MySQL Cluster寫入效率測試
MySQL Cluster使用到目前為止遇到渴望得到答案的問題,也是直接影響使用的問題就是MySQL Cluster的寫入效率問題和Cluster是否適合大資料儲存、如何配置儲存的問題。
在之前的測試中MySQL Cluster的寫入效率一直不佳,這也是直接影響能否使用MySQL Cluster的關鍵。現在我們來仔細測試一下。使用的環境略有變化。
Data節點的記憶體擴充套件為4G。
叢集配置如下:
[ndbd default] # Options affecting ndbd processes on all data nodes: NoOfReplicas=2 # Number of replicas DataMemory=2000M # How much memory to allocate for data storage IndexMemory=300M # How much memory to allocate for index storage # For DataMemory and IndexMemory, we have used the # default values. Since the "world" database takes up # only about 500KB, this should be more than enough for # this example Cluster setup. MaxNoOfConcurrentOperations=1200000 MaxNoOfLocalOperations=1320000
測試程式碼如下:
/**
* 向資料庫中插入資料
*
* @param conn
* @param totalRowCount
* @param perRowCount
* @param tableName
* @author lihzh(OneCoder)
* @throws SQLException
* @date 2013 -1 -17 下午1:57:10
*/
private void insertDataToTable(Connection conn, String tableName,
long totalRowCount, long perRowCount, long startIndex)
throws SQLException {
conn.setAutoCommit( false);
String sql = "insert into " + tableName + " VALUES(?,?,?)";
System. out.println( "Begin to prepare statement.");
PreparedStatement statement = conn.prepareStatement(sql);
long sum = 0L;
for ( int j = 0; j < TOTAL_ROW_COUNT / BATCH_ROW_COUNT; j++) {
long batchStart = System. currentTimeMillis();
for ( int i = 0; i < BATCH_ROW_COUNT; i++) {
long id = j * BATCH_ROW_COUNT + i + startIndex;
String name_pre = String. valueOf(id);
statement.setLong(1, id);
statement.setString(2, name_pre);
statement.setString(3, name_pre);
statement.addBatch();
}
System. out.println( "It's up to batch count: " + BATCH_ROW_COUNT);
statement.executeBatch();
conn.commit();
long batchEnd = System. currentTimeMillis();
long cost = batchEnd - batchStart;
System. out.println( "Batch data commit finished. Time cost: "
+ cost);
sum += cost;
}
System. out.println( "All data insert finished. Total time cost: "
+ sum);
System. out.println( "Avg cost: "
+ sum/5);
}
分下列情景進行寫入測試。
資料載入、寫入在記憶體中時,在獨立的新庫、新表中一次寫入100,1000,10000,50000條記錄,分別記錄其耗時情況。(5次平均)
100:260ms
1000:1940ms
10000:17683ms(12000-17000)
50000: 93308、94730、90162、94849、162848
與普通單點MySQL寫入效率進行對比(2G記憶體)
100:182ms
1000:1624ms
10000:14946ms
50000:84438ms
雙執行緒併發寫入測試
由於只有兩個SQL節點,所以這裡只採用雙執行緒寫入的方法進行測試。程式碼上採用了簡單的硬編碼
/**
* 多執行緒並行寫入測試
*
* @author lihzh(OneCoder)
* @blog http://www.coderli.com
* @date 2013 -2 -27 下午3:39:56
*/
private void parallelInsert() {
final long start = System. currentTimeMillis();
Thread t1 = new Thread( new Runnable() {
@Override
public void run() {
try {
Connection conn = getConnection(DB_IPADDRESS, DB_PORT,
DB_NAME, DB_USER, DB_PASSOWRD);
MySQLClusterDataMachine dataMachine = new MySQLClusterDataMachine();
dataMachine.insertDataToTable(conn, TABLE_NAME_DATAHOUSE,
500, 100, 0);
long end1 = System.currentTimeMillis();
System. out.println( "Thread 1 cost: " + (end1 - start));
} catch (SQLException e) {
e.printStackTrace();
}
}
});
Thread t2 = new Thread( new Runnable() {
@Override
public void run() {
try {
Connection conn = getConnection(DB_IPADDRESS_TWO, DB_PORT,
DB_NAME, DB_USER, DB_PASSOWRD);
MySQLClusterDataMachine dataMachine = new MySQLClusterDataMachine();
dataMachine.insertDataToTable(conn, TABLE_NAME_DATAHOUSE,
500, 100, 500);
long end2 = System.currentTimeMillis();
System. out.println( "Thread 2 cost: " + (end2 - start));
} catch (SQLException e) {
e.printStackTrace();
}
}
});
t1.start();
t2.start();
}
測試結果:
(總條數/每次) | 執行緒1(總/平均- 各寫一半資料) | 執行緒2 | 並行總耗時 | 單執行緒單點 |
1000/100 | 985/197 | 1005/201 | 1005/201 | 2264/226 |
10000/1000 | 9223/1836 | 9297/1850 | 9297/1850 | 19405/1940 |
100000/10000 | 121425/12136 | 122081/12201 | 121425/12136 |
148518/14851 |
從結果可以看出,在10000條以下批量寫入的情況下,SQL節點的處理能力是叢集的瓶頸,雙執行緒雙SQL寫入相較單執行緒單節點效率可提升一倍。但是當批量寫入資料達到一定數量級,這種效率的提升就不那麼明顯了,應該是叢集中的其他位置也產生了瓶頸。
注:由於各自測試環境的差異,測試資料僅可做內部比較,不可外部橫向對比。僅供參考。
寫入測試,要做的還很多,不過暫時告一段落。大資料儲存和查詢測試,隨後進行。