1. 程式人生 > >kettle之mongodb資料同步

kettle之mongodb資料同步

需求:

1.源資料庫新增一條記錄,目標庫同時新增一條記錄;

2.源資料庫修改一條記錄,目標庫同時修改該條記錄;

示例用到三個Kettle元件

 

下面詳細說下每個元件的配置

Source:

本示例連線的是Mongodb資料庫,四個欄位,ID預設為主鍵,_id會系統自動生成暫時先不管。

值對映:

本步在本示例作用不大,只是為了測試效果。按照截圖上進行配置即可

MongoDbOutput:

關鍵是這一步的配置

官網上對這個tab頁的解釋是這樣的:

2.2 Selecting the write mode

The MongoDb output step provides a number of options that control what and how data is written to the target Mongo document collection. By default, data is inserted into the target collection. If the specified collection doesn't exist, it will be created before data is inserted. Selecting the Truncate

 option will delete any existing data in the target collection before inserting begins. Unless unique indexes are being used (see section on indexing below) then Mongo DB will allow duplicate records to be inserted. Mongo DB allows for fast bulk insert operations - the batch size can be configured using the Batch insert size
 field. If no value is supplied here, then the default size of 100 rows is used.

Selecting the Upsert option changes the write mode from insert to upsert (i.e. update if a match is found, otherwise insert a new record). Information on defining how records are matched can be found in the next section. Standard upsert replaces a matched record with an entire new record based on all the incoming fields specified in the Mongo document fields tab

. Modifier update enables modifier ($ operators) to be used to mutate individual fields within matching documents. This type of update is fast and involves minimal network traffic; it also has the ability to update all matching documents, rather than just the first, if the Multi-updateoption is enabled

個人理解就是勾選上紅色圈著的選項之後,源資料修改、添加了,在目標庫裡都會有相應的操作。不過還要設定下面的一步

ID為主鍵match field for update時一定要選擇Y否則執行時出錯。

同步過程最主要的就是上邊列出的幾步設定,當然如果想要再設定更強大的功能,可詳細去研究官網的API