[同步腳本]mysql-elasticsearch同步

阿新 • • 發佈：2017-08-26

global name oca continue for txt tel rb+ __file__

公司項目搜索部分用的elasticsearch，那麽這兩個之間的數據同步就是一個問題。

網上找了幾個包，但都有各自的缺點，最後決定還是自己寫一個腳本，大致思路如下：

1.在死循環中不斷的select指定的表

2.讀取表中更新時間晚於某個時間點的所有行（初始化時候為"1970-01-01 00:00:00")

3.把需要的字段更新到elasticsearch

註意：1.中間要考慮到腳本中斷，或者重啟所以把最後的更新時間記錄到了固定的txt文件

2.為了讓腳本更加通用，不至於為了一個表就大幅度更改腳本，考慮動態生成變量，使用了locals和globals

代碼如下：

#!/usr/bin/env python 

# coding=utf-8
import sys
sys.path.append(‘/Users/cangyufu/work_jbkj/elabels-flask‘)
from modules.utils.commons import app, redispool, db_master, db_slave
from sqlalchemy import text
import os
import datetime
import time
from service.myelasticsearch.index import es
from modules.utils.mysqldb import db_obj_dict
 
import datetime

CONST_SLEEP = 3

WORK_INDEX = ‘test‘

#https://stackoverflow.com/questions/136168/get-last-n-lines-of-a-file-with-python-similar-to-tail
def tail(f, lines=1):
    total_lines_wanted = lines

    BLOCK_SIZE = 1024
    f.seek(0, 2)
    block_end_byte = f.tell()
    lines_to_go = total_lines_wanted
    block_number  
= -1
    blocks = [] # blocks of size BLOCK_SIZE, in reverse order starting
                # from the end of the file
    while lines_to_go > 0 and block_end_byte > 0:
        if (block_end_byte - BLOCK_SIZE > 0):
            # read the last block we haven‘t yet read
            f.seek(block_number*BLOCK_SIZE, 2)
            blocks.append(f.read(BLOCK_SIZE))
        else:
            # file too small, start from begining
            f.seek(0,0)
            # only read what was not read
            blocks.append(f.read(block_end_byte))
        lines_found = blocks[-1].count(‘\n‘)
        lines_to_go -= lines_found
        block_end_byte -= BLOCK_SIZE
        block_number -= 1
    all_read_text = ‘‘.join(reversed(blocks))
    return ‘\n‘.join(all_read_text.splitlines()[-total_lines_wanted:])


def is_file_exists(filename):
    if not os.path.isfile(filename):
        file = open(filename, ‘wb‘)
        file.write("1970-01-01 00:00:00\n")
        file.close()


#傳入要監控的表名
def sync_main(*args):
    for table in args:
        try:
            callable(globals()[‘monitor_‘+table])
        except Exception:
            raise Exception(‘lack function monitor_{}‘.format(table))
    for table in args:
        filename = ‘‘.join([‘monitor_‘, table, ‘.txt‘])
        locals()[table+‘path‘] = os.path.join(os.path.dirname(__file__), filename)
        is_file_exists(locals()[table+‘path‘])
        locals()[table+‘file‘] = open(locals()[table+‘path‘], ‘rb+‘)
    try:
        print "begin"
        while True:
            count = 0
            for table in args:
                print ‘handleing ‘+table
                last_time = tail(locals()[table+‘file‘], 1)
                update_time = globals()[‘monitor_‘+table](last_time)
                print update_time
                if update_time == last_time:
                    count += 1
                    continue
                locals()[table + ‘file‘].write(update_time+‘\n‘)
                locals()[table + ‘file‘].flush()
            if count == len(args):
                time.sleep(CONST_SLEEP)
    except Exception, e:
        print e
        raise e
    finally:
        for table in args:
            locals()[table + ‘file‘].close()

########################################################################################################################
#
# 如果要監控哪個表，必須要實現 函數 monitor_table_name，比如要監控table1表，就必須要實現monitor_table1函數，
#   傳入參數為開始更新的起始時間，初始化時候為1970-01-01 00:00:00，返回更新到的最新的時間
#
########################################################################################################################
def monitor_table1(last_time):
    pass
    return last_time

def monitor_table2(last_time):
    pass
    return last_time

def trans_date_time(dt): 
　　 return datetime.datetime.strptime(dt, "%Y-%m-%d %H:%M:%S")


 sync_main(‘table1‘,‘table2‘)

[同步腳本]mysql-elasticsearch同步

global name oca continue for txt tel rb+ __file__ 公司項目搜索部分用的elasticsearch，那麽這兩個之間的數據同步就是一個問題。網上找了幾個包，但都有各自的缺點，最後決定還是自己寫一個腳本，大致思路如下： 1.在死

rsync+inotify 實時同步腳本學習筆記

rsync+inotify 實時同步腳本學習筆記#!/bin/sh#parahost01=192.168.100.61src=/backupdst=oldboyuser=rsync_backuprsync_passfile=/etc/rsync.passwordinotify_home=/usr/local/

一對多服務器觸發自動數據同步腳本

notify ncpa ogre sdi ssd one 循環用戶 for循環同步方向：源向目標單向同步，源刪除而目標不動#!/usr/bin/bashsrcdir=/src //源目錄dstdir=bak

PowerShell 腳本通知Office365 同步錯誤

powershell office365 dirsync 同步異常豆子公司是上市公司，每年都需要審計。因此離職用戶的信息不能刪掉，只能disable掉。有的時候，桌面需要把一個離職用戶的郵件重新添加到另一個用戶的別名，以便繼續接收郵件。但是Office365默認配置情況下一個已經disab

inotify 同步腳本

ech roo AS ogre orm rom path -- efm #!/bin/bash path1=/home/htoa/tomcat/webapps/ROOT/htoa/ ip=192.168.30.13 /usr/bin/inotifywait -mrq

時間同步腳本

pool clock pre 阿裏雲 exp RF .net 雲同步 man #!/usr/bin/env python #coding:utf-8 #Example: #阿裏雲同步 #ntpd -u ntp:ntp -p /var/run/ntpd.pid -g #nt

linux 分發同步腳本與分發命令腳本

絕對路徑入參路徑 user echo 分發 code 用戶名 oop 同步腳本,在第5步要拼接自己配置的主機名 1 #!/bin/bash 2 #1 獲取輸入參數個數，如果沒有參數，直接退出 3 pcount=$# 4 if((pcount==0)); the

Linux 時間同步腳本

ntp ron bash 註意事項 ntp服務加密 stat ash ive 參考資料：http://www.cnblogs.com/liushui-sky/p/9203657.html(博客中perfer應該為prefer) https://www.jianshu.co

Rsync 增量同步腳本（腳本一分鐘，容災備份服務器）

date sip sha hl7 com 增量容災備份 ogr auth #/bin/bash author : Jerry update : 2018-11-30 FrtIP=192.168.25.Sip=(72 45 41 70 249 43 114)Names=(R

shell腳本中執行sql腳本(mysql為例)

技術分享 src ins 註釋 ima 嘗試方式 sql腳本分享圖片 1、sql腳本(t.sql) insert into test.t value ("LH",88); 2、shell腳本(a.sh 為方便說明，a.sh與t.sql在同一目錄下) 說明：

mysql主從同步監控腳本

mysql主從復制監控腳本 linux mysql shell mysql主從同步監控腳本,利用mysql從庫中的IO和SQL進程以及延遲時間來監控主從同步是否正常，詳細shell腳本如下：#!/bin/bash #author wangning #date 2017-7-17 #qq 119

Elasticsearch + head + 資料庫同步資料(本例子為mysql)

希望通過這篇文章，減少你的時間，願同你一起修行，任重而道遠！先來看看最終效果：第一.安裝：Elasticsearch 安裝之前檢查***jdk***是否安裝： cmd 輸入命令 java -version 2.

OGG運維優化腳本（十二）-信息同步類--信息上傳

og文件: upload.sh路徑:$HOME/ggscript/ggupload功能：該腳本不會直接使用，為滿足其他腳本進行信息上傳而設計，在腳本內直接調用上傳相應的文件信息他會讀取系統信息配置文件sysinfo內的系統配置信息範例[detest#]Ip-MTMyLjEyMS4xMDEuODYKUserNa

OGG運維優化腳本（十五）-信息同步類--錯誤日誌同步

ogg oracle goldengate 腳本數據同步 shell 文件：logtitle.sh log.sh路徑:$HOME/ggscript/gginfo該腳本主要用於每小時檢查ggserr.log內包含error關鍵字的信息（具體可調整）然後拼接成html格式文件發送給監控

OGG運維優化腳本（十四）-信息同步類--定義文件自動下發

ogg oracle goldengate 腳本數據同步 shell 文件: resend.sh路徑:$HOME/ggscript/ggdef功能：該腳本為用於應對目標端因為定義文件失效導致的進程異常中斷所設計因源端業務經常未通知目標端以及系統組自行修改表結構因此設計該腳本自動生成定

分發系統：自動同步文件腳本

exp local nbsp blog rsync oca expect set txt [[email protected]/* */ ~]# cat 4.expect#!/usr/bin/expect set passwd "123456" spaw

expect腳本同步文件、指定host和要同步的文件、構建文件分發系統、批量遠程執行命令

expect腳本 expect遠程同步 expect腳本同步文件 1.自動同步文件 [root@garytao-01 shell]# vi 4.expect 增加如下腳本內容： #!/usr/bin/expect set passwd "123456" spawn rsync -av root@1

expect腳本同步文件、expect腳本指定host和要同步的文件、構建文件分發系統、批量遠程執行

expect腳本同步文件 expect腳本指定host和要同步的構建文件分發系統批量遠程執行命令 20.31 expect腳本同步文件自動同步文件 #!/usr/bin/expect set passwd "rootroot" spawn rsync -av [email protected]

20.31 expect腳本同步文件 20.32 expect腳本指定host和要同步的文件 20.

20.31 expect腳本同步文件 220.31 expect腳本同步文件 20.32 expect腳本指定host和要同步的文件 20.33 構建文件分發系統 20.34 批量遠程執行命令 20.31 expect腳本同步文件 20.32 expect腳本指定host和要同步的文件 20.

expect腳本同步文件expect腳本指定host和要同步的文件構建文件分發系統批量遠程執行命令

十八周二次課（4月26日）20.31 expect腳本同步文件#!/usr/bin/expectset passwd "liang.123"spawn rsync -av [email protected]:/tmp/12.txt /tmp/ 將遠程的/tmp/12.txt同步

[同步腳本]mysql-elasticsearch同步

相關推薦