1. 程式人生 > >如何生成可匯入資料庫的億級別資料

如何生成可匯入資料庫的億級別資料

1. 使用python指令碼可以輕鬆生成滿足條件的資料,具體如下

#coding: utf-8
import os, sys, time, datetime
from itertools import izip

N = 100000000

def gen_meid():
   return

def gen_seq():
   return

def generate_message(meid,seq):
    ts = time.time();
    time_st = datetime.datetime.fromtimestamp(ts).strftime('%Y-%m-%d %H:%M:%S')
    print '\t'.join(( meid, seq, '\N', '\N', '\N', '\N', '0', '0', '0', '0', time_st, '\N', '\N', '0', '\N', '\N', '\N', '\N', time_st ))

def main(args):
    print '\t'.join(( 'deviceID', 'battery', ... , 'accumulatedTime', 'createDate' ))  // for mongodb, mysql delete
    for meid,seq in izip(gen_meid(),gen_seq()):
        generate_message(meid,seq)
        pass
    return 0

#==============================
if __name__ == "__main__":
 import sys
 main(sys.argv)
 pass
#==============================

$ python a.py > device.tsv

2. 切分資料(可選)

tail -n +1      device.csv | head -n 5000000 > part1.txt

tail -n +100001 device.csv | head -n 100000 > part2.txt

tail -n +200001 device.csv | head -n 100000 > part3.txt

tail -n +300001 device.csv | head -n 100000 > part4.txt

3. 生成txt 檔案

python a.py > device.txt