1. 程式人生 > >tensorflow 對csv數據進行批量獲取

tensorflow 對csv數據進行批量獲取

BE mark cer cte orf efault 代碼 none listing

代碼如下:

#讀取文件數據

def read_data(file_queue):
# 讀取的時候需要跳過第一行
reader = tf.TextLineReader(skip_header_lines=1)
key, value = reader.read(file_queue)
# 對於數據源中空的值設置默認值
record_defaults = [[‘‘], [‘‘], [‘‘], [‘‘], [0.], [0.], [0.], [0.], [‘‘],[0], [‘‘], [0.], [‘‘], [‘‘], [0]]
# 定義decoder,每次讀取的執行都從文件中讀取一行。然後,decode_csv 操作將結果解析為張量列表
province, city, address, postCode, longitude,latitude, price, buildingTypeId, buildingTypeName, tradeTypeId, tradeTypeName, expectedDealPrice, listingDate, delislingDate, daysOnMarket = tf.decode_csv(value, record_defaults)
return tf.stack([price,expectedDealPrice]),daysOnMarket



#批量獲取
def create_pipeline(filename,batch_size,num_epochs=None):
file_queue = tf.train.string_input_producer([filename],num_epochs=num_epochs)
example,dayOnMarket = read_data(file_queue)#example,label 樣本和樣本標簽,batch_size 返回一個樣本batch樣本集的樣本個數
min_after_dequeue = 1000#出隊後隊列至少剩下的數據個數,小於capacity(隊列的長度)否則會報錯,
capacity = min_after_dequeue+batch_size#隊列的長度
#example_batch,label_batch= tf.train.shuffle_batch([example,label],batch_size=batch_size,capacity=capacity,min_after_dequeue=min_after_dequeue)#把隊列的數據打亂了讀取
example_batch,daysOnMarket_batch= tf.train.batch([example,dayOnMarket],batch_size=batch_size,capacity=capacity)#順序讀取

return example_batch,daysOnMarket_batch

tensorflow 對csv數據進行批量獲取