在mac上,將csv檔案匯入MySQL,ERROR 1290 (HY000),ERROR 13 (HY000)
在mac上安裝mysql和Workbench很快,主要是在匯入csv資料時踩了很多的坑。
啟動mysql
在system preferences中點選MySql,即可進入啟動mysql的介面。在terminal中輸入輸入 mysql -u root -p, 輸入密碼即可登陸mysql。
匯入資料前,請確保匯入的資料是csv的格式,主要是我原本的資料就是csv格式,沒有嘗試匯入xlsx格式的資料。在其他網頁上看到必須匯入的資料必須是csv格式。匯入資料,最主要是要清楚所有資料的格式再確定該列在資料庫中的型別,例如:user_id中存在字元,若將user_id的型別設定為INT,匯入資料遇到該列的值為字串時,就會出現Incorrect integer value:的問題。
匯入資料主要有兩種方法:
第一種是直接在Workbench中直接匯入,匯入之前可以不用先建表;
第二種是在terminal中匯入,這方法匯入的速度要快很多,但得匯入資料之前就要建好資料表。
1、將資料匯入mysql時出現了ERROR 1290 (HY000): The MySQL server is running with the --secure-file-priv option so it cannot execute this statement 的錯誤;
解決方法:可以用show variables like ‘%secure%’;命令檢視secure-file-priv的值。如果value的值為null,則為禁止;如果值為資料夾目錄,則只允許該目錄下檔案;如果值為空,則不限制目錄。但是現在版本的mysql在/etc/下沒有my.cnf檔案,所以可以在這個網址中
cd /etc
sudo vim my.cnf
如果不用sudo會出現不能儲存的問題!
# Example MySQL config file for medium systems. # # This is for a system with little memory (32M - 64M) where MySQL plays # an important part, or systems up to 128M where MySQL is used together with # other programs (such as a web server) # # MySQL programs look for option files in a set of # locations which depend on the deployment platform. # You can copy this option file to one of those # locations. For information about these locations, see: # http://dev.mysql.com/doc/mysql/en/option-files.html # # In this file, you can use all long options that a program supports. # If you want to know which options a program supports, run the program # with the "--help" option. # The following options will be passed to all MySQL clients [client] default-character-set=utf8 #password = your_password port = 3306 socket = /tmp/mysql.sock # Here follows entries for some specific programs # The MySQL server [mysqld] character-set-server=utf8 init_connect='SET NAMES utf8 port = 3306 socket = /tmp/mysql.sock skip-external-locking key_buffer_size = 16M max_allowed_packet = 1M table_open_cache = 64 sort_buffer_size = 512K net_buffer_length = 8K read_buffer_size = 256K read_rnd_buffer_size = 512K myisam_sort_buffer_size = 8M character-set-server=utf8 init_connect='SET NAMES utf8' # Don't listen on a TCP/IP port at all. This can be a security enhancement, # if all processes that need to connect to mysqld run on the same host. # All interaction with mysqld must be made via Unix sockets or named pipes. # Note that using this option without enabling named pipes on Windows # (via the "enable-named-pipe" option) will render mysqld useless! # #skip-networking # Replication Master Server (default) # binary logging is required for replication log-bin=mysql-bin # binary logging format - mixed recommended binlog_format=mixed # required unique id between 1 and 2^32 - 1 # defaults to 1 if master-host is not set # but will not function as a master if omitted server-id = 1 # Replication Slave (comment out master section to use this) # # To configure this host as a replication slave, you can choose between # two methods : # # 1) Use the CHANGE MASTER TO command (fully described in our manual) - # the syntax is: # # CHANGE MASTER TO MASTER_HOST=<host>, MASTER_PORT=<port>, # MASTER_USER=<user>, MASTER_PASSWORD=<password> ; # # where you replace <host>, <user>, <password> by quoted strings and # <port> by the master's port number (3306 by default). # # Example: # # CHANGE MASTER TO MASTER_HOST='125.564.12.1', MASTER_PORT=3306, # MASTER_USER='joe', MASTER_PASSWORD='secret'; # # OR # # 2) Set the variables below. However, in case you choose this method, then # start replication for the first time (even unsuccessfully, for example # if you mistyped the password in master-password and the slave fails to # connect), the slave will create a master.info file, and any later # change in this file to the variables' values below will be ignored and # overridden by the content of the master.info file, unless you shutdown # the slave server, delete master.info and restart the slaver server. # For that reason, you may want to leave the lines below untouched # (commented) and instead use CHANGE MASTER TO (see above) # # required unique id between 2 and 2^32 - 1 # (and different from the master) # defaults to 2 if master-host is set # but will not function as a slave if omitted #server-id = 2 # # The replication master for this slave - required #master-host = <hostname> # # The username the slave will use for authentication when connecting # to the master - required #master-user = <username> # # The password the slave will authenticate with when connecting to # the master - required #master-password = <password> # # The port the master is listening on. # optional - defaults to 3306 #master-port = <port> # # binary logging - not required for slaves, but recommended #log-bin=mysql-bin # Uncomment the following if you are using InnoDB tables #innodb_data_home_dir = /usr/local/mysql/data #innodb_data_file_path = ibdata1:10M:autoextend #innodb_log_group_home_dir = /usr/local/mysql/data # You can set .._buffer_pool_size up to 50 - 80 % # of RAM but beware of setting memory usage too high #innodb_buffer_pool_size = 16M #innodb_additional_mem_pool_size = 2M # Set .._log_file_size to 25 % of buffer pool size #innodb_log_file_size = 5M #innodb_log_buffer_size = 8M #innodb_flush_log_at_trx_commit = 1 #innodb_lock_wait_timeout = 50 [mysqldump] quick max_allowed_packet = 16M [mysql] no-auto-rehash # Remove the next comment character if you are not familiar with SQL #safe-updates default-character-set=utf8 [myisamchk] key_buffer_size = 20M sort_buffer_size = 20M read_buffer = 2M write_buffer = 2M [mysqlhotcopy] interactive-timeout
2、這個時候如果將資料匯入時出現了
ERROR 13 (HY000): Can't get stat of '/Users/zhangxin/Desktop/pool_oto/ccf_online_stage1_train.csv' (OS errno 13 - Permission denied)的錯誤
解決方法:用show variables like '%tmpdir%'; 檢視mysql預設使用的臨時目錄,將檔案轉到該臨時目錄下即可;還有一種方法是將load data infile 改為load data local infile(待考證)http://www.360doc.com/content/15/1231/20/1073512_524491459.shtml
3、mysql匯入資料時出現了Incorrect integer value: 'null' for column的問題。
解決方法:用python將NaN值轉換為空值,再匯入到資料庫
最後成功運用著一段程式碼匯入成功!!!!!!!
python程式碼:
import pandas as pd
data = pd.read_csv('/var/tmp/pool_oto/ccf_online_stage1_train.csv')
df = data.where(data.notnull(),'') #將NaN轉換為空值‘’
df['Date'] = df['Date'].apply(lambda x: int(x) if x != '' else x)#因為時間被改為了浮點數,將時間轉化為整型
df['Date_received'] = df['Date_received'].apply(lambda x: int(x) if x != '' else x)
df.to_csv('/var/tmp/pool_oto/ccf_online_stage1_train2.csv',index = False)
sql程式碼:
LOAD DATA INFILE '/var/tmp/pool_oto/ccf_online_stage1_train2.csv'
INTO TABLE oto.online_train
FIELDS TERMINATED BY ','
ignore 1 lines /*忽略首行,因為首行為列名,匯入進去會出現Incorrect integer value:的錯誤,因為user_id為字串*/
/*這裡使用使用者變數@,可以讓資料根據我們想要的形式存入資料庫, 同時要保證欄位的順序和要傳入的表格順序一致
否則匯入表後會出現匯入的值和欄位名匹配不上的問題*/
(@User_id,@Merchant_id,@Action, @Coupon_id,@Discount_rate, @Date_received,@Date)
SET
User_id = IF(@User_id= '', NULL,@User_id),
Merchant_id= IF(@Merchant_id= '', NULL,@Merchant_id),
Action= IF(@Action= '', NULL,@Action),
Coupon_id = IF(@Coupon_id= '', NULL,@Coupon_id),
Discount_rate= IF(@Discount_rate= '', NULL,@Discount_rate),
Date_received= IF(@Date_received= '', NULL,@Date_received),
Date= IF(@Date= '', NULL,@Date)
;