1. 程式人生 > 實用技巧 >爬蟲2-python爬取的資料存入mysql**

爬蟲2-python爬取的資料存入mysql**

也可以存入hive、HDFS,這裡選擇存在mysql。

一、安裝mysql(python在pyspark一節已配置好)

https://blog.csdn.net/zhouzezhou/article/details/52446608

安裝後找不到bin目錄解決方法

https://blog.csdn.net/cuicui_ruirui/article/details/105840107

二、python安裝操作mysql的庫

pip install pymysql -i http://pypi.douban.com/simple --trusted-host pypi.douban.com

三、在mysql中建立相應的表

mysql>
create database pachong; mysql> use pachong; --##收集欄位url, author, bookname, chubanshe, time_print, prince mysql> create table dangdang_book(url varchar(100) not null,author varchar(20),bookname varchar(50),chubanshe varchar(50),time_print varchar(50),price varchar(10))charset=utf8;

四、測試python灌資料到mysql

conn = pymysql.connect(host='localhost', port=3306, user='root', password='****', db='pachong', charset='utf8')
cursor = conn.cursor()
cursor.execute("insert into `dangdang_book` values('aa','bb','c','d','e','f');")
conn.commit()
mysql> select * from dangdang_book;
+-----+--------+----------+-----------+------------+-------+
| url | author | bookname | chubanshe | time_print | price | +-----+--------+----------+-----------+------------+-------+ | aa | bb | c | d | e | f | +-----+--------+----------+-----------+------------+-------+

五、接下來可以將url批量存入Excel,讓python讀取Excel中的url逐條將資訊插入mysql