scrapy crawl itcast -o teachers.json 爬蟲案列
阿新 • • 發佈:2018-01-11
title dom https imp awl mod urn art 封裝
- spider.py文件配置
1 2 # -*- coding: utf-8 -*- 3 import scrapy 4 from itTeachers.items import ItteachersItem 5 6 7 class ItcastSpider(scrapy.Spider): 8 name = ‘itcast‘ 9 allowed_domains = [‘itcast.cn‘] 10 start_urls = [‘http://www.itcast.cn/channel/teacher.shtml#‘] 11 12 def
- items.py文件配置
1 # -*- coding: utf-8 -*- 2 3 # Define here the models for your scraped items 4 # 5 # See documentation in: 6 # https://doc.scrapy.org/en/latest/topics/items.html 7 8 import scrapy 9 10 11 class ItteachersItem(scrapy.Item): 12 # define the fields for your item here like: 13 # name = scrapy.Field() 14 name = scrapy.Field() 15 title = scrapy.Field() 16 info = scrapy.Field()
scrapy crawl itcast -o teachers.json 爬蟲案列