Java實現簡單爬蟲爬取天氣預報

阿新 • • 發佈：2019-01-16

爬蟲爬取網頁的主要流程是：

1.向目標網頁發起請求；

2.對於獲取到的html檔案進行解析；

3.對解析後的資料進行儲存。

本次主要是爬取全國城市未來7天的天氣預報，爬取物件為中國天氣網，爬取的資料存入文字中。

對於html檔案的解析採用Jsoup結合正則表示式。

地區程式碼參考：https://wenku.baidu.com/view/49166e7265ce050877321331.html

實現程式碼：

public class Spiderweather {
	public static void main(String[] args) {
		List<String> list = null;
		BufferedReader bufr = null;
		BufferedWriter bufw = null;
		try {
			bufr = new BufferedReader(new FileReader(new File("D:\\bianma.txt")));
			list = new ArrayList<String>();
			String line = "";
			Pattern p = Pattern.compile("\\d{2,}");
			while ((line = bufr.readLine()) != null) {
				Matcher m = p.matcher(line);
				while (m.find())
					list.add(m.group());
			}
		} catch (Exception e1) {
			e1.printStackTrace();
		}
		Iterator<String> it = list.iterator();
		File file = new File("D:\\forecast.txt");
		if (!file.exists())
			try {
				file.createNewFile();
			} catch (IOException e1) {
				e1.printStackTrace();
			}
		try {
			bufw = new BufferedWriter(new FileWriter(file));
		} catch (IOException e1) {
			e1.printStackTrace();
		}
		String bm = "";
		while (it.hasNext()) {
			bm = it.next();
			String url = "http://www.weather.com.cn/weather/" + bm + ".shtml";
			try {
				Document doc = Jsoup.connect(url).get();
				Elements content = doc.getElementsByClass("con today clearfix");
				for (Element e : content) {
					Document conDoc = Jsoup.parse(e.toString());
					Elements cru = conDoc.getElementsByClass("crumbs fl");
					Elements sky = content.select("li[class^=sky skyid lv]");
					bufw.write(cru.text());// 地點
					bufw.newLine();
					for (Element sk : sky) {
						bufw.write(sk.text());
						bufw.newLine();
					}
					bufw.newLine();
				}
				bufw.newLine();
				bufw.flush();
			} catch (Exception e) {
				e.printStackTrace();
			}

		}
		try {
			System.out.println("天氣查詢完畢！！");
			if (bufw != null)
				bufw.close();
			if (bufr != null)
				bufr.close();
		} catch (IOException e) {
			e.printStackTrace();
		}
	}
}

爬取到的結果：

北京 > 城區
7日（今天）多雲轉晴 6℃/ -5℃ 3-4級轉<3級
8日（明天）晴間多雲轉晴 6℃/ -4℃ <3級
9日（後天）多雲 6℃/ -3℃ <3級
10日（週日）多雲 5℃/ -5℃ 3-4級轉<3級
11日（週一）晴 3℃/ -8℃ <3級
12日（週二）晴 2℃/ -6℃ <3級
13日（週三）多雲 4℃/ -5℃ <3級

北京 > 朝陽
7日（今天）多雲轉晴 6℃/ -5℃ 4-5級轉<3級
8日（明天）晴 6℃/ -4℃ <3級
9日（後天）多雲 6℃/ -3℃ <3級
10日（週日）多雲 5℃/ -5℃ 3-4級轉<3級
11日（週一）晴 3℃/ -8℃ <3級
12日（週二）晴 2℃/ -6℃ <3級
13日（週三）多雲 4℃/ -5℃ <3級

北京 > 順義
7日（今天）多雲轉晴 5℃/ -5℃ 3-4級轉<3級
8日（明天）晴 6℃/ -4℃ <3級
9日（後天）多雲 6℃/ -3℃ <3級
10日（週日）多雲 5℃/ -5℃ 3-4級轉<3級
11日（週一）晴 3℃/ -8℃ <3級
12日（週二）晴 2℃/ -6℃ <3級
13日（週三）多雲 4℃/ -5℃ <3級

北京 > 懷柔
7日（今天）多雲轉晴 5℃/ -7℃ 3-4級轉<3級
8日（明天）晴 6℃/ -6℃ <3級
9日（後天）多雲 6℃/ -5℃ <3級
10日（週日）多雲 5℃/ -7℃ 3-4級轉<3級
11日（週一）晴 3℃/ -10℃ <3級
12日（週二）晴 2℃/ -8℃ <3級
13日（週三）多雲 4℃/ -7℃ <3級

……

Java實現簡單爬蟲爬取天氣預報

Java實現簡單爬蟲爬取天氣預報

python3 爬蟲—爬取天氣預報多個城市七天資訊（三）

用JAVA實現簡單爬蟲多執行緒抓取

Python爬取天氣預報

PHP簡單爬蟲爬取免費代理ip 一萬條

Python爬取天氣預報資料，並存入到本地EXCEL中

Python簡單爬蟲爬取多頁圖片

Node.js實現簡單的爬取

Springboot+JPA下實現簡易爬蟲--爬取豆瓣電視劇資料

java實現簡單的網路爬蟲（爬取電影天堂電影資訊）

Java爬蟲-使用HttpClient+Jsoup實現簡單的爬蟲爬取文字

用JAVA實現一個爬蟲，爬取知乎的上的內容（程式碼已無法使用）

java網路程式設計____最簡單的爬蟲(爬取網站美女圖片)

超簡單的JAVA爬蟲爬取晉江小說的簡介和評論

關於java實現需要登入且帶驗證碼的定時網路爬蟲(爬取的資料存庫)

pyhthon 利用爬蟲結合阿裏大於短信接口實現短信發送天氣預報

使用Python的BeautifulSoup庫實現一個可以爬取1000條百度百科數據的爬蟲

Java爬蟲爬取京東商品信息

java爬蟲爬取資源，小白必須會的入門程式碼塊

JAVA爬蟲爬取網頁資料資料庫中,並且去除重複資料

Java實現簡單爬蟲爬取天氣預報

相關推薦