pyecharts資料分析及展示

阿新 • • 發佈：2018-12-13

僅僅從網上爬下資料當然是不夠用的，主要還得對資料進行分析與展示，大部分人都看重薪資，但是薪資資料有的是*k/月，有的是*萬/月，還有*萬/年等等，就要對資料進行清理

將所有單位統一化，全部換算成統一單位，然後分類薪資範圍，在計算各個範圍的數量，最後繪圖展示

import pymysql
import numpy as np
from pyecharts import Bar
from pyecharts import Pie


class Mysqlhelper(object):
    config = {
        "host": "localhost",
        "user": "root",
        "password": "123456",
        "db": "test",
        "charset": "utf8"
    }

    def __init__(self):
        self.connection = None
        self.cursor = None

    # 從資料庫中查詢多行資料
    def getlist(self, sql, *args):
        try:
            self.connection = pymysql.connect(**Mysqlhelper.config)  # **接函式所有引數
            self.cursor = self.connection.cursor()
            self.cursor.execute(sql, args)
            return self.cursor.fetchall()
        except Exception as ex:
            print(ex, ex)
        finally:
            self.close()

    def close(self):
        if self.cursor:
            self.cursor.close()
        if self.connection:
            self.connection.close()


if __name__ == "__main__":
    count=0
    list = []
    list1 = []
    list2 = [5000,10000,15000,20000,25000,30000,35000,40000]
    salary0 = []
    salary1 = []
    salary2 = []
    salary3 = []
    salary4 = []
    salary5 = []
    salary6 = []
    salary7 = []
    city=[]
    helper = Mysqlhelper()
    rows = helper.getlist("select * from t_job")

    #print(rows)
    for n in rows:
        if n[4][-1]=='月':
            list.append(n[4])
        elif n[4][-1]=='年':
            pass
        elif n[4][-1]=='天':
            pass
        else:
            pass
    for sale in list:
        #print(sale)
        money = sale.split('/')
        #print(money[0])
        money1 = money[0].split('-')
        #print(money1)
        if money[0][-1] == '萬':
            a = float(money1[0]) * 10000
            b = float(money1[1][:-1]) * 10000
            aveage = (a + b) / 2
            count+=1
            list1.append(aveage)
        elif money[0][-1]=='千':
            a = float(money1[0]) * 1000
            b = float(money1[1][:-1]) * 1000
            #print(a)
            #print(b)
            aveage = (a + b) / 2
            #print(aveage)
            count += 1
            list1.append(aveage)
    #print(count)
    #print(list1)
    for i in list1:
        print(i)
        if 0 < i <= 5000:
            salary0.append(i)
        elif 5000 < i <= 10000:
            salary1.append(i)
        elif 10000 < i <= 15000:
            salary2.append(i)
        elif 15000 < i <= 20000:
            salary3.append(i)
        elif 20000 < i <= 25000:
            salary4.append(i)
        elif 25000 < i <= 30000:
            salary5.append(i)
        elif 30000 < i <= 35000:
            salary6.append(i)
        elif 35000 < i <= 40000:
            salary7.append(i)
    print(min(list1))
    print(max(list1))
    a = len(salary0)
    b = len(salary1)
    c = len(salary2)
    d = len(salary3)
    e = len(salary4)
    f = len(salary5)
    g = len(salary6)
    h = len(salary7)
    list3=[a,b,c,d,e,f,g,h]
    print(list2)   #x軸
    print(a,b,c,d,e,f,g,h)
    print(list3)   #數量


    bar = Bar('Python平均工資')
    bar.add("月薪", list2,list3)
    # bar.show_config()
    bar.render('Python工資柱狀圖.html')

    pie = Pie()
    pie.add("", list2, list3, is_label_show=True)
    #pie.show_config()
    pie.render('Python工資餅狀圖.html')
    '''

    #print(rows)
    citycount=[]
    cityname=['北京','異地招聘','海淀區','朝陽區','豐臺區','昌平區','東城區','延慶區',
              '房山區','通州區','順義區','大興區','懷柔區','西城區','平谷區','門頭溝區']
    beijing=[]
    yidi=[]

    haidian=[]
    chaoyang=[]
    fengtai=[]
    changping=[]
    dongcheng=[]
    yanqing=[]
    fangshan=[]
    tongzhou=[]
    shunyi=[]
    daxing=[]
    huairou=[]
    xicheng=[]
    pinggu=[]
    mentougou=[]


    for n in rows:
        #print(n[3])
        area=n[3].split('-')
        print(area)
        if len(area)==1:
            print(area[0])
            city.append(area[0])
        else:
            print(area[1])
            city.append(area[1])
    print(city)
    print(len(city))
    for i in city:
        if i=='北京':
            beijing.append(i)
        elif i=='異地招聘':
            yidi.append(i)
        elif i=='海淀區':
            haidian.append(i)
        elif i == '朝陽區':
            chaoyang.append(i)
        elif i=='豐臺區':
            fengtai.append(i)
        elif i=='昌平區':
            changping.append(i)
        elif i=='東城區':
            dongcheng.append(i)
        elif i=='延慶區':
            yanqing.append(i)
        elif i=='房山區':
            fangshan.append(i)
        elif i=='通州區':
            tongzhou.append(i)
        elif i=='順義區':
            shunyi.append(i)
        elif i=='大興區':
            daxing.append(i)
        elif i=='懷柔區':
            huairou.append(i)
        elif i=='西城區':
            xicheng.append(i)
        elif i=='平谷區':
            pinggu.append(i)
        elif i=='門頭溝區':
            mentougou.append(i)

    #print(beijing)
    #print(len(beijing))

    a = len(beijing)
    b = len(yidi)
    c = len(haidian)
    d = len(chaoyang)
    e = len(fengtai)
    f = len(changping)
    g = len(dongcheng)
    h = len(yanqing)
    j = len(fangshan)
    k = len(tongzhou)
    l = len(shunyi)
    m = len(daxing)
    n = len(huairou)
    o = len(xicheng)
    p = len(pinggu)
    q = len(mentougou)
    citycount=[a,b,c,d,e,f,g,h,j,k,l,m,n,o,p,q]
    print(cityname)
    print(citycount)

    pie = Pie()
    pie.add("", cityname, citycount, is_label_show=True)
    # pie.show_config()
    pie.render('北京各區Python職位佔比餅狀圖.html')

    bar = Bar('北京各區職位數量')
    bar.add("數量", cityname, citycount)
    # bar.show_config()
    bar.render('北京各區Python職位佔比柱狀圖.html')
    
    '''

前面寫的是資料庫的操作函式，其實可以封裝成一個py檔案，以後使用直接呼叫即可。

結果。：

我也分析了boss直聘網站的一些資料，類似於經驗要求和學歷要求等等，也可以自己分析想要的資料。


import pymysql
import numpy as np
from pyecharts import Bar
from pyecharts import Pie
import jieba
from collections import Counter
from os import  path

class Mysqlhelper(object):
    config={
        "host":"localhost",
        "user":"root",
        "password":"123456",
        "db":"test",
        "charset":"utf8"
    }

    def __init__(self):
        self.connection=None
        self.cursor=None

    # 從資料庫中查詢多行資料
    def getlist(self, sql, *args):
        try:
            self.connection = pymysql.connect(**Mysqlhelper.config)  # **接函式所有引數
            self.cursor = self.connection.cursor()
            self.cursor.execute(sql, args)
            return self.cursor.fetchall()
        except Exception as ex:
            print(ex,ex)
        finally:
            self.close()

    def close(self):
        if self.cursor:
            self.cursor.close()
        if self.connection:
            self.connection.close()

if __name__=="__main__":
    sale=[]
    exp=[]
    edu=[]
    one = []
    three = []
    five = []
    onein = []
    noexp = []
    qita=[]
    benke=[]
    dazhuan=[]
    noedu=[]
    boshi=[]
    other=[]
    helper = Mysqlhelper()
    rows = helper.getlist("select * from boss_job")
    #print(rows)

    for data in rows:
        #print(data[2])
        #print(data[5])
        #print(data[6])
        sale.append(data[2])
        exp.append(data[5])
        edu.append(data[6])
        if data[5]=='1-3年':
            one.append(data[5])
        elif data[5]=='3-5年':
            three.append(data[5])
        elif data[5]=='5-10年':
            five.append(data[5])
        elif data[5]=='經驗不限':
            noexp.append(data[5])
        elif data[5]=='1年以內':
            onein.append(data[5])
        else:
            qita.append(data[5])
            pass
        if data[6]=='本科':
            benke.append(data[6])
        elif data[6]=='大專':
            dazhuan.append(data[6])
        elif data[6]=='博士':
            boshi.append(data[6])
        elif data[6]=='學歷不限':
            noedu.append(data[6])
        else:
            other.append(data[6])



    #     with open('./data/jingyan.txt', 'a', encoding='utf-8') as fp:
    #         fp.write(data[5])
    #         fp.write(',')
    #         fp.flush()
    #         fp.close()
    print(exp)
    print(edu)
    print(len(exp))
    print(len(edu))

    '''
    d = path.dirname(__file__)
    jingyan_text = open(path.join(d, "data//jingyan.txt"), encoding='utf-8').read()
    print(len(jingyan_text))

    jieba.load_userdict("data//jingyan_dict.txt")

    seg_list = jieba.cut_for_search(jingyan_text)
    print(u"[全模式]: ", "/ ".join(seg_list))
    '''
    # sanguo_words = [x for x in jieba.cut(jingyan_text)if x!=','and len(x) >=2]
    # c = Counter(sanguo_words).most_common(20)
    # print(c)
    # print(''.join(jieba.cut(jingyan_text)))

    print(one)
    print(three)
    print(five)
    print(noexp)
    print(onein)
    print(qita)
    a=len(one)
    b=len(three)
    c=len(five)
    d=len(noexp)
    e=len(onein)
    f=len(qita)
    expcount=[f,e,a,b,c,d]
    expfenlei=['應屆生','1年以內','1-3年','3-5年','5-10年','經驗不限']
    print(expcount)
    print(a+b+c+d+e+f)

    print(other)
    g=len(benke)
    h=len(dazhuan)
    j=len(boshi)
    k=len(noedu)
    m=len(other)
    educount=[h,g,k,j,m]
    edufenlei=['大專','本科','碩士','博士','學歷不限']
    print(educount)

    '''
    bar = Bar('工作年限')
    bar.add("要求", expfenlei, expcount)
    # bar.show_config()
    bar.render('工作年限柱狀圖.html')

    pie = Pie()
    pie.add("工作", expfenlei, expcount, is_label_show=True)
    # pie.show_config()
    pie.render('工作年限餅狀圖.html')
    '''

    bar = Bar('學歷要求')
    bar.add("學歷", edufenlei, educount)
    # bar.show_config()
    bar.render('學歷要求柱狀圖.html')

    pie = Pie()
    pie.add("學歷", edufenlei, educount, is_label_show=True)
    # pie.show_config()
    pie.render('學歷要求餅狀圖.html')

我使用的是最基本的陣列方法，不知道有什麼簡單方法麼，例如jieba分詞模組，等等

可以看出本科生需求還是很大的。。。

pyecharts資料分析及展示

僅僅從網上爬下資料當然是不夠用的，主要還得對資料進行分析與展示，大部分人都看重薪資，但是薪資資料有的是*k/月，有的是*萬/月，還有*萬/年等等，就要對資料進行清理將所有單位統一化，全部換算成統一單位，然後分類薪資範圍，在計算各個範圍的數量，最後繪圖展示 import pymysql im

Python 資料分析與展示筆記4 -- Pandas 庫基礎

Python 資料分析與展示筆記4 – Pandas 庫基礎 Python 資料分析與展示系列筆記是筆者學習、實踐Python 資料分析與展示的相關筆記課程連結： Python 資料分析與展示參考文件： Numpy 官方文件（英文） Numpy 官方文件（中文） P

Python 資料分析與展示筆記3 -- Matplotlib 庫基礎

Python 資料分析與展示筆記3 – Matplotlib 庫基礎 Python 資料分析與展示系列筆記是筆者學習、實踐Python 資料分析與展示的相關筆記課程連結： Python 資料分析與展示參考文件： Numpy 官方文件（英文） Numpy 官方文件（中

Python 資料分析與展示筆記2 -- 影象手繪效果

Python 資料分析與展示筆記2 – 影象手繪效果 Python 資料分析與展示系列筆記是筆者學習、實踐Python 資料分析與展示的相關筆記課程連結： Python 資料分析與展示參考文件： Numpy 官方文件（英文） Numpy 官方文件（中文） PIL 官

Python 資料分析與展示筆記1 -- Numpy 基礎

Python 資料分析與展示筆記1 – NumPy 基礎 Python 資料分析與展示系列筆記是筆者學習、實踐Python 資料分析與展示的相關筆記課程連結： Python 資料分析與展示參考文件： NumPy 官方文件（英文） NumPy 官方文件（中文） PIL

python進階之資料分析與展示（三）

資料分析之表示資料存取與函式資料的CSV檔案存取 CSV (Comma‐Separated Value, 逗號分隔值) CSV是一種常見的檔案格式，用來儲存批量資料。 np.savetxt(frame, array, fmt=’%.

python進階之資料分析與展示（二）

資料分析之表示 NumPy庫入門資料的維度一維資料一維資料由對等關係的有序或無序資料構成，採用線性方式組織。例如：3.1413, 3.1398, 3.1404, 3.1401, 3.1349, 3.1376。對應列

python進階之資料分析與展示（一）

資料分析之前奏 Anaconda IDE的使用方法一個數據表達一個含義，一組資料表達一個或多個含義。摘要有損地提取資料特徵的過程。基本統計（含排序）。分佈/累計統計。資料特徵。相關性、

智聯Python相關職位的資料分析及視覺化-Pandas&Matplotlib篇 python

Numpy（Numerical Python的簡稱）是Python科學計算的基礎包。它提供了以下功能：快速高效的多維陣列物件ndarray。用於對陣列執行元素級計算以及直接對陣列執行數學運算的函式。用於讀寫硬碟上基於陣列的資料集的工具。線性代數運算、傅立

進階 | 一文讀懂大資料分析及挖掘技術

隨著大資料時代的到來，在大資料觀念不斷提出的今天，加強資料大資料探勘及時的應用已成為大勢所趨。什麼是大資料探勘？資料探勘（Data Mining）是從大量的、不完全的、有噪聲的、模糊的、隨機的資料中提取隱含在其中的、人們事先不知道的、但又是潛在有用的資

【Python3實戰Spark大資料分析及排程】Spark Core 課程筆記（1）

目錄架構注意事項 Spark Core: Spark 核心進階 Spark 核心概念 Application User program built on Spark. Consists of a driver progr

資料分析及挖掘到底能帶來什麼價值？

導讀：大資料為什麼如此受關注，資料分析能給企業和社會帶來什麼價值？看完本文，相信你會有一個整體的認識。簡單歸納下本文案例中資料分析帶來的效益：1.解決個性化定製的需求，形成精準及閉環的營銷策略；2.基於資料分析實現合理科學的決策，降低決策成本，提高精準率和成功

北京理工python資料分析與展示課單元二總結

一、檔案讀取與儲存： 1:savetxt()與loadtxt()函式 import numpy as np numpy.savetxt(frame, array, fmt='%.18e',

【Python資料探勘課程】四.決策樹DTC資料分析及鳶尾資料集分析

希望這篇文章對你有所幫助，尤其是剛剛接觸資料探勘以及大資料的同學，同時準備嘗試以案例為主的方式進行講解。如果文章中存在不足或錯誤的地方，還請海涵~一. 分類及決策樹介紹1.分類分類其實是從特定的資料中挖掘模式，作出判斷的過程。比如Gmail郵箱

Python資料探勘課程四.決策樹DTC資料分析及鳶尾資料集分析

希望這篇文章對你有所幫助，尤其是剛剛接觸資料探勘以及大資料的同學，同時準備嘗試以案例為主的方式進行講解。如果文章中存在不足或錯誤的地方，還請海涵~ 一. 分類及決策樹介紹 1.分類分類其實是從特定的資料中挖掘模式，作

【MOOC】Python資料分析與展示-北京理工大學-【第〇周】資料分析之前奏

課程內容導學主題思想與一組資料相關的那些事兒：如何理解一組資料表達的含義有損地提取資料特徵內容組織全課程包括： • 8個內容單元，共12個單元 • 全課程總長4周，每週3個單元 • 每週包含一個實戰型例項程式

4.python資料分析與展示-----Matplotlib庫入門

1.Matplotlib庫介紹Python優秀的資料視覺化第三方庫2.Matplotlib庫的使用Matplotlib庫由各種視覺化類構成，內部結構複雜，受Matlab啟發，matplotlib.pyplot是繪製各類視覺化圖形的命令子庫，相當於快捷方式。

萌新向Python資料分析及資料探勘第一章 Python基礎 (上)未排版

因word和部落格編輯器格式不能完全對接，正在重新排版，2019年1月1日發出第一章完整版本文將參考《Python程式設計從入門到實踐》的講述順序和例子，加上自己的理解，讓大家快速瞭解Python的基礎用法，並將拓展內容的連結新增在相關內容之後，方便大家閱讀。

萌新向Python資料分析及資料探勘第一章 Python基礎第一節 python安裝以及環境搭建第二節變數和簡單的資料型別

本文將參考《Python程式設計從入門到實踐》的講述順序和例子，加上自己的理解，讓大家快速瞭解Python的基礎用法，並將拓展內容的連結新增在相關內容之後，方便大家閱讀。好了！我們開始第一章的學習。第一章 Python基礎第一節 Python安裝以及環境搭建 Python

萌新向Python資料分析及資料探勘第一章 Python基礎第三節列表簡介第四節操作列表

第一章 Python基礎第三節列表簡介列表是是處理一組有序專案的資料結構，即可以在一個列表中儲存一個序列的專案。列表中的元素包括在方括號（[]）中，每個元素之間用逗號分割。列表是可變的資料型別，可以新增、刪除或是搜尋列表中的元素。列表可以理解為你用鉛筆在筆記本里記錄內容，內容可以修改，每

pyecharts資料分析及展示

相關推薦