自學 Python 到什麼程度能找到工作，1300+ 條招聘資訊告訴你答案

阿新 • • 發佈：2020-07-14

隨著移動網際網路的發展以及機器學習等熱門領域帶給人們的衝擊，讓越來越多的人接觸並開始學習 Python。無論你是是科班出身還是非科班轉行，Python 無疑都是非常適合你入門計算機世界的第一門語言，其語法非常簡潔，寫出的程式易懂，這也是 Python 一貫的哲學「簡單優雅」，在保證程式碼可讀的基礎上，用盡可能少的程式碼完成你的想法。

很多人學習python，不知道從何學起。
很多人學習python，掌握了基本語法過後，不知道在哪裡尋找案例上手。
很多已經做案例的人，卻不知道如何去學習更加高深的知識。
那麼針對這三類人，我給大家提供一個好的學習平臺，免費領取視訊教程，電子書籍，以及課程的原始碼！
QQ群：1097524789

那麼，我們學習 Python 到什麼程度，就可以開始找工作了呢，大家都知道，實踐是檢驗真理的唯一標準，那麼學到什麼程度可以找工作，當然得看市場的需求，畢竟企業招你來是工作的，而不是讓你來帶薪學習的。

所以，今天我們就試著爬取下拉鉤上關於 Python 的招聘資訊，來看看市場到底需要什麼樣的人才。

網頁結構分析

開啟拉鉤網首頁，輸入關鍵字「Python」，接著按 F12 開啟網頁除錯面板，切換到「Network」選項卡下，過濾條件選上「XHR」，一切準備就緒之後點選搜尋，仔細觀察網頁的網路請求資料。

從這些請求中我們可以大致猜測到資料好像是從jobs/positionAjax.json這個介面獲取的。

別急，我們來驗證下，清空網路請求記錄，翻頁試試。當點選第二頁的時候，請求記錄如下。

可以看出，這些資料是通過 POST 請求獲取的，Form Data 中的 pn 就是當前頁碼了。好了，網頁分析好了，接下來就可以寫爬蟲拉取資料了。你的爬蟲程式碼看起來可能會是這樣的。

url = 'https://www.lagou.com/jobs/positionAjax.json?px=new&needAddtionalResult=false'
headers = """
accept: application/json, text/javascript, */*; q=0.01
origin: https://www.lagou.com
referer: https://www.lagou.com/jobs/list_python?px=new&city=%E5%85%A8%E5%9B%BD
user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36
"""

headers_dict = headers_to_dict(headers)

def get_data_from_cloud(page):
    params = {
        'first': 'false',
        'pn': page,
        'kd': 'python'
    }
    response = requests.post(url, data=params, headers=headers_dict, timeout=3)
    result = response.text
    write_file(result)

for i in range(76):
    get_data_from_cloud(i + 1)

程式寫好之後，激動的心，顫抖的手，滿懷期待的你按下了 run 按鈕。美滋滋的等著接收資料呢，然而你得到的結果資料很大可能是這樣的。

{"success":true,"msg":null,"code":0,"content":{"showId":"8302f64","hrInfoMap":{"6851017":{"userId":621208...
{"status":false,"msg":"您操作太頻繁,請稍後再訪問","clientIp":"xxx.yyy.zzz.aaa","state":2402}
...

不要懷疑，我得到的結果就是這樣的。這是因為拉勾網做了反爬蟲機制，對應的解決方案就是不要頻繁的爬，每次獲取到資料之後適當停頓下，比如每兩個請求之間休眠 3 秒，然後請求資料時再加上 cookie 資訊。完善之後的爬蟲程式如下：

home_url = 'https://www.lagou.com/jobs/list_python?px=new&city=%E5%85%A8%E5%9B%BD'
url = 'https://www.lagou.com/jobs/positionAjax.json?px=new&needAddtionalResult=false'
headers = """
accept: application/json, text/javascript, */*; q=0.01
origin: https://www.lagou.com
referer: https://www.lagou.com/jobs/list_python?px=new&city=%E5%85%A8%E5%9B%BD
user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36
"""

headers_dict = string_util.headers_to_dict(headers)

def get_data_from_cloud(page):
    params = {
        'first': 'false',
        'pn': page,
        'kd': 'python'
    }
    s = requests.Session()  # 建立一個session物件
    s.get(home_url, headers=headers_dict, timeout=3)  # 用 session 物件發出 get 請求，獲取 cookie
    cookie = s.cookies
    response = requests.post(url, data=params, headers=headers_dict, cookies=cookie, timeout=3)
    result = response.text
    write_file(result)

def get_data():
    for i in range(76):
        page = i + 1
        get_data_from_cloud(page)
        time.sleep(5)

不出意外，這下可以就可以獲得全部資料了，總共 1131 條。

資料清洗

上文我們將獲取到的 json 資料儲存到了 data.txt 檔案中，這不方便我們後續的資料分析操作。我們準備用 pandas 對資料做分析，所以需要做一下資料格式化。

處理過程不難，只是有點繁瑣。具體過程如下：

def get_data_from_file():
    with open('data.txt') as f:
        data = []
        for line in f.readlines():
            result = json.loads(line)
            result_list = result['content']['positionResult']['result']
            for item in result_list:
                dict = {
                    'city': item['city'],
                    'industryField': item['industryField'],
                    'education': item['education'],
                    'workYear': item['workYear'],
                    'salary': item['salary'],
                    'firstType': item['firstType'],
                    'secondType': item['secondType'],
                    'thirdType': item['thirdType'],
                    # list
                    'skillLables': ','.join(item['skillLables']),
                    'companyLabelList': ','.join(item['companyLabelList'])
                }
                data.append(dict)
        return data

data = get_data_from_file()
data = pd.DataFrame(data)
data.head(15)

資料分析

獲取資料和清洗資料只是我們的手段，而不是目的，我們最終的目的是要通過獲取到的招聘資料探勘出招聘方的需求，以此為目標來不斷完善自己的技能圖譜。

城市

先來看看哪些城市的招聘需求最大，這裡我們只取 Top15 的城市資料。

top = 15
citys_value_counts = data['city'].value_counts()
citys = list(citys_value_counts.head(top).index)
city_counts = list(citys_value_counts.head(top))

bar = (
    Bar()
    .add_xaxis(citys)
    .add_yaxis("", city_counts)
)
bar.render_notebook()

pie = (
    Pie()
    .add("", [list(z) for z in zip(citys, city_counts)])
    .set_global_opts(title_opts=opts.TitleOpts(title=""))
    .set_global_opts(legend_opts=opts.LegendOpts(is_show=False))
)
pie.render_notebook()

由上圖可以看出，北京佔據了四分之一還多的招聘量，其次是上海，深圳，杭州，單單從需求量來說，四個一線城市中廣州被杭州所代替。

這也就從側面說明了我們為啥要去一線城市發展了。

學歷

eduction_value_counts = data['education'].value_counts()

eduction = list(eduction_value_counts.index)
eduction_counts = list(eduction_value_counts)

pie = (
    Pie()
    .add("", [list(z) for z in zip(eduction, eduction_counts)])
    .set_global_opts(title_opts=opts.TitleOpts(title=""))
    .set_global_opts(legend_opts=opts.LegendOpts(is_show=False))
)
pie.render_notebook()

看來大多公司的要求都是至少要本科畢業的，不得不說，當今社會本科基本上已經成為找工作的最低要求了（能力特別強的除外）。

工作年限

work_year_value_counts = data['workYear'].value_counts()
work_year = list(work_year_value_counts.index)
work_year_counts = list(work_year_value_counts)

bar = (
    Bar()
    .add_xaxis(work_year)
    .add_yaxis("", work_year_counts)
)
bar.render_notebook()

3-5年的中級工程師需求最多，其次是 1-3 年的初級工程師。

其實這也是符合市場規律的，這是因為高階工程師換工作頻率遠遠低於初中級，且一個公司對高階工程師的需求量是遠遠低於初中級工程師的。

行業

我們再來看看這些招聘方都屬於哪些行業。因為行業資料不是非常規整，所以需要單獨對每一條記錄按照,作下切割。

industrys = list(data['industryField'])
industry_list = [i for item in industrys for i in item.split(',') ]

industry_series = pd.Series(data=industry_list)
industry_value_counts = industry_series.value_counts()

industrys = list(industry_value_counts.head(top).index)
industry_counts = list(industry_value_counts.head(top))

pie = (
    Pie()
    .add("", [list(z) for z in zip(industrys, industry_counts)])
    .set_global_opts(title_opts=opts.TitleOpts(title=""))
    .set_global_opts(legend_opts=opts.LegendOpts(is_show=False))
)
pie.render_notebook()

移動網際網路行業佔據了四分之一還多的需求量，這跟我們的認識的大環境是相符合的。

技能要求

來看看招聘方所需的技能要求詞雲。

word_data = data['skillLables'].str.split(',').apply(pd.Series)
word_data = word_data.replace(np.nan, '')
text = word_data.to_string(header=False, index=False)

wc = WordCloud(font_path='/System/Library/Fonts/PingFang.ttc', background_color="white", scale=2.5,
               contour_color="lightblue", ).generate(text)

wordcloud = WordCloud(background_color='white', scale=1.5).generate(text)
plt.figure(figsize=(16, 9))
plt.imshow(wc)
plt.axis('off')
plt.show()

除去 Python，出現最多的是後端、MySQL、爬蟲、全棧、演算法等。

薪資

接下來我們看看各大公司給出的薪資條件。

salary_value_counts = data['salary'].value_counts()
top = 15
salary = list(salary_value_counts.head(top).index)
salary_counts = list(salary_value_counts.head(top))

bar = (
    Bar()
    .add_xaxis(salary)
    .add_yaxis("", salary_counts)
.set_global_opts(xaxis_opts=opts.AxisOpts(name_rotate=0,name="薪資",axislabel_opts={"rotate":45}))
)
bar.render_notebook()

大部分公司給出的薪資還是很可觀的，基本都在 15K-35K 之間，只要你技術過關，很難找不到滿意薪酬的工作。

福利

最後咱來看看公司給出的額外福利都有哪些。

word_data = data['companyLabelList'].str.split(',').apply(pd.Series)
word_data = word_data.replace(np.nan, '')
text = word_data.to_string(header=False, index=False)

wc = WordCloud(font_path='/System/Library/Fonts/PingFang.ttc', background_color="white", scale=2.5,
               contour_color="lightblue", ).generate(text)

plt.figure(figsize=(16, 9))
plt.imshow(wc)
plt.axis('off')
plt.show()

年底雙薪、績效獎金、扁平化管理，都是大家所熟知的福利。其中扁平化管理是網際網路公司的特色，不像國企或者其他實體企業，上下級觀念比較重。

總結

今天我們抓取了拉勾網 1300+ 條關於 Python 的招聘資料，對這批資料分析之後我們得出如下結論：

關於學歷你最好是本科畢業，市場對 1-5 年工作經驗的工程師需求量比較大，需求量最大的城市是北上深杭，需求量最多的行業仍然是移動網際網路，而且大多數公司都可以給到不錯的薪酬待遇。

通過對這 1300+ 條招聘資料的分析，相信你會更瞭解現在的就業市場情況，做到知己知彼，才能增加自己在未來工作中的勝算。

自學 Python 到什麼程度能找到工作，1300+ 條招聘資訊告訴你答案

網頁結構分析

資料清洗

資料分析

城市

學歷

工作年限

行業

技能要求

薪資

福利

總結

自學 Python 到什麼程度能找到工作，1300+ 條招聘資訊告訴你答案

學python能找到工作嗎，自學python多久可以找到工作

同樣是在招聘資訊凌亂的網站上找工作，同學的騷操作把我給整蒙了

“找工作，我要和老闆談”涉嫌廣告欺詐？BOSS 直聘客服：HR 也是 BOSS 的一員

課得軟體 | 職場小白找工作，是薪資重要還是去到適應的環境重要？

java要學到什麼程度才能找工作

電腦靜音工作，又聽不到12306的來票音樂，糾結啊！但春節前工作多工重，不能安心工作，就動手做個“無聲購票彈窗”工具吧！

想從事Python後端開發？如何入門和學習，這篇文章來告訴你。

python生成各種隨機小數，總有一種滿足你的需求

一個企業的信用多重要，大家信夫來告訴你

通過自學python能找到工作嗎

前端工程師到底要需要掌握什麼技能、到什麼程度，才能去找工作？

自學前端半年找了一份不錯的工作，給前端小白寫的一份入門指南

知乎關注度25K的問題，自學軟體測試，要學到什麼程度才找到工作

Python既然這麼火，為什麼找工作那麼難找呢？

對“托盤中的程式退出後，工作管理員中還能找該程序”研究分析

看完這篇Redis快取三大問題，保你面試能造火箭，工作能擰螺絲。

opencv-python不能往圖片上寫中文，亂碼要這麼做

python爬取快遞100，執行程式碼就能查詢的物流資訊

寫給年輕程式設計師：37歲的我，正在找工作！

自學 Python 到什麼程度能找到工作，1300+ 條招聘資訊告訴你答案

網頁結構分析

資料清洗

資料分析

城市

學歷

工作年限

行業

技能要求

薪資

福利

總結

相關推薦