Github被微軟收購了?不過一點不影響打造Github程式碼洩露監控工具
0×01 擼起袖子開幹
人生苦短,我用Python!
Python強大的庫、簡潔語言以及開發迅速等特點,深受廣大程式開發者喜愛。那麼我們就用Python來開發吧!
0×02 步驟解析
1.登陸Github
登陸這裡設定了一個坑,登陸 https://github.com/login 會跳轉到https://github.com/session ,然後提交請求主體。而主體包含了如下引數:
“commit=Sign+in&utf8=%E2%9C%93&authenticity_token=sClUkea9k0GJ%2BTVRKRYsvLKPGPfLDknMWVSd%2FyWvyGAR9Zz09bipesvXUo8ND2870Q2FEVsQWFKScyqtV0w1PA%3D%3D&login=YourUsername&password=YourPassword”
commit、uft8、login和password值相對來說是固定的,我們要做到工具登陸,那麼需要獲取到authenticity_token這個值,然後一起通過POST方法提交。那應該如何獲取該值呢?
我們開啟瀏覽器嘗試手動正常登陸,同時按F12開啟“開發者工具”,輸入使用者名稱和密碼可以看到跳轉到 https://github.com/session ,而authenticity_token的值就在如下圖位置:
雖然是隱藏的,但是我們可以通過Xpath來獲取它,然後跟其他引數一起提交登陸Github。看程式碼:
2.查詢關鍵詞及結果呈現
登陸後請求查詢的URL,然後獲取響應的頁面,使用xpath解析節點獲取想要的資訊。關於xpath的語法請看這裡
http://www.runoob.com/xpath/xpath-tutorial.html
我們還要將獲取的資訊寫入表格裡面,便於以後檢視。詳情如下:def hunter(gUser,gPass,keyword,payloads):
global sensitive_list
global tUrls
sensitive_list = []
tUrls = []
try:
#建立表格
csv_file = open('leak.csv','w',encoding='utf-8',newline='')
writer = csv.writer(csv_file)
#寫入表頭
writer = writerow(['URL' ,'Username','Upload Time','Filename'])
#搜尋資訊
s = login_github(gUser,gPass)
print('登陸成功,正在檢索洩露資訊......')
sleep(1)
for page in tqdm(range(1,6)): #檢索1到6頁匹配關鍵詞keyword的結果
search_code = 'https://github.com/search?p=' + str(page) + '&q=' + keyword + '&type=Code'
resp = s.get(search_code)
results_code = resp.text
dom_tree_code = etree.HTML(results_code) #採用lxml提供的etree來解析結果
Urls = dom_tree_code.xpath('//div[@class="d-inline-block col-10"]/a[2]/@href') #獲取倉庫地址
users = dom_tree_code.xpath('//a[@class="text-blod"]/text()') #獲取使用者名稱
datetime = dom_tree_code.xpath('//relative-time/text()') #獲取上傳時間
filename = dom_tree_code.xpath('//div[@class="d-inline-block col-10"]/a[2]/text()') #獲取上傳的檔名稱
for i in range(len(Urls)):
for Url in Urls:
Url = 'https://github.com' + Url #獲取的URl被截斷,所以需要加入字首便於訪問
tUrls.append(Url)
writer.writerow([tUrls[i],users[i],datetime[i],filename[i]]) #寫入表格檔案
'''
以下部分主要是獲取洩露的raw程式碼,然後在程式碼中搜索使用者自定義的payload,例如 password,username,IP等等,然後把存在敏感關鍵詞的URL存放在sensitvie_list列表中,用於後續的郵件傳送預警。
'''
for raw_url in Urls:
url = 'https://raw.githubusercontent.com' + raw_url.replace('/blob','')
code = requests.get(url).text
for payload in payloads:
if payload in code:
leak_url = '命中的Payload為:' + payload + '\r\n' + 'https://github.com' + raw_url + '\r\n\r\n\r\n' + '程式碼如下: \r\n' + code + '\r\n\r\n'
sensitive_list.append(leak_url)
csv_file.close()
return sensitive_list
except Exception as e:
print(e)
以上程式碼的核心主要是採用xpath解析DOM樹,然後根據需要的資料逐一獲取然後寫入表格中。最後請求raw.githubusercontent.com來獲取原始碼,根據使用者提供的payload進行逐一匹配,如果匹配則記錄payload、URL以及程式碼,然後傳送郵件預警。
3.郵件預警
其實郵件傳送部分不是工具的重點,但是還是有必要貼上程式碼部分。請看:
def send_warning(host,username,password,sender,receivers,content)
def _format_addr(s):
name,addr = parseaddr(s)
return formataddr((Header(name,'utf-8').encode(),addr)
msg = MIMEMultipart()
msg['From'] = _format_addr('Github安全監控<%s>' % sender)
msg['To'] = ''.join(receivers)
Subject = 'Github敏感資訊洩露通知'
msg['Subject'] = Header(Subject,'utf-8').encode()
msg.attach(MIMEText('Dear all \r\n\r\n請注意,懷疑Github上已經上傳敏感資訊!以下是可能存在敏感資訊的倉庫!\r\n\r\n'+content+'\r\n\r\n'))
with open('leak.csv','rb') as f:
m = MIMEBase('excel','csv',filename='leak.csv')
m.add_header('Content-Disposition','attachment',filename = 'leak.csv'
m.add_header('Content-ID','<0>')
m.add_header('X-Attachment-ID','0')
m.set_payload(f.read())
encoders.encode_base64(m)
msg.attach(m)
try:
server = smtplib.SMTP(host,25)
server.login(username,password)
server.sendmail(sender,receivers,msg.as_string())
print('郵件傳送成功!')
except Exception as err:
print(err)
server.quit()
4.配置檔案讀取
我們將建立一個.ini的檔案,便於工具讀取我們想要傳入工具的關鍵詞、使用者名稱、密碼以及payload等等。ini配置檔案定義如下:
[KEYWORD]
keyword = your main keyword here
[EMAIL]
host = Email server
user = Email User
password = Email password
[SENDER]
sender = The email sender
[RECEIVER]
receiver1 = Email receiver No.1
receiver2 = Email receiver No.2
[Github]
user = Github Username
password = Github Password
[PAYLOADS]
p1 = Payload 1
p2 = Payload 2
p3 = Payload 3
p4 = Payload 4
p5 = Payload 5
p6 = Payload 6
然後我們在main函式中讀取它們,然後傳入工具中。
if __name__ == '__main__':
config = configparser.ConfigParser()
config.read('info.ini')
g_User = config['Github']['user']
g_Pass = config['Github']['password']
host = config['EMAIL']['host']
m_User = config['EMAIL']['user']
m_Pass = config['EMAIL']['password']
m_sender = config['SENDER']['sender']
receivers = []
for k in config['RECEIVER']:
receivers.append(config['RECEIVER'][k])
keyword = config['KEYWORD']['keyword']
payloads = []
for key in config['PAYLOADS']:
payloads.append(config['PAYLOADS'][key])
sensitive_list = hunter(g_User, g_Pass, keyword, payloads)
if sensitive_list:
print('\033[1;31;0m警告:找到敏感資訊!\r\n\033[0m')
print('開始傳送告警郵件......')
content = ''.join(sensitive_list)
send_warning(host, m_User, m_Pass, m_sender, receivers, content)
else:
print('恭喜:未找到敏感資訊!\r\n')
print('所有檢查已完成,已生成報表!\r\n')
print('開始傳送報表......\r\n')
send_mail(host, m_User, m_Pass, m_sender, receivers)
以上程式碼中存在另外一個send_mail函式,同樣是傳送郵件的功能跟send_warning功能一樣,只是傳送的內容不一樣。這裡不再贅述。這樣我們就完成了整個工具的核心部分。怎麼樣?對於老司機來說很簡單吧!
0×03 監控效果
1.執行效果
2.郵件預警
歡迎加入我的千人交流學習答疑群:125240963