58同城的字型解密(一)
阿新 • • 發佈:2018-12-11
在爬蟲的時候,經常會遇到一些反爬機制,但在反爬中字型加密屬於比較難解決的一部分。今天介紹一個比較簡單的解密方法。
1、首先找到加密的字型,開啟58的一個連結:https://zz.58.com/pinpaigongyu/?utm_source=market&spm=u-LlFBrx8a1luDwQM.sgppzq_zbt&PGTID=0d100000-0015-67a3-d744-1bb7d66dd6e2&ClickID=2,如下圖
2、700的字型是加密的,然後找到這個標籤,然後找到與之對應的標籤【1】,複製,然後開啟網頁原始碼,找到標籤【1】。,如下圖:
3、複製AAAAAA~AAAAAA標註的這些內容。然後書寫程式碼:
from fontTools.ttLib import TTFont import base64 from io import BytesIO from PIL import Image,ImageDraw,ImageFont str = 'AAEAAAALAIAAAwAwR1NVQiCLJXoAAAE4AAAAVE9TLzL4XQjtAAABjAAAAFZjbWFwq8R/YwAAAhAAAAIuZ2x5ZuWIN0cAAARYAAADdGhlYWQT0/0FAAAA4AAAADZoaGVhCtADIwAAALwAAAAkaG10eC7qAAAAAAHkAAAALGxvY2ED7gSyAAAEQAAAABhtYXhwARgANgAAARgAAAAgbmFtZTd6VP8AAAfMAAACanBvc3QFRAYqAAAKOAAAAEUAAQAABmb+ZgAABLEAAAAABGgAAQAAAAAAAAAAAAAAAAAAAAsAAQAAAAEAAOs1n4RfDzz1AAsIAAAAAADYJlj6AAAAANgmWPoAAP/mBGgGLgAAAAgAAgAAAAAAAAABAAAACwAqAAMAAAAAAAIAAAAKAAoAAAD/AAAAAAAAAAEAAAAKADAAPgACREZMVAAObGF0bgAaAAQAAAAAAAAAAQAAAAQAAAAAAAAAAQAAAAFsaWdhAAgAAAABAAAAAQAEAAQAAAABAAgAAQAGAAAAAQAAAAEERAGQAAUAAAUTBZkAAAEeBRMFmQAAA9cAZAIQAAACAAUDAAAAAAAAAAAAAAAAAAAAAAAAAAAAAFBmRWQAQJR2n6UGZv5mALgGZgGaAAAAAQAAAAAAAAAAAAAEsQAABLEAAASxAAAEsQAABLEAAASxAAAEsQAABLEAAASxAAAEsQAAAAAABQAAAAMAAAAsAAAABAAAAaYAAQAAAAAAoAADAAEAAAAsAAMACgAAAaYABAB0AAAAFAAQAAMABJR2lY+ZPJpLnjqeo59kn5Kfpf//AACUdpWPmTyaS546nqOfZJ+Sn6T//wAAAAAAAAAAAAAAAAAAAAAAAAABABQAFAAUABQAFAAUABQAFAAUAAAACQAHAAgABAAKAAEAAwAFAAIABgAAAQYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADAAAAAAAiAAAAAAAAAAKAACUdgAAlHYAAAAJAACVjwAAlY8AAAAHAACZPAAAmTwAAAAIAACaSwAAmksAAAAEAACeOgAAnjoAAAAKAACeowAAnqMAAAABAACfZAAAn2QAAAADAACfkgAAn5IAAAAFAACfpAAAn6QAAAACAACfpQAAn6UAAAAGAAAAAAAAACgAPgBmAJoAvgDoASQBOAF+AboAAgAA/+YEWQYnAAoAEgAAExAAISAREAAjIgATECEgERAhIFsBEAECAez+6/rs/v3IATkBNP7S/sEC6AGaAaX85v54/mEBigGB/ZcCcwKJAAABAAAAAAQ1Bi4ACQAAKQE1IREFNSURIQQ1/IgBW/6cAicBWqkEmGe0oPp7AAEAAAAABCYGJwAXAAApATUBPgE1NCYjIgc1NjMyFhUUAgcBFSEEGPxSAcK6fpSMz7y389Hym9j+nwLGqgHButl0hI2wx43iv5D+69b+pwQAAQAA/+YEGQYnACEAABMWMzI2NRAhIzUzIBE0ISIHNTYzMhYVEAUVHgEVFAAjIiePn8igu/5bgXsBdf7jo5CYy8bw/sqow/7T+tyHAQN7nYQBJqIBFP9uuVjPpf7QVwQSyZbR/wBSAAACAAAAAARoBg0ACgASAAABIxEjESE1ATMRMyERNDcjBgcBBGjGvv0uAq3jxv58BAQOLf4zAZL+bgGSfwP8/CACiUVaJlH9TwABAAD/5gQhBg0AGAAANxYzMjYQJiMiBxEhFSERNjMyBBUUACEiJ7GcqaDEx71bmgL6/bxXLPUBEv7a/v3Zbu5mswEppA4DE63+SgX42uH+6kAAAAACAAD/5gRbBicAFgAiAAABJiMiAgMzNjMyEhUUACMiABEQACEyFwEUFjMyNjU0JiMiBgP6eYTJ9AIFbvHJ8P7r1+z+8wFhASClXv1Qo4eAoJeLhKQFRj7+ov7R1f762eP+3AFxAVMBmgHjLfwBmdq8lKCytAAAAAABAAAAAARNBg0ABgAACQEjASE1IQRN/aLLAkD8+gPvBcn6NwVgrQAAAwAA/+YESgYnABUAHwApAAABJDU0JDMyFhUQBRUEERQEIyIkNRAlATQmIyIGFRQXNgEEFRQWMzI2NTQBtv7rAQTKufD+3wFT/un6zf7+AUwBnIJvaJLz+P78/uGoh4OkAy+B9avXyqD+/osEev7aweXitAEohwF7aHh9YcJlZ/7qdNhwkI9r4QAAAAACAAD/5gRGBicAFwAjAAA3FjMyEhEGJwYjIgA1NAAzMgAREAAhIicTFBYzMjY1NCYjIga5gJTQ5QICZvHD/wABGN/nAQT+sP7Xo3FxoI16pqWHfaTSSgFIAS4CAsIBDNbkASX+lf6l/lP+MjUEHJy3p3en274AAAAAABAAxgABAAAAAAABAA8AAAABAAAAAAACAAcADwABAAAAAAADAA8AFgABAAAAAAAEAA8AJQABAAAAAAAFAAsANAABAAAAAAAGAA8APwABAAAAAAAKACsATgABAAAAAAALABMAeQADAAEECQABAB4AjAADAAEECQACAA4AqgADAAEECQADAB4AuAADAAEECQAEAB4A1gADAAEECQAFABYA9AADAAEECQAGAB4BCgADAAEECQAKAFYBKAADAAEECQALACYBfmZhbmdjaGFuLXNlY3JldFJlZ3VsYXJmYW5nY2hhbi1zZWNyZXRmYW5nY2hhbi1zZWNyZXRWZXJzaW9uIDEuMGZhbmdjaGFuLXNlY3JldEdlbmVyYXRlZCBieSBzdmcydHRmIGZyb20gRm9udGVsbG8gcHJvamVjdC5odHRwOi8vZm9udGVsbG8uY29tAGYAYQBuAGcAYwBoAGEAbgAtAHMAZQBjAHIAZQB0AFIAZQBnAHUAbABhAHIAZgBhAG4AZwBjAGgAYQBuAC0AcwBlAGMAcgBlAHQAZgBhAG4AZwBjAGgAYQBuAC0AcwBlAGMAcgBlAHQAVgBlAHIAcwBpAG8AbgAgADEALgAwAGYAYQBuAGcAYwBoAGEAbgAtAHMAZQBjAHIAZQB0AEcAZQBuAGUAcgBhAHQAZQBkACAAYgB5ACAAcwB2AGcAMgB0AHQAZgAgAGYAcgBvAG0AIABGAG8AbgB0AGUAbABsAG8AIABwAHIAbwBqAGUAYwB0AC4AaAB0AHQAcAA6AC8ALwBmAG8AbgB0AGUAbABsAG8ALgBjAG8AbQAAAAIAAAAAAAAAFAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACwECAQMBBAEFAQYBBwEIAQkBCgELAQwAAAAAAAAAAAAAAAAAAAAA' fanti = '餼麣麣' def make_font_file(base64_string:str): #將base64編碼的字型字串解碼成二進位制編碼 bin_data = base64.decodebytes(base64_string.encode()) with open('textwoff.woff','wb') as f: f.write(bin_data) return bin_data #第一種:生成XML檔案,檢視對應編碼 def convert_font_to_xml(bin_data): #ByteIO把一個二進位制記憶體塊當成檔案來操作, font = TTFont(BytesIO(bin_data)) #將解碼字型儲存為xml font.saveXML("text.xml") s = make_font_file(base64_string=str) convert_font_to_xml(s)
4、然後生成一個xml檔案,找到檔案中的cmap模組:
<cmap> <tableVersion version="0"/> <cmap_format_4 platformID="0" platEncID="3" language="0"> <map code="0x9476" name="glyph00009"/><!-- CJK UNIFIED IDEOGRAPH-9476 --> <map code="0x958f" name="glyph00007"/><!-- CJK UNIFIED IDEOGRAPH-958F --> <map code="0x993c" name="glyph00008"/><!-- CJK UNIFIED IDEOGRAPH-993C --> <map code="0x9a4b" name="glyph00004"/><!-- CJK UNIFIED IDEOGRAPH-9A4B --> <map code="0x9e3a" name="glyph00010"/><!-- CJK UNIFIED IDEOGRAPH-9E3A --> <map code="0x9ea3" name="glyph00001"/><!-- CJK UNIFIED IDEOGRAPH-9EA3 --> <map code="0x9f64" name="glyph00003"/><!-- CJK UNIFIED IDEOGRAPH-9F64 --> <map code="0x9f92" name="glyph00005"/><!-- CJK UNIFIED IDEOGRAPH-9F92 --> <map code="0x9fa4" name="glyph00002"/><!-- CJK UNIFIED IDEOGRAPH-9FA4 --> <map code="0x9fa5" name="glyph00006"/><!-- CJK UNIFIED IDEOGRAPH-9FA5 --> </cmap_format_4> <cmap_format_12 platformID="0" platEncID="4" format="12" reserved="0" length="136" language="0" nGroups="10"> <map code="0x9476" name="glyph00009"/><!-- CJK UNIFIED IDEOGRAPH-9476 --> <map code="0x958f" name="glyph00007"/><!-- CJK UNIFIED IDEOGRAPH-958F --> <map code="0x993c" name="glyph00008"/><!-- CJK UNIFIED IDEOGRAPH-993C --> <map code="0x9a4b" name="glyph00004"/><!-- CJK UNIFIED IDEOGRAPH-9A4B --> <map code="0x9e3a" name="glyph00010"/><!-- CJK UNIFIED IDEOGRAPH-9E3A --> <map code="0x9ea3" name="glyph00001"/><!-- CJK UNIFIED IDEOGRAPH-9EA3 --> <map code="0x9f64" name="glyph00003"/><!-- CJK UNIFIED IDEOGRAPH-9F64 --> <map code="0x9f92" name="glyph00005"/><!-- CJK UNIFIED IDEOGRAPH-9F92 --> <map code="0x9fa4" name="glyph00002"/><!-- CJK UNIFIED IDEOGRAPH-9FA4 --> <map code="0x9fa5" name="glyph00006"/><!-- CJK UNIFIED IDEOGRAPH-9FA5 --> </cmap_format_12> <cmap_format_0 platformID="1" platEncID="0" language="0"> </cmap_format_0> <cmap_format_4 platformID="3" platEncID="1" language="0"> <map code="0x9476" name="glyph00009"/><!-- CJK UNIFIED IDEOGRAPH-9476 --> <map code="0x958f" name="glyph00007"/><!-- CJK UNIFIED IDEOGRAPH-958F --> <map code="0x993c" name="glyph00008"/><!-- CJK UNIFIED IDEOGRAPH-993C --> <map code="0x9a4b" name="glyph00004"/><!-- CJK UNIFIED IDEOGRAPH-9A4B --> <map code="0x9e3a" name="glyph00010"/><!-- CJK UNIFIED IDEOGRAPH-9E3A --> <map code="0x9ea3" name="glyph00001"/><!-- CJK UNIFIED IDEOGRAPH-9EA3 --> <map code="0x9f64" name="glyph00003"/><!-- CJK UNIFIED IDEOGRAPH-9F64 --> <map code="0x9f92" name="glyph00005"/><!-- CJK UNIFIED IDEOGRAPH-9F92 --> <map code="0x9fa4" name="glyph00002"/><!-- CJK UNIFIED IDEOGRAPH-9FA4 --> <map code="0x9fa5" name="glyph00006"/><!-- CJK UNIFIED IDEOGRAPH-9FA5 --> </cmap_format_4> <cmap_format_12 platformID="3" platEncID="10" format="12" reserved="0" length="136" language="0" nGroups="10"> <map code="0x9476" name="glyph00009"/><!-- CJK UNIFIED IDEOGRAPH-9476 --> <map code="0x958f" name="glyph00007"/><!-- CJK UNIFIED IDEOGRAPH-958F --> <map code="0x993c" name="glyph00008"/><!-- CJK UNIFIED IDEOGRAPH-993C --> <map code="0x9a4b" name="glyph00004"/><!-- CJK UNIFIED IDEOGRAPH-9A4B --> <map code="0x9e3a" name="glyph00010"/><!-- CJK UNIFIED IDEOGRAPH-9E3A --> <map code="0x9ea3" name="glyph00001"/><!-- CJK UNIFIED IDEOGRAPH-9EA3 --> <map code="0x9f64" name="glyph00003"/><!-- CJK UNIFIED IDEOGRAPH-9F64 --> <map code="0x9f92" name="glyph00005"/><!-- CJK UNIFIED IDEOGRAPH-9F92 --> <map code="0x9fa4" name="glyph00002"/><!-- CJK UNIFIED IDEOGRAPH-9FA4 --> <map code="0x9fa5" name="glyph00006"/><!-- CJK UNIFIED IDEOGRAPH-9FA5 --> </cmap_format_12> </cmap>
5、檢視加密字型的編碼:
print(fanti[0].encode('unicode-escape'))
#輸出結果
b'\\u993c'
6、檢視與code對應的name中的最後一個數字:減一,然後即為加密後的數字。本文為700.