Python——day3_基礎1_集合,文件操作,字符編碼與轉碼
阿新 • • 發佈:2017-06-07
windows 使用 bject 差集 ise fse style spl dev
集合
集合是一個無序的,不重復的數據組合,它的主要作用如下:
- 去重,把一個列表變成集合,就自動去重了
- 關系測試,測試兩組數據之前的交集、差集、並集等關系
常用操作
s = set([3,5,9,10]) #創建一個數值集合 t = set("Hello") #創建一個唯一字符的集合 a = t | s # t 和 s的並集 b = t & s # t 和 s的交集 c = t – s # 求差集(項在t中,但不在s中) d = t ^ s #對稱差集(項在t或s中,但不會同時出現在二者中) 基本操作: t.add(‘x‘) # 添加一項 s.update([10,37,42]) # 在s中添加多項 使用remove()可以刪除一項: t.remove(‘H‘) len(s) set 的長度 x in s 測試 x 是否是 s 的成員 x not in s 測試 x 是否不是 s 的成員 s.issubset(t) s <= t 測試是否 s 中的每一個元素都在 t 中 s.issuperset(t) s>= t 測試是否 t 中的每一個元素都在 s 中 s.union(t) s | t 返回一個新的 set 包含 s 和 t 中的每一個元素 s.intersection(t) s & t 返回一個新的 set 包含 s 和 t 中的公共元素 s.difference(t) s - t 返回一個新的 set 包含 s 中有但是 t 中沒有的元素 s.symmetric_difference(t) s ^ t 返回一個新的 set 包含 s 和 t 中不重復的元素 s.copy() 返回 set “s”的一個淺復制
文件操作
對文件操作流程
- 打開文件,得到文件句柄並賦值給一個變量
- 通過句柄對文件進行操作
- 關閉文件
現有文件如下
1 Somehow, it seems the love I knew was always the most destructive kind 2 不知為何,我經歷的愛情總是最具毀滅性的的那種 3 Yesterday when I was young 4 昨日當我年少輕狂 5 The taste of life was sweet 6 生命的滋味是甜的 7 As rain upon my tongue 8 就如舌尖上的雨露 9 I teased at life as if it were a foolish game 10 我戲弄生命 視其為愚蠢的遊戲 11 The way the evening breeze 12 就如夜晚的微風 13 May tease the candle flame 14 逗弄蠟燭的火苗 15 The thousand dreams I dreamed 16 我曾千萬次夢見 17 The splendid things I planned 18 那些我計劃的絢麗藍圖 19 I always built to last on weak and shifting sand 20 但我總是將之建築在易逝的流沙上 21 I lived by night and shunned the naked light of day 22 我夜夜笙歌 逃避白晝赤裸的陽光 23 And only now I see how the time ran away 24 事到如今我才看清歲月是如何匆匆流逝 25 Yesterday when I was young 26 昨日當我年少輕狂 27 So many lovely songs were waiting to be sung 28 有那麽多甜美的曲兒等我歌唱 29 So many wild pleasures lay in store for me 30 有那麽多肆意的快樂等我享受 31 And so much pain my eyes refused to see 32 還有那麽多痛苦 我的雙眼卻視而不見 33 I ran so fast that time and youth at last ran out 34 我飛快地奔走 最終時光與青春消逝殆盡 35 I never stopped to think what life was all about 36 我從未停下腳步去思考生命的意義 37 And every conversation that I can now recall 38 如今回想起的所有對話 39 Concerned itself with me and nothing else at all 40 除了和我相關的 什麽都記不得了 41 The game of love I played with arrogance and pride 42 我用自負和傲慢玩著愛情的遊戲 43 And every flame I lit too quickly, quickly died 44 所有我點燃的火焰都熄滅得太快 45 The friends I made all somehow seemed to slip away 46 所有我交的朋友似乎都不知不覺地離開了 47 And only now I‘m left alone to end the play, yeah 48 只剩我一個人在臺上來結束這場鬧劇 49 Oh, yesterday when I was young 50 噢 昨日當我年少輕狂 51 So many, many songs were waiting to be sung 52 有那麽那麽多甜美的曲兒等我歌唱 53 So many wild pleasures lay in store for me 54 有那麽多肆意的快樂等我享受 55 And so much pain my eyes refused to see 56 還有那麽多痛苦 我的雙眼卻視而不見 57 There are so many songs in me that won‘t be sung 58 我有太多歌曲永遠不會被唱起 59 I feel the bitter taste of tears upon my tongue 60 我嘗到了舌尖淚水的苦澀滋味 61 The time has come for me to pay for yesterday 62 終於到了付出代價的時間 為了昨日 63 When I was young 64 當我年少輕狂View Code
基本操作
f = open(‘lyrics‘) #打開文件 first_line = f.readline() print(‘first line:‘,first_line) #讀一行 print(‘我是分隔線‘.center(50,‘-‘)) data = f.read()# 讀取剩下的所有內容,文件大時不要用 print(data) #打印文件 f.close() #關閉文件
打開文件的模式有:
- r,只讀模式(默認)。
- w,只寫模式。【不可讀;不存在則創建;存在則刪除內容;】
- a,追加模式。【可讀; 不存在則創建;存在則只追加內容;】
"+" 表示可以同時讀寫某個文件
- r+,可讀寫文件。【可讀;可寫;可追加】
- w+,寫讀
- a+,同a
"U"表示在讀取時,可以將 \r \n \r\n自動轉換成 \n (與 r 或 r+ 模式同使用)
- rU
- r+U
"b"表示處理二進制文件(如:FTP發送上傳ISO鏡像文件,linux可忽略,windows處理二進制文件時需標註)
- rb
- wb
- ab
其它語法
def close(self): # real signature unknown; restored from __doc__ """ Close the file. A closed file cannot be used for further I/O operations. close() may be called more than once without error. """ pass def fileno(self, *args, **kwargs): # real signature unknown """ Return the underlying file descriptor (an integer). """ pass def isatty(self, *args, **kwargs): # real signature unknown """ True if the file is connected to a TTY device. """ pass def read(self, size=-1): # known case of _io.FileIO.read """ 註意,不一定能全讀回來 Read at most size bytes, returned as bytes. Only makes one system call, so less data may be returned than requested. In non-blocking mode, returns None if no data is available. Return an empty bytes object at EOF. """ return "" def readable(self, *args, **kwargs): # real signature unknown """ True if file was opened in a read mode. """ pass def readall(self, *args, **kwargs): # real signature unknown """ Read all data from the file, returned as bytes. In non-blocking mode, returns as much as is immediately available, or None if no data is available. Return an empty bytes object at EOF. """ pass def readinto(self): # real signature unknown; restored from __doc__ """ Same as RawIOBase.readinto(). """ pass #不要用,沒人知道它是幹嘛用的 def seek(self, *args, **kwargs): # real signature unknown """ Move to new file position and return the file position. Argument offset is a byte count. Optional argument whence defaults to SEEK_SET or 0 (offset from start of file, offset should be >= 0); other values are SEEK_CUR or 1 (move relative to current position, positive or negative), and SEEK_END or 2 (move relative to end of file, usually negative, although many platforms allow seeking beyond the end of a file). Note that not all file objects are seekable. """ pass def seekable(self, *args, **kwargs): # real signature unknown """ True if file supports random-access. """ pass def tell(self, *args, **kwargs): # real signature unknown """ Current file position. Can raise OSError for non seekable files. """ pass def truncate(self, *args, **kwargs): # real signature unknown """ Truncate the file to at most size bytes and return the truncated size. Size defaults to the current file position, as returned by tell(). The current file position is changed to the value of size. """ pass def writable(self, *args, **kwargs): # real signature unknown """ True if file was opened in a write mode. """ pass def write(self, *args, **kwargs): # real signature unknown """ Write bytes b to file, return number written. Only makes one system call, so not all of the data may be written. The number of bytes actually written is returned. In non-blocking mode, returns None if the write would block. """ pass
with語句
為了避免打開文件後忘記關閉,可以通過管理上下文,即:
1 with open(‘log‘,‘r‘) as f: 2 3 ...
如此方式,當with代碼塊執行完畢時,內部會自動關閉並釋放文件資源。
在Python 2.7 後,with又支持同時對多個文件的上下文進行管理,即:
1 with open(‘log1‘) as obj1, open(‘log2‘) as obj2: 2 pass
字符編碼與轉碼
詳細文章:
http://www.cnblogs.com/yuanchenqi/articles/5956943.html
http://www.diveintopython3.net/strings.html
需知: 1.在python2默認編碼是ASCII, python3裏默認是unicode 2.unicode 分為 utf-32(占4個字節),utf-16(占兩個字節),utf-8(占1-4個字節), so utf-16就是現在最常用的unicode版本, 不過在文件裏存的還是utf-8,因為utf8省空間 3.在py3中encode,在轉碼的同時還會把string 變成bytes類型,decode在解碼的同時還會把bytes變回string
上圖僅適用於py2
in python2
#-*-coding:utf-8-*- __author__ = ‘Alex Li‘ import sys print(sys.getdefaultencoding()) msg = "我愛北京天安門" msg_gb2312 = msg.decode("utf-8").encode("gb2312") gb2312_to_gbk = msg_gb2312.decode("gbk").encode("gbk") print(msg) print(msg_gb2312) print(gb2312_to_gbk) in python2
in python3
#-*-coding:gb2312 -*- #這個也可以去掉 __author__ = ‘Alex Li‘ import sys print(sys.getdefaultencoding()) msg = "我愛北京天安門" #msg_gb2312 = msg.decode("utf-8").encode("gb2312") msg_gb2312 = msg.encode("gb2312") #默認就是unicode,不用再decode,喜大普奔 gb2312_to_unicode = msg_gb2312.decode("gb2312") gb2312_to_utf8 = msg_gb2312.decode("gb2312").encode("utf-8") print(msg) print(msg_gb2312) print(gb2312_to_unicode) print(gb2312_to_utf8) in python3
Python——day3_基礎1_集合,文件操作,字符編碼與轉碼