python 面試題(基礎)

阿新 • • 發佈：2020-08-13

檔案操作

1.有一個jsonline格式的檔案file.txt大小約為10K

def get_lines():
    with open('file.txt','rb') as f:
        return f.readlines()

if __name__ == '__main__':
    for e in get_lines():
        process(e) # 處理每一行資料

現在要處理一個大小為10G的檔案，但是記憶體只有4G，如果在只修改get_lines 函式而其他程式碼保持不變的情況下，應該如何實現？需要考慮的問題都有那些？

def get_lines():
    with open('file.txt','rb') as f:
        for i in f:
            yield i

個人認為：還是設定下每次返回的行數較好，否則讀取次數太多。

def get_lines():
    l = []
    with open('file.txt','rb') as f:
      data = f.readlines(60000)
    l.append(data)
    yield l

Pandaaaa906提供的方法

from mmap import mmap


def get_lines(fp):
    with open(fp,"r+") as f:
        m = mmap(f.fileno(), 0)
        tmp = 0
        for i, char in enumerate(m):
            if char==b"\n":
                yield m[tmp:i+1].decode()
                tmp = i+1

if __name__=="__main__":
    for i in get_lines("fp_some_huge_file"):
        print(i)

要考慮的問題有：記憶體只有4G無法一次性讀入10G檔案，需要分批讀入分批讀入資料要記錄每次讀入資料的位置。分批每次讀取資料的大小，太小會在讀取操作花費過多時間。https://stackoverflow.com/questions/30294146/python-fastest-way-to-process-large-file

2.補充缺失的程式碼

def print_directory_contents(sPath):
"""
這個函式接收資料夾的名稱作為輸入引數
返回該資料夾中檔案的路徑
以及其包含資料夾中檔案的路徑
"""
import os
for s_child in os.listdir(s_path):
    s_child_path = os.path.join(s_path, s_child)
    if os.path.isdir(s_child_path):
        print_directory_contents(s_child_path)
    else:
        print(s_child_path)

模組與包

3.輸入日期，判斷這一天是這一年的第幾天？

import datetime
def dayofyear():
    year = input("請輸入年份: ")
    month = input("請輸入月份: ")
    day = input("請輸入天: ")
    date1 = datetime.date(year=int(year),month=int(month),day=int(day))
    date2 = datetime.date(year=int(year),month=1,day=1)
    return (date1-date2).days+1

4.打亂一個排好序的list物件alist？

import random
alist = [1,2,3,4,5]
random.shuffle(alist)
print(alist)

資料型別

5.現有字典 d= {'a':24,'g':52,'i':12,'k':33}請按value值進行排序?

sorted(d.items(),key=lambda x:x[1])

x[0]代表用key進行排序；x[1]代表用value進行排序。

6.字典推導式

d = {key:value for (key,value) in iterable}

7.請反轉字串 "aStr"?

print("aStr"[::-1])

8.將字串 "k:1 |k1:2|k2:3|k3:4"，處理成字典 {k:1,k1:2,...}

str1 = "k:1|k1:2|k2:3|k3:4"
def str2dict(str1):
    dict1 = {}
    for iterms in str1.split('|'):
        key,value = iterms.split(':')
        dict1[key] = value
    return dict1
#字典推導式
d = {k:int(v) for t in str1.split("|") for k, v in (t.split(":"), )}

9.請按alist中元素的age由大到小排序

alist = [{'name':'a','age':20},{'name':'b','age':30},{'name':'c','age':25}]
def sort_by_age(list1):
    return sorted(alist,key=lambda x:x['age'],reverse=True)

10.下面程式碼的輸出結果將是什麼？

list = ['a','b','c','d','e']
print(list[10:])

程式碼將輸出[],不會產生IndexError錯誤，就像所期望的那樣，嘗試用超出成員的個數的index來獲取某個列表的成員。例如，嘗試獲取list[10]和之後的成員，會導致IndexError。然而，嘗試獲取列表的切片，開始的index超過了成員個數不會產生IndexError，而是僅僅返回一個空列表。這成為特別讓人噁心的疑難雜症，因為執行的時候沒有錯誤產生，導致Bug很難被追蹤到。

11.寫一個列表生成式，產生一個公差為11的等差數列

print([x*11 for x in range(10)])

12.給定兩個列表，怎麼找出他們相同的元素和不同的元素？

list1 = [1,2,3]
list2 = [3,4,5]
set1 = set(list1)
set2 = set(list2)
print(set1 & set2)
print(set1 ^ set2)

13.請寫出一段python程式碼實現刪除list裡面的重複元素？

l1 = ['b','c','d','c','a','a']
l2 = list(set(l1))
print(l2)

用list類的sort方法:

l1 = ['b','c','d','c','a','a']
l2 = list(set(l1))
l2.sort(key=l1.index)
print(l2)

也可以這樣寫:

l1 = ['b','c','d','c','a','a']
l2 = sorted(set(l1),key=l1.index)
print(l2)

也可以用遍歷：

l1 = ['b','c','d','c','a','a']
l2 = []
for i in l1:
    if not i in l2:
        l2.append(i)
print(l2)

14.給定兩個list A，B ,請用找出A，B中相同與不同的元素

A,B 中相同元素： print(set(A)&set(B))
A,B 中不同元素:  print(set(A)^set(B))

企業面試題

15.python新式類和經典類的區別？

a. 在python裡凡是繼承了object的類，都是新式類

b. Python3裡只有新式類

c. Python2裡面繼承object的是新式類，沒有寫父類的是經典類

d. 經典類目前在Python裡基本沒有應用

e. 保持class與type的統一對新式類的例項執行a.__class__與type(a)的結果是一致的，對於舊式類來說就不一樣了。

f.對於多重繼承的屬性搜尋順序不一樣新式類是採用廣度優先搜尋，舊式類採用深度優先搜尋。

16.python中內建的資料結構有幾種？

a. 整型 int、長整型 long、浮點型 float、複數 complex

b. 字串 str、列表 list、元祖 tuple

c. 字典 dict 、集合 set

d. Python3 中沒有 long，只有無限精度的 int

17.python如何實現單例模式?請寫出兩種實現方式?

第一種方法:使用裝飾器

def singleton(cls):
    instances = {}
    def wrapper(*args, **kwargs):
        if cls not in instances:
            instances[cls] = cls(*args, **kwargs)
        return instances[cls]
    return wrapper
    
    
@singleton
class Foo(object):
    pass
foo1 = Foo()
foo2 = Foo()
print(foo1 is foo2)  # True

第二種方法：使用基類 New 是真正建立例項物件的方法，所以重寫基類的new 方法，以此保證建立物件的時候只生成一個例項

class Singleton(object):
    def __new__(cls, *args, **kwargs):
        if not hasattr(cls, '_instance'):
            cls._instance = super(Singleton, cls).__new__(cls, *args, **kwargs)
        return cls._instance
    
    
class Foo(Singleton):
    pass

foo1 = Foo()
foo2 = Foo()

print(foo1 is foo2)  # True

第三種方法：元類，元類是用於建立類物件的類，類物件建立例項物件時一定要呼叫call方法，因此在呼叫call時候保證始終只建立一個例項即可，type是python的元類

class Singleton(type):
    def __call__(cls, *args, **kwargs):
        if not hasattr(cls, '_instance'):
            cls._instance = super(Singleton, cls).__call__(*args, **kwargs)
        return cls._instance


# Python2
class Foo(object):
    __metaclass__ = Singleton

# Python3
class Foo(metaclass=Singleton):
    pass

foo1 = Foo()
foo2 = Foo()
print(foo1 is foo2)  # True

18.反轉一個整數，例如-123 --> -321

class Solution(object):
    def reverse(self,x):
        if -10<x<10:
            return x
        str_x = str(x)
        if str_x[0] !="-":
            str_x = str_x[::-1]
            x = int(str_x)
        else:
            str_x = str_x[1:][::-1]
            x = int(str_x)
            x = -x
        return x if -2147483648<x<2147483647 else 0
if __name__ == '__main__':
    s = Solution()
    reverse_int = s.reverse(-120)
    print(reverse_int)

19.設計實現遍歷目錄與子目錄，抓取.pyc檔案

第一種方法：

import os

def get_files(dir,suffix):
    res = []
    for root,dirs,files in os.walk(dir):
        for filename in files:
            name,suf = os.path.splitext(filename)
            if suf == suffix:
                res.append(os.path.join(root,filename))

    print(res)

get_files("./",'.pyc')

第二種方法：

import os

def pick(obj):
    if obj.endswith(".pyc"):
        print(obj)
    
def scan_path(ph):
    file_list = os.listdir(ph)
    for obj in file_list:
        if os.path.isfile(obj):
            pick(obj)
        elif os.path.isdir(obj):
            scan_path(obj)
    
if __name__=='__main__':
    path = input('輸入目錄')
    scan_path(path)

第三種方法

from glob import iglob


def func(fp, postfix):
    for i in iglob(f"{fp}/**/*{postfix}", recursive=True):
        print(i)

if __name__ == "__main__":
    postfix = ".pyc"
    func("K:\Python_script", postfix)

20.一行程式碼實現1-100之和

count = sum(range(0,101))
print(count)

21.Python-遍歷列表時刪除元素的正確做法

遍歷在新在列表操作，刪除時在原來的列表操作

a = [1,2,3,4,5,6,7,8]
print(id(a))
print(id(a[:]))
for i in a[:]:
    if i>5:
        pass
    else:
        a.remove(i)
    print(a)
print('-----------')
print(id(a))

#filter
a=[1,2,3,4,5,6,7,8]
b = filter(lambda x: x>5,a)
print(list(b))

列表解析

a=[1,2,3,4,5,6,7,8]
b = [i for i in a if i>5]
print(b)

倒序刪除因為列表總是‘向前移’，所以可以倒序遍歷，即使後面的元素被修改了，還沒有被遍歷的元素和其座標還是保持不變的

a=[1,2,3,4,5,6,7,8]
print(id(a))
for i in range(len(a)-1,-1,-1):
    if a[i]>5:
        pass
    else:
        a.remove(a[i])
print(id(a))
print('-----------')
print(a)

22.字串的操作題目

全字母短句 PANGRAM 是包含所有英文字母的句子，比如：A QUICK BROWN FOX JUMPS OVER THE LAZY DOG. 定義並實現一個方法 get_missing_letter, 傳入一個字串採納數，返回引數字串變成一個 PANGRAM 中所缺失的字元。應該忽略傳入字串引數中的大小寫，返回應該都是小寫字元並按字母順序排序（請忽略所有非 ACSII 字元）

下面示例是用來解釋，雙引號不需要考慮:

(0)輸入: "A quick brown for jumps over the lazy dog"

返回： ""

(1)輸入: "A slow yellow fox crawls under the proactive dog"

返回: "bjkmqz"

(2)輸入: "Lions, and tigers, and bears, oh my!"

返回: "cfjkpquvwxz"

(3)輸入: ""

返回："abcdefghijklmnopqrstuvwxyz"

def get_missing_letter(a):
    s1 = set("abcdefghijklmnopqrstuvwxyz")
    s2 = set(a.lower())
    ret = "".join(sorted(s1-s2))
    return ret
    
print(get_missing_letter("python"))

# other ways to generate letters
# range("a", "z")
# 方法一:
import string
letters = string.ascii_lowercase
# 方法二:
letters = "".join(map(chr, range(ord('a'), ord('z') + 1)))

23.可變型別和不可變型別

1,可變型別有list,dict.不可變型別有string，number,tuple.

2,當進行修改操作時，可變型別傳遞的是記憶體中的地址，也就是說，直接修改記憶體中的值，並沒有開闢新的記憶體。

3,不可變型別被改變時，並沒有改變原記憶體地址中的值，而是開闢一塊新的記憶體，將原地址中的值複製過去，對這塊新開闢的記憶體中的值進行操作。

24.is和==有什麼區別？

is：比較的是兩個物件的id值是否相等，也就是比較倆物件是否為同一個例項物件。是否指向同一個記憶體地址

== ：比較的兩個物件的內容/值是否相等，預設會呼叫物件的eq()方法

25.求出列表所有奇數並構造新列表

a = [1,2,3,4,5,6,7,8,9,10]
res = [ i for i in a if i%2==1]
print(res)

26.用一行python程式碼寫出1+2+3+10248

from functools import reduce
#1.使用sum內建求和函式
num = sum([1,2,3,10248])
print(num)
#2.reduce 函式
num1 = reduce(lambda x,y :x+y,[1,2,3,10248])
print(num1)

27.Python中變數的作用域？（變數查詢順序)

函式作用域的LEGB順序

1.什麼是LEGB?

L： local 函式內部作用域

E: enclosing 函式內部與內嵌函式之間

G: global 全域性作用域

B： build-in 內建作用

python在函式裡面的查詢分為4種，稱之為LEGB，也正是按照這是順序來查詢的

28.字串`"123"`轉換成`123`，不使用內建api，例如`int()`

方法一：利用str函式

def atoi(s):
    num = 0
    for v in s:
        for j in range(10):
            if v == str(j):
                num = num * 10 + j
    return num

方法二：利用ord函式

def atoi(s):
    num = 0
    for v in s:
        num = num * 10 + ord(v) - ord('0')
    return num

方法三: 利用eval函式

def atoi(s):
    num = 0
    for v in s:
        t = "%s * 1" % v
        n = eval(t)
        num = num * 10 + n
    return num

方法四: 結合方法二，使用reduce，一行解決

from functools import reduce
def atoi(s):
    return reduce(lambda num, v: num * 10 + ord(v) - ord('0'), s, 0)

29.Given an array of integers

給定一個整數陣列和一個目標值，找出陣列中和為目標值的兩個數。你可以假設每個輸入只對應一種答案，且同樣的元素不能被重複利用。示例:給定nums = [2,7,11,15],target=9 因為 nums[0]+nums[1] = 2+7 =9,所以返回[0,1]

class Solution:
    def twoSum(self,nums,target):
        """
        :type nums: List[int]
        :type target: int
        :rtype: List[int]
        """
        d = {}
        size = 0
        while size < len(nums):
            if target-nums[size] in d:
                if d[target-nums[size]] <size:
                    return [d[target-nums[size]],size]
                else:
                    d[nums[size]] = size
                size = size +1
solution = Solution()
list = [2,7,11,15]
target = 9
nums = solution.twoSum(list,target)
print(nums)


class Solution(object):
    def twoSum(self, nums, target):
        for i in range(len(nums)):
            num = target - nums[i]
            if num in nums[i+1:]:
                return [i, nums.index(num,i+1)]

給列表中的字典排序：假設有如下list物件，alist=[{"name":"a","age":20},{"name":"b","age":30},{"name":"c","age":25}],將alist中的元素按照age從大到小排序 alist=[{"name":"a","age":20},{"name":"b","age":30},{"name":"c","age":25}]

alist_sort = sorted(alist,key=lambda e: e.__getitem__('age'),reverse=True)

30.python程式碼實現刪除一個list裡面的重複元素

def distFunc1(a):
    """使用集合去重"""
    a = list(set(a))
    print(a)

def distFunc2(a):
    """將一個列表的資料取出放到另一個列表中，中間作判斷"""
    list = []
    for i in a:
        if i not in list:
            list.append(i)
    #如果需要排序的話用sort
    list.sort()
    print(list)

def distFunc3(a):
    """使用字典"""
    b = {}
    b = b.fromkeys(a)
    c = list(b.keys())
    print(c)

if __name__ == "__main__":
    a = [1,2,4,2,4,5,7,10,5,5,7,8,9,0,3]
    distFunc1(a)
    distFunc2(a)
    distFunc3(a)

31.統計一個文字中單詞頻次最高的10個單詞？

import re

# 方法一
def test(filepath):
    
    distone = {}

    with open(filepath) as f:
        for line in f:
            line = re.sub("\W+", " ", line)
            lineone = line.split()
            for keyone in lineone:
                if not distone.get(keyone):
                    distone[keyone] = 1
                else:
                    distone[keyone] += 1
    num_ten = sorted(distone.items(), key=lambda x:x[1], reverse=True)[:10]
    num_ten =[x[0] for x in num_ten]
    return num_ten
    
 
# 方法二 
# 使用 built-in 的 Counter 裡面的 most_common
import re
from collections import Counter


def test2(filepath):
    with open(filepath) as f:
        return list(map(lambda c: c[0], Counter(re.sub("\W+", " ", f.read()).split()).most_common(10)))

32.請寫出一個函式滿足以下條件

該函式的輸入是一個僅包含數字的list,輸出一個新的list，其中每一個元素要滿足以下條件：

1、該元素是偶數

2、該元素在原list中是在偶數的位置(index是偶數)

def num_list(num):
    return [i for i in num if i %2 ==0 and num.index(i)%2==0]

num = [0,1,2,3,4,5,6,7,8,9,10]
result = num_list(num)
print(result)

33.使用單一的列表生成式來產生一個新的列表

該列表只包含滿足以下條件的值，元素為原始列表中偶數切片

list_data = [1,2,5,8,10,3,18,6,20]
res = [x for x in list_data[::2] if x %2 ==0]
print(res)

34.用一行程式碼生成[1,4,9,16,25,36,49,64,81,100]

[x * x for x in range(1,11)]

35.輸入某年某月某日，判斷這一天是這一年的第幾天？

import datetime

y = int(input("請輸入4位數字的年份:"))
m = int(input("請輸入月份:"))
d = int(input("請輸入是哪一天"))

targetDay = datetime.date(y,m,d)
dayCount = targetDay - datetime.date(targetDay.year -1,12,31)
print("%s是 %s年的第%s天。"%(targetDay,y,dayCount.days))

36.兩個有序列表，l1,l2，對這兩個列表進行合併不可使用extend

def loop_merge_sort(l1,l2):
    tmp = []
    while len(l1)>0 and len(l2)>0:
        if l1[0] <l2[0]:
            tmp.append(l1[0])
            del l1[0]
        else:
            tmp.append(l2[0])
            del l2[0]
    while len(l1)>0:
        tmp.append(l1[0])
        del l1[0]
    while len(l2)>0:
        tmp.append(l2[0])
        del l2[0]
    return tmp

37.給定一個任意長度陣列，實現一個函式

讓所有奇數都在偶數前面，而且奇數升序排列，偶數降序排序，如字串'1982376455',變成'1355798642'

# 方法一
def func1(l):
    if isinstance(l, str):
        l = [int(i) for i in l]
    l.sort(reverse=True)
    for i in range(len(l)):
        if l[i] % 2 > 0:
            l.insert(0, l.pop(i))
    print(''.join(str(e) for e in l))

# 方法二
def func2(l):
    print("".join(sorted(l, key=lambda x: int(x) % 2 == 0 and 20 - int(x) or int(x))))

38.寫一個函式找出一個整數陣列中，第二大的數

def find_second_large_num(num_list):
    """
    找出陣列第2大的數字
    """
    # 方法一
    # 直接排序，輸出倒數第二個數即可
    tmp_list = sorted(num_list)
    print("方法一\nSecond_large_num is :", tmp_list[-2])
    
    # 方法二
    # 設定兩個標誌位一個儲存最大數一個儲存次大數
    # two 儲存次大值，one 儲存最大值，遍歷一次陣列即可，先判斷是否大於 one，若大於將 one 的值給 two，將 num_list[i] 的值給 one，否則比較是否大於two，若大於直接將 num_list[i] 的值給two，否則pass
    one = num_list[0]
    two = num_list[0]
    for i in range(1, len(num_list)):
        if num_list[i] > one:
            two = one
            one = num_list[i]
        elif num_list[i] > two:
            two = num_list[i]
    print("方法二\nSecond_large_num is :", two)
    
    # 方法三
    # 用 reduce 與邏輯符號 (and, or)
    # 基本思路與方法二一樣，但是不需要用 if 進行判斷。
    from functools import reduce
    num = reduce(lambda ot, x: ot[1] < x and (ot[1], x) or ot[0] < x and (x, ot[1]) or ot, num_list, (0, 0))[0]
    print("方法三\nSecond_large_num is :", num)
    
    
if __name__ == '__main___':
    num_list = [34, 11, 23, 56, 78, 0, 9, 12, 3, 7, 5]
    find_second_large_num(num_list)

39.閱讀一下程式碼他們的輸出結果是什麼？

def multi():
    return [lambda x : i*x for i in range(4)]
print([m(3) for m in multi()])

正確答案是[9,9,9,9]，而不是[0,3,6,9]產生的原因是Python的閉包的後期繫結導致的，這意味著在閉包中的變數是在內部函式被呼叫的時候被查詢的，因為，最後函式被呼叫的時候，for迴圈已經完成, i 的值最後是3,因此每一個返回值的i都是3,所以最後的結果是[9,9,9,9]

40.統計一段字串中字元出現的次數

# 方法一
def count_str(str_data):
    """定義一個字元出現次數的函式"""
    dict_str = {} 
    for i in str_data:
        dict_str[i] = dict_str.get(i, 0) + 1
    return dict_str
dict_str = count_str("AAABBCCAC")
str_count_data = ""
for k, v in dict_str.items():
    str_count_data += k + str(v)
print(str_count_data)

# 方法二
from collections import Counter

print("".join(map(lambda x: x[0] + str(x[1]), Counter("AAABBCCAC").most_common())))

41.super函式的具體用法和場景

https://python3-cookbook.readthedocs.io/zh_CN/latest/c08/p07_calling_method_on_parent_class.html