X86 架構下函式是怎樣執行的?

阿新 • • 發佈：2019-12-31

前一陣子去看 java 虛擬機器器原理,忽然痛悟到虛擬機器器也是機器啊,呵呵也就是個軟體而已. 看到 java 方法呼叫太複雜. 位元組碼那一套又不太熟悉,還不如直接去看 C 編譯後的彙編程式碼. 目的: 搞明白 X86 架構下函式到底是怎麼呼叫執行的.

assembly syntax for X86

gas (gnu assembler syntax),也就是 AT&T 風格.

本文采用該風格.

swap(int,int):
        pushq   %rbp
        movq    %rsp,%rbp
        movl    %edi,-20(%rbp)
        movl    %esi,-24(%rbp)
        movl    -20(%rbp),%eax
        movl    %eax,-4(%rbp)
        movl    -24(%rbp),-20(%rbp)
        movl    -4(%rbp),-24(%rbp)
        nop
        popq    %rbp
        ret
複製程式碼

intel syntax

swap(int,int):
        push    rbp
        mov     rbp,rsp
        mov     DWORD PTR [rbp-20],edi
        mov     DWORD PTR [rbp-24],esi
        mov     eax,DWORD PTR [rbp-20]
        mov     DWORD PTR [rbp-4],eax
        mov     eax,DWORD PTR [rbp-24]
        mov     DWORD PTR [rbp-20],DWORD 
 PTR [rbp-4]
        mov     DWORD PTR [rbp-24],eax
        nop
        pop     rbp
        ret
複製程式碼

instruction suffixes

縮寫	全稱	位數
b	byte	8bit
w	word	16bit
l	long	32bit
q	quad	64bit

addressing mode

CPU 定址方式,也就是拿到資料的方式.

direct addressing

movb $0x05,%al

表示為:R[al] = 0x05;

將立即數 0x05(1 byte) 複製到暫存器 al

indirect addressing

間接定址也就是到記憶體裡去找

register to memory

movl %eax,-4(%ebp)

表示為: mem[R[ebp]-4] = R[eax];

將暫存器 eax 裡面的值複製到暫存器 ebp 的值減去 4 指向的記憶體地址處(也就是 R[ebp] -4 的值是一個記憶體地址).

通過暫存器指向了記憶體地址,是不是很熟悉的指標啊,對,就是指標. C 語言的指標就是這麼玩的啊!

memory to register

movl -4(%ebp)

%eax 表示為: R[eax] = mem[R[ebp] -4];

將暫存器 esp 的值減去 4 的值指向的記憶體地址處存放的值,複製到暫存器 eax

program counter for stored program

PC = PC + (instruction size in bytes)

(instruction) (src1) (src2) (dst)

In most processors,the PC is incremented after fetching an instruction,and holds the memory address of ("points to") the next instruction that would be executed. 這裡就用到了指令週期(instruction cycle)這個概念了,fetch,decode,execute. 注意到 PC 這個暫存器,在 CPU fetch 了一條指令後就自動增加了.

(In a processor where the incrementation precedes the fetch,the PC points to the current instruction being executed.) 同樣的在 CPU fetch 一條指令之前,PC 指向當前正在執行的指令.

注意: 不允許直接操作 ip(instruction pointer) 也叫 pc(program counter) 這個暫存器,如果這個能被編譯器操作的話,就完全想跳到哪執行就跳到哪執行了.

實際上 call 和 ret 指令就是在間接操作這兩個暫存器. call 帶來的效果之一就是 push %rip,ret 帶來的效果之一就是 pop %rip. 兩者具有對稱作用啊!

change control flow

jmp label

When a jump instruction executes (in the last step of the machine cycle),it puts a new address into the PC. Now the fetch at the top of the next machine cycle fetches the instruction at that new address. Instead of executing the instruction that follows the jump instruction in memory,the processor "jumps" to an instruction somewhere else in memory.

jmp 指令把 label 所在的地址,複製給 pc 暫存器. 這就改變了程式的控制流. 然後程式流程就脫離了原來的執行流. 和 call label 很相似,call 指令作用之一就包括了一個隱式的 jmp label. 函式呼叫也就是把控制權交給了被呼叫者. 但是控制權要回到呼叫函式那裡. 只不過 call 指令在函式交出控制權之前還多幹了一件事,就是把此時的 pc 值 push 到了棧裡.

stack management

stack pointer

A stack register is a computer central processor register whose purpose is to keep track of a call stack.

push pop 指令操作的是 sp(stack pointer) 這個暫存器.

棧底地址: 由bp(base pointer) 儲存

棧分配空間: sp 減去需要的地址空間大小(所謂的棧向下生長);

棧回收空間: sp 加上需要的地址空間大小(所謂的棧向上收縮); (PS: 相當無聊的話)

pushl %eax

push value of %eax onto stack

The push instruction places its operand onto the top of the hardware supported stack in memory. Specifically,push first decrements ESP by 4,then places its operand into the contents of the 32-bit location at address [ESP]. ESP (the stack pointer) is decremented by push since the x86 stack grows down - i.e. the stack grows from high addresses to lower addresses.

這裡可以看到 push 的是多位元組的資料,那就涉及到怎樣排列多位元組資料的問題了. 也就是所謂的位元組序的問題.

X86 採用所謂的小端,也就是把數字按照順序放到棧裡,數字的高位放在了比較大的記憶體地址那裡.(這裡不做討論) 等價於

subl $4,%esp ;分配4個位元組的空間,所謂的棧向下生長
movl %eax,(%esp) ;將 eax 的值複製到 esp 指到的記憶體地址處
複製程式碼

popl %eax

pop %eax off stack

The pop instruction removes the 4-byte data element from the top of the hardware-supported stack into the specified operand (i.e. register or memory location). It first moves the 4 bytes located at memory location [ESP] into the specified register or memory location,and then increments SP by 4.

等價於

movl (%esp),%eax ;將 esp 指向的記憶體地址裡面的值複製到 eax
addl $4,%esp ;回收空間
複製程式碼

function call and return

call

The call instruction first pushes the current code location onto the hardware supported stack in memory(see the push instruction for details),and then performs an unconditional jump to the code location indicated by the label operand. Unlike the simple jump instructions,the call instruction saves the location to return to when the subroutine completes.

注意到 CPU 在 fetch 到 call 指令後,PC 就已經自動加 1 了. 此時的 PC 值也就是所謂的函式返回地址. call 指令做了兩件事,第一件事: 將此時的 ip 儲存到棧中,第二件事: jump 到 label 位置,此時已經改變了 PC 的值. call label 作用等價於:

pushq %rip
jmp label
複製程式碼

ret

The ret instruction implements a subroutine return mechanism. This instruction first pops a code location off the hardware supported in-memory stack (也就是 call 指令壓入棧中的 PC,將這個值複製到 PC 暫存器)(see the pop instruction for details). It then performs an unconditional jump to the retrieved code location.

所以啊,call(含有一個 push 操作) 和 ret(含有一個 pop 操作) 指令,這是實現控制流跳轉和恢復的關鍵. 也間接操作了 sp 這個暫存器. 硬體實現的功能,不需要過多的計較. ret 作用等價於:

popq %rip
複製程式碼

call stack

In computer science,a call stack is a stack data structure that stores information about the active subroutines of a computer program. This kind of stack is also known as an execution stack,program stack,control stack,run-time stack,or machine stack,and is often shortened to just "the stack".

A call stack is used for several related purposes,but the main reason for having one is to keep track of the point to which each active subroutine should return control when it finishes executing. ^ An active subroutine is one that has been called but is yet to complete execution after which control should be handed back to the point of call. Such activations of subroutines may be nested to any level (recursive as a special case),hence the stack structure.

example

for example,a subroutine DrawSquare calls a subroutine DrawLine from four different places,DrawLine must know where to return when its execution completes. To accomplish this,the address following the instruction that jumps to DrawLine,the return address,is pushed onto the call stack with each call.

code analysis

void swap(int a,int b){
    int tmp = a;
    a = b;
    b = tmp;
}
複製程式碼

-- 64 bit 機器,AT&T 風格的彙編
swap(int,int):
        pushq   %rbp // 上一個棧幀(main)的基地址壓棧 等價於 subq $8,%rsp; movq %rbp,(%rsp)
        movq    %rsp,%rbp // 開闢新的函式棧幀,也就是形成一個新的棧的基地址
        movl    %edi,-20(%rbp) // 引數 a
        movl    %esi,-24(%rbp) // 引數 b
        movl    -20(%rbp),%eax // 把 a 賦值給 %eax
        movl    %eax,-4(%rbp)  // 把 %eax (a)賦值給 %rbp - 4(a) 的地址處
        movl    -24(%rbp),%eax // 把 b 賦值給 % eax（b）
        movl    %eax,-20(%rbp) // 把 %eax (b) 賦值給 %rbp - 20（b） 的地址處,完成 b 的交換
        movl    -4(%rbp),%eax  // 把 %rbp - 4 地址處的值(a) 賦值給 %eax (a)
        movl    %eax,-24(%rbp) // 把 %eax (a) 賦值給 %rbp - 24 的地址處,完成 a 的交換
        nop // 延時
        popq    %rbp // 等價於 movq (%rsp),%rbp ; 上一個函式棧幀(main)的基地址恢復; addq $8,%rsp ; 上一個函式的 %rsp 恢復
        ret // 1. popq %rip. (恢復 main 的 pc,call swap 這條指令壓入的 pc ) 2. jmp % rip 處繼續執行.(也就是 movl $0,%eax 這條指令的地址)
複製程式碼

int main() {
    swap(1,2);
    return 0;
}
複製程式碼

main:
        pushq   %rbp
        movq    %rsp,%rbp
        movl    $2,%esi // 由 caller 準備函式引數 2
        movl    $1,%edi // 由 caller 準備函式引數 1
        call    swap // 在 CPU fetch 了 call 指令後,pc 已經指向了下一條指令,也就是 movl $0,%eax 這條指令. 此時的 call 指令完成了兩件事,第一件事: 將 pc(old) 壓入到棧中(swap 函式 ret 指令(函式返回)就是把這個 pc(old) pop 到 pc 這個暫存器,CPU 就能接著執行 movl $0,%eax 這條指令了),第二件事: jump 到swap的地址,開始執行swap的程式碼.
        movl    $0,%eax // 返回值 0 
        popq    %rbp
        ret
複製程式碼

C compare to Assembly

asm execute graph

注意: 示意圖裡面的是 64 bit 的彙編程式碼.

注意: 所有的 push 和 pop 指令都會改變 sp 暫存器的值.

圖1 main 函式執行完 pushq %rbp 和 movq %rsp,%rbp,開闢 main 函式的棧幀.

圖2 main 函式執行 call swap. call 指令兩個作用: 1. 將 movl $0,%eax 這條指令的地址(X)壓入棧中. 2. jump 到 swap 的地址.

圖3 是 swap 函式的棧幀,此時新函式的棧幀 rsp 和 rbp 指向的是相同的記憶體地址.

圖4 所有的 mov 使用的記憶體地址,都是通過 rbp 來偏移得到,rbp 的值並沒有發生改變.

圖5 執行完 popq %rsp,恢復 main 函式的棧基址(rbp),也就是和圖1 一樣.

圖6 執行完 ret 恢復為 main 函式的棧幀(這裡主要是 rsp,rbp,pc,個人理解把 pc 視為棧幀的一部分,因為函式呼叫控制權發生轉移,幕後也離不開 pc 這個暫存器的變化). ret 的作用等價於 popq %rip. 但是無法直接操作 ip(pc) 這個暫存器. 也就相當於間接改變 ip. 此時 pc 已被 ret 指令恢復成了 X. (此時實際上已經控制權已經回到 main 函式了),接下來就是繼續執行 main 函式的程式碼. 其實 swap 函式的棧幀已經被銷燬了. 也就是再也訪問不到 swap 函式裡的變量了. 這就是 C 語言裡的所謂的本地變數的本質.

注意: 圖1 和圖6,圖2 和圖5 完全一樣,這不是有意為之,按照 X86 的函式呼叫機制就是這樣的. 在被呼叫函式(swap)執行 popq % rbp,這條指令就是要恢復呼叫函式(main)的 rbp,執行 ret 這條指令就是要恢復呼叫函式(main)的下一條指令的地址. 也就是將 pc 的值恢復為 X,這樣就可以接著執行了嘛. 也就是所謂的恢復呼叫者(main)的棧幀. 也就是 main 函式呼叫 swap 函式(call 指令)保留 main 的狀態(也就是 main 函式的 rbp 和 pc),swap 執行到最後(popq,ret)負責恢復現場(也就是恢復 main 函式的 rbp 和 pc). call 和 ret 指令的也分別有 push %rip 和 pop %rip 的作用. 很對稱的操作!

bombs

pushq   %rbp  ; 保留上一個函式(也就是呼叫者)的棧基址
movq    %rsp,%rbp ; 新函式的棧基址. 一個新的棧幀 sp 和 bp 指向的是同一個地址
複製程式碼

一個所謂的棧幀(stack frame)就是由 sp(stack pointer) 和 bp(base pointer) 這兩個暫存器來維護的. 這兩句會出現在每一個函式的開始,那麼問題來了 main 函式裡面保留的是哪一個呼叫函式的棧基址呢? 個人推測,不一定正確,我們知道建立程式(執行緒)是 OS 核心的功能,當然程式銷燬也是核心的功能. 核心同樣維護著屬於核心空間的棧幀,當程式建立完畢後,我們寫的 C 程式碼應該是被核心裡的函式呼叫的,這樣的話 main 裡面 pushq %rbp 應該是保留的核心函式的棧基址. 這樣 main 的 ret 返回後就能接著執行核心函式裡面的邏輯了. (估計也就是銷燬程式一系列操作了,這樣才能把分配的資源收回來啊!)

references

X86 架構下函式是怎樣執行的?

前一陣子去看 java 虛擬機器器原理,忽然痛悟到虛擬機器器也是機器啊,呵呵也就是個軟體而已. 看到 java 方法呼叫太複雜. 位元組碼那一套又不太熟悉,還不如直接去看 C 編譯後的彙編程式碼.

Windows x86環境下使用QEMU安裝arm架構銀河麒麟V10作業系統

在琢磨arm架構下的一些技術問題，沒有arm架構的電腦，錢不夠時間來湊，花了一下午時間搞定。記錄一下主要過程：

程式碼 or 指令，淺析ARM架構下的函式的呼叫過程【轉】

轉自：https://www.cnblogs.com/huaweiyun/p/14338490.html 摘要：linux程式執行的狀態以及如何推導呼叫棧。

分散式架構下，Session 共享有什麼方案？

程式設計改變世界分散式架構下的 Session 共享，也稱作分散式 Session 一致性；分散式架構下 Session 共享有哪些問題，又有哪些解決方案，讓我們一起看一下。

UMA架構與NUMA架構下的自旋鎖（CLH鎖與MCS鎖）

關於自旋鎖我們知道自旋鎖是實現同步的一種方案，它是一種非阻塞鎖。它與常規鎖的主要區別就在於獲取鎖失敗後的處理方式不同，常規鎖會將執行緒阻塞並在適當時喚醒它。而自旋鎖的核心機制就在自旋兩個字，即用自旋操

一文看懂靜態初始化塊、靜態成員、初始化塊、建構函式執行順序以及用途

Tips 非靜態初始化塊基本和建構函式一個作用，可以避免建構函式的程式碼重複。初始化塊在類的每次構造都會執行

如何利用Oracle命令解決函式執行錯誤

1 問題自定義了一個 Oracle 函式。編譯正常；使用 PL/SQL Developer 的 Test 視窗模式，測試通過。但 Java 直接呼叫失敗；使用 PL/SQL Developer 的 SQL 視窗模式，執行失敗。

將python依賴包打包成window下可執行檔案bat方式

1、開啟一個記事本，將需要安裝的第三方python依賴包寫入檔案，比如：需要安裝urllib3、flask、bs4三個python庫（替換成你想要安裝的庫，每個庫之間用空格隔開），輸入“python -m pip install ”，再輸入“urllib3

C#架構設計-程式執行時從xml配置檔案中載入配置項並設定為全域性變數

場景 C#中全域性作用域的常量、欄位、屬性、方法的定義與使用： https://blog.csdn.net/BADAO_LIUMANG_QIZHI/article/details/102550025

解決python呼叫自己檔案函式/執行函式找不到包問題

寫python程式的時候很多人習慣建立一個utils.py檔案，存放一些經常使用的函式，方便其他檔案呼叫，同時也更好的管理一些通用函式，方便今後使用。或是兩個檔案之間的class或是函式呼叫情況。

win10下cod9執行無反應的詳細解決方法

cod9是一款由Treyarch開發，動視發行的第一人稱射擊遊戲，很是受玩家們的青睞,但是在玩的過程中難免會遇到各種各樣的問題，例如就有使用者反映說自己的win10精簡版電腦出現了cod9執行無反應的情況，不知道怎麼解決這

python中id函式執行方式

id(object) 功能：返回的是物件的“身份證號”，唯一且不變，但在不重合的生命週期裡，可能會出現相同的id值。此處所說的物件應該特指複合型別的物件(如類、list等)，對於字串、整數等型別，變數的id是隨值的改變而改

MySQL 對window函式執行sum函式疑似Bug

MySQL 對window函式執行sum函式疑似Bug 使用MySql的視窗函式統計資料時，發現一個小的問題，與大家一起探討下。

MySQL對window函式執行sum函式可能出現的一個Bug

使用MySql的視窗函式統計資料時，發現一個小的問題，與大家一起探討下。環境配置：

C語言中函式執行過程中堆疊的變化

一個最簡易的C函式：執行一個加法 int add(int x, int y) { return x + y; } void main() { __asm {

Java基礎語法(下)-函式與方法

01.方法定義格式及格式解釋 package com.bird_01; /* * 方法：其實就是完成特定功能的程式碼塊

解決VUE mounted 鉤子函式執行時 img 未載入導致頁面佈局的問題

專案需求：圖片載入時，當滑鼠滾動至當前圖片進行載入並加上上滑特效，實現這個效果需要對文件文件滾動位置和圖片的當前位置進行比較。但是mounted 鉤子函式執行時img圖片並未加載出來也就是佔位為空，導致圖片位置計

Lxd-4.0.2 在arm架構下原始碼編譯

Lxd-4.0.2在arm架構下原始碼編譯 1、安裝過程 1.環境：centos7.6-aarch64 2.安裝epel源 yum -y install epel-release

Mac下Python執行時報錯：Class QMacAutoReleasePoolTracker is implemented in both /.../site-packages/PyQt5/... and /.../site-packages/cv2/...

完整報錯如下： objc[22334]: Class QMacAutoReleasePoolTracker is implemented in both /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/PyQt5/Qt/lib/QtCore.framework/Versions

Spark專案實戰從0到1之（11）實現手機號碼脫敏的udf函式執行過程

Hive UDFHive UDF 函式1 POM 檔案2.UDF 函式3 利用idea打包4 新增hive udf函式4.1 上傳jar包到叢集4.2 修改叢集hdfs檔案許可權4.3 註冊UDF4.4 使用UDF

X86 架構下函式是怎樣執行的?

assembly syntax for X86

gas (gnu assembler syntax),也就是 AT&T 風格.

intel syntax

instruction suffixes

addressing mode

direct addressing

indirect addressing

register to memory

memory to register

program counter for stored program

change control flow

jmp label

stack management

stack pointer

pushl %eax

popl %eax

function call and return

call

ret

call stack

example

code analysis

C compare to Assembly

asm execute graph

bombs

references

相關推薦