定位Oops錯誤程式碼行【整理】
[17184178.672000] Bad mode in data abort handler detected
[17184178.672000] Internal error: Oops - bad mode: 0 [#1] PREEMPT
[17184178.672000] CPU: 0 Not tainted (2.6.26.5 #1255)
[17184178.672000] PC is at 0xd201393a
[17184178.672000] LR is at 0xd20132fd
[17184178.672000] pc : [<d201393a>] lr : [<d20132fd>] psr: 200000bb
[17184178.672000] sp : d202df68 ip : 00000000 fp : 4021fc34
[17184178.672000] r10: 4003e000 r9 : 00000000 r8 : e0096c0c
[17184178.672000] r7 : e00aa340 r6 : e0096c04 r5 : e009e7c0 r4 : 000000b0
[17184178.672000] r3 : e001a418 r2 : 00000000 r1 : e00a11e8 r0 : 00accbad
[17184178.672000] Flags: nzCv IRQs off FIQs on Mode UND_32
[17184178.672000] Control: 0005317d Table: 21aa4000 DAC: 00000015
[17184178.672000] Stack: (0xd202df68 to 0xd002e800)
[17184178.672000] Code: 6868 0029 2800 d01a (6802)
[17184178.676000] ---[ end trace 31c4d86500000008 ]---
[17179681.444000] Internal error: Oops - bad syscall: ddf04c [#1] PREEMPT
[17179681.444000] Modules linked in: coma_dsr coma_voice coma_ss7 coma_cpi coma_config
[17179681.444000] CPU: 0 Not tainted (2.6.26.5 #659)
[17179681.444000] PC is at __dabt_usr+0x4/0x60
[17179681.444000] LR is at 0x377a4
[17179681.444000] pc : [<b4023884>] lr : [<000377a4>] psr: 80000093
[17179681.444000] sp : 4021fc10 ip : 0000f1b4 fp : 4021fc34
[17179681.444000] r10: 4003e000 r9 : 00000000 r8 : 003d0f00
[17179681.444000] r7 : 00000152 r6 : 400286f8 r5 : 402202b0 r4 : 00000000
[17179681.444000] r3 : 000000e4 r2 : 000000e4 r1 : 000f229c r0 : b428bcec
[17179681.444000] Flags: Nzcv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user
[17179681.444000] Control: 0005317d Table: 21990000 DAC: 00000015
[17179681.444000] Process my_app(pid: 81, stack limit = 0xb5b8a268)
[17179681.444000] Stack: (0x4021fc10 to 0xb5b8c000)
[17179681.444000] fc00: 00000001 000000e4 000f229c 00000004
[17179681.444000] fc20: 402202f8 ffffffff 4021fc54 4021fc38 00035778 00037784 4003e000 00000000
[17179681.444000] fc40: 4021fc60 ffffffff afa66c90 4021fc58 40028788 00035740 00000000 402202b0
[17179681.444000] fc60: 402202f8 402202b0 400286f8 00000152 003d0f00 00000000 4003e000 afa66c90
[17179681.444000] fc80: 4021fc58 40028758 00000000 00000000 00000000 00000000 00000000 00000000
[17179681.444000] fca0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[17179681.444000] fcc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
上面兩種Oops:第一種是UND_32表示是發生了使用者空間,第二種SVC_32表示發生在核心空間。pc表示出問題的地址。
幾個常識概念:
1.程式計數器PC (R15),可以作為一般的通用暫存器使用,但有一些指令在使用R15時有一些限制。由於ARM採用了流水線處理器機制,當正確讀取了PC的值時,該值為當前指令地址值加上8個位元組。也就是說,對於ARM指令集來說,PC指向當前指令的下兩條指令的地址。由於ARM指令是字對齊的,PC值的第0位和第一位總為 0。
2.暫存器R13(SP),通常用作堆疊指標,每一種模式都有自己的物理R13,程式初始化R13。當進入該模式
時,可以將要使用的暫存器儲存在R13所指的棧中,當退出時,將彈出,從而實現了現場保護。
3.暫存器R14被稱為連結暫存器(LR),當中存放每種模式下,當前子程式的返回地址或者
發生異常中斷的時候,將R14設定成異常模式將要返回的地址。
4.暫存器R12
在給linux核心新增netfilter和iptables配置後,生成核心。下載到開發板的後啟動插入網線出現錯誤提示:
eth1: link up, 100Mbps, full-duplex, lpa 0xCDE1 //插入網線後提示
------------------首先這些列印是kernel的panic函式列出的,具體意義可以直接找到kernel程式碼去看,很有幫助。
Unable to handle kernel paging request at virtual address 06400040
-----------------------空指標錯誤,這個一般就是非法地址訪問,至於為什麼導致非法,請關注PC周圍的程式碼邏輯。有必要的話就printk出來。
pgd = c0004000
[06400040] *pgd=00000000
Internal error: Oops: 5 [#1] //錯誤提示 ------5代表什麼?要在你的手冊或kernel程式碼中查,null pointer??
Modules linked in: zd1211rw rt73usb rt2x00usb rt2x00lib asix usbnet mac80211 inp
ut_polldev
CPU: 0 Not tainted (2.6.30.4-LanxumDomas #19)
PC is at skb_release_data+0x74/0xc4
--------當前pc指標,這個十分有用!可以將kernel反彙編,然後找dequeue_task的相對位置0xc處
LR is at __kfree_skb+0x1c/0xd0
-----------------以下為當前暫存器值,也有幫助
pc : [<c0282b68>] lr : [<c028266c>] psr: 20000013
sp : c040dd90 ip : c040dda8 fp : c040dda4
r10: 00000001 r9 : c0465fc8 r8 : c3a22000
r7 : c3b54300 r6 : c3b50004 r5 : c3ae1600 r4 : c3ae1600
r3 : 00000000 r2 : c3b50822 r1 : 00000000 r0 : 06400040
Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel
Control: c000717f Table: 33af0000 DAC: 00000017
Process swapper (pid: 0, stack limit = 0xc040c268)
Stack: (0xc040dd90 to 0xc040e000)
dd80: c3ae1600 c3ae1600 c040ddbc c040dda8
dda0: c028266c c0282b04 c3ae19c0 c3ae1600 c040ddcc c040ddc0 c0282794 c0282660
-----------------以下是呼叫堆疊,可以看到程式的流程,便於跟蹤。
Backtrace:
[<c0282af4>] (skb_release_data+0x0/0xc4) from [<c028266c>] (__kfree_skb+0x1c/0xd
0)
r5:c3ae1600 r4:c3ae1600
[<c0282650>] (__kfree_skb+0x0/0xd0) from [<c0282794>] (kfree_skb+0x24/0x50)
r5:c3ae1600 r4:c3ae19c0
[<c0282770>] (kfree_skb+0x0/0x50) from [<c02f6884>] (br_handle_frame_finish+0x3c
/0x18c)
[<c02f6848>] (br_handle_frame_finish+0x0/0x18c) from [<c02f6c0c>] (br_handle_fra
me+0x238/0x26c)
r8:c3a22000 r7:c3ae19c0 r6:c3ae1600 r5:c3b50004 r4:80000000
[<c02f69d4>] (br_handle_frame+0x0/0x26c) from [<c028a2f4>] (netif_receive_skb+0x
1e4/0x380)
r7:00000000 r6:c0465fc8 r5:c3ae1600 r4:c3ae19c0
[<c028a110>] (netif_receive_skb+0x0/0x380) from [<c028a50c>] (process_backlog+0x
7c/0xd4)
[<c028a490>] (process_backlog+0x0/0xd4) from [<c0288b44>] (net_rx_action+0xd0/0x
194)
[<c0288a74>] (net_rx_action+0x0/0x194) from [<c009cb64>] (__do_softirq+0x78/0x10
0)
[<c009caec>] (__do_softirq+0x0/0x100) from [<c009cc34>] (irq_exit+0x48/0x50)
[<c009cbec>] (irq_exit+0x0/0x50) from [<c0083048>] (_text+0x48/0x70)
[<c0083000>] (_text+0x0/0x70) from [<c0083a44>] (__irq_svc+0x24/0xa0)
Exception stack(0xc040df48 to 0xc040df90)
df40: f4100000 00000032 f4100000 60000013 c0084ed8 c040c000
df60: c0084ed8 c04398c4 3001ea00 41129200 3001e9cc c040df9c c040dfa0 c040df90
df80: c008550c c0084f38 60000013 ffffffff
r7:c04398c4 r6:04000000 r5:f4000000 r4:ffffffff
[<c0084ed8>] (default_idle+0x0/0xac) from [<c008550c>] (cpu_idle+0x4c/0x68)
[<c00854c0>] (cpu_idle+0x0/0x68) from [<c0320ea0>] (rest_init+0x5c/0x70)
r7:c04104f0 r6:c0020d34 r5:c0439880 r4:c045d164
[<c0320e44>] (rest_init+0x0/0x70) from [<c0008944>] (start_kernel+0x1e0/0x24c)
[<c0008764>] (start_kernel+0x0/0x24c) from [<30008034>] (0x30008034)
r5:c0439968 r4:c0007175
Code: e3500000 0a000006 e3a03000 e5823018 (e5904000)
Kernel panic - not syncing: Fatal exception in interrupt
Backtrace:
[<c0087fd0>] (dump_backtrace+0x0/0x10c) from [<c0321f0c>] (dump_stack+0x18/0x1c)
r7:c0282b6c r6:c0439ec0 r5:c0282b68 r4:c0282b68
[<c0321ef4>] (dump_stack+0x0/0x1c) from [<c0321f5c>] (panic+0x4c/0x12c)
[<c0321f10>] (panic+0x0/0x12c) from [<c00882bc>] (die+0x1e0/0x214)
r3:00000100 r2:00000080 r1:c0439ec0 r0:c03b5a34
[<c00880dc>] (die+0x0/0x214) from [<c008a528>] (__do_kernel_fault+0x6c/0x7c)
[<c008a4bc>] (__do_kernel_fault+0x0/0x7c) from [<c008a684>] (do_page_fault+0x14c
/0x25c)
r7:c040f890 r6:06400040 r5:c040dd7c r4:00000000
[<c008a538>] (do_page_fault+0x0/0x25c) from [<c008a89c>] (do_translation_fault+0
x78/0x80)
[<c008a824>] (do_translation_fault+0x0/0x80) from [<c00831e0>] (do_DataAbort+0x3
8/0x9c)
r7:c041075c r6:00000005 r5:c040dd7c r4:c041070c
[<c00831a8>] (do_DataAbort+0x0/0x9c) from [<c0083a00>] (__dabt_svc+0x40/0x60)
Exception stack(0xc040dd48 to 0xc040dd90)
dd40: 06400040 00000000 c3b50822 00000000 c3ae1600 c3ae1600
dd60: c3b50004 c3b54300 c3a22000 c0465fc8 00000001 c040dda4 c040dda8 c040dd90
dd80: c028266c c0282b68 20000013 ffffffff
[<c0282af4>] (skb_release_data+0x0/0xc4) from [<c028266c>] (__kfree_skb+0x1c/0xd
0)
r5:c3ae1600 r4:c3ae1600
[<c0282650>] (__kfree_skb+0x0/0xd0) from [<c0282794>] (kfree_skb+0x24/0x50)
r5:c3ae1600 r4:c3ae19c0
[<c0282770>] (kfree_skb+0x0/0x50) from [<c02f6884>] (br_handle_frame_finish+0x3c
/0x18c)
[<c02f6848>] (br_handle_frame_finish+0x0/0x18c) from [<c02f6c0c>] (br_handle_fra
me+0x238/0x26c)
r8:c3a22000 r7:c3ae19c0 r6:c3ae1600 r5:c3b50004 r4:80000000
[<c02f69d4>] (br_handle_frame+0x0/0x26c) from [<c028a2f4>] (netif_receive_skb+0x
1e4/0x380)
r7:00000000 r6:c0465fc8 r5:c3ae1600 r4:c3ae19c0
[<c028a110>] (netif_receive_skb+0x0/0x380) from [<c028a50c>] (process_backlog+0x
7c/0xd4)
[<c028a490>] (process_backlog+0x0/0xd4) from [<c0288b44>] (net_rx_action+0xd0/0x
194)
[<c0288a74>] (net_rx_action+0x0/0x194) from [<c009cb64>] (__do_softirq+0x78/0x10
0)
[<c009caec>] (__do_softirq+0x0/0x100) from [<c009cc34>] (irq_exit+0x48/0x50)
[<c009cbec>] (irq_exit+0x0/0x50) from [<c0083048>] (_text+0x48/0x70)
[<c0083000>] (_text+0x0/0x70) from [<c0083a44>] (__irq_svc+0x24/0xa0)
Exception stack(0xc040df48 to 0xc040df90)
df40: f4100000 00000032 f4100000 60000013 c0084ed8 c040c000
df60: c0084ed8 c04398c4 3001ea00 41129200 3001e9cc c040df9c c040dfa0 c040df90
df80: c008550c c0084f38 60000013 ffffffff
r7:c04398c4 r6:04000000 r5:f4000000 r4:ffffffff
[<c0084ed8>] (default_idle+0x0/0xac) from [<c008550c>] (cpu_idle+0x4c/0x68)
[<c00854c0>] (cpu_idle+0x0/0x68) from [<c0320ea0>] (rest_init+0x5c/0x70)
r7:c04104f0 r6:c0020d34 r5:c0439880 r4:c045d164
[<c0320e44>] (rest_init+0x0/0x70) from [<c0008944>] (start_kernel+0x1e0/0x24c)
[<c0008764>] (start_kernel+0x0/0x24c) from [<30008034>] (0x30008034)
r5:c0439968 r4:c0007175
出現以上錯誤後,可以根據錯誤的提示oops;以下介紹根據錯誤提示進行錯誤定位。
1.首先在編譯生成核心的時候同時生成了一個vmlinux,使用gdb。
在核心配置時,make menuconfig 要開啟complie with debug info選項。
注意這行: PC is at skb_release_data+0x74/0xc4
這告訴我們,skb_release_data函式有0xc4這麼大,而Oops發生在0x74處。 那麼我們先看一下skb_release_data從哪裡開始:
# grep skb_release_data ./System.map
c0282af4 t skb_release_data
於是我們知道在系統出現錯誤時程式指標在 c0282af4+0x74=c0282b68
2.然後用gdb檢視,gdb ./vmlinux (在linux目錄下執行),進入除錯模式。
(gdb) b *0xc0282b68
Breakpoint 1 at 0xc0282b68: file net/core/skbuff.c ,line312
這就是告訴我們在哪個檔案,在哪一行。如此知道了錯誤的位置,具體的原因帶解決。
3,反彙編
(gdb) disassemble 0xc0282b68