1. 程式人生 > >定位Oops錯誤程式碼行【整理】

定位Oops錯誤程式碼行【整理】

[17184178.672000] Bad mode in data abort handler detected
[17184178.672000] Internal error: Oops - bad mode: 0 [#1] PREEMPT
[17184178.672000] CPU: 0    Not tainted  (2.6.26.5 #1255)
[17184178.672000] PC is at 0xd201393a
[17184178.672000] LR is at 0xd20132fd
[17184178.672000] pc : [<d201393a>]    lr : [<d20132fd>]    psr: 200000bb
[17184178.672000] sp : d202df68  ip : 00000000  fp : 4021fc34
[17184178.672000] r10: 4003e000  r9 : 00000000  r8 : e0096c0c
[17184178.672000] r7 : e00aa340  r6 : e0096c04  r5 : e009e7c0  r4 : 000000b0
[17184178.672000] r3 : e001a418  r2 : 00000000  r1 : e00a11e8  r0 : 00accbad
[17184178.672000] Flags: nzCv  IRQs off  FIQs on  Mode UND_32 

ISA Thumb  Segment user
[17184178.672000] Control: 0005317d  Table: 21aa4000  DAC: 00000015
[17184178.672000] Stack: (0xd202df68 to 0xd002e800)
[17184178.672000] Code: 6868 0029 2800 d01a (6802)
[17184178.676000] ---[ end trace 31c4d86500000008 ]---

 

[17179681.444000] Internal error: Oops - bad syscall: ddf04c [#1] PREEMPT

[17179681.444000] Modules linked in: coma_dsr coma_voice coma_ss7 coma_cpi coma_config

[17179681.444000] CPU: 0    Not tainted  (2.6.26.5 #659)

[17179681.444000] PC is at __dabt_usr+0x4/0x60

[17179681.444000] LR is at 0x377a4

[17179681.444000] pc : [<b4023884>]    lr : [<000377a4>]    psr: 80000093

[17179681.444000] sp : 4021fc10  ip : 0000f1b4  fp : 4021fc34

[17179681.444000] r10: 4003e000  r9 : 00000000  r8 : 003d0f00

[17179681.444000] r7 : 00000152  r6 : 400286f8  r5 : 402202b0  r4 : 00000000

[17179681.444000] r3 : 000000e4  r2 : 000000e4  r1 : 000f229c  r0 : b428bcec

[17179681.444000] Flags: Nzcv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user

[17179681.444000] Control: 0005317d  Table: 21990000  DAC: 00000015

[17179681.444000] Process my_app(pid: 81, stack limit = 0xb5b8a268)

[17179681.444000] Stack: (0x4021fc10 to 0xb5b8c000)

[17179681.444000] fc00:                                     00000001 000000e4 000f229c 00000004

[17179681.444000] fc20: 402202f8 ffffffff 4021fc54 4021fc38 00035778 00037784 4003e000 00000000

[17179681.444000] fc40: 4021fc60 ffffffff afa66c90 4021fc58 40028788 00035740 00000000 402202b0

[17179681.444000] fc60: 402202f8 402202b0 400286f8 00000152 003d0f00 00000000 4003e000 afa66c90

[17179681.444000] fc80: 4021fc58 40028758 00000000 00000000 00000000 00000000 00000000 00000000

[17179681.444000] fca0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

[17179681.444000] fcc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

 

上面兩種Oops:第一種是UND_32表示是發生了使用者空間,第二種SVC_32表示發生在核心空間。pc表示出問題的地址。

 

幾個常識概念:

1.程式計數器PC (R15),可以作為一般的通用暫存器使用,但有一些指令在使用R15時有一些限制。由於ARM採用了流水線處理器機制,當正確讀取了PC的值時,該值為當前指令地址值加上8個位元組。也就是說,對於ARM指令集來說,PC指向當前指令的下兩條指令的地址。由於ARM指令是字對齊的,PC值的第0位和第一位總為 0。

2.暫存器R13(SP),通常用作堆疊指標,每一種模式都有自己的物理R13,程式初始化R13。當進入該模式

時,可以將要使用的暫存器儲存在R13所指的棧中,當退出時,將彈出,從而實現了現場保護。

3.暫存器R14被稱為連結暫存器(LR),當中存放每種模式下,當前子程式的返回地址或者

發生異常中斷的時候,將R14設定成異常模式將要返回的地址。

4.暫存器R12

 

在給linux核心新增netfilter和iptables配置後,生成核心。下載到開發板的後啟動插入網線出現錯誤提示:

eth1: link up, 100Mbps, full-duplex, lpa 0xCDE1        //插入網線後提示

------------------首先這些列印是kernel的panic函式列出的,具體意義可以直接找到kernel程式碼去看,很有幫助。
Unable to handle kernel paging request at virtual address 06400040

-----------------------空指標錯誤,這個一般就是非法地址訪問,至於為什麼導致非法,請關注PC周圍的程式碼邏輯。有必要的話就printk出來。

pgd = c0004000
[06400040] *pgd=00000000
Internal error: Oops: 5 [#1]  //錯誤提示 ------5代表什麼?要在你的手冊或kernel程式碼中查,null pointer??

Modules linked in: zd1211rw rt73usb rt2x00usb rt2x00lib asix usbnet mac80211 inp
ut_polldev
CPU: 0    Not tainted  (2.6.30.4-LanxumDomas #19)


PC is at skb_release_data+0x74/0xc4

--------當前pc指標,這個十分有用!可以將kernel反彙編,然後找dequeue_task的相對位置0xc處

LR is at __kfree_skb+0x1c/0xd0

-----------------以下為當前暫存器值,也有幫助
pc : [<c0282b68>]    lr : [<c028266c>]    psr: 20000013
sp : c040dd90  ip : c040dda8  fp : c040dda4
r10: 00000001  r9 : c0465fc8  r8 : c3a22000
r7 : c3b54300  r6 : c3b50004  r5 : c3ae1600  r4 : c3ae1600
r3 : 00000000  r2 : c3b50822  r1 : 00000000  r0 : 06400040
Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
Control: c000717f  Table: 33af0000  DAC: 00000017
Process swapper (pid: 0, stack limit = 0xc040c268)
Stack: (0xc040dd90 to 0xc040e000)
dd80:                                     c3ae1600 c3ae1600 c040ddbc c040dda8
dda0: c028266c c0282b04 c3ae19c0 c3ae1600 c040ddcc c040ddc0 c0282794 c0282660 

-----------------以下是呼叫堆疊,可以看到程式的流程,便於跟蹤。

Backtrace:
[<c0282af4>] (skb_release_data+0x0/0xc4) from [<c028266c>] (__kfree_skb+0x1c/0xd
0)
 r5:c3ae1600 r4:c3ae1600
[<c0282650>] (__kfree_skb+0x0/0xd0) from [<c0282794>] (kfree_skb+0x24/0x50)
 r5:c3ae1600 r4:c3ae19c0
[<c0282770>] (kfree_skb+0x0/0x50) from [<c02f6884>] (br_handle_frame_finish+0x3c
/0x18c)
[<c02f6848>] (br_handle_frame_finish+0x0/0x18c) from [<c02f6c0c>] (br_handle_fra
me+0x238/0x26c)
 r8:c3a22000 r7:c3ae19c0 r6:c3ae1600 r5:c3b50004 r4:80000000
[<c02f69d4>] (br_handle_frame+0x0/0x26c) from [<c028a2f4>] (netif_receive_skb+0x
1e4/0x380)
 r7:00000000 r6:c0465fc8 r5:c3ae1600 r4:c3ae19c0
[<c028a110>] (netif_receive_skb+0x0/0x380) from [<c028a50c>] (process_backlog+0x
7c/0xd4)
[<c028a490>] (process_backlog+0x0/0xd4) from [<c0288b44>] (net_rx_action+0xd0/0x
194)
[<c0288a74>] (net_rx_action+0x0/0x194) from [<c009cb64>] (__do_softirq+0x78/0x10
0)
[<c009caec>] (__do_softirq+0x0/0x100) from [<c009cc34>] (irq_exit+0x48/0x50)
[<c009cbec>] (irq_exit+0x0/0x50) from [<c0083048>] (_text+0x48/0x70)
[<c0083000>] (_text+0x0/0x70) from [<c0083a44>] (__irq_svc+0x24/0xa0)
Exception stack(0xc040df48 to 0xc040df90)
df40:                   f4100000 00000032 f4100000 60000013 c0084ed8 c040c000
df60: c0084ed8 c04398c4 3001ea00 41129200 3001e9cc c040df9c c040dfa0 c040df90
df80: c008550c c0084f38 60000013 ffffffff
 r7:c04398c4 r6:04000000 r5:f4000000 r4:ffffffff
[<c0084ed8>] (default_idle+0x0/0xac) from [<c008550c>] (cpu_idle+0x4c/0x68)
[<c00854c0>] (cpu_idle+0x0/0x68) from [<c0320ea0>] (rest_init+0x5c/0x70)
 r7:c04104f0 r6:c0020d34 r5:c0439880 r4:c045d164
[<c0320e44>] (rest_init+0x0/0x70) from [<c0008944>] (start_kernel+0x1e0/0x24c)
[<c0008764>] (start_kernel+0x0/0x24c) from [<30008034>] (0x30008034)
 r5:c0439968 r4:c0007175
Code: e3500000 0a000006 e3a03000 e5823018 (e5904000)
Kernel panic - not syncing: Fatal exception in interrupt
Backtrace:
[<c0087fd0>] (dump_backtrace+0x0/0x10c) from [<c0321f0c>] (dump_stack+0x18/0x1c)

 r7:c0282b6c r6:c0439ec0 r5:c0282b68 r4:c0282b68
[<c0321ef4>] (dump_stack+0x0/0x1c) from [<c0321f5c>] (panic+0x4c/0x12c)
[<c0321f10>] (panic+0x0/0x12c) from [<c00882bc>] (die+0x1e0/0x214)
 r3:00000100 r2:00000080 r1:c0439ec0 r0:c03b5a34
[<c00880dc>] (die+0x0/0x214) from [<c008a528>] (__do_kernel_fault+0x6c/0x7c)
[<c008a4bc>] (__do_kernel_fault+0x0/0x7c) from [<c008a684>] (do_page_fault+0x14c
/0x25c)
 r7:c040f890 r6:06400040 r5:c040dd7c r4:00000000
[<c008a538>] (do_page_fault+0x0/0x25c) from [<c008a89c>] (do_translation_fault+0
x78/0x80)
[<c008a824>] (do_translation_fault+0x0/0x80) from [<c00831e0>] (do_DataAbort+0x3
8/0x9c)
 r7:c041075c r6:00000005 r5:c040dd7c r4:c041070c
[<c00831a8>] (do_DataAbort+0x0/0x9c) from [<c0083a00>] (__dabt_svc+0x40/0x60)
Exception stack(0xc040dd48 to 0xc040dd90)
dd40:                   06400040 00000000 c3b50822 00000000 c3ae1600 c3ae1600
dd60: c3b50004 c3b54300 c3a22000 c0465fc8 00000001 c040dda4 c040dda8 c040dd90
dd80: c028266c c0282b68 20000013 ffffffff
[<c0282af4>] (skb_release_data+0x0/0xc4) from [<c028266c>] (__kfree_skb+0x1c/0xd
0)
 r5:c3ae1600 r4:c3ae1600
[<c0282650>] (__kfree_skb+0x0/0xd0) from [<c0282794>] (kfree_skb+0x24/0x50)
 r5:c3ae1600 r4:c3ae19c0
[<c0282770>] (kfree_skb+0x0/0x50) from [<c02f6884>] (br_handle_frame_finish+0x3c
/0x18c)
[<c02f6848>] (br_handle_frame_finish+0x0/0x18c) from [<c02f6c0c>] (br_handle_fra
me+0x238/0x26c)
 r8:c3a22000 r7:c3ae19c0 r6:c3ae1600 r5:c3b50004 r4:80000000
[<c02f69d4>] (br_handle_frame+0x0/0x26c) from [<c028a2f4>] (netif_receive_skb+0x
1e4/0x380)
 r7:00000000 r6:c0465fc8 r5:c3ae1600 r4:c3ae19c0
[<c028a110>] (netif_receive_skb+0x0/0x380) from [<c028a50c>] (process_backlog+0x
7c/0xd4)
[<c028a490>] (process_backlog+0x0/0xd4) from [<c0288b44>] (net_rx_action+0xd0/0x
194)
[<c0288a74>] (net_rx_action+0x0/0x194) from [<c009cb64>] (__do_softirq+0x78/0x10
0)
[<c009caec>] (__do_softirq+0x0/0x100) from [<c009cc34>] (irq_exit+0x48/0x50)
[<c009cbec>] (irq_exit+0x0/0x50) from [<c0083048>] (_text+0x48/0x70)
[<c0083000>] (_text+0x0/0x70) from [<c0083a44>] (__irq_svc+0x24/0xa0)
Exception stack(0xc040df48 to 0xc040df90)
df40:                   f4100000 00000032 f4100000 60000013 c0084ed8 c040c000
df60: c0084ed8 c04398c4 3001ea00 41129200 3001e9cc c040df9c c040dfa0 c040df90
df80: c008550c c0084f38 60000013 ffffffff
 r7:c04398c4 r6:04000000 r5:f4000000 r4:ffffffff
[<c0084ed8>] (default_idle+0x0/0xac) from [<c008550c>] (cpu_idle+0x4c/0x68)
[<c00854c0>] (cpu_idle+0x0/0x68) from [<c0320ea0>] (rest_init+0x5c/0x70)
 r7:c04104f0 r6:c0020d34 r5:c0439880 r4:c045d164
[<c0320e44>] (rest_init+0x0/0x70) from [<c0008944>] (start_kernel+0x1e0/0x24c)
[<c0008764>] (start_kernel+0x0/0x24c) from [<30008034>] (0x30008034)
 r5:c0439968 r4:c0007175

出現以上錯誤後,可以根據錯誤的提示oops;以下介紹根據錯誤提示進行錯誤定位。

1.首先在編譯生成核心的時候同時生成了一個vmlinux,使用gdb。

   在核心配置時,make menuconfig 要開啟complie with debug info選項。

   注意這行: PC is at skb_release_data+0x74/0xc4

   這告訴我們,skb_release_data函式有0xc4這麼大,而Oops發生在0x74處。 那麼我們先看一下skb_release_data從哪裡開始:

      # grep skb_release_data ./System.map

         c0282af4 t skb_release_data

     於是我們知道在系統出現錯誤時程式指標在 c0282af4+0x74=c0282b68

  2.然後用gdb檢視,gdb ./vmlinux (在linux目錄下執行),進入除錯模式。

                                (gdb) b *0xc0282b68

  Breakpoint 1 at 0xc0282b68: file net/core/skbuff.c ,line312

  這就是告訴我們在哪個檔案,在哪一行。如此知道了錯誤的位置,具體的原因帶解決。

3,反彙編

                (gdb) disassemble 0xc0282b68