Linux核心的Kmemleak實現記憶體洩露檢測
1. Kmemleak的使用方法
a. 在uboot的bootarg中加入"kmemleak=on"
b. 在.config中使能如下配置
CONFIG_HAVE_DEBUG_KMEMLEAK=y CONFIG_DEBUG_KMEMLEAK=y CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE=400
c. mount -t debugfs nodev /sys/kernel/debug/
如果一切順利的話,你將能夠在"/sys/kernel/debug/"下面看到kmemleak的檔案
/sys/kernel/debug # ls ... kmemleak suspend_stats /sys/kernel/debug #
我們用一個例子來說明其如何使用:
static ssize_t workqueue_proc_store(struct file *file, const char __user *buffer, size_t count, loff_t *ppos) { #ifdef MM_LEAK_DEBUG char *mm = kmalloc(32*1024, GFP_KERNEL); printk("mem leak,ptr = %p\n",mm); if(mm) { memset(mm,0x0,32*1024); mm = kmalloc(32*1024, GFP_KERNEL); kfree(mm); printk("mem leak,ptr = %p\n",mm); } #endif return count; } ... static const struct file_operations workqueue_proc_fops = { .open = workqueue_proc_open, .read = seq_read, .write = workqueue_proc_store, .llseek = seq_lseek, .release = single_release, }; ... proc_create("workqueue", 0, NULL, &workqueue_proc_fops); ...
如上我在proc節點的write實現中發生了一個記憶體洩露,我們接下來看看系統是如何檢測到它的。
- 觸發記憶體洩露
/ # echo 1 > /proc/workqueue
mem leak,ptr = 9e3c0000
mem leak,ptr =9e3c8000
- 等待系統的memleak檢測執行緒排程(最長10min),或者你可以執行如下的命令強制系統去檢測記憶體洩露
echo scan > /sys/kernel/debug/kmemleak
注:當你強制觸發檢測的時候,需要留意的是第一次觸發檢測的時候,會先sleep出去1min,以保證系統完全的bring up。
- 隨後系統將檢測到記憶體洩露,並通知你去檢視
/ # kmemleak: 2 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
- 檢視可能的記憶體洩露點
/ # cat /sys/kernel/debug/kmemleak
unreferenced object 0x9e3c0000
(size 32768): comm “sh”, pid 778, jiffies 4294939511 (age 57.630s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 …
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 … backtrace:
[<80138228>] proc_reg_write+0x5c/0x84
[<800ebdd4>] __vfs_write+0x1c/0xd8
[<800ec618>] vfs_write+0x90/0x170
[<800ece18>] SyS_write+0x3c/0x90
[<8000f3c0>] ret_fast_syscall+0x0/0x3c
[] 0xffffffff / #
注:如果你覺得memleak的檢測執行緒間隔時間太長,那麼你可以手動的修改,如下:
Kmemleak.c (mm)
#define SECS_SCAN_WAIT 600 /* subsequent auto scanning delay */
2. 深入的理解Kmemleak
為了更加深入的理解Kmemleak的檢測原理,我們不妨做如下實驗:
我們在最開始的位置定義一個全域性的指標變數,讓tmp_ptr 指向洩露的那塊記憶體,如下:
char * tmp_ptr = NULL;
...
#ifdef MM_LEAK_DEBUG
char *mm = kmalloc(32*1024, GFP_KERNEL);
printk("mem leak,ptr = %p\n",mm);
tmp_ptr = mm;
if(mm)
{
memset(mm,0x0,32*1024);
mm = kmalloc(32*1024, GFP_KERNEL);
printk("mem leak,ptr = %p\n",mm);
kfree(mm);
}
#endif
...
我們再次執行kmemleak:
/ # echo 1 > /proc/workqueue
mem leak,ptr = 9e3c0000
mem leak,ptr = 9e3c8000
/ #
/ # echo scan > /sys/kernel/debug/kmemleak
/ # echo scan > /sys/kernel/debug/kmemleak
/ # echo scan >/sys/kernel/debug/kmemleak
/ #
我們發現很遺憾,kmemleak並不能為你檢測到這塊記憶體洩露,因為它檢測到tmp_ptr這個指標還指向它。所以系統錯誤的判定該處沒有記憶體洩露。
我們不妨再做一次實驗,再次呼叫“echo 1 > /proc/workqueue ”,再觸發一次記憶體洩露
/ # echo 1 > /proc/workqueue
mem leak,ptr = 9e3c0000
mem leak,ptr =9e3c8000
/ # echo scan > /sys/kernel/debug/kmemleak
/ # echo scan >/sys/kernel/debug/kmemleak
/ # echo scan > /sys/kernel/debug/kmemleak
/ # echo 1 > /proc/workqueue
mem leak,ptr = 9e3c8000
mem leak,ptr =9e3d0000
/ #
/ # echo scan > /sys/kernel/debug/kmemleak
kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
然後我們看下該處記憶體洩露
/ # cat /sys/kernel/debug/kmemleak
unreferenced object 0x9e3c0000 (size32768):
comm “sh”, pid 776, jiffies 4294938930 (age 326.730s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 …
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 … backtrace:
[<80138228>] proc_reg_write+0x5c/0x84
[<800ebdd4>] __vfs_write+0x1c/0xd8
[<800ec618>] vfs_write+0x90/0x170
[<800ece18>] SyS_write+0x3c/0x90
[<8000f3c0>] ret_fast_syscall+0x0/0x3c
[] 0xffffffff
哦哦,系統幫我們檢測到了第一次的記憶體洩露,原因是第一次洩露的記憶體已經沒有人在引用了。
3. 將洩露的記憶體還給系統
當我們發現記憶體洩露後,是可以將洩露的記憶體釋放還給系統的,方法如下:
# echo scan=off > /sys/kernel/debug/kmemleak
# echo off > /sys/kernel/debug/kmemleak
需要特別注意的是kmemleak scan thread執行的情況下,無法釋放kmemleak objects,因此我們首先要將kmemleak的檢測執行緒停止。
kmemleak_write
if (strncmp(buf, "off", 3) == 0)
kmemleak_disable();
->schedule_work(&cleanup_work);
->kmemleak_do_cleanup
->__kmemleak_do_cleanup
delete_object_full