核心執行緒函式kernel_thread解析
裝置驅動程式中,如果需要幾個併發執行的任務,可以啟動核心執行緒,啟動核心執行緒的函式為:<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />
int kernel_thread (int ( * fn )( void * ), void * arg, unsigned long flags);
kernel_thread函式的作用是產生一個新的執行緒
核心執行緒實際上就是一個共享父程序地址空間的程序,它有自己的系統堆疊.
核心執行緒和程序都是通過do_fork()函式來產生的,系統中規定的最大程序數與執行緒數由
[/arch/kernel/process.c/fork_init()]
void __init fork_init(unsigned long mempages)
{
#ifndef __HAVE_ARCH_TASK_STRUCT_ALLOCATOR
#ifndef ARCH_MIN_TASKALIGN
#define ARCH_MIN_TASKALIGN L1_CACHE_BYTES
#endif
/* 在slab快取記憶體中建立task_struct結構專用的緩衝區佇列 */
task_struct_cachep =
kmem_cache_create("task_struct", sizeof(struct task_struct),
ARCH_MIN_TASKALIGN, SLAB_PANIC, NULL, NULL);
#endif
/*
把預設執行緒數設定到一個安全值,因為核心中總的執行緒佔用的空間
可能要記憶體一半還要多.
引數mempages系統中總的實體記憶體結構大小,它等於mempages/PAGESIZE.
比如我機器的記憶體是<?xml:namespace prefix = st1 ns = "urn:schemas-microsoft-com:office:smarttags" />512m,那麼在我的系統最多能同時產生執行緒數為
(512*2^20/2^12) / 2^3 = 512*2^5 = 16384
*/
max_threads = mempages / (8 * THREAD_SIZE / PAGE_SIZE);
/*
* 啟動系統的時候至少需要20個執行緒
*/
if(max_threads < 20)
max_threads = 20;
/*
* 每個程序最多產生max_threads/2,也就是執行緒總數的一半,在我的機器上為8192.
*/
init_task.signal->rlim[RLIMIT_NPROC].rlim_cur = max_threads/2;
init_task.signal->rlim[RLIMIT_NPROC].rlim_max = max_threads/2;
}
kernel_thread原形在/arch/kernel/process.c中.
(*fn)(void *)為要執行的函式的指標,arg為函式引數,flags為do_fork產生執行緒時的標誌.
int kernel_thread(int (*fn)(void *), void * arg, unsigned long flags)
{
struct pt_regs regs;
memset(®s, 0, sizeof(regs));
regs.ebx = (unsigned long) fn; /* ebx指向函式地址 */
regs.edx = (unsigned long) arg; /* edx指向引數 */
regs.xds = __USER_DS;
regs.xes = __USER_DS;
regs.orig_eax = -1;
regs.eip = (unsigned long) kernel_thread_helper;
regs.xcs = __KERNEL_CS;
regs.eflags = X86_EFLAGS_IF | X86_EFLAGS_SF | X86_EFLAGS_PF | 0x2;
/* 利用do_fork來產生一個新的執行緒,共享父程序地址空間,並且不允許除錯子程序 */
return do_fork(flags | CLONE_VM | CLONE_UNTRACED, 0, ®s, 0, NULL, NULL);
}
[/arch/i386/kernel/process.c/kernel_thread_helper]
extern void kernel_thread_helper(void); /* 定義成全域性變數 */
__asm__(".section .text\n"
".align 4\n"
"kernel_thread_helper:\n\t"
"movl %edx,%eax\n\t"
"pushl %edx\n\t" /* edx指向引數,壓入堆疊 */
"call *%ebx\n\t" /* ebx指向函式地址,執行函式 */
"pushl %eax\n\t"
"call do_exit\n" /* 結束執行緒 */
".previous");
在kernel_thread中呼叫了do_fork,那麼do_fork是怎樣轉入kernel_thread_helper去執行的呢,繼續跟蹤下do_fork函式.
[kernel/fork.c/do_fork()]
long do_fork(unsigned long clone_flags,
unsigned long stack_start,
struct pt_regs *regs,
unsigned long stack_size,
int __user *parent_tidptr,
int __user *child_tidptr)
{
....
....
p = copy_process(clone_flags, stack_start, regs, stack_size, parent_tidptr, child_tidptr, pid);
....
....
}
它呼叫copy_process函式來向子程序拷貝父程序的程序環境和全部暫存器副本.
[kernel/fork.c/do_fork()->copy_process]
static task_t *copy_process(unsigned long clone_flags,
unsigned long stack_start,
struct pt_regs *regs,
unsigned long stack_size,
int __user *parent_tidptr,
int __user *child_tidptr,
int pid)
{
...
...
retval = copy_thread(0, clone_flags, stack_start, stack_size, p, regs);
...
...
}
它又呼叫copy_thread來拷貝父程序的系統堆疊並做相應的調整.
[/arch/i386/kernel/process.c/copy_thread]:
int copy_thread(int nr, unsigned long clone_flags, unsigned long esp,
unsigned long unused,
struct task_struct * p, struct pt_regs * regs)
{
...
...
p->thread.eip = (unsigned long) ret_from_fork;
}
在這裡把ret_from_fork的地址賦值給p->thread.eip,p->thread.eip表示當程序下一次排程時的指令開始地址,
所以當執行緒建立後被排程時,是從ret_from_fork地址處開始的.
[/arch/i386/kernel/entry.s]
到這裡說明,新的執行緒已經產生了.
ENTRY(ret_from_fork)
pushl %eax
call schedule_tail
GET_THREAD_INFO(%ebp)
popl %eax
jmp syscall_exit
syscall_exit:
...
work_resched:
call schedule
...
當它從ret_from_fork退出時,會從堆疊中彈出原來儲存的ip,而ip指向kernel_thread_helper,
至此kernel_thread_helper被呼叫,它就可以執行我們的指定的函數了.
請注意在kernel_thread是如何呼叫系統呼叫的,我們知道kernel_thread是在核心中
呼叫,所以他是可以直接呼叫系統呼叫的,像sys_open()等
關於kernel的flags的標誌,做如下記述:
/*
* cloning flags:
*/
#define CSIGNAL 0x000000ff /* signal mask to be sent at exit */
#define CLONE_VM 0x00000100 /* set if VM shared between processes */
#define CLONE_FS 0x00000200 /* set if fs info shared between processes */
#define CLONE_FILES 0x00000400 /* set if open files shared between processes */
#define CLONE_SIGHAND 0x00000800 /* set if signal handlers and blocked signals shared */
#define CLONE_IDLETASK 0x00001000 /* set if new pid should be 0 (kernel only)*/
#define CLONE_PTRACE 0x00002000 /* set if we want to let tracing continue on the child too */
#define CLONE_VFORK 0x00004000 /* set if the parent wants the child to wake it up on mm_release */
#define CLONE_PARENT 0x00008000 /* set if we want to have the same parent as the cloner */
#define CLONE_THREAD 0x00010000 /* Same thread group? */
#define CLONE_NEWNS 0x00020000 /* New namespace group? */
#define CLONE_SYSVSEM 0x00040000 /* share system V SEM_UNDO semantics */
#define CLONE_SETTLS 0x00080000 /* create a new TLS for the child */
#define CLONE_PARENT_SETTID 0x00100000 /* set the TID in the parent */
#define CLONE_CHILD_CLEARTID 0x00200000 /* clear the TID in the child */
#define CLONE_DETACHED 0x00400000 /* Unused, ignored */
#define CLONE_UNTRACED 0x00800000 /* set if the tracing process can’t force CLONE_PTRACE on this clone */
#define CLONE_CHILD_SETTID 0x01000000 /* set the TID in the child */
#define CLONE_STOPPED 0x02000000 /* Start in stopped state */
/*
* List of flags we want to share for kernel threads,
* if only because they are not used by them anyway.
*/
#define CLONE_KERNEL (CLONE_FS | CLONE_FILES | CLONE_SIGHAND)
核心執行緒常用的flags就是CLONE_KERNEL。