Friday, June 08, 2012

Is linux kernel preemptive or not?

Yes, the kernel is preemptive.
It has been preemptive by default since the 2.6 branch. The old Linux kernel is not preemptive, which means that a process can be preempted only while running in User Mode; non preemptive kernel design is much simpler, since most synchronization problems involving the kernel data structures are easily avoided Prior to Linux kernel version 2.5.4, Linux Kernel was not preemptive which means a process running in kernel mode cannot be moved out of processor until it itself leaves the processor or it starts waiting for some input output operation to get complete.
  Generally a process in user mode can enter into kernel mode using system calls. Previously when the kernel was non-preemptive, a lower priority process could priority invert a higher priority process by denying it access to the processor by repeatedly calling system calls and remaining in the kernel mode. Even if the lower priority process' timeslice expired, it would continue running until it completed its work in the kernel or voluntarily relinquished control. If the higher priority process waiting to run is a text editor in which the user is typing or an MP3 player ready to refill its audio buffer, the result is poor interactive performance. This way non-preemptive kernel was a major drawback at that time.
A preemptive kernel is one that can be interrupted in the middle of a executing code - for instance in response for a system call - to do other things and run other threads, possible that are not in the kernel.
The main advantage in a preemptive kernel is that sys-calls do not block the entire system. if a sys-call takes a long time to finish then it doesn't mean the kernel can't do anything else in this time.
The main disadvantage is that this introduces more complexity to the kernel code, having to handle more end-cases, perform more fine grained locking or use lock-less structures and algorithms.
preemption (more correctly pre-emption) is the act of temporarily interrupting a task being carried out by a computer system, without requiring its cooperation, and with the intention of resuming the task at a later time. Such a change is known as a context switch. It is normally carried out by a privileged task or part of the system known as a preemptive scheduler, which has the power to preempt, or interrupt, and later resume, other tasks in the system.
$ strace echo test                                
execve("/bin/echo", ["echo", "test"], [/* 45 vars */]) = 0       (execve, mmap are system calls)         
brk(0)                                  = 0xc90000  
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4210db7000
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)         
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4210db5000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4210c5d000
write(1, "test\n", 5test
)                   = 5
close(1)                                = 0
munmap(0x7f4210c5d000, 4096)            = 0
close(2)                                = 0
exit_group(0)                           = ?

No comments: