LinuxWorld
Subscribe to this site with RSS

Kernel space: Linux runs into a scalability problem

Why is modprobe so slow? Avoiding bottlenecks on 4096-processor systems

Part of the fun of working with truly large machines is that one gets to discover new scalability surprises before anybody else. So the SGI folks often have more fun than many of the rest of us. Their latest discovery has to do with the number of kernel threads which, on a 4096-processor system, leads to some interesting kernel behavior.

To begin, they found out that they could not even boot a kernel with the default configuration. Linux systems normally have a limit of 32768 active processes at any given time. Anybody who has run "ps" will have noted that kernel threads are taking up an increasing number of those slots; your editor's single-processor desktop is running 39 of them. In fact, there are now enough kernel threads on a typical system that they will fill that entire space - and more - on a 4096-CPU machine. This problem is relatively easy to take care of by raising the limit on the number of processes. But it gets more interesting from there.

The init process is the parent of last resort for every other process on the system, including kernel threads. So, on a big system, init has a lot of child processes. These children live on a big linked list; that list must be searched by various functions, including the variants of wait(). If the process being searched for is toward the end of the list, that search can take a long time. Since (1) most kernel threads are long-lived, and (2) new processes are put at the end of the list, chances are that a search will, indeed, be looking for a process at the end.

Then, for the ultimate in fun, load a module into the kernel. The module loading process calls stop_machine_run() when the new module is being linked in; this function creates a high-priority kernel thread for each processor on the system. That thread will grab its assigned CPU and simply sit there until told to exit; while all CPUs are locked up in this way the linking process can be performed. Calling a function like stop_machine_run() is a somewhat antisocial act in the best of times. But, in the 4096-processor system, stop_machine_run() will create 4096 threads, each of which goes on the end of init's child list, and each of which must be searched for when the time comes to clean it up. The result is a system which simply stops for an extended period of time.

React: Give us your thoughts on the issues here.
Use this form to start a public discussion with other Linux World users on this article.
Log In | Register for an account (Why you should)

Note: Register to have your user name appear; otherwise your comment will show up as "Anonymous."

*Anonymous comments will only appear once they are approved by the moderator.

Featured Whitepapers
Newsletter sign-up

Sign up for one of Network World's newsletters compliments of Linux World

Linux & Open Source News Alert
Web Applications Alert
Video and Podcast Alert
Security Alert
Virtualization Alert

Email Address: