Linux has long provided an outstanding operating system for a wide range of users in a variety of settings. However, high-performance computing users, who must run applications on thousands of nodes, historically have faced challenges that Linux could not effectively address.
Sidebar: Linux vs. Catamount running ocean modeling application
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
These issues arise for a number of reasons. In the first place, installing a full, un-tuned copy of Linux — or of any full-scale operating system — on each node of a large-scale HPC system interferes with the efficient use of processor and communication resources. HPC users also have found that some inherent attributes of Linux, such as various daemons and services that run by default, can impede application performance as the operating system scales to larger numbers of processors.
Given these issues, the largest-scale HPC facilities have traditionally employed alternative specialized lightweight operating systems on compute nodes, while using Linux at the system level. Unfortunately, this strategy is not viable for all types of HPC users. After all, a specialized operating system tuned explicitly for a particular application environment simply cannot provide the breadth of services and features that may be required by users in enterprises and other types of HPC environments.
The ideal solution for many HPC users would be a combination of full-blown Linux at the system level, with compute nodes employing a lightweight Linux that is optimized for HPC systems. Today, Cray and others in the HPC community are working to deliver just that. In the short term, this “Linux on Compute Node” strategy will offer the greatest benefits to users of larger-scale HPC systems, allowing them to achieve better application performance without sacrificing the familiarity and feature set of Linux. However, as enterprise HPC users and applications continually demand greater scalability and more processors, this innovation ultimately may extend significant advantages to users in all types of HPC environments.
The biggest problem that HPC users have with using full-blown Linux on all compute nodes is that Linux was designed to operate primarily in an enterprise environment, supporting desktop and server workloads. As a result, Linux is optimized for “capacity operation”, for providing the greatest possible throughput in an environment in which the operating system must handle many small jobs, and for single-node interactive response time, providing, for example, prompt processing of Web server requests. In an HPC environment, however, users are more concerned about “capability operation,” or achieving the best possible performance of a single application running across the entire system.
• Dell puts Linux and Atom in Vostro PCs
• Mozilla names best Firefox 3 add-ons
• Torvalds: Fed up with the 'security circus'
• Dell Latitude ON - big win for Linux
• Open source advocates hail appeals court ruling
LinuxWorld Conference and Expo San Francisco, August 4-7, 2008.
Linux Plumbers Conference Portland, OR, Sept. 16-19, 2008.
FreedomHEC Santa Monica, November 8-9, 2008.