LinuxWorld
Subscribe to this site with RSS

Kernel space: The Linux trace toolkit's next generation

Kernel debugging systems can slow the kernel enough to hide or change the bugs they're looking for. A new tracing system would minimize the performance penalty of running with tracing on.

Instrumenting a running kernel for debugging or profiling is on the wish list of many administrators and developers. Advocates of OpenSolaris like to point to DTrace as a feature that Linux lacks, though SystemTap has started to close that gap. The Linux Trace Toolkit next generation (LTTng) takes a different approach and was recently submitted for inclusion in the kernel (in two patches: architecture independent and architecture dependent).

LTTng relies upon kernel markers to provide static probe points for its kernel tracing activities. It also provides the ability to trace userspace programs and combine that data with kernel tracing data to give a detailed view of the internals of the system. Unlike other tools, LTTng takes a post-processing approach, storing the data away as efficiently as possible for later analysis. This is in contrast to SystemTap and DTrace, which have their own mini-languages that specify what to do as each trace point is reached.

One of the major design goals of LTTng is to have as little impact on the system as possible, not only when it is actually tracing events, but also when it is disabled. Kernel hackers are quite resistant to debugging solutions that add any significant performance penalty when not in use. In addition, any significant delays while enabled may change the system timing such that the bug or condition being studied does not occur. For this reason, LTTng does not take the path that various dynamic tracing solutions have used and avoids the expense of a breakpoint interrupt by using the static markers.

Another major design goal is to provide monotonically increasing timestamp values for events. The original LTT uses timestamps derived from the kernel Network Time Protocol (NTP) time, which can fluctuate somewhat as adjustments are made— sometimes going backward. LTTng uses a timestamp derived from the hardware clocks that will work on various processor architectures and clock speeds. In addition, the timestamps can be correlated between different processors in a multi-processor system.

React: Give us your thoughts on the issues here.
Use this form to start a public discussion with other Linux World users on this article.
Log In | Register for an account (Why you should)

Note: Register to have your user name appear; otherwise your comment will show up as "Anonymous."

*Anonymous comments will only appear once they are approved by the moderator.

Featured Whitepapers
Newsletter sign-up

Sign up for one of Network World's newsletters compliments of Linux World

Linux & Open Source News Alert
Web Applications Alert
Video and Podcast Alert
Security Alert
Virtualization Alert

Email Address: