Long ago, the prerequisite UNIX performance book was Adrian
Cockcroft's 1994 classic, Sun
Performance and Tuning: Sparc & Solaris, later updated in
1998 as Java
and the Internet. As Solaris evolved to include the invaluable
DTrace
observability features, new essential performance references have
been published, such as Solaris
Performance and Tools: DTrace and MDB Techniques for Solaris 10
and OpenSolaris (2006) by McDougal, Mauro, and Gregg,
and DTrace:
Dynamic Tracing in Oracle Solaris, Mac OS X and FreeBSD
(2011), also by Mauro and Gregg.
Much has occurred in Solaris Land since those books appeared,
notably Oracle's
acquisition of Sun Microsystems in 2010 and the demise of the
OpenSolaris community. But operating system technologies have
continued to improve markedly in recent years, driven by stunning
advances in multicore processor architecture, virtualization, and
the massive scalability requirements of cloud computing.
A new performance reference was needed, and I eagerly waited for
something that thoroughly covered modern, distributed computing
performance issues from the ground up. Well, there's a new classic
now, authored yet again by Brendan Gregg,
former Solaris kernel engineer at Sun and now Lead Performance
Engineer at Joyent.
Systems
Performance: Enterprise and the Cloud is a modern, very
comprehensive guide to general system performance principles and
practices, as well as a highly detailed reference for specific UNIX
and Linux observability tools used to examine and diagnose operating
system behaviour. It provides thorough definitions of terms,
explains performance diagnostic Best Practices and "Worst Practices"
(called "anti-methods"), and covers key observability tools
including DTrace, SystemTap, and all the traditional UNIX utilities
like vmstat, ps, iostat, and many others.
The book focuses on operating system performance principles and
expands on these with respect to Linux (Ubuntu, Fedora, and CentOS
are cited), and to Solaris and its derivatives [1]; it is not
directed at any one OS so it is extremely useful as a broad
performance reference.
The author goes beyond the intricacies of performance analysis and
shows how to interpret and visualize statistical information
gathered from the observability tools. It's often difficult to
extract understanding from voluminous rows of text output, and
techniques are provided to assist with summarizing, visualizing, and
interpreting the performance data.
Gregg includes myriad useful references from the system performance
literature, including a "Who's Who" of contributors to this great
body of diagnostic tools and methods.
This outstanding book should be required reading for UNIX
and Linux system administrators as well as anyone charged with
diagnosing OS performance issues. Moreover, the book can
easily serve as a textbook for a graduate level course in operating
systems [2].
[1] Solaris 11, of course, and Joyent's SmartOS (developed from
OpenSolaris)
[2] Gregg has taught system performance seminars for many years; I
have also taught such courses...this book would be perfect for the
OS component of an advanced CS curriculum.