
ISBN 1-56592-625-0 Price approximately
£32 UK
David HM Spector
9 Chapters, 4 Appendices, Glossary and Index in 332 pages
Chapter List
1 Introduction
2 Basic Concepts
3 Designing Clusters
4 Building Clusters
5 Software Installation
and Configuration
6 Managing Clusters
7 Tools and Libraries
for Parallel Programming
8 Programming in a Parallel
Environment
9 Application Examples
Appendix A
Resources
Appendix
B
Message Passing APIs
Appendix
C
Installation Scripts
Appendix
D
The Cluster Administration Database
Building Linux clusters is about putting computers together in a cluster and making them do something that they wouldn't do otherwise. It is of course based around the Beowulf project and much work has been done by NASA on the subject of "Beo" clusters as they are known by many physicists and computing boffins.
Review
If you are a professional physicist or an IT expert you will have heard of the Beowulf project and of it's legendary ability to make supercomputers out of the most mundane hardware. To quote from the preface.......
" Back in 1994, a couple of research scientists, Thomas Sterling and Donald Becker, at NASA's Center of Excellence in Space Data and Information Sciences (CESDIS) in Greenbelt, Maryland, embarked on a project to build a parallel computer out of off the shelf components. They wanted to make a low cost, yet efficient, system for processing large space science data sets. Their first system, a 16 node network of workstations, was constructed out of Intel DX4 processors. It was interconnected by a novel channel bonding method that allowed them to tie together multiple 10Mbit/ second Ethernets to balance network performance without the use of then expensive network switching systems. With the addition of message passing software such as the Parallel Virtual Machine (PVM) system, they were able to build a quite effective parallel processing computer on a shoe string budget. They named the results of their work "Beowulf". Their experiment in distributed processing was an unmitigated success, principally because of the underlying system software they chose to base their efforts on: Linux. "
I knew this to be the way things were before I read the book. I had at that time sent some mail to the people at NASA to ask a few questions. However, for the first time reader of clustering software and parallel computing the idea that a few 486s might do something for you may come as something of a shock. This is not so. I have heard more than two Cray operators say to me that they can't understand why the Cray computer is still around.
Chapter one kicks off with a brilliant but shortened account of the history of computers and computing science. It explains how things were in the 1950s and 1970s and how things changed after Intel and other companies came along. As so often happens in several fields of discovery science fiction became science fact and we now use computers that would never have been thought of even as little as twenty years ago. There are a few very useful pages on what clusters can be used for. Chapter two goes into basic concepts and why we need clusters. Chapter three introduces in a few brief words a topic that not many of us have read about or been involved in. Designing clusters is one of those things that most people assume is done by someone else. If you haven't built a cluster before then this is the part of the book that will interest you the most. Building clusters at chapter four is the part I really like. The bit where you actually get your hands on the tools and build it. Next comes software installation and configuration. This is really good and should be the kind of thing that should be shown in every modern art gallery. Very much Tate Modern. The sixth chapter goes into managing clusters. Most people need to know this kind of thing but I've heard of an in joke about the fact that the Caltech janitor could take care of this kind of thing. There are some good graphical examples in this book that show GUI tools that make cluster management easy to understand. It reminds me of the Using Samba book that is also much easier to understand than the others.
The really sophisticated part of the book comes up at chapter seven. Tools and libraries for parallel programming is very helpful but perhaps another hundred pages would have been useful ? Programming in a parallel environment is also helpful. Not the sort of thing you'll find on a shelf just anywhere. The final chapter about application examples is invaluable.
Colophon