LinuxDevCenter.com

oreilly.comSafari Books Online.Conferences.

We've expanded our Linux news coverage and improved our search! Search for all things Linux across O'Reilly!

Search
Search Tips

advertisement

Listen Print Subscribe to Linux Subscribe to Newsletters

Linux Clusters - Using Linux for Power Computing
Pages: 1, 2

How do they work?

Linux clusters are a novel application of standard programming techniques writ very large. All modern software is written in a "divide and conquer" fashion. That is, a problem is analyzed and broken down into easily solvable parts and then program elements (functions, subroutines, or methods, depending upon your terminology or style of programming) are built to solve a particular piece of the problem. Usually this process is executed on a single CPU or sometimes on a symmetric multiprocessor (SMP) system using a technique called "multithreading" to allow a single program to spin off subtasks that can execute some of the subroutines.



Multithreading is a good way to squeeze performance and efficiency out of a computer, but performance gains are limited by the absolute amount of computer power you have in a given machine, and there are a limited number of processors that can be put into a single system, usually fewer than 64.

This works very well for most problems that are fairly linear in their use of processor time, but large-scale computation problems need a more powerful solution. Some classes of problems, including weather forecasting, the design and simulation of semiconductors or aviations systems, and data mining (such as looking for patterns in customer buying habits) require immense amounts of computing time and often both consume and generate many gigabytes of data. Problems such as these cannot be done with any accuracy in reasonable amounts of time (i.e., before the results would be rendered worthless) on regular off-the-shelf computer systems. These kinds of problems also require specialized programming techniques that can be used to break down the problem-solving program into bite-sized chunks, but the data too must be structured in such a way that the program can easily access it.

In order to make a pile of PCs into a supercomputer, software is needed to help manage the information flow between nodes in the cluster and to help distribute the work. The most common packages used to do this are the Parallel Virtual Machine (PVM) and the Local Area Multicomputer (LAM). These packages, the first developed at Oak Ridge National Labs and the second by Notre Dame University, are programming libraries that implement a message passing system that allows nodes on a network to cooperatively work on a problem.

What can you do with them?

As mentioned, Beowulf clusters are being used for a large variety of applications, including weather forecasting and high-energy physics problems such as the modeling of black holes. On a more down-to-earth level Beowulf clusters can be used to create lifelike animations or other computer-generated graphics -- films such as The Matrix, Titanic, and Toy Story all made extensive use of clustered computers to generate the huge amount of imagery that was required to make these films. Clusters are also being used more and more for applications such as data mining, simulation of semiconductors, CAD systems for developing everything from packaging to sneakers, and even the sequencing of the human genome.

Future of clusters

As the cost of hardware and storage continues to decline, the use of clusters will increase dramatically. Already most database vendors support SMP machines and several, including Oracle, are starting to release versions of their software for Linux clusters. As these mainstay packages become available, they will drive the availability of a whole new class of applications that run on these machines, ranging from serious business applications to entertainment applications such as online gaming systems.

Related Links

• The Beowulf Project

• Beowulf Underground

• IBM's DB2 product family

• Oracle on Linux

• TiVo Personal Television Service

• LAM / MPI Parallel Computing

• PVM: Parallel Virtual Machine

• Top 500 Supercomputer Sites

• Building Linux Clusters

Another effect of the ever-increasing capability of computer hardware is its ever-decreasing size. Traditional supercomputers are very large beasts, and, until recently, clusters were even larger since they are, by definition, collections of rack-mounted servers. As system sizes decrease, the physical size of a cluster will decrease as well, while the overall computational capabilities will increase. Right now a 48-node cluster made of 2U (3 ½" high) rack-mount servers would take one and a half 70" rack enclosures and weigh over 1000 pounds -- and this wouldn't even include the networking gear! In the near future, with the availability of 1U (1 ¾" high) systems, a 48-node cluster will soon fit in a closet, and with upcoming and single-board computers, the same cluster will fit under your desk. All of these developments will bring more and more compute-power to bear on your problems.

Want to get in on the fun? Build your own cluster!

One of the nice things about having a regular column is that you get a soapbox for your own projects! As the author of the upcoming O'Reilly book Building Linux Clusters, I recommend that if you're interested in getting your feet wet in the world of clusters and parallel computing, this is a good place to start. Building Linux Clusters is a hand-on introduction to cluster building that will help you decide what hardware to use, whether it's all new systems or older systems you may have "lying around," and help you get your cluster up and running in a matter of hours with a customized version of Red Hat Linux geared for cluster-building.

David HM Spector is President & CEO of Really Fast Systems, LLC, an infrastructure consulting and product development company based in New York


Read more Linux in the Enterprise columns.

Discuss this article in the O'Reilly Network Linux Forum.

Return to the Linux DevCenter.

 




Tagged Articles

Be the first to post this article to del.icio.us

Sponsored Resources

  • Inside Lightroom
Advertisement

Sponsored by:

O'Reilly Media

©2009, O'Reilly Media, Inc.
(707) 827-7000 / (800) 998-9938
All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.
About O'Reilly
Academic Solutions
Authors
Contacts
Customer Service
Jobs
Newsletters
O'Reilly Labs
Press Room
Privacy Policy
RSS Feeds
Terms of Service
User Groups
Writing for O'Reilly
Content Archive
Business Technology
Computer Technology
Google
Microsoft
Mobile
Network
Operating System
Digital Photography
Programming
Software
Web
Web Design
More O'Reilly Sites
O'Reilly Radar
Ignite
Tools of Change for Publishing
Digital Media
Inside iPhone
O'Reilly FYI
makezine.com
craftzine.com
hackszine.com
perl.com
xml.com

Partner Sites
InsideRIA
java.net
O'Reilly Insights on Forbes.com