Beowulf Cluster Computing with Linux (Scientific and Engineering Computation)

Beowulf Cluster Computing with Linux (Scientific and Engineering Computation)

William Gropp

Language: English

Pages: 660

ISBN: 0262692929

Format: PDF / Kindle (mobi) / ePub

Use of Beowulf clusters (collections of off-the-shelf commodity computers programmed to act in concert, resulting in supercomputer performance at a fraction of the cost) has spread far and wide in the computational science community. Many application groups are assembling and operating their own "private supercomputers" rather than relying on centralized computing centers. Such clusters are used in climate modeling, computational biology, astrophysics, and materials science, as well as non-traditional areas such as financial modeling and entertainment. Much of this new popularity can be attributed to the growth of the open-source movement.The second edition of Beowulf Cluster Computing with Linux has been completely updated; all three stand-alone sections have important new material. The introductory material in the first part now includes a new chapter giving an overview of the book and background on cluster-specific issues, including why and how to choose a cluster, as well as new chapters on cluster initialization systems (including ROCKS and OSCAR) and on network setup and tuning. The information on parallel programming in the second part now includes chapters on basic parallel programming and available libraries and programs for clusters. The third and largest part of the book, which describes software infrastructure and tools for managing cluster resources, has new material on cluster management and on the Scyld system.















area networks (e.g., Myrinet) and run widely available low-cost or no-cost software for An Overview of Cluster Computing 19 managing system resources and coordinating parallel execution. Such systems exhibit exceptional price/performance for many applications. Cluster farms — existing local area networks of PCs and workstations serving either as dedicated user stations or servers that, when idle, can be employed to perform pending work from outside users. Exploiting job stream parallelism,

systems. To build an effective parallel computer, you should start with the best uniprocessor. Of course, this tendency must be tempered by cost. The overall price/performance ratio for your favorite application is probably the most important consideration. The highest performance processor at any point in time rarely has the best price/performance ratio. Usually it is the processor one generation or one-half generation behind the cutting edge that is available with the most attractive ratio of

floppy disk. The Linux kernel can dynamically load the MSDOS file system kernel module when it detects a request to mount an MSDOS file system. The resident size of the kernel remains small until it needs to dynamically 74 Chapter 4 add more functionality. By moving as many features out of the kernel core and into dynamically loadable modules, the legendary stability of Linux compared with legacy operating systems is achieved. Linux distributions, in an attempt to support as many different

‘proc’ and ‘mnt’ directories are empty, as they will be used as mount points during the cloning process. The ‘dev’ directory contains all the standard Linux device files. Device files are special, and cannot be copied normally. The easiest way to create this directory is by letting tar do the work by executing the following command as root: tar -C / -c -f - dev | tar xf This will create a ‘dev’ directory containing all the device files found on the worldly node. All the remaining directories are

nodes. The use of utilities such as userdel (or deluser) and groupdel helps to ensure that the worldly node files are not simultaneously updated, which is a good first step in maintaining consistency. The alternative method of managing user accounts is to use a directory service, such as NIS, to store user account information. NIS stores all directory data in a central server, which is contacted by clients (the nodes) to perform authentication. This eases system administration considerably, since

Download sample