Cluster computing: the state-of-the-art in theory and practice Rapid improvements in network and processor performance are revolutionizing high-performance computing, transforming clustered commodity workstations into the supercomputing solution of choice. This book brings together contributions from more than 100 leading practitioners, offering a single source for up-to-the-minute information on virtually every key system-related issue in high-performance cluster computing. The book contains expert coverage of commodity supercomputing systems and architectures; Internet-based wide area metacomputing systems; the role of Java; new applications and algorithms; advanced techniques for enhancing availability and throughput; and much more. Discover the state-of-the-art in: * Communal multiprocessing/adaptive parallelism techniques for resource sharing * Networking, lightweight protocols, active messages, killer switches, and I/O * Cluster middleware and resource management systems * Cluster computing programming environments, tools, and paradigms * Administering high-performance clustered systems High Performance Cluster Computing, Volume 1: Architectures and Systems captures the
Die Inhaltsangabe kann sich auf eine andere Ausgabe dieses Titels beziehen.
High Performance Cluster Computing contains academic articles concerning supercomputing collected from researchers around the world. Though targeted primarily at graduate students and researchers in computer science, the general reader may find great value in its overview of the current state of high-performance computing.
Computer science experts address many aspects of high performance computing, beginning with the state-of-the-art concepts and basic terminology related to cluster computing. Their investigations provide immediate solutions to engineering problems like optimized node arrangements for low-cost workstations yoked together to solve problems in parallel. One article describes such a cluster created for the Department of Energy that uses 9,000 Pentium CPUs to model nuclear detonations.
Various contributors also consider the requirements necessary for improving parallel programs in terms of speed and logic, including reductions in network latencies and enhanced file and I/O access. One contributor even suggests that Network RAM--unused RAM in systems on the same network--may someday challenge the hard disk for fast--and permanent--data storage.
In all, High Performance Cluster Computing works as an up-to-date, central repository of current thinking on interconnecting computers and processors to improve speed and performance. It provides a valuable roadmap of the state of the art in computer science research as well as some potential benefits for forward-looking corporate computing professionals. --Richard DraganFrom the Inside Flap:
Preface The initial idea leading to cluster computing was developed in the 1960s by IBM as a way of linking large mainframes to provide a cost-effective form of commercial parallelism. During those days, IBM's HASP (Houston Automatic Spooling Priority) system and its successor, JES (Job Entry System), provided a way of distributing work to a user-constructed mainframe cluster. IBM still supports clustering of mainframes through their Parallel Sysplex system, which allows the hardware, operating system, middleware, and system management software to provide dramatic performance and cost improvements while permitting large mainframe users to continue to run their existing applications.
However, cluster computing did not gain momentum until three trends converged in the 1980s: high performance microprocessors, high-speed networks, and standard tools for high performance distributed computing. A possible fourth trend is the increased need of computing power for computational science and commercial applications coupled with the high cost and low accessibility of traditional supercomputers. These building blocks are also known as killer-microprocessors, killer-networks, killer-tools, and killer-applications, respectively. The recent advances in these technologies and their availability as cheap and commodity components are making clusters or networks of computers (PCs, workstations, and SMPs) an appealing vehicle for cost-effective parallel computing. Clusters, built using commodity-off-the-shelf (COTS) hardware components as well as free, or commonly used, software, are playing a major role in redefining the concept of supercomputing.
The trend in parallel computing is to move away from specialized traditional supercomputing platforms, such as the Cray/SGI T3E, to cheaper and general purpose systems consisting of loosely coupled components built up from single or multiprocessor PCs or workstations. This approach has a number of advantages, including being able to build a platform for a given budget which is suitable for a large class of applications and workloads.
This book is motivated by the fact that parallel computing on a network of computers using commodity components has received increased attention recently, and noticeable progress towards usable systems has been made. A number of researchers in academia and industry have been active in this field of research. Although research in this area is still in its early stage, promising results have been demonstrated by experimental systems built in academic and industrial laboratories. There is a need for better understanding of what cluster computing can offer, how cluster computers can be constructed, and what the impacts of clustering on high performance computing will be.
Though a significant number of research articles have been published in various conference proceedings and journals, the results are scattered in many places, are hard to obtain, and are difficult to understand, especially for beginners. This book, the first of its kind, gathers in one place the current and comprehensive technical coverage of the field and presents it in a tutorial form. The book's coverage reflects the state-of-the-art in high-level architecture, design, and development, and points out possible directions for further research and development.
Organization This book is a collection of chapters written by leading scientists active in the area of parallel computing using networked computers. The primary purpose of the book is to provide an authoritative overview of this field's state-of-the-art. The emphasis is on the following aspects of cluster computing:
Requirements, Issues, and Services
System Area Networks, Communication Protocols, and High
Performance I/O Techniques
Resource Management, Scheduling, Load Balancing, and System Availability
Possible Models for Cluster-based Parallel Systems
Programming Models and Environments
Algorithms and Applications of Clusters
The work on High Performance Cluster Computing appears in two volumes:
Volume 1: Systems and Architectures
Volume 2: Programming and Applications
This book, Volume 1, consists of 36 chapters, which are grouped into the following four parts:
Part I: Requirements and General Issues Part II: Networking, Protocols, and I/O Part III: Process Scheduling, Load Sharing, and Balancing Part IV: Representative Cluster Systems
Part I focuses on cluster computing requirements and issues related to components, single system image, high performance, high availability, scalability, deployment, administration, and wide-area computing. Part II covers system area networks, light-weight communication protocols, and I/O. Part III discusses techniques and algorithms of process scheduling, migration, and load balancing along with representative systems. Part IV covers system architectures of some of the popular academic and commercial cluster-based systems such as Beowulf and SP/2.
Readership The book is primarily written for graduate students and researchers interested in the area of parallel and distributed computing. However, it is also suitable for practitioners in industry and government laboratories.
The interdisciplinary nature of the book is likely to appeal to a wide audience. They will find this book to be a valuable source of information on recent advances and future directions of parallel computation using networked computers. This is the first book addressing various technological aspects of cluster computing in-depth, and we expect that the book will be an informative and useful reference in this new and fast growing research area.
The organization of this book makes it particularly useful for graduate courses. It can be used as a text for a research-oriented or seminar-based advanced graduate course. Graduate students will find the material covered by this book to be stimulating and inspiring. Using this book, they can identify interesting and important research topics for their Master's and Ph.D. work. It can also serve as a supplementary book for regular courses, taught in Computer Science, Computer Engineering, Electrical Engineering, and Computational Science and Informatics Departments, including:
Advanced Computer Architecture
Advances in Networking Technologies
High Performance Distributed Computing
Distributed and Concurrent Systems
High Performance Computing
Trends in Distributed Operating Systems
Cluster Computing and their Architecture.
Rajkumar Buyya Monash University, Melbourne, Australia (firstname.lastname@example.org / rajkumar@ieee) February, 1999
„Über diesen Titel“ kann sich auf eine andere Ausgabe dieses Titels beziehen.