Wednesday, May 21, 2008

A cluster Computer and Its Architecture

A cluster is a type of parallel or distributed processing system, which consists of a collection of interconnected stand-alone computers working together as a single, integrated computing resource.
A cluster is a type of parallel or distributed processing system, which consists of a collection of interconnected stand-alone computers working together as a single, integrated computing resource.
A computer node can be a single or multiprocessor system (PCs, workstations, or SMPs) with memory, I O facilities, and an operating system. A cluster generally refers to two or more computers (nodes) connected together. The nodes can exist in a single cabinet or be physically separated and connected via a LAN. An inter- connected (LAN-based) cluster of computers can appear as a single system to users and applications. Such a system can provide a cost-evective way to gain features and benefits (fast and reliable services) that have historically been found only on more expensive proprietary shared memory systems. The typical architecture of a cluster is shown in Figure above

The following are some prominent components of cluster computers:
Multiple High Performance Computers (PCs, Workstations, or SMPs)
State-of-the-art Operating Systems (Layered or Micro-kernel based)
High Performance Networks Switches (such as Gigabit Ethernet and Myrinet)
Network Interface Cards (NICs)
Fast Communication Protocols and Services (such as Active and Fast Messages)
Cluster Middleware (Single System Image (SSI) and System Availability Infrastructure)
Hardware (such as Digital (DEC) Memory Channel, hardware DSM, and SMP techniques)
Operating System Kernel or Gluing Layer (such as Solaris MC and GLUn ix)
Applications and Subsystems
Applications (such as system management tools and electronic forms)
Run-time Systems (such as software DSM and parallel file-system)
Resource Management and Scheduling software (such as LSF (Load Sharing Facility) and CODINE (COmputing in DIstributed Net-worked Environments))
Parallel Programming Environments and Tools (such as compilers, PVM (Parallel Virtual Machine), and MPI (Message Passing Interface))
Applications
Sequential
Parallel or Distributed
The network interface hardware acts as a communication processor and is responsible for transmitting and receiving packets of data between cluster nodes via a network switch.
Communication software offers a means of fast and reliable data communication among cluster nodes and to the outside world. Often, clusters with a special network switch like Myrinet use communication protocols such as active messages for fast communication among its nodes. They potentially bypass the operating system and thus remove the critical communication overheads providing direct user-level access to the network interface.
The cluster nodes can work collectively as an integrated computing resource, or they can operate as individual computers. The cluster middleware is responsible for offering an illusion of a united system image (single system image) and availability out of a collection on independent but interconnected computers.
Programming environments can offer portable, efficient, and easy-to-use tools for development of applications. They include message passing libraries, debuggers, and profilers. It should not be forgotten that clusters could be used for the execution of sequential or parallel applications.

Reference/Source
Parallel Programming Models and Paradigms
Lui s Moura Silva and Rajkumar Buyya

No comments: