A computer cluster, or simply a cluster, is a type of computer system. It connects a group of loosely integrated computer software or hardware to work closely together to complete computing tasks. In a sense, they can be regarded as a single computer.
Introduction

In a cluster system, a single computer is usually referred to as a node. It is typically connected via a local area network, but there are other possible connection methods as well. Cluster computers are usually used to enhance the computing speed and/or reliability of a single computer. Generally speaking, cluster computers have a much higher performance-to-price ratio compared to a single computer, such as a workstation or a supercomputer. [1]

Cluster classification
Clusters can be classified into two types: homogeneous and heterogeneous. The difference lies in whether the architectures of the computers that make up the cluster system are the same. Cluster computers can be divided into the following categories based on their functions and structures [1]:
High-availability clusters
Load balancing clusters
High-performance (HPC) clusters
Grid computing
High-availability clusters
Generally, it refers to the situation where if a certain node in the cluster fails, the tasks on that node will automatically be transferred to other normal nodes. It also refers to the ability to offline maintain a certain node in the cluster and then bring it back online, without affecting the operation of the entire cluster. [1]
Load balancing cluster
When a load balancing cluster is running, it usually distributes the workload to a group of servers at the back end through one or more front-end load balancers, thereby achieving high performance and high availability for the entire system. Such computer clusters are sometimes also called server farms. Generally, high availability clusters and load balancing clusters use similar technologies, or possess both high availability and load balancing features.
The Linux Virtual Server (LVS) project provides the most commonly used load balancing software on the Linux operating system. [1]
High-performance computing cluster
The high-performance computing cluster increases computing power by distributing computing tasks to different computing nodes within the cluster. Therefore, it is mainly applied in the field of scientific computing. The popular HPC uses the Linux operating system and some other free software to complete parallel computing. This cluster configuration is usually called a Beowulf cluster. Such clusters usually run specific programs to leverage the parallel capabilities of the HPC cluster. These programs generally use specific runtime libraries, such as the MPI library specifically designed for scientific computing.
HPC clusters are particularly suitable for computing jobs where there is a large amount of data communication between the computing nodes during the computation, such as intermediate results of one node or situations that affect the calculation results of other nodes. [1]
Grid Computing
Grid computing or grid clusters is a technology closely related to cluster computing. The main difference between grids and traditional clusters is that a grid connects a group of related but untrusted computers, and its operation is more like a computing public facility rather than an independent computer. Moreover, grids usually support a larger variety of computer sets than clusters.
Grid computing is optimized for tasks with many independent operations. During the computing process, there is no need for sharing data between jobs. Grids mainly serve to manage the job allocation among computers that are independently executing tasks. Resources such as storage can be shared by all nodes, but the intermediate results of the jobs will not affect the progress of the jobs on other grid nodes.