Pegasus consists of 210 compute nodes accessible through 4 high available login nodes. The robust Dell built cluster utilizes R740s and C4140 and can be broken down by compute, GPU (small and large), high memory and high throughput nodes. All nodes are loaded with CentOS 8 and utilize the SLURM job scheduler. With the cluster capable of a total of 2.14 PFLOPs Single Precision!
Core Counts
Total CPU Cores*: 8,112
Total NVIDIA Tensor Cores: 76,800
Total NVIDIA CUDA Cores: 614,400
A breakdown can be found here
Compute Nodes
There are 108 CPU nodes in Pegasus. Each of these is a:
- Dell PowerEdge R740 server with
- Dual 20-Core 3.70GHz Intel Xeon Gold 6148 processors
- 96GB of 2666MHz DDR4 ECC Register DRAM
- 800 GB SSD onboard storage (used for boot and local scratch space)
- Mellanox EDR Infiniband controller to 100GB fabric
GPU Nodes
18 One GPU nodes
There are 16 Small GPU nodes in Pegasus. Each of these is a:
- Dell PowerEdge R740 server
- (2) NVIDIA Tesla V100 GPU
- Dual 20-Core 3.70GHz Intel Xeon Gold 6148 processors
- 192GB of 2666MHz DDR4 ECC Register DRAM
- 800 GB SSD onboard storage (used for boot and local scratch space)
- Mellanox EDR Infiniband controller to 100GB fabric
21 Four GPU nodes
There are 21 Large GPU nodes in Pegasus. Each of these is a:
- Dell C4140 server
- 6TB NVMe card
- Four (4) Nvidia Tesla V100 SXM2 16GB GPUs with NVLink enabled
- Dual 18-Core 3.70GHz Intel Xeon Gold 6140 processors
- 384GB of 2666MHz DDR4 ECC Register DRAM
- 800 GB SSD onboard storage (used for boot and local scratch space)
- Mellanox EDR Infiniband controller to 100GB fabric
2 Eight A100 GPU nodes
There are 2 Eight GPU nodes in Pegasus. Each of these is a:
- Lenovo ThinkSystem SR670 V2
- (8) NVIDIA A100 80GB PCIe Gen4 Passive GPU
- Dual 26 Cores Intel Xeon Gold 5320 26C
- 512GB TruDDR4 3200 MHz (2Rx4 1.2V) RDIMM
- 800 GB SSD onboard storage (used for boot and local scratch space)
- Mellanox EDR Infiniband controller to 100GB fabric
2 GH200 Grace Hopper Superchip nodes
There are 2 Gracehopper superchip nodes in Pegasus. Each of these is a:
- Quanta S74G-2U Grace Hopper
- (8) NVIDIA A100 80GB PCIe Gen4 Passive GPU
- Grace CPU with 72 Arm Neoverse V2 cores, up to 480GB LPDDRX memory
- Hopper H100 GPU 96GB HBM3 memory
- 7.68TB E1.S SSD
High Throughput Node
There are 6 High Throughput nodes in Pegasus. Each of these is a:
- Dell PowerEdge R740 server
- Dual 4-Core 3.70GHz Intel Xeon Gold 5122 processors
- 384GB of 2666MHz DDR4 ECC Register DRAM
- 800 GB SSD onboard storage (used for boot and local scratch space)
- Mellanox EDR Infiniband controller to 100GB fabric
Medium Memory Node
There are 54 Medium Memory Nodes in Pegasus. Each of these is a:
- Dell PowerEdge R740 server with
- Dual 20-Core 3.70GHz Intel Xeon Gold 6148 processors
- 384GB of 2666MHz DDR4 ECC Register DRAM
- 800 GB SSD onboard storage (used for boot and local scratch space)
- Mellanox EDR Infiniband controller to 100GB fabric
High Memory Node
There are 2 High Memory Nodes in Pegasus. Each of these is a:
- Dell PowerEdge R740 server
- Dual 18-Core 3.70GHz Intel Xeon Gold 6140M processors
- 3TB of 2666MHz DDR4 ECC Register DRAM
- 800 GB SSD onboard storage (used for boot and local scratch space)
- Mellanox EDR Infiniband controller to 100GB fabric
Login Nodes
There are 4 High Available Login Nodes in Pegasus. Each of these is a:
- Dell PowerEdge R740 server
- Dual 16-Core 3.70GHz Intel Xeon Gold 6130 processors
- 192GB of 2666MHz DDR4 ECC Register DRAM
- 2TB RAID I HDD onboard storage (used for boot and local scratch space)
- 40Gb/s Ethernet for the external network
- Mellanox EDR Infiniband controller to 100GB fabric
Head Nodes
There are 2 High Available Head Nodes to control Pegasus. Each of these is a:
- Dell PowerEdge R740 server
- Dual 16-Core 3.70GHz Intel Xeon Gold 6130 processors
- 192GB of 2666MHz DDR4 ECC Register DRAM
- 5TB RAID5 HDD onboard storage
- 40Gb/s Ethernet for the external network
- Mellanox EDR Infiniband controller to 100GB fabric
Filesystem
For NFS, the cluster utilized Qumulo Cluster with 2PB replicated across
For scratch Lenovo DSS Solution with 2PB Storage.
Note: Neither filesystem is meant for longterm storage and is subject to requests of removal by support staff.
*Not including Login Nodes or Head Nodes