Skip to content

Pegasus Hardware Configuration

Pegasus consists of 210 compute nodes accessible through 4 high available login nodes. The robust Dell built cluster utilizes R740s and C4140 and can be broken down by compute, GPU (small and large), high memory and high throughput nodes. All nodes are loaded with CentOS 8 and utilize the SLURM job scheduler. With the cluster capable of a total of 2.14 PFLOPs Single Precision!

Core Counts

 

Total CPU Cores*: 8,112

Total NVIDIA Tensor Cores: 76,800

Total NVIDIA CUDA Cores: 614,400

 

A breakdown can be found here

Compute Nodes

There are 108 CPU nodes in Pegasus. Each of these is a:

  • Dell PowerEdge R740 server with
  • Dual 20-Core 3.70GHz Intel Xeon Gold 6148 processors
  • 96GB of 2666MHz DDR4 ECC Register DRAM
  • 800 GB SSD onboard storage (used for boot and local scratch space)
  • Mellanox EDR Infiniband controller to 100GB fabric

GPU Nodes

 

18 One GPU nodes

There are 16 Small GPU nodes in Pegasus. Each of these is a:

  • Dell PowerEdge R740 server
  • (2) NVIDIA Tesla V100 GPU
  • Dual 20-Core 3.70GHz Intel Xeon Gold 6148 processors
  • 192GB of 2666MHz DDR4 ECC Register DRAM
  • 800 GB SSD onboard storage (used for boot and local scratch space)
  • Mellanox EDR Infiniband controller to 100GB fabric

21 Four GPU nodes

There are 21 Large GPU nodes in Pegasus. Each of these is a:

  • Dell C4140 server
  • 6TB NVMe card
  • Four (4) Nvidia Tesla V100 SXM2 16GB GPUs with NVLink enabled
  • Dual 18-Core 3.70GHz Intel Xeon Gold 6140 processors
  • 384GB of 2666MHz DDR4 ECC Register DRAM
  • 800 GB SSD onboard storage (used for boot and local scratch space)
  • Mellanox EDR Infiniband controller to 100GB fabric

2 Eight A100 GPU nodes

There are 2 Eight GPU nodes in Pegasus. Each of these is a:

  • Lenovo ThinkSystem SR670 V2
  • (8) NVIDIA A100 80GB PCIe Gen4 Passive GPU
  • Dual 26 Cores Intel Xeon Gold 5320 26C
  • 512GB TruDDR4 3200 MHz (2Rx4 1.2V) RDIMM
  • 800 GB SSD onboard storage (used for boot and local scratch space)
  • Mellanox EDR Infiniband controller to 100GB fabric

2 GH200 Grace Hopper Superchip nodes

There are 2 Gracehopper superchip nodes in Pegasus. Each of these is a:

  • Quanta S74G-2U Grace Hopper
  • (8) NVIDIA A100 80GB PCIe Gen4 Passive GPU
  • Grace CPU with 72 Arm Neoverse V2 cores, up to 480GB LPDDRX memory
  • Hopper H100 GPU 96GB HBM3 memory
  • 7.68TB E1.S SSD

High Throughput Node

There are 6 High Throughput nodes in Pegasus. Each of these is a:

  • Dell PowerEdge R740 server
  • Dual 4-Core 3.70GHz Intel Xeon Gold 5122 processors
  • 384GB of 2666MHz DDR4 ECC Register DRAM
  • 800 GB SSD onboard storage (used for boot and local scratch space)
  • Mellanox EDR Infiniband controller to 100GB fabric

Medium Memory Node

There are 54 Medium Memory Nodes in Pegasus. Each of these is a:

  • Dell PowerEdge R740 server with
  • Dual 20-Core 3.70GHz Intel Xeon Gold 6148 processors
  • 384GB of 2666MHz DDR4 ECC Register DRAM
  • 800 GB SSD onboard storage (used for boot and local scratch space)
  • Mellanox EDR Infiniband controller to 100GB fabric

High Memory Node

There are 2 High Memory Nodes in Pegasus. Each of these is a:

  • Dell PowerEdge R740 server
  • Dual 18-Core 3.70GHz Intel Xeon Gold 6140M processors
  • 3TB of 2666MHz DDR4 ECC Register DRAM
  • 800 GB SSD onboard storage (used for boot and local scratch space)
  • Mellanox EDR Infiniband controller to 100GB fabric

Login Nodes

There are 4 High Available Login Nodes in Pegasus. Each of these is a:

  • Dell PowerEdge R740 server
  • Dual 16-Core 3.70GHz Intel Xeon Gold 6130 processors
  • 192GB of 2666MHz DDR4 ECC Register DRAM
  • 2TB RAID I HDD onboard storage (used for boot and local scratch space)
  • 40Gb/s Ethernet for the external network
  • Mellanox EDR Infiniband controller to 100GB fabric

Head Nodes

There are 2 High Available Head Nodes to control Pegasus. Each of these is a:

  • Dell PowerEdge R740 server
  • Dual 16-Core 3.70GHz Intel Xeon Gold 6130 processors
  • 192GB of 2666MHz DDR4 ECC Register DRAM
  • 5TB RAID5 HDD onboard storage
  • 40Gb/s Ethernet for the external network
  • Mellanox EDR Infiniband controller to 100GB fabric

Filesystem

For NFS, the cluster utilized Qumulo Cluster with 2PB replicated across

For scratch Lenovo DSS Solution with 2PB Storage.

Note: Neither filesystem is meant for longterm storage and is subject to requests of removal by support staff.

 

*Not including Login Nodes or Head Nodes