We are excited to announce the next phase of High Performance Computing at GWU -- a brand new cluster to go alongside what we have all come to know. In its newest iteration, GW’s premier source for its High-Performance Computing needs takes the shape of Pegasus. Pegasus consists of 210 compute nodes accessible through 4 high available login nodes. The robust Dell built cluster utilizes R740s and C4140 and can be broken down by compute, GPU (small and large), high memory and high throughput nodes. All nodes are loaded with CentOS 7.8 and utilize the SLURM job scheduler. With the cluster capable of a total of 2.14 PFLOPs Single Precision!
Core Counts
Total CPU Cores*: 8,112
Total NVIDIA Tensor Cores: 76,800
Total NVIDIA CUDA Cores: 614,400
A breakdown can be found here
Compute Nodes
There are 108 CPU nodes in Pegasus. Each of these is a:
- Dell PowerEdge R740 server with
- Dual 20-Core 3.70GHz Intel Xeon Gold 6148 processors
- 96GB of 2666MHz DDR4 ECC Register DRAM
- 800 GB SSD onboard storage (used for boot and local scratch space)
- Mellanox EDR Infiniband controller to 100GB fabric
GPU Nodes
Small GPU nodes
There are 16 Small GPU nodes in Pegasus. Each of these is a:
- Dell PowerEdge R740 server
- (2) NVIDIA Tesla V100 GPU
- Dual 20-Core 3.70GHz Intel Xeon Gold 6148 processors
- 192GB of 2666MHz DDR4 ECC Register DRAM
- 800 GB SSD onboard storage (used for boot and local scratch space)
- Mellanox EDR Infiniband controller to 100GB fabric
Large GPU nodes
There are 22 Large GPU nodes in Pegasus. Each of these is a:
- Dell C4140 server
- 6TB NVMe card
- Four (4) Nvidia Tesla V100 SXM2 16GB GPUs with NVLink enabled
- Dual 18-Core 3.70GHz Intel Xeon Gold 6140 processors
- 384GB of 2666MHz DDR4 ECC Register DRAM
- 800 GB SSD onboard storage (used for boot and local scratch space)
- Mellanox EDR Infiniband controller to 100GB fabric
High Throughput Node
There are 6 High Throughput nodes in Pegasus. Each of these is a:
- Dell PowerEdge R740 server
- Dual 4-Core 3.70GHz Intel Xeon Gold 5122 processors
- 384GB of 2666MHz DDR4 ECC Register DRAM
- 800 GB SSD onboard storage (used for boot and local scratch space)
- Mellanox EDR Infiniband controller to 100GB fabric
Medium Memory Node
There are 54 Medium Memory Nodes in Pegasus. Each of these is a:
- Dell PowerEdge R740 server with
- Dual 20-Core 3.70GHz Intel Xeon Gold 6148 processors
- 384GB of 2666MHz DDR4 ECC Register DRAM
- 800 GB SSD onboard storage (used for boot and local scratch space)
- Mellanox EDR Infiniband controller to 100GB fabric
High Memory Node
There are 2 High Memory Nodes in Pegasus. Each of these is a:
- Dell PowerEdge R740 server
- Dual 18-Core 3.70GHz Intel Xeon Gold 6140M processors
- 3TB of 2666MHz DDR4 ECC Register DRAM
- 800 GB SSD onboard storage (used for boot and local scratch space)
- Mellanox EDR Infiniband controller to 100GB fabric
Login Nodes
There are 4 High Available Login Nodes in Pegasus. Each of these is a:
- Dell PowerEdge R740 server
- Dual 16-Core 3.70GHz Intel Xeon Gold 6130 processors
- 192GB of 2666MHz DDR4 ECC Register DRAM
- 2TB RAID I HDD onboard storage (used for boot and local scratch space)
- 40Gb/s Ethernet for the external network
- Mellanox EDR Infiniband controller to 100GB fabric
Head Nodes
There are 2 High Available Head Nodes to control Pegasus. Each of these is a:
- Dell PowerEdge R740 server
- Dual 16-Core 3.70GHz Intel Xeon Gold 6130 processors
- 192GB of 2666MHz DDR4 ECC Register DRAM
- 5TB RAID5 HDD onboard storage
- 40Gb/s Ethernet for the external network
- Mellanox EDR Infiniband controller to 100GB fabric
Filesystem
For NFS, the cluster will utilize DDN GS7K Storage appliance having a total of 2Pb of space. Which is connected to compute and login nodes via Mellanox EDR Infiniband over 100Gb fabric.
For scratch/high speed storage, the cluster will utilize DDN ES14K Lustre appliance providing 2Pb of Parallel Scratch storage. The ES14K will be connected to the compute nodes via Mellanox EDR Infiniband over 100Gb fabric.
Note: Neither filesystem is meant for longterm storage and is subject to requests of removal by support staff.
*Not including Login Nodes or Head Nodes