Skip to Main Content

R2 GPU/CPU Cluster

The R2 HPC Cluster is a Linux core operating system (CentOS release 7), high availability cluster supporting 22 compute nodes, each with dual Intel Xeon E5-2680 14 core CPUs, for a total of 616 CPU cores. It has five GPU nodes with dual Nvidia Tesla NVLink P100 cards, and each GPU has 3584 cores with Double-Precision Performance of up to 4.7 teraflops – for a total of 35840 cores. Compute cores are controlled by the Slurm scheduler.

Users with R2 accounts can log on to the R2-User-Portal (http://r2.boisestate.edu/)

  • Head Node(s) – High Availability Fail Over
    MotherBoard: Dell PE R730/xd
    CPU: Dual Intel Xeon E5-2623 4 core 2.6GHz
    Memory: 64GB
    EtherNet: Dual 10GigE, Dual GigE
    InfiniBand: Mellanox ConnectX-3 VPI FDR, QSFP+ 40/56GbE
    Data Storage: Dell MD3460 12G SAS [60 TB] Raid-6 XFS
  • Compute Nodes 1-22
    MotherBoard: Dell PE R630
    CPU: Dual intel Xeon E5-2680 v4 14 core 2.4GHz
    Memory: 192GB
    EtherNet: Quad Port GigE
    InfiniBand:  Mellanox ConnectX-3 VPI FDR, QSFP+  40/56GbE
  • GPU Nodes 23-27
    MotherBoard: Dell PE 730/xd
    CPU: Dual Intel Xeon E5-2680 v4 14 core 2.4GHz
    GPU: dual Nvidia Tesla NVLink P100’s (3584 cores each)
    Memory: 256GB
    EtherNet: Quad Port GigE
    InfiniBand:  Mellanox ConnectX-3 VPI FDR, QSFP+  40/56GbE
  • File Systems
    54 CPU’s @ 570 = ~30.8 TFlops
    10 GPU’s @ 10.6T = ~106 TFlops
    Total theoretical flops = 137 TFlops
  • Environment Settings
    Default Linux shell is BASH.
    Environment settings are controlled with MODULES.
    Cluster is accessed via remote secure sessions (SSH).
  • File Systems
    Head Nodes 600G mirrored /cm/shared/apps 5Tb /home 30Tb /scratch 18T
    Compute/GPU Nodes 600G /cm/shared/apps 5Tb /home 30Tb /scratch 18T
    Home directories have a 50Gb quota
  • Cluster Management/Scheduler
    The cluster is managed using Bright Cluster Manager version 7.3 in conjunction with Slurm Workload Manager 16.05.
    The HPC Systems Administrator uses this tools to provide:
    • Users can request an account via Accounts & Access
    • LDAP based user creation and authentication.
    • Compute Node Install and Provisioning.
      • Image creation / capture / install
      • RootShell image administration & patching
      • IPMI remote monitoring and BIOS configuration.
    • Network configuration.
      • Cluster Networks (Intranet / Internet)
      • DNS & DHCP NAT
      • NFS shared files exports
    • Master Node install and maintenance.
      • Patch Management
      • Compiler Management
      • MPI Environment Management
      • CUDA Development Management
      • Application Compiles & Installs
    • Job scheduling with Slurm. Job scheduling is based on:
      • Node Resources Available
      • Job Queues based on group policies and node resources/ownership
      • Job Submission based on queue policy using a given parallel environments
    • Cluster Monitoring and Reporting.
    • Cluster Data Storage Creation / Maintenance / Backup

Please follow the links below for information on each component:

R2Linux Environment Modules

R2Slurm Workload Manager 16.05

For more information, please contact researchcomputing@boisestate.edu.