The New Blade System from T-Platforms

cse.scitech.ac.uk

The New Blade System from T-Platforms

T-Platforms blade server “V-Class and Storage”

with the ClustrX “Operating System

practical HPC Igor Zacharov

Technical Marketing

Igor.zacharov@t‐platforms.com


T‐Platforms Projects/Installations

2002 — Foundation

2003 — SKIF К‐500 cluster, 407 th position in the global Top500

2004 — SKIF К‐1000 cluster, 98 th position in Top500

2007 — SKIF Siberia b supercomputer, 72 d nd position in Top500

2008 — SKIF MSU supercomputer (60 TF), 36 th in Top500;

• Proprietary platforms based on 9‐core IBM PowerXCell 8i;

• Establishment of a dedicated HPC service company

2009 — 420TFlops supercomputer built for MSU, 12 th in Top500

• T‐Blade2‐XN T Blade2 XN (Intel Xeon based) delivering 18TFlops per rack

2011 — Blade2‐TL (Nvidia Fermi based) Installation 1.3PF

2012 — V‐Class : modular HPC

2013 — Project j 10 PF ( (next generation g TB‐3) )


T‐Platforms TB2 product


MSU “Lomonosov” installation

1.3 Pflop/s Intel Xeon/Nvidia GPGPU hybrid system


Announcement

“V‐CLASS” –A MODULAR HPC SYSTEM


V-Class V Class System Key Features

�� Chassis 5U 19” 19 with 10 nodes (dual CPU sockets)

� with combined power and cooling (fan) infrastructure

� Redundant (N+1) components allow increased uptime

� Management g module with network switch:

� consolidation of the Ethernet ports

�� Single point of control for chassis and nodes


Compute Node V20Xs

C t d h t

Concept and photos

V205-B1A, engineering sample

V20X V20Xs, ttray engineering i i sample

l


System Specification

Chassis Enclosure: 5U Dimensions: L820 x W442 x H220

• Chassis Controller with Ethernet switch (management network)

• IMU (Integrated Management Unit) Software for startup and control: GUI Interface

• Power draw ~3.8 kW, 3+1 power p units; supply pp y efficiency y near 95% (Platinum)

Nodes and Trays

• 10 dual‐socket (10 trays) (Intel/AMD)

• Intel: 5 dual‐socket dual socket x86 + 2 x GPU (NVIDIA® (NVIDIA Tesla Tesla M2050 / M2070 / M2090)

• AMD: 5 dual‐socket x86 + 1 x GPU (NVIDIA® Tesla M2050 / M2070 / M2090)

Processors

• Intel E5 series (Sandy Bridge) (TDP up to 95Watt)

• AMD 61xx and 62xx series (Bulldozer) (ACP up to 80Watt)

Memory

• AMD AMD: 256GB/ 256GB/node, d 16 x DDR3 RDIMM/UDIMM ECC 1333/1600 MH MHz

• Intel: 256GB/node; 16 x DDR3 RDIMM/UDIMM ECC 1333/1600 MHz

Storage: Internal disks: 2.5" SATA 3Gb/s (AMD), 6Gb/s (Intel) HDD or SSD drives (2/node) (Cold swap)

Expansion Slots

• AMD: 1 PCI‐E x16 2.0 for LP MD2 form‐factor cards

• Intel: 2 PCI‐E x16 3.0 for LP MD2 form‐factor cards

Communication

• Dual GbE/node, optional QDR (AMD), FDR (Intel) InfiniBand / 10GbE port/node


PPower MManagement tiin Chassis Ch i Controller C t ll

� Measurement of momentum power draw on the processor nodes

�� Information is part of the overall monitoring of the system

� Information available to the Resource Manager

Monitoring by external server(s) via management network

�� Resource Manager policies:

� Minimize average power usage

�Grouping processes

�Knowledge of the application profile

Research Activity

�� Constant power usage

FFollow ll up ffrom

�average yearly power

HOPSA project


Expanding… p g

COOLING SOLUTIONS


Super Cool Enclosure

� 25kw nominal Rear Door Heat Exchanger (RDhx)

There is another way

� Precision cooling

�� No need for ‘Drip Drip Tray’ Tray

� Six hot swap fans

� Fan fail alarm as standard, visual and data line

transmission

� Variable speed fans

� Built in redundancy

� In excess of 80% venting on front door

� Optional remote monitoring and administration

� Unique styling y g

� 1000kg weight loading as standard

� Full range of accessories…


W≤24KW

Tout=25˚C Tin=20˚C

Glycol 25%


Storage g Systems y

SOFTWARE BASED STORAGE


AVRO‐RAID Software based storage g

IB

network

Software for RAID 00,6,10 6 10 in SAN infrastructure

(patent on software algorithm)

� Fast recovery (


AVRO‐RAID Software based storage

• Highest performance in RAID6

• Patented, parallel calculation:

Writing and Reading at the same speed

• Sustained performance with ongoing reconstruction

• Minimum time for RAID initialization

• Configurable cache size (upto 24 GB)

• Unrestricted LUN size


Software and Management g

CLUSTRX


What’s ClustrX “Operating System”?

� OS for supercomputers, not only for nodes

�� Integral solution: compute nodes + network + infrastrcture

� Heterogeneous & Scalable

Network interface

Clustrx.Watch

Virtualized services:

aggregation servers.

Database.

Logic layers …

ClustrX Watch: global system monitoring

Clustrx (node) agent

(lightweight


Clustrx Subsystems & Installation

11. Cl Clustrx t WWatch t h - monitoring and control

22. dConf - Cl Cluster-wide, t id ddecentralized t li d di distributed t ib t d

storage for configuration data

3. Resource manager - POSIX-compliant, modular,

scalable, CLOUD-ready,…

CLOUD ready,…

Load OS for a job

4. Network boot & provisioning - infrastructure

to support any number of computing nodes

Written in Erlang and C/C++ Linux open source modules


ClustrX Watch: monitoring g hardware

All equipment

Selected

equipment

Sensors and

th their i status t t


ClustrX watch: measurement statistics


des

uster no

Cl

ClustrX Architecture (Admin view)

�Not hardware specific p

�Can obtain software from UK based

company: Erlang-Solutions

Management nodes


Resource Management & Power control

• Monitoring data makes database of events

• Power consumption per board

• PPower consumption ti per component t (f (future) t )

• Resource manager can use the data to schedule jobs

• Max performance

• Optimize power

• Max cut power

• Constant power through out the year

• …

Configuring g g software to your y

needs


Using g the power p of ClustrX

VIRTUAL SUPERCOMPUTER

(VSC)


Virtual Supercomputer p p (VSC) ( )

The VSC is an addition to personal/small‐scale computing

clusters, which helps utilize idle resources of large‐scale

ones.

1) BBundel d l workload kl d

in Virtual machine

Repository of

prepackaged softw

5) Get results back:

Accounting

Billing

2) Ship to front-end

3) ClustrX

Resource Manager PXE boots VM to execute

(on specified number of nodes) ← HPC

4) ClustrX

Monitoring & Finalize


Putting g it all together g

T‐PLATFORMS HARDWARE, SOFTWARE

AND SERVICES IN THE UK


T‐Platforms Systems, Software, Services

Hardware solutions:

Modular HPC

TB2-TL/XN T-Blade V-Class T-Mini P-Class

Infrastructure solutions:

Super Cool Enclosure

«Clustrx» Operating System

Services

�� Design of the HPC solution

• Modular hardware

• Delivery T‐Platforms GmbH

• Support by Integrex, UK

� Provision of ClustrX for cluster

management and control:

• Multi‐vendor hardware support

• Power aware resource management

• VSC

• SSupport tb by El Erlang‐Solutions, S l ti UK

More magazines by this user
Similar magazines