IBM’s reply to the cost-effective supercomputer has already been up and working for a number of months now, however solely not too long ago has it disclosed any tangible details about its so-called Vela challenge.
Turning to its blog (opens in new tab) to debate particulars, IBM revealed that the analysis, authored by 5 staff on the firm, tackles the issues with earlier supercomputers, and their lack of readiness for AI duties.
As a way to tweak the supercomputer mannequin for this future kind of workload, the corporate sheds some mild on the choices it made by way of the usage of reasonably priced however highly effective {hardware}.
IBM’s Vela AI supercomputer
The work highlights that “constructing a [traditional] supercomputer has meant naked steel nodes, high-performance networking {hardware}… parallel file techniques, and different objects normally related to high-performance computing (HPC).”
Whereas it’s clear that these supercomputers can deal with heavy AI workloads, together with the one designed for OpenAI, the startup behind the favored ChatGPT live chat software program, a scarcity of optimization has meant that conventional supercomputers might lack invaluable energy, and have an extra in different areas resulting in an pointless spend.
Whereas it has lengthy been accepted that naked steel nodes are probably the most best for AI, IBM needed to discover providing these up inside a digital machine (VM). The end result, in response to Huge Blue, is big efficiency features.
“Following a major quantity of analysis and discovery, we devised a approach to expose all the capabilities on the node (GPUs, CPUs, networking, and storage) into the VM in order that the virtualization overhead is lower than 5%, which is the bottom overhead within the trade that we’re conscious of.”
By way of node design, Vela is filled with 80GB or GPU reminiscence, 1.5TB of DRAM, and 4 3.2TB NVMe storage drives.
The Next Platform (opens in new tab) estimates that, if IBM needed to characteristic its supercomputer within the Top500 rankings, it will ship round 27.9 petaflops of efficiency, putting it in fifteenth place in response to November 2022’s rankings.
Whereas at the moment’s supercomputers are at the moment in a position to deal with AI workloads, big developments in synthetic intelligence mixed with the urgent want for value effectivity spotlight the necessity for such a machine.