History of Supercomputers

Getty Images

Many of us are familiar with computers. You’re likely using one now to read this blog post as devices such as laptops, smartphones and tablets are essentially the same underlying computing technology. Supercomputers, on the other hand, are somewhat esoteric as they’re often thought of as hulking, costly, energy-sucking machines developed, by and large, for government institutions, research centers and large firms.

Take for instance China’s Sunway TaihuLight, currently the world’s fastest super computer, according to Top500’s supercomputer rankings. It’s comprised of 41,000 chips (the processors alone weigh over 150 tons), cost about $270 million and has a power rating of 15,371 kW. On the plus side, however, it’s capable of performing quadrillions of calculations per second and can store up to 100 million books. And like other supercomputers, it’ll be used to tackle some of the most complex tasks in the fields of science such as weather forecasting and drug research.

The notion of a supercomputer first arose in the 1960’s when an electrical engineer named Seymour Cray, embarked on creating the world’s fastest computer. Cray, considered the “father of supercomputing,” had left his post at business computing giant Sperry-Rand to join the newly formed Control Data Corporation so that he can focus on developing scientific computers.

The title of world’s fastest computer was held at the time by the IBM 7030 “Stretch,” one of the first to use transistors instead of vacuum tubes.  

In 1964, Cray introduced the CDC 6600, which featured innovations such as switching out germanium transistors in favor of silicon and a Freon-based cooling system.

More importantly, it ran at a speed of 40 MHz, executing roughly three million floating-point operations per second, which made it the fastest computer in the world. Often considered to be the world’s first supercomputer, the CDC 6600 was 10 times faster than most computers and three times faster than the IBM 7030 Stretch. The title was eventually relinquished in 1969 to its successor the CDC 7600.    

In 1972, Cray left Control Data Corporation to form his own company, Cray Research. After some time raising seed capital and financing from investors, Cray debuted the Cray 1, which again raised the bar for computer performance by a wide margin. The new system ran at a clock speed of 80 MHz and performed 136 million floating-point operations per second (136 megaflops). Other unique features include a newer type of processor (vector processing) and a speed-optimized horseshoe-shaped design that minimized the length of the circuits. The Cray 1 was installed at Los Alamos National Laboratory in 1976.

By the 1980’s Cray had established himself as the preeminent name in supercomputing and any new release was widely expected to topple his previous efforts. So while Cray was busy working on a successor to the Cray 1, a separate team at the company put out the Cray X-MP, a model that was billed as a more “cleaned up” version of the Cray 1.

It shared the same horseshoe-shape design, but boasted multiple processors, shared memory and is sometimes described as two Cray 1s linked together as one. In fact, the Cray X-MP (800 megaflops) was one of the first “multiprocessor” designs and helped open the door to parallel processing, wherein computing tasks are split into parts and executed simultaneously by different processors.  

The Cray X-MP, which was continually updated, served as the standard bearer until the long anticipated launch of the Cray 2 in 1985. Like its predecessors, Cray’s latest and greatest took on the same horseshoe-shaped design and basic layout with integrated circuits stacked together on logic boards. This time, however, the components were crammed so tightly that the computer had to be immersed in a liquid cooling system to dissipate the heat.

The Cray 2 came equipped with eight processors, with a “foreground processor” in charge of handling storage, memory and giving instructions to the “background processors,” which were tasked with the actual computation. All together, it packed a processing speed of 1.9 billion floating point operations per second (1.9 Gigaflops), two times faster than the Cray X-MP.

Needless to say, Cray and his designs ruled the early era of super computer. But he wasn’t the only one advancing the field. The early 80’s also saw the emergence of massively parallel computers, powered by thousands of processors all working in tandem to smash though performance barriers. Some of the first multiprocessor systems were created by W. Daniel Hillis, who came up with the idea as a graduate student at the Massachusetts Institute of Technology. The goal at the time was to overcome to the speed limitations of having a CPU direct computations among the other processors by developing a decentralized network of processors that functioned similarly to the brain’s neural network. His implemented solution, introduced in 1985 as the Connection Machine or CM-1, featured 65,536 interconnected single-bit processors.

The early 90’s marked the beginning of the end for Cray’s stranglehold on supercomputing. By then, the supercomputing pioneer had split off from Cray Research to form Cray Computer Corporation. Things started to go south for the company when the Cray 3 project, the intended successor to the Cray 2, ran into a whole host of problems.

One of Cray’s major mistakes was opting for gallium arsenide semiconductors – a newer technology -- as a way to achieve his stated goal of a twelvefold improvement in processing speed. Ultimately, the difficulty in producing them, along with other technical complications, ended up delaying the project for years and resulted in many of the company’s potential customers eventually losing interest. Before long, the company ran out of money and filed for bankruptcy in 1995.

Cray’s struggles would give way to a changing of the guard of sorts as competing Japanese computing systems would come to dominate the field for much of the decade. Tokyo-based NEC Corporation first came onto the scene in 1989 with the SX-3 and a year later unveiled a four-processor version that took over as the world’s fastest computer, only to be eclipsed in 1993. That year, Fujitsu’s Numerical Wind Tunnel, with the brute force of 166 vector processors became the first supercomputer to surpass 100 gigaflops (Side note: To give you an idea of how rapidly the technology advances, the fastest consumer processors in 2016 can easily do more than 100 gigaflops, but at the time, it was particularly impressive). In 1996, the Hitachi SR2201 upped the ante with 2048 processors to reach a peak performance of 600 gigaflops.

Now where was Intel? The company that had established itself as the consumer market’s leading chipmaker didn’t really make a splash in the realm of supercomputing until towards the end of the century.

This was because the technologies were altogether very different animals. Supercomputers, for instance, were designed to jam in as much processing power as possible while personal computers were all about squeezing efficiency from minimal cooling capabilities and a limited energy supply. So in 1993 Intel engineers finally took the plunge by taking the bold approach of going massively parallel with the 3,680 processor Intel XP/S 140 Paragon, which by June of 1994 had climbed to the summit of the supercomputer rankings. In fact, it was the first massively parallel processor supercomputer to be indisputably the fastest system in the world.  

Up to this point, supercomputing has been mainly the domain of those with the kind of deep pockets to fund such ambitious projects. That all changed in 1994 when contractors at NASA's Goddard Space Flight Center, who didn’t have that kind of luxury, came up with a clever way to harness the power of parallel computing by linking and configuring a series of personal computers using an ethernet network. The “Beowulf cluster” system they developed was comprised of 16 486DX processors, capable of operating in the gigaflops range and cost less than $50,000 to build. It also had the distinction of running Linux rather than Unix before the Linux became the operating systems of choice for supercomputers. Pretty soon, do-it-yourselfers everywhere were followed similar blueprints to set up their own Beowulf clusters.   

After relinquishing the title in 1996 to the Hitachi SR2201, Intel came back that year with a design based on the Paragon called ASCI Red, which was comprised of more than 6,000 200MHz Pentium Pro processors. Despite moving away from vector processors in favor of off-the-shelf components, the ASCI Red gained the distinction of being the first computer to break the one trillion flops barrier (1 teraflops). By 1999, upgrades enabled it to surpass three trillion flops (3 teraflops). The ASCI Red was installed at Sandia National Laboratories and was used primarily to simulate nuclear explosions and assist in the maintenance of the country’s nuclear arsenal.

After Japan retook the supercomputing lead for a period with the 35.9 teraflops NEC Earth Simulator, IBM brought supercomputing to unprecedented heights starting in 2004 with the Blue Gene/L. That year, IBM debuted a prototype that just barely edged the Earth Simulator (36 teraflops). And by 2007, engineers would ramp up the hardware to push its processing capability to a peak of nearly 600 teraflops. Interestingly, the team was able reach such speeds by going with the approach of using more chips that were relatively low power, but more energy efficient. In 2008, IBM broke ground again when it switched on the Roadrunner, the first supercomputer to exceed one quadrillion floating point operations per second (1 petaflops).