SALT LAKE CITY, Utah -- The U.S. Dept. of Energy, which builds the world's largest supercomputers, is now targeting 2020 to 2022 target for an exascale system, two to four years later than earlier expectations.
The new timeframe assumes that Congress will fund the project in the fiscal 2014 budget. The White House will deliver its budget request to Congress early next year for fiscal 2014, which begins next Oct. 1.
Despite a belief among scientists that exascale systems can help deliver breakthrough scientific breakthroughs, improve U.S. competitiveness and deepen the understanding of problems like climate change, the development effort has so far received limited funding -- nowhere near the billions of dollars likely needed.
Experts had previously expected an exascale system to arrive in 2018. Those expectations were based, in part, on predictable increases in compute power.
In 1997, the ASCI Red supercomputer built by Intel and installed at the Sandia National Laboratory broke the teraflop barrier, or one trillion calculations per second. ASCI Red cost $55 million to build.
By comparison, Intel's just released Phi 60-core co-processor, which is also capable of one teraflop, is priced at $2,649.
In 2008, a decade after ASCI Red debuted, IBM's Roadrunner began operating at Los Alamos National Labs. Roadrunner operated at petaflop speeds, or 1,000 trillion (one quadrillion) sustained floating point operations per second.
The next leap, an exaflop, is 1,000 petaflops.
The DOE is working on a report for Congress that will detail its "Exascale Computing Initiative" (ECI). The report, initially due in February, is expected to spell out a plan and cost for building an exascale system.
William Harrod, research division director in the advanced scientific computing in the DOE Office of Science, previewed the ECI report at the SC12 supercomputing conference held here last week.
"When we started this, [the timetable was] 2018; now it's become 2020 but really it is 2022," said Harrod.
"I have no doubt that somebody out there could put together an exaflop system in the 2018-2020 timeframe, but I don't think it's going to be one that's going to be destined for solving real world applications," said Harrod.
China, Europe and Japan are all working on exascale initiatives, so it's not assured that the U.S. will deliver the first exascale system.
China, in particular, has been investing heavily in large HPC systems and in its own microprocessor and interconnects technologies.
The U.S. set up some strict criteria for its exascale effort.
The system needs to be relatively low power as well as be a platform for a wide range of applications. The government also wants exascale research spending to lead to marketable technologies that can help the IT industry.
The U.S. plan, when delivered to Congress, will call for building two or three prototype systems by 2018. Once a technology approach is proven, the U.S. will order anywhere from one to three exascale systems, said Harrod.
Exascale system development poses a unique set of power, memory, concurrency and resiliency challenges.
Resiliency refers to the ability to keep a massive system, with millions of cores, continuously running despite component failures. "I think resiliency is going to be a great challenge and it really would be nice if the computer would stay up for more than a couple of hours," said Harrod.
The scale of the challenge is evident in the power goals.
The U.S. wants an exascale system that needs no more than 20 megawatts (MW) of power. In contrast, the leading petascale systems in operation today use as much 8 or more MW.
Although processor capability remains paramount, it is not the center of attention in exascale system design.
Dave Turek, vice president of exascale systems at IBM, said the real change with exascale systems isn't around the microprocessor, especially in the era of big data. "It's really settled around the idea of data and minimizing data movement as the principal design philosophy behind what comes in the future," he said.
In today's systems, data has to travel a long way which uses up power. Datasets are "being generated are so large that it's basically impractical to write the data out to disk and bring it all back in to analyze it," said Harrod.
"We need systems that have large memory capacity," said Harrod. "If we limit the memory capacity we limit the ability to execute the applications as they need to be run," he said.
Exascale systems require a new programing model, and for now there isn't one.
High performance computing allows scientists to model, simulate and visualize processes. The systems can run endless scenarios to test hypothesis, such as discovering how a drug may interact with a cell or how a solar cell operates.
Larger systems allow scientists to expand resolution, or look at problems in finer detail, as well as increase the amount of physics to any problem.
The U.S. research effort would aim to fully utilize the potential of exascale, and achieve a "one billion concurrency."
To give some perspective on that goal, researchers at the Argonne National Lab developed a multi-petaflop simulation of the universe. Salman Habib, a physicist at the lab, said the simulation achieved 13.94 petaflops sustained on more than 1.5 million cores, with a total concurrency of 6.3 million at 4 threads per core on IBM's Sequoia system.
The project is the largest cosmological simulation to date.
"Much as we would all like to, we can't build our own universes to test various ideas about what is happening in the one real universe. Because of this inability to carry out true cosmological experiments, we run virtual experiments inside the computer and then compare the results against observations -- in this sense, large-scale computing is absolutely necessary for cosmology," said Habib.
To accomplish the task, researchers must run hundreds or thousands of virtual universes to tune their understanding. "To carry out such simulation campaigns at high fidelity requires computer power at the exascale" said Habib. "What is exciting is that by the time this power will be available, the observations and the simulations will also be keeping pace."
The total number of nodes in an exascale system will likely be in the 100,000 range, like the smaller systems today. Now, though, each node is becoming more parallel and powerful, said Pete Beckman, the director of the Exascale Technology and Computing Institute at Argonne National Laboratory.
The IBM Blue Gene/Q, for instance, has 16 cores with 64 threads. As time goes on, the number of threads will increase from the hundreds to upwards of a thousand.
"Now, when you have 1,000 independent threads of operation on a node, then the whole system ends up with billion-way concurrency," said Beckman.
"The real change is programming in the node and the parallelism to hide latency, to hide the communication to the other nodes, so that requires lots of parallelism and concurrency," said Beckman.
The new systems will require adaptive programming models, said Beckman. Until an approach is settled it is going to be a "disruptive few years in terms of programming models."
Vendors will have to change their approaches to building software, said Harrod.
"Almost all the vendors have 50 years of legacy built into their system software - 50 years of effort where nobody ever cared about energy efficiency, reliability, minimizing data movement - that's not there, so therefore we need to change that," said Harrod.
Harrod believes the problems can be solved, but that the U.S. will have to invest in new technologies. "We have to push the vendors to go where they are not really interested in going," he said.
Harrod said the U.S. can't build a "stunt machine," or a one-off system that has limited usefulness. The exascale effort has to result in marketable technologies, he said.
"If I can do a 20MW exascale system in 500 cabinets that means we have a petaflop in a single cabinet -- that's amazing," said Harrod. Such a result would mean that a petascale system could be small enough to fit in the data closet of an academic department or business unit.
"We have to do a fair amount of research before we can actually start going out and designing and developing these computers," said Harrod. "We actually don't know exactly how to design and develop these computers at this point in time."
Funding for an exascale system remains a question. The U.S. has approved funds cover preliminary efforts, about $73 million, but has not yet allocated exascale program funding.
"We don't anticipate the ECI [exascale] funding to start before 2014," said Harrod.
FY 2014 begins Oct. 1, 2013. But current fiscal problems in Congress, the so-called fiscal cliff in particular, makes Harrod pessimistic about funding for next year. "To be honest, I would be somewhat doubtful of that at this point in time," he said.
"The biggest problem is the budget," said Harrod. "Until I have a budget, I really don't know what I'm doing," he said.
Patrick Thibodeau covers SaaS and enterprise applications, outsourcing, government IT policies, data centers and IT workforce issues for Computerworld. Follow Patrick on Twitter at @DCgov, or subscribe to Patrick's RSS feed . His e-mail address is firstname.lastname@example.org.
Read more about high performance computing in Computerworld's High Performance Computing Topic Center.
Join the CIO Australia group on LinkedIn. The group is open to CIOs, IT Directors, COOs, CTOs and senior IT managers.