A version of this article appeared October 8, 2013, on page A1 in the U.S. edition of The Wall Street Journal, with the headline: Meltdowns Hobble NSA Data Center.
Meltdowns Hobble NSA Data Center
Investigators Stumped by What's Causing Power Surges That Destroy Equipment
Chronic electrical surges at the massive new data-storage facility central
to the National Security Agency's spying operation have destroyed hundreds
of thousands of dollars worth of machinery and delayed the center's opening
for a year, according to project documents and current and former officials.
There have been 10 meltdowns in the past 13 months that have prevented the
NSA from using computers at its new Utah data-storage center, slated to be
the spy agency's largest, according to project documents reviewed by The
Wall Street Journal.
One project official described the electrical troubles—so-called arc
fault failures—as "a flash of lightning inside a 2-foot box." These
failures create fiery explosions, melt metal and cause circuits to fail,
the official said.
The causes remain under investigation, and there is disagreement whether
proposed fixes will work, according to officials and project documents. One
Utah project official said the NSA planned this week to turn on some of its
NSA spokeswoman Vanee Vines acknowledged problems but said "the failures
that occurred during testing have been mitigated. A project of this magnitude
requires stringent management, oversight, and testing before the government
accepts any building."
The Utah facility, one of the Pentagon's biggest U.S. construction projects,
has become a symbol of the spy agency's surveillance prowess, which gained
broad attention in the wake of leaks from NSA contractor Edward Snowden.
It spans more than one-million square feet, with construction costs pegged
at $1.4 billion—not counting the Cray supercomputers that will reside
Exactly how much data the NSA will be able to store there is classified.
Engineers on the project believe the capacity is bigger than Google's largest
data center. Estimates are in a range difficult to imagine but outside experts
believe it will keep exabytes or zettabytes of data. An exabyte is roughly
100,000 times the size of the printed material in the Library of Congress;
a zettabyte is 1,000 times larger.
But without a reliable electrical system to run computers and keep them cool,
the NSA's global surveillance data systems can't function. The NSA chose
Bluffdale, Utah, to house the data center largely because of the abundance
of cheap electricity. It continuously uses 65 megawatts, which could power
a small city of at least 20,000, at a cost of more than $1 million a month,
according to project officials and documents.
Utah is the largest of several new NSA data centers, including a nearly $900
million facility at its Fort Meade, Md., headquarters and a smaller one in
San Antonio. The first of four data facilities at the Utah center was originally
scheduled to open in October 2012, according to project documents.
In the wake of the Snowden leaks, the NSA has been criticized for its expansive
domestic operations. Through court orders, the NSA collects the phone records
of nearly all Americans and has built a system with telecommunications companies
that provides coverage of roughly 75% of Internet communications in the U.S.
In another program called Prism, companies including Google, Microsoft, Facebook
and Yahoo are under court orders to provide the NSA with account information.
The agency said it legally sifts through the collected data to advance its
foreign intelligence investigations.
The data-center delays show that the NSA's ability to use its powerful
capabilities is undercut by logistical headaches. Documents and interviews
paint a picture of a project that cut corners to speed building.
Backup generators have failed numerous tests, according to project documents,
and officials disagree about whether the cause is understood. There are also
disagreements among government officials and contractors over the adequacy
of the electrical control systems, a project official said, and the cooling
systems also remain untested.
The Army Corps of Engineers is overseeing the data center's construction.
Chief of Construction Operations, Norbert Suter said, "the cause of the
electrical issues was identified by the team, and is currently being corrected
by the contractor." He said the Corps would ensure the center is "completely
reliable" before handing it over to the NSA.
But another government assessment concluded the contractor's proposed solutions
fall short and the causes of eight of the failures haven't been conclusively
determined. "We did not find any indication that the proposed equipment
modification measures will be effective in preventing future incidents,"
said a report last week by special investigators from the Army Corps of Engineers
known as a Tiger Team.
The architectural firm KlingStubbins designed the electrical system. The
firm is a subcontractor to a joint venture of three companies: Balfour Beatty
Construction, DPR Construction and Big-D Construction Corp. A KlingStubbins
official referred questions to the Army Corps of Engineers.
The joint venture said in a statement it expected to submit a report on the
problems within 10 days: "Problems were discovered with certain parts of
the unique and highly complex electrical system. The causes of those problems
have been determined and a permanent fix is being implemented."
The first arc fault failure at the Utah plant was on Aug. 9, 2012, according
to project documents. Since then, the center has had nine more failures,
most recently on Sept. 25. Each incident caused as much as $100,000 in damage,
according to a project official.
It took six months for investigators to determine the causes of two of the
failures. In the months that followed, the contractors employed more than
30 independent experts that conducted 160 tests over 50,000 man-hours, according
to project documents.
This summer, the Army Corps of Engineers dispatched its Tiger Team, officials
said. In an initial report, the team said the cause of the failures remained
unknown in all but two instances.
The team said the government has incomplete information about the design
of the electrical system that could pose new problems if settings need to
change on circuit breakers. The report concluded that efforts to "fast track"
the Utah project bypassed regular quality controls in design and construction.
Contractors have started installing devices that insulate the power system
from a failure and would reduce damage to the electrical machinery. But the
fix wouldn't prevent the failures, according to project documents and current
and former officials.
Contractor representatives wrote last month to NSA officials to acknowledge
the failures and describe their plan to ensure there is reliable electricity
for computers. The representatives said they didn't know the true source
of the failures but proposed remedies they believed would work. With those
measures and others in place, they said, they had "high confidence that the
electrical systems will perform as required by the contract."
A couple of weeks later, on Sept. 23, the contractors reported they had uncovered
the "root cause" of the electrical failures, citing a "consensus" among 30
investigators, which didn't include government officials. Their proposed
solution was the same device they had already begun installing.
The Army Corps of Engineer's Tiger Team said the contractor's explanations
were unproven. The causes of the incidents "are not yet sufficiently understood
to ensure that [the NSA] can expect to avoid these incidents in the future,"
their report said.
Write to Siobhan Gorman at email@example.com