NSA to store yottabytes of surveillance data in Utah megarepository

There’s an interesting article
in the current New York Review of books (predictably, a book review)
detailing the history of the National Security Agency, that shadowy
power-behind-the-power to which we surrender much of our privacy. That
in itself is interesting, but I found the introduction a bit shocking:
the NSA is constructing a datacenter in the Utah desert that they
project will be storing yottabytes of surveillance data. And what is a yottabyte? I’m glad you asked.
There are a thousand gigabytes in a terabyte, a thousand terabytes
in a petabyte, a thousand petabytes in an exabyte, a thousand exabytes
in a zettabyte, and a thousand zettabytes in a yottabyte. In other
words, a yottabyte is 1,000,000,000,000,000GB. Are you paranoid yet?
The
more salient question is, of course, what are they storing that, by
some estimates, is going take up thousands of times more space than all
the world’s known computers combined? Don’t think they’re going to say;
they didn’t grow to their current level of shadowy omniscience by
disclosing things like that to the public. However, speculation isn’t
too hard on this topic. Now more than ever, surveillance is a data
game. What with millions of phones being tapped and all data
duplicated, constant recording of all radio traffic, 24-hour high
definition video surveillance by satellite, there’s terabytes at least
of data coming in every day. And who knows when you’ll have to sift
through August 2007’s overhead footage of Baghdad for heat signatures
in order to confirm some other intelligence?
As for the medium on which the data might be stored on, that’s
anybody’s guess. Whoever’s making the estimates is probably playing a
bit fast and loose with exponential curves, but if any of the alternative storage technologies
we cover here on CG are any indication, yottabytes won’t seem so big a
few years from now. We can be sure, however, that despite their better
dollars-per-gigabyte cost, spinning hard disks won’t be in use as a
main medium. The electricity required, mean time before failure, and
other maintenance issues are probably unacceptable for an
economy-minded government agency — interestingly, it seems that lack of
electricity is one of the NSA’s primary concerns.
The article mentions that the NSA’s equivalent in the UK, the
Government Communications Headquarters, asked that all telecoms
providers store and hand over a huge amount of customer data for an
entire year. They refused,
citing “grave misgivings” and noting that at any rate the level of data
collection expected was “impossible in principle.” Tut tut! Those Brits
lacked the American can-do spirit. Thus it was that AT&T
and other telecoms instantly complied with US mandates following
September 11. The extent of the government’s meddling with switches,
routers, antennas, and so on may never be fully known, but I wouldn’t
be surprised if everyone reading this article isn’t on the record
somewhere. Storage capacity of this magnitude implies a truly
unprecedented amount of subjects for monitoring.
There is talk of the NSA shutting down altogether or being rolled
into another agency, but I suspect that the “too big to fail” idea, as
well as the “our safety is worth any price” dogma, will prevent that
eventuality. It’s more reasonable to ask when or if its expansion will
cease being sustainable. These datacenters, and the yottabytes they
will hold, are extremely expensive as well as practically having
bulls-eyes painted on them to the enemy (whoever he is) — though at
under $10bn the NSA’s budget is a footnote compared to other programs
and agencies. So is the increasingly (to use a semi-word that is only
rarely usable) tentacular NSA a necessary evil of the digital age, or a
cancerous money sink born from the colossal intelligence competition of
the Cold War?
The answer will only be visible in retrospect years from now, perhaps when a sequel to the book being reviewed (The Secret Sentry: The Untold History of the National Security Agency,
by Matthew M. Aid) is released covering the heavily-redacted records of
the early 2000s. In the meantime, it’s probably best to assume that the
walls have ears.
(Updated with a note on storage medium)
Update 2: A commenter points out
that in the study cited, yottabytes are only one possible estimate for
total storage requirements. The more realistic estimates are in the
hundreds of petabytes, which is much easier for a datacenter to
accommodate. That said, I’m leaving the post as it is because the
speculation still stands with “only” hundreds of petabytes being stored
in these datacenters. However, adjust your tinfoil hats accordingly.