How anonymous is Bitcoin, really?

Sign up now: Get ST's newsletters delivered to your inbox

Google Preferred Source badge
NEW YORK • Data scientist Alyssa Blackburn from Rice University and Baylor College of Medicine in Houston has spent several years performing digital detective work with her trusty lab assistant, Hail Mary, a shiny black computer with orange trim.
She has been collecting and analysing leaks from the Bitcoin blockchain, the immutable public ledger that has recorded all transactions since the cryptocurrency's launch in January 2009.
Bitcoin represents a techno-utopian dream. Satoshi Nakamoto, its pseudonymous inventor, proposed that the world run not on centralised financial institutions but on an egalitarian, maths-based electronic money system distributed through a computer network.
The system would be "trustless" - that is, it would not rely on a trusted party, like a bank or government, to arbitrate deals. Rather, as Satoshi Nakamoto wrote in a 2008 white paper, it would be anchored in "cryptographic proof instead of trust". Or, as T-shirts proclaim: "In Code We Trust."
The practicalities have proved complicated. Price turbulence is enough to induce the Bitcoin bends, and the system is environmentally destructive, since the computational network uses exorbitant amounts of electricity.
Ms Blackburn said her project was agnostic to Bitcoin's pros and cons. Her goal was to pierce the scrim of anonymity, track the transaction flow from day 1 and study how the world's largest cryptoeconomy emerged.
Satoshi Nakamoto had presented the currency as anonymous: For Bitcoin transactions, users employ pseudonyms, or addresses - alphanumeric cloaks that hide their real identities.
And there was apparent confidence in the anonymity; in 2011, WikiLeaks announced it would accept donations via Bitcoin. But over time, research revealed data leakage; the identity protections were not so watertight after all.
"Drip by drip, information leakage erodes the once-impenetrable blocks, carving out a new landscape of socioeconomic data," Ms Blackburn and her collaborators report in their new paper, which has not yet been published in a peer-reviewed journal.
Aggregating multiple leakages, Ms Blackburn consolidated many Bitcoin addresses, which might have seemed to represent many miners, into few. She pieced together a catalogue of agents and concluded that, in those first two years, 64 key players - some of them the community's "founders", as the researchers called them - mined most of the Bitcoin that existed at the time.
"What they figured out, just how concentrated early mining and use of Bitcoin was, that's a scientific discovery," said University of Chicago economist Eric Budish.
Professor Budish, who has done research in this realm, received a two-hour video preview with the authors. Referring to those early key players, he suggested that the paper be titled The Bitcoin 64.
Computer scientist Jaron Lanier, who is based in Berkeley, California, and an early reader of the paper, called the investigation important and significant in its ambitions and social implications.
"This thing isn't hermetically sealed," he said. "I don't think it's the end of the story. I think there's further innovation that will take place, extracting information from these types of systems."
One of Ms Blackburn's tactics was simple perseverance. "I kicked it till it broke," she said, recalling how principal investigator Erez Lieberman Aiden, an applied mathematician, computer scientist and geneticist at Baylor College of Medicine and Rice University, characterised her method.
More precisely, Ms Blackburn developed hacks for the period of time that was of particular interest: from the cryptocurrency's start to when Bitcoin achieved parity with the US dollar in February 2011, which coincided with the establishment of the Silk Road, a Bitcoin-based black market.
She leveraged human lapses such as insecure user behaviour; she exploited operational features inherent to Bitcoin's software; she deployed established techniques for linking the pseudonymous addresses; and she developed new techniques.
Ms Blackburn was particularly interested in miners, the agents who verify transactions by engaging in an elaborate computational tournament - a puzzle hunt, of sorts, guessing and checking random numbers against a target, in search of a lucky number. When a miner wins, they earn Bitcoin income. Whether 64 seems like a small or large number of key miners depends on one's proximity to the crypto undertow.
Scholars have questioned whether Bitcoin is truly a decentralised currency. From Dr Lieberman Aiden's perspective, the population under probe was "even more concentrated than it seems".
Although the analysis showed that the big players numbered 64 over two years, at any given moment, according to the researchers' modelling, the effective size of that population was only five or six. And on many occasions, just one or two people held most of the mining power.
As Ms Blackburn described it, there were very few people "wearing the crown", functioning as arbiters of the network - "which is not the ethos of decentralised trustless crypto", she said.

FINDING TREASURES IN THE DATA

For Ms Blackburn and Dr Lieberman Aiden, Bitcoin's data - 324 or so gigabytes archived in the blockchain - presented a cache of temptation.
Dr Lieberman Aiden's lab does biological physics and widely applied mathematics; one focus is three-dimensional genome mapping. But as a scholar, he is also intrigued by the use of new kinds of data to explore complex phenomena. In 2011, he published a quantitative cultural analysis using more than five million digitised books from 1800 to 2000, with Google Books and collaborators. "Culturomics", he called it.
For instance, the team introduced the Google Ngram Viewer, which lets users type in a word or phrase and observe its usage plotted over the centuries. In the same spirit, he wondered what treasures might be submersed in Bitcoin's data lake.
"We literally have a record of every single transaction," he said. "These are remarkable economic and sociological data sets. Clearly, there's a lot of information in there, if you can get at it."
Getting at it proved non-trivial. Ms Blackburn was barred from the university's supercomputing cluster - with her file folder labelled "Bitcoin", she was suspected of mining the cryptocurrency. She said she tried to convince an administrator that she was doing research, but "they were completely unmoved".
A key tactic of Ms Blackburn's was to trace patterns in plots of numbers that in theory should have been random and meaningless. In one case, she was chasing the "extranonce", one piece of the mining puzzle: a short field of 0s and 1s tucked within a longer string that encodes each block, or bundle, of transactions.
The extranonce leaked information about a computer's activity. This led her to reconstruct the miners' behaviour: when they were mining, when they stopped and when they started up again. She speculates that the extranonce's leaky behaviour was tolerated because it allowed Bitcoin's creator to keep an eye on miners; the source code was modified to plug this leak shortly before Satoshi Nakamoto disappeared from the public Bitcoin community in December 2010.
Once Ms Blackburn had put various toeholds to use - allowing her to erode the identity-masking protections - she began merging addresses, linking nodes on a graph, consolidating the effective population of mining agents. Then she cross-referenced and validated the results with information scraped from Bitcoin discussion forums and blogs.
Initially, the catalogue of agents who mined most of the Bitcoin tallied a couple of thousand; then it hovered for a while around 200. Ultimately, Hail Mary spat out 64.
The study's purpose was not to name names; it's the job of the federal authorities to bust Bitcoin criminals. But the researchers pinpointed the identities of a couple of the top players who were publicly known Bitcoin criminals:
• Agent No. 19 is Michael Mancil Brown, aka "Dr Evil", who was found guilty of a 2012 fraud and extortion scheme involving then candidate for president Mitt Romney.
• Agent No. 67 is associated with Ross Ulbricht, aka "DreadPirateRoberts", creator of the Silk Road.
• Naturally, Agent No. 1 is Satoshi Nakamoto - whose true identity the researchers did not try to determine.
Once the catalogue of agents was done, Ms Blackburn analysed the income they had reaped from mining, and found that within a few months of the cryptocurrency's introduction - and contrary to Bitcoin's egalitarian promise - a classic distribution of income inequality emerged: A small fraction of the miners held most of the wealth and power.
In the formal study, Ms Blackburn also observed that the concentration of resources threatened the network's security, with a miner's computational resources being directly proportionate to his or her mining income.
On several occasions, individual miners wielded more than 50 per cent of the computational power and, as a result, could have taken over like a tyrant using what is called a "51 per cent attack". For instance, they could have cheated the system and repeatedly spent the same bitcoins on different transactions.
University College London cryptographer Sarah Meiklejohn said the investigation's findings, assuming they were error-free, provide empirical confirmation of an "intuition that has been floating around in this space for a while".
"We all kind of knew that mining was fairly centralised," she said. "There aren't that many miners. This is true even today, of course, and it was even more true at the beginning."
As for what should be done about it, "we do need to really examine that question", she said. "How do we make mining more decentralised?"
She thought the results of this investigation might encourage the field to take the issue more seriously. But to add a twist, Ms Blackburn found that while some miners had the power to execute 51 per cent attacks, they repeatedly chose not to. Rather, they acted altruistically - preserving the cryptocurrency's integrity, even though the decentralisation-based fraud-prevention mechanism had been compromised.
One moral of the story, Ms Blackburn said, is simply: "You have to be careful." There is a limited timeline for encryption, "a horizon beyond which it will no longer be useful. When you are encrypting private data and making it public, you cannot assume that it'll be private forever".
NYTIMES
See more on