In the past year, we developed a data-driven method for identifying, quantifying, and comparing ransom payments in the Bitcoin ecosystem from 35 ransomware families. The study was conducted in partnership with Bernhard Haslhofer from the Austrian Institute of Technology (AIT) and Benoît Dupont from the Université de Montréal (UdeM). It resulted in a paper that will be presented at the 17th Annual Workshop on the Economics of Information Security (WEIS2018) in Innsbruck, Austria, besides renowned academic researchers.
WEIS2018 is “the leading forum for interdisciplinary scholarship on information security and privacy, combining expertise from the fields of economics, social science, business, law, policy, and computer science.”
The paper was published in the proceedings of the conference. This blog provides a quick summary of the methodology developed for tracing ransom payments and presents the study’s estimation on the lower-bound direct financial impact of 35 ransomware families.
Ransom Payments in Bitcoin: A Perfect Opportunity
Ransomware can prevent a user from accessing a device and its files until a ransom is paid to the attacker, most frequently in Bitcoin. With over 500 known ransomware families, it has become one of the dominant cybercrime threats for law enforcement, security professionals and the public. Ransomware Bitcoin payments represent a perfect research opportunity to quantify -at least- the lower bound direct financial impact of ransomware attacks (to learn more about Bitcoin, you can refer to this source). Indeed,
- Most ransoms are paid in Bitcoin,
- Bitcoin transactions are available publicly,
- Clustering heuristics as well as tools have been developed in the past years to extract information from each Bitcoin transaction.
Tracing Ransomware Monetary Payments
Scraping the web, we found 7,118 Bitcoin addresses related to 35 ransomware families. The first step in our methodology was to expand the dataset using the multi-input heuristic.
Multi-Input Heuristic: key for analyzing Bitcoin transactions
A number of heuristics have been developed to analyze Bitcoin transactions and group addresses found in the blockchain into maximal subsets (clusters) that can be associated with different real-world actors. A key heuristic is the multi-input one. This heuristic takes into account that two addresses used as inputs in the same transaction must be controlled by the same real-world actor. Thus, if addresses A and B are used as input in one transaction and then addresses A and C are used as input in another transaction, one can infer that A, B and C belong to the same real-world actor. Such heuristics is, however, not valid for CoinJoin transactions or transactions from mixing services or Tumblers.
In terms of attribution, if one address in a cluster can be associated with a tag (such as belonging to a large exchange or an organization), then the whole cluster can be associated to that tag. This results in deanonymizing the whole cluster.
Data process
We applied the multi-input heuristic on each of the 7,118 Bitcoin addresses in the seed dataset. We then applied a time filter on the expanded dataset to determine the start date of each ransomware campaign. For the time filter, we used Google trend searches and extracted the first month in which online searches about the ransomware family took place. The whole processing of Bitcoin transactions was conducted with the open-source GraphSense cryptocurrency analytics platform and is summarized in Figure 1.
Figure 1 – Data Process Created by Bernhard Haslhofer
Once the time filter was applied, we built family specific graphs focusing on outgoing-relationships. In short, we mapped, for each address related to one family, where the money was being sent. For example, Figure 2 shows an outgoing-relationships graph from the CryptoHitman ransomware family.
Figure 2 – CryptoHitman Outgoing-Relationships Graph
In Figure 2, red nodes represent expanded addresses belonging to the CryptoHitman family and gray nodes represent addresses not in the dataset. It shows that some addresses are key: they receive more than once, money from known CryptoHitman addresses. We qualify these addresses as collector: addresses used to collect or aggregate payments. Figure 3 displays two Locky collectors.
Figure 3 – Locky Collector Addresses
We determined that each address receiving money more than once from any address (known to be related to the same ransomware family) can be considered a collector address for this ransomware family. While investigating collectors, we found that some were directly related to known real-word actors (tagged clusters). Among others, we found that some collectors were related to:
- 86 exchange organizations (i.e. BTC-e.com, LocalBitcoin.com, Kraken.com)
- 47 gambling sites (i.e. SatoshiDice.com, Bitzillions.com, SatoshiMines.com)
- 12 mixing services (i.e. BitcoinFog.info, Helix Mixer)
This shows that not all ransomware authors bother to launder their money, some of them quickly cash out their earnings through various exchanges. Such reckless action reduces the risks associated to using a third-party to launder money, but increases the risks of being caught by law enforcement.
Lower Bound Direct Financial Impacts
Removing the collector addresses from the filtered dataset (to avoid double counting), we estimated the lower bound direct financial impacts of ransomware attacks for each family, shown in Table 1.
Family | Addresses | BTC | USD |
---|---|---|---|
Locky | 6,827 | 15,399.01 | 7,834,737 |
CryptXXX | 1,304 | 3,339.68 | 1,878,696 |
DMALockerv3 | 147 | 1,505.78 | 1,500,630 |
SamSam | 41 | 632.01 | 599,687 |
Cryptolocker | 944 | 1,511.71 | 519,991 |
GlobeImposter | 1 | 96.94 | 116,014 |
WannaCry | 6 | 55.34 | 102,703 |
CryptoTorLocker2015 | 94 | 246.32 | 67,221 |
APT | 2 | 36.07 | 31,971 |
NoobCrypt | 17 | 54.34 | 25,080 |
Globe | 49 | 33.03 | 24,319 |
Globev3 | 18 | 14.34 | 16,008 |
EDA2 | 23 | 7.1 | 15,111 |
NotPetya | 1 | 4.39 | 11,458 |
Razy | 1 | 10.75 | 8,073 |
Table 1 – Received Payment per Ransomware Family
Table 1 illustrates that the market is top-heavy: only a few players are responsible for most of the ransom payments. Indeed, Locky, CryptXXX and DMALockerv3 make 86% of the market and the 32 other families share 12% of the market. This means that law enforcement’s limited resources could focus on the few capable players in the market.
Summing up the amounts, we find that, from 2013 to mid-2017, the market for ransomware payments has a minimum worth of
USD 12,768,536 (22,967.54 BTC)
The minimum worth of the market for ransom payments, taking into account 35 families, seems relatively modest compared to the hype surrounding the issue; the overall direct and indirect damages they caused to individual and organizational victims are much higher.
Implications
We conclude that the low performance of most ransomware family may be explained by multiple factors, such as the effectiveness of various tools developed in the past years to prevent ransomware attacks. The community project “No More Ransom!”, that make ransomware decryption tools available to victims, may also be of great help. Victims may also decide not to pay the ransom or may have backups, limiting ransom payments. This research results are in accordance with Kharraz et al. (2015) who studied 1,359 samples from 15 ransomware families and Gazet (2010) who reversed-engineered 15 ransomware samples. Both studies found that most ransomware families used superficial and flawed techniques to encrypt files, few having actual destructive capabilities. Such observation does not mean that the ransomware threat should be underestimated, some ransomware families are indeed successful and destructive, but it provides a glimpse at better understanding the structure of the ransomware market.
In terms of future work, we plan on extending our analysis to additional ransomware families and supplementary payment addresses. We could also study other illicit activities channeling financial transactions through the Bitcoin network, such as other extortion cases, trafficking of illicit goods or money laundering.
Our research is entirely reproducible, we encourage anyone to use any of the tools and the data provided in the links below: