Attribution is Hard, Part 2
Last week in Attribution is Hard, Part 1, I described a classic hacking incident and discussed the challenges of establishing attribution. This week, I explain what weak attribution is, and I conclude the discussion on the four requirements of establishing attribution.
Last week’s cliff hanger probably left you wondering what I mean by "weak attribution." Well, there are several forms of weak attribution that warrant discussion.
The first form of weak attribution is an argument based on tools used, if those tools are available in the wild to security researchers. Just because a tool is available and used by an attacker doesn’t mean that any other frequent user of the tool is your current perpetrator. There are plenty of hacking tools available for repurposing by other attackers. I hate to sound like a cynic, but apparently some people haven’t yet realized that there are security researchers who play both sides of the game-board; if I wanted to go rogue, I could assemble a state-of-the-art set of custom "state-sponsored" quality malware in about a week. Tools are clues, but not fingerprints.
Tools are clues, but not fingerprints.
The second weak form of attribution is the "cui bono" argument: who benefits by a given attack. Cui bono doesn’t hold water because it assumes that all attacks are motivated and that the accuser actually understands all the potential enemies of a given target. A few years ago, I had a conversation with someone who felt that he was undergoing a high-level “state sponsored” set of attacks. After looking at some data he provided, I had to inform him that he was being targeted by a robotic hacking tool. Cui bono assumes that the attacker has motives at all and isn’t a robot. If the attacker actually does have motives, Occam’s razor tells us that the simplest assumption is that it’s just some sociopath poking around or criminal hunting for credit card databases. And, last, but far from least, things sometimes happen at the same time in the real world – geopolitical stresses may coincide with a perfectly normal hacking attack, and the target is wrong to assume that the attack was motivated by whatever was happening in the front page news.
One of the most important (and unpredictable) factors in a successful attribution is the audience.
The last form of weak attribution that we need to dispense with is IP addresses. During 2010, there were a lot of hacking attacks attributed to China, based on IP address blocks. What that says is exactly nothing. Chinese users make up about half of the global Internet population and they have just as many (if not more) cybercafes and unsecured machines that can launder connections as anyone else. To attribute a particular attack to Chinese origin, you would need solid evidence that the IP address in China was the origin of the traffic and not merely a system that eventually relayed it. That’s a very tall order, because showing that a system is the origin of traffic practically amounts to proving a negative – namely that there was no other kind of relaying in use. That relaying could be known or unknown.
I have a friend whose idea of fun is to watch how many times people try to penetrate his laptop on public internets, presumably to steal data from it or to use it as a relay. Imagine if you did a complete back-track to a laptop at a Starbucks in Texas and never discovered that the laptop had been hacked and used as a relay by someone sitting on the other side of the coffee shop! IP addresses are such weak evidence they are not even relevant to effective attribution (though they are interesting for searching in logs) — they’re almost a distraction.
Wrapping up the evidence
Generally, whenever we talk about attribution, we forget to discuss one of the most important (and unpredictable) factors in a successful attribution: the audience. This is not a science lab, where we can establish cause and effect by varying inputs into a controlled experiment; it’s more like a court of law, in which a prosecutor is trying to establish that so-and-so did such-and-such. Inevitably, that brings in all the issues I have discussed, but also that there must be someone to convince. If you’re blaming someone for a crime and present no evidence at all, you’re not attempting to establish attribution – you’re just finger-pointing. To convince your audience/jury, the evidence presented must tell a compelling story, which is why I say that at a minimum, attribution depends on evidence collected from multiple systems that support your reconstruction of the attacker’s actions. That’s why, until recently, US law did not allow secret evidence in court: it’s hard to challenge evidence that you know nothing about; but even more importantly, it’s hard to be convinced by evidence that you know nothing about. The "I could tell you but then I’d have to kill you" kind of evidence only works and applies in bad Hollywood fantasies.
Attribution depends on evidence collected from multiple systems that support your reconstruction of the attacker’s actions.
Any evidence presented in an attribution must stand up to expert scrutiny, which is a severe problem with today’s media. Attribution is highly technical and I do not expect a typical news commentator to dig deeply enough to understand why an IP address is not convincing enough to nail an attribution. Unfortunately, this is an expert’s specialty, which means that experts need to assess the evidence and publish their conclusions before anyone goes off half-cocked and starts finger-pointing. Whenever I think of this, I remember the sinking of the ROKS Cheonan – a South Korean corvette that suffered an explosion and sank under mysterious conditions. Foul play was immediately suspected and a team of naval disaster experts from the US, UK, Sweden, Canada and Australia was assembled to examine the pieces of the ship as they were recovered. They concluded with a massive, detailed report (which was contested by Russian and Chinese naval sources) which included detailed analysis of the patterns of destruction, pieces of torpedo found at the scene, and other evidence. In such a case where an attribution might relate to a cause for war or retaliation, the process must be methodical (and therefore slow) and the evidence must be presented and weighed by the appropriate experts. Of course, those experts may still disagree (as with the Cheonan case) – and that’s kind of the point. The case for attribution must survive expert scrutiny or it’s not a case at all.
This stuff is important because eventually a computer security attribution will lead to military action and people may die. Let’s hope no one dies because his IP address had a remote control Trojan horse program on it. In our infosec community, we’re the experts, and it’s our responsibility to resist hopping on a bandwagon of finger-pointing, but rather to deliver sober and rational arguments based on our best analysis of the evidence we see in front of us.