The 1980’s film “Wargames” asked a computer to learn whether global thermonuclear war made sense. In the film, thermonuclear war didn’t make sense but what if, in real life, preemptive cyberattacks were our best hope for winning? Or better yet, what are the cyberwar scenarios and incentives when peace is the best strategy, just like “Wargames”? Or is it the reverse, where the best thing to do is invest in offense?
We don’t like thinking about offense and attack in cyber. But if you think about offense, can’t you develop a better defense? It can be tricky to do informally, what we need are decision and strategy making frameworks.
Game theory is a branch of mathematics that allows us to reason through cyberattack/defense scenarios without spinning in philosophical circles. Game theory was created by a polyglot and computer scientist named John von Neumann. Johnny was an interesting guy with a penchant for creating unusual acronyms such as “MAD” for Mutually Assured Destruction. He also had a knack for asking the hard and difficult questions, one famous quip being “If you say why not bomb [the Soviets] tomorrow, I say, why not today?” MAD allowed Johnny to answer that question. Ultimately MAD is why the WOPR computer in Wargames decided not to launch the thermonuclear war, concluding that “the only winning move is not to play.”
Cyber clearly isn’t thermonuclear war. We need to think about how it’s different, build models, and see how those models play out. We need to ask hard questions, just like Johnny.
For example, an exploit is just bits on a wire and can be copied if your opponent happens to log the attack. You can use that information to “reflect” or “ricochet” the exploit against your opponent or you can decide to use that new knowledge to create a patch. When you capture someone else’s exploit and use it (or patch it), you’ve used their energy against them. If you can better use an adversary’s energy and time for your own benefit, you have a higher chance of succeeding.
The U.S. government has a responsibility to protect the nation, which (quite rationally) entails both cyber offense and defense. The U.S. policy is to prioritize defense and disclose any action in the US Vulnerabilities Equities Process (VEP). The National Security Agency has said it typically discloses about 91% of vulnerabilities its researchers uncover after evaluation through the VEP process.
Controversially, I’m not a security expert that believes that the NSA should outright disclose every vulnerability. Offense does have value and cyber offense has been a key part in real-world events like hacking into the DNC server (bad) or damaging a dangerous state’s nuclear materials program (good). If there is an option between killing people with a bomb and using a cyberattack to achieve the same end, I think cyber may make more sense. Any question to disclose or not shouldn’t be made in isolation. It comes down to the game, as you see it, and what strategy is most likely to achieve the desired outcome.
Game theory isn’t just for nation-states; it’s a way of modeling scenarios and guiding decisions. You can model probabilities on how someone else will take action and what you’ll do to counter that action.
One thing that is clear, cyber offense and defense isn’t chess. It’s a game of poker. In chess, you have complete visibility into your opponent’s position and moves. In poker, you lack that visibility which also happens in the cyber realm. In cyber you don’t have certainty in what exploits your adversary knows about, whether they are using an exploit they disclosed, and whether your zero-day is really a zero-day globally.
Strategy means you’ve thought through the larger picture of various alternatives, risks, and rewards. You’ve built a game, not in the playful, fun sense, but one that allows you to reason through actions, incentives, and possibilities.
Cyber should be no different. As I wrote this article, I imagined well-known security experts screaming things like “responsible disclosure is the ethical choice” and “we have more risk as a nation when we don’t responsibly disclose.” To such experts, I’m asking you to stop and play devil’s advocate for a moment. Hack at your assumptions and really test them. I believe it leads to better thinking.
How do we think through what to do?
Let’s play a game.
Imagine you found a new zero-day vulnerability. You can either disclose the zero-day vulnerability or create an exploit and attack others. Your actions have consequences and you have the ability to play a sequence of actions:
In game theory, we create a game state to capture that context. Game theory also asks us to be formal and provide the utility — positive or negative — for each action. Ask anyone in risk assessment; if you don’t have a cost for an action, you can’t assess the risk. The nice thing about game theory is you can use different utility functions to understand how they change the outcome. For example, how does a defenders’ strategy change if the cost of being exploited is $10 vs. $1 million?
Let’s start out simple with just two players: Red and Blue. Each player is running the same software, say Windows 10. Since both parties are running the same software, each is affected by a new zero-day vulnerability. A player can only exploit or disclose when they find a zero-day. If they choose to exploit, they get a “point” per system they compromise. Each player wants to win by getting the most points or at least tie.
The number of computer systems matter in this game, because it highlights the potential asymmetry in risk a particular vulnerability may pose. Let’s assume Red and Blue are different, where Blue has 10 computers and Red has three.
(Red’s Perspective): If I discover a new zero-day, I can get up to 10 points attacking Blue’s computers. I only have three vulnerable computers, so at most Blue can get three points. Since 10 > 3, I’ll always attack.
(Blue’s Perspective): If I discover a new zero-day, l can get up to three points attacking Red’s computers. However, if Red finds it, they’ll get 10 points. It makes sense to disclose and patch, assuming I can get the patches installed before Red attacks.
In this game, Blue is incentivized not to attack. Ethically that seems like a good outcome. Unfortunately, Red is incentivized to wage war. Later we will look at one way Red could be incentivized to make peace.
Even this simple example highlights some lessons and properties:
The term “zero-day vulnerability” is a bit of a misnomer. If you find a previously unreported vulnerability that doesn’t mean no one else knows about it. What it means is no one else has publicly disclosed it.
Suppose Blue found a new zero-day using either:
The method you used to find the vulnerability can change the probability that your opponent also finds the vulnerability. If it took you two days to discover, then it is likely that your attacker can also discover the vulnerability in two days. The time it takes to find a vulnerability relates to how easy or difficult the vulnerability is to discover. However, the super-secret technique that you used to find the vulnerability is yours alone. If it finds a vulnerability (and it’s not found with AFL), that vulnerability likely has a longer shelf life before someone else discovers it.
You can also start to estimate how many new exploits your adversary may have. For example, Google has reported over 3,849 new security-critical bugs using their oss-fuzz infrastructure over three years, which works out to about 3.5 per day. Think about it: Google, statistically, will find 28 new security issues between Christmas and New Year’s. Google has nation-state offensive capabilities. Yes, weaponization takes more time and not all 3.5 vulnerabilities per day can be weaponized, but you get the gist.
Google uses open source tools in their open source fuzzing to find bugs before attackers. If you’re serious about being proactive, I recommend you follow their lead and similarly employ techniques used also by attackers. It’s a way to be proactive. Even if you had a magical technique that found all bugs, fuzzing and other techniques used by attackers can help too. If you find a bug and have data that shows how long it takes using such a tool, you can use that information to gauge the risk or how long it will take an attacker.
Exploits are bits and can be copied. What if you got really good at ricochet? That changes the strategy in game models. Interestingly, it can provide a real incentive for everyone not to attack.
What if when Blue launches an attack, Red can ricochet? Blue can start reasoning about possibilities:
Consider some extreme values. If Red can 100% ricochet any attack, then Blue should never choose to attack. If Red has 0% chance of a ricochet, Blue should always attack in this game. The extreme values help clarify scenarios, but we don’t need to assume 100% ricochet. What if Red siphoned off just 10% of their traffic for really deep analysis? There is only a probability they see the zero-day; is it enough to disincentivize Blue?
What is interesting in ricochet is it incentivizes peace even when there is a vulnerability. A bit like MAD, but without the world being destroyed. If Red and Blue have an equal number of systems, and both have ricochet, neither should attack. It’s like the old saying: those who live in glass houses should not throw stones.
To me, the framework suggests the US is behaving rationally. They likely have the most to lose if someone else finds and weaponizes a vulnerability. Rationally (not just ethically), it makes sense to put their thumb on the disclosure side of the scale.
Beyond outright ricochet, we can think of a disclosure as providing some partial information to an attacker and that also guides decisions. Blue may further reason:
Ricochet is not necessarily revenge. For example, suppose Red had an ally Orange with 10 vulnerable computers. Previously, without an ally, the incentives seem to promote Red attacking. If Blue can ricochet, he can disincentivize Red by ricocheting attacks to Orange. Red now has to be comfortable with the three points they’d lose if the vulnerability is discovered plus the 10 points of damage Blue can inflict against Orange. This new world suggests Red should no longer attack first.
Imagine a crazy world where Russia simply said, “If I see a cyberattack, I will ricochet the same attack against every vulnerable computer in Israel.” That would incentivize Israel to not just keep the peace with Russia, but also incentivize Israel to pressure allies to not attack as well. It would also guide national policy (e.g., getting really good at ricochet).
Even if you can’t ricochet, the game theory suggests you should disclose not just vulnerabilities you find, but also those launched against you. Attack/defense hacking competitions teach us the best thing to do is attack the weakest player first. If you use an exploit against a weak player and they detect it, you know not to use it against a strong player. It doesn’t say a stronger player wouldn’t detect it as well, but does provide some information.
If you disclosed any attack on your network, especially if you disclosed a new zero-day, you could be disincentivizing attackers. It would make sense, at least, that they don’t attack you first but someone else.
Specifics on how to use game theory at a national level are theoretic at this point. The NSA awarded its “Science of Security” award on a paper written by myself and graduate student, Tiffany Bao, on the subject of Game Theory. The paper makes simplifying assumptions that likely don’t capture many real-world factors. It assumes rationality when we know the world isn’t rational. It takes an actuarial viewpoint where there is a utility function for consequences when the real world is more complicated. The point is to shed light on a method of thinking that has worked in MAD, in economics, and other areas. Game theory, in general, can also highlight where we can place incentives that may not be obvious and whether those incentives actually change the game we (think) we’re playing.
In 2014, Tiffany Bao and Steven Turner, from my lab, published a paper on recognizing functions inside COTS compiled binaries. Function identification is a fundamental challenge in vulnerability research. Common wisdom is that the better you do at function identification the more productive your vulnerability research will be. Common reasoning is that vulnerability discovery doesn’t work well until you break down a commercial-off-the-shelf (COTS) application into functions. So, the first step in vulnerability research is often to reverse engineer the COTS application into functions.
Our experiments showed that our tool ByteWeight had 99% precision and 98% recall on 64-bit binaries, where IDA pro (a state-of-the-art commercial tool) had 74% and 55% precision and recall, respectively. A solid improvement.
We then asked, “to what end would such an improvement matter?” It’s easy to cherry-pick cases where a missing function makes it harder to find a vulnerability. But anecdotes aren’t great when deciding strategy. So, we built two worlds: one where ByteWeight was used and one where it wasn’t.
To make it more concrete, we analyzed whether ShellPhish’s decision to use ByteWeight mattered in their third-place victory at the DARPA Cyber Grand Challenge (CGC). In world 1, ShellPhish used ByteWeight and we had their real-life performance in the CGC. In world 2, we analyzed how well ShellPhish would have done without ByteWeight. It didn’t matter, they still received third place. The details are in Tiffany Bao’s Ph.D. thesis (section 6.3.1).
A game-theoretic analysis showed that IDA Pro was “good enough”. Function identification wasn’t a barrier in the CGC. A better strategy would have been to put equivalent R&D dollars on better fuzzing techniques. The reason is getting to 100% effectiveness would not have changed the outcome compared to just using IDA Pro.
What can you learn? If you have a goal, such as finding zero-days or defense, ask the question “how will this change the outcome if we are successful?” Assume research has a breakthrough and is 100% effective. Does that change the game?
For example, suppose you can invest either in a really deep static analysis tool that highlights buggy lines of code that identifies 100% of all flaws, but it’s difficult to take action on a report. Is that deep analysis really benefitting you compared to something less deep but more actionable? The goal is typically not to find flaws, but to reduce the window from when a vulnerability is introduced to when a patch is fielded. Think through all the incentives that go into such a program.
Strategy means you’ve thought through the larger picture of various alternatives, risks, and rewards. You’ve built a game, not in the playful, fun sense, but one that allows you to reason through actions, incentives, and possibilities.
Cyber should be no different. As I wrote this blog, I imagined well-known security experts preaching that “responsible disclosure is the ethical choice” or that “we have more risk as a nation when we don’t responsibly disclose.” To such experts, I’m asking you to stop and play devil’s advocate for a moment. Hack at your assumptions and really test them. I believe it leads to better thinking.
Originally published at The New Stack.
By submitting this form, you agree to our Terms of Use and acknowledge our Privacy Statement.