Tell me if you’ve heard this: there is a new advanced network intrusion device that uses modern, super-smart Machine Learning (ML) to root out known and unknown intrusions. The IDS device is so smart, it learns what’s normal on your network and does not immediately inform you when it sees an anomaly. Or maybe it’s an intrusion prevention system (IPS) that will then block all malicious traffic. This AI-enabled solution boasts 99% accuracy detecting attacks. Even more, it can detect previously unknown attacks. Exciting, right?
That’s an amazing sales pitch, but can we do it? I’m not sold yet. Here are the two big reasons why:
Put simply, ML algorithms are not generally intended to defeat an active adversary. Indeed, academic research areas in adversarial machine learning is still in its infancy, let alone real products with ML technology. Make no mistake — there is amazing research and researchers, but I don’t think it’s ready for full autonomy.
The vision of autonomous security is machines detect, react, and defend. The two reasons I listed above can’t add up to full autonomy. In a nutshell:
Let’s dig into netsec and ML, specifically using intrusion detection and prevention as our main setting.
One key reason I’m still not sold on ML comes down to this: they often confuse detection rates (most times listed as accuracy) with the true rate of an attack. In IDS, I assume users want to identify real intrusions. They don’t want to study the attacker. Instead, they want to detect when the attacker actually succeeds. Unfortunately, these wants aren’t reality.
In the first paragraph of this blog post I tried to fool you. Did you catch it?
I switched detecting attacks with detecting intrusions. These are different. Attacks are a possible indication of an intrusion, but they may or may not be successful. An intrusion is someone having success.
Consider a port sweep (e.g., with nmap), script kiddies trying default passwords and a targeted exploit against your system. Nmap sweeps don’t result in compromise, script kiddie scripts are likely to be unsuccessful, but the targeted attack -- that’s something quite different. It’s likely to be successful. You want to target your limited resources, manpower, and attention span on real intrusions.
The key concept to understand is the base rate fallacy. It’s not so much a fallacy, but a difficulty for humans to readily understand statistics. (For the mathematically-minded, the key paper I cover at Carnegie Mellon University’s coursework is Stefan Axelsson’s paper. If you want to annoy a sales rep, ask him if he’s read it.) We see this every day, when gamblers think they are on a hot streak. Every so often they win, so they think their system is a “winner”. The same principle can apply to security -- if every so often your ML algorithm is right, your brain can trick you into thinking it’s better than it really is.
The typical example of the base rate fallacy goes like this; you’ve gone to the doctor and he’s run a test that is 99% accurate for a disease. The bad news: the test said you had the disease. The good news: the disease itself is really rare. Indeed, only 1 in 500 people get the disease. What is the probability you really have the disease?
This is the fallacy: most people would assume since the test is 99% accurate that they are very likely to have the disease. However, math states otherwise. In fact, the actual probability you have the disease is only about 20%. How can that be?
Think of it like this; a 99% accurate test (here) means that every 1 out of 100 people will turn up positive. However, we know only 1 in 500 people actually have the disease. For 500 tests, 5 people are detected, but only 1 is a true positive… or only about 20%. (If you are a mathematician, you can calculate the exact probability with Bayes theorem. We use approximations because they’re easier to grapple with than conditional probabilities.)
Things get worse as the numbers get larger. For example, suppose the rate of real intrusions is only 1 in 1,000,000 events. With a 99% accurate detection rate you’ll have approximately 9,999 false positives for every true alarm. Who has time to chase that many false positives?
How does this apply to IDS and any algorithm, machine learning or not? If the actual rate of successful intrusions is low then even a 99% accurate IDS (as in the sales literature) will create a huge number of false positives. One rule of thumb used by SOCs [credits: Michael Collins at USC] is that an analyst can handle about 10 events in an hour. If almost everything is just an attack (a false positive) and not a real intrusion, you’ll at best inundate them with false positives or at worse teach them to ignore the IDS.
Indeed, the math shows that almost any false positive rate is likely too high.
What I mean by robustness is “does it still work once it’s expected by the attacker”. Let’s use an analogy where you are the attacker. Pretend, for a moment, you want to get into the illegal import business. You go through a security checkpoint and bam, you’re caught! So far so good; law enforcement (the IDS in this analogy) is winning. They won because you had no idea what they really were checking for and you likely triggered something.
As a determined criminal, what would you do? The natural thing is to figure out the checkpoints and the checkpoint rules and then evade them, right? In other words, as soon as you recognize the defense, you’ll figure out how to evade it. The same is true for attackers.
An IDS system is robust if it continues to detect even when the attacker changes their methods. That’s the key problem: right now, ML seems attractive because attackers are not trying to fool it. As soon as you’ve deployed it, attackers will notice they are not getting through and try to evade it.
A theorem applies here: it’s called (literally) The No Free Lunch Theorem (NFL). You can find the original paper here if you are mathematically minded: No Free Lunch Theorem, and an easier-to-read summary here. The “No Free Lunch” theorem states that there is no one model that works best for every problem. In our setting of security, the implication is that for any ML algorithm, an adversary can likely create one that violates assumptions and, therefore, the ML algorithm performs poorly.
Current research in ML security has had a hard time showing its robust. For example:
The lesson: attackers can learn about your defenses and if the defense isn’t robust, they’ll be able to evade it.
Overall, I’m not sold. In this blog post, we covered two reasons:
This is just the start. There are other technical problems we didn’t get in to, such as, whether the data you used to train is realistic of your network. Companies like Google and AT&T have immense data and still have challenges with this. There are also organizational issues, such as whether your SLA is calibrated so your people don’t have to manage the unknown. Mature security operations centers often first figure out how much they can handle, then tune the detector down appropriately.
First, think about whether you’re looking for a short-term fix or long-term fix requiring robustness against evasion. If you only want to stop internet chaff, new machine learning products may help. There isn’t a scientific consensus that they won’t be easy to evade as attackers learn about them.
Second, think hard about what you want to detect: do you want to study the attacker or are you in charge of responding to real problems? For example, the base rate fallacy teaches us that if your organization has relatively few intrusions per attack (ask your team if you do not know!), the iron-clad rules of math mean the hard limits on any approach -- ML or not -- may not be on your side.
Where can ML really help? The jury is still out, but the general principle is ML is a statistic, and will better apply where you are trying to marginally boost your statistical accuracy. I use the word “statistic” on purpose: you have to accept there is risk. For example, Google has had tremendous success on boosting ad click rates with machine learning because a 5% improvement means millions (billions?) more in revenue, but is a 5% boost enough for you?
A second place ML can help get rid of unsophisticated attackers, for example: that script kiddie using a well-known exploit. In such a setting we’ve removed the need to be robust since we’ve defined-away someone really trying to fool the algorithm.
Finally, I want to reiterate the amazing research being done in ML today and the researchers studying it. We need more. My opinion is we are not “there” yet, at least not in the way an average user would expect.
In the intro, I said I have more confidence that we can make parts of application security fully autonomous. The reason is:
But do I think there is a chance ML-powered IDS will become a fully autonomous network defense? Not today.
Originally published at CSOOnline
Thank you for subscribing!