The Case for Autonomous Security Testing
The AI Movies Were (Kind Of) Right.
I hold on tightly to my hyperbolic belief that all hacker and AI movies are destined to be horrible (Sorry Chris Hemsworth!). I refused to let them hypnotize me with their impeccable CG and realistic explosions. The plots are always unbearable, perpetuating the far fetched notion that computers will eventually run our lives into destruction.
But, we still watch them because there’s a hint of truth and a glimmer of possibility. I periodically take audits of my life and am reminded of how dependent I am on software, how deeply it permeates through my life, and how quietly they keep my day-to-day humming along, infrequently making its presence known. While not nearly as dramatically gloomy as the movies assert, I can’t deny there’s validity in those movie plots. To reluctantly quote the overused phrase: software is everywhere.
From ride-sharing, to home-sharing, to content-sharing, software has not only reimaged our lives, but also the way markets operate. The new possibilities that software offers has increased demand for applications. As a result, we are on track to release 111 billion lines of new code every year. That’s fantastic!
There’s always a but, isn’t there?
With more code, comes mistakes. With mistakes, comes vulnerabilities. With vulnerabilities, comes risk. With risk, comes breaches. Unfortunately, the old phrase rings true: You can’t have your cake and eat it too.
There is a Cost to Doing Nothing.
Some maintain that, much like failing is a requirement of innovation, mistakes are a part of pushing out more code. While true, this logic is often weaponized as an excuse for complacency and perpetuates short-sighted judgement.
The reality is that cybersecurity is a matter of survival. 60% of hacked small and medium-sized companies go out of business after 6 months. In a National Cyber Security Alliance report, research showed that of the 1,009 small businesses with 500 employees surveyed, 10% went out of business, 25% filed for bankruptcy, and 37% experienced financial loss.
Large organizations aren’t exempt from this statistic.
In 2015, two renowned security researchers uncovered a critical vulnerability that allowed remote control of the then latest Jeep Chrysler. They used an advanced security technique known as “fuzz testing”. Following the finding, Fiat Chrysler Automobiles faced scrutiny by the National Highway Traffic Safety Administration for failing to follow federal laws requiring expeditious recall notifications and fixes. They were given a $105 million penalty for failing to complete 23 safety recalls covering more than 11 million vehicles. Their approach highlighted their negligence of patient consumer safety. This steep punishment was a humbling setback. The market speculates that the remediation cost over $1B in total. FCA stock values plummeted consistently for nearly two years, before showing signs of growth.
Well, that was 6 years ago and an extreme case, you may think. Security’s deep impact on company performance isn’t a thing of the past. On May 5, 2021, Peloton made headlines after it was reported that their APIs were leaking private user information. The culprit? A vulnerability that was disclosed by a security researcher, Jan Masters, at a penetration testing company. The news went viral on May 4. Between May 4 and May 5, Peloton stock fell 13% to $82. At its highest point this year, the company's stock was trading for as much at $167. Surprisingly, Peloton’s stock felt greater downward pressure after the API leak announcement, than after they announced the recall of all of its treadmills due to its link to 70 injuries and the death of one child.
From a financial and company reputation perspective, it pays off to pay attention to your products’ security. It could even be the difference between winning and losing.
However, when we look at the scale of code being developed and deployed, it feels nearly impossible for security testing to keep pace. To make matters worse, the gap between software security talent and software security demand has been widening year after year. There simply isn’t enough skill to go around.
Perhaps not as impossible as we thought, thanks to the recent debunking of these two security-related myths.
Myth #1: The Goal of Security is to be 100% Secure
It’s a common misconception that the goal in security is to be 100% secure. Security is more like a leaderboard, where the goal is to remain ahead of the attacker.
This was cemented in the 2016 DARPA Cyber Grand Challenge (more on this later).
Myth #2: Government Bureaucracy Stifles Innovation
DARPA was originally created on February 7, 1958 by President Eisenhower in response to the Soviet launching of Sputnik 1 in 1957. DARPA is responsible for the development of emerging military technologies.
DARPA’s influence transcends military use cases. Interestingly, DARPA-funded projects have propelled the development of innovative and historically significant technologies outside of the military, including computer networking and graphical user interfaces. It’s no coincidence that since DARPA’s first Autonomous Vehicle Challenge ten years ago, we’ve started seeing an influx in autonomous driving capabilities whether it’s in your Tesla or in your recent Motional ride-share.
In 2016, DARPA left onlookers scratching their heads when they announced the Cyber Grand Challenge, a competition to create autonomous systems that are capable of reasoning through flaws in software, developing patches, and deploying them on the network in real time. DARPA further justified its need:
The need for automated, scalable, machine-speed vulnerability detection and patching is large and growing fast as more and more systems—from household appliances to major military platforms—get connected to and become dependent upon the internet. Today, the process of finding and countering bugs, hacks, and other cyber infection vectors is still effectively artisanal. Professional bug hunters, security coders, and other security pros work tremendous hours, searching millions of lines of code to find and fix vulnerabilities that could be taken advantage of by users with ulterior motives.
This justification supplemented with claims like “future wars are destined to be fought on civilian infrastructures of satellite systems, electric power grids, communication networks, and transportation systems” made the entire situation feel like a scene straight out of a Spielberg movie. Little did we know, this challenge would be one of those DARPA experiments that permeates outside federal uses.
2016 DARPA Cyber Grand Challenge
DARPA poured $2 million into its machine only capture-the-flag (CTF) competition. As mentioned above, the objective of this competition was to create autonomous systems that are capable of reasoning through flaws in software, developing patches, and deploying them on the network in real time. No human intervention was permitted. The last system left standing would be deemed the winner. Seven research teams competed, but only one system walked out of the competition -- ForAllSecure’s cyber reasoning system: Mayhem (now Mayhem for Code).
Articles often highlight what made the difference: Mayhem’s accurate results and ability to leverage those results to autonomously make intelligent decisions for assessing the ROI of fixing vulnerabilities. That’s right, it wasn’t the system that was most secure that won the competition (Remember, I said I’d get into this later? Well, this is later). It was the system that was able to best manage their limited resources through reasoning. For example, deploying a patch sometimes degraded system performance. In these instances, Mayhem opted to remain insecure, due to the low risk the vulnerability presented compared to the larger cost on system performance.
Rarely do they highlight the commonalities. None of the systems leveraged today’s most popular security testing technique, SAST. In fact, nearly all the systems relied on fuzz testing. Guided fuzz testing, as a technique, showed strength in automated analysis from both a speed and scale perspective. It’s also the only security testing technique available today that was capable of bringing intelligent, autonomous capabilities to these cyber reasoning systems.
Mayhem’s win drew large attention from organizations seeking relief from the challenges and complexities of application security. Mayhem’s capabilities resonated far and wide. Overwhelmed by the enormous market interest, our mission since has to make this autonomous technology a possibility for the masses.
Development Speed or Code Security. Why Not Both?
Find out how ForAllSecure can bring advanced fuzz testing into your development pipelines.
The Case for Autonomous Security Testing
Hacker and AI movies have led society to believe that computers will eventually possess the intelligence to lead the human race to its demise. As we make Mayhem’s autonomous technology widely available, we’ve come to learn that this is a false narrative. It’s actually the complete opposite.
Autonomous technology isn’t about destroying, much less replacing the human element. It’s about elevating human potential. Today we waste so much brilliant talent on boring, manual tasks (“artisanal” DARPA characterizes, if you recall).
Humans should have more creative roles. Imagine how many more boundaries we can push when we free developers to do what they do best: solve humanity's greatest challenges through code.
We believe that the implementation of autonomous security testing will make a mark a pivotal moment in history, multiplying not only the scale, but also the speed at which we will be able to push out ingenious solutions. We’re no longer bound, slowed down, or distracted by doubts of safety and security.
We conduct security in the background, so we can help organizations focus on developing game-changing applications like Uber, Airbnb, Spotify, rather than maintaining them. Our mission is to make way for those possibilities. We’re making big things happen. To reluctantly quote the overused phrase: we’re changing the world. Really.