For two years Heartbleed was a zero-day in OpenSSL until fuzz testing exposed it. How many others are in the wild now? And how will we find the next one?
In this episode I talk about how Heartbleed (CVE 2014-0160) was found and also interview Rauli Kaksonen, someone who was at Codenomicon at the time of its discovery and is now a senior security specialist at the University of Oulu in Finland, about how new security tools are still needed to find the next big zero day.
Listen to EP 10: Hunting The Next Heartbleed
Vamosi: Imagine being able to attack a company’s servers without leaving a trace. A kind of digital smash and grab of sensitive information such as the encryption keys created to protect sensitive transactions on a site like Amazon, or your bank with no way to trace any of it back to you. This isn’t a typical person-in-the middle attack; that would leave too many fingerprints. No, this would be almost the perfect crime and done with a very simply yet subtle zero day.
Such a scenario isn’t fantasy; something like this actually existed between 2012 and 2014. I’m talking about Heartbleed or CVE 2014-0160. It is a vulnerability in SSL/TLS, protocols that are designed to protect data in transit. Except during that two year window, there was a serious vulnerability in OpenSSL that no one knew about.
What I want to know is how that vulnerability was able to persist for so long. I mean, it was open source, right? And how many other serious vulnerabilities like Heartbleed are lurking unknown in the applications we use everyday, in the websites we depend on, and in the devices we carry. To answer these questions I’ll talk with an expert on developing security testing tools, someone who was, coincidentally, there the moment Heartbleed was discovered.
Welcome to the Hacker Mind, an original podcast from ForAllSecure. It’s about challenging our expectations about people who hack for a living.
I’m Robert Vamosi and this episode I’m going to talk about one of the most perfect examples we have of a dangerous zero day, Heartbleed, and how fuzz testing was used to discover this new class of vulnerability, and what the security industry can do in terms of developing new tools in the future to help stop the next Heratbleed, such as using more automation and perhaps even machine learning or AI.
So what is Heartbleed? We hear the name, but do we really understand the vulnerability? Secure Socket Layer or SSL and its successor Transport Layer Security or TLS are complex protocols that operate behind the little paddle lock you see on the address bar of your preferred web browser. There’s actually a lot of things going on there, but suffice it to say SSL establishes an encrypted channel between your browser and the site you are trying to access. Once it’s set up, it needs to stay up until you leave that site, so it needs to know whether the two parties are still there, so there is this constant communication going on in the background called the heartbeat function of SSL.
When the heartbeat is operating, by design, it sends a message -- “I’m here” - and the web site responds “I am too”. The reality is more nuanced. There’s the initial Hello, followed by a secret message and then the length of the secret message. So Hello. Bird. 4 characters. And the server would send back Hello. Bird. 4 characters. And this happens over and over until you leave the web site and break the encrypted channel and establish a new one on another website.
That’s how it’s supposed to work.
Fact is, not everyone rolls their own SSL/TLS. Nor should they. History is filled with failed examples of homegrown encryption, some with disastrous consequences. As I said, it’s pretty complicated to implement if you don’t already know something about encryption, so a lot of sites use something premade such as OpenSSL. No shame in that. Sites affected included Google, Dropbox, Netflix, and even Facebook. And this is open source software, meaning that there’s some developers or some project behind it that has already built out the basics of what you need to put it into your code and start using it.
So on December 31, 2011, at almost midnight, a developer with direct access to OpenSSL, Robin Seggelmann, committed the change that changed the heartbeat function. We know the name and we know the time because it’s recorded in the logs. He later told the Sydney Morning Herald "In one of the new features, unfortunately, I missed validating a variable containing a length."
Length. What is he talking about? In OpenSSL the heartbeat function didn’t bother to limit the length of the characters used. So, remember our example with the word bird? So, if a site were using OpenSSL 1.0.1, and a criminal hacker knew this, they could create a malformed heartbeat such that it would say something like Hello. Bird. 500 characters. And the server, dutifully, would have to respond Hello. Bird. And then four hundred and ninety six other characters. And that’s a problem. Those four hundred and ninety six characters were probably just sitting somewhere in memory. And those four hundred and ninety six characters probably included recently used encryption keys, passwords, social security numbers, and other PII. It would be a massive data breach. And it wouldn’t just be that one time. The criminal hacker could make this request over and over, remember, it’s a heartbeat, amassing a large amount of data quickly, if not just using the encryption keys already obtained to eavesdrop on all the communications happening at that time.
I said this would be the perfect crime: if you have the encryption keys, you have access to the live session and no way for anyone to know you had that access. And if you could initiate a heartbeat before authentication was complete on the site, you could smash and grab the encrypted information before anyone even knew who you were. You could use a Starbucks free wi-fi and virtually leave no trace behind.
So this zero day -- in that the operators of OpenSSL didn’t about it -- persisted for two years. You may have heard of Linus' law. It states that "given enough eyeballs, all bugs are shallow" - except that in this case all those eyeballs looking at the code just wasn’t enough to discover the open source-based Heartbleed vulnerability; it was hiding in a blindspot.
And traditional application security tools like static analysis, they couldn’t find it.
It took something different to discover Heartbleed.
What is Fuzz testing? It is certainly not a new solution, but it still isn’t used nearly enough in my opinion. Fuzzing involves sending invalid input to stress test software - either the program can handle the invalid input or it can’t.
In discovering Heartbleed what the researchers got back wasn’t a crash, it wasn’t a fault, it was anomalous behavior. Consistently anomalous behavior. If you imputed 70 characters, you got back 70 characters. If you inputted 400 characters, you got back 400 characters. Funny thing is the protocol spec said it should be limited, so at least in OpenSSL 1.0.1 there was no upper limit and returning more data than it should have been allowed. That much was obvious.
In March of 2014, researchers from Google and a Finnish start up named Codenomicon both turned their attention to OpenSSL. Coincidence, right? Actually it happens a lot, where two independent researchers will look at the same problem at the same time. There are a lot of dupes reported with, say, bug bounties, for example.
The thing is criminal hackers also know how to fuzz test -- and they might have found Heartbleed first but we have no way of knowing. Personally, I don’t think anyone successfully exploited Heartbleed, but the thing to know is that criminal hackers do use open source fuzzing tools. They use them all the time. And some of these free fuzzers take a lot of time to run; they require some expertise to separate the signal from the noise. But we also know that some criminal hackers have plenty of time and at least some software expertise.
Determining whether someone had read the memory from your server is something that is hard, if not impossible to do after the fact. We really have no way of knowing whether criminal hackers even knew Heartbleed existed before the public announcement, or if anyone was in fact able to steal data from the 600 thousand vulnerable web servers at the time.
But here’s the thing: SSL is one of the most common protocols, therefore the amount of traffic that is generated is enormous. So sneaking in a malformed request among the millions that come in every day -- say, if you are a big bank or a well-known organization -- exploiting, even finding that needle in the needle stack, even if you knew what you are looking for, and going to be even harder.
Here’s why: Heartbleed was something different, it was just a a crash or a fault, it was an anomaly, so, again, I think it’s unlikely criminal hackers would have noticed that.
The other good news? Commercial fuzz testing tools have gotten much better. There’s much more automation involved today, so it’s not just Google, Microsoft, and Apple who are fuzzing their software. More and more mid-sized companies now fuzz test their software during development - so that’s a win-win for everyone. So we detect the next Heartbleed? I don’t know.
I called up someone I used to work with, someone who was there at the moment Heartbleed was discovered, someone who is now at a university in Finland, and spends his full time these days thinking about these security tools questions. I’ll let him introduce himself.
Kaksonen: Rauli Kaksonen, a senior security specialist in the University of Oulu.
Vamosi: Rauli and I started by discussing the big picture: the beginning of the software development lifecycle, when the code is first developed, but then there’s also the other side, after the software is released and a vulnerability is found. Companies need to think of tools that help them test the whole lifecycle, and not just the part about getting your app to market.
Kaksonen: I have been looking at the space of security tools. What's the big picture there. And it seems that it is the obvious step, proactive tools that look for vulnerabilities whether they are already into the R&D phase, that look for scanning for applications and systems before they are produced. Then they are security scanning live systems. And then they move on, they decide that something bad has already happened and we need to respond. We need to direct to the problems that have happened. And then, after an incident, there's also a need to do some forensic analysis. Let's see what happened, what can we learn about it. And, well, in legal context we want to find out who did it and also go to court if possible. So it's the whole spectrum from the development phase that we start looking at what are the things to be fixed at this early stage into very late in the cycle that something bad happens, who can we blame, who can we ask for compensation, perhaps, if anybody.
Vamosi: So what makes a good security tool. And is it better to be open source or commercial.
Kaksonen: Well, I guess. It doesn't really matter that much if it's an open source tool or commercial to clear tool. I think, in essence, they'd have to solve a problem for the user for the user to use it. And that's the key. Then obviously they are things like usability. So, usually open source tools are used. Maybe more like and if you talk about security, so specialists, they appreciate comment line API's, and so on. But obviously, ease of use in that context is a very important quality, the tool must not crash. So it must be reliable and effective. So, especially in a security context you. You have trust for the result so that you have the findings are real. Or if you didn't find anything it's also because there was nothing to farm. The paper, and so forth. This is difficult for us also. Obviously, you users, typically are not paying you so how, why would you support them. And that's where you need, usually a community of people that can support each other, and while they feel the tool comes more popular than perhaps those some accompany. It's tough to provide commercial support. And then the tool must be maintained so that they are fixed and upgraded, as needed. So I guess the bottom line is that it doesn't matter what kind of tool base it's in the underlying questions are saying, what makes a good tool.
Vamosi: Fuzz testing tools came out of academic research. The most famous story is from the late 1980s, of professor Barton Miller at the University of Wisconsin, who, one summer’s night, in the middle of a thunderstorm, noticed that his remote session with the university was influenced by electrical discharges in the air. Gibberish was being imputed and the program sometimes did and did not behave as expected. This is often cited as where the idea of putting in invalid data to test software comes from but I recently heard of another. This dates back to the 1970s. It’s the idea of taking punch cards out of the trash and running these random program cards against your program to see if it was resilient enough. Very similar idea. Whatever the origin of fuzzing, there were a lot of academic papers produced around the phenomena of using invalid input to trigger unexpected results; a lot of papers before we started to see open source and commercial versions. So, it fair to say a lot of security tools are the result of academic research?
Find out how ForAllSecure can bring advanced fuzz testing into your development pipelines.
Kaksonen: I don't know, I'm not sure how the first fuzzer came to be, but I think that in fact it was an academic exercise. The idea came from security or liability research, but obviously these days, many of the fuzzing tools back they come from different backgrounds now. I don't think it has been studied that much. I think there's a fair amount of tools coming from academic backgrounds but then also, like, especially artists who have done tools for their own use. And then like to wait for the community to use, and to improve. And also there are companies which make tools like free, free versions of them, or promote send them because they really think it's, it's good for the world, and then have like premium paid version so that all kinds of ways where tools come to life.
Vamosi: So let’s say I have a great idea for a tool. There’s this process that takes so many minutes to perform that I need to do every time; wouldn’t it be great if a tool could do that for me, and just give me the results? So I create it. Now what? How does anyone else come to find out about it?
Kaksonen: Well, I think, then it's marketing, but obviously very few open-source developers have a marketing budget but I'm certain you have a community that see people having similar problems and maybe you created the tool to solve a problem that many other people have so that community is probably your best friend here you get early adopters. If it's a good tool they will start using it and, and they will tell others, and very one good way is that is that get great results. Find learning abilities or whatever your tool is doing and report them and tell, and people come to us that well how do you find that thing and then you can tell it's fine, new tool. So, I don't think a tool can be great, actually. Without the user so they go hand in hand then you get users to get the community and team shoots as to improve the tool.
Vamosi: So turn that around. Just because I can market a new tool, how does someone else know whether or not it’s the real deal? How do you know if it even works, or works well?
Kaksonen: My take is it that this is a problem. Many tools are there are not greatly tested. So, even they are, there are updates, and after they may work differently in the new version. There might be regression, and so on. Obviously there is a huge scale of things. Some open-source tools are backed by companies or foundations and they are rigorously tested. And then some are made by individuals. At some point, the individual has moved on and the tool is now abandonware. Obviously, at some point, with those tools the users will abandon them as well. And sometimes they are picked by somebody else who takes the responsibility of making a new version of it. Make a branch of it. And maybe change the name. So there are many different things happening for different tools.
Vamosi: So is there a need for third-party validation or testing? I got my start writing about malware, this was back even before the term malware existed. There were times when a new virus and later worms would be reported daily, when it seemed like this was out of control. Then there came all this antivirus, antimalware products, there was a need to report which ones were better at solving the problem. I turned to to a few third parties, such as AV-Comparatives out of the University of Innsbruck, Austria. They provided objective analysis of how well say Norton Antivirus worked vs Trend Micro Antivirus. But antimalware is niche category. Who’s going to test network scanners, port scanners, and software code?
Kaksonen: Yes, that's, that's true. I guess there have been for example a virus tool comparisons that both can found and so on but I'm not aware of any like systematic long running effort of measuring a set of tools and how they can form. It would be interesting. I think it would be great. It's very hard to see how that could happen. But there are some things that are similar to third party validation for example, these days all these open source repositories. I mean, public repositories. They have all these quality check code. scanning features that can measure the quality of the software. And also we have in our my current academic setting we have looked for doing some testing of the open source security tools. But the problem is that it's properly the checks are in general in nature we check that they are installed properly and so on it's, it's very hard to see that they would be somebody who looks at the specific feature of the tool the specific why the tool was failed and measures how cool that is so there is a community and the users again they choose to use the tool. Maybe you can also trust that it's a good tool. But it would be great if they would be somebody really mastering the tools and keeping, giving them good feedback.
Vamosi: So we have a plethora of security tools, some vetted, most not. Back to an earlier question: Where do I find these? Certainly I don’t go to the dark web for them, or do I?
Kaksonen: Please do not. The internet, has some source sites. Google it. What do you want to do and you will find these tools and so on. And then there are, for example, dedicated Linux distributions which server like on collections of tools ready to be used. I think it's a great way to start doing some security analyzes also some Windows based virtual machine images. To get started, [at the university] we also collected some tools that we think could be used in a ready to use way. And we actually use containers to package storage tools that are easy to get started running them.
Vamosi: So what are some upcoming trends in security tools?
Kaksonen: My take is the this: We are more networked today and we have this internet of things, and bring your own devices. Everything is ID-based. We need, unfortunately, more and more security. Well, not unfortunately. We need more and more security analysis today because there are more incidents and more things to protect. So the demand for security analyzes is increasing. And obviously then more uses for the tools and with more users, there are not everybody can be a super cyber security expert. So we need tools that are easier to use. And some of those are going to be commercial tools, and then probably some are going to be cloud based solutions because, well, cloud is good. And then also these open source tools are getting more users. So, as we study now, how to make the tools easier to use, install and run an easier to automate tasks with those tools? And then, support users for sharing, for example tool configuration sharing results sharing information about those. How do you go towards efficiency and what are the results? And we aim to scale the analysis capability to master the requirements of the society going forward. It means more people but also more effective use of the tools. So I think the future is more tools, more uses of tools.
Vamosi: And for commercial tools, what about the role of automation?
Kaksonen: Well, certainly there's a lot of room to automate analyzers of let's say security incidents but also analyze the security posture of companies and applications, and so on. And it's it by automating the, let's say, mundane everyday tasks that take more time for the people that look for the interesting, for the unexpected results, and so on. I think AI has a role. I suppose a lot of results so there's a lot of data, and when there are a lot of data, then the machine learning and other artificial intelligence systems have had what they can use to work on and analyze so I think it's natural to move to that direction. Well, I think it's the Brock there were things that are proceeding because they really want to be ahead of the curve. And to find the vulnerabilities before somebody gets into the system or before. Several days are discovered we want to find stuff already in the R&D. And before stuff go into production so that's at least where we should put the focus on. Because, first like after we already have an incident and everything is so much more difficult than much more expensive. And there was already something bad has had to happen so right
Vamosi: I should also mention that automation helps with remediation. Ideally you want a test solution that validates the vulnerability before you have your developers wasting time on something that wasn’t really something. So there again, we have humans involved. So would an automated tool have been able to find something subtle like Heartbleed?
Kaksonen: Interesting question. I think, well, I don't expect automation to find a new guy in software and abilities like Heartbleed. It was really a new kind of learnability. Obviously, now that we know that those things exist we probably have the constraint to either find them. But, but I would imagine that the new categories of vulnerabilities are in the future also found by clever individuals.
Vamosi: Is that to say that humans are still superior to automated tools?
Kaksonen: Obviously, the capability of a human is very limited on the amount of data he or she can process. So it's it's always that the human looks only a limited amount of data. We need automation, or artificial intelligence, to pick a select the data. It is simply too much to digest otherwise already.
Vamosi: So Rauli was part of Codenomicon which helped find Heartbleed. Does he have any advice for someone with tools who might want to go into business or maybe just want to find the next Heartbleed?
Kaksonen: How do you start a startup company? I've always said that the most important thing is the idea. What is the problem that you solve for the customer? And if you have a great solution for the problem, then you are on your way to watch the success. And part of the having a great idea is also timing. So, if you are too early, and you don't find customers, you run out of money. If you are too late, there's a fierce competition and then it's no longer make question of who has the greatest technology but it's who has the greatest marketing, which is an important function, but, as an engineer, I think i was so privileged to work, start a company, which just happened to be right at the right time, so that we could evangelize our technology to customers that you know. This is something you can do. And they were happy and they bought from us, because it was so much. This was so good.
Vamosi: A couple of interesting footnotes to Heartbleed.
The amount of coordination between Google and Codenomicon can’t be understated. They reached out to their respective Computer Emergency Response Teams (CERTs) in the US and Finland, and were able to work with the OpenSSL folks immediately. This while contacting major players like Yahoo and Facebook a few days ahead of the announcement, which mitigated the risk for data breaches. It also established a good precedent for future vulnerability disclosures going forward.
And I’ve written about this elsewhere, but despite what you may think about naming vulnerabilities, it is so much easier to remember Heartbleed than CVE 2014-0160-- at least I think so.
And, finally, although the patch has been available since 2014, if you today fire up Shodan, the Iot device search engine, you’ll find over 250 thousand devices still using the vulnerable versions of OpenSSL. That’s an alarming number, until you start to drill down. These devices are sensors, and monitors, often left out in the field, with no extra resource or capability for updates. In theory you could still overtake them, but the data leaked wouldn’t be all that interesting I would guess. Perhaps the temperature, or the soil acidity. These devices will remain vulnerable. Eventually, as they start to fail, they will be replaced with non vulnerable versions. And we stop seeing large number attributed to Heartbleed on Shodan and other resources. At least I hope so.
For The Hacker Mind, I remain, Hello Bird 4 Characters, Robert Vamosi
Thank you for subscribing!