The xz apocalypse that almost was*
*Paraphrased prompt for header image. “Man dressed in black lurks around the open backdoor of a farmhouse where a desktop computer can be seen. A rubber duck is on the porch. Make it in the style of regionalism.” Why a rubber duck? Bailey the duck got huge engagement from our RSAC content and who am I to deny the people what they want?
It’s been a while since I’ve blogged. Things have been busy; Bitsight released a great report authored by yours truly on CISA’s KEV catalog. It’s really great if I do say so myself so I highly recommend going and giving it a glance.
Aside from that, there have been a lot of things on my mind. NIST’s National Vulnerability Database (NVD) has had a “degradation of service,” which means the main source of vulnerability information that so many APIs rely upon is in danger of being significantly less useful. I got to kibitz with the vulnerability glitterati1 at VulnCon at the end of March. While there, my colleague Sander Vinberg and I gave a barn burner of a talk to a packed crowd about the history and evolution of CVEs and their associated frameworks. I even talked to the press a bit. A whole post on that meeting and what the decline in NVD might mean for security research is forthcoming, but for now I wanted to talk about another big story that hit last month, the backdoor that got inserted into the open source xz library and how it almost got disseminated across the globe.
For those not “in the know”, this was an absolutely fascinating story about a seemingly good natured developer taking over maintenance of a small, but crucial compression library, and then inserting a backdoor into the library which would have given them access to a large number of OpenSSH servers2. A great deal of digital ink has been spilled on the ins and outs of the attack, but here is a quick timeline synopsis of what happened.
- A patch to make the OpenSSH Daemon compatible with systemd causes OpenSSH to load libsystemd which in turn loads liblzma. This in itself is benign, but opened the avenue for attack.
- A contributor going by the name of Jia Tan begins contributing to the liblzma and xz compression library projects. Over the course of three years, this person contributes and earns the trust of the relatively small community of developers. Eventually pressure is put on the original maintainer to step down and hand the reins to Jia Tan.
- In March of this year, malicious payloads were added to the repository which are included in builds of xz.
- A month later, Andres Freund, a PostgreSQL developer, notes some performance degradations with new versions of the library and investigates finding the backdoor was introduced into a binary test file into the source repository.
- The internet collectively freaks out.
Given that I make my bones in the security hot-take economy, it was interesting to read folks’ conclusions about this particular incident. Beyond questions of attribution of the mysterious “Jia Tan”, the fascinating details of how the backdoor actually worked, and of course the many, many funny memes, one particular disagreement was exactly how close a call was this? Some folks felt it was somewhere in between “Armageddon” and “Ragnarok”, and that if the attacker had been slightly more competent every OpenSSH server on the internet would have been vulnerable to this attacker. Others insisted this was caught early, and that catching it before it became widespread was inevitable given the widespread use of OpenSSH.
So what can Bitsight say?
Given Bitsight’s pretty broad view of the Internet, I thought I could contribute to the discussion a bit and ask “how bad could this have been?” and as a corollary “how many chances would there have been to notice?” So let’s get into the “how bad could this have been?” question first.
The first part of answering any research question like this is making it just a bit more concrete. So let’s make it a bit more measurable: “How many SSH servers use OpenSSH and are running on Operating Systems that use systemd and therefore would have eventually used the backdoored xz library in future releases?” Of course, the best part of this kind of question is that it's easy to break down and get distracted along the way by all the weird stuff that pops up in our data. Let’s break out an envelope we can write on the back of and do some estimation.
First, exactly how many SSH servers are on the Internet? I took a look at our global internet scanning in the first quarter of 2024, and found ~38.3M3 different SSH servers (running various types of SSH server software on various ports) on ~37.0M IP addresses over the three months we measured. That’s a lot of communication! As you might expect nearly all of those are hosted on the RFC suggested port 22, but we found servers running on 252 different ports, with some interesting popular ones:
It’s wild to see SSH servers running on the typical HTTP and HTTPS ports of 80 and 443. One of the most fun diversions for me working at Bitsight is finding weird stuff in data like this. For example, I found an IP address in Malaysia running SSH servers on 33 different ports in a single day, and one network service provider owned an IPv4 address that used SSH on 250 distinct ports over the first three months of 2024. Neither of these were cloud service providers or telecoms, but rather infrastructure technology companies, who actually have very good reasons for doing weird networking things like this. This is really more of a roadside (network-side?) curiosity.
So the answer to the first question, how many SSH servers are out there, turns out to be “quite a lot.” How many are running OpenSSH specifically? Most SSH servers helpfully will announce themselves with a banner when connections are attempted (though this banner has varying levels of veracity and utility). Using this and some other probing information we can usually determine the software that is running on. This is an area where this very good open source solution dominates, as we can see in figure 2.
~70% of SSH servers are running some form of OpenSSH, followed by 24% using some form of dropbear. The remaining 6% run a smattering of other more obscure software, some of which might simply be OpenSSH wearing a crude disguise, but we aren’t going to dig into specifics.
OK, so we see ~26M SSH servers running OpenSSH, but what percentage of those are running on OSes that use systemd and therefore the backdoored xz library, that is to say, Debian, Ubuntu, Red Hat, and Fedora? The trouble here is that determining the OS an OpenSSH server is running is pretty tough. Sometimes the data is right there in the banner, but sometimes it’s not. Through Bitsight’s arcane knowledge and finely honed techniques, we can determine this a little less than half the time.
A pretty big fraction of OpenSSH servers are Ubuntu and Debian, and if we had to guess a large portion of those “Unknown OSes” are likely Debian based or Red Hat/Fedora based. Here we might fork some estimations a bit and give an upper and lower bound of what might have been affected if our worst nightmares had come true. The upper bound will assume the worst about all those “unknown OS”s, with the lower bound counting only stuff we can confirm.
One last step and we’ll get to some estimations of the scope of the potential xz apocalypse. We can also gain a sense of the versions of Ubuntu and Debian that folks were using and how close they were to using the “bleeding edge” version that might have pulled in the backdoored version of xz.
What we can see in Figure 4 is that most organizations were not close to using the most up to date version of either OS (12.5 for Debian and 22.0.4 for Ubuntu when the CVE was published). It’s possible that the servers with indeterminate versions were indeed that bleeding edge version, but that’s still only 13% for Debian and 44% for Ubuntu. Everyone else would have had to go through the arduous process of updating their OS before they would actually be vulnerable.
This in itself raises an interesting research question: Organizations should certainly maintain the software they are using within the range of supported versions, but where in that range is optimal? Should folks use bleeding edge, or the oldest long term support version available? Somewhere in the middle? Seems ripe for some analysis…
But what about our original question. At a lower bound if we are only considering those that might be running bleeding edge versions of OSes that would be exposed, it’d likely only have been around 5M OpenSSH servers. This is certainly a substantial amount, but not every single one. At the top end we’d still be worried about 26M, which is functionally all of them.
As one last bit of analysis, we might ask which industries are most likely to have OpenSSH servers.
Telecoms are unsurprisingly at the top (Figure 5), with Tech and Education following close behind. Perhaps what’s most interesting here is that in all cases (even telecoms at 28%) the vast majority of organizations don’t have OpenSSH servers running. So while there are certainly a lot of them out there, this doesn’t necessarily mean every organization is at risk.
So how big of an apocalypse could this have been? Honestly, it’s not quite clear, but it was also not clearly terrible, meaning that this doesn’t really count as an apocalypse in our book. After all, we’ve got to uphold some standards around here.
One last point we’d like to make is that the large install base provides many opportunities for discovery of the backdoor, both before it was deployed and after. Even if a small percentage (0.1%) of system administrators and maintainers who rely on OpenSSH or xz really dug into a particular update, that’s still tens of thousands of pairs of eyes examining a particular update. Which raises the question of whether the doomsday scenario with a tiny, much relied on package is really a weak point in a critical point in the metaphorical block tower4?
I would guess all those eyes watching are attached to hands ready to jump in the case that block starts to fail, so maybe there is more resilience than people assume. As part of this research I wanted to look into the dependency trees to see what other software might have been affected by this particular backdoor, but given the specificity of the attack path, that might have been less compelling than what I’ve done here.
This story brings up further questions as well. Why did the attacker choose to insert a backdoor rather than just slip in a vulnerability? We learned a few years ago that doing so is possible even for the venerable Linux Kernel. A vulnerability can be exploited by anyone though once discovered, so this attacker was maybe getting greedy and was willing to trade potentially easier discovery and no plausible deniability for exclusive access to OpenSSH servers. We’d note that exclusive access was only possible because the attacker ensured the backdoor could only be opened with a private key they held5, and one that has yet to be cracked. Another addition to the list of why this was such a fascinating story.
Then again, maybe this is happening all the time, and this is the first we’ve noticed. Many open source communities are filled with pseudonymous folks contributing in free form with varying degrees of governance. This is the hallmark of an interesting incident though, it gives rise to wide ranging and abstract research questions. We hope to continue to visit those in the future.
1Read “thought leadership”
2The main purpose of this post is to estimate this exactly, but we need to lay some narrative groundwork
3Not to toot our own horn, but this is more than Shodan currently displays (28.6M), but that is likely because we are looking at the whole of 2024 rather than Shodan’s last 30 days.
4One of Randall Monroe’s classics
5Some backdoors inserted in software can also be exploited by anyone who finds them. This one, not so much. The attacker held a key that allowed any backdoored SSH server to upload an ssh certificate with embedded malicious code that could then be executed. Because of the robust encryption techniques used, it’s likely the key itself will never be decrypted.