by John Speed Meyers

What software supply chain security really means

feature
Sep 5, 20247 mins
Application SecuritySoftware Development

In the beginning, we identified two major types of software supply chain attacks and nine minor types. The world keeps insisting on a broader definition.

shutterstock 1869308242 team putting together a chain of gears teamwork coordination collaboration
Credit: Studio Romantic / Shutterstock

In the spring of 2020, it really mattered to me what the definition of โ€œsoftware supply chain securityโ€ ought to be. I was working atย In-Q-Tel, a strategic investor for the US intelligence community, and co-authoring a research paper that attempted to measure the frequency of software supply chain attacks. We picked a definition that emphasized instances of malicious software being widely distributed through an existing distribution channel. Almost as soon as the paper was published though, the definition broke down, repeatedly.

Now, writing in the summer of 2024, several major software supply chain incidents later, Iโ€™ve come to accept that the only workable definition, which Iโ€™ll discuss below, is broadโ€”so broad that a careful observer might even accuse the definition of encompassing all of software security.

In short, this piece is a reflection on my wrestling with the definition of software supply chain security: pinning down a definition, confronting mounting evidence of the definitionโ€™s flaws, and accepting a broader definition.

The original definition

While drafting a research paper later published as โ€œCounting Broken Linksโ€ my co-authors and I methodically combed through old news reports, whitepapers, GitHub issues, and other sources to find all known software supply chain attacks. One day, the lead author,ย Dan Geer,ย who is something like the Gandalf of quantitative computer security research, challenged us to provide a definition of a โ€œsoftware supply chain attack,โ€ the very thing we intended to count. We settled on this definition: software supply chain attacks โ€œintentionally insert malicious functionality into build, source, or publishing infrastructure or into software components with the goal of propagating that malicious functionality through existing distribution methods.โ€ In short, it was all about the distribution of malicious code through existing channels.

The article proposed two major types of software supply chain attacks and nine minor types. Based on the historical attack data we collected, we argued that the two major types were attacks on โ€œbuild, source, and publishing infrastructureโ€ and attacks on โ€œsoftware registries.โ€ For instance, back-doored compilers, popularized in theย O.G. article on software supply chain securityย by Ken Thompson, fall in the first category; theย legions of malicious open source software packagesย that have been discovered over the past 10 years would fall in the second. The table below provides the categories and data in the original report.

Count of Reports:Attacks by major and minor categories

Major type Build, source, and publishing infrastructure Software registry
Minor type Build system compromise Firmware implant Source code system compromise Publishing: Certificate attack Publishing: Delivery system compromise Account takeover Dependency compromise Malicious package Typosquatting
Count 11:13 7:32 9:39 6:18 29:35 11:14 12:333 51:1,373 15:1,247
Table from Geer, Tozer and Meyers, โ€œCounting Broken Links,โ€ย USENIX ;login:, December 2020. Note: Each cell in the count row provides both the count of โ€œreportsโ€ and โ€œattacksโ€ separated by a colon. A โ€œreportโ€ is a public disclosure of one or more software supply chain attacks, e.g., a blog post from a security researcher. An โ€œincidentโ€ is a single instance of an attack reaching a target, e.g., the download of a compromised application from a download server.

The โ€œCounting Broken Linksโ€ article with this table was published the same week in late 2020 asย SolarWinds, the mother of all software supply chain attacks. During this compromise, Russian intelligence operatives corrupted the build process of SolarWinds, a major network software company, and implanted malicious code that traveled via SolarWindsโ€™ own software updates to its customers. Our definition was consistent with this attack. We presented at the NSAโ€™sย Science of Securityย conference and ourย GitHub repositoryย with the underlying data started gaining traction. But then everything started to fall apart.

The definition breaks down

A mere three months later, a new type of attack materialized that didnโ€™t fit within the existing typology. A new attack type called โ€œdependency confusionโ€ was coined when security researcher Alex Birsan self-published a Medium article sub-titled โ€œHow I Hacked into Apple, Microsoft, and Dozens of Other Companies.โ€ What was clever about this new attack type is how it took advantage of the non-intuitive behavior of package managers, allowing an attacker to trick developers into downloading malicious code from an external package registry rather than, as planned, an internal package registry. While similar to typosquatting, which was already a minor category in our typology, this attack didnโ€™t actually involve a typo. Our original definition of software supply chain security had already been stretched. We added another minor category and moved on.

Then in December 2021, Log4shell happened and the โ€œinternet was on fire.โ€ Now our typology suffered a mortal wound. The earlier typology focused exclusively on the insertion of malicious code, but theย Log4shell vulnerabilityย didnโ€™t involve malicious code. Nevertheless, Log4shell clearly represented a widespread vulnerability in the software supply chain. It was an easily exploited and severe vulnerability, introduced by a flaw in a widely popular open source Java logging library. The episode revealed a crucial flaw in our existing definition of software supply chain security: unintentional security flaws in widely used open source software had no place. That original typology, for the purposes of my career, was dead only 18 months after invention.

Accepting a broader definition

Upon reflection, the โ€œsupply chainโ€ aspect of software supply chain security suggests the crucial ingredient of an improvedย definition. Software producers, like manufacturers, have a supply chain. And software producers, like manufacturers, require inputs and then perform a manufacturing process to build a finished product. In other words, a software producer uses components, developed by third parties and themselves, and technologies to write, build, and distribute software. A vulnerability or compromise of this chain, whether done via malicious code or via the exploitation of an unintentional vulnerability, is what defines software supply chain security. I should mention that a similar, rivalย data setย maintained by the Atlantic Council uses this broader definition. (Full disclosure: Iโ€™m now a non-resident fellow at the Atlantic Council. If you canโ€™t beat โ€˜em, join โ€˜em.)

I admit to still having one general reservation about this definition: It can feel like software supply chain security subsumes all of software security, especially the sub-discipline often called application security. When a developer writes aย buffer overflowย in the open source software library your application depends upon, is that application security? Yep! Is that also software supply chain security? Yep! Perhaps the subsuming of software security by software supply chain security is inevitable in an era in which software development depends so heavily on open source software.ย Researchย I co-authored suggests that the typical claim that 80 to 90 percent of modern software applications are actually open-source code is, in fact, a conservative estimate. Our measurements indicated that some smaller software applications are more than 99 percent open source software.

In short, Iโ€™ve come to accept that new attack types will continue to occur, forcing the creation of new โ€œminorโ€ categories, and that the broader definition too will likely need to evolve. In other words, writing in the summer of 2024, it now matters to me a lot less what the definition of โ€œsoftware supply chain securityโ€ ought to be.

John Speed Meyers is the head of Chainguard Labs atย Chainguard. He is also a non-resident senior fellow at the Atlantic Council.

โ€”

New Tech Forum provides a venue for technology leadersโ€”including vendors and other outside contributorsโ€”to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries toย doug_dineley@foundryco.com.