The Hidden Risks of Open Source: How Bitergia is Changing the Game

Share this Post

Table of Contents

 

This blog post is based on a panel discussion at the 2024 SOSS Community Day Europe (19:38 minutes video) and introduces the new Bitergia Risk Radar. This tool uses project health metrics to help large companies address the growing risk in open source software supply chains. Continue reading to learn about one customer’s motivation to uncover their risk and how Bitergia worked with them to make that possible. 

The Log4j Incident: A Wake-Up Call

The development of Bitergia Risk Radar began with a need in the open source security space. Specifically, it began when the developers at ING Bank realized that the software supply chains that their organization and others relied on may not be as secure as they had thought. 

They are a company with structured processes set up around SCA scanning and compliance risk models. But they, like everyone else, did not see the Log4j vulnerability coming. This vulnerability impacted millions across the Internet who used the popular open source logging library. The vulnerability left all of those users exposed to malware attacks and data theft. Fortunately, the community was able to quickly identify and fix the problem, and the ING engineers were then able to address it for their own software. But after that scramble, there was a creeping feeling of what if? What if there are more vulnerabilities waiting to happen? 

Wietse Braam, a Senior Manager and Area Lead responsible for the team that develops the global CI/CD solution for ING– and the lead behind the collaboration between ING and Bitergia–  described his reaction to the event: “We got the brewing feeling that we only saw the fluffy tail of the beast hidden underneath.” He wondered, “what if this was a library that was used everywhere in our company and there was no community, or the community wasn’t able to fix this one?”

So Braam and others at ING started discussions with their architects and some industry leaders. They wanted to know: Was it possible to measure this risk? Was the risk really there? “Luckily,” he said, “We stumbled upon the guys from Bitergia to help do some investigation into this risk that we foresaw.” 

The goal of their work together was to develop a risk model that analyzes the sustainability of projects based on community dynamics and development processes. One that is capable of helping large companies predict potential vulnerabilities in the software supply chain before they become a costly problem.

The Growing Need for Open Source Supply Chain Security

ING is not alone in its quest for a more secure software ecosystem. As organizations increasingly rely on open source components, the risks associated with supply chain vulnerabilities have become more pronounced. In fact, Gartner has predicted that “by 2025, 45% of organizations worldwide will have experienced attacks on their software supply chains, a three-fold increase from 2021.” 

Ana Jimenez Santamaria of the TODO Group works with OSPO managers and has observed the increased concern first hand. She describes the trend in managers really wanting to do more to strengthen the software supply chain security.

Ana identifies two main challenges from these managers that are seeking supply security. The first challenge has to do with tooling: It is “finding or developing tooling that is capable of detecting vulnerabilities in terms of open source project health.” Many security models exist that analyze the software code or that assess license compliance. There are also some that look at good development practices. All of these play an important role in the security ecosystem, identifying vulnerabilities and preventing security problems. 

However, before Bitergia began developing the Bitergia Risk Radar, there was no tool that looked at risk from a development and processes perspective. There was no tool that evaluated sustainable development practices, and that answered questions like:

  • How efficiently are the libraries being maintained?

  • How many developers are maintaining them? 

  • Are new developers joining?

  • Where are vulnerabilities most likely to happen?
 

With inspiration and collaboration from ING, Bitergia is filling this gap in tooling to help companies get a more full picture of their dependency risk. Companies can go beyond only scanning code for current vulnerabilities to being able to predict where future vulnerabilities are most likely to happen. 

The metrics and analysis that Bitergia Risk Radar provides can also help open source managers address their second main challenge that Ana discussed: Setting up effective processes around supply chain security. She explained that managers want to know “which projects have vulnerabilities, which projects need help, and how they should be providing help in terms of infrastructure, money, and so on.”  

Without data, it’s hard to set up effective processes. The data and analysis about developer activity that is the core of Bitergia Risk Radar can help to predict vulnerabilities before they happen. They can then help organizations direct their efforts and resources in ways that uphold supply chain security.

Bitergia's Approach: A Project Health Risk Assessment

Bitergia’s approach to risk assessment is rooted in the understanding that open source projects are communities. By analyzing the health and activity of these communities, Bitergia can identify potential risks before they manifest as vulnerabilities. 

Bitergia Risk Radar focuses on seven key metrics to evaluate the maintenance and sustainability of dependencies. For some companies and projects, other metrics are more relevant, and so Bitergia uses those. However, all of these metrics fall into three categories:

  1. Community sustainability indicators, such as the growth of newcomers,
  2. process-oriented metrics and good practices, which includes average lead times and review efficiency,
  3. Maintenance metrics, such as the backlog management index.
 

Miguel Angel Fernandez is one of the data scientists behind the developing the risk model. He explains the approach: “The idea is to take the concept of ‘code smells’–indicators that point to buggy parts of code– and apply it on a community activity level.” So “community smells” are metrics that point to parts of the community that may not be working well or that could be improved. “By assessing these ‘community smells,’ Miguel Angel concludes, “we can identify projects that could have issues in the future.” 

Bitergia runs the model against the software development repositories from the dependencies used by a project or set of projects. And the end result is a risk score for each dependency and an accessible breakdown of the data. The model aggregates the results of all of these metrics into a single score of 1-10 for each dependency, 10 being the most risky. 

Users can then drill down to see the breakdown of that score. They can identify any specific weakness in the community activity– whether it’s insufficient maintenance or efficiency, or something else. 

From there, managers can take action. The information this tool can provide is invaluable to setting up processes, allocating resources, and setting hearts at ease that the dependencies underlying their projects are sustainable.

Bitergia's Proof of Concept: A Revealing Experiment

Bitergia set to work on a proof of concept for applying Bitergia Risk Radar. Before applying the model to ING’s dependencies, Bitergia ran the model against a subset of the Kubernetes dependencies. More specifically, against those dependencies’ last 12 months of activity (from June 2023 to June 2024). As Kubernetes is a large and highly trusted development project, Bitergia was able to use it to fine-tune the model and to get a large sample of results.

These results were illuminating. 179 of the 347 dependencies analyzed (about 51%) got a total risk score of low to medium risk. However, 37 dependencies got a high risk score, and 131 showed as very high risk.

According to the model, a very high risk score is the result of a dependency having very little community activity or none at all. This is an important ‘community smell,’ because zero maintenance is inherently risky. It leaves a dependency open to future vulnerabilities. Miguel Angel was surprised by this result: “I expected some [unmaintained dependencies] in such a large project as Kubernetes…but not so many.” 

Further, even in the dependencies with a low total risk score, Miguel Angel found that the individual Metrics with a higher risk were related to the number of people maintaining the code. “The Pony Factor,” for example, is a metric that identifies the number of people contributing 50% of a project’s code. According to this metric, five or more contributors is low risk, two to five is medium risk, and only one or two is high risk. Ideally, a project has more than five people contributing half of the code. But this is usually not the case. 

28 of the dependencies with a low total risk score still had a high risk in the Pony Factor. If these one or two maintainers leave the project, and others do not replace them, they take all of the knowledge of the project with them. It may then be left open to future vulnerabilities. 

These are some of the most interesting results from this experiment with Kubernetes, although this process yielded a great deal more information that Bitergia has used to fine-tune and focus their model and analysis.

Bitergia went on to help ING run the model against nine of their own repositories and discovered a concerning number of dependencies that had not been supported for years. Fortunately, these components were side projects, not integral to the business. But it was very clear to ING the importance of proactive risk assessment to predict and prevent future vulnerabilities.

They are continuing to work with Bitergia to scale up to run the model against many more of their some 80,000 repositories. 

The proof of concept that Bitergia ran with Kubernetes and their work with ING gives a fascinating look at what Bitergia Risk Radar can uncover for companies.

The Broader Implications: Filling a Need in Open Source Security

As the complexity of software supply chains continues to grow, the need for effective security and risk management solutions will only become more critical. Bitergia’s work with ING and Kubernetes demonstrates the power of a community health approach to addressing this challenge.

Bitergia Risk Radar is a valuable tool for those who are seeking to improve their company’s open source supply chain security. By identifying potential risks early, companies can take proactive steps to mitigate them, reducing the likelihood of costly security breaches. 

With knowledge and insights, they can take action. And they can ease their minds about the risk lurking in their supply chain, or– as Wietse Braam at ING called it– the “beast hidden underneath.”

 

This blog post is based on the 2024 SOSS Community Day Panel Discussion: “OSS Dependency Health: Towards Maturity and Sustainability Risk Assessment Model.”

 

Picture of Julia Lawson

Julia Lawson

Technical Writer at Bitergia

More To Explore

Do You Want To Start
Your Metrics Journey?

drop us a line and Start with a Free Demo!