Open-source software development has become an increasingly popular practice. Today's software systems comprise first-party code and third-party dependencies built through a complex supply chain process involving different individuals, organizations, and tools. An attacker can compromise any step in the process by deliberately incorporating vulnerabilities into the code to be triggered at a later stage of the software life cycle. The recent impactful attacks on SolarWinds or Log4j vulnerability are examples of many such rapidly-increasing attacks. In this project, we will lay the foundations of providing provably-secure open-source software - and to prove that it is secure.
Information-flow control is a well-known mechanism to reason about confidentiality and integrity. A security property states that there is no illegal information flow, e.g., no secret data is leaked to public channels or no tainted data is ever passed to sensitive sinks. We introduce the concept of security summary, which states when it is secure to use an artifact (i.e., there is no illegal flow) and what are the effects of using the artifact on the security-related behaviour. Security summaries are a conceptually simple form of assume-guarantee reasoning with two key ingredients:
(1) a guard, which lists conditions under which using the software is secure, and
(2) an effect, which expresses the (security) consequences of using it.
While the concept is simple, implementing it is not: the smaller problem is that the software we want to reason about may contain thousands of lines of code, while the larger problem is that it will rely on the use of libraries that have thousands of concepts with millions of lines of code and intricate interplay. The question is more "where to start?" than "how to proceed?", unless we are prepared to be constrained to meaningless toy problems. We will address this question by exploiting the compositional character of summary-based reasoning. Security summaries of methods calls are key to establishing the security summaries of methods that rely on them. In this way, we can reduce the problem of reasoning about the security of a large application into the smaller problems of reasoning about the security of individual small methods and compose their results to establish the security of a large application.
It is quite possible to make security assumptions and then trust them. While this makes software reliable only relative to such assumptions, it allows for successively replacing assumptions with certificates (i.e., correct security summaries), or uncertified methods by certified ones. Once such a process is in full swing, certified libraries will become valuable assets for open-source software development, which will bring them into existence purely by the competitive advantage they provide over uncertified ones.
The methods we develop will allow for automatically producing correct security summaries and transparently releasing them, so that the code consumer will be able to check and validate the security of a code before reusing it, and also detect any misbehaviour along the supply chain.
Security summaries also hold many research challenges. For example, methods may come with a certain degree of nondeterminism, and it is not necessary that all resolutions of this nondeterminism satisfy the desired security guarantees - but we need to find one that does. Similarly, while the pathway from security summaries from called methods to the overall desired property is clear, the reverse way (from our overall goals to requirements on the methods called) provides leeway. We will deliver sharp requirements, which will make it easier to update or replace the method called, because the requirements its replacement has to fulfill are relaxed.
Tackling these problems allows us to combine interesting theoretical challenges with practical relevance, that will help produce tomorrow's secure systems.
|