Inferring causal relationships has always been one of the main objectives in many fields of study. Examples in today's world include, but are not limited to, inferring potential causes of cancer, the effect of gene manipulation, the cause of immigration and its effect on the economy, and much more.
Specific statistical models, known generically as causal models, have been used to infer such causal relationships from observed data. Extensive research has been conducted on defining, interpreting, and applying causal models, and recently, these models have become a mainstream in statistics and computer science. Today, a very popular method for inferring causal relationships is based on the use of what we call graphical causal models. These apply graphical (Markov) models, which are statistical models over graphs with nodes that are random variables representing the quantities of interest. Edges indicate probabilistic dependence among these variables conditional on some other variables in the
graph, which with some additional assumptions can be interpreted as causal relationships. Graphical models have been extensively used in statistics, probability theory, and machine learning, and are applied in a wide range of areas from genetics to economics.
On the other hand, the surge of online and other social networks as well as other types of data represented as networks, has led to the introduction of a wide range of statistical (random) network models. However, inferring causal relationships among the edges of the network, or between the nodal attributes and the edges of the network, is an important yet less studied task. Despite some research on using nongraphical causal models, and also some attempts on approaches based on graphical models, a general causal framework for network models is lacking in the literature. The main objective of this project is to use this connection to apply the theory of graphical causal models on random networks.
Both graphical and network models use graphs. However, as opposed to graphical models, in network models, nodes of the graph are fixed individuals, and edges are random. Although these two types of models have been introduced and studied in completely different branches of statistics, there is a natural connection between them. This connection allows us to (in principle) develop every theory pertaining to graphical models for network models by specializing the theory of graphical models to the specific distributions of random network models as well as the specific type of graphs used in graphical models. This is a novel approach, which will be the basis of performing graphical causal inference on networks. This is especially important at this moment in time because of the surge in different types of network data that are particularly wellsuited for causal modeling.
