The OSCAR project is a collaborative effort to shape a new computer algebra system, written in Julia. OSCAR is built on top of the four "cornerstone systems" ANTIC (for number theory), GAP (for group and representation theory), polymake (for polyhedral and tropical geometry) and Singular (for commutative algebra and algebraic geometry). We present examples to showcase the current version 0.14.0. This is joint work with The OSCAR Development Team, currently lead by Wolfram Decker, Claus Fieker, Max Horn and Michael Joswig.
\[ \] Interested participants can also install OCSCAR before the workshop. More information about the installation can be found here: https://www.oscar-system.org/install/
After a demo of the OSCAR system, we introduce the mrdi file format and discuss the advantages of using serialization for collaborative work and scientific research. We demonstrate how users can benefit from OSCAR's built-in serialization mechanism, which employs that file format. Key applications include the reproduction of mathematical results computed with OSCAR and the interoperability between OSCAR and other software applications.
Graphs are often used as representations of conditional independence structures of random vectors. In stochastic processes, one may use graphs to represent so-called local independence. Local independence is an asymmetric notion of independence which describes how a system of stochastic processes (e.g., point processes or diffusions) evolves over time. Let A, B, and C be three subsets of the coordinate processes of the stochastic system. Intuitively speaking, B is locally independent of A given C if at every point in time knowing the past of both A and C is not more informative about the present of B than knowing the past of C only. Directed graphs can be used to describe the local independence structure of the stochastic processes using a separation criterion which is analogous to d-separation. In such a local independence graph, each node represents an entire coordinate process rather than a single random variable.
\[ \] In this talk, we will describe various properties of graphical models of local independence and then turn our attention to the case where the system is only partially observed, i.e., some coordinate processes are unobserved. In this case, one can use so-called directed mixed graphs to describe the local independence structure of the observed coordinate processes. Several directed mixed graphs may describe the same local independence model, and therefore it is of interest to characterize such equivalence classes of directed mixed graphs. It turns out that directed mixed graphs satisfy a certain maximality property which allows one to construct a simple graphical representation of an entire Markov equivalence class of marginalized local independence graphs. This is convenient as the equivalence class can be learned from data and its graphical representation concisely describes what underlying structure could have generated the observed local independencies.
\[ \] Deciding Markov equivalence of two directed mixed graphs is computationally hard, and we introduce a class of equivalence relations that are weaker than Markov equivalence, i.e., lead to larger equivalence classes. The weak equivalence classes enjoy many of the same properties as the Markov equivalence classes, and they provide a computationally feasible framework while retaining a clear interpretation. We discuss how this can be used for graphical modeling and causal structure learning based on local independence.
We study the optimal sample complexity of structure learning in Gaussian structural equation models. In the first part of the talk, we compare the complexity of structure learning via the PC algorithm and distribution learning via the Chow-Liu algorithm in directed polytrees. We will show how both algorithms are optimal under different assumptions, and lead to different statistical complexities. Moving beyond polytrees, we then investigate the problem of neighbourhood selection, which is an important primitive when learning the overall structure of a graphical model. We will introduce a new estimator, called klBSS, and compare its performance to best subset selection (BSS). We show by example that—even when the structure is unknown—the existence of underlying structure can reduce the sample complexity of neighbourhood selection compared to classical methods such as BSS and the Lasso.
Gaussian Processes (GP) are widely used to model spatial dependency in geostatistical data, yet the exact Bayesian inference has an intractable time complexity of $O(n^3)$. Vecchia approximation has become a popular solution to this computational issue, where spatial dependency is characterized by a sparse directed acyclic graph (DAG) that allows scalable Bayesian inference. Despite the popularity in practice, little is understood about its theoretical properties. In this paper, we systematically study the posterior contraction rates of Vecchia approximations of GP. Under minimal regularity conditions, we prove that by appropriate selection of the underlying DAG, the Vecchia approximated GP possess the same posterior contraction rates as the mother GP. Therefore, by optimal choices of the tunning hyper-parameters, the Vecchia approximation can achieve the minimax contraction rate, providing strong frequentist guarantees to the procedure. Our theoretical findings are demonstrated numerically as well using synthetic and real world data sets.
Colored Gaussian DAG models generalize linear structural equation models by allowing additional equalities to be specified among the error variances and regression coefficients. We show that these models are smooth manifolds and give a characterization of their vanishing ideals up to a saturation. We also initiate the study of faithfulness and structural identifiability. Our results are facilitated by an in-depth analysis of parameter identification maps for ordinary Gaussian DAG models and our techniques carry over easily to other classes of rationally identifiable statistical models. This is joint work with Kaie Kubjas, Pratik Misra and Liam Solus.
When learning a causal model of a system, a key motivation is the use of that model for downstream decision-making. In this talk, I will take a decision-centric perspective on causal structure learning, focused on a simple setting that is amenable to careful statistical analysis. In particular, we study causal effect estimation via covariate adjustment, when the causal graph is unknown, all variables are discrete, and the non-descendants of treatment are given. \[ \] We propose an algorithm which searches for a data-dependent "approximate" adjustment set via conditional independence testing, and analyze the bias-variance tradeoff entailed by this procedure. We prove matching upper and lower bounds on omitted confounding bias in terms of small violations of conditional independence. Further, we provide a finite-sample bound on the complexity of correctly selecting an "approximate" adjustment set and of estimating the resulting adjustment functional, using results from the property testing literature. \[ \] We demonstrate our algorithm on synthetic and real-world data, outperforming methods which ignore structure learning or which perform structure learning separately from causal effect estimation. I conclude with some open questions at the intersection of structure learning and causal effect estimation.
In the context of linear regression, we construct a data-driven convex loss function with respect to which empirical risk minimisation yields optimal asymptotic variance in the downstream estimation of the regression coefficients. Our semiparametric approach targets the best decreasing approximation of the derivative of the log-density of the noise distribution. At the population level, this fitting process is a nonparametric extension of score matching, corresponding to a log-concave projection of the noise distribution with respect to the Fisher divergence. The procedure is computationally efficient, and we prove that our procedure attains the minimal asymptotic covariance among all convex M-estimators. As an example of a non-log-concave setting, for Cauchy errors, the optimal convex loss function is Huber-like, and our procedure yields an asymptotic efficiency greater than 0.87 relative to the oracle maximum likelihood estimator of the regression coefficients that uses knowledge of this error distribution; in this sense, we obtain robustness without sacrificing much efficiency.
t.b.a.