Filter is active: Show only talks in the following category : Oberseminar Statistics and Data Science.
Decision makers desire to implement decision rules that, when applied to individuals in the population of interest, yield the best possible outcomes. For example, the current focus on precision medicine reflects the search for individualized treatment decisions, adapted to a patient's characteristics. In this presentation, I will consider how to formulate, choose and estimate effects that guide individualized treatment decisions. In particular, I will introduce a class of regimes that are guaranteed to outperform conventional optimal regimes in settings with unmeasured confounding. I will further consider how to identify or bound these "superoptimal" regimes and their values. The performance of the superoptimal regimes will be illustrated in two examples from medicine and economics.
A fundamental challenge in causal induction is inferring the underlying graph structure from observational and interventional data. While traditional algorithms rely on candidate graph generation and score-based evaluation, my research takes a different approach. I developed neural causal models that leverage the flexibility and scalability of neural networks to infer causal relationships, enabling more expressive mechanisms and facilitating analysis of larger systems.
Building on this foundation, I further explored causal foundation models, inspired by the success of large language models. These models are trained on massive datasets of diverse causal graphs, learning to predict causal structures from both observational and interventional data. This "black box" approach achieves remarkable performance and scalability, significantly surpassing traditional methods. This model exhibits robust generalization to new synthetic graphs, resilience under train-test distribution shifts, and achieves state-of-the-art performance on naturalistic graphs with low sample complexity.
We then leverage LLMs and their metacognitive processes to causally organize skills. By labeling problems and clustering them into interpretable categories, LLMs gain the ability to categorize skills, which acts as a causal variable enabling skill-based prompting and enhancing mathematical reasoning.
This integrated perspective, combining causal induction in graph structures with emergent skills in LLMs, advances our understanding of how skills function as causal variables. It offers a structured pathway to unlock complex reasoning capabilities in AI, paving the way from simple word prediction to sophisticated causal reasoning in LLMs.
Applying causal discovery algorithms to real-world problems can be challenging, especially when unobserved confounders may be present. In theory, constraint-based approaches like FCI are able to handle this, but implicitly rely on sparse networks to complete within a reasonable amount of time. However, in practice many relevant systems are anything but sparse. For example, in protein-protein interaction networks, there are often high-density nodes (‘hub nodes’) representing key proteins that are central to many processes in the cell. Other systems, like electric power grids and social networks, can exhibit high-clustering tendencies, leading to so-called small-world networks. In such cases, even basic constrained based algorithms like PC can get stuck in the high-density regions, failing to produce any useful output. Existing approaches to deal with this, like Anytime-FCI, were designed to interrupt the search at any stage and then produce a sound causal model output a.s.a.p., but unfortunately can require even more time to complete than just letting FCI run by itself.
In this talk I will present a new approach to causal discovery in graphs with high-density regions. Based on a radically new search strategy and modified orientation rules, it builds up the causal graph on the fly, updating the model after each validated edge removal, while remaining sound throughout. It exploits the intermediate causal model to efficiently reduce number and size of the conditional independence tests, and automatically prioritizes ‘low-hanging fruit', leaving difficult (high-density) regions until last. The resulting ‘anytime-anywhere FCI+’ is a true anytime algorithm that is not only faster than its traditional counterparts, but also more flexible, and can easily be adapted to handle arbitrary edge removals as well, opening up new possibilities for e.g. targeted cause-effect recovery in large graphs.
The Euler characteristic of a very affine variety encodes the algebraic complexity of solving likelihood (or scattering) equations on this variety. I will discuss this quantity for the Grassmannian with d hyperplane sections removed. I will demonstrate a combinatorial formula, and explain how to compute this Euler characteristic in practice, both symbolically and numerically. Connections to (and between) statistics and particle physics will also be discussed. This is joint work with Elia Mazzucchelli and Kexin Wang. \[ \] This talk is part of a special series of lectures on algebraic statistics.
We extend the scope of a recently introduced dependence coefficient between a scalar response Y and a multivariate covariate X to the case where X takes values in a general metric space. Particular attention is paid to the case where X is a curve. While on the population level, this extension is straight forward, the asymptotic behavior of the estimator we consider is delicate. It crucially depends on the nearest neighbor structure of the infinite-dimensional covariate sample, where deterministic bounds on the degrees of the nearest neighbor graphs available in multivariate settings do no longer exist. The main contribution of this paper is to give some insight into this matter and to advise a way how to overcome the problem for our purposes. As an important application of our results, we consider an independence test.