Filter is active: Show only talks in the following category : Oberseminar Statistics and Data Science.
Factor analysis is a statistical technique that explains correlations among observed random variables with the help of a smaller number of unobserved factors. In traditional full-factor analysis, each observed variable is influenced by every factor. However, many applications exhibit interesting sparsity patterns, that is, each observed variable only depends on a subset of the factors. In this talk, we will discuss parameter identifiability of sparse factor analysis models. In particular, we present a sufficient condition for parameter identifiability that generalizes the well-known Anderson-Rubin condition and is tailored to the sparse setup. This is joint work with Mathias Drton, Miriam Kranzlmüller, and Irem Portakal.
Graphical continuous Lyapunov models offer a novel framework for the statistical modeling of correlated multivariate data. These models define the covariance matrix through a continuous Lyapunov equation, parameterized by the drift matrix of the underlying dynamic process. In this talk, I will discuss key results on the defining equations of these models and explore the challenge of structural identifiability. Specifically, I will present conditions under which models derived from different directed acyclic graphs (DAGs) are equivalent and provide a transformational characterization of such equivalences. This is based on ongoing work with Carlos Amendola, Tobias Boege, and Ben Hollering.
While the theory of causality is widely viewed as an extension of probability theory, a view which we share, there was no universally accepted, axiomatic framework for causality analogous to Kolmogorov's measure-theoretic axiomatization for the theory of probabilities. Instead, many competing frameworks exist, such as the structural causal models or the potential outcomes framework, that mostly have the flavor of statistical models. To fill this gap, we propose the notion of causal spaces, consisting of a probability space along with a collection of transition probability kernels, called causal kernels, which satisfy two simple axioms and which encode causal information that probability spaces cannot encode. The proposed framework is not only rigorously grounded in measure theory, but it also sheds light on long-standing limitations of existing frameworks, including, for example, cycles, latent variables, and stochastic processes. Our hope is that causal spaces will play the same role for the theory of causality that probability spaces play for the theory of probabilities.
Clinical data often include a mix of continuous measurements and covariates that have been discretized, typically to protect privacy, meet reporting obligations, or simplify clinical interpretation. This combination, along with the nonlinear and tail-asymmetric dependence frequently observed in clinical data, affects the behavior of regression and variable-selection methods. Copula models, which separate marginal behavior from the dependence structure, provide a principled approach to studying these effects. In this talk, we analyze how discretizing a continuous covariate into equiprobable categories impacts conditional quantiles and likelihoods in bivariate copula models. For the Clayton and Frank families, we derive closed-form anchor points: for a given category, we identify the continuous covariate value at which the conditional quantile under the continuous model matches that of the discretized one. These anchors provide an exact measure of discretization bias, which is small near the center but can be substantial in the tails. Simulations across five copula families show that likelihood-based variable selection may over- or under-weight discretized covariates, depending on the dependence structure. Through simulations, we conclude by comparing polyserial and Pearson correlations, as well as Kendall’s tau (-b), in the same settings. Our results have practical implications for copula-based modeling of mixed-type data.
Inference of the conditional dependence structure is challenging when many covariates are present. In numerous applications, only a low-dimensional projection of the covariates influences the conditional distribution. The smallest subspace that captures this effect is called the central subspace in the literature. We show that inference of the central subspace of a vector random variable Y conditioned on a vector of covariates can be separated into inference of the marginal central subspaces of the components of Y conditioned on X and on the copula central subspace, which we define in this paper. Further discussion addresses sufficient dimension reduction subspaces for conditional association measures. An adaptive nonparametric method is introduced for estimating the central dependence subspaces, achieving parametric convergence rates under mild conditions. Simulation studies illustrate the practical performance of the proposed approach.