Filter off: No filter for categories
in this work, we provide the stability of generic polycycles of hybrid planar vector fields, extending previous known results in the literature. The polycycles considered here may have hyperbolic saddles, tangential singularities and jump singularities
There has been increasing interest in the estimation of risk capital within enterprise risk models, particularly through Monte Carlo methods. A key challenge in this area is accurately characterizing the distribution of a company’s available capital, which depends on the conditional expected value of future cash flows. Two prominent approaches are available: the “regress-now” method, which projects cash flows and regresses their discounted values on basis functions, and the “regress-later” method, which approximates cash flows using realizations of tractable processes and subsequently calculates the conditional expected value in a closed form. This paper demonstrates that the left and right singular functions of the valuation operator serve as robust approximators for both approaches, enhancing their performance. Furthermore, we describe situations where each method outperforms the other, with the regress-later method demonstrating superior performance under certain conditions, while the regress-now method generally exhibiting greater robustness.
Geometric packing problems have been investigated for centuries in mathematics and a notable example is the Kepler's conjecture, from the 1600s, about the packing of spheres in the Euclidean space.
In this talk, we mainly address the knapsack problem of hyperspheres. The input is a set of hyperspheres associated with profits and a hyperrectangular container, the knapsack. The goal is to pack a maximum profit subset of the hyperspheres into the knapsack. For this problem, we present a PTAS. For a more general version, where we can have arbitrary fat objects instead of hyperspheres, we present a resource augmentation scheme (an algorithm that gives a super optimal solution in a slightly increased knapsack).
Decision makers desire to implement decision rules that, when applied to individuals in the population of interest, yield the best possible outcomes. For example, the current focus on precision medicine reflects the search for individualized treatment decisions, adapted to a patient's characteristics. In this presentation, I will consider how to formulate, choose and estimate effects that guide individualized treatment decisions. In particular, I will introduce a class of regimes that are guaranteed to outperform conventional optimal regimes in settings with unmeasured confounding. I will further consider how to identify or bound these "superoptimal" regimes and their values. The performance of the superoptimal regimes will be illustrated in two examples from medicine and economics.
A fundamental challenge in causal induction is inferring the underlying graph structure from observational and interventional data. While traditional algorithms rely on candidate graph generation and score-based evaluation, my research takes a different approach. I developed neural causal models that leverage the flexibility and scalability of neural networks to infer causal relationships, enabling more expressive mechanisms and facilitating analysis of larger systems.
Building on this foundation, I further explored causal foundation models, inspired by the success of large language models. These models are trained on massive datasets of diverse causal graphs, learning to predict causal structures from both observational and interventional data. This "black box" approach achieves remarkable performance and scalability, significantly surpassing traditional methods. This model exhibits robust generalization to new synthetic graphs, resilience under train-test distribution shifts, and achieves state-of-the-art performance on naturalistic graphs with low sample complexity.
We then leverage LLMs and their metacognitive processes to causally organize skills. By labeling problems and clustering them into interpretable categories, LLMs gain the ability to categorize skills, which acts as a causal variable enabling skill-based prompting and enhancing mathematical reasoning.
This integrated perspective, combining causal induction in graph structures with emergent skills in LLMs, advances our understanding of how skills function as causal variables. It offers a structured pathway to unlock complex reasoning capabilities in AI, paving the way from simple word prediction to sophisticated causal reasoning in LLMs.
Currently, there is no universal definition of a random bifurcation. Many of the existing definitions either fail or are often too complicated to validate in applications. To address this, new definitions of a random bifurcation were proposed and subsequently validated for the supercritical pitchfork bifurcation with additive Brownian noise ([1]). However, the in [1] used stochastic integration suffers from multiple drawbacks, which can be circumvented by using rough integrals instead. In my master thesis, we translated their work to the rough path formalism (made precise in [2], [3]). We were able to reproduce the results in [1] using rough instead of stochastic integration. Further, taking advantage of the rough path formalism, we could additionally investigate the case, where the additive noise is given by fractional Brownian motion with Hurst parameter H ∈ ( 1 3 , 1 2 ). In this talk, we discuss previously existing definitions of a random bifurcation and compare them to the in [1] newly proposed concepts. In particular, we point out where existing definitions fall short. Further, we introduce the rough path formalism and argue why it is such a fitting choice when investigating random bifurcations. Moreover, we compare rough to stochastic integrals and highlight the major advantages of the former. Finally, we present our bifurcation results for the rough fractional pitchfork equation with additive (fractional) Brownian noise. References [1] M. Callaway, T. S. Doanb, J. S. W. Lamb and Martin Rasmussen. “The dichotomy spectrum for random dynamical systems and pitchfork bifurcations with additive noise”. In: Annales de l’Institut Henri Poincar´e - Probabilit´es et Statistiques 53.4 (2017). doi: 10.1214/16-AIHP763. [2] P.K. Friz and M. Hairer. A Course on Rough Paths. Berlin: Springer, 2014. [3] P.K. Friz and N.B. Victor. Multidimensional Stochastic Processes as Rough Paths: Theory and Ap- plications. Cambridge: Cambridge University Press, 2009.
In this talk, we present a physically-motivated model of diffusion dynamics for a system of n hard spheres (colloids) evolving in a bath of infinitely-many very small particles (polymers). We first show that this two-type system with reflection admits a unique strong solution. We then explore the main feature of the model: by projecting the stationary measure onto the subset of the large spheres, these now feel a new attractive short-range dynamical interaction between each other, known in the physics literature as a depletion force, due to the (hidden) small particles. We are able to construct a natural gradient system with depletion interaction, having the projected measure as its stationary measure. Finally, we will see how this dynamics yields, in the high-density limit for the small particles, a constructive dynamical approach to the famous discrete geometry problem of maximising the contact number of n identical spheres. Based on joint work with M. Fradon, J. Kern, and S. Rœlly.
Applying causal discovery algorithms to real-world problems can be challenging, especially when unobserved confounders may be present. In theory, constraint-based approaches like FCI are able to handle this, but implicitly rely on sparse networks to complete within a reasonable amount of time. However, in practice many relevant systems are anything but sparse. For example, in protein-protein interaction networks, there are often high-density nodes (‘hub nodes’) representing key proteins that are central to many processes in the cell. Other systems, like electric power grids and social networks, can exhibit high-clustering tendencies, leading to so-called small-world networks. In such cases, even basic constrained based algorithms like PC can get stuck in the high-density regions, failing to produce any useful output. Existing approaches to deal with this, like Anytime-FCI, were designed to interrupt the search at any stage and then produce a sound causal model output a.s.a.p., but unfortunately can require even more time to complete than just letting FCI run by itself.
In this talk I will present a new approach to causal discovery in graphs with high-density regions. Based on a radically new search strategy and modified orientation rules, it builds up the causal graph on the fly, updating the model after each validated edge removal, while remaining sound throughout. It exploits the intermediate causal model to efficiently reduce number and size of the conditional independence tests, and automatically prioritizes ‘low-hanging fruit', leaving difficult (high-density) regions until last. The resulting ‘anytime-anywhere FCI+’ is a true anytime algorithm that is not only faster than its traditional counterparts, but also more flexible, and can easily be adapted to handle arbitrary edge removals as well, opening up new possibilities for e.g. targeted cause-effect recovery in large graphs.
Quantum walks (QWs) can be viewed as quantum analogs of classical random walks. Mathematically, a QW is described as a unitary, local operator acting on a grid and can be written as a product of shift and coin operators. We highlight differences to classical random walks and stress their connection to quantum algorithms (see Grover’s algorithm). If the QW is assumed to be translation invariant, applying the Fourier transform yields a multiplication operator, whose bandstructure we briefly study. After equipping the underlying lattice with random phases, we turn to dynamical localization. This means that the probability to move from one lattice site to another decreases on average exponentially in the distance, independently of how many steps the QW may take. We sketch the proof of dynamical localization on the hexagonal lattice in the regime of strong disorder, which uses a finite volume method.
Some PDEs admit symmetric ‘pattern’ solutions, such as traveling or rotating waves. We investigate what happens to such patterns when noise is present in the system. We show that the patterns persist for a long time with high probability, and describe the (noise-induced) motion of the pattern. The symmetries of the PDE and the pattern play a key role in describing the dynamics, and we give special attention to the situation where the symmetry group is not commutative.
In smooth optimization, Taylor expansion is a powerful tool for analyzing the speed of convergence of solution methods. In nonsmooth optimization, this tool cannot be applied anymore, as there is no suitable generalization of Taylor expansion for a nonsmooth function. As a result, although many different solution methods have been proposed, the speeds of convergence of these methods are rarely analyzed. In this talk, I will present a novel approach based on the Goldstein \(\varepsilon\)-subdifferential for analyzing convergence behavior in nonsmooth optimization. More precisely, given a converging sequence and upper bounds for the distance of the \(\varepsilon\)-subdifferential to zero for vanishing \(\varepsilon\), we derive an estimate for the distance of said sequence to the minimizer. The assumptions we require for this are polynomial growth around the minimizer and, depending on the order of growth, a higher-order generalization of semismoothness. After giving an overview of the assumptions and the theoretical results, I will show how these results lead to a better understanding of the behavior of gradient sampling methods.
The Traveling Salesman Problem (TSP) is one of the most-studied hard problems in combinatorial optimization. We developed a new algorithm that uses a connection between Minimum Cost Flow Problems (MCF) and the TSP to improve on a given local optimum. MCF problems can be solved efficiently through linear programming or combinatorial algorithms such as cycle-cancelling. We investigated the potential of cycle-cancelling in the context of the TSP. Through a restriction of the search space of cycles to cancel for a given tour, practical results exhibit that only a low number of subtours is created, and a simple patching step suffices for a high success rate and gap closure towards an optimum.
Competitive games of complete information famously always have Nash equilibria, but it is well-known that correlation (“advice”) can yield new equilibria, sometimes with preferable collective properties (social welfare, fairness, …). While it is known that quantum correlations in the form of entanglement do not imply further correlated equilibria, this situation changes when going to so-called Bayesian games of incomplete information, where each player has to react to a privately known type, while being ignorant about the type of the other players.
In fact, when all players in the game share the same payoff function, this reproduces the “nonlocal games” studied in quantum mechanics since Bell, in which case the CHSH inequality separates classical from quantum correlation and quantum from no-signalling.
In the talk, I will review, how classical correlations, shared quantum states and no-signalling correlations can create a hierarchy of sets of classically correlated, quantum and belief-invariant equilibria of conflict-of-interest games, all containing the set of Nash equilibria. I will show a simple and easy-to-analyze construction of such games based on quantum pseudo-telepathy games, which show the quantum advantage of larger social welfare and of fairer distribution of payoff.
[Based on work with V. Auletta, D. Ferraioli, A. Rai and G. Scarpa, arXiv:1605.07896, and with M. Cerda (BSc thesis, UAB 2021).]
The Euler characteristic of a very affine variety encodes the algebraic complexity of solving likelihood (or scattering) equations on this variety. I will discuss this quantity for the Grassmannian with d hyperplane sections removed. I will demonstrate a combinatorial formula, and explain how to compute this Euler characteristic in practice, both symbolically and numerically. Connections to (and between) statistics and particle physics will also be discussed. This is joint work with Elia Mazzucchelli and Kexin Wang. \[ \] This talk is part of a special series of lectures on algebraic statistics.
In this talk, we analyse Lyapunov exponents of recurrent neural networks. At first, we define the Lyapunov spectrum of a random dynamical system by stating Oseledets theorem and consider some special cases, such as Oseledets theorem for triangular cocycles and consequences for the Lyapunov spectrum of cocycles with invariant subspaces. After a brief repetition of the concept of entropy, we take a look at skew product flows and the so-called fiber entropy and connect it with Lyapunov exponents by using Pesin’s formula. Subsequently, we consider a lower bound for top Lyapunov exponents of random dynamical systems of a special shape. Finally, we use the derived statements and apply them to continuous-time and discrete-time recurrent neural networks by interpreting them as random dynamical systems and discuss the implications in terms of memory capacity.
No-Free-Lunch theorems are important results in the mathematical foundations of statistical learning. They typically state that, in expectation w.r.t. a uniformly chosen target concept, no machine learning algorithm performs better on unseen data than random guessing. Put differently, one algorithm can only outperform another when being supplied with sufficient a priori knowledge by means of training data or design. In this talk, I will present a new kind of No-Free-Lunch theorem, namely for so-called autoregressive models, most prominently used in Large Language models powering, e.g., OpenAI's ChatGPT. These can be represented by higher-order Markov chains whose kernels are learned during training. I will discuss the key points of its proof and put the result into perspective to scenarios relevant to natural language processing.
Deep neural networks have led to breakthrough results in numerous practical machine learning tasks. In this lecture, we will attempt a journey through the mathematical universe behind these practical successes, elucidating the theoretical underpinnings of deep neural networks in functional analysis, harmonic analysis, complex analysis, approximation theory, dynamical systems, Kolmogorov complexity, optimal transport, fractal geometry, mathematical logic, and automata theory. ___________________________
Invited by Prof. Gitta Kutyniok.
We extend the scope of a recently introduced dependence coefficient between a scalar response Y and a multivariate covariate X to the case where X takes values in a general metric space. Particular attention is paid to the case where X is a curve. While on the population level, this extension is straight forward, the asymptotic behavior of the estimator we consider is delicate. It crucially depends on the nearest neighbor structure of the infinite-dimensional covariate sample, where deterministic bounds on the degrees of the nearest neighbor graphs available in multivariate settings do no longer exist. The main contribution of this paper is to give some insight into this matter and to advise a way how to overcome the problem for our purposes. As an important application of our results, we consider an independence test.