Filter off: No filter for categories
Wide neural networks are often described by Gaussian process limits, which capture their typical behavior at initialization and under lazy training regimes. However, these approximations fail to describe the rare but structurally significant fluctuations that govern finite-width effects, posterior concentration, and feature learning. In this talk, we present a large deviation perspective on deep neural networks that goes beyond Gaussian limits.
We first discuss recent results establishing a functional large deviation principle for fully connected networks with Gaussian initialization and Lipschitz activations, including ReLU. This provides a probabilistic description of the entire network output as a random function on compact input sets, with a rate function characterized by a recursive variational structure across layers.
We then turn to Bayesian neural networks and show how large deviation theory leads to an explicit variational characterization of the posterior over predictors. In contrast with Gaussian process theory, where the kernel is fixed, the large deviation rate function involves a joint optimization over both outputs and internal kernels, yielding a natural notion of feature learning at the functional level. Numerical experiments illustrate how this framework captures non-Gaussian effects, posterior deformation, and data-dependent kernel selection in moderately wide networks.
These results suggest that large deviations provide a principled framework to understand representation learning in wide neural networks, bridging probabilistic asymptotics with practical behavior beyond Gaussian approximations.
Based on arXiv:2601.18276 and arXiv:2602.22925.
In recent years, conservation laws with space-dependent nonlocal fluxes have attracted growing attention due to their wide range of applications in fluid mechanics, including granular flow, sedimentation, aggregation phenomena, crowd dynamics, and, in particular, traffic flow, which serves as the main motivating example of this talk.
We provide an introduction to nonlocal conservation laws, where nonlocality is modeled through a spatial convolution operator. We discuss key analytical properties of these models, including the uniqueness of weak solutions and the singular limit as the convolution kernel converges to a Dirac delta.
Furthermore, we present a general framework for the numerical approximation of such equations and highlight possible extensions of both the theory and the numerical methods, as well as remaining open problems in the literature.
Mercer's celebrated theorem is refined and extended by introducing a novel class of higher-order kernel operators that includes the common integral operator only as a special case. \[ \] These operators genuinely take into account information encoded in the (weak) derivatives of a kernel, and their natural domains are Sobolev spaces of order k over some bounded d-dimensional space. domain, where k depends on the order of (weak) differentiability. \[ \] The spectral decomposition of such higher-order kernel operators leads to Mercer-type expansions, which are optimal in terms of the Sobolev norm and, if k>d, also converge uniformly without requiring the kernel to be positive definite. \[ \] Nuclearity of higher order kernel operators is confirmed for positive definite kernels, and a major refinement of Mercer's theorem is obtained that implies trace formulas and a simple rate for the uniform convergence (including derivatives) in terms of the eigenvalues. A further immediate consequence is novel spectral representations of RKHS's. \[ \] Finally, applied to the covariance kernel of a (weakly) differentiable stochastic process, these refinements also yield novel Karhunen-Loève-type expansions allowing for simultaneous approximations of the process and its (weak) derivatives in a mean-square-optimal sense.
In recent years, Fractional Ordinary differential equations, FODEs, became an essential tool for modelling of viscoelasticity, neuron behaviour, fluid dynamics, electrical circuits and more. The distinguishing feature of the FODEs is the use of a fractional derivative, which generalises the classical derivative to a non-integer order.
The fractional derivative is a non-local operator, meaning the whole history of the function affects the value of the derivative at a given point. The non-locality introduces analytical difficulties when extending the standard method from the classical dynamical systems to the FODE framework, similar to the challenges faced with time-delay systems. This is particularly evident in the theory of invariant manifolds. For example, the classical notion of invariance is no longer well-posed for the fractional dynamical systems. The literature presents conflicting results on this topic, some studies claim that stable and invariant centre manifolds exist, and one work disputes that claim.
The aim is to resolve this contradiction and provide a concrete framework for the analysis of the fractional dynamical systems along the way.
TBA
We study a natural generalization of the Unsplittable Flow Problem (UFP) on a path. We generalize the problem by allowing multiple parallel paths (machines). The objective is to find a profit-maximizing feasible assignment of n given tasks to m machines. Each task has a given machine-independent start and finish time, as well as a profit and a resource demand. Each machine is characterized by a given time-varying capacity. At any point in time, the total demand of tasks assigned to a machine must not exceed its available capacity. To the best of our knowledge, this is the first work to study the multiple path generalization of the UFP on a path.
Assuming that all machines have uniform capacities and the number of machines is constant, we present a polynomial-time (2 + ϵ)-approximation algorithm for this NP-hard problem, derived by combining randomized rounding techniques with dynamic programming. Furthermore, under the no-bottleneck assumption (NBA) and assuming that the tasks are small, i.e., their demand-to-capacity ratio is less than a fixed value δ between their start and finish times for all the machines, we present a randomized-rounding-based polynomial-time constant factor approximation algorithm.
Additionally, for the case of uniform capacities where demands are at most half of the capacity, we present a generalization of the "List Algorithm" given in [2]. We prove that this LP-rounding-based method achieves a polynomial-time 3-approximation. This result extends the 2-approximation from the single-machine setting, with the increased ratio reflecting the additional disjointness constraints required when assigning tasks across multiple parallel machines. Finally, we exhibit instances with unit demands that possess an integrality gap greater than 1, a phenomenon that does not occur in the standard UFP on a path (i.e., the single machine case).
Divergent asymptotic expansions are ubiquitous in mathematical physics, yet they often encode far more information than their formal nature suggests. In this talk, I will present ideas from resurgence theory, which provide a systematic way to reconstruct analytic functions from such expansions.
As a an example, I will consider the exact WKB method, where asymptotic series arise as formal solutions to Schrödinger operators. Resurgence reveals how different analytic realizations of these series are related through Stokes phenomena—discrete jumps that encode nonperturbative effects, and geometrically encode changes of triangulations of an underlying Riemann Surface.
TBA