Research Statement

Mastery of matter means approximate computation

\[ \begin{aligned} \mathcal{L}_{\mathrm{SM}} ={}& -\frac{1}{4} F_{\mu\nu} F^{\mu\nu} + i\bar{\psi} \cancel{D} \psi \\ &+ \psi_i y_{ij} \psi_j \phi + \text{h.c.} \\ &+ \lvert D_\mu \phi \rvert^2 - V(\phi) \end{aligned} \]

everything we observe on Earth is governed by this equation, the appropriately named Standard Model of particle physics. It is fascinating from the machine learning point of view - the ultimate example of inductive bias. From a relatively small experimental dataset, we inferred an equation which with almost perfect generalizability describes a huge range of phenomena, from the behavior of quarks in the Large Hadron Collider to the electrons inside an H100 GPU, and the DNA of the living organisms.

In a way, we have solved science. Aristotle claimed that by using pure reason he could understand the world, now we finally can. And yet, there are no cure for cancer, fusion power plant, or a printer that consistently connects to the Wi-Fi on the first try – all of which are most certainly permitted by the Standard Model. The cause of this most unfortunate gap is our meager reasoning power. Anything more complex than a hydrogen atom can't be solved exactly – we enter the realm of approximation. Machine learning, by its nature is the automated tool for approximation building. My research is about using it to expand the frontier of what we can understand and control in the physical world.

Vision

Forward problem: Prediction

The most straightforward application of machine learning is to take an expensive ab initio simulator, run in it enough times, and train a model to appropriate the result. I did this for a Cherenkov detector, for example. This naive approach has crucial practical and fundamental limitations. Practically, for it to make sense, the training dataset must be much smaller than the number of future model invocations. Fundamentally, such an approach will only ever be capable of modelling the systems which can be reliably and cheaply simulated in the first place. Particle physics detectors, where Monte-Carlo simulation repeatedly models interactions of the same types of particles with the same detector, are a good example where straightforward ML surrogate works.

The infinitely more interesting and challenging problem is expanding the range of systems which can be modelled. Solve Schrödinger equation for 10 atoms, train a model to predict for 1000. The issue is that such a model can't ever be a naive black box – it must contain some assuptions, some inductive bias about operation of the world. For example, once we agree that interatomic interactions are limited by physical distance, we suddenly can train an ML model on small systems and predict the properties of much larger ones, leading to machine-learning interatomic potentials, the most successful application of ML in materials science to date. We are looking for similar scale transitions across the board.

Short-range interactions are the most obvious inductive bias to use. But it's unlikely that it would lead us to things like a tractable superconductivity model. We need better and more creative assumptions. The ongoing explosion of reasoning models offers enticing opportunities here:

AI Theoretician, which would carry analytical math to bound and isolate the black-box approximation among strictly derived equations, similarly to Nobel-prize winning density functional theory
AI Phenomenologist – its softer brother who doesn't rely on strict mathematical derivation, but uses the available experimental data and general physical principles to build a working model.

Inverse problem: Mastery

From the point of view of the applied science, prediction is just scaffolding which enables us to make useful things. The ultimate goal is to control the system, to design it to our needs. This is the inverse problem. It is also much harder, as it requires us to navigate the space of possible solutions, rather than just predicting the outcome for a given input.

For most real-world applications, the utility function is just too complex to define and expensive to evaluate. The traditional solution is to define a hierachy of proxy functions, each approximating the utility function at a different level of abstraction and cost. For example, see our review on automated 2D material design. Again, modern reasoning AI offers an opportunity here: instead of manually crafting proxy functions, they can be generated and refined based on literature and upstream results.

Once, at some level, the search space and the fitness function are fixed, the problem becomes an optimization problem. And global optimization is hard, with no unversal algorithm that works for all problems. Optimizaton typically has two things to balance: exploration vs exploitation and the cost of the surrogate model vs the evaluation cost. This leads to very different approaches in different domains.

In our work on Terahertz antenna the fitness function was well-defined and bearable, and the search space high-dimensional, so we went with annealing.

In materials science, on the other hand, true fitness is often prohibitively e