About me and this page

Hello there 👋!

My name is Natalí de Santi. I am a physicist working on statistical, computational, and machine learning (ML) methods applied to Cosmology and Astrophysics.

I am currently a Postdoctoral Scholar at the Berkeley Center for Cosmological Physics (BCCP), University of California, Berkeley (UC Berkeley), and an affiliated researcher at the Lawrence Berkeley National Laboratory (LBL).

My main scientific interests include:

  • large-scale structure of the Universe,
  • dark matter and dark energy,
  • cosmological parameter inference,
  • N-body and hydrodynamical simulations,
  • the halo-galaxy connection,
  • BAO reconstruction and parameter estimation.

Beyond Physics itself, I am deeply curious about data science, machine learning, probabilistic modeling, and scientific programming.

I obtained my PhD from the University of São Paulo (USP), under the supervision of Dr. Luis Raul Weber Abramo (USP) and Dr. Francisco Villaescusa-Navarro (Flatiron Institute).
My doctoral thesis focuses on how machine learning methods can be used to extract cosmological information from simulated halo and galaxy catalogs, and it is available here.

A little about me

Since childhood, I have been fascinated by science, especially Astronomy and Physics, as well as by computers.
As you can see in the picture below, when I was only four years old, I had a peculiar talent for drawing ducks in Microsoft Paint 🦆

This curiosity led me to participate in Astronomy Olympiads from an early age.
During high school, I took part in experimental research on superconductivity with
Dr. Antonio Carlos Hernandes at
USP São Carlos, an experience that ultimately convinced me to pursue Physics rather than Astronomy.

I completed my undergraduate studies at the Institute of Physics of São Carlos (IFSC), University of São Paulo.
I initially worked on experimental materials physics within the former CCMC group, before transitioning to a more theoretical and computational path.
My second undergraduate research project focused on Particle Physics, under the supervision of
Dr. Attilio Cucchieri.

For my master’s degree, I moved to the Federal University of São Carlos (UFSCar), working with Dr. Raphael Santarelli.
During this period, I studied General Relativity and Quantum Field Theory in Curved Spacetime, including work on Hawking radiation and the mass evolution of Schwarzschild black holes. My research included a review of Hawking radiation and discussions on the temporal evolution of a Schwarzschild black hole’s mass, taking this effect into account.

During my PhD, now fully focused on Cosmology, I joined the University of São Paulo in São Paulo and later spent a year as a guest researcher and CCA Predoctoral Fellow at the Flatiron Institute (2022-2023), my first long research experience abroad.

Research interests and ongoing work

Broadly speaking, my research lies at the intersection of cosmology, simulations, and machine learning, with a focus on building robust, physically interpretable, and uncertainty-aware inference methods.

1️⃣ Field-level simulation-based inference for cosmology

A major part of my work focuses on likelihood-free, field-level inference using galaxy and halo catalogs.

  • We showed that graph neural networks (GNNs) combined with moment neural networks (MNNs) can extract cosmological information directly from galaxy phase-space data, probing smaller scales than ever before, while remaining robust to astrophysical uncertainties. More details can be found here.

  • We extended this framework to include observational effects, such as masking, velocity and radial distance uncertainties, and galaxy selection, demonstrating that these models retain strong performance on realistic data (paper).

  • By combining GNNs with symbolic regression, we extracted analytic equations that explain how these networks infer cosmological parameters, revealing physically interpretable relations that are robust across simulations and galaxy formation models (paper).

  • Within the CAMELS project, we demonstrated that training on diverse hydrodynamical simulations, particularly Astrid, is key to building ML models that generalize across astrophysics and subgrid physics (paper).

  • More recently, we showed that semi-analytic models (SAMs) can be used to train GNN-based inference models that extrapolate remarkably well to full hydrodynamical simulations, offering a fast and efficient path to cosmology-ready mock catalogs (paper).

2️⃣ Improving cosmological covariance matrices with machine learning

I also work on reducing the computational cost of cosmological analyses by improving covariance matrices.

  • Using convolutional neural networks (CNNs), we developed a denoising approach that allows accurate covariance matrices to be constructed from only tens to hundreds of simulations, instead of tens of thousands, while preserving parameter inference accuracy (paper). This work also showed its own extrapolation power after being trained on matrices using data from ExSHalos to data from the Quijote simulations

3️⃣ Modeling the halo-galaxy connection with probabilistic ML

Another core research direction is understanding and modeling the stochastic nature of the halo-galaxy connection.

  • We used multiple ML models (ERT, LGBM, kNN, and NN) and data augmentation techniques (SMOGN) to predict galaxy properties from halo features, improving the description of scatter in halo-galaxy relations (paper).

  • By reformulating regression as a classification problem, we recovered full probability distributions of galaxy properties for the first time (paper).

  • We later compared different probabilistic ML methods, including a multivariate Gaussian distribution, a multilayer perceptron classifier, and normalizing flows, to identify which approaches best capture uncertainty in galaxy populations (paper).

This page

This blog is a space where I share my research and related topics, ranging from how we simulate the Universe on a computer to modern machine learning techniques for cosmology, with the goal of making these ideas accessible while remaining scientifically rigorous.