Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

Posts

Pytorch

8 minute read

Published:

Hello there! If you just came to this blog and you would like to learn something about PyTorch I believe you can find something to start playing around. In this post we will see how to write a simple code to solve a simple regression problem. This means we will do predictions for a quantity $y$, given values for their predictor (the input) $x$ for a noised: $y = a x + b$.

Graph Neural Networks

17 minute read

Published:

Have you ever heard about graphs? What about graph neural networks (GNNs)? If you have never paid attention, let’s try to remind you that, structures like the image below are, actually, graphs. And they are filling all the internet.

10 years from my undergraduation in Physics

20 minute read

Published:

In this post I bring a flashback from my undergrad in Physics, because this year (2022) I am completing 10 years from the beginning of this wonderful journey! What is written here is only my personal experience and does not reflect what happens with all the students. First, because I am old (a lot has changed at USP - the university where I did my undergrad - and in the proper major of Physics by there) - and (of course) because it is different, between the different majors.

Schwarzschild Black Holes

7 minute read

Published:

This post is dedicated to the Schwarzschild solution of Einstein’s equations. Here I will follow a simple way to deduce it and show why this solution arises to what we call as the basic kind of black hole: the Schwarzschild Black Holes.

Casimir effect

15 minute read

Published:

In this post you will see what is the bizarre Casimir Effect. I will start telling you about quantum fluctuations, give you some qualitative definition of it and, then, two quantitative descriptions. As I warned you in the previous post (Harmonic Oscillator), the Physics/Mathematics behind of this effect are the quantum harmonic oscillators!

Harmonic Oscillator

14 minute read

Published:

This post is about the basic unit of the Physics, the harmonic oscillator. I start telling you why it is so important in Physics and about the classical version of it. Then, I pass to its quantum version. This “transition” is a good way to see the Correspondence Principle and here I explain why!

A bit of BaTi0.9Zr0.1O3

5 minute read

Published:

In this post you will see a bit about the Barium Titanate (BaTiO3) with the inclusion of zirconium, i.e., the BaTi0.9Zr0.1O3 material. I start with the motivation behind this work and its objectives. Then, I describe the methodology to produce the powders and the pellets, and finalize with the electric characterization of the materials produced.

Superconductivity

7 minute read

Published:

In this post I will summarize what superconductivity is, tell you about some historic aspects, pass through a little of theory (without touching the “mathematics” behind it – because this work was done when I was a teenager) and show to you all the stages of the production of the cuprate YBaCuO pellets and the development of a support, to see the magnetic levitation using them.

portfolio

publications

Mimicking the halo-galaxy connection using machine learning

Published in MNRAS, 2022

Elucidating the connection between the properties of galaxies and the properties of their hosting haloes is a key element in theories of galaxy formation. When the spatial distribution of objects is also taken under consideration, investigating the halo-galaxy connection becomes very relevant for cosmological measurements. In this paper, we use machine learning (ML) techniques to analyze these intricate relations in the IllustrisTNG300 magnetohydrodynamical simulation. We employ four different algorithms: extremely randomized trees (ERT), K-nearest neighbors (kNN), light gradient boosting machine (LGBM), and neural networks (NN), along with a stacked model where we combine results from all four approaches. Overall, the different ML algorithms produce consistent results in terms of predicting galaxy properties from a set of input halo properties that include halo mass, concentration, spin, and halo overdensity. For stellar mass, the (predicted v. true) Pearson correlation coefficient is 0.98, dropping down to 0.7-0.8 for specific star formation rate (sSFR), colour, and size. In addition, we test an existing data augmentation technique, designed to alleviate the problem of unbalanced datasets, and show that it improves slightly the shape of the predicted distributions. We also demonstrate that our predictions are good enough to reproduce the power spectra of multiple galaxy populations, defined in terms of stellar mass, sSFR, colour, and size with high accuracy. Our results align with previous reports suggesting that certain galaxy properties cannot be reproduced using halo features alone.

Recommended citation: Natalí S. M. de Santi, Natália V. N. Rodrigues, Antonio D. Montero-Dorta, L. Raul Abramo, Beatriz Tucci, M. Celeste Artale, Mimicking the halo–galaxy connection using machine learning, Monthly Notices of the Royal Astronomical Society, Volume 514, Issue 2, August 2022, Pages 2463–2478, https://doi.org/10.1093/mnras/stac1469 https://doi.org/10.1093/mnras/stac1469

Improving cosmological covariance matrices with machine learning

Published in JCAP, 2022

Cosmological covariance matrices are fundamental for parameter inference, since they are responsible for propagating uncertainties from the data down to the model parameters. However, when data vectors are large, in order to estimate accurate and precise matrices we need huge numbers of observations, or rather costly simulations - neither of which may be viable. In this work we propose a machine learning approach to alleviate this problem in the context of the matrices used in the study of large-scale structure. With only a small amount of data (matrices built with samples of 50-200 halo power spectra) we are able to provide significantly improved matrices, which are almost indistinguishable from the ones built from much larger samples (thousands of spectra). In order to perform this task we trained convolutional neural networks to denoise the matrices, using in the training process a data set made up entirely of spectra extracted from simple, inexpensive halo simulations (mocks). We then show that the method not only removes the noise in the matrices of the cheap simulation, but it is also able to successfully denoise the matrices of halo power spectra from N-body simulations. We compare the denoised to the other matrices using several metrics, and in all of them they score better, without any signs of spurious artifacts. With the help of the Wishart distribution we derive an analytical extrapolation for the effective sample augmentation allowed by the denoiser. Finally, we show that, by using the denoised matrices, the cosmological parameters can be recovered with nearly the same accuracy as when using matrices built with a sample of 30,000 spectra in the case of the cheap simulations, and with 15,000 spectra in the case of the N-body simulations. Of particular interest is the bias in the Hubble parameter H0, which was significantly reduced after applying the denoiser.

Recommended citation: de Santi, N. S. M. and Abramo, L. R. 2022, DOI:10.1088/1475-7516/2022/09/013 [https://arxiv.org/pdf/2205.10881.pdf](https://iopscience.iop.org/article/10.1088/1475-7516/2022/09/013)

Primeiros passos na obtenção de parâmetros cosmológicos utilizando matrizes de covariância cosmológicas sem ruído

Published in Blucher Physics Proceedings, 2022

Matrizes de covariância são uma das peças mais importantes na análise de dados em Cosmologia: elas não apenas representam o entendimento sobre a natureza das incertezas, mas refletem a propagação dos erros estatísticos e dependem das suposições, devido aos modelos teóricos utilizados, para reduzir os dados. Para representar os verdadeiros erros estatísticos, muitos dados são necessários para construir essas matrizes, ou seja, centenas de milhares de observações ou simulações caríssimas, algo que nem sempre possui obtenção viável. Para resolver esse problema, foi proposto o uso de técnicas de aprendizado de máquina, com uma pipeline completa para tal. Foi implementada uma simulação de um campo Gaussiano com três parâmetros. Então, o espectro de potências linear foi calculado para cada mapa produzido e centenas de matrizes de covariância, usando diferentes números de espectros, foram calculadas. As matrizes foram utilizadas como dados de entrada em uma rede neural convolucional para remover o ruído daquelas criadas com poucos dados. Por fim, as matrizes de covariância limpas obtidas foram utilizadas para recuperar os parâmetros da simulação, utilizando algoritmos de Monte Carlo acoplados a cadeias de Markov (MCMC). Os resultados mostraram que essa técnica é capaz de produzir boas matrizes de covariância, mesmo com poucos dados de entrada, diminuindo muito os erros dos parâmetros cosmológicos obtidos.

Recommended citation: de Santi, N. S. M. and Abramo, L. R. 2022, DOI: 10.5151/astrocientistas2021-11 [DOI: 10.5151/astrocientistas2021-11]([https://iopscience.iop.org/article/10.1088/1475-7516/2022/09/013](https://www.proceedings.blucher.com.br/article-details/primeiros-passos-na-obteno-de-parmetros-cosmolgicos-utilizando-matrizes-de-covarincia-cosmolgicas-sem-rudo-37453))

Robust field-level likelihood-free inference with galaxies

Published in arXiv, 2023

We train graph neural networks to perform field-level likelihood-free inference using galaxy catalogs from state-of-the-art hydrodynamic simulations of the CAMELS project. Our models are rotationally, translationally, and permutation invariant and have no scale cutoff. By training on galaxy catalogs that only contain the 3D positions and radial velocities of approximately 1,000 galaxies in tiny volumes of (25 h−1Mpc)3, our models achieve a precision of approximately 12% when inferring the value of Ωm. To test the robustness of our models, we evaluated their performance on galaxy catalogs from thousands of hydrodynamic simulations, each with different efficiencies of supernova and AGN feedback, run with five different codes and subgrid models, including IllustrisTNG, SIMBA, Astrid, Magneticum, and SWIFT-EAGLE. Our results demonstrate that our models are robust to astrophysics, subgrid physics, and subhalo/galaxy finder changes. Furthermore, we test our models on 1,024 simulations that cover a vast region in parameter space - variations in 5 cosmological and 23 astrophysical parameters - finding that the model extrapolates really well. Including both positions and velocities are key to building robust models, and our results indicate that our networks have likely learned an underlying physical relation that does not depend on galaxy formation and is valid on scales larger than, at least, ∼10 h−1kpc.

Recommended citation: Natalí S. M. de Santi et al 2023 ApJ 952 69 [DOI 10.3847/1538-4357/acd1e2](https://iopscience.iop.org/article/10.3847/1538-4357/acd1e2/meta)

A universal equation to predict Ωm from halo and galaxy catalogues

Published in arXiv, 2023

We discover analytic equations that can infer the value of Ωm from the positions and velocity moduli of halo and galaxy catalogues. The equations are derived by combining a tailored graph neural network (GNN) architecture with symbolic regression. We first train the GNN on dark matter halos from Gadget N-body simulations to perform field-level likelihood-free inference, and show that our model can infer Ωm with ∼6% accuracy from halo catalogues of thousands of N-body simulations run with six different codes: Abacus, CUBEP3M, Gadget, Enzo, PKDGrav3, and Ramses. By applying symbolic regression to the different parts comprising the GNN, we derive equations that can predict Ωm from halo catalogues of simulations run with all of the above codes with accuracies similar to those of the GNN. We show that by tuning a single free parameter, our equations can also infer the value of Ωm from galaxy catalogues of thousands of state-of-the-art hydrodynamic simulations of the CAMELS project, each with a different astrophysics model, run with five distinct codes that employ different subgrid physics: IllustrisTNG, SIMBA, Astrid, Magneticum, SWIFT-EAGLE. Furthermore, the equations also perform well when tested on galaxy catalogues from simulations covering a vast region in parameter space that samples variations in 5 cosmological and 23 astrophysical parameters. We speculate that the equations may reflect the existence of a fundamental physics relation between the phase-space distribution of generic tracers and Ωm, one that is not affected by galaxy formation physics down to scales as small as 10 h−1kpc.

Recommended citation: Shao, H., de Santi, N. S. M., Villaescusa-Navarro, F., et al. 2023, arXiv:2302.14591. doi:10.48550/arXiv.2302.14591 [https://arxiv.org/pdf/2302.14591.pdf](https://arxiv.org/pdf/2302.14591.pdf)

The CAMELS project: Expanding the galaxy formation model space with new ASTRID and 28-parameter TNG and SIMBA suites

Published in arXiv, 2023

We present CAMELS-ASTRID, the third suite of hydrodynamical simulations in the Cosmology and Astrophysics with MachinE Learning (CAMELS) project, along with new simulation sets that extend the model parameter space based on the previous frameworks of CAMELS-TNG and CAMELS-SIMBA, to provide broader training sets and testing grounds for machine-learning algorithms designed for cosmological studies. CAMELS-ASTRID employs the galaxy formation model following the ASTRID simulation and contains 2,124 hydrodynamic simulation runs that vary 3 cosmological parameters (Ωm, σ8, Ωb) and 4 parameters controlling stellar and AGN feedback. Compared to the existing TNG and SIMBA simulation suites in CAMELS, the fiducial model of ASTRID features the mildest AGN feedback and predicts the least baryonic effect on the matter power spectrum. The training set of ASTRID covers a broader variation in the galaxy populations and the baryonic impact on the matter power spectrum compared to its TNG and SIMBA counterparts, which can make machine-learning models trained on the ASTRID suite exhibit better extrapolation performance when tested on other hydrodynamic simulation sets. We also introduce extension simulation sets in CAMELS that widely explore 28 parameters in the TNG and SIMBA models, demonstrating the enormity of the overall galaxy formation model parameter space and the complex non-linear interplay between cosmology and astrophysical processes. With the new simulation suites, we show that building robust machine-learning models favors training and testing on the largest possible diversity of galaxy formation models. We also demonstrate that it is possible to train accurate neural networks to infer cosmological parameters using the high-dimensional TNG-SB28 simulation set.

Recommended citation: Ni, Y., Genel, S., Anglés-Alcázar, D., et al. 2023, arXiv:2304.02096. doi:10.48550/arXiv.2304.02096 [https://arxiv.org/abs/2304.02096](https://arxiv.org/abs/2304.02096)

High-fidelity reproduction of central galaxy joint distributions with Neural Networks

Published in MNRAS, 2023

The relationship between galaxies and haloes is central to the description of galaxy formation, and a fundamental step towards extracting precise cosmological information from galaxy maps. However, this connection involves several complex processes that are interconnected. Machine Learning methods are flexible tools that can learn complex correlations between a large number of features, but are traditionally designed as deterministic estimators. In this work, we use the IllustrisTNG300-1 simulation and apply neural networks in a binning classification scheme to predict probability distributions of central galaxy properties, namely stellar mass, colour, specific star formation rate, and radius, using as input features the halo mass, concentration, spin, age, and the overdensity on a scale of 3 h−1 Mpc. The model captures the intrinsic scatter in the relation between halo and galaxy properties, and can thus be used to quantify the uncertainties related to the stochasticity of the galaxy properties with respect to the halo properties. In particular, with our proposed method, one can define and accurately reproduce the properties of the different galaxy populations in great detail. We demonstrate the power of this tool by directly comparing traditional single-point estimators and the predicted joint probability distributions, and also by computing the power spectrum of a large number of tracers defined on the basis of the predicted colour-stellar mass diagram. We show that the neural networks reproduce clustering statistics of the individual galaxy populations with excellent precision and accuracy.

Recommended citation: Natália V N Rodrigues, Natalí S M de Santi, Antonio D Montero-Dorta, L Raul Abramo, High-fidelity reproduction of central galaxy joint distributions with neural networks, Monthly Notices of the Royal Astronomical Society, Volume 522, Issue 3, July 2023, Pages 3236–3247, https://doi.org/10.1093/mnras/stad1186 [https://doi.org/10.1093/mnras/stad1186](https://doi.org/10.1093/mnras/stad1186)

A Hierarchy of Normalizing Flows for Modelling the Galaxy-Halo Relationship

Published in arXiv, 2023

Using a large sample of galaxies taken from the Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) project, a suite of hydrodynamic simulations varying both cosmological and astrophysical parameters, we train a normalizing flow (NF) to map the probability of various galaxy and halo properties conditioned on astrophysical and cosmological parameters. By leveraging the learnt conditional relationships we can explore a wide range of interesting questions, whilst enabling simple marginalisation over nuisance parameters. We demonstrate how the model can be used as a generative model for arbitrary values of our conditional parameters; we generate halo masses and matched galaxy properties, and produce realisations of the halo mass function as well as a number of galaxy scaling relations and distribution functions. The model represents a unique and flexible approach to modelling the galaxy-halo relationship.

Recommended citation: Lovell, C. C., Hassan, S., Anglés-Alcázar, D., et al. 2023, arXiv:2307.06967. doi:10.48550/arXiv.2307.06967 [https://arxiv.org/abs/2310.15234](https://arxiv.org/abs/2310.15234)

Field-level simulation-based inference with galaxy catalogs: the impact of systematic effects

Published in arXiv, 2023

It has been recently shown that a powerful way to constrain cosmological parameters from galaxy redshift surveys is to train graph neural networks to perform field-level likelihood-free inference without imposing cuts on scale. In particular, de Santi et al. (2023) developed models that could accurately infer the value of Ωm from catalogs that only contain the positions and radial velocities of galaxies that are robust to uncertainties in astrophysics and subgrid models.

Recommended citation: Natalí S. M. de Santi et al 2023 ApJ 952 69 [https://arxiv.org/abs/2310.15234](https://arxiv.org/abs/2310.15234)

talks

Published:

teaching

Tutor Experience

Tutoring, Tutores Educação Multidisciplinar, 2018

Tutoring in Physics, elementary school II level.

Private teacher

Undergraduate revision lectures, Flexus Educacional LTDA-ME, 2018

Review of the discipline LE201 - General Physics I, of the University of Campinas (UNICAMP), campus of Limeira.

Teaching Assistant of Classical Electrodynamics I

Teaching Assistant, University of São Paulo (USP), 2021

I helped graduated physics students with problems, concepts and theory related to Classical Electrodynamics I. Besides, together with the professor, we proposed the problem sets and I solved and made available the solution of them to the students.