Portfolio item number 1
Published:
Short description of portfolio item number 1
Published:
Short description of portfolio item number 1
Published:
Short description of portfolio item number 2
Published in Revista Brasileira de Ensino de Física, 2019
This paper comprehends a first part of my Masters.
Recommended citation: Santi, Natali Soler Matubaro de, & Santarelli, Raphael. (2019). Desvendando a radiação Hawking. Revista Brasileira de Ensino de Física, 41(3), e20180312. Epub March 11, 2019.https://doi.org/10.1590/1806-9126-rbef-2018-0312 http://www.scielo.br/pdf/rbef/v41n3/1806-9126-RBEF-41-3-e20180312.pdf
Published in Brazilian Journal of Physics, 2019
This paper contains the numerical results from my Masters.
Recommended citation: de Santi, N.S.M., Santarelli, R. Mass Evolution of Schwarzschild Black Holes. Braz J Phys 49, 897–913 (2019). https://doi.org/10.1007/s13538-019-00708-y https://link.springer.com/article/10.1007%2Fs13538-019-00708-y
Published in MNRAS, 2022
Elucidating the connection between the properties of galaxies and the properties of their hosting haloes is a key element in theories of galaxy formation. When the spatial distribution of objects is also taken under consideration, investigating the halo-galaxy connection becomes very relevant for cosmological measurements. In this paper, we use machine learning (ML) techniques to analyze these intricate relations in the IllustrisTNG300 magnetohydrodynamical simulation. We employ four different algorithms: extremely randomized trees (ERT), K-nearest neighbors (kNN), light gradient boosting machine (LGBM), and neural networks (NN), along with a stacked model where we combine results from all four approaches. Overall, the different ML algorithms produce consistent results in terms of predicting galaxy properties from a set of input halo properties that include halo mass, concentration, spin, and halo overdensity. For stellar mass, the (predicted v. true) Pearson correlation coefficient is 0.98, dropping down to 0.7-0.8 for specific star formation rate (sSFR), colour, and size. In addition, we test an existing data augmentation technique, designed to alleviate the problem of unbalanced datasets, and show that it improves slightly the shape of the predicted distributions. We also demonstrate that our predictions are good enough to reproduce the power spectra of multiple galaxy populations, defined in terms of stellar mass, sSFR, colour, and size with high accuracy. Our results align with previous reports suggesting that certain galaxy properties cannot be reproduced using halo features alone.
Recommended citation: Natalí S. M. de Santi, Natália V. N. Rodrigues, Antonio D. Montero-Dorta, L. Raul Abramo, Beatriz Tucci, M. Celeste Artale, Mimicking the halo–galaxy connection using machine learning, Monthly Notices of the Royal Astronomical Society, Volume 514, Issue 2, August 2022, Pages 2463–2478, https://doi.org/10.1093/mnras/stac1469 https://doi.org/10.1093/mnras/stac1469
Published in JCAP, 2022
Cosmological covariance matrices are fundamental for parameter inference, since they are responsible for propagating uncertainties from the data down to the model parameters. However, when data vectors are large, in order to estimate accurate and precise matrices we need huge numbers of observations, or rather costly simulations - neither of which may be viable. In this work we propose a machine learning approach to alleviate this problem in the context of the matrices used in the study of large-scale structure. With only a small amount of data (matrices built with samples of 50-200 halo power spectra) we are able to provide significantly improved matrices, which are almost indistinguishable from the ones built from much larger samples (thousands of spectra). In order to perform this task we trained convolutional neural networks to denoise the matrices, using in the training process a data set made up entirely of spectra extracted from simple, inexpensive halo simulations (mocks). We then show that the method not only removes the noise in the matrices of the cheap simulation, but it is also able to successfully denoise the matrices of halo power spectra from N-body simulations. We compare the denoised to the other matrices using several metrics, and in all of them they score better, without any signs of spurious artifacts. With the help of the Wishart distribution we derive an analytical extrapolation for the effective sample augmentation allowed by the denoiser. Finally, we show that, by using the denoised matrices, the cosmological parameters can be recovered with nearly the same accuracy as when using matrices built with a sample of 30,000 spectra in the case of the cheap simulations, and with 15,000 spectra in the case of the N-body simulations. Of particular interest is the bias in the Hubble parameter H0, which was significantly reduced after applying the denoiser.
Recommended citation: de Santi, N. S. M. and Abramo, L. R. 2022, DOI:10.1088/1475-7516/2022/09/013 [https://arxiv.org/pdf/2205.10881.pdf](https://iopscience.iop.org/article/10.1088/1475-7516/2022/09/013)
Published in Blucher Physics Proceedings, 2022
Matrizes de covariância são uma das peças mais importantes na análise de dados em Cosmologia: elas não apenas representam o entendimento sobre a natureza das incertezas, mas refletem a propagação dos erros estatísticos e dependem das suposições, devido aos modelos teóricos utilizados, para reduzir os dados. Para representar os verdadeiros erros estatísticos, muitos dados são necessários para construir essas matrizes, ou seja, centenas de milhares de observações ou simulações caríssimas, algo que nem sempre possui obtenção viável. Para resolver esse problema, foi proposto o uso de técnicas de aprendizado de máquina, com uma pipeline completa para tal. Foi implementada uma simulação de um campo Gaussiano com três parâmetros. Então, o espectro de potências linear foi calculado para cada mapa produzido e centenas de matrizes de covariância, usando diferentes números de espectros, foram calculadas. As matrizes foram utilizadas como dados de entrada em uma rede neural convolucional para remover o ruído daquelas criadas com poucos dados. Por fim, as matrizes de covariância limpas obtidas foram utilizadas para recuperar os parâmetros da simulação, utilizando algoritmos de Monte Carlo acoplados a cadeias de Markov (MCMC). Os resultados mostraram que essa técnica é capaz de produzir boas matrizes de covariância, mesmo com poucos dados de entrada, diminuindo muito os erros dos parâmetros cosmológicos obtidos.
Recommended citation: de Santi, N. S. M. and Abramo, L. R. 2022, DOI: 10.5151/astrocientistas2021-11 [DOI: 10.5151/astrocientistas2021-11]([https://iopscience.iop.org/article/10.1088/1475-7516/2022/09/013](https://www.proceedings.blucher.com.br/article-details/primeiros-passos-na-obteno-de-parmetros-cosmolgicos-utilizando-matrizes-de-covarincia-cosmolgicas-sem-rudo-37453))
Published in arXiv, 2023
We train graph neural networks to perform field-level likelihood-free inference using galaxy catalogs from state-of-the-art hydrodynamic simulations of the CAMELS project. Our models are rotationally, translationally, and permutation invariant and have no scale cutoff. By training on galaxy catalogs that only contain the 3D positions and radial velocities of approximately 1,000 galaxies in tiny volumes of (25 h−1Mpc)3, our models achieve a precision of approximately 12% when inferring the value of Ωm. To test the robustness of our models, we evaluated their performance on galaxy catalogs from thousands of hydrodynamic simulations, each with different efficiencies of supernova and AGN feedback, run with five different codes and subgrid models, including IllustrisTNG, SIMBA, Astrid, Magneticum, and SWIFT-EAGLE. Our results demonstrate that our models are robust to astrophysics, subgrid physics, and subhalo/galaxy finder changes. Furthermore, we test our models on 1,024 simulations that cover a vast region in parameter space - variations in 5 cosmological and 23 astrophysical parameters - finding that the model extrapolates really well. Including both positions and velocities are key to building robust models, and our results indicate that our networks have likely learned an underlying physical relation that does not depend on galaxy formation and is valid on scales larger than, at least, ∼10 h−1kpc.
Recommended citation: Natalí S. M. de Santi et al 2023 ApJ 952 69 [DOI 10.3847/1538-4357/acd1e2](https://iopscience.iop.org/article/10.3847/1538-4357/acd1e2/meta)
Published in arXiv, 2023
We discover analytic equations that can infer the value of Ωm from the positions and velocity moduli of halo and galaxy catalogues. The equations are derived by combining a tailored graph neural network (GNN) architecture with symbolic regression. We first train the GNN on dark matter halos from Gadget N-body simulations to perform field-level likelihood-free inference, and show that our model can infer Ωm with ∼6% accuracy from halo catalogues of thousands of N-body simulations run with six different codes: Abacus, CUBEP3M, Gadget, Enzo, PKDGrav3, and Ramses. By applying symbolic regression to the different parts comprising the GNN, we derive equations that can predict Ωm from halo catalogues of simulations run with all of the above codes with accuracies similar to those of the GNN. We show that by tuning a single free parameter, our equations can also infer the value of Ωm from galaxy catalogues of thousands of state-of-the-art hydrodynamic simulations of the CAMELS project, each with a different astrophysics model, run with five distinct codes that employ different subgrid physics: IllustrisTNG, SIMBA, Astrid, Magneticum, SWIFT-EAGLE. Furthermore, the equations also perform well when tested on galaxy catalogues from simulations covering a vast region in parameter space that samples variations in 5 cosmological and 23 astrophysical parameters. We speculate that the equations may reflect the existence of a fundamental physics relation between the phase-space distribution of generic tracers and Ωm, one that is not affected by galaxy formation physics down to scales as small as 10 h−1kpc.
Recommended citation: Shao, H., de Santi, N. S. M., Villaescusa-Navarro, F., et al. 2023, arXiv:2302.14591. doi:10.48550/arXiv.2302.14591 [https://arxiv.org/pdf/2302.14591.pdf](https://arxiv.org/pdf/2302.14591.pdf)
Published in arXiv, 2023
We present CAMELS-ASTRID, the third suite of hydrodynamical simulations in the Cosmology and Astrophysics with MachinE Learning (CAMELS) project, along with new simulation sets that extend the model parameter space based on the previous frameworks of CAMELS-TNG and CAMELS-SIMBA, to provide broader training sets and testing grounds for machine-learning algorithms designed for cosmological studies. CAMELS-ASTRID employs the galaxy formation model following the ASTRID simulation and contains 2,124 hydrodynamic simulation runs that vary 3 cosmological parameters (Ωm, σ8, Ωb) and 4 parameters controlling stellar and AGN feedback. Compared to the existing TNG and SIMBA simulation suites in CAMELS, the fiducial model of ASTRID features the mildest AGN feedback and predicts the least baryonic effect on the matter power spectrum. The training set of ASTRID covers a broader variation in the galaxy populations and the baryonic impact on the matter power spectrum compared to its TNG and SIMBA counterparts, which can make machine-learning models trained on the ASTRID suite exhibit better extrapolation performance when tested on other hydrodynamic simulation sets. We also introduce extension simulation sets in CAMELS that widely explore 28 parameters in the TNG and SIMBA models, demonstrating the enormity of the overall galaxy formation model parameter space and the complex non-linear interplay between cosmology and astrophysical processes. With the new simulation suites, we show that building robust machine-learning models favors training and testing on the largest possible diversity of galaxy formation models. We also demonstrate that it is possible to train accurate neural networks to infer cosmological parameters using the high-dimensional TNG-SB28 simulation set.
Recommended citation: Ni, Y., Genel, S., Anglés-Alcázar, D., et al. 2023, arXiv:2304.02096. doi:10.48550/arXiv.2304.02096 [https://arxiv.org/abs/2304.02096](https://arxiv.org/abs/2304.02096)
Published in MNRAS, 2023
The relationship between galaxies and haloes is central to the description of galaxy formation, and a fundamental step towards extracting precise cosmological information from galaxy maps. However, this connection involves several complex processes that are interconnected. Machine Learning methods are flexible tools that can learn complex correlations between a large number of features, but are traditionally designed as deterministic estimators. In this work, we use the IllustrisTNG300-1 simulation and apply neural networks in a binning classification scheme to predict probability distributions of central galaxy properties, namely stellar mass, colour, specific star formation rate, and radius, using as input features the halo mass, concentration, spin, age, and the overdensity on a scale of 3 h−1 Mpc. The model captures the intrinsic scatter in the relation between halo and galaxy properties, and can thus be used to quantify the uncertainties related to the stochasticity of the galaxy properties with respect to the halo properties. In particular, with our proposed method, one can define and accurately reproduce the properties of the different galaxy populations in great detail. We demonstrate the power of this tool by directly comparing traditional single-point estimators and the predicted joint probability distributions, and also by computing the power spectrum of a large number of tracers defined on the basis of the predicted colour-stellar mass diagram. We show that the neural networks reproduce clustering statistics of the individual galaxy populations with excellent precision and accuracy.
Recommended citation: Natália V N Rodrigues, Natalí S M de Santi, Antonio D Montero-Dorta, L Raul Abramo, High-fidelity reproduction of central galaxy joint distributions with neural networks, Monthly Notices of the Royal Astronomical Society, Volume 522, Issue 3, July 2023, Pages 3236–3247, https://doi.org/10.1093/mnras/stad1186 [https://doi.org/10.1093/mnras/stad1186](https://doi.org/10.1093/mnras/stad1186)
Published in arXiv, 2023
Using a large sample of galaxies taken from the Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) project, a suite of hydrodynamic simulations varying both cosmological and astrophysical parameters, we train a normalizing flow (NF) to map the probability of various galaxy and halo properties conditioned on astrophysical and cosmological parameters. By leveraging the learnt conditional relationships we can explore a wide range of interesting questions, whilst enabling simple marginalisation over nuisance parameters. We demonstrate how the model can be used as a generative model for arbitrary values of our conditional parameters; we generate halo masses and matched galaxy properties, and produce realisations of the halo mass function as well as a number of galaxy scaling relations and distribution functions. The model represents a unique and flexible approach to modelling the galaxy-halo relationship.
Recommended citation: Lovell, C. C., Hassan, S., Anglés-Alcázar, D., et al. 2023, arXiv:2307.06967. doi:10.48550/arXiv.2307.06967 [https://arxiv.org/abs/2310.15234](https://arxiv.org/abs/2310.15234)
Published in arXiv, 2023
It has been recently shown that a powerful way to constrain cosmological parameters from galaxy redshift surveys is to train graph neural networks to perform field-level likelihood-free inference without imposing cuts on scale. In particular, de Santi et al. (2023) developed models that could accurately infer the value of Ωm from catalogs that only contain the positions and radial velocities of galaxies that are robust to uncertainties in astrophysics and subgrid models.
Recommended citation: Natalí S. M. de Santi et al 2023 ApJ 952 69 [https://arxiv.org/abs/2310.15234](https://arxiv.org/abs/2310.15234)
Published:
Poster presentation, XIX Congresso de Iniciação Científica da UFSCar (XIX CIC), Federal University of São Carlos, São Carlos, SP, Brazil
Published:
Coordinates:
Published:
Coordinates:
Published:
Coordinates:
Published:
Poster presentation, XIII Semana de Física (XIII SeFís), Federal University of São Carlos, São Carlos, SP, Brazil
Published:
Coordinates:
Published:
Coordinates:
Published:
Coordinates:
Published:
Virtual talk, Group Meeting: Prof. Francesco Shankar, University of Southampton, Southampton, UK
Published:
Coordinates:
Published:
Virtual talk, Journal Club: Astronomical Observatory of Trieste, I.N.A.F., Trieste, Italy
Published:
Virtual talk, AI in Astronomy, IAG, USP, São Paulo, SP, Brazil
Published:
Coordinates:
Published:
Coordinates:
Published:
Coordinates:
Published:
Virtual presentation, II mini-Workshop on the halo-galaxy connection for the South American community, Universidad Técnica Federico Santa María, Santiago, Chile
Published:
Coordinates:
Published:
Talk, Galaxy Formation Group Meeting, Flatiron Institute, New York, NY, US
Published:
Virtual talk, Journal club: Berkeley Cosmology Group, UC Berkeley, Berkeley, CA, US
Published:
Coordinates:
Published:
Talk, Cosmology Lunch Talks, Institute of Advanced Studies (IAS), Princeton, NJ, US
Published:
Talk, Tri-state Cosmology X Data Science Meeting, Flatiron Institute
Published:
Coordinates:
Published:
Talk, Group Seminars - Prof. Bhuvnesh Jain, UPenn, Philadelphia, PA, US
Published:
Coordinates:
Published:
Talk, Data-Science X Astro seminar, Yale University, Yale University, New Haven, CT, US
Published:
Coordinates:
Published:
Talk, Renoir Science Seminar, Recherche Energie NOIRe (RENOIR), Marseille, France
Published:
Coordinates:
Published:
Talk, CITA cosmology discussion, University of Toronto, Toronto, Canada
Published:
Coordinates:
Published:
Coordinates
Published:
Coordinates:
Published:
Talk, Seminários do CBPF/COTEC, CBPF, Rio de Janeiro, Brazil
Published:
Coordinates
Published:
Virtual talk, Yale Astronomy Data Science Seminar, Yale
Published:
Virtual talk, ML Session of DoA, Tsinghua University, Beijing, China
Tutoring, Tutores Educação Multidisciplinar, 2018
Tutoring in Physics, elementary school II level.
Undergraduate revision lectures, Flexus Educacional LTDA-ME, 2018
Review of the discipline LE201 - General Physics I, of the University of Campinas (UNICAMP), campus of Limeira.
Tutorial, XIX Physics Week, 2019
Coordinates
Teaching Assistant, University of São Paulo (USP), 2020
I helped undergrad engineering students with problems, concepts and theory related to Eletromagnetism.
Teaching Assistant, University of São Paulo (USP), 2021
I helped graduated physics students with problems, concepts and theory related to Classical Electrodynamics I. Besides, together with the professor, we proposed the problem sets and I solved and made available the solution of them to the students.
Tutorial, Galaxy Formation and Evolution in the Data Science Era, 2023
Coordinates
Tutorial, AI in Astronomy, 2023
Coordinates