João M. Pereira |
|
|
João M. Pereira |
|
|
I'm an Assistant Professor in
the Instituto de Matemática Pura e Aplicada.
I am an applied mathematician studying machine learning,
multi-linear algebra and information theory. My
research centers on scalable methods and statistical
fundamental limits of high-dimensional inverse
problems, that include tensor decomposition,
cryo-electron microscopy, and partial and stochastic
differential equations. In addition, I have
interests in machine learning, deep learning,
statistics, information theory and optimization.
I was a Ph.D. student of Emmanuel Abbe and Amit Singer at Princeton University and worked as a postdoc with Vahid Tarokh at Duke University, and with Joe Kileel and Rachel Ward at University of Texas at Austin. Sou Pesquisador Adjunto no
Instituto de Matemática Pura e Aplicada.
Sou um matemático aplicado a estudar aprendizado de máquina,
álgebra multi-linear e teoria de informação. A minha investigação
foca em métodos escaláveis e limites fundamentais estatísticos
de problemas inversos em dimensão alta, que incluêm decomposição de
tensores, Crio-microscopia eletrônica, e equações de derivadas
estocásticas e parciais. Para além disso, tenho interesse
generalizado em problemas de aprendizado de máquina, estatística,
teoria da informação e optimização.
Fui estudante de doutorado de Emmanuel Abbe e Amit Singer na Universidade de Princeton e trabalhei como postdoc com Vahid Tarokh na Universidade de Duke, e com Joe Kileel e Rachel Ward na University do Texas em Austin. |
Recorded TalksSeminários gravados (em inglês)
Recent and Upcoming TalksSeminários recentes
Linear Algebra and Applications (March – June 2023) Álgebra Linear e Aplicações (Março – Junho 2023)
Materials (in portuguese): Materiais
Homework (in portuguese): Listas
Linear Algebra and Applications (March – June 2024) Álgebra Linear e Aplicações (Março – Junho 2024)
Monday and WednesdaySegunda e Quarta, 10:30–12:00
Office Hours: Monitoria: Tuesday,Terça, 13:30–15:00, Sala 302
Materials (in portuguese): Materiais
Homework (in portuguese): Listas
Tensor
Decompositions
We proposed the Subspace Power Method (SPM) for decomposing low-rank symmetric tensors. Numerical experiments indicate that this method is faster by an order of magnitude than the state of the art. In a follow-up work we studied its optimization landscape and showed that, with a suitable initialization, it is provably efficient in certain regimes. Furthermore, the algorithm may be used to decompose moment tensors implicitly, enabling moment tensors which would otherwise occupy 100PB of memory to be decomposed in a matter of seconds.
Decomposição de Tensores
Propusemos o Método de Potências em Subespaços (Subspace Power Method - SPM) para decompor tensores simétricos de baixo ranque. Experimentos numéricos indicam que este método é uma ordem de magnitude mais rápido que o estado da arte. Num trabalho posterior, estudámos a sua paisagem de optimização e mostrámos que, com uma inicialização adequada e em certos regimes, o método é eficiente. Para além disso, o algoritmo pode ser usado para decompor tensores de momentos implicitamente, permitindo que tensores de momentos, que de outra forma ocupariam 100PB de memória, sejam decompostos em apenas alguns segundos. |
Ilustração de decomposição de tensores simétricos
Illustration of symmetric tensor decomposition
|
Learning Laws
of Datasets
We use machine learning tools for discovering laws and equations, such as partial differential equations (PDEs) and stochastic differential equations (SDEs), that govern high-dimensional datasets. We proposed a method that, given noisy data points of a function that is a solution of a PDE, it both learns this function and the underlying PDE. On SDE's, we developed a method for learning a latent unknown low-dimensional SDE that governs observed high-dimensional data. As an example, given a video of a 2D ball moving in the plane according to an SDE, it learns the two-dimensional underlying SDE that governs the ball coordinates. |
Performance of the method for
learning PDEs from data. Given true function
values (left plot), corrupted with noise (middle
plot), the method denoises the function (right
plot) and learns the Helmholtz equation. Credit: Ali Hasan
|
Fundamental Limits of
multireference alignment and Cryo-EM
Cryo-electron microscopy (cryo-EM) is a technique to determine the 3D structure of molecules. A crucial challenge in cryo-EM consists of estimating the 3D electrostatic potential of the molecule from (very noisy) 2D projections of the molecule potential, over unknown viewing directions. This problem can be seen as an instance of MRA, where the observations result from a group action on the signal, which is then corrupted with noise. In the low signal-to-noise ratio (SNR) regime, which is the prevalent regime in cryo--EM. it has been showed that the fundamental limits in MRA are determined by the lower order moments of the data. This result follows from a Taylor expansion of the Kullback-Leibler divergence of GMM models, valid in the low SNR regime, which I obtained and extended in my Ph.D. thesis for other random mixture models. |
Single-particle reconstruction
problem in Cryo-EM
Credit: Amit Singer |
Pseudo-spectra of
Time-Frequency localization operators
We used inequalities arising from the trace and norm of time-frequency localization operators to study their pseudo-spectra and obtain other important results. Moreover, we used similar ideas to show that a class of determinantal point processes, associated with the Schrödinger representation of the Heinsenberg group, belong to a state of matter called hyperuniform. |
Left: Poisson process, not uniform;
Middle: hyperuniform;
Right: Lattice/Crystal, also hyperuniform. Credit: Wikipedia |