ISI is restructuring its organization chart, for such reason some information is not available at the moment.


Learning Latent Variable Models: Efficient Algorithms and Applications 

Monday, December 2, 2019

12 p.m

ISI seminar room 1st floor

Matteo Ruffini
 Data Scientist at Skyscanner

Latent variable models are a wide class of probabilistic models characterized by the presence of some hidden variables influencing the outcomes of observable data. Models belonging to this class include mixture models, latent Dirichlet allocation, hidden Markov models, and are routinely used for machine learning tasks like clustering, topic modeling and time-series analysis. The Expectation-Maximization (EM) method has been for decades the standard approach to learn these models, despite its well-known limitations: EM is known to be slow and prone to learn suboptimal models. To overcome these issues, a variety of techniques based on the Method of Moments (MoM) have been proposed during the last few years; these methods are in general faster than EM - requiring a single pass over the data - and have provable guarantees of learning accuracy in polynomial time.  During this talk, I will go through the theory of methods of moments to learn latent variable models. I will present their key concepts, identify their advantages and drawbacks with respect to traditional methods and present some recent advancements aimed at improving their applicability to real-world data. During the presentation I will also provide some examples of how methods of moments can be used in real-world scenarios, focusing on two areas of application: natural language processing - showing how they can be applied to efficiently learn topic models - and healthcare analytics - applying MoMs to cluster patients in groups with homogeneous clinical profiles.

Short Bio:

Matteo Ruffini recently obtained a Ph.D. in artificial intelligence from Barcelona tech University (UPC), where he has been working on methods to learn probabilistic latent-variable models. His research focuses on tensor-based methods of moments - a family of techniques that allows to learn high-dimensional models in short running times - and on applications to topic modeling and healthcare analytics. Besides his research activity, Matteo is also working as a Senior Data Scientist at Skyscanner, where he is developing recommender systems to automatically suggest personalized trip solutions to Skyscanner users. Previously, he was leading a data science team at ToolsGroup - a software house providing AI solutions for supply chain optimization -  where he was applying machine learning techniques to sales forecasting and inventory planning.