Brighton Pavilion

10thAnnual Conference of the International Speech Communication Association

ISCA Interspeech 2009 Brighton

Tutorials Day - Sunday 6 September 2009

T-2: Dealing with High Dimensional Data with Dimensionality Reduction

Presented by Neil D. Lawrence and Jon Barker

Outline

The aim of the tutorial is to transfer the latest developments in dimensionality reduction in machine learning to the speech processing community in a coherent and complete manner. The aims of the tutorial are as follows:

  • Develop an intuition amoungst participants about why high dimensional data is “special” .
  • Review principal component analysis as a standard linear approach to dimensionality reduction. Mention extensions in passing (CCA, factor analysis, LDA).
  • Build on principal component analysis to introduce the family of spectral approaches that has recently gained much attention in machine learning. Focus on their strengths and weaknesses.
  • Build on principal component analysis to introduce probabilistic approaches to dimensionality reduction such as the GTM and the GP-LVM. Introducing the GP-LVM will also involve an introduction to Gaussian processes.

Dimensionality reduction is a standard component of the toolkit in any area of data modeling. Over the last decade algorithmic development in the area of dimensionality reduction has been rapid. Approaches such as Isomap, LLE, and maximum variance unfolding have extended the methodologies available to the practitioner. More recently, probabilistic dimensionality reductiont techniques have been used with great success in modeling of human motion. How are all these approaches related? What are they useful for? In this tutorial our aim is to develop an understanding of high dimensional data and what the problems are with dealing with it. We will motivate the use of nonlinear dimensionality reduction as a solution for these problems. The keystone to unify the various approaches to non-linear dimensinoality reduction is principal component analysis. We will show how it underpins spectral methods and attempt to cast spectral approaches within the same unifying framework. We will further build on principal componet analysis to introduce probabilistic approaches to non-linear dimensionality reduction. These approaches have become increasingly popular in graphics and vision through the Gaussian Process Latent Variable Model. We will review the GP-LVM and also consider earlier approaches such as the Generative Topographic Mapping and Latent Density Networks.

A key focus of the tutorial will be the difference between the probabilistic approaches and the more commonly applied spectral approaches. In particular we will emphasise the distance preservation charac- ter of the probabilistic approaches: namely that local distances in the data are not necessarily preserved in the latent space. This contrasts with spectral algorithms which typically aim to preserve such local distances. These different characteristics mean that probabilistic approaches complement the spectral approaches, but the bring their own range of associated problems, in particular local minima in the optimisation space. Heuristics for avoiding these local minima will also be discussed.

Speaker Biography

Neil Lawrence is a Senior Research Fellow in the School of Computer Science at the University of Manchester, U.K.. Previous to this appointment he was a Senior Lecturer in the Department of Computer Science at the University of Sheffield, U.K. where he was head of the Machine Learning Research Group. His main research interest is machine learning through probabilistic models. He is interested in both the algorithmic side of these models and their application in areas such as bioinformatics, speech, vision and graphics. His PhD was awarded in 2000 from the Computer Lab at the University of Cambridge. He then spent a year at Microsoft Research, Cambridge before moving to Sheffield in 2001 and then to Manchester in 2007.

Jon Barker obtained a degree in Electrical and Information Sciences from the University of Cambridge, followed by a Ph.D. in Computer Science from the University of Sheffield in 1998. After graduating he spent a year at the Institut Communication Parlee, Grenoble, studying audio-visual speech perception before returning to the Speech and Hearing Group at Sheffield where he is now a Senior Lecturer. His research interests centre around human and machine approaches to the organisation of auditory and auditory-visual scenes. In particular, he is interested in how statistical models of individual sound sources – e.g. models of speech – can be used to answer questions about complex scenes where *multiple* sound source are simultaneously active. Since completing his PhD he has authored or coauthored over 40 papers in this area.