Marginally Interesting: On Relevant Dimensions in Kernel Feature Spaces

I finally found some time to write a short overview article about our latest JMLR paper. It discusses some interesting insights into the dimensionality of your data in a kernel feature space when you only consider the information relevant to your supervised learning problem. The paper also presents a method for estimating this dimensionality for a given kernel and data set.

The overview also contains a few lines of matlab code with which you can have a look at the discussed effect yourself. It all boils down to pictures like these:

What you see here is the contribution of individual kernel PCA components to the Y samples, divided by the smooth part (red), and the noise (blue), on a toy data set, of course. Kernel PCA components are sorted by decreasing variance. What you can see is that the smooth part is contained in the leading kernel PCA components, while the later components only contain information relevant for the noise. This means that even in infinite-dimensional feature spaces, the actual information is contained in a low-dimensional feature space.

Posted by at 2008-11-17 10:06:00 +0100

Marginally Interesting: On Relevant Dimensions in Kernel Feature Spaces

Marginally Interesting: On Relevant Dimensions in Kernel Feature Spaces

Marginally Interesting: On my way to NIPS 2008

Marginally Interesting: Attending AWSSummit 2014 in Berlin

Marginally Interesting: Curation and Collaboration in Science

Marginally Interesting: Talk: Scalability Challenges in Big Data Science

Marginally Interesting: Levels of Abstractions in Big Data

Marginally Interesting: Twitter's new retweet feature

Marginally Interesting: Command Line Interactive Machine Learning on the JVM. Part 2: JRuby and Scala

Marginally Interesting: Twitter in 2011 revamped

Marginally Interesting: Three Things About Data Science You Won't Find In the Books

Marginally Interesting: Java Integration in JRuby

Marginally Interesting: Slides for my LinuxTag talk on Cassandra

Marginally Interesting: Benchmarking javac vs. ecj on Array Access

Marginally Interesting: My thoughts on the NY Times article: Troves of Personal Data, Forbidden to Researchers

Marginally Interesting: Companion Objects as Classes in Scala

Marginally Interesting: Germans in the Valley

Marginally Interesting: What is going on with DeepMind and Google?

Marginally Interesting: Why are people following me on twitter?

Marginally Interesting: Command Line Interactive Machine Learning on the JVM. Part 3: Missing Parts

Marginally Interesting: jblas finally on central Maven repository

Marginally Interesting: On Relevant Dimensions in Kernel Feature Spaces

相關推薦