Articles of machine learning

VC dimension for Rotatable Rectangles

It can be shown that VC dimension of rotatable rectangles is 7. The problem is I cannot understand how to approach the solution. So far I used bruteforce to solve this kind of problem, I was drawing points in different shapes and check whenever the hypothesis shatters the points. In this case the heptagon is […]

Why use the kernel trick in an SVM as opposed to just transforming the data?

Why use the kernel trick in a support vector machine as opposed to just transforming the data and then using a linear classifier? Certainly, we’ll approximately double the amount of memory required to hold the data, original plus transformed, but beyond that it seems like the amount of computation remains about the same. What are […]

Machine Learning: Linear Regression models

I’m currently in a course learning about neural networks and machine learning, and I came across these two formulas in this textbook page on linear regression: 1) $y(x) = a + bx$ and 2) $y(x) = w^{T}\phi(x)$ What is the difference between these two formulas? How are they related? They seem to perform the same […]

Help in understanding Derivation of Posterior in Gaussian Process

According to the textbook Gaussian Process in Machine Learning, it is given that \begin{align*} p(w\mid X,y) &\propto \exp\left(-\frac{1}{2\sigma_n^2}(y-X^Tw)(y-X^Tw)\right)\exp\left(-\frac{1}{2}w^T\Sigma_{p}^{-1}w\right) \\ &\propto \exp\left(-\frac{1}{2}(w-\bar{w})^T\left(\frac{1}{\sigma_n^2}XX^T + \Sigma_p^{-1}\right)(w-\bar{w})\right) \end{align*} where $\bar{w} = \sigma_n^{-2}(\sigma_n^{-2}XX^T + \Sigma_p^{-1})^{-1}Xy$. I can’t really understand how the first step leads to the second step. Can someone kindly show me how the derivation is done? Thanks

A concise guide to basic probability

I am looking for a reference to learn probability theory which satisfies the following criteria: Not beginner oriented, but starting from the basis. That means, it can assume the reader to have some mathematical maturity, and analysis, linear algebra, etc can be assumed from page one. On the other hand, it should assume no knowledge […]

Deriving cost function using MLE :Why use log function?

I am learning machine learning from Andrew Ng’s open-class notes and coursera.org. I am trying to understand how the cost function for the logistic regression is derived. I will start with the cost function for linear regression and then get to my question with logistic regression. (Btw a similar question was asked here, which answers […]

How gaussian mixture models work?

I am given an example: Suppose 1000 observations are drawn from $N(0,1)$ and $N(5,2)$ with mixing parameters $\pi_{1}=0.2$ and $\pi_{2}=0.8$ respectively. Suppose we only know $\sigma$ and want to estimate $\mu$ and $\pi$. How does one go about using Gaussian Mixture models to estimate these parameters? I know I have to use the EM algorithm […]

How does one derive Radial Basis Function (RBF) Networks as the smoothest interpolation of points?

I was reading/watching CalTech’s ML course and it said that one could derive the RBF Gaussian kernel from the solution to smoothest interpolation that minimizes squared loss. i.e. one can derive the predictor/interpolator: $$ f(x) = \sum^{K}_{k=1} c_k \exp( -\beta_k \| x – w_k \|^2 )$$ from the Empirical Risk minimizer with a smoothest Regularizer: […]

What is an example of a SVM kernel, where one implicitly uses an infinity-dimensional space?

Reading the Wikipedia article about SVMs, I noticed More formally, a support vector machine constructs a hyperplane or set of hyperplanes in a high- or infinite-dimensional space, which can be used for classification, regression, or other tasks. I continued with “A Tutorial on Support Vector Machines for Pattern Recognition” by Christopher JC Burges and stumbled […]

First Course in Linear algebra books that start with basic algebra?

I’m 30 years old, and the only math I can remember from college is basic algebra and some probabilities. Next month, I have a machine learning project I’d like to work on, but I’ll need a solid footing in linear algebra first. Are there any books or tutorials that can take me from the spotty […]