It can be shown that VC dimension of rotatable rectangles is 7. The problem is I cannot understand how to approach the solution. So far I used bruteforce to solve this kind of problem, I was drawing points in different shapes and check whenever the hypothesis shatters the points. In this case the heptagon is […]

Why use the kernel trick in a support vector machine as opposed to just transforming the data and then using a linear classifier? Certainly, we’ll approximately double the amount of memory required to hold the data, original plus transformed, but beyond that it seems like the amount of computation remains about the same. What are […]

I’m currently in a course learning about neural networks and machine learning, and I came across these two formulas in this textbook page on linear regression: 1) $y(x) = a + bx$ and 2) $y(x) = w^{T}\phi(x)$ What is the difference between these two formulas? How are they related? They seem to perform the same […]

According to the textbook Gaussian Process in Machine Learning, it is given that \begin{align*} p(w\mid X,y) &\propto \exp\left(-\frac{1}{2\sigma_n^2}(y-X^Tw)(y-X^Tw)\right)\exp\left(-\frac{1}{2}w^T\Sigma_{p}^{-1}w\right) \\ &\propto \exp\left(-\frac{1}{2}(w-\bar{w})^T\left(\frac{1}{\sigma_n^2}XX^T + \Sigma_p^{-1}\right)(w-\bar{w})\right) \end{align*} where $\bar{w} = \sigma_n^{-2}(\sigma_n^{-2}XX^T + \Sigma_p^{-1})^{-1}Xy$. I can’t really understand how the first step leads to the second step. Can someone kindly show me how the derivation is done? Thanks

I am looking for a reference to learn probability theory which satisfies the following criteria: Not beginner oriented, but starting from the basis. That means, it can assume the reader to have some mathematical maturity, and analysis, linear algebra, etc can be assumed from page one. On the other hand, it should assume no knowledge […]

I am learning machine learning from Andrew Ng’s open-class notes and coursera.org. I am trying to understand how the cost function for the logistic regression is derived. I will start with the cost function for linear regression and then get to my question with logistic regression. (Btw a similar question was asked here, which answers […]

I am given an example: Suppose 1000 observations are drawn from $N(0,1)$ and $N(5,2)$ with mixing parameters $\pi_{1}=0.2$ and $\pi_{2}=0.8$ respectively. Suppose we only know $\sigma$ and want to estimate $\mu$ and $\pi$. How does one go about using Gaussian Mixture models to estimate these parameters? I know I have to use the EM algorithm […]

I was reading/watching CalTech’s ML course and it said that one could derive the RBF Gaussian kernel from the solution to smoothest interpolation that minimizes squared loss. i.e. one can derive the predictor/interpolator: $$ f(x) = \sum^{K}_{k=1} c_k \exp( -\beta_k \| x – w_k \|^2 )$$ from the Empirical Risk minimizer with a smoothest Regularizer: […]

Reading the Wikipedia article about SVMs, I noticed More formally, a support vector machine constructs a hyperplane or set of hyperplanes in a high- or infinite-dimensional space, which can be used for classification, regression, or other tasks. I continued with “A Tutorial on Support Vector Machines for Pattern Recognition” by Christopher JC Burges and stumbled […]

I’m 30 years old, and the only math I can remember from college is basic algebra and some probabilities. Next month, I have a machine learning project I’d like to work on, but I’ll need a solid footing in linear algebra first. Are there any books or tutorials that can take me from the spotty […]

Intereting Posts

Tangent bundles of exotic manifolds
Definition of the image as coker of ker == ker of coker?
Idea behind definitions in math
Number of elements in a finite $\sigma$-algebra
Prove that $(mn)!$ is divisible by $(n!)\cdot(m!)^n$
Is $\mathbb{Z}/p^\mathbb{N} \mathbb{Z}$ widely studied, does it have an accepted name/notation, and where can I learn more about it?
Is it true that $f\in W^{-1,p}(\mathbb{R}^n)$, then $\Gamma\star f\in W^{1,p}(\mathbb{R}^n)$?
Summation notation problem
Row and column algorithm
Show simplicial complex is Hausdorff
Fibonacci identity proof
The longest string of none consecutive repeated pattern
How many consecutive composite integers follow k!+1?
How prove this $(abc)^4+abc(a^3c^2+b^3a^2+c^3b^2)\le 4$
A limit of a uniformly convergent sequence of smooth functions