- Is there a danger to a human that is linked to ground wire of home electrical system?
- How do jets/airplane cold start?
- Stop free-roaming dogs from entering my property
- How can I effectively patch up holes in insect screen?
- How to add a Headrest to this chair?
- How to get rid of print from fabric clothing?
- How can I straighten a twisted necktie?
- What is the best way to clean up the spider webs?
- Prevent rubber bands from ageing?
- Efficient life hack to peel a jicama
- Inflation, Future, And Value of Money - deciding to buy a house
- What does it mean for a technology to have a $|\rho|<1$?
- How to use unread-command-events variable?
- Could a gas giant with layer similar to Earth atmosphere exist?
- What change to laws of nature could disable modern technology, but allow nature as we know it? Or is it impossible?
- How would battle strategies change if airplanes had much more limited ranges?
- Is Quora a disruptor for StackExchange?
- Where should i find a co-founder and teammates for my startup?
- Why a girl is forced to get married with in an year after her Mothers death
- Caching Issue “unserialize(): Error at offset 0 of 24 bytes”
K-means: why reduce dimensions first?
I'm a bit confused about the usefulness of reducing dimensions before doing a k-means clustering.
Suppose you want to apply k-means to a set points $(x_i)$ with high dimension. You want to minimize the cost function $\sum_i \|x_i-c_i\|^2$ where $c_i$ is the center of the cluster $x_i$ belongs to.
You have basically two methods:
A: do a k-means (Lloyd) directly on $(x_i)$
B: reduce the number of dimensions with some dimensional reduction method (such as SVD/PCA), and then apply k-means to the points with reduced dimensions
On the one hand, A is unlikely to find the global minimum, replications will help getting closer to it. It might have a high computational cost due to handling high dimension vectors and many replications.
On the other hand, B is more likely to get close to the global minimum (or even reach it) with fewer replications. But the minimum on the reduced version is not the original minimum (it is however known to be close to it). There is of course an