- EFQ SQL error when using filedCondition on two different taxonomy terms
- Breathing trouble during high intensity
- Was a finger ever grown back with “pixie dust”?
- Did Hillary Clinton say that North Korea requires experienced diplomats rather than tweets?
- What is the industry standard way of calculating and annualizing performance metrics?
- Why the difference between SPY and ^GSPC?
- Feedback on Website Code
- Readable implementation of Trie in Python
- What were the other 8 Nazgul doing while the Witch-King weakened Arnor?
- Actors appearing in both Star Trek and Star Wars?
- How to query inside jsonb column
- Extract max of value in Fact table using date filter
- What are the .pgAdmin4.2123502176.addr and .log files?
- postgresql - Importing CSV Data with Custom Type
- Photoshop CC2017 “An integer between 96 and 8 is required. The closest value inserted.”
- Why are VIRB 360 such poor quality?
- Extract public and private keys from JBOSS
- Are there any redundancies in the way I create this payload in msfvenom and then listen with multi handler? Payload does not execute
- Does non-conservation of number of particles imply zero chemical potential?
- A moment of cohomology$.$
K-means: why reduce dimensions first?
I'm a bit confused about the usefulness of reducing dimensions before doing a k-means clustering.
Suppose you want to apply k-means to a set points $(x_i)$ with high dimension. You want to minimize the cost function $\sum_i \|x_i-c_i\|^2$ where $c_i$ is the center of the cluster $x_i$ belongs to.
You have basically two methods:
A: do a k-means (Lloyd) directly on $(x_i)$
B: reduce the number of dimensions with some dimensional reduction method (such as SVD/PCA), and then apply k-means to the points with reduced dimensions
On the one hand, A is unlikely to find the global minimum, replications will help getting closer to it. It might have a high computational cost due to handling high dimension vectors and many replications.
On the other hand, B is more likely to get close to the global minimum (or even reach it) with fewer replications. But the minimum on the reduced version is not the original minimum (it is however known to be close to it). There is of course an