# K means Clustering

The other algorithm we will cover is the k-means algorithm. Unlike hierarchical clustering, we know beforehand the number of clusters that we want (k).

The basic algorithm for k-means clustering is as follows: Choose a k-value for the number of clusters we want to end up with. Select k number of starting points that we want to initialize to start our clusters. For each data point, find the closest mean vector and assign the object to the corresponding cluster.

For each cluster, update its mean vector according to the current assignments.

![](https://gblobscdn.gitbook.com/assets%2F-MY9eKRi9g1FeDj7RFjh%2F-MYUWhbH2cp3fX78Hvhv%2F-MYUX4yQl7FeuZVltErc%2Fkmeans.png?alt=media\&token=f19ab96b-dbc7-42c1-81c6-d5ceea48e613)

We keep repeating the last two steps until a stopping criteria is met. Unlike the hierarchical clustering algorithm, the k-means clustering algorithm isn't always guaranteed to terminate. It can stop during convergence, when the algorithm no longer reassigns points, or it can run indefinitely until it stops at a user-defined number of iterations. This contrasts with hierarchical clustering which has a more finite and predictable termination step (when everything is inside of one cluster). Additionally, the k-means algorithm may produce different outcomes based on how we initialize our initial k points. Here is an animation that shows how k-means clustering behaves.

![](https://gblobscdn.gitbook.com/assets%2F-MY9eKRi9g1FeDj7RFjh%2F-MYUWhbH2cp3fX78Hvhv%2F-MYUX9iD7AOLYLYNsUIA%2Fgraph.png?alt=media\&token=ad0e4ed0-ea9a-41d9-8517-48350fd14e8b)

## Reference Used <a href="#reference-used" id="reference-used"></a>

​<https://zhonglab.gitbook.io/3dgenome/appendix-homework/students-note/a-brief-introduction-to-machine-learning#k-means-clustering>​


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://sparkingdebo.gitbook.io/awesome-ml-book/k-means-clustering.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
