R

A deep dive into partitioning around medoids

Series: Kmeans and Its Variants

In this final article in my mini-series on k-means and its variants, I will talk about the k-medoids algorithm, also commonly called partitioning around medoids (PAM). It has the beauty of being basically deterministic and find very good solutions reliably.

How to cluster noisy data sets

Series: Kmeans and Its Variants

Real-world data sets often come with many outliers that you might not be able to remove completely during the data cleanup phase. If you have run into this problem, I want to introduce you to the k-medians algorithm. By using the median instead of the mean, and using a more robust dissimilarity metric, it is much less sensitive to outliers.