Yet another article on YAML

Introduction to the XML format

Series: Common File Formats

Grid is the very first CSS module created specifically to solve the layout problems we’ve all been hacking our way around for as long as we’ve been making websites.

All you need to know about JSON

Series: Common File Formats

Grid is the very first CSS module created specifically to solve the layout problems we’ve all been hacking our way around for as long as we’ve been making websites.

A deep dive into partitioning around medoids

Series: Kmeans and Its Variants

In this final article in my mini-series on k-means and its variants, I will talk about the k-medoids algorithm, also commonly called partitioning around medoids (PAM). It has the beauty of being basically deterministic and find very good solutions reliably.

How to cluster noisy data sets

Series: Kmeans and Its Variants

Real-world data sets often come with many outliers that you might not be able to remove completely during the data cleanup phase. If you have run into this problem, I want to introduce you to the k-medians algorithm. By using the median instead of the mean, and using a more robust dissimilarity metric, it is much less sensitive to outliers.

The k-means++ algorithm to kick start your initialization

Series: Kmeans and Its Variants

k-means is a very simple and ubiquitous clustering algorithm. But quite often it does not work on your problem, for example because the initialization is bad. Fortunately, there is an improved initialization method, k-means++, which can help to alleviate this problem.

A deep dive into kmeans

Series: Kmeans and Its Variants

A simple framework for performance metrics

The list of performance metrics is seemingly never-ending. Especially if you are new to data science, you can easily feel stranded in an ocean of choices. Learn how they connect to each other and how you can use it to choose the best metric for your problem and model.

The Dendrite Nanomap

Synapses are at the center of neuronal communication, yet they are incredibly difficult to study. During my PhD I created the a comprehensive, quantitative model of the postsynapse at the nanometer scale

Data Science