What is a dendrogram, and how is it used in hierarchical clustering?

A dendrogram is a tree-like diagram that visualizes the successive process for how clusters are formed in a hierarchical clustering algorithm. All of the data points are shown on the x-axis, and branches are drawn from the observations to the clusters they are assigned to. The y-axis represents the cluster distance between existing clusters and observations at each step of the process. As hierarchical clustering does not require the user to pre-specify the number of clusters, a common heuristic to decide on a good number of clusters is to horizontally “cut” the dendrogram at a point where it seems the distance between observations joined to clusters is getting too large. A large distance between observations and existing clusters implies that more dissimilar points are being combined into the same cluster. 

In the example drawing below, performing cut 1 would result in 2 clusters, where the first contains observations 1,2,3,4,5, and the second contains observations 6,7,8. Cut 2 would produce 3 clusters, with the first containing observations 1,2,3; the second consisting of observations 4,5; and the third 6,7,8. Finally, Cut 3 would produce 5 clusters,with the first containing observations 1,2; the second observation 3 alone; the third observations 4,5; the fourth observation 6 alone; and the fifth observations 7,8.