Unsupervised classification is a type of classification where the data is not labeled. This means that the algorithm does not know the different classes, and it has to figure them out on its own. It is the opposite process of supervised classification. Spectral classes are grouped first and then categorized into clusters.
One way to do unsupervised classification is to use a clusterer algorithm. A clusterer algorithm takes a set of data and tries to find groups of data points that are similar to each other. Once the clusterer has found these groups, it can assign labels to them. In Earth Engine, these classifiers are ee.Clusterer objects. These are “self-taught” algorithms that do not use a set of labeled training data (i.e., they are “unsupervised”). There are many different clusterer algorithms, but some of the most common ones include k-means, hierarchical clustering, and DBSCAN.
- K-means is a simple clusterer algorithm that works by dividing the data into k groups. The algorithm chooses k random points, and then it assigns each data point to the group that is closest to the random point.
- Hierarchical clustering is a more complex clusterer algorithm that works by starting with all of the data points in one cluster, and then repeatedly splitting the clusters until each cluster contains only one data point.
- DBSCAN is a clustering algorithm that is designed for data that has outliers. Outliers are data points that do not fit into any of the clusters. DBSCAN tries to find clusters that contain at least a certain number of data points, and it also tries to find clusters that are dense enough.
To better understand unsupervised classification, you can think of it as performing a task you have not experienced before, starting by gathering as much information as possible. For example, imagine learning a new language without knowing the basic grammar, learning only by watching a TV series in that language, listening to examples, and finding patterns.
Similar to the supervised classification, unsupervised classification in Earth Engine has this workflow:
- Assemble features with numeric properties in which to find clusters (training data).
- Select and instantiate a clusterer.
- Train the clusterer with the training data.
- Apply the clusterer to the scene (classification).
- Label the clusters.