Lab 11 - Supervised Classification

Supervised classification uses a training dataset with known labels and representing the spectral characteristics of each land cover class of interest to “supervise” the classification. The overall approach of a supervised classification in Earth Engine is summarized as follows: 

  1. Get a scene.
  2. Collect training data.
  3. Select and train a classifier using the training data.
  4. Classify the image using the selected classifier.

Let's get started

  1. We will begin by manually creating training data based on a clear Landsat image. Copy the code block below to define your Landsat 8 scene variable and add it to the map. We will use a point in Milan, Italy, as the center of the area for our image classification.
  2. // Create an Earth Engine Point object over Milan.
    var pt = ee.Geometry.Point([9.453, 45.424]);

    // Filter the Landsat 8 collection and select the least cloudy image.
    var landsat = ee.ImageCollection('LANDSAT/LC08/C02/T1_L2')
        .filterBounds(pt)
        .filterDate('2019-01-01', '2020-01-01')
        .sort('CLOUD_COVER')
        .first();

    // Center the map on that image.
    Map.centerObject(landsat, 8);

    // Add Landsat image to the map.
    var visParams = {
        bands: ['SR_B4', 'SR_B3', 'SR_B2'],
        min: 7000,
        max: 12000
    };
    Map.addLayer(landsat, visParams, 'Landsat 8 image');
    resulting image from landsat 
  3. Using the Geometry Tools, we will create points on the Landsat image that represent land cover classes of interest to use as our training data. We’ll need to do two things: (1) identify where each land cover occurs on the ground, and (2) label the points with the proper class number. For this exercise, we will use the classes and codes shown in Table.
  4. Land cover classes 


    Class

    Class code

    Forest

    0

    Developed

    1

    Water

    2

    Herbaceous

    3


  5. In the Geometry Tools, click on the marker option. This will create a point geometry which will show up as an import named “geometry”. Click on the gear icon to configure this import.

  6. where the geomerty tool is.
  7. We will start by collecting forest points, so name the import forest. Import it as a FeatureCollection, and then click + Property. Name the new property “class” and give it a value of 0. We can also choose a color to represent this class. For a forest class, it is natural to choose a green color. You can choose the color you prefer by clicking on it, or, for more control, you can use a hexadecimal value.

  8. Hexadecimal values are used throughout the digital world to represent specific colors across computers and operating systems. They are specified by six values arranged in three pairs, with one pair each for the red, green, and blue brightness values. If you’re unfamiliar with hexadecimal values, imagine for a moment that colors were specified in pairs of base 10 numbers instead of pairs of base 16. In that case, a bright pure red value would be “990000”; a bright pure green value would be “009900”; and a bright pure blue value would be “000099”. A value like “501263” would be a mixture of the three colors, not especially bright, having roughly equal amounts of blue and red, and much less green: a color that would be a shade of purple. To create numbers in the hexadecimal system, which might feel entirely natural if humans had evolved to have 16 fingers, sixteen “digits” are needed: a base 16 counter goes 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F, then 10, 11, and so on. Given that counting framework, the number “FF” is like “99” in base 10: the largest two-digit number. The hexadecimal color used for coloring the letters of the word FeatureCollection in this book, a color with roughly equal amounts of blue and red, and much less green, is “7F1FA2” 

  9. Returning to the coloring of the forest points, the hexadecimal value “589400” is a little bit of red, about twice as much green, and no blue: the deep green. Enter that value, with or without the “#” in front, and click OK after finishing the configuration.

  10. Screenshot 2023-03-03 at 8.38.15 AM.png
  11. Now, in the Geometry Imports, we will see that the import has been renamed forest. Click on it to activate the drawing mode in order to start collecting forest points. 

  12. ready to start collecting points
  13. Now, start collecting points over forested areas. Zoom in and out as needed. You can use the satellite basemap to assist you, but the basis of your collection should be the Landsat image. Remember that the more points you collect, the more the classifier will learn from the information you provide. For now, let’s set a goal to collect 25 points per class. Click Exit next to Point drawing when finished.

  14. collected points
  15. Repeat the same process for the other classes by creating new layers. Don’t forget to import using the FeatureCollection option as mentioned above. For the developed class, collect points over urban areas. For the water class, collect points over the Ligurian Sea, and also look for other bodies of water, like rivers. For the herbaceous class, collect points over agricultural fields. Remember to set the “class” property for each class to its corresponding code (see Table above) and click Exit once you finalize collecting points for each class as mentioned above. We will be using the following hexadecimal colors for the other classes: #FF0000 for developed, #1A11FF for water, and #D0741E for herbaceous

  16. click the new layer button
  17. You should now have four FeatureCollection imports named forest, developed, water, and herbaceous
  18. Collected Training Samples
  19.  The next step is to combine all the training feature collections into one. Copy and paste the code below to combine them into one FeatureCollection called trainingFeatures. Here, we use the flatten method to avoid having a collection of feature collections—we want individual features within our FeatureCollection.
  20. // Combine training feature collections.
    var trainingFeatures = ee.FeatureCollection([
        forest, developed, water, herbaceous
    ]).flatten(); 
  21. Note: Alternatively, you could use an existing set of reference data. For example, the European Space Agency (ESA) WorldCover dataset is a global map of land use and land cover derived from ESA’s Sentinel-2 imagery at 10 m resolution. With existing datasets, we can randomly place points on pixels classified as the classes of interest (if you are curious, you can explore the Earth Engine documentation to learn about the ee.Image.stratifiedSample and the ee.FeatureCollection.randomPoints methods). The drawback is that these global datasets will not always contain the specific classes of interest for your region, or may not be entirely accurate at the local scale. Another option is to use samples collected in the field (e.g., GPS points).

  22. In the combined FeatureCollection, each Feature point should have a property called “class”. The class values are consecutive integers from 0 to 3. Verify that this is true by printing trainingFeatures and checking the properties of the features in the console.

  23.  

     print(trainingFeatures);
  24. Now that we have our training points, copy and paste the code below to extract the band information for each class at each point location.

  25. // Define prediction bands.
    var predictionBands = [
        'SR_B1', 'SR_B2', 'SR_B3', 'SR_B4', 'SR_B5', 'SR_B6', 'SR_B7',
        'ST_B10'
    ];

    // Sample training points.
    var classifierTraining = landsat.select(predictionBands)
        .sampleRegions({
            collection: trainingFeatures,
            properties: ['class'],
            scale: 30
        });
  26. First, we define the prediction bands to extract different spectral and thermal information from different bands for each class. Then, we use the sampleRegions method to sample the information from the Landsat image at each point location. This method requires information about the FeatureCollection (our reference points), the property to extract (“class”), and the pixel scale (in meters).

  27. You can check whether the classifierTraining object extracted the properties of interest by printing it and expanding the first feature. You should see the band and class information
  28. print(classifierTraining);
  29. Screenshot 2023-03-03 at 8.43.08 AM.png
  30. Now we can choose a classifier. The choice of classifier is not always obvious, and there are many options from which to pick—you can quickly expand the ee.Classifier object under Docs to get an idea of how many options we have for image classification. Therefore, we will be testing different classifiers and comparing their results. We will start with a Classification and Regression Tree (CART) classifier, a well-known classification algorithm that has been around for decades. 
  31. Example of a decision tree for satellite image classification. Values and classes are hypothetical.
  32. The figure above is an example of a decision tree for satellite image classification. Values and classes are hypothetical.
  33. Copy and paste the code below to instantiate a CART classifier (ee.Classifier.smileCart) and train it.
  34. //////////////// CART Classifier ///////////////////

    // Train a CART Classifier.
    var classifier = ee.Classifier.smileCart().train({
        features: classifierTraining,
        classProperty: 'class',
        inputProperties: predictionBands
    });
  35. Essentially, the classifier contains the mathematical rules that link labels to spectral information. If you print the variable classifier and expand its properties, you can confirm the basic characteristics of the object (bands, properties, and classifier being used).

  36. If you print classifier.explain, you can find a property called “tree” that contains the decision rules.

  37. print(classifier.explain());
  38. print(classifier.explain());

  39. After training the classifier, copy and paste the code below to classify the Landsat image and add it to the Map.

  40. // Classify the Landsat image.
    var classified = landsat.select(predictionBands).classify(classifier);

    // Define classification image visualization parameters.
    var classificationVis = {
        min: 0,
        max: 3,
        palette: ['589400', 'ff0000', '1a11ff', 'd0741e']
    };

    // Add the classified image to the map.
    Map.addLayer(classified, classificationVis, 'CART classified');
  41. Note that, in the visualization parameters, we define a palette parameter which in this case represents colors for each pixel value (0–3, our class codes). We use the same hexadecimal colors used when creating our training points for each class. This way, we can associate a color with a class when visualizing the classified image in the Map.

  42. results of CART classifcation
  43. Inspect the result: Activate the Landsat composite layer and the satellite basemap to overlay with the classified images (Fig. F2.1.11). Change the layers’ transparency to inspect some areas. What do you notice? The result might not look very satisfactory in some areas (e.g., confusion between developed and herbaceous classes). Why do you think this is happening? There are a few options to handle misclassification errors:
    1. Collect more training data We can try incorporating more points to have a more representative sample of the classes.
    2. Tune the model Classifiers typically have “hyperparameters,” which are set to default values. In the case of classification trees, there are ways to tune the number of leaves in the tree, for example. 
    3. Try other classifiers If a classifier’s results are unsatisfying, we can try some of the other classifiers in Earth Engine to see if the result is better or different.
    4. Expand the collection location It is good practice to collect points across the entire image and not just focus on one location. Also, look for pixels of the same class that show variability (e.g., for the developed class, building rooftops look different than house rooftops; for the herbaceous class, crop fields show distinctive seasonality/phenology). 
    5. Add more predictors We can try adding spectral indices to the input variables; this way, we are feeding the classifier new, unique information about each class. For example, there is a good chance that a vegetation index specialized for detecting vegetation health (e.g., NDVI) would improve the developed versus herbaceous classification.
  44. For now, we will try another supervised learning classifier that is widely used: Random Forests (RF). The RF algorithm (Breiman 2001, Pal 2005) builds on the concept of decision trees, but adds strategies to make them more powerful. It is called a “forest” because it operates by constructing a multitude of decision trees. As mentioned previously, a decision tree creates the rules which are used to make decisions. A Random Forest will randomly choose features and make observations, build a forest of decision trees, and then use the full set of trees to estimate the class. It is a great choice when you do not have a lot of insight about the training data.

  45. diagram of Random Forest
  46. Copy and paste the code below to train the RF classifier (ee.Classifier.smileRandomForest) and apply the classifier to the image. The RF algorithm requires, as its argument, the number of trees to build. We will use 50 trees.

  47. /////////////// Random Forest Classifier /////////////////////

    // Train RF classifier.
    var RFclassifier = ee.Classifier.smileRandomForest(50).train({
        features: classifierTraining,
        classProperty: 'class',
        inputProperties: predictionBands
    });

    // Classify Landsat image.
    var RFclassified = landsat.select(predictionBands).classify(
        RFclassifier);

    // Add classified image to the map.
    Map.addLayer(RFclassified, classificationVis, 'RF classified');
  48. Note that in the ee.Classifier.smileRandomForest documentation (Docs tab), there is a seed (random number) parameter. Setting a seed allows you to exactly replicate your model each time you run it. Any number is acceptable as a seed.

  49. results of random forest
  50. Inspect the result. How does this classified image differ from the CART one? Is the classifications better or worse? Zoom in and out and change the transparency of layers as needed.

  51. For Submission, submit a URL with your Code.  At the End of your code, as a Comment, answer the following questions.
    1. How does this classified image differ from the Random Forest to the CART one?
    2. Why do you think that CART classification had classification errors and in what way would you have improved the classification while still using CART?
    3. Is there a better classification between two of these two methods, explain through visual examination of the images? 

 

Lab Submission

Submit lab via email.

Subject: Lab 11 - Supervised Classification - [Your Name]