Lab 13 - Improving your Classifications

Objective: Apply techniques to improve classification accuracy through better training data, additional predictors, and parameter tuning.

What You'll Learn

  • Improve supervised classification through better training data collection
  • Add spectral indices (NDVI, NDWI) as additional predictor bands
  • Tune Random Forest parameters for better results
  • Adjust unsupervised clustering parameters
  • Compare different clustering algorithms

Building On Previous Learning

This lab directly extends your work from:

  • Lab 11 - Supervised classification with Random Forest
  • Lab 12 - Unsupervised k-means clustering

Why This Matters

Initial classification results are rarely perfect. Understanding how to systematically improve your results is critical for:

  • Accuracy: Meeting project requirements for land cover mapping
  • Reliability: Producing consistent results across different scenes
  • Understanding: Knowing WHY classifications fail helps you fix them

Before You Start

  • Prerequisites: Complete Labs 11-12 and gather feedback on your initial classification outputs.
  • Estimated time: 60 minutes
  • Materials: Earth Engine access, saved classifier scripts, and your Lab 12 code.

Key Terms

Predictor Bands
The image bands used by the classifier to make predictions. More informative bands can improve accuracy.
Feature Space
The multi-dimensional space defined by all predictor bands where pixels are classified.
Hyperparameter
Settings that control how an algorithm works (e.g., number of trees, k value) that must be tuned by the analyst.
Spectral Separability
How distinguishable different classes are based on their spectral signatures.

Introduction

In this lab, you will improve the code you created in Lab 11 - Supervised Classification and Lab 12 - Unsupervised Classification. You'll see if you can improve the classifications by completing the following assignments.

You can use your code from Lab 12 or use the following code: https://code.earthengine.google.com/82c02e163ca780304a5536c29c6c4461

Strategies for Improvement

Strategy Applies To What It Improves
More training points Supervised Class representation, spectral variability
Additional predictor bands Both Class separability
More trees in RF Supervised Model stability (with diminishing returns)
More sample pixels Unsupervised Cluster quality
Adjust k value Unsupervised Class granularity

Assignments

Assignment 1: Collect More Training Points

For the supervised classification, try collecting more points for each class.

  • Collect points across the entire composite, not just one location
  • Look for pixels of the same class that show variability
  • For water: collect pixels in parts of rivers that vary in color
  • For developed: collect pixels from different rooftops and materials

Tip: The more spectrally representative your training points are, the better the classifier can generalize to the entire image.

Assignment 2: Add Predictor Bands

Usually, the more spectral information you feed the classifier, the easier it is to separate classes.

Add NDVI or NDWI as a new predictor band:

// Calculate NDVI and add to the image
var ndvi = Landsat.normalizedDifference(['SR_B5', 'SR_B4']).rename('NDVI');
var imageWithNDVI = Landsat.addBands(ndvi);

// Update your inputProperties to include NDVI
var inputProperties = ['SR_B2', 'SR_B3', 'SR_B4', 'SR_B5', 'SR_B6', 'SR_B7', 'NDVI'];

Questions to answer in your comments:

  • Does adding NDVI help the classification?
  • Check developed areas classified as herbaceous (or vice versa)

Assignment 3: Tune Random Forest Trees

Use more trees in the Random Forest classifier:

// Try different numbers of trees
var classifier100 = ee.Classifier.smileRandomForest(100).train({...});
var classifier200 = ee.Classifier.smileRandomForest(200).train({...});

Questions to answer:

  • Do you see improvements compared to 50 trees?
  • How does computation time change?

Note: More trees increases computation time, and there are diminishing returns beyond a certain point.

Assignment 4: Increase Sample Size (Unsupervised)

Increase the number of samples extracted from the composite:

var training = Landsat.sample({
    region: Landsat.geometry(),
    scale: 30,
    numPixels: 5000,  // Increased from 1000
    tileScale: 8
});

Does increasing samples improve the unsupervised result?

Assignment 5: Increase K Clusters

Increase the number of k clusters:

// Try 10 clusters instead of 4
var clusterer = ee.Clusterer.wekaKMeans(10).train(training);

Questions to answer:

  • Does the classified map result in meaningful classes?
  • Can multiple clusters be merged into your original 4 classes?

Assignment 6: Test Other Clustering Algorithms

Try other options under the ee.Clusterer object:

// Try X-Means (automatically determines optimal k)
var clusterer = ee.Clusterer.wekaXMeans({
    minClusters: 2,
    maxClusters: 10
}).train(training);

// Try Cascade K-Means
var clusterer = ee.Clusterer.wekaCascadeKMeans({
    minClusters: 2,
    maxClusters: 10
}).train(training);

How do results differ from standard k-means?

Check Your Understanding

  1. Why is it important to collect training points across the entire image rather than in one area?
  2. How does adding NDVI as a predictor band help differentiate vegetation from developed areas?
  3. What are the trade-offs of using more trees in Random Forest?
  4. When might you want MORE than 4 clusters even if you only have 4 final classes?

Troubleshooting

Problem: Adding NDVI causes an error about band names

Solution: Make sure you add the NDVI band to the image BEFORE sampling, and update your inputProperties list to include 'NDVI'.

Problem: More trees makes computation too slow

Solution: 100-200 trees is usually sufficient. Beyond that, you get diminishing returns. Consider exporting the classification as a task instead of running on-the-fly.

Problem: 10 clusters creates classes that don't make sense

Solution: This is expected! More clusters create finer spectral distinctions that may not correspond to meaningful land cover. The analyst must interpret and merge clusters.

Pro Tips

  • Iterate systematically: Change one variable at a time to understand its impact
  • Document everything: Add comments showing what you changed and the result
  • Use validation data: Set aside some training points for testing accuracy
  • Consider your goal: More accurate isn't always better if it takes too long to compute

Key Takeaways

  • Classification accuracy can be improved through better training data, more predictor bands, and parameter tuning
  • There are always trade-offs between accuracy, computation time, and complexity
  • Spectral indices like NDVI can significantly improve class separability
  • Systematic experimentation helps you understand what works for your specific data

📋 Lab Submission

Subject: Lab 13 - Improving your Classifications - [Your Name]

Submit:

A shareable URL to your code that includes:

  1. Completed assignments with clear comment separators
  2. Comments explaining what you changed and the results
  3. Your observations about which improvements worked best

Total: 60 Points (10 per assignment)