Lab 13 - Improving your Classifications

What You'll Learn

Improve supervised classification through better training data collection
Add spectral indices (NDVI, NDWI) as additional predictor bands
Tune Random Forest parameters for better results
Adjust unsupervised clustering parameters
Compare different clustering algorithms

Building On Previous Learning

This lab directly extends your work from:

Lab 11 - Supervised classification with Random Forest
Lab 12 - Unsupervised k-means clustering

Why This Matters

Initial classification results are rarely perfect. Understanding how to systematically improve your results is critical for:

Accuracy: Meeting project requirements for land cover mapping
Reliability: Producing consistent results across different scenes
Understanding: Knowing WHY classifications fail helps you fix them

Before You Start

Prerequisites: Complete Labs 11-12 and gather feedback on your initial classification outputs.
Estimated time: 60 minutes
Materials: Earth Engine access, saved classifier scripts, and your Lab 12 code.

Key Terms

Predictor Bands: The image bands used by the classifier to make predictions. More informative bands can improve accuracy.
Feature Space: The multi-dimensional space defined by all predictor bands where pixels are classified.
Hyperparameter: Settings that control how an algorithm works (e.g., number of trees, k value) that must be tuned by the analyst.
Spectral Separability: How distinguishable different classes are based on their spectral signatures.

Introduction

In this lab, you will improve the code you created in Lab 11 - Supervised Classification and Lab 12 - Unsupervised Classification. You'll see if you can improve the classifications by completing the following assignments.

You can use your code from Lab 12 or use the following code: https://code.earthengine.google.com/82c02e163ca780304a5536c29c6c4461

Strategies for Improvement

Strategy	Applies To	What It Improves
More training points	Supervised	Class representation, spectral variability
Additional predictor bands	Both	Class separability
More trees in RF	Supervised	Model stability (with diminishing returns)
More sample pixels	Unsupervised	Cluster quality
Adjust k value	Unsupervised	Class granularity

Assignments

Assignment 1: Collect More Training Points

For the supervised classification, try collecting more points for each class.

Collect points across the entire composite, not just one location
Look for pixels of the same class that show variability
For water: collect pixels in parts of rivers that vary in color
For developed: collect pixels from different rooftops and materials

Tip: The more spectrally representative your training points are, the better the classifier can generalize to the entire image.

Assignment 2: Add Predictor Bands

Usually, the more spectral information you feed the classifier, the easier it is to separate classes.

Add NDVI or NDWI as a new predictor band:

// Calculate NDVI and add to the image
var ndvi = Landsat.normalizedDifference(['SR_B5', 'SR_B4']).rename('NDVI');
var imageWithNDVI = Landsat.addBands(ndvi);

// Update your inputProperties to include NDVI
var inputProperties = ['SR_B2', 'SR_B3', 'SR_B4', 'SR_B5', 'SR_B6', 'SR_B7', 'NDVI'];

Questions to answer in your comments:

Does adding NDVI help the classification?
Check developed areas classified as herbaceous (or vice versa)

Assignment 3: Tune Random Forest Trees

Use more trees in the Random Forest classifier:

// Try different numbers of trees
var classifier100 = ee.Classifier.smileRandomForest(100).train({...});
var classifier200 = ee.Classifier.smileRandomForest(200).train({...});

Questions to answer:

Do you see improvements compared to 50 trees?
How does computation time change?

Note: More trees increases computation time, and there are diminishing returns beyond a certain point.

Assignment 4: Increase Sample Size (Unsupervised)

Increase the number of samples extracted from the composite:

var training = Landsat.sample({
    region: Landsat.geometry(),
    scale: 30,
    numPixels: 5000,  // Increased from 1000
    tileScale: 8
});

Does increasing samples improve the unsupervised result?

Assignment 5: Increase K Clusters

Increase the number of k clusters:

// Try 10 clusters instead of 4
var clusterer = ee.Clusterer.wekaKMeans(10).train(training);

Questions to answer:

Does the classified map result in meaningful classes?
Can multiple clusters be merged into your original 4 classes?

Assignment 6: Test Other Clustering Algorithms

Try other options under the ee.Clusterer object:

// Try X-Means (automatically determines optimal k)
var clusterer = ee.Clusterer.wekaXMeans({
    minClusters: 2,
    maxClusters: 10
}).train(training);

// Try Cascade K-Means
var clusterer = ee.Clusterer.wekaCascadeKMeans({
    minClusters: 2,
    maxClusters: 10
}).train(training);

How do results differ from standard k-means?

Check Your Understanding

Why is it important to collect training points across the entire image rather than in one area?
How does adding NDVI as a predictor band help differentiate vegetation from developed areas?
What are the trade-offs of using more trees in Random Forest?
When might you want MORE than 4 clusters even if you only have 4 final classes?

Troubleshooting

Problem: Adding NDVI causes an error about band names

Solution: Make sure you add the NDVI band to the image BEFORE sampling, and update your inputProperties list to include 'NDVI'.

Problem: More trees makes computation too slow

Solution: 100-200 trees is usually sufficient. Beyond that, you get diminishing returns. Consider exporting the classification as a task instead of running on-the-fly.

Problem: 10 clusters creates classes that don't make sense

Solution: This is expected! More clusters create finer spectral distinctions that may not correspond to meaningful land cover. The analyst must interpret and merge clusters.

Pro Tips

Iterate systematically: Change one variable at a time to understand its impact
Document everything: Add comments showing what you changed and the result
Use validation data: Set aside some training points for testing accuracy
Consider your goal: More accurate isn't always better if it takes too long to compute

Key Takeaways

Classification accuracy can be improved through better training data, more predictor bands, and parameter tuning
There are always trade-offs between accuracy, computation time, and complexity
Spectral indices like NDVI can significantly improve class separability
Systematic experimentation helps you understand what works for your specific data

📋 Lab Submission

Subject: Lab 13 - Improving your Classifications - [Your Name]

Submit:

A shareable URL to your code that includes:

Completed assignments with clear comment separators
Comments explaining what you changed and the results
Your observations about which improvements worked best

Total: 60 Points (10 per assignment)