Accuracy Assessment

A classification is only as good as its accuracy. This module teaches you to measure how well your classifier performs using confusion matrices, overall accuracy, and class-specific metrics.

Learning objectives

  • Split data into training and validation sets.
  • Generate and interpret a confusion matrix.
  • Calculate overall accuracy, producer's accuracy, and user's accuracy.
  • Understand kappa coefficient and when to use it.
  • Apply best practices for unbiased accuracy assessment.

Why it matters

Without accuracy assessment, you don't know if your map is 50% or 95% correct. Decision-makers need to understand the reliability of your results before using them for policy, conservation, or planning.

Key vocabulary

Confusion Matrix
A table comparing predicted classes (columns) against actual classes (rows).
Overall Accuracy
Percentage of correctly classified samples: (correct / total) × 100.
Producer's Accuracy
How well the classifier identifies a class (omission error complement).
User's Accuracy
How reliable the map is for a class (commission error complement).
Kappa Coefficient
Accuracy adjusted for chance agreement (0 = random, 1 = perfect).

Quick win: Complete accuracy assessment

This example trains a classifier and evaluates it with a confusion matrix:

// 1. Load image and define bands
var image = ee.Image('LANDSAT/LC08/C02/T1_L2/LC08_044034_20210623')
  .multiply(0.0000275).add(-0.2)
  .select(['SR_B2', 'SR_B3', 'SR_B4', 'SR_B5', 'SR_B6', 'SR_B7']);

// 2. Create training points (in practice, use imported geometry)
var water = ee.FeatureCollection([
  ee.Feature(ee.Geometry.Point([-122.085, 37.422]), {class: 0}),
  ee.Feature(ee.Geometry.Point([-122.090, 37.420]), {class: 0})
]);
var vegetation = ee.FeatureCollection([
  ee.Feature(ee.Geometry.Point([-122.100, 37.430]), {class: 1}),
  ee.Feature(ee.Geometry.Point([-122.105, 37.432]), {class: 1})
]);
var urban = ee.FeatureCollection([
  ee.Feature(ee.Geometry.Point([-122.070, 37.415]), {class: 2}),
  ee.Feature(ee.Geometry.Point([-122.065, 37.418]), {class: 2})
]);

var allPoints = water.merge(vegetation).merge(urban);

// 3. Sample the image at point locations
var training = image.sampleRegions({
  collection: allPoints,
  properties: ['class'],
  scale: 30
});

// 4. Add random column for splitting
var withRandom = training.randomColumn('random');
var trainSet = withRandom.filter(ee.Filter.lt('random', 0.7));
var testSet = withRandom.filter(ee.Filter.gte('random', 0.7));

// 5. Train classifier on training set
var classifier = ee.Classifier.smileRandomForest(50).train({
  features: trainSet,
  classProperty: 'class',
  inputProperties: image.bandNames()
});

// 6. Classify the test set
var validated = testSet.classify(classifier);

// 7. Get confusion matrix
var confusionMatrix = validated.errorMatrix('class', 'classification');

// 8. Print results
print('Confusion Matrix:', confusionMatrix);
print('Overall Accuracy:', confusionMatrix.accuracy());
print('Kappa:', confusionMatrix.kappa());
print('Producer Accuracy:', confusionMatrix.producersAccuracy());
print('User Accuracy:', confusionMatrix.consumersAccuracy());

// 9. Apply classifier to image
var classified = image.classify(classifier);
Map.centerObject(image, 10);
Map.addLayer(classified, {min: 0, max: 2, palette: ['blue', 'green', 'gray']}, 'Classification');

What you should see

Console output showing the confusion matrix, overall accuracy (0-1), kappa coefficient, and per-class accuracies. The map displays the classified image.

Understanding the confusion matrix

The confusion matrix shows where your classifier makes mistakes:

Predicted: Water Predicted: Vegetation Predicted: Urban Row Total
Actual: Water 45 3 2 50
Actual: Vegetation 2 43 5 50
Actual: Urban 1 4 45 50
Column Total 48 50 52 150
  • Diagonal (green): Correct classifications
  • Off-diagonal: Misclassifications
  • Overall Accuracy: (45+43+45)/150 = 88.7%
  • Producer's Accuracy (Water): 45/50 = 90% (how much water was correctly found)
  • User's Accuracy (Water): 45/48 = 93.8% (how reliable "water" pixels are)

Best practices

1. Use independent validation data

Never test on the same points you trained with. Options:

  • Random split: 70% training, 30% validation (easiest)
  • Spatial split: Train in one area, test in another (prevents spatial autocorrelation)
  • Temporal split: Train on one date, test on another

2. Stratified sampling

Ensure each class is represented proportionally in both training and test sets:

// Stratified random split
var split = allPoints.randomColumn('random');
var trainSet = split.filter(ee.Filter.lt('random', 0.7));
var testSet = split.filter(ee.Filter.gte('random', 0.7));

3. Sufficient sample size

Rule of thumb: At least 50 validation points per class, more for rare classes.

4. Report all metrics

Don't just report overall accuracy—include the full confusion matrix and per-class metrics.

Pro tips

  • Use .errorMatrix('actual', 'predicted') with the correct order of arguments.
  • For large datasets, use testSet.limit(5000) to avoid memory issues.
  • Export the confusion matrix to Drive for publication-ready tables.
  • Consider F1-score for imbalanced datasets (not built-in, but calculable from precision/recall).

Try it: Improve your accuracy

  1. Add more training points for each class (aim for 50+ per class).
  2. Include additional spectral indices (NDVI, NDWI) as input bands.
  3. Experiment with different classifiers (CART, SVM, Random Forest).
  4. Compare accuracy metrics before and after each change.

Common mistakes

  • Testing on training data (produces overly optimistic accuracy).
  • Ignoring class imbalance (rare classes may have poor accuracy).
  • Reporting only overall accuracy (hides per-class problems).
  • Using too few validation points (unstable accuracy estimates).
  • Confusing producer's and user's accuracy.

Quick self-check

  1. Why should training and validation data be separate?
  2. What does a high user's accuracy but low producer's accuracy indicate?
  3. What is kappa, and when is it useful?
  4. How many validation points per class is a good minimum?

Next steps