A classification is only as good as its accuracy. This module teaches you to measure how well your classifier performs using confusion matrices, overall accuracy, and class-specific metrics.
Learning objectives
- Split data into training and validation sets.
- Generate and interpret a confusion matrix.
- Calculate overall accuracy, producer's accuracy, and user's accuracy.
- Understand kappa coefficient and when to use it.
- Apply best practices for unbiased accuracy assessment.
Why it matters
Without accuracy assessment, you don't know if your map is 50% or 95% correct. Decision-makers need to understand the reliability of your results before using them for policy, conservation, or planning.
Key vocabulary
- Confusion Matrix
- A table comparing predicted classes (columns) against actual classes (rows).
- Overall Accuracy
- Percentage of correctly classified samples: (correct / total) × 100.
- Producer's Accuracy
- How well the classifier identifies a class (omission error complement).
- User's Accuracy
- How reliable the map is for a class (commission error complement).
- Kappa Coefficient
- Accuracy adjusted for chance agreement (0 = random, 1 = perfect).
Quick win: Complete accuracy assessment
This example trains a classifier and evaluates it with a confusion matrix:
// 1. Load image and define bands
var image = ee.Image('LANDSAT/LC08/C02/T1_L2/LC08_044034_20210623')
.multiply(0.0000275).add(-0.2)
.select(['SR_B2', 'SR_B3', 'SR_B4', 'SR_B5', 'SR_B6', 'SR_B7']);
// 2. Create training points (in practice, use imported geometry)
var water = ee.FeatureCollection([
ee.Feature(ee.Geometry.Point([-122.085, 37.422]), {class: 0}),
ee.Feature(ee.Geometry.Point([-122.090, 37.420]), {class: 0})
]);
var vegetation = ee.FeatureCollection([
ee.Feature(ee.Geometry.Point([-122.100, 37.430]), {class: 1}),
ee.Feature(ee.Geometry.Point([-122.105, 37.432]), {class: 1})
]);
var urban = ee.FeatureCollection([
ee.Feature(ee.Geometry.Point([-122.070, 37.415]), {class: 2}),
ee.Feature(ee.Geometry.Point([-122.065, 37.418]), {class: 2})
]);
var allPoints = water.merge(vegetation).merge(urban);
// 3. Sample the image at point locations
var training = image.sampleRegions({
collection: allPoints,
properties: ['class'],
scale: 30
});
// 4. Add random column for splitting
var withRandom = training.randomColumn('random');
var trainSet = withRandom.filter(ee.Filter.lt('random', 0.7));
var testSet = withRandom.filter(ee.Filter.gte('random', 0.7));
// 5. Train classifier on training set
var classifier = ee.Classifier.smileRandomForest(50).train({
features: trainSet,
classProperty: 'class',
inputProperties: image.bandNames()
});
// 6. Classify the test set
var validated = testSet.classify(classifier);
// 7. Get confusion matrix
var confusionMatrix = validated.errorMatrix('class', 'classification');
// 8. Print results
print('Confusion Matrix:', confusionMatrix);
print('Overall Accuracy:', confusionMatrix.accuracy());
print('Kappa:', confusionMatrix.kappa());
print('Producer Accuracy:', confusionMatrix.producersAccuracy());
print('User Accuracy:', confusionMatrix.consumersAccuracy());
// 9. Apply classifier to image
var classified = image.classify(classifier);
Map.centerObject(image, 10);
Map.addLayer(classified, {min: 0, max: 2, palette: ['blue', 'green', 'gray']}, 'Classification');
What you should see
Console output showing the confusion matrix, overall accuracy (0-1), kappa coefficient, and per-class accuracies. The map displays the classified image.
Understanding the confusion matrix
The confusion matrix shows where your classifier makes mistakes:
| Predicted: Water | Predicted: Vegetation | Predicted: Urban | Row Total | |
|---|---|---|---|---|
| Actual: Water | 45 | 3 | 2 | 50 |
| Actual: Vegetation | 2 | 43 | 5 | 50 |
| Actual: Urban | 1 | 4 | 45 | 50 |
| Column Total | 48 | 50 | 52 | 150 |
- Diagonal (green): Correct classifications
- Off-diagonal: Misclassifications
- Overall Accuracy: (45+43+45)/150 = 88.7%
- Producer's Accuracy (Water): 45/50 = 90% (how much water was correctly found)
- User's Accuracy (Water): 45/48 = 93.8% (how reliable "water" pixels are)
Best practices
1. Use independent validation data
Never test on the same points you trained with. Options:
- Random split: 70% training, 30% validation (easiest)
- Spatial split: Train in one area, test in another (prevents spatial autocorrelation)
- Temporal split: Train on one date, test on another
2. Stratified sampling
Ensure each class is represented proportionally in both training and test sets:
// Stratified random split
var split = allPoints.randomColumn('random');
var trainSet = split.filter(ee.Filter.lt('random', 0.7));
var testSet = split.filter(ee.Filter.gte('random', 0.7));
3. Sufficient sample size
Rule of thumb: At least 50 validation points per class, more for rare classes.
4. Report all metrics
Don't just report overall accuracy—include the full confusion matrix and per-class metrics.
Pro tips
- Use
.errorMatrix('actual', 'predicted')with the correct order of arguments. - For large datasets, use
testSet.limit(5000)to avoid memory issues. - Export the confusion matrix to Drive for publication-ready tables.
- Consider F1-score for imbalanced datasets (not built-in, but calculable from precision/recall).
Try it: Improve your accuracy
- Add more training points for each class (aim for 50+ per class).
- Include additional spectral indices (NDVI, NDWI) as input bands.
- Experiment with different classifiers (CART, SVM, Random Forest).
- Compare accuracy metrics before and after each change.
Common mistakes
- Testing on training data (produces overly optimistic accuracy).
- Ignoring class imbalance (rare classes may have poor accuracy).
- Reporting only overall accuracy (hides per-class problems).
- Using too few validation points (unstable accuracy estimates).
- Confusing producer's and user's accuracy.
Quick self-check
- Why should training and validation data be separate?
- What does a high user's accuracy but low producer's accuracy indicate?
- What is kappa, and when is it useful?
- How many validation points per class is a good minimum?
Next steps
- Introduction to Classification
- Unsupervised Classification
- Exporting Data - save your accuracy report