Chapter 05

Digitizing Vector Data

The art of precision. Learn how to convert geographic features into clean, accurate, and topologically sound vector data.

At a Glance

Prereqs: Chapters 01, 04 Time: 25 min read + 30 min practice Deliverable: Digitized training polygons

Learning outcomes

  • Digitize clean vector features with appropriate vertex density.
  • Apply basic QA checks (snapping, no gaps/overlaps).
  • Explain how digitizing choices affect analysis and classification.

Key terms

digitizing, snapping, topology, vertex, attribute table, training data

Stop & check

  1. Why is snapping important when digitizing adjacent polygons?

    Answer: It prevents gaps and overlaps.

    Why: Topological errors propagate into area totals and overlays.

    Common misconception: Tiny errors do not matter; they can create thousands of slivers in analysis.

  2. Why do digitized training labels affect classification results?

    Answer: The model learns whatever you label as truth.

    Why: Label bias and mixed pixels reduce separability and accuracy.

    Common misconception: More labels always helps; better labels and balance matters more.

Try it (5 minutes)

  1. Digitize one simple polygon and zoom in to inspect edges.
  2. List one place where you should avoid digitizing (ambiguous boundary).

Lab (Two Tracks)

Both tracks produce the same deliverable: a small set of labeled polygons (3-5 classes) plus a short QA note.

Desktop GIS Track (ArcGIS Pro / QGIS)

Digitize training polygons on imagery, add class attribute, and run a topology/geometry check. Export GeoPackage/GeoJSON.

Remote Sensing Track (Google Earth Engine)

Create a FeatureCollection of training polygons (drawing tools) and export it. Write a QA note about class balance.

Common mistakes

  • Over-digitizing: too many vertices without adding accuracy.
  • Labeling mixed pixels (boundaries) as pure classes.
  • Forgetting to record class definitions (what counts as forest vs shrub?).

Further reading: https://www.ucgis.org/site/gis-t-body-of-knowledge

What is Digitizing?

Digitizing is the process of converting geographic data from a scanned map or satellite image into a digital vector format (points, lines, and polygons). While modern AI can automate some of this, manual heads-up digitizing remains a core skill for GIS analysts where high precision is required.

⚠️ The Golden Rule: Trash in, trash out. If your digitizing is sloppy, every analysis you perform afterwards will be flawed. Precision starts here.

Interactive: The Snapping Challenge

In vector GIS, features must connect perfectly. Snapping automatically pulls your cursor to existing nodes. Try to click exactly on the center node below. The dashed circle shows your Snapping Tolerance.

Status: Ready...

Common Digitizing Errors

Maintaining Topology is crucial. Watch out for these common mistakes:

  • Overshoots: Lines that extend past their intended junction.
  • Undershoots: Lines that fail to reach their intended junction (gaps).
  • Sliver Polygons: Tiny, accidental polygons created by gaps or overlaps between two adjacent polygons.
🏢

Regional Decision: Mapping the New Campus

Your university is building a new research wing. You have been handed a high-resolution drone image of the construction site. You need to digitize the new building's footprint.

The Dilemma: One corner of the building is partially obscured by a tree canopy. Do you:

  • A) Guess the corner based on the visible roof lines?
  • B) Use the "Parallel" tool to infer the corner from the opposite wall?
  • C) Wait for the winter "leaf-off" imagery to be collected?
Expert Insight: Most professionals choose Option B. Using geometric constraints (orthogonality) is more accurate than guessing, and project deadlines rarely allow for waiting for a new sensor pass.

Summary of Big Ideas

  • Heads-up digitizing is the standard method for manually creating vector data directly over an image.
  • Snapping is the critical tool for ensuring topology and geometric connectivity.
  • Attribute entry happens alongside digitizing, ensuring the data is "rich" from the moment of creation.
  • Clean data is the foundation of all valid spatial analysis—errors here propagate through the entire workflow.

Chapter 08 Checkpoint

1. What tool should you ALWAYS use to ensure that two adjacent polygons share a perfect boundary?

Snapping (and Auto-Complete Polygon tools)
Eye-balling it at high zoom

2. A "sliver polygon" is usually caused by:

Poor topology and lack of snapping between features.
Using the wrong map projection.

Chapter Glossary

Nodes: The start and end points of a line feature; essential for topology.
Vertices: The intermediate points that define the turns and shape of a line or polygon.
Snapping Tolerance: The "gravity" radius within which the cursor will automatically connect to a nearby feature.
← Chapter 04: GPS Next: Chapter 06: Georeferencing →

BoK Alignment

Topics in the UCGIS GIS&T Body of Knowledge that support this chapter.