LAB INSTRUCTIONS

John Snow & Spatial Analytics

The 1854 Cholera Epidemic

🎯 Lab Objectives

In this lab, you will step into the shoes of Dr. John Snow, the "father of modern epidemiology." In 1854, a severe cholera outbreak hit London's Soho district. Before the germ theory of disease was widely accepted, Snow mapped the cholera deaths and the public water pumps, ultimately pinpointing the Broad Street pump as the source of the infected water.

You will recreate this historic analysis using two different approaches:

  1. Option 1: QGIS (Traditional Desktop GIS) - A visual, click-driven approach to spatial analysis and cartography.
  2. Option 2: VS Code & Python (Modern AI-Assisted Scripting) - A code-driven approach using Python, guided by GitHub Copilot.
πŸ’Ύ Data Preparation You will need two datasets for this lab (provided by your instructor):
  • cholera_deaths.csv (Contains fields: id, deaths_count, longitude, latitude)
  • water_pumps.csv (Contains fields: pump_name, longitude, latitude)
Ensure your CSV files are saved in an easily accessible folder on your computer.

πŸ—ΊοΈ Option 1: QGIS Analysis

⚠️ Legacy Note: ArcMap Screencast If you would like to see how this lab was historically performed, you can watch the original ArcMap screencast here.

Please note that Esri's ArcMap software has been officially retired. The concepts in the video remain identical, but we will adapt the exact clicks and buttons to modern QGIS in the steps below!

Step 1: Base Map Setup

  1. Open QGIS.
  2. Start a new project.
  3. In the Browser Panel on the left, expand XYZ Tiles.
  4. Double-click on OpenStreetMap to add a basemap of the world to your canvas.

Step 2: Importing the Data

  1. Go to Layer > Add Layer > Add Delimited Text Layer... (or click the comma icon).
  2. Click the ... button next to File name and select cholera_deaths.csv.
  3. In the Geometry Definition section, set X field to longitude and Y field to latitude.
  4. Set Geometry CRS to EPSG:4326 - WGS 84.
  5. Click Add, then Close. The deaths will appear as points on your map.
  6. Repeat steps 1-5 for the water_pumps.csv file.

Step 3: Symbology (Making it Readable)

  1. Double-click the cholera_deaths layer in the Layers Panel (bottom-left) to open its Properties.
  2. Go to the Symbology tab. Change the color to Red and reduce the size to 1.5 millimeters. Click OK.
  3. Double-click the water_pumps layer.
  4. Go to Symbology. Change the symbol type to an SVG marker, pick a distinct icon (like a blue cross or a water drop), and increase the size to 4.0 millimeters. Click OK.

Step 4: Spatial Analysis (Kernel Density Estimation)

Dr. Snow visually clustered the deaths. We will use an algorithm to find the "hotspot."

  1. From the top menu, select Processing > Toolbox.
  2. In the Toolbox search bar, type Heatmap.
  3. Double-click on Heatmap (Kernel Density Estimation).
  4. Set the parameters:
    • Point layer: cholera_deaths
    • Radius: 0.002 (degrees, roughly 200 meters)
    • Weight from field: deaths_count
    • Pixel size X / Y: 0.0001
  5. Click Run and close the dialog when finished. A black-and-white raster layer will appear.

Step 5: Styling the Heatmap

  1. Double-click the new Heatmap layer and go to Symbology.
  2. Change Render type to Singleband pseudocolor.
  3. Choose a Color ramp like YlOrRd (Yellow-Orange-Red).
  4. Open the Transparency tab on the left and set Global Opacity to 60%. Click OK.
  5. Analysis: Look at the dark red hotspot. Which water pump is located squarely inside the highest concentration of deaths? (Hint: The Broad Street pump).

πŸ’» Option 2: Python Scripting with VS Code & GitHub Copilot

For students who want to build spatial analysis pipelines through code.

Step 1: Environment Setup

  1. Open VS Code and create a new folder for your lab.
  2. Copy the two CSV datasets into this folder.
  3. Open a terminal in VS Code (Ctrl + ~) and install the necessary spatial libraries:
    pip install pandas folium

Step 2: Using GitHub Copilot to Write the Script

Create a new file called snow_analysis.py. Instead of writing the code from scratch, let's use Copilot!

Type the following comment block at the top of your python file and press Enter. Watch Copilot suggest the code, and press Tab to accept its suggestions line-by-line or block-by-block.

"""
Goal: Recreate John Snow's Cholera Map using Python and Folium.
1. Load cholera_deaths.csv and water_pumps.csv using pandas.
2. Create Folium map centered on London Soho (-0.136, 51.513), zoom start 15.
3. Add water pumps to the map as blue markers.
4. Add cholera deaths to the map as red circle markers. The radius should be based on the 'deaths_count'.
5. Create a HeatMap of the deaths and add it to the map.
6. Save the map as 'john_snow_map.html'.
"""

Step 3: Reference Code

If Copilot gets stuck, here is a guaranteed working script:

import pandas as pd
import folium
from folium.plugins import HeatMap

# 1. Load the data
deaths_df = pd.read_csv('cholera_deaths.csv')
pumps_df = pd.read_csv('water_pumps.csv')

# 2. Initialize the map over 1854 Soho, London
m = folium.Map(location=[51.513, -0.136], zoom_start=15, tiles='CartoDB positron')

# 3. Add the water pumps
for idx, row in pumps_df.iterrows():
    folium.Marker(
        location=[row['latitude'], row['longitude']],
        popup=f"Pump: {row.get('pump_name', 'Unknown')}",
        icon=folium.Icon(color='blue', icon='tint')
    ).add_to(m)

# 4. Add the deaths as circle markers
for idx, row in deaths_df.iterrows():
    folium.CircleMarker(
        location=[row['latitude'], row['longitude']],
        radius=row.get('deaths_count', 1) * 2, # Scale by death count
        color='red',
        fill=True,
        fill_opacity=0.6,
        popup=f"Deaths: {row.get('deaths_count', 1)}"
    ).add_to(m)

# 5. Add a Heatmap
heat_data = [[row['latitude'], row['longitude'], row.get('deaths_count', 1)] for idx, row in deaths_df.iterrows()]
HeatMap(heat_data, radius=15, blur=10).add_to(m)

# 6. Save the map
m.save('john_snow_map.html')
print("Map successfully saved to john_snow_map.html!")

Step 4: Run and Analyze

  1. Run the script in the VS Code terminal: python snow_analysis.py
  2. Open the resulting john_snow_map.html in your web browser.
  3. Observe the interactive map. The heatmap clearly centers on the Broad Street pump, exactly as Dr. Snow proved over 150 years ago!
πŸ“ Lab Deliverable Take a screenshot of either your QGIS canvas with the heatmap or your interactive Folium map in the browser. Submit this alongside a 1-paragraph summary explaining how spatial visualization helped identify the source of the outbreak.