Pipeline guide

This page walks through each stage of the ThermoKourt pipeline with practical examples and tips for common problems.

Stage 1 — Arena extraction

What it does

Motif records long sessions as sequential .mp4 chunks (typically 18 000 frames each at 25 fps). arena_extractor concatenates these chunks and crops each circular arena into an individual video file.

Arena detection

The first frame is analysed with a Hough Circle Transform. The detector sweeps accumulator thresholds from strict to loose until it finds the expected number of arenas (default: 10). It tries OpenCV first and falls back to scikit-image if OpenCV is not installed.

Interactive editor

The detected circles are presented in a matplotlib window. Each arena has:

  • A centre marker (+) — drag to reposition

  • A rim handle (◇) — drag to resize

  • A dashed bounding box — shows the actual crop region

Key bindings:

Key

Action

Drag centre

Move arena

Drag rim

Resize arena

Right-click empty

Add new arena

Right-click centre

Delete arena

A

Re-run auto-detection

+/-

Adjust all radii ±5 px

U

Set all radii to median

S

Save arena positions to JSON

Q / Enter

Accept and extract

Escape

Abort

Output

For a recording named droso_18deg_20251002_102448 with 10 arenas:

droso_18deg_20251002_102448_arenas.json   # arena positions (reusable)
droso_18deg_20251002_102448_arena_00.mp4
droso_18deg_20251002_102448_arena_01.mp4
...
droso_18deg_20251002_102448_arena_09.mp4

Reusing arena positions

If arena positions are stable across recordings (same camera setup):

thermokourt-extract /data/new_recording --arenas old_recording_arenas.json

Stage 2 — Identity tracking

Why idtracker.ai?

For 3 unmarked Drosophila in a small arena, idtracker.ai v6 achieves

99.9% identity accuracy on comparable benchmarks. It works by learning a visual fingerprint for each individual, which means it can distinguish the two males and the headless female without any physical markers.

Alternatives considered:

  • DeepLabCut (multi-animal): excellent for pose estimation but identity tracking is less robust for visually similar flies

  • SLEAP: fast and modular, but flow-shift tracking has higher ID-switch rates than idtracker.ai for small groups

  • STCS: promising newer approach but less battle-tested

Running the tracker

thermokourt-track arena_00.mp4 --n_animals 3 --backend idtracker

Output: an HDF5 file with per-frame centroid coordinates and identity labels.

Stage 3 — Identity overlay

Renders colour-coded auras around each tracked individual:

  • Male 1: teal (semi-transparent halo)

  • Male 2: orange (semi-transparent halo)

  • Headless female: automatically assigned remaining colour

thermokourt-overlay arena_00.mp4 --tracks arena_00_tracks.h5

Stage 4 — Manual ethogram annotation

Open overlay videos in GameThogram. Define behaviours (courtship wing extension, lunge, chase, etc.) and score frame-by-frame with a gamepad.

Stage 5 — Automated scoring

Train a temporal CNN on the manual annotations, then batch-infer across all remaining videos on the Aoraki HPC cluster:

# On Aoraki
sbatch scripts/slurm/train_scorer.sh --annotations /data/manual/ --epochs 100
sbatch scripts/slurm/infer_scorer.sh --model best.pt --videos /data/overlays/