Usage

STAG processes wearable accelerometer data through five sequential stages. Each stage can be run independently once its inputs are available.

Stage 1 — Sensor synchronisation

The head and ear accelerometers are aligned using calibration-drop events (three controlled drops from 1.5 m recorded by both sensors simultaneously).

import pandas as pd
from stag.sync.data_sync import BetterDataSync

head_df = pd.read_csv("raw_data/R1_D1_head.csv")
ear_df  = pd.read_csv("raw_data/R1_D1_ear.csv")

syncer = BetterDataSync(
    deer_id="R1_D1",
    head_data=head_df,
    ear_data=ear_df,
    window_dict={"start": 0, "end": 50000},
    mkplot=True,
    plot_folder="plots/sync/",
)
syncer.run_synchronization()

Or via the command-line script for HPC submission:

python scripts/run_sync_aoraki.py --deer_code R1_D1 --path_sys cluster_paths

Stage 2 — Database construction

Synchronised .h5 files are ingested into a SQLite database using the SQLAlchemy ORM defined in stag.database.

# Insert a single deer's data
python stag/database/deer_info.py sqlite:///deer_data.db data/synced/R1_D1.h5

# Batch insert via SLURM
sbatch slurm/make_deer_db.sh

Stage 3 — GPS trajectory features

Ground speed and tortuosity are computed from GPS fixes projected onto the New Zealand Map Grid (EPSG:27200).

from stag.gps.analysis import main as process_gps

deer_df = process_gps("data/synced/R1_D1.h5")
print(deer_df[["abs_speed_mPs", "tortuosity"]].describe())

Stage 4 — GPU clustering

k-means clustering with contiguous leave-out stability analysis. The script accepts command-line arguments for integration with SLURM array jobs.

python stag/clustering/kmeans.py \
    -t deer8 -nc 8 -ds 0 -dp 0 -rs 0 \
    -df data/clust_data_deer8.npy \
    -sd results/clustering/

To sweep over the full parameter grid (k = 2–50, deletion sizes 0/10/25/50 %, deletion positions in 2 % steps):

sbatch slurm/run_slurm_main_clustering.sh

Post-hoc model selection uses stag.clustering.meta_analysis.ClusterMetaAnalysis to evaluate Calinski–Harabasz quality and Hungarian-matched centroid stability across all runs.

Stage 5 — Behavioural analysis

The cluster label sequence is analysed for transition probabilities, bout durations, and super-prototype motifs.

from stag.analysis.label_analysis import LabelAnalyser

analyser = LabelAnalyser("results/clustering/labels.npy", fps=50)
analyser.main(cutoff=2, save_path="results/label_analysis.json")

The output JSON contains per-centroid statistics (percentage, mean bout duration ± SEM) and the full transition matrix.