Analysis & Plotting
Data Ingestion
W.I.N.G.S. — Ingest GPU simulation results
Reads all CSV files from the GPU results directory, parses the experimental conditions from the filenames, and produces a single combined CSV ready for plot_results.py.
- Filename format (matches both old CPU and new GPU outputs):
cytoplasmic_incompatibility_<True|False>_male_killing_<True|False>_ increased_exploration_rate_<True|False>_increased_eggs_<True|False>_<replicate>.csv
- CSV columns expected: Population Size, Infection Rate
(Day is added automatically as row index + 1)
- Output columns (what plot_results.py expects):
Day, Population Size, Infection Rate, Cytoplasmic Incompatibility, Male Killing, Increased Exploration Rate, Increased Eggs, Replicate ID
- wings.analysis.ingest.parse_conditions_from_filename(filename)[source]
Extract experimental conditions and replicate ID from filename.
Returns
tuple : (ci: bool, mk: bool, er: bool, eg: bool, rep_id: int)
- Parameters:
filename (str)
- wings.analysis.ingest.normalise_columns(df)[source]
Ensure the DataFrame has ‘Population Size’ and ‘Infection Rate’ columns regardless of what the CSV header actually says. Handles the GPU output (already correct) and any minor variations.
- wings.analysis.ingest.main(data_dir, output_file)[source]
CLI entry point for data ingestion.
Parses
--input-dirand--outputarguments, callsingest_directory(), and writes the combined CSV.
W.I.N.G.S. — Δp sweep data ingestion.
Reads individual ABM CSV outputs from the delta-p sweep and combines them into a single CSV for plotting.
- Filename convention:
{PHENOTYPE}_frac{NNN}_rep{R}.csv e.g. CI_frac050_rep3.csv → phenotype=CI, fraction=0.50, rep=3
Each per-run CSV has columns: Day, Population Size, Infection Rate, …
Output CSV adds: Phenotype, Infected Fraction, Replicate ID
- Usage:
python ingest_delta_p.py –input-dir /projects/…/abm_delta_p –output data/combined_delta_p.csv
Publication Figures
W.I.N.G.S. — Publication figure generator.
Produces parallel figure sets for the Agent-Based Model (ABM) and the Wright-Fisher fixed-size Model (WFM), organised by biologically meaningful combination subsets.
Colour scheme: Paul Tol’s qualitative palette (colourblind-safe). Line plots differentiate combos via colour + dash pattern + line width. All figures exported as both PNG (300 dpi) and SVG (text as text objects).
Figure catalogue
- For EACH model (ABM / WFM):
1–2. infection_over_time_{subset}.{png,svg} 3–4. final_infection_{subset}.{png,svg} 5–6. time_to_fixation_{subset}.{png,svg} 7. heatmap_infection.{png,svg} 8. heatmap_fixation_pct.{png,svg} 9. heatmap_fixation_time.{png,svg} ABM-only: 10–11. population_over_time_{subset}.{png,svg} 12. heatmap_population.{png,svg}
Usage
python plot_wings.py –model abm –input data/combined_abm.csv python plot_wings.py –model wfm –input data/combined_wfm.csv
- wings.analysis.plot_wings.get_style(label)[source]
Return (colour, dash, linewidth) for a combo label.
- wings.analysis.plot_wings.parse_exclude(exclude_str)[source]
Parse a comma-separated exclusion string into a set of mechanic names.
- Parameters:
exclude_str (str or None) – Comma-separated mechanic abbreviations, e.g.
"MK"or"MK,IE". Case-insensitive.- Returns:
Uppercase mechanic abbreviations to exclude.
- Return type:
- Raises:
SystemExit – If an unrecognised abbreviation is provided.
- wings.analysis.plot_wings.filter_subset_by_exclusion(subset, excluded)[source]
Remove combos from a subset that use any excluded mechanic.
- wings.analysis.plot_wings.filter_heatmap_configs(row_configs, col_configs, excluded)[source]
Remove heatmap rows/columns that contain an excluded mechanic.
- wings.analysis.plot_wings.combo_label(ci, mk, er, ie)[source]
Build a human-readable label from boolean effect flags.
- wings.analysis.plot_wings.save_fig(fig, path_stem)[source]
Save a matplotlib figure as PNG (300 dpi) and SVG (editable text).
- Parameters:
fig (matplotlib.figure.Figure) – The figure to save.
path_stem (str) – Output path without extension (e.g.
"figures/infection_rate"). Produces.pngand.svg.
- wings.analysis.plot_wings.load_data(path)[source]
Load a combined simulation CSV produced by
ingest.ingest_directory().Normalises boolean effect columns and adds a
Combolabel column.- Parameters:
path (str) – Path to the combined CSV.
- Returns:
Loaded data with boolean columns and combo labels.
- Return type:
- wings.analysis.plot_wings.filter_combos(df, subset)[source]
Filter a DataFrame to only the specified effect combinations.
- Parameters:
df (pandas.DataFrame) – Combined simulation data.
subset (list[tuple]) – List of
(ci, mk, er, ie)boolean tuples.
- Returns:
Filtered copy containing only matching rows.
- Return type:
- wings.analysis.plot_wings.get_ordered_labels(subset)[source]
Return combo labels in the order defined by the subset.
- wings.analysis.plot_wings.compute_timeseries_stats(df, time_col='Day')[source]
Compute per-timepoint summary statistics across replicates.
For each combination × time point, calculates median, IQR (25th/75th percentiles), and 5th/95th percentiles for both infection rate and population size.
- Parameters:
df (pandas.DataFrame) – Simulation data.
time_col (str) – Name of the time column. Defaults to
"Day".
- Returns:
Summary statistics with columns
inf_median,inf_q25,inf_q75, etc.- Return type:
- wings.analysis.plot_wings.compute_final_values(df, time_col='Day')[source]
Extract the final time-point values for each replicate.
- Parameters:
df (pandas.DataFrame) – Simulation data.
time_col (str) – Name of the time column. Defaults to
"Day".
- Returns:
One row per combo × replicate, containing the last recorded infection rate and population size.
- Return type:
- wings.analysis.plot_wings.compute_time_to_fixation(df, time_col='Day', threshold=0.99)[source]
Find the first time point where infection rate ≥ threshold.
- Parameters:
df (pandas.DataFrame) – Simulation data.
time_col (str) – Time column name. Defaults to
"Day".threshold (float) – Fixation threshold. Defaults to
0.99.
- Returns:
Columns
Combo,Replicate ID,t_fix(NaN if fixation never reached).- Return type:
- wings.analysis.plot_wings.plot_timeseries(df, subset, subset_name, metric, ylabel, title, path_stem, time_col='Day', is_abm=True, skip_before=None)[source]
Line + ribbon plot of a metric over time for each combination.
Draws median line with IQR ribbon and 5th–95th percentile whisker band. ABM plots use symlog x-axis; WFM uses linear.
- Parameters:
df (pandas.DataFrame) – Simulation data.
subset_name (str) – Human-readable subset name for title.
metric (str) – Column prefix (
"inf"or"pop").ylabel (str) – Y-axis label.
title (str) – Figure title.
path_stem (str) – Output path without extension.
time_col (str) – Time column name. Defaults to
"Day".is_abm (bool) – Use ABM-specific formatting. Defaults to
True.skip_before (int, optional) – Skip time points before this day.
- wings.analysis.plot_wings.plot_final_infection(df, subset, subset_name, title, path_stem, time_col='Day')[source]
Strip plot of final infection rate with fixation-% annotation.
Each replicate is a jittered dot; a diamond shows the median; a vertical bar shows the IQR. Fixation percentage is annotated above each combination.
- wings.analysis.plot_wings.plot_final_strip(df, subset, subset_name, metric_col, ylabel, title, path_stem, time_col='Day', csv_col_name=None, fixation_annot=False)[source]
Strip + median diamond + IQR bar for any final-timepoint metric. Exports companion CSV with columns: mechanic, {csv_col_name}.
- wings.analysis.plot_wings.plot_time_to_fixation(df, subset, subset_name, title, path_stem, time_col='Day', is_abm=True)[source]
Violin + strip plot of time-to-fixation across replicates.
Combinations that never reach fixation are listed as italic annotations.
- Parameters:
- wings.analysis.plot_wings.plot_heatmap(df, metric_func, cmap, cbar_label, title, path_stem, time_col='Day', fmt='.2f', vmin=None, vmax=None, csv_raw_func=None, csv_value_col='value', excluded=None)[source]
Heatmap of effect combinations, with optional mechanic exclusion.
Rows represent the CI × MK axis; columns represent the ER × IE axis. When mechanics are excluded via
excluded, the corresponding rows or columns are removed and the heatmap shrinks accordingly.- Parameters:
df (pandas.DataFrame) – Simulation data.
metric_func (callable) – Function
(df_subset, time_col) → floatcomputing the cell value.cmap (str) – Matplotlib colourmap name.
cbar_label (str) – Colourbar label.
title (str) – Figure title.
path_stem (str) – Output path without extension.
time_col (str) – Time column. Defaults to
"Day".fmt (str) – Number format string. Defaults to
".2f".vmin (float, optional) – Colourmap minimum.
vmax (float, optional) – Colourmap maximum.
csv_raw_func (callable, optional) – Per-replicate extraction function for CSV export.
excluded (set[str], optional) – Mechanic abbreviations to exclude.
- wings.analysis.plot_wings.raw_final_infection(df_sub, time_col)[source]
One row per replicate: final infection rate.
- wings.analysis.plot_wings.raw_fixation_binary(df_sub, time_col)[source]
One row per replicate: 1 if reached fixation, 0 otherwise.
- wings.analysis.plot_wings.raw_time_to_fixation(df_sub, time_col)[source]
One row per replicate: time to fixation (NaN if never).
- wings.analysis.plot_wings.raw_final_population(df_sub, time_col)[source]
One row per replicate: final population size.
- wings.analysis.plot_wings.raw_final_infected_count(df_sub, time_col)[source]
One row per replicate: final number of infected beetles.
- wings.analysis.plot_wings.generate_figures(df, model, outdir, time_col='Day', excluded=None)[source]
Generate the complete set of publication figures for one model.
Produces time series, strip plots, violin plots, and heatmaps for both combo subsets (Individual Effects, ER-Centric), plus ABM-only population size figures.
- Parameters:
df (pandas.DataFrame) – Combined simulation data.
model (str) –
"abm"or"wfm".outdir (str) – Output directory for figures.
time_col (str) – Time column. Defaults to
"Day".excluded (set[str], optional) – Mechanic abbreviations to exclude (e.g.
{"MK"}). Combos using excluded mechanics are removed from subsets, and heatmap rows/columns are pruned.
Δp Analysis
W.I.N.G.S. — Δp vs p figure (complementary strategies).
Extracts per-interval infection frequency change (Δp) from ABM time series and overlays analytical models.
Key design choices
First-passage extraction: for each replicate, only uses the FIRST time the trajectory passes through each p-bin (ascending), avoiding contamination from equilibrium wobble or stalled runs.
Asymmetric CI model: A·p^α·(1-p)^γ allows the peak to shift right, capturing spatial/growth effects the mean-field Turelli model misses.
Diminishing-returns ER model: s_0·(1-p)^β captures mate-finding saturation.
- Outputs (PNG 300dpi + SVG with editable text):
delta_p_vs_p.{png,svg} — 3-panel (CI, ER, CI+ER) delta_p_overlay.{png,svg} — single-panel with crossing point
- Usage:
python plot_delta_p.py –input data/combined_delta_p.csv python plot_delta_p.py –input data/combined_delta_p.csv –dt 14 –ascending-only
- wings.analysis.plot_delta_p.detect_columns(df)[source]
Auto-detect the time and infection rate column names.
Searches for common variants (
Day,Days,Time, etc.) case-insensitively.- Parameters:
df (pandas.DataFrame) – Input data.
- Returns:
(time_column_name, infection_column_name).- Return type:
- Raises:
SystemExit – If either column cannot be found.
- wings.analysis.plot_delta_p.model_ci_asymmetric(p, A, alpha, gamma)[source]
Asymmetric CI selection coefficient:
A · p^α · (1-p)^γ.Generalises the Turelli (1994) model
p(1-p) · s_h / (1 - p·s_h)by allowing different exponents on the left (α) and right (γ) sides. When α > γ the peak shifts right of 0.5, capturing spatial clustering and population growth effects.- Parameters:
p (numpy.ndarray) – Infection frequency array.
A (float) – Amplitude parameter.
alpha (float) – Left exponent (rise rate at low p).
gamma (float) – Right exponent (decay rate at high p).
- Returns:
Expected Δp per generation.
- Return type:
- wings.analysis.plot_delta_p.model_er(p, s_0, beta)[source]
ER selection coefficient with diminishing returns:
s_0 · (1-p)^β.The mate-finding advantage of increased exploration decays as infected females become common and the mating landscape saturates.
- Parameters:
p (numpy.ndarray) – Infection frequency array.
s_0 (float) – Base advantage at p → 0.
beta (float) – Decay exponent (higher = faster saturation).
- Returns:
Expected Δp per generation from ER.
- Return type:
- wings.analysis.plot_delta_p.model_ci_er(p, A, alpha, gamma, s_0, beta)[source]
Combined CI + ER selection (additive).
- Parameters:
- Returns:
Combined expected Δp.
- Return type:
- wings.analysis.plot_delta_p.extract_delta_p(df, time_col, inf_col, dt=7, ascending_only=False)[source]
Extract
(p, Δp)pairs from replicate time series.For each replicate, samples the infection rate at intervals of
dtdays and computes the change. Skips the pre-hatch period (day < 23) and absorbing boundaries (p < 0.01 or p > 0.99).- Parameters:
df (pandas.DataFrame) – Combined simulation data.
time_col (str) – Name of the time column.
inf_col (str) – Name of the infection rate column.
dt (int) – Sampling interval in days. Defaults to
7.ascending_only (bool) – If
True, only use intervals where Δp > 0 and only the first passage through each 5%-bin. Defaults toFalse.
- Returns:
Columns
Phenotype,p,delta_p,initial_fraction, and optionallypop_size.- Return type:
- wings.analysis.plot_delta_p.bin_delta_p(dp_df, n_bins=20)[source]
Bin
(p, Δp)data by infection frequency and compute statistics.- Parameters:
dp_df (pandas.DataFrame) – Raw Δp data from extraction.
n_bins (int) – Number of equal-width bins over [0, 1]. Defaults to
20.
- Returns:
Per-bin summary with
Phenotype,p_mid,dp_median,dp_q25,dp_q75,dp_mean,n.- Return type:
- wings.analysis.plot_delta_p.fit_analytical(binned_df)[source]
Fit the asymmetric CI and diminishing-returns ER models.
Uses
scipy.optimize.curve_fiton the binned median Δp data. Falls back to default parameters if fitting fails or data is insufficient.- Parameters:
binned_df (pandas.DataFrame) – Binned Δp data.
- Returns:
Fitted parameters
{A, alpha, gamma, s_0, beta}.- Return type:
- wings.analysis.plot_delta_p.save_fig(fig, path_stem)[source]
Save a figure as PNG (300 dpi) and SVG (editable text).
- Parameters:
fig (matplotlib.figure.Figure) – Figure to save.
path_stem (str) – Output path without extension.
- wings.analysis.plot_delta_p.plot_delta_p_figure(binned_df, params, outdir, dt)[source]
Three-panel figure: CI, ER, CI+ER with ABM data and analytical overlay.
Each panel shows binned ABM median + IQR ribbon and a dashed analytical curve. CI panels mark the unstable equilibrium (Δp = 0 crossing).
- Parameters:
binned_df (pandas.DataFrame) – Binned Δp data.
params (dict) – Fitted analytical parameters.
outdir (str) – Output directory.
dt (int) – Time interval used for Δp.
- wings.analysis.plot_delta_p.plot_overlay_figure(binned_df, params, outdir, dt)[source]
Single-panel overlay of all three phenotypes with crossing point.
Marks the frequency where CI overtakes ER as the dominant driver of spread (“ER → CI handoff”) and shades the ER-dominant and CI-dominant frequency regions.
- Parameters:
binned_df (pandas.DataFrame) – Binned Δp data.
params (dict) – Fitted analytical parameters.
outdir (str) – Output directory.
dt (int) – Time interval used for Δp.
- wings.analysis.plot_delta_p.main()[source]
CLI entry point for the Δp figure generator.
- Supports four modes via
--mode: all: Full trajectory averaging (default).ascending: First-passage, Δp > 0 only.initial: One Δp per replicate (Turelli-correct).compare: Run all three and produce comparison figure.
- Supports four modes via
Statistical Tests
- wings.analysis.stats.read_data(file_path)[source]
Read a simulation CSV into a pandas DataFrame.
- Parameters:
file_path (str) – Path to the CSV file.
- Returns:
Raw simulation data.
- Return type:
- wings.analysis.stats.bootstrap_ci(data, n_bootstrap=1000, ci=0.95)[source]
Compute a bootstrap confidence interval for the median.