arista.ingest.parsers.alex

Walk an Alex-layout tree and yield IngestRecord instances.

The expected layout is the flat form produced by arista-preprocess (and committed for the in-repo subset):

<root>/<genotype>/<animal_label>/<fiji>.csv

where:

The deep HCS layout (<genotype>/<date>/<exp>/Arista_<side>/) is detected and surfaced via the skipped list — the parser for it lands in the next ingest sprint.

Stimulus protocol: Alex’s 641 sessions all use ascAmp per his records; this parser defaults to that and the CLI exposes a flag to override per ingest run.

Functions

discover_alex_records(source_root, *[, ...])

Yield one DiscoveryResult per CSV under source_root.

Classes

DiscoveryResult(csv_path, record, reason)

Outcome of discover_alex_records() for one CSV path.

IngestRecord(researcher_name, strain_name, ...)

One ingest-ready unit: dimension lookups + samples in one bundle.

class arista.ingest.parsers.alex.DiscoveryResult(csv_path, record, reason)[source]

Bases: object

Outcome of discover_alex_records() for one CSV path.

Parameters:
csv_path: Path
reason: str | None
record: IngestRecord | None
class arista.ingest.parsers.alex.IngestRecord(researcher_name, strain_name, recording_date, sex, animal_number, arista_suffix, cell_type_code, cell_number, hemisphere, stimulus_name, fps, n_samples, duration_s, drift_method, samples_df, source_csv, notes=None)[source]

Bases: object

One ingest-ready unit: dimension lookups + samples in one bundle.

The orchestrator consumes a stream of these and translates each into one animals row (or lookup), one recordings row, and N samples rows.

Parameters:
animal_number: int
arista_suffix: str | None
cell_number: int
cell_type_code: str
drift_method: str
duration_s: float
fps: float
hemisphere: str | None
n_samples: int
notes: str | None = None
recording_date: str
researcher_name: str
samples_df: pandas.DataFrame
sex: str
source_csv: Path
stimulus_name: str
strain_name: str
arista.ingest.parsers.alex.discover_alex_records(source_root, *, stimulus_name='ascAmp')[source]

Yield one DiscoveryResult per CSV under source_root.

Walks <root>/<genotype>/<animal>/<fiji>.csv only. Paths that do not match the flat layout are yielded with record=None and a populated reason; the CLI surfaces them in the startup banner.

Parameters:
  • source_root (Path) – Root directory (e.g. preprocessed_output/alex/).

  • stimulus_name (str) – Stimulus protocol to assign to every record. Defaults to ascAmp per Alex’s 641 sessions.

Yields:

DiscoveryResult instances in deterministic alpha order.

Return type:

Iterator[DiscoveryResult]