arista.ingest.parsers.robert

Walk Robert Kossen’s Compiled_data_pickled/<genotype>/*.txt tree.

Each TXT carries a five-line #-prefixed provenance header followed by a whitespace-separated 3-column body:

#  date: 2018-02-07
#  genotype: CantonS
#  gender: m01
#  stimulus: ascAmp
#  celltype: CC
0 21.47 -0.004010
1 21.47 0.001740
…

Filenames follow YYYY-MM-DD_<strain>_<sex><animal_num>[suffix]_<stim>_<cell>_<n>.txt; the per-file gender header field packs sex + animal-of-the-day + optional b-for-second-arista into one string (“m01”, “f02b”, …).

The dfbf column in the TXT is the canonical post-pipeline ΔF/F₀ (drift-corrected per the pytci massiveAligner workflow). Robert’s TXT export dropped the original choice of fit method so recordings.drift_correction = 'unknown' and the raw pre-correction trace is not recoverable from the TXT alone. We store the TXT value in samples.dfbf and leave samples.dfbf_drift_corrected NULL — the column represents “raw fed in vs corrected from raw”; we lack the raw side.

Niko’s 2016 recordings were re-processed through this same pipeline and live in NSybLexALexOpGCamp6/, so they are also ingested via this parser, attributed to the same researcher as the rest of the compiled tree.

Module Attributes

DEFAULT_FPS

Default sample rate.

Functions

discover_robert_records(source_root)

Yield one DiscoveryResult per TXT under Compiled_data_pickled/.

arista.ingest.parsers.robert.DEFAULT_FPS = 10.0

Default sample rate. Robert’s pipeline assumed 10 fps throughout the corpus; the TXT carries no per-file fps so we hard-code.

arista.ingest.parsers.robert.discover_robert_records(source_root)[source]

Yield one DiscoveryResult per TXT under Compiled_data_pickled/.

The expected tree is <source_root>/<genotype>/*.txt where <genotype> is one of CantonS / NompC3_NSybLexALexOpGCamp6 / NompC-HeterozControl / NompCPbac / NompCRescue / NompCOverExpression / NompCGal4-Ctrl-NCBG / NompCGal4-Ctrl-WTBG / UASNompC-Ctrl-NCBG / NSybLexALexOpGCamp6 / ColdAdapt / HotAdapt / AristaBending.

Files outside that pattern are yielded with record=None and a populated reason so the CLI surface can show them.

Parameters:

source_root (Path) – Path to Compiled_data_pickled/ (or an equivalent tree of <genotype>/<txt>).

Return type:

Iterator[DiscoveryResult]