arista.preprocess
Headless preprocessing pipeline — public API re-exports.
- class arista.preprocess.DriftFit(method, fitted, residual_ssq, aic, params=<factory>)[source]
Bases:
objectA fitted drift model plus its evaluation on the full trace.
- Parameters:
- fitted: numpy.ndarray
- class arista.preprocess.FijiRecording(frame, dfbf)[source]
Bases:
objectRaw Fiji ROI export — one ΔF/F₀ value per imaging frame.
- Parameters:
frame (numpy.ndarray)
dfbf (numpy.ndarray)
- dfbf: numpy.ndarray
- frame: numpy.ndarray
- class arista.preprocess.Recording(frame, time_s, sensor_t_c, target_t_c, drive_t_c, dfbf, dfbf_drift_corrected=None, drift_method='none', recording_date=None)[source]
Bases:
objectFrame-aligned, optionally drift-corrected Ca²⁺ recording.
One row per imaging frame. Matches the column layout of the
samplestable in [[Database Schema]] so the ingester can bulk-insert directly.dfbf_drift_correctedisNonewheneverdrift_method == "none".recording_dateis the calendar date the recording started, extracted from the first sensor MAT epoch during alignment. The ingester uses it to populateanimals.recording_date.- Parameters:
- dfbf: np.ndarray
- frame: np.ndarray
- sensor_t_c: np.ndarray
- target_t_c: np.ndarray
- time_s: np.ndarray
- class arista.preprocess.SensorRecord(epoch_time, frame, sensor_t_c, target_t_c, drive_t_c)[source]
Bases:
objectRaw MATLAB sensor record — continuously logged 5-column array.
Each MAT file holds a
datamatrix whose columns are, in order: epoch_time (MATLAB serial datenum), frame index (1-based!), sensor_T (°C, actually measured), target_T (°C, set-point) and drive_T (°C, applied to the Peltier element). The log rate is higher than the imaging rate so multiple sensor rows share the same frame number;arista.preprocess.align()collapses them.- Parameters:
epoch_time (numpy.ndarray)
frame (numpy.ndarray)
sensor_t_c (numpy.ndarray)
target_t_c (numpy.ndarray)
drive_t_c (numpy.ndarray)
- drive_t_c: numpy.ndarray
- epoch_time: numpy.ndarray
- frame: numpy.ndarray
- sensor_t_c: numpy.ndarray
- target_t_c: numpy.ndarray
- arista.preprocess.apply_drift(recording, fit)[source]
Subtract a fit from the ΔF/F trace and return a new
Recording.If
fitisNonethe recording is returned withdrift_method = "none"anddfbf_drift_corrected = None(i.e. drift correction explicitly not applied — the originaldfbfcolumn remains the source of truth).
- arista.preprocess.assemble_recording(fiji, sensor)[source]
Run stages A + B + C and return a frame-aligned
Recording.Drift correction is not applied here — that is
arista.preprocess.drift’s job. The returned recording hasdfbf_drift_corrected = Noneanddrift_method = "none".The output length is the intersection of the Fiji frame range and the (post-collapse, post-interp) sensor frame range. Any Fiji frames without a matching sensor frame are dropped silently — this matches legacy behaviour and only ever clips a handful of trailing frames in well-formed recordings.
- Parameters:
fiji (FijiRecording) – Fiji ΔF/F₀ trace.
sensor (SensorRecord) – Raw sensor record.
- Returns:
A
Recordingaligned to Fiji’s frame numbers.- Return type:
- arista.preprocess.collapse_to_frames(sensor)[source]
Group the continuous sensor log by frame, take the per-frame mean.
Sensor rows where
frame == 0are pre-stimulus calibration data and dropped (matches theframe > 0cut in the legacy pipeline). Frame numbers are decremented by 1 so the result is 0-indexed, aligning with Fiji’s frame numbering.- Parameters:
sensor (SensorRecord) – Raw
SensorRecordstraight fromarista.preprocess.io.read_sensor_mat().- Returns:
A DataFrame indexed by integer 0-based frame number, with columns
epoch_time,sensor_t_c,target_t_c,drive_t_c. NaN-padded over any missing intermediate frames sointerpolate_missing_frames()can fill them.- Return type:
- arista.preprocess.correct_drift(recording, method='auto')[source]
Convenience: fit all candidates, pick the best, apply it.
For headless / batch / CI use. Matches the default behaviour
arista-preprocess drift --method autowill expose at the CLI level in Phase 3.
- arista.preprocess.fit_all(t, y)[source]
Compute linear, poly and exp fits; return them in a dict by method name.
The exponential fit may fail to converge on flat traces; in that case it is omitted from the returned dict (rather than raising) so the AIC chooser can still pick between linear and poly.
- Parameters:
t (numpy.ndarray)
y (numpy.ndarray)
- Return type:
- arista.preprocess.fit_exponential(t, y)[source]
a·exp(-b·t) + cfit. Mirrors pytci’sfitExpbounds + p0.- Parameters:
t (numpy.ndarray)
y (numpy.ndarray)
- Return type:
- arista.preprocess.fit_linear(t, y)[source]
Degree-1 polyfit on the first and last
_LINEAR_TAIL_FRAMESframes.Matches pytci’s
fitLinear: the fit is trained on pre + post stimulus tails only, then evaluated over the whole trace. This deliberately ignores the stimulus-evoked excursions so the linear component captures photobleach drift rather than the response.- Parameters:
t (numpy.ndarray)
y (numpy.ndarray)
- Return type:
- arista.preprocess.fit_polynomial(t, y, degree=4)[source]
Degree-
degreepolyfit over the whole trace (default 4, pytci default).- Parameters:
t (numpy.ndarray)
y (numpy.ndarray)
degree (int)
- Return type:
- arista.preprocess.interpolate_missing_frames(per_frame)[source]
Linearly interpolate NaN values in a per-frame dataframe.
Replicates
aristaSingleCellData.interpolateMissingFrames: any NaN values left afterarista.preprocess.align.collapse_to_frames()has reindexed the per-frame table to a contiguous integer range are filled byDataFrame.interpolate(method="linear").- Parameters:
per_frame (pandas.DataFrame) – DataFrame whose index is the integer frame number. Typical columns are
epoch_time/sensor_t_c/target_t_c/drive_t_cbut the function is agnostic.- Returns:
A new DataFrame with NaNs filled. The index is preserved.
- Return type:
- arista.preprocess.is_broken_sensor(sensor, min_rows=1000)[source]
Return True if a sensor MAT file looks truncated.
The heuristic mirrors pytci: a full recording logs the sensor at a much higher rate than the imaging frame rate, so an honest MAT file has thousands of rows. Anything substantially smaller is almost certainly a partially-written file from a crashed acquisition.
- Parameters:
sensor (SensorRecord) – Sensor record from
arista.preprocess.io.read_sensor_mat().min_rows (int) – Threshold below which we declare the file broken (default 1000, matching pytci).
- Return type:
- arista.preprocess.load_template(stimulus_name)[source]
Substitute a median-template sensor trace for a broken MAT.
Not yet implemented in v0.1; raises with a clear pointer to the sprint plan rather than silently returning bogus data.
- Parameters:
stimulus_name (str) – Canonical stimulus name (e.g.
"adaptation").- Return type:
- arista.preprocess.pick_best(fits, method='auto')[source]
Select one fit from a
fit_all()result.- Parameters:
- Returns:
The chosen
DriftFit, orNoneifmethod == "none".- Raises:
ValueError – If
methodis not a valid choice, or if a forced method is requested but missing fromfits.- Return type:
DriftFit | None
- arista.preprocess.read_fiji_csv(path)[source]
Read a Fiji ΔF/F₀ ROI export into a
FijiRecording.Accepts any of the three header conventions documented in the module docstring. Header is required (no headerless CSV support; that would silently re-interpret the first frame as a column name and corrupt downstream alignment).
- Parameters:
- Returns:
Frozen
FijiRecordingwith frame and dfbf as numpy arrays.- Raises:
ValueError – If neither a frame-like nor a value-like column is present, or if frame indices are not monotonically increasing.
- Return type:
- arista.preprocess.read_recording_csv(path)[source]
Re-read a canonical
write_recording_csv()output.
- arista.preprocess.read_sensor_mat(path)[source]
Read a MATLAB
temperature_data_*.matsensor record.The MAT file must contain a top-level variable
datashaped(n_samples, 5):[epoch_time, frame, sensor_T, target_T, drive_T].- Parameters:
- Returns:
Frozen
SensorRecordwith the five columns as separate arrays.- Raises:
KeyError – If the MAT file lacks a
datavariable.ValueError – If the
datamatrix is not 5 columns wide.
- Return type:
- arista.preprocess.write_recording_csv(recording, path)[source]
Persist a
Recordingto disk as a canonical CSV.Column order matches the
samplestable in [[Database Schema]] soarista-ingestcanCOPY-style load without remapping. Two#-prefixed header lines carry recording-level provenance:# drift_method: <method>— which drift correction was applied# recording_date: <YYYY-MM-DD>— calendar date of frame 0, omitted when the recording carries no date
Modules
Frame-align a sensor record against a Fiji ΔF/F trace. |
|
Drift-correction fits and chooser. |
|
Fill DAQ-dropped frames via linear interpolation. |
|
File I/O for raw inputs and preprocessed outputs. |
|
Detect a broken / truncated sensor MAT file. |