Input section #
The [input] table enumerates all FastQ sources that make up a fragment. At least one segment must be declared.
[input]
read1 = ['fileA_1.fastq', 'fileB_1.fastq.gz', 'fileC_1.fastq.zst'] # required: one or more paths
read2 = "fileA_2.fastq.gz" # optional
index1 = ['index1_A.fastq.gz'] # optional
# interleaved = [...] # optional, see below
| Key | Required | Value type | Notes |
|---|---|---|---|
segment name (e.g. read1) | Yes (at least one) | string or array of strings | Each unique key defines a segment; arrays concatenate multiple files in order. |
interleaved | No | array of strings | Enables interleaved reading; must list segment names in their in-file order. |
Additional points:
- Segment names are user-defined and case sensitive. Common conventions include
read1,read2,index1, andindex2. - Compression is auto-detected for by inspecting file headers.
- Every segment must provide the same number of reads. Cardinality mismatches raise a validation error.
- Multiple files per segment are concatenated virtually; the processor streams them sequentially.
- The segment name ‘All’ is reserved, since some steps use it to signal working on all segments.
Interleaved input #
Some datasets store all segments in a single file. Activate interleaved mode and describe how the segments are ordered:
[input]
source = ['interleaved.fq'] # this 'virtual' segment will not be available for steps downstream
interleaved = ["read1", "read2", "index1", "index2"]
Rules for interleaving:
- The
[input]table must contain exactly one data source wheninterleavedis present. - The
interleavedlist dictates how reads are grouped into fragments. The length of the list equals the number of segments. - Downstream steps reference the declared segment names exactly as written in the list.