Input Section

Input section #

The [input] table enumerates all FastQ sources that make up a fragment. At least one segment must be declared.

[input]
    read1 = ['fileA_1.fastq', 'fileB_1.fastq.gz', 'fileC_1.fastq.zst'] # required: one or more paths
    read2 = "fileA_2.fastq.gz"                                      # optional
    index1 = ['index1_A.fastq.gz']                                   # optional
    # interleaved = [...]                                            # optional, see below
KeyRequiredValue typeNotes
segment name (e.g. read1)Yes (at least one)string or array of stringsEach unique key defines a segment; arrays concatenate multiple files in order.
interleavedNoarray of stringsEnables interleaved reading; must list segment names in their in-file order.

Additional points:

  • Segment names are user-defined and case sensitive. Common conventions include read1, read2, index1, and index2.
  • Compression is auto-detected for by inspecting file headers.
  • Every segment must provide the same number of reads. Cardinality mismatches raise a validation error.
  • Multiple files per segment are concatenated virtually; the processor streams them sequentially.
  • The segment name ‘All’ is reserved, since some steps use it to signal working on all segments.

Interleaved input #

Some datasets store all segments in a single file. Activate interleaved mode and describe how the segments are ordered:

[input]
    source = ['interleaved.fq'] # this 'virtual' segment will not be available for steps downstream
    interleaved = ["read1", "read2", "index1", "index2"]

Rules for interleaving:

  • The [input] table must contain exactly one data source when interleaved is present.
  • The interleaved list dictates how reads are grouped into fragments. The length of the list equals the number of segments.
  • Downstream steps reference the declared segment names exactly as written in the list.