<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Reference on mbf-fastq-processor documentation</title><link>https://tyberiusprime.github.io/fastqrab/v0.8.1/docs/reference/</link><description>Recent content in Reference on mbf-fastq-processor documentation</description><generator>Hugo</generator><language>en-us</language><atom:link href="https://tyberiusprime.github.io/fastqrab/v0.8.1/docs/reference/index.xml" rel="self" type="application/rss+xml"/><item><title/><link>https://tyberiusprime.github.io/fastqrab/v0.8.1/docs/reference/cli/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/v0.8.1/docs/reference/cli/</guid><description>&lt;h1 id="command-line-interface">
 Command line interface
 &lt;a class="anchor" href="#command-line-interface">#&lt;/a>
&lt;/h1>
&lt;p>mbf-fastq-processor is configured exclusively through a TOML document. The CLI is therefore intentionally minimal and focuses on selecting the configuration and the working directory.&lt;/p>
&lt;h2 id="usage">
 Usage
 &lt;a class="anchor" href="#usage">#&lt;/a>
&lt;/h2>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-text" data-lang="text">&lt;span style="display:flex;">&lt;span>mbf-fastq-processor process [config.toml] [--allow-overwrite]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>mbf-fastq-processor template
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>mbf-fastq-processor interactive [config.toml]
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="process">
 Process
 &lt;a class="anchor" href="#process">#&lt;/a>
&lt;/h3>
&lt;p>Process FASTQ as described in &amp;lt;config.toml&amp;gt;.(see the &lt;a href="https://tyberiusprime.github.io/fastqrab/v0.8.1/fastqrab/v0.8.1/docs/reference/toml/">TOML format reference&lt;/a>).
Relative paths are resolved against the current shell directory.&lt;/p>
&lt;p>The config.toml argument can be left off iff there&amp;rsquo;s one .toml in the current directory, and it contains an [input] and an [output] section&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/v0.8.1/docs/reference/input-section/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/v0.8.1/docs/reference/input-section/</guid><description>&lt;h1 id="input-section">
 Input section
 &lt;a class="anchor" href="#input-section">#&lt;/a>
&lt;/h1>
&lt;p>The &lt;code>[input]&lt;/code> table enumerates all read sources that make up a fragment.
At least one segment must be declared.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[&lt;span style="color:#a6e22e">input&lt;/span>]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">read1&lt;/span> = [&lt;span style="color:#e6db74">&amp;#39;fileA_1.fastq&amp;#39;&lt;/span>, &lt;span style="color:#e6db74">&amp;#39;fileB_1.fastq.gz&amp;#39;&lt;/span>, &lt;span style="color:#e6db74">&amp;#39;fileC_1.fastq.zst&amp;#39;&lt;/span>] &lt;span style="color:#75715e"># required: one or more paths&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">read2&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;fileA_2.fastq.gz&amp;#34;&lt;/span> &lt;span style="color:#75715e"># optional&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">index1&lt;/span> = [&lt;span style="color:#e6db74">&amp;#39;index1_A.fastq.gz&amp;#39;&lt;/span>] &lt;span style="color:#75715e"># optional&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># interleaved = [...] # optional, see below&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>Key&lt;/th>
 &lt;th>Required&lt;/th>
 &lt;th>Value type&lt;/th>
 &lt;th>Notes&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>segment name (e.g. &lt;code>read1&lt;/code>)&lt;/td>
 &lt;td>Yes (at least one)&lt;/td>
 &lt;td>string or array of strings&lt;/td>
 &lt;td>Each unique key defines a segment; arrays concatenate multiple files in order.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>interleaved&lt;/code>&lt;/td>
 &lt;td>No&lt;/td>
 &lt;td>array of strings&lt;/td>
 &lt;td>Enables interleaved reading; must list segment names in their in-file order.&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;p>Additional points:&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/v0.8.1/docs/reference/output-section/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/v0.8.1/docs/reference/output-section/</guid><description>&lt;h1 id="output-section">
 Output section
 &lt;a class="anchor" href="#output-section">#&lt;/a>
&lt;/h1>
&lt;p>The &lt;code>[output]&lt;/code> table controls how transformed reads and reporting artefacts are written.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[&lt;span style="color:#a6e22e">output&lt;/span>]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">prefix&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;output&amp;#34;&lt;/span> &lt;span style="color:#75715e"># required.&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">format&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;Fastq&amp;#34;&lt;/span>, &lt;span style="color:#75715e"># (optional) output format, defaults to &amp;#39;Fastq&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>					 &lt;span style="color:#75715e"># Valid values are: Fastq, Fasta, BAM and None (for no sequence output)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">compression&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;Gzip&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Raw | Uncompressed | Gzip | Zstd | None (default: Raw)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">suffix&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;.fq.gz&amp;#34;&lt;/span> &lt;span style="color:#75715e"># optional override; inferred from format when omitted&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">compression_level&lt;/span> = &lt;span style="color:#ae81ff">6&lt;/span> &lt;span style="color:#75715e"># gzip: 0-9, zstd: 1-22, bam: 0-9 (BGZF); defaults are gzip=6, zstd=5&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">ix_separator&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;_&amp;#34;&lt;/span> &lt;span style="color:#75715e"># optional separator between prefix, infixes, and segments. Defaults to &amp;#39;_&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">report_json&lt;/span> = &lt;span style="color:#66d9ef">false&lt;/span> &lt;span style="color:#75715e"># write prefix.json&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">report_html&lt;/span> = &lt;span style="color:#66d9ef">true&lt;/span> &lt;span style="color:#75715e"># write prefix.html&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">output&lt;/span> = [&lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&amp;#34;read2&amp;#34;&lt;/span>] &lt;span style="color:#75715e"># limit which segments become FASTQ files&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">interleave&lt;/span> = &lt;span style="color:#66d9ef">false&lt;/span> &lt;span style="color:#75715e"># emit a single interleaved FASTQ&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">stdout&lt;/span> = &lt;span style="color:#66d9ef">false&lt;/span> &lt;span style="color:#75715e"># stream to stdout instead of files&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">chunk_size&lt;/span> = &lt;span style="color:#ae81ff">100000&lt;/span> &lt;span style="color:#75715e"># Write multiple, numbered output files, each a maximum of chunk_size reads/molecules.&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">output_hash_uncompressed&lt;/span> = &lt;span style="color:#66d9ef">false&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">output_hash_compressed&lt;/span> = &lt;span style="color:#66d9ef">false&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>Key&lt;/th>
 &lt;th>Default&lt;/th>
 &lt;th>Description&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>&lt;code>prefix&lt;/code>&lt;/td>
 &lt;td>&lt;code>&amp;quot;output&amp;quot;&lt;/code>&lt;/td>
 &lt;td>Base name for all files produced by the run.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>format&lt;/code>&lt;/td>
 &lt;td>&lt;code>&amp;quot;Fastq&amp;quot;&lt;/code>&lt;/td>
 &lt;td>Output format. Valid values are: &lt;code>Fastq&lt;/code>, &lt;code>Fasta&lt;/code>, &lt;code>Bam&lt;/code>, and &lt;code>None&lt;/code> (for no sequence output).&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>compression&lt;/code>&lt;/td>
 &lt;td>&lt;code>&amp;quot;Uncompressed&amp;quot;&lt;/code>&lt;/td>
 &lt;td>Compression format for read outputs. Valid values are: &lt;code>Gzip&lt;/code>, &lt;code>Zstd&lt;/code>, &lt;code>Uncompressed&lt;/code> (alias: &lt;code>&amp;quot;Raw&amp;quot;&lt;/code>). Must not be set for BAM&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>suffix&lt;/code>&lt;/td>
 &lt;td>derived from format&lt;/td>
 &lt;td>Override file extension when interop with other tooling demands a specific suffix.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>compression_level&lt;/code>&lt;/td>
 &lt;td>gzip: 6, zstd: 5&lt;/td>
 &lt;td>Fine-tune compression effort. Ignored for &lt;code>Raw&lt;/code>/&lt;code>None&lt;/code>. &lt;code>Bam&lt;/code> maps directly to the BGZF level (0–9).&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>report_json&lt;/code> / &lt;code>report_html&lt;/code>&lt;/td>
 &lt;td>&lt;code>false&lt;/code>&lt;/td>
 &lt;td>Toggle structured or interactive reports.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>output&lt;/code>&lt;/td>
 &lt;td>all input segments&lt;/td>
 &lt;td>Restrict the subset of segments written to disk. Use an empty list to suppress FASTQs while still running steps that depend on fragment data.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>interleave&lt;/code>&lt;/td>
 &lt;td>&lt;code>false&lt;/code>&lt;/td>
 &lt;td>Generate a single interleaved FASTQ (&lt;code>{prefix}_interleaved.fq*&lt;/code>).&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>stdout&lt;/code>&lt;/td>
 &lt;td>&lt;code>false&lt;/code>&lt;/td>
 &lt;td>Write to stdout. Forces &lt;code>format = &amp;quot;Raw&amp;quot;&lt;/code>. &lt;code>Sets interleave=true&lt;/code> if more than one fragment is listed in &lt;code>output&lt;/code>&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>output_hash_uncompressed&lt;/code> / &lt;code>output_hash_compressed&lt;/code>&lt;/td>
 &lt;td>&lt;code>false&lt;/code>&lt;/td>
 &lt;td>Emit SHA-256 checksums.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>ix_separator&lt;/code>&lt;/td>
 &lt;td>&lt;code>&amp;quot;_&amp;quot;&lt;/code>&lt;/td>
 &lt;td>Separator inserted between &lt;code>prefix&lt;/code>, any infix (demultiplex labels, inspect names, etc.), and segment names.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>chunk_size&lt;/code>&lt;/td>
 &lt;td>(unlimited)&lt;/td>
 &lt;td>Split outputs into multiple files, each containing at most &lt;code>chunk_size&lt;/code> reads/molecules. For non-interleaved output files, it&amp;rsquo;s &lt;code>chunk_size&lt;/code> reads, for interleaved files it&amp;rsquo;s molecules. This means when mixing interleaved and non-interleaved output, you get the same number of files. Files are numbered sequentially, e.g. &lt;code>output_read1_0.fq.gz&lt;/code>, &amp;hellip;, Numbers start at 0 and use the minimum number of (base 10) digits necessary for alphabetical sorting (by renaming already produced files whenever an extension is needed).&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;p>Generated filenames join these components with &lt;code>ix_separator&lt;/code> (default &lt;code>_&lt;/code>), e.g. &lt;code>{prefix}_{segment}{suffix}&lt;/code>. Interleaving replaces &lt;code>segment&lt;/code> with &lt;code>interleaved&lt;/code>; demultiplexing adds per-barcode infixes before the segment. Checksums use &lt;code>.uncompressed.sha256&lt;/code> or &lt;code>.compressed.sha256&lt;/code> suffixes.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/v0.8.1/docs/reference/demultiplex/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/v0.8.1/docs/reference/demultiplex/</guid><description>&lt;h2 id="demultiplexed-output">
 Demultiplexed output
 &lt;a class="anchor" href="#demultiplexed-output">#&lt;/a>
&lt;/h2>
&lt;p>&lt;a href="https://tyberiusprime.github.io/fastqrab/v0.8.1/fastqrab/v0.8.1/docs/reference/demultiplex/">Demultiplex&lt;/a> is a magic transformation that forks the output.&lt;/p>
&lt;p>You receive one set of output files per barcode (combination) defined.&lt;/p>
&lt;p>Transformations downstream are (virtually) duplicated,
so you can for example filter to the head reads in each barcode,
and get reports for both: all reads and each separate barcode.&lt;/p>
&lt;p>Demultiplexing can be done on barcodes, or on boolean tags.&lt;/p>
&lt;h3 id="based-on-barcodes">
 Based on barcodes
 &lt;a class="anchor" href="#based-on-barcodes">#&lt;/a>
&lt;/h3>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;Demultiplex&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">in_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;mytag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">barcodes&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;mybarcodes&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">output_unmatched&lt;/span> = &lt;span style="color:#66d9ef">true&lt;/span> &lt;span style="color:#75715e"># if set, write reads not matching any barcode&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># to a file like ouput_prefix_no-barcode_1.fq&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>[&lt;span style="color:#a6e22e">barcodes&lt;/span>.&lt;span style="color:#a6e22e">mybarcodes&lt;/span>] &lt;span style="color:#75715e"># can be before and after.&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#75715e"># separate multiple regions with a _&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#75715e"># a Mapping of barcode -&amp;gt; output name.&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#a6e22e">AAAAAA_CCCCCC&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;sample-1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># output files are named prefix{ix_separator}barcode_prefix{ix_separator}segment.suffix&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># with the separator defaulting to &amp;#39;_&amp;#39;, e.g. output_sample-1_1.fq.gz&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># or output_sample-1_report.fq.gz&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="based-on-boolean-tags">
 Based on boolean tags
 &lt;a class="anchor" href="#based-on-boolean-tags">#&lt;/a>
&lt;/h3>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;TagOtherFileByName&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">out_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;a_bool_tag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">filename&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;path/to/boolean_tags.tsv&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">false_positive_rate&lt;/span> = &lt;span style="color:#ae81ff">0&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;Demultiplex&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">in_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;a_bool_tag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Note that this does not
extract the barcodes from the read
(use an extract step, such as &lt;a href="https://tyberiusprime.github.io/fastqrab/v0.8.1/fastqrab/v0.8.1/docs/reference/tag-steps/extract/extractregion/">ExtractRegion&lt;/a>).&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/v0.8.1/docs/reference/options/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/v0.8.1/docs/reference/options/</guid><description>&lt;h1 id="options">
 Options
 &lt;a class="anchor" href="#options">#&lt;/a>
&lt;/h1>
&lt;p>There is a small set of runtime knobs exposed under &lt;code>[options]&lt;/code>. Most workflows can rely on the defaults.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[&lt;span style="color:#a6e22e">options&lt;/span>]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">thread_count&lt;/span> = &lt;span style="color:#ae81ff">-1&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">block_size&lt;/span> = &lt;span style="color:#ae81ff">10000&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">buffer_size&lt;/span> = &lt;span style="color:#ae81ff">102400&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">accept_duplicate_files&lt;/span> = &lt;span style="color:#66d9ef">false&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">spot_check_read_pairing&lt;/span> = &lt;span style="color:#66d9ef">true&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>Key&lt;/th>
 &lt;th>Default&lt;/th>
 &lt;th>Description&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>&lt;code>thread_count&lt;/code>&lt;/td>
 &lt;td>&lt;code>-1&lt;/code>&lt;/td>
 &lt;td>Worker threads for transformations. &lt;code>-1&lt;/code> autotunes per CPU; most runtime is still dominated by decompression threads, so gains are modest.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>block_size&lt;/code>&lt;/td>
 &lt;td>&lt;code>10000&lt;/code>&lt;/td>
 &lt;td>Number of fragments pulled per batch. Increase for very large runs when IO is abundant; decrease to reduce peak memory use.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>buffer_size&lt;/code>&lt;/td>
 &lt;td>&lt;code>102400&lt;/code>&lt;/td>
 &lt;td>Initial bytes reserved per block. The allocator grows buffers on demand, so tuning is rarely necessary.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>accept_duplicate_files&lt;/code>&lt;/td>
 &lt;td>&lt;code>false&lt;/code>&lt;/td>
 &lt;td>Permit the same path to appear multiple times across segments. Useful for fixtures or synthetic tests; keep disabled to catch accidental copy/paste errors.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>spot_check_read_pairing&lt;/code>&lt;/td>
 &lt;td>&lt;code>true&lt;/code>&lt;/td>
 &lt;td>Sample every 1000th fragment to ensure paired reads still share a name prefix; disable when names are intentionally divergent or rely on &lt;code>ValidateName&lt;/code> to customise the separator.&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;p>Changing these knobs can affect memory pressure and concurrency behaviour. Measure before and after if you deviate from defaults.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/v0.8.1/docs/reference/out_of_scope/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/v0.8.1/docs/reference/out_of_scope/</guid><description>&lt;h1 id="out-of-scope">
 Out of scope
 &lt;a class="anchor" href="#out-of-scope">#&lt;/a>
&lt;/h1>
&lt;p>Things mbf-fastq-processor will explicitly not do and that won&amp;rsquo;t be implemented.&lt;/p>
&lt;h2 id="anything-based-on-averaging-phred-scores">
 Anything based on averaging phred scores
 &lt;a class="anchor" href="#anything-based-on-averaging-phred-scores">#&lt;/a>
&lt;/h2>
&lt;p>Based on the average quality in a sliding window.
Arithmetic averaging of phred scores is wrong.&lt;/p>
&lt;p>see &lt;a href="https://tyberiusprime.github.io/fastqrab/v0.8.1/fastqrab/v0.8.1/docs/reference/tag-steps/calc/calcmeanquality/">ExtractMeanQuality&lt;/a>&lt;/p>
&lt;h3 id="corresponding-options-in-other-software">
 Corresponding options in other software
 &lt;a class="anchor" href="#corresponding-options-in-other-software">#&lt;/a>
&lt;/h3>
&lt;ul>
&lt;li>Trimmomatic SLIDINGWINDOW&lt;/li>
&lt;li>fastp &amp;ndash;cut_front&lt;/li>
&lt;li>fastp &amp;ndash;cut_tail&lt;/li>
&lt;li>fastp &amp;ndash;cut_right&lt;/li>
&lt;/ul>
&lt;h2 id="fast5">
 Fast5
 &lt;a class="anchor" href="#fast5">#&lt;/a>
&lt;/h2>
&lt;p>&lt;a href="https://medium.com/@shiansu/a-look-at-the-nanopore-fast5-format-f711999e2ff6">https://medium.com/@shiansu/a-look-at-the-nanopore-fast5-format-f711999e2ff6&lt;/a>
Oxford Nanopore squiggle data.
Apparently no formal spec.&lt;/p>
&lt;h2 id="kallisto-bus-format">
 kallisto BUS format
 &lt;a class="anchor" href="#kallisto-bus-format">#&lt;/a>
&lt;/h2>
&lt;pre>&lt;code>- a brief barcode/umi format for single cell RNA-seq
- needs an 'equivalance class' - i.e. at least pseudo alignment
- weird length restrictions on barcodes and umis (1(!)-32), 
 but stores the length in an uint32...
&lt;/code>&lt;/pre></description></item></channel></rss>