<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Introduction on fastqrab documentation</title><link>https://tyberiusprime.github.io/fastqrab/main/</link><description>Recent content in Introduction on fastqrab documentation</description><generator>Hugo</generator><language>en-us</language><atom:link href="https://tyberiusprime.github.io/fastqrab/main/index.xml" rel="self" type="application/rss+xml"/><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/concepts/philosophy/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/concepts/philosophy/</guid><description>&lt;h1 id="philosophy">
 Philosophy
 &lt;a class="anchor" href="#philosophy">#&lt;/a>
&lt;/h1>
&lt;p>fastqrab transforms (DNA) sequencing reads for downstream analysis.&lt;/p>
&lt;p>Its focus are on&lt;/p>
&lt;ul>
&lt;li>correctness&lt;/li>
&lt;li>reproducibility&lt;/li>
&lt;li>a lack of surprises&lt;/li>
&lt;li>friendliness&lt;/li>
&lt;li>speed&lt;/li>
&lt;/ul>
&lt;h2 id="correctness">
 Correctness
 &lt;a class="anchor" href="#correctness">#&lt;/a>
&lt;/h2>
&lt;p>We strive to do the right thing, always.&lt;/p>
&lt;p>To that end, fastqrab is tested with more than 500
end-to-end, input-to-output tests, both during development and via
continuous integration.&lt;/p>
&lt;h2 id="reproducibility">
 Reproducibility
 &lt;a class="anchor" href="#reproducibility">#&lt;/a>
&lt;/h2>
&lt;p>Repeated runs on the same bits (input data &amp;amp; configuration)
must deliver the same output bits. Every time.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/development/getting_started/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/development/getting_started/</guid><description>&lt;h1 id="getting-started-with-development">
 Getting started with development
 &lt;a class="anchor" href="#getting-started-with-development">#&lt;/a>
&lt;/h1>
&lt;p>The easiest way to get started with working on fastqrab
is to clone the repository:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-bash" data-lang="bash">&lt;span style="display:flex;">&lt;span>jj git clone --colocate https://github.com/TyberiusPrime/fastqrab
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>or&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-bash" data-lang="bash">&lt;span style="display:flex;">&lt;span>git clone https://github.com/TyberiusPrime/fastqrab
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>if you&amp;rsquo;re not yet convinced that &lt;a href="https://docs.jj-vcs.dev/latest/">Jujutsu&lt;/a> is the better git.&lt;/p>
&lt;h2 id="development-environment">
 Development environment
 &lt;a class="anchor" href="#development-environment">#&lt;/a>
&lt;/h2>
&lt;p>Using &lt;code>nix develop&lt;/code> to enter a shell with all the necessary requirements using &lt;a href="https://nix.dev/">Nix&lt;/a>.&lt;/p>
&lt;p>If you don&amp;rsquo;t use nix, you&amp;rsquo;re on your own to supply a matching rust compiler,
openssl and pkg-config.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/faq/trouble_shooting/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/faq/trouble_shooting/</guid><description>&lt;h1 id="troubleshooting">
 Troubleshooting
 &lt;a class="anchor" href="#troubleshooting">#&lt;/a>
&lt;/h1>
&lt;h2 id="i-dont-know-what-to-do-after-an-error-message">
 I don&amp;rsquo;t know what to do after an error message
 &lt;a class="anchor" href="#i-dont-know-what-to-do-after-an-error-message">#&lt;/a>
&lt;/h2>
&lt;p>Well, that&amp;rsquo;s below our targets, we want our error messages to tell
you enough for you to be able to fix it.&lt;/p>
&lt;p>Please file a bug report in our &lt;a href="https://github.com/tyberiusPrime/fastqrab/issues">issue tracker&lt;/a>.&lt;/p>
&lt;h2 id="i-received-a-friendly-panic-message">
 I received a friendly panic message
 &lt;a class="anchor" href="#i-received-a-friendly-panic-message">#&lt;/a>
&lt;/h2>
&lt;p>These look like this&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-bash" data-lang="bash">&lt;span style="display:flex;">&lt;span>Well, this is embarrassing.
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>fastqrab had a problem and crashed. To help us diagnose the problem you can send us a crash report.
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>We have generated a report file at &lt;span style="color:#e6db74">&amp;#34;/tmp/nix-shell.ty6dWk/report-89ffc0f3-8076-4f55-83f0-c0a5d2fb3b55.toml&amp;#34;&lt;/span>.
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This kind of error message, which wraps a rust &amp;lsquo;panic&amp;rsquo; only happens if fastqrab has managed
to get into an impossible or unforeseen state.&lt;/p></description></item><item><title>LLM Configuration Guide</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/llm-guide/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/llm-guide/</guid><description>&lt;h1 id="llm-configuration-generation-guide">
 LLM Configuration Generation Guide
 &lt;a class="anchor" href="#llm-configuration-generation-guide">#&lt;/a>
&lt;/h1>
&lt;p>This guide is optimized for Large Language Models to generate valid &lt;code>fastqrab&lt;/code> configurations. It provides structured information with explicit types, constraints, and patterns.&lt;/p>
&lt;h2 id="configuration-structure">
 Configuration Structure
 &lt;a class="anchor" href="#configuration-structure">#&lt;/a>
&lt;/h2>
&lt;p>Every configuration has 3 required sections and 2 optional sections:&lt;/p>
&lt;pre tabindex="0">&lt;code># example-only - structure overview, not valid TOML
[input] # REQUIRED: Define input files
[[step]] # OPTIONAL: Processing steps (0 or more, order matters)
[output] # REQUIRED: Output configuration
[barcodes.*] # OPTIONAL: Barcode definitions for demultiplexing
[options] # OPTIONAL: Global processing options
&lt;/code>&lt;/pre>&lt;h2 id="quick-start-patterns">
 Quick Start Patterns
 &lt;a class="anchor" href="#quick-start-patterns">#&lt;/a>
&lt;/h2>
&lt;h3 id="pattern-1-basic-quality-report">
 Pattern 1: Basic Quality Report
 &lt;a class="anchor" href="#pattern-1-basic-quality-report">#&lt;/a>
&lt;/h3>
&lt;p>Generate reports without modifying sequences.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/development/custom_transformation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/development/custom_transformation/</guid><description>&lt;h1 id="implementing-your-own-transformation">
 Implementing your own transformation
 &lt;a class="anchor" href="#implementing-your-own-transformation">#&lt;/a>
&lt;/h1>
&lt;p>Let&amp;rsquo;s implement an example step (also called a &amp;rsquo;transformation&amp;rsquo;) that converts your reads to FuNkYcAsE
by upper/lower-casing every other letter.&lt;/p>
&lt;p>This guide assumes you have basic linux command line knowledge, and that you
can edit text files (source code).&lt;/p>
&lt;p>We are going to start by devising a test case, making sure it fails,
and then step by step adding all the parts we need. This will illustrate
all the infrastructure the project has to support you in this.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/CLI/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/CLI/</guid><description>&lt;h1 id="command-line-interface">
 Command line interface
 &lt;a class="anchor" href="#command-line-interface">#&lt;/a>
&lt;/h1>
&lt;p>fastqrab is configured exclusively through a TOML document. The CLI is therefore intentionally minimal and focuses on selecting the configuration and the working directory.&lt;/p>
&lt;h2 id="usage">
 Usage
 &lt;a class="anchor" href="#usage">#&lt;/a>
&lt;/h2>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-text" data-lang="text">&lt;span style="display:flex;">&lt;span>fastqrab process [config.toml|-] [--allow-overwrite]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>fastqrab template
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>fastqrab verify [config.toml] [--output-dir &amp;lt;OUTPUT_DIR&amp;gt;]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>fastqrab interactive [config.toml]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>fastqrab completions &amp;lt;SHELL&amp;gt;
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="process">
 Process
 &lt;a class="anchor" href="#process">#&lt;/a>
&lt;/h3>
&lt;p>Process FASTQ as described in &amp;lt;config.toml&amp;gt;.(see the &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/reference/toml/">TOML format reference&lt;/a>).
Relative paths are resolved against the current shell directory.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/input-section/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/input-section/</guid><description>&lt;h1 id="input-section">
 Input section
 &lt;a class="anchor" href="#input-section">#&lt;/a>
&lt;/h1>
&lt;p>The &lt;code>[input]&lt;/code> table enumerates all read sources that make up a fragment.
At least one segment must be declared.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[&lt;span style="color:#a6e22e">input&lt;/span>]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">read1&lt;/span> = [&lt;span style="color:#e6db74">&amp;#39;fileA_1.fastq&amp;#39;&lt;/span>, &lt;span style="color:#e6db74">&amp;#39;fileB_1.fastq.gz&amp;#39;&lt;/span>, &lt;span style="color:#e6db74">&amp;#39;fileC_1.fastq.zst&amp;#39;&lt;/span>] &lt;span style="color:#75715e"># required: one or more paths&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">read2&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;fileA_2.fastq.gz&amp;#34;&lt;/span> &lt;span style="color:#75715e"># optional&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">index1&lt;/span> = [&lt;span style="color:#e6db74">&amp;#39;index1_A.fastq.gz&amp;#39;&lt;/span>] &lt;span style="color:#75715e"># optional&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># interleaved = [...] # optional, see below&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>Key&lt;/th>
 &lt;th>Required&lt;/th>
 &lt;th>Value type&lt;/th>
 &lt;th>Notes&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>segment name (e.g. &lt;code>read1&lt;/code>)&lt;/td>
 &lt;td>Yes (at least one)&lt;/td>
 &lt;td>string or array of strings&lt;/td>
 &lt;td>Each unique key defines a segment; arrays concatenate multiple files in order.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>interleaved&lt;/code>&lt;/td>
 &lt;td>No&lt;/td>
 &lt;td>array of strings&lt;/td>
 &lt;td>Enables interleaved reading; must list segment names in their in-file order.&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;p>Additional points:&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/output-section/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/output-section/</guid><description>&lt;h1 id="output-section">
 Output section
 &lt;a class="anchor" href="#output-section">#&lt;/a>
&lt;/h1>
&lt;p>The &lt;code>[output]&lt;/code> table controls how transformed reads and reporting artefacts are written.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[&lt;span style="color:#a6e22e">output&lt;/span>]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">prefix&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;output&amp;#34;&lt;/span> &lt;span style="color:#75715e"># required.&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">format&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;Fastq&amp;#34;&lt;/span>, &lt;span style="color:#75715e"># (optional) output format, defaults to &amp;#39;Fastq&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>					 &lt;span style="color:#75715e"># Valid values are: Fastq, Fasta, BAM and None (for no sequence output)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">compression&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;Gzip&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Raw | Uncompressed | Gzip | Zstd | None (default: Raw)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">compression_threads&lt;/span> = &lt;span style="color:#ae81ff">5&lt;/span> &lt;span style="color:#75715e"># (optional) number of threads to use for compressing gzip data&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">suffix&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;.fq.gz&amp;#34;&lt;/span> &lt;span style="color:#75715e"># optional override; inferred from format when omitted&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">compression_level&lt;/span> = &lt;span style="color:#ae81ff">6&lt;/span> &lt;span style="color:#75715e"># gzip: 0-9, zstd: 1-22, bam: 0-9 (BGZF); defaults are gzip=6, zstd=5&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">ix_separator&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;_&amp;#34;&lt;/span> &lt;span style="color:#75715e"># optional separator between prefix, infixes, and segments. Defaults to &amp;#39;_&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">report_json&lt;/span> = &lt;span style="color:#66d9ef">false&lt;/span> &lt;span style="color:#75715e"># write prefix.json (default: false)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">report_html&lt;/span> = &lt;span style="color:#66d9ef">true&lt;/span> &lt;span style="color:#75715e"># write prefix.html (default: false)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">report_timing&lt;/span> = &lt;span style="color:#66d9ef">false&lt;/span> &lt;span style="color:#75715e"># write prefix.timing.json (default: false)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">output&lt;/span> = [&lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&amp;#34;read2&amp;#34;&lt;/span>] &lt;span style="color:#75715e"># limit which segments become FASTQ files&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">interleave&lt;/span> = &lt;span style="color:#66d9ef">false&lt;/span> &lt;span style="color:#75715e"># emit a single interleaved FASTQ&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">stdout&lt;/span> = &lt;span style="color:#66d9ef">false&lt;/span> &lt;span style="color:#75715e"># stream to stdout instead of files&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">chunk_size&lt;/span> = &lt;span style="color:#ae81ff">100000&lt;/span> &lt;span style="color:#75715e"># Write multiple, numbered output files, each a maximum of chunk_size reads/molecules.&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">output_hash_uncompressed&lt;/span> = &lt;span style="color:#66d9ef">false&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">output_hash_compressed&lt;/span> = &lt;span style="color:#66d9ef">false&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>[&lt;span style="color:#a6e22e">output&lt;/span>.&lt;span style="color:#a6e22e">bam&lt;/span>] &lt;span style="color:#75715e"># only valid when format = bam&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">comment_separation_char&lt;/span> = &lt;span style="color:#e6db74">&amp;#39; &amp;#39;&lt;/span> &lt;span style="color:#75715e"># read parts after this get put into a &amp;#39;CO&amp;#39; tag (free-text comment, null terminated in BAM)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">tag_to_bam_tag&lt;/span> = {&lt;span style="color:#e6db74">&amp;#39;a-tag-label&amp;#39;&lt;/span>&lt;span style="color:#960050;background-color:#1e0010">:&lt;/span> &lt;span style="color:#e6db74">&amp;#39;XY&amp;#39;&lt;/span>} &lt;span style="color:#75715e"># write a-tag-label to bam tag &amp;#39;xz&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">tag_to_reference&lt;/span> = {
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">tag&lt;/span> = &lt;span style="color:#e6db74">&amp;#39;assigned-barcode-name&amp;#39;&lt;/span>,
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># needs one of &lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># references_from_bam = &amp;#34;template.bam&amp;#34;, # take reference information from here&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># or&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">references_from_barcodes&lt;/span> = &lt;span style="color:#e6db74">&amp;#39;barcode-section-name&amp;#39;&lt;/span> &lt;span style="color:#75715e"># take references from this barcode section.&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> }
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>Key&lt;/th>
 &lt;th>Default&lt;/th>
 &lt;th>Description&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>&lt;code>prefix&lt;/code>&lt;/td>
 &lt;td>&lt;code>&amp;quot;output&amp;quot;&lt;/code>&lt;/td>
 &lt;td>Base name for all files produced by the run.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>format&lt;/code>&lt;/td>
 &lt;td>&lt;code>&amp;quot;Fastq&amp;quot;&lt;/code>&lt;/td>
 &lt;td>Output format. Valid values are: &lt;code>Fastq&lt;/code>, &lt;code>Fasta&lt;/code>, &lt;code>Bam&lt;/code>, and &lt;code>None&lt;/code> (for no sequence output).&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>compression&lt;/code>&lt;/td>
 &lt;td>&lt;code>&amp;quot;Uncompressed&amp;quot;&lt;/code>&lt;/td>
 &lt;td>Compression format for read outputs. Valid values are: &lt;code>Gzip&lt;/code>, &lt;code>Zstd&lt;/code>, &lt;code>Uncompressed&lt;/code> (alias: &lt;code>&amp;quot;Raw&amp;quot;&lt;/code>). Must not be set for BAM&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>compression_threads&lt;/code>&lt;/td>
 &lt;td>auto&lt;/td>
 &lt;td>if using gzip compression, how many thread should be used for compression. See &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/reference/threading/">threading&lt;/a>&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>suffix&lt;/code>&lt;/td>
 &lt;td>derived from format&lt;/td>
 &lt;td>Override file extension when interop with other tooling demands a specific suffix.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>compression_level&lt;/code>&lt;/td>
 &lt;td>gzip: 6, zstd: 5&lt;/td>
 &lt;td>Fine-tune compression effort. Ignored for &lt;code>Raw&lt;/code>/&lt;code>None&lt;/code>. &lt;code>Bam&lt;/code> maps directly to the BGZF level (0–9).&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>report_json&lt;/code> / &lt;code>report_html&lt;/code>&lt;/td>
 &lt;td>&lt;code>false&lt;/code>&lt;/td>
 &lt;td>Toggle structured or interactive reports.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>report_timing&lt;/code>&lt;/td>
 &lt;td>&lt;code>false&lt;/code>&lt;/td>
 &lt;td>Emit a JSON file with detailed timing information for all steps.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>output&lt;/code>&lt;/td>
 &lt;td>all input segments&lt;/td>
 &lt;td>Restrict the subset of segments written to disk. Use an empty list to suppress FASTQs while still running steps that depend on fragment data.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>interleave&lt;/code>&lt;/td>
 &lt;td>&lt;code>false&lt;/code>&lt;/td>
 &lt;td>Generate a single interleaved FASTQ (&lt;code>{prefix}_interleaved.fq*&lt;/code>).&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>stdout&lt;/code>&lt;/td>
 &lt;td>&lt;code>false&lt;/code>&lt;/td>
 &lt;td>Write to stdout. Forces &lt;code>format = &amp;quot;Raw&amp;quot;&lt;/code>. &lt;code>Sets interleave=true&lt;/code> if more than one fragment is listed in &lt;code>output&lt;/code>&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>output_hash_uncompressed&lt;/code> / &lt;code>output_hash_compressed&lt;/code>&lt;/td>
 &lt;td>&lt;code>false&lt;/code>&lt;/td>
 &lt;td>Emit SHA-256 checksums.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>ix_separator&lt;/code>&lt;/td>
 &lt;td>&lt;code>&amp;quot;_&amp;quot;&lt;/code>&lt;/td>
 &lt;td>Separator inserted between &lt;code>prefix&lt;/code>, any infix (demultiplex labels, inspect names, etc.), and segment names.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>chunk_size&lt;/code>&lt;/td>
 &lt;td>(unlimited)&lt;/td>
 &lt;td>Split outputs into multiple files, each containing at most &lt;code>chunk_size&lt;/code> reads/molecules. For non-interleaved output files, it&amp;rsquo;s &lt;code>chunk_size&lt;/code> reads, for interleaved files it&amp;rsquo;s molecules. This means when mixing interleaved and non-interleaved output, you get the same number of files. Files are numbered sequentially, e.g. &lt;code>output_read1_0.fq.gz&lt;/code>, &amp;hellip;, Numbers start at 0 and use the minimum number of (base 10) digits necessary for alphabetical sorting (by renaming already produced files whenever an extension is needed).&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;p>Generated filenames join these components with &lt;code>ix_separator&lt;/code> (default &lt;code>_&lt;/code>), e.g. &lt;code>{prefix}_{segment}{suffix}&lt;/code>. Interleaving replaces &lt;code>segment&lt;/code> with &lt;code>interleaved&lt;/code>; demultiplexing adds per-barcode infixes before the segment. Checksums use &lt;code>.uncompressed.sha256&lt;/code> or &lt;code>.compressed.sha256&lt;/code> suffixes.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/filter-steps/FilterByTag/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/filter-steps/FilterByTag/</guid><description>&lt;h1 id="filterbytag">
 FilterByTag
 &lt;a class="anchor" href="#filterbytag">#&lt;/a>
&lt;/h1>
&lt;p>This transformation filters molecules based on the presence or absence of a specified tag.&lt;/p>
&lt;p>Use &amp;ldquo;Keep&amp;rdquo; to retain molecules that have the tag, or &amp;ldquo;Remove&amp;rdquo; to discard reads that have the tag.&lt;/p>
&lt;p>If used on a boolean tag, the boolean value of the tag is used to determine whether to keep or remove the read.&lt;/p>
&lt;p>For numeric tags, use &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/reference/filter-steps/FilterByNumericTag/">FilterByNumericTag&lt;/a>.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;FilterByTag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">in_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;mytag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">keep_or_remove&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;Keep&amp;#34;&lt;/span> &lt;span style="color:#75715e"># or &amp;#34;Remove&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/report-steps/Report/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/report-steps/Report/</guid><description>&lt;h1 id="report">
 Report
 &lt;a class="anchor" href="#report">#&lt;/a>
&lt;/h1>
&lt;p>Capture data for the final report (see &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/reference/output-section/">the output section&lt;/a>).&lt;/p>
&lt;p>You can add multiple reports, at any stage of your transformation chain
to get e.g. before/after filtering reports.&lt;/p>
&lt;p>&lt;a href="../../../../html/example_report.html">Example report&lt;/a>&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#39;Report&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">name&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;report&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Key that the report will be listed under. Must be distinct&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">count&lt;/span> = &lt;span style="color:#66d9ef">true&lt;/span> &lt;span style="color:#75715e"># count reads at this position&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">base_statistics&lt;/span> = &lt;span style="color:#66d9ef">false&lt;/span> &lt;span style="color:#75715e"># include base distribution at each read position, q20, q30, total, gc bases&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">length_distribution&lt;/span> = &lt;span style="color:#66d9ef">false&lt;/span> &lt;span style="color:#75715e"># capture read length distribution&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">duplicate_count_per_read&lt;/span> = &lt;span style="color:#66d9ef">false&lt;/span> &lt;span style="color:#75715e"># count duplicates using Cukoo filter on each read1/read2/index1/index2&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">duplicate_count_per_fragment&lt;/span> = &lt;span style="color:#66d9ef">false&lt;/span> &lt;span style="color:#75715e"># count duplicates using Cukoo filter, on concatenated read1/read2/index1/index2&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">count_oligos&lt;/span> = {&lt;span style="color:#a6e22e">name1&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;AGTC&amp;#34;&lt;/span>, &lt;span style="color:#a6e22e">name2&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ACCCCC&amp;#34;&lt;/span>} &lt;span style="color:#75715e"># map of name -&amp;gt; sequence. Full match only, no iupac&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">count_oligos_segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;all&amp;#34;&lt;/span> &lt;span style="color:#75715e"># segment to count oligos in, can be &amp;#39;all&amp;#39;, &amp;#39;read1&amp;#39;, ...&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">tag_histograms&lt;/span> = [&lt;span style="color:#e6db74">&amp;#34;mytag&amp;#34;&lt;/span>] &lt;span style="color:#75715e"># Calculate a histogram for this tag&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Statistics available (for each &amp;lsquo;segment&amp;rsquo;. If demultiplexed, per barcode combination):&lt;/p></description></item><item><title>Barcodes section</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/barcodes/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/barcodes/</guid><description>&lt;h1 id="barcodes-section">
 &lt;code>[barcodes.*]&lt;/code> section
 &lt;a class="anchor" href="#barcodes-section">#&lt;/a>
&lt;/h1>
&lt;p>Barcode tables supply the sequence-to-sample-name mappings used by
&lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/reference/Demultiplex/">Demultiplex&lt;/a>,
[HammingCorrect](/fastqrab/main/docs/reference/tag-steps/using/HammingCorrect/
and
&lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/reference/tag-steps/using/AssignToReference/">AssignToReference&lt;/a>
).&lt;/p>
&lt;p>Each table is an independent named dictionary. The name is chosen by the
user and referenced from the step that consumes it.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>&lt;span style="color:#75715e"># ignore_in_test&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>[&lt;span style="color:#a6e22e">barcodes&lt;/span>.&lt;span style="color:#a6e22e">my_barcodes&lt;/span>]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#a6e22e">AAAAAA&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;sample-1&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#a6e22e">CCCCCC&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;sample-2&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#a6e22e">GGGGGG&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;sample-3&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>The table may appear anywhere in the TOML file; forward and backward
references are both valid.&lt;/p>
&lt;h2 id="key-format">
 Key format
 &lt;a class="anchor" href="#key-format">#&lt;/a>
&lt;/h2>
&lt;p>Keys are DNA sequences using uppercase IUPAC nucleotide codes.
All standard IUPAC ambiguity codes are accepted (e.g. &lt;code>N&lt;/code>, &lt;code>R&lt;/code>, &lt;code>Y&lt;/code>, &lt;code>W&lt;/code>, …).&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/Demultiplex/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/Demultiplex/</guid><description>&lt;h2 id="demultiplexed-output">
 Demultiplexed output
 &lt;a class="anchor" href="#demultiplexed-output">#&lt;/a>
&lt;/h2>
&lt;p>&lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/reference/Demultiplex/">Demultiplex&lt;/a> is a &amp;lsquo;magic&amp;rsquo; transformation that forks the output.&lt;/p>
&lt;p>You receive one set of output files per barcode (combination) defined.&lt;/p>
&lt;p>Transformations downstream are (virtually) duplicated,
so you can for example filter to the head reads in each barcode,
and get reports for both: all reads and each separate barcode.&lt;/p>
&lt;p>Demultiplexing can be done on barcodes, or on boolean tags, and can happen multiple times.&lt;/p>
&lt;h3 id="based-on-barcodes">
 Based on barcodes
 &lt;a class="anchor" href="#based-on-barcodes">#&lt;/a>
&lt;/h3>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;Demultiplex&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">in_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;mytag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">barcodes&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;mybarcodes&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">output_unmatched&lt;/span> = &lt;span style="color:#66d9ef">true&lt;/span> &lt;span style="color:#75715e"># if set, write reads not matching any barcode&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># to a file like ouput_prefix_no-barcode_1.fq&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>[&lt;span style="color:#a6e22e">barcodes&lt;/span>.&lt;span style="color:#a6e22e">mybarcodes&lt;/span>] &lt;span style="color:#75715e"># can be before and after.&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#75715e"># separate multiple regions with a _&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#75715e"># a Mapping of barcode -&amp;gt; output name.&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#a6e22e">AAAAAA_CCCCCC&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;sample-1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># output files are named prefix{ix_separator}barcode_prefix{ix_separator}segment.suffix&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># with the separator defaulting to &amp;#39;_&amp;#39;, e.g. output_sample-1_1.fq.gz&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># or output_sample-1_report.fq.gz&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="based-on-boolean-tags">
 Based on boolean tags
 &lt;a class="anchor" href="#based-on-boolean-tags">#&lt;/a>
&lt;/h3>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;TagOtherFile&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">source&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;name:read1&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">out_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;a_bool_tag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">filename&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;path/to/boolean_tags.tsv&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">false_positive_rate&lt;/span> = &lt;span style="color:#ae81ff">0&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;Demultiplex&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">in_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;a_bool_tag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># output_unmatched = is not valid for boolean tags&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Note that this example does not
extract the barcodes from the read
(use an extract step, such as &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/reference/tag-steps/extract/ExtractRegion/">ExtractRegion&lt;/a>).&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/filter-steps/FilterByNumericTag/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/filter-steps/FilterByNumericTag/</guid><description>&lt;h1 id="filterbynumerictag">
 FilterByNumericTag
 &lt;a class="anchor" href="#filterbynumerictag">#&lt;/a>
&lt;/h1>
&lt;p>Remove molecules by thresholding on numeric tag.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;CalcLength&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">out_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;mytag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;FilterByNumericTag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">in_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;mytag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">keep_or_remove&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;Keep&amp;#34;&lt;/span> &lt;span style="color:#75715e"># or &amp;#34;Remove&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">min_value&lt;/span> = &lt;span style="color:#ae81ff">5&lt;/span> &lt;span style="color:#75715e"># &amp;gt;= this, optional&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">max_value&lt;/span> = &lt;span style="color:#ae81ff">21&lt;/span> &lt;span style="color:#75715e"># &amp;lt; this, optional&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>The example only keeps reads that are between 5 and 20 bases long.&lt;/p>
&lt;p>Consider using an &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/reference/tag-steps/convert/EvalExpression/">EvalExpression&lt;/a> for more complicated decisions.&lt;/p></description></item><item><title>Parser Architecture</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/concepts/parser-architecture/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/concepts/parser-architecture/</guid><description>&lt;h1 id="parser-architecture">
 Parser Architecture
 &lt;a class="anchor" href="#parser-architecture">#&lt;/a>
&lt;/h1>
&lt;h2 id="overview">
 Overview
 &lt;a class="anchor" href="#overview">#&lt;/a>
&lt;/h2>
&lt;p>fastqrab uses a custom-built parser designed for high performance and correctness when processing FASTQ.
The parser&amp;rsquo;s design emphasizes:&lt;/p>
&lt;ol>
&lt;li>&lt;strong>Zero-copy parsing&lt;/strong> where possible to minimize memory allocations&lt;/li>
&lt;li>&lt;strong>Streaming architecture&lt;/strong> to handle files of any size&lt;/li>
&lt;li>&lt;strong>Transparent compression&lt;/strong> support (raw, gzip, zstd)&lt;/li>
&lt;li>&lt;strong>Cross-platform compatibility&lt;/strong> (Unix/Windows line endings)&lt;/li>
&lt;/ol>
&lt;p>(FASTA and BAM files are processed differently, see below).&lt;/p>
&lt;h2 id="the-zero-copy-challenge-with-compressed-files">
 The Zero-Copy Challenge with Compressed Files
 &lt;a class="anchor" href="#the-zero-copy-challenge-with-compressed-files">#&lt;/a>
&lt;/h2>
&lt;h3 id="why-not-pure-zero-copy">
 Why Not Pure Zero-Copy?
 &lt;a class="anchor" href="#why-not-pure-zero-copy">#&lt;/a>
&lt;/h3>
&lt;p>A common optimization in bioinformatics tools is &amp;ldquo;zero-copy&amp;rdquo; parsing,
where the parser operates directly on memory-mapped file contents without allocating separate buffers.
This works well for uncompressed files stored on fast storage in suitable file formats.&lt;/p></description></item><item><title>Parser Architecture</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/development/parser-architecture/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/development/parser-architecture/</guid><description>&lt;h1 id="parser-architecture">
 Parser Architecture
 &lt;a class="anchor" href="#parser-architecture">#&lt;/a>
&lt;/h1>
&lt;h2 id="overview">
 Overview
 &lt;a class="anchor" href="#overview">#&lt;/a>
&lt;/h2>
&lt;p>fastqrab uses a custom-built parser designed for high performance and correctness when processing FASTQ, FASTA, and BAM files. The parser&amp;rsquo;s design emphasizes:&lt;/p>
&lt;ol>
&lt;li>&lt;strong>Zero-copy parsing&lt;/strong> where possible to minimize memory allocations&lt;/li>
&lt;li>&lt;strong>Streaming architecture&lt;/strong> to handle files of any size&lt;/li>
&lt;li>&lt;strong>Transparent compression&lt;/strong> support (uncompressed, gzip, zstd)&lt;/li>
&lt;li>&lt;strong>Stateful parsing&lt;/strong> to handle reads spanning block boundaries&lt;/li>
&lt;li>&lt;strong>Cross-platform compatibility&lt;/strong> (Unix/Windows line endings)&lt;/li>
&lt;/ol>
&lt;h2 id="the-zero-copy-challenge-with-compressed-files">
 The Zero-Copy Challenge with Compressed Files
 &lt;a class="anchor" href="#the-zero-copy-challenge-with-compressed-files">#&lt;/a>
&lt;/h2>
&lt;h3 id="why-not-pure-zero-copy">
 Why Not Pure Zero-Copy?
 &lt;a class="anchor" href="#why-not-pure-zero-copy">#&lt;/a>
&lt;/h3>
&lt;p>A common optimization in bioinformatics tools is &amp;ldquo;zero-copy&amp;rdquo; parsing, where the parser operates directly on memory-mapped file contents without allocating separate buffers. This works well for uncompressed files stored on fast storage.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/development/coverage/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/development/coverage/</guid><description>&lt;h1 id="code-coverage">
 Code Coverage
 &lt;a class="anchor" href="#code-coverage">#&lt;/a>
&lt;/h1>
&lt;h2 id="running-coverage">
 Running coverage
 &lt;a class="anchor" href="#running-coverage">#&lt;/a>
&lt;/h2>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-bash" data-lang="bash">&lt;span style="display:flex;">&lt;span>python3 dev/coverage.py
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This prints a line-coverage summary with exclusions applied (see below).&lt;/p>
&lt;h3 id="other-formats">
 Other formats
 &lt;a class="anchor" href="#other-formats">#&lt;/a>
&lt;/h3>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-bash" data-lang="bash">&lt;span style="display:flex;">&lt;span>python3 dev/coverage.py --lcov &lt;span style="color:#75715e"># writes coverage.lcov&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>python3 dev/coverage.py --html &lt;span style="color:#75715e"># writes coverage-html/html/index.html&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>python3 dev/coverage.py --json &lt;span style="color:#75715e"># writes coverage.json&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>python3 dev/coverage.py --all &lt;span style="color:#75715e"># all of the above + summary&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>python3 dev/coverage.py --open &lt;span style="color:#75715e"># --html and open in browser&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>The summary, LCOV, and HTML outputs all apply exclusions. HTML is rendered by
&lt;code>genhtml&lt;/code> (from the &lt;code>lcov&lt;/code> package, included in the nix dev shell) from the
post-processed lcov file; excluded lines appear as uninstrumented (white, not
red). JSON is generated directly by &lt;code>cargo-llvm-cov&lt;/code> and does &lt;strong>not&lt;/strong> reflect
exclusion comments.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/filter-steps/FilterEmpty/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/filter-steps/FilterEmpty/</guid><description>&lt;h2 id="filterempty">
 FilterEmpty
 &lt;a class="anchor" href="#filterempty">#&lt;/a>
&lt;/h2>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;FilterEmpty&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments, or &amp;#39;All&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Drop the molecule if the read has length 0.
(Use after other processing.)&lt;/p>
&lt;p>This get&amp;rsquo;s expanded to an internal tag and a FilteryByNumeric.&lt;/p>
&lt;p>This is necessary if your modification can produce &amp;rsquo;empty'
reads - downstream aligners like STAR tend to dislike these in their input.&lt;/p>
&lt;p>On segment=&amp;lsquo;All&amp;rsquo;, only filters reads that are empty in all parts.
Use multiple &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/reference/filter-steps/FilterEmpty/">FilterEmpty&lt;/a> steps to filter if any part is empty.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/modification-steps/Postfix/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/modification-steps/Postfix/</guid><description>&lt;h1 id="postfix">
 Postfix
 &lt;a class="anchor" href="#postfix">#&lt;/a>
&lt;/h1>
&lt;p>Add DNA to the end of read sequences.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;Postfix&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">seq&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;agtc&amp;#34;&lt;/span> &lt;span style="color:#75715e"># DNA sequence to add at end of reads. Checked to be agtcn&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">qual&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;IIII&amp;#34;&lt;/span> &lt;span style="color:#75715e"># same length as seq. Your responsibility to have valid phred values&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">if_tag&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;mytag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">encoding&lt;/span> = &lt;span style="color:#e6db74">&amp;#39;Illumina1.8&amp;#39;&lt;/span> &lt;span style="color:#75715e"># optional, default=sanger &amp;#39;Illumina1.8|Illumina1.3|Sanger|Solexa&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># Illumina1.8 is an alias for Sanger.&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This transformation adds a specified sequence and corresponding quality scores to the end of reads.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/modification-steps/Prefix/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/modification-steps/Prefix/</guid><description>&lt;h1 id="prefix">
 Prefix
 &lt;a class="anchor" href="#prefix">#&lt;/a>
&lt;/h1>
&lt;p>Add text to the beginning of read sequences.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;Prefix&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">seq&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;agtTCAa&amp;#34;&lt;/span> &lt;span style="color:#75715e"># DNA sequence to add at beginning of reads. Checked to be agtcn&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">qual&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;IIIBIII&amp;#34;&lt;/span> &lt;span style="color:#75715e"># same length as seq. Your responsibility to have valid phred values&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">if_tag&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;mytag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">encoding&lt;/span> = &lt;span style="color:#e6db74">&amp;#39;Illumina1.8&amp;#39;&lt;/span> &lt;span style="color:#75715e"># optional, default=sanger &amp;#39;Illumina1.8|Illumina1.3|Sanger|Solexa&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># Illumina1.8 is an alias for Sanger.&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This transformation adds a specified sequence and corresponding quality scores to the beginning of reads.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/modification-steps/ReplaceTagWithLetter/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/modification-steps/ReplaceTagWithLetter/</guid><description>&lt;h1 id="replacetagwithletter">
 ReplaceTagWithLetter
 &lt;a class="anchor" href="#replacetagwithletter">#&lt;/a>
&lt;/h1>
&lt;p>Replace sequence bases in tagged regions with a specified letter.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ReplaceTagWithLetter&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">in_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;mytag&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Tag containing regions to replace&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">letter&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;N&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Replacement character (defaults to &amp;#39;N&amp;#39;)&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This transformation replaces all sequence bases within the regions defined by a
tag with a specified replacement character. Quality scores are preserved
unchanged. This is commonly used to mask low-quality regions as &amp;lsquo;N&amp;rsquo; characters.&lt;/p>
&lt;h2 id="parameters">
 Parameters
 &lt;a class="anchor" href="#parameters">#&lt;/a>
&lt;/h2>
&lt;ul>
&lt;li>&lt;code>in_label&lt;/code>: Name of the tag containing regions to be replaced&lt;/li>
&lt;li>&lt;code>letter&lt;/code>: Single character to replace bases with (defaults to &amp;lsquo;N&amp;rsquo; if not specified)&lt;/li>
&lt;/ul>
&lt;h2 id="example-use-cases">
 Example Use Cases
 &lt;a class="anchor" href="#example-use-cases">#&lt;/a>
&lt;/h2>
&lt;ul>
&lt;li>Mask low-quality bases identified by &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/reference/tag-steps/extract/ExtractRegionsOfLowQuality/">ExtractRegionsOfLowQuality&lt;/a>&lt;/li>
&lt;li>Replace specific sequence motifs identified by other extraction steps&lt;/li>
&lt;li>Convert tagged regions to ambiguous bases for downstream analysis&lt;/li>
&lt;/ul>
&lt;p>The tag must have been created by a previous extraction step and must contain location information.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/modification-steps/TrimAtTag/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/modification-steps/TrimAtTag/</guid><description>&lt;h1 id="trimattag">
 TrimAtTag
 &lt;a class="anchor" href="#trimattag">#&lt;/a>
&lt;/h1>
&lt;p>Trim the read at the position of a tag.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;TrimAtTag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">in_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;mytag&amp;#34;&lt;/span> &lt;span style="color:#75715e"># must be a location tag&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">direction&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;Start&amp;#34;&lt;/span> &lt;span style="color:#75715e"># or &amp;#34;End&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">keep_tag&lt;/span> = &lt;span style="color:#66d9ef">false&lt;/span> &lt;span style="color:#75715e"># if true, the tag sequence is kept in the read&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This transformation trims the read at the position where a tag was found.&lt;/p>
&lt;p>The &lt;code>direction&lt;/code> parameter determines whether to trim from the start or end of the tag,
and &lt;code>keep_tag&lt;/code> determines whether the tag sequence itself is retained.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/report-steps/QuantifyTag/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/report-steps/QuantifyTag/</guid><description>&lt;h1 id="quantifytag">
 QuantifyTag
 &lt;a class="anchor" href="#quantifytag">#&lt;/a>
&lt;/h1>
&lt;p>Count the occurrences of each tag-sequence.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;QuantifyTag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">in_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;mytag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">infix&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;tagcount&amp;#34;&lt;/span> &lt;span style="color:#75715e"># output file is output{ix_separator}tagcount.qr.json (default &amp;#39;_&amp;#39; → output_tagcount.qr.json)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">region_separator&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;_&amp;#34;&lt;/span> &lt;span style="color:#75715e"># optional. If the tag consists of multiple regions, join them with this string&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This transformation counts how many times each unique tag value appears and outputs
the results to a JSON file.&lt;/p>
&lt;h3 id="demultiplex-interaction">
 Demultiplex interaction
 &lt;a class="anchor" href="#demultiplex-interaction">#&lt;/a>
&lt;/h3>
&lt;p>Barcodes are counted per demultiplexed stream.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/calc/CalcLength/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/calc/CalcLength/</guid><description>&lt;h1 id="calclength">
 CalcLength
 &lt;a class="anchor" href="#calclength">#&lt;/a>
&lt;/h1>
&lt;p>Extract the length of a read as a tag.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;CalcLength&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">out_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;mytag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments, or &amp;#39;All&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This transformation creates a tag containing the length of the specified read.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/extract/ExtractRegex/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/extract/ExtractRegex/</guid><description>&lt;h1 id="extractregex">
 ExtractRegex
 &lt;a class="anchor" href="#extractregex">#&lt;/a>
&lt;/h1>
&lt;p>Extract a regexp result. Stores an empty string if not found.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ExtractRegex&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">out_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;mytag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">search&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;^CT(..)CT&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">replacement&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;$1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># optional. standard regex replacement syntax&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">source&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># An input segment (to read from sequence), or name:&amp;lt;segment&amp;gt; to read from a read&amp;#39;s name.&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This transformation searches for a regular expression pattern in the specified
read and extracts the matching portion as a tag.&lt;/p>
&lt;p>The value actually &amp;rsquo;extracted&amp;rsquo; is after replacement has been performed.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/extract/ExtractRegion/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/extract/ExtractRegion/</guid><description>&lt;h1 id="extractregion">
 ExtractRegion
 &lt;a class="anchor" href="#extractregion">#&lt;/a>
&lt;/h1>
&lt;p>Extract a fixed position region.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ExtractRegion&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">start&lt;/span> = &lt;span style="color:#ae81ff">5&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">length&lt;/span> = &lt;span style="color:#ae81ff">8&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">source&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments, name:&amp;lt;segment_name&amp;gt; or tag:&amp;lt;any_location_or_string_tag&amp;gt;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">anchor&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;Start&amp;#34;&lt;/span> &lt;span style="color:#75715e"># &amp;#39;Start&amp;#39; or &amp;#39;End&amp;#39; - start is relative to this&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">out_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;umi&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This transformation extracts a fixed-length region from the specified read at a
given position and stores it as a tag.&lt;/p>
&lt;p>Start positions are 0-based.&lt;/p>
&lt;p>End positions require negative starts (as in python, start=-1, length=1 is the last character).&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/extract/ExtractRegions/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/extract/ExtractRegions/</guid><description>&lt;h1 id="extractregions">
 ExtractRegions
 &lt;a class="anchor" href="#extractregions">#&lt;/a>
&lt;/h1>
&lt;p>Extract from multiple regions with flexible source and anchoring options.&lt;/p>
&lt;h2 id="basic-usage">
 Basic Usage
 &lt;a class="anchor" href="#basic-usage">#&lt;/a>
&lt;/h2>
&lt;p>Extract from fixed positions in sequence data:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ExtractRegions&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">regions&lt;/span> = [
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> {&lt;span style="color:#a6e22e">source&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span>, &lt;span style="color:#a6e22e">start&lt;/span> = &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#a6e22e">length&lt;/span> = &lt;span style="color:#ae81ff">8&lt;/span>, &lt;span style="color:#a6e22e">anchor&lt;/span>= &lt;span style="color:#e6db74">&amp;#34;Start&amp;#34;&lt;/span>},
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> {&lt;span style="color:#a6e22e">source&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span>, &lt;span style="color:#a6e22e">start&lt;/span> = &lt;span style="color:#ae81ff">-12&lt;/span>, &lt;span style="color:#a6e22e">length&lt;/span> = &lt;span style="color:#ae81ff">4&lt;/span>, &lt;span style="color:#a6e22e">anchor&lt;/span>=&lt;span style="color:#e6db74">&amp;#34;End&amp;#34;&lt;/span>},
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> ]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">out_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;barcode&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="advanced-usage-with-sources">
 Advanced Usage with Sources
 &lt;a class="anchor" href="#advanced-usage-with-sources">#&lt;/a>
&lt;/h2>
&lt;p>Extract from tag-derived positions:&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>&lt;span style="color:#75715e"># First create an anchor tag&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ExtractIUPAC&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">search&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;CAYA&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">out_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;anchor_tag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">anchor&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;Anywhere&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">max_mismatches&lt;/span> = &lt;span style="color:#ae81ff">0&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#75715e"># Then extract relative to that anchor&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ExtractRegions&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">regions&lt;/span> = [
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> { &lt;span style="color:#a6e22e">source&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;tag:anchor_tag&amp;#34;&lt;/span>, &lt;span style="color:#a6e22e">start&lt;/span> = &lt;span style="color:#ae81ff">-2&lt;/span>, &lt;span style="color:#a6e22e">length&lt;/span> = &lt;span style="color:#ae81ff">4&lt;/span>, &lt;span style="color:#a6e22e">anchor&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;Start&amp;#34;&lt;/span> },
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> { &lt;span style="color:#a6e22e">source&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;tag:anchor_tag&amp;#34;&lt;/span>, &lt;span style="color:#a6e22e">start&lt;/span> = &lt;span style="color:#ae81ff">4&lt;/span>, &lt;span style="color:#a6e22e">length&lt;/span> = &lt;span style="color:#ae81ff">1&lt;/span>, &lt;span style="color:#a6e22e">anchor&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;Start&amp;#34;&lt;/span> }
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> ]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">out_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;relative_regions&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Extract from read names:&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/extract/ExtractRegionsOfLowQuality/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/extract/ExtractRegionsOfLowQuality/</guid><description>&lt;h1 id="extractregionsoflowquality">
 ExtractRegionsOfLowQuality
 &lt;a class="anchor" href="#extractregionsoflowquality">#&lt;/a>
&lt;/h1>
&lt;p>Extract regions where bases have quality scores below threshold, with a minimum length requirement.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ExtractRegionsOfLowQuality&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">min_quality&lt;/span> = &lt;span style="color:#ae81ff">60&lt;/span> &lt;span style="color:#75715e"># Quality threshold (Phred+33)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">min_length&lt;/span> = &lt;span style="color:#ae81ff">5&lt;/span> &lt;span style="color:#75715e"># Minimum region length (bp)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">out_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;low_quality_regions&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This transformation scans through quality scores of the specified segment and identifies contiguous regions where quality scores are below the specified threshold. Each low-quality region that meets the minimum length requirement becomes a tagged region with location information (start position and length).&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/tag/TagOtherFile/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/tag/TagOtherFile/</guid><description>&lt;h1 id="tagotherfile">
 TagOtherFile
 &lt;a class="anchor" href="#tagotherfile">#&lt;/a>
&lt;/h1>
&lt;p>Marks reads based on whether &amp;rsquo;they&amp;rsquo; are present in another file.&lt;/p>
&lt;p>Supports comparing by read sequence, read name, and tags.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;TagOtherFile&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">source&lt;/span> = &lt;span style="color:#e6db74">&amp;#39;read1&amp;#39;&lt;/span> &lt;span style="color:#75715e"># &amp;lt;segment&amp;gt;, name:&amp;lt;segment&amp;gt; or tag&amp;lt;tag_name&amp;gt;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">out_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;present_in_other_file&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">filename&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;names.fastq&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Can read fastq (also compressed), or SAM/BAM, or fasta files&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">false_positive_rate&lt;/span> = &lt;span style="color:#ae81ff">0.01&lt;/span> &lt;span style="color:#75715e"># false positive rate (0..1)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">seed&lt;/span> = &lt;span style="color:#ae81ff">42&lt;/span> &lt;span style="color:#75715e"># seed for randomness&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">include_mapped&lt;/span> = &lt;span style="color:#66d9ef">true&lt;/span> &lt;span style="color:#75715e"># in case of BAM/SAM, whether to include aligned reads&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">include_unmapped&lt;/span> = &lt;span style="color:#66d9ef">true&lt;/span> &lt;span style="color:#75715e"># in case of BAM/SAM, whether to include unaligned reads&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># other_read_name_end_character &amp;#34; &amp;#34; # in name: mode, Cut the other files read names at this character&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This step annotates reads by comparing them to another file.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/using/ConcatTags/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/using/ConcatTags/</guid><description>&lt;h1 id="concattags">
 ConcatTags
 &lt;a class="anchor" href="#concattags">#&lt;/a>
&lt;/h1>
&lt;p>Concatenate multiple tags into a single tag.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>&lt;span style="color:#75715e"># ignore_in_test&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ConcatTags&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">in_labels&lt;/span> = [&lt;span style="color:#e6db74">&amp;#34;mytag&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&amp;#34;mytag2&amp;#34;&lt;/span>] &lt;span style="color:#75715e"># list of tags to concatenate (minimum 2)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">out_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;combined&amp;#34;&lt;/span> &lt;span style="color:#75715e"># output tag name&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">on_missing&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;merge_present&amp;#34;&lt;/span> &lt;span style="color:#75715e"># required: &amp;#34;merge_present&amp;#34; or &amp;#34;set_missing&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">separator&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;_&amp;#34;&lt;/span> &lt;span style="color:#75715e"># (optional) separator for string concatenation&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This transformation combines multiple tags into a single output tag. The behavior depends on the types of input tags and how missing tags are handled.&lt;/p>
&lt;h2 id="behavior-by-tag-type">
 Behavior by Tag Type
 &lt;a class="anchor" href="#behavior-by-tag-type">#&lt;/a>
&lt;/h2>
&lt;h3 id="location-tags-only">
 Location Tags Only
 &lt;a class="anchor" href="#location-tags-only">#&lt;/a>
&lt;/h3>
&lt;p>When all input tags are location tags (e.g., from &lt;code>ExtractIUPAC&lt;/code>, &lt;code>ExtractAnchor&lt;/code>), the transformation:&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/using/ForgetAllTags/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/using/ForgetAllTags/</guid><description>&lt;h1 id="forgetalltags">
 ForgetAllTags
 &lt;a class="anchor" href="#forgetalltags">#&lt;/a>
&lt;/h1>
&lt;p>Remove every tag currently stored for the read batch.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ForgetAllTags&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Use this when you want to clear all tag labels before continuing.
It is handy after persisting tags to an external table, or before running
steps that must not see previously extracted metadata.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/using/ForgetTag/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/using/ForgetTag/</guid><description>&lt;h1 id="forgettag">
 ForgetTag
 &lt;a class="anchor" href="#forgettag">#&lt;/a>
&lt;/h1>
&lt;p>Forget about a tag. Useful if you want to store tags in a table, but not this one.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ForgetTag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">in_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;mytag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This transformation removes a specified tag from the molecule&amp;rsquo;s tag collection.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/using/StoreTagBackInSequence/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/using/StoreTagBackInSequence/</guid><description>&lt;h1 id="storetagbackinsequence">
 StoreTagBackInSequence
 &lt;a class="anchor" href="#storetagbackinsequence">#&lt;/a>
&lt;/h1>
&lt;p>Store the tag&amp;rsquo;s replacement in the sequence, replacing the original sequence at that location.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;StoreTagBackInSequence&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">in_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;mytag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">ignore_missing&lt;/span> = &lt;span style="color:#66d9ef">true&lt;/span> &lt;span style="color:#75715e"># if false, an error is raised if the tag is missing&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This transformation stores the tag&amp;rsquo;s value back into the sequence, replacing the original sequence at that location.&lt;/p>
&lt;p>Note that if this changes the length of the sequence, existing location tags will loose their location data (retaining their sequence though).&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/using/StoreTagInComment/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/using/StoreTagInComment/</guid><description>&lt;h1 id="storetagincomment">
 StoreTagInComment
 &lt;a class="anchor" href="#storetagincomment">#&lt;/a>
&lt;/h1>
&lt;p>Store currently present tags as comments on read names.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;StoreTagInComment&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">in_labels&lt;/span> = [&lt;span style="color:#e6db74">&amp;#34;mytag&amp;#34;&lt;/span>] &lt;span style="color:#75715e"># Store these tags. Not optional. May be a single string as well. Alias &amp;#39;in_label&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments, or &amp;#39;All&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">comment_insert_char&lt;/span> = &lt;span style="color:#e6db74">&amp;#34; &amp;#34;&lt;/span> &lt;span style="color:#75715e"># (optional) char at which to insert comments&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">comment_separator&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;|&amp;#34;&lt;/span> &lt;span style="color:#75715e"># (optional) char to separate comments&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">region_separator&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;_&amp;#34;&lt;/span> &lt;span style="color:#75715e"># (optional) char to separate regions in a tag, if it has multiple&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Comments are key=value pairs, separated by &lt;code>comment_separator&lt;/code> which defaults to &amp;lsquo;|&amp;rsquo;.
They get inserted before the first &lt;code>comment_insert_char&lt;/code>, which defaults to
&lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/reference/input-section/#input-options">&lt;code>input.options.read_comment_char&lt;/code>&lt;/a>.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/using/StoreTagsInTable/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/using/StoreTagsInTable/</guid><description>&lt;h1 id="storetagsintable">
 StoreTagsInTable
 &lt;a class="anchor" href="#storetagsintable">#&lt;/a>
&lt;/h1>
&lt;p>Store the tags in a TSV table.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;StoreTagsInTable&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">infix&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;tags&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">compression&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;Raw&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Raw, Gzip, Zstd&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">region_separator&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;_&amp;#34;&lt;/span> &lt;span style="color:#75715e"># (optional) char to separate regions in a tag, if it has multiple&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">in_labels&lt;/span> = [&lt;span style="color:#e6db74">&amp;#34;mytag&amp;#34;&lt;/span>, ] &lt;span style="color:#75715e"># Store just these tags. Optional, all tags store if not set&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">include_read_name&lt;/span> = &lt;span style="color:#66d9ef">true&lt;/span> &lt;span style="color:#75715e"># (optional) include the ReadName column. Default: true&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This transformation writes all current tags to a tab-separated values (TSV) table file for further analysis.&lt;/p></description></item><item><title>Extract IUPAC</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/extract/ExtractIUPAC/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/extract/ExtractIUPAC/</guid><description>&lt;h1 id="extractiupac">
 ExtractIUPAC
 &lt;a class="anchor" href="#extractiupac">#&lt;/a>
&lt;/h1>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ExtractIUPAC&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">out_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;mytag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">anchor&lt;/span> = &lt;span style="color:#e6db74">&amp;#39;Left&amp;#39;&lt;/span> &lt;span style="color:#75715e"># Left | Right | Anywhere&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">search&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;CTN&amp;#34;&lt;/span> &lt;span style="color:#75715e"># what we are searching. May also be a list [&amp;#34;CTN&amp;#34;, &amp;#34;GAN&amp;#34;, ...]&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#39;read1&amp;#39;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">max_mismatches&lt;/span> = &lt;span style="color:#ae81ff">0&lt;/span> &lt;span style="color:#75715e"># required. How many mismatches are allowed&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Search and extract a sequence from the read, defined by a &lt;a href="https://doi.org/10.1093%2Fnar%2F13.9.3021">IUPAC string&lt;/a>.&lt;/p>
&lt;p>Anchor is the regex equivalent of ^ (Left), $ (Right) or no anchor (Anywhere).&lt;/p></description></item><item><title>Extract IUPAC suffix</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/extract/ExtractIUPACSuffix/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/extract/ExtractIUPACSuffix/</guid><description>&lt;h1 id="extractiupacsuffix">
 ExtractIUPACSuffix
 &lt;a class="anchor" href="#extractiupacsuffix">#&lt;/a>
&lt;/h1>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ExtractIUPACSuffix&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">out_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;mytag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">search&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;AGTCA&amp;#34;&lt;/span> &lt;span style="color:#75715e"># the adapter to trim. Straigth bases only, no IUPAC.&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments (default: read1)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">min_length&lt;/span> = &lt;span style="color:#ae81ff">3&lt;/span> &lt;span style="color:#75715e"># uint, the minimum length of match between the end of the read and&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># the start of the adapter&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">max_mismatches&lt;/span> = &lt;span style="color:#ae81ff">0&lt;/span> &lt;span style="color:#75715e"># required. How many mismatches to accept&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Find a potentially truncated &lt;a href="https://doi.org/10.1093%2Fnar%2F13.9.3021">IUPAC string&lt;/a> sequence at the end of a read.&lt;/p>
&lt;p>Simple comparison with a max mismatch hamming distance, requiring only the first min length
bases of the query to match at the end of the read.&lt;/p></description></item><item><title>Extract IUPAC with Indels</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/extract/ExtractIUPACWithIndel/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/extract/ExtractIUPACWithIndel/</guid><description>&lt;h1 id="extractiupacwithindel">
 ExtractIUPACWithIndel
 &lt;a class="anchor" href="#extractiupacwithindel">#&lt;/a>
&lt;/h1>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ExtractIUPACWithIndel&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">out_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;adapter&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">search&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;AGTC&amp;#34;&lt;/span> &lt;span style="color:#75715e"># IUPAC pattern to align against&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">max_mismatches&lt;/span> = &lt;span style="color:#ae81ff">1&lt;/span> &lt;span style="color:#75715e"># allowed substitutions (IUPAC-aware)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">max_indel_bases&lt;/span> = &lt;span style="color:#ae81ff">1&lt;/span> &lt;span style="color:#75715e"># total insertions + deletions allowed&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">max_total_edits&lt;/span> = &lt;span style="color:#ae81ff">2&lt;/span> &lt;span style="color:#75715e"># optional overall edit ceiling&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">anchor&lt;/span> = &lt;span style="color:#e6db74">&amp;#39;Anywhere&amp;#39;&lt;/span> &lt;span style="color:#75715e"># Left | Right | Anywhere&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#39;read1&amp;#39;&lt;/span> &lt;span style="color:#75715e"># defaults to read1&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Locate an &lt;a href="https://doi.org/10.1093%2Fnar%2F13.9.3021">IUPAC&lt;/a> pattern even when the read contains small insertions or deletions relative to the pattern. The extractor performs a semiglobal alignment (pattern vs. read segment) using IUPAC-aware scoring and returns the aligned span as a location tag.&lt;/p></description></item><item><title>Store Tag In FASTQ</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/using/StoreTagInFastQ/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/using/StoreTagInFastQ/</guid><description>&lt;h1 id="storetaginfastq">
 StoreTagInFastQ
 &lt;a class="anchor" href="#storetaginfastq">#&lt;/a>
&lt;/h1>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>&lt;span style="color:#75715e"># Store the content of a tag in a fastq file.&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#75715e"># Needs a &amp;#39;location &amp;#39;tag&amp;#39;.&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#75715e"># Can store other tags in the read name.&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#75715e"># quality scores are set to &amp;#39;~&amp;#39;.&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#75715e"># With demultiplexing: creates separate files per barcode&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> [[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;StoreTagInFastQ&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">in_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;mytag&amp;#34;&lt;/span> &lt;span style="color:#75715e"># tag to store. File name is derived with this as infix&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">format&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;Fastq&amp;#34;&lt;/span> &lt;span style="color:#75715e"># FASTQ / FASTA / BAm&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">compression&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;gz&amp;#34;&lt;/span> &lt;span style="color:#75715e"># or &amp;#34;zstd&amp;#34; | &amp;#34;none&amp;#34; # (optional) compression format, not if format == BAM&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">compression_level&lt;/span> = &lt;span style="color:#ae81ff">6&lt;/span> &lt;span style="color:#75715e"># (optional) compression level for gzip (0-9) or zstd (1-22)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> 					 &lt;span style="color:#75715e"># defaults: gzip=6, zstd=5&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">comment_tags&lt;/span> = []&lt;span style="color:#75715e"># e.g. [&amp;#34;other_tag&amp;#34;] # see StoreTagInComment&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">comment_insert_char&lt;/span> = &lt;span style="color:#e6db74">&amp;#39; &amp;#39;&lt;/span> &lt;span style="color:#75715e"># (optional) char at which to insert comments&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">comment_separator&lt;/span> = &lt;span style="color:#e6db74">&amp;#39;|&amp;#39;&lt;/span> &lt;span style="color:#75715e"># (optional) char to separate comments&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">region_separator&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;_&amp;#34;&lt;/span> &lt;span style="color:#75715e"># (optional) char to separate regions in a tag, if it has multiple&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Store the sequence of a tag in a fastq file,
with other tags (and virtual tags) optionally stored in the read name as comments.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/calc/CalcExpectedError/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/calc/CalcExpectedError/</guid><description>&lt;h1 id="calcexpectederror">
 CalcExpectedError
 &lt;a class="anchor" href="#calcexpectederror">#&lt;/a>
&lt;/h1>
&lt;p>Compute aggregated per-base error probabilities (expected errors) for each read assuming PHRED+33 qualities.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;CalcExpectedError&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">out_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;expected_error&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">aggregate&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;sum&amp;#34;&lt;/span> &lt;span style="color:#75715e"># or &amp;#34;max&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments, or &amp;#39;All&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>If your data is not encoded as Phred+33, convert it first (for example, with &lt;code>ConvertQuality&lt;/code>) before running this step. Values outside of the Phred+33 range will lead to an stop with an error.&lt;/p>
&lt;p>Set &lt;code>aggregate = &amp;quot;sum&amp;quot;&lt;/code> to calculate the sum of per-base error probabilities.
Use &lt;code>aggregate = &amp;quot;max&amp;quot;&lt;/code> to store only the worst base&amp;rsquo;s error probability for each read or read pair.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/using/StoreTagInSequence/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/using/StoreTagInSequence/</guid><description>&lt;h1 id="storetaginsequence">
 StoreTagInSequence
 &lt;a class="anchor" href="#storetaginsequence">#&lt;/a>
&lt;/h1>
&lt;p>Insert a tag&amp;rsquo;s string value into a read sequence at the position defined by another location tag.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;StoreTagInSequence&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">in_value_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;mytag&amp;#34;&lt;/span> &lt;span style="color:#75715e"># location or string tag to insert&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">in_position_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;mytag2&amp;#34;&lt;/span> &lt;span style="color:#75715e"># location tag defining where to insert&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">anchor&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;Start&amp;#34;&lt;/span> &lt;span style="color:#75715e"># &amp;#34;Start&amp;#34;/&amp;#34;left&amp;#34; or &amp;#34;End&amp;#34;/&amp;#34;right&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="parameters">
 Parameters
 &lt;a class="anchor" href="#parameters">#&lt;/a>
&lt;/h2>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>Parameter&lt;/th>
 &lt;th>Type&lt;/th>
 &lt;th>Required&lt;/th>
 &lt;th>Description&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>&lt;code>in_value_label&lt;/code>&lt;/td>
 &lt;td>location or string tag&lt;/td>
 &lt;td>yes&lt;/td>
 &lt;td>Tag whose sequence is inserted into the read&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>in_position_label&lt;/code>&lt;/td>
 &lt;td>location tag&lt;/td>
 &lt;td>yes&lt;/td>
 &lt;td>Tag that defines the insertion position&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>anchor&lt;/code>&lt;/td>
 &lt;td>&lt;code>&amp;quot;Start&amp;quot;&lt;/code> / &lt;code>&amp;quot;left&amp;quot;&lt;/code> / &lt;code>&amp;quot;End&amp;quot;&lt;/code> / &lt;code>&amp;quot;right&amp;quot;&lt;/code>&lt;/td>
 &lt;td>yes&lt;/td>
 &lt;td>Whether to insert before the leftmost position (&lt;code>Start&lt;/code>) or after the rightmost end (&lt;code>End&lt;/code>) of the position tag&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;h2 id="how-it-works">
 How it works
 &lt;a class="anchor" href="#how-it-works">#&lt;/a>
&lt;/h2>
&lt;p>&lt;code>in_position_label&lt;/code> is a location tag pointing to a region (or multiple regions) in a
read. The insertion point is derived from the anchor:&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/convert/ConvertRegionsToLength/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/convert/ConvertRegionsToLength/</guid><description>&lt;h1 id="convertregionstolength">
 ConvertRegionsToLength
 &lt;a class="anchor" href="#convertregionstolength">#&lt;/a>
&lt;/h1>
&lt;p>Turn region tags (such as those produced by &lt;code>ExtractRegion&lt;/code>/&lt;code>ExtractRegions&lt;/code>) into numeric length tags.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ExtractRegion&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">out_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;adapter&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">source&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">start&lt;/span> = &lt;span style="color:#ae81ff">0&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">len&lt;/span> = &lt;span style="color:#ae81ff">12&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">anchor&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;Start&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ConvertRegionsToLength&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">out_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;adapter_len&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">in_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;adapter&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;ul>
&lt;li>The new tag stores the total span (in bases) covered by all regions on each read.&lt;/li>
&lt;li>Reads without the source tag receive a length of &lt;code>0&lt;/code>.&lt;/li>
&lt;li>&lt;code>label&lt;/code> must be different from &lt;code>region_label&lt;/code>; the step keeps the original region tag.&lt;/li>
&lt;/ul></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/convert/EvalExpression/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/convert/EvalExpression/</guid><description>&lt;h1 id="evalexpression">
 EvalExpression
 &lt;a class="anchor" href="#evalexpression">#&lt;/a>
&lt;/h1>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;EvalExpression&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">out_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;outtag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">expression&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;log(2, mytag + 1)&amp;#34;&lt;/span> &lt;span style="color:#75715e"># log to base 2&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">result_type&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;numeric&amp;#34;&lt;/span> &lt;span style="color:#75715e"># or bool.&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Calculate a &lt;a href="https://docs.rs/fasteval/latest/fasteval/">fasteval&lt;/a> expression on your tags,
which you can then pass to .&lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/reference/filter-steps/FilterByTag/">FilterByTag&lt;/a>.&lt;/p>
&lt;p>You can use any tags previously defined on the molecule as variables in the expression.&lt;/p>
&lt;p>Additional, there&amp;rsquo;s a series of virtual tags available:&lt;/p>
&lt;ul>
&lt;li>&lt;code>len_&amp;lt;segment-name&amp;gt;&lt;/code> - the length of the specified segment (e.g. &lt;code>len_read1&lt;/code>).&lt;/li>
&lt;li>&lt;code>len_&amp;lt;tag-label&amp;gt;&lt;/code> - the length of the specified tag (e.g. &lt;code>len_mytag&lt;/code>). For location tags,
this is the length of the underlying matched regions (which may change / be lost when reads are truncated - eval before truncation if necessary). For string tags (= &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/reference/tag-steps/extract/ExtractRegex/">ExtractRegex&lt;/a> with &lt;code>source=name:...&lt;/code>) this is the length of the &lt;em>replaced&lt;/em> string.&lt;/li>
&lt;li>&lt;code>read_no&lt;/code> - the running number of the read (starting with 0)&lt;/li>
&lt;/ul>
&lt;h2 id="language">
 Language
 &lt;a class="anchor" href="#language">#&lt;/a>
&lt;/h2>
&lt;p>Besides the regular arithmetic operators (+, -, *, /, %, ^)
this supports log(base, val), e(), pi(), int(), ceil(), floor(), round(), abs(), sign(), min(a,b,&amp;hellip;), max(a,b,&amp;hellip;)
sin(radians), cos(radians), tan(radians), sinh(radians), cosh(radians), tanh(radians),
Use any defined tag by name. Location/string tags are converted to booleans by their presence.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/Options/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/Options/</guid><description>&lt;h1 id="options">
 Options
 &lt;a class="anchor" href="#options">#&lt;/a>
&lt;/h1>
&lt;p>There is a small set of runtime knobs exposed under &lt;code>[options]&lt;/code>.&lt;/p>
&lt;p>Most workflows can rely on the defaults.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[&lt;span style="color:#a6e22e">options&lt;/span>]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">threads&lt;/span> = &lt;span style="color:#ae81ff">10&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">max_blocks_in_flight&lt;/span> = &lt;span style="color:#ae81ff">100&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">block_size&lt;/span> = &lt;span style="color:#ae81ff">10000&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">buffer_size&lt;/span> = &lt;span style="color:#ae81ff">102400&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">accept_duplicate_files&lt;/span> = &lt;span style="color:#66d9ef">false&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">spot_check_read_pairing&lt;/span> = &lt;span style="color:#66d9ef">true&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>Key&lt;/th>
 &lt;th>Default&lt;/th>
 &lt;th>Description&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>&lt;code>threads&lt;/code>&lt;/td>
 &lt;td>(auto)&lt;/td>
 &lt;td>Worker threads for transformations. See &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/reference/threading/">threading&lt;/a>.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>max_blocks_in_flight&lt;/code>&lt;/td>
 &lt;td>&lt;code>100&lt;/code>&lt;/td>
 &lt;td>How many blocks may be concurrently being processed. Lowering this limits RAM usage.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>block_size&lt;/code>&lt;/td>
 &lt;td>&lt;code>10000&lt;/code>&lt;/td>
 &lt;td>Number of fragments pulled per batch. Increase for very large runs when IO is abundant; decrease to reduce peak memory use.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>buffer_size&lt;/code>&lt;/td>
 &lt;td>&lt;code>102400&lt;/code>&lt;/td>
 &lt;td>Initial bytes reserved per block. The allocator grows buffers on demand, so tuning is rarely necessary.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>accept_duplicate_files&lt;/code>&lt;/td>
 &lt;td>&lt;code>false&lt;/code>&lt;/td>
 &lt;td>Permit the same path to appear multiple times across segments. Useful for fixtures or synthetic tests; keep disabled to catch accidental copy/paste errors.&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>spot_check_read_pairing&lt;/code>&lt;/td>
 &lt;td>&lt;code>true&lt;/code>&lt;/td>
 &lt;td>Sample every 1000th fragment to ensure paired reads still share a name prefix; disable when names are intentionally divergent or rely on &lt;code>ValidateName&lt;/code> to customise the separator.&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;p>Changing these knobs can affect memory pressure and concurrency behavior.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/calc/CalcKmers/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/calc/CalcKmers/</guid><description>&lt;h1 id="calckmers">
 CalcKmers
 &lt;a class="anchor" href="#calckmers">#&lt;/a>
&lt;/h1>
&lt;p>Count the number of kmers from a read that match those in a database built from reference sequences.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;CalcKmers&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">out_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;mytag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments, or &amp;#39;All&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">filename&lt;/span> = [&lt;span style="color:#e6db74">&amp;#39;reference.fa&amp;#39;&lt;/span>, &lt;span style="color:#e6db74">&amp;#39;database.fq&amp;#39;&lt;/span>] &lt;span style="color:#75715e"># Path (string) or list of such&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">count_reverse_complement&lt;/span> = &lt;span style="color:#66d9ef">true&lt;/span> &lt;span style="color:#75715e"># whether to also include each revcomp of a kmer in the database&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">k&lt;/span> = &lt;span style="color:#ae81ff">21&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">min_count&lt;/span> = &lt;span style="color:#ae81ff">2&lt;/span> &lt;span style="color:#75715e"># optional, defaults to 1&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This transformation:&lt;/p>
&lt;ol>
&lt;li>Builds a kmer database from the specified sequence files (all &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/reference/input-section/">input formatws&lt;/a>)&lt;/li>
&lt;li>Extracts all kmers of length &lt;code>k&lt;/code> from the reference sequences&lt;/li>
&lt;li>Filters kmers by &lt;code>min_count&lt;/code> (minimum occurrences in the reference to be included)&lt;/li>
&lt;li>For each read, counts how many of its kmers appear in the database&lt;/li>
&lt;li>Creates a numeric tag with the kmer match count&lt;/li>
&lt;/ol>
&lt;h2 id="parameters">
 Parameters
 &lt;a class="anchor" href="#parameters">#&lt;/a>
&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>out_label&lt;/strong>: Tag name to store the kmer count&lt;/li>
&lt;li>&lt;strong>segment&lt;/strong>: Which segment to quantify (read1, read2, index1, index2, or &amp;lsquo;All&amp;rsquo;)&lt;/li>
&lt;li>&lt;strong>files&lt;/strong>: List of sequence files to build the kmer database from&lt;/li>
&lt;li>&lt;strong>count_reverse_complement&lt;/strong>: (alias: &amp;ldquo;canonical&amp;rdquo;) Whether to include reverse complements of kmers in the database (&amp;lsquo;canonical kmers&amp;rsquo;)&lt;/li>
&lt;li>&lt;strong>k&lt;/strong>: Kmer length&lt;/li>
&lt;li>&lt;strong>min_count&lt;/strong>: Minimum number of times a kmer must appear in the reference files to be included in the database (default: 1). Sum of forward and reverse complement counts if &lt;code>count_reverse_complement&lt;/code> is true.&lt;/li>
&lt;/ul>
&lt;h2 id="use-cases">
 Use Cases
 &lt;a class="anchor" href="#use-cases">#&lt;/a>
&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Contamination detection&lt;/strong>: Quantify or filter reads matching known contaminant sequences&lt;/li>
&lt;li>&lt;strong>Quality control&lt;/strong>: Count kmers from adapter or primer sequences&lt;/li>
&lt;li>&lt;strong>Species identification&lt;/strong>: Measure presence of species-specific kmers&lt;/li>
&lt;/ul>
&lt;h2 id="notes">
 Notes
 &lt;a class="anchor" href="#notes">#&lt;/a>
&lt;/h2>
&lt;ul>
&lt;li>Only kmers with only valid DNA bases (A, C, G, T) are counted; kmers containing N or other ambiguous bases are skipped&lt;/li>
&lt;li>Kmer matching is case-insensitive&lt;/li>
&lt;/ul></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/threading/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/threading/</guid><description>&lt;h1 id="multithreading-considerations">
 Multithreading considerations
 &lt;a class="anchor" href="#multithreading-considerations">#&lt;/a>
&lt;/h1>
&lt;p>Mbf-fastq-processor is inherently multi-threaded, and strives
to make full use of your machine&amp;rsquo;s cores.&lt;/p>
&lt;p>Usually, you should not need to influence the thread counts.&lt;/p>
&lt;p>If you want to limit CPU usage, we suggest either systemd based resource control
or tools like [cpulimit] (&lt;a href="https://github.com/opsengine/cpulimit">https://github.com/opsengine/cpulimit&lt;/a>) instead of coarsely
changing thread counts.&lt;/p>
&lt;h2 id="threading-architecture">
 Threading architecture
 &lt;a class="anchor" href="#threading-architecture">#&lt;/a>
&lt;/h2>
&lt;p>Mbf-fastq-processor runs the following thread stack for a (non-interleaved configuration):&lt;/p>
&lt;pre tabindex="0">&lt;code>[decompression / reader threads]
 ↓
[parsers]
 ↓
combining thread
 ↓
[workpool handling steps]
 ↓
[output threads]
&lt;/code>&lt;/pre>&lt;h3 id="decompression--reader-threads">
 Decompression / reader threads
 &lt;a class="anchor" href="#decompression--reader-threads">#&lt;/a>
&lt;/h3>
&lt;p>The number of decompression / reading threads is
(in order of precedence)&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/benchmark-section/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/benchmark-section/</guid><description>&lt;h1 id="benchmark">
 Benchmark
 &lt;a class="anchor" href="#benchmark">#&lt;/a>
&lt;/h1>
&lt;p>For profiling and benchmarking (individual) steps,
fastqrab has a special benchmark mode.&lt;/p>
&lt;p>This mode focuses on benchmarking the steps,
and avoids (most) input and output runtime.&lt;/p>
&lt;p>Enable it by adding this TOML section.
The output section becomes optional (and ignored)
when benchmarking is enabled.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[&lt;span style="color:#a6e22e">benchmark&lt;/span>]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">enable&lt;/span> = &lt;span style="color:#66d9ef">true&lt;/span> &lt;span style="color:#75715e"># required to enable benchmark mode&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">quiet&lt;/span> = &lt;span style="color:#66d9ef">false&lt;/span> &lt;span style="color:#75715e"># default. If true, don&amp;#39;t output timing information&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">molecule_count&lt;/span> = &lt;span style="color:#ae81ff">1_000_000&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Benchmark mode:&lt;/p>
&lt;ul>
&lt;li>Disables (regular) output&lt;/li>
&lt;li>runs in a temp directory,&lt;/li>
&lt;li>repeats the first molecule &amp;lsquo;block&amp;rsquo; of &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/reference/Options/">&lt;code>Options.block_size&lt;/code>&lt;/a> reads
until &lt;code>molecule_count&lt;/code> has been exceeded.&lt;/li>
&lt;/ul>
&lt;p>The last point means that we will spent very little time in
reading &amp;amp; decompression (without rapidgzip / parallel BAM processing the largest
runtime parts), and focus on the steps. The drawback here is that your pipeline
sees the same reads over and over, which of course will lead to a different
&amp;lsquo;hit&amp;rsquo; profile for set based tests such as duplication counting,
&lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/reference/tag-steps/tag/TagOtherFile/">TagOtherFile&lt;/a>,&lt;br>
and &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/reference/Demultiplex/">Demultiplex&lt;/a>&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/development/benchmarking/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/development/benchmarking/</guid><description>&lt;h1 id="performance-benchmarking">
 Performance Benchmarking
 &lt;a class="anchor" href="#performance-benchmarking">#&lt;/a>
&lt;/h1>
&lt;p>fastqrab includes a comprehensive benchmarking suite for measuring and analyzing the performance of individual transformation steps.&lt;/p>
&lt;h2 id="overview">
 Overview
 &lt;a class="anchor" href="#overview">#&lt;/a>
&lt;/h2>
&lt;p>The benchmarking system uses:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Criterion&lt;/strong>: Industry-standard Rust benchmarking framework&lt;/li>
&lt;li>&lt;strong>Benchmark mode&lt;/strong>: Built-in mode that focuses on step performance while minimizing I/O overhead&lt;/li>
&lt;/ul>
&lt;h2 id="quick-start">
 Quick Start
 &lt;a class="anchor" href="#quick-start">#&lt;/a>
&lt;/h2>
&lt;h3 id="running-benchmarks">
 Running Benchmarks
 &lt;a class="anchor" href="#running-benchmarks">#&lt;/a>
&lt;/h3>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-bash" data-lang="bash">&lt;span style="display:flex;">&lt;span>&lt;span style="color:#75715e"># Run all step benchmarks&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>cargo bench --bench simple_benchmarks
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="benchmark-architecture">
 Benchmark Architecture
 &lt;a class="anchor" href="#benchmark-architecture">#&lt;/a>
&lt;/h2>
&lt;h3 id="benchmark-mode">
 Benchmark Mode
 &lt;a class="anchor" href="#benchmark-mode">#&lt;/a>
&lt;/h3>
&lt;p>The processor includes a special benchmark mode that:&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/adapters/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/adapters/</guid><description>&lt;h1 id="sequencing-adapters">
 Sequencing Adapters
 &lt;a class="anchor" href="#sequencing-adapters">#&lt;/a>
&lt;/h1>
&lt;p>Paste this &lt;code>count_oligos&lt;/code> block into a &lt;code>Report&lt;/code> step to identify which adapter is present in your data.
See &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/how-to/cookbooks/10-adapter-identification/">Cookbook 10: Adapter Identification&lt;/a> for a complete example.&lt;/p>
&lt;p>&lt;code>count_oligos&lt;/code> performs exact, full-sequence matching — no mismatches, no IUPAC wildcards.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]] 
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#39;Report&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># ...&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">count_oligos&lt;/span> = {
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># Illumina TruSeq / standard adapters&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># https://support-docs.illumina.com/SHARE/AdapterSequences/adapter-sequences.htm&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#e6db74">&amp;#34;Illumina Nextera/AmpliSeq&amp;#34;&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;CTGTCTCTTATACACATCT&amp;#34;&lt;/span>,
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#e6db74">&amp;#34;Illumina TruSeq R1/miRNA&amp;#34;&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;AGATCGGAAGAGCACACGTCTGAACTCCAGTCA&amp;#34;&lt;/span>,
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#e6db74">&amp;#34;Illumina TruSeq R2&amp;#34;&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT&amp;#34;&lt;/span>,
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#e6db74">&amp;#34;Illumina Small RNA R2&amp;#34;&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;GATCGTCGGACTGTAGAACTCTGAACGTGTAGA&amp;#34;&lt;/span>,
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#e6db74">&amp;#34;Illumina Single End Adapter 1&amp;#34;&lt;/span>= &lt;span style="color:#e6db74">&amp;#34;GATCGGAAGAGCTCGTATGCCGTCTTCTGCTTG&amp;#34;&lt;/span>,
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#e6db74">&amp;#34;Illumina Paired End Adapter 2&amp;#34;&lt;/span>= &lt;span style="color:#e6db74">&amp;#34;GATCGGAAGAGCGGTTCAGCAGGAATGCCGAG&amp;#34;&lt;/span>,
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># MGI/BGI adapters&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># http://seqanswers.com/forums/showthread.php?t=87647 (2nd post)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#e6db74">&amp;#34;BGI Forward&amp;#34;&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;AAGTCGGAGGCCAAGCGGTCTTAGGAAGACAA&amp;#34;&lt;/span>,
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#e6db74">&amp;#34;BGI Reverse&amp;#34;&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;AAGTCGGATCGTAGCCATGTCGTTCTGTGAGCCAAGGAGTTG&amp;#34;&lt;/span>,
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># QIASeq&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#e6db74">&amp;#34;QIASeq miRNA&amp;#34;&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;AACTGTAGGCACCATCAAT&amp;#34;&lt;/span>,
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># ABI SOLiD&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#e6db74">&amp;#34;ABI SOLiD3 Adapter A&amp;#34;&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;CTGCCCCGGGTTCCTCATTCTCTCAGCAGCATG&amp;#34;&lt;/span>,
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#e6db74">&amp;#34;ABI SOLiD3 Adapter B&amp;#34;&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;CCACTACGCCTCCGCTTTCCTCTCTATGGGCAGTCGGTGAT&amp;#34;&lt;/span>,
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># Clontech&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#e6db74">&amp;#34;Clontech Universal Primer Mix&amp;#34;&lt;/span>= &lt;span style="color:#e6db74">&amp;#34;CTAATACGACTCACTATAGGGCAAGCAGTGGTATCAACGCAGAGT&amp;#34;&lt;/span>,
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> }
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>More adapters may be found in the &lt;a href="https://github.com/OpenGene/fastp/blob/master/src/knownadapters.h">fastp source&lt;/a>&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/concepts/segments/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/concepts/segments/</guid><description>&lt;h1 id="segments">
 Segments
 &lt;a class="anchor" href="#segments">#&lt;/a>
&lt;/h1>
&lt;p>Modern sequencers, particularly Illumina sequencers, can read multiple times from one (amplified) DNA molecule, producing multiple &amp;lsquo;segments&amp;rsquo; (often called &amp;lsquo;reads&amp;rsquo;) that together form a &amp;lsquo;molecule&amp;rsquo; or &amp;lsquo;fragment&amp;rsquo;.&lt;/p>
&lt;h2 id="definition-and-configuration">
 Definition and Configuration
 &lt;a class="anchor" href="#definition-and-configuration">#&lt;/a>
&lt;/h2>
&lt;p>Segments are defined in the &lt;code>[input]&lt;/code> section of your TOML configuration. Each segment corresponds to one FASTQ file (or stream in interleaved formats), and segment names are arbitrary but should be meaningful.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[&lt;span style="color:#a6e22e">input&lt;/span>]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">read1&lt;/span> = [&lt;span style="color:#e6db74">&amp;#34;sample_R1.fq.gz&amp;#34;&lt;/span>]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">read2&lt;/span> = [&lt;span style="color:#e6db74">&amp;#34;sample_R2.fq.gz&amp;#34;&lt;/span>]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">index1&lt;/span> = [&lt;span style="color:#e6db74">&amp;#34;sample_I1.fq.gz&amp;#34;&lt;/span>]
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>In this example, three segments are defined: &lt;code>read1&lt;/code>, &lt;code>read2&lt;/code>, and &lt;code>index1&lt;/code>.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/concepts/source/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/concepts/source/</guid><description>&lt;h1 id="source">
 Source
 &lt;a class="anchor" href="#source">#&lt;/a>
&lt;/h1>
&lt;p>When a step refers to a &amp;lsquo;source&amp;rsquo; (instead of a &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/concepts/segments/">&lt;code>segment&lt;/code>&lt;/a>), it means the step can read from multiple types of data: segment sequences, segment names, or tag values.&lt;/p>
&lt;h2 id="overview">
 Overview
 &lt;a class="anchor" href="#overview">#&lt;/a>
&lt;/h2>
&lt;p>The &lt;code>source&lt;/code> parameter generalizes the &lt;code>segment&lt;/code> parameter, allowing steps to operate on different kinds of string data within a fragment. This flexibility enables advanced workflows like extracting patterns from read names, processing tag-derived sequences, or combining multiple data sources.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/concepts/step/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/concepts/step/</guid><description>&lt;h1 id="step">
 Step
 &lt;a class="anchor" href="#step">#&lt;/a>
&lt;/h1>
&lt;p>A step is one coherent manipulation of the FASTQ stream and its associated data.&lt;/p>
&lt;h2 id="overview">
 Overview
 &lt;a class="anchor" href="#overview">#&lt;/a>
&lt;/h2>
&lt;p>Steps are the building blocks of a processing pipeline. Each step is declared as a &lt;code>[[step]]&lt;/code> entry in the TOML configuration file, and the complete pipeline executes steps sequentially from top to bottom.&lt;/p>
&lt;p>Every step operates on complete fragments (molecules), ensuring that paired segments remain synchronized. If a filtering step removes a fragment based on criteria from &lt;code>read1&lt;/code>, the corresponding &lt;code>read2&lt;/code>, &lt;code>index1&lt;/code>, and any other segments are automatically removed alongside it.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/concepts/tag/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/concepts/tag/</guid><description>&lt;h1 id="tag--label">
 Tag / Label
 &lt;a class="anchor" href="#tag--label">#&lt;/a>
&lt;/h1>
&lt;p>A regular tag is a piece of fragment-derived metadata that one step in the pipeline
produces, and other steps may consume, transform, or export.&lt;/p>
&lt;p>A virtual tag is an on-the-fly create tag that exists just
for this step and disappears right afterwards.&lt;/p>
&lt;h2 id="overview---regular-tags">
 Overview - Regular tags
 &lt;a class="anchor" href="#overview---regular-tags">#&lt;/a>
&lt;/h2>
&lt;p>Tags enable sophisticated workflows by decoupling data extraction from data
usage. Instead of hardcoding logic like &amp;ldquo;trim adapters AND filter by adapter
presence&amp;rdquo; into a single step, you extract adapter locations as a tag, then use
that tag in multiple downstream operations.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/faq/changelog/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/faq/changelog/</guid><description>&lt;h1 id="changelog">
 Changelog
 &lt;a class="anchor" href="#changelog">#&lt;/a>
&lt;/h1>
&lt;h2 id="v090">
 v0.9.0
 &lt;a class="anchor" href="#v090">#&lt;/a>
&lt;/h2>
&lt;h3 id="general">
 General
 &lt;a class="anchor" href="#general">#&lt;/a>
&lt;/h3>
&lt;ul>
&lt;li>Renamed project from mbf-fastq-processor to fastqrab&lt;/li>
&lt;li>Much improved error messages pinpointing exactly what needs to change.&lt;/li>
&lt;/ul>
&lt;h3 id="new--renamed-steps">
 New &amp;amp; renamed steps
 &lt;a class="anchor" href="#new--renamed-steps">#&lt;/a>
&lt;/h3>
&lt;ul>
&lt;li>New step: ConcatTags — concatenate multiple tags into one&lt;/li>
&lt;li>New step: Lowercase — unified replacement for LowercaseTag/LowercaseSequence&lt;/li>
&lt;li>New step: SpotCheckReadPairing — Hamming-distance based read pairing validation&lt;/li>
&lt;li>New step: ExtractIUPACSuffix added&lt;/li>
&lt;li>OtherFile unified: OtherFileByName and OtherFileBySequence merged into one step&lt;/li>
&lt;li>ExtractAnchor merged into ExtractRegions&lt;/li>
&lt;li>NCount -&amp;gt; NContent to be inline with GCContent&lt;/li>
&lt;li>NContent/GCContent now support (and require) relative (counts or rate?)&lt;/li>
&lt;/ul>
&lt;h3 id="step-changes">
 Step changes
 &lt;a class="anchor" href="#step-changes">#&lt;/a>
&lt;/h3>
&lt;ul>
&lt;li>Conditional Swap/ReverseComplement variants merged into the main steps&lt;/li>
&lt;li>&lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/concepts/tag/#conditional-processing">if tag&lt;/a> support on 8 core editing steps for conditional read editing&lt;/li>
&lt;li>ExtractIUPAC: multiple queries in one step, max_mismatches now required, large performance improvements&lt;/li>
&lt;li>min_length added to ExtractRegionsOfLowQuality&lt;/li>
&lt;li>Quality checking added to Prefix/Postfix&lt;/li>
&lt;li>Tag replacement within regular expressions&lt;/li>
&lt;/ul>
&lt;h3 id="output-changes">
 Output changes
 &lt;a class="anchor" href="#output-changes">#&lt;/a>
&lt;/h3>
&lt;ul>
&lt;li>Tag histogram reports; demultiplex data nested under &amp;lsquo;demultiplex&amp;rsquo; key in reports&lt;/li>
&lt;/ul>
&lt;h3 id="performance">
 Performance
 &lt;a class="anchor" href="#performance">#&lt;/a>
&lt;/h3>
&lt;ul>
&lt;li>Redesigned multi-core engine: workpool based, better controllable, better documented&lt;/li>
&lt;li>Default thread count now uses all available CPU cores&lt;/li>
&lt;li>Rapidgzip for parallel gzip decompression, now also for FASTA; auto-detected; included in Nix builds&lt;/li>
&lt;li>Arena-based parsers for FASTA and BAM&lt;/li>
&lt;li>Parallel BAM decoding&lt;/li>
&lt;li>Multicore EvalExpression, ReportCountOligos, ReportLengthDistribution&lt;/li>
&lt;li>Prefix/Postfix massively improved performance&lt;/li>
&lt;li>Merge base statistics ~80% faster&lt;/li>
&lt;li>ConcatTags ~15% faster&lt;/li>
&lt;li>IUPAC matching: replaced Sassy with optimized pure-Rust implementation&lt;/li>
&lt;li>Optimized SwapConditional, TrimAtTag, StoreTagBackInSequence, FilterReservoirSample, Rename&lt;/li>
&lt;li>Dynamic cuckoo filter sizing; initial_filter_capacity documented; read count estimation&lt;/li>
&lt;/ul>
&lt;h3 id="other">
 Other
 &lt;a class="anchor" href="#other">#&lt;/a>
&lt;/h3>
&lt;ul>
&lt;li>verify command: validates a pipeline produces expected output; auto-detects config, captures stdout/stderr&lt;/li>
&lt;li>configuration toml can now be read from stdin (incompatible with reads from stdin).&lt;/li>
&lt;li>Shell autocompletion for bash, fish, and zsh&lt;/li>
&lt;li>benchmark mode and per-step benchmark harness&lt;/li>
&lt;li>template command: shows help on error&lt;/li>
&lt;li>LLM configuration guide and template.toml rewrite for LLM-assisted config generation&lt;/li>
&lt;li>TagLabel type: all tag names are now strongly typed; duplicate tag names produce a clear error&lt;/li>
&lt;li>IndexMap replaces HashMap everywhere for deterministic output order&lt;/li>
&lt;li>unwrap() replaced with expect() throughout; clippy::unwrap_used now denied&lt;/li>
&lt;li>MSRV pinned to match flake.nix Rust version&lt;/li>
&lt;li>Security: upgraded bytes crate (GHSA-434x-w66g-qw3r)&lt;/li>
&lt;li>Upgraded dependencies&lt;/li>
&lt;/ul>
&lt;h3 id="documentation">
 Documentation
 &lt;a class="anchor" href="#documentation">#&lt;/a>
&lt;/h3>
&lt;ul>
&lt;li>Four new cookbooks for common FastQ processing tasks&lt;/li>
&lt;li>Copy-to-clipboard button in docs&lt;/li>
&lt;li>Documentation URLs included in validation failure messages&lt;/li>
&lt;li>Added mascot&lt;/li>
&lt;/ul>
&lt;h3 id="bug-fixes">
 Bug fixes
 &lt;a class="anchor" href="#bug-fixes">#&lt;/a>
&lt;/h3>
&lt;ul>
&lt;li>Fixed fastp merge algorithm (replaced with direct port of the reference C++ algorithm)&lt;/li>
&lt;li>Fixed invalid FASTQ detection when comment line doesn&amp;rsquo;t start with &amp;lsquo;+&amp;rsquo;&lt;/li>
&lt;li>Fixed Windows newline detection edge case in parser&lt;/li>
&lt;li>Fixed Local-Local FastQElement swap&lt;/li>
&lt;li>Fixed demultiplex &amp;amp; fragment count in reports&lt;/li>
&lt;li>Fixed Head short-circuit (broken by SpotCheckReadPairing)&lt;/li>
&lt;li>Fixed ignore_unaligned → now include_mapped / include_unmapped&lt;/li>
&lt;li>Fixed barcode overlapping multiple matches&lt;/li>
&lt;/ul>
&lt;h2 id="v081">
 v0.8.1
 &lt;a class="anchor" href="#v081">#&lt;/a>
&lt;/h2>
&lt;ul>
&lt;li>Github release workflow test&lt;/li>
&lt;/ul>
&lt;h2 id="v080">
 v0.8.0
 &lt;a class="anchor" href="#v080">#&lt;/a>
&lt;/h2>
&lt;ul>
&lt;li>Versioned documentation&lt;/li>
&lt;li>First revision where very major feature is in place. Changelog starts here.&lt;/li>
&lt;/ul></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/how-to/cookbooks/01-basic-quality-report/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/how-to/cookbooks/01-basic-quality-report/</guid><description>&lt;h1 id="cookbook-01-basic-quality-report">
 Cookbook 01: Basic Quality Report
 &lt;a class="anchor" href="#cookbook-01-basic-quality-report">#&lt;/a>
&lt;/h1>
&lt;h2 id="use-case">
 Use Case
 &lt;a class="anchor" href="#use-case">#&lt;/a>
&lt;/h2>
&lt;p>You have FastQ files from a sequencing run and want to generate comprehensive quality reports to assess:&lt;/p>
&lt;ul>
&lt;li>Read quality scores&lt;/li>
&lt;li>Base composition&lt;/li>
&lt;li>Read length distribution&lt;/li>
&lt;li>Duplicate read counts&lt;/li>
&lt;/ul>
&lt;p>This is typically the first step in any sequencing data analysis to understand data quality before downstream processing.&lt;/p>
&lt;h2 id="what-this-pipeline-does">
 What This Pipeline Does
 &lt;a class="anchor" href="#what-this-pipeline-does">#&lt;/a>
&lt;/h2>
&lt;ol>
&lt;li>Reads input FastQ file(s)&lt;/li>
&lt;li>Generates a comprehensive quality report including:
&lt;ul>
&lt;li>Base quality statistics&lt;/li>
&lt;li>Base distribution across positions&lt;/li>
&lt;li>Read length distribution&lt;/li>
&lt;li>Duplicate read counting&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Outputs reports in both HTML (human-readable) and JSON (machine-readable) formats&lt;/li>
&lt;li>Passes through all reads unchanged (no filtering)&lt;/li>
&lt;/ol>
&lt;h2 id="input-files">
 Input Files
 &lt;a class="anchor" href="#input-files">#&lt;/a>
&lt;/h2>
&lt;ul>
&lt;li>&lt;code>input/sample_R1.fq&lt;/code> - Forward reads (Read 1) from paired-end sequencing&lt;/li>
&lt;/ul>
&lt;h2 id="output-files">
 Output Files
 &lt;a class="anchor" href="#output-files">#&lt;/a>
&lt;/h2>
&lt;ul>
&lt;li>&lt;code>output_R1.fq&lt;/code> - Passed-through reads (identical to input)&lt;/li>
&lt;li>&lt;code>output.report_initial.html&lt;/code> - HTML quality report&lt;/li>
&lt;li>&lt;code>output.report_initial.json&lt;/code> - JSON quality report with detailed statistics&lt;/li>
&lt;/ul>
&lt;h2 id="when-to-use-this">
 When to Use This
 &lt;a class="anchor" href="#when-to-use-this">#&lt;/a>
&lt;/h2>
&lt;ul>
&lt;li>First analysis of new sequencing data&lt;/li>
&lt;li>Quality control before committing to expensive downstream analysis&lt;/li>
&lt;li>Comparing data quality across different sequencing runs&lt;/li>
&lt;li>Identifying potential issues (adapter contamination, quality drop-off, etc.)&lt;/li>
&lt;/ul>
&lt;h2 id="download">
 Download
 &lt;a class="anchor" href="#download">#&lt;/a>
&lt;/h2>
&lt;p>&lt;a href="../../../../cookbooks/01-basic-quality-report.tar.gz">Download 01-basic-quality-report.tar.gz&lt;/a> for a complete, runnable example including expected output files.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/how-to/cookbooks/02-umi-extraction/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/how-to/cookbooks/02-umi-extraction/</guid><description>&lt;h1 id="cookbook-02-umi-extraction">
 Cookbook 02: UMI Extraction
 &lt;a class="anchor" href="#cookbook-02-umi-extraction">#&lt;/a>
&lt;/h1>
&lt;h2 id="use-case">
 Use Case
 &lt;a class="anchor" href="#use-case">#&lt;/a>
&lt;/h2>
&lt;p>You have sequencing data with Unique Molecular Identifiers (UMIs) embedded in the reads. UMIs are short random barcodes added during library preparation that allow you to:&lt;/p>
&lt;ul>
&lt;li>Identify and remove PCR duplicates&lt;/li>
&lt;li>Distinguish true biological duplicates from amplification artifacts&lt;/li>
&lt;li>Improve accuracy in quantitative analyses (RNA-seq, ATAC-seq, etc.)&lt;/li>
&lt;/ul>
&lt;h2 id="what-this-pipeline-does">
 What This Pipeline Does
 &lt;a class="anchor" href="#what-this-pipeline-does">#&lt;/a>
&lt;/h2>
&lt;ol>
&lt;li>Reads input FastQ file with UMIs at the start of read1&lt;/li>
&lt;li>Extracts the UMI sequence (first 8 bases) and creates a tag&lt;/li>
&lt;li>Stores the UMI in the read comment (FASTQ header)&lt;/li>
&lt;li>Removes the UMI bases from the read sequence (so they don&amp;rsquo;t interfere with alignment)&lt;/li>
&lt;li>Outputs modified reads with UMI preserved in the header&lt;/li>
&lt;/ol>
&lt;h2 id="input-files">
 Input Files
 &lt;a class="anchor" href="#input-files">#&lt;/a>
&lt;/h2>
&lt;ul>
&lt;li>&lt;code>input/sample_R1.fq&lt;/code> - Reads with 8bp UMI at the start&lt;/li>
&lt;/ul>
&lt;h2 id="output-files">
 Output Files
 &lt;a class="anchor" href="#output-files">#&lt;/a>
&lt;/h2>
&lt;ul>
&lt;li>&lt;code>output_R1.fq&lt;/code> - Reads with UMI in comment, UMI bases removed from sequence&lt;/li>
&lt;/ul>
&lt;h2 id="configuration-highlights">
 Configuration Highlights
 &lt;a class="anchor" href="#configuration-highlights">#&lt;/a>
&lt;/h2>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># Extract UMI from positions 0-7 (8 bases)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#39;ExtractRegions&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">label&lt;/span> = &lt;span style="color:#e6db74">&amp;#39;umi&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">regions&lt;/span> = [{&lt;span style="color:#a6e22e">source&lt;/span> = &lt;span style="color:#e6db74">&amp;#39;read1&amp;#39;&lt;/span>, &lt;span style="color:#a6e22e">start&lt;/span> = &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#a6e22e">length&lt;/span> = &lt;span style="color:#ae81ff">8&lt;/span>, &lt;span style="color:#a6e22e">anchor&lt;/span>=&lt;span style="color:#e6db74">&amp;#34;Start&amp;#34;&lt;/span>}]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># Store UMI in the FASTQ comment&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#39;StoreTagInComment&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">label&lt;/span> = &lt;span style="color:#e6db74">&amp;#39;umi&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># Remove the UMI bases from the read&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#39;CutStart&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">target&lt;/span> = &lt;span style="color:#e6db74">&amp;#39;Read1&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">n&lt;/span> = &lt;span style="color:#ae81ff">8&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="workflow-details">
 Workflow Details
 &lt;a class="anchor" href="#workflow-details">#&lt;/a>
&lt;/h2>
&lt;p>&lt;strong>Before processing:&lt;/strong>&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/how-to/cookbooks/03-lexogen-quantseq/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/how-to/cookbooks/03-lexogen-quantseq/</guid><description>&lt;h1 id="cookbook-03-lexogen-quantseq-processing">
 Cookbook 03: Lexogen QuantSeq Processing
 &lt;a class="anchor" href="#cookbook-03-lexogen-quantseq-processing">#&lt;/a>
&lt;/h1>
&lt;h2 id="use-case">
 Use Case
 &lt;a class="anchor" href="#use-case">#&lt;/a>
&lt;/h2>
&lt;p>Lexogen QuantSeq is a popular 3&amp;rsquo; mRNA sequencing protocol optimized for gene expression profiling. The library structure includes:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>First 8 bases&lt;/strong>: UMI (Unique Molecular Identifier) for deduplication&lt;/li>
&lt;li>&lt;strong>Next 6 bases&lt;/strong>: Random hexamer primer sequence (needs removal)&lt;/li>
&lt;li>&lt;strong>Remaining sequence&lt;/strong>: Actual cDNA from the 3&amp;rsquo; end of transcripts&lt;/li>
&lt;/ul>
&lt;p>This cookbook demonstrates the standard preprocessing for QuantSeq data before alignment.&lt;/p>
&lt;h2 id="what-this-pipeline-does">
 What This Pipeline Does
 &lt;a class="anchor" href="#what-this-pipeline-does">#&lt;/a>
&lt;/h2>
&lt;ol>
&lt;li>Extracts the 8bp UMI from the start of reads&lt;/li>
&lt;li>Stores the UMI in the read comment (FASTQ header)&lt;/li>
&lt;li>Removes the first 14 bases total (8bp UMI + 6bp random hexamer)&lt;/li>
&lt;li>Outputs processed reads ready for alignment&lt;/li>
&lt;/ol>
&lt;h2 id="input-files">
 Input Files
 &lt;a class="anchor" href="#input-files">#&lt;/a>
&lt;/h2>
&lt;ul>
&lt;li>&lt;code>input/quantseq_sample.fq&lt;/code> - Raw QuantSeq reads with UMI and random hexamer&lt;/li>
&lt;/ul>
&lt;h2 id="output-files">
 Output Files
 &lt;a class="anchor" href="#output-files">#&lt;/a>
&lt;/h2>
&lt;ul>
&lt;li>&lt;code>output_read1.fq&lt;/code> - Processed reads with:
&lt;ul>
&lt;li>UMI stored in comment&lt;/li>
&lt;li>First 14bp removed&lt;/li>
&lt;li>Ready for alignment to reference genome&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="workflow-details">
 Workflow Details
 &lt;a class="anchor" href="#workflow-details">#&lt;/a>
&lt;/h2>
&lt;p>&lt;strong>Raw read structure:&lt;/strong>&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/how-to/cookbooks/04-phiX-removal/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/how-to/cookbooks/04-phiX-removal/</guid><description>&lt;h1 id="cookbook-04-phix-removal">
 Cookbook 04: PhiX Removal
 &lt;a class="anchor" href="#cookbook-04-phix-removal">#&lt;/a>
&lt;/h1>
&lt;h2 id="use-case">
 Use Case
 &lt;a class="anchor" href="#use-case">#&lt;/a>
&lt;/h2>
&lt;p>You have Illumina PhiX spike-in sequences in your dataset and want to remove those contaminating reads before downstream analysis. PhiX is commonly added as a control to increase base diversity during sequencing runs.&lt;/p>
&lt;h2 id="what-this-pipeline-does">
 What This Pipeline Does
 &lt;a class="anchor" href="#what-this-pipeline-does">#&lt;/a>
&lt;/h2>
&lt;p>This cookbook demonstrates how to identify and remove PhiX contamination using k-mer counting:&lt;/p>
&lt;ol>
&lt;li>&lt;strong>Count k-mers&lt;/strong>: Uses &lt;code>CalcKmers&lt;/code> to count how many 30-mers from each read match the PhiX genome&lt;/li>
&lt;li>&lt;strong>Export data&lt;/strong>: Saves k-mer counts to a TSV table for analysis&lt;/li>
&lt;li>&lt;strong>Filter reads&lt;/strong>: Removes reads with high PhiX k-mer counts (≥25 matching k-mers)&lt;/li>
&lt;/ol>
&lt;h2 id="understanding-the-approach">
 Understanding the Approach
 &lt;a class="anchor" href="#understanding-the-approach">#&lt;/a>
&lt;/h2>
&lt;h3 id="k-mer-counting">
 K-mer Counting
 &lt;a class="anchor" href="#k-mer-counting">#&lt;/a>
&lt;/h3>
&lt;p>The &lt;code>CalcKmers&lt;/code> step counts how many k-mers (short subsequences of length k) from each read are present in the PhiX reference genome:&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/how-to/cookbooks/05-quality-filtering/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/how-to/cookbooks/05-quality-filtering/</guid><description>&lt;h1 id="cookbook-05-quality-filtering">
 Cookbook 05: Quality Filtering
 &lt;a class="anchor" href="#cookbook-05-quality-filtering">#&lt;/a>
&lt;/h1>
&lt;h2 id="use-case">
 Use Case
 &lt;a class="anchor" href="#use-case">#&lt;/a>
&lt;/h2>
&lt;p>You have sequencing data with varying quality and want to remove low-quality reads before downstream analysis. Poor quality reads can introduce errors in variant calling, assembly, and other analyses.&lt;/p>
&lt;h2 id="what-this-pipeline-does">
 What This Pipeline Does
 &lt;a class="anchor" href="#what-this-pipeline-does">#&lt;/a>
&lt;/h2>
&lt;p>This cookbook demonstrates quality-based filtering using expected error calculation:&lt;/p>
&lt;ol>
&lt;li>&lt;strong>Calculate Expected Errors&lt;/strong>: Uses &lt;code>CalcExpectedError&lt;/code> to compute the expected number of base call errors per read based on quality scores&lt;/li>
&lt;li>&lt;strong>Filter Low-Quality Reads&lt;/strong>: Uses &lt;code>FilterByNumericTag&lt;/code> to remove reads exceeding an error threshold&lt;/li>
&lt;li>&lt;strong>Generate Reports&lt;/strong>: Creates quality reports before and after filtering to show improvement&lt;/li>
&lt;/ol>
&lt;h2 id="understanding-expected-error">
 Understanding Expected Error
 &lt;a class="anchor" href="#understanding-expected-error">#&lt;/a>
&lt;/h2>
&lt;p>Expected error (EE) is a more nuanced quality metric than average quality score:&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/how-to/cookbooks/06-adapter-trimming/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/how-to/cookbooks/06-adapter-trimming/</guid><description>&lt;h1 id="cookbook-06-adapter-trimming-with-polya-tail-removal">
 Cookbook 06: Adapter Trimming with PolyA Tail Removal
 &lt;a class="anchor" href="#cookbook-06-adapter-trimming-with-polya-tail-removal">#&lt;/a>
&lt;/h1>
&lt;h2 id="use-case">
 Use Case
 &lt;a class="anchor" href="#use-case">#&lt;/a>
&lt;/h2>
&lt;p>You have RNA-seq data that contains:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>PolyA tails&lt;/strong>: Stretches of A bases at the 3&amp;rsquo; end (or polyT at 5&amp;rsquo; for reverse strand)&lt;/li>
&lt;li>&lt;strong>Sequencing adapters&lt;/strong>: Illumina or other adapter sequences that need removal before alignment&lt;/li>
&lt;/ul>
&lt;p>These artifacts can interfere with alignment and downstream analysis if not removed.&lt;/p>
&lt;h2 id="what-this-pipeline-does">
 What This Pipeline Does
 &lt;a class="anchor" href="#what-this-pipeline-does">#&lt;/a>
&lt;/h2>
&lt;p>This cookbook demonstrates a complete adapter and polyA trimming workflow:&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/how-to/cookbooks/07-demultiplexing/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/how-to/cookbooks/07-demultiplexing/</guid><description>&lt;h1 id="cookbook-07-demultiplexing-by-inline-barcode">
 Cookbook 07: Demultiplexing by Inline Barcode
 &lt;a class="anchor" href="#cookbook-07-demultiplexing-by-inline-barcode">#&lt;/a>
&lt;/h1>
&lt;h2 id="use-case">
 Use Case
 &lt;a class="anchor" href="#use-case">#&lt;/a>
&lt;/h2>
&lt;p>You have pooled sequencing data from multiple samples that were tagged with unique barcode sequences during library preparation
and have not been demuliplexed by your sequencing facility.&lt;/p>
&lt;p>You need to:&lt;/p>
&lt;ul>
&lt;li>Extract the barcode(s) from each read&lt;/li>
&lt;li>Correct sequencing errors in barcodes&lt;/li>
&lt;li>Separate reads into individual files per sample&lt;/li>
&lt;/ul>
&lt;p>This is common in multiplexed sequencing runs to maximize sequencing efficiency and reduce costs.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/how-to/cookbooks/08-length-filtering/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/how-to/cookbooks/08-length-filtering/</guid><description>&lt;h1 id="cookbook-08-read-length-filtering-and-truncation">
 Cookbook 08: Read Length Filtering and Truncation
 &lt;a class="anchor" href="#cookbook-08-read-length-filtering-and-truncation">#&lt;/a>
&lt;/h1>
&lt;h2 id="use-case">
 Use Case
 &lt;a class="anchor" href="#use-case">#&lt;/a>
&lt;/h2>
&lt;p>You have sequencing data with variable read lengths and need to:&lt;/p>
&lt;ul>
&lt;li>Remove reads that are too short (may align poorly or represent artifacts)&lt;/li>
&lt;li>Remove reads that are too long (may indicate technical issues)&lt;/li>
&lt;li>Truncate all reads to a uniform length (required by some downstream tools)&lt;/li>
&lt;/ul>
&lt;p>Read length filtering is important for:&lt;/p>
&lt;ul>
&lt;li>Quality control after adapter trimming&lt;/li>
&lt;li>Preparing data for tools that require uniform read lengths&lt;/li>
&lt;li>Removing degraded or artifactual sequences&lt;/li>
&lt;/ul>
&lt;h2 id="what-this-pipeline-does">
 What This Pipeline Does
 &lt;a class="anchor" href="#what-this-pipeline-does">#&lt;/a>
&lt;/h2>
&lt;p>This cookbook demonstrates comprehensive read length management:&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/how-to/cookbooks/09-fastp-equivalent/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/how-to/cookbooks/09-fastp-equivalent/</guid><description>&lt;h1 id="cookbook-09-fastp-equivalent-workflow">
 Cookbook 09: Fastp-Equivalent Workflow
 &lt;a class="anchor" href="#cookbook-09-fastp-equivalent-workflow">#&lt;/a>
&lt;/h1>
&lt;h2 id="use-case">
 Use Case
 &lt;a class="anchor" href="#use-case">#&lt;/a>
&lt;/h2>
&lt;p>You want to replicate the default behavior of &lt;a href="https://github.com/OpenGene/fastp/">fastp&lt;/a> — a popular all-in-one FASTQ preprocessor — using a configurable pipeline. This is useful when you need reproducible, step-by-step control over each filtering stage, or want to extend the workflow beyond what fastp offers.&lt;/p>
&lt;h2 id="what-this-pipeline-does">
 What This Pipeline Does
 &lt;a class="anchor" href="#what-this-pipeline-does">#&lt;/a>
&lt;/h2>
&lt;p>This cookbook replicates fastp&amp;rsquo;s default single-end processing pipeline:&lt;/p>
&lt;ol>
&lt;li>&lt;strong>PolyG Trimming&lt;/strong>: Uses &lt;code>ExtractPolyTail&lt;/code> + &lt;code>TrimAtTag&lt;/code> to remove polyG tails (Illumina NextSeq/NovaSeq artifact)&lt;/li>
&lt;li>&lt;strong>Adapter Trimming&lt;/strong>: Uses &lt;code>ExtractIUPAC&lt;/code> + &lt;code>TrimAtTag&lt;/code> to remove the Illumina TruSeq R1 adapter&lt;/li>
&lt;li>&lt;strong>N-base Filtering&lt;/strong>: Uses &lt;code>CalcNCount&lt;/code> + &lt;code>FilterByNumericTag&lt;/code> to remove reads with too many ambiguous bases (&lt;code>--n_base_limit 5&lt;/code>)&lt;/li>
&lt;li>&lt;strong>Quality Filtering&lt;/strong>: Uses &lt;code>CalcQualifiedBases&lt;/code> + &lt;code>FilterByNumericTag&lt;/code> to remove reads with too many low-quality bases (&lt;code>--qualified_quality_phred 15&lt;/code>, &lt;code>--unqualified_percent_limit 40&lt;/code>)&lt;/li>
&lt;li>&lt;strong>Length Filtering&lt;/strong>: Uses &lt;code>CalcLength&lt;/code> + &lt;code>FilterByNumericTag&lt;/code> to remove reads shorter than 15bp (&lt;code>--length_required 15&lt;/code>)&lt;/li>
&lt;/ol>
&lt;h2 id="fastp-defaults-replicated">
 Fastp Defaults Replicated
 &lt;a class="anchor" href="#fastp-defaults-replicated">#&lt;/a>
&lt;/h2>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>Fastp parameter&lt;/th>
 &lt;th>Value&lt;/th>
 &lt;th>Pipeline step&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>&lt;code>--poly_g_min_len&lt;/code>&lt;/td>
 &lt;td>10&lt;/td>
 &lt;td>&lt;code>ExtractPolyTail min_length = 10&lt;/code>&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>--adapter_sequence&lt;/code>&lt;/td>
 &lt;td>&lt;code>AGATCGGAAGAGCACACGTCTGAACTCCAGTCA&lt;/code>&lt;/td>
 &lt;td>&lt;code>ExtractIUPAC query = ...&lt;/code>&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>--n_base_limit&lt;/code>&lt;/td>
 &lt;td>5&lt;/td>
 &lt;td>&lt;code>FilterByNumericTag max_value = 6&lt;/code>&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>--qualified_quality_phred&lt;/code>&lt;/td>
 &lt;td>15 (Phred) → 48 (ASCII)&lt;/td>
 &lt;td>&lt;code>CalcQualifiedBases threshold = 48&lt;/code>&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>--unqualified_percent_limit&lt;/code>&lt;/td>
 &lt;td>40% → keep if ≥ 60% qualified&lt;/td>
 &lt;td>&lt;code>FilterByNumericTag min_value = 0.60&lt;/code>&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;code>--length_required&lt;/code>&lt;/td>
 &lt;td>15&lt;/td>
 &lt;td>&lt;code>FilterByNumericTag min_value = 15&lt;/code>&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;p>&lt;strong>Note on quality scores&lt;/strong>: Quality values in FASTQ files are ASCII-encoded. Phred Q15 corresponds to ASCII character 48 (&lt;code>15 + 33 = 48&lt;/code>).&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/how-to/cookbooks/10-adapter-identification/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/how-to/cookbooks/10-adapter-identification/</guid><description>&lt;h1 id="cookbook-10-adapter-identification">
 Cookbook 10: Adapter Identification
 &lt;a class="anchor" href="#cookbook-10-adapter-identification">#&lt;/a>
&lt;/h1>
&lt;h2 id="use-case">
 Use Case
 &lt;a class="anchor" href="#use-case">#&lt;/a>
&lt;/h2>
&lt;p>You have a FASTQ file and want to identify which sequencing adapter is present
before trimming — or to confirm no adapter contamination remains after
trimming. This is useful when the adapter type is unknown, when working with
data from multiple library prep kits, or when validating a trimming step.&lt;/p>
&lt;h2 id="what-this-pipeline-does">
 What This Pipeline Does
 &lt;a class="anchor" href="#what-this-pipeline-does">#&lt;/a>
&lt;/h2>
&lt;ol>
&lt;li>Runs a single &lt;code>Report&lt;/code> step that counts exact occurrences of each common
adapter sequence in every read (&lt;code>count_oligos&lt;/code>)&lt;/li>
&lt;li>Writes an &lt;a href="../../../../cookbooks/10-adapter-identification/output.html">HTML&lt;/a> and JSON report — no reads are filtered or written to disk&lt;/li>
&lt;/ol>
&lt;h2 id="how-count_oligos-works">
 How count_oligos Works
 &lt;a class="anchor" href="#how-count_oligos-works">#&lt;/a>
&lt;/h2>
&lt;p>&lt;code>count_oligos&lt;/code> performs exact, full-sequence matching across every read. A read
is counted if the probe sequence appears verbatim anywhere within it. There are
no mismatches and no IUPAC wildcards. A non-zero count means reads carry at
least one complete copy of that adapter.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/filter-steps/FilterReservoirSample/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/filter-steps/FilterReservoirSample/</guid><description>&lt;h1 id="filterreservoirsample">
 FilterReservoirSample
 &lt;a class="anchor" href="#filterreservoirsample">#&lt;/a>
&lt;/h1>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;FilterReservoirSample&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">n&lt;/span> = &lt;span style="color:#ae81ff">10_000&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">seed&lt;/span> = &lt;span style="color:#ae81ff">59014&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Filter for a fixed number of reads based on &lt;a href="https://en.wikipedia.org/wiki/Reservoir_sampling">reservoir sampling&lt;/a>, that is all input reads have an equal probability of being selected.&lt;/p>
&lt;p>This means we need to keep n reads in memory, and they get processed as one
large block at the end by all downstream steps.&lt;/p>
&lt;p>That means it&amp;rsquo;s not the right tool if you want to sample to millions of reads,
but it is the right tool if you want to have lowish fixed number of reads.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/filter-steps/FilterSample/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/filter-steps/FilterSample/</guid><description>&lt;h1 id="filtersample">
 FilterSample
 &lt;a class="anchor" href="#filtersample">#&lt;/a>
&lt;/h1>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;FilterSample&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">p&lt;/span> = &lt;span style="color:#ae81ff">0.5&lt;/span> &lt;span style="color:#75715e"># float, the chance for any given read to be kept&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># 0..1&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">seed&lt;/span> = &lt;span style="color:#ae81ff">42&lt;/span> &lt;span style="color:#75715e"># (optional) random seed for reproducibility&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Randomly sample a percentage of reads.
Requires a random seed to ensure reproducibility.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/modification-steps/ConvertQuality/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/modification-steps/ConvertQuality/</guid><description>&lt;h1 id="convertquality">
 ConvertQuality
 &lt;a class="anchor" href="#convertquality">#&lt;/a>
&lt;/h1>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ConvertQuality&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">from&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;Illumina1.8&amp;#34;&lt;/span>&lt;span style="color:#75715e"># Illumin1.8|Illumina1.3|Sanger|Solexa&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">to&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;Solexa&amp;#34;&lt;/span> &lt;span style="color:#75715e"># same range as from. Illumina1.8 is an alias for Sanger&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Convert quality scores between various encodings / meanings.&lt;/p>
&lt;p>See &lt;a href="https://en.wikipedia.org/wiki/Phred_quality_score">https://en.wikipedia.org/wiki/Phred_quality_score&lt;/a>&lt;/p>
&lt;p>Will error if from == to.&lt;/p>
&lt;p>This step introduces a &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/reference/validation-steps/ValidateQuality/">ValidateQuality&lt;/a> step automatically before it.&lt;/p>
&lt;h2 id="corresponding-options-in-other-software">
 Corresponding options in other software
 &lt;a class="anchor" href="#corresponding-options-in-other-software">#&lt;/a>
&lt;/h2>
&lt;ul>
&lt;li>trimmomatic TOPHRED33&lt;/li>
&lt;/ul></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/modification-steps/CutEnd/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/modification-steps/CutEnd/</guid><description>&lt;h1 id="cutend">
 CutEnd
 &lt;a class="anchor" href="#cutend">#&lt;/a>
&lt;/h1>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;CutEnd&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">n&lt;/span> = &lt;span style="color:#ae81ff">5&lt;/span> &lt;span style="color:#75715e"># positive integer, cut n nucleotides from the end of the read&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments (default: read1)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">if_tag&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;mytag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Cut nucleotides from the end of the read.&lt;/p>
&lt;p>May produce empty reads; filter those with &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/reference/filter-steps/FilterEmpty/">FilterEmpty&lt;/a>.&lt;/p>
&lt;p>Optionally only applies if a &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/concepts/tag/">tag&lt;/a> is truthy.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/modification-steps/CutStart/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/modification-steps/CutStart/</guid><description>&lt;h3 id="cutstart">
 CutStart
 &lt;a class="anchor" href="#cutstart">#&lt;/a>
&lt;/h3>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;CutStart&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">n&lt;/span> = &lt;span style="color:#ae81ff">5&lt;/span> &lt;span style="color:#75715e"># positive integer, cut n nucleotides from the start of the read&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">if_tag&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;mytag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Cut nucleotides from the start of the read.&lt;/p>
&lt;p>May produce empty reads; filter those with &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/reference/filter-steps/FilterEmpty/">FilterEmpty&lt;/a>.&lt;/p>
&lt;p>Optionally only applies if a &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/concepts/tag/">tag&lt;/a> is truthy.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/modification-steps/ExtractToName/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/modification-steps/ExtractToName/</guid><description>&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ExtractRegions&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">regions&lt;/span> = [
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> {&lt;span style="color:#a6e22e">segment&lt;/span>= &lt;span style="color:#e6db74">&amp;#34;Read1&amp;#34;&lt;/span>, &lt;span style="color:#a6e22e">start&lt;/span> = &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#a6e22e">length&lt;/span> = &lt;span style="color:#ae81ff">8&lt;/span>},
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> {&lt;span style="color:#a6e22e">segment&lt;/span>= &lt;span style="color:#e6db74">&amp;#34;Read1&amp;#34;&lt;/span>, &lt;span style="color:#a6e22e">start&lt;/span> = &lt;span style="color:#ae81ff">12&lt;/span>, &lt;span style="color:#a6e22e">length&lt;/span> = &lt;span style="color:#ae81ff">4&lt;/span>},
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> ]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">out_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;umi&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">region_separator&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;_&amp;#34;&lt;/span> &lt;span style="color:#75715e"># (optional) str, what to put between the regions, defaults to &amp;#39;_&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;StoreTagInComment&amp;#34;&lt;/span> 
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">in_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;umi&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Extract a sequence from the read and place it in the read name&amp;rsquo;s comment section,
so a (space separated) &amp;lsquo;key=value&amp;rsquo; pair is added to the read name.&lt;/p>
&lt;p>Supports multiple region-extraction.&lt;/p>
&lt;p>See &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/reference/tag-steps/">the tag section&lt;/a> for more tag generation options.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/modification-steps/Head/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/modification-steps/Head/</guid><description>&lt;h1 id="head">
 Head
 &lt;a class="anchor" href="#head">#&lt;/a>
&lt;/h1>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;Head&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">n&lt;/span> = &lt;span style="color:#ae81ff">1000&lt;/span> &lt;span style="color:#75715e"># positive integer, number of reads to keep&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Output just the first n molecules.&lt;/p>
&lt;h3 id="demultiplex-interaction">
 Demultiplex interaction
 &lt;a class="anchor" href="#demultiplex-interaction">#&lt;/a>
&lt;/h3>
&lt;p>If present after a demultiplex step, includes n molecules in each stream.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/modification-steps/Lowercase/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/modification-steps/Lowercase/</guid><description>&lt;h2 id="weight-150">
 weight: 150
 &lt;a class="anchor" href="#weight-150">#&lt;/a>
&lt;/h2>
&lt;h1 id="lowercase">
 Lowercase
 &lt;a class="anchor" href="#lowercase">#&lt;/a>
&lt;/h1>
&lt;p>Convert sequences, tags, or read names to lowercase.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;Lowercase&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">target&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any input segment, &amp;#39;All&amp;#39;, &amp;#39;tag:mytag&amp;#39;, or &amp;#39;name:read1&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e">#if_tag = &amp;#34;mytag&amp;#34; # Optional: only apply if tag is truthy&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="target-options">
 Target Options
 &lt;a class="anchor" href="#target-options">#&lt;/a>
&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Segment&lt;/strong>: &lt;code>&amp;quot;read1&amp;quot;&lt;/code>, &lt;code>&amp;quot;read2&amp;quot;&lt;/code>, &lt;code>&amp;quot;index1&amp;quot;&lt;/code>, &lt;code>&amp;quot;index2&amp;quot;&lt;/code>, or &lt;code>&amp;quot;All&amp;quot;&lt;/code> - lowercase&amp;rsquo;s sequence&lt;/li>
&lt;li>&lt;strong>Tag&lt;/strong>: &lt;code>&amp;quot;tag:mytag&amp;quot;&lt;/code> - lowercase&amp;rsquo;s tag&amp;rsquo;s sequence content (Location-type tags only)&lt;/li>
&lt;li>&lt;strong>Name&lt;/strong>: &lt;code>&amp;quot;name:read1&amp;quot;&lt;/code> - lowercase&amp;rsquo;s read name (not including comments)&lt;/li>
&lt;/ul>
&lt;p>Optionally only applies if a &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/concepts/tag/">tag&lt;/a> is truthy via &lt;code>if_tag&lt;/code>.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/modification-steps/MergeReads/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/modification-steps/MergeReads/</guid><description>&lt;h1 id="mergereads">
 MergeReads
 &lt;a class="anchor" href="#mergereads">#&lt;/a>
&lt;/h1>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;MergeReads&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment1&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># First segment&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment2&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read2&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Second segment&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">reverse_complement_segment2&lt;/span> = &lt;span style="color:#66d9ef">true&lt;/span> &lt;span style="color:#75715e"># Whether to reverse complement segment2 (suggested: true)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">algorithm&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;FastpSeemsWeird&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Algorithm: &amp;#34;fastp_seems_weird&amp;#34;. Further algorithms are in planning&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">min_overlap&lt;/span> = &lt;span style="color:#ae81ff">30&lt;/span> &lt;span style="color:#75715e"># Minimum overlap length required&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">max_mismatch_rate&lt;/span> = &lt;span style="color:#ae81ff">0.2&lt;/span> &lt;span style="color:#75715e"># Maximum allowed mismatch rate (0.0-1.0) (suggested: 0.2)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">max_mismatch_count&lt;/span> = &lt;span style="color:#ae81ff">5&lt;/span> &lt;span style="color:#75715e"># Maximum allowed absolute mismatches (suggested: 5)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">no_overlap_strategy&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;as_is&amp;#34;&lt;/span> &lt;span style="color:#75715e"># What to do when no overlap found: &amp;#34;as_is&amp;#34; or &amp;#34;concatenate&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">concatenate_spacer&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;NNNN&amp;#34;&lt;/span> &lt;span style="color:#75715e"># (optional) Required if no_overlap_strategy = &amp;#34;concatenate&amp;#34;. Spacer sequence to insert between reads&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">spacer_quality_char&lt;/span> = &lt;span style="color:#ae81ff">33&lt;/span> &lt;span style="color:#75715e"># (optional) Quality score for spacer bases (suggested: 33 = Phred quality 0)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># out_label = &amp;#34;merged&amp;#34; # (optional) output Tag label for boolean merge status&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Merges paired-end reads from two segments by detecting their overlap and resolving mismatches.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/modification-steps/Rename/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/modification-steps/Rename/</guid><description>&lt;h3 id="rename">
 Rename
 &lt;a class="anchor" href="#rename">#&lt;/a>
&lt;/h3>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;Rename&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">search&lt;/span> = &lt;span style="color:#e6db74">&amp;#39;(.)/([1/2])$&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">replacement&lt;/span> = &lt;span style="color:#e6db74">&amp;#39;$1 $2&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Apply a regular expression based renaming to the reads.&lt;/p>
&lt;p>It is always applied to all available segments (read1, read2, index1, index2).&lt;/p>
&lt;p>The example above fixes old school MGI reads for downstream processing, like
fastp&amp;rsquo;s &amp;lsquo;&amp;ndash;fix_mgi&amp;rsquo; option&lt;/p>
&lt;p>You can use the full power of the &lt;a href="https://docs.rs/regex/latest/regex/">rust regex crate&lt;/a> here.&lt;/p>
&lt;h4 id="read-index-placeholder">
 Read index placeholder
 &lt;a class="anchor" href="#read-index-placeholder">#&lt;/a>
&lt;/h4>
&lt;p>After the regex replacement runs, the special literal &lt;code>{{READ_INDEX}}&lt;/code> is expanded to
the running 1-based index of each logical read. When multiple segments are present
(for example &lt;code>read1&lt;/code>/&lt;code>read2&lt;/code> pairs), every segment for the same read receives the
same index so pairs stay aligned. This makes it easy to re-sequence identifiers:&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/modification-steps/ReverseComplement/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/modification-steps/ReverseComplement/</guid><description>&lt;h1 id="reversecomplement">
 ReverseComplement
 &lt;a class="anchor" href="#reversecomplement">#&lt;/a>
&lt;/h1>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ReverseComplement&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments (default: read1)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">if_tag&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;mytag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Reverse-complements the read sequence (and reverses the quality).&lt;/p>
&lt;p>This supports IUPAC codes (U is complemented to A, so it&amp;rsquo;s not strictly
reversible). Unknown letters are output verbatim.&lt;/p>
&lt;p>Useful to combine with &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/reference/modification-steps/Swap/">Swap&lt;/a>.&lt;/p>
&lt;p>Optionally only swaps if a &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/concepts/tag/">tag&lt;/a> is truthy.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/modification-steps/Skip/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/modification-steps/Skip/</guid><description>&lt;h1 id="skip">
 Skip
 &lt;a class="anchor" href="#skip">#&lt;/a>
&lt;/h1>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;Skip&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">n&lt;/span> = &lt;span style="color:#ae81ff">1000&lt;/span> &lt;span style="color:#75715e"># positive integer, number of reads to skip&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Skips the first n molecules.&lt;/p>
&lt;h3 id="demultiplex-interaction">
 Demultiplex interaction
 &lt;a class="anchor" href="#demultiplex-interaction">#&lt;/a>
&lt;/h3>
&lt;p>If present after a demultiplex step, skips the first n molecules in that stream.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/modification-steps/Swap/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/modification-steps/Swap/</guid><description>&lt;h1 id="swap">
 Swap
 &lt;a class="anchor" href="#swap">#&lt;/a>
&lt;/h1>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;Swap&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment_a&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment_b&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read2&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">if_tag&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;mytag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Swaps exactly two segments.&lt;/p>
&lt;p>Arguments &lt;code>segment_a&lt;/code>/&lt;code>segment_b&lt;/code> are only necessary if there are more than two segments defined in the input.&lt;/p>
&lt;p>Optionally only applies if a &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/concepts/tag/">tag&lt;/a> is truthy.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/modification-steps/Truncate/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/modification-steps/Truncate/</guid><description>&lt;h1 id="maxlen">
 MaxLen
 &lt;a class="anchor" href="#maxlen">#&lt;/a>
&lt;/h1>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;Truncate&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">n&lt;/span> = &lt;span style="color:#ae81ff">100&lt;/span> &lt;span style="color:#75715e"># the maximum length of the read. Cut at end if longer&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments (default: read1)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">if_tag&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;mytag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Cut the read down to `n&amp;rsquo; bases.&lt;/p>
&lt;p>Optionally only applies if a &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/concepts/tag/">tag&lt;/a> is truthy.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/modification-steps/Uppercase/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/modification-steps/Uppercase/</guid><description>&lt;h2 id="weight-150">
 weight: 150
 &lt;a class="anchor" href="#weight-150">#&lt;/a>
&lt;/h2>
&lt;h1 id="uppercase">
 Uppercase
 &lt;a class="anchor" href="#uppercase">#&lt;/a>
&lt;/h1>
&lt;p>Convert sequences, tags, or read names to uppercase.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;Uppercase&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">target&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any input segment, &amp;#39;All&amp;#39;, &amp;#39;tag:mytag&amp;#39;, or &amp;#39;name:read1&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e">#if_tag = &amp;#34;mytag&amp;#34; # Optional: only apply if tag is truthy&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;h2 id="target-options">
 Target Options
 &lt;a class="anchor" href="#target-options">#&lt;/a>
&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Segment&lt;/strong>: &lt;code>&amp;quot;read1&amp;quot;&lt;/code>, &lt;code>&amp;quot;read2&amp;quot;&lt;/code>, &lt;code>&amp;quot;index1&amp;quot;&lt;/code>, &lt;code>&amp;quot;index2&amp;quot;&lt;/code>, or &lt;code>&amp;quot;All&amp;quot;&lt;/code> - lowercase&amp;rsquo;s sequence&lt;/li>
&lt;li>&lt;strong>Tag&lt;/strong>: &lt;code>&amp;quot;tag:mytag&amp;quot;&lt;/code> - lowercase&amp;rsquo;s tag&amp;rsquo;s sequence content (Location-type tags only)&lt;/li>
&lt;li>&lt;strong>Name&lt;/strong>: &lt;code>&amp;quot;name:read1&amp;quot;&lt;/code> - lowercase&amp;rsquo;s read name (not including comments)&lt;/li>
&lt;/ul>
&lt;p>Optionally only applies if a &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/concepts/tag/">tag&lt;/a> is truthy via &lt;code>if_tag&lt;/code>.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/Out_Of_Scope/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/Out_Of_Scope/</guid><description>&lt;h1 id="out-of-scope">
 Out of scope
 &lt;a class="anchor" href="#out-of-scope">#&lt;/a>
&lt;/h1>
&lt;p>Things fastqrab will explicitly not do and that won&amp;rsquo;t be implemented.&lt;/p>
&lt;h2 id="anything-based-on-averaging-phred-scores">
 Anything based on averaging phred scores
 &lt;a class="anchor" href="#anything-based-on-averaging-phred-scores">#&lt;/a>
&lt;/h2>
&lt;p>Based on the average quality in a sliding window.
Arithmetic averaging of phred scores is wrong.&lt;/p>
&lt;p>see &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/reference/tag-steps/calc/CalcMeanQuality/">ExtractMeanQuality&lt;/a>&lt;/p>
&lt;h3 id="corresponding-options-in-other-software">
 Corresponding options in other software
 &lt;a class="anchor" href="#corresponding-options-in-other-software">#&lt;/a>
&lt;/h3>
&lt;ul>
&lt;li>Trimmomatic SLIDINGWINDOW&lt;/li>
&lt;li>fastp &amp;ndash;cut_front&lt;/li>
&lt;li>fastp &amp;ndash;cut_tail&lt;/li>
&lt;li>fastp &amp;ndash;cut_right&lt;/li>
&lt;/ul>
&lt;h2 id="fast5">
 Fast5
 &lt;a class="anchor" href="#fast5">#&lt;/a>
&lt;/h2>
&lt;p>&lt;a href="https://medium.com/@shiansu/a-look-at-the-nanopore-fast5-format-f711999e2ff6">https://medium.com/@shiansu/a-look-at-the-nanopore-fast5-format-f711999e2ff6&lt;/a>
Oxford Nanopore squiggle data.
Apparently no formal spec.&lt;/p>
&lt;h2 id="kallisto-bus-format">
 kallisto BUS format
 &lt;a class="anchor" href="#kallisto-bus-format">#&lt;/a>
&lt;/h2>
&lt;pre>&lt;code>- a brief barcode/umi format for single cell RNA-seq
- needs an 'equivalance class' - i.e. at least pseudo alignment
- weird length restrictions on barcodes and umis (1(!)-32), 
 but stores the length in an uint32...
&lt;/code>&lt;/pre>
&lt;h2 id="alignment">
 Alignment
 &lt;a class="anchor" href="#alignment">#&lt;/a>
&lt;/h2>
&lt;p>While it&amp;rsquo;s tempting to leverage the fastq parsing for an aligner,
aligning molecules to references is out of scope for the 1.0 target.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/report-steps/Inspect/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/report-steps/Inspect/</guid><description>&lt;h1 id="inspect">
 Inspect
 &lt;a class="anchor" href="#inspect">#&lt;/a>
&lt;/h1>
&lt;p>Dump a few reads to a FASTQ file for inspection at this point in the graph.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;Inspect&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">n&lt;/span> = &lt;span style="color:#ae81ff">1000&lt;/span> &lt;span style="color:#75715e"># how many molecules &lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">infix&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;inspect_at_point&amp;#34;&lt;/span> &lt;span style="color:#75715e"># output filename infix&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments (use &amp;#34;all&amp;#34; for interleaved output)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">format&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;FASTQ&amp;#34;&lt;/span> &lt;span style="color:#75715e"># output format: FASTQ, FASTA (no BAM)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">suffix&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;compressed&amp;#34;&lt;/span> &lt;span style="color:#75715e"># (optional) custom suffix for filename&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">compression&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;gzip&amp;#34;&lt;/span> &lt;span style="color:#75715e"># (optional) compression format: raw, gzip, zstd. Defaults to uncompressed&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">compression_level&lt;/span> = &lt;span style="color:#ae81ff">1&lt;/span> &lt;span style="color:#75715e"># (optional) compression level for gzip/zstd/bam (gzip, zstd: 1-22)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># defaults: gzip=6, zstd=5&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Output filename pattern:&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/report-steps/Progress/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/report-steps/Progress/</guid><description>&lt;h1 id="progress">
 Progress
 &lt;a class="anchor" href="#progress">#&lt;/a>
&lt;/h1>
&lt;p>Emit progress to stdout (default) or a .progress log file,
if output_infix is set. (filename is {output_prefix}{ix_separator}{infix}.progress, default separator &lt;code>_&lt;/code>).&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;Progress&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">n&lt;/span> = &lt;span style="color:#ae81ff">100_000&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">output_infix&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;progress&amp;#34;&lt;/span> &lt;span style="color:#75715e"># optional&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Every &lt;code>n&lt;/code> reads, report on total progress, total reads per second.
At the end, report final runtime and reads/second.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/calc/CalcComplexity/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/calc/CalcComplexity/</guid><description>&lt;h1 id="calccomplexity">
 CalcComplexity
 &lt;a class="anchor" href="#calccomplexity">#&lt;/a>
&lt;/h1>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;CalcComplexity&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">out_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;complexity&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments, or &amp;#39;All&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Calculate read complexity, based on the percentage of bases that are changed from their predecessor.
Ranges from 0.0 to 1.0.&lt;/p>
&lt;p>A good filter value might be 0.30, which means 30% complexity is required. See
&lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/reference/filter-steps/FilterByNumericTag/">FilterByNumericTag&lt;/a>.&lt;/p>
&lt;h2 id="corresponding-options-in-other-software">
 Corresponding options in other software
 &lt;a class="anchor" href="#corresponding-options-in-other-software">#&lt;/a>
&lt;/h2>
&lt;ul>
&lt;li>fastp: -low_complexity_filter&lt;/li>
&lt;/ul></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/calc/CalcMeanQuality/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/calc/CalcMeanQuality/</guid><description>&lt;h3 id="calcmeanquality">
 CalcMeanQuality
 &lt;a class="anchor" href="#calcmeanquality">#&lt;/a>
&lt;/h3>
&lt;p>We don&amp;rsquo;t support calculating the &amp;lsquo;average quality&amp;rsquo;.&lt;/p>
&lt;p>This is typically a bad idea, see &lt;a href="https://www.drive5.com/usearch/manual/avgq.html">https://www.drive5.com/usearch/manual/avgq.html&lt;/a> for a discussion of the issues.&lt;/p>
&lt;p>To illustrate, 140 x Q35 + 10 x Q2 reads have an &amp;lsquo;average&amp;rsquo; phred of 33, but 6.4 expected wrong bases.
A read with 150 x Q25 has a much worse &amp;lsquo;average&amp;rsquo; phred of 25, but a much lower expected number of errors at 0.5.&lt;/p>
&lt;h2 id="corresponding-options-in-other-software">
 Corresponding options in other software
 &lt;a class="anchor" href="#corresponding-options-in-other-software">#&lt;/a>
&lt;/h2>
&lt;ul>
&lt;li>Trimmomatic: AVGQUAL:&lt;/li>
&lt;li>fastp: &amp;ndash;average_qual&lt;/li>
&lt;/ul></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/calc/CalcQualifiedBases/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/calc/CalcQualifiedBases/</guid><description>&lt;h1 id="calcqualifiedbases">
 CalcQualifiedBases
 &lt;a class="anchor" href="#calcqualifiedbases">#&lt;/a>
&lt;/h1>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;CalcQualifiedBases&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">threshold&lt;/span> = &lt;span style="color:#e6db74">&amp;#39;C&amp;#39;&lt;/span> &lt;span style="color:#75715e"># the quality value &amp;gt;= which a base is qualified &lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># In your phred encoding. Typically 33..75&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># a byte or a number 0...255&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">op&lt;/span> = &lt;span style="color:#e6db74">&amp;#39;worse&amp;#39;&lt;/span> &lt;span style="color:#75715e"># see below.&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments, or &amp;#39;All&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">out_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;tag_name&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">relative&lt;/span> = &lt;span style="color:#66d9ef">true&lt;/span> &lt;span style="color:#75715e"># a rate (true) or a count (false)&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Calculate the number of bases that are &amp;lsquo;qualified&amp;rsquo;, that is
abov/below a user defined threshold.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/extract/ExtractLongestPolyX/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/extract/ExtractLongestPolyX/</guid><description>&lt;h1 id="extractlongestpolyx">
 ExtractLongestPolyX
 &lt;a class="anchor" href="#extractlongestpolyx">#&lt;/a>
&lt;/h1>
&lt;p>Find the longest homopolymer stretch anywhere in the read (unlike &lt;code>ExtractPolyTail&lt;/code>, which only considers suffixes).&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ExtractLongestPolyX&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">out_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;my_tag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">min_length&lt;/span> = &lt;span style="color:#ae81ff">5&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">base&lt;/span> = &lt;span style="color:#e6db74">&amp;#39;.&amp;#39;&lt;/span> &lt;span style="color:#75715e"># search for any homopolymer (A/C/G/T/N)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">max_mismatch_rate&lt;/span> = &lt;span style="color:#ae81ff">0.15&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">max_consecutive_mismatches&lt;/span> = &lt;span style="color:#ae81ff">2&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;ul>
&lt;li>&lt;code>base&lt;/code> accepts a concrete nucleotide (&lt;code>A&lt;/code>, &lt;code>C&lt;/code>, &lt;code>G&lt;/code>, &lt;code>T&lt;/code>, &lt;code>N&lt;/code>) or &lt;code>.&lt;/code> to search all of &lt;code>ACGT&lt;/code> and report the longest hit.&lt;/li>
&lt;li>&lt;code>max_mismatch_rate&lt;/code> and &lt;code>max_consecutive_mismatches&lt;/code> mirror &lt;code>ExtractPolyTail&lt;/code>; they control how permissive the run detection is.&lt;/li>
&lt;li>When no run satisfies &lt;code>min_length&lt;/code>, the tag is reported as missing.&lt;/li>
&lt;li>only one run is reported, even if multiple runs of the same length exist; in this case, the first run found is reported.&lt;/li>
&lt;/ul></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/extract/ExtractLowQualityEnd/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/extract/ExtractLowQualityEnd/</guid><description>&lt;h1 id="trimqualityend">
 TrimQualityEnd
 &lt;a class="anchor" href="#trimqualityend">#&lt;/a>
&lt;/h1>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ExtractLowQualityEnd&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">out_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;low_quality_ends&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">min_qual&lt;/span> = &lt;span style="color:#ae81ff">20&lt;/span> &lt;span style="color:#75715e"># u8, minimum quality to keep (in whatever your score is encoded in)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># either a char like &amp;#39;A&amp;#39; or a number 0..128 (typical phred score is 33..75)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Define a region of low quality bases at the end of reads.&lt;/p>
&lt;h2 id="corresponding-options-in-other-software">
 Corresponding options in other software
 &lt;a class="anchor" href="#corresponding-options-in-other-software">#&lt;/a>
&lt;/h2>
&lt;ul>
&lt;li>Trimmomatic: TRAILING (if paired with &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/reference/modification-steps/TrimAtTag/">TrimAtTag&lt;/a>)&lt;/li>
&lt;/ul></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/extract/ExtractLowQualityStart/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/extract/ExtractLowQualityStart/</guid><description>&lt;h1 id="trimqualitystart">
 TrimQualityStart
 &lt;a class="anchor" href="#trimqualitystart">#&lt;/a>
&lt;/h1>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ExtractLowQualityStart&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">min_qual&lt;/span> = &lt;span style="color:#ae81ff">20&lt;/span> &lt;span style="color:#75715e"># u8, minimum quality to keep (in whatever your score is encoded in)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># either a char like &amp;#39;A&amp;#39; or a number 0..128 (typical phred score is 33..75)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">out_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;bad_starts&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Define a region with low quality bases (below threshold) at steart of read.&lt;/p>
&lt;h2 id="corresponding-options-in-other-software">
 Corresponding options in other software
 &lt;a class="anchor" href="#corresponding-options-in-other-software">#&lt;/a>
&lt;/h2>
&lt;ul>
&lt;li>Trimmomatic: LEADING (if combined with &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/reference/modification-steps/TrimAtTag/">TrimAtTag&lt;/a>)&lt;/li>
&lt;/ul></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/extract/ExtractPolyTail/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/extract/ExtractPolyTail/</guid><description>&lt;h1 id="extractpolytail">
 ExtractPolyTail
 &lt;a class="anchor" href="#extractpolytail">#&lt;/a>
&lt;/h1>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ExtractPolyTail&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">out_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;tag_label&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments (default: read1)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">min_length&lt;/span> = &lt;span style="color:#ae81ff">5&lt;/span> &lt;span style="color:#75715e"># positive integer, the minimum number of repeats of the base&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">base&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;A&amp;#34;&lt;/span> &lt;span style="color:#75715e"># one of AGTCN., the &amp;#39;base&amp;#39; to trim (or . for any repeated base)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">max_mismatch_rate&lt;/span> = &lt;span style="color:#ae81ff">0.1&lt;/span> &lt;span style="color:#75715e"># float 0.0..=1.0, how many mismatches are allowed in the repeat&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">max_consecutive_mismatches&lt;/span> = &lt;span style="color:#ae81ff">3&lt;/span> &lt;span style="color:#75715e"># how many consecutive mismatches are allowed&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Identify either a specific letter (AGTC or N) repetition,
or any base repetition (base = &amp;lsquo;.&amp;rsquo;) at the end of the read.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/tag/TagDuplicates/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/tag/TagDuplicates/</guid><description>&lt;h3 id="filterduplicates">
 FilterDuplicates
 &lt;a class="anchor" href="#filterduplicates">#&lt;/a>
&lt;/h3>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;TagDuplicates&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">false_positive_rate&lt;/span> = &lt;span style="color:#ae81ff">0.00001&lt;/span> &lt;span style="color:#75715e">#&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># the false positive rate of the filter.&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># 0..1&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">seed&lt;/span> = &lt;span style="color:#ae81ff">59&lt;/span> &lt;span style="color:#75715e"># required!&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">source&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;All&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any input segment, &amp;#39;All&amp;#39;, &amp;#39;tag:&amp;lt;tag-name&amp;gt;&amp;#39; or &amp;#39;name:&amp;lt;segment&amp;gt;&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># split_character = &amp;#34;/&amp;#34; # required (and accepted only iff using name:&amp;lt;segment&amp;gt;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">out_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;dups&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># initial_filter_capacity = 10_000_000 # optional. Auto detected by default&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;FilterByTag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">in_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;dups&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">keep_or_remove&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;Remove&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Keep|Remove&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Tag duplicates (2nd onwards) from the stream using a &lt;a href="https://en.wikipedia.org/wiki/Cuckoo_filter">Cuckoo filter&lt;/a>.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/using/AssignToReference/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/using/AssignToReference/</guid><description>&lt;h1 id="assigntoreference">
 AssignToReference
 &lt;a class="anchor" href="#assigntoreference">#&lt;/a>
&lt;/h1>
&lt;p>Assign each query sequence to the closest entry in barcodes section,
using Hamming distance.&lt;/p>
&lt;p>( As opposed to &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/redirects/HammingCorrect/">HammingCorrect&lt;/a>
which will correct to the closest barcode sequence).&lt;/p>
&lt;p>At start-up the step builds an efficient Hamming-distance index over the database. For every read,
the tag supplied in &lt;code>in_label&lt;/code> is looked up in the index and the name of the
closest matching reference entry is written to &lt;code>out_label&lt;/code> as a string tag.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/using/HammingCorrect/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/using/HammingCorrect/</guid><description>&lt;h1 id="hammingcorrect">
 HammingCorrect
 &lt;a class="anchor" href="#hammingcorrect">#&lt;/a>
&lt;/h1>
&lt;p>Correct a tag to one of a predefined set of &amp;lsquo;barcodes&amp;rsquo; using closest hamming distance.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;HammingCorrect&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">in_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;mytag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">out_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;my_corrected_tag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">barcodes&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;mybarcodelist&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">max_hamming_distance&lt;/span> = &lt;span style="color:#ae81ff">1&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">on_no_match&lt;/span> = &lt;span style="color:#e6db74">&amp;#39;remove&amp;#39;&lt;/span> &lt;span style="color:#75715e"># &amp;#39;remove&amp;#39;, &amp;#39;empty&amp;#39;, &amp;#39;keep&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">name_split_character&lt;/span> = &lt;span style="color:#e6db74">&amp;#39;|&amp;#39;&lt;/span> &lt;span style="color:#75715e"># optional&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>[&lt;span style="color:#a6e22e">barcodes&lt;/span>.&lt;span style="color:#a6e22e">mybarcodelist&lt;/span>]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#e6db74">&amp;#34;AAAA&amp;#34;&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;label_ignored&amp;#34;&lt;/span> &lt;span style="color:#75715e"># only read when demultiplexing &lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>on_no_match controls what happens if the tag cannot be corrected within the max_hamming_distance:&lt;/p>
&lt;ul>
&lt;li>remove: Remove the hit (location and sequence), useful for &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/reference/filter-steps/FilterByTag/">FilterByTag&lt;/a> later.&lt;/li>
&lt;li>keep: Keep the original tag (and location)&lt;/li>
&lt;li>empty: Keep the original location, but set the tag to empty.&lt;/li>
&lt;/ul>
&lt;p>Note that hamming_correction removes the location information on tags if
they spanned more than one region. (This is an implementation limitation, not a conceptual one).&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/validation-steps/ValidateAllReadsSameLength/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/validation-steps/ValidateAllReadsSameLength/</guid><description>&lt;h1 id="validateallreadssamelength">
 ValidateAllReadsSameLength
 &lt;a class="anchor" href="#validateallreadssamelength">#&lt;/a>
&lt;/h1>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ValidateAllReadsSameLength&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">source&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any segment, All, tag:&amp;lt;name&amp;gt; or &amp;#39;name:segment&amp;gt;&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Validates that all reads have the same sequence/tag/name length.&lt;/p>
&lt;p>Useful when you want to verify read length consistency in your pipeline.&lt;/p>
&lt;p>(For names, the names without comments -
that is up to the first input.options.read_comment_character are used).&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/validation-steps/ValidateName/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/validation-steps/ValidateName/</guid><description>&lt;h1 id="validatename">
 ValidateName
 &lt;a class="anchor" href="#validatename">#&lt;/a>
&lt;/h1>
&lt;p>Verify that all segments have the same read name (or a shared prefix).&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ValidateName&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># Optional separator character; the comparison stops at the first match&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">readname_end_char&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;_&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Optional. Do not set for exact matching. Otherwise, a byte character&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">sample_stride&lt;/span> = &lt;span style="color:#ae81ff">1000&lt;/span> &lt;span style="color:#75715e"># Check every nth fragment, default 1000. Must be &amp;gt; 0. Starts with first read&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>When no separator character (readname_end_char) is provided the
entire name must match exactly across all segments.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/validation-steps/ValidateQuality/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/validation-steps/ValidateQuality/</guid><description>&lt;h1 id="validatequality">
 ValidateQuality
 &lt;a class="anchor" href="#validatequality">#&lt;/a>
&lt;/h1>
&lt;p>Validate that all scores are between 33..=41&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ValidateQuality&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments, or &amp;#39;All&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">encoding&lt;/span> = &lt;span style="color:#e6db74">&amp;#39;Illumina1.8&amp;#39;&lt;/span> &lt;span style="color:#75715e"># &amp;#39;Illumina1.8|Illumina1.3|Sanger|Solexa&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># Illumina1.8 is an alias for Sanger.&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>The encoding defines the accepted range of values.&lt;/p>
&lt;p>If you want to convert quality codes, use &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/reference/modification-steps/ConvertQuality/">ConvertQuality&lt;/a>.&lt;/p>
&lt;p>See &lt;a href="https://pmc.ncbi.nlm.nih.gov/articles/PMC2847217/">https://pmc.ncbi.nlm.nih.gov/articles/PMC2847217/&lt;/a> , table 1&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/validation-steps/ValidateReadNamesPrintable/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/validation-steps/ValidateReadNamesPrintable/</guid><description>&lt;h1 id="validatereadnamesprintable">
 ValidateReadNamesPrintable
 &lt;a class="anchor" href="#validatereadnamesprintable">#&lt;/a>
&lt;/h1>
&lt;p>Validate that every read name conforms to the SAM/BAM specification.&lt;/p>
&lt;p>The SAM specification requires that query names (QNAME) match &lt;code>[!-?A-~]{1,254}&lt;/code>:
printable ASCII characters excluding &lt;code>@&lt;/code> and space, with a maximum length of 254.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ValidateReadNamesPrintable&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>No additional parameters are needed — the allowed character set is fixed by the SAM spec.&lt;/p>
&lt;h2 id="when-to-use">
 When to use
 &lt;a class="anchor" href="#when-to-use">#&lt;/a>
&lt;/h2>
&lt;p>Add this step when your pipeline writes non-BAM output that will become BAM eventually
and you suspect read names may contain invalid characters.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/validation-steps/ValidateReadPairing/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/validation-steps/ValidateReadPairing/</guid><description>&lt;h1 id="validatereadpairing">
 ValidateReadPairing
 &lt;a class="anchor" href="#validatereadpairing">#&lt;/a>
&lt;/h1>
&lt;p>Confirms for every &lt;code>sample_stride&lt;/code>th read &amp;lsquo;pair&amp;rsquo; that the names are
identical but for one letter.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ValidateReadPairing&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">sample_stride&lt;/span> = &lt;span style="color:#ae81ff">1000&lt;/span> &lt;span style="color:#75715e"># Check every nth fragment, default 1000. Must be &amp;gt; 0. Starts with first read&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Ensures&lt;/p>
&lt;ul>
&lt;li>read names between segments have the same length&lt;/li>
&lt;li>read names between segments have a hamming distance of at most one.&lt;/li>
&lt;/ul>
&lt;p>Note that this validation requires at least two input segments.&lt;/p>
&lt;p>(See also: &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/reference/validation-steps/ValidateName/">&lt;code>ValidateName&lt;/code>&lt;/a>,
which validates after truncating on a character occurance).&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/validation-steps/ValidateSeq/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/validation-steps/ValidateSeq/</guid><description>&lt;h1 id="validateseq">
 ValidateSeq
 &lt;a class="anchor" href="#validateseq">#&lt;/a>
&lt;/h1>
&lt;p>Validate that only allowed characters are in the sequence.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ValidateSeq&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">allowed&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;AGTC&amp;#34;&lt;/span> &lt;span style="color:#75715e"># String. Example &amp;#39;ACGTN&amp;#39;, the allowed characters&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments, or &amp;#39;All&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div></description></item><item><title>(Older) Documentation Versions</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/older_versions/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/older_versions/</guid><description>&lt;p>Available builds:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>main&lt;/strong> (this build)&lt;/li>
&lt;li>&lt;a href="https://tyberiusprime.github.io/fastqrab/v0.8.1/">v0.8.1&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://tyberiusprime.github.io/fastqrab/v0.8.0-test/">v0.8.0-test&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://tyberiusprime.github.io/fastqrab/v0.8.0/">v0.8.0&lt;/a>&lt;/li>
&lt;/ul></description></item><item><title>adapters</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/adapters/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/adapters/</guid><description/></item><item><title>AssignToReference</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/AssignToReference/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/AssignToReference/</guid><description/></item><item><title>barcodes</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/barcodes/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/barcodes/</guid><description/></item><item><title>benchmark-section</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/benchmark-section/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/benchmark-section/</guid><description/></item><item><title>Calc Base Content</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/calc/CalcBaseContent/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/calc/CalcBaseContent/</guid><description>&lt;h1 id="calcbasecontent">
 CalcBaseContent
 &lt;a class="anchor" href="#calcbasecontent">#&lt;/a>
&lt;/h1>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;CalcBaseContent&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments, or &amp;#39;All&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">out_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;at_content&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">bases_to_count&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;AT&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">bases_to_ignore&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;N&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">relative&lt;/span> = &lt;span style="color:#66d9ef">true&lt;/span> &lt;span style="color:#75715e"># default&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Counts the rate (0..=1) of bases that match &lt;code>bases_to_count&lt;/code>, while removing any
bases listed in &lt;code>bases_to_ignore&lt;/code> from the denominator (if relative = True).&lt;/p>
&lt;p>Both lists are case-insensitive, and accept only ascii letters. When no bases
remain after filtering, the step returns &lt;code>0&lt;/code>.&lt;/p>
&lt;p>Set &lt;code>relative = false&lt;/code> to emit absolute base counts instead of a rate.
Absolute mode requires &lt;code>bases_to_ignore&lt;/code> to remain unset, otherwise the
configuration check fails.&lt;/p></description></item><item><title>Calc GC Content</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/calc/CalcGCContent/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/calc/CalcGCContent/</guid><description>&lt;h1 id="calcgccontent">
 CalcGCContent
 &lt;a class="anchor" href="#calcgccontent">#&lt;/a>
&lt;/h1>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;CalcGCContent&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments, or &amp;#39;All&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">out_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;gc&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">relative&lt;/span> = &lt;span style="color:#66d9ef">true&lt;/span> &lt;span style="color:#75715e"># a rate (true) or a count (false)&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Count what percentage of bases are GC (as opposed to AT).
Non-AGTC bases (e.g. N) are ignored.&lt;/p>
&lt;p>Output is 0..100.&lt;/p>
&lt;p>Wrapper around &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/reference/tag-steps/calc/CalcBaseContent/">CalcBaseContent&lt;/a> with &lt;code>bases = &amp;quot;GC&amp;quot;, ignore=&amp;quot;N&amp;quot;, relative=true&lt;/code>).&lt;/p></description></item><item><title>CalcBaseContent</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/CalcBaseContent/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/CalcBaseContent/</guid><description/></item><item><title>CalcComplexity</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/CalcComplexity/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/CalcComplexity/</guid><description/></item><item><title>CalcExpectedError</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/CalcExpectedError/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/CalcExpectedError/</guid><description/></item><item><title>CalcGCContent</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/CalcGCContent/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/CalcGCContent/</guid><description/></item><item><title>CalcKmers</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/CalcKmers/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/CalcKmers/</guid><description/></item><item><title>CalcLength</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/CalcLength/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/CalcLength/</guid><description/></item><item><title>CalcMeanQuality</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/CalcMeanQuality/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/CalcMeanQuality/</guid><description/></item><item><title>CalcNContent</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/CalcNContent/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/CalcNContent/</guid><description/></item><item><title>CalcNContent</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/calc/CalcNContent/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/calc/CalcNContent/</guid><description>&lt;h1 id="calcncount">
 CalcNCount
 &lt;a class="anchor" href="#calcncount">#&lt;/a>
&lt;/h1>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;CalcNContent&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments, or &amp;#39;All&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">out_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ncount&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">relative&lt;/span> = &lt;span style="color:#66d9ef">true&lt;/span> &lt;span style="color:#75715e"># a rate (true) or a count (false)&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Count how many N are present in the read.&lt;/p>
&lt;p>This step is a convenient wrapper for
&lt;a href="./CalcBaseContent.md">&lt;code>CalcBaseContent&lt;/code>&lt;/a> with &lt;code>bases_to_count = &amp;quot;N&amp;quot;&lt;/code>.&lt;/p>
&lt;h2 id="corresponding-options-in-other-software">
 Corresponding options in other software
 &lt;a class="anchor" href="#corresponding-options-in-other-software">#&lt;/a>
&lt;/h2>
&lt;ul>
&lt;li>fastp: &amp;ndash;n_base_limit (if combined with &lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/reference/filter-steps/FilterByNumericTag/">FilterByNumericTag&lt;/a>)&lt;/li>
&lt;/ul></description></item><item><title>CalcQualifiedBases</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/CalcQualifiedBases/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/CalcQualifiedBases/</guid><description/></item><item><title>CLI</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/CLI/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/CLI/</guid><description/></item><item><title>ConcatTags</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ConcatTags/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ConcatTags/</guid><description/></item><item><title>Convert To Rate</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/convert/ConvertToRate/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/reference/tag-steps/convert/ConvertToRate/</guid><description>&lt;h1 id="converttorate">
 ConvertToRate
 &lt;a class="anchor" href="#converttorate">#&lt;/a>
&lt;/h1>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;CalcBaseContent&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">bases_to_count&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;A&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">relative&lt;/span> = &lt;span style="color:#66d9ef">false&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">out_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;a_count&amp;#34;&lt;/span> &lt;span style="color:#75715e"># absolute A-base count&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ConvertToRate&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">in_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;a_count&amp;#34;&lt;/span> &lt;span style="color:#75715e"># The numeric tag to divide by read length&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">out_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;a_rate&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Output tag label&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Segment to measure length from, or &amp;#39;All&amp;#39; for total length (default: only segment if single-read input)&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Divide an existing numeric tag by the read length to produce a normalized rate.&lt;/p>
&lt;p>Typical use case: Divide
&lt;a href="https://tyberiusprime.github.io/fastqrab/main/fastqrab/main/docs/reference/tag-steps/convert/EvalExpression/">EvalExpression&lt;/a>
by the read length to get a fraction (rate).&lt;/p></description></item><item><title>ConvertQuality</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ConvertQuality/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ConvertQuality/</guid><description/></item><item><title>ConvertRegionsToLength</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ConvertRegionsToLength/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ConvertRegionsToLength/</guid><description/></item><item><title>ConvertToRate</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ConvertToRate/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ConvertToRate/</guid><description/></item><item><title>CutEnd</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/CutEnd/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/CutEnd/</guid><description/></item><item><title>CutStart</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/CutStart/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/CutStart/</guid><description/></item><item><title>Demultiplex</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/Demultiplex/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/Demultiplex/</guid><description/></item><item><title>EvalExpression</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/EvalExpression/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/EvalExpression/</guid><description/></item><item><title>ExtractIUPAC</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ExtractIUPAC/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ExtractIUPAC/</guid><description/></item><item><title>ExtractIUPACSuffix</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ExtractIUPACSuffix/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ExtractIUPACSuffix/</guid><description/></item><item><title>ExtractIUPACWithIndel</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ExtractIUPACWithIndel/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ExtractIUPACWithIndel/</guid><description/></item><item><title>ExtractLongestPolyX</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ExtractLongestPolyX/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ExtractLongestPolyX/</guid><description/></item><item><title>ExtractLowQualityEnd</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ExtractLowQualityEnd/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ExtractLowQualityEnd/</guid><description/></item><item><title>ExtractLowQualityStart</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ExtractLowQualityStart/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ExtractLowQualityStart/</guid><description/></item><item><title>ExtractPolyTail</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ExtractPolyTail/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ExtractPolyTail/</guid><description/></item><item><title>ExtractRegex</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ExtractRegex/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ExtractRegex/</guid><description/></item><item><title>ExtractRegion</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ExtractRegion/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ExtractRegion/</guid><description/></item><item><title>ExtractRegions</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ExtractRegions/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ExtractRegions/</guid><description/></item><item><title>ExtractRegionsOfLowQuality</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ExtractRegionsOfLowQuality/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ExtractRegionsOfLowQuality/</guid><description/></item><item><title>ExtractToName</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ExtractToName/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ExtractToName/</guid><description/></item><item><title>FilterByNumericTag</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/FilterByNumericTag/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/FilterByNumericTag/</guid><description/></item><item><title>FilterByTag</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/FilterByTag/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/FilterByTag/</guid><description/></item><item><title>FilterEmpty</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/FilterEmpty/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/FilterEmpty/</guid><description/></item><item><title>FilterReservoirSample</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/FilterReservoirSample/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/FilterReservoirSample/</guid><description/></item><item><title>FilterSample</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/FilterSample/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/FilterSample/</guid><description/></item><item><title>ForgetAllTags</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ForgetAllTags/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ForgetAllTags/</guid><description/></item><item><title>ForgetTag</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ForgetTag/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ForgetTag/</guid><description/></item><item><title>HammingCorrect</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/HammingCorrect/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/HammingCorrect/</guid><description/></item><item><title>Head</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/Head/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/Head/</guid><description/></item><item><title>input-section</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/input-section/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/input-section/</guid><description/></item><item><title>Inspect</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/Inspect/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/Inspect/</guid><description/></item><item><title>llm-guide</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/llm-guide/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/llm-guide/</guid><description/></item><item><title>Lowercase</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/Lowercase/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/Lowercase/</guid><description/></item><item><title>MergeReads</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/MergeReads/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/MergeReads/</guid><description/></item><item><title>Options</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/Options/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/Options/</guid><description/></item><item><title>Out_Of_Scope</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/Out_Of_Scope/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/Out_Of_Scope/</guid><description/></item><item><title>output-section</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/output-section/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/output-section/</guid><description/></item><item><title>Postfix</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/Postfix/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/Postfix/</guid><description/></item><item><title>Prefix</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/Prefix/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/Prefix/</guid><description/></item><item><title>Progress</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/Progress/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/Progress/</guid><description/></item><item><title>QuantifyTag</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/QuantifyTag/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/QuantifyTag/</guid><description/></item><item><title>Rename</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/Rename/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/Rename/</guid><description/></item><item><title>ReplaceTagWithLetter</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ReplaceTagWithLetter/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ReplaceTagWithLetter/</guid><description/></item><item><title>Report</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/Report/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/Report/</guid><description/></item><item><title>ReverseComplement</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ReverseComplement/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ReverseComplement/</guid><description/></item><item><title>Skip</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/Skip/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/Skip/</guid><description/></item><item><title>StoreTagBackInSequence</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/StoreTagBackInSequence/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/StoreTagBackInSequence/</guid><description/></item><item><title>StoreTagInComment</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/StoreTagInComment/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/StoreTagInComment/</guid><description/></item><item><title>StoreTagInFastQ</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/StoreTagInFastQ/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/StoreTagInFastQ/</guid><description/></item><item><title>StoreTagInSequence</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/StoreTagInSequence/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/StoreTagInSequence/</guid><description/></item><item><title>StoreTagsInTable</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/StoreTagsInTable/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/StoreTagsInTable/</guid><description/></item><item><title>Swap</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/Swap/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/Swap/</guid><description/></item><item><title>TagDuplicates</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/TagDuplicates/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/TagDuplicates/</guid><description/></item><item><title>TagOtherFile</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/TagOtherFile/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/TagOtherFile/</guid><description/></item><item><title>threading</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/threading/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/threading/</guid><description/></item><item><title>TrimAtTag</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/TrimAtTag/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/TrimAtTag/</guid><description/></item><item><title>Truncate</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/Truncate/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/Truncate/</guid><description/></item><item><title>Uppercase</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/Uppercase/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/Uppercase/</guid><description/></item><item><title>ValidateAllReadsSameLength</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ValidateAllReadsSameLength/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ValidateAllReadsSameLength/</guid><description/></item><item><title>ValidateName</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ValidateName/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ValidateName/</guid><description/></item><item><title>ValidateQuality</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ValidateQuality/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ValidateQuality/</guid><description/></item><item><title>ValidateReadNamesPrintable</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ValidateReadNamesPrintable/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ValidateReadNamesPrintable/</guid><description/></item><item><title>ValidateReadPairing</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ValidateReadPairing/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ValidateReadPairing/</guid><description/></item><item><title>ValidateSeq</title><link>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ValidateSeq/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/main/docs/redirects/ValidateSeq/</guid><description/></item></channel></rss>