<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Tag Generation on mbf-fastq-processor documentation</title><link>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/</link><description>Recent content in Tag Generation on mbf-fastq-processor documentation</description><generator>Hugo</generator><language>en-us</language><atom:link href="https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/index.xml" rel="self" type="application/rss+xml"/><item><title/><link>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractanchor/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractanchor/</guid><description>&lt;h1 id="extractanchor">
 ExtractAnchor
 &lt;a class="anchor" href="#extractanchor">#&lt;/a>
&lt;/h1>
&lt;p>Extract regions relative to a previously tagged anchor position.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>&lt;span style="color:#75715e"># First create an anchor tag. Iupac, regex, ExtractRegion, your choice.&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ExtractIUPAC&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">search&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;CAYA&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;anchor_tag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">anchor&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;Anywhere&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">max_mismatches&lt;/span> = &lt;span style="color:#ae81ff">0&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#75715e"># Then extract relative to that anchor&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ExtractAnchor&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;mytag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">input_label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;anchor_tag&amp;#34;&lt;/span> &lt;span style="color:#75715e"># tag that provides the anchor position&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">regions&lt;/span> = [[&lt;span style="color:#ae81ff">-2&lt;/span>, &lt;span style="color:#ae81ff">4&lt;/span>], [&lt;span style="color:#ae81ff">4&lt;/span>, &lt;span style="color:#ae81ff">1&lt;/span>]] &lt;span style="color:#75715e"># [start, length] pairs relative to anchor&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">region_separator&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;_&amp;#34;&lt;/span> &lt;span style="color:#75715e"># (optional) separator between regions&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This transformation uses the leftmost position of a previously established tag as the anchor point and extracts specified regions relative to that position.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractlength/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractlength/</guid><description>&lt;h1 id="extractlength">
 ExtractLength
 &lt;a class="anchor" href="#extractlength">#&lt;/a>
&lt;/h1>
&lt;p>Extract the length of a read as a tag.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ExtractLength&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;mytag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments, or &amp;#39;All&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This transformation creates a tag containing the length of the specified read.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractregex/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractregex/</guid><description>&lt;h1 id="extractregex">
 ExtractRegex
 &lt;a class="anchor" href="#extractregex">#&lt;/a>
&lt;/h1>
&lt;p>Extract a regexp result. Stores an empty string if not found.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ExtractRegex&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;mytag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">search&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;^CT(..)CT&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">replacement&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;$1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># standard regex replacement syntax&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This transformation searches for a regular expression pattern in the specified read and extracts the matching portion as a tag.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractregion/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractregion/</guid><description>&lt;h1 id="extractregion">
 ExtractRegion
 &lt;a class="anchor" href="#extractregion">#&lt;/a>
&lt;/h1>
&lt;p>Extract a fixed position region.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ExtractRegion&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">start&lt;/span> = &lt;span style="color:#ae81ff">5&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">length&lt;/span> = &lt;span style="color:#ae81ff">8&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;umi&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This transformation extracts a fixed-length region from the specified read at a given position and stores it as a tag.&lt;/p>
&lt;p>Use &lt;a href="https://tyberiusprime.github.io/fastqrab/v0.8.0-test/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractregions/">ExtractRegions&lt;/a> if your region is actually multiple regions (possibly from different segments).&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractregions/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractregions/</guid><description>&lt;h1 id="extractregions">
 ExtractRegions
 &lt;a class="anchor" href="#extractregions">#&lt;/a>
&lt;/h1>
&lt;p>Extract from multiple fixed position regions.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ExtractRegions&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">regions&lt;/span> = [
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> {&lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span>, &lt;span style="color:#a6e22e">start&lt;/span> = &lt;span style="color:#ae81ff">0&lt;/span>, &lt;span style="color:#a6e22e">length&lt;/span> = &lt;span style="color:#ae81ff">8&lt;/span>},
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> {&lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span>, &lt;span style="color:#a6e22e">start&lt;/span> = &lt;span style="color:#ae81ff">12&lt;/span>, &lt;span style="color:#a6e22e">length&lt;/span> = &lt;span style="color:#ae81ff">4&lt;/span>},
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> ]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;barcode&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This transformation extracts multiple fixed-length regions from reads and concatenates them into a single tag.&lt;/p>
&lt;p>ExtractRegions with only one region are exactly equivalent to &lt;a href="https://tyberiusprime.github.io/fastqrab/v0.8.0-test/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractregion/">ExtractRegion&lt;/a>.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractregionsoflowquality/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractregionsoflowquality/</guid><description>&lt;h1 id="extractregionsoflowquality">
 ExtractRegionsOfLowQuality
 &lt;a class="anchor" href="#extractregionsoflowquality">#&lt;/a>
&lt;/h1>
&lt;p>Extract regions (min size 1 bp) where bases have quality scores below threshold.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ExtractRegionsOfLowQuality&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">min_quality&lt;/span> = &lt;span style="color:#ae81ff">60&lt;/span> &lt;span style="color:#75715e"># Quality threshold (Phred+33)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;low_quality_regions&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This transformation scans through quality scores of the specified segment and identifies contiguous regions where quality scores are below the specified threshold. Each low-quality region becomes a tagged region with location information (start position and length).&lt;/p>
&lt;h2 id="parameters">
 Parameters
 &lt;a class="anchor" href="#parameters">#&lt;/a>
&lt;/h2>
&lt;ul>
&lt;li>&lt;code>segment&lt;/code>: Which read to analyze for low-quality regions&lt;/li>
&lt;li>&lt;code>min_quality&lt;/code>: Quality score threshold using Phred+33 encoding. See &lt;a href="https://en.wikipedia.org/wiki/Phred_quality_score#Symbols">Phred quality score&lt;/a> for ASCII character mapping&lt;/li>
&lt;li>&lt;code>label&lt;/code>: Tag name to store the extracted regions&lt;/li>
&lt;/ul>
&lt;h2 id="example">
 Example
 &lt;a class="anchor" href="#example">#&lt;/a>
&lt;/h2>
&lt;p>With &lt;code>min_quality = 60&lt;/code> (ASCII &amp;lsquo;&amp;lt;&amp;rsquo;), any bases with quality scores below &amp;lsquo;&amp;lt;&amp;rsquo; will be identified as low-quality regions. This is useful for masking or filtering poor-quality sequences.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/tagotherfilebyname/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/tagotherfilebyname/</guid><description>&lt;h1 id="tagotherfilebyname">
 TagOtherFileByName
 &lt;a class="anchor" href="#tagotherfilebyname">#&lt;/a>
&lt;/h1>
&lt;p>Mark reads based on wether names are present in another file.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;TagOtherFileByName&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># which segment&amp;#39;s name are we using&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;present_in_other&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">filename&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;names.fastq&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Can read fastq (also compressed), or sam/bam files&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">false_positive_rate&lt;/span> = &lt;span style="color:#ae81ff">0.01&lt;/span> &lt;span style="color:#75715e"># false positive rate (0..1)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">seed&lt;/span> = &lt;span style="color:#ae81ff">42&lt;/span> &lt;span style="color:#75715e"># seed for randomness&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">ignore_unaligned&lt;/span> = &lt;span style="color:#66d9ef">false&lt;/span> &lt;span style="color:#75715e"># in case of BAM/SAM, whether to ignore unaligned reads&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">fastq_readname_end_char&lt;/span> = &lt;span style="color:#e6db74">&amp;#34; &amp;#34;&lt;/span> &lt;span style="color:#75715e"># (optional) char (byte value) at which to cut input fastq read names before comparing. If not set, no cutting is done.&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">reference_readname_end_char&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;/&amp;#34;&lt;/span> &lt;span style="color:#75715e"># (optional) char (byte value) at which to cut reference read names before storing them.&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This step marks reads by comparing their names against names from another file.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/tagotherfilebysequence/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/tagotherfilebysequence/</guid><description>&lt;h1 id="tagotherfilebysequence">
 TagOtherFileBySequence
 &lt;a class="anchor" href="#tagotherfilebysequence">#&lt;/a>
&lt;/h1>
&lt;p>Marks reads based on wether sequences are present in another file.&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;TagOtherFileBySequence&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;present_in_other_file&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">filename&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;sequences.fastq&amp;#34;&lt;/span> &lt;span style="color:#75715e"># fastq (also compressed), or sam/bam files&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">false_positive_rate&lt;/span> = &lt;span style="color:#ae81ff">0.01&lt;/span> &lt;span style="color:#75715e"># false positive rate (0..1)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">seed&lt;/span> = &lt;span style="color:#ae81ff">42&lt;/span> &lt;span style="color:#75715e"># seed for randomness&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>This step annotates reads by comparing their sequences against sequences from another file.&lt;/p>
&lt;p>Please note our &lt;a href="https://tyberiusprime.github.io/fastqrab/v0.8.0-test/fastqrab/v0.8.0-test/docs/faq/#cuckoo-filtering">remarks about cuckoo filters&lt;/a>.&lt;/p></description></item><item><title>Extract IUPAC</title><link>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractiupac/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractiupac/</guid><description>&lt;h1 id="extractiupac">
 ExtractIUPAC
 &lt;a class="anchor" href="#extractiupac">#&lt;/a>
&lt;/h1>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ExtractIUPAC&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;mytag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">anchor&lt;/span> = &lt;span style="color:#e6db74">&amp;#39;Left&amp;#39;&lt;/span> &lt;span style="color:#75715e"># Left | Right | Anywhere&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">search&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;CTN&amp;#34;&lt;/span> &lt;span style="color:#75715e"># what we are searching&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#39;read1&amp;#39;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Search and extract a sequence from the read, defined by a &lt;a href="https://doi.org/10.1093%2Fnar%2F13.9.3021">IUPAC string&lt;/a>.&lt;/p>
&lt;p>If anchor = &amp;lsquo;Anywhere&amp;rsquo;, ExtractIUPAC will find the left most occurance.&lt;/p></description></item><item><title>Extract IUPAC suffix</title><link>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractiupacsuffix/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractiupacsuffix/</guid><description>&lt;h1 id="extractiupacsuffix">
 ExtractIUPACSuffix
 &lt;a class="anchor" href="#extractiupacsuffix">#&lt;/a>
&lt;/h1>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ExtractIUPACSuffix&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;mytag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">query&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;AGTCA&amp;#34;&lt;/span> &lt;span style="color:#75715e"># the adapter to trim. Straigth bases only, no IUPAC.&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments (default: read1)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">min_length&lt;/span> = &lt;span style="color:#ae81ff">3&lt;/span> &lt;span style="color:#75715e"># uint, the minimum length of match between the end of the read and&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># the start of the adapter&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">max_mismatches&lt;/span> = &lt;span style="color:#ae81ff">1&lt;/span> &lt;span style="color:#75715e"># How many mismatches to accept&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Find a potentially truncated &lt;a href="https://doi.org/10.1093%2Fnar%2F13.9.3021">IUPAC string&lt;/a> sequence at the end of a read.&lt;/p>
&lt;p>Simple comparison with a max mismatch hamming distance, requiring only the first min length
bases of the query to match at the end of the read.&lt;/p></description></item><item><title>Extract IUPAC with Indels</title><link>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractiupacwithindel/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractiupacwithindel/</guid><description>&lt;h1 id="extractiupacwithindel">
 ExtractIUPACWithIndel
 &lt;a class="anchor" href="#extractiupacwithindel">#&lt;/a>
&lt;/h1>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ExtractIUPACWithIndel&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;adapter&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">search&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;AGTC&amp;#34;&lt;/span> &lt;span style="color:#75715e"># IUPAC pattern to align against&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">max_mismatches&lt;/span> = &lt;span style="color:#ae81ff">1&lt;/span> &lt;span style="color:#75715e"># allowed substitutions (IUPAC-aware)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">max_indel_bases&lt;/span> = &lt;span style="color:#ae81ff">1&lt;/span> &lt;span style="color:#75715e"># total insertions + deletions allowed&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">max_total_edits&lt;/span> = &lt;span style="color:#ae81ff">2&lt;/span> &lt;span style="color:#75715e"># optional overall edit ceiling&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">anchor&lt;/span> = &lt;span style="color:#e6db74">&amp;#39;Anywhere&amp;#39;&lt;/span> &lt;span style="color:#75715e"># Left | Right | Anywhere&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#39;read1&amp;#39;&lt;/span> &lt;span style="color:#75715e"># defaults to read1&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Locate an &lt;a href="https://doi.org/10.1093%2Fnar%2F13.9.3021">IUPAC&lt;/a> pattern even when the read contains small insertions or deletions relative to the pattern. The extractor performs a semiglobal alignment (pattern vs. read segment) using IUPAC-aware scoring and returns the aligned span as a location tag.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractlowcomplexity/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractlowcomplexity/</guid><description>&lt;h1 id="extractlowcomplexity">
 ExtractLowComplexity
 &lt;a class="anchor" href="#extractlowcomplexity">#&lt;/a>
&lt;/h1>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ExtractLowComplexity&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;complexity&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments, or &amp;#39;All&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Calculate read complexity, based on the percentage of bases that are changed from their predecessor.&lt;/p>
&lt;p>A good filter value might be 0.30, which means 30% complexity is required. See
&lt;a href="https://tyberiusprime.github.io/fastqrab/v0.8.0-test/fastqrab/v0.8.0-test/docs/reference/filter-steps/filterbynumerictag/">FilterByNumericTag&lt;/a>.&lt;/p>
&lt;h2 id="corresponding-options-in-other-software">
 Corresponding options in other software
 &lt;a class="anchor" href="#corresponding-options-in-other-software">#&lt;/a>
&lt;/h2>
&lt;ul>
&lt;li>fastp: -low_complexity_filter&lt;/li>
&lt;/ul></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractlowqualityend/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractlowqualityend/</guid><description>&lt;h1 id="trimqualityend">
 TrimQualityEnd
 &lt;a class="anchor" href="#trimqualityend">#&lt;/a>
&lt;/h1>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ExtractLowQualityEnd&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;low_quality_ends&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">min_qual&lt;/span> = &lt;span style="color:#ae81ff">20&lt;/span> &lt;span style="color:#75715e"># u8, minimum quality to keep (in whatever your score is encoded in)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># either a char like &amp;#39;A&amp;#39; or a number 0..128 (typical phred score is 33..75)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Define a region of low quality bases at the end of reads.&lt;/p>
&lt;h2 id="corresponding-options-in-other-software">
 Corresponding options in other software
 &lt;a class="anchor" href="#corresponding-options-in-other-software">#&lt;/a>
&lt;/h2>
&lt;ul>
&lt;li>Trimmomatic: TRAILING (if paired with &lt;a href="https://tyberiusprime.github.io/fastqrab/v0.8.0-test/fastqrab/v0.8.0-test/docs/reference/modification-steps/trimattag/">TrimAtTag&lt;/a>)&lt;/li>
&lt;/ul></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractlowqualitystart/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractlowqualitystart/</guid><description>&lt;h1 id="trimqualitystart">
 TrimQualityStart
 &lt;a class="anchor" href="#trimqualitystart">#&lt;/a>
&lt;/h1>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ExtractLowQualityStart&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">min_qual&lt;/span> = &lt;span style="color:#ae81ff">20&lt;/span> &lt;span style="color:#75715e"># u8, minimum quality to keep (in whatever your score is encoded in)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># either a char like &amp;#39;A&amp;#39; or a number 0..128 (typical phred score is 33..75)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;bad_starts&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Define a region with low quality bases (below threshold) at steart of read.&lt;/p>
&lt;h2 id="corresponding-options-in-other-software">
 Corresponding options in other software
 &lt;a class="anchor" href="#corresponding-options-in-other-software">#&lt;/a>
&lt;/h2>
&lt;ul>
&lt;li>Trimmomatic: LEADING (if combined with &lt;a href="https://tyberiusprime.github.io/fastqrab/v0.8.0-test/fastqrab/v0.8.0-test/docs/reference/modification-steps/trimattag/">TrimAtTag&lt;/a>)&lt;/li>
&lt;/ul></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractmeanquality/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractmeanquality/</guid><description>&lt;h3 id="extractmeanquality">
 ExtractMeanQuality
 &lt;a class="anchor" href="#extractmeanquality">#&lt;/a>
&lt;/h3>
&lt;p>We don&amp;rsquo;t support calculating the &amp;lsquo;average quality&amp;rsquo;.&lt;/p>
&lt;p>This is typically a bad idea, see &lt;a href="https://www.drive5.com/usearch/manual/avgq.html">https://www.drive5.com/usearch/manual/avgq.html&lt;/a> for a discussion of the issues.&lt;/p>
&lt;p>To illustrate, 140 x Q35 + 10 x Q2 reads have an &amp;lsquo;average&amp;rsquo; phred of 33, but 6.4 expected wrong bases
A read with 150 x Q25 has a much wores &amp;lsquo;average&amp;rsquo; phred of 25, but a much lower expected number of errors at 0.5!&lt;/p>
&lt;h2 id="corresponding-options-in-other-software">
 Corresponding options in other software
 &lt;a class="anchor" href="#corresponding-options-in-other-software">#&lt;/a>
&lt;/h2>
&lt;ul>
&lt;li>Trimmomatic: AVGQUAL:&lt;/li>
&lt;li>fastp: &amp;ndash;average_qual&lt;/li>
&lt;/ul></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractpolytail/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractpolytail/</guid><description>&lt;h1 id="extractpolytail">
 ExtractPolyTail
 &lt;a class="anchor" href="#extractpolytail">#&lt;/a>
&lt;/h1>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ExtractPolyTail&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;tag-label&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments (default: read1)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">min_length&lt;/span> = &lt;span style="color:#ae81ff">5&lt;/span> &lt;span style="color:#75715e"># positive integer, the minimum number of repeats of the base&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">base&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;A&amp;#34;&lt;/span> &lt;span style="color:#75715e"># one of AGTCN., the &amp;#39;base&amp;#39; to trim (or . for any repeated base)&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">max_mismatch_rate&lt;/span> = &lt;span style="color:#ae81ff">0.1&lt;/span> &lt;span style="color:#75715e"># float 0.0..=1.0, how many mismatches are allowed in the repeat&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">max_consecutive_mismatches&lt;/span> = &lt;span style="color:#ae81ff">3&lt;/span> &lt;span style="color:#75715e"># how many consecutive mismatches are allowed&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;TrimAtTag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">label&lt;/span> =&lt;span style="color:#e6db74">&amp;#34;tag-label&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">direction&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;End&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">keep_tag&lt;/span> = &lt;span style="color:#66d9ef">false&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Identify either a specific letter (AGTC or N) repetition,
or any base repetition (base = &amp;lsquo;.&amp;rsquo;) at the end of the read.&lt;/p></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractqualifiedbases/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractqualifiedbases/</guid><description>&lt;p>#ExtractQualifiedBases&lt;/p>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ExtractQualifiedBases&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">min_quality&lt;/span> = &lt;span style="color:#ae81ff">30&lt;/span> &lt;span style="color:#75715e"># the quality value &amp;gt;= which a base is qualified &lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># In your phred encoding. Typically 33..75&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># a byte or a number 0...255&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments, or &amp;#39;All&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;tag_name&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Calculate by the percentage of bases that are &amp;lsquo;unqualified&amp;rsquo;,
that is below a user defined threshold.&lt;/p>
&lt;h2 id="corresponding-options-in-other-software">
 Corresponding options in other software
 &lt;a class="anchor" href="#corresponding-options-in-other-software">#&lt;/a>
&lt;/h2>
&lt;ul>
&lt;li>fastp : &amp;ndash;qualified_quality_phred / &amp;ndash;unqualified_percent_limit (if combined with &lt;a href="https://tyberiusprime.github.io/fastqrab/v0.8.0-test/fastqrab/v0.8.0-test/docs/reference/filter-steps/filterbynumerictag/">FilterByNumericTag&lt;/a>)&lt;/li>
&lt;/ul></description></item><item><title/><link>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/tagduplicates/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/tagduplicates/</guid><description>&lt;h3 id="filterduplicates">
 FilterDuplicates
 &lt;a class="anchor" href="#filterduplicates">#&lt;/a>
&lt;/h3>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;TagDuplicates&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">false_positive_rate&lt;/span> = &lt;span style="color:#ae81ff">0.00001&lt;/span> &lt;span style="color:#75715e">#&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># the false positive rate of the filter.&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#75715e"># 0..1&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">seed&lt;/span> = &lt;span style="color:#ae81ff">59&lt;/span> &lt;span style="color:#75715e"># required!&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;All&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments, or &amp;#39;All&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;dups&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;FilterByBoolTag&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">label&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;dups&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">keep_or_remove&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;Remove&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Keep|Remove&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Tag duplicates (2nd onwards) from the stream using a &lt;a href="https://en.wikipedia.org/wiki/Cuckoo_filter">Cuckoo filter&lt;/a>.&lt;/p>
&lt;p>That&amp;rsquo;s a probabilistic data structure, accordingly there&amp;rsquo;s a false positive rate,
and a tunable memory requirement.&lt;/p>
&lt;p>Needs a seed for the random number generator, and a segment
to know which reads to consider for deduplication (filters the complete molecule, like
all other filters of course).&lt;/p></description></item><item><title>Extract GC Content</title><link>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractgccontent/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractgccontent/</guid><description>&lt;h1 id="extractgccontent">
 ExtractGCContent
 &lt;a class="anchor" href="#extractgccontent">#&lt;/a>
&lt;/h1>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ExtractGCContent&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments, or &amp;#39;All&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">label&lt;/span>=&lt;span style="color:#e6db74">&amp;#34;gc&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Count what percentage of bases are GC (as opposed to AT).
Non-AGTC bases (e.g. N) are ignored in both the numerator and denominator.&lt;/p></description></item><item><title>Extract N Count</title><link>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractncount/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://tyberiusprime.github.io/fastqrab/v0.8.0-test/docs/reference/tag-steps/generation/extractncount/</guid><description>&lt;h1 id="extractncount">
 ExtractNCount
 &lt;a class="anchor" href="#extractncount">#&lt;/a>
&lt;/h1>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-toml" data-lang="toml">&lt;span style="display:flex;">&lt;span>[[&lt;span style="color:#a6e22e">step&lt;/span>]]
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">action&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;ExtractNCount&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">segment&lt;/span> = &lt;span style="color:#e6db74">&amp;#34;read1&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Any of your input segments, or &amp;#39;All&amp;#39;&lt;/span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span> &lt;span style="color:#a6e22e">label&lt;/span>=&lt;span style="color:#e6db74">&amp;#34;ncount&amp;#34;&lt;/span>
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;p>Count how many N are present in the read&lt;/p>
&lt;h2 id="corresponding-options-in-other-software">
 Corresponding options in other software
 &lt;a class="anchor" href="#corresponding-options-in-other-software">#&lt;/a>
&lt;/h2>
&lt;ul>
&lt;li>fastp: &amp;ndash;n_base_limit (if combined with &lt;a href="https://tyberiusprime.github.io/fastqrab/v0.8.0-test/fastqrab/v0.8.0-test/docs/reference/filter-steps/filterbynumerictag/">FilterByNumericTag&lt;/a>)&lt;/li>
&lt;/ul></description></item></channel></rss>