ExtractRegionsOfLowQuality #
Extract regions where bases have quality scores below threshold, with a minimum length requirement.
[[step]]
action = "ExtractRegionsOfLowQuality"
segment = "read1" # Any of your input segments
min_quality = 60 # Quality threshold (Phred+33)
min_length = 5 # Minimum region length (bp)
out_label = "low_quality_regions"
This transformation scans through quality scores of the specified segment and identifies contiguous regions where quality scores are below the specified threshold. Each low-quality region that meets the minimum length requirement becomes a tagged region with location information (start position and length).
Parameters #
segment: Which read to analyze for low-quality regionsmin_quality: Quality score threshold using Phred+33 encoding. See Phred quality score for ASCII character mappingmin_length: Minimum length (in bases) for a region to be extracted. Must be >= 1out_label: Tag name to store the extracted regions
Example #
With min_quality = 60 (ASCII ‘<’) and min_length = 5, any contiguous stretches of 5+ bases with quality scores below ‘<’ will be identified as low-quality regions. This is useful for masking or filtering poor-quality sequences.
Notes #
Note that one read may have multiple low-quality regions. TrimAtTag will cut at the outmost one of them.