StoreTagInSequence #
Insert a tag’s string value into a read sequence at the position defined by another location tag.
[[step]]
action = "StoreTagInSequence"
in_value_label = "mytag" # location or string tag to insert
in_position_label = "mytag2" # location tag defining where to insert
anchor = "Start" # "Start"/"left" or "End"/"right"
Parameters #
| Parameter | Type | Required | Description |
|---|---|---|---|
in_value_label | location or string tag | yes | Tag whose sequence is inserted into the read |
in_position_label | location tag | yes | Tag that defines the insertion position |
anchor | "Start" / "left" / "End" / "right" | yes | Whether to insert before the leftmost position (Start) or after the rightmost end (End) of the position tag |
How it works #
in_position_label is a location tag pointing to a region (or multiple regions) in a
read. The insertion point is derived from the anchor:
Start/leftβ insert before the leftmoststartcoordinate across all regions.End/rightβ insert after the rightmoststart + lencoordinate across all regions.
The bytes inserted come from in_value_label:
- If it is a location tag, its sequences are joined (without spacer) and used.
- If it is a string tag, the string value is used directly.
Quality scores for the inserted bases are set to ~ (Phred+33 = Q93, maximum Sanger quality).
Location tag adjustment #
After insertion, all location tags on the same segment are updated:
- Locations after the insertion point are shifted forward by the number of inserted bases.
- Locations that straddle the insertion point (start before, end after) have their coordinate information removed (sequence data is preserved).
- Locations before the insertion point are unchanged.
Behaviour when tags are missing #
If in_value_label is Missing or produces an empty sequence, or if in_position_label
carries no location information, the read is left unchanged and no error is raised.
Example #
Given read1 = AAACCCGGG (quality IIIIIIIII), with:
val_tagextracting bases 0β2 (AAA, location [0,3])pos_tagextracting bases 3β5 (CCC, location [3,3])
# ignore_in_test
[[step]]
action = "StoreTagInSequence"
in_value_label = "val_tag"
in_position_label = "pos_tag"
anchor = "Start"
Result: AAA inserted before position 3 β AAAAAACCCGGG (quality III~~~IIIIII).
With anchor = "End":
Result: AAA inserted after position 6 β AAACCCAAAGGG (quality IIIIII~~~III).