feat: add fgumi modules and bump fgumi family to 0.4.0#12178
Open
nh13 wants to merge 9 commits into
Open
Conversation
Add an nf-core module wrapping `fgumi fastq`. Convert a BAM file to interleaved gzipped FASTQ. Mirrors the existing fgumi modules and is pinned to fgumi 0.4.0. Tested against the nf-core UMI test fixtures with a stub run.
Add an nf-core module wrapping `fgumi simplex-metrics`. Collect QC metrics for simplex (single-strand) UMI sequencing data from a UMI-grouped BAM. Mirrors the existing fgumi modules and is pinned to fgumi 0.4.0. Tested against the nf-core UMI test fixtures with a stub run.
Add an nf-core module wrapping `fgumi codec`. Call CODEC consensus reads from a UMI-grouped BAM. Mirrors the existing fgumi modules and is pinned to fgumi 0.4.0. Tested against the nf-core UMI test fixtures with a stub run.
Add an nf-core module wrapping `fgumi downsample`. Downsample a BAM by UMI family using a streaming algorithm. Mirrors the existing fgumi modules and is pinned to fgumi 0.4.0. Tested against the nf-core UMI test fixtures with a stub run.
Add an nf-core module wrapping `fgumi correct`. Correct UMIs in a BAM file (RX tag) to a fixed set of known UMIs (supplied via task.ext.args). Mirrors the existing fgumi modules and is pinned to fgumi 0.4.0. Tested against the nf-core UMI test fixtures with a stub run.
Add an nf-core module wrapping `fgumi clip`. Clip overlapping reads in a queryname-sorted BAM, regenerating tags against a reference FASTA. Mirrors the existing fgumi modules and is pinned to fgumi 0.4.0. Tested against the nf-core UMI test fixtures with a stub run.
Add an nf-core module wrapping `fgumi zipper`. Zip an unmapped UMI BAM together with its aligned BAM, transferring UMI tags onto the aligned reads. Mirrors the existing fgumi modules and is pinned to fgumi 0.4.0. Tested against the nf-core UMI test fixtures with a stub run.
Add an nf-core module wrapping `fgumi dedup`. Mark or remove PCR duplicates using UMI information; emits the deduplicated BAM, metrics, and a family-size histogram. Mirrors the existing fgumi modules and is pinned to fgumi 0.4.0. Tested against the nf-core UMI test fixtures with a stub run.
Bump the eight existing fgumi modules (extract, group, simplex, duplex, duplexmetrics, filter, sort, merge) from fgumi 0.2.0 to 0.4.0 so the entire fgumi module family pins the latest release. Updates the conda pins and both container URLs, and regenerates the nf-test snapshots.
Contributor
|
Ugh, 8 new modules in one PR :( |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Adds eight new modules for fgumi, high-performance tools for UMI-tagged sequencing data, and bumps the entire fgumi module family to
fgumi=0.4.0(the latest release):New modules:
fgumi/fastq— convert a BAM to interleaved gzipped FASTQ.fgumi/simplexmetrics— collect QC metrics for simplex UMI data.fgumi/codec— call CODEC consensus reads from a grouped BAM.fgumi/downsample— downsample a BAM by UMI family.fgumi/correct— correct UMIs to a fixed set of known UMIs.fgumi/clip— clip overlapping reads against a reference.fgumi/zipper— zip an unmapped UMI BAM with its aligned BAM.fgumi/dedup— mark/remove PCR duplicates using UMI information.Version bump:
extract,group,simplex,duplex,duplexmetrics,filter,sort,merge) are bumped fromfgumi=0.2.0tofgumi=0.4.0, so all sixteenfgumi/*modules pin the same latest release. Snapshots regenerated accordingly.Each new module is tested against the nf-core UMI test fixtures (with setup chains via
fgumi/extract,fgumi/sort, andsamtools/sortwhere required) plus stub runs.fgumi/reviewis intentionally left for a follow-up PR as it requires a dedicated VCF + consensus/grouped BAM fixture set.PR checklist
fgumi/*modules).label.bioconda::fgumi=0.4.0; Seqera Wave community container).nf-core modules test fgumi/<sub> --profile dockerpasses for all sixteen modules.topic: versions.