Added parentage functions#59
Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates package exports/imports to support new pedigree/parentage-related functionality, including exposing validate_pedigree() and tightening check_ped() dependency declarations.
Changes:
- Export
validate_pedigree()and add itsdata.tableroxygen imports. - Replace broad
@importdirectives incheck_ped()with more specific@importFromentries. - Regenerate/update
NAMESPACEaccordingly (including addingtools::file_path_sans_extandjanitor::clean_namesimports).
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| R/validate_pedigree.R | Adds @importFrom data.table ... for validate_pedigree() and ensures it’s exported. |
| R/check_ped.R | Switches roxygen imports to @importFrom for dplyr/janitor and adds tools::file_path_sans_ext. |
| NAMESPACE | Updates exports/imports to match roxygen (adds export(validate_pedigree), switches janitor import style, adds additional dplyr imports). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## development #59 +/- ##
===============================================
- Coverage 81.53% 78.96% -2.58%
===============================================
Files 23 23
Lines 2616 2814 +198
===============================================
+ Hits 2133 2222 +89
- Misses 483 592 +109 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…ement for documentation on check_ped
…and to plot results.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 12 out of 13 changed files in this pull request and generated 6 comments.
Comments suppressed due to low confidence (1)
man/validate_pedigree.Rd:90
- The function always writes
corrected_pedigree.txtto the working directory (even whenwrite_results = FALSE), but this side-effect isn’t documented here. Please document this output file (including when it’s written and how it relates towrite_results) so users aren’t surprised by unexpected files in their working directory.
\description{
Validates parent-offspring trios by calculating Mendelian error rates from
SNP genotype data. Identifies incorrect parentage assignments and suggests
best-matching replacements. If a list of founders is supplied, trios that
are declared founders (both parents coded as 0) are preserved unchanged
with no recommendations. Trios removed due to missing genotype data are
retained in the output with a NO_GENOTYPE_DATA status.
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 12 out of 13 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 12 out of 13 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
NAMESPACE:33
NAMESPACEno longer importsjanitor, and a repo-wide search shows no remainingjanitor::usage. However,DESCRIPTIONstill listsjanitorunderImports, which will typically trigger an R CMD check NOTE about unused imports. Either removejanitorfromDESCRIPTIONor reintroduce an actual use/import as appropriate.
import(dplyr)
import(parallel)
import(quadprog)
import(stringr)
import(tibble)
import(tidyr)
import(vcfR)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 12 out of 13 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 12 out of 19 changed files in this pull request and generated 7 comments.
Files not reviewed (6)
- man/allele_freq_poly.Rd: Language not supported
- man/check_ped.Rd: Language not supported
- man/find_parentage.Rd: Language not supported
- man/imputation_concordance.Rd: Language not supported
- man/solve_composition_poly.Rd: Language not supported
- man/validate_pedigree.Rd: Language not supported
Comments suppressed due to low confidence (1)
DESCRIPTION:66
janitorremains listed under Imports, but there are no longer anyjanitor::(orclean_names) usages in the R sources. This makesjanitoran unnecessary hard dependency; consider removing it from DESCRIPTION Imports if it's truly unused.
Imports:
parallel,
dplyr,
Rdpack (>= 0.7),
readr (>= 2.1.5),
reshape2 (>= 1.4.4),
rlang,
tidyr (>= 1.3.1),
vcfR (>= 1.15.0),
Rsamtools,
Biostrings,
pwalign,
janitor,
quadprog,
tibble,
stringr,
data.table
Suggests:
… or file paths, updated documentation and test files
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 13 out of 20 changed files in this pull request and generated 4 comments.
Files not reviewed (6)
- man/allele_freq_poly.Rd: Language not supported
- man/check_ped.Rd: Language not supported
- man/find_parentage.Rd: Language not supported
- man/imputation_concordance.Rd: Language not supported
- man/solve_composition_poly.Rd: Language not supported
- man/validate_pedigree.Rd: Language not supported
…ring genotype matrices for comparisons. updated test file to accomodate nw arguments
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 13 out of 20 changed files in this pull request and generated 5 comments.
Files not reviewed (6)
- man/allele_freq_poly.Rd: Language not supported
- man/check_ped.Rd: Language not supported
- man/find_parentage.Rd: Language not supported
- man/imputation_concordance.Rd: Language not supported
- man/solve_composition_poly.Rd: Language not supported
- man/validate_pedigree.Rd: Language not supported
| } else if (decision %in% c("remove_both", | ||
| "low_markers_remove_both", | ||
| "low_markers_remove_male_parent", | ||
| "low_markers_remove_female_parent")) { | ||
| if (grepl("male", decision)) | ||
| data.table::set(corrected_pedigree, row_idx, "male_parent", "0") | ||
| if (grepl("female", decision)) | ||
| data.table::set(corrected_pedigree, row_idx, "female_parent", "0") | ||
| if (decision %in% c("low_markers_remove_both", "remove_both")) { | ||
| data.table::set(corrected_pedigree, row_idx, "male_parent", "0") | ||
| data.table::set(corrected_pedigree, row_idx, "female_parent", "0") | ||
| } | ||
| } |
| show_ties = TRUE, | ||
| allow_selfing = TRUE, | ||
| verbose = TRUE, | ||
| write_txt = TRUE | ||
| plot_results = TRUE | ||
| ) | ||
| } | ||
| \arguments{ | ||
| \item{genotypes_file}{Path to a TSV/CSV/TXT file containing genotype data. | ||
| Must include an 'ID' column followed by marker columns coded as 0, 1, 2 | ||
| (allele dosage).} | ||
| \item{genotypes_file}{Path to a TSV/CSV/TXT file, OR a data.frame / | ||
| data.table with an 'id' column followed by marker columns coded as 0, 1, 2.} | ||
|
|
||
| \item{parents_file}{Path to a TSV/CSV/TXT file listing candidate parent IDs. | ||
| Must include an 'ID' column. An optional 'Sex' column with values | ||
| 'M' (male parent), 'F' (female parent), or 'A' (ambiguous) determines | ||
| which parents are tested for each role. If absent, all parents are treated | ||
| as ambiguous.} | ||
| \item{parents_file}{Path to a TSV/CSV/TXT file, OR a data.frame / | ||
| data.table with an 'id' column and an optional 'sex' column | ||
| ('M', 'F', or 'A'). If absent, all parents are treated as ambiguous.} | ||
|
|
||
| \item{progeny_file}{Path to a TSV/CSV/TXT file listing progeny IDs to assign. | ||
| Must include an 'ID' column.} | ||
| \item{progeny_file}{Path to a TSV/CSV/TXT file, OR a data.frame / | ||
| data.table with an 'id' column.} | ||
|
|
||
| \item{method}{Character. Parentage assignment method. One of: | ||
| \itemize{ | ||
| \item \code{"best_male_parent"} — finds the best male parent for each | ||
| progeny using homozygous mismatch rate. | ||
| \item \code{"best_female_parent"} — finds the best female parent for each | ||
| progeny using homozygous mismatch rate. | ||
| \item \code{"best_match"} — finds the single best parent (either sex) | ||
| using homozygous mismatch rate. | ||
| \item \code{"best_pair"} — finds the best male-female parent pair for | ||
| each progeny using full Mendelian error rate (default). | ||
| }} | ||
| \item{method}{Character. One of \code{"best_male_parent"}, | ||
| \code{"best_female_parent"}, \code{"best_match"}, or | ||
| \code{"best_pair"} (default).} | ||
|
|
||
| \item{min_markers}{Integer. Minimum number of non-missing markers required | ||
| to report a parentage assignment. Progeny-parent comparisons with fewer | ||
| markers are flagged as \code{LOW_MARKERS} and no assignment is made | ||
| (default: \code{10}).} | ||
| \item{min_markers}{Integer. Minimum markers required; fewer flags | ||
| \code{low_markers} (default: \code{10}).} | ||
|
|
||
| \item{error_threshold}{Numeric. Maximum mismatch percentage to report a | ||
| parentage assignment as confident. Assignments above this threshold are | ||
| flagged as \code{HIGH_ERROR} in the \code{Assignment_Status} column | ||
| (default: \code{5.0}). Must be between 0 and 100.} | ||
| \item{error_threshold}{Numeric. Maximum mismatch percentage; exceeded values | ||
| flag \code{high_error} (default: \code{5.0}). Must be between 0 and 100.} | ||
|
|
||
| \item{show_ties}{Logical. If \code{TRUE}, all tied best pairs (after | ||
| tie-breaking by maximum markers tested) are reported as additional columns | ||
| (\code{Male_Parent_1}, \code{Male_Parent_2}, etc.) when | ||
| \code{method = "best_pair"}. The base columns (\code{Male_Parent}, | ||
| \code{Female_Parent}, etc.) are always populated with the top result. | ||
| If \code{FALSE}, only one tied pair is reported with a warning. | ||
| Default is \code{TRUE}.} | ||
| \item{show_ties}{Logical. If \code{TRUE}, tied best pairs are appended as | ||
| suffix columns. Default is \code{TRUE}.} | ||
|
|
||
| \item{allow_selfing}{Logical. If \code{FALSE}, male-female parent pairs where | ||
| both IDs are identical are excluded when \code{method = "best_pair"}. | ||
| Default is \code{TRUE}.} | ||
| \item{allow_selfing}{Logical. If \code{FALSE}, pairs with identical male and | ||
| female parent IDs are excluded. Default is \code{TRUE}.} | ||
|
|
||
| \item{verbose}{Logical. If \code{TRUE}, prints progress messages, summary | ||
| statistics, and the results table to the console. Default is \code{TRUE}.} | ||
| \item{verbose}{Logical. If \code{TRUE}, prints progress and summary. | ||
| Default is \code{TRUE}.} | ||
|
|
||
| \item{write_txt}{Logical. If \code{TRUE}, writes results to | ||
| \code{parentage_results_dt.txt} in the working directory. Default is | ||
| \code{TRUE}.} | ||
| \item{plot_results}{Logical. If \code{TRUE}, plots the Mendelian error | ||
| distribution. Requires \code{ggplot2}. Default is \code{TRUE}.} | ||
| } |
| ggplot2::geom_vline(xintercept = error_threshold, | ||
| linetype = "dashed", color = "black", linewidth = 1) + | ||
| ggplot2::scale_x_continuous(breaks = seq(0, 100, by = 5)) + | ||
| ggplot2::scale_y_continuous(breaks = seq(0, 10000, by = 5)) + |
| plot_df$plot_status <- dplyr::case_when( | ||
| plot_df$recommended_correction %in% c("none", "keep_both", | ||
| "low_markers_keep_both") ~ "pass", | ||
| plot_df$recommended_correction %in% c("remove_male_parent", | ||
| "remove_female_parent", | ||
| "low_markers_remove_male_parent", | ||
| "low_markers_remove_female_parent") ~ "fail_one_parent", | ||
| plot_df$recommended_correction %in% c("remove_both", | ||
| "low_markers_remove_both") ~ "fail_both_parents", | ||
| TRUE ~ "other" | ||
| ) |
| library(testthat) | ||
| library(data.table) | ||
|
|
||
| set.seed(101919) |
Added 2 parentage functions along with test files for them. updated imporFrom statemetns for ped_check