Skip to content

Added parentage functions#59

Closed
josuechinchilla wants to merge 18 commits into
developmentfrom
add_parentage_functions
Closed

Added parentage functions#59
josuechinchilla wants to merge 18 commits into
developmentfrom
add_parentage_functions

Conversation

@josuechinchilla

Copy link
Copy Markdown
Collaborator

Added 2 parentage functions along with test files for them. updated imporFrom statemetns for ped_check

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates package exports/imports to support new pedigree/parentage-related functionality, including exposing validate_pedigree() and tightening check_ped() dependency declarations.

Changes:

  • Export validate_pedigree() and add its data.table roxygen imports.
  • Replace broad @import directives in check_ped() with more specific @importFrom entries.
  • Regenerate/update NAMESPACE accordingly (including adding tools::file_path_sans_ext and janitor::clean_names imports).

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
R/validate_pedigree.R Adds @importFrom data.table ... for validate_pedigree() and ensures it’s exported.
R/check_ped.R Switches roxygen imports to @importFrom for dplyr/janitor and adds tools::file_path_sans_ext.
NAMESPACE Updates exports/imports to match roxygen (adds export(validate_pedigree), switches janitor import style, adds additional dplyr imports).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread R/check_ped.R Outdated
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread R/validate_pedigree.R Outdated
Comment thread R/check_ped.R Outdated
@codecov

codecov Bot commented Apr 22, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 54.51713% with 146 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.96%. Comparing base (ccc0952) to head (f798362).

Files with missing lines Patch % Lines
R/find_parentage.R 16.45% 66 Missing ⚠️
R/validate_pedigree.R 32.63% 64 Missing ⚠️
R/check_ped.R 89.11% 16 Missing ⚠️
Additional details and impacted files
@@               Coverage Diff               @@
##           development      #59      +/-   ##
===============================================
- Coverage        81.53%   78.96%   -2.58%     
===============================================
  Files               23       23              
  Lines             2616     2814     +198     
===============================================
+ Hits              2133     2222      +89     
- Misses             483      592     +109     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@josuechinchilla josuechinchilla removed the request for review from alex-sandercock April 23, 2026 20:13

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 13 changed files in this pull request and generated 6 comments.

Comments suppressed due to low confidence (1)

man/validate_pedigree.Rd:90

  • The function always writes corrected_pedigree.txt to the working directory (even when write_results = FALSE), but this side-effect isn’t documented here. Please document this output file (including when it’s written and how it relates to write_results) so users aren’t surprised by unexpected files in their working directory.
\description{
Validates parent-offspring trios by calculating Mendelian error rates from
SNP genotype data. Identifies incorrect parentage assignments and suggests
best-matching replacements. If a list of founders is supplied, trios that
are declared founders (both parents coded as 0) are preserved unchanged
with no recommendations. Trios removed due to missing genotype data are
retained in the output with a NO_GENOTYPE_DATA status.
}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread R/find_parentage.R Outdated
Comment thread R/find_parentage.R
Comment thread R/find_parentage.R Outdated
Comment thread R/check_ped.R Outdated
Comment thread R/check_ped.R Outdated
Comment thread R/validate_pedigree.R Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 13 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread NAMESPACE
Comment thread R/validate_pedigree.R
Comment thread man/validate_pedigree.Rd Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 13 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

NAMESPACE:33

  • NAMESPACE no longer imports janitor, and a repo-wide search shows no remaining janitor:: usage. However, DESCRIPTION still lists janitor under Imports, which will typically trigger an R CMD check NOTE about unused imports. Either remove janitor from DESCRIPTION or reintroduce an actual use/import as appropriate.
import(dplyr)
import(parallel)
import(quadprog)
import(stringr)
import(tibble)
import(tidyr)
import(vcfR)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread R/check_ped.R
Comment thread R/check_ped.R Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 13 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread R/check_ped.R
Comment thread R/find_parentage.R Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 19 changed files in this pull request and generated 7 comments.

Files not reviewed (6)
  • man/allele_freq_poly.Rd: Language not supported
  • man/check_ped.Rd: Language not supported
  • man/find_parentage.Rd: Language not supported
  • man/imputation_concordance.Rd: Language not supported
  • man/solve_composition_poly.Rd: Language not supported
  • man/validate_pedigree.Rd: Language not supported
Comments suppressed due to low confidence (1)

DESCRIPTION:66

  • janitor remains listed under Imports, but there are no longer any janitor:: (or clean_names) usages in the R sources. This makes janitor an unnecessary hard dependency; consider removing it from DESCRIPTION Imports if it's truly unused.
Imports: 
    parallel,
    dplyr,
    Rdpack (>= 0.7),
    readr (>= 2.1.5),
    reshape2 (>= 1.4.4),
    rlang,
    tidyr (>= 1.3.1),
    vcfR (>= 1.15.0),
    Rsamtools,
    Biostrings,
    pwalign,
    janitor,
    quadprog,
    tibble,
    stringr,
    data.table
Suggests: 

Comment thread R/check_ped.R Outdated
Comment thread R/check_ped.R
Comment thread R/check_ped.R
Comment thread tests/testthat/test-check_ped.R Outdated
Comment thread man/check_ped.Rd
Comment thread NAMESPACE
Comment thread R/imputation_concordance.R

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 20 changed files in this pull request and generated 4 comments.

Files not reviewed (6)
  • man/allele_freq_poly.Rd: Language not supported
  • man/check_ped.Rd: Language not supported
  • man/find_parentage.Rd: Language not supported
  • man/imputation_concordance.Rd: Language not supported
  • man/solve_composition_poly.Rd: Language not supported
  • man/validate_pedigree.Rd: Language not supported

Comment thread NAMESPACE
Comment thread R/check_ped.R
Comment thread R/find_parentage.R
Comment thread R/find_parentage.R
…ring genotype matrices for comparisons. updated test file to accomodate nw arguments

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 20 changed files in this pull request and generated 5 comments.

Files not reviewed (6)
  • man/allele_freq_poly.Rd: Language not supported
  • man/check_ped.Rd: Language not supported
  • man/find_parentage.Rd: Language not supported
  • man/imputation_concordance.Rd: Language not supported
  • man/solve_composition_poly.Rd: Language not supported
  • man/validate_pedigree.Rd: Language not supported

Comment thread R/validate_pedigree.R
Comment on lines +321 to 333
} else if (decision %in% c("remove_both",
"low_markers_remove_both",
"low_markers_remove_male_parent",
"low_markers_remove_female_parent")) {
if (grepl("male", decision))
data.table::set(corrected_pedigree, row_idx, "male_parent", "0")
if (grepl("female", decision))
data.table::set(corrected_pedigree, row_idx, "female_parent", "0")
if (decision %in% c("low_markers_remove_both", "remove_both")) {
data.table::set(corrected_pedigree, row_idx, "male_parent", "0")
data.table::set(corrected_pedigree, row_idx, "female_parent", "0")
}
}
Comment thread man/find_parentage.Rd
Comment on lines 14 to 52
show_ties = TRUE,
allow_selfing = TRUE,
verbose = TRUE,
write_txt = TRUE
plot_results = TRUE
)
}
\arguments{
\item{genotypes_file}{Path to a TSV/CSV/TXT file containing genotype data.
Must include an 'ID' column followed by marker columns coded as 0, 1, 2
(allele dosage).}
\item{genotypes_file}{Path to a TSV/CSV/TXT file, OR a data.frame /
data.table with an 'id' column followed by marker columns coded as 0, 1, 2.}

\item{parents_file}{Path to a TSV/CSV/TXT file listing candidate parent IDs.
Must include an 'ID' column. An optional 'Sex' column with values
'M' (male parent), 'F' (female parent), or 'A' (ambiguous) determines
which parents are tested for each role. If absent, all parents are treated
as ambiguous.}
\item{parents_file}{Path to a TSV/CSV/TXT file, OR a data.frame /
data.table with an 'id' column and an optional 'sex' column
('M', 'F', or 'A'). If absent, all parents are treated as ambiguous.}

\item{progeny_file}{Path to a TSV/CSV/TXT file listing progeny IDs to assign.
Must include an 'ID' column.}
\item{progeny_file}{Path to a TSV/CSV/TXT file, OR a data.frame /
data.table with an 'id' column.}

\item{method}{Character. Parentage assignment method. One of:
\itemize{
\item \code{"best_male_parent"} — finds the best male parent for each
progeny using homozygous mismatch rate.
\item \code{"best_female_parent"} — finds the best female parent for each
progeny using homozygous mismatch rate.
\item \code{"best_match"} — finds the single best parent (either sex)
using homozygous mismatch rate.
\item \code{"best_pair"} — finds the best male-female parent pair for
each progeny using full Mendelian error rate (default).
}}
\item{method}{Character. One of \code{"best_male_parent"},
\code{"best_female_parent"}, \code{"best_match"}, or
\code{"best_pair"} (default).}

\item{min_markers}{Integer. Minimum number of non-missing markers required
to report a parentage assignment. Progeny-parent comparisons with fewer
markers are flagged as \code{LOW_MARKERS} and no assignment is made
(default: \code{10}).}
\item{min_markers}{Integer. Minimum markers required; fewer flags
\code{low_markers} (default: \code{10}).}

\item{error_threshold}{Numeric. Maximum mismatch percentage to report a
parentage assignment as confident. Assignments above this threshold are
flagged as \code{HIGH_ERROR} in the \code{Assignment_Status} column
(default: \code{5.0}). Must be between 0 and 100.}
\item{error_threshold}{Numeric. Maximum mismatch percentage; exceeded values
flag \code{high_error} (default: \code{5.0}). Must be between 0 and 100.}

\item{show_ties}{Logical. If \code{TRUE}, all tied best pairs (after
tie-breaking by maximum markers tested) are reported as additional columns
(\code{Male_Parent_1}, \code{Male_Parent_2}, etc.) when
\code{method = "best_pair"}. The base columns (\code{Male_Parent},
\code{Female_Parent}, etc.) are always populated with the top result.
If \code{FALSE}, only one tied pair is reported with a warning.
Default is \code{TRUE}.}
\item{show_ties}{Logical. If \code{TRUE}, tied best pairs are appended as
suffix columns. Default is \code{TRUE}.}

\item{allow_selfing}{Logical. If \code{FALSE}, male-female parent pairs where
both IDs are identical are excluded when \code{method = "best_pair"}.
Default is \code{TRUE}.}
\item{allow_selfing}{Logical. If \code{FALSE}, pairs with identical male and
female parent IDs are excluded. Default is \code{TRUE}.}

\item{verbose}{Logical. If \code{TRUE}, prints progress messages, summary
statistics, and the results table to the console. Default is \code{TRUE}.}
\item{verbose}{Logical. If \code{TRUE}, prints progress and summary.
Default is \code{TRUE}.}

\item{write_txt}{Logical. If \code{TRUE}, writes results to
\code{parentage_results_dt.txt} in the working directory. Default is
\code{TRUE}.}
\item{plot_results}{Logical. If \code{TRUE}, plots the Mendelian error
distribution. Requires \code{ggplot2}. Default is \code{TRUE}.}
}
Comment thread R/find_parentage.R
ggplot2::geom_vline(xintercept = error_threshold,
linetype = "dashed", color = "black", linewidth = 1) +
ggplot2::scale_x_continuous(breaks = seq(0, 100, by = 5)) +
ggplot2::scale_y_continuous(breaks = seq(0, 10000, by = 5)) +
Comment thread R/validate_pedigree.R
Comment on lines +365 to +375
plot_df$plot_status <- dplyr::case_when(
plot_df$recommended_correction %in% c("none", "keep_both",
"low_markers_keep_both") ~ "pass",
plot_df$recommended_correction %in% c("remove_male_parent",
"remove_female_parent",
"low_markers_remove_male_parent",
"low_markers_remove_female_parent") ~ "fail_one_parent",
plot_df$recommended_correction %in% c("remove_both",
"low_markers_remove_both") ~ "fail_both_parents",
TRUE ~ "other"
)
library(testthat)
library(data.table)

set.seed(101919)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

in_progress not ready to merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants