Add parentage functions#64
Merged
Merged
Conversation
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…ement for documentation on check_ped
…and to plot results.
… or file paths, updated documentation and test files
…ring genotype matrices for comparisons. updated test file to accomodate nw arguments
Replaced separate with extract to parse SNP ID into CHROM and POS. Updated POS formatting to handle leading zeros.
Update updog2vcf
Contributor
There was a problem hiding this comment.
Pull request overview
This PR updates BIGr’s pedigree/parentage tooling by introducing list-based, in-memory-friendly outputs (and optional plotting) for parentage workflows, alongside broad documentation/test updates and minor package configuration changes.
Changes:
- Refactors
validate_pedigree()andfind_parentage()to return invisible named lists (withfull_results, subsets, and optional ggplot output) and to accept in-memory inputs in addition to file paths. - Overhauls
check_ped()behavior and tests to align withid/male_parent/female_parentnaming and a richer, structured report. - Updates documentation/man pages, NAMESPACE imports/exports, and package metadata (version bump, roxygen config, build ignores).
Reviewed changes
Copilot reviewed 14 out of 21 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| R/validate_pedigree.R | Refactors pedigree validation to list-based output, adds plotting, removes file-writing, adds in-memory input support. |
| R/find_parentage.R | Refactors parentage assignment to list-based output, adds self-match controls and plotting, removes file-writing, adds in-memory input support. |
| R/check_ped.R | Reworks pedigree QC/correction reporting and options; supports in-memory inputs and standardizes column naming. |
| R/updog2vcf.R | Adjusts SNP-to-CHROM/POS parsing logic to be more robust to extra underscores and leading zeros. |
| R/imputation_concordance.R | Documentation/import hygiene improvements for concordance utility. |
| R/breedtools_functions.R | Documentation and readability refactors in breed composition helpers. |
| NAMESPACE | Exports validate_pedigree; expands targeted imports (dplyr/ggplot2/janitor/data.table). |
| DESCRIPTION | Version bump to 0.8.0 and roxygen2 config update. |
| inst/check_ped_test.txt | Updates fixture headers/content to male_parent/female_parent convention. |
| tests/testthat/test-validate_pedigree.R | Rewrites/expands tests for new return structure, plotting flag, and in-memory inputs. |
| tests/testthat/test-find_parentage.R | Rewrites/expands tests for new return structure, new flags, plotting flag, and in-memory inputs. |
| tests/testthat/test-check_ped.R | Replaces prior minimal test with comprehensive structured tests for new report outputs and behaviors. |
| tests/testthat/corrected_pedigree.txt | Removes previously written-output fixture (reflects “no write” behavior). |
| man/*.Rd | Regenerates man pages to match refactored APIs and docs. |
| .Rbuildignore | Ignores .positai and .claude directories. |
| .gitignore | Ignores .positai. |
Files not reviewed (6)
- man/allele_freq_poly.Rd: Language not supported
- man/check_ped.Rd: Language not supported
- man/find_parentage.Rd: Language not supported
- man/imputation_concordance.Rd: Language not supported
- man/solve_composition_poly.Rd: Language not supported
- man/validate_pedigree.Rd: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+92
to
110
| valid_ids <- genos$id | ||
| removed_parents <- base::setdiff(all_parents$id, valid_ids) | ||
| if (base::length(removed_parents) > 0) { | ||
| warning("The following parent IDs were not in the genotype file and will not be analyzed: ", | ||
| paste(removed_parents, collapse = ", "), call. = FALSE) | ||
| all_parents <- all_parents[ID %in% valid_ids] | ||
| all_parents <- all_parents[id %in% valid_ids] | ||
| } | ||
|
|
||
| removed_progeny <- base::setdiff(progeny_candidates$ID, valid_ids) | ||
| removed_progeny <- base::setdiff(progeny_candidates$id, valid_ids) | ||
| if (base::length(removed_progeny) > 0) { | ||
| warning("The following progeny IDs were not in the genotype file and will not be analyzed: ", | ||
| paste(removed_progeny, collapse = ", "), call. = FALSE) | ||
| progeny_candidates <- progeny_candidates[ID %in% valid_ids] | ||
| progeny_candidates <- progeny_candidates[id %in% valid_ids] | ||
| } | ||
|
|
||
| if (!"Sex" %in% base::colnames(all_parents)) { | ||
| warning("No 'Sex' column in parents file. All parents treated as ambiguous ('A').") | ||
| all_parents[, Sex := "A"] | ||
| if (!"sex" %in% base::colnames(all_parents)) { | ||
| warning("No 'sex' column in parents file. All parents treated as ambiguous ('A').") | ||
| all_parents[, sex := "A"] | ||
| } |
Comment on lines
+325
to
+327
| final_df <- merge(results_dt, tie_rows, by = "id", all.x = TRUE) | ||
| for (col in base::names(final_df)) | ||
| data.table::set(final_df, which(final_df[[col]] == ""), col, NA_character_) |
Comment on lines
+412
to
+416
| ggplot2::geom_vline(xintercept = error_threshold, | ||
| linetype = "dashed", color = "black", linewidth = 1) + | ||
| ggplot2::scale_x_continuous(breaks = seq(0, 100, by = 5)) + | ||
| ggplot2::scale_y_continuous(breaks = seq(0, 10000, by = 5)) + | ||
| ggplot2::scale_fill_manual( |
Comment on lines
85
to
93
| #### Check required columns #### | ||
| required_ped_cols <- c("ID", "Male_Parent", "Female_Parent") | ||
| required_ped_cols <- c("id", "male_parent", "female_parent") | ||
| missing_cols <- base::setdiff(required_ped_cols, base::names(pedigree)) | ||
| if (base::length(missing_cols) > 0) | ||
| stop("Pedigree file missing required columns: ", | ||
| base::paste(missing_cols, collapse = ", ")) | ||
| if (!"ID" %in% base::names(genos)) | ||
| stop("Genotypes file must have an 'ID' column") | ||
| if (!"id" %in% base::names(genos)) | ||
| stop("Genotypes file must have an 'id' column") | ||
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces several improvements and updates to the BIGr package, including enhanced documentation, new functionality, and updates to dependencies and imports. The most significant changes are the addition of a new pedigree validation export, expanded documentation for core functions, and updates to how dependencies are managed.
Documentation and Functionality Enhancements:
allele_freq_polyandsolve_composition_polyfunctions inR/breedtools_functions.R, including clearer parameter descriptions, return values, and example usage. This makes the functions easier to understand and use for both developers and end-users. [1] [2]allele_freq_poly,QPsolve, andsolve_composition_polyfor better readability and maintainability, including more explicit use ofbase::for base R functions and improved handling of input matrices. [1] [2]New Exports and Imports:
validate_pedigreeto the list of exported functions inNAMESPACE, making it available for package users.NAMESPACEimports to include additional functions fromdplyr,ggplot2, andjanitor, supporting new or updated functionality and improving data manipulation and visualization capabilities.Dependency and Configuration Updates:
0.7.0to0.8.0inDESCRIPTIONto reflect the new features and changes.Config/roxygen2/version: 8.0.0toDESCRIPTIONto specify the required Roxygen2 version for documentation generation.