Skip to content

Adds support for more types of FractalHeap (used for storing attributes)#244

Open
bnlawrence wants to merge 5 commits into
mainfrom
issue230-huge-heap-from-main
Open

Adds support for more types of FractalHeap (used for storing attributes)#244
bnlawrence wants to merge 5 commits into
mainfrom
issue230-huge-heap-from-main

Conversation

@bnlawrence

@bnlawrence bnlawrence commented Jun 24, 2026

Copy link
Copy Markdown
Collaborator

Description

This pull request includes implementation and tests for two types of FractalHeap which we had not previously encountered (as described in #243).

In practice we have added some more code within the implementation of the Fractal_Heap class, which replaces some "NotImplementedError" branches, and support for the particular kind of b-tree used in the fractal heaps. One large test file which requires one of the two new branches is included - we couldn't shrink it as any attempt to rewrite it invoked a differnt kind of FractalHeap. We have not been able to generate a real file using the other branch, so some synthetic testing has been included.

Closes #243

Checklist

  • This pull request has a descriptive title and labels
  • This pull request has a minimal description (most was discussed in the issue, but a two-liner description is still desirable)
  • Unit tests have been added (if codecov test fails)
  • Any changed dependencies have been added or removed correctly (if need be)
  • If you are working on the documentation, please ensure the current build passes
  • All tests pass

@codecov

codecov Bot commented Jun 24, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 65.78947% with 26 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.48%. Comparing base (b9e3df5) to head (ba4a18a).

Files with missing lines Patch % Lines
pyfive/misc_low_level.py 58.18% 16 Missing and 7 partials ⚠️
pyfive/btree.py 85.71% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #244      +/-   ##
==========================================
- Coverage   78.65%   78.48%   -0.18%     
==========================================
  Files          15       15              
  Lines        3345     3416      +71     
  Branches      534      546      +12     
==========================================
+ Hits         2631     2681      +50     
- Misses        579      593      +14     
- Partials      135      142       +7     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends pyfive’s Fractal Heap support to handle additional huge/tiny object encodings encountered in real-world netCDF/HDF5 files (notably CMIP6), including huge-object lookup through a v2 B-tree, and adds tests/fixtures to exercise these branches.

Changes:

  • Implement decoding for tiny heap IDs and huge heap IDs (direct + indirect via v2 B-tree) in FractalHeap.get_data().
  • Add BTreeV2HugeObjectsIndirect to parse/iterate huge-object records stored in HDF5 v2 B-trees.
  • Add tests including a real CMIP6 fixture-based test and synthetic tiny-object decoding tests; update pre-commit config to allow the large fixture file.

Reviewed changes

Copilot reviewed 4 out of 5 changed files in this pull request and generated 2 comments.

File Description
pyfive/misc_low_level.py Adds huge/tiny Fractal Heap ID decoding and indirect huge-object lookup via a v2 B-tree.
pyfive/btree.py Introduces a v2 B-tree reader for indirectly accessed huge objects.
tests/test_fractal_heap.py Adds coverage for tiny IDs and a real CMIP6 huge-object case; updates huge-object expectations.
.pre-commit-config.yaml Excludes the large CMIP fixture from the “added large files” pre-commit hook.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pyfive/btree.py Outdated
Comment thread tests/test_fractal_heap.py Outdated

@davidhassell davidhassell left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, as far as I can tell (I've eyeballed the code and tests, and run the test, and played on the command line)

@valeriupredoi

Copy link
Copy Markdown
Collaborator

bit more testing would be nice though; RTD fails unrelated to here see #245

@bnlawrence

Copy link
Copy Markdown
Collaborator Author

Can't test half of this because we can't create synthetic data with these properties. At some point mocking it becomes pointless ...

@valeriupredoi

Copy link
Copy Markdown
Collaborator

Can't test half of this because we can't create synthetic data with these properties. At some point mocking it becomes pointless ...

no probs, cov hit is not big 😃

@valeriupredoi

Copy link
Copy Markdown
Collaborator

docs build issue sorted, this is ready for merge when @bnlawrence is happy to 🍺

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CMIP6 files in the wild use fractal heaps with huge instances and internal b-tree indexes

4 participants