Skip to content

fix: prefer entity key in TOML file when restoring#607

Merged
ormsbee merged 2 commits into
openedx:mainfrom
ormsbee:fix-empty-restore
Jun 25, 2026
Merged

fix: prefer entity key in TOML file when restoring#607
ormsbee merged 2 commits into
openedx:mainfrom
ormsbee:fix-empty-restore

Conversation

@ormsbee

@ormsbee ormsbee commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Prior to this commit, we would always assume the entity key would match the directory naming, but that is not a hard requirement. Furthermore, it looks like something in the backup code is adding hashes to the directories unnecessarily, so this was coming up quite often (e.g. where a Unit and HTMLBlock share the last part of their key).

@kdmccormick kdmccormick left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple small requests

Comment on lines +1068 to +1073
# The ordering of the file processing is important because we need to
# ensure that TOML files for a given component are processed before the
# static files for that component. '.' sorts before '/', so "foo.toml"
# will sort before "foo/component_versions/v1/static/figure1.webp" or
# any other subdirectory of the "foo" component. Processing the TOML
# first allows us to map the directory to a entity key.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[thinking out loud -- no action needed] I was wondering if this would break for entity refs that contain dots, but it seems like it should be OK since entities/foo.bar.toml will still sort before entities/foo.bar/component_versions/etc.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, as long as the export keeps the convention that the TOML file and the dir structure have matching names, then we can rely on the sort order. That matching name is the only thing actually correlating the two, so if those don't match, the archive is broken.

Comment thread src/openedx_content/applets/backup_restore/zipper.py Outdated
Comment thread src/openedx_content/applets/backup_restore/zipper.py Outdated
Comment thread src/openedx_content/applets/backup_restore/zipper.py Outdated
Comment thread src/openedx_content/applets/backup_restore/zipper.py Outdated
Comment thread src/openedx_content/applets/backup_restore/zipper.py
@ormsbee

ormsbee commented Jun 23, 2026

Copy link
Copy Markdown
Contributor Author

@kdmccormick: Should be good for another review pass. Before merging, I'll squash down to two commits, one for the compressionlevel thing, and one for everything else.

@kdmccormick kdmccormick left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two tiny nits, LGTM otherwise! No need for re-review.

I wasn't able to manually verify this myself (dev env issues).

Comment thread src/openedx_content/applets/backup_restore/zipper.py Outdated
Comment thread src/openedx_content/applets/backup_restore/zipper.py Outdated
ormsbee and others added 2 commits June 24, 2026 18:16
Prior to this commit, we would always assume the entity key would match
the directory naming, but that is not a hard requirement. Furthermore,
it looks like something in the backup code is adding hashes to the
directories unnecessarily, so this was coming up quite often.

Co-authored-by: Kyle McCormick <kyle@kylemccormick.me>
Modifying the Zip compressionlevel usually doesn't make much of a
difference, but when experimenting with a large library, I was seeing a
7.5% reduction in file size (614 MB vs 568 MB) by just toggling this
parameter. I suspect that this has something to do with how we're
incrementally building the Zip file during the backup process, since
simply creating a Zip file all at once from a directory does not show
this kind of difference.
@ormsbee ormsbee force-pushed the fix-empty-restore branch from 6ddb80b to d208892 Compare June 24, 2026 22:17
@ormsbee ormsbee merged commit 8e08a2d into openedx:main Jun 25, 2026
7 checks passed
@ormsbee ormsbee deleted the fix-empty-restore branch June 25, 2026 00:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Backup -> Restore -> Backup cycle for library loses content

2 participants