Skip to content

[COMPRESS-727] Provide a symlink-resistant and concurrency-safe archive Extractor#776

Open
tomasilluminati wants to merge 2 commits into
apache:masterfrom
tomasilluminati:COMPRESS-727-secure-extractor
Open

[COMPRESS-727] Provide a symlink-resistant and concurrency-safe archive Extractor#776
tomasilluminati wants to merge 2 commits into
apache:masterfrom
tomasilluminati:COMPRESS-727-secure-extractor

Conversation

@tomasilluminati

Copy link
Copy Markdown

Adds a security-first Extractor under archivers. extractor with a SecureDirectoryStream-based race-safe path on Linux and a best-effort base extractor elsewhere. Symlink and special-file policies, safe hard-link resolution, tests and a changes.xml entry are included.

…ve Extractor

Adds a security-first Extractor under archivers. extractor with a SecureDirectoryStream-based race-safe path on Linux and a best-effort base extractor elsewhere. Symlink and special-file policies, safe hard-link resolution, tests and a changes.xml entry are included.
@tomasilluminati

Copy link
Copy Markdown
Author

@ppkarwasz PR Opened.

Compare Path values instead of toString() so the platform separator does not break the test.
@tomasilluminati

Copy link
Copy Markdown
Author

@ppkarwasz Fixed the Windows Test Errors.

@garydgregory garydgregory left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @tomasilluminati
This is interesting and I am hoping we can extract something more generic on this theme.

The background nugget is hidden in Apache Commons Text in the package-private class PathFence.

I extracted the class for addition in Apache Commons IO in PR 805 but it might not be good enough. The IO PR is something I had started to iterate with @ppkarwasz but other things popped up since and now seems like a good time to revist.

There are two concerns: resolving relative paths and dealing with various types of file system links. Handling the relative path issue is "simple", the links less so. At first glance it looks like you've covered many cases here.

I'm not sure if both concerns should be intermingled in one class or composed.

Can you see a PathFence class as something that can be improved on and reused here?

The PathFence could also be reused in Apache Commons Configuration for example.

Looking forward to your thoughts.

TY!

@tomasilluminati

Copy link
Copy Markdown
Author

@garydgregory
Thanks, this is a great direction, I'm keen to reuse PathFence instead of carrying my own containment code.

The syntactic part is a clean fit. toAbsolutePath().normalize() plus the within-root check is generic, and Compress, Configuration and Text can all share it as-is.

Links are the tricky bit, and I'd lean toward composing rather than folding them in. Part of why they resist living in a path predicate is that during extraction the entry doesn't exist yet, so I can't just toRealPath() the target. The real protection ends up being operational, resolve each component no-follow, create with CREATE_NEW so an existing link is never followed, and on Linux write relative to the directory handle (SecureDirectoryStream) to close the TOCTOU window. That's bound to the act of creating files, not to judging a path.

Where @ppkarwasz point lands well is the already-on-disk case, like a trusted config dir. There a resolving check (toRealPath then the same within-root test) makes sense, and that could be an optional mode in PathFence. The extractor would lean on the syntactic decision and keep its no-follow creation layer on top.

So my vote is composed. PathFence owns the containment decision, extraction-specific link safety sits around it, and PathFence stays light enough for Configuration to reuse.

Happy to dig into the IO PR with you and Piotr whenever it's a good time, and to bring over the link cases I already cover as tests. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants