Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jetty 12 : Resource API Review #8474

Closed
joakime opened this issue Aug 17, 2022 · 6 comments
Closed

Jetty 12 : Resource API Review #8474

joakime opened this issue Aug 17, 2022 · 6 comments
Assignees

Comments

@joakime
Copy link
Contributor

joakime commented Aug 17, 2022

org.eclipse.jetty.util.resource.Resource Proposal

High level

The Resource abstract class should be made into an Interface, with minimal default implementations (if any).
Existing static methods behaviors stay.

The ResourceCollection should be renamed.
It's ultimately how we composite a set of Resources together.

Its current implementation is a List, so calling it ResourceList is tempting.
However, the current implementation flattens and uniques the incoming Resources, so it cannot have a nested composite inside of it.
This was done in early iterations when ResourceCollection was managing the resources (mounts).
It was felt that having multiple composites, possibly from different management scopes (now called ResourceFactory) could be a recipe for shooting your whole leg off.

This concern is not valid anymore IMO.
We could allow a composite to have other composites.
So lets call it CompositeResource.

My argument here is that List implies the same entry type, and is flat, but Composite makes it clear that it structured.

The Resource interface should also expose a Iterable<Resource> (more on that below).

Use Cases

Use Case: Create Resource from multiple Paths and pass it around

Simple creation, but used as a composite

Resource resourceBase = resourceFactory.newResource(listOfUris);

contextHandler.setResourceBase(resourceBase);

Allowed URIs can reference locations that do not exist.
A composite with zero entries should throw an exception.

Use Case: Create Resource from single path and pass it around

Simple creation, used as a simple resource

Resource resourceBase = resourceFactory.newResource(uri);

contextHandler.setResourceBase(resourceBase);

Allowed URI or Path can reference locations that do not exist.

Use Case: combine multiple resources into a single Resource

Combine Resources together to form a composite Resource.

ResourceCollection rc1 = Resource.combine(
    List.of(
        resourceFactory.newResource(one), // simple directory
        resourceFactory.newResource(two), // simple directory
        resourceFactory.newResource(three) // simple directory
    )
);

ResourceCollection rc2 = Resource.combine(
    List.of(
        // the original ResourceCollection
        rc1,
        // a duplicate entry
        resourceFactory.newResource(two), // simple directory
        // a new entry
        resourceFactory.newResource(dirFoo) // simple directory
    )
);

rc2.getResources(); // has 4 entries [one, two, three, dirFoo]

Constructing a composite resource from other resources, even other composite resources.
Duplicate entries are eliminated.
The order of the entries in the combine(Resource .... resources) is relevant to the resulting composite.

Use Case: Resolve sub-resource

We have a resource, and we want a sub resource.
This API is .resolve(String), and is more flexible in inputs than Path.resolve(String) or URI.resolve(String).
But should it ever return a Resource if that resource doesn't exist? Potential API change, return null if it doesn't exist.

Resource resourceBase = resourceFactory.newResource(listOfUris);

Resource resource = resourceBase.resolve("welcome.html");
// a resource, might not exist, possibly a composite
// We should not return the resource if it doesn't exist. Should it be null?

// In the normal runtime in a webapp usages.
Resource resource = resourceBase.resolve("deep/one/index.html");

The use case of using Resource.resolve(a) against something that was returned from a .resolve(b) earlier is actually quite rare in our code, the most common use case is to take the initial configured Resource and only resolve deeply.

A .resolve(String) can result in a composite of entries, so we need to handle that possibility in code.

Resource resourceBase = ... ;

Resource webInf = resourceBase.resolve("WEB-INF");
if (webInf == null) return;
Resource jettyEnvs = webInf.resolve("jetty-env.xml");
if (jettyEnvs == null) return;
for (Resource xml: jettyEnvs)
{
   // process XML via XmlConfiguration
}

Use Case: Alias Checking

If the input Resource.resolve(String) can be a reference to a file in a non-real way.
There is a need to know if this reference results in a non-standard reference to said Resource.
Differences in the URI provided vs URI of path, or discovered Path vs Real Path, indicate if this reference is an alias to another Path location. (eg: symlink, windows alt references, case-insensitive filesystems, etc)

// Path : /foo/  (is the resourceBase below)
//           bar/
//             test.txt

Resource resourceBase = resourceFactory.newResource(fooDir);

// windows example (case insensitve)
Resource resource = resourceBase.resolve("BAR/test.txt");
// This is an alias, and the real path is "/bar/test.txt"

// The Alias handling is only needed by DefaultServlet or ResourceService ...
// So limit the Alias calculation to when it's actually needed
Path aliasPath = resource.getAlias(); // calculates the Alias at this point in time (stored in Resource if asked once)
if (aliasPath != null) // does record exist?
  return AliasCheckers.check(pathInContext, aliasPath);

Use Case: We want a shallow directory listing

Typical fetch of shallow list from resource.
Input Resource might be a composite.
No expected ordering.

Resource resourceBase = ... ;

// Simple case
List<Resource> listing = resourceBase.list();
for (Resource entry: listing)
{
    // process listing entries
}

We should be able to take the List, so we can filter, sort, map, etc it further.

List<Resource> hits = resourceBase.list();
hits.sort(byLastModified);
List<Resource> listing = resourceBase.list();
List<Path> hits = listing.stream()
  .filter(FileID::notHidden)
  .map(Resource::getPath)
  .map(Path::getFileName)
  .unique()
  .toList();
/* Example of finding WEB-INF/lib/*.jar files.
 * Take a collection of URIs:
 *   file:///path/to/dir/
 *   jar:file:///path/webapp.war!/
 *   jar:file:///path/lib.jar!/
 */

Resource baseResource = resourceFactory.newResource(listOfUris);
Resource libs = baseResource.resolve("WEB-INF/lib");
if (libs == null) return;
List<Resource> jars = libs.list().stream().filter(FileID::isArchive).toList();
for (Resource jar: jars)
{
  // process Jar
}

Use Case: Deep walk / Find resources

Sometimes we need to walk deeply all of the children of the resources, even if it's a composite.

/* We want to find TLDS
 * 
 *   file:///path/to/dir/
 *   jar:file:///path/webapp.war!/
 *   jar:file:///path/lib.jar!/
 *
 * look for META-INF/.../*.tld
 * eg:  META-INF/foo.tld
 *      META-INF/deep/path/bar.tld
 */
Resource metaInfs = baseResource.resolve("META-INF"); // possibly a composite
if (metaInfs == null) return;
List<Resource> tlds = metaInfs.walk()
    .stream()
    .filter(FileID::notHidden)
    .filter(FileID::isTld)
    .sorted(ResourceCollator::byName)
    .toList()
for (Resource tld: tlds)
{
    // process tld
}

API Proposals

Resource implements Iterable ✔️

NEW API.

The core Resource implementation should also be able to exposed as an Iterable<Resource>

  • Each entry in the Iterable should be a non-composite. :X:
  • When PathResource, return a single instance iterator of self/this. :X:
  • When ResourceCollection, return an iterator of all the resources tracked by the composite. :X:
    Nested composites makes this complicated, but not impossible.

Resource Resource.resolve(String)

Keep this.

  • When PathResource, resolve to single existing Resource :X:
  • Behavior Change: When ResourceCollection, resolve to all shallow hits across resources :X:
  • Behavior Change: When CompositeResource has nested CompositeResource, call nested.resolve(String), this will result
    in a CompositeResource return value that has no nested CompositeResource.
  • Behavior Change: Can never return a result that would be a non-existent Resource. :PENDING: (will have impact on exist() API, which would be removed)
  • Attempt to access sub-resource outside of the Resource results in an IllegalArgumentException :X:

Resource Resource.resolveNonExistent(String)

Path Resource.getPath() ✔️

Keep this.

  • When PathResource always valid. :X:
  • When ResourceCollection always null. :X:

boolean Resource.isContainedIn(Resource r)

Keep this.

  • When PathResource simple Path.startsWith(Path) should be sufficient :X:
  • When MountedPathResource better testing.
  • When ResourceCollection should delegate to children.isContainedIn(), first to return true.

boolean Resource.isSame(Resource r) ✔️

Eliminate, not used anywhere outside of testing

boolean Resource.exists() :X:

Neutral opinion.
Consider elimination, as this would have no meaning if the
change above in resolve(String) not returning a resource that doesn't exist.
Although, if the newResource(...) could create a Resource that doesn't exist (yet).

  • When PathResource, a simple Files.exists :X:
  • When ResourceCollection, any child exists. :X:

boolean Resource.isDirectory() :X:

Keep this.

  • When PathResource simple should just delegate to Files.isDirectory, false if IOException :X:
  • When ResourceCollection this can be files and/or directories, this should return false. :X:

Instant Resource.lastModified() :X:

SIGNATURE CHANGE: Different return type.
Was long (as millis since epoch).
Now an Instant, as this is more generally useful. :X:

  • When PathResource simple should just delegate to Files.getLastModified, EPOCH if IOException :X:
  • When ResourceCollection this can be files and/or directories, and even nested composites, :X:
    this should return either EPOCH or walk the composite and find the newest lastmodified of the
    composite (not a fan of this approach).

long Resource.length() :X:

Keep this.

  • When PathResource, delegate to Files.size, -1L if IOException :X:
  • When ResourceCollection, return -1L always :X:

URI Resource.getURI() :X:

  • When PathResource, just return Path.toUri() :X:
  • When ResourceCollection, return first child with URI. :X:

String Resource.getName() :X:

Keep it.

  • When PathResource, return Path.toAbsolutePath().toString() :X:
  • When CompositeResource, return first child with name. :X:

InputStream Resource.newInputStream()

Evaluate use.
There has been a desire to remove this and have this done in HttpContent instead.

  • When PathResource, delegate to Files.newInputStream
  • When CompositeResource, throw UnsupportedOperationException always.

ReadableByteChannel Resource.newReadableByteChannel()

Evaluate use.
There has been a desire to remove this and have this done in HttpContent instead.

  • When PathResource, delegate to Files.newByteChannel
  • When CompositeResource, throw UnsupportedOperationException always.

SeekableByteChannel Resource.newByteChannel()

NEW API, only if we keep the other newInputStream/newReadableByteChannel APIs above
Now SeekableByteChannel.

  • When PathResource, delegate to Files.newByteChannel
  • When CompositeResource, throw UnsupportedOperationException always.

boolean Resource.isMemoryMappable()

Eliminate it.
This appears to not be used.

List<String> Resource.list()

Eliminate this signature.

List<Resource> Resource.list()

NEW SIGNATURE

Returns a shallow list of resources that the Resource contains.

  • When PathResource ...
    • if a File, return just the File
    • if a Dir, return the shallow list of entries in the Directory
  • When CompositeResource, iterate all resource entries and ...
    • if Resource is a File, add File as-is.
    • if Resource is a Dir, add all shallow entries in directory.

This API will not remove duplicate (by filename) entries like in Jetty 9/10/11.

That is a task for the user of the API, not built into the Resource layer.

For example, classpath building requires the duplicates to be present, HTML directory listing maybe doesn't.

String Resource.getFileName()

NEW API

Returns the filename portion of the path, or null if the Resource is not based on a Path (eg: MemoryResource)

boolean Resource.isAlias()

Eliminate, in favor of new API.

Path Resource.getAlias()

NEW API: Performs same algorithm as what is in the current PathResource.checkAliasPath() right now.

A return of null means there is no alias to worry about.

  • When PathResource, perform checkAliasPath() algorithm, store value in Resource, and return real-path.
  • When CompositeResource, return null always.

URI Resource.getAlias()

Eliminate, not used in code anywhere.

void Resource.copyTo(Path)

Unfortunately, as the user of the Resource you need to be sure that your Resource is even capable of being copied, or even needs to be copied.
The behaviors in Jetty 9 thru 11 in this regard is also quietly broken in many places (as copyTo() doesn't do anything in a few cases I looked into in Jetty 11)

In Jetty 12 if you are handed a resource (which is always a Path btw)

For Jar/Zip/War files, you have 2 ways to represent them in a Resource.

As the Jar/Zip/War itself, which is just a File, and is no more special then any other File.
As the root directory of the Jar/Zip/War archive (represented via the zipfs layer), this is a Directory, and is no different than any other Directory.
The logic that we need to settle on as a result of this issue ...

If the Resource _____ then copyTo(Path) means what?

Is a File

  • The destination is a directory then you want the file copied to the destination directory (with the same name?)
  • The destination is a file, then you want the file contents copied to the destination file.
  • The destination doesn't exist, then what? do we create a directory? or a file? or fail the copyTo?

Is a Directory

  • The destination is a directory, then you want the source directory copied to the destination directory.
  • What to do with links? (not just unix symlinks, but any links from any filesystem type, even ntfs)
  • The destination is a file, then what? Do you want to create a JAR/ZIP? Concat all of the contents of the source into the file?
  • The destination doesn't exist, then what? do we create a file? or a directory? or fail the copyTo?

does not exist

  • This case should throw an exception, as this points to a bug in the user of a Resource, it should never have reached the point of a blind copyTo(Path)

Is a ResourceCollection

  • Iterate through each entry in the ResourceCollection applying the directory logic above (as only Directories can be in a ResourceCollection)?
  • Fail? as it's a meaningless operation? I'm in favor of this approach.
  • Leave the user of the ResourceCollection to decide on a per sub-resource basis? (eg: copy only the META-INF/**/*.tld files to destination?)

String Resource.getWeakETag()

Eliminate.

This should be removed and centralized in the cache handling for ResourceService

String Resource.getWeakETag(String suffix)

This should be removed and centralized in the cache handling for ResourceService

byte[] Resource.longToBytes(long)

INTERNAL API, private, used only by getWeakETag() impls

This can be removed once the two getWeakETag() methods are centralized in the cache handling for ResourceService

Collection Resource.getAllResources()

Eliminate, not used outside of test cases.

List<Resource> walk(Predicate ... predicates);

NEW API.
Deep walk of the resource.

  • No predicates means all things are selected.
  • When PathResource
    • if a File, return just the File if it matches the predicates.
    • if a Dir, use Files.walk() with predicates and return
  • When CompositeResource, iterate all resource entries and ...
    • if Resource is a File, add File if it matches the predicates.
    • if a Dir, use Files.walk() with predicates and return

Resource findFirst(String);

NEW API.
Only needed if we support Composite.
Different than .resolve(String) in that it guarantees a single Resource, never a composite.

  • When PathResource, return resolved Resource, if it exists.
  • When CompositeResource, return first hit, that exists.
@janbartel
Copy link
Contributor

You've left out the use-case from the jetty maven plugin: the WebAppContext can be configured directly in the pom.xml. Therefore a user may have configured a resource base that is either a single Resource, or a ResourceCollection. The plugin then needs a way to overlay on top of that all of the various unassembled directories and/or wars that are declared dependencies. So the base resource is a Resource; ResourceCollection must be a Resource; ResourceCollection may be composed of other ResourceCollections.

@gregw
Copy link
Contributor

gregw commented Aug 18, 2022

With regards to the question of should resolved return s resource that does not exist, i think it should.

There is the rare case of needing to support PUT. Also i think we create resources in things like quick start

Checking for exists if pretty much the same hassle as null checking anyway

@joakime
Copy link
Contributor Author

joakime commented Sep 6, 2022

You've left out the use-case from the jetty maven plugin: the WebAppContext can be configured directly in the pom.xml. Therefore a user may have configured a resource base that is either a single Resource, or a ResourceCollection. The plugin then needs a way to overlay on top of that all of the various unassembled directories and/or wars that are declared dependencies. So the base resource is a Resource; ResourceCollection must be a Resource; ResourceCollection may be composed of other ResourceCollections.

This is how it currently is in Jetty 12.0.x.

See testcase: https://github.com/eclipse/jetty.project/blob/8953873e67cebd204366c9c95ab68660f56f0b4a/jetty-core/jetty-util/src/test/java/org/eclipse/jetty/util/resource/ResourceCollectionTest.java#L134-L174

You can combine (using ResourceFactory.combine()) two or more resources regardless of type (eg: PathResource and ResourceCollection), and get a ResourceCollection from its result.

I'll add this as a use case above.

@joakime
Copy link
Contributor Author

joakime commented Sep 6, 2022

There is the rare case of needing to support PUT. Also i think we create resources in things like quick start

Checking Jetty 9/10/11 I see a PutFilter, not there in Jetty 12.0.x

https://github.com/eclipse/jetty.project/blob/jetty-10.0.x/jetty-servlets/src/main/java/org/eclipse/jetty/servlets/PutFilter.java

It doesn't use Resource anyway, so not a problem.

QuickStart has a few places where it creates a Resource, those all need to be addressed, as it doesn't take into account composite behaviors, and can easily break right now (eg: if there's more than 1 WEB-INF directory discovered, or the quickstart-web.xml is in a place that cannot be written to)

joakime added a commit that referenced this issue Sep 6, 2022
…issame

Issue #8474 - Jetty 12 - Eliminate `Resource.isSame`
joakime added a commit that referenced this issue Sep 6, 2022
…ismemorymappable

Issue #8474 - Jetty 12 - Eliminate `Resource.isMemoryMappable`
joakime added a commit that referenced this issue Sep 6, 2022
Issue #8474 - Jetty 12 - Introduce `Iterable<Resource>` to base `Resource`
@olamy olamy moved this to To do in Jetty 12.0.ALPHAS Sep 7, 2022
joakime added a commit that referenced this issue Sep 7, 2022
…ified-to-instant

Issue #8474 - Jetty 12 - Change signature of `Resource.lastModified()`
joakime added a commit that referenced this issue Sep 12, 2022
joakime added a commit that referenced this issue Sep 12, 2022
joakime added a commit that referenced this issue Sep 13, 2022
* Introduce Resource.getFileName and use it
* Correct signature of Resource.list and use it

Signed-off-by: Joakim Erdfelt <[email protected]>
joakime added a commit that referenced this issue Sep 14, 2022
…FileName()` addition (#8582)

* Issue #8474 - Resource.list and Resource.getFileName work
* Correcting signature of Resource.list and use it
* Introduce Resource.getFileName and use it
* Adding URIUtil.addPath() test to show file+str behavior

Signed-off-by: Joakim Erdfelt <[email protected]>
@joakime
Copy link
Contributor Author

joakime commented Jan 26, 2023

The only remaining quirk is the copyTo(Path) ambiguity.

The idea of making copyTo(Path) support IncludeExcludeSet to eliminate/centralize the SelectiveJarResource implementations is awkward.

For a few reasons.

  • The current implementations of SelectiveJarResource is based on plexus-utils SelectorUtils to have consistent glob/pattern selection for includes/exclude to what is used in maven itself.
  • The use of copyTo(Path) isn't guarded for this kind of entry walking, the Resource could be a CombinedResource or even a simple PathResource and might not have entries to even walk. But the implementation of SelectiveJarResource doesn't guard against that.

I could considered an .extractTo(Path, IncludeExcludeSet) which is more clear, and not ambiguous, but would NOT be based on plexus-utils SelectorUtils (as I don't want to introduce a plexus-utils dep on jetty-util). However that would likely change the behavior of the maven plugin users that actually specify the includes/excludes on overlay based configs.

So leaving the 3 flawed implementations of SelectiveJarResource seems to be the only sane choice.

@joakime
Copy link
Contributor Author

joakime commented Mar 28, 2023

Closing.
Specific issues can be opened for new things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants