-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Local Module Search Paths #12923
Comments
I was wondering if one could always use I don't really like the flag name
|
This proposal strikes me as being pretty complex in its use of a very delicate set of ordered flags to define a hierarchy. And I have to admit that I tend to get pretty skittish about proposals that start to require the language or compiler to start to have some sort of concept of a (mason) package (is there precedent for this in other languages?). To me, from the language/compiler perspective, a mason package is just a module that uses whatever modules it needs to and defines whatever submodules it wants to, so I worry about the need to teach the compiler about packages if it's not necessary. I know I've gotten pushback on this before, but are we sure that such module-specific dependences shouldn't be specified in |
Did you see that Example 1 and Example 2 did not use any new flags at all? (I.e. I think the solution to Example 1 and Example 2 is the main part of the proposal; the new flags add additional functionality and if they're the only thing you don't like then it's worth separating that). In particular I'd like to know if you object somehow to Example 1 or Example 2. Besides that, there is nothing about this proposal that ties it directly to mason. The names of the flags use the term "package" to reflect their expected common use. @ben-albrecht and I discussed another option for the flag names,
That might be possible but I don't think it would solve the problem for Mason packages using modules with the same name by itself. Something about how the compiler handles the module paths will have to change to solve that problem. Or we could conceivably insist that Mason packages using more than one .chpl file use a require statement from the main package .chpl file. In any case I think there is more to your Anyway, I view the main idea of this proposal to be this:
Wouldn't having |
I don't think I'm objecting to this aspect of the proposal; more to the use of command-line flags to specify per-module behavior; and somewhat to the elevation of packages to a compiler-/language-level concept if it can be avoided (not using the term "package" in the flags seems like a dodge... is there no way we can relate these concepts to modules directly?).
Today, testit.chpl: require "M.chpl";
proc main() {
M.foo();
} M.chpl: proc foo() {
writeln("In M.foo()");
} Works if you do:
I don't think we've ever had support for require "subdir/M.chpl"; I think we could also consider adding support for I also wonder sometimes about specifying paths using config params, though this is harder for module paths than -I / -L paths because the current compiler architecture wants/needs to know those long before param resolution has occurred (but perhaps if it were restricted to simple string literals and compile-line config params? What the I could imagine potentially needing to make other flavors of
I think that's right. The kernel of what I like about thinking about this in the context of |
We could think about making all
That's good to know. I think we could immediately implement the change to fix Example 1 and Example 2.
I think we should talk more about command-line vs. source code for these things, but my position is that we'll ultimately want to support both.
Sure, the flags could be called Anyway, let's talk more (maybe in another issue) about how you'd imagine supporting submodules in a different file from a module. Such a functionality will be important in the event that the submodule wishes to refer to private functions in a parent module (since just making them both top-level modules will not allow access to the private functions). |
After discussing offline with Michael: I remain unconvinced that this feature is a necessity in Chapel, though I'll admit I'm not certain about that. The direction I'd prefer to invest in for the short-term is to explore the ability to break a module and its submodules up across multiple files, and then come back to this issue. Specifically, for example 1, it seems to me that if L is intended to be a module that helps M define its behavior but that nobody else should know about, that L should be a sub-module of M rather than a top-level module that somehow only M knows about. Or, put another way, I don't think there should be a way to inject module names into the top-level namespace that some modules can see but others can't (any more than I think there should be a way to declare a module-scope variable that some functions can see but other functions cannot). So to me, the question example 1 poses is "Did the author actually want L to be a sub-module of M?" and if so "Is the issue really that they want a way to split module M and its submodules across multiple files to avoid having to define L within the M.chpl file?" |
@bradcray - Earlier you stated you were not opposed to the local module search paths part of the proposal, but this reads like you do now. Have I interpreted your current stance correctly?
Yes, the author could make L a sub-module of M if we had a solution to splitting sub-modules across multiple files, as you suggest. However, there are some challenges with sub-module approach, e.g when a project has a diamond-shaped dependency:
In any case, I think a good next step would be to explore the separated-submodule idea a bit more in a new issue and understand how we might handle some of the challenging cases, so that we have more concrete ideas to compare against each other - as @mppf mentions above. |
I guess that's accurate and that the off-line discussion made me more skeptical about their importance than I had been previously. It might be most accurate to say that I'd like to see whether supporting the ability to break nested modules across multiple files + minor tweaks to directory/file organizations and conventions would obviate the need to support module-specific search paths. I don't see your diamond-shape case as presenting a problem for sub-modules. I think the module structure you're saying you want is: module M {
private use L, K;
private module L {
private use Utils;
}
private module K {
private use Utils;
}
private module Utils {
}
} And then the question becomes "How would we permit you to break this structure up across multiple files?" |
Here's one answer to my question (albeit one that's generally been met with negative reviews, but just to start somewhere...):
M.chpl: module M {
private use L, K;
include "subdir/L.chpl", "subdir/K.chpl", "subdir/Util.chpl";
} (where L.chpl, K.chpl, and Util.chpl each define the respective module from my previous comment). Properties:
|
I think that some of the criticism of I'm not sure I'm on board with the requirement that the Additionally the need to put a path like
So here is a straw-person counter-proposal:
module M {
private module L;
} Here the compiler could interpret |
Sorry, I didn't mean to imply that So, starting with your preferred directory structure:
I'm thinking about how main-module.chpl found M.chpl to begin with? (where the third answer below is what I think this issue is assuming, but for completeness...). One possibility is that it's in the module search path. But if that's the case, then L.chpl is also in the module search path suggesting that any other module's A second possibility is that A third possibility is that So this makes me think that we should look into no longer having command-line Chapel files affect the global module search path, see what tests break, and whether we find them compelling. If not, we can change this behavior to not affect the global module search path, and not require a local module search path either (at least for this case/reason). This is a simple change to make (see https://github.com/bradcray/chapel/tree/relative-chpl-dont-affect-modpath) and it looks like < 75 tests use the feature, so I'll run a spot-check on them tonight and do a full run to make sure I didn't miss anything when nightly testing isn't about to run (failures due to spot-check: (For historical purposes: Why did we take this behavior? I think it's because if a file Anyway, if we were to change this, then I think we wouldn't need a module-local search path for this case either (and at this point I want to foreshadow an important sidebar that makes up the final three paragraphs of this comment). Am I missing any other ways that main-module could know about M's location?
Just to be clear, I don't share this concern, at least for cases like this. I think it's reasonable for an author of a big Chapel module who wants to break it into separate files to organize those files using subdirectories and specify relative paths to get to the files where they live. I'm also not sure that those who have objected to putting paths into sources in the past would object to cases like this either; what I recall hearing objections to was more around putting library search paths or include paths into sources for system-wide packages. But maybe there is a reason to avoid even simple relative paths like this when creating little code clusters that I'm not seeing.
It's also similar to LaTeX's The main criticism I've heard about I don't mean to imply that having an All that said, I'm far more happy to wrestle with counterproposals to the "how do we break a module across multiple files" question than the "how do we create module-specific search directories" question because I think it solves two problems: (1) how to avoid huge monolithic files in Chapel and (2) how to encapsulate private modules so that they don't pollute the top-level program namespace. That said, I have to admit that I'm not crazy about Michael's counterproposal: module M {
private module L;
} As Chapel stands today, I interpret this as: "I'm defining a private module named L. It has no body / contents" (similar to how [One historical note that I've brushed up against a few times in this issue and want to get out in the open again: The current behavior in which We made a start at doing something better with a grammar called modulefinder.ypp (that can be found in the git archives) which was meant to be a clone of chapel.ypp that mostly just dropped code on the ground but knew how to navigate comments and strings to avoid false positives. Then the idea was to create little index files that would say which modules were defined by each I think this model still has merit (lots more than the current system), though there are challenges as well: For example, if the modules that are defined by a file depend on the settings of a |
I'd forgotten that Bryant also gave it a thumbs-down for other reasons in issue #10909. |
Not that I know of. Indeed, the Mason case is that
I'm not seeing how which modules a file defines could depend on a config param currently. Are you imagining that some other feature is introduced?
Right, I think we need to choose one (or more) of these:
I have in the past bristled at the way that I think we still have a problem that requires module-local search paths. The reason for that is that if we allow a (more explicit) way to indicate where a module is coming from, then it needs to be checked before the global module search path and not apply to other modules. I tried using
chpl main-module.chpl M/src/M.chpl // main-module.chpl
use M;
proc main() {
mfunction();
use L; // currently compiles but I want it to be an error
// because L is intended to be private to the package M.
} // M/src/M.chpl
module M {
//require "L.chpl"; // doesn't find L.chpl
//require "./L.chpl"; // doesn't find L.chpl
require "M/src/subdir/L.chpl"; // works but requires specific working directory
proc mfunction() {
use L only;
L.lfunction();
}
} // M/src/subdir/L.chpl
module L {
writeln("initing L");
proc lfunction() {
writeln("in lfunction");
}
} An idea of module-local search paths would solve 2 problems in this example:
|
Sure, we can address that. What about module M {
private module L in "L.chpl";
} Anyway I think the big question is if we want submodules-in-different-files to be handled by:
|
From the original post, I completely agree with Examples 1 and 2. Because that would be only a behavioral change with the compiler with no new compiler options, I don't see a reason not to do it today. For Subdirectories and Examples 3 and 4: nope. I don't want to be in perpetual servitude to an arbitrary layout of my filesystem directories. The user should not be allowed to arbitrarily put files in subdirectories only to then go about representing that layout differently in their actual code. The code should dictate where to put modules, not the other way around. That way, we don't get into this mess with For the following code: module MyModule {
use Submodule1;
use Submodule2;
} There should only be a few known layouts for it, including a few combinatorial layouts between 2 and 3.
Look at the directory tree. Can you tell which are the parent modules and which are the submodules? The compiler should be able to do this too without any new compiler options. So, Example 3 is turned into:
and Example 4:
Most of the comments in this thread talk about the Original Examples 3 and 4, which I am not in favor of supporting because of the unnecessary complexity and discussion it has generated. This solution seems way cleaner. |
Reading through the comments more carefully, Brad's responses have actually been pushing back against the idea of local module search paths. Global module search paths are right in line with the current status quo where module members have default public visibility and
Example 1 is good to me. I want to expose a set of public APIs through a top-level module called M. I don't mind defining a submodule L in a separate file if it means I can break away some of those components into a logical grouping. If I can't break that module into a logical grouping called L, then they deserve to be in a monolithic file because it's all related functionality anyway. One place where this helps is with #12712 in a refactor of all stable standard modules into top-level module
While just a straw-person, the problem with this approach is that it doesn't work for #12712.
The local module search paths would need to learn how to look for files too. The problem doesn't occur if
I don't want to embed directory paths to files at all. It's unnecessary information if we outright forbid that directory structure from existing with local module search paths. More critically, semantic imports through
I wouldn't put too much emphasis on the test suite given that there likely aren't a lot of tests with large hierarchical module dependencies in there.
I dislike all of the stated behaviors. 😄 Maybe that's because I like controlling Chapel things from within Chapel source and not controlling C or command-line things from within Chapel source. The compiler can control the local module search paths without having to resort to filesystem-level constructs and the duplication of functionality when nested // A_Include.chpl
module A {
include "B.chpl"
include "C.chpl"
include "D.chpl"
}
// A_Include_Bad.chpl
module A {
module B {
include "B.chpl" // Did I just wrap B in an outer B?
}
module C {
include "C.chpl"
}
module D {
include "C.chpl" // Whoops.
}
}
// A_LocalModuleSearchPaths.chpl
module A {
use B;
use C;
use D;
}
While I don't want to go up against the behemoth that is LaTeX, this is one of the first results regarding
Because we can be better! Do you want faster horses or a car? 👍
I'd personally like whatever solution occurs with modules to be useful enough that
I'd advocate for
See above regarding C++ modules after decades of experience with
Unless the language enforces it, someone will do something tricky with it, which is where I'd advocate for something much more restrictive that provides the necessary information for compilers to do their jobs while giving enough flexibility to the programmer to lay out their directory structure without low-level hooks to let humans be the creative entities that they are in doing said tricky things. I'd argue that this isn't new ground; modern languages are all gravitating away from these low-level behaviors regarding package directory structures. Though taken to the extreme, something to avoid is Python's behavior where it's
Uh, so if I wanted to edit someone else's module L, I'd have to grep all source files to find it? I'm glad that's not the world we ended up in; restrictive is better for the compiler! (From my previous point, that also sounds pretty expensive for the filesystem.)
I agree; |
Handy references to consider:
|
Another example. I'll use module Foo {
public use Bar.Baz;
private use Details; // Only available to Foo and any of its submodules like Bar or Bar.Baz.
}
module Bar {
public use Baz; // If we don't do this, then Foo can't `use Bar.Baz` either.
}
proc main() {
use Foo;
//use Foo.Bar; // No! You can't do this. Foo didn't `public use` Bar.
use Foo.Baz; // Ok. Foo has Baz in scope. main doesn't need to know that it occurs through Bar.
//use Foo.Bar.Baz; // Nope!
//use Foo.Details; // Can't do this either. Details is private.
}
But note the dragons. I'm not sure how Details will get exposed to Bar or Baz. Rust can use a top-of-level absolute path to handle this case. So maybe this isn't quite the right model and we have to be even more restrictive and start our local search elsewhere like the Edit: Added search paths for modules under Foo looking for Details. Maybe? I don't know. |
@BryantLam - wow that's a lot of comments :) I wanted to bring up some things related to your proposal in #12923 (comment) but please note that I'm not yet trying to express an opinion about it. Difference from directory-is-a-module?First, I want to understand how your proposal is different from what I proposed in #10946. I think that the difference is that your proposal doesn't actually create submodules. Instead, it is a module search adjustment. In particular, when looking for Does the
|
I think that's right. My proposal doesn't create submodules; rather it limits the module search paths instead. After looking at #10946 more closely, I didn't grasp the distinction between a submodule versus
Whoops! Sorry, you're right! I read your code snippet too quickly and thought that
I think my proposal--being an extension of the original post at the top--will likely not address the submodules distinction unless strategy 1 enables that effect. The proposal only handles the local search paths, so I think if we substitute
More (hopefully relevant) references:
|
@BryantLam - Thanks especially for the links to Rust's previous work. I see also Revisiting Rust's modules, part 2 and - from a different author - The Rust module system is too confusing. Also, it looks like what was actually agreed upon and implemented from those blog posts is in RFC 2126. As I understand it, that RFC does support the idea that a directory can represent a module - but it doesn't make it mandatory. (In particular, if you search for One thing that is clear to me is that just as this is one of the few Chapel issues where I see people putting 👎 on ideas... the module system discussions for Rust were pretty contentious. Somehow it seems inherent to the topic. Anyway, it looks to me like the authors of the blog posts mentioned above would like for Rust modules to more closely map to files and directories. However AFAIK this is not what Rust has done, at least in part due to backwards compatability issues. I think that is a reasonable direction for us to go - or to at least seriously consider. Certainly one could view #10946 as a starting point in that direction. Note that some of the blog posts even argue for deprecating There is an important difference from #10946 and the Rust proposals around directory-as-module. In #10946, I proposed that the files within a directory would be submodules. But in Revisiting Rust's modules, the files in a directory collectively create a module, and submodules are stored in their own subdirectory. Why do they think that it's generally better for the directory structure to match the module hierarchy? Because it's less confusing (especially for beginners) and also because it allows one to know where to look for a particular piece of code in a larger project. So, I think the main question at this point is this - should the recommended style for submodules in different files involve an idea of directories representing submodules? That is, that a module can be represented by a directory, with submodules represented by subdirectories? What could this look like, in terms of Example 1 from this issue? Example 1 using
|
More links from Rust:
I wrestled with this topic myself coming from a C/C++ background, but I do believe it is better that Chapel enforces a packaging standard in the long run. The majority of programmers are reading/maintaining code way more often than writing code. I'd personally want any user of Chapel to be able to quickly learn/scan through any package source because all packages/codebases would have a consistent filesystem layout, whatever that layout may be.
I completely agree. This rationale deserves repeating because Python source is laid out in a similar way and I'd like to think that the spaces-vs-tabs debate went away with code formatters and style guides similar to how Python's rigid packaging hierarchy removed a similar debate for new codes being written.
I agree, especially since it is easier to grok by a new user. While I do empathize with @bradc's desire for what amounts to multiple files inlined into a If such a feature were desired, the proposal in #10946 actually notes inlined-files-as-module as a possibility from the original Rust proposal where files in a directory were concatenated/inlined into that module and submodules must be directories. This model would not be that hard to understand either, but it is different enough from Python and Rust that it has to be taught. I'm okay with either option since the inlining/concatenating approach affords an additional capability. It does, however, deviate from Chapel's notion of file-level modules, though that problem is also present with #10909's include statement. More questions related to module search paths: Question: Will Chapel will need the same pathing distinctions that Rust and Python have?Ambiguities? Visibility of name conflicts between user modules and Mason packages? For example, how would you specify between the two module Baz {
...
}
module Foo {
module Bar {
module Foo {}
use Foo; // ambiguous; or (my preference) child-Foo using relative paths
// Python-like syntax.
use .Foo; // child-Foo referencing `self::Foo` module in Rust
use ..; // parent-Foo referencing `super` module in Rust
use Baz; // Today, this would work. Should it? What about other packages?
use /Baz; // Unambiguous from top. My fake syntax that starts from "root".
// .. What does "top" even mean?
}
} Another example in #10946 (comment). Question: What's the default search-path behavior?Absolute vs. relative pathing. Is there one? Relative pathing is more natural. This would affect the ambiguous search case and Question: What's the visibility for items in a "package"?The compilation boundary in Rust is a crate. The default visibility for items in a crate was changed from private to Chapel doesn't really have a notion for a package, but I think the question of item visibility will also be a concern in order to minimize re-exports. Python doesn't have this problem because everything is public visibility (for better or worse), so maybe it's not a big concern since Chapel already has default public behavior; the main downside to this behavior is when someone |
I think so - I think we'll need a way to specify the difference between an absolute path and a relative one, at the least. I think this is only about
I would agree that relative paths are more natural. However I'm open to considering the alternative.
We could introduce a visibility like But either way, if there is some module That leads me to wonder if it would be good enough to rely on that property to control whether or not |
I've ignored this issue for a month because it was driving me a little crazy when I was active on it, and then the conversation snowballed to the point where I was unable to keep up (which then sapped my motivation to even try to catch up). I started into an attempt to catch up with it today and quickly felt overwhelmed again, so ended up just taking a really quick first pass through it, mostly skimming for the sake of time, and trying not to get too hung up on details. I have a feeling that what we're going to need at some point (maybe now, but more likely not quite yet) is a new issue proposing a strawperson plan that wouldn't require everyone to digest the discussion on this one in order to understand it. As a baby step towards catching up and re-engaging on this topic, let me try and state the concern that I was left with a month ago and that's been rattling around in my head since dropping off. It seems relevant to a few of the comments that caught my eye today as I was trying to catch up, like this one:
In doing so, I'm going to ignore (for now) a bunch of other questions that were asked of me and comments that seemed like they wanted a response to try and keep this manageable. Moreover, I'm going to do this without talking about files and directories at all because I think my concern is unrelated to that aspect of the issue (which is unfortunate, since that's the topic of the issue! :) ). My mental model of Chapel's namespaces and scoping (which I believe matches what is implemented today), goes something like the this:
Given that, when I think of a Chapel program, I tend to think of its structure as being formed around the nesting or hierarchy of modules. For example, given the code: module M1 {
module S1 { ... }
module S2 { ... }
}
module M2 {
module S1 { ... }
module S2 { ... }
} my mind pictures the following (and apologies, but I'm going to use a directory hierarchy notation for convenience, though I'm not trying to tie this back to directories and files in any way):
Moreover, when I think of OK, so where a lot of this conversation hung me up a month ago is due to what seemed to me like a recurring theme of "I want to use a module / make it known to the Chapel compiler, but I don't want anyone else to be able to see it, yet I don't want to make it a submodule." And to my thinking, that seems inelegant and counter to the Chapel's design. As a specific example to talk about, let's go all the way back to example 1:
In my mind, regardless of the arrangement of files and directories here, the options for the module hierarchy on master today are either: case 1: sub-module
in which case nobody can get to or: case 2: sibling module
in which case we shouldn't be surprised if others can see What I worry about is that it felt like the original post and several of the comments have been wanting something new and different like the following: case 3: private sibling module
To me, this feels like a new, complicated, and unnecessary concept that I'd like to avoid if at all possible. That is, I believe that if you don't want others to know about So my baby step for today is to pause at this point and see whether anyone (but particularly @mppf and @BryantLam) disagree with what I've written here (where you're welcome to point out "Yeah, we'd already come to this same conclusion midway through that huge conversation you skimmed"). Most specifically:
|
My viewpoint today is that I'd be pretty happy with a system emulating some of the Rust proposals (e.g. what I outlined in #12923 (comment) ) which provides for easy submodules but not for the private-sibling-module pattern. However I view this as having some of the same features of the original proposal in terms of customizing how the compiler "finds" modules (since e.g. M.chpl can access M/L.chpl but the rest of the source code cannot). But yes, it does so with submodules rather that private-sibling-modules. I think that it would be reasonable for an author trying to achieve Case 3 from the original issue description to use to put L.chpl inside of M somehow. I have some concern that if doing so is not really easy / intuitive in terms of files and directories that users won't do it. Additionally, I think the question of whether or not it should be an error for Mason packages is a bit fraught. Perhaps mason should merely print out the names of the modules that are being exported. |
I agree with Michael. Admittedly, this issue went off on a tangent for a bit, but it's all related to the original post's issue of conflicting same-named modules in the global module path (#8470). Reusing Example 1 from the original post (modified to include
The module hierarchy is:
Today, this program cannot be compiled without conflicting-module errors.
How do you solve this issue without local module search paths? One option is to do it using low-level primitives like the Edit: I do agree with you. I don't think case 3 of private sibling modules is something I'm overly concerned with in the discussion regarding how to lay out code. In libraries, there will be a package-level module (i.e., Mason package) that has to be exposed as the entry module into that library, similar to a main module of an application. It's why these questions are particularly relevant regarding visibility of symbols within a package boundary. Edit2: Part of the debate which eventually led to #10946 was how to split submodules into other files so the compiler can still find them. |
Thanks for (eventually) asking the simple yes/no questions I was looking for with only minimal (5ish?) other unrelated paragraphs. I'll get back to the topic at hand soon. |
I've split off the specific concrete proposal I think we might have some agreement on into #13524. |
We implemented something along the lines of #13524. Closing this one. |
This issue is a proposal for a solution to the duplicate module names in mason
packages issue (#8470):
This issue could be solved by introducing a concept of local module
search paths, i.e. each module contains its own module search path rather than
using a single global module search path for all modules.
Consider the following example:
Example 1
Directory Layout:
We would like a main module in a different directory to be able to
use M
directly and not
L
and we need to somehow provide the compiler with thelocation of M.chpl.
Compilation of Main Module:
Today, the global module search path looks something like:
Therefore,
L
is still accessible to the main module.In this proposal we'd like for the local module search paths to be as follows:
Therefore, only
M
can accessL
directly.Example 2
Suppose the main-module from before now requires a mason package,
[email protected]
:The local module search paths under this proposal would be as follows:
Subdirectories
What if there are subdirectories? To support this case, we will need new
compilation flags that can modify local module search paths.
The proposed compilation flags for modifying module search paths are:
--include-package <moduleFile>
adds a module (<moduleFile>
) to the localmodule search path of the main module.
--include-subpackage <moduleFile>
adds a module (<moduleFile
) to thelocal module search path of the last module listed in an
--include-package
or
--include-subpackage
flag.--package-private-M <path>
adds a path or module file (<path>
) to the localmodule search path of the last module listed in an
--include-package
or--include-subpackage
flag.-M <path>
adds a path or module file (<path>
) to the local module searchpath of all modules being compiled, i.e. the global module search path.
Example 3
Directory Layout:
Compilation Command;
The local module search paths under this proposal would be as follows:
Example 4
Directory Layout:
Compilation Command;
The local module search paths under this proposal would be as follows:
Note: The proposed flag names here are placeholders (especially
--package-private-M
) so feedback is welcome on those.The text was updated successfully, but these errors were encountered: