Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify Import Syntax to Include Username Prefix #1767

Closed
7 tasks
radeusgd opened this issue May 28, 2021 · 15 comments
Closed
7 tasks

Modify Import Syntax to Include Username Prefix #1767

radeusgd opened this issue May 28, 2021 · 15 comments
Assignees
Labels
--breaking Important: a change that will break a public API or user-facing behaviour p-high Should be completed in the next sprint

Comments

@radeusgd
Copy link
Member

radeusgd commented May 28, 2021

Summary

The libraries are meant to be identified by their author's username, so we want to support the syntax import Library_Author.Library_Name.Some_Module.Some_Symbol.

All libraries will have an author, so we are not making the author prefix optional, instead - every import (of course besides polyglot imports as these are a separate thing altogether) will now need to contain that prefix.

In particular, the standard libraries will use the prefix Standard.

Value

  • Libraries can be identified based on their author.
  • The import syntax will stay consistent.

Specification

  • Update the parser to support the new syntax (both imports and exports).
  • Change import resolution logic to take the prefixes into account.
    • The prefix should be read as author's name from the librarie's package.yaml (the same as the library name is taken from there).
  • Split the Standard library back into separate modules, all having the Standard prefix.
  • Update any other components that were affected by this change.
    • This may include making the Lanugage Server of new syntax, updating imports in tests and example to also have a prefix etc.

Acceptance Criteria & Test Cases

  • The above specification has been completed.
  • There are no regressions in existing tests.
    • In particular, all standard library tests are correctly configured on the CI and passing.
  • There are tests that check the new syntax.
@radeusgd radeusgd added Category: Libraries --breaking Important: a change that will break a public API or user-facing behaviour p-high Should be completed in the next sprint labels May 28, 2021
@radeusgd
Copy link
Member Author

I'd like to discuss a few things here (@wdanilo @MichaelMauderer @iamrecursion):

Prefix for local projects

An edge case that we have to handle is that one may create a project while not being logged in and want to import modules from that project. As we do not have relative imports, they will need an absolute path so the project itself needs a prefix. What should it be, since the user is not logged in?

As far as I understand we do not want to force the user to log-in to create a new project and we also would prefer to avoid having to modify imports once the user logs in to rename the prefix. My suggestion would be to have a reserved prefix (much like Standard is reserved) for local projects - be it local or project. This way the local projects have a stable path that can be used. We do not have to worry that this path is different from a username because projects created in this way will not be shared as libraries - creating a library requries being logged in and will use the proper prefix.

Author field in Standard libraries

As far as I understand, we want the Standard to include not only our own libraries but also 'community-contrib'. In that case, the above would require to modify the author field to be set to Standard to force the correct prefix. Are we ok with that? Library authors may not like this, they can still be listed as maintainer but it does not look good that the authorship is unclear.

One solution that I'd suggest is having a separate field prefix which, if not present, will fall back to author. When uploading the field will be checked alongside author. This has the advantage that in the future we could have special permissions for uploading libraries with the Standard prefix that could allow their maintainers to easily upload updates to these libraries without giving them access to the whole standard library (the authentication could verify that the author is correct and check if they have permission for the Standard prefix enabled).

@radeusgd radeusgd mentioned this issue May 28, 2021
28 tasks
@radeusgd
Copy link
Member Author

Prefix syntax

We need to come up with criteria for the usernames/import prefixes.

I'd suggest just [A-Za-z0-9_]+ for greatest compatibility (let's keep in mind that these will be used as part of filesystem paths when the user downloads a library) but I'm open to other suggestions.

@radeusgd
Copy link
Member Author

radeusgd commented Jun 2, 2021

I think this should take 2-3 days but I set 4 to be safe as this affects many parts of the engine so it's highly possible there will be unexpected issues.

@iamrecursion
Copy link
Contributor

An edge case that we have to handle is that one may create a project while not being logged in and want to import modules from that project. As we do not have relative imports, they will need an absolute path so the project itself needs a prefix. What should it be, since the user is not logged in?

I like the local prefix you suggest. It's clean and clear.

One solution that I'd suggest is having a separate field prefix which, if not present, will fall back to author. When uploading the field will be checked alongside author. This has the advantage that in the future we could have special permissions for uploading libraries with the Standard prefix that could allow their maintainers to easily upload updates to these libraries without giving them access to the whole standard library (the authentication could verify that the author is correct and check if they have permission for the Standard prefix enabled).

My concern with a separate prefix field is that it would only be used for the standard libraries. It'd be a bit confusing to basically deny most users the ability to use the field but have it available. That said, I can't think of a better idea at this juncture.

@kustosz
Copy link
Contributor

kustosz commented Jun 7, 2021

If we're going with such a naming scheme, I would opt to change Standard to Enso. import Enso.Base does not look bad and is consistent with the design choice of including author names in imports.

@kustosz
Copy link
Contributor

kustosz commented Jun 7, 2021

Right, so this is next on my plate, we need to make calls on the unresolved issues:

  1. Do we want to do anything about the prefix? IMO the fact that we feel the need for this is a big design smell – why are we implementing a feature that we ourselves want to work around? We should be fine with using the prefix Enso or rethink this functionality altogether.
  2. The local thing seems a bit off – it's not really local as much as current_project – maybe should just be project? Or this_project? Or maybe no prefix at all, just a .? Something like import .My_Module?

@radeusgd
Copy link
Member Author

radeusgd commented Jun 7, 2021

If we're going with such a naming scheme, I would opt to change Standard to Enso. import Enso.Base does not look bad and is consistent with the design choice of including author names in imports.

As far as I remember the Standard library was also meant to include user-contributed libraries that were deemed part of it, which would not fit to this scheme. @wdanilo I think it was you mentioning these user-contrib libraries at some point, what do you think here?

The local thing seems a bit off – it's not really local as much as current_project – maybe should just be project? Or this_project? Or maybe no prefix at all, just a .? Something like import .My_Module?

I think this may be a misunderstanding - to my understanding, local is nothing special, you shouldn't need to think about it within this task. It would just be a reserved prefix that will be selected for new projects when the user is not signed-in, so when creating the project instead of using e.g. kustosz (because we don't know it until you sign-in), we will use local as a placeholder - instead of something like unknown. But this would act the same as any author's name, just a default value for it. So I don't see any special logic in the parser needed for that.

@radeusgd
Copy link
Member Author

radeusgd commented Jun 7, 2021

Btw. we still have a problem with using author for the prefix - currently our spec allows multiple authors, as the authors field can be a list. We could take the first one or something, but that sounds extremely arbitrary.

So I think that we may need to add a separate field anyway, I'd suggest calling it prefix or owner or something similar; it would normally default to the first author when creating the project, but it could also handle these other edge cases.

@wdanilo
Copy link
Member

wdanilo commented Jun 8, 2021

If we're going with such a naming scheme, I would opt to change Standard to Enso. import Enso.Base does not look bad and is consistent with the design choice of including author names in imports.

The problem with this is that Enso is not the author of things that will be included under this prefix. Semantic naming is always better than anything else. Things that we will put in Standard are libraries of different people that we think are good, proven, and should be used as a "standard". The name "Enso" doesn't tell that, thus I believe that the name "Standard" is much better.

Do we want to do anything about the prefix? IMO the fact that we feel the need for this is a big design smell – why are we implementing a feature that we ourselves want to work around? We should be fine with using the prefix Enso or rethink this functionality altogether.

TBH I don't understand this question – where were any sentences that we want to work around any of the described features? Also, what is the prefix you are referring to? Is this Standard vs Enso prefix, or the proposed prefix field in the config?

The local thing seems a bit off – it's not really local as much as current_project – maybe should just be project? Or this_project? Or maybe no prefix at all, just a .? Something like import .My_Module?

I also don't like local - it is not clear if it tells about local files on my computer (vs published libs), or something else. The project prefix is much better. I'd not use the dot-prefix syntax, as maybe in the future we will allow relative imports and this syntax is pretty popular for relative imports. Also, the dot-prefix syntax is something that would be confusing for people at the first sight, while the project name is very explicit and self-describing.

As far as I remember the Standard library was also meant to include user-contributed libraries that were deemed part of it, which would not fit to this scheme. @wdanilo I think it was you mentioning these user-contrib libraries at some point, what do you think here?

That's right. We've been discussing that already with Marcin too. In fact, the name "Standard" was invented by Marcin during this discussion after me telling exactly this argument ;)

My concern with a separate prefix field is that it would only be used for the standard libraries. It'd be a bit confusing to basically deny most users the ability to use the field but have it available. That said, I can't think of a better idea at this juncture.

I fully agree with Ara here. Let's be config-less if that is possible. Telling which library belongs to "Standard" is our job (the core team). This should not live in user-maintaining configs. Instead, there should be a list of libraries on the server that will just contain all names of libs we consider to be "standard". This list would be updated by us.

@kustosz
Copy link
Contributor

kustosz commented Jun 8, 2021

@wdanilo right, so here's where I'm confused. We talk about wanting the prefix to be Standard but also about this being plucked from the author field. Are we then calling our team Standard? And when a library gets included, do we want the original author to change their name to Standard?

Particularly the latter (a super awesome community member has to change their name to Standard and lose their original attribution) bothers me – I wouldn't be happy to re-author my super-awesome library – the system of including libs under Standard should be a reward and should foster community member growth, not anonymize them.

Which suggests to me that this should be a separate config field – like namespace (this is what we are calling prefix so far in this issue). Essentially the 3 solutions I see are:

  1. We take the namespace from the author field. This forces us to be named Standard and also forces community members to be named Standard sometimes (i.e. when they are appreciated, we ask them to forgo authorship and attribution) – which I think will have negative effects on the community.
  2. We take the namespace from the author field, but call it Enso for us and cancel the idea of including great community libs under this umbrella. This seems clean, if we can find other ways to prominently feature a library.
  3. We take the namespace from a separate config field – this is a bit clunky, but also works well with all the different scenarios.

@kustosz
Copy link
Contributor

kustosz commented Jun 8, 2021

We've discussed the issue with @wdanilo and the resolution is: the author prefix is going to be a separate entity in our libraries system, akin to GitHub's organizations, allowing users to alias their libraries with some organizational name, but defaulting to the username.

@radeusgd
Copy link
Member Author

radeusgd commented Jun 8, 2021

I've been told that there were discussions that we want to be able to create aliases for library names, like for example when a user Foo creates a library Foo.Bar but at some point we want it to be available under Standard.Bar too.

I'm not sure if I understand why we want this to be an alias instead of just publishing new versions of the library under a new name. Introducing the alias can be 1. confusing - is Standard.Bar:1.2.3 the same thing as Foo.Bar:1.2.3 or a different thing? How do I know? What if in a single project one time I import using Standard.Bar and in other places using Foo.Bar? and 2. complicates the resolution logic significantly. This is not something that can be just 'handled' by the repository - the resolution logic that is performed on the local machine must also account for this, for example if one project does import Foo.Bar and another does import Standard.Bar it somehow needs to know that both should point to the same library, so we would need to add some configuration for this kind of mapping just for this one particular edge case.

If we really want to have these aliases (although in my opinion this can be confusing to users as written above), I'd suggest not adding any logic for handling that just for this special case, but instead handling it by just creating copies of the library - so that when we assign Foo.Bar to the Standard, we could add special logic (only) to our upload service such that when Foo is uploading the library Bar it would upload it and make a copy renamed to Standard.Bar and simply contain two copies of the library. This way at least the client side logic would not need to account for these resolution rules.

But I would mostly suggest just renaming future versions of the library to Standard.Bar to avoid confusion which one is which (while still of course keeping the authors field unmodified).

@wdanilo AFAIK you were suggesting these aliases, what do you think about the above?


I fully agree with Ara here. Let's be config-less if that is possible. Telling which library belongs to "Standard" is our job (the core team). This should not live in user-maintaining configs. Instead, there should be a list of libraries on the server that will just contain all names of libs we consider to be "standard". This list would be updated by us.

It is impossible to do this in a completely 'configless' way - at some place there needs to be this configuration file and it cannot be just on our servers - it must somehow reside on the local machine as the local machine must know how to resolve Standard.Bar. That is why in my opinion the best place to keep this information is in the package.yaml file, because introducing another config file (the standard list) that must be downloaded from our servers and defines additional logic for handling this one special case is something that I thought we wanted to avoid (adding special logic, on the clientside, just for this one single special case).

I guess an alternative solution could be to be just based on the directory, i.e. if the package is located in libraries/Standard/Foo/1.2.3/package.yaml then we know that this library has the Standard prefix. That sounds good to me because we avoid duplicating this information inside of package.yaml (although we duplicate the library name and version anyway). On the other hand I'm not sure if basing this kind of information just on the directory is a good solution. It may be especially problematic for user-created libraries. If the library is created within the IDE, the IDE can take care to put it in the right location (so that the directory-based prefix matches the username). But in the CLI we may not be able to enforce this directory structure (as the user may put the code wherever they want).

@radeusgd
Copy link
Member Author

radeusgd commented Jun 8, 2021

One more thought:

Please note that the prefix is not only a notion of a library but also of a project - to have an unified import syntax, the project's module should also have some kind of a prefix. I thought that initially we wanted to go for the project's prefix to also be correlated with the author's username or if they are not logged in, fall back to some default.

But if we don't want to include the prefix in the configuration, then how will we infer the prefix for a given project? For libraries we could check their directory structure (e.g. a library stored in parent directory Foo can be thought to have prefix Foo), we could even do the same for projects created by the IDE (just store them under ~/enso/projects/prefix-name/project-name), but we definitely cannot do this for projects created using the CLI. So either we need to allow to configure the prefix somehow or we need to have a special reserved name that refers to 'the current project' (since we cannot refer to other projects which are not libraries, this could be enough).

Which one do you think we should go for? @wdanilo @iamrecursion @MichaelMauderer @kustosz I think your input could be very valuable here.

@kustosz
Copy link
Contributor

kustosz commented Jun 9, 2021

The more I think about the default prefix, the more I think we should also implement the import .My_Mod syntax as part of this task. This obviates the need for both author and library name and would allow us to import other files from the current project before it is designated as a library. Would also make the project absolutely name- and author-change proof. @wdanilo please say something here :)

@kustosz
Copy link
Contributor

kustosz commented Jun 10, 2021

So we've discussed the above with Wojciech and the resolution is: this piece of syntax should be import project.My_Mod. And cc @radeusgd because this indeed means that the project prefix is special (replaces both namespace and project name in import)

@kustosz kustosz closed this as completed Jun 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
--breaking Important: a change that will break a public API or user-facing behaviour p-high Should be completed in the next sprint
Projects
None yet
Development

No branches or pull requests

4 participants