Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mill cannot handle forward slash in crossValue #2835

Closed
jackkoenig opened this issue Oct 9, 2023 · 16 comments · Fixed by #2984 or #2986
Closed

Mill cannot handle forward slash in crossValue #2835

jackkoenig opened this issue Oct 9, 2023 · 16 comments · Fixed by #2984 or #2986
Milestone

Comments

@jackkoenig
Copy link

jackkoenig commented Oct 9, 2023

I am using mill version 0.11.5.

My main goal is handling a nested hierarchy of files with Dynamic Cross Modules, but I am running into issues where attempting to use relative paths as cross values causes mill to error out.

Consider the following:

import mill._
object foo extends Cross[Foo]("a", "b/c")
trait Foo extends Cross.Module[String] {
  def value = T { crossValue }
}

Mill will happily build these cross modules:

$ mill resolve foo._
foo[a]
foo[b/c]

But when you try to do Tasks that cache to disk, I think Mill is hitting an internal error due to the /:

$ mill foo[_].value
Exception in thread "MillServerActionRunner" os.PathError$InvalidSegment: [b/c] is not a valid path segment. [/] is not a valid character to appear in a path segment. If you want to parse an absolute or relative path that may have multiple segments, e.g. path-strings coming from external sources use the Path(...) or RelPath(...) constructor calls to convert them. 
        at os.BasePath$.fail$1(Path.scala:125)
        at os.BasePath$.checkSegment(Path.scala:131)
        at os.PathChunk$StringPathChunk.<init>(Path.scala:25)
        at os.PathChunk$.StringPathChunk(Path.scala:24)
        at mill.eval.EvaluatorPaths$.$anonfun$resolveDestPaths$3(EvaluatorPaths.scala:31)
        at scala.collection.immutable.List.map(List.scala:250)
        at scala.collection.immutable.List.map(List.scala:79)
        at os.PathChunk$SeqPathChunk.<init>(Path.scala:46)
        at os.PathChunk$.SeqPathChunk(Path.scala:43)
        at mill.eval.EvaluatorPaths$.resolveDestPaths(EvaluatorPaths.scala:31)
        at mill.eval.GroupEvaluator.evaluateGroupCached(GroupEvaluator.scala:162)
        at mill.eval.GroupEvaluator.evaluateGroupCached$(GroupEvaluator.scala:44)
        at mill.eval.EvaluatorImpl.evaluateGroupCached(EvaluatorImpl.scala:15)
        at mill.eval.EvaluatorCore.$anonfun$evaluate0$2(EvaluatorCore.scala:116)
        at scala.concurrent.impl.Promise$Transformation.run(Promise.scala:467)
        at mill.eval.ExecutionContexts$RunNow$.execute(ExecutionContexts.scala:13)
        at scala.concurrent.impl.Promise$Transformation.submitWithValue(Promise.scala:429)
        at scala.concurrent.impl.Promise$DefaultPromise.submitWithValue(Promise.scala:338)
        at scala.concurrent.impl.Promise$DefaultPromise.dispatchOrAddCallbacks(Promise.scala:312)
        at scala.concurrent.impl.Promise$DefaultPromise.map(Promise.scala:182)
        at mill.eval.EvaluatorCore.$anonfun$evaluate0$1(EvaluatorCore.scala:92)
        at mill.eval.EvaluatorCore.$anonfun$evaluate0$1$adapted(EvaluatorCore.scala:90)
        at scala.collection.immutable.Vector.foreach(Vector.scala:2124)
        at mill.eval.EvaluatorCore.evaluate0(EvaluatorCore.scala:90)
        at mill.eval.EvaluatorCore.$anonfun$evaluate$1(EvaluatorCore.scala:43)
        at scala.util.DynamicVariable.withValue(DynamicVariable.scala:59)
        at mill.eval.EvaluatorCore.evaluate(EvaluatorCore.scala:34)
        at mill.eval.EvaluatorCore.evaluate$(EvaluatorCore.scala:26)
        at mill.eval.EvaluatorImpl.evaluate(EvaluatorImpl.scala:15)
        at mill.main.RunScript$.evaluateNamed(RunScript.scala:38)
        at mill.main.RunScript$.$anonfun$evaluateTasksNamed$2(RunScript.scala:26)
        at scala.util.Either.map(Either.scala:382)
        at mill.main.RunScript$.evaluateTasksNamed(RunScript.scala:26)
        at mill.runner.MillBuildBootstrap$.evaluateWithWatches(MillBuildBootstrap.scala:399)
        at mill.runner.MillBuildBootstrap.$anonfun$processFinalTargets$3(MillBuildBootstrap.scala:308)
        at scala.util.DynamicVariable.withValue(DynamicVariable.scala:59)
        at mill.runner.MillBuildBootstrap.processFinalTargets(MillBuildBootstrap.scala:308)
        at mill.runner.MillBuildBootstrap.evaluateRec(MillBuildBootstrap.scala:196)
        at mill.runner.MillBuildBootstrap.evaluate(MillBuildBootstrap.scala:48)
        at mill.runner.MillMain$.$anonfun$main0$6(MillMain.scala:234)
        at mill.runner.Watching$.watchLoop(Watching.scala:27)
        at mill.runner.MillMain$.$anonfun$main0$1(MillMain.scala:219)
        at scala.util.DynamicVariable.withValue(DynamicVariable.scala:59)
        at scala.Console$.withErr(Console.scala:193)
        at mill.api.SystemStreams$.$anonfun$withStreams$2(SystemStreams.scala:62)
        at scala.util.DynamicVariable.withValue(DynamicVariable.scala:59)
        at scala.Console$.withOut(Console.scala:164)
        at mill.api.SystemStreams$.$anonfun$withStreams$1(SystemStreams.scala:61)
        at scala.util.DynamicVariable.withValue(DynamicVariable.scala:59)
        at scala.Console$.withIn(Console.scala:227)
        at mill.api.SystemStreams$.withStreams(SystemStreams.scala:60)
        at mill.runner.MillMain$.main0(MillMain.scala:101)
        at mill.runner.MillServerMain$.main0(MillServerMain.scala:83)
        at mill.runner.MillServerMain$.main0(MillServerMain.scala:35)
        at mill.runner.Server.$anonfun$handleRun$1(MillServerMain.scala:187)
        at java.base/java.lang.Thread.run(Thread.java:833)

If I had to guess, I would guess that

val targetPath = workspacePath / segmentStrings
should be converting the Seq[String] to a os.RelPath.

@lefou
Copy link
Member

lefou commented Oct 10, 2023

I don't think allowing a cross-value to open up multiple sub-paths is the right solution. We already map n cross-values to n sub-directories. Now, when we also allow the value itself to contain sub-directories, we can't no longer guarantee unique T.dest directories.

Here is my counter example, in which both cross module instances whould use the same T.dest path for their targets.

import mill._
object foo extends Cross[Foo](("a", "b/c"), ("a/b", "c"))
trait Foo extends Cross.Module2[String, String] {
  ...
}

Instead, we could encode the special / character in the path. But that would create less nice cache directory names.

@jackkoenig
Copy link
Author

jackkoenig commented Oct 10, 2023

Fair point, and thanks for the counter example.

Instead, we could encode the special / character in the path. But that would create less nice cache directory names.

I think this also highlights that maybe there are two asks here:

  1. Support / in Strings used for Cross.Modules. Perhaps with a special character as you said
  2. Support os.RelPath (or some mill built-in alternative) as the type of Cross.Modules (via implementation of `ToSegments).

os.RelPath could have the behavior I want since there's no chance of collision in that case (and is probably what people should be steered toward if they are creating Strings with / in the name). Per (1), I think a special character is fine. Honestly, even just erroring and pointing people toward using os.RelPath would be reasonable since I suspect that would handle most use cases.

EDIT: my above proposal for os.RelPath hits the exact same issue you pointed out for Strings with /. Another option is to append each path segment in os.RelPath with $, while leaving the final segment without it, eg.

implicit object RelPathToPathSegment extends mill.define.Cross.ToSegments[os.RelPath](
  p => p.segments.init.map(_ + "$") ++: List(p.segments.last)
)
object fizz extends Cross[Fizz]((os.RelPath("a/b"), os.RelPath("c/d/e")), (os.RelPath("a/b/c"), os.RelPath("d/e")))
trait Fizz extends Cross.Module2[os.RelPath, os.RelPath]
$ mill resolve fizz._
fizz[a$,b$,c,d$,e]
fizz[a$,b,c$,d$,e]

@jackkoenig
Copy link
Author

FWIW, I'm able to deal with this in my own code with:

implicit object RelPathToPathSegment extends mill.define.Cross.ToSegments[os.RelPath](_.segments.toList)

@lefou
Copy link
Member

lefou commented Oct 10, 2023

I strongly suggest to use os.SubPath instead of os.RelPath, to avoid any unwanted effects by escaping the base, e.g. by using ...

@jackkoenig
Copy link
Author

I didn't realize SubPath existed, thanks for the tip!

@lihaoyi
Copy link
Member

lihaoyi commented Oct 12, 2023

I think it should be ok to add a ToSegment[os.SubPath]. After all, we already have ToSegment[Seq[T: ToSegment]] and ToSegment[List[T: ToSegment]], which have the same failure modes and potential for path conflicts.

Since we can't give strong guarantees here anyway, might as well make it a bit more convenient in some common use cases

@lefou
Copy link
Member

lefou commented Oct 12, 2023

Since we can't give strong guarantees here anyway, might as well make it a bit more convenient in some common use cases

Aside the strong guarantees, we at least assume uniqueness, right? Otherwise, we should implement collision detection when calculating metadata destination paths.

@jackkoenig
Copy link
Author

jackkoenig commented Oct 12, 2023

I think collision detection is necessary anyway. You can construct a collision already with Cross.Module2[Seq[String], Seq[String]]:

object buzz extends Cross[Buzz](Seq((Seq("a", "b"), Seq("c")), (Seq("a"), Seq("b", "c"))))
trait Buzz extends Cross.Module2[Seq[String], Seq[String]] {
  def func = T { T.dest }
}
$ mill resolve buzz._
buzz[a,b,c]

@lefou
Copy link
Member

lefou commented Oct 12, 2023

What is this? Since when does this work? I'm surprised and think, this wasn't even possible in Mill 0.10 / 0.9.

@jackkoenig
Copy link
Author

jackkoenig commented Oct 12, 2023

I'm using mill 0.11.5, you can also show the same thing with Cross.Module[Seq[Seq[String]]]:

object buzz extends Cross[Buzz](Seq(Seq(Seq("a", "b"), Seq("c")), Seq(Seq("a"), Seq("b", "c"))))
trait Buzz extends Cross.Module[Seq[Seq[String]]] {
  def func = T { T.dest }
}

@lefou
Copy link
Member

lefou commented Oct 12, 2023

I guess, this is due to the ToSegemnt type class mentioned by @lihaoyi above. I wonder, why we should have a generic version supporting sequences or lists. Without, the mapping should be pretty collision free. I think, it's something newer, introduced with 0.11. We should revisit and discuss the collision thing. I'd rather have some less automatic mapping, but a less error-prone setup.

@lihaoyi
Copy link
Member

lihaoyi commented Oct 13, 2023

Before 0.11.0, we basically only allowed Strings and tuples of Strings, and any enforcement was done at runtime (we took in Anys and did type tests to decide what to do). That was definitely too anemic a data model, so we expanded it. Without too much care or analysis, hence the chance for conflicts today.

One option we could go with is to allow primitives and Tuples of primitives, but allow Lists/Seqs/SubPaths only outside of Tuples, OR only allow Lists/Seqs/SubPath as the rightmost elements of the tuple. That would allow us to provide both convenience and strict-ish compile-time enforcement of conflict-free-ness. It will never be 100% - even passing in Strings we have to do runtime checks to avoid duplicates- but it would be better than what we have now. However, it seems like it would likely require a reworking of the ToSegment typeclass hierarchy to split it into two related typeclasses, which is a bin-compat breakage that can't land in 0.11.x

As a compromise, we could add a ToSegment[os.SubPath] today for convenience, add some basic asserts to enforce conflict-free-ness via runtime checks, and when 0.12.x rolls around we could refactor the typeclass hierarchy to move the conflict-free enforcement to compile time

TBH I think just doing runtime enforcement is good enough. "runtime" here is before tasks actually run, so it should give pretty quick feedback anyway, and we can make sure the error is good. No need to be too stuck to compile-time validation when in this case we will always need runtime checks anyway: having the keys defined at runtime is virtually the raison detre of cross modules in the first place!

@lefou
Copy link
Member

lefou commented Oct 13, 2023

@lihaoyi Yeah, I also think runtime-checks are good enough. We known all effective cross instances and their segments, so it's pretty easy and fast to check for collisions.

@jackkoenig wrote:

$ mill resolve buzz._
buzz[a,b,c]

Disregarding how many ways exists to configure the build to end up with the same resolved target name, we only need to make sure they don't collide. We only need a collision free mapping from a parseable string-representation (given at the CLI) to a cross instance, however it is constructed in the code. It's not relevant, if we could refactor the code but end up with the same target name. It might even a feature, as it allows to grow a build over time without adapting the surrounding knowlegde and tooling, as long as it fits.

Adding a ToSegment[os.SubPath] (but not ToSegment[os.RelPath]) and a runtime-check for collisions is the way to go, I think.

lefou added a commit that referenced this issue Jan 23, 2024
Also add an implicit `ToSegements[os.SubPath]` to accept `os.SubPath` as
values in cross modules.

Fix #2835

Pull request: #2984
@lefou
Copy link
Member

lefou commented Jan 23, 2024

With PR #2984, we closed this issue, as discussed. Please be aware, that the initial use case (a cross value as string containing a forward slash) is still not supported.

@lefou
Copy link
Member

lefou commented Jan 23, 2024

I opened #2986 to make slashes in crossValue (the OP use case) work.

@jackkoenig
Copy link
Author

Thanks for solving this @lefou!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment