From 451b43e6479e96db11ff544118c8552c5d584bb3 Mon Sep 17 00:00:00 2001 From: Deepti Gandluri Date: Thu, 25 Aug 2022 12:55:19 -0700 Subject: [PATCH 01/11] Add notes for 8/23 GC meeting --- gc/2022/GC-08-23.md | 184 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 184 insertions(+) diff --git a/gc/2022/GC-08-23.md b/gc/2022/GC-08-23.md index 236ccc8e..4060feb0 100644 --- a/gc/2022/GC-08-23.md +++ b/gc/2022/GC-08-23.md @@ -33,3 +33,187 @@ Installation is required, see the calendar invite. ## Meeting Notes ### Introduction of attendees + +- Francis McCabe +- Ben Titzer +- Asumu Takikawa +- Justin Michaud +- Mathias Liedtke +- Zalim Bashorov +- Alon Zakai +- Ilya Rezvov +- Deepti Gandluri +- Andreas Rossberg +- Ashley Nelson +- Jakob Kummerow +- Aske Simon Christensen +- Conrad Watt +- Luke Wagner +- Rick +- Michael Knyzek +- Emanuel Ziegler + +### Discussions + +#### Type annotations ([#27](https://github.com/WebAssembly/function-references/issues/27)) + +AZ: There is a future type system feature that might need annotations, we probably need an example of a future type system that will not be solved by adding annotations later. I don’t think we’ve seen examples of that + +AR: One thing to be clear about is that adding annotations later is not going to fit any problems that we run into later, we can’t retrofit the type annotations on existing instructions, completely depends on the nature of the problem we run into. In general I assume there are ways to work around them. In terms of an example, I can appreciate to see something concrete, I don’t think we have anything. One thing I also want to say is that for people who haven’t worked with advanced type systems, it’s very easy to run into problems that we don’t foresee. I don’t want to run this, either algorithmic complexity explodes, or we reach an unsolvable state, especially when we lead into type parameters. Either we have to cut/restrict features in that space, this is a warning I want to give as not an easy thing to decide. The main advantage to omitting is the size win + +DG: Validation complexity? + +AR: That was an assumption that was made, JK’s numbers didn’t show anything substantial in terms of complexity + +BT: Type systems are fragile, and we don’t know yet what future extensions we want, and we don’t want to design things out by leaving them out. There are also other tools that process Wasm code, tools that want to inspect how fields of a struct are used, all those tools then have to reconstruct the type systems, and more inference is required, more likely that the tools will all have the burden that want to process the bytecode + +AR: Some folks that do program analysis were really happy that they didn’t have to do type inference, otherwise you have to do non-local type reconstruction, it’s only simple when you’re looking at whole programs, some of them you’re looking at program fragments, we have to shuffle around a bunch of constraints if you only have partial knowledge of the program + +AZ: I appreciate that it makes the analysis simple, you do need to do analysis but it’s local the type is present on the stack, example a call instruction we don’t annotate the return type of the binary and you have to do a global look up. There is already a some amount of burden, but it’s not a large burden + +AR: This is a different type of burden, you need to know what all the indices that you can refer to what their types are, an invariant environment, where they are explicitly declared in the program, here you have to have context of the instruction sequence that can be potentially unbounded, depending on what the instructions sequence is, there’s no bound on how far you have to look back, that’s a sort of different dimension from looking a + +DG: Are there examples of tools that we can reach out to to look at what kinds analysys these tools do? Theoretically all sorts of analysys is possible, looking for something used in practise + +BT: There aren’t tools for Wasm GC, if you go by analogy, for Java bytecode, there are lots of bytecode analysis tools, examples: find me all uses of this field, or some analysis at the class file level, Java bytecode has enough annotation that you can do this + +AR: A generalization of this is a simple example of finding out where a type is used, or size of a certain type. If you have that in the instructions explicitly, it’s easy to have that. Certain transformations want to have this information, not sure we have any concrete examples of tools, I would be interested in hearing of them too + +DG: Even in a limited context if there are tools that perform some sort of analysis right now we could extrapolate what the costs would be, a lot of the discussions here about what sorts of analysis the tools would perform is also hypothetical right now + +AR: Yeah, there are a lot of hypotheticals right now, in the presence of hypotheticals, it’s safe to be conservative and type annotations are a conservative choice. We can always add unannotated versions of instructions later if we find out that there’s a problem that sizes are really too big in practice, once we don’t have them we can’t go back + +AZ: Can you clarify why it’s easier to add unannotated versions later vs. adding the annotated versions later? + +AR: When we run into problems with lack of annotations, we can’t add them because of backwards compatibility, because we’re tied into supporting these instructions forever. If there’s interference with some feature we want to add, then we basically a bad place. We have to resolve this somehow. The other way is more robust, the size optimization is always something you can add at later. +AZ: Why do we think we’d be at a point of time in the future where we think the risk is lower vs. now? Do you think we’ll come to a point where we’ve implemented all the tricky things in the type system? That seems unlikely. We’d always have the argument against adding them, we’ll always thinking it’s risky to add. Worry is always that we will keep suffering the 5% loss indefinitely, cost can be significant in the long run, and we’re choosing to opt into it, we might always say it’s risky to add them + +AR: there’s never a point to be sure, over time we might have more data about what size data is more relevant, personally I think 5% is not an issue but ymmv, but we will be able to gather more data on that. The other side is that some features that are at higher risk, we will have added already, we will know more about whether these risks materialize, there will always be potential more things there are very likely features that we have them, type parameters for example, once we have those I’d be somewhat more comfortablle + +JK: Having annotations make some tools simpler, don’t think that’s a convincing argument. In particular a bunch of arguments regarding tools don’t make much sense, if you want to look for all occurrences of something, then you have to look everywhere anyway because you can reconstruct the types in a single linear pass over the module, and you have to do a single linear pass over the module if you want to look for all occurrences of something, that’s not really a significant burden, it only affects tools that process fragments of Wasm functions. Do we really want to make regular production modules that ship to billions of users larger by x%, because there could be a tool that wants to analyze fragments, and that tool would be slightly simpler? I remain extremely unconvinced by that line of reasoning. + +BT: We are designing a format that has many different purposes, it’s a code format for interchange, we should prioritize all the tools that process that format, producers and consumers, and I don’t see how we justify that.. We could make the reverse argument, why aren’t there annotations that would be useful for tool authors and engines? From a different point of view that might be a hostile design decision + +AR: This is very real feedback from people that work in the research space, one of the top things that makes their life easier with Wasm was that there is no overload, which is the trivial case of the problem we’re discussing right now. We’re talking about a much more inconvenient relaxation.. I would also like to push back on the size arguments, it falls into the category of premature optimization, we haven’t really optimized for size as much previously, putting in adhoc solutions to individual small things, we could have had a const 0 inaructions, but we decided against it, we don’t have a negate instructions, and all these things we never decided to size optimized, we should do a more targeted approach instead of adhoc ideas, we should stay consistent with that + +BT: I think we would call this layer 1 compression, I see what we’re doing as we’re trying to use the type reconstruction algorithm as a way to compression, if we really want to use this, that might be really cool. That’s a compression mechanism that might not be available to a general purpose zip algorithm, if we were to continue looking at that idea, that could exist at another level, and we can have another tool that decompresses, still get the best of both worlds + +AZ: Fair point that the compression could be done at another level, we haven’t focused on micro optimizing things. Want to push back against some of these, while in theory we could do something in the space, but it would take time, it’s unclear that we would actually do this. Unclear that we would actually have the resources to work on this, there are things we could have done the past, but did not but those are things (const 0, const 1) that compress well, on the other hand, type annotations are high entropy thing and have a higher cost, this stuff does matter, we have a lot of use cases on teh web where there are limited on the size of the download, while it doesn’t matter in some countries with unlimited bandwidth, other countries it isn’t. 5% is significant, lots of efforts to reduce code size overall and ever % matters. IT is likely that if we don’t save this now, it’s possible that we’ll be stuck with this for a very long time and that would be wasteful. + +DG: We’ve had discussions on both sides of this, one of the things that I’d like to circle back on, was that there are future features that there is a point of time in the future that we might feel comfortable about not having type annotations, you mentioned type parameters, maybe we can dig into that a little more what are other examples of future features? It would be nice to quantify that, at what point of time do we think the risk is lower, or is there a feature set you have in mind? + +AR: It’s really a spectrum, certainty is increasing monotonically, at some point we can have a more informed decision about this. What the features would be, or what the timeline would be, I would make no predictions at this point given how long we’ve worked on at this point + +DG: phrased this wrong, maybe looking for more of an opinion, the point that this could be unbounded w.r.t to time/features it seems uncomfortable at a time that we see the binary size being bloated that we want to make sure that we’re not adding size increase in unless necessary + +FM: What would be the rationale for having type parameters? I’m not sure that there is one + +AR: Currently there is no way to create code that operates the same on any reference, where you don’t really care what the reference is. This is a frequent thing for compilers, in Wasm you can’t express them because every reference has to be concretely defined, i view this feature as giving a way to generate this kind of code that native code compilers are able to generate + +FM: You can use any type + +AR: But then you have a lot of casts, examples: polymorphic or dynamic languages. Main use case is all the languages which operate with a uniform representation. In many cases they don’t care about what the type is, but right now you have to go to the top type and go back. There are a lots of casts, will be expensive in practice. This is the type of cast you can get rid of because they’re only there because the type system of Wasm is unique.This is one very obvious case where we should be better without going overboard. + +LW: Wanted to make the case for taking the layer 1 approach seriously, if we compress in some cases but not others, we’ll keep motivating ourselves not to work on this. There are significant wins to be had over the current status quo to doing this consistently. Doing individual small optimizations motivates us less to do this.Separately I was talking about this opcode table approach, where you define single bytes that could expand to opcodes and some fixed set of immediates, the idea being that you could do it inline in the decoder instead of the streaming phase, we abandoned it because we didn’t have time, and that we only got size savings with gzip and not brotli. The hope was that in the future that a simplistic scheme that we could implement to get rid of redundant immediates, if the normal bytecode stake could use this scheme, it could result in savings at multiple stages + +BT: I’ll +1 to that, the tools want annotations and we might want to keep the annotations indefinitely, but have the tools do the compression. + +DG: That’s a reasonable answer, that assumes that we do the work to handle compression at a different layer, and that the tools do the right thing and we have the tools in place to do that right? + +BT: Right, Isee layer 1 compression as the thing that’ll give us the best of both worlds. We could use the Wasm type system as the type compression, and still have type annotation in the bytecode for future tools, and future feature extensions + +DG: Any other thoughts? + +CW: Restate something came up implicitly, in a couple of years from now, if we do have a more solidified view of what extensions we want in the future, we can still add annotationless versions in the future with essentially the same code size savings. + +AR: That’s the conservativeness argument + +DG: It doesn’t have to be an example for this particular set, but is there an example that you could point to of something that was broken, looking for something that’s not as hypothetical as future problems, extensions, but even a toy type system example that can run into this problem + +AR: Select instruction is an example it didn’t have annotations, and once we added subtyping we realized we have to compute LUBs now we did this hack, where it was moderately ugly, we’ve already run into this problem, the br_table was also simple but hack not so ugly, simple example that everyone can understand. With the kind of things we’re talking about now, the problems would be much more intricate. IT’s not completely generic types, it’s very specific types + +FM: Is it fair to say that you run into trouble when you have to do the LUB? + +AR: Depends on how you define trouble, it’s very expensive, depending on how large your types are. In Wasm, we require that all types that occur in validation are defined, when you compute an LUB you may have a type that doesn’t have a definition, doesn’t have an index it can refer to. You have to extend the whole type system with synthesized types, that’s a whole new level of complexity as well which we do avoid right now + +BT: Ran into this problem for func.bind, which has a synthesized type, we run into the same issue with any type of inference that it creates types that you haven’t written down. It’s because we have this subtyping relationship that we have to declare, if you haven’t written one of the subtypes in the chain, it’s essentially a synthesized type, and how is that type canonicalized? We don’t have to mandate type canonicalization + +AR: LUBs only work when you have proper lattice in the type system, or at least a semi lattice. For br_table you need GLBs (?) so you would need a full lattice, that’s a severe constraint. There are cases where you can’t synthesize a proper type, there is no principle type that you could synthesize, it’s not clear how you would solve this problem + +DG: The select, and br_table examples make sense, but the larger picture of exactly what problems we would run into is still hard for me to grasp. + +BT: Look at the func.bind example, Andreas had an example of how this would work, without annotations we would have to do this operation, I’m not sure what the complexity would be it would have made it easier to have annoptations on func.bind + +JK: We can always require annotations in places where annotations are required, for example we could say that the hypothetical struct.get is valid if and only if the type of the thing on the value type is uniquely defined. If that’s not the case, a new instruction has to be used, or an explicit ref.cast or something else would need to be introduced + +AR: Highlighted the problem again, that’s what doesn’t work. The condition under which.. You would need to be able to decide whether the type is principled or not, and deciding that as complex as as deriving all tht types that are possible. You have a decision process that is arbitrarily expensive + +JK: To validate the annotation, which you have to do in an engine, you do have to check the type on the stack, for example, you have to check is there even a struct to do struct.get on that, and if you can tell easily enough whether you have a struct at all, then presumably you can always tell whether that has a type parameter or not + +AR: It’s more complicated than that. In general, do you have a principled type or not? Is there a most generic type that covers all possibilities. If you don’t have that then you need back checking in general. This is a thing you don’t want to do. To even decide whether your type is principled, that’s not something you can do without backtracking either. You have to explore all the search space where you could have decided otherwise. I have seen systems that track the principality of the types used, but that is very research, I don’t know how well it generalizes. In general, the assumption that you can make this distinction doesn’t work + +BT: In practise what would happen is that the annotation less instruction become less and less useful over time, because you add features that can’t use them, and then you have to come up with rules for when they can be used and not and they can be very subtle + +AR: Because you’re very likely to use properties like substitutibility, if you have very adhoc restrictionson what types are allowed somewhere, one essential rule of subtyping is that whenever you can use a type, you can always use a subtype, if you break the property, you’ve broken the entire type system, so thes rules have to work in a way that you don’t break the property, I’m not sure that it’s possible + +JK: We can make the rules as simple as we want them to be, not saying that’s a particularly good design, but just saying that we do have the power to make the simplest rules possible + +AR: That would make merging modules broken + +JK: Merging modules is already broken, you can’t merge a GC enabled module into a non-GC enabled module and then expect it to run on a non-GC enabled engines that the old module can run on, module merging argument would rule out adding any feature ever + +AR: I disagree, that doesn’t make sense + +BT: Obviously if you merge modules, then you have to run on engines that support all those modules + +CW: To flip it around, if we want to make the point that the annotation less instructions are reasonable, the check to distinguish whether we need an annotation or not is not too bad, I think we’ll be in a better place to make our judgment. At the point that we’re past the GC MVP, at some point we are going to care about code size. This shouldn’t lock us out about caring about code size, but we’re trying to hold together a lot of discussions about the type system right now, it makes sense to be conservative to get the MVP through the door. + +BT: The rules are exactly the same whether you need an annotation or not, to determine when you can compress, either way we can end up with different opcodes in the encoding + +CW: At some point we care about code size, then we can decide the rules of when an annotation is ok and when it isn’t. Give different opcodes and we get the same size opcodes + +DG: The concern I have with that is that some point is very indefinitely into the future and we’re not able to define what that point could be, hypothetically we could be doing several things with layer 1 compression, just want to mention the other side that the indefinite period is concerning + +AR: That is true of any feature, if you consider size savings features, then different features have different priorities, and as features are requested more and more they become higher priority, so let’s see where we land + +BT: Can believe space is a priority, there are other techniques that are just valuable, if code size is a high priority, we can always do things to factor them out, tooling things that would help in this domain. Giving it a full try there, it was mentioned in the thread, Java module compiled to JS vs WebASsembly GC have a 2x space gap, there are probably more reasons that the immediates to struct.get to that. There are different design decisions that can be made to save code space too + +AR: In general if you want to optimize for code size, you have to do that in a targeted way at a higher level, this is a problem we can’t solve without having domain specific abstractions + +JK: BT says if you really cares you can sacrifice performance for it, and AR says well if you really care you can do something other than Wasm. Still unconvinced about this argument + +FM: Pushing back argument the in-browser toolchain argument, not consistent with CSP, the CSP policy applies to the module that has to instantiate, it strongly limits the type of things we can do at the tooling level + +AR: Not sure how to respond to that, if that was a constraint we want to work around, then we’d end up turning Wasm into a source language. If we can’t add additional layers of abstraction, then we’re doomed + +FM: We looked at doing macros, the macro question keeps coming up, but as it happened that I was working on CSP at the time, it causes sufficient pain to drop it at the time, it’s possible we could fix it + +CW: To me the core of what’s going on is the meta question, of when we want to do that worrying, can we defer the eat the 5% and come back to it later + +BT: Is 5% shippable? + +AZ: 5% is not going to make the difference between shipping and not, it could be something we suffer for a long time, it affects everyone, not just one subset of people working on toolchains/specs, there are risks to people across the board and we have to consider that + +BT: I do want to discount that, it is a binary question of can we ship, that determines whether we want to remove annotations +AZ: I think we could ship either way, just one way would be less good + +Poll: **We should remove type annotations from array and struct accessor instructions rather than add a type annotation to call_ref.** + +Non binding poll, we should do F/N/A + +|F|N|A| +|-|-|-| +|3|5|6| + +Other comments: +JM: Neutral: Code size is a big problem now, and we should prioritize end users over toolchains, but the worries about future type system enhancements seem to have a lot of merit + +AT: Weak preference in my case, I agree with picking the conservative option for now while investigating other ways to save space and possibly revisiting this + +BT: I don’t think we should remove annotations at all, but handle compression at a different layer. + +AR: What are next steps? + +DG: Could be many options, do we thing this is sufficient consensus or want another poll? With majority vote against for this poll, do we need a separate poll to add annotations for call_ref? Let’s follow up on the issue as we’re out of time. + +### Closure From a1e042967829009fa7599a360a4338be3d8be476 Mon Sep 17 00:00:00 2001 From: Deepti Gandluri Date: Thu, 25 Aug 2022 13:28:26 -0700 Subject: [PATCH 02/11] Update gc/2022/GC-08-23.md Co-authored-by: Thomas Lively <7121787+tlively@users.noreply.github.com> --- gc/2022/GC-08-23.md | 1 + 1 file changed, 1 insertion(+) diff --git a/gc/2022/GC-08-23.md b/gc/2022/GC-08-23.md index 4060feb0..b78e43b0 100644 --- a/gc/2022/GC-08-23.md +++ b/gc/2022/GC-08-23.md @@ -195,6 +195,7 @@ BT: Is 5% shippable? AZ: 5% is not going to make the difference between shipping and not, it could be something we suffer for a long time, it affects everyone, not just one subset of people working on toolchains/specs, there are risks to people across the board and we have to consider that BT: I do want to discount that, it is a binary question of can we ship, that determines whether we want to remove annotations + AZ: I think we could ship either way, just one way would be less good Poll: **We should remove type annotations from array and struct accessor instructions rather than add a type annotation to call_ref.** From 184ebac44a7fe2badb3627d771bd797d0e13a60e Mon Sep 17 00:00:00 2001 From: Deepti Gandluri Date: Thu, 25 Aug 2022 13:28:47 -0700 Subject: [PATCH 03/11] Update gc/2022/GC-08-23.md Co-authored-by: Thomas Lively <7121787+tlively@users.noreply.github.com> --- gc/2022/GC-08-23.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gc/2022/GC-08-23.md b/gc/2022/GC-08-23.md index b78e43b0..68076903 100644 --- a/gc/2022/GC-08-23.md +++ b/gc/2022/GC-08-23.md @@ -130,7 +130,7 @@ AR: That’s the conservativeness argument DG: It doesn’t have to be an example for this particular set, but is there an example that you could point to of something that was broken, looking for something that’s not as hypothetical as future problems, extensions, but even a toy type system example that can run into this problem -AR: Select instruction is an example it didn’t have annotations, and once we added subtyping we realized we have to compute LUBs now we did this hack, where it was moderately ugly, we’ve already run into this problem, the br_table was also simple but hack not so ugly, simple example that everyone can understand. With the kind of things we’re talking about now, the problems would be much more intricate. IT’s not completely generic types, it’s very specific types +AR: Select instruction is an example it didn’t have annotations, and once we added subtyping we realized we have to compute LUBs now we did this hack, where it was moderately ugly, we’ve already run into this problem, the br_table was also simple but hack not so ugly, simple example that everyone can understand. With the kind of things we’re talking about now, the problems would be much more intricate. It’s not completely generic types, it’s very specific types FM: Is it fair to say that you run into trouble when you have to do the LUB? From afa68cf94c780bd63ad7035da8b352ad0745b6e4 Mon Sep 17 00:00:00 2001 From: Deepti Gandluri Date: Thu, 25 Aug 2022 13:28:58 -0700 Subject: [PATCH 04/11] Update gc/2022/GC-08-23.md Co-authored-by: Thomas Lively <7121787+tlively@users.noreply.github.com> --- gc/2022/GC-08-23.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gc/2022/GC-08-23.md b/gc/2022/GC-08-23.md index 68076903..57a237fe 100644 --- a/gc/2022/GC-08-23.md +++ b/gc/2022/GC-08-23.md @@ -120,7 +120,7 @@ BT: I’ll +1 to that, the tools want annotations and we might want to keep the DG: That’s a reasonable answer, that assumes that we do the work to handle compression at a different layer, and that the tools do the right thing and we have the tools in place to do that right? -BT: Right, Isee layer 1 compression as the thing that’ll give us the best of both worlds. We could use the Wasm type system as the type compression, and still have type annotation in the bytecode for future tools, and future feature extensions +BT: Right, I see layer 1 compression as the thing that’ll give us the best of both worlds. We could use the Wasm type system as the type compression, and still have type annotation in the bytecode for future tools, and future feature extensions DG: Any other thoughts? From fdb714c754e7a97c84735d7f43ae1c0333e28cdf Mon Sep 17 00:00:00 2001 From: Deepti Gandluri Date: Thu, 25 Aug 2022 13:29:09 -0700 Subject: [PATCH 05/11] Update gc/2022/GC-08-23.md Co-authored-by: Thomas Lively <7121787+tlively@users.noreply.github.com> --- gc/2022/GC-08-23.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gc/2022/GC-08-23.md b/gc/2022/GC-08-23.md index 57a237fe..9fc81635 100644 --- a/gc/2022/GC-08-23.md +++ b/gc/2022/GC-08-23.md @@ -108,7 +108,7 @@ DG: phrased this wrong, maybe looking for more of an opinion, the point that thi FM: What would be the rationale for having type parameters? I’m not sure that there is one -AR: Currently there is no way to create code that operates the same on any reference, where you don’t really care what the reference is. This is a frequent thing for compilers, in Wasm you can’t express them because every reference has to be concretely defined, i view this feature as giving a way to generate this kind of code that native code compilers are able to generate +AR: Currently there is no way to create code that operates the same on any reference, where you don’t really care what the reference is. This is a frequent thing for compilers, in Wasm you can’t express them because every reference has to be concretely defined, I view this feature as giving a way to generate this kind of code that native code compilers are able to generate FM: You can use any type From 8399e70f498325dad72776265d9dc87a6e7f6e00 Mon Sep 17 00:00:00 2001 From: Deepti Gandluri Date: Thu, 25 Aug 2022 13:29:25 -0700 Subject: [PATCH 06/11] Update gc/2022/GC-08-23.md Co-authored-by: Thomas Lively <7121787+tlively@users.noreply.github.com> --- gc/2022/GC-08-23.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gc/2022/GC-08-23.md b/gc/2022/GC-08-23.md index 9fc81635..2c3afcaf 100644 --- a/gc/2022/GC-08-23.md +++ b/gc/2022/GC-08-23.md @@ -100,7 +100,7 @@ BT: I think we would call this layer 1 compression, I see what we’re doing as AZ: Fair point that the compression could be done at another level, we haven’t focused on micro optimizing things. Want to push back against some of these, while in theory we could do something in the space, but it would take time, it’s unclear that we would actually do this. Unclear that we would actually have the resources to work on this, there are things we could have done the past, but did not but those are things (const 0, const 1) that compress well, on the other hand, type annotations are high entropy thing and have a higher cost, this stuff does matter, we have a lot of use cases on teh web where there are limited on the size of the download, while it doesn’t matter in some countries with unlimited bandwidth, other countries it isn’t. 5% is significant, lots of efforts to reduce code size overall and ever % matters. IT is likely that if we don’t save this now, it’s possible that we’ll be stuck with this for a very long time and that would be wasteful. -DG: We’ve had discussions on both sides of this, one of the things that I’d like to circle back on, was that there are future features that there is a point of time in the future that we might feel comfortable about not having type annotations, you mentioned type parameters, maybe we can dig into that a little more what are other examples of future features? It would be nice to quantify that, at what point of time do we think the risk is lower, or is there a feature set you have in mind? +DG: We’ve had discussions on both sides of this, one of the things that I’d like to circle back on, was that there are future features that there is a point of time in the future that we might feel comfortable about not having type annotations, you mentioned type parameters, maybe we can dig into that a little more what are other examples of future features? It would be nice to quantify that, at what point of time do we think the risk is lower, or is there a feature set you have in mind? AR: It’s really a spectrum, certainty is increasing monotonically, at some point we can have a more informed decision about this. What the features would be, or what the timeline would be, I would make no predictions at this point given how long we’ve worked on at this point From b06bdd691fa6aaf2c3dfdf39a7c3c8e14a37eb5c Mon Sep 17 00:00:00 2001 From: Deepti Gandluri Date: Thu, 25 Aug 2022 13:29:39 -0700 Subject: [PATCH 07/11] Update gc/2022/GC-08-23.md Co-authored-by: Thomas Lively <7121787+tlively@users.noreply.github.com> --- gc/2022/GC-08-23.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gc/2022/GC-08-23.md b/gc/2022/GC-08-23.md index 2c3afcaf..559b937e 100644 --- a/gc/2022/GC-08-23.md +++ b/gc/2022/GC-08-23.md @@ -98,7 +98,7 @@ AR: This is very real feedback from people that work in the research space, one BT: I think we would call this layer 1 compression, I see what we’re doing as we’re trying to use the type reconstruction algorithm as a way to compression, if we really want to use this, that might be really cool. That’s a compression mechanism that might not be available to a general purpose zip algorithm, if we were to continue looking at that idea, that could exist at another level, and we can have another tool that decompresses, still get the best of both worlds -AZ: Fair point that the compression could be done at another level, we haven’t focused on micro optimizing things. Want to push back against some of these, while in theory we could do something in the space, but it would take time, it’s unclear that we would actually do this. Unclear that we would actually have the resources to work on this, there are things we could have done the past, but did not but those are things (const 0, const 1) that compress well, on the other hand, type annotations are high entropy thing and have a higher cost, this stuff does matter, we have a lot of use cases on teh web where there are limited on the size of the download, while it doesn’t matter in some countries with unlimited bandwidth, other countries it isn’t. 5% is significant, lots of efforts to reduce code size overall and ever % matters. IT is likely that if we don’t save this now, it’s possible that we’ll be stuck with this for a very long time and that would be wasteful. +AZ: Fair point that the compression could be done at another level, we haven’t focused on micro optimizing things. Want to push back against some of these, while in theory we could do something in the space, but it would take time, it’s unclear that we would actually do this. Unclear that we would actually have the resources to work on this, there are things we could have done the past, but did not but those are things (const 0, const 1) that compress well, on the other hand, type annotations are high entropy thing and have a higher cost, this stuff does matter, we have a lot of use cases on the web where there are limited on the size of the download, while it doesn’t matter in some countries with unlimited bandwidth, other countries it isn’t. 5% is significant, lots of efforts to reduce code size overall and ever % matters. It is likely that if we don’t save this now, it’s possible that we’ll be stuck with this for a very long time and that would be wasteful. DG: We’ve had discussions on both sides of this, one of the things that I’d like to circle back on, was that there are future features that there is a point of time in the future that we might feel comfortable about not having type annotations, you mentioned type parameters, maybe we can dig into that a little more what are other examples of future features? It would be nice to quantify that, at what point of time do we think the risk is lower, or is there a feature set you have in mind? From 1d385282349573ea5287f8207f4fe8d3b09a9604 Mon Sep 17 00:00:00 2001 From: Deepti Gandluri Date: Thu, 25 Aug 2022 13:29:58 -0700 Subject: [PATCH 08/11] Update gc/2022/GC-08-23.md Co-authored-by: Thomas Lively <7121787+tlively@users.noreply.github.com> --- gc/2022/GC-08-23.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gc/2022/GC-08-23.md b/gc/2022/GC-08-23.md index 559b937e..2e4d2d11 100644 --- a/gc/2022/GC-08-23.md +++ b/gc/2022/GC-08-23.md @@ -94,7 +94,7 @@ JK: Having annotations make some tools simpler, don’t think that’s a convinc BT: We are designing a format that has many different purposes, it’s a code format for interchange, we should prioritize all the tools that process that format, producers and consumers, and I don’t see how we justify that.. We could make the reverse argument, why aren’t there annotations that would be useful for tool authors and engines? From a different point of view that might be a hostile design decision -AR: This is very real feedback from people that work in the research space, one of the top things that makes their life easier with Wasm was that there is no overload, which is the trivial case of the problem we’re discussing right now. We’re talking about a much more inconvenient relaxation.. I would also like to push back on the size arguments, it falls into the category of premature optimization, we haven’t really optimized for size as much previously, putting in adhoc solutions to individual small things, we could have had a const 0 inaructions, but we decided against it, we don’t have a negate instructions, and all these things we never decided to size optimized, we should do a more targeted approach instead of adhoc ideas, we should stay consistent with that +AR: This is very real feedback from people that work in the research space, one of the top things that makes their life easier with Wasm was that there is no overload, which is the trivial case of the problem we’re discussing right now. We’re talking about a much more inconvenient relaxation. I would also like to push back on the size arguments, it falls into the category of premature optimization, we haven’t really optimized for size as much previously, putting in ad hoc solutions to individual small things, we could have had a const 0 instruction, but we decided against it, we don’t have a negate instruction, and all these things we never decided to size optimize, we should do a more targeted approach instead of ad hoc ideas, we should stay consistent with that. BT: I think we would call this layer 1 compression, I see what we’re doing as we’re trying to use the type reconstruction algorithm as a way to compression, if we really want to use this, that might be really cool. That’s a compression mechanism that might not be available to a general purpose zip algorithm, if we were to continue looking at that idea, that could exist at another level, and we can have another tool that decompresses, still get the best of both worlds From 9db98861bde33ebf5f5c556ae5bdb73b77884376 Mon Sep 17 00:00:00 2001 From: Deepti Gandluri Date: Thu, 25 Aug 2022 13:30:13 -0700 Subject: [PATCH 09/11] Update gc/2022/GC-08-23.md Co-authored-by: Thomas Lively <7121787+tlively@users.noreply.github.com> --- gc/2022/GC-08-23.md | 1 + 1 file changed, 1 insertion(+) diff --git a/gc/2022/GC-08-23.md b/gc/2022/GC-08-23.md index 2e4d2d11..f0f3e6eb 100644 --- a/gc/2022/GC-08-23.md +++ b/gc/2022/GC-08-23.md @@ -86,6 +86,7 @@ AR: Yeah, there are a lot of hypotheticals right now, in the presence of hypothe AZ: Can you clarify why it’s easier to add unannotated versions later vs. adding the annotated versions later? AR: When we run into problems with lack of annotations, we can’t add them because of backwards compatibility, because we’re tied into supporting these instructions forever. If there’s interference with some feature we want to add, then we basically a bad place. We have to resolve this somehow. The other way is more robust, the size optimization is always something you can add at later. + AZ: Why do we think we’d be at a point of time in the future where we think the risk is lower vs. now? Do you think we’ll come to a point where we’ve implemented all the tricky things in the type system? That seems unlikely. We’d always have the argument against adding them, we’ll always thinking it’s risky to add. Worry is always that we will keep suffering the 5% loss indefinitely, cost can be significant in the long run, and we’re choosing to opt into it, we might always say it’s risky to add them AR: there’s never a point to be sure, over time we might have more data about what size data is more relevant, personally I think 5% is not an issue but ymmv, but we will be able to gather more data on that. The other side is that some features that are at higher risk, we will have added already, we will know more about whether these risks materialize, there will always be potential more things there are very likely features that we have them, type parameters for example, once we have those I’d be somewhat more comfortablle From d47c24b58d907fb95cdcae520196ef96f2446a75 Mon Sep 17 00:00:00 2001 From: Deepti Gandluri Date: Thu, 25 Aug 2022 14:53:59 -0700 Subject: [PATCH 10/11] Update gc/2022/GC-08-23.md Co-authored-by: Andreas Rossberg --- gc/2022/GC-08-23.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gc/2022/GC-08-23.md b/gc/2022/GC-08-23.md index f0f3e6eb..c9408138 100644 --- a/gc/2022/GC-08-23.md +++ b/gc/2022/GC-08-23.md @@ -151,7 +151,7 @@ AR: Highlighted the problem again, that’s what doesn’t work. The condition u JK: To validate the annotation, which you have to do in an engine, you do have to check the type on the stack, for example, you have to check is there even a struct to do struct.get on that, and if you can tell easily enough whether you have a struct at all, then presumably you can always tell whether that has a type parameter or not -AR: It’s more complicated than that. In general, do you have a principled type or not? Is there a most generic type that covers all possibilities. If you don’t have that then you need back checking in general. This is a thing you don’t want to do. To even decide whether your type is principled, that’s not something you can do without backtracking either. You have to explore all the search space where you could have decided otherwise. I have seen systems that track the principality of the types used, but that is very research, I don’t know how well it generalizes. In general, the assumption that you can make this distinction doesn’t work +AR: It’s more complicated than that. In general, do you have a principal type or not? Is there a most generic type that covers all possibilities. If you don’t have that then you need back tracking in general. This is a thing you don’t want to do. To even decide whether your type is principal, that’s not something you can generally do without backtracking either. You have to explore all the search space where you could have decided otherwise. I have seen systems that track the principality of the types used, but that is very researchy, I don’t know how well it generalizes. In general, the assumption that you can make this distinction doesn’t work BT: In practise what would happen is that the annotation less instruction become less and less useful over time, because you add features that can’t use them, and then you have to come up with rules for when they can be used and not and they can be very subtle From 233337c5844c53fa6740fdf7a893a87cd29e9fd7 Mon Sep 17 00:00:00 2001 From: Deepti Gandluri Date: Thu, 25 Aug 2022 14:54:37 -0700 Subject: [PATCH 11/11] Update gc/2022/GC-08-23.md Co-authored-by: Andreas Rossberg --- gc/2022/GC-08-23.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gc/2022/GC-08-23.md b/gc/2022/GC-08-23.md index c9408138..bdd84df6 100644 --- a/gc/2022/GC-08-23.md +++ b/gc/2022/GC-08-23.md @@ -155,7 +155,7 @@ AR: It’s more complicated than that. In general, do you have a principal type BT: In practise what would happen is that the annotation less instruction become less and less useful over time, because you add features that can’t use them, and then you have to come up with rules for when they can be used and not and they can be very subtle -AR: Because you’re very likely to use properties like substitutibility, if you have very adhoc restrictionson what types are allowed somewhere, one essential rule of subtyping is that whenever you can use a type, you can always use a subtype, if you break the property, you’ve broken the entire type system, so thes rules have to work in a way that you don’t break the property, I’m not sure that it’s possible +AR: Because you’re very likely to lose properties like substitutability, if you have very adhoc restrictionson what types are allowed somewhere. One essential principle of subtyping is that whenever you can use a type, you can always use a subtype. If you break the property, you’ve broken the entire type system, so these rules have to work in a way that you don’t break the property, I’m not sure that is possible in general JK: We can make the rules as simple as we want them to be, not saying that’s a particularly good design, but just saying that we do have the power to make the simplest rules possible