-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Type inference experiments. #14510
Type inference experiments. #14510
Conversation
15f74df
to
c1118b4
Compare
c1118b4
to
ec9b3fe
Compare
1420b8f
to
b2bafc5
Compare
b2363a8
to
acad00f
Compare
m_typeConstructors.at(m_primitiveTypeConstructors.at(PrimitiveType::TypeFunction).m_index).arities = {Arity{std::vector<Sort>{{typeSort},{typeSort}}, classType}}; | ||
m_typeConstructors.at(m_primitiveTypeConstructors.at(PrimitiveType::Function).m_index).arities = {Arity{std::vector<Sort>{{typeSort, typeSort}}, classType}}; | ||
m_typeConstructors.at(m_primitiveTypeConstructors.at(PrimitiveType::Function).m_index).arities = {Arity{std::vector<Sort>{{typeSort, typeSort}}, classType}}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this one should be Itself
too as in #14510 (comment)?
m_typeConstructors.at(m_primitiveTypeConstructors.at(PrimitiveType::TypeFunction).m_index).arities = {Arity{std::vector<Sort>{{typeSort},{typeSort}}, classType}}; | |
m_typeConstructors.at(m_primitiveTypeConstructors.at(PrimitiveType::Function).m_index).arities = {Arity{std::vector<Sort>{{typeSort, typeSort}}, classType}}; | |
m_typeConstructors.at(m_primitiveTypeConstructors.at(PrimitiveType::Function).m_index).arities = {Arity{std::vector<Sort>{{typeSort, typeSort}}, classType}}; | |
m_typeConstructors.at(m_primitiveTypeConstructors.at(PrimitiveType::TypeFunction).m_index).arities = {Arity{std::vector<Sort>{{typeSort},{typeSort}}, classType}}; | |
m_typeConstructors.at(m_primitiveTypeConstructors.at(PrimitiveType::Function).m_index).arities = {Arity{std::vector<Sort>{{typeSort, typeSort}}, classType}}; | |
m_typeConstructors.at(m_primitiveTypeConstructors.at(PrimitiveType::Itself).m_index).arities = {Arity{std::vector<Sort>{{typeSort, typeSort}}, classType}}; |
size_t arguments() const | ||
{ | ||
solAssert(!arities.empty()); | ||
return arities.front().argumentSorts.size(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the first element in arities
meant to be always PrimitiveClass::Type
? Can we assert that? If not, what would a different type class on the first position mean?
Can a type constructor have more than one PrimitiveClass::Type
in arities
? If so, what would that mean?
If there's always exactly one, then it seems to me that it we should have a separate field for it and not just lump it with other things in arities
. It seems different in some ways and it's even excluded by some checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, and by the way calling this thing Arity
confused me a lot. Looking at other code I assumed it was just something describing the number of args, but it looks like an equivalent of TypeConstant
but for type classes. I'm only now noticing that it has a type class inside, and TypeSystem::instantiateClass()
makes more sense (it was a bit weird that you'd not need to say which class you're instantiating).
Can we come up with a more descriptive name for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's how these things are called :-). If you can come up with a better name, feel free to suggest it.
It describes for a type constructor that applying it to type arguments of specific sorts (the argument sorts in the arity), the result will have a specific class (the class in the arity).
Sort baseSort{{primitiveClass(PrimitiveClass::Type)}}; | ||
size_t index = m_typeConstructors.size(); | ||
m_typeConstructors.emplace_back(TypeConstructorInfo{ | ||
_name, | ||
_canonicalName, | ||
{Arity{std::vector<Sort>{_arguments, baseSort}, primitiveClass(PrimitiveClass::Type)}}, | ||
_declaration | ||
}); | ||
TypeConstructor constructor{index}; | ||
if (_arguments) | ||
{ | ||
std::vector<Sort> argumentSorts; | ||
std::generate_n(std::back_inserter(argumentSorts), _arguments, [&](){ return Sort{{primitiveClass(PrimitiveClass::Type)}}; }); | ||
std::vector<Type> argumentTypes; | ||
std::generate_n(std::back_inserter(argumentTypes), _arguments, [&](){ return freshVariable({}); }); | ||
auto error = instantiateClass(type(constructor, argumentTypes), Arity{argumentSorts, primitiveClass(PrimitiveClass::Type)}); | ||
solAssert(!error, *error); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get the purpose of instantiateClass()
call here. Unless I'm missing something, it will simply insert a second, identical copy of PrimitiveClass::Type
arity into arities
and with how it's called, validations inside can never fail (we're calling it on a type constant and using matching numbers of args).
Is this bit just redundant? If so, why is it done only when there is at least one argument?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The system currently has a primitive class Type
that contains all types (ultimately, we may not need that, actually, but currently at least it's there). When you declare a new type constructor, you need to clarify that it, applied to arguments of Type
sorts gives you something of class Type
- that doesn't happen automatically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But why do we need two identical copies of that class in arities
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yeah, we don't :-) - we should probably not add it manually in here and only let instantiateClass
handle it, I need to read through it properly myself again :-).
{ | ||
SecondarySourceLocation ssl; | ||
ssl.append("Previous instantiation.", instantiation->second->location()); | ||
m_errorReporter.typeError(6620_error, _typeClassInstantiation.location(), ssl, "Duplicate type class instantiation."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I find it odd that you can constrain type arguments of instantiations and we can distinguish instantiations with different constraints (they'll have different Arity
), but you can't instantiate multiple times.
For example:
pragma experimental solidity;
type Vector(Item);
class Self: Container {}
class Self: Number {}
class Self: String {}
instantiation Vector(Item: Number): Container {}
instantiation Vector(Item: String): Container {}
TypeError 6620: (171-219): Duplicate type class instantiation.
Is this something we're going to allow that in the future? Does it just require support for parameterized type classes? Otherwise I don't see much point in allowing those constraints in instantiations.
I'm also generally unsure at what places we're going to allow constraints in the first place, because the prototype is pretty inconsistent about that. It disallows some things I'd expect to be possible (e.g. type Vector(T: Number);
) while allowing some wild things (e.g. let x: T: Number: Number: Number;
). It's incomplete so it's understandable, but it makes it hard to figure out what is supposed to be valid eventually and what not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is generally impossible. How would you distinguish between the two instantiations? I.e. how would you ever be able to tell which is the right one? There can only ever be a unique instantiation for any type constructor.
In situations in which you'd want to do this, you'd, instead, move the specific logic to a new type class for the argument. I.e. have instantiation Vector(Item: NumberOrString): Container {}
and appropriately define NumberOrString
(we currently don't have that in the PR, but conceptually it's possible to declare Number
and String
subclasses of NumberOrString
by implementing the interface of NumberOrString
with what's available for Number
or String
- but in general the current PR doesn't properly handle a dependency/inheritance hierarchy of type classes yet - partially because this would also only need to change once we generalize them anyways)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also generally unsure at what places we're going to allow constraints in the first place, because the prototype is pretty inconsistent about that. It disallows some things I'd expect to be possible (e.g. type Vector(T: Number);) while allowing some wild things (e.g. let x: T: Number: Number: Number;). It's incomplete so it's understandable, but it makes it hard to figure out what is supposed to be valid eventually and what not.
Type definitions are general and cannot be constrained to arguments of specific type classes (there's no use for doing that) It's only type class instantiations that can depend on specific argument sorts. So that part is intentional.
Stuff like let x:T:Number:Number:Number;
- if that's really allowed - is just laziness so far. Remember that this is experimental and not yet meant to be perfect - it will have to change anyways, once we generalize type classes to more than one type argument.
experimental::Type TypeSystem::freshVariable(Sort _sort) | ||
{ | ||
size_t index = m_numTypeVariables++; | ||
return TypeVariable(index, std::move(_sort)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We increase the number first so type variable IDs start at 1 rather than 0. Is that intentional? If not, it would be more consistent with other IDs in the AST to start at 0.
while (index /= 26) | ||
varName += static_cast<char>('a' + (index%26)); | ||
reverse(varName.begin(), varName.end()); | ||
stream << '\'' << varName; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an established convention for naming fixed free type variables? Is prefixing them with "
a good idea?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change generic ones to be prefixed by ?
instead of '
and use '
for fixed ones.
experimental::Type TypeSystem::freshTypeVariable(Sort _sort) | ||
{ | ||
_sort.classes.emplace(primitiveClass(PrimitiveClass::Type)); | ||
return freshVariable(_sort); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does a fresh non-type variable even mean? Does it have any sensible interpretation in the type system?
The only meaningful use of freshVariable()
on its own currently seems to be when we're declaring the type
primitive class. If seems to me that it should be kept as an internal helper. Especially given that the naming is confusing - it's not immediately clear what the difference between a "variable" and a "type variable" is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Higher-kinded type variables would be examples - but we don't have that implemented and potentially won't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about naming then? Can we have something better than freshVariable()
?
bdd2f02
to
d7d4de0
Compare
9cf8459
to
f318787
Compare
Removing the commit with a hack that disabled codegen in syntax tests before I merge #14660. |
This pull request is stale because it has been open for 14 days with no activity. |
326ef77
to
b3a7739
Compare
Ok, there seems to be some difference in bytecode comparison's handling of unimplemented feature errors between cli and standard json - and something in soltest needs to deal with them more gracefully, but it doesn't look like much is left Well, and the error code test coverage of course needs to ignore this (we need to see if there's a better way than adding all of them to the exclusion list, but if need be we can do that) |
We also have to deal with uncovered error codes somehow, and I'm not sure that ignoring them is the best option; should likely have at least some basic test coverage for them. Also, the |
On a side note - why are we disallowing this in the experimental mode - we can't have semantic tests without, except we already have working tests, so all this assert did was to essentially disallow previously working functionality. |
Yeah, I see no reason to disable it. When reviewing #14659 I assumed that all the asserted outputs are unusable due to missing annotation and either segfault or produce broken output. If they are not, we should re-enable them. BTW, does
It's not different handling between CLI and StandardJSON. It's actually the lack of different handling. CLI and StandardJSON behave differently with regard to what outputs they produce as I already reported in #13925. For example StandardJSON may give you metadata even in presence of codegen ICEs. I still think that the right way to solve this is to make these two interfaces behave consistently. But it can of course be solved also by hard-coding the report generatorto ignore metadata in some cases. That's messier though.
I think we should just add them all to the exclusion list. That will force us to later do a proper pass of fixing error handling and adding proper coverage for them rather than glossing over it just to get the script to shut up. Right now the prototype is not complete enough for us to be able to cover it properly (unstable syntax, testing corner cases very often triggers ICEs). If we want to avoid forgetting about this, we could add an issue for it already or list it in Experimental Solidity quirks to deal with eventually. |
I made a quick checklist here to track remaining work on this: #14729. |
Co-authored-by: Kamil Śliwak <[email protected]> Co-authored-by: Matheus Aguiar <[email protected]> Co-authored-by: Nikola Matic <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually just wanted to remark these comments for future reference, but then still approve and merge - but I actually cannot approve, since I'm the author :-).
// TODO: consider still asserting unless we are in experimental solidity. | ||
// solAssert(m_typeName, ""); solAssert(!m_typeExpression, ""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should note this down somewhere, resp. take care of this sooner than later.
{ | ||
solAssert(_kind == Token::Constructor || _kind == Token::Function || _kind == Token::Fallback || _kind == Token::Receive, ""); | ||
solAssert(isOrdinary() == !name().empty(), ""); | ||
// TODO: assert _returnParameters implies non-experimental _experimentalReturnExpression implies experimental |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would also be nice to clean up soon. Unfortunately a bit of a hassle to pass the information whether the AST is experimental down to these...
@@ -2138,7 +2152,8 @@ class BinaryOperation: public Expression | |||
): | |||
Expression(_id, _location), m_left(std::move(_left)), m_operator(_operator), m_right(std::move(_right)) | |||
{ | |||
solAssert(TokenTraits::isBinaryOp(_operator) || TokenTraits::isCompareOp(_operator), ""); | |||
// TODO: assert against colon for non-experimental solidity |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another one that'd be nice to get rid off.
return std::apply([&](auto... _indexTuple) { | ||
return ([&](auto&& _step) { | ||
for (auto source: _sourceUnits) | ||
if (!_step.analyze(*source)) | ||
return false; | ||
return true; | ||
}(std::tuple_element_t<decltype(_indexTuple)::value, AnalysisSteps>{*this}) && ...); | ||
}, makeIndexTuple(std::make_index_sequence<std::tuple_size_v<AnalysisSteps>>{})); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We actually kept this until now, haha :-D, well, now we merge it.
// TODO move after step introduced in https://github.com/ethereum/solidity/pull/14578, but before TypeInference | ||
FunctionDependencyAnalysis, | ||
TypeInference, | ||
DebugWarner |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure we should keep this here on develop
} | ||
} | ||
|
||
// TODO: clean up rational parsing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, all this is half-dead and duplicated code that won't stay like this at all, but well, works for now...
|
||
if (!m_activeInstantiations.empty()) | ||
{ | ||
// TODO: This entire logic is superfluous - I thought mutually recursive dependencies between |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha, this we should also get rid of quickly until anyone thinks this is actually necessary
using namespace solidity::frontend; | ||
using namespace solidity::frontend::experimental; | ||
|
||
/*std::optional<TypeConstructor> experimental::typeConstructorFromTypeName(Analysis const& _analysis, TypeName const& _typeName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those comments we could also have cleaned out still :-).
There is already a PR with the workaround to the failing prb-math test. |
This is far from complete and even the parts that are in here need some reworking, so I'm mainly posting it for a basis for discussing and splitting up further work.
Notes:
SyntaxRestrictor
is preliminary and (probably) incomplete.DebugWarner
is just used for printing the result of type inference as info messages.Among the issues:
IRVariable
mechanism to account for larger types.Also it'd be interesting to explore if and how "size on stack" could be conceptually a construct defined in-language (i.e. a type class property that is merely directly inherited/derived on type declarations; optional - also fine to determine it hard-coded for now).
What should rather happen is:
word
type that may cross the assembly barrier). This means name resolution and type checking (other than for the cases involving special semantics) works as usual, so this immediately yields proper scoping and reduces special-casing. E.g.type word = __builtin("word");
or as separate top-level statement__builtin_type("word", word);
type word1 = __builtin("word"); type word2 = _builtin("word");
is invalid.function f(x:a) {}
works, whilefunction f(x:a) -> a {}
fails with undeclared identifier). Potentially we should have explicit syntax for free type variables (like'a
or?a
) or for declaring free types (likefunction f<a>(x:a) -> a
orforall a. function f(x:a) -> a
)HM(X)
-style)Infrastructure issues:
Further notes for future development: https://notes.ethereum.org/_OSmtx9aQAOHQXwa60IDsQ