-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Specify object subtyping #43
Comments
@hasithaa Which interpretation of |
The current implementation is based on the 2nd interpretation where access only by methods that belong to the same object type descriptor (private to the object type). import ballerina/io;
type Person object {
private string email;
function __init(string emailAddr = "[email protected]") {
self.email = emailAddr;
}
function foo(Person p) {
io:println(self.email);
io:println(p.email);
}
};
public function main() {
Person p1 = new;
Person p2 = new(emailAddr = "[email protected]");
p1.foo(p2);
} |
Great. I think the 2nd interpretation is the more useful one. This means Approach B won’t work, which leaves Approach A. |
Btw, is it a strict requirement to have structural subtyping for objects. I know Flow and TypeScript have some nominal typing for Classes. I came across the following paper which describes an approach where they've integrated nominal and structural subtyping to get rid of the weaknesses of both approaches. The example situations that they have describes are practical situations that we've faced in current Ballerina compiler front-end written in Java as well as in Ballerina JMV back-end written in Ballerina that we are working on these days. FYI, We've defined structural subtyping for data related types and that works well. But for objects possibly we can think of a different approach. WDYT? |
Well one problem is that containers, which are typed semantically, can contain objects, which means that typing of objects has to also be defined in terms of a subset relationship between sets of values. I suppose you could think of an object as having the class name as part of its value, but I think you would want a very different syntax if that is your conceptual model. Maybe something to think about for Ballerina v2 or v3. |
I will open a new issue to discuss this matter. Btw, I have a concern on the point 8 of the approach A in your original proposal. Let me explain it using the following example. Ballerina module: pkg1 import ballerina/io;
public type Person object {
int age = 20;
private string email = "[email protected]";
public function runWith(Person other) {
io:println(self.email);
// Privacy violation 1
io:println(other.email);
}
};
public function tryRunning(Person p1, Person p2) {
p1.runWith(p2);
io:println(p1.age);
// Privacy violation 2
io:println(p2.age);
} Ballerina module: pkg2 import ballerina/io;
public type Student object {
int age = 32;
string email = "[email protected]";
public function runWith(Student other) {
io:println(self.email);
io:println(other.email);
}
public function walkWith(Student other){
}
}; Ballerina module: pkgmain import pkg1;
import pkg2;
import ballerina/io;
public function main() {
var p = new pkg1:Person();
var s = new pkg2:Student();
pkg1:tryRunning(p,s);
} Here is the output Here in this example pkg2:Student object type is a subtype of pkg1:Person object type according to Approach A. Also, according to point 8, the visibility of field "email" in pkg2:Student is greater than that visibility of field "email" in pkg1:Person. But this examples demostrates that pkg1 has access to a field of pk2:Student which has "module" visibility. |
However, I agree that privacy interpretation 1 (introduced above) is less useful even though it does not break privacy rules. But it has many drawbacks compared to interpretation 2. Some scenarios cannot be implemented without exposing private members to the whole world and implementation that verifies the interpretation 1 can be sub-optimal than interpretation 2. Even though the interpretation 2 has privacy violations as explained in my above comment, that is the most useful interpretation. Therefore the privacy violation 1 in my above comment may not be a concern. However, I think the 2nd violation can be a problem. |
Point 8 of my proposed approach needs rethinking. @sameerajayasoma Don't we have a problem even with simpler cases? For example
|
I think we need to distinguish between the visibility specifier (public, nothing or private) and the region of code to which a visibility specifier limits access. We can describe the visibility region as being one of
where M is a reference to a module and O is a reference to a specific (non-abstract) object type descriptor. Before doing type-checking, we need to resolve each (context-dependent) visibility specifier into a (context-independent) visibility region. There is a partial order on visibility regions corresponding to region inclusion with public > module(M) > object(M, O) Some regions are incomparable (<>): module(M1) <> module(M2) |
I have a concern about considering object members which have module and private visibility for object subtyping. Imagine that I need to define an object type (S) that is a subtype of another object type (T) defined in a different Ballerina module. Now I need to know the internal structure (including module level and private members ) of T in order to make S a subtype of T. The internal structure is not available unless you have access to the source code of the module. That can be a problem IMO. We can think of a model where we define object subtype relationships only based on abstract objects?
If abstract object type (AT1) contains members with module visibility then other packages won't be able to create subtypes of AT1. This model is similar Go lang interfaces. |
Suppose *T copies the members along with the resolved visibility regions instead of the visibility specifiers. This makes a difference for module-level visibility. At the moment abstract object types cannot have private members. But I think we should also say that public abstract object types should have only public members. However, I think it’s fine for protected abstract object types to have protected members. |
Sorry. I edited my comment to remove the restriction for public abstract objects having members with module (protected) visibility a few mins ago. (didn't see your reply). Go lang has done something like that and it is useful. |
Based on further discussions with @sameerajayasoma, it actually does make sense to have public abstract objects with module visibility. What it means is that only the module that defines the abstract type can subtype it. |
The overall effect is that objects with private fields/methods cannot have subtypes. This makes sense given that
For object types with module level visibility, only that module can create subtypes. Objects with only public fields/modules can be subtyped freely, just like with records. |
If module M defines an abstract object type T with module-visibility members, then only M can use *T. |
The previous comment but one is not quite correct since member of subtype can have more visibility than the supertype (consistent with substitution principle). |
Means private to object type not private to object value. See discussion in issue #43.
To summarize, an object type T’ is a subtype of an object type T, if and only if for every field and method f of T, there is a field or method f’ of T’, such that
The last point is motivated by substitutability. We want it to be possible to substitute T objects with T’ objects. What is needed for this is for T’ to provide at least the level of access to a method or field that T does. Since objects can be members of containers, and subtyping of containers is defined semantically in terms of set relationships between the set of values (shapes) that types denote, we need to define the shape of an object in such a way that the above subtype relationship holds. We can do this by saying that the shape of an object is a pair of maps <F,M> where F maps keys to the shape of fields and M maps keys to the shape of methods, which will be a function shape. In both cases, a key is a pair <R, S> where R is a code region and S is string for the name of the field or method. A code region is either a module or an object type descriptor. F or M will have an entry <R, S> if there is a field or method named S that is visible in region R. Thus a field or method declared as public will have a map entry for every code region; one declared as module level will have a map entry for the code region of its module and of every object type descriptor in that module; one declared as private will have a map entry only for the code region of its object type descriptor. An object type is open and includes all shapes that have at least the fields and member functions specified in a type. |
Do we need to take the initializer into account for subtyping? The current behaviour is to ignore the initializer. |
You mean the __init method, right? If you can call it explicitly, type safety requires to be taken into account. At the moment, the spec does not prohibit calling it explicitly. |
I think we have two choices as to how we handle visibility. When S is a subtype of T, and a method or field f occurs in both S and T, we have two choices for the requirement of the visibility regions of f in S and in T, either
1 allows more subtyping relationships, and is, I believe, what is required for substitutability, but 2 is perhaps easier to understand, and has the advantage that it allows us to approximate nominal typing: if you create an object with a protected field/method, then only objects defined in the same module can be a subtype; similarly, if you create an object with a private field/method, then the only objects that could be a subtype are local objects created by methods of that objects. @sanjiva, @sameerajayasoma, @hasithaa Any thoughts? |
I think the idea that a subtype expands the visibility region is going to cause a lot of confusion to mort programmers. So I prefer option 2. |
+1 for option 2. AFAIR, that is the option that was discussed in an offline discussion with @jclark on 4th of April. It is summarized in this comment. #43 (comment) |
Btw, these object subtyping rules give rise to many patterns which are useful in Ballerina programs with multiple Ballerina modules. They are related to substitutability principal in OOP.
|
Option 2 is what I put in the spec. |
The spec needs to specify how object subtyping works.
This will almost certainly depend on function subtyping #28.
Can we define this in terms of shapes? (ie type denotes set of shapes, and subtype means subset of denoted set of shapes)
It would be intuitive if every object type was a subtype of
object {}
, since then we would not need any special way to write a type corresponding to the object basic type.Privacy
One significant complexity that relates only to objects is privacy. We need to be clear on what private means. Two possible interpretations
Approach A
One possible approach is as follows. An object type S is a subtype of an object type T provided that the only differences between S and T are the following:
Approach B
An alternative approach is that private aspects of value do not affect type.
This only works with first interpretation of private.
The text was updated successfully, but these errors were encountered: