-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Destructible Types #161
Comments
I would prefer a way that allows me to apply this not just to new types but to existing code. One issue I see is this essentially mimics These two concerns are showstoppers for me and so I would not vote to include this in its current form. Another issue as-is with this proposal is that when reading code, there's no obvious way to determine that a variable will have side effects when it goes out of scope. If there's not a I believe something closer to this would get about 90% of the way there and be a lot more usable: say a way to mark an
This would give scope-bound deterministic disposal, single ownership safety, obvious code readability, be immediately useful when using existing code, and I believe could be implemented without changing the VM similar to Additional safety regarding single ownership could be had by adding a weak reference designator that forces explicit ownership to exist elsewhere:
But, this may be of limited usefulness if "get reader" allows getting a raw instance. (Which, I think is very important to allow) |
What if you used C++ stack allocation construction syntax to visually separate garbage collected objects from destructible objects? Would that allow any object to be destructible then? I think you'd need CLR support at that point though. OutputMessageOnDestruction omod("Hello world!"); |
👍 |
Would this work together with
Does that mean that array or |
That's definitely something I want to see, and reminiscent of Rust with its compile-time checking. Generalized enough, one can imagine a core runtime that almost doesn't need the GC at all. What about returning a destructible type? Would it need to be
Or would it be implicit?
If the caller doesn't use the returned value, is it destructed immediately, or only at the end of the current scope?
|
@RichiCoder1 from C++ experience, it is not uncommon to have one "owner" object and several other objects that still need to use the instance somehow. I can see the same need here. Here's an exercise. Say
I still want to allow these other methods (and any objects they allocate in their use of it) to use it while maintaining strict lifetime control at a top level. A ref param won't work here if Write() is an async method that needs to put the FileStream into its state machine object. |
No. One of the goals of this proposal is to have destruction be deterministic and getting that requires the implementation of the type obey certain restrictions. Hence it wouldn't be possible to make any type destructible simply by changing its construction syntax. |
👍 This is one I've thought about several times, as having deterministic management of resources would be extremely useful in a lot of areas, such as scientific programming and game development. Could this potentially be used to allow for deterministic management of heap allocated memory? Effectively implementing the C++ Can these types be stored in collections? I noticed the proposal requires that an object containing fields that are destructible must itself be destructible? How would that work with arrays and collections? |
@jaredpar The new construction syntax idea was simply so that you could tell how objects are allocated at the point of allocation. A a = new A(); // Destructible?
B b(); // Destructible! C# has done a very good job of making things explicit at the call site and not just at the declaration site; What are the other "certain restrictions"? Why couldn't the CLR initialize any given object type on stack memory instead of heap allocated memory and have the finalizer run when it goes out of scope if it hasn't been moved? There's lambda capture to worry about I suppose, and concurrent modification, neither of which is anything remotely resembling easy. |
I agree that being able to distinguish between the two types here is important. In the past we did consider taking a page out of the F# book here and doing the following: use var A a = new A(); // destructible The overall feeling when we did this was mixed. Sure it made the construction more obvious but there was also concern it was adding too much verbosity to the code. Eventually we ripped it out in favor of finding a better solution later on.
It's not just moving that is a problem but even simple aliasing. If the implementation of a method should put |
Good point, I take it for granted sometimes that C++ let's people do stupid things if they really feel like it. I assume you meant the implementation of a destructible object's method, so if you put I really like your sample syntax also, with a minor tweak. If that caused mixed feelings, I don't know what to tell you :-) use a = new A(); // destructible shorthand, implicit var
use A a = new A(); // destructible longhand, explicit type |
Maybe somehow allow |
Boxing is definitely an option we explored and feel is necessary for completeness. We even called it simply
|
@jaredpar Sounds great an exactly what I was thinking about. I think a read a previous article about discussion around a destructible types internal. Maybe piggy back on other ideas in this thread and follow File! file = File.GetFile("....."); // <--- Implicitly is boxed up.
files.Add(file); Random thought; Possibly include generic constraint public destructible class DestructibleList<T> : IList<T> where T : destructible
{
// .. implementation
} public void ProcesFiles(string folderName)
{
DirectoryInfo di = new DirectoryInfo(folderName);
use DestructibleList<File> files = di.GetFiles("*");
foreach(File! file in files)
{
// ... do stuff
}
// ...files go away here.
} Though that raises the next question is how would a list of destructible work with something like Addendum: How would destructible types play with |
Overall, I like the proposal, but a couple of things are not entirely clear to me.... would someone mind expanding on this:
|
A scope-based single-owned destruction is good idea in a certain situation, I like it, but there might be still many situations that the proposal can not solve. |
I was originally concerned that destructable types could require VM changes which would have a negative impact on the performance of the garbage collector. I no longer believe that is the case. My current understanding of the proposal is the following (with the exception of lambda considerations):
The above rules could be expanded to support a destructable
The above is surely incomplete but may serve as a starting point for defining the semantics of destructable types. |
Regarding
When a destructable value is placed into a "box", ownership is assigned to that box. |
@sharwell I assumed by reading the proposal that a destructible type doesn't implement |
Why not? @stephentoub's proposal contains this: var omod1 = new OutputMessageOnDestruction("Destructed!");
SomeMethod(move omod1); // Ok, 'omod1' now uninitializedl; won't be destructed I think this is reasonable, i.e. if the parameter is not
So, collections of destructible types wouldn't be allowed (at least not without |
Use of destructable types in an array poses and interesting, but not insurmountable, challenge. Consider the following:
In this scenario, values is (semantically) treated as a |
@MrJul and @RichiCoder1: The implementation of |
I agree, I updated my post above. |
Right now I don't see a way to provide destructable guarantees for existing generic data types. Perhaps if we introduce a |
The generated code for a destructible type would not have an |
Just a few points I'd like to have clarified in the proposal. Given Taking a reference of a destructible reference is done via Taking ownership of a destructible reference is done via Can ownership be taken from references? Is this allowed (I hope not)? Is it absolutely required that when
Regardless, I do think use of the special keywords for on assignment should be required for destructible. Without them, compile time checking is impossible / really difficult. Additionally, declaration time specification is ideal - class markup is also handy for requiring specific declaration mark up but I do not believe it should be the method of implementation. Lastly, I didn't see this in the specification: what happens to all the references? Does the run-time set them all to null or is there some kind of destroyed operator being introduced? I very much hope that we're not expecting the developer to track the state of these things without support from the run-time. @Grauenwolf I disagree completely. The disposable pattern has been a weakness of C# since its inception. The sooner we can be without it, the better. The fact that the Dispose method could be called by any interacting code has made the entire pattern fragile. For example: there are a number of |
Any proposal based on abandoning IDisposable is doomed from the outset due to backwards compatibility issues. |
@Grauenwolf I do not disagree. There is no possible way to abandon the disposable pattern for many years. However, given that the pattern is subject to misuse and can be an enforced cause of bugs it should be phased out over time. To assume that because something is common that it must always be, is akin to assuming that because it is common for people to drive gasoline powered automobiles that electric powered automobiles should never be considered. I believe history will prove this assumption incorrect. 😄 |
I would like to suggest a poor-man's version of this feature, where the existing
This does not address the |
|
would compile (checking the constraint
Point 4. Code like:
is compiled as
Point 5. The compiler can detect cases like:
|
|
For your 1. and 2. I understand your concern is some method like:
This code should not raise a warning, since it returns the For 3. The order of disposal is in the declaration order. For another order, implement For 4. Then the For 5. That's an example where the compiler might raise warning (the Anyway, I'm just trying to suggest exploring a lightweight approach to the problems addressed by this topic, but which is complementary to the existing |
@govert I think that code should raise a warning. The IDisposable semantics are not being recognized in that method somehow. You should do something explicit to not get it: Given:
This should give a warning to the effect of IDisposable instance marked with RequiresUsingAttribute not disposed
In this case an analyzer could offer the fix (because
You could also detect class level usages:
I think there is merit to such an approach and it could be done entirely with attributes and analyzers without changes to the compiler. |
@bbarry: although the feature is already on the "probably never" list, I want to point out that you cannot easily workaround such using-enforcements, e.g., what to do in NUnit tests where you have a SetUp and a TearDown method? How to tell the system that the Dispose() method is called by reflection? |
@drauch |
It would be nice to merge this proposal with dotnet/csharplang#6611 and somehow #181, so // instead of
var foo = new Destructible();
var bar = move foo; // mandatory move
// we could just
let foo = new Whatever(); // owned by scope
{
let bar = foo; // implicitly (temporarily, in this case) move
foo.Bar(); // ERROR: use of moved value
}
foo.Bar(); // OK This would cause a compiler error when accessing the collection in let list = new List<T> { ... };
foreach(var item in list) // list is borrowed
{
WriteLine(item);
list.Add( ... ); // ERROR: use of moved value
}
list.Add( ... ); // OK Probably another keyword instead of |
We are now taking language feature discussion in other repositories:
Features that are under active design or development, or which are "championed" by someone on the language design team, have already been moved either as issues or as checked-in design documents. For example, the proposal in this repo "Proposal: Partial interface implementation a.k.a. Traits" (issue 16139 and a few other issues that request the same thing) are now tracked by the language team at issue 52 in https://github.com/dotnet/csharplang/issues, and there is a draft spec at https://github.com/dotnet/csharplang/blob/master/proposals/default-interface-methods.md and further discussion at issue 288 in https://github.com/dotnet/csharplang/issues. Prototyping of the compiler portion of language features is still tracked here; see, for example, https://github.com/dotnet/roslyn/tree/features/DefaultInterfaceImplementation and issue 17952. In order to facilitate that transition, we have started closing language design discussions from the roslyn repo with a note briefly explaining why. When we are aware of an existing discussion for the feature already in the new repo, we are adding a link to that. But we're not adding new issues to the new repos for existing discussions in this repo that the language design team does not currently envision taking on. Our intent is to eventually close the language design issues in the Roslyn repo and encourage discussion in one of the new repos instead. Our intent is not to shut down discussion on language design - you can still continue discussion on the closed issues if you want - but rather we would like to encourage people to move discussion to where we are more likely to be paying attention (the new repo), or to abandon discussions that are no longer of interest to you. If you happen to notice that one of the closed issues has a relevant issue in the new repo, and we have not added a link to the new issue, we would appreciate you providing a link from the old to the new discussion. That way people who are still interested in the discussion can start paying attention to the new issue. Also, we'd welcome any ideas you might have on how we could better manage the transition. Comments and discussion about closing and/or moving issues should be directed to #18002. Comments and discussion about this issue can take place here or on an issue in the relevant repo. I think dotnet/csharplang#121 is the best place to continue this discussion. |
Background
C# is a managed language. One of the primary things that's “managed” is memory, a key resource that programs require. Programs are able to instantiate objects, requesting memory from the system, and at some point later when they're done with the memory, that memory can be reclaimed automatically by the system's garbage collector (GC). This reclaiming of memory happens non-deterministically, meaning that even though some memory is now unused and can be reclaimed, exactly when it will be is up to the system rather than being left to the programmer to determine.
Other languages, in particular those that don't use garbage collection, are more deterministic in when memory will be reclaimed. C++, for example, requires that developers explicitly free their memory; there is typically no GC to manage this for the developer, but that also means the developer gets complete control over when resources are reclaimed, as they're handling it themselves.
Memory is just one example of a resource. Another might be a handle to a file or to a network connection. As with any resource, a developer using C++ needs to be explicit about when such resources are freed; often this is done using a “smart pointer,” a type that looks like a pointer but that provides additional functionality on top of it, such as keeping track of any outstanding references to the pointer and freeing the underlying resource when the last reference is released.
C# provides multiple ways of working with such “unmanaged” resources, resources that, unlike memory, are not implicitly managed by the system. One way is by linking such a resource to a piece of memory; since the system does know how to track objects and to release the associated memory after that object is no longer being referenced, the system allows developers to piggyback on this and to associate an additional piece of logic that should be run when the object is collected. This logic, known as a “finalizer,” allows a developer to create an object that wraps an unmanaged resource, and then to release that resource when the associated object is collected. This can be a significant simplification from a usability perspective, as it allows the developer to treat any resource just as it does memory, allowing the system to automatically clean up after the developer.
However, there are multiple downsides to this approach, and some of the biggest reliability problems in production systems have resulted from an over-reliance on finalization. One issue is that the system is managing memory, not unmanaged resources. It has heuristics that help it to determine the appropriate time to clean up memory based on the system's understanding of the memory being used throughout the system, but such a view of memory doesn't provide an accurate picture about any pressures that might exist on the associated unmanaged resources. For example, if the developer has allocated but then stopped using a lot of file-related objects, unless the developer has allocated enough memory to trigger the garbage collector to run, the system will not know that it should run the garbage collector because it doesn't know how to monitor the “pressure” on the file system. Over the years, a variety of techniques have been developed to help the system with this, but none of them have addressed the problem completely. There is also a performance impact to abusing the GC in this manner, in that allocating lots of finalizable objects can add a significant amount of overhead to the system.
The biggest issue with relying on finalizers is the non-determinism that results. As mentioned, the developer doesn't have control over when exactly the resources will be reclaimed, and this can lead to a wide variety of problems. Consider an object that's used to represent a file: the object is created when the file is opened, and when the object is finalized, the file is closed. A developer opens the file, manipulates it, and then releases the object associated with it; at this point, the file is still open, and it won't be closed until some non-deterministic point in the future when the system decides to run the garbage collector and finalize any unreachable objects. In the meantime, other code in the system might try to access the file, and be denied, even though no one is actively still using it.
To address this, the .NET Framework has provided a means for doing more deterministic resource management: IDisposable. IDisposable is a deceptively simple interface that exposes a single Dispose method. This method is meant to be implemented by an object that wraps an unmanaged resource, either directly (a field of the object points to the resource) or indirectly (a field of the object points to another disposable object), which the Dispose method frees. C# then provides the 'using' construct to make it easier to create resources used for a particular scope and then freed at the end of that scope:
Problem
While helpful in doing more deterministic resource management, the IDisposable mechanism does suffer from problems. For one, there's no guarantee made that it will be used to deterministically free resources. You're able to, but not required to, use a 'using' to manage an IDisposable instance.
This is complicated further by cases where an IDisposable instance is embedded in another object. Over the years, FxCop rules have been developed to help developers track cases where an IDisposable goes undisposed, but the rules have often yielded non-trivial numbers of both false positives and false negatives, resulting in the rules often being disabled.
Additionally, the IDisposable pattern is notoriously difficult to implement correctly, compounded by the fact that because objects may not be deterministically disposed of via IDisposable, IDisposable objects also frequently implement finalizers, making the pattern that much more challenging to get right. Helper classes (like SafeHandle) have been introduced over the years to assist with this, but the problem still remains for a large number of developers.
Solution: Destructible Types
To address this, we could add the notion of "destructible types" to C#, which would enable the compiler to ensure that resources are deterministically freed. The syntax for creating a destructible type, which could be either a struct or a class, would be straightforward: annotate the type as 'destructible' and then use the '~' (the same character used to name finalizers) to name the destructor.
An instance of this type may then be constructed, and the compiler guarantees that the resource will be destructed when the instance goes out of scope:
No matter what happens in SomeMethod, regardless of whether it returns successfully or throws an exception, the destructor of 'omod' will be invoked as soon as the 'omod' variable goes out of scope at the end of the method, guaranteeing that “Destructed!” will be written to the console.
Note that it's possible for a destructible value type to be initialized to a default value, and as such the destruction could be run when none of the fields have been initialized. Destructible value type destructors need to be coded to handle this, as was done in the 'OutputMessageOnDestruction' type previously by checking whether the message was non-null before attempting to output it.
Now, back to the original example, consider what would happen if 'omod' were stored into another variable. We'd then end up with two variables effectively wrapping the same resource, and if both variables were then destructed, our resource would effectively be destructed twice (in our example resulting in “Destructed!” being written twice), which is definitely not what we want. Fortunately, the compiler would ensure this can't happen. The following code would fail to compile:
The compiler would prevent such situations from occurring by guaranteeing that there will only ever be one variable that effectively owns the underlying resource. If you want to assign to another variable, you can do that, but you need to use the 'move' keyword (#160) to transfer the ownership from one to the other; this effectively performs the copy and then zeroes out the previous value so that it's no longer usable. In compiler speak, a destructible type would be a "linear type," guaranteeing that destructible values are never inappropriately “aliased”.
This applies to passing destructible values into method calls as well. In order to pass a destructible value into a method, it must be 'move'd, and when the method's parameter goes out of scope when the method returns, the value will be destructed:
In this case, the value needs to be moved into SomeMethod so that SomeMethod can take ownership of the destruction. If you want to be able to write a helper method that works with a destructible value but that doesn't assume ownership for the destruction, the value can be passed by reference:
In addition to being able to destructively read a destructible instance using 'move' and being able to pass a destructible instance by reference to a method, you can also access fields of or call instance methods on destructible instances. You can also store destructible instances in fields of other types, but those other types must also be destructible types, and the compiler guarantees that these fields will get destructed when the containing type is destructed.
There would be a well-defined order in which destruction happens when destructible types contain other destructible types. Destructible fields would be destructed in the reverse order from which the fields are declared on the containing type. The fields of a derived type are destructed before the fields of a base type. And user-defined code runs in a destructor before the type's fields are destructed.
Similarly, there'd be a well-defined order for how destruction happens with locals. Destructible locals are destructed at the end of the scope in which they are created, in reverse declaration order. Further, destructible temporaries (destructible values produced as the result of an expression and not immediately stored into a storage location) would behave exactly as a destructible locals declared at the same position, but the scope of a destructible temporary is the full expression in which it is created.
Destructible locals may also be captured into lambdas. Doing so results in the closure instance itself being destructible (since it contains destructible fields resulting from capturing destructible locals), which in turn means that the delegate to which the lambda is bound must also be destructible. Just capturing a local by reference into a closure would be problematic, as it would result in a destructible value being accessible both to the containing method and to the lambda. To deal with this, closures may capture destructible values, but only if an explicit capture list (#117) is used to 'move' the destructible value into the lambda (such support would also require destructible delegate types):
The destructible types feature would enable a developer to express some intention around how something should behave, enabling the compiler to then do a lot of heavy lifting for the developer in making sure that the program is as correct-by-construction as possible. Developers familiar with C++ should feel right at home using destructible types, as it provides a solid Resource Acquisition Is Initialization (RAII) approach to ensuring that resources are properly destructed and that resource leaks are avoided.
The text was updated successfully, but these errors were encountered: