Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API Proposal]: DirectoryInfo.Copy / Directory.Copy #60903

Open
Symbai opened this issue Oct 27, 2021 · 37 comments
Open

[API Proposal]: DirectoryInfo.Copy / Directory.Copy #60903

Symbai opened this issue Oct 27, 2021 · 37 comments
Labels
api-needs-work API needs work before it is approved, it is NOT ready for implementation area-System.IO
Milestone

Comments

@Symbai
Copy link

Symbai commented Oct 27, 2021

Background and motivation

The purpose is to copy the entire directory to a new place, optionally recursive. Currently the only way is to create the target directory first, then enumerate all files in the source directory and copy them one by one. While we already have DirectoryInfo.MoveTo() and DirectoryInfo.Delete() many developers are looking forward for DirectoryInfo.Copy as well, see high voted https://stackoverflow.com/questions/58744/copy-the-entire-contents-of-a-directory-in-c-sharp

In terms for exception handling it could be the same as for the existing Move or File.Copy methods. There is a further discussion here on improving the exception handling but I believe this should be tracked and discussed in another issue. This proposal is for closing the gap only.

//Edit: 11/25/21, added overwriteExistingFiles overload to main proposal based on @jozkee feedback.

API Proposal

namespace System.IO
{
    public sealed class DirectoryInfo : FileSystemInfo
    {
        public void Copy(string destDirName);
        public void Copy(string destDirName, bool recursive);
        public void Copy(string destDirName, bool recursive, bool overwriteExistingFiles);
    }
}

API Usage

//OLD behavior
foreach (string dir in Directory.GetDirectories("F:\\Folder", "*", SearchOption.AllDirectories))
{
	Directory.CreateDirectory(dir.Replace("F:\\Folder", "F:\\newFolder"));
}
foreach (string file in Directory.GetFiles("F:\\Folder", "*.*",SearchOption.AllDirectories))
{
	File.Copy(file, file.Replace("F:\\Folder", "F:\\newFolder"), true);
}

//New behavior
DirectoryInfo di = new("F:\\Folder");
di.Copy("F:\\newFolder", true);

Alternative Designs

Alternatively it could (also?) be part of System.IO.Directory class, they also have Move and Delete methods.

namespace System.IO
{
    public static class Directory
    {
        public void Copy(string sourceDirName, string destDirName);
        public void Copy(string sourceDirName, string destDirName, bool recursive);
        public void Copy(string sourceDirName, string destDirName, bool recursive, bool overwriteExistingFiles);
    }
}

Risks

No response

@Symbai Symbai added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Oct 27, 2021
@dotnet-issue-labeler dotnet-issue-labeler bot added area-System.IO untriaged New issue has not been triaged by the area owner labels Oct 27, 2021
@ghost
Copy link

ghost commented Oct 27, 2021

Tagging subscribers to this area: @dotnet/area-system-io
See info in area-owners.md if you want to be subscribed.

Issue Details

Background and motivation

The purpose is to copy the entire directory to a new place, optionally recursive. Currently the only way is to create the target directory first, then enumerate all files in the source directory and copy them one by one. While we already have DirectoryInfo.MoveTo() and DirectoryInfo.Delete() many developers are looking forward for DirectoryInfo.Copy as well, see high voted https://stackoverflow.com/questions/58744/copy-the-entire-contents-of-a-directory-in-c-sharp

In terms for exception handling it could be the same as for the existing Move methods.

API Proposal

namespace System.IO
{
    public sealed class DirectoryInfo : FileSystemInfo
    {
        public void Copy(string destDirName);
        public void Copy(string destDirName, bool recursive);
    }
}

API Usage

//OLD behavior
foreach (string dir in Directory.GetDirectories("F:\\Folder", "*", SearchOption.AllDirectories))
{
	Directory.CreateDirectory(dir.Replace("F:\\Folder", "F:\\newFolder"));
}
foreach (string file in Directory.GetFiles("F:\\Folder", "*.*",SearchOption.AllDirectories))
{
	File.Copy(file, file.Replace("F:\\Folder", "F:\\newFolder"), true);
}

//New behavior
DirectoryInfo di = new("F:\\Folder");
di.Copy("F:\\newFolder", true);

Alternative Designs

Alternatively it could (also?) be part of System.IO.Directory class instead, they also have Move and Delete methods.

namespace System.IO
{
    public static class Directory
    {
        public void Copy(string sourceDirName, string destDirName);
        public void Copy(string sourceDirName, string destDirName, bool recursive);
    }
}

Something else to think about is if we add a third overload to optionally overwrite existing files. While this is useful only Delete methods have these but existing Move not.

namespace System.IO
{
    public static class Directory
    {
        public void Copy(string sourceDirName, string destDirName, bool recursive, bool overwriteExistingFiles);
    }

    public sealed class DirectoryInfo : FileSystemInfo
    {
        public void Copy(string destDirName, bool recursive, bool overwriteExistingFiles);
    }
}

Risks

No response

Author: Symbai
Assignees: -
Labels:

api-suggestion, area-System.IO, untriaged

Milestone: -

@FilipToth
Copy link
Contributor

Does seem OK, but I would definitely also go for both the alternative design and the original design. I think it would be weird for us to only implement this inside Directory or only inside DirectoryInfo. We definitely don't want any unexpected differences between the two API surfaces.

@batzen
Copy link
Contributor

batzen commented Oct 27, 2021

Getting some kind of status updates during copy operations would also be nice.
What would be of interest, at least for me, would be:

  • Number of directories being copied
  • Number of files being copied
  • Total number of items being copied
  • Current number of items copied (this should be some kind of live reporting)

@Symbai
Copy link
Author

Symbai commented Oct 27, 2021

Getting some kind of status updates during copy operations would also be nice. What would be of interest, at least for me, would be:

Yes I absolutely agree, but you don't get these from existing Move APIs as of now. For a higher chance that this API will get accepted at all, my propose stick to same behavior as the existing Move APIs. Adding status updates is probably better for another API proposal, in which case proposes to add these status updates to other existing APIs (Move, System.IO.File APIs etc). I don't think it makes sense to discuss something like this in this API request.

@Frassle
Copy link
Contributor

Frassle commented Nov 1, 2021

Yes I absolutely agree, but you don't get these from existing Move APIs as of now.

Move is very different behavior to copy. Move is just a rename which is why it throws if you try and move to a different volume, so it can't emit any of this data because its a single atomic operation.
Copy has to copy each thing over file by file and so those statistics are relevant. There's also a lot more cases for this to fail. Like what if you've copied over half then there's an IO error on one file, or the new location runs out of space, or two copies are happening at once and your getting conflicts. There's also questions of if the copy should be done in parallel or not, and what order (breadth vs depth) the copy should be done in.

It would be really handy to have a built-in copy function but it needs to be carefully thought out.

@Symbai
Copy link
Author

Symbai commented Nov 1, 2021

That's interesting, didn't know that. On Windows if it fails on one file in the middle, the whole copy operation fails (in some cases you can skip the failing file but not in all). I would have expect something similar from .NET api. copying files one by one and simply throws if it fails on one file. Running out of space is something the developer can check himself first. For any other IO issue it can just throw. Developer can catch, handle and re-try if necessary. This is clearly not THE BEST behavior but in order to make it the best behavior there is a lot of other stuff to do like @batzen suggested. If you look at the high voted solutions from Stackoverflow its also just file copy one by one. If it fails somewhere the whole operation fails. There is no deeper logic in here for now and it still makes many devs happy enough. You can also look at the recommended solution on MSDN or look into the VB API, its the same. So if three different sources are behaving the same then I would suggest to do the same. Highest chance this API gets approved. If we lose ourselves in questioning how we can handle each edge scenario, API will likely get denied because of being too complex and too different to any existing IO api. My proposal is to have an inbuilt solution for something simple and useful like that, not relying to Stackoverflow/VB or MDSN. It can easily be the same/similar implementation.

@Clockwork-Muse
Copy link
Contributor

On Windows if it fails on one file in the middle, the whole copy operation fails (in some cases you can skip the failing file but not in all).

The operation fails, but does it roll back the files already copied? If not, now you have to figure out the resulting state.

Running out of space is something the developer can check himself first.

Nope (or at least, that's not sufficient). There's multiple reasons the available space can change in the middle of the copy operation (and by that I mean during the file byte copy).

That aside, I'm assuming that, like pretty much the rest of the IO stack we're going to be wrapping some OS function here, so capabilities may be dictated by what the various OS flavors provide.

@Mrxx99
Copy link
Contributor

Mrxx99 commented Nov 10, 2021

Does it also make sense to provide these new Copy methods as async aswell? (But this would also apply to File.Copy)

@Symbai
Copy link
Author

Symbai commented Nov 11, 2021

Does it also make sense to provide these new Copy methods as async aswell? (But this would also apply to File.Copy)

This should be tracked by another issue, for example here #20695 (see also #20697). This issue is only about filling the gap and provide a basic directory copy method first.

@ygra
Copy link

ygra commented Nov 21, 2021

Another likely complication, or at least something where a decision had to be made:

  • Symbolic links and directory junctions (volume mount points are similar, right?): copy the link or the target?

@ghost ghost added the needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration label Nov 21, 2021
@Symbai
Copy link
Author

Symbai commented Nov 21, 2021

Another likely complication, or at least something where a decision had to be made:

  • Symbolic links and directory junctions (volume mount points are similar, right?): copy the link or the target?

Simply follow the behavior of File.Copy. If the current behavior is not what you believe it should be than make another issue.

@AraHaan
Copy link
Member

AraHaan commented Nov 22, 2021

That's interesting, didn't know that. On Windows if it fails on one file in the middle, the whole copy operation fails (in some cases you can skip the failing file but not in all). I would have expect something similar from .NET api. copying files one by one and simply throws if it fails on one file. Running out of space is something the developer can check himself first. For any other IO issue it can just throw. Developer can catch, handle and re-try if necessary. This is clearly not THE BEST behavior but in order to make it the best behavior there is a lot of other stuff to do like @batzen suggested. If you look at the high voted solutions from Stackoverflow its also just file copy one by one. If it fails somewhere the whole operation fails. There is no deeper logic in here for now and it still makes many devs happy enough. You can also look at the recommended solution on MSDN or look into the VB API, its the same. So if three different sources are behaving the same then I would suggest to do the same. Highest chance this API gets approved. If we lose ourselves in questioning how we can handle each edge scenario, API will likely get denied because of being too complex and too different to any existing IO api. My proposal is to have an inbuilt solution for something simple and useful like that, not relying to Stackoverflow/VB or MDSN. It can easily be the same/similar implementation.

that space issue can be avoided by first obtaining the size of each file that needs to be copied and then using the new .NET 6 file apis to reserve that space so that way the copy operation won't fail.

There is also another added benefit of doing that as well, you can then throw if the reservation of space for a specific file fails and be able to tell them for which file it happened on in the exception message.

@jozkee
Copy link
Member

jozkee commented Nov 24, 2021

Simply follow the behavior of File.Copy.

Agree.

The proposal makes sense, I would just add the overwriteExistingFiles to the main proposal.

@jozkee jozkee added api-ready-for-review API is ready for review, it is NOT ready for implementation and removed api-suggestion Early API idea and discussion, it is NOT ready for implementation untriaged New issue has not been triaged by the area owner labels Nov 24, 2021
@bartonjs
Copy link
Member

bartonjs commented Nov 30, 2021

Video

  • We changed the 3 overloads to two with default parameters.
  • We added cancellation to the still-synchronous methods
  • We renamed some parameters
  • We added a boolean return value. False indiciates that something couldn't be read (a file couldn't be read, or a directory couldn't be enumerated). Write failures should throw.
namespace System.IO
{
    public partial class DirectoryInfo : FileSystemInfo
    {
        public bool CopyTo(string destinationPath, bool recursive);
        public bool CopyTo(string destinationPath, bool recursive, bool skipExistingFiles=true, CancellationToken cancellationToken=default);
    }
    public partial class Directory
    {
        public static bool Copy(string sourcePath, string destinationPath, bool recursive);
        public static bool Copy(string sourcePath, string destinationPath, bool recursive, bool skipExistingFiles=true, CancellationToken cancellationToken=default);
    }
}

@bartonjs bartonjs added api-approved API was approved in API review, it can be implemented and removed api-ready-for-review API is ready for review, it is NOT ready for implementation labels Nov 30, 2021
@adamsitnik adamsitnik added help wanted [up-for-grabs] Good issue for external contributors and removed needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration labels Dec 1, 2021
@carlossanlop carlossanlop removed the help wanted [up-for-grabs] Good issue for external contributors label Dec 4, 2021
@Symbai
Copy link
Author

Symbai commented Dec 10, 2021

@deeprobin Did you gave up? No response on your PR for 6 days.

@AraHaan
Copy link
Member

AraHaan commented Dec 11, 2021

No, they just need to work everything out with reviews first.

@deeprobin
Copy link
Contributor

@deeprobin Did you gave up? No response on your PR for 6 days.

I have not given up. However, I also have other PRs to work on apart from this one still needs clarification, as @AranHaan already said.

No, they just need to work everything out with reviews first.

If I have the time, I will of course add the ref-assembly implementation, doc-comments and tests.

@GSPP
Copy link

GSPP commented Dec 17, 2021

Consider using an enum for those options? I'm sure that more will come up in the future (for example options around permissions). Adding booleans is not a scalable API design. Also, an enum would be more descriptive to read.


I wonder why someone would not want a recursive copy. Is this needed at all?


What does the overload void Copy(string destDirName) do? Is it recursive or not? Nobody can know until they consult the documentation. It cannot be deduced logically. It might be better to require the recursion parameter (or the proposed options enum).

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Jan 20, 2022
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Apr 15, 2022
@jozkee jozkee added api-needs-work API needs work before it is approved, it is NOT ready for implementation and removed api-approved API was approved in API review, it can be implemented labels Apr 15, 2022
@jozkee
Copy link
Member

jozkee commented Apr 15, 2022

More feedback came up on the implementing PR. We need to reach consensus on the error policy before adding this API.

#62375 (comment)

If it returns false then now I have to do something more complicated

Do what? How do you know what went wrong and what to do differently next time / how to fix it so it doesn't happen again?

With no retry policy/error handler callback/etc, the method feels, to me, too trivial to be worth adding if it throws on the first failure (except for a few degenerate cases, like "your destination is an illegal path").

I agree with that. I also think it's not worth adding if it only does some of what you asked it to do and happily returns false with no additional information. If we want to throw an exception after it's completed copying whatever it can, that's fine; the API review discussed either throwing the first exception or the last exception or an aggregate... any of those seems better than returning false (though with an aggregate I'd worry about memory impact if lots of things in the copy failed).

@iSazonov
Copy link
Contributor

iSazonov commented Jun 16, 2022

We could consider follow https://en.cppreference.com/w/cpp/filesystem/copy as good start point.
As for error handling, it seems we have no need in "error policy". I'd accept that we do in file/directory enumeration API - use ContinueOnError().
Of cause we need to improve and implement the API:
#1953 (bool ContinueOnErrorPredicate(FileSystemErrorInfo errorInfo))
#52666
and perhaps other APIs for links (but of cause they can be internal implementation details at first step).

@Symbai
Copy link
Author

Symbai commented Jun 16, 2022

We could consider follow https://en.cppreference.com/w/cpp/filesystem/copy as good start point.

Why not considering the VB.NET implementation instead? It already has a .NET runtime implementation which can be taken and improved where it needs. It also handles errors which way we can change if necessary. And it's been one of the recommended solutions of "How can I do this in C#" 🤷‍♂️

I mean of course we can take a completely different language implementation as a reference point. Completely starting the .NET implementation from scratch. Making it much more complex than necessary with callbacks. Its just that... why making it more difficult than necessary? For me the VB.NET implementation is pretty much perfect because:

  1. It continues on errors
  2. Its a one-liner. I dont have to hook an event, use try-catch etc.

The only thing I would probably change is that it not only returns the last exception but all exceptions, so I could see for example all the files which have failed if necessary.

@iSazonov
Copy link
Contributor

Why not considering the VB.NET implementation instead?

Can you add a reference on the implementation?

@iSazonov
Copy link
Contributor

Internal implementation: https://source.dot.net/#Microsoft.VisualBasic.Core/Microsoft/VisualBasic/FileIO/FileSystem.vb,f7120b635fcfb47f,references Public API: https://source.dot.net/#Microsoft.VisualBasic.Core/Microsoft/VisualBasic/FileIO/FileSystem.vb,0088e80b4572af96

Here it copy the files one by one and continues on errors (but saving the exception): https://source.dot.net/#Microsoft.VisualBasic.Core/Microsoft/VisualBasic/FileIO/FileSystem.vb,1039

@Symbai Thanks for additional information! Looking VB implementation I think it is too "naive" and does not meet the requirements of the .Net C# community at all.

@Symbai
Copy link
Author

Symbai commented Jun 17, 2022

Can you be more specific? And did you by chance misread my previous comment where I wrote can be taken and improved where it needs. Because I have no idea what else you are going to want from a method that copies files from A to B. I for example who proposed the API have no other needs and if you would also consider the official C# recommendation that is even more "naive". So please tell us exactly what's missing and why it cannot be added when taking the VB.NET method as a reference.

@Symbai
Copy link
Author

Symbai commented Aug 23, 2022

@jozkee @stephentoub

What's the status on this? The proposal has been approved but the PR has been rejected, because the implementation wasn't good enough. What's next? Does this need to be reviewed again because in the review process you guys and girls are discussing how it should get implemented? If so, could someone please re-add the "ready-for-review" label? If not, what needs to be done?

@stephentoub
Copy link
Member

#74273 (comment)

@terrajobst
Copy link
Member

terrajobst commented Aug 24, 2022

Let me summarize:

  1. The idea was to provide a convenience method for utilities/scripting scenarios. In that vein, we felt it wasn't useful to throw on the first error because it would make the method complex (e.g. by taking a Func<IOException, bool> or something like that).

  2. Originally, the idea was that we simply ignore any errors and do a best effort copy. During the review @bartonjs suggested to at least return a bool to signal that not everything was copied.

  3. Based on the discussion in he PR it seems people fundamentally disagree with that API.

I'm at the point now where I think we should reject this feature and let people do their own recursive walk because that means they have full control over the policy because offering an API with a configurable policy seems more complicated.

@deeprobin
Copy link
Contributor

@terrajobst
I agree with you there. Adding a feature like this to the constillation doesn't make sense in my opinion because of the error handling, which every user would probably need differently.

for ex.
User A wants a best-effort copy
User B wants to abort on error
User C wants to undo the changes in case of an error

@iSazonov
Copy link
Contributor

This isn't the first time I've been confused as to why having overwriteExistingFiles in the API is more important than other scenarios.
What if I want to add a filter of what to copy or skip? Or collect all copy errors or have custom error handler? These questions arise for any file API which uses some enumerator internally but only for API from the issue.

@GSPP
Copy link

GSPP commented Aug 29, 2022

Is the set of possible configuration options really that large? I'd say that the following (configured as an enum) would cover 99% of the use cases.

  • Recursive
  • Overwrite
  • Continue on error

I'd say that the settings "recursive, no overwrite, no continue on error" are already the 95% use case. Most programmatic copies are really simple... Just duplicate a plain directory without any complications. Programmatic copies tend to be much simpler than what a user would want to do in a GUI such as Windows Explorer.

As suggested above, symbolic links would simply match the File.Copy behavior.

This feature does not have to be a swiss army knife. If a user wants that they have to do it themselves.

In my view, it is a long-standing missing feature that .NET cannot... copy a directory. This seems like a highly desirable thing to have. I certainly have needed it.

Downsides of writing a custom copy algorithm:

  • It takes a non-trivial amount of work.
  • It can be quite hard for junior developers (recursion).
  • Thinking about edge cases in the file system is non-trivial.
  • There are bad ways of doing it (unclear code, slow performance).
  • It feels like an omission in .NET.

Can we at least keep this issue open as a tracking issue? Closing it would be a strong signal for future visitors that the very concept has been rejected. It seems that, at this time, the team is merely not seeing a good path to do it but they do not reject the idea outright.

@Clockwork-Muse
Copy link
Contributor

  • OverwriteIfNewer

Maybe? Or the ability to provide a filter func.

@WeihanLi
Copy link
Contributor

Any update so far?

@Symbai
Copy link
Author

Symbai commented Nov 13, 2023

Any update so far?

No. According to stephentoub this proposal needs someone from the .NET runtime team, who really cares about it, to push it forward, see his comment #74273 (comment) but I cannot explain you why.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-needs-work API needs work before it is approved, it is NOT ready for implementation area-System.IO
Projects
None yet
Development

Successfully merging a pull request may close this issue.