Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more entropy to GetTempFileName() on Windows #73793

Closed
danmoseley opened this issue Aug 11, 2022 · 11 comments · Fixed by #74855 or dotnet/dotnet-api-docs#8361
Closed

Add more entropy to GetTempFileName() on Windows #73793

danmoseley opened this issue Aug 11, 2022 · 11 comments · Fixed by #74855 or dotnet/dotnet-api-docs#8361
Assignees
Labels
area-System.IO help wanted [up-for-grabs] Good issue for external contributors
Milestone

Comments

@danmoseley
Copy link
Member

danmoseley commented Aug 11, 2022

On Unix, GetTempFileName() wraps mkstemps, requesting 6 characters of entropy, which should enable about 62^6 = approx 6 x 10^10 possible names, in a pattern like tmpCkPaMC.tmp. Through mkstemps it creates an empty file with that path. Presumably mkstemps passes the appropriate flag to ensure it's creating a new file, although given the amount of entropy it is essentially certain to succeed first time so I don't know whether it bothers to loop.

On Windows, GetTempFileName() simply wraps Windows's GetTempFileNameW. This is limited to allowing 65,536 possible names, in a pattern like tmpABD1.tmp. If all names are in use, as might happen if there is a process on the system with a temp file leak, it fails. Also, it loops through the possibilities passing CREATE_NEW to CreateFile until it succeeds, so if the directory is congested it potentially calls CreateFile thousands of times before succeeding.

It is time to replace the Windows implementation of GetTempFileName() to make it as strong as the Unix one -- robust to temp file leaks and also more efficient. We can do this by generating ~6 chars of randomness ourselves, rater than calling GetTempFileNameW.

Could this be a breaking change? The following seem very unlikely --

  • App depends on the specific size of the name (eg., for sizing a buffer for a pinvoke) -- unlikely as the temp path itself, which prefixes it, varies in length
  • App offers a file picker with a filter like tmp????.tmp to allow a user to pick a file from temp - why would the user pick a file from temp?
  • App enumerates temp with a filter like tmp????.tmp -- I've not encountered an app that enumerates temp in this way, since these files are shared with arbitrary other apps.
  • App is expecting to create 65K files and then fail
  • Temp path length approaching MAX_PATH such that adding 2 characters would hit the limit.
  • Any scripts or code that clean up files matching tmp????.tmp
  • Any other possibilities? @marklio ?

If temp is on a filesystem that requires 8.3 names (FAT32 maybe?) this longer name would fail. If that's the case we should fallback to 4 characters when encountering the particular error writing the file. (Alternatively we could drop tmp prefix to make room, but that might break some app assumption, so I'd be inclined to not)

cc @eerhardt

@ghost
Copy link

ghost commented Aug 11, 2022

Tagging subscribers to this area: @dotnet/area-system-io
See info in area-owners.md if you want to be subscribed.

Issue Details

On Unix, GetTempFileName() wraps mkstemps, requesting 6 characters of entropy, which should enable about 62^6 = approx 6 x 10^10 possible names, in a pattern like tmpCkPaMC.tmp. Through mkstemps it creates an empty file with that path. Presumably mksstemps passes the appropriate flag to ensure it's creating the file, although given the entropy it is essentially certain to succeed first time so I don't know whether it bothers to loop.

On Windows, GetTempFileName() simply wraps Windows's GetTempFileNameW. This is limited to allowing 65,536 possible names, in a pattern like tmpABD1.tmp. If all names are in use, as might happen if there is a process on the system with a temp file leak, it fails. Also, it loops through the possibilities passing CREATE_NEW to CreateFile until it succeeds, so if the directory is congested it potentially calls CreateFile thousands of times before succeeding.

It is time to replace the Windows implementation of GetTempFileName() to make it as strong as the Unix one -- robust to temp file leaks and also more efficient. We can do this by generating ~6 chars of randomness ourselves, rater than calling GetTempFileNameW.

Could this be a breaking change? The following seem very unlikely --

  • App depends on the specific size of the name (eg., for sizing a buffer for a pinvoke) -- unlikely as the temp path itself, which prefixes it, varies in length
  • App offers a file picker with a filter like tmp????.tmp to allow a user to pick a file from temp - why would the user pick a file from temp?
  • App enumerates temp with a filter like tmp????.tmp -- I've not encountered an app that enumerates temp in this way, since these files are shared with arbitrary other apps.
  • App is expecting to create 65K files and then fail
  • Temp path length approaching MAX_PATH such that adding 2 characters would hit the limit.
  • Any scripts or code that clean up files matching tmp????.tmp
  • Any other possibilities? @marklio ?

If temp is on a filesystem that requires 8.3 names (FAT32 maybe?) this longer name would fail. If that's the case we should fallback to 4 characters when encountering the particular error writing the file.

cc @eerhardt

Author: danmoseley
Assignees: -
Labels:

area-System.IO

Milestone: -

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Aug 11, 2022
@danmoseley
Copy link
Member Author

On Windows, we would have a little less variation than Unix, as we need to be case insensitive. So 6 chars would be approx 36 ^ 6 which is 2 x 10^9 - not so great.

It would be better to change to use GetRandomFileName(), which drops the tmp prefix and suffix and creates an 8.3 name with a character set of 32 characters, so 11^32 which is 2 x 10^33 .. vast.

@marklio
Copy link

marklio commented Aug 11, 2022

@danmoseley You say "It's time" to make this change, but I'm curious as to why. Is GetTempFileName() really a source of problem for apps today?

Overall, this list is really good. The other thing to consider from a reliability standpoint is that you could turn a stable "out of entropy and can't create a file" error into errors that the app isn't hardened against like being out of disk space or other disk quota (it used to be common for temp/profile directories to have quotas on shared systems), or overrunning some data structure that hasn't encountered so many files (all things it could hit today, but are limited by this API). All in all, the risk there seems pretty low.

Experience around usage of this API in infrastructure leads me to believe that MAX_PATH issues are likely to be the most impactful. Historically, tests and tests infrastructure have definitely ended up in precarious positions where single character changes/additions push you over the limit. GetTempPathW is easily and commonly steered with commonly-used environment variables. You could take the same approach as your proposed 8.3 fallback in such a case, and basically eliminate that as a source of breakage. The other sources of break seem less likely. A cleanup script that targets this patterns specifically already has to deal with the fact that:

  • the keyspace is shared among many apps (unless it has steered the temp path, in which case there is no need for it to be so specific)
  • it may not be able to clean up temp files in use by other apps

It's unclear how the proposal achieves robustness to temp file leaks. Do you just mean that a leak takes longer to cause a failure to create a unique file name? (which leaves it more likely to encounter some other error as I mentioned above).

@eerhardt
Copy link
Member

Note that in #72881 (comment), it was approved to hide Path.GetTempFileName() and instead expose File.CreateTempFile(string? prefix = null, string? suffix = null);.

We could leave GetTempFileName() as-is today, and instead make the new File.CreateTempFile have "more entropy". We will need to write the logic ourselves anyway on Windows because File.CreateTempFile takes an optional prefix and suffix. Windows's GetTempFileNameW doesn't allow for a suffix to be specified, only a prefix.

@danmoseley
Copy link
Member Author

danmoseley commented Aug 11, 2022

It's unclear how the proposal achieves robustness to temp file leaks. Do you just mean that a leak takes longer to cause a failure to create a unique file name? (which leaves it more likely to encounter some other error as I mentioned above).

It is just as easy to leak (easier actually as you point out since it will take longer to fail). The robustness is simply that it prevents another app's leak from breaking your app. Something that can easily happen today as the space is quite small and Windows has no automated daemon that cleans up temp. It need not be a bug in an app -- imagine you commonly kill apps (installers eg) that write temp files.

Having said that I just looked at my temp and I don't see any tmpXXXX.tmp files -- suggesting it's becoming more common for apps to use their own prefixes. eg I see names like wct9B72.tmp (which likely come from GetTempFileNameW with a custom prefix) but also eg 51ad8f29-cec7-4506-a39c-c51d7119bf39.tmp. But our GetTempFileName() doesn't accept a prefix, so .NET apps aren't being robust like that.

@danmoseley
Copy link
Member Author

danmoseley commented Aug 11, 2022

We could leave GetTempFileName() as-is today, and instead make the new File.CreateTempFile have "more entropy".

We could, but realistically hiding it only redirects newly authored code, there will always be a vast amount of code that uses the existing method. It would be nice to make that more robust/faster, as long as the breakingness is very low.

Yes certainly the new method should have at least as much entropy as the Unix GetTempFileName() and perhaps as much as GetRandomFileName().

@jkotas
Copy link
Member

jkotas commented Aug 11, 2022

Having said that I just looked at my temp and I don't see any tmpXXXX.tmp files -- suggesting it's becoming more common for apps to use their own prefixes.

You can find many posts on stackoverflow and elsewhere going 10+ years that basically say "Do not use GetTempFileName, it is a broken API". The fact that you do not see any tmpXXXX.tmp files suggests that it is indeed a broken API and all mainstream apps migrated away from using it.

@danmoseley
Copy link
Member Author

You can find many posts on stackoverflow and elsewhere going 10+ years that basically say "Do not use GetTempFileName, it is a broken API". The fact that you do not see any tmpXXXX.tmp files suggests that it is indeed a broken API and all mainstream apps migrated away from using it.

I agree, and we should fix our API to help .NET apps do that without having to write their own code and potentially get it wrong.

@adamsitnik adamsitnik added this to the 8.0.0 milestone Aug 12, 2022
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Aug 12, 2022
@danmoseley danmoseley added the help wanted [up-for-grabs] Good issue for external contributors label Aug 17, 2022
@vcsjones
Copy link
Member

I don't know if it has a material impact or not, but I thought I would point out that VB's FileSystem.GetTempFileName() defers to this API, so changing this API will change the behavior there as well.

Public Shared Function GetTempFileName() As String
Return System.IO.Path.GetTempFileName()
End Function

Do we want VB to retain its existing behavior, or is inheriting this new behavior okay?

@danmoseley
Copy link
Member Author

Is there any reason to be particularly concerned about back compat when called through the VB API?

@ghost ghost added in-pr There is an active PR which will close this issue when it is merged and removed in-pr There is an active PR which will close this issue when it is merged labels Aug 31, 2022
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Sep 2, 2022
@danmoseley danmoseley reopened this Sep 2, 2022
@danmoseley
Copy link
Member Author

Reopening to update docs

@danmoseley danmoseley self-assigned this Sep 2, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Oct 7, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.IO help wanted [up-for-grabs] Good issue for external contributors
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants