Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EEXIST error when using functions_worker_process_count #1230

Closed
ejizba opened this issue Oct 16, 2023 · 4 comments
Closed

EEXIST error when using functions_worker_process_count #1230

ejizba opened this issue Oct 16, 2023 · 4 comments
Assignees

Comments

@ejizba
Copy link

ejizba commented Oct 16, 2023

In Azure Functions, users can have multiple processes on the same instance if they use the functions_worker_process_count setting (docs here). Unfortunately that can lead to the following uncaught exception:

Error while saving data to disk: [object Error]{ stack: 'Error: EEXIST: file already exists, mkdir 'C:\local\Temp\appInsights-node...

I don't know the best behavior here, but at minimum this error should be handled more gracefully instead of throwing an uncaught exception which can crash the Node.js process.

@JacksonWeber JacksonWeber self-assigned this Jan 26, 2024
JacksonWeber added a commit to Azure/azure-sdk-for-js that referenced this issue Feb 2, 2024
…for Telemetry Caching (#28399)

### Packages impacted by this PR
@azure/monitor-opentelemetry-exporter

### Issues associated with this PR
microsoft/ApplicationInsights-node.js#1230

### Describe the problem that is addressed by this PR
Append the process ID to the file name created for holding disk cached
telemetry. This should resolve the issue with multiple Azure Functions
cores attempting to read/write/delete the same file when functions are
scaled to use multiple cores.

Extended this logic outside of Azure Functions so that in any case where
the SDK could be run concurrently we create distinct cache files.

### Command used to generate this PR:**_(Applicable only to SDK release
request PRs)_

### Checklists
- [x] Added impacted package name to the issue description
- [ ] Does this PR needs any fixes in the SDK Generator?** _(If so,
create an Issue in the
[Autorest/typescript](https://github.com/Azure/autorest.typescript)
repository and link it here)_
- [x] Added a changelog (if necessary)
v-weiyding pushed a commit to v-weiyding/azure-sdk-for-js that referenced this issue Feb 4, 2024
…for Telemetry Caching (Azure#28399)

### Packages impacted by this PR
@azure/monitor-opentelemetry-exporter

### Issues associated with this PR
microsoft/ApplicationInsights-node.js#1230

### Describe the problem that is addressed by this PR
Append the process ID to the file name created for holding disk cached
telemetry. This should resolve the issue with multiple Azure Functions
cores attempting to read/write/delete the same file when functions are
scaled to use multiple cores.

Extended this logic outside of Azure Functions so that in any case where
the SDK could be run concurrently we create distinct cache files.

### Command used to generate this PR:**_(Applicable only to SDK release
request PRs)_

### Checklists
- [x] Added impacted package name to the issue description
- [ ] Does this PR needs any fixes in the SDK Generator?** _(If so,
create an Issue in the
[Autorest/typescript](https://github.com/Azure/autorest.typescript)
repository and link it here)_
- [x] Added a changelog (if necessary)
@JacksonWeber
Copy link
Contributor

@ejizba Just wanted to confirm with you that the solution implemented in the above PRs works for you. Thanks.

@ejizba
Copy link
Author

ejizba commented May 17, 2024

@JacksonWeber this was a customer incident and I don't know exactly how to repro this issue myself. I tried on an old version of the appinsights package but I didn't see the error. Do have any tips for how to hit this code path? When is that file used?

@JacksonWeber
Copy link
Contributor

@ejizba IIRC the only way I could figure out to create the situation required to reproduce this problem was to create an Azure Function with access to multiple threads, and then have a sufficiently resource intensive process run on that function. I wasn't able to reproduce, but the logic behind the PR was to create a naming convention for created cache files that utilizes the PID such that no two processes could attempt to operate on the same file at once, even in the situation the customer described.

@ejizba
Copy link
Author

ejizba commented May 17, 2024

Okay I reviewed the PRs and they look good to me. This was only a single incident and it was arguably not even their main problem (high CPU). I think it's safe to close for now and we can just open a new issue if we ever see it happen again.

@ejizba ejizba closed this as completed May 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants