-
Notifications
You must be signed in to change notification settings - Fork 29.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Referenced Object never be recycled even if napi_delete_reference called #39915
Comments
There is never any guarantee when the gc will collect objects, when finalizers will be run and also no real guarantee that RSS will go down when memory is released so its often hard to be sure if there is a real leak or not. Have you tried running for a long period and seeing if it actually fails due to an out of memory at some point? |
After this change: async function main() {
let i = 0
++ while (true) {
-- for (const [i] of Array.from({ length: 10000000 }).entries()) {
let refCount = refBuffer(BigBuffer)
if (refCount) {
throw new Error(`refCount should be 0, but got ${refCount}`)
}
if (i % 1000000 === 0) {
await sleep()
if (typeof global.gc === 'function') {
global.gc()
}
displayMemoryUsageFromNode(initial)
}
++ i++
}
} And running this program for a while, it stucks and the console output the memory usage:
And I removed the #[js_function(1)]
fn ref_buffer(ctx: napi::CallContext) -> napi::Result<napi::JsNumber> {
let buffer = ctx.get::<napi::JsBuffer>(0)?;
++ let ref_buffer = buffer.into_value()?;
++ let ref_count = ref_buffer.len();
++ ctx.env.create_uint32(ref_count as u32)
-- let ref_buffer = buffer.into_ref()?;
-- let ref_count = ref_buffer.unref(ctx.env.clone())?;
-- ctx.env.create_uint32(ref_count)
} And the program seems to be fine afterward:
|
I've updated the codes in https://github.com/Brooooooklyn/node-ref-object-leak #[js_function(1)]
fn ref_buffer(ctx: napi::CallContext) -> napi::Result<napi::JsNumber> {
-- let buffer = ctx.get::<napi::JsBuffer>(0)?;
-- let ref_buffer = buffer.into_ref()?;
++ let buffer = ctx.get::<napi::JsObject>(0)?;
++ let ref_buffer = ctx.env.create_reference(buffer)?;
let ref_count = ref_buffer.unref(ctx.env.clone())?;
ctx.env.create_uint32(ref_count)
} And if I pass the const { readFileSync } = require('fs')
const { refBuffer } = require('./index.node')
const { displayMemoryUsageFromNode } = require('./util')
const BigBuffer = readFileSync('yarn.lock')
const initial = process.memoryUsage()
async function main() {
let i = 0
while (true) {
-- let refCount = refBuffer(BigBuffer) // Memory leak
++ let refCOunt = refBuffer({ BigBuffer }) //fine
if (refCount) {
throw new Error(`refCount should be 0, but got ${refCount}`)
}
if (i % 1000000 === 0) {
await sleep()
if (typeof global.gc === 'function') {
global.gc()
}
displayMemoryUsageFromNode(initial)
}
i++
}
}
function sleep() {
return new Promise((resolve) => {
setTimeout(resolve, 1000)
})
}
main() |
Looking at Node-api code Line 2452 in 73d5f8a
One thing that change does do is introduce an allocation point where gc might be triggered. If you run with --trace-gc output do you see a difference between the behaviour? I also see that you ref the same BigBuffer, so unless I misunderstand the size of that should not make a difference since there is only one of those objects regardless of how many times it is referenced. |
Spent some more time looking at this today. Changing the code to be: const { readFileSync } = require('fs')
const { refBuffer } = require('./index.node')
const { displayMemoryUsageFromNode } = require('./util')
const BigBuffer = new Object();
const initial = process.memoryUsage()
async function main() {
let i = 0
while (true) {
let refCount = refBuffer(new Object())
if (refCount) {
throw new Error(`refCount should be 0, but got ${refCount}`)
}
if (i % 1000000 === 0) {
await sleep()
if (typeof global.gc === 'function') {
global.gc()
}
displayMemoryUsageFromNode(initial)
}
i++
}
}
function sleep() {
return new Promise((resolve) => {
setTimeout(resolve, 100)
})
}
main() I see that rss grows to about 22MB and then stays there. If I then edit the rust code to remove the unref:
the the rss quickly grows reaching several GB in 3-4 seconds. Fixing the code to add the unref back I then see that rss is stable again. That confirms that unref is doing what we think in the case were we are passing in a new object each time. If I then change the code so that instead of a new object it reference the same object (one created through simply new Object()) every time I again see the increasing rss. This is the change to the code above to see the difference in behaviour: let refCount = refBuffer(BigBuffer)
// let refCount = refBuffer(new Object()) |
Next I modified the code so that every time the stats were printed out we would create a new object and then use that for the call to refBuffer. .
.
.
displayMemoryUsageFromNode(initial)
BigBuffer = new Object();
} After doing that the rss grew to 393MB (It went up higher then shrunk down and stayed there) and then stayed steady at that. |
Based on the above this is what I think is happening: The following code creates a reference to Object A. That reference is then unrefed, but the reference itself is not deleted. When the finalizer for the object runs it will delete the reference for you. let ref_buffer = ctx.env.create_reference(buffer)?;
let ref_count = ref_buffer.unref(ctx.env.clone())?; BUT before the gc runs, triggering the finalizer, a another reference is created to the same object object A. This means it will not be collected by the gc, the finalizer is not run and the first reference that was created will not be cleaned up. This matches the other data where if we use a new object every time we don't see any growth in RSS. Since it's a new Object each time, creating the second reference does not prevent the earlier one from being cleaned up. It also matches the case were we re-use the same Object for a period of time and then start using another one. We see RSS grow, but then stabilize as we end up with a maximum number of references held alive related to the same object. So I think this is working as designed, even if its not exactly what we'd like. We need to use the finalizer for the object to clean up the C structure allocated for the reference. If you prevent the Object from being collected and therefore the finalizer being run by creating another reference you will retain memory until the Object can be collected. Therefore, repeatedly creating a new reference on the same object will cause a memory leak as the Object will never be able to be collected. That is something that is easy to make happen in a simple test but unlikely to occur in a real life scenario. |
I'll add that even if you delete the reference in addition to unrefing it the same behaviour may occur. This is because if a finalizer is queued we need to wait to delete the reference until after the finalizer runs. You can see the code that could end up defering the deleted until after the finalizer runs here: Line 507 in 861c3e0
However, if you avoid unrefing before you delete, the delete will take place immediately as we know it could not have been queued for finalization since the reference count was >0. |
@mhdawson IINM when we create a buffer on the native side with |
@gabrielschulhof I need a bit more context/info on how that might resolve the issue. |
Based on discussion in the Node-API team meeting we agreed that adding some additional explanation to the documentation would help, with the recommendation to do a delete instead of unref when that makes sense. |
Created this PR to update the documentation - #42035 |
There is now a concept of a nogc callback which helps with this problem https://nodejs.org/docs/latest/api/n-api.html#node_api_nogc_finalize. @Brooooooklyn can you validate that this fixes the issue for you? |
@mhdawson I'll try it asap, thank you! |
Version
14.17.5
Platform
All
Subsystem
No response
What steps will reproduce the bug?
See https://github.com/Brooooooklyn/node-ref-object-leak for detail
I have a js function in native side, which accept a
JsBuffer
:The
napi_create_reference
,napi_reference_unref
andnapi_delete_reference
are called in this function.In the JavaScript side:
We can see the memory is growing:
How often does it reproduce? Is there a required condition?
Always
What is the expected behavior?
Buffer is deallocated
What do you see instead?
Buffer is never been deallocated
Additional information
Brooooooklyn/snappy#25
The text was updated successfully, but these errors were encountered: