Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in LRU cache can cause persistent server failure #344

Closed
2 tasks done
tobiasdcl opened this issue Nov 29, 2023 · 9 comments · Fixed by #345
Closed
2 tasks done

Bug in LRU cache can cause persistent server failure #344

tobiasdcl opened this issue Nov 29, 2023 · 9 comments · Fixed by #345

Comments

@tobiasdcl
Copy link

tobiasdcl commented Nov 29, 2023

Prerequisites

  • I have written a descriptive issue title
  • I have searched existing issues to ensure the bug has not already been reported

Fastify version

4.24.3

Plugin version

9.0.0

Node.js version

20.9.0

Operating system

Linux

Operating system version (i.e. 20.04, 11.3, 10)

Ubuntu 22.04

Description

Hey folks,
first of all thanks for building and maintaining this library ❤️

I found a bug in the most recent version of fastify-rate-limit (9.0.0 at the time of writing) when using the default LocalStore.

Summary

  • if the first two cache keys are identical and the cache limit is reached once - all following consecutive requests fail

Based on my current understanding the root cause is a bug in the underlying LRU implementation toad-cache - I will create a dedicated issue there and link it here (update: done).

Impact

If triggered, all routes protected by fastify-rate-limit will be persitently error out

Steps to Reproduce

create the following file asrepro.mjs and then run node repro.mjs

import Fastify from 'fastify'

(async () => {
  const fastify = Fastify({
    logger: true
  })
  await fastify.register(import('@fastify/rate-limit'), {
    global: true,
    cache: 5,
  })

  fastify.get('/', (_request, reply) => {
    reply.send({ hello: 'fastify' })
  })

  fastify.listen({ port: 8000 }, (err, address) => {
    if (err) throw err
    console.log('Server running on port 80')
  })

  await fastify.inject({
    method: 'GET',
    url: '/',
    remoteAddress: '10.0.0.1'
  })

  /**
   * >>> This is triggering the bug in the LRU cache <<<
   */
  await fastify.inject({
    method: 'GET',
    url: '/',
    remoteAddress: '10.0.0.1'
  })
  await fastify.inject({
    method: 'GET',
    url: '/',
    remoteAddress: '10.0.0.2'
  })
  await fastify.inject({
    method: 'GET',
    url: '/',
    remoteAddress: '10.0.0.3'
  })
  await fastify.inject({
    method: 'GET',
    url: '/',
    remoteAddress: '10.0.0.4'
  })
  await fastify.inject({
    method: 'GET',
    url: '/',
    remoteAddress: '10.0.0.5'
  })

  /**
   * This request will cause the first cache evict because it is the 6th ip while the cache size is set to 5
   * Normally this is not a problem but in this case it causes the `evict` function of the LRU cache to be called - causing a failure. Due to the nature of the bug all following requests will fail and the application can not recover on its own
   */
  await fastify.inject({
    method: 'GET',
    url: '/',
    remoteAddress: '10.0.0.6'
  })
})()

raw output from node repro.mjs:

{"level":30,"time":1701280599225,"pid":65968,"hostname":"334e6cd6cd00","reqId":"req-1","req":{"method":"GET","url":"/","hostname":"localhost:80","remoteAddress":"10.0.0.1"},"msg":"incoming request"}
{"level":30,"time":1701280599227,"pid":65968,"hostname":"334e6cd6cd00","reqId":"req-1","res":{"statusCode":200},"responseTime":1.9484590291976929,"msg":"request completed"}
{"level":30,"time":1701280599227,"pid":65968,"hostname":"334e6cd6cd00","reqId":"req-2","req":{"method":"GET","url":"/","hostname":"localhost:80","remoteAddress":"10.0.0.1"},"msg":"incoming request"}
{"level":30,"time":1701280599227,"pid":65968,"hostname":"334e6cd6cd00","reqId":"req-2","res":{"statusCode":200},"responseTime":0.15683400630950928,"msg":"request completed"}
{"level":30,"time":1701280599228,"pid":65968,"hostname":"334e6cd6cd00","reqId":"req-3","req":{"method":"GET","url":"/","hostname":"localhost:80","remoteAddress":"10.0.0.2"},"msg":"incoming request"}
{"level":30,"time":1701280599228,"pid":65968,"hostname":"334e6cd6cd00","reqId":"req-3","res":{"statusCode":200},"responseTime":0.140083909034729,"msg":"request completed"}
{"level":30,"time":1701280599228,"pid":65968,"hostname":"334e6cd6cd00","reqId":"req-4","req":{"method":"GET","url":"/","hostname":"localhost:80","remoteAddress":"10.0.0.3"},"msg":"incoming request"}
{"level":30,"time":1701280599228,"pid":65968,"hostname":"334e6cd6cd00","reqId":"req-4","res":{"statusCode":200},"responseTime":0.09191691875457764,"msg":"request completed"}
{"level":30,"time":1701280599228,"pid":65968,"hostname":"334e6cd6cd00","reqId":"req-5","req":{"method":"GET","url":"/","hostname":"localhost:80","remoteAddress":"10.0.0.4"},"msg":"incoming request"}
{"level":30,"time":1701280599228,"pid":65968,"hostname":"334e6cd6cd00","reqId":"req-5","res":{"statusCode":200},"responseTime":0.12541699409484863,"msg":"request completed"}
{"level":30,"time":1701280599228,"pid":65968,"hostname":"334e6cd6cd00","reqId":"req-6","req":{"method":"GET","url":"/","hostname":"localhost:80","remoteAddress":"10.0.0.5"},"msg":"incoming request"}
{"level":30,"time":1701280599228,"pid":65968,"hostname":"334e6cd6cd00","reqId":"req-6","res":{"statusCode":200},"responseTime":0.0599590539932251,"msg":"request completed"}
{"level":30,"time":1701280599228,"pid":65968,"hostname":"334e6cd6cd00","reqId":"req-7","req":{"method":"GET","url":"/","hostname":"localhost:80","remoteAddress":"10.0.0.6"},"msg":"incoming request"}
{"level":50,"time":1701280599229,"pid":65968,"hostname":"334e6cd6cd00","reqId":"req-7","req":{"method":"GET","url":"/","hostname":"localhost:80","remoteAddress":"10.0.0.6"},"res":{"statusCode":500},"err":{"type":"TypeError","message":"Cannot read properties of null (reading 'key')","stack":"TypeError: Cannot read properties of null (reading 'key')\n    at LruObject.evict (/workspace/node_modules/toad-cache/dist/toad-cache.cjs:406:30)\n    at LruObject.set (/workspace/node_modules/toad-cache/dist/toad-cache.cjs:470:12)\n    at LocalStore.incr (/workspace/node_modules/@fastify/rate-limit/store/LocalStore.js:41:12)\n    at res (/workspace/node_modules/@fastify/rate-limit/index.js:214:15)\n    at new Promise (<anonymous>)\n    at Object.<anonymous> (/workspace/node_modules/@fastify/rate-limit/index.js:213:25)\n    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)"},"msg":"Cannot read properties of null (reading 'key')"}
{"level":30,"time":1701280599229,"pid":65968,"hostname":"334e6cd6cd00","reqId":"req-7","res":{"statusCode":500},"responseTime":0.6233330965042114,"msg":"request completed"

prettified stacktrace:

[17:42:51.262] ERROR (63715): Cannot read properties of null (reading 'key')
    reqId: "req-7"
    req: {
      "method": "GET",
      "url": "/",
      "hostname": "localhost:80",
      "remoteAddress": "10.0.0.6"
    }
    res: {
      "statusCode": 500
    }
    err: {
      "type": "TypeError",
      "message": "Cannot read properties of null (reading 'key')",
      "stack":
          TypeError: Cannot read properties of null (reading 'key')
              at LruObject.evict (/workspace/node_modules/toad-cache/dist/toad-cache.cjs:406:30)
              at LruObject.set (/workspace/node_modules/toad-cache/dist/toad-cache.cjs:470:12)
              at LocalStore.incr (/workspace/node_modules/@fastify/rate-limit/store/LocalStore.js:41:12)
              at res (/workspace/node_modules/@fastify/rate-limit/index.js:214:15)
              at new Promise (<anonymous>)
              at Object.<anonymous> (/workspace/node_modules/@fastify/rate-limit/index.js:213:25)
              at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    }

Expected Behavior

Expected Behavior:

  • should get 200 responses for all injected requests

Actual Behavior:

  • once the cache limit is reached once all consecutive requests fail
    • given that the first two cache keys are identical
@Uzlopak
Copy link
Contributor

Uzlopak commented Nov 29, 2023

@gurgunday
@kibertoad

@tobiasdcl
Copy link
Author

tobiasdcl commented Nov 29, 2023

I can submit a PR but would like some guidance about the approach. Given that the issue is in an underlying library (if my understanding is correct) I am not sure if you want to revert to the prev. LRU cache or wait for a fix in toad-cache

@tobiasdcl
Copy link
Author

tobiasdcl commented Nov 29, 2023

Here are also some more details. The problem is that the lru.first is set to null if a value is first set and then retrieved via get

Bildschirmfoto 2023-11-29 um 18 39 29


Bildschirmfoto 2023-11-29 um 18 40 28

and once lru.first is set to null it is just a matter of time before reaching the cache limit causing the first evict call - from that point on it is game over 😅

@Uzlopak
Copy link
Contributor

Uzlopak commented Nov 29, 2023

probably has to be fixed in toad-cache and not here.

@tobiasdcl
Copy link
Author

probably has to be fixed in toad-cache and not here.

I created a PR 🙌

@Uzlopak
Copy link
Contributor

Uzlopak commented Nov 29, 2023

Nice job!

@gurgunday
Copy link
Member

Thank you so much for the PR!

@kibertoad
Copy link
Member

@tobiasdcl 9.0.1 with the fix was released! thank you for the report and the PR

@tobiasdcl
Copy link
Author

@tobiasdcl 9.0.1 with the fix was released! thank you for the report and the PR

Awesome, thank you for the super fast fix 🥳

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants