Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raft crashes with error="key too large" (DoS) #13281

Closed
slm0n87 opened this issue Nov 25, 2021 · 2 comments · Fixed by #13282
Closed

raft crashes with error="key too large" (DoS) #13281

slm0n87 opened this issue Nov 25, 2021 · 2 comments · Fixed by #13282
Milestone

Comments

@slm0n87
Copy link

slm0n87 commented Nov 25, 2021

Describe the bug
Internal raft storage crashes when user put secrets into deep nested paths.

To Reproduce
0. Having a running cluster (or single node) with raft storage backend with a secret-engine "secret"

  1. Run vault login ...
  2. Run for count in {550..600}; do echo COUNT:$count; vault kv put secret/$(for i in {1..$count}; do echo -n "1/"; done) foo=bar ; done
  3. For me at a count of 564 directories the raft cluster crashed.

Expected behavior
vault should prevent raft cluster from crashing

Environment:

  • Vault Server Version (retrieve with vault status): 1.9.0
  • Vault CLI Version (retrieve with vault version): Vault v1.8.4 ('925bc650ad1d997e84fbb832f302a6bfe0105bbb+CHANGES')
  • Server Operating System/Architecture: Ubuntu 20.4 LTS

Vault server configuration file(s):

# Ansible managed

storage "raft" {
  path    = "/space/raft-storage/xxxxxxxx-cnxyaqpi"
  node_id = "xxxxxxxx"
}

listener "tcp" {
  address = "0.0.0.0:8200"
  cluster_address = "x.x.x.x:8201"
  tls_disable = "true"
  # trust x-forwarded-for header of HA-Proxies
  x_forwarded_for_authorized_addrs = "x.x.x.x/32,x.x.x.x/32"
  x_forwarded_for_reject_not_present = "false"
}

seal "transit" {
  address = "http://x.x.x.x:8200"
  # token is read from VAULT_TOKEN env
  disable_renewal = "true"
  // Key configuration
  key_name           = "unseal_key"
  mount_path         = "transit/"
}

cluster_addr = "http://x.x.x.x:8201"
api_addr = "http://vault-test.app.xxxxxxxx:8200"
cluster_name = "vault"
raw_storage_endpoint = "true"
ui = "true"

Vault systemd logs:

Nov 25 17:42:13 xxxxxxxx vault[24393]: 2021-11-25T17:42:13.678Z [ERROR] storage.raft.fsm: failed to store data: error="key too large"
Nov 25 17:42:13 xxxxxxxx vault[24393]: panic: failed to store data
Nov 25 17:42:13 xxxxxxxx vault[24393]: goroutine 422 [running]:
Nov 25 17:42:13 xxxxxxxx vault[24393]: github.com/hashicorp/vault/physical/raft.(*FSM).ApplyBatch(0xc000207f40, {0xc000970838, 0x1, 0x44864f})
Nov 25 17:42:13 xxxxxxxx vault[24393]:         /home/runner/work/vault/vault/physical/raft/fsm.go:678 +0x7bf
Nov 25 17:42:13 xxxxxxxx vault[24393]: github.com/hashicorp/go-raftchunking.(*ChunkingBatchingFSM).ApplyBatch(0xc0004da150, {0xc000970830, 0x1, 0x2})
Nov 25 17:42:13 xxxxxxxx vault[24393]:         /home/runner/go/pkg/mod/github.com/hashicorp/[email protected]/fsm.go:234 +0x36e
Nov 25 17:42:13 xxxxxxxx vault[24393]: github.com/hashicorp/raft.(*Raft).runFSM.func2({0xc000f1c200, 0x1, 0xc0004da150})
Nov 25 17:42:13 xxxxxxxx vault[24393]:         /home/runner/go/pkg/mod/github.com/hashicorp/[email protected]/fsm.go:141 +0x1e9
Nov 25 17:42:13 xxxxxxxx vault[24393]: github.com/hashicorp/raft.(*Raft).runFSM(0xc000563b80)
Nov 25 17:42:13 xxxxxxxx vault[24393]:         /home/runner/go/pkg/mod/github.com/hashicorp/[email protected]/fsm.go:216 +0x35a
Nov 25 17:42:13 xxxxxxxx vault[24393]: github.com/hashicorp/raft.(*raftState).goFunc.func1()
Nov 25 17:42:13 xxxxxxxx vault[24393]:         /home/runner/go/pkg/mod/github.com/hashicorp/[email protected]/state.go:146 +0x62
Nov 25 17:42:13 xxxxxxxx vault[24393]: created by github.com/hashicorp/raft.(*raftState).goFunc
Nov 25 17:42:13 xxxxxxxx vault[24393]:         /home/runner/go/pkg/mod/github.com/hashicorp/[email protected]/state.go:144 +0x92
Nov 25 17:42:13 xxxxxxxx systemd[1]: vault.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Nov 25 17:42:13 xxxxxxxx systemd[1]: vault.service: Failed with result 'exit-code'.

Additional context
The issue was discovered by me during some performance and security tests.
In reality no user will ever create a path with 564 directories hopefully. But a potential attacker is able to crash the whole raft cluster with this method.
I would classify this issue as a Denial of Service Bug.
I was not able to recover from that state - I needed to build up a new empty raft cluster and restore the last raft snapshot.

@ncabatoff
Copy link
Collaborator

Nice find, thanks for the bug report @slm0n87!

@slm0n87
Copy link
Author

slm0n87 commented Nov 26, 2021

Great, thanks for the super fast fix.
With the new version the client now got the following response and the raft cluster did not crash anymore:

URL: PUT https://vault-test.app.xxx.xxx/v1/xxx-test/data/.......
...
Code: 500. Errors:

* 1 error occurred:
	* put failed due to key being too large, max key size for integrated storage is 32768

@mladlow mladlow modified the milestones: 1.9.1, 1.7.7 Nov 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants