Memory leak or poor memory performance. #1228

tau0 · 2014-12-15T09:20:47Z

$ grep "influxdb" /var/log/messages
Dec 11 18:30:32 crosstalk kernel: [18228436.798056] Out of memory: Kill process 7657 (influxdb) score 670 or sacrifice child
Dec 11 18:30:32 crosstalk kernel: [18228436.838105] Killed process 7657 (influxdb) total-vm:35068428kB, anon-rss:33083764kB, file-rss:880kB
Dec 12 02:34:17 crosstalk kernel: [18257484.632229] influxdb invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
Dec 12 02:34:17 crosstalk kernel: [18257484.632234] influxdb cpuset=/ mems_allowed=0-1
Dec 12 02:34:17 crosstalk kernel: [18257484.632238] CPU: 9 PID: 2910 Comm: influxdb Not tainted 3.10.28-4 #1

The text was updated successfully, but these errors were encountered:

tau0 · 2014-12-15T09:26:48Z

Just after restart:

tau0@crosstalk:~$ cat /proc/`pidof influxdb`/status|grep Vm
VmPeak:  5877544 kB
VmSize:  5877544 kB
VmLck:         0 kB
VmPin:         0 kB
VmHWM:   1256124 kB
VmRSS:   1255912 kB
VmData:  5838296 kB
VmStk:       136 kB
VmExe:     14984 kB
VmLib:      3416 kB
VmPTE:      3084 kB
VmSwap:        0 kB

toddboom · 2014-12-15T15:56:06Z

@tau0 Can you tell me a little bit more about your setup? (Hardware specifications, read/write load, data size, etc.)

tau0 · 2014-12-17T13:45:25Z

HW: https://gist.github.com/tau0/0c44330d04400e93a9e4
I don't know how to measure write/read correctly but I expected smth about 10^6 writes per day.
Db size: https://gist.github.com/tau0/44244079f9ae48287a55

imcom · 2015-01-02T09:47:38Z

top - 09:42:13 up 130 days, 5:07, 2 users, load average: 4.39, 4.16, 5.04
Tasks: 245 total, 2 running, 243 sleeping, 0 stopped, 0 zombie
Cpu(s): 23.0%us, 1.9%sy, 0.0%ni, 73.9%id, 0.8%wa, 0.0%hi, 0.4%si, 0.0%st
Mem: 65898548k total, 5616964k used, 60281584k free, 48872k buffers

Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 24
On-line CPU(s) list: 0-23
Thread(s) per core: 2
Core(s) per socket: 6
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 45
Stepping: 7
CPU MHz: 2000.040
BogoMIPS: 3999.97
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 15360K
NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22
NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23

Hi @toddboom , I met the similar situation today, one of the Influxdb servers crashed due to OOM.
Above is my hardware setup, and the server only runs Influxdb. I have approximately 120K points for both write and read per second. I have some queries like "select func(x) from x where time > now() - 5m group by time(1m)" and I will write the output of those queries back to Influxdb for further query. I do not use continuous query because continuous query often hangs in my production, this is another story. But the OOM killer is really a killer. Alternatively, is there any way to free memory from a running influxdb?

dimatha · 2015-01-14T21:54:55Z

Same issue here. I just setup a PoC running on a VM. it was running fine for a few days with 4G of memory and it started to consume all the memory today even after I added additional 4G of RAM. No CQ as well. I'm using collectd input.
shard_db_v2]# du -sh * .
1.2G 00001
100K 00002
Not sure how to get the stat for the DB itself and the number of ttl points.

joelgriffiths · 2015-03-18T18:09:46Z

I'm getting this quite often on c3.xlarge instances, particularly when my boss tries to select a week's worth of data at 0.1s intervals. I'm trying to tune down max-open-shards and max-open-files, but I can't find much current documentation on the parameters. The example on the website show max-open-files configured at 40 and that seems unusually restrictive. Currently I have 10 max-open-shards and 1000 max-open-files with nofiles set to 65536, and I haven't crashed yet with these settings, but I'm afraid to let my boss use it again. When it crashed before, max-open-shards was set to 0 (unlimited).

pauldix · 2015-04-07T20:51:56Z

This is likely fixed in 0.9.0 and we won't be making fix releases for the 0.8 line. Please test against the latest 0.9.0 RC and let us know if there's an issue.

lokeshintuit · 2015-04-27T20:36:17Z

@pauldix are you sure that memory issues are fixed? without verifying you are marking it closed. I am also having memory issue on influx 0.8.8

dmesg:
[2425195.754801] Out of memory: Kill process 24552 (influxdb) score 983 or sacrifice child
[2425195.764315] Killed process 24552 (influxdb) total-vm:154248772kB, anon-rss:129593940kB, file-rss:0kB

beckettsean · 2015-04-27T21:42:53Z

@lokeshverizon There's enough difference between 0.8.8 and 0.9.0 that bugs in one don't offer much insight into the other system. The memory leaks in 0.8.8 will not be fixed. Most likely 0.9.0 has new memory leaks to find, but the behavior that leads to these leaks is not particularly strange or unusual.

steverweber mentioned this issue Feb 14, 2015

Test 3 node cluster running for multiple hours #1440

Closed

pauldix closed this as completed Apr 7, 2015

dchen1107 mentioned this issue Jul 1, 2015

Persistent influxdb data to directory on local host. kubernetes/kubernetes#10528

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory leak or poor memory performance. #1228

Memory leak or poor memory performance. #1228

tau0 commented Dec 15, 2014

tau0 commented Dec 15, 2014

toddboom commented Dec 15, 2014

tau0 commented Dec 17, 2014

imcom commented Jan 2, 2015

dimatha commented Jan 14, 2015

joelgriffiths commented Mar 18, 2015

pauldix commented Apr 7, 2015

lokeshintuit commented Apr 27, 2015

beckettsean commented Apr 27, 2015

Memory leak or poor memory performance. #1228

Memory leak or poor memory performance. #1228

Comments

tau0 commented Dec 15, 2014

tau0 commented Dec 15, 2014

toddboom commented Dec 15, 2014

tau0 commented Dec 17, 2014

imcom commented Jan 2, 2015

dimatha commented Jan 14, 2015

joelgriffiths commented Mar 18, 2015

pauldix commented Apr 7, 2015

lokeshintuit commented Apr 27, 2015

beckettsean commented Apr 27, 2015