Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak or poor memory performance. #1228

Closed
tau0 opened this issue Dec 15, 2014 · 9 comments
Closed

Memory leak or poor memory performance. #1228

tau0 opened this issue Dec 15, 2014 · 9 comments

Comments

@tau0
Copy link

tau0 commented Dec 15, 2014

$ grep "influxdb" /var/log/messages
Dec 11 18:30:32 crosstalk kernel: [18228436.798056] Out of memory: Kill process 7657 (influxdb) score 670 or sacrifice child
Dec 11 18:30:32 crosstalk kernel: [18228436.838105] Killed process 7657 (influxdb) total-vm:35068428kB, anon-rss:33083764kB, file-rss:880kB
Dec 12 02:34:17 crosstalk kernel: [18257484.632229] influxdb invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
Dec 12 02:34:17 crosstalk kernel: [18257484.632234] influxdb cpuset=/ mems_allowed=0-1
Dec 12 02:34:17 crosstalk kernel: [18257484.632238] CPU: 9 PID: 2910 Comm: influxdb Not tainted 3.10.28-4 #1
@tau0
Copy link
Author

tau0 commented Dec 15, 2014

Just after restart:

tau0@crosstalk:~$ cat /proc/`pidof influxdb`/status|grep Vm
VmPeak:  5877544 kB
VmSize:  5877544 kB
VmLck:         0 kB
VmPin:         0 kB
VmHWM:   1256124 kB
VmRSS:   1255912 kB
VmData:  5838296 kB
VmStk:       136 kB
VmExe:     14984 kB
VmLib:      3416 kB
VmPTE:      3084 kB
VmSwap:        0 kB

@toddboom
Copy link
Contributor

@tau0 Can you tell me a little bit more about your setup? (Hardware specifications, read/write load, data size, etc.)

@tau0
Copy link
Author

tau0 commented Dec 17, 2014

HW: https://gist.github.com/tau0/0c44330d04400e93a9e4
I don't know how to measure write/read correctly but I expected smth about 10^6 writes per day.
Db size: https://gist.github.com/tau0/44244079f9ae48287a55

@imcom
Copy link

imcom commented Jan 2, 2015

top - 09:42:13 up 130 days, 5:07, 2 users, load average: 4.39, 4.16, 5.04
Tasks: 245 total, 2 running, 243 sleeping, 0 stopped, 0 zombie
Cpu(s): 23.0%us, 1.9%sy, 0.0%ni, 73.9%id, 0.8%wa, 0.0%hi, 0.4%si, 0.0%st
Mem: 65898548k total, 5616964k used, 60281584k free, 48872k buffers

Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 24
On-line CPU(s) list: 0-23
Thread(s) per core: 2
Core(s) per socket: 6
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 45
Stepping: 7
CPU MHz: 2000.040
BogoMIPS: 3999.97
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 15360K
NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22
NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23

Hi @toddboom , I met the similar situation today, one of the Influxdb servers crashed due to OOM.
Above is my hardware setup, and the server only runs Influxdb. I have approximately 120K points for both write and read per second. I have some queries like "select func(x) from x where time > now() - 5m group by time(1m)" and I will write the output of those queries back to Influxdb for further query. I do not use continuous query because continuous query often hangs in my production, this is another story. But the OOM killer is really a killer. Alternatively, is there any way to free memory from a running influxdb?

@dimatha
Copy link

dimatha commented Jan 14, 2015

Same issue here. I just setup a PoC running on a VM. it was running fine for a few days with 4G of memory and it started to consume all the memory today even after I added additional 4G of RAM. No CQ as well. I'm using collectd input.
shard_db_v2]# du -sh * .
1.2G 00001
100K 00002
Not sure how to get the stat for the DB itself and the number of ttl points.

@joelgriffiths
Copy link

I'm getting this quite often on c3.xlarge instances, particularly when my boss tries to select a week's worth of data at 0.1s intervals. I'm trying to tune down max-open-shards and max-open-files, but I can't find much current documentation on the parameters. The example on the website show max-open-files configured at 40 and that seems unusually restrictive. Currently I have 10 max-open-shards and 1000 max-open-files with nofiles set to 65536, and I haven't crashed yet with these settings, but I'm afraid to let my boss use it again. When it crashed before, max-open-shards was set to 0 (unlimited).

@pauldix
Copy link
Member

pauldix commented Apr 7, 2015

This is likely fixed in 0.9.0 and we won't be making fix releases for the 0.8 line. Please test against the latest 0.9.0 RC and let us know if there's an issue.

@pauldix pauldix closed this as completed Apr 7, 2015
@lokeshintuit
Copy link

@pauldix are you sure that memory issues are fixed? without verifying you are marking it closed. I am also having memory issue on influx 0.8.8

dmesg:
[2425195.754801] Out of memory: Kill process 24552 (influxdb) score 983 or sacrifice child
[2425195.764315] Killed process 24552 (influxdb) total-vm:154248772kB, anon-rss:129593940kB, file-rss:0kB

@beckettsean
Copy link
Contributor

@lokeshverizon There's enough difference between 0.8.8 and 0.9.0 that bugs in one don't offer much insight into the other system. The memory leaks in 0.8.8 will not be fixed. Most likely 0.9.0 has new memory leaks to find, but the behavior that leads to these leaks is not particularly strange or unusual.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants