-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High memory usage with new forking parser cache #7503
Comments
Maybe add this code to the memory_usage.py file, and have it save the data to a Before I do something like that, remember tmp/w3af-psutil.data , which might already contain something like it. Also remember that it might be a good idea to integrate this into the collector instead of w3af {
"CPU": {
"guest": 0.0,
"guest_nice": 0.0,
"idle": 46.05,
"iowait": 43.59,
"irq": 4.19,
"nice": 4.32,
"softirq": 0.0,
"steal": 263.34,
"system": 121.42,
"user": 390.76
},
"Load average": [
0.89,
0.98,
0.6
],
"Network": {
"docker0": {
"bytes_recv": 0,
"bytes_sent": 0,
"dropin": 0,
"dropout": 0,
"errin": 0,
"errout": 0,
"packets_recv": 0,
"packets_sent": 0
},
"eth0": {
"bytes_recv": 265032109,
"bytes_sent": 3876973,
"dropin": 0,
"dropout": 0,
"errin": 0,
"errout": 0,
"packets_recv": 180723,
"packets_sent": 48957
},
"lo": {
"bytes_recv": 0,
"bytes_sent": 0,
"dropin": 0,
"dropout": 0,
"errin": 0,
"errout": 0,
"packets_recv": 0,
"packets_sent": 0
}
},
"Swap memory": {
"free": 939520000,
"percent": 0.0,
"sin": 0,
"sout": 0,
"total": 939520000,
"used": 0
},
"Virtual memory": {
"active": 533528576,
"available": 1616830464,
"buffers": 35323904,
"cached": 938897408,
"free": 642609152,
"inactive": 461950976,
"percent": 6.6,
"total": 1731534848,
"used": 1088925696
}
}```
The psutil data might not be enough, one of the problems is that it's only retrieved once |
Using andresriancho/w3af-performance-analysis@198c0e0 it's possible to get this output:
Which shows that:
I would expect the In most of the data tuples above we see that the main process uses X% RAM and the SubDaemonPoolWorker uses X-1%, which seems strange and makes me think that either:
|
4255a32 will help understand the shared memory usage in subprocesses |
Need to run |
This is really bad... not much memory is shared betweek parent (10327) and child (11418):
Is this the right type of shared memory I'm looking for? |
TODO for debugging, add the profiling tools to this function, so we can better track what happens inside the sub-process: def init_worker(log_queue): |
Looks like this might be useful: |
Does this mean that the worker is sending http traffic? What's the traceback for these calls to
|
This issue of the sub-process running "all w3af" looks like http://stackoverflow.com/a/7994465/1347554 , but that can't be... |
https://docs.python.org/3.4/library/multiprocessing.html#contexts-and-start-methods |
Latest billiard, the one from the repo (not pypi), is a backport of the 3.4 multiprocessing module for 2.7 I want to test that with spawn as https://docs.python.org/3.4/library/multiprocessing.html#multiprocessing.set_start_method |
Something I wanted to play with for a while... but not sure if it's the best thing in this scenario: http://zerorpc.dotcloud.com/ With timeouts:
And the server import time
class Cooler(object):
""" Various convenience methods to make things cooler. """
def add_man(self, sentence):
""" End a sentence with ", man!" to make it sound cooler, and
return the result. """
return sentence + ", man!"
def add_42(self, n):
""" Add 42 to an integer argument to make it cooler, and return the
result. """
return n + 42
def boat(self, sentence):
""" Replace a sentence with "I'm on a boat!", and return that,
because it's cooler. """
return "I'm on a boat!"
def sleep(self, num):
time.sleep(num)
return 'slept %s' % num
import zerorpc
s = zerorpc.Server(Cooler())
s.bind("tcp://0.0.0.0:4242")
s.run() I could easily start the zerorpc server in a different process using Popen and make it listen on a unix socket
The document parser zerorpc-server could have a signal handler for Ctrl+C which would stop the server, kill any running tasks, cleanup, etc. and end the process. From the main process we could send the signal when we need to kill the sub-process. Instead of having only one document parser server I could use zeroworkers to have N workers (like I have now with the pool) and one task generator (the main process) |
https://pypi.python.org/pypi/Pyro4 looks good:
|
http://rpyc.readthedocs.org/ also looks good
|
Maybe the sub-processes could be started/stopped using something similar to https://pypi.python.org/pypi/python-daemon ? (source https://www.python.org/dev/peps/pep-3143/) |
I also need to create a new AMI for launching the collector since with the latest tools it's really slow to boot. andresriancho/collector#10 |
https://circleci.com/gh/andresriancho/w3af/1428 yields the message:
This was not related to the parsers/subprocesses, it was because of a memory leak in lxml |
Solved in |
Source code
https://github.com/andresriancho/w3af/blob/develop/w3af/core/data/parsers/parser_cache.py#L81
https://github.com/andresriancho/w3af/blob/develop/w3af/core/data/parsers/parser_cache.py#L161
Description
High memory usage with new forking parser cache, this is appearing after several minutes of running w3af using collector:
Tasks
The text was updated successfully, but these errors were encountered: