High memory usage with new forking parser cache #7503

andresriancho · 2015-01-13T21:14:53Z

Source code

https://github.com/andresriancho/w3af/blob/develop/w3af/core/data/parsers/parser_cache.py#L81
https://github.com/andresriancho/w3af/blob/develop/w3af/core/data/parsers/parser_cache.py#L161

Description

High memory usage with new forking parser cache, this is appearing after several minutes of running w3af using collector:

Exception in thread Thread-15:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "/usr/lib/python2.7/threading.py", line 763, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 326, in _handle_workers
    pool._maintain_pool()
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 230, in _maintain_pool
    self._repopulate_pool()
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 223, in _repopulate_pool
    w.start()
  File "/usr/lib/python2.7/multiprocessing/process.py", line 130, in start
    self._popen = Popen(self)
  File "/usr/lib/python2.7/multiprocessing/forking.py", line 121, in __init__
    self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory
/home/ubuntu/run_w3af.sh: line 64: 10302 Killed                  ./w3af_console -s /tmp/test-script.w3af

Fatal error: sudo() received nonzero return code 137 while executing!

Requested: /home/ubuntu/run_w3af.sh
Executed: sudo -S -p 'sudo password:'  /bin/bash -l -c "export VERSION=\"da7afe3\" && /home/ubuntu/run_w3af.sh"

Aborting.
A remote error was found.

Tasks

Change the collector to be able to collect information even when the run_w3af.sh command fails
Add a profiling plugin to w3af that counts how many sub-processes it has, and if possible, their memory usage (real, remember the copy on write)
Identify the where the memory is being used

The text was updated successfully, but these errors were encountered:

andresriancho/w3af#7503

andresriancho · 2015-01-14T13:40:33Z

Maybe add this code to the memory_usage.py file, and have it save the data to a .os_memory file?
https://raw.githubusercontent.com/pixelb/ps_mem/master/ps_mem.py

Before I do something like that, remember tmp/w3af-psutil.data , which might already contain something like it. Also remember that it might be a good idea to integrate this into the collector instead of w3af

{
    "CPU": {
        "guest": 0.0, 
        "guest_nice": 0.0, 
        "idle": 46.05, 
        "iowait": 43.59, 
        "irq": 4.19, 
        "nice": 4.32, 
        "softirq": 0.0, 
        "steal": 263.34, 
        "system": 121.42, 
        "user": 390.76
    }, 
    "Load average": [
        0.89, 
        0.98, 
        0.6
    ], 
    "Network": {
        "docker0": {
            "bytes_recv": 0, 
            "bytes_sent": 0, 
            "dropin": 0, 
            "dropout": 0, 
            "errin": 0, 
            "errout": 0, 
            "packets_recv": 0, 
            "packets_sent": 0
        }, 
        "eth0": {
            "bytes_recv": 265032109, 
            "bytes_sent": 3876973, 
            "dropin": 0, 
            "dropout": 0, 
            "errin": 0, 
            "errout": 0, 
            "packets_recv": 180723, 
            "packets_sent": 48957
        }, 
        "lo": {
            "bytes_recv": 0, 
            "bytes_sent": 0, 
            "dropin": 0, 
            "dropout": 0, 
            "errin": 0, 
            "errout": 0, 
            "packets_recv": 0, 
            "packets_sent": 0
        }
    }, 
    "Swap memory": {
        "free": 939520000, 
        "percent": 0.0, 
        "sin": 0, 
        "sout": 0, 
        "total": 939520000, 
        "used": 0
    }, 
    "Virtual memory": {
        "active": 533528576, 
        "available": 1616830464, 
        "buffers": 35323904, 
        "cached": 938897408, 
        "free": 642609152, 
        "inactive": 461950976, 
        "percent": 6.6, 
        "total": 1731534848, 
        "used": 1088925696
    }
}```

The psutil data might not be enough, one of the problems is that it's only retrieved once

andresriancho · 2015-01-16T12:43:28Z

Using andresriancho/w3af-performance-analysis@198c0e0 it's possible to get this output:

PSUtils memory usage summary
    0
        10351: 1.68% - SyncManager-1._run_server
        10334: 5.54% - (unknown)

    1
        10351: 1.72% - SyncManager-1._run_server
        12070: 11.85% - SubDaemonPoolWorker-7.worker
        10334: 12.78% - (unknown)

    2
        10351: 1.74% - SyncManager-1._run_server
        10334: 26.40% - (unknown)

    3
        10351: 1.74% - SyncManager-1._run_server
        10334: 35.56% - (unknown)
        18390: 33.96% - (unknown)

    4
        10351: 1.74% - SyncManager-1._run_server
        10334: 44.82% - (unknown)

    5
        10351: 1.74% - SyncManager-1._run_server
        10334: 50.51% - (unknown)

    6
        10351: 1.74% - SyncManager-1._run_server
        10334: 55.65% - (unknown)
        25349: 55.00% - (unknown)

    7
        10351: 1.74% - SyncManager-1._run_server
        10334: 61.72% - (unknown)

    8
        10351: 1.63% - SyncManager-1._run_server
        10334: 68.65% - (unknown)

    9
        30818: 68.41% - (unknown)
        10351: 0.37% - SyncManager-1._run_server
        10334: 68.87% - (unknown)

    10
        10351: 0.35% - SyncManager-1._run_server
        10334: 66.70% - (unknown)
        31164: 66.07% - SubDaemonPoolWorker-85.worker

    11
        10351: 0.27% - SyncManager-1._run_server
        10334: 70.25% - (unknown)
        31164: 41.74% - SubDaemonPoolWorker-85.worker

    12
        10351: 0.24% - SyncManager-1._run_server
        10334: 77.76% - (unknown)
        1001: 76.95% - SubDaemonPoolWorker-86.worker

    13
        10351: 0.24% - SyncManager-1._run_server
        10334: 77.96% - (unknown)

    14
        10351: 0.15% - SyncManager-1._run_server
        10334: 75.20% - (unknown)

    15
        10351: 0.14% - SyncManager-1._run_server
        10334: 77.95% - (unknown)
        2685: 77.10% - SubDaemonPoolWorker-89.worker

    16
        10351: 0.15% - SyncManager-1._run_server
        10334: 78.67% - (unknown)

    17
        3328: 80.97% - SubDaemonPoolWorker-91.worker
        10351: 0.13% - SyncManager-1._run_server
        10334: 82.75% - (unknown)

    18
        10351: 0.14% - SyncManager-1._run_server
        10334: 80.95% - (unknown)

    19
        10334: 82.42% - (unknown)
        10351: 0.16% - SyncManager-1._run_server

    20
        10334: 82.94% - (unknown)
        10351: 0.16% - SyncManager-1._run_server

    21
        10334: 84.02% - (unknown)
        10351: 0.16% - SyncManager-1._run_server

    22
        10334: 86.08% - (unknown)
        10351: 0.14% - SyncManager-1._run_server

Which shows that:

10334: 70.25% - (unknown) is the main w3af process, using considerable amount of RAM
10351: 0.27% - SyncManager-1._run_server is the self._processes = manager.dict() which doesn't use much RAM (expected but good to confirm)
31164: 41.74% - SubDaemonPoolWorker-85.worker is the parser worker process
There are at most three processes running at any time, which is good/expected but's something that I was worried about: zombie processes are not present.

I would expect the 31164 pid to use less RAM

In most of the data tuples above we see that the main process uses X% RAM and the SubDaemonPoolWorker uses X-1%, which seems strange and makes me think that either:

os.fork() is not working as expected and actually copying all the process memory into the child
I'm measuring memory usage in a suboptimal way, which leads to the shared memory pages being counted

andresriancho · 2015-01-17T01:37:01Z

4255a32 will help understand the shared memory usage in subprocesses

andresriancho · 2015-01-17T01:41:28Z

Need to run ./collector examples/w3af/config-lock.yml 4255a32

andresriancho · 2015-01-17T13:35:45Z

This is really bad... not much memory is shared betweek parent (10327) and child (11418):

    1
        10327 - (unknown)
            RSS (total): 208.6 MB
            Shared: 7.6 MB
            Percent OS used: 12.05 %

        11418 - SubDaemonPoolWorker-5.worker
            RSS (total): 169.7 MB
            Shared: 2.6 MB
            Percent OS used: 9.80 %

        10344 - SyncManager-1._run_server
            RSS (total): 29.7 MB
            Shared: 1.7 MB
            Percent OS used: 1.71 %

Is this the right type of shared memory I'm looking for?

andresriancho · 2015-01-19T13:32:36Z

TODO for debugging, add the profiling tools to this function, so we can better track what happens inside the sub-process:

def init_worker(log_queue):

andresriancho · 2015-01-19T20:30:45Z

Looks like this might be useful:
http://pytracemalloc.readthedocs.org/install.html

andresriancho · 2015-01-27T16:00:13Z

Does this mean that the worker is sending http traffic? What's the traceback for these calls to GET, _send_404, etc?

./wpa ~/performance_info/012d55a/i-6dd0c896/tmp/ 13959 -3

Performance analysis
====================

CPU usage by function
    Measurement #0 (w3af-13870-2015-27-01-03_00.cpu)
        1: extended_urllib.py:305 (ExtendedUrllib.GET)
        2: extended_urllib.py:491 (ExtendedUrllib._send)
        3: urllib2.py:386 (OpenerDirector.open)
        4: decorators.py:80 (f_retry)
        5: fingerprint_404.py:175 (fingerprint_404._send_404)
        6: threading.py:308 (_Condition.wait)
        7: pool.py:226 (Pool._maintain_pool)
        8: pool.py:195 (Pool._join_exited_workers)
        9: Queue.py:150 (Queue.get)
        10: HTTPResponse.py:129 (from_httplib_resp)
        11: urllib2.py:374 (OpenerDirector._call_chain)
        12: urllib2.py:414 (OpenerDirector._open)
        13: strategy.py:293 (w3af_core_strategy.verify_target_server)
        14: __init__.py:77 (DaemonProcess.exitcode)
        15: keepalive.py:782 (HTTPHandler.http_open)

    Measurement #1 (w3af-13959-2015-27-01-03_01.cpu)
        1: parser_cache.py:279 (init_worker)
        2: extended_urllib.py:305 (ExtendedUrllib.GET)
        3: extended_urllib.py:491 (ExtendedUrllib._send)
        4: urllib2.py:386 (OpenerDirector.open)
        5: decorators.py:22 (return_exceptions_wrapper)
        6: decorators.py:4 (apply_with_return_error)
        7: parser_cache.py:267 (ProcessDocumentParser.__init__)
        8: document_parser.py:39 (ProcessDocumentParser.__init__)
        9: html.py:45 (HTMLParser.__init__)
        10: sgml.py:65 (HTMLParser.__init__)
        11: sgml.py:140 (HTMLParser._parse)
        12: decorators.py:80 (f_retry)
        13: fingerprint_404.py:175 (fingerprint_404._send_404)
        14: __init__.py:122 (dump)
        15: url.py:175 (URL.__init__)

andresriancho · 2015-01-27T18:41:05Z

This issue of the sub-process running "all w3af" looks like http://stackoverflow.com/a/7994465/1347554 , but that can't be...

andresriancho · 2015-01-27T20:32:45Z

https://gist.github.com/andresriancho/957180a06be43fa9c654

andresriancho · 2015-01-27T20:33:04Z

https://docs.python.org/3.4/library/multiprocessing.html#contexts-and-start-methods
"Note that safely forking a multithreaded process is problematic."

andresriancho · 2015-01-27T22:06:16Z

Latest billiard, the one from the repo (not pypi), is a backport of the 3.4 multiprocessing module for 2.7
https://github.com/celery/billiard

I want to test that with spawn as https://docs.python.org/3.4/library/multiprocessing.html#multiprocessing.set_start_method

andresriancho · 2015-01-28T01:50:10Z

http://www.linuxprogrammingblog.com/threads-and-fork-think-twice-before-using-them

https://www.reddit.com/r/Python/comments/2twt6r/multiprocessing_module_backport_from_3_to_27x/
https://gist.github.com/andresriancho/957180a06be43fa9c654

andresriancho · 2015-01-28T12:22:54Z

Something I wanted to play with for a while... but not sure if it's the best thing in this scenario: http://zerorpc.dotcloud.com/

With timeouts:

In [1]: import zerorpc
In [2]: c = zerorpc.Client()
In [3]: c.connect("tcp://127.0.0.1:4242")
Out[3]: [None]
In [6]: c._timeout
Out[6]: 30
In [9]: c.add_42(1)
Out[9]: 43
In [14]: r = c.add_42(1, async=True)
In [15]: r
Out[15]: <gevent.event.AsyncResult at 0x7f1910006450>
In [16]: r.get()
Out[16]: 43
In [17]: r.get()
Out[17]: 43

#
# Sleep and timeout
#
In [20]: r = c.sleep(15, async=True)
In [21]: r.get(timeout=1)
---------------------------------------------------------------------------
Timeout                                   Traceback (most recent call last)
...
Timeout: 1 second


#
# When there is a timeout the task keeps running, and at some point it finishes
#
In [22]: r.get(timeout=10)
Out[22]: 'slept 15'


#
# Timeout without async
#
In [23]: c.sleep(15, timeout=1)
---------------------------------------------------------------------------
TimeoutExpired                            Traceback (most recent call last)
...
TimeoutExpired: timeout after 1s, when calling remote method sleep

And the server

import time

class Cooler(object):
    """ Various convenience methods to make things cooler. """

    def add_man(self, sentence):
        """ End a sentence with ", man!" to make it sound cooler, and
        return the result. """
        return sentence + ", man!"

    def add_42(self, n):
        """ Add 42 to an integer argument to make it cooler, and return the
        result. """
        return n + 42

    def boat(self, sentence):
        """ Replace a sentence with "I'm on a boat!", and return that,
        because it's cooler. """
        return "I'm on a boat!"

    def sleep(self, num):
        time.sleep(num)
        return 'slept %s' % num

import zerorpc

s = zerorpc.Server(Cooler())
s.bind("tcp://0.0.0.0:4242")
s.run()

I could easily start the zerorpc server in a different process using Popen and make it listen on a unix socket --bind ipc:///tmp/randomly-generated-id.w3af (r+w only for the current user). Then I would call the document parser through zerorpc. There are some questions which remain open:

I can timeout the "get()", but that doesn't kill the running remote function. Is there a way to do that without killing the whole zerorpc server?
What happens when multiple threads want to consume the remote service? Are the requests queued somewhere? In memory? Does execution block?
Does the remote service run multiple threads?

The document parser zerorpc-server could have a signal handler for Ctrl+C which would stop the server, kill any running tasks, cleanup, etc. and end the process. From the main process we could send the signal when we need to kill the sub-process.

Instead of having only one document parser server I could use zeroworkers to have N workers (like I have now with the pool) and one task generator (the main process)

andresriancho · 2015-01-30T19:52:47Z

https://pypi.python.org/pypi/Pyro4 looks good:

Better than zerorpc (which seems to be abandoned)
It's pure-python and has only one (also pure python) dependency
MIT licensed, which is GPL compatible
Has support for remote tracebacks
Has support for futures with timeouts (wait method of futures)

andresriancho · 2015-01-30T20:22:57Z

http://rpyc.readthedocs.org/ also looks good

Better than zerorpc (which seems to be abandoned)
Active mailing list
Active development
Async/timed/wait
Has support for remote tracebacks
It's pure-python and has only one (also pure python) dependency

andresriancho · 2015-02-02T16:26:09Z

Maybe the sub-processes could be started/stopped using something similar to https://pypi.python.org/pypi/python-daemon ? (source https://www.python.org/dev/peps/pep-3143/)

andresriancho · 2015-02-02T20:24:38Z

I also need to create a new AMI for launching the collector since with the latest tools it's really slow to boot. andresriancho/collector#10

andresriancho · 2015-02-25T14:59:39Z

https://circleci.com/gh/andresriancho/w3af/1428 yields the message:

Warning: The build VMs have a memory limit of 4G. Your build hit this limit
on one or more containers, and your build results are likely invalid.

This was not related to the parsers/subprocesses, it was because of a memory leak in lxml

andresriancho · 2015-05-28T01:49:44Z

Solved in develop branch

andresriancho added bug priority:high labels Jan 13, 2015

andresriancho self-assigned this Jan 13, 2015

andresriancho added this to the 1.6.1 - Bug fixing after 1.6 milestone Jan 13, 2015

andresriancho added a commit to andresriancho/collector that referenced this issue Jan 14, 2015

Allow execution to continue with the warn_only flag in config

f341001

andresriancho/w3af#7503

andresriancho mentioned this issue Jan 22, 2015

Added gui support to the dockerfile. Added a little explanation in the ... #5043

Closed

andresriancho added a commit to andresriancho/w3af-performance-analysis that referenced this issue Jan 27, 2015

Showing rss/shared to be able to debug andresriancho/w3af#7503

ae26ffe

andresriancho mentioned this issue Mar 17, 2015

Find other lxml parsers like phishtank - Identify memory leaks #8614

Closed

7 tasks

andresriancho mentioned this issue Apr 25, 2015

High memory usage #9626

Closed

andresriancho closed this as completed May 28, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High memory usage with new forking parser cache #7503

High memory usage with new forking parser cache #7503

andresriancho commented Jan 13, 2015

andresriancho commented Jan 14, 2015

andresriancho commented Jan 16, 2015

andresriancho commented Jan 17, 2015

andresriancho commented Jan 17, 2015

andresriancho commented Jan 17, 2015

andresriancho commented Jan 19, 2015

andresriancho commented Jan 19, 2015

andresriancho commented Jan 27, 2015

andresriancho commented Jan 27, 2015

andresriancho commented Jan 27, 2015

andresriancho commented Jan 27, 2015

andresriancho commented Jan 27, 2015

andresriancho commented Jan 28, 2015

andresriancho commented Jan 28, 2015

andresriancho commented Jan 30, 2015

andresriancho commented Jan 30, 2015

andresriancho commented Feb 2, 2015

andresriancho commented Feb 2, 2015

andresriancho commented Feb 25, 2015

andresriancho commented May 28, 2015

High memory usage with new forking parser cache #7503

High memory usage with new forking parser cache #7503

Comments

andresriancho commented Jan 13, 2015

Source code

Description

Tasks

andresriancho commented Jan 14, 2015

andresriancho commented Jan 16, 2015

andresriancho commented Jan 17, 2015

andresriancho commented Jan 17, 2015

andresriancho commented Jan 17, 2015

andresriancho commented Jan 19, 2015

andresriancho commented Jan 19, 2015

andresriancho commented Jan 27, 2015

andresriancho commented Jan 27, 2015

andresriancho commented Jan 27, 2015

andresriancho commented Jan 27, 2015

andresriancho commented Jan 27, 2015

andresriancho commented Jan 28, 2015

andresriancho commented Jan 28, 2015

andresriancho commented Jan 30, 2015

andresriancho commented Jan 30, 2015

andresriancho commented Feb 2, 2015

andresriancho commented Feb 2, 2015

andresriancho commented Feb 25, 2015

andresriancho commented May 28, 2015