Hang on calculating a large number #538

LordAro · 2014-05-13T17:06:51Z

It seems to be trivially easy to cause the bot to hang.

".calc 99999999999**9"

hung for several minutes, and was maxing out a core on my server

elad661 · 2014-05-13T17:31:07Z

Can reproduce.

It's threaded, so I guess the reason this freezes is the GIL.

We need to see how we can optimize .calc or prevent calculations with such big numbers.

lramati · 2014-05-14T00:12:24Z

Maybe before running the calculation, calculate a computational complexity
based on the operators present and not running if the complexity is too
high?

ari-koivula · 2014-05-16T12:25:46Z

I suggest adding adding constraints on maximum magnitudes of inputs to exponential operations like multiplication and division. Something like 10**1000 could be reasonable.

Exponentiation needs an additional constraint. Maybe something like exp < 10**1000 and exp * (1+log10(number))**3 < 1000 based on theoretical computational complexity of exponentiation.

saelo · 2014-07-06T13:08:23Z

I've prepared a basic fix for this issue which simply spawns a seperate process and kills it
after some time: https://github.com/saelo/willie/commit/c391b13463031e02855b6573cca9a13da557fdf8

I'm sure there's stuff to imporove here. Let me know what you think.

ari-koivula · 2014-07-07T14:19:59Z

@saelo Have you tested this? How does it fare when a big calculation is spammed? It's no good if the host can be flooded with python processes, with each one trying to alloc gigabytes of memory to do the calculation. Something like (2100)(2**100) can allocate quite a bit of memory in a few seconds.

saelo · 2014-07-07T16:02:06Z

@Venti- yeah, I've tested it and it's definately not perfect but at least prevents the bot from disconnecting due to timeout. I've ran ~20 huge calculations at the same time on a raspberry pi, the bot is a lot slower during that time but recovers.
You can probably still DOS the server with 100% CPU load if you wanted to though... Then again you can probably do that without the calc module as well. Memory is not really an issue on the RPI as the CPU is "slow enough", don't know about other systems.
To fix that I would suggest limiting the number of times a specific action/command can be triggered within a couple of seconds inside the framework. Alternatively the calc module could be changed to do that but I think doing it in the framework would be better.

ari-koivula · 2014-07-07T23:06:23Z

While your solution works, I don't really like the idea of spinning up new python processes just to do a simple computation (something computers are supposed to be pretty good at), so I implemented my own suggestion from earlier.

I made the limits quite liberal so you might be able to break it if you try. I based them on how long the calculations took on my laptop and set the limit on around 0.5s. I figured that should be low enough for something like Raspberry to be able to do them in 5s. You could test that though by running the following script on your Raspberry:

Here is a script to print a table of how long pow takes for various arguments around the 1s mark:

from timeit import timeit
print "   " + "".join("{e:>5}e5".format(e=e) for e in range(1, 10))
for n in range(1, 10):
    print "e{n:<2}".format(n=n),
    for e in range(1, 10):
        print "{time:>6.2f}".format(
                 time=timeit("{n}**({e} * 10**5)".format(n=10**n, e=e), number=1)),
    print ""

Here are the result on my laptop (i7-2670QM): x=exp, y=num

-      1e5    2e5    3e5    4e5    5e5    6e5    7e5    8e5    9e5
e1    0.03   0.09   0.16   0.25   0.35   0.46   0.60   0.73   0.88 
e2    0.08   0.24   0.46   0.73   1.03   1.40   1.80   2.21   2.63 
e3    0.15   0.46   0.87   1.39   1.99   2.63   3.35   4.18   5.15 
e4    0.24   0.73   1.39   2.20   3.11   4.18   5.39   6.59   7.88 
e5    0.34   1.03   2.00   3.12   4.48   5.97   7.56   9.37  11.34 
e6    0.46   1.39   2.62   4.16   5.97   7.86  10.09  12.56  15.39 
e7    0.60   1.79   3.34   5.39   7.60  10.16  13.00  16.23  19.44 
e8    0.73   2.20   4.18   6.60   9.37  12.60  16.26  19.83  23.70 
e9    0.87   2.62   5.15   7.93  11.34  15.44  19.40  23.66  28.58

ari-koivula · 2014-07-08T08:05:32Z

I guess we should still ad a overall time limit to the evaluator to avoid cases where some complex expression is chained enough times to cause a significant delay. I first thought that the limited input length would take care of that, but with a slow enough processor and the right expression it probably is still a problem. Setting an overall time limit in the evaluator while the individual operations are limited in time should take care of that though. I will do that later today or tomorrow.

I also noticed that for large expressions the most time can actually go to turning the 10**9 digit number into characters, which is why I set a limit on the magnitude of numbers than can be returned. Another way to handle that might be to print just the magnitude and a few digits for large numbers.

saelo · 2014-07-08T17:05:37Z

Sure, works for me.
Apparently (for python2) you also need to check for long though:

>>> isinstance(1000000000000000000000000000, int)
False

For the record here is the output of your script from an RPI (overclocked at 900MHz):

       1e5    2e5    3e5    4e5    5e5    6e5    7e5    8e5    9e5
e1    0.25   0.70   1.45   2.36   2.92   4.80   5.74   6.62   7.37 
e2    0.64   1.94   3.90   6.55   9.17  13.07  15.34  20.14  22.71 
e3    1.48   4.21   7.62  12.98  17.62  23.35  30.51  38.11  43.11 
e4    2.39   6.37  12.67  19.15  28.58  39.07  46.19  58.07  71.57 
e5    3.05   9.55  17.40  28.91  45.25  53.52  68.06  85.06 106.04 
e6    4.22  13.36  23.30  38.51  51.99  70.17  92.41 118.36 134.45 
e7    5.32  15.37  30.89  46.53  67.23  85.88 120.25 140.42 164.13 
e8    5.85  17.60  35.50  53.01  76.90 107.40 128.93 160.87 194.99 
e9    7.09  21.37  41.51  64.69  95.64 123.31 156.29 194.97 246.18

ari-koivula · 2014-07-08T17:34:39Z

Yeah. Python 3 doesn't have longs, but I think comparing against numbers.Integral should do the job.

- Added default timeout of 5s to expression evaluator, that will throw an error before evaluating the next statement if the timeout has passed. This should stop an expression consisting of several long running statements from taking too long. - Calc now uses general number formatting from str.format to print the number according to it's magnitude. - Fixes bug in Python 2 from #538, where long ints are let through by using abstract base class numbers.Integral.

ari-koivula · 2014-07-08T18:02:06Z

From looking at your results it's seems like the default limit does indeed limit the runtime to roughly 5s on the pi. I guess you could scale the limit function by timing some calculation at initialization time or something, but I think I'm done with this for now.

- Fixes sopel-irc#538. - Removed the drivel from calc when exceptions are raised. Printing the actual error might not be pretty, but at least it's not completely useless.

- Added default timeout of 5s to expression evaluator, that will throw an error before evaluating the next statement if the timeout has passed. This should stop an expression consisting of several long running statements from taking too long. - Calc now uses general number formatting from str.format to print the number according to it's magnitude. - Fixes bug in Python 2 from sopel-irc#538, where long ints are let through by using abstract base class numbers.Integral.

elad661 added Bug labels May 13, 2014

ari-koivula closed this as completed in 8730ef3 Jul 7, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hang on calculating a large number #538

Hang on calculating a large number #538

LordAro commented May 13, 2014

elad661 commented May 13, 2014

lramati commented May 14, 2014

ari-koivula commented May 16, 2014

saelo commented Jul 6, 2014

ari-koivula commented Jul 7, 2014

saelo commented Jul 7, 2014

ari-koivula commented Jul 7, 2014

ari-koivula commented Jul 8, 2014

saelo commented Jul 8, 2014

ari-koivula commented Jul 8, 2014

ari-koivula commented Jul 8, 2014

Hang on calculating a large number #538

Hang on calculating a large number #538

Comments

LordAro commented May 13, 2014

elad661 commented May 13, 2014

lramati commented May 14, 2014

ari-koivula commented May 16, 2014

saelo commented Jul 6, 2014

ari-koivula commented Jul 7, 2014

saelo commented Jul 7, 2014

ari-koivula commented Jul 7, 2014

ari-koivula commented Jul 8, 2014

saelo commented Jul 8, 2014

ari-koivula commented Jul 8, 2014

ari-koivula commented Jul 8, 2014