Reinstate full protection against slow retrieval attacks #932

lukpueh · 2019-10-10T14:39:04Z

Description of issue or feature request:

#781 updated tuf's downloader module to use requests instead of custom networking code, to fix issues with HTTPS proxies.

This change, however, deprived TUF of a way to prevent a certain kind of slow retrieval attack, where the attacker sends bytes below a minimum average speed.

This attack can not be prevented with the timeout parameters provided by requests, which are:

connect timeout (max # of seconds to establish connection with server)
read timeout (max # of seconds between bytes sent from the server)

Current behavior:
Cannot reliably enforce minimal average download speed and thus is vulnerable to slow retrieval attacks.

Expected behavior:
Reinstate full protection against slow retrieval attacks

Possible ways of fixing are

report issue to requests and ask them to fix it, or
switch to equally well-established and feature-rich requests-alternative that supports non-blocking I/O

Note: The comments in #781 have more ways of addressing this issue and also discuss whey they are not viable. Consider reading up on the discussion when working on this issue.

The text was updated successfully, but these errors were encountered:

joshuagl · 2019-10-11T13:11:02Z

I'd like to grab this, though it may be a a couple of weeks until I get round to it.

sharmaansh21 · 2020-03-18T13:49:28Z

@lukpueh Hi, I don't think anyone working on this, can I pick up this issue ?

govardhangdg · 2020-03-23T04:47:36Z

Hello @lukpueh! I'd like to pick this up as my GSoC project.

lukpueh · 2020-03-23T10:20:03Z

Thanks for your interest, @optimisticanshul and @govardhangdg! Make sure to turn in your GSoC application by the end of March.

Besides, you are welcome to join us on the CNCF TUF slack channel and tell us a little bit about you and why you are interested in working on TUF and this task in particular. :)

govardhangdg · 2020-03-23T11:12:55Z

Thanks @lukpueh. Will do!

Urhengulas · 2020-03-29T13:11:38Z

Hey @lukpueh,
I would also love to work on it as part of GSOC2020. Currently finishing my proposal and going to send a post in Slack.

jcstr · 2020-06-05T12:58:48Z

Hello Everyone,

Following the suggestion that @lukpueh says here, I've look around for a replacement for requests as I'm taking the consideration to the properties that make a good replacement for it.

Properties that fit for a good candidate for `requests` replacement

The library that will be the replacement has:

Equally well-established and feature-rich requests-alternative
Supports non-blocking I/O (asyncio)

The options

I'll mention here and I'd like to everyone give an observation for what I've choose and what other options are out there to take in consideration as well:

The library I've choose:
python-aiohttp: Asynchronous HTTP client/server framework for asyncio and Python.

Why `aiohttp` is a good replacement for `requests`?

If we look to the key features, it has:

Supports both client and server side of HTTP protocol.
Supports both client and server Web-Sockets out-of-the-box and avoids Callback Hell.
Provides Web-server with middlewares and plugable routing.

For me aiohttp is a good fit for TUF because it fills the requirements to become a replacement for requests. specially because it's built to work with asyncio utilities in hand, which will help us to solve the Slow Retrieval Attack by going beyond timeouts.

I need to verify that aiohttp will help us to solve this by doing a test and see how it behaves.

The other aspects that aiohttp is a good replacement is the community that's behind the project.
It seems that the project is following good practices such as code coverage for the codebase. Also the aiohttp community has a dedicated chat room (gitter) to look for help or talk about an specific feature.

Another options to look for

httpx
SOCKS

lukpueh · 2020-06-08T17:45:56Z

Many thanks for putting this together, @jcstr! python-aiohttp indeed seems to be the most popular choice of the 3. It might be worth comparing it to the seemingly similarly popular httpx in regards to code readability. Easy-to-read code is a key requirement for the TUF reference implementation!

Also, I wonder which of the key features you mention above you intend to use for this task. Can you explain why all of these are relevant?

At any rate, I'm looking forward to seeing proof-of-concept example. How is that coming along? Any code to share?

Btw., (and this is more a note to the co-maintainer) I saw that none of the mentioned libraries work on Python 2.7. @SantiagoTorres, @trishankatdatadog, @JustinCappos how long do we still want to support a Python 2.7 client?

SantiagoTorres · 2020-06-08T17:48:59Z

Btw., (and this is more a note to the co-maintainer) I saw that none of the mentioned libraries work on Python 2.7. @SantiagoTorres, @trishankatdatadog, @JustinCappos how long do we still want to support a Python 2.7 client?

I think this is an artifact of async not existing on py2 (iirc). My ideal roadmap was to:

Provide library X as an optdepend
if exists, you get slow retrieval protection
if it doesn't, then you get whatever is in place right now

I find this appealing because we can also allow people to fall back to the default implementation if they find things to be finnicky. Needless to say, this way we can also leave py2 out of the question.

trishankatdatadog · 2020-06-08T18:06:22Z

Agreed with @SantiagoTorres approach. I wouldn't make this a blocking requirement, since slow retrieval attacks are lower priority compared to other, more serious attacks.

jcstr · 2020-06-10T12:12:50Z

Many thanks for putting this together, @jcstr! python-aiohttp indeed seems to be the most popular choice of the 3. It might be worth comparing it to the seemingly similarly popular httpx in regards to code readability. Easy-to-read code is a key requirement for the TUF reference implementation!

No problem @lukpueh!, considering what you're saying in here I'll look closely to httpx and I'll comment here the comparison between aiohttp and httpx.

Also, I wonder which of the key features you mention above you intend to use for this task. Can you explain why all of these are relevant?

As long as I'm on my way to help with this issue, I think that mention the key features of aiohttp are worth it because it shows that fit with the needs to help with this case, specially this one:

Supports both client and server Web-Sockets out-of-the-box and avoids Callback Hell.

And the reason behind of this is, according to the knowledge I'm getting of this topic is web sockets are relevant for this particular case, we're looking for a solution that help us along side timeouts, that's were Asynchronous I/O comes to help.

At any rate, I'm looking forward to seeing proof-of-concept example. How is that coming along? Any code to share?

I need help with this, I've look around the test_slow_retrieval_attack.py test which focus on this issue and as @SantiagoTorres suggested me, it won't be a full library replacement for the first approximation, aiohttp will be added as a case use to specifically prevent the Slow Retrieval Attack and requests will still be used.

I'm looking for a way to refactor test_slow_retrieval_attack.py to include aiohttp with the Adapter structural design pattern but I don't know if I'm following the right direction.

lukpueh · 2020-06-11T15:06:16Z

No problem @lukpueh!, considering what you're saying in here I'll look closely to httpx and I'll comment here the comparison between aiohttp and httpx.

🎉

Supports both client and server Web-Sockets out-of-the-box and avoids Callback Hell.

And the reason behind of this is, according to the knowledge I'm getting of this topic is web sockets are relevant for this particular case, we're looking for a solution that help us along side timeouts, that's were Asynchronous I/O comes to help.

Right now the TUF updater uses HTTP, which has the advantage that we don't really need to think about the server implementation. Why should we instead use the websocket protocol?

I need help with this, I've look around the test_slow_retrieval_attack.py test which focus on this issue and as @SantiagoTorres suggested me, it won't be a full library replacement for the first approximation, aiohttp will be added as a case use to specifically prevent the Slow Retrieval Attack and requests will still be used.
I'm looking for a way to refactor test_slow_retrieval_attack.py to include aiohttp with the Adapter structural design pattern but I don't know if I'm following the right direction.

I agree with Santiago, don't worry too much about fixing the TUF updater just yet. It would be nice to see a minimal working example, where you have a server that slowly serves a file to a client if the download drops below a chosen rate.

This can be a simple stand-alone example and I doesn't necessarily need to follow any sophisticated design patterns unless you want to.

test_slow_retrieval_attack.py should help you to see how the slow retrieval attack prevention used to be tested with the TUF updater.

jku · 2020-07-02T09:57:29Z

I don't claim to be an expert on Requests but I don't see why this is true:

Cannot reliably enforce minimal average download speed and thus is vulnerable to slow retrieval attacks.

Can someone explain this (e.g. with an example)?

The current implementation uses a fairly huge CHUNK_SIZE (400000). Are you sure a more reasonable value would not allow you to choose acceptable MIN_AVERAGE_DOWNLOAD_SPEED value (as well as Requests connect and read timeouts)?

EDIT: partly answering myself with some bits from the linked bug: the concern seems to be that with whatever reasonable chunk size and read timeout the maximum download time for a single chunk would be too long. Would still be interesting to see what actual cases of slow retrieval attacks non-blocking IO will protect from... I don't see too many numbers in the linked PR either.

jku · 2020-07-02T14:35:55Z

I tested my assumptions a bit: I did download tests (very unscientific just to get a feel for it: laptop with many workloads, wifi with other users on it). I downloaded 100MB using a loop like the one in tuf for each chunk size ten times and picked the fastest results.

SIZE(b) TIME(s)
    50    7.80
   100    4.21
   200    4.22
   400    4.26
   800    4.25
  1600    4.26
  4000    4.25
 40000    4.27
400000    4.25

So on my laptop and wifi you can reduce CHUNK_SIZE to 100 bytes (0.025% of its current value) without affecting download speed. I'm not saying we should do that (my test starts to become CPU-bound at lower chunk sizes and there could be other disadvantages like increased network traffic because of overhead). I am saying that 400000 is a huge chunk size -- smaller values can probably be used without affecting performance.

jku · 2020-11-17T13:14:57Z

Linking to #1213 updater: Abstract out the network IO: it proposes creating a client network IO interface that clients can implement and providing a default implementation in TUF.

Effects on this issue would be:

Experimenting with e.g. aiohttp would become quite a bit easier -- if the client network functionality is pluggable, comparing implementations and developing separate ones becomes much more reasonable
The protections that TUF offers for slow retrieval will partially depend on the component that the client selected: this is not great for messaging but arguably not any worse than the current situation (where slow retrieval protection just does not work).

lukpueh mentioned this issue Oct 10, 2019

Add disclaimer about limited slow retrieval attack protection #928

Merged

3 tasks

lukpueh added the up for grabs label Oct 11, 2019

lukpueh mentioned this issue Jun 29, 2020

Updater: No download progress hooks #1057

Closed

joshuagl added this to the 1.0.0 milestone Jul 7, 2020

joshuagl removed this from the 1.0.0 milestone Sep 8, 2020

lukpueh mentioned this issue Nov 20, 2020

updater: abstract out the network IO #1213

Closed

effulgentsia mentioned this issue Apr 12, 2021

Allow PreFileDownloadEvent to carry transport options for metadata composer/composer#9831

Merged

lukpueh mentioned this issue Jan 15, 2024

increase default network timeout #2542

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reinstate full protection against slow retrieval attacks #932

Reinstate full protection against slow retrieval attacks #932

lukpueh commented Oct 10, 2019

joshuagl commented Oct 11, 2019

sharmaansh21 commented Mar 18, 2020

govardhangdg commented Mar 23, 2020

lukpueh commented Mar 23, 2020

govardhangdg commented Mar 23, 2020

Urhengulas commented Mar 29, 2020

jcstr commented Jun 5, 2020

lukpueh commented Jun 8, 2020

SantiagoTorres commented Jun 8, 2020

trishankatdatadog commented Jun 8, 2020

jcstr commented Jun 10, 2020

lukpueh commented Jun 11, 2020

jku commented Jul 2, 2020 •

edited

Loading

jku commented Jul 2, 2020 •

edited

Loading

jku commented Nov 17, 2020

Reinstate full protection against slow retrieval attacks #932

Reinstate full protection against slow retrieval attacks #932

Comments

lukpueh commented Oct 10, 2019

joshuagl commented Oct 11, 2019

sharmaansh21 commented Mar 18, 2020

govardhangdg commented Mar 23, 2020

lukpueh commented Mar 23, 2020

govardhangdg commented Mar 23, 2020

Urhengulas commented Mar 29, 2020

jcstr commented Jun 5, 2020

Properties that fit for a good candidate for requests replacement

The options

Why aiohttp is a good replacement for requests?

Another options to look for

lukpueh commented Jun 8, 2020

SantiagoTorres commented Jun 8, 2020

trishankatdatadog commented Jun 8, 2020

jcstr commented Jun 10, 2020

lukpueh commented Jun 11, 2020

jku commented Jul 2, 2020 • edited Loading

jku commented Jul 2, 2020 • edited Loading

jku commented Nov 17, 2020

Properties that fit for a good candidate for `requests` replacement

Why `aiohttp` is a good replacement for `requests`?

jku commented Jul 2, 2020 •

edited

Loading

jku commented Jul 2, 2020 •

edited

Loading