Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confirm X509ChainPolicy.UrlRetrievalTimeout behavior across Windows/Linux #38875

Closed
bartonjs opened this issue Jul 7, 2020 · 10 comments · Fixed by #43586
Closed

Confirm X509ChainPolicy.UrlRetrievalTimeout behavior across Windows/Linux #38875

bartonjs opened this issue Jul 7, 2020 · 10 comments · Fixed by #43586
Assignees
Labels
area-System.Security test-enhancement Improvements of test source code
Milestone

Comments

@bartonjs
Copy link
Member

bartonjs commented Jul 7, 2020

Now that we have better testing capabilities for URL retrieval scenarios, we should ensure that Windows and Linux are behaving similarly.

I think currently Linux uses it cumulative, Windows uses it per fetch, and Linux uses 0 as "forever" but Windows might default to 15 seconds (or might cap at 15 seconds?)

@bartonjs bartonjs added this to the 5.0.0 milestone Jul 7, 2020
@ghost
Copy link

ghost commented Jul 7, 2020

Tagging subscribers to this area: @bartonjs, @vcsjones, @krwq
Notify danmosemsft if you want to be subscribed.

@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the untriaged New issue has not been triaged by the area owner label Jul 7, 2020
@bartonjs bartonjs removed the untriaged New issue has not been triaged by the area owner label Jul 7, 2020
@vcsjones
Copy link
Member

vcsjones commented Aug 7, 2020

@bartonjs i can grab this one and work on some tests over the weekend.

@bartonjs
Copy link
Member Author

bartonjs commented Aug 7, 2020

My testing thoughts were

  • Add a delay property to RevocationResponder
  • Build a 4-chain instead of a 3-chain
  • Do something like a 2 second delay on a 3 second timeout
  • See if the second intermediate resolution fails 😄 (PartialChain)
    • Could also be done for an untrusted root, I guess, avoiding the need to generate the 4-chain.
  • Similarly, with revocation enabled, and see if it fails between cert fetch and revocation fetch (RevocationStatusUnknown)
    • That requires trusting the root, but since there are two serial fetches the same 2/3 strategy works.

More and less complex things are possible, but that felt like the heart of the problem to me.

@vcsjones
Copy link
Member

vcsjones commented Aug 8, 2020

Some random commentary of findings. I'll keep updating this with things / behaviors since I can't be trusted not to lose a Text Editor document, apparently.

Linux revocation:

  • UrlRetrievalTimeout is cumulative for all network retrievals (AIA, OCSP, CRL) together.

  • Linux checks for revocation "up" the chain for OCSP. First the intermediate is checked, then the leaf. For CRL, it goes the other way. It checks the leaf first, then the intermediate.
    Given a 7 second URT, and each revocation response is delayed 5 seconds (meaning the first will succeed but the second will hit the timeout limit) here is the result for OCSP:

    CN=A Revocation Test Cert, O=LeafRevocation_OcspTimeout:
        RevocationStatusUnknown
        OfflineRevocation
    
    CN=A Revocation Test CA, O=LeafRevocation_OcspTimeout:
        None
    
     CN=A Revocation Test Root, O=LeafRevocation_OcspTimeout:
        None
    

    and for CRL:

    CN=A Revocation Test Cert, O=LeafRevocation_CrlTimeout:
        None
    
    CN=A Revocation Test CA, O=LeafRevocation_CrlTimeout:
        RevocationStatusUnknown
        OfflineRevocation
    
    CN=A Revocation Test Root, O=LeafRevocation_CrlTimeout:
        None
    
  • URT of "0" does indeed mean forever.

  • URT can be set to a negative value and does not throw. It just leads to all network operations to behave as if they immediately timed-out.

Windows Revocation:

  • URT is not cumulative - it is for for each individual operation.
  • Since URT is not cumulative, when an operation times out, it will try another method. e.g. if CRL times out, it will try OCSP. It will do this per-certificate in the chain.
  • URT seems to be capped at 1 minute.
  • A negative URT is treated as an unsigned value and is thus effectively 1 minute due to the cap. I suspect what is going on here is that we convert the milliseconds to an int, however CERT_CHAIN_PARA.dwUrlRetrievalTimeout expects an unsigned value, so by using a negative value, we end up with a huge unsigned value.
  • A URT of "0" seems to be treated as 15 seconds.

I'm still tinkering with how to roll all of this in to a test, but, those are the behaviors I have observed so far.

@vcsjones
Copy link
Member

@bartonjs I guess to better understand what you're looking for, are you wanting a test that just documents current behavior, or a few changes / issues to start unifying the behavior?

As it is, a test is doable but the behaviors are different enough that the test body is basically "If OpenSSL, do this, if Windows, do this entirely different test".

@bartonjs
Copy link
Member Author

Ideally, the test tests Windows, and then the Linux code changes to match it.

@vcsjones
Copy link
Member

To clarify one thing: URT has no affect on AIA fetching in Windows. As far as I can tell, there is no way to control AIA fetching timeouts in Win32.

@vcsjones
Copy link
Member

@bartonjs I think this issue can be closed. I had the PR as "contributes to" since the PR didn't initially cover negatives and caps, but those are in now.

@bartonjs
Copy link
Member Author

Is there anything else for testing... CRL/OCSP timing out and falling back to the other one? Anything related to AIA?

If there's anything left, I'll move it to 6 (since we got the most important parts in), if not, we can close.

@vcsjones
Copy link
Member

CRL/OCSP timing out and falling back to the other one?

Hm, yeah there are at least some tests there that could be written.

Anything related to AIA?

I added tests for Linux for AIA timeouts. They aren't in there for Windows since URT doesn't impact AIA fetches on Windows.

move it to 6

I think that is the most sensible thing.

@bartonjs bartonjs added the test-enhancement Improvements of test source code label Aug 13, 2020
@bartonjs bartonjs modified the milestones: 5.0.0, 6.0.0 Aug 13, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 19, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Security test-enhancement Improvements of test source code
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants