Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rust Semgrep Rules #3151

Open
wants to merge 9 commits into
base: develop
Choose a base branch
from

Conversation

Matthewbialek
Copy link

@Matthewbialek Matthewbialek commented Oct 6, 2023

Hello Semgrep team
We would like to contribute the rules we are using internally for Rust! We would love to see the Rust support for the tool to grow. These rules do include some todorule id because I believe the pattern matching is having some issues and this could be illustrated. I understand the state of the language support of course also!

I will be on vacation for the next 3 weeks! But i will return and handle all the feedback We look forward to a growing relationship with the semgrep community :)

This pull request was a group effort with these great developers:
Rohit Joshi
Chris Jacoby
Josher

@CLAassistant
Copy link

CLAassistant commented Oct 6, 2023

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
3 out of 4 committers have signed the CLA.

✅ Matthewbialek
✅ inkz
✅ 0xDC0DE
❌ sxm588


sxm588 seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


sxm588 seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@0xDC0DE
Copy link
Contributor

0xDC0DE commented Oct 17, 2023

Looks like we are still missing a CLA and some failing lints for our rules. If you have any questions or require any help, let us know! Here is a link to the docs about this as well:

https://semgrep.dev/docs/contributing/contributing-to-semgrep-rules-repository

@0xDC0DE
Copy link
Contributor

0xDC0DE commented Oct 17, 2023

I'll also do a review of the actual rules tomorrow!

Copy link
Contributor

@0xDC0DE 0xDC0DE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reviewed some of the rules, but not yet all! I will come back to this.
It seems like you are missing some metadata for these to be accepted as well:

Here is a link to the docs: https://semgrep.dev/docs/contributing/contributing-to-semgrep-rules-repository/#semgrep-registry-rule-requirements

Comment on lines +3 to +5
message: |
This is a Insufficiently Protected Credentials weakness: https://cwe.mitre.org/data/definitions/522.html
Consider using an appropriate security mechanism to protect the credentials (e.g. keeping secrets in environment variables)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
message: |
This is a Insufficiently Protected Credentials weakness: https://cwe.mitre.org/data/definitions/522.html
Consider using an appropriate security mechanism to protect the credentials (e.g. keeping secrets in environment variables)
message: >-
This is an Insufficiently Protected Credentials weakness: https://cwe.mitre.org/data/definitions/522.html
Consider using an appropriate security mechanism to protect the credentials (e.g. keeping secrets in environment variables).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the >- operator for messages. This together with some other guidelines regarding metadata are available in the docs.

Comment on lines +10 to +11
pattern-either:
- patterns:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some unnecessary operators here

@@ -0,0 +1,5 @@
//ruleid: dangerous-spawn-process
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also add True Negative tests, so tests indicated with //ok: <ruleid>

Comment on lines +17 to +19
patterns:
- pattern-either:
- patterns:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unnecessary operators

Comment on lines +20 to +26
- pattern: |
fn $FUNC(...) {
...
Command::new(...);
...
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- pattern: |
fn $FUNC(...) {
...
Command::new(...);
...
}
- pattern: |
Command::new(...)
- pattern-not: |
Command::new("...")

Some suggestions:

  1. By removing the function declaration around the actual vulnerable line of code we can reduce the amount of lines being reported or marked. Instead of the entire function declaration, Semgrep will now report the exact line of the Command::new. You will need to update the test syntax accordingly! Additionally, this would also capture calls that do not end in a semicolon.

For example, a construction like this might be missed by your pattern (I didn't actually run it):

let output = if cfg!(target_os = "windows") {
    Command::new(cmd)
            .args(["/C", "echo hello"])
            .output()
            .expect("failed to execute process")
} else {
    Command::new("sh")
            .arg("-c")
            .arg("echo hello")
            .output()
            .expect("failed to execute process")
};

You should look up some uses of the functions and add them to the testcode with appropriate test syntax to verify this!

  1. I have added a pattern-not to exclude very obvious False Positives, like a hardcoded String. This will reduce the noise for this rule.

Comment on lines +14 to +16
fix-regex:
regex: SslVersion::SSL3
replacement: SslVersion::TLS1_3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are fixes, it is possible to add fix tests.

Create a new file called deprecated-ssl.fixed.rs with the expected outcome of running the autofix on the deprecated-ssl.rs file.

More info here: https://semgrep.dev/docs/writing-rules/testing-rules/#testing-autofix

fix-regex works but using regular fix will be more powerful, especially when we develop AST-based autofix for Rust. (https://semgrep.dev/blog/2022/autofixing-code-with-semgrep/)

If you use fix your rules will likely be more future-proof. Your rule would like something like this:

patterns:
  - pattern: |
        $VAR.set_min_proto_version(Some($TLS))
  - metavariable-regex:
        metavariable: $TLS
        regex: SslVersion::(SSL3|TLSv1|TLSv1_1)
  - focus-metavariable: $TLS
fix: |
  SslVersion::TLS1_3

This would allow you to combine all three rules into just one rule! You can even use the metavariable in the message if you want to still keep three separate messages:

message: >-
    $TLS is insecure ...

Comment on lines +13 to +14
//todoruleid: math-random-used
let _ = thread_rng();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Semgrep does some symbolic propagation and should be aware enough for this case to be captured with your current version of the rule:

https://semgrep.dev/playground/s/N0oy

Comment on lines +17 to +30
# issues with pattern inside
# - pattern: |
# use rand::thread_rng;
# ...

# - pattern-not-inside: |
# use rand::CryptoRng;
# ...
# - pattern-not-inside: |
# use rand_core::StdRng;
# ...
# - pattern-not-inside: |
# use rand_core::OsRng;
# ...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem here is that you have nested them under the pattern-either, this operator works like an OR operator.

Your pattern-inside is written correctly and works as expected.
https://semgrep.dev/playground/s/kJPP

Please add testcases for these pattern-not-inside cases to your testcode.


// Use of unsecure sha1 as MessageDigest
fn reversible_hash_2() {
//ruleid: reversible-hash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add negative test cases for approved hash algortihms not flagged by this rule.

Comment on lines +9 to +11
generates a hash, then compares the hash to another hash that it has stored in its
authentication database, then the ability to create a collision could allow an attacker
to provide an alternate password that produces the same target hash, bypassing authentication.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be a recommended solution?

Comment on lines +13 to +15
pattern-either:
- pattern: use des;
- pattern: use des::Des;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pattern-either:
- pattern: use des;
- pattern: use des::Des;
pattern-either:
- pattern: use des;
- pattern: use des::Des;
fix: |
use aes::Aes128;

Would probably be easy to add a fix here. Can you add a fix test as well?

Comment on lines +11 to +13
pattern-either:
- pattern: |
$VAR.set_ciphersuites("=~/(TLS_RSA_WITH_RC4_128_SHA|TLS_RSA_WITH_AES_128_CBC_SHA256|TLS_RSA_WITH_AES_128_CBC_SHA256|TLS_ECDHE_ECDSA_WITH_RC4_128_SHA|TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256|TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256|TLS_AES_256_GCM_SHA384)/")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pattern-either:
- pattern: |
$VAR.set_ciphersuites("=~/(TLS_RSA_WITH_RC4_128_SHA|TLS_RSA_WITH_AES_128_CBC_SHA256|TLS_RSA_WITH_AES_128_CBC_SHA256|TLS_ECDHE_ECDSA_WITH_RC4_128_SHA|TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256|TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256|TLS_AES_256_GCM_SHA384)/")
patterns:
- pattern: |
$VAR.set_ciphersuites($CIPHER)
- metavariable-regex:
metavariable: $CIPHER
regex: (TLS_RSA_WITH_RC4_128_SHA|TLS_RSA_WITH_AES_128_CBC_SHA256|TLS_RSA_WITH_AES_128_CBC_SHA256|TLS_ECDHE_ECDSA_WITH_RC4_128_SHA|TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256|TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256|TLS_AES_256_GCM_SHA384)
fix: |
TLS_AES_128_GCM_SHA256

Feels like a rule that could easily have a fix. Mind adding one, as well as a fixtest? I've made a suggestion that might help you in the right direction.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was reading through the docs and i couldnt find anything about fix tests? i was also looking for examples in the other rules.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw the higher comment now

rules:
- id: use-of-md2
languages: [rust]
message: avoid cryptographically insecure hashing function(s)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please expand the message with "why" and "what to do instead".

rules:
- id: use-of-md4
languages: [rust]
message: avoid cryptographically insecure hashing function(s)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please expand the message with "why" and "what to do instead"

Comment on lines +6 to +13
- pattern: |
fn $FUNC(...) {
let $VAR = $F_ENV(...);
...
$X.push(...);
...
$F_FS($X);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +4 to +36
- pattern-either:
- patterns:
- pattern: |
fn $FUNC(...) {
let $VAR = $F_ENV(...)?;
...
$X.push($VAR);
...
$F_FS($X);
}
- pattern: |
fn $FUNC(...) {
let $VAR = $F_ENV(...)?;
...
$F_FS(...);
}
- metavariable-pattern:
metavariable: $F_ENV
pattern-either:
- pattern: std::env::var
- pattern: env::var
- pattern: "var"
- pattern: "var"
- pattern: "std::env::var_os"
- pattern: "env::var_os"
- pattern: "var_os"
- metavariable-pattern:
metavariable: $F_FS
pattern-either:
- pattern: "std::fs::File::open"
- pattern: "fs::File::open"
- pattern: "File::open"
- pattern: "open"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +13 to +21
pattern-either:
- pattern: |
let $VAL = requests::$METHOD(...).unwrap();
...
headers.set($HEADER($VAL));
- pattern: |
let $VAL = requests::$METHOD(...);
...
headers.set($HEADER($VAL));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@0xDC0DE could you help me understand how to do this properly. It keeps failing to parse

Comment on lines +12 to +13
message: |
Opening temporary files without appropriate measures or controls can leave the file, its contents and any function that it impacts vulnerable to attack.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add remediation guidance and testcases for the appropriate solution?

Comment on lines +2 to +9
- id: cross-site-scripting-templates-detected
patterns:
- pattern-either:
- pattern: |
let $VAR = requests::$METHOD(...).unwrap();
...
return [...,$VAR,...].join("\n");

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taint mode!

Also think more about how to send HTML back. Commonly we use direct response writers as taint sinks for XSS. I haven't used Rust beyond a few small tests, but what I came up with in the Rocket framework for example was this:

#[get("/xss?<param>")]
fn xss(param: String) -> Html<String> {
    Html(format!("Html {}", param))
}

In this case the Html(...) function could be a taint sink, for example.

@0xDC0DE
Copy link
Contributor

0xDC0DE commented Oct 15, 2024

@Matthewbialek looks like it's been a while since this PR is touched. Would you still like some help getting this merged?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants