Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serious Security Vulnerabilities in GATK #8215

Open
mohitmathew opened this issue Feb 22, 2023 · 18 comments
Open

Serious Security Vulnerabilities in GATK #8215

mohitmathew opened this issue Feb 22, 2023 · 18 comments
Assignees

Comments

@mohitmathew
Copy link

I am looking at using GATK and first checked at the docker image using docker pull broadinstitute/gatk

this container image has 1460 vulnerabilities and a lot of them are critical.
Screenshot 2023-02-21 212830

Then I decided not to use this image and instead create my own image and just deploy the released version 4.2.6.1 from here (https://github.com/broadinstitute/gatk/releases/download/4.2.6.1/gatk-4.2.6.1.zip).

Even this has many vulnerabilities include things stemming from log4j 1.2.17. These have been fixed by log4j team years back in version 2.17.1 onwards. I am really stunned that a popular library like gatk is not keeping up with basic security fixes.

Screenshot 2023-02-21 212751

the latest version of docker desktop has integrated image scanning and can very easily highlight the issues listed above.

Can we start addressing these issues sooner than later.

@droazen
Copy link
Contributor

droazen commented Feb 23, 2023

@mohitmathew Thanks for the report! We are currently in the process of updating GATK to Java 17, which necessarily involves updating many of our dependencies. We are also updating our docker image to be based off of the latest Ubuntu LTS release. This should greatly reduce the number of critical vulnerabilities in our release image. After the Java 17 switchover we can revisit this and see what security issues remain.

@gokalpcelik
Copy link
Contributor

Java 17! Great news.

@lbergelson
Copy link
Member

I will say that a lot of the listed vulnerabilities are not actually problematic for us. Many of the scariest ones are only relevant in the context of reading untrusted data from the internet which is not something that GATK is typically doing.

@droazen droazen self-assigned this Feb 27, 2023
@mohitmathew
Copy link
Author

@droazen sorry for a late response. I agree moving to java 17 would help. I do see that GATK itself is using the newer version of log4j but then its the transitive dependencies for the libraries used that bring in the older version of log4j.

this creates situations that the final compiled jar has both version of the log4j and this could create problems.

Gatk being a very useful tool gets integrated in multiple other tools and pipelines so in a way affecting the security posture of where its being used. The risk might be low being a standalone cli tool but its a very hard conversation with info security :) .

May I ask for a ballpark ETA for the new version? Appreciate the work thats gone into this tool.

@DarioS
Copy link

DarioS commented Mar 22, 2023

It was released last week.

@droazen
Copy link
Contributor

droazen commented Apr 4, 2023

The latest GATK release does significantly cut down on the number of critical vulnerabilities (mainly by moving to the latest Ubuntu 18.04 image), but there is definitely more work to be done here, so I'll keep this ticket open

@superbsky
Copy link

superbsky commented Apr 18, 2023

I am still receiving security warnings about GATK 4.4.0.0:

Detected by File Paths: gatk-4.4.0.0/gatk-package-4.4.0.0-local.jar
Detected by Library: pkg:java/log4j:log4j
CPE: cpe:/a:apache:log4j:1.2.17
Version End of Life Date: August 4th, 2015 at 7:00 PM

@mohitmathew
Copy link
Author

The vulnerabilities reduced a bit but most serious once continue to be there. Dependency upkeep is really needed to iron this out these.

@droazen
Copy link
Contributor

droazen commented May 12, 2023

Yes, this issue is not yet fully resolved. We intend to make additional progress in reducing vulnerabilities in our dependencies in the next GATK release.

@mohitmathew
Copy link
Author

HI @droazen I see you were on this issue and generated a PR but could not merge because test case failures. I wanted to check if you were able to make progress on this. Within my organization infosec independently reviewed and have denied use of GATK :( . Let me know if you have an ETA for security fix update.

Thank you!

@droazen
Copy link
Contributor

droazen commented Aug 16, 2023

@mohitmathew Yes, we are still working on this! The PR is not yet in a usable state, but we intend to finish it for the next release.

@mohitmathew
Copy link
Author

Thanks @droazen ! . Eagerly waiting for the next release

@droazen
Copy link
Contributor

droazen commented Dec 14, 2023

@mohitmathew With the GATK 4.5 release, we've again made significant progress on the known vulnerabilities in our dependencies, as well as in our docker image. There are still a few left, in "dependencies of our dependencies" that will be difficult to update, but we're getting there.

Note that the known vulnerabilities in log4j 1.x reported above are not the same as the infamous (and extremely serious) log4j 2.x vulnerabilities that were discovered a few years back. log4j 1.x completely lacks the feature that was exploited in the log4j 2.x vulnerability, and we patched our version of log4j 2.x in GATK almost as soon as that vulnerability was reported.

@mohitmathew
Copy link
Author

@droazen : Thanks a lot for prioritizing and attending to this. The security posture has greatly improved from where we started. Community greatly benefits from your effort.

I have migrated to using the 4.5 release after some regression testing. Below is a list of critical and high findings with 4.5 release. There are links to snyk version update recommendations. I know sometimes its not easy just to upgrade the library version as we could end up with run time errors. I am adding this here so that its handy when ever you look at this further.

Thanks again.

packageName version severity language module_id
com.google.protobuf:protobuf-java 3.7.1 high java SNYK-JAVA-COMGOOGLEPROTOBUF-2331703
com.google.protobuf:protobuf-java 3.7.1 high java SNYK-JAVA-COMGOOGLEPROTOBUF-3167772
io.netty:netty-codec-http2 4.1.96.Final high java SNYK-JAVA-IONETTY-5953332
log4j:log4j 1.2.17 high java SNYK-JAVA-LOG4J-2342645
log4j:log4j 1.2.17 high java SNYK-JAVA-LOG4J-2342646
log4j:log4j 1.2.17 high java SNYK-JAVA-LOG4J-2342647
log4j:log4j 1.2.17 critical java SNYK-JAVA-LOG4J-572732
net.minidev:json-smart 1.3.2 high java SNYK-JAVA-NETMINIDEV-3369748
org.apache.zookeeper:zookeeper 3.6.3 high java SNYK-JAVA-ORGAPACHEZOOKEEPER-5961102
org.codehaus.jettison:jettison 1.1 high java SNYK-JAVA-ORGCODEHAUSJETTISON-3168085
org.codehaus.jettison:jettison 1.1 high java SNYK-JAVA-ORGCODEHAUSJETTISON-3367610
org.eclipse.jetty:jetty-http 9.4.52.v20230823 high java SNYK-JAVA-ORGECLIPSEJETTY-5958847

@vilay-nference
Copy link

vilay-nference commented Jul 11, 2024

@droazen , We made fixes for the vulnerabilities after java17 which was release last week.

Can you help to integrate this into GATK so that we can have new release. We have the files with patch ready.

Thanks

@droazen
Copy link
Contributor

droazen commented Jul 11, 2024

@vilay-nference You are always very welcome to submit a pull request on github with any proposed changes to GATK!

Most of the remaining vulnerabilities are in dependencies-of-dependencies which can be difficult to update, but we are slowly chipping away at them. For example, log4j 1.x is a dependency of the latest release of Apache Spark 3.x, and 4.x is still in preview (and note again that the log4j 1.x vulnerabilities are not the same as the infamous and very serious vulnerability that affected log4j 2.x some years ago). We don't believe that any of the remaining library vulnerabilities pose a real-world threat to GATK in practice, but it would still be good to eliminate them.

@vilay-nference
Copy link

@droazen,

Apologies for the delay in getting back to you.

Given the nature of our work, it's essential that we address and remove any high and critical vulnerabilities, regardless of their real-world threat level. Ensuring our system remains secure is our top priority.

Here is the pull request with the modifications to address the high and critical vulnerabilities: #8950.

Please review and let me know if you have any feedback.

@lbergelson
Copy link
Member

@vilay-nference Thank you for your pull request. I've incorporated your suggestions and closed out many vulnerabilities from our transitive dependencies. Hadoop/spark have finally stopped incorporating log4j1 so that one is closed out for good.

I've also rebuilt our base docker to incorporate recent patches from ubuntu. We've implemented some additional security scanning into our build process which will help keep us more up to date going forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants