Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: filter for hemizygous frequencies does not work (#2100) #2101

Merged
merged 1 commit into from
Nov 7, 2024

Conversation

stolpeo
Copy link
Contributor

@stolpeo stolpeo commented Nov 7, 2024

Summary by CodeRabbit

  • New Features

    • Enhanced genomic data handling with the addition of hemizygous variant fields in query forms.
  • Bug Fixes

    • Improved error handling for case and quality sample comparisons in the conversion process.
  • Chores

    • Updated test data file, reflecting changes in content and size.

@stolpeo stolpeo linked an issue Nov 7, 2024 that may be closed by this pull request
Copy link
Contributor

coderabbitai bot commented Nov 7, 2024

Warning

Rate limit exceeded

@stolpeo has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 3 minutes and 25 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between f36aea7 and bfd68bf.

Walkthrough

The pull request introduces modifications to the backend/variants/query_schemas.py file, enhancing the QueryJsonToFormConverter and FormToQueryJsonConverter classes by adding new genomic data fields: exac_hemizygous, thousand_genomes_hemizygous, gnomad_exomes_hemizygous, and gnomad_genomes_hemizygous. These fields are incorporated into the conversion methods of both classes, allowing for improved handling of hemizygous variant data. Additionally, the error handling logic for case and genotype sample comparisons has been updated to ensure consistency. The backend/variants/tests/data/query_settings.py file has also been modified, though specific changes are not detailed.

Changes

File Change Summary
backend/variants/query_schemas.py Enhanced QueryJsonToFormConverter and FormToQueryJsonConverter to include new genomic fields. Updated error handling in QueryJsonToFormConverter for case and genotype sample comparisons.
backend/variants/tests/data/query_settings.py File size and hash changed; no specific code changes detailed.

Possibly related PRs

  • fix: annotated sv not listed in case annotations (#2015) #2030: The changes in this PR focus on enhancing the handling of user annotations related to structural variants, which may involve similar data handling concepts as the genomic data fields added in the main PR, although they target different classes and functionalities.

Poem

In the fields of data bright,
New genes dance in the moonlight.
Hemizygous tales now unfold,
With converters bold and gold.
A hop, a skip, a jump we cheer,
For genomic wonders now draw near! 🐇✨


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

github-actions bot commented Nov 7, 2024

deps-report 🔍

Commit scanned: bfd68bf
ℹ️ Python version 3.10 is used by your project but the latest version is 3.13.

Vulnerable dependencies

4 dependencies have vulnerabilities 😱
Dependency Advisory Versions impacted
djangorestframework (transitive) Affected versions of the package djangorestframework are vulnerable to Cross-site Scripting (XSS) via the break_long_headers template filter due to improper input sanitization before splitting and joining with
tags.
<3.15.2
jinja2 (transitive) In Jinja2, the from_string function is prone to Server Side Template Injection (SSTI) where it takes the source parameter as a template object, renders it, and then returns it. The attacker can exploit it with INJECTION COMMANDS in a URI. NOTE: The maintainer and multiple third parties believe that this vulnerability isn't valid because users shouldn't use untrusted templates without sandboxing. >=0
setuptools (transitive) Affected versions of Setuptools allow for remote code execution via its download functions. These functions, which are used to download packages from URLs provided by users or retrieved from package index servers, are susceptible to code injection. If these functions are exposed to user-controlled inputs, such as package URLs, they can execute arbitrary commands on the system. <70.0.0
sqlalchemy Sqlalchemy 2.0.0b1 avoids leaking cleartext passwords to the open for careless uses of str(engine.URL()) in logs and prints. sqlalchemy/sqlalchemy#8563 <2.0.0b1

Outdated dependencies

60 outdated dependencies found (including 22 outdated major versions)😢
Dependency Installed version Latest version
alabaster (transitive) 0.7.16 1.0.0
aldjemy 2.6 3.0
argon2-cffi (transitive) 21.3.0 23.1.0
async-timeout (transitive) 4.0.3 5.0.1
billiard (transitive) 3.6.4.0 4.2.1
crispy-bootstrap4 (transitive) 2022.1 2024.10
django 3.2.25 5.1.3
django-model-utils (transitive) 4.3.1 5.0.0
django-rest-knox (transitive) 4.2.0 5.0.2
django-sodar-core 0.13.4 1.0.2
et-xmlfile (dev,transitive) 1.1.0 2.0.0
markupsafe (transitive) 2.1.5 3.0.2
mistune (transitive) 2.0.5 3.0.2
packaging (transitive) 23.2 24.1
pillow (transitive) 10.4.0 11.0.0
protobuf 3.20.3 5.28.3
setuptools (transitive) 67.6.1 75.3.0
sphinx (transitive) 6.2.1 8.1.3
sphinx-rtd-theme (transitive) 1.2.2 3.0.1
sqlalchemy 1.4.54 2.0.36
unidecode (transitive) 0.4.21 1.3.8
xmlschema (transitive) 2.5.1 3.4.3
Dependency Installed version Latest version
botocore (transitive) 1.35.36 1.35.55
celery (transitive) 5.2.7 5.4.0
charset-normalizer (transitive) 3.3.2 3.4.0
coverage (dev,transitive) 7.6.1 7.6.4
cryptography (transitive) 43.0.1 43.0.3
django-autocomplete-light (transitive) 3.9.4 3.11.0
django-crispy-forms (transitive) 2.0 2.3
django-db-file-storage (transitive) 0.5.5 0.5.6.1
django-debug-toolbar 4.3.0 4.4.6
django-environ (transitive) 0.10.0 0.11.2
django-iconify (transitive) 0.1.1 0.4
django-plugins-bihealth 0.4.0 0.5.2
django-postgres-copy 2.3.7 2.7.6
djangorestframework (transitive) 3.14.0 3.15.2
docutils (transitive) 0.18.1 0.21.2
drf-keyed-list-bihealth 0.1.1 0.2.1
drf-spectacular-sidecar (transitive) 2024.7.1 2024.11.1
drf-writable-nested 0.7.0 0.7.1
elementpath (transitive) 4.5.0 4.6.0
frozenlist (transitive) 1.4.1 1.5.0
ipython (dev,transitive) 8.28.0 8.29.0
markdown (transitive) 3.4.1 3.7
mypy-protobuf (dev) 3.3.0 3.6.0
psycopg2 (transitive) 2.9.9 2.9.10
psycopg2-binary (transitive) 2.9.9 2.9.10
pydantic-core (transitive) 2.23.4 2.26.0
regex (transitive) 2024.9.11 2024.11.6
requests-http-signature 0.2.0 0.7.1
rpds-py (transitive) 0.20.0 0.21.0
rules (transitive) 3.3 3.5
sentry-sdk 2.17.0 2.18.0
tqdm 4.66.5 4.67.0
types-protobuf (dev) 5.28.0.20240924 5.28.3.20241030
types-psycopg2 (dev,transitive) 2.9.21.20240819 2.9.21.20241019
versioneer (transitive) 0.28 0.29
werkzeug (dev) 3.1.1 3.1.2
wheel (transitive) 0.40.0 0.44.0
yarl (transitive) 1.15.5 1.17.1

Logs

@stolpeo stolpeo force-pushed the 2100-filter-for-hemizygous-frequencies-does-not-work branch from 3c4dfc0 to f36aea7 Compare November 7, 2024 15:35
@stolpeo stolpeo enabled auto-merge (squash) November 7, 2024 15:37
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (2)
backend/variants/query_schemas.py (2)

Line range hint 450-453: Fix incorrect sample comparison in error handling.

There's a bug in the sample comparison logic. The condition is comparing case_samples with quality_samples again instead of genotype_samples, which could lead to incorrect error reporting.

Apply this fix:

-        if case_samples != quality_samples:
+        if case_samples != genotype_samples:
             raise ValueError(
                 f"Case and genotype samples are not equal: {case_samples} vs {genotype_samples}"
             )

Line range hint 444-453: Consider refactoring sample validation logic.

The sample validation logic is duplicated and could be extracted into a helper method to improve maintainability and reduce the chance of errors like the one found above.

Consider refactoring like this:

+    def _validate_samples(self, case_samples: set, target_samples: set, sample_type: str) -> None:
+        if case_samples != target_samples:
+            raise ValueError(
+                f"Case and {sample_type} samples are not equal: {case_samples} vs {target_samples}"
+            )
+
     # In the convert method:
-        if case_samples != quality_samples:
-            raise ValueError(
-                f"Case and quality samples are not equal: {case_samples} vs {quality_samples}"
-            )
-        if case_samples != genotype_samples:
-            raise ValueError(
-                f"Case and genotype samples are not equal: {case_samples} vs {genotype_samples}"
-            )
+        self._validate_samples(case_samples, quality_samples, "quality")
+        self._validate_samples(case_samples, genotype_samples, "genotype")
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between b3045b7 and f36aea7.

📒 Files selected for processing (2)
  • backend/variants/query_schemas.py (1 hunks)
  • backend/variants/tests/data/query_settings.py (1 hunks)
🧰 Additional context used
🪛 Ruff
backend/variants/tests/data/query_settings.py

2-2: SyntaxError: Simple statements must be separated by newlines or semicolons


3-3: SyntaxError: Simple statements must be separated by newlines or semicolons

🔇 Additional comments (1)
backend/variants/query_schemas.py (1)

317-317: LGTM: Hemizygous frequency fields added consistently.

The new hemizygous frequency fields have been added following the same pattern as the existing heterozygous and homozygous fields, maintaining consistency across all database types (ExAC, Thousand Genomes, gnomAD exomes, and gnomAD genomes).

Also applies to: 322-322, 327-327, 332-332

@stolpeo stolpeo force-pushed the 2100-filter-for-hemizygous-frequencies-does-not-work branch from f36aea7 to bfd68bf Compare November 7, 2024 16:00
@stolpeo stolpeo merged commit 05c1887 into main Nov 7, 2024
16 checks passed
@stolpeo stolpeo deleted the 2100-filter-for-hemizygous-frequencies-does-not-work branch November 7, 2024 16:14
Copy link

codecov bot commented Nov 7, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 91%. Comparing base (b3045b7) to head (bfd68bf).
Report is 1 commits behind head on main.

Additional details and impacted files
@@          Coverage Diff          @@
##            main   #2101   +/-   ##
=====================================
  Coverage     91%     91%           
=====================================
  Files        678     678           
  Lines      38537   38537           
=====================================
  Hits       35140   35140           
  Misses      3397    3397           
Files with missing lines Coverage Δ
backend/variants/query_schemas.py 88% <ø> (ø)
backend/variants/tests/data/query_settings.py 100% <ø> (ø)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Filter for hemizygous frequencies does not work
1 participant