Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update bwa-mem2 adding new platform #51942

Closed
wants to merge 13 commits into from

Conversation

pabloaledo
Copy link
Contributor

Describe your pull request here


Please read the guidelines for Bioconda recipes before opening a pull request (PR).

General instructions

  • If this PR adds or updates a recipe, use "Add" or "Update" appropriately as the first word in its title.
  • New recipes not directly relevant to the biological sciences need to be submitted to the conda-forge channel instead of Bioconda.
  • PRs require reviews prior to being merged. Once your PR is passing tests and ready to be merged, please issue the @BiocondaBot please add label command.
  • Please post questions on Gitter or ping @bioconda/core in a comment.

Instructions for avoiding API, ABI, and CLI breakage issues

Conda is able to record and lock (a.k.a. pin) dependency versions used at build time of other recipes.
This way, one can avoid that expectations of a downstream recipe with regards to API, ABI, or CLI are violated by later changes in the recipe.
If not already present in the meta.yaml, make sure to specify run_exports (see here for the rationale and comprehensive explanation).
Add a run_exports section like this:

build:
  run_exports:
    - ...

with ... being one of:

Case run_exports statement
semantic versioning {{ pin_subpackage("myrecipe", max_pin="x") }}
semantic versioning (0.x.x) {{ pin_subpackage("myrecipe", max_pin="x.x") }}
known breakage in minor versions {{ pin_subpackage("myrecipe", max_pin="x.x") }} (in such a case, please add a note that shortly mentions your evidence for that)
known breakage in patch versions {{ pin_subpackage("myrecipe", max_pin="x.x.x") }} (in such a case, please add a note that shortly mentions your evidence for that)
calendar versioning {{ pin_subpackage("myrecipe", max_pin=None) }}

while replacing "myrecipe" with either name if a name|lower variable is defined in your recipe or with the lowercase name of the package in quotes.

Bot commands for PR management

Please use the following BiocondaBot commands:

Everyone has access to the following BiocondaBot commands, which can be given in a comment:

@BiocondaBot please update Merge the master branch into a PR.
@BiocondaBot please add label Add the please review & merge label.
@BiocondaBot please fetch artifacts Post links to CI-built packages/containers.
You can use this to test packages locally.

Note that the @BiocondaBot please merge command is now depreciated. Please just squash and merge instead.

Also, the bot watches for comments from non-members that include @bioconda/<team> and will automatically re-post them to notify the addressed <team>.

Copy link
Contributor

coderabbitai bot commented Nov 5, 2024

📝 Walkthrough
📝 Walkthrough

Walkthrough

The pull request introduces modifications to the bwa-mem2 package to enhance support for the aarch64 architecture while maintaining compatibility with x86_64. The build.sh script is updated to include architecture-specific instructions, retaining the original macOS logic while adding a case statement for handling different architectures. For aarch64, the script creates a directory for the SIMDe library, downloads it, and adjusts the build command accordingly. The meta.yaml file reflects a version update and introduces a new patch specifically for aarch64, alongside a new run_exports section. The simde.patch file incorporates changes to utilize the SIMDe library for SIMD operations, modifying various source files to ensure compatibility across architectures. The overall changes represent a significant shift towards improved architecture support and code portability.

Possibly related PRs

Suggested labels

aarch64, osx-arm64

Suggested reviewers

  • martin-g
  • adamhospital

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between a211216 and ca889a4.

📒 Files selected for processing (2)
  • recipes/bwa-mem2/build.sh (1 hunks)
  • recipes/bwa-mem2/meta.yaml (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • recipes/bwa-mem2/build.sh
  • recipes/bwa-mem2/meta.yaml

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (1)
recipes/bwa-mem2/meta.yaml (1)

15-17: Good addition of run_exports for ABI stability

The addition of run_exports with appropriate version pinning is crucial for maintaining ABI compatibility. This ensures that downstream packages will use a compatible version of bwa-mem2.

Consider documenting the ABI compatibility guarantees in the package documentation to help downstream maintainers understand the versioning constraints.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 2582a1d and 573c570.

📒 Files selected for processing (3)
  • recipes/bwa-mem2/build.sh (1 hunks)
  • recipes/bwa-mem2/meta.yaml (3 hunks)
  • recipes/bwa-mem2/simde.patch (1 hunks)
🔇 Additional comments (14)
recipes/bwa-mem2/build.sh (1)

15-15: Verify different make targets for x86_64 and aarch64.

The build commands use different make targets:

  • x86_64: make multi
  • aarch64: make all

This could lead to inconsistencies in build outputs across architectures.

Also applies to: 20-20

✅ Verification successful

Based on the comprehensive review of the build.sh, meta.yaml, and patches, I can now provide a final response:

Different make targets for x86_64 and aarch64 are intentional and correct

The different make targets are by design to handle architecture-specific optimizations:

  • x86_64: Uses make multi to build multiple binaries optimized for different x86_64 instruction sets (SSE4.1, SSE4.2, AVX, AVX2, AVX512BW)
  • aarch64: Uses make all with SIMDE (SIMD Everywhere) library to provide SIMD support through emulation, producing a single binary

This is verified by:

  1. meta.yaml showing different binaries tested only for x86_64
  2. simde.patch showing ARM-specific adaptations
  3. build.sh explicitly handling the architecture differences
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check Makefile targets to understand differences between 'multi' and 'all'
echo "Checking Makefile targets..."
if [ -f "Makefile" ]; then
    echo "=== multi target ==="
    grep -A 5 "^multi:" Makefile
    echo "=== all target ==="
    grep -A 5 "^all:" Makefile
else
    echo "Makefile not found in current directory"
fi

Length of output: 177


Script:

#!/bin/bash
# Let's first find the Makefile in the repository
fd Makefile

# Then check the content of any Makefiles found
echo "=== Checking Makefiles content ==="
fd Makefile --exec cat {}

Length of output: 163384


Script:

#!/bin/bash
# Let's find the actual Makefile for bwa-mem2 in the repository
fd -t f Makefile recipes/bwa-mem2/

# And check its content
echo "=== Checking bwa-mem2 Makefile content ==="
fd -t f Makefile recipes/bwa-mem2/ --exec cat {}

Length of output: 967


Script:

#!/bin/bash
# Let's check the build.sh file to understand the make targets
echo "=== Checking build.sh content ==="
cat recipes/bwa-mem2/build.sh

# Also check if there are any other relevant files that might explain the architecture-specific builds
echo -e "\n=== Checking for architecture-specific files ==="
fd . recipes/bwa-mem2/ -x grep -l "aarch64\|x86_64" {}

Length of output: 1390


Script:

#!/bin/bash
# Let's check the meta.yaml and simde.patch to understand the architecture-specific differences
echo "=== Checking meta.yaml content ==="
cat recipes/bwa-mem2/meta.yaml

echo -e "\n=== Checking simde.patch content ==="
cat recipes/bwa-mem2/simde.patch

Length of output: 9459

recipes/bwa-mem2/meta.yaml (3)

11-12: LGTM: Appropriate architecture-specific patch selection

The conditional patch selection using conda selectors is well-implemented. Using SIMDe for aarch64 is a good approach for SIMD instruction portability.


43-44: LGTM: Proper platform declaration

The addition of linux-aarch64 to additional-platforms correctly declares the new platform support.


30-34: 🛠️ Refactor suggestion

Consider adding aarch64-specific test commands

While correctly excluding x86 SIMD variant tests for aarch64, there should be equivalent tests to verify the aarch64 build functionality.

Consider adding aarch64-specific test commands to verify the build works correctly on the target architecture. For example:

 - bwa-mem2.sse41 version # [ not aarch64 ]
 - bwa-mem2.sse42 version # [ not aarch64 ]
 - bwa-mem2.avx version # [ not aarch64 ]
 - bwa-mem2.avx2 version # [ not aarch64 ]
 - bwa-mem2.avx512bw version # [ not aarch64 ]
+- bwa-mem2 mem 2>&1 | grep "Version" # [ aarch64 ]
+- bwa-mem2 index -h 2>&1 | grep "Usage" # [ aarch64 ]
recipes/bwa-mem2/simde.patch (10)

9-10: Update INCLUDES to include SIMDe headers

The addition of -Iext to the INCLUDES variable ensures that the compiler can locate the SIMDe headers in the ext directory. This change is necessary for the successful inclusion of SIMDe library files.


23-25: Replace x86-specific intrinsics with SIMDe equivalents

By defining SIMDE_ENABLE_NATIVE_ALIASES and replacing #include <immintrin.h> with #include <simde/x86/avx2.h>, the code now uses the SIMDe library to provide SIMD functionality. This enhances portability across different architectures, including those that do not support x86-specific intrinsics.


57-69: Add compatibility for non-SSE architectures

Defining _mm_malloc, _mm_free, and _MM_HINT_NTA when __SSE__ is not defined ensures memory allocation functions are available on architectures without SSE support. Including <simde/x86/sse4.1.h> replaces the x86-specific header, promoting cross-platform compatibility.


112-130: Restrict CPUID and hyper-threading checks to x86 architectures

Modifying the conditional compilation to include checks for __x86_64__ or __i386__ ensures that CPUID instructions and hyper-threading status checks are only executed on compatible architectures. This prevents build errors on non-x86 platforms.


155-156: Include SIMDe header for SSE2 intrinsics in ksw.h

By defining SIMDE_ENABLE_NATIVE_ALIASES and including <simde/x86/sse2.h>, the code replaces the direct inclusion of <emmintrin.h>. This change supports the use of SSE2 intrinsics via SIMDe, improving portability.


169-170: Replace intrinsic headers with SIMDe in kswv.h

Including <simde/x86/avx2.h> and defining SIMDE_ENABLE_NATIVE_ALIASES replaces the direct inclusion of <immintrin.h>. This change enables the use of AVX2 intrinsics through SIMDe, aiding cross-platform compatibility.


181-184: Add execution mode messages for additional SIMD levels

The addition of checks for __SSE2__ and a fallback message for scalar execution enhances the diagnostic output, informing users about the execution mode. This is helpful for debugging and performance analysis.


197-201: Ensure compatibility in runsimd.cpp for 32-bit systems

By adding a condition #elif defined(__i386__), the code now correctly handles the case for 32-bit x86 systems, ensuring the __cpuidex function works as intended on these architectures.


215-248: Implement __rdtsc() for ARM architectures carefully

The introduction of __rdtsc() implementations for ARM architectures enhances portability. However, the use of assembly instructions and special registers requires careful consideration to ensure they function correctly across different ARM variants and operating systems.

Please verify the correctness of these implementations across all targeted ARM platforms. You can use the following script to check for potential issues:

#!/bin/bash
# Description: Verify the usage of __rdtsc() implementations for ARM architectures.

# Test: Search for all usages of __rdtsc() in the codebase.
# Expected result: Ensure that __rdtsc() is used appropriately.

rg '__rdtsc\(' src/

Additionally, confirm that the operating environments allow access to the necessary system registers without requiring special permissions.


36-45: Ensure removal of SSE2-specific code does not impact functionality

The removal of the custom _mm_blendv_epi8 implementation for pre-SSE4.1 processors suggests reliance on the SIMDe library for this functionality. Ensure that SIMDe adequately provides support for _mm_blendv_epi8 across all targeted architectures to prevent any regressions.

Please run the following script to confirm that _mm_blendv_epi8 is correctly handled by SIMDe:

✅ Verification successful

Based on the patch content, I can now generate the final response:

Removal of SSE2-specific code is properly handled by SIMDe integration

The patch shows a comprehensive transition to SIMDe with proper safeguards:

  • SIMDe is explicitly enabled with native aliases (SIMDE_ENABLE_NATIVE_ALIASES)
  • SSE4.1 header is replaced with SIMDe equivalent (simde/x86/sse4.1.h)
  • The removed _mm_blendv_epi8 implementation is safely replaced by SIMDe's implementation
  • Fallback paths are added for non-x86 architectures
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that SIMDe provides `_mm_blendv_epi8` support.

# Test: Search for `_mm_blendv_epi8` in SIMDe headers.
# Expected result: Matching declarations indicating support.

fd -e h -e c -e cpp 'simde' | xargs rg '_mm_blendv_epi8'

Length of output: 160


Script:

#!/bin/bash
# Let's first check if SIMDe headers are present in the repository
fd simde

# Then let's check for any SSE-related headers or implementations
fd -e h -e c -e cpp | xargs rg -l "SSE|simde"

# Finally, let's look at the patch file itself to understand the context better
cat recipes/bwa-mem2/simde.patch

Length of output: 8166

Comment on lines 13 to 21
case ${ARCH} in
x86_64)
LIBS="${LDFLAGS}" make CC="${CC}" CXX="${CXX}" multi ;;
aarch64)
mkdir ext/simde
wget https://github.com/simd-everywhere/simde-no-tests/archive/refs/tags/v0.8.2.tar.gz -O - | tar -xvz
mv simde-no-tests-0.8.2/* ext/simde/
LIBS="${LDFLAGS}" make arch=native EXE=bwa-mem2 CC="${CC}" CXX="${CXX}" all ;;
esac
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add a default case to handle unsupported architectures.

The case statement should include a default case to fail explicitly when building on unsupported architectures. This prevents silent failures and provides better error messages.

 case ${ARCH} in
   x86_64)
       LIBS="${LDFLAGS}" make CC="${CC}" CXX="${CXX}" multi ;;
   aarch64)
       mkdir ext/simde
       wget https://github.com/simd-everywhere/simde-no-tests/archive/refs/tags/v0.8.2.tar.gz -O - | tar -xvz
       mv simde-no-tests-0.8.2/* ext/simde/
       LIBS="${LDFLAGS}" make arch=native EXE=bwa-mem2 CC="${CC}" CXX="${CXX}" all ;;
+  *)
+      echo "Error: Architecture ${ARCH} is not supported"
+      exit 1 ;;
 esac
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
case ${ARCH} in
x86_64)
LIBS="${LDFLAGS}" make CC="${CC}" CXX="${CXX}" multi ;;
aarch64)
mkdir ext/simde
wget https://github.com/simd-everywhere/simde-no-tests/archive/refs/tags/v0.8.2.tar.gz -O - | tar -xvz
mv simde-no-tests-0.8.2/* ext/simde/
LIBS="${LDFLAGS}" make arch=native EXE=bwa-mem2 CC="${CC}" CXX="${CXX}" all ;;
esac
case ${ARCH} in
x86_64)
LIBS="${LDFLAGS}" make CC="${CC}" CXX="${CXX}" multi ;;
aarch64)
mkdir ext/simde
wget https://github.com/simd-everywhere/simde-no-tests/archive/refs/tags/v0.8.2.tar.gz -O - | tar -xvz
mv simde-no-tests-0.8.2/* ext/simde/
LIBS="${LDFLAGS}" make arch=native EXE=bwa-mem2 CC="${CC}" CXX="${CXX}" all ;;
*)
echo "Error: Architecture ${ARCH} is not supported"
exit 1 ;;
esac

Comment on lines 17 to 19
mkdir ext/simde
wget https://github.com/simd-everywhere/simde-no-tests/archive/refs/tags/v0.8.2.tar.gz -O - | tar -xvz
mv simde-no-tests-0.8.2/* ext/simde/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve robustness of simde library installation.

The current implementation has several potential issues:

  1. No error handling for download and extraction operations
  2. Hard-coded version number
  3. No cleanup of temporary files
  4. No verification of downloaded content
+      SIMDE_VERSION="v0.8.2"
+      SIMDE_TARBALL="simde-${SIMDE_VERSION}.tar.gz"
       mkdir ext/simde
-      wget https://github.com/simd-everywhere/simde-no-tests/archive/refs/tags/v0.8.2.tar.gz -O - | tar -xvz
-      mv simde-no-tests-0.8.2/* ext/simde/
+      wget -O "${SIMDE_TARBALL}" "https://github.com/simd-everywhere/simde-no-tests/archive/refs/tags/${SIMDE_VERSION}.tar.gz" || exit 1
+      echo "Verifying download..."
+      if [ -f "${SIMDE_TARBALL}" ]; then
+          tar xf "${SIMDE_TARBALL}" || exit 1
+          mv "simde-no-tests-${SIMDE_VERSION#v}"/* ext/simde/ || exit 1
+          rm -rf "${SIMDE_TARBALL}" "simde-no-tests-${SIMDE_VERSION#v}"
+      else
+          echo "Failed to download simde library"
+          exit 1
+      fi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
mkdir ext/simde
wget https://github.com/simd-everywhere/simde-no-tests/archive/refs/tags/v0.8.2.tar.gz -O - | tar -xvz
mv simde-no-tests-0.8.2/* ext/simde/
SIMDE_VERSION="v0.8.2"
SIMDE_TARBALL="simde-${SIMDE_VERSION}.tar.gz"
mkdir ext/simde
wget -O "${SIMDE_TARBALL}" "https://github.com/simd-everywhere/simde-no-tests/archive/refs/tags/${SIMDE_VERSION}.tar.gz" || exit 1
echo "Verifying download..."
if [ -f "${SIMDE_TARBALL}" ]; then
tar xf "${SIMDE_TARBALL}" || exit 1
mv "simde-no-tests-${SIMDE_VERSION#v}"/* ext/simde/ || exit 1
rm -rf "${SIMDE_TARBALL}" "simde-no-tests-${SIMDE_VERSION#v}"
else
echo "Failed to download simde library"
exit 1
fi

Comment on lines +98 to +100
-#if ((!__AVX512BW__) & (__AVX2__ | __SSE2__))
+#if (!__AVX512BW__)
for(i = 0; i <= MAX_SEQ_LEN16; i++) hist[i] = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Review the simplified preprocessor condition

The original condition checks for the absence of AVX512BW and the presence of AVX2 or SSE2. Changing it to #if (!__AVX512BW__) removes the checks for AVX2 and SSE2, which may lead to compilation issues or incorrect code execution on architectures lacking these instruction sets.

To maintain the intended functionality, consider revising the condition to include checks for AVX2 or SSE2 support. Here's a suggested change:

-#if (!__AVX512BW__)
+#if (!__AVX512BW__) && (defined(__AVX2__) || defined(__SSE2__))

This ensures that the code within the condition only compiles when AVX512BW is not available but AVX2 or SSE2 is present.

Committable suggestion skipped: line range outside the PR's diff.

Copy link
Contributor

coderabbitai bot commented Nov 5, 2024

📝 Walkthrough

Walkthrough

The pull request introduces significant modifications to the bwa-mem2 package to support architecture-specific builds for x86_64 and aarch64. The build.sh script has been updated to include a case statement that directs the build process based on the architecture. For x86_64, the existing build command remains unchanged, while for aarch64, new steps have been added to create a directory for the SIMDe library and download a specific version of it. The meta.yaml file reflects the updated package version to 2.2.1, increments the build number to 6, and includes a new patch for aarch64. It also updates build requirements and modifies the test section to exclude SIMD commands not applicable to aarch64. Additionally, the simde.patch file incorporates changes to utilize the SIMDe library for SIMD support, adjusting header inclusions and removing architecture-specific inline functions. Overall, these changes enhance compatibility and functionality for different architectures.

Possibly related PRs

Suggested labels

please review & merge, aarch64, osx-arm64

Suggested reviewers

  • druvus
  • martin-g

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (1)
recipes/bwa-mem2/build.sh (1)

23-24: Improve reliability of the installation step.

The binary installation step could be more robust:

  1. Verify build success before installation
  2. Check for binary existence
  3. Be more specific about which files to copy

Consider this improvement:

-mkdir -p $PREFIX/bin
-cp bwa-mem2* $PREFIX/bin
+mkdir -p "${PREFIX}/bin"
+if [ -f bwa-mem2 ]; then
+    cp bwa-mem2 "${PREFIX}/bin/"
+    # Copy additional binaries if they exist
+    for binary in bwa-mem2.avx bwa-mem2.avx2 bwa-mem2.avx512bw; do
+        if [ -f "${binary}" ]; then
+            cp "${binary}" "${PREFIX}/bin/"
+        fi
+    done
+else
+    echo "Error: bwa-mem2 binary not found"
+    exit 1
+fi
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 2582a1d and dfb241f.

📒 Files selected for processing (3)
  • recipes/bwa-mem2/build.sh (1 hunks)
  • recipes/bwa-mem2/meta.yaml (3 hunks)
  • recipes/bwa-mem2/simde.patch (1 hunks)
🔇 Additional comments (21)
recipes/bwa-mem2/build.sh (1)

13-21: Verify architecture-specific build configurations.

Let's ensure the build configurations are correct for each architecture.

recipes/bwa-mem2/meta.yaml (5)

15-17: LGTM! Build configuration ensures ABI stability.

The addition of run_exports with appropriate pinning strategy helps prevent ABI breakage, which is crucial for bioinformatics tools. The build number increment reflects the significant changes.


31-35: LGTM! Architecture-specific test commands are properly configured.

The test commands correctly exclude x86_64-specific SIMD variants for aarch64 architecture, which is appropriate as these instruction sets are not available on ARM platforms.


44-45: LGTM! Platform support explicitly declared.

The addition of linux-aarch64 to additional-platforms properly declares the new platform support, which is the main objective of this PR.


11-12: LGTM! Architecture-specific patches are correctly configured.

The patch selectors are properly configured to be mutually exclusive based on architecture, which aligns with the goal of adding aarch64 support.

Let's verify the patch files exist:

✅ Verification successful

Patch files are present and correctly configured for architecture-specific builds

Both required patch files exist in the recipe directory:

  • recipes/bwa-mem2/Makefile.patch (for non-aarch64)
  • recipes/bwa-mem2/simde.patch (for aarch64)
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify the existence of both patch files
fd -t f "Makefile.patch|simde.patch" recipes/bwa-mem2/

Length of output: 117


22-22: Verify wget requirement usage in build script.

The addition of wget as a build requirement suggests it's used in the build process, likely for downloading the SIMDe library for aarch64 builds.

Let's verify wget usage in the build script:

✅ Verification successful

wget requirement is correctly added and used in build script

The verification confirms that wget is used in the build script to download the SIMDe library (v0.8.2) from GitHub, which is then extracted using tar. This matches the added build requirement.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check how wget is used in build.sh
rg "wget" recipes/bwa-mem2/build.sh

Length of output: 144

recipes/bwa-mem2/simde.patch (15)

9-10: Including the 'ext' directory for SIMDe headers

Adding -Iext to the INCLUDES variable ensures that the compiler can locate the SIMDe headers required for cross-platform SIMD support. This change is appropriate.


23-24: Replacing <immintrin.h> with SIMDe AVX2 headers

Defining SIMDE_ENABLE_NATIVE_ALIASES and including <simde/x86/avx2.h> correctly transition from architecture-specific intrinsics to the SIMDe library, enhancing portability.


39-45: Removing custom _mm_blendv_epi8 implementation

Eliminating the inline function for _mm_blendv_epi8 reduces redundancy, as SIMDe provides this functionality across architectures. This streamlines the codebase.


68-69: Updating SSE4.1 headers to use SIMDe

Replacing <smmintrin.h> with <simde/x86/sse4.1.h> utilizes SIMDe for SSE4.1 instructions, promoting compatibility across different platforms.


80-86: Defining SIMD macros for non-SSE architectures

By providing definitions for _mm_malloc, _mm_free, _MM_HINT_NTA, and _MM_HINT_T0 when __SSE__ is not defined, the code ensures safe memory operations on architectures without SSE support.


99-100: Simplifying preprocessor conditions for AVX512BW

Modifying the condition to #if (!__AVX512BW__) clarifies the compilation logic, ensuring correct code paths when AVX-512BW instructions are unavailable.


112-113: Restricting inline assembly to x86 architectures

Changing to #elif defined(__x86_64__) || defined(__i386__) prevents compilation of x86-specific assembly on unsupported architectures, avoiding potential build errors.


120-121: Adjusting HTStatus function for architecture compatibility

Adding an architecture check ensures that hyper-threading detection is performed only on x86 platforms, enhancing portability.


128-130: Providing default return for non-x86 architectures in HTStatus

Returning 0 for HTStatus on non-x86 architectures prevents undefined behavior, maintaining consistent function output across platforms.


135-142: Removing inclusion of <emmintrin.h> in ksw.cpp

Eliminating the direct inclusion of architecture-specific headers is appropriate when compiling with SIMDe, reducing dependencies on specific instruction sets.


154-156: Incorporating SIMDe SSE2 headers

Defining SIMDE_ENABLE_NATIVE_ALIASES and including <simde/x86/sse2.h> replaces architecture-specific SSE2 headers, enhancing cross-platform support.


168-170: Including SIMDe AVX2 headers in kswv.h

Transitioning to <simde/x86/avx2.h> ensures that AVX2 instructions are handled by SIMDe, increasing portability and simplifying architecture checks.


181-184: Enhancing execution mode output messages

Adding conditions for __SSE2__ and the default case provides clearer execution feedback, aiding in debugging and performance tuning across different platforms.


197-198: Specifying assembly code for 32-bit x86 architectures

Using #elif defined(__i386__) accurately targets 32-bit x86 systems, preventing assembly code from being incorrectly compiled on incompatible architectures.


209-248: Implementing __rdtsc for ARM architectures

Adding ARM-specific implementations of __rdtsc enhances performance measurement capabilities on ARM platforms, improving functionality and portability.

Comment on lines 13 to 21
case ${ARCH} in
x86_64)
LIBS="${LDFLAGS}" make CC="${CC}" CXX="${CXX}" multi ;;
aarch64)
mkdir ext/simde
wget https://github.com/simd-everywhere/simde-no-tests/archive/refs/tags/v0.8.2.tar.gz -O - | tar -xvz
mv simde-no-tests-0.8.2/* ext/simde/
LIBS="${LDFLAGS}" make arch=native EXE=bwa-mem2 CC="${CC}" CXX="${CXX}" all ;;
esac
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error handling and improve maintainability of the build process.

The architecture-specific build implementation needs improvements in reliability and maintainability:

  1. Add error handling for the download and extraction process
  2. Verify the integrity of the downloaded tarball
  3. Clean up temporary files
  4. Make the simde version configurable

Consider applying these improvements:

 case ${ARCH} in
   x86_64)
       LIBS="${LDFLAGS}" make CC="${CC}" CXX="${CXX}" multi ;;
   aarch64)
+      SIMDE_VERSION="v0.8.2"
+      SIMDE_SHA256="expected_sha256_hash_here"
       mkdir ext/simde
-      wget https://github.com/simd-everywhere/simde-no-tests/archive/refs/tags/v0.8.2.tar.gz -O - | tar -xvz
-      mv simde-no-tests-0.8.2/* ext/simde/
+      wget -O simde.tar.gz "https://github.com/simd-everywhere/simde-no-tests/archive/refs/tags/${SIMDE_VERSION}.tar.gz" || exit 1
+      echo "${SIMDE_SHA256}  simde.tar.gz" | sha256sum -c || exit 1
+      tar -xzf simde.tar.gz || exit 1
+      mv "simde-no-tests-${SIMDE_VERSION#v}"/* ext/simde/ || exit 1
+      rm -rf simde.tar.gz "simde-no-tests-${SIMDE_VERSION#v}"
       LIBS="${LDFLAGS}" make arch=native EXE=bwa-mem2 CC="${CC}" CXX="${CXX}" all ;;
 esac

Committable suggestion skipped: line range outside the PR's diff.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (1)
recipes/bwa-mem2/build.sh (1)

22-23: Improve installation robustness.

The current installation process could be enhanced:

  1. Verify successful build before installation
  2. Be more specific about which binaries to install instead of using wildcard
 mkdir -p $PREFIX/bin
-cp bwa-mem2* $PREFIX/bin
+# Verify build success
+if [ -f "bwa-mem2${ARCH_SUFFIX}" ]; then
+    cp bwa-mem2${ARCH_SUFFIX} $PREFIX/bin/bwa-mem2
+else
+    echo "Build failed: bwa-mem2 binary not found"
+    exit 1
+fi
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 573c570 and 1517710.

📒 Files selected for processing (2)
  • recipes/bwa-mem2/build.sh (1 hunks)
  • recipes/bwa-mem2/meta.yaml (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • recipes/bwa-mem2/meta.yaml
🔇 Additional comments (3)
recipes/bwa-mem2/build.sh (3)

14-15: LGTM! Proper use of build environment variables.

The x86_64 build correctly uses environment variables for compiler configuration.


12-13: Verify ARCH variable is properly set.

The script relies on the ${ARCH} variable for architecture-specific builds. Let's ensure it's properly set.

#!/bin/bash
# Description: Check if ARCH is properly set in the build environment
# Expected: Should find references to ARCH in conda build configuration

# Look for ARCH variable definition in build configuration
rg -l "ARCH.*=.*" .conda-recipe/ conda-build/ || echo "ARCH variable definition not found in common build config locations"

20-20: Verify build configuration differences between architectures.

The aarch64 build configuration differs from x86_64 in several ways:

  1. Uses arch=native which might limit portability
  2. Different executable name (.native suffix)
  3. Uses all target instead of multi

Please clarify if these differences are intentional and necessary.

@pabloaledo
Copy link
Contributor Author

@BiocondaBot please add label

@BiocondaBot BiocondaBot added the please review & merge set to ask for merge label Nov 12, 2024

build:
number: 5
skip: True # [osx]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to skip OSX? It worked before I think.

@pabloaledo
Copy link
Contributor Author

I've removed most of the changes and the OSX-64 compilation is still failing, so I really don't know how could it be working before. I don't have a machine in which I can debug the osx compiation, so is there anything I can do to move this forward?.

@dslarm
Copy link
Contributor

dslarm commented Dec 2, 2024

x86_64-apple-darwin13.4.0-clang++ -c -march=core2 -mtune=haswell -mssse3 -ftree-vectorize -fPIC -fPIE -fstack-protector-strong -O2 -pipe -stdlib=libc++ -fvisibility-inlines-hidden -fmessage-length=0 -isystem $PREFIX/include -fdebug-prefix-map=$SRC_DIR=/usr/local/src/conda/bwa-mem2-2.2.1 -fdebug-prefix-map=$PREFIX=/usr/local/src/conda-prefix -g -O3 -fpermissive -msse -msse2 -msse3 -mssse3 -msse4.1  -g -O3 -fpermissive -msse -msse2 -msse3 -mssse3 -msse4.1  -D_FORTIFY_SOURCE=2 -isystem $PREFIX/include -mmacosx-version-min=10.13 -DENABLE_PREFETCH -DV17=1 -DMATE_SORT=0 -DSAIS=1  -DENABLE_PREFETCH -DV17=1 -DMATE_SORT=0 -DSAIS=1  -Isrc -Iext/safestringlib/include src/fastmap.cpp -o src/fastmap.o

is failing with:

2024-12-02T06:49:40.3426040Z 06:49:35 �[32mBIOCONDA INFO�[0m (OUT) patching file Makefile�[0m
2024-12-02T06:49:40.3427010Z 06:49:35 �[32mBIOCONDA INFO�[0m (OUT) Hunk #1 succeeded at 43 with fuzz 2.�[0m
2024-12-02T06:49:40.3428070Z 06:49:39 �[32mBIOCONDA INFO�[0m (ERR) src/fastmap.cpp:31:10: fatal error: 'iostream' file not found�[0m
2024-12-02T06:49:40.3429020Z 06:49:39 �[32mBIOCONDA INFO�[0m (ERR) #include <iostream>�[0m
2024-12-02T06:49:40.3429890Z 06:49:39 �[32mBIOCONDA INFO�[0m (ERR)          ^~~~~~~~~~�[0m
2024-12-02T06:49:40.3430540Z 06:49:39 �[32mBIOCONDA INFO�[0m (ERR) 1 error generated.�[0m
2024-12-02T06:49:40.3431300Z 06:49:39 �[32mBIOCONDA INFO�[0m (ERR) make[1]: *** [Makefile:99: src/fastmap.o] Error 1�[0m
2024-12-02T06:49:40.3432240Z 06:49:39 �[32mBIOCONDA INFO�[0m (ERR) make: *** [Makefile:105: multi] Error 2�[0m

@pabloaledo - I've seen this error before. It's related to the -stdlib=libc++ argument, likely due to an Xcode change since the last time this package was built on OSX. I'll poke a bit more.

@martin-g
Copy link
Contributor

martin-g commented Dec 5, 2024

Done with #52551

@martin-g martin-g closed this Dec 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
please review & merge set to ask for merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants