Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor changes to improve compilation speed #137

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Commits on Feb 25, 2024

  1. Configuration menu
    Copy the full SHA
    4f7d6d8 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    b63550c View commit details
    Browse the repository at this point in the history
  3. Optimize compilation time of parsing functions on Linux

    Iterator combinators have a noticable compilation time overhead
    because they must be monomorphized and end up passing more code
    to LLVM.  This code does get optimized out, but that takes time
    and slows down the overall build.
    Aeledfyr committed Feb 25, 2024
    Configuration menu
    Copy the full SHA
    ff5dbc9 View commit details
    Browse the repository at this point in the history
  4. Prevent debug macro from generating format_args when not in use

    I'm not sure how much of a compile time impact this makes, but this
    prevents the compiler from having to generate formatting code when
    the debug macro is not in use.
    Aeledfyr committed Feb 25, 2024
    Configuration menu
    Copy the full SHA
    5c6c743 View commit details
    Browse the repository at this point in the history
  5. Use Vec instead of HashMap when counting physical cores

    The current implementation uses a HashMap to deduplicate the output
    from each core of the same cpu.  This commit instead collects the
    output for each core in to a Vec, and then sorts it to deduplicate
    physical CPUs.
    
    This reduces the code size processed by LLVM by 15-20%, as counted
    by `cargo llvm-lines --lib -p num_cpus` on both debug and release.
    
    These implementations have different performance characteristics:
    - the HashMap must hash each key, and SipHash is slow on small keys
    - the number of cores will be small (<1024) so sorting the list
      should be very fast
    - the list will likely already be sorted
    
    I have not benchmarked this code, but it should be around the same
    speed or slightly faster (from testing against randomized lists).
    Aeledfyr committed Feb 25, 2024
    Configuration menu
    Copy the full SHA
    a0fbf43 View commit details
    Browse the repository at this point in the history