Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize resolve_dependencies for large workflows #126

Open
incorvia opened this issue Jan 15, 2025 · 0 comments · May be fixed by #127
Open

Optimize resolve_dependencies for large workflows #126

incorvia opened this issue Jan 15, 2025 · 0 comments · May be fixed by #127

Comments

@incorvia
Copy link

incorvia commented Jan 15, 2025

Description

The current implementation of the resolve_dependencies method in the workflow is inefficient for workflows with a large number of jobs and dependencies. Specifically, the method uses find_job, which performs a linear search over the jobs array for every dependency. This results in a time complexity of O(n * d), where n is the number of jobs and d is the number of dependencies.

For workflows with thousands or hundreds of thousands of jobs and dependencies, this becomes a significant bottleneck, causing high execution times during workflow creation.


Steps to Reproduce

  1. Create a workflow with a large number of jobs (e.g., 100,000).
  2. Define dependencies between these jobs.
  3. Measure the time taken for the resolve_dependencies method to complete.

Impact

  • Performance degrades significantly as the number of jobs and dependencies increases.
  • For workflows with 100,000 jobs and a proportional number of dependencies, this can take tens of seconds or longer, impacting user experience and system throughput.
  • In fact, I was unable to get a workflow with 100k dependencies to resolve within the bounds of my patience!

Proposed Solution

To optimize the resolve_dependencies method, we introduce a precomputed lookup table (@job_lookup) stored in the workflow's state. This lookup table allows all calls to find_job to perform O(1) lookups, rather than performing O(n) linear searches through the jobs array. The original regex-based fallback logic is preserved.

Updated Implementation

def resolve_dependencies
  @dependencies.each do |dependency|
    from = find_job(dependency[:from])
    to   = find_job(dependency[:to])

    to.incoming << dependency[:from]
    from.outgoing << dependency[:to]
  end
end

# Precompute the lookup table
def build_job_lookup
  @job_lookup = {
    klass: jobs.each_with_object({}) { |job, lookup| lookup[job.klass.to_s] = job },
    name: jobs.each_with_object({}) { |job, lookup| lookup[job.name.to_s] = job }
  }
end

# Optimized find_job with regex-based fallback
def find_job(identifier)
  # Build the lookup table if it doesn't exist
  build_job_lookup unless @job_lookup

  # Use regex to determine which lookup to use
  match_data = /(?<klass>\w*[^-])-(?<identifier>.*)/.match(identifier.to_s)

  if match_data.nil?
    # Lookup by klass if the pattern doesn't match
    @job_lookup[:klass][identifier.to_s]
  else
    # Lookup by name if the pattern matches
    @job_lookup[:name][identifier.to_s]
  end
end

Changes

  1. Precomputed Lookup Table:

    • The build_job_lookup method creates a hash (@job_lookup) with two sub-hashes:
      • :klass: Maps job.klass.to_s to the job object.
      • :name: Maps job.name.to_s to the job object.
    • The lookup table is built on-demand and ensures O(1) lookups.
  2. Regex-Based Fallback:

    • The logic in find_job uses a regex to determine whether to look up the job by name or klass, preserving the behavior of the original implementation.
  3. Automatic Rebuild:

    • The lookup table is rebuilt if it doesn’t exist (@job_lookup is nil).
    • To ensure consistency, the lookup table should be invalidated when jobs is modified (e.g., by setting @job_lookup = nil).
  4. Improved Performance:

    • Eliminates linear searches over the jobs array in find_job.
    • Reduces the time complexity of resolve_dependencies to O(d).

Performance Comparison

Workflow Size Old Time Complexity Old Execution Time New Time Complexity New Execution Time
1,000 Jobs O(n * d) ~1 second O(d) <0.1 second
10,000 Jobs O(n * d) ~10 seconds O(d) ~1 second
100,000 Jobs O(n * d) Couldn’t measure O(d) ~20 seconds

In testing, workflows with 100,000 jobs and dependencies were reduced from unmanageable execution times to approximately 20 seconds.


Testing

  1. Functional Tests:

    • Verify that find_job continues to resolve jobs correctly for both klass and name.
  2. Performance Tests:

    • Compare the execution time of resolve_dependencies before and after the optimization for large workflows.
  3. Edge Cases:

    • Test scenarios where:
      • A dependency matches klass (e.g., from: "JobA").
      • A dependency matches name (e.g., from: "JobA-123").
      • A dependency is invalid (e.g., from: "NonexistentJob").

Advantages

  1. Backward Compatibility:

    • The functionality remains identical to the original implementation.
    • No breaking changes are introduced.
  2. Consistent Performance:

    • find_job now consistently performs O(1) lookups, even when called multiple times.
  3. Scalability:

    • Handles workflows with 100,000+ jobs and dependencies efficiently.
  4. Simplified Logic:

    • Centralized lookup logic in find_job ensures all calls benefit from the optimization.

Conclusion

This optimization drastically improves the performance of resolve_dependencies for large workflows, while preserving the exact behavior of the original implementation. It should be a backward-compatible change with no impact on functionality, only improving execution time.

Would you be open to a PR implementing this change? I’d be happy to contribute! 🚀

@incorvia incorvia changed the title Optimize resolve_dependencies for Large Workflows Optimize resolve_dependencies for large workflows Jan 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant