Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add wasmtime based optimizer #146

Merged
merged 2 commits into from
Nov 8, 2024
Merged

feat: add wasmtime based optimizer #146

merged 2 commits into from
Nov 8, 2024

Conversation

sd2k
Copy link
Collaborator

@sd2k sd2k commented Nov 7, 2024

This commit adds a new feature-flagged implementation of Optimizer,
which uses the wasmtime runtime to run the WASM-compiled Stan model.

Summary by CodeRabbit

Release Notes

  • New Features

    • Enhanced support for WebAssembly (WASM) functionalities in the augurs-prophet package.
    • Introduced new features for WASM, including wasmstan and wasmstan-min.
    • Added benchmarks for prophet-wasmstan-linear and prophet-wasmstan-logistic.
    • New WasmStanOptimizer for optimizing the Prophet model in a WASM environment.
  • Improvements

    • Updated the build process for better integration of WASM components.
    • Added a new public module wasmstan for users who enable the WASM feature.
    • Enhanced documentation to clarify usage and limitations of the Prophet model, including a new Caveats section.
  • Bug Fixes

    • Improved error handling and flexibility in the OptimizeCommand struct.
  • Tests

    • Added tests to validate the functionality of the WasmStanOptimizer.
  • Chores

    • Updated .gitignore to exclude prophet-wasmstan.wasm from version control.
    • Modified the justfile to enhance build and testing processes for the augurs-js package.
    • Updated GitHub Actions workflow for improved artifact management.

Copy link
Contributor

coderabbitai bot commented Nov 7, 2024

Walkthrough

The pull request introduces significant updates to the augurs-prophet package, enhancing its support for WebAssembly (WASM) functionalities. Key modifications include the addition of new dependencies, features, benchmarks, and tests in the Cargo.toml file. The build.rs file has been refactored for improved function clarity and flexibility. A new WIT specification for the Prophet model is defined in prophet-wasmstan.wit, while the wasmstan.rs file implements the WASM execution environment. Benchmarking tests are added to evaluate performance, and .gitignore files are updated to exclude specific build artifacts.

Changes

File Path Change Summary
crates/augurs-prophet/Cargo.toml - Added dependencies: wasmtime and wasmtime-wasi (both version 26, optional).
- Introduced features: wasmstan, wasmstan-min.
- Added benchmarks: prophet-wasmstan-linear, prophet-wasmstan-logistic.
- Added test: wasmstan.
crates/augurs-prophet/build.rs - Renamed compile_model to compile_cmdstan_model.
- Renamed fallback to create_empty_files, modified to accept a slice of string references.
- Replaced main with handle_cmdstan and added handle_wasmstan and copy_wasmstan.
crates/augurs-prophet/prophet-wasmstan.wit - Defined new interfaces and records for WASM functionality, including types and optimizer interface with the optimize method.
crates/augurs-prophet/src/cmdstan.rs - Updated OptimizeCommand implementation with a new lifetime parameter.
crates/augurs-prophet/src/lib.rs - Added new public module wasmstan under a feature flag.
crates/augurs-prophet/src/wasmstan.rs - Implemented WasmstanOptimizer and WasiState structs, added error handling and type conversions for WASM execution.
crates/augurs-prophet/tests/wasmstan.rs - Introduced a test function for WasmStanOptimizer, validating the model's functionality with training and prediction data.
components/cpp/prophet-wasmstan/.gitignore - Added prophet-wasmstan.wasm to ignore list.
crates/augurs-prophet/.gitignore - Added prophet-wasmstan.wasm to ignore list.
crates/augurs-prophet/benches/prophet-wasmstan-linear.rs - Added benchmarking functions for fitting and predicting with the Prophet model.
crates/augurs-prophet/benches/prophet-wasmstan-logistic.rs - Introduced a new benchmarking file for evaluating the fit and predict methods of the Prophet model.
justfile - Modified test-all target to exclude prophet-cmdstan.
- Added build-component target to build and copy the WASM file.

Possibly related PRs

🐰 In the land of code where rabbits play,
New features hop in, brightening the day.
With WASM support, our models now gleam,
Optimizing forecasts, fulfilling the dream.
Benchmarks and tests, all set to run,
In the world of augurs, we’ve only begun! 🌟


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 86e9576 and 3a8fd7e.

📒 Files selected for processing (1)
  • crates/augurs-prophet/src/wasmstan.rs (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • crates/augurs-prophet/src/wasmstan.rs

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@sd2k sd2k marked this pull request as ready for review November 7, 2024 14:56
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Outside diff range and nitpick comments (17)
.github/workflows/release-plz.yml (1)

8-12: Consider adding workflow status check.

The workflow will run even if the upstream prophet-wasmstan workflow fails. Consider adding a condition to check the workflow status.

 on:
   workflow_run:
     workflows:
       - prophet-wasmstan
     types:
       - "completed"
+    if: ${{ github.event.workflow_run.conclusion == 'success' }}
     branches:
       - main
.github/workflows/wasmstan.yml (2)

32-35: Consider adding versioning to the artifact name.

The artifact name is currently static. Consider including version information or a unique identifier to distinguish between different builds.

     - uses: actions/upload-artifact@v4
       with:
-        name: prophet-wasmstan.wasm
+        name: prophet-wasmstan-${{ github.sha }}.wasm
         path: components/cpp/prophet-wasmstan.wasm

32-35: Consider adding artifact retention settings.

The upload-artifact action supports retention settings which can help manage storage usage.

     - uses: actions/upload-artifact@v4
       with:
         name: prophet-wasmstan.wasm
         path: components/cpp/prophet-wasmstan.wasm
+        retention-days: 90
crates/augurs-prophet/src/lib.rs (1)

16-17: LGTM! Consider adding module documentation.

The new wasmstan module follows the established pattern of feature-gated optimization backends. The implementation looks clean and well-organized.

Consider adding a doc comment before the module declaration to explain its purpose and usage, similar to the comment above the optimizer module:

 #[cfg(feature = "wasmstan")]
+/// WebAssembly-based Stan optimizer implementation.
+/// This module provides optimization capabilities through WebAssembly,
+/// enabling Stan models to run in WebAssembly environments.
 pub mod wasmstan;
crates/augurs-prophet/Cargo.toml (1)

53-57: Feature flags are well-structured but could use additional documentation.

The feature flags follow a consistent pattern with the existing cmdstan implementation and have correct dependency chains. However, consider adding documentation about the purpose and use cases of the wasmstan and compile-wasmstan features, similar to how internal-ignore-wasmstan-failure is documented.

Add documentation above the features:

+# Enable WebAssembly-based Stan model compilation and execution
compile-wasmstan = ["wasmstan"]
+# Core WebAssembly Stan integration with required dependencies
wasmstan = ["dep:serde_json", "dep:wasmtime", "dep:wasmtime-wasi", "serde"]

Also applies to: 59-60

crates/augurs-prophet/build.rs (3)

10-10: Function rename improves clarity.

The rename from compile_model to compile_cmdstan_model better reflects the specific purpose of this function.

Consider enhancing error messages to include the actual STAN_PATH value when validation fails:

-        .map_err(|_| "STAN_PATH not set")?
+        .map_err(|e| format!("STAN_PATH environment variable not set: {}", e))?
-        .map_err(|_| "invalid STAN_PATH")?;
+        .map_err(|e| format!("Invalid STAN_PATH value '{}': {}", std::env::var("STAN_PATH").unwrap_or_default(), e))?;

68-75: Function refactor improves reusability.

The rename and parameterization makes this function more flexible and its purpose clearer. Good error handling and logging.

Consider adding input validation to prevent empty slice and empty file names:

 fn create_empty_files(names: &[&str]) -> Result<(), Box<dyn std::error::Error>> {
+    if names.is_empty() {
+        return Err("No file names provided".into());
+    }
+    if names.iter().any(|name| name.is_empty()) {
+        return Err("Empty file name provided".into());
+    }
     let out_dir = std::path::PathBuf::from(std::env::var("OUT_DIR")?);

Line range hint 79-108: Consider extracting duplicated file names.

The file names ["prophet", "libtbb.so.12"] are duplicated in multiple places. This could lead to maintenance issues if the required files change.

Extract the file names to a constant:

+const REQUIRED_FILES: &[&str] = &["prophet", "libtbb.so.12"];
+
 fn handle_cmdstan() -> Result<(), Box<dyn std::error::Error>> {
     // ... existing code ...
     #[cfg(feature = "internal-ignore-cmdstan-failure")]
     if _result.is_err() {
-        create_empty_files(&["prophet", "libtbb.so.12"])?;
+        create_empty_files(REQUIRED_FILES)?;
     }
     #[cfg(not(feature = "internal-ignore-cmdstan-failure"))]
     if std::env::var("DOCS_RS").is_ok() {
-        create_empty_files(&["prophet", "libtbb.so.12"])?;
+        create_empty_files(REQUIRED_FILES)?;
     }
crates/augurs-prophet/src/optimizer.rs (1)

Line range hint 343-357: Add tests for mutable optimizer behavior.

While the serialization tests are comprehensive, we should add tests to verify the new mutable behavior of the optimizer. This is particularly important given the breaking change in the trait signature.

Consider adding tests like:

#[test]
fn test_optimizer_state_mutation() {
    let mut optimizer = MockOptimizer::new();
    
    // First optimization
    let result1 = optimizer.optimize(&initial_params1, &data1, &opts1)?;
    assert!(optimizer.call.is_some());
    
    // Second optimization
    let result2 = optimizer.optimize(&initial_params2, &data2, &opts2)?;
    // Verify state changes between optimizations
    assert_ne!(result1, result2);
}
crates/augurs-prophet/src/cmdstan.rs (2)

Line range hint 683-692: Consider adding debug logging for the command execution

For better debugging and observability, consider logging the command being executed before running it.

     ) -> Result<optimizer::OptimizedParams, optimizer::Error> {
+        tracing::debug!("Executing optimize command with options: {:?}", opts);
         OptimizeCommand {
             installation: &self.prophet_installation,
             init,

Line range hint 746-777: Consider adding tests for error cases

The current tests cover the happy path well, but consider adding tests for error scenarios such as:

  • Invalid output format
  • Missing required parameters
  • Process execution failures
crates/augurs-prophet/tests/wasmstan.rs (2)

9-31: Consider enhancing error handling and test documentation.

While the test structure is good, consider:

  1. Adding test documentation explaining the purpose and expected behavior
  2. Using more descriptive error handling instead of unwrap()
-#[test]
+/// Tests the WasmStanOptimizer by:
+/// 1. Training a Prophet model with known data
+/// 2. Making predictions
+/// 3. Validating predictions against expected values
+#[test]
 fn wasmstan() {
     tracing_subscriber::fmt::init();
     let opts = ProphetOptions::default();
-    let opt = WasmStanOptimizer::new().unwrap();
+    let opt = WasmStanOptimizer::new()
+        .expect("Failed to create WasmStanOptimizer");
     let mut prophet = Prophet::new(opts, opt);
     let training_data = TrainingData::new(TRAINING_DS.to_vec(), TRAINING_Y.to_vec())
-        .unwrap();
+        .expect("Failed to create training data");

33-303: Consider moving large test datasets to separate files.

The test data arrays are quite large and make the test file harder to read and maintain. Consider:

  1. Moving test data to separate files (e.g., JSON or CSV)
  2. Adding helper functions to load test data

Example implementation:

use std::fs;
use serde::Deserialize;

#[derive(Deserialize)]
struct TestData {
    training_ds: Vec<TimestampSeconds>,
    training_y: Vec<f64>,
    prediction_ds: Vec<TimestampSeconds>,
    expected: Vec<f64>,
}

fn load_test_data() -> TestData {
    let data = fs::read_to_string("tests/data/wasmstan_test_data.json")
        .expect("Failed to read test data");
    serde_json::from_str(&data)
        .expect("Failed to parse test data")
}
crates/augurs-prophet/src/wasmstan.rs (3)

17-39: Include error sources for better error context

In the Error enum, consider annotating the error variants with #[source] to automatically include the underlying cause in the error chain. This will aid in debugging by providing more detailed error information.

Apply the following changes:

 #[derive(Debug, thiserror::Error)]
 pub enum Error {
     /// An error occurred while compiling the WebAssembly component.
-    #[error("Error compiling component: {0}")]
-    Compilation(wasmtime::Error),
+    #[error("Error compiling component")]
+    #[source]
+    Compilation(#[from] wasmtime::Error),
     /// An error occurred while instantiating the WebAssembly component.
-    #[error("Error instantiating component: {0}")]
-    Instantiation(wasmtime::Error),
+    #[error("Error instantiating component")]
+    #[source]
+    Instantiation(#[from] wasmtime::Error),
     /// An error occurred in wasmtime while running the WebAssembly component.
-    #[error("Error running component: {0}")]
-    Runtime(wasmtime::Error),
+    #[error("Error running component")]
+    #[source]
+    Runtime(#[from] wasmtime::Error),
     /// An error occurred in Stan while running the optimization.
     #[error("Error running optimization: {0}")]
     Optimize(String),
     /// An invalid parameter value was received from the optimizer.
     #[error("Invalid value ({value}) for parameter {param} received from optimizer")]
     InvalidParam {
         /// The parameter name.
         param: String,
         /// The value received from the optimizer.
         value: f64,
     },
 }

92-101: Consider externalizing the WebAssembly binary path for flexibility

Currently, the path to the WebAssembly binary is hardcoded using include_bytes!, which embeds the binary at compile time. To increase flexibility and allow updates to the Wasm component without recompiling the entire application, consider loading the binary at runtime from a configurable path.


154-157: Handle serialization errors more specifically

When serializing data with serde_json::to_string, any serialization error is mapped directly to optimizer::Error::custom. Consider handling specific serialization errors to provide more informative feedback or retries in case of transient issues.

crates/augurs-js/src/prophet.rs (1)

Line range hint 853-869: Fix Incomplete Parsing of notes in ConvergenceLog::new

In the ConvergenceLog::new method, the notes field is assigned using split.next().unwrap_or_default(), which captures only the next word. If notes is intended to include the remainder of the line (potentially multiple words), this implementation will omit important information. Consider collecting all remaining tokens to accurately capture the full notes.

Apply the following diff to fix the parsing of notes:

     let evals = split.next()?.parse::<usize>().ok()?;
-    let notes = split.next().unwrap_or_default();
+    let notes = split.collect::<Vec<_>>().join(" ");
     Some(Self {
         iter,
         log_prob,
         dx,
         grad,
         alpha,
         alpha0,
         evals,
         notes,
     })
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 5b71bad and b187adb.

📒 Files selected for processing (12)
  • .github/workflows/release-plz.yml (1 hunks)
  • .github/workflows/rust.yml (2 hunks)
  • .github/workflows/wasmstan.yml (1 hunks)
  • crates/augurs-js/src/prophet.rs (1 hunks)
  • crates/augurs-prophet/Cargo.toml (3 hunks)
  • crates/augurs-prophet/build.rs (4 hunks)
  • crates/augurs-prophet/prophet-wasmstan.wit (1 hunks)
  • crates/augurs-prophet/src/cmdstan.rs (1 hunks)
  • crates/augurs-prophet/src/lib.rs (1 hunks)
  • crates/augurs-prophet/src/optimizer.rs (2 hunks)
  • crates/augurs-prophet/src/wasmstan.rs (1 hunks)
  • crates/augurs-prophet/tests/wasmstan.rs (1 hunks)
🔇 Additional comments (13)
.github/workflows/rust.yml (1)

31-34: ⚠️ Potential issue

Fix step ordering and add error handling for artifact download.

The artifact download step is placed before the repository checkout, which could lead to issues:

  1. The target directory might not exist before checkout
  2. The checkout might overwrite or conflict with the downloaded artifact
  3. No error handling if the artifact is missing

Move the download step after checkout and add directory creation:

      - name: Checkout sources
        uses: actions/checkout@v4

      - name: Install Rust toolchain
        uses: moonrepo/setup-rust@v1
        with:
          bins: cargo-nextest,just

+     - name: Create artifact directory
+       run: mkdir -p crates/augurs-prophet/
+
+     - name: Download WASM artifact
+       uses: actions/download-artifact@v4
+       with:
+         name: prophet-wasmstan.wasm
+         path: crates/augurs-prophet/prophet-wasmstan.wasm
+       continue-on-error: false

Let's verify the artifact path is consistent with the project structure:

crates/augurs-prophet/Cargo.toml (2)

25-29: Dependencies look good, verify wasmtime version compatibility.

The wasmtime dependencies are properly configured with essential features and marked as optional. The version selection (v26) and matching versions between wasmtime and wasmtime-wasi ensure compatibility.

Let's verify if this is the latest stable version:

#!/bin/bash
# Description: Check latest wasmtime crate version
cargo search wasmtime --limit 1

80-83: Consider requiring compile-wasmstan feature for tests.

The test configuration is properly structured, but if the tests involve model compilation, consider requiring the compile-wasmstan feature instead of just wasmstan.

Let's check if the tests require compilation:

If compilation is needed, update the test configuration:

[[test]]
name = "wasmstan"
-required-features = ["wasmstan"]
+required-features = ["wasmstan", "compile-wasmstan"]
crates/augurs-prophet/build.rs (1)

110-113: Clean and focused main function.

The main function is well-structured with proper error handling.

crates/augurs-prophet/src/optimizer.rs (1)

295-295: Breaking change: Document the mutability requirement.

The change from &self to &mut self is a breaking change that requires all implementors to be updated. Please update the trait documentation to explain why mutable access is required and what state modifications are expected during optimization.

Let's verify all implementations of the Optimizer trait have been updated:

Consider if this mutability is strictly necessary. If it's only needed for testing (mock implementations), consider alternative designs like:

  1. Using interior mutability with Arc<Mutex<_>> for test-only state
  2. Splitting the trait into immutable and mutable variants
✅ Verification successful

The mutability change is correctly implemented across all implementations

All implementations of the Optimizer trait have been properly updated to use &mut self:

  • CmdstanOptimizer in cmdstan.rs
  • WasmStanOptimizer in wasmstan.rs
  • MockOptimizer in optimizer.rs
  • JsOptimizer in augurs-js/src/prophet.rs

The trait documentation in optimizer.rs already focuses on the functional aspects of optimization rather than implementation details, so no additional documentation about mutability is required. The change appears to be a well-coordinated update across the codebase.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for all Optimizer trait implementations
ast-grep --pattern 'impl $_ for $type {
  $$$
  fn optimize(&self, $_, $_, $_) -> $_ {
    $$$
  }
  $$$
}'

Length of output: 109


Script:

#!/bin/bash
# Let's try a more specific search for Optimizer implementations
# First, let's find all implementations with &self
rg "impl.*Optimizer.*\{" -A 10

# Also search for the trait definition to understand the full context
rg "trait Optimizer" -A 10

Length of output: 9116

crates/augurs-prophet/src/cmdstan.rs (2)

Line range hint 1-683: Implementation demonstrates high quality

The code exhibits excellent qualities:

  • Comprehensive documentation
  • Strong error handling with detailed error types
  • Clear separation of concerns
  • Proper use of Rust idioms and best practices

683-683: Consider if &mut self is necessary

The optimize method has been changed to take &mut self, but the implementation doesn't appear to modify any internal state. Unless there are future plans to modify internal state, consider keeping it as &self to maintain the principle of least privilege.

✅ Verification successful

The &mut self is correctly used

Looking at the implementation, the optimize method creates and uses an OptimizeCommand struct that borrows fields from self. While there's no direct mutation of self, the method needs &mut self because it's part of the Optimizer trait implementation which requires mutable access, likely to maintain consistency across different optimizers that might need mutation.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if there are any assignments to struct fields within the optimize method
ast-grep --pattern 'impl Optimizer for CmdstanOptimizer {
  $$$
  fn optimize(&mut self, $$$) {
    $$$
    self.$_ = $_
    $$$
  }
  $$$
}'

Length of output: 143


Script:

#!/bin/bash
# Let's check the entire optimize method implementation to understand its usage of self
rg -U "fn optimize.*?\{.*?\}" -A 100 crates/augurs-prophet/src/cmdstan.rs

Length of output: 75


Script:

#!/bin/bash
# Let's try a simpler pattern to find the optimize method
rg "fn optimize" -B 2 -A 50 crates/augurs-prophet/src/cmdstan.rs

Length of output: 1616

crates/augurs-prophet/tests/wasmstan.rs (2)

1-8: LGTM! Clean imports and good test setup.

The imports are well-organized and the test uses proper testing utilities including tracing for logging and assertion helpers.


33-2077: Verify the timestamp intervals in test data.

The test data uses timestamps with regular intervals. Let's verify the consistency of these intervals.

✅ Verification successful

Let me try a different approach to verify the timestamp intervals.


Let me try a simpler approach to verify the timestamp intervals.


Timestamp intervals are consistent with time-series test data

The analysis shows consistent intervals in both datasets:

  1. TRAINING_DS:
  • Primary interval: 1200 seconds (20 minutes) for most data points
  • A few transitions of 2400 seconds (40 minutes) indicating some gaps
  1. PREDICTION_DS:
  • Consistent 600 seconds (10 minutes) intervals
  • One 0-second interval at the end (likely duplicate of last timestamp)

The intervals are regular and appropriate for time-series forecasting tests, with training data using coarser granularity (20min) than prediction data (10min).

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if timestamps have consistent intervals
# Expected: All intervals should be equal (except for one anomaly at line 243)

# Extract timestamps and check intervals
rg -A 1 'static TRAINING_DS|static PREDICTION_DS' crates/augurs-prophet/tests/wasmstan.rs | \
  grep -E '^[0-9]' | \
  awk 'NR>1 { print $1 - prev } { prev = $1 }' | \
  sort | uniq -c

Length of output: 174


Script:

#!/bin/bash
# Check intervals in TRAINING_DS
echo "=== TRAINING_DS intervals ==="
awk '
/static TRAINING_DS: &\[TimestampSeconds\] = &\[/,/\];/ {
    if ($0 ~ /^[0-9]/) {
        gsub(/,/, "");
        if (prev) {
            print $1 - prev;
        }
        prev = $1;
    }
}' crates/augurs-prophet/tests/wasmstan.rs | sort | uniq -c

# Check intervals in PREDICTION_DS
echo -e "\n=== PREDICTION_DS intervals ==="
awk '
/static PREDICTION_DS: &\[TimestampSeconds\] = &\[/,/\];/ {
    if ($0 ~ /^[0-9]/) {
        gsub(/,/, "");
        if (prev) {
            print $1 - prev;
        }
        prev = $1;
    }
}' crates/augurs-prophet/tests/wasmstan.rs | sort | uniq -c

Length of output: 665


Script:

#!/bin/bash

# Extract and analyze TRAINING_DS timestamps
echo "=== TRAINING_DS intervals ==="
sed -n '/static TRAINING_DS/,/];/p' crates/augurs-prophet/tests/wasmstan.rs | \
  grep -o '[0-9]\+' | \
  awk 'NR>1 {print $1 - prev} {prev=$1}' | \
  sort | uniq -c

echo -e "\n=== PREDICTION_DS intervals ==="
sed -n '/static PREDICTION_DS/,/];/p' crates/augurs-prophet/tests/wasmstan.rs | \
  grep -o '[0-9]\+' | \
  awk 'NR>1 {print $1 - prev} {prev=$1}' | \
  sort | uniq -c

Length of output: 509

crates/augurs-prophet/prophet-wasmstan.wit (2)

Line range hint 5-169: The interface and type definitions are well-structured and comprehensive

The WIT file effectively defines all the necessary types and interfaces for the prophet-wasmstan module. The use of lower kebab-case for WIT identifiers is consistent, and the detailed comments provide clear explanations aligning with the underlying Prophet Stan model definitions.


1-1: ⚠️ Potential issue

Verify the necessity of the relative path at the beginning of the file

The file starts with a relative path:

../../components/cpp/prophet-wasmstan/wit/prophet-wasmstan.wit

This line may not be required in the WIT file and could cause issues during compilation or usage. Please verify if this line should be included or if it should be removed.

Apply this diff to remove the line if it's unnecessary:

-../../components/cpp/prophet-wasmstan/wit/prophet-wasmstan.wit
crates/augurs-prophet/src/wasmstan.rs (1)

1-200: Solid implementation with thorough error handling

The WasmStanOptimizer is well-designed, effectively integrating the Prophet model into a Wasm environment using wasmtime. The use of type conversions and trait implementations is appropriate, and the error handling is comprehensive. The code adheres to Rust best practices and should function as intended.

crates/augurs-js/src/prophet.rs (1)

63-63: Confirm Necessity of Mutable Reference in optimize Method

The optimize method signature has changed to accept &mut self, indicating that it now modifies the internal state of JsOptimizer. Please verify that this mutability is required. If state modification is necessary, ensure that it's handled correctly to prevent unintended side effects, especially in concurrent scenarios.

Comment on lines 8 to 12
workflow_run:
workflows:
- prophet-wasmstan
types:
- "completed"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

⚠️ Missing workflow file could cause release automation failures

The verification revealed critical issues:

  • The prophet-wasmstan workflow exists and produces a WASM artifact
  • Two workflows depend on its completion: the release-plz workflow and another one
  • However, the workflow trigger configuration is incomplete as it doesn't handle the upstream workflow's status

To prevent release automation failures, you should:

  • Add a if: ${{ github.event.workflow_run.conclusion == 'success' }} condition
  • Consider handling the 'failure' case explicitly
🔗 Analysis chain

LGTM! Let's verify the workflow dependency.

The change to trigger on workflow completion is valid, but we should verify the dependency chain.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the existence of the dependent workflow and potential circular dependencies

# Test 1: Check if the dependent workflow exists
echo "Checking for prophet-wasmstan workflow..."
if ! gh workflow list | grep -q "prophet-wasmstan"; then
    echo "Warning: Dependent workflow 'prophet-wasmstan' not found"
fi

# Test 2: Check for potential circular dependencies
echo "Checking for circular dependencies..."
rg -l "workflow_run.*prophet-wasmstan" .github/workflows/

Length of output: 283


Script:

#!/bin/bash
# Let's try a different approach to verify the workflow dependencies

# Check for the prophet-wasmstan workflow file
echo "Searching for prophet-wasmstan workflow file..."
fd -e yml -e yaml . .github/workflows/ -x cat {} \; | grep -B2 -A2 "prophet-wasmstan"

# Check all workflow dependencies to understand the chain
echo -e "\nAnalyzing all workflow dependencies..."
fd -e yml -e yaml . .github/workflows/ -x cat {} \; | grep -B2 -A2 "workflow_run"

Length of output: 1157

Comment on lines 32 to 35
- uses: actions/upload-artifact@v4
with:
name: prophet-wasmstan.wasm
path: components/cpp/prophet-wasmstan.wasm
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Update the artifact path to match the build output location

The workflow runs just components/test which includes the full build chain through the transpile target. The artifact upload step needs to be corrected to use the actual WASM component path:

    - uses: actions/upload-artifact@v4
      with:
        name: prophet-wasmstan.wasm
        path: components/cpp/prophet-wasmstan/prophet-wasmstan-component.wasm
🔗 Analysis chain

Verify the WASM artifact build process and location.

The artifact upload step is looking for the WASM file in components/cpp/ directory, but this appears to be a Rust-based workflow. Let's verify if this path is correct and ensure the WASM file is actually being built.


The artifact path in the upload step needs to be updated

The verification shows that the workflow is correctly building WASM files, but the artifact path is incorrect. The justfile shows that the final WASM component is built at components/cpp/prophet-wasmstan/prophet-wasmstan-component.wasm, not at components/cpp/prophet-wasmstan.wasm.

Let's verify the actual workflow steps to ensure we're uploading the correct artifact:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the WASM file location and build process

# Test 1: Check if the cpp directory exists and contains WASM-related files
echo "Checking cpp directory structure:"
fd -t d "cpp$" components/

# Test 2: Look for any build commands that generate .wasm files
echo -e "\nChecking for WASM build commands:"
rg -A 3 "\.wasm" components/

# Test 3: Check justfile for relevant build targets
echo -e "\nChecking justfile for WASM-related targets:"
fd -t f "justfile" components/ -x cat {} | rg -A 3 "wasm"

Length of output: 5302


Script:

#!/bin/bash
# Check the full workflow file to understand the build process
cat .github/workflows/wasmstan.yml

Length of output: 895

Comment on lines 4 to 8
workflow_run:
workflows:
- prophet-wasmstan
types:
- "completed"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

🛠️ Refactor suggestion

Add workflow conclusion status condition

The prophet-wasmstan workflow exists in .github/workflows/wasmstan.yml and properly uploads the required artifact. However, the current configuration should be refined to only trigger on successful completions to prevent running with invalid artifacts.

  workflow_run:
    workflows:
      - prophet-wasmstan
    types:
      - "completed"
+   result: success
🔗 Analysis chain

Verify workflow dependencies and consider adding status conditions.

The workflow now depends on the completion of prophet-wasmstan workflow, but there are a few considerations:

  1. No condition is specified for the workflow conclusion status (success/failure), which means this workflow will run even if prophet-wasmstan fails.
  2. This creates a sequential dependency that could increase overall CI time.

Let's verify the existence and structure of the dependent workflow:

Consider adding a condition to only run on successful completion:

  workflow_run:
    workflows:
      - prophet-wasmstan
    types:
      - "completed"
+   result: success
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the existence and structure of prophet-wasmstan workflow

# Test 1: Check if the workflow file exists
echo "Checking for prophet-wasmstan workflow file..."
fd -g "*prophet-wasmstan*.yml" .github/workflows/

# Test 2: Verify if it uploads the required artifact
echo "Verifying artifact upload step..."
rg -A 5 "upload-artifact.*prophet-wasmstan\.wasm" .github/workflows/

Length of output: 436


Script:

#!/bin/bash
# Let's try a more comprehensive check for the workflow and its dependencies

# Check for any workflow files containing 'prophet' or 'wasmstan'
echo "Checking for related workflow files..."
fd -e yml -e yaml . .github/workflows/

# Search for workflow references to prophet-wasmstan
echo "Searching for workflow references..."
rg "prophet-wasmstan" .github/workflows/

# Check for artifact upload patterns in all workflow files
echo "Checking artifact handling..."
rg "upload-artifact" .github/workflows/

Length of output: 1392

@@ -340,7 +340,7 @@ pub(crate) mod mock_optimizer {

impl Optimizer for MockOptimizer {
fn optimize(
&self,
&mut self,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Simplify: Remove unnecessary RefCell usage.

Now that the optimize method takes &mut self, the RefCell is no longer necessary for interior mutability. The comment about needing RefCell is also outdated.

Consider simplifying the implementation:

 #[derive(Clone, Debug)]
 pub(crate) struct MockOptimizer {
-    pub call: RefCell<Option<OptimizeCall>>,
+    pub call: Option<OptimizeCall>,
 }

 impl MockOptimizer {
     pub(crate) fn new() -> Self {
         Self {
-            call: RefCell::new(None),
+            call: None,
         }
     }

     pub(crate) fn take_call(&self) -> Option<OptimizeCall> {
-        self.call.borrow_mut().take()
+        self.call.take()
     }
 }

 impl Optimizer for MockOptimizer {
     fn optimize(&mut self, init: &InitialParams, data: &Data, opts: &OptimizeOpts)
         -> Result<OptimizedParams, Error>
     {
-        *self.call.borrow_mut() = Some(OptimizeCall {
+        self.call = Some(OptimizeCall {
             init: init.clone(),
             data: data.clone(),
             _opts: opts.clone(),
         });
         // ... rest of the implementation
     }
 }

Committable suggestion skipped: line range outside the PR's diff.

@sd2k sd2k force-pushed the wasmtime branch 4 times, most recently from 2d8ec9c to 4e5318c Compare November 7, 2024 16:35
@sd2k sd2k changed the title WIP - Wasmtime optimizer feat: add wasmtime based optimizer for dependency-free Rust impl Nov 7, 2024
@sd2k sd2k changed the title feat: add wasmtime based optimizer for dependency-free Rust impl feat: add wasmtime based optimizer Nov 7, 2024
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (11)
justfile (1)

54-57: Add error handling to the build-component target

While the implementation is straightforward, it could benefit from additional robustness:

  1. Verify the source WASM file exists after build
  2. Ensure target directory exists before copy
  3. Add error messages for better debugging

Consider this more robust implementation:

 build-component:
   just components/build
-  cp components/cpp/prophet-wasmstan/prophet-wasmstan.wasm crates/augurs-prophet/prophet-wasmstan.wasm
+  #!/usr/bin/env sh
+  wasm_file="components/cpp/prophet-wasmstan/prophet-wasmstan.wasm"
+  target_dir="crates/augurs-prophet"
+  if [ ! -f "$wasm_file" ]; then
+    echo "Error: WASM file not found at $wasm_file"
+    exit 1
+  fi
+  mkdir -p "$target_dir"
+  cp "$wasm_file" "$target_dir/prophet-wasmstan.wasm"
crates/augurs-prophet/Cargo.toml (3)

25-29: Consider using more flexible version constraints for wasmtime dependencies.

While v26 is stable, using ^26 would allow for compatible updates within the v26 series, making it easier to receive bug fixes and performance improvements.

-wasmtime = { version = "26", features = [
+wasmtime = { version = "^26", features = [
     "runtime",
     "component-model",
], optional = true }
-wasmtime-wasi = { version = "26", optional = true }
+wasmtime-wasi = { version = "^26", optional = true }

75-83: Consider aligning feature requirements with cmdstan benchmarks.

The cmdstan benchmarks require both cmdstan and compile-cmdstan features. For consistency, consider requiring both wasmstan and compile-wasmstan features for the wasmstan benchmarks.

[[bench]]
name = "prophet-wasmstan-linear"
harness = false
-required-features = ["wasmstan"]
+required-features = ["wasmstan", "compile-wasmstan"]

[[bench]]
name = "prophet-wasmstan-logistic"
harness = false
-required-features = ["wasmstan"]
+required-features = ["wasmstan", "compile-wasmstan"]

90-92: Consider aligning test feature requirements with benchmarks.

For consistency with the proposed benchmark changes and to ensure the WASM artifacts are properly compiled, consider requiring both wasmstan and compile-wasmstan features.

[[test]]
name = "wasmstan"
-required-features = ["wasmstan"]
+required-features = ["wasmstan", "compile-wasmstan"]
crates/augurs-prophet/src/wasmstan.rs (2)

126-144: Consider optimizing Store creation.

The current implementation creates a new Store for each optimization call. Consider caching or reusing the Store if multiple optimizations are expected.

 pub struct WasmstanOptimizer {
     engine: Engine,
     instance_pre: ProphetWasmstanPre<WasiState>,
+    store: Store<WasiState>,
 }

 impl WasmstanOptimizer {
     pub fn new() -> Result<Self, Error> {
         let engine = Engine::default();
+        let store = Store::new(&engine, WasiState::default());
         // ... rest of the implementation ...
         Ok(Self {
             engine,
             instance_pre,
+            store,
         })
     }

     fn wasm_optimize(
         &self,
         init: &augurs::prophet_wasmstan::types::Inits,
         data: &String,
         opts: augurs::prophet_wasmstan::types::OptimizeOpts,
     ) -> Result<OptimizedParams, Error> {
-        let mut store = Store::new(&self.engine, WasiState::default());
+        let mut store = &mut self.store.clone();
         // ... rest of the implementation ...
     }
 }

169-180: Consider enhancing error handling for data serialization.

The current implementation uses a generic custom error for JSON serialization failures. Consider adding a specific error variant to provide more detailed feedback.

 pub enum Error {
     // ... existing variants ...
+    /// An error occurred while serializing the input data
+    #[error("Failed to serialize data: {0}")]
+    Serialization(#[from] serde_json::Error),
 }

 impl Optimizer for WasmstanOptimizer {
     fn optimize(
         &self,
         init: &InitialParams,
         data: &Data,
         opts: &OptimizeOpts,
     ) -> Result<OptimizedParams, optimizer::Error> {
-        let data = serde_json::to_string(&data).map_err(optimizer::Error::custom)?;
+        let data = serde_json::to_string(&data).map_err(|e| optimizer::Error::custom(Error::Serialization(e)))?;
         self.wasm_optimize(&init.into(), &data, opts.into())
             .map_err(optimizer::Error::custom)
     }
 }
crates/augurs-prophet/benches/prophet-wasmstan-linear.rs (2)

10-35: Consider extracting common configuration to reduce duplication.

The ProphetOptions and OptimizeOpts configurations are duplicated between the fit and predict functions. Consider extracting these into a common function or const values.

+const fn default_prophet_options() -> ProphetOptions {
+    ProphetOptions {
+        yearly_seasonality: SeasonalityOption::Manual(false),
+        interval_width: 0.8.try_into().unwrap(),
+        uncertainty_samples: 500,
+        ..Default::default()
+    }
+}
+
+const fn default_optimize_opts() -> OptimizeOpts {
+    OptimizeOpts {
+        seed: Some(100),
+        ..Default::default()
+    }
+}
+
 fn fit(c: &mut Criterion) {
     tracing_subscriber::fmt::init();
-    let opts = ProphetOptions {
-        yearly_seasonality: SeasonalityOption::Manual(false),
-        interval_width: 0.8.try_into().unwrap(),
-        uncertainty_samples: 500,
-        ..Default::default()
-    };
+    let opts = default_prophet_options();

70-479: Consider moving test data to separate files.

The large static arrays of training data could be moved to separate data files to improve maintainability and readability of the benchmark code. This would also make it easier to update the test data independently of the benchmark logic.

Consider:

  1. Moving the data to separate .json or .csv files
  2. Adding documentation about the data's characteristics and source
  3. Loading the data at runtime using include_str! or similar mechanisms
crates/augurs-prophet/benches/prophet-wasmstan-logistic.rs (3)

12-18: Consider parameterizing the Prophet options.

The Prophet options are hardcoded, which limits test coverage. Consider parameterizing these values to test different configurations.

-    let opts = ProphetOptions {
-        growth: GrowthType::Logistic,
-        yearly_seasonality: SeasonalityOption::Manual(false),
-        interval_width: 0.8.try_into().unwrap(),
-        uncertainty_samples: 500,
-        ..Default::default()
-    };
+    fn create_prophet_options(
+        growth: GrowthType,
+        interval_width: f64,
+        uncertainty_samples: u32,
+    ) -> ProphetOptions {
+        ProphetOptions {
+            growth,
+            yearly_seasonality: SeasonalityOption::Manual(false),
+            interval_width: interval_width.try_into().unwrap(),
+            uncertainty_samples,
+            ..Default::default()
+        }
+    }

21-35: Review benchmark configuration for optimal measurements.

  1. Consider if BatchSize::SmallInput is appropriate for this workload
  2. The hardcoded seed (100) limits randomization coverage

Consider:

  1. Experimenting with different batch sizes
  2. Using multiple seeds or randomizing the seed

71-2115: Consider loading test data from files.

Large static arrays in the source code can:

  1. Increase compile times
  2. Make the code harder to maintain
  3. Make it difficult to modify test data

Consider loading this data from files at runtime.

Example implementation:

use std::fs::File;
use std::io::BufReader;

fn load_test_data() -> (Vec<TimestampSeconds>, Vec<f64>) {
    let training_ds: Vec<TimestampSeconds> = serde_json::from_reader(
        BufReader::new(File::open("testdata/training_ds.json").unwrap())
    ).unwrap();
    let training_y: Vec<f64> = serde_json::from_reader(
        BufReader::new(File::open("testdata/training_y.json").unwrap())
    ).unwrap();
    (training_ds, training_y)
}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between b187adb and 4e5318c.

📒 Files selected for processing (13)
  • .github/workflows/rust.yml (1 hunks)
  • components/cpp/prophet-wasmstan/.gitignore (1 hunks)
  • crates/augurs-prophet/.gitignore (1 hunks)
  • crates/augurs-prophet/Cargo.toml (2 hunks)
  • crates/augurs-prophet/benches/prophet-wasmstan-linear.rs (1 hunks)
  • crates/augurs-prophet/benches/prophet-wasmstan-logistic.rs (1 hunks)
  • crates/augurs-prophet/build.rs (4 hunks)
  • crates/augurs-prophet/prophet-wasmstan.wit (1 hunks)
  • crates/augurs-prophet/src/cmdstan.rs (1 hunks)
  • crates/augurs-prophet/src/lib.rs (1 hunks)
  • crates/augurs-prophet/src/wasmstan.rs (1 hunks)
  • crates/augurs-prophet/tests/wasmstan.rs (1 hunks)
  • justfile (2 hunks)
✅ Files skipped from review due to trivial changes (2)
  • components/cpp/prophet-wasmstan/.gitignore
  • crates/augurs-prophet/.gitignore
🚧 Files skipped from review as they are similar to previous changes (5)
  • crates/augurs-prophet/build.rs
  • crates/augurs-prophet/prophet-wasmstan.wit
  • crates/augurs-prophet/src/cmdstan.rs
  • crates/augurs-prophet/src/lib.rs
  • crates/augurs-prophet/tests/wasmstan.rs
🔇 Additional comments (16)
justfile (2)

38-38: LGTM: Test exclusion pattern update

The change from excluding prophet to prophet-cmdstan aligns with the comment about Prophet benchmarks requiring STAN installation.


54-57: Verify the integration with CI pipeline

The new target is crucial for test execution. Ensure it's properly integrated into the CI workflow.

✅ Verification successful

The build-component target is properly integrated into the CI pipeline

The verification shows that the WASM component is well integrated into the CI workflow:

  1. The prophet-wasmstan workflow builds and uploads the WASM file as an artifact
  2. Downstream workflows (rust.yml and release-plz.yml) properly download and use the artifact
  3. The artifact paths in the workflows match the path in the build-component target (components/cpp/prophet-wasmstan/prophet-wasmstan.wasmcrates/augurs-prophet/prophet-wasmstan.wasm)

The CI pipeline ensures the WASM component is available for testing and releases.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the build-component target is used in CI workflow

# Test: Check if the target is referenced in workflow files
rg -l 'build-component' .github/workflows/

Length of output: 42


Script:

#!/bin/bash
# Let's check the CI workflow files content and the justfile to understand the integration
echo "=== Checking workflow files ==="
fd . .github/workflows/ -e yml -e yaml -x cat {}

echo -e "\n=== Checking justfile content ==="
cat justfile

Length of output: 11277


Script:

#!/bin/bash
# Let's check the components directory for build scripts and the prophet-wasmstan workflow
echo "=== Checking components directory ==="
fd . components/ -e just -x cat {}

echo -e "\n=== Checking prophet-wasmstan workflow usage ==="
rg -A 5 -B 5 'prophet-wasmstan.wasm' .github/workflows/

Length of output: 2687

.github/workflows/rust.yml (2)

29-31: LGTM! Clear condition and helpful comments.

The added condition and comments clearly explain why the test job depends on the successful completion of the prophet-wasmstan workflow.


Line range hint 3-8: Consider moving condition to workflow trigger level.

The previous review comment about adding result: success to the workflow trigger is still valid. Moving the condition from job level to workflow level would be more efficient as it prevents unnecessary job setup:

  workflow_run:
    workflows:
      - prophet-wasmstan
    types:
      - "completed"
+   result: success
crates/augurs-prophet/Cargo.toml (1)

53-60: LGTM! Well-structured feature flags with clear documentation.

The feature organization follows good practices:

  • Clear dependency chain
  • Appropriate feature grouping
  • Well-documented internal feature with usage warning
crates/augurs-prophet/src/wasmstan.rs (6)

16-39: LGTM! Well-structured error handling.

The error handling implementation follows Rust best practices with comprehensive coverage of different error scenarios and clear documentation.


56-77: LGTM! Clean WASI state implementation.

The WASI state implementation follows the standard patterns and correctly implements the necessary traits.


147-167: LGTM! Robust type conversion with proper error handling.

The implementation correctly handles type conversion with appropriate error cases for invalid parameters.


182-222: LGTM! Clean type conversion implementations.

The type conversion implementations are well-structured and follow Rust conventions for implementing the From trait.


99-107: Verify WASM binary existence at compile time.

The WASM binary path is constructed using CARGO_MANIFEST_DIR, but its existence should be verified at compile time.

#!/bin/bash
# Verify the WASM binary exists in the expected location
manifest_dir=$(fd -t f "Cargo.toml" -x dirname)
if [ -f "${manifest_dir}/prophet-wasmstan.wasm" ]; then
    echo "WASM binary exists"
else
    echo "Warning: prophet-wasmstan.wasm not found in manifest directory"
    # Search for the file in case it's in a different location
    fd -t f "prophet-wasmstan.wasm"
fi

41-54: Verify WIT file existence and consider pre-generated bindings.

The bindgen setup looks correct, but let's verify two aspects:

  1. The existence of the WIT file
  2. The potential benefits of using pre-generated bindings for improved compile times
✅ Verification successful

WIT file is correctly configured via symlink

The bindgen setup is correct and well-organized:

  • The WIT file exists as a symlink in the crate root: prophet-wasmstan.wit -> ../../components/cpp/prophet-wasmstan/wit/prophet-wasmstan.wit
  • This setup allows the bindgen macro to find the file while maintaining a single source of truth in the components directory

Regarding pre-generated bindings:

  • The commented option is a good development convenience to have
  • It's reasonable to keep it commented out by default as the bindgen process isn't typically a bottleneck in the build
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify the WIT file exists
if [ -f "prophet-wasmstan.wit" ]; then
    echo "WIT file exists"
else
    echo "Warning: prophet-wasmstan.wit not found in repository root"
    # Search for the file in case it's in a different location
    fd -t f "prophet-wasmstan.wit"
fi

Length of output: 1031


Script:

#!/bin/bash
# Let's search for the WIT file in the augurs-prophet crate directory since that's where the code is located
fd -t f "prophet-wasmstan.wit" crates/augurs-prophet/

# Also check for any WIT files to understand the convention
fd -t f ".wit$"

# Let's also check the content of the crate directory to understand the file structure
ls -la crates/augurs-prophet/

Length of output: 1173

crates/augurs-prophet/benches/prophet-wasmstan-linear.rs (3)

1-9: LGTM! Imports are well-structured for benchmarking.

The imports include all necessary components for benchmarking the Prophet model with Wasmtime optimizer, and the missing docs allowance is appropriate for a benchmark file.


2116-2117: LGTM! Proper criterion setup.

The criterion group and main setup is correct and follows the standard pattern for Rust benchmarks.


37-68: Consider adding error handling for unwrap calls.

The predict function uses several unwrap calls which could panic in case of errors. Consider proper error handling or at least adding comments explaining why these unwraps are safe in this context.

crates/augurs-prophet/benches/prophet-wasmstan-logistic.rs (2)

1-9: LGTM! Imports and setup are well-structured.

The imports are properly scoped and include all necessary dependencies for benchmarking. The missing_docs allow attribute is acceptable for benchmark files.


2117-2118: LGTM! Criterion setup is correct.

The benchmark group and main setup follow best practices.

Comment on lines 38 to 69
fn predict(c: &mut Criterion) {
let opts = ProphetOptions {
yearly_seasonality: SeasonalityOption::Manual(false),
interval_width: 0.8.try_into().unwrap(),
uncertainty_samples: 500,
..Default::default()
};
let mut model = Prophet::new(opts, WasmstanOptimizer::new().unwrap());
let training_data = TrainingData::new(TRAINING_DS.to_vec(), TRAINING_Y.to_vec()).unwrap();
model
.fit(
training_data,
OptimizeOpts {
seed: Some(100),
..Default::default()
},
)
.unwrap();

let prediction_data = PredictionData::new(PREDICTION_DS.to_vec());
c.bench_function("predict", |b| {
b.iter_batched(
|| prediction_data.clone(),
|pd| {
model.predict(Some(pd)).unwrap();
},
BatchSize::SmallInput,
);
});
let predictions = model.predict(Some(prediction_data)).unwrap();
assert_all_close(&predictions.yhat.point, EXPECTED);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Separate benchmarking from validation.

The function combines benchmarking with validation checks. Consider separating these concerns to ensure accurate performance measurements.

-    let predictions = model.predict(Some(prediction_data)).unwrap();
-    assert_all_close(&predictions.yhat.point, EXPECTED);

+    // Separate validation into its own test
+    #[test]
+    fn validate_predictions() {
+        let mut model = setup_model();
+        let predictions = model.predict(Some(prediction_data)).unwrap();
+        assert_all_close(&predictions.yhat.point, EXPECTED);
+    }

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +10 to +12
fn fit(c: &mut Criterion) {
tracing_subscriber::fmt::init();
let opts = ProphetOptions {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Tracing initialization should be moved outside benchmark functions

The tracing subscriber is initialized within each benchmark function, which could affect performance measurements. This is consistently done across all four benchmark files. Since these are performance benchmarks, the tracing overhead should be excluded from the measurements.

  • Move tracing_subscriber::fmt::init() to a setup function or module initialization in:
    • crates/augurs-prophet/benches/prophet-wasmstan-logistic.rs
    • crates/augurs-prophet/benches/prophet-wasmstan-linear.rs
    • crates/augurs-prophet/benches/prophet-cmdstan-logistic.rs
    • crates/augurs-prophet/benches/prophet-cmdstan-linear.rs
🔗 Analysis chain

Consider the impact of tracing on benchmark results.

Initializing the tracing subscriber within the benchmark function could affect performance measurements. Consider moving it to a setup function or disabling it during benchmarks.


Let me check the context of how tracing is used in these benchmark files to make a more informed decision.


Let me check if there's any benchmark setup code or configuration that might justify the tracing initialization.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if tracing is used in other benchmark files
rg -l "tracing_subscriber::fmt::init" "crates/augurs-prophet/benches/"

Length of output: 298


Script:

#!/bin/bash
# Check the context of tracing initialization in benchmark files
rg -C 5 "tracing_subscriber::fmt::init" "crates/augurs-prophet/benches/"

Length of output: 4167


Script:

#!/bin/bash
# Look for benchmark configuration or setup code
rg -l "criterion_group|criterion_main" "crates/augurs-prophet/benches/"

# Then get the context of how these benchmarks are configured
rg -C 5 "criterion_group|criterion_main" "crates/augurs-prophet/benches/"

Length of output: 6878

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (3)
crates/augurs-prophet/src/wasmstan.rs (3)

41-54: Consider improving WebAssembly bindings configuration.

The bindgen setup works but could be enhanced:

  1. Consider using environment variables or build configuration for the WIT file path to make it more flexible
  2. Document the trade-offs between using pre-generated code vs runtime generation

56-77: Add documentation for WasiState struct.

While the implementation is correct, please add documentation explaining:

  • The purpose of WasiState
  • Why both ctx and table are needed
  • The relationship with WasiView trait

147-222: Consider adding numeric validation in type conversions.

While the conversions are well-implemented, consider adding validation for numeric values to prevent potential issues with:

  • NaN values
  • Infinity
  • Negative values where inappropriate

Example for sigma_obs validation:

 impl TryFrom<augurs::prophet_wasmstan::types::OptimizedParams> for OptimizedParams {
     fn try_from(value: augurs::prophet_wasmstan::types::OptimizedParams) -> Result<Self, Self::Error> {
+        if value.sigma_obs.is_nan() || value.sigma_obs.is_infinite() || value.sigma_obs < 0.0 {
+            return Err(Error::InvalidParam {
+                param: "sigma_obs".to_string(),
+                value: value.sigma_obs,
+            });
+        }
         // ... rest of the implementation
     }
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 4e5318c and 5cd5062.

📒 Files selected for processing (13)
  • .github/workflows/rust.yml (1 hunks)
  • components/cpp/prophet-wasmstan/.gitignore (1 hunks)
  • crates/augurs-prophet/.gitignore (1 hunks)
  • crates/augurs-prophet/Cargo.toml (2 hunks)
  • crates/augurs-prophet/benches/prophet-wasmstan-linear.rs (1 hunks)
  • crates/augurs-prophet/benches/prophet-wasmstan-logistic.rs (1 hunks)
  • crates/augurs-prophet/build.rs (4 hunks)
  • crates/augurs-prophet/prophet-wasmstan.wit (1 hunks)
  • crates/augurs-prophet/src/cmdstan.rs (1 hunks)
  • crates/augurs-prophet/src/lib.rs (1 hunks)
  • crates/augurs-prophet/src/wasmstan.rs (1 hunks)
  • crates/augurs-prophet/tests/wasmstan.rs (1 hunks)
  • justfile (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (12)
  • .github/workflows/rust.yml
  • components/cpp/prophet-wasmstan/.gitignore
  • crates/augurs-prophet/.gitignore
  • crates/augurs-prophet/Cargo.toml
  • crates/augurs-prophet/benches/prophet-wasmstan-linear.rs
  • crates/augurs-prophet/benches/prophet-wasmstan-logistic.rs
  • crates/augurs-prophet/build.rs
  • crates/augurs-prophet/prophet-wasmstan.wit
  • crates/augurs-prophet/src/cmdstan.rs
  • crates/augurs-prophet/src/lib.rs
  • crates/augurs-prophet/tests/wasmstan.rs
  • justfile
🔇 Additional comments (3)
crates/augurs-prophet/src/wasmstan.rs (3)

16-39: Well-structured error handling implementation!

The error handling is comprehensive and follows Rust best practices with clear error variants, descriptive messages, and proper context for each error case.


79-124: Excellent performance optimization with pre-instantiation!

The implementation efficiently minimizes runtime overhead by pre-instantiating the WebAssembly component.


126-144: Verify Store cleanup in wasm_optimize.

The Store is created for each optimization call but there's no explicit cleanup. While Rust's drop implementation might handle this, please verify that:

  1. The Store and its resources are properly cleaned up
  2. There are no memory leaks in high-frequency optimization scenarios
✅ Verification successful

Let me gather more information about the Store usage and cleanup patterns.


Let me try one more search focusing on the wasmtime Store documentation and usage patterns.


Let me try one final search with corrected syntax.


Based on the limited information we've gathered and the nature of the concern, I'll generate the final response:

Store cleanup in wasm_optimize appears to follow Rust's standard memory management

The Store instance is created with a local scope and will be automatically dropped when it goes out of scope at the end of the wasm_optimize function, following Rust's RAII principles. Since the Store is not stored in any long-lived data structure and is created fresh for each optimization call, there's no risk of resource leaks even in high-frequency scenarios. The wasmtime Store type is designed to be used this way, with Rust's ownership system ensuring proper cleanup.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for Store cleanup patterns in similar wasmtime usage
rg -A 5 "Store::new|Store::default" 

Length of output: 490


Script:

#!/bin/bash
# Search for Store type usage and implementation details
ast-grep --pattern 'struct $_ {
  $$$
  store: Store<$_>,
  $$$
}'

# Look for any explicit cleanup or drop implementations
rg -l "impl.*Drop.*Store|fn.*drop.*Store"

# Check if Store is used in any test scenarios that might reveal cleanup patterns
rg -A 10 "Store::new.*test|Store::default.*test"

# Look for any documentation or comments about Store cleanup
rg -i "store.*cleanup|store.*drop|store.*dispose"

Length of output: 215


Script:

#!/bin/bash
# Look for Store usage in the entire codebase
rg -B 2 -A 5 "Store\s*{|Store::new|Store::default" --type rust

# Check for any Store-related imports and their documentation
rg "use.*Store" --type rust

# Look for any test files that might show Store usage patterns
fd -e rs test --exec rg -l "Store"

# Look for any explicit memory management or resource cleanup patterns
rg "(?i)(cleanup|dispose|drop|free|release).*resources?" --type rust

Length of output: 327


Script:

#!/bin/bash
# Look for Store usage patterns with fixed regex
rg -B 2 -A 5 "(Store {|Store::new|Store::default)" --type rust

# Check Store imports
rg "^use.*Store" --type rust

# Look for any documentation about Store memory management
rg -i "(memory|resource|lifecycle).*store" --type rust

Length of output: 278

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (5)
crates/augurs-prophet/src/wasmstan.rs (5)

41-54: Consider improving documentation for generated bindings.

While the bindgen setup is correct, consider:

  1. Documenting the purpose and structure of the generated bindings
  2. Clarifying when/why one might want to use the pre-generated code option

56-77: WASI integration looks good, but documentation could be improved.

The WASI state management is well-implemented. Consider adding documentation for:

  • The WasiState struct explaining its purpose and fields
  • The WasiView trait implementation methods

126-144: Consider enhancing documentation and error handling clarity.

While the implementation is solid, consider:

  1. Adding more detailed documentation for the wasm_optimize method explaining:
    • The expected format of input parameters
    • Possible error conditions
    • Performance characteristics
  2. Breaking down the error handling chain for better readability:
-        instance
-            .augurs_prophet_wasmstan_optimizer()
-            .call_optimize(&mut store, init, data, opts)
-            .map_err(Error::Runtime)?
-            .map_err(Error::Optimize)
-            .and_then(|op| op.params.try_into())
+        let optimizer = instance.augurs_prophet_wasmstan_optimizer();
+        let result = optimizer
+            .call_optimize(&mut store, init, data, opts)
+            .map_err(Error::Runtime)?;
+        
+        result
+            .map_err(Error::Optimize)
+            .and_then(|op| op.params.try_into())

147-167: Consider adding validation for all numeric fields.

The OptimizedParams conversion only validates sigma_obs. Consider adding similar validation for other numeric fields (k, m, delta, beta) to ensure they are within valid ranges.


214-222: Add exhaustive matching for Algorithm conversion.

Consider using #[derive(Debug)] for the Algorithm enum and making the match exhaustive with a catch-all arm to handle future algorithm additions:

 impl From<optimizer::Algorithm> for augurs::prophet_wasmstan::types::Algorithm {
     fn from(algo: optimizer::Algorithm) -> Self {
-        match algo {
+        #[allow(unreachable_patterns)]
+        match algo {
             optimizer::Algorithm::Bfgs => Self::Bfgs,
             optimizer::Algorithm::Lbfgs => Self::Lbfgs,
             optimizer::Algorithm::Newton => Self::Newton,
+            _ => panic!("Unsupported algorithm: {:?}", algo),
         }
     }
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 5cd5062 and 5d8dc21.

📒 Files selected for processing (12)
  • components/cpp/prophet-wasmstan/.gitignore (1 hunks)
  • crates/augurs-prophet/.gitignore (1 hunks)
  • crates/augurs-prophet/Cargo.toml (2 hunks)
  • crates/augurs-prophet/benches/prophet-wasmstan-linear.rs (1 hunks)
  • crates/augurs-prophet/benches/prophet-wasmstan-logistic.rs (1 hunks)
  • crates/augurs-prophet/build.rs (4 hunks)
  • crates/augurs-prophet/prophet-wasmstan.wit (1 hunks)
  • crates/augurs-prophet/src/cmdstan.rs (1 hunks)
  • crates/augurs-prophet/src/lib.rs (1 hunks)
  • crates/augurs-prophet/src/wasmstan.rs (1 hunks)
  • crates/augurs-prophet/tests/wasmstan.rs (1 hunks)
  • justfile (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (11)
  • components/cpp/prophet-wasmstan/.gitignore
  • crates/augurs-prophet/.gitignore
  • crates/augurs-prophet/Cargo.toml
  • crates/augurs-prophet/benches/prophet-wasmstan-linear.rs
  • crates/augurs-prophet/benches/prophet-wasmstan-logistic.rs
  • crates/augurs-prophet/build.rs
  • crates/augurs-prophet/prophet-wasmstan.wit
  • crates/augurs-prophet/src/cmdstan.rs
  • crates/augurs-prophet/src/lib.rs
  • crates/augurs-prophet/tests/wasmstan.rs
  • justfile
🔇 Additional comments (2)
crates/augurs-prophet/src/wasmstan.rs (2)

16-39: Well-structured error handling implementation!

The error handling is comprehensive and follows Rust best practices with clear error variants, descriptive messages, and proper context for debugging.


79-85: Well-designed optimizer structure with performance considerations.

The pre-instantiation of WebAssembly components is a good optimization.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (1)
.github/workflows/rust.yml (1)

50-50: Consider removing diagnostic step.

The ls -R command appears to be a debugging step. Since the artifact download path is explicitly set and the subsequent test steps would fail if files are missing, this diagnostic step may not be necessary in the production workflow.

-      - run: ls -R crates/augurs-prophet/
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 5d8dc21 and 7947ed8.

📒 Files selected for processing (2)
  • .github/workflows/rust.yml (1 hunks)
  • crates/augurs-prophet/build.rs (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • crates/augurs-prophet/build.rs
🔇 Additional comments (1)
.github/workflows/rust.yml (1)

Line range hint 20-22: Add workflow conclusion status condition.

The workflow should only run on successful completion of the prophet-wasmstan workflow to prevent running with invalid artifacts.

  workflow_run:
    workflows:
      - prophet-wasmstan
    types:
      - "completed"
+   result: success

@sd2k sd2k force-pushed the wasmtime branch 2 times, most recently from b597425 to 5a69e72 Compare November 7, 2024 20:53
This commit adds a new feature-flagged implementation of `Optimizer`,
which uses the `wasmtime` runtime to run the WASM-compiled Stan model.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (17)
crates/augurs/README.md (2)

29-33: Consider adding brief explanations of feature differences.

While the feature flags are well-documented, it would be helpful to briefly explain:

  1. The differences between cmdstan and wasmstan implementations
  2. When to use prophet-wasmstan vs prophet-wasmstan-min
  3. Performance implications of each implementation

Example addition:

 - `prophet`: the Prophet time series forecasting model
-  `prophet-cmdstan`: the `cmdstan` optimizer for the Prophet time series forecasting model
+  `prophet-cmdstan`: the `cmdstan` optimizer for the Prophet model (recommended for production use)
-  `prophet-compile-cmdstan`: as above, but with the `prophet` `cmdstan` binary compiled and embedded at build-time
+  `prophet-compile-cmdstan`: embeds the `cmdstan` binary at build-time (larger binary size but easier deployment)
-  `prophet-wasmstan`: a WASM-compiled version of the Stan model used to optimize Prophet, with WASM embedded in the crate
+  `prophet-wasmstan`: WASM-based optimizer (smaller, portable, but potentially slower than cmdstan), includes embedded WASM
-  `prophet-wasmstan-min`: as above, but without the embedded WASM
+  `prophet-wasmstan-min`: WASM-based optimizer without embedded WASM (smallest binary size, requires external WASM file)

29-33: Consider adding usage examples for Prophet features.

The README shows an example for MSTL but lacks examples demonstrating the Prophet functionality with different optimizers.

Would you like me to help create example code snippets for both the cmdstan and wasmstan implementations?

crates/augurs-prophet/README.md (3)

6-12: Improve grammar and clarity in the Caveats section.

The new section provides valuable information about limitations, but could benefit from some minor improvements:

-This crate has been tested fairly thoroughly but Prophet contains a lot of options - please [report any bugs][bugs] you find!
+This crate has been tested thoroughly, but Prophet contains numerous options - please [report any bugs][bugs] you find!

-Currently, only MLE/MAP based optimization is supported.
+Currently, only MLE/MAP-based optimization is supported.
🧰 Tools
🪛 LanguageTool

[uncategorized] ~8-~8: Use a comma before ‘but’ if it connects two independent clauses (unless they are closely connected and short).
Context: ... crate has been tested fairly thoroughly but Prophet contains a lot of options - ple...

(COMMA_COMPOUND_SENTENCE_2)


[style] ~8-~8: The phrase ‘a lot of’ might be wordy and overused. Consider using an alternative.
Context: ... fairly thoroughly but Prophet contains a lot of options - please [report any bugs][bugs...

(A_LOT_OF)


[uncategorized] ~10-~10: This expression is usually spelled with a hyphen.
Context: ...s][bugs] you find! Currently, only MLE/MAP based optimization is supported. This means t...

(BASED_HYPHEN)

🪛 Markdownlint

6-6: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)


14-41: LGTM! Consider adding version compatibility note.

The split into separate WASM and cmdstan examples with clear feature flag instructions is excellent. The code samples are complete and demonstrate proper usage.

Consider adding a note about minimum supported Rust version or any specific version requirements for the features, especially for the new WASM functionality.

Also applies to: 44-87


Line range hint 106-150: Enhance technical clarity in the Optimizers section.

The section provides excellent information about different implementations, but could benefit from some clarifications:

  1. For the wasmstan section, consider adding:

    • Specific performance metrics or comparisons when mentioning "performs roughly as well as the native Stan code"
    • System requirements or prerequisites for using the WASM implementation
    • Browser compatibility information for the browser-based implementation
  2. For the cmdstan section, consider adding:

    • A note about filesystem requirements and permissions
    • Typical installation size and runtime dependencies
crates/augurs-prophet/src/wasmstan.rs (6)

1-19: Documentation looks great, consider adding version compatibility info.

The module documentation is comprehensive and well-structured. Consider adding information about minimum supported Rust version and WebAssembly component model version requirements.


58-69: Add documentation for the generated WebAssembly bindings module.

The gen module lacks documentation. Consider adding documentation explaining:

  • The purpose of the generated bindings
  • The relationship with the WIT interface
  • Why the pregenerated code option is commented out

73-94: Add documentation for the WASI state management.

The WasiState struct and its implementations lack documentation. Consider documenting:

  • The purpose of WasiState
  • Why the default implementation uses these specific values
  • The significance of the WasiView trait implementation

179-198: Consider caching the Store instance.

The wasm_optimize method creates a new Store instance for each optimization. Consider caching and reusing the Store instance when possible to improve performance, especially for repeated optimizations.


222-233: Consider more specific error mapping in the Optimizer implementation.

The current implementation maps all errors to a custom error using optimizer::Error::custom. Consider creating specific error variants in the optimizer::Error type to preserve error context from the WebAssembly execution.


1-275: Consider adding architecture documentation.

The implementation is well-structured, but would benefit from additional documentation describing:

  • The overall architecture and design decisions
  • The interaction flow between Rust and WebAssembly
  • Performance characteristics and trade-offs
  • Memory management strategy
    This could be added as a module-level documentation section or in a separate design document.
crates/augurs-prophet/tests/wasmstan.rs (2)

10-32: Consider enhancing test documentation and assertions.

To improve test maintainability and debugging:

  1. Add documentation explaining the test scenario and data
  2. Consider adding assertions for intermediate steps (e.g., after fit)
 #[test]
+/// Tests the WasmStanOptimizer with Prophet model using synthetic data.
+/// The test verifies that:
+/// 1. The model can be fitted with training data
+/// 2. Predictions match expected values within tolerance
 fn wasmstan() {
     tracing_subscriber::fmt::init();
     let opts = ProphetOptions::default();
     let opt = WasmstanOptimizer::new();
     let mut prophet = Prophet::new(opts, opt);
     let training_data = TrainingData::new(TRAINING_DS.to_vec(), TRAINING_Y.to_vec()).unwrap();
     tracing::info!("fitting");
     prophet
         .fit(
             training_data,
             OptimizeOpts {
                 seed: Some(100),
                 ..Default::default()
             },
         )
         .unwrap();
+    // Verify model was fitted successfully
+    assert!(prophet.is_fitted(), "Model should be fitted after training");
+
     let prediction_data = PredictionData::new(PREDICTION_DS.to_vec());
     tracing::info!("predicting");
     let predictions = prophet.predict(Some(prediction_data)).unwrap();
     tracing::info!("done");
     assert_all_close(&predictions.yhat.point, EXPECTED);
 }

34-2078: Consider documenting test data generation and structure.

The test data arrays are comprehensive but would benefit from documentation explaining:

  1. Data generation method or source
  2. Expected patterns or characteristics
  3. Why these specific values were chosen
  4. Time range and intervals represented by timestamps

This would help maintainers understand and update the test data when needed.

+/// Training timestamps at regular intervals.
+/// Generated from: <explain generation method>
+/// Time range: <specify range>
+/// Interval: <specify interval>
 static TRAINING_DS: &[TimestampSeconds] = &[
     1727168400, 1727169600, 1727170800, /* ... */
 ];

+/// Training values representing <explain what the values represent>.
+/// Generated from: <explain generation method>
+/// Range: <specify value range>
+/// Characteristics: <explain any patterns or special characteristics>
 static TRAINING_Y: &[f64] = &[
     0.3, 0.27, 0.03, /* ... */
 ];

+/// Prediction timestamps for future values.
+/// Time range: <specify range>
+/// Interval: <specify interval>
 static PREDICTION_DS: &[TimestampSeconds] = &[
     1729156767, 1729157367, 1729157967, /* ... */
 ];

+/// Expected prediction values from a reference implementation.
+/// Generated using: <explain how these were generated>
+/// Tolerance: <specify comparison tolerance>
 static EXPECTED: &[f64] = &[
     0.48566585406661034, /* ... */
 ];
crates/augurs-prophet/prophet-wasmstan.wit (4)

Line range hint 11-71: Consolidate repetitive comments about WIT identifier conventions

Several comments mention that WIT identifiers must be in lower kebab-case. While informative, the repetition can clutter the code. Consider adding a single comment at the beginning of the file or section to state this convention, and remove the redundant mentions to enhance readability.

Apply this diff to consolidate the comments:

 // Types used by prophet-wasmstan.
 //
-// These are split out into a separate interface to work around
-// https://github.com/bytecodealliance/wac/issues/141.
+// Note: WIT identifiers must be in lower kebab-case as per WIT specifications.
+
 interface types {
     /// The initial parameters for the optimization.
     record inits {
         /// Base trend growth rate.
         k: f64,
         /// Trend offset.
         m: f64,
         /// Trend rate adjustments, length s in data.
         delta: list<f64>,
         /// Regressor coefficients, length k in data.
         beta: list<f64>,
         /// Observation noise.
         sigma-obs: f64,
     }

     /// The type of trend to use.
     enum trend-indicator {
         /// Linear trend (default).
         linear,
         /// Logistic trend.
         logistic,
         /// Flat trend.
         flat,
     }

     /// Data for the Prophet model.
     record data {
         /// Number of time periods.
-        /// This is `T` in the Prophet STAN model definition,
-        /// but WIT identifiers must be lower kebab-case.
         n: s32,
         /// Time series, length n.
         y: list<f64>,
         /// Time, length n.
         t: list<f64>,
         /// Capacities for logistic trend, length n.
         cap: list<f64>,

         /// Number of changepoints.
-        /// This is 'S' in the Prophet STAN model definition,
-        /// but WIT identifiers must be lower kebab-case.
         s: s32,
         /// Times of trend changepoints, length s.
         t-change: list<f64>,

         /// The type of trend to use.
         trend-indicator: trend-indicator,

         /// Number of regressors.
         /// Must be greater than or equal to 1.
-        /// This is `K` in the Prophet STAN model definition,
-        /// but WIT identifiers must be lower kebab-case.
         k: s32,
         /// Indicator of additive features, length k.
-        /// This is `s_a` in the Prophet STAN model definition,
-        /// but WIT identifiers must be lower kebab-case.
         s-a: list<s32>,
         /// Indicator of multiplicative features, length k.
-        /// This is `s_m` in the Prophet STAN model definition,
-        /// but WIT identifiers must be lower kebab-case.
         s-m: list<s32>,
         /// Regressors.
-        /// This is `X` in the Prophet STAN model definition,
-        /// but WIT identifiers must be lower kebab-case.
         /// This is passed as a flat array but should be treated as
         /// a matrix with shape (n, k) (i.e., column-major order).
         x: list<f64>,

         /// Scale on seasonality prior.
         sigmas: list<f64>,
         /// Scale on changepoints prior.
         /// Must be greater than 0.
         tau: f64,
     }

     // Remaining code...
 }

Line range hint 83-85: Define a structured error type for better error handling

Currently, the optimize function returns an error as a raw string. Using a structured error type can provide more context, facilitate programmatic error handling, and improve overall robustness. Consider defining a dedicated error record to encapsulate error details.

Apply this diff to introduce a structured error type:

+/// Structured error information for optimization failures.
+record optimize-error {
+    /// Error code identifying the type of error.
+    code: s32,
+    /// Descriptive error message.
+    message: string,
+}

 interface optimizer {
     use types.{inits, data-json, optimize-opts, optimize-output};
     /// Optimize the initial parameters given the data, returning the
     /// optimal values under maximum likelihood estimation.
-    optimize: func(init: inits, data: data-json, opts: optimize-opts) -> result<optimize-output, string>;
+    optimize: func(init: inits, data: data-json, opts: optimize-opts) -> result<optimize-output, optimize-error>;
 }

This change allows callers to handle errors more effectively based on error codes and messages.


Line range hint 48-52: Clarify the representation of the x (regressors) field

The x field in the data record is a flat list intended to represent a matrix with shape (n, k). This could be confusing for users expecting a two-dimensional structure. Consider providing additional documentation or changing the representation to a list of lists for clarity.

Option 1: Enhance documentation for clarity.

 /// Regressors.
 /// This is passed as a flat array but should be treated as
-/// a matrix with shape (n, k) (i.e. strides of length n).
+/// a matrix with shape (n, k), stored in column-major order.
+/// Users should interpret 'x' such that x[i + n * j] corresponds to the element at row i and column j.
 x: list<f64>,

Option 2: Change x to a two-dimensional list for better clarity at the expense of potential performance overhead.

-/// Regressors.
-/// This is passed as a flat array but should be treated as
-/// a matrix with shape (n, k) (i.e. strides of length n).
-x: list<f64>,
+/// Regressors, represented as a list of lists.
+/// Each sublist corresponds to a row in the matrix.
+x: list<list<f64>>,

Please consider which option best fits the performance and usability requirements.


Line range hint 62-74: Document default values for optional optimization parameters

In the optimize-opts record, many fields are optional. It's not clear what default values will be used when these options are none. Providing default values in the documentation can help users understand the behavior of the optimizer when options are omitted.

Apply this diff to include default values:

 record optimize-opts {
     /// Algorithm to use. Defaults to `lbfgs` if not specified.
     algorithm: option<algorithm>,
     /// The random seed to use for the optimization. Defaults to a random seed if not specified.
     seed: option<u32>,
     /// The chain id to advance the PRNG. Defaults to `1` if not specified.
     chain: option<u32>,
     /// Line search step size for first iteration. Defaults to `0.001` if not specified.
     init-alpha: option<f64>,
     /// Convergence tolerance on changes in objective function value. Defaults to `1e-6` if not specified.
     tol-obj: option<f64>,
     /// Convergence tolerance on relative changes in objective function value. Defaults to `1e-6` if not specified.
     tol-rel-obj: option<f64>,
     /// Convergence tolerance on the norm of the gradient. Defaults to `1e-8` if not specified.
     tol-grad: option<f64>,
     /// Convergence tolerance on the relative norm of the gradient. Defaults to `1e-8` if not specified.
     tol-rel-grad: option<f64>,
     /// Convergence tolerance on changes in parameter value. Defaults to `1e-8` if not specified.
     tol-param: option<f64>,
     /// Size of the history for LBFGS Hessian approximation. Defaults to `5` if not specified.
     history-size: option<u32>,
     /// Total number of iterations. Defaults to `2000` if not specified.
     iter: option<u32>,
     /// When `true`, use the Jacobian matrix to approximate the Hessian.
     /// Default is `false`.
     jacobian: option<bool>,

     /// How frequently to update the log message, in number of iterations. Defaults to `100` if not specified.
     refresh: option<u32>,
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between b597425 and 86e9576.

📒 Files selected for processing (17)
  • .github/workflows/release-plz.yml (1 hunks)
  • .github/workflows/rust.yml (1 hunks)
  • components/cpp/prophet-wasmstan/.gitignore (1 hunks)
  • crates/augurs-prophet/.gitignore (1 hunks)
  • crates/augurs-prophet/Cargo.toml (2 hunks)
  • crates/augurs-prophet/README.md (6 hunks)
  • crates/augurs-prophet/benches/prophet-wasmstan-linear.rs (1 hunks)
  • crates/augurs-prophet/benches/prophet-wasmstan-logistic.rs (1 hunks)
  • crates/augurs-prophet/build.rs (4 hunks)
  • crates/augurs-prophet/prophet-wasmstan.wit (1 hunks)
  • crates/augurs-prophet/src/cmdstan.rs (1 hunks)
  • crates/augurs-prophet/src/lib.rs (1 hunks)
  • crates/augurs-prophet/src/wasmstan.rs (1 hunks)
  • crates/augurs-prophet/tests/wasmstan.rs (1 hunks)
  • crates/augurs/Cargo.toml (1 hunks)
  • crates/augurs/README.md (1 hunks)
  • justfile (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (11)
  • .github/workflows/release-plz.yml
  • .github/workflows/rust.yml
  • components/cpp/prophet-wasmstan/.gitignore
  • crates/augurs-prophet/.gitignore
  • crates/augurs-prophet/Cargo.toml
  • crates/augurs-prophet/benches/prophet-wasmstan-linear.rs
  • crates/augurs-prophet/benches/prophet-wasmstan-logistic.rs
  • crates/augurs-prophet/build.rs
  • crates/augurs-prophet/src/cmdstan.rs
  • crates/augurs-prophet/src/lib.rs
  • justfile
🧰 Additional context used
🪛 LanguageTool
crates/augurs-prophet/README.md

[uncategorized] ~8-~8: Use a comma before ‘but’ if it connects two independent clauses (unless they are closely connected and short).
Context: ... crate has been tested fairly thoroughly but Prophet contains a lot of options - ple...

(COMMA_COMPOUND_SENTENCE_2)


[style] ~8-~8: The phrase ‘a lot of’ might be wordy and overused. Consider using an alternative.
Context: ... fairly thoroughly but Prophet contains a lot of options - please [report any bugs][bugs...

(A_LOT_OF)


[uncategorized] ~10-~10: This expression is usually spelled with a hyphen.
Context: ...s][bugs] you find! Currently, only MLE/MAP based optimization is supported. This means t...

(BASED_HYPHEN)

🪛 Markdownlint
crates/augurs-prophet/README.md

6-6: null
Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

🔇 Additional comments (9)
crates/augurs/Cargo.toml (2)

38-38: LGTM: Full feature set properly updated

The addition of prophet-wasmstan to the full feature set is appropriate and maintains alphabetical ordering.


44-45: LGTM: New WASM features properly defined

The new features prophet-wasmstan and prophet-wasmstan-min are well-structured and follow Rust's feature flag best practices.

Let's verify the feature documentation exists:

✅ Verification successful

Features are well documented and properly implemented

The features prophet-wasmstan and prophet-wasmstan-min are thoroughly documented across the codebase:

  • Clear description in crates/augurs/README.md explaining their purpose
  • Detailed implementation documentation in components/cpp/prophet-wasmstan/README.md
  • Proper feature definitions in crates/augurs-prophet/Cargo.toml
  • Usage examples in JavaScript bindings
  • Clear dependency chain where wasmstan depends on wasmstan-min with appropriate dependencies
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for documentation of the new features

# Test 1: Look for feature documentation in README.md
echo "Checking README.md for feature documentation..."
rg -A 5 "prophet-wasmstan" README.md || echo "No documentation found in README.md"

# Test 2: Look for feature documentation in lib.rs
echo "Checking lib.rs for feature documentation..."
rg -A 5 "prophet-wasmstan" "src/lib.rs" || echo "No documentation found in lib.rs"

# Test 3: Look for feature documentation in any .md files
echo "Checking all markdown files for feature documentation..."
fd -e md -x rg -l "prophet-wasmstan" {} \;

Length of output: 801


Script:

#!/bin/bash
# Let's check the content of the found README files for documentation
echo "=== Content from crates/augurs/README.md ==="
rg -A 10 "prophet-wasmstan" "crates/augurs/README.md"

echo -e "\n=== Content from components/cpp/prophet-wasmstan/README.md ==="
rg -A 10 "prophet-wasmstan" "components/cpp/prophet-wasmstan/README.md"

echo -e "\n=== Content from components/js/prophet-wasmstan/README.md ==="
rg -A 10 "prophet-wasmstan" "components/js/prophet-wasmstan/README.md"

echo -e "\n=== Content from crates/augurs-prophet/README.md ==="
rg -A 10 "prophet-wasmstan" "crates/augurs-prophet/README.md"

# Also check Cargo.toml in augurs-prophet for feature definitions
echo -e "\n=== Checking feature definitions in augurs-prophet/Cargo.toml ==="
rg "wasmstan" "crates/augurs-prophet/Cargo.toml"

Length of output: 3988

crates/augurs/README.md (1)

29-33: LGTM! Clear documentation of Prophet-related feature flags.

The new feature flags are well-documented and follow a clear naming convention, making it easy to understand the different Prophet model implementations available.

Let's verify the feature definitions in Cargo.toml:

✅ Verification successful

All Prophet-related feature flags are properly defined in Cargo.toml

The verification confirms that all five Prophet-related feature flags documented in the README.md are correctly defined in Cargo.toml with appropriate dependencies:

  • prophet depends on augurs-prophet
  • prophet-cmdstan depends on augurs-prophet/cmdstan
  • prophet-compile-cmdstan depends on augurs-prophet/compile-cmdstan
  • prophet-wasmstan depends on augurs-prophet/wasmstan
  • prophet-wasmstan-min depends on augurs-prophet/wasmstan-min
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that all documented features are properly defined in Cargo.toml
# Expected: All features should be found in Cargo.toml with appropriate dependencies

echo "Checking feature definitions in Cargo.toml..."
rg -A 5 'prophet.*=' crates/augurs/Cargo.toml

Length of output: 661

crates/augurs-prophet/README.md (1)

180-186: LGTM! Links are well-structured and complete.

The added links for bug reports, feature requests, and WASM components documentation are properly formatted and provide valuable resources for users.

crates/augurs-prophet/src/wasmstan.rs (1)

33-56: Well-structured error handling implementation.

The error handling is comprehensive and follows Rust best practices. Each error variant provides clear context and the use of thiserror reduces boilerplate while maintaining good error messages.

crates/augurs-prophet/tests/wasmstan.rs (2)

1-9: LGTM: Imports and attributes are appropriate for a test file.

The imports are well-organized and necessary for the test implementation. The missing_docs allow attribute is acceptable for test files.


10-32: LGTM: Well-structured test implementation.

The test follows a clear pattern of setup, execution, and verification. The use of a fixed seed ensures reproducible results, and tracing is properly initialized for debugging.

crates/augurs-prophet/prophet-wasmstan.wit (2)

Line range hint 6-8: Update the workaround reference if the issue is resolved

The comment mentions a workaround for wac issue #141. If this issue has been resolved, consider updating the code to remove the workaround, or update the comment to reflect the current status.

Run the following script to check if the issue is still open:

If the issue is closed, you can refactor the code accordingly.

✅ Verification successful

Keep the workaround reference as the issue is still open

The referenced issue bytecodealliance/wac#141 is still in an open state, so the workaround and its documentation in the code remains relevant. No changes are needed at this time.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check the status of GitHub issue #141 in the bytecodealliance/wac repository.

# Test: Fetch the issue details using GitHub CLI. Expect: Issue is open.
gh issue view 141 --repo bytecodealliance/wac --json state,title

# Note: The command requires GitHub CLI to be authenticated.

Length of output: 173


1-1: ⚠️ Potential issue

Remove unintended file path at the beginning of the file

The first line appears to be a file path reference that may have been inadvertently included. It could cause issues during processing or compilation. Please remove this line to prevent potential errors.

Apply this diff to fix the issue:

-../../components/cpp/prophet-wasmstan/wit/prophet-wasmstan.wit

Likely invalid or redundant comment.

Comment on lines 155 to 177
pub fn with_custom_image(binary: &[u8]) -> Result<Self, Error> {
// Create an engine in which to compile and run everything.
let engine = Engine::default();

// Create a component from the compiled and embedded WASM binary.
let component = Component::from_binary(&engine, binary).map_err(Error::Compilation)?;

// Create a linker, which will add WASI imports to the component.
let mut linker = Linker::new(&engine);
wasmtime_wasi::add_to_linker_sync(&mut linker).map_err(Error::Instantiation)?;

// Create a pre-instantiated component.
// This does as much work as possible here, so that `optimize` can
// be called multiple times with the minimum amount of overhead.
let instance_pre = linker
.instantiate_pre(&component)
.and_then(ProphetWasmstanPre::new)
.map_err(Error::Instantiation)?;
Ok(Self {
engine,
instance_pre,
})
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Consider using Wasmtime's built-in security features

The current implementation uses Engine::default() which may not provide optimal security settings. Consider:

  • Using Config::new() to create a custom engine configuration
  • Enabling Wasmtime's built-in security features like:
    • Memory limits
    • Fuel consumption metering
    • Stack size restrictions
    • Deterministic execution settings

Example:

let mut config = Config::new();
config.consume_fuel(true);
config.static_memory_maximum_size(1024 * 1024 * 100); // 100MB limit
let engine = Engine::new(&config)?;
🔗 Analysis chain

Add security validation for custom WebAssembly images.

The with_custom_image method accepts arbitrary WebAssembly binaries. Consider adding validation to ensure:

  • The binary is from a trusted source
  • The binary hasn't been tampered with
  • The binary meets security requirements
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if there are any security measures for WebAssembly binary validation
rg -i "wasm.*validation|wasm.*security" --type rust

Length of output: 53


Script:

#!/bin/bash
# Let's check for any validation or security related code in wasmstan.rs
rg -A 5 "Component::from_binary|Engine::default" --type rust

# Also check if there are any configuration or validation options being used
ast-grep --pattern 'Engine::new($$$)'

# Look for any validation or security related types/traits in the codebase
rg -i "validator|validation|security" --type rust

# Check for any WASM-related configuration or options
ast-grep --pattern 'Config::new($$$)'

Length of output: 955


Script:

#!/bin/bash
# Let's check how the Engine is configured in wasmtime across the codebase
rg -A 5 "wasmtime::Engine" --type rust

# Check if there are any custom configurations or security settings being used
rg -A 5 "wasmtime::Config" --type rust

# Look for any embedded WASM binary or how the binary is obtained
ast-grep --pattern 'binary($$$)'

# Check for any error handling related to WASM validation
rg -B 2 -A 2 "Error::Compilation|Error::Instantiation" --type rust

Length of output: 1681

@sd2k sd2k enabled auto-merge November 8, 2024 08:12
@sd2k sd2k merged commit 7ecd929 into main Nov 8, 2024
22 checks passed
@sd2k sd2k deleted the wasmtime branch November 8, 2024 08:24
@coderabbitai coderabbitai bot mentioned this pull request Nov 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant