-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add cmdstan-based optimizer for augurs-prophet #121
Conversation
320e955
to
2dfa468
Compare
This commit adds an implementation of `Optimizer` to the `augurs-prophet` crate that uses `cmdstan` to optimize or sample from the Prophet model, similar to the Prophet Python package. The `cmdstan` feature is disabled by default since it requires external dependencies and may not work well on all platforms. The `compile-cmdstan` feature can be enabled to build the Prophet model at build time. This requires a working installation of `stan` and setting the `STAN_PATH` environment variable to the path to the `stan` installation. This will embed the Prophet model binary and the libtbb dynamic library into the final executable, which will increase the size of the final executable by about 2MB, but the final binary won't require any additional dependencies. There's an example in the `examples` directory which fits and predicts using cmdstan. It can be runwith something like: ```bash STAN_PATH=/home/ben/.conda/envs/stan cargo run --example prophet_cmdstan ``` (replacing the `STAN_PATH` value with the path to your Stan installation).
WalkthroughThe pull request includes modifications to several files, primarily focusing on enhancing the Rust workflow, adding dependencies, and improving documentation for the Changes
Possibly related PRs
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 18
🧹 Outside diff range and nitpick comments (19)
crates/augurs-prophet/src/lib.rs (1)
20-21
: LGTM! Consider adding documentation for the new module.The addition of the
cmdstan
module is well-placed and correctly feature-gated, aligning with the PR objectives. Making it public allows users to access the new functionality when the feature is enabled.Consider adding a brief doc comment above the module declaration to explain its purpose and when it's available. For example:
/// Provides cmdstan-based optimization for the Prophet model. /// This module is only available when the `cmdstan` feature is enabled. #[cfg(feature = "cmdstan")] pub mod cmdstan;justfile (2)
11-15
: Improved test command structure and exclusions.The changes to the
test
command are well-structured and align with the PR objectives. The use ofcargo nextest
and the exclusion ofaugurs-js
andpyaugurs
packages are good improvements.Consider adding a comment explaining why
augurs-js
andpyaugurs
are excluded to improve maintainability for future contributors.
40-42
: New command added for downloading Prophet Stan model.The addition of the
download-prophet-stan-model
command is a good implementation that aligns with the PR objectives. It provides a convenient way to download the required Stan model for the new cmdstan-based optimization feature.Consider adding a brief comment explaining the purpose and any prerequisites for using this command, similar to the helpful comments provided for other commands in this file.
crates/augurs-prophet/src/positive_float.rs (1)
5-5
: LGTM! Consider adding documentation for the new serde feature.The addition of the
serde
feature for serialization and deserialization is a good improvement. It allows users to opt-in to this functionality when needed, which is consistent with Rust's philosophy of zero-cost abstractions.Consider adding a brief documentation comment above the struct to mention the available features:
/// A positive-only, 64 bit precision floating point number. /// /// # Features /// /// - When the "bytemuck" feature is enabled, this type implements `bytemuck::Pod` and `bytemuck::Zeroable`. /// - When the "serde" feature is enabled, this type can be serialized and deserialized using serde. #[repr(transparent)] #[derive(Debug, Clone, Copy, PartialEq, PartialOrd)] #[cfg_attr(feature = "bytemuck", derive(bytemuck::Pod, bytemuck::Zeroable))] #[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))] pub struct PositiveFloat(f64);This documentation will help users understand the available features and their effects on the
PositiveFloat
type..github/workflows/rust.yml (1)
37-40
: Approve the new step with suggestions for improvement.The addition of the step to download the Prophet Stan model is a good improvement to ensure the necessary dependencies are available for testing. However, consider the following enhancements:
- Provide more specific information in the comment about which example requires this model.
- Add error handling or a verification step to ensure the download was successful.
- Consider caching the downloaded model to speed up future workflow runs.
Here's a suggested improvement:
- name: Download Prophet Stan model # Download the Prophet Stan model required for the cmdstan optimizer example run: | just download-prophet-stan-model if [ $? -ne 0 ]; then echo "Failed to download Prophet Stan model" exit 1 fi - name: Cache Prophet Stan model uses: actions/cache@v2 with: path: path/to/prophet/stan/model key: ${{ runner.os }}-prophet-stan-modelReplace
path/to/prophet/stan/model
with the actual path where the model is downloaded.crates/augurs-prophet/Cargo.toml (1)
36-44
: New features are well-defined and align with PR objectives.The addition of
cmdstan
,compile-cmdstan
,download
, andserde
features with their respective dependencies is appropriate and provides flexibility for users.The
internal-ignore-cmdstan-failure
feature is well-documented, but its potential for misuse is concerning.Consider renaming
internal-ignore-cmdstan-failure
tointernal-ignore-cmdstan-failure-dev-only
to further emphasize its intended use case and potential risks.Cargo.toml (1)
48-48
: Consider using a more flexible version specifier for serde_json.The addition of
serde_json
is appropriate and aligns with the PR objectives. However, the current version specifier "1.0.128" is very specific. In Rust, it's common to use more flexible version specifiers to allow for compatible updates.Consider changing the version specifier to
"1.0"
or"^1.0.128"
. This will allow for minor version updates that maintain backwards compatibility:-serde_json = "1.0.128" +serde_json = "1.0"or
-serde_json = "1.0.128" +serde_json = "^1.0.128"examples/forecasting/examples/prophet_cmdstan.rs (4)
1-11
: Enhance documentation with additional context and feature information.The documentation provides clear instructions for running the example. However, consider adding the following improvements:
- Brief explanation of the example's purpose (demonstrating Prophet model with CmdStan optimizer).
- Mention the
compile-cmdstan
feature and its significance.- Note about the increase in executable size when using this feature.
This additional information will provide users with a more comprehensive understanding of the example and its implications.
12-39
: LGTM! Consider extracting data preparation into a separate function.The imports and data preparation look good. The
TrainingData
is correctly initialized with timestamps, values, and additional regressors.To improve readability, consider extracting the data preparation logic into a separate function. This would make the
main
function cleaner and easier to understand at a glance.Here's a suggested refactor:
fn prepare_training_data() -> Result<TrainingData, Box<dyn std::error::Error>> { let ds = vec![ 1704067200, 1704871384, 1705675569, 1706479753, 1707283938, 1708088123, 1708892307, 1709696492, 1710500676, 1711304861, 1712109046, 1712913230, 1713717415, ]; let y = vec![ 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0, 13.0, ]; Ok(TrainingData::new(ds, y)? .with_regressors(HashMap::from([ ( "foo".to_string(), vec![1.0, 2.0, 1.0, 2.0, 1.0, 2.0, 1.0, 2.0, 1.0, 2.0, 1.0, 2.0, 1.0], ), ( "bar".to_string(), vec![4.0, 5.0, 6.0, 4.0, 5.0, 6.0, 4.0, 5.0, 6.0, 4.0, 5.0, 6.0, 4.0], ), ]))?) } // In main(): let data = prepare_training_data()?;
41-43
: Enhance comments for CmdstanOptimizer initialization options.The initialization of CmdstanOptimizer looks good. To improve clarity, consider expanding the comment about the embedded version to explain when to use each approach. This will help users understand the trade-offs between the two options.
Here's a suggested improvement for the comments:
// Initialize CmdstanOptimizer with a path to the precompiled Prophet model binary. // Use this approach when you want to manage the Stan model separately or when you need more control over the model version. let cmdstan = CmdstanOptimizer::with_prophet_path("prophet_stan_model/prophet_model.bin")?; // Alternatively, use the embedded version when you want to include the Stan model in your binary. // This increases the binary size but simplifies deployment. Requires the `compile-cmdstan` feature to be enabled. // let cmdstan = CmdstanOptimizer::new_embedded();
45-49
: LGTM! Consider adding error handling for regressor addition.The Prophet model setup and fitting look good. The regressors are added correctly, and the fit method is called with default options.
To improve robustness, consider adding error handling when adding regressors. Although it's unlikely to fail in this context, it's a good practice to handle potential errors, especially in example code.
Here's a suggested improvement:
let mut prophet = Prophet::new(Default::default(), cmdstan); prophet.add_regressor("foo".to_string(), Regressor::additive()).map_err(|e| format!("Failed to add 'foo' regressor: {}", e))?; prophet.add_regressor("bar".to_string(), Regressor::additive()).map_err(|e| format!("Failed to add 'bar' regressor: {}", e))?; prophet.fit(data, Default::default())?;crates/augurs-prophet/README.md (4)
6-44
: Great addition of the "Example" section!The new example provides clear instructions and a comprehensive code snippet that demonstrates key functionalities of the
augurs-prophet
crate. It's well-structured and follows Rust best practices.Consider adding a brief explanation of the output format after line 41. This would help users understand what to expect from the
println!
statement.🧰 Tools
🪛 Markdownlint
11-11: null
Dollar signs used before commands without showing output(MD014, commands-show-output)
12-12: null
Dollar signs used before commands without showing output(MD014, commands-show-output)
58-63
: Improved clarity in the "Optimizers" sectionThe updates to this section provide more detailed information about the
Optimizer
andSampler
traits, as well as clear explanations of thecmdstan
module and its features. This information is crucial for users to understand how to use the library effectively.Consider adding a brief note about the performance implications or trade-offs of using the
compile-cmdstan
feature. This could help users make an informed decision about whether to enable this feature.Also applies to: 78-85
11-13
: Consider addressing Markdownlint warningsThe Markdownlint tool has flagged lines 11 and 12 for using dollar signs before commands without showing output (MD014). While not critical, addressing these could improve the README's adherence to Markdown standards.
You can address this by either:
- Removing the
$
signs from the beginning of the commands, or- Adding example output after each command.
For example:
```sh cargo install --bin download-stan-model --features download augurs-prophet download-stan-model<details> <summary>🧰 Tools</summary> <details> <summary>🪛 Markdownlint</summary><blockquote> 11-11: null Dollar signs used before commands without showing output (MD014, commands-show-output) --- 12-12: null Dollar signs used before commands without showing output (MD014, commands-show-output) </blockquote></details> </details> --- Line range hint `1-85`: **Excellent improvements to the README** The changes to this README significantly enhance the documentation of the `augurs-prophet` crate. The addition of the example section and the clarifications in the Optimizers section provide users with a much better understanding of how to use the crate and its features. Consider adding a "Quick Start" section at the beginning of the README, just after the initial description. This section could include: 1. A brief description of what the crate does 2. Installation instructions (including how to enable the `cmdstan` feature) 3. A minimal example of how to use the crate This would provide an even quicker entry point for users who want to get started immediately. <details> <summary>🧰 Tools</summary> <details> <summary>🪛 Markdownlint</summary><blockquote> 11-11: null Dollar signs used before commands without showing output (MD014, commands-show-output) --- 12-12: null Dollar signs used before commands without showing output (MD014, commands-show-output) </blockquote></details> </details> </blockquote></details> <details> <summary>crates/augurs-prophet/build.rs (1)</summary><blockquote> `75-77`: **Clarify the Usage of `internal-ignore-cmdstan-failure` Feature** The comment suggests that no one should use this feature, yet it affects the build behavior. Consider rephrasing the comment to clarify its intended purpose and usage: ```diff -// This is a complete hack but lets us get away with still using -// the `--all-features` flag of Cargo without everything failing -// if there isn't a Stan installation. -// Basically, if have this feature enabled, skip any failures in -// the build process and just create some empty files. -// This will cause things to fail at runtime if there isn't a Stan -// installation, but that's okay because no-one should ever use this -// feature. +// The `internal-ignore-cmdstan-failure` feature is used to bypass build failures +// when Stan is not installed, allowing us to use `--all-features` during development. +// It creates placeholder files to prevent build errors, but will cause runtime +// failures if Stan is required. This feature should only be used for internal +// development purposes and not in production.
This provides clearer guidance on when and why this feature should be utilized.
crates/augurs-prophet/prophet.stan (1)
117-126
: Ensuretrend_indicator
covers all possible casesThe
transformed parameters
block assignstrend
based ontrend_indicator
. To prevent unintended behavior, consider adding anelse
clause or validatetrend_indicator
to ensure it only takes values 0, 1, or 2.crates/augurs-prophet/src/optimizer.rs (1)
357-442
: Consider adding negative tests for serialization failuresTo strengthen the test coverage, consider adding tests that verify serialization failure cases, such as when mandatory fields are missing or contain invalid data.
crates/augurs-prophet/src/cmdstan.rs (1)
396-435
: Avoid panics by returning errors innew_embedded
In the
new_embedded
method, the code usesexpect
and could panic if creating the temporary directory or writing the Prophet model binary fails. Relying on panics for error handling is not recommended.Consider returning a
Result<Self, Error>
instead of panicking. This allows the calling code to handle errors gracefully.- pub fn new_embedded() -> Self { + pub fn new_embedded() -> Result<Self, Error> { static PROPHET_INSTALLATION: std::sync::LazyLock<ProphetInstallation> = std::sync::LazyLock::new(|| { - let dir = tempfile::tempdir().expect("could not create temporary directory"); + let dir = match tempfile::tempdir() { + Ok(dir) => dir, + Err(e) => return Err(Error::CreateTempDir(e)), + }; // Remaining code... }); - Self { + Ok(Self { prophet_installation: PROPHET_INSTALLATION.clone(), - } + }) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (18)
- .github/workflows/rust.yml (1 hunks)
- .gitignore (1 hunks)
- Cargo.toml (1 hunks)
- crates/augurs-outlier/Cargo.toml (1 hunks)
- crates/augurs-prophet/.gitignore (1 hunks)
- crates/augurs-prophet/Cargo.toml (1 hunks)
- crates/augurs-prophet/README.md (3 hunks)
- crates/augurs-prophet/build.rs (1 hunks)
- crates/augurs-prophet/prophet.stan (1 hunks)
- crates/augurs-prophet/src/bin/main.rs (1 hunks)
- crates/augurs-prophet/src/cmdstan.rs (1 hunks)
- crates/augurs-prophet/src/lib.rs (1 hunks)
- crates/augurs-prophet/src/optimizer.rs (6 hunks)
- crates/augurs-prophet/src/positive_float.rs (1 hunks)
- crates/augurs/Cargo.toml (1 hunks)
- examples/forecasting/Cargo.toml (1 hunks)
- examples/forecasting/examples/prophet_cmdstan.rs (1 hunks)
- justfile (2 hunks)
✅ Files skipped from review due to trivial changes (2)
- .gitignore
- crates/augurs-prophet/.gitignore
🧰 Additional context used
🪛 Markdownlint
crates/augurs-prophet/README.md
11-11: null
Dollar signs used before commands without showing output(MD014, commands-show-output)
12-12: null
Dollar signs used before commands without showing output(MD014, commands-show-output)
🔇 Additional comments (24)
examples/forecasting/Cargo.toml (1)
14-14
: LGTM! Consider adding explanatory comments.The addition of "prophet" and "prophet-cmdstan" features to the
augurs
dependency aligns with the PR objectives. This change enables the use of the new cmdstan-based optimizer for augurs-prophet in this example.However, please consider the following points:
- Adding these features might increase compilation time and binary size, especially the "prophet-cmdstan" feature which likely involves external dependencies.
- It would be helpful to add a comment explaining the purpose of these new features and any requirements (like setting
STAN_PATH
) for users who might want to run this example.Could you confirm if this example is specifically intended to demonstrate the cmdstan functionality? If not, you might want to create a separate example for cmdstan or make it optional in this one.
To verify the impact of these changes, you can run the following script:
✅ Verification successful
Verification Successful
The
prophet_cmdstan.rs
example in theexamples/forecasting/examples/
directory confirms that this example specifically demonstrates the cmdstan functionality. The review comment is verified.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check if there's a new example demonstrating cmdstan functionality # Test: Search for new prophet or cmdstan related examples rg --type rust 'prophet.*cmdstan' examplesLength of output: 382
crates/augurs-outlier/Cargo.toml (1)
29-29
: LGTM! Good practice for workspace dependency management.The change from
serde_json = "1.0.128"
toserde_json.workspace = true
in the[dev-dependencies]
section is a positive improvement. This modification:
- Centralizes version management in the workspace root.
- Ensures consistency of the
serde_json
version across all crates in the workspace.- Simplifies future updates of this dependency.
This change aligns well with Rust's workspace best practices and will make maintenance easier in the long run.
justfile (2)
17-27
: Well-structured comprehensive test command added.The new
test-all
command is a valuable addition, providing a way to run a comprehensive suite of tests while appropriately excluding certain packages and features. The comment explaining the exclusion ofiai
tests is particularly helpful.
Line range hint
29-36
: Consistent updates to documentation commands.The modifications to the
doctest
anddoc
commands, excludingaugurs-js
andpyaugurs
, maintain consistency with the testing commands. The added comment provides valuable context for these exclusions.crates/augurs/Cargo.toml (1)
38-39
: New features for Prophet cmdstan integration look good.The addition of
prophet-cmdstan
andprophet-compile-cmdstan
features aligns well with the PR objectives. These features correctly depend on the corresponding features in theaugurs-prophet
crate, allowing users to opt-in to the cmdstan-based optimizer functionality.A few points to consider:
- The feature names are clear and follow the existing naming convention in the file.
- The dependencies are correctly specified, pointing to the
augurs-prophet
crate.- These optional features provide flexibility for users who need the cmdstan functionality without affecting those who don't.
To ensure these new features are properly integrated, let's verify the
augurs-prophet
crate's Cargo.toml file:✅ Verification successful
Verified the presence of
cmdstan
andcompile-cmdstan
features incrates/augurs-prophet/Cargo.toml
.Both features are correctly defined and align with the modifications in
crates/augurs/Cargo.toml
.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the presence of cmdstan and compile-cmdstan features in augurs-prophet crate # Test: Check for cmdstan and compile-cmdstan features in augurs-prophet/Cargo.toml # Expect: Both features should be present rg --type toml 'cmdstan|compile-cmdstan' crates/augurs-prophet/Cargo.tomlLength of output: 262
.github/workflows/rust.yml (1)
42-42
: Approve the test command change with a request for clarification.The change from
just test
tojust test-all
suggests a more comprehensive testing approach, which is good for ensuring thorough coverage. This aligns with the PR objectives mentioning the use of--all-features
.Could you please clarify:
- Does
just test-all
internally use the--all-features
flag?- How does this change affect the build time and test coverage?
- Are there any specific reasons why we're moving to test all features by default in CI?
To verify the changes in the testing command, you can run the following script:
This script will help us understand the differences between the
test
andtest-all
commands and verify if--all-features
is being used.crates/augurs-prophet/Cargo.toml (3)
18-20
: New dependencies look good and align with PR objectives.The addition of
serde
,serde_json
,tempfile
,ureq
, andzip
as optional dependencies is appropriate for the cmdstan-based optimization feature. Marking them as optional keeps the base package lightweight, which is a good practice.Also applies to: 22-23
26-26
: Dev-dependency addition is appropriate.Adding the
augurs
crate as a dev-dependency is a good choice for testing and examples. Using the workspace version ensures consistency across the project.
31-32
: Build-dependency addition is appropriate.Adding
tempfile
as an optional build dependency aligns with the PR objectives, likely needed for compiling the cmdstan model during the build process. The explicit version specification is good for reproducibility.Cargo.toml (2)
Line range hint
1-1
: Summary of changes in Cargo.tomlThe changes to this file are minor but important:
- Addition of
serde_json
dependency, which aligns with the PR objectives.- Correction of an author's email address for proper attribution.
These changes improve dependency management and maintain accurate project metadata. The rest of the file structure and configurations appear to be standard for a Rust workspace.
Line range hint
1-1
: Verify the author's email address correction.The correction of the author's email address is a good catch. However, this change is not visible in the provided code snippet.
Please run the following script to verify the email address correction:
✅ Verification successful
Please run the following script to further verify the author's email address correction:
Author's email address has been correctly updated.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the author's email address correction in Cargo.toml # Test: Check if the corrected email address is present grep -n '[email protected]>' Cargo.toml # Expected output: A line number followed by the corrected email address # If no output is produced, the correction may not have been appliedLength of output: 45
Script:
# Description: Verify the author's email address correction in Cargo.toml without trailing '>' # Test: Check if the corrected email address is present without '>' grep -n '[email protected]' Cargo.toml # Expected output: A line number followed by the corrected email address # If no output is produced, the correction may not have been applied or is formatted differentlyLength of output: 83
examples/forecasting/examples/prophet_cmdstan.rs (1)
1-56
: Overall, good example with room for minor improvements.This example effectively demonstrates the usage of the Prophet model with the CmdStan optimizer. The code is well-structured and covers the essential steps: data preparation, model initialization, fitting, and prediction.
Key points for improvement:
- Enhance documentation with more context about the example's purpose and feature implications.
- Extract data preparation into a separate function for better readability.
- Improve comments around CmdstanOptimizer initialization options.
- Add error handling for regressor addition and prediction steps.
- Provide more detailed and informative output of predictions.
Implementing these suggestions will make the example more robust, informative, and easier to understand for users of the
augurs-prophet
crate.crates/augurs-prophet/prophet.stan (6)
7-27
: Implementation ofget_changepoint_matrix
appears correctThe
get_changepoint_matrix
function correctly constructs the changepoint indicator matrixA
. The implementation properly handles the indexing and updatesA
as intended.
31-46
:logistic_gamma
function is correctly calculating adjusted offsetsThe
logistic_gamma
function accurately computes the adjusted offsets for piecewise continuity in the logistic trend model. The use ofappend_row
andcumulative_sum
is appropriate for generatingk_s
.
48-62
:logistic_trend
function is correctly implementedThe
logistic_trend
function effectively calculates the logistic trend using the parameters and the changepoint matrixA
. The dimensionality of operations aligns correctly.
66-75
:linear_trend
function correctly computes the linear trendThe
linear_trend
function correctly models the linear trend component by incorporating trend adjustments throughdelta
and the changepoint matrixA
.
79-84
:flat_trend
function is correctly generating a constant trendThe
flat_trend
function appropriately returns a constant vectorm
of lengthT
, representing a flat trend.
87-101
: Data block variables are properly definedThe variables in the
data
block are correctly declared with appropriate types and constraints. This sets a solid foundation for the model.crates/augurs-prophet/src/optimizer.rs (4)
30-31
: Conditional derivation ofserde::Serialize
andserde::Deserialize
The use of
#[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))]
forInitialParams
is correct and ensures that serialization traits are only derived when theserde
feature is enabled.
152-167
: Ensure all fields are serialized forData
structThe custom serialization for
Data
correctly serializes all fields, including the customX
field. Ensure that any new fields added toData
in the future are also included in theserialize_struct
call to maintain consistency.
182-191
: Correct implementation offmt::Display
forAlgorithm
The
fmt::Display
implementation forAlgorithm
accurately maps each variant to its lowercase string representation, which enhances usability when converting the enum to a string.
366-370
: Clarify construction ofsigmas
withPositiveFloat
In the test, the
sigmas
vector is constructed usingtry_into().unwrap()
. While acceptable in tests, usingunwrap()
can cause panics if the conversion fails. Ensure that the values are guaranteed to be positive to prevent panics.crates/augurs-prophet/src/cmdstan.rs (2)
417-423
: Set executable permissions on non-Unix platformsCurrently, the code sets executable permissions for the Prophet binary only on Unix-like systems using
std::os::unix::fs::PermissionsExt
. This means that on Windows, the binary might lack the necessary executable permissions.Check if the Prophet binary requires executable permissions to run on Windows and adjust the code accordingly. You might need to handle permissions using
std::os::windows::fs::PermissionsExt
on Windows platforms.
271-276
: Handle potential parsing errors when reading outputIn the
parse_output
method, if the data row contains unexpected values or formatting issues, parsing withval.parse()
might fail, leading to a runtime error.Ensure that the code gracefully handles parsing errors. You could collect parsing results and report any issues with specific parameters.
let libtbb_path = stan_path.join("lib/libtbb.so.12"); | ||
let dest_libtbb_path = build_dir.join("libtbb.so.12"); | ||
std::fs::copy(&libtbb_path, &dest_libtbb_path)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle Dynamic Library Version Flexibly
Hardcoding the library name libtbb.so.12
may lead to compatibility issues if the version changes.
Consider modifying the code to dynamically find the libtbb
library:
-
Option 1: Use glob patterns to match any version of the shared library.
use glob::glob; let libtbb_pattern = stan_path.join("lib/libtbb.so.*"); let libtbb_path = glob(&libtbb_pattern.to_string_lossy())? .filter_map(Result::ok) .next() .ok_or("libtbb.so not found")?;
-
Option 2: Allow the library name to be configurable or document the required version explicitly.
This ensures that your build script adapts to different versions of the libtbb
library.
if let Ok(ld_library_path) = std::env::var("LD_LIBRARY_PATH") { | ||
let path = format!("{}:{}", ld_library_path, self.lib_dir.display()); | ||
cmd.env("LD_LIBRARY_PATH", path); | ||
} | ||
// Otherwise, just use our lib_dir. | ||
else { | ||
cmd.env("LD_LIBRARY_PATH", &self.lib_dir); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle LD_LIBRARY_PATH
in a cross-platform manner
The current implementation sets the LD_LIBRARY_PATH
environment variable, which is specific to Unix-like systems. This might cause issues on non-Unix platforms like Windows, where the equivalent is the PATH
environment variable.
Consider using the dylib
crate or conditional compilation to handle library paths in a platform-independent way. This ensures that the library directories are correctly set regardless of the operating system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (2)
crates/augurs-prophet/src/optimizer.rs (2)
127-173
: LGTM: Custom serialization for Data structThe custom serialization implementation for the
Data
struct is well-implemented and correctly handles the serialization of theX
field as a nested array. TheXSerializer
is a good approach to flatten theX
vector into the required format.One minor suggestion:
Consider extracting theXSerializer
into a separate struct outside of theserialize
method for better code organization and potential reuse.
365-446
: LGTM with minor suggestion: serialize_data testThe
serialize_data
test is well-implemented and thoroughly verifies the serialization of theData
struct, including the correct nested array serialization of theX
field.Minor suggestion:
There's a small inconsistency in the test data.S
is set to 2, butt_change
has 3 elements. Consider adjusting one of these to match:S: 3, t_change: vec![0.0, 0.0, 0.0],Or:
S: 2, t_change: vec![0.0, 0.0],This will ensure that the test data is internally consistent.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
- crates/augurs-prophet/src/optimizer.rs (6 hunks)
🧰 Additional context used
🔇 Additional comments (3)
crates/augurs-prophet/src/optimizer.rs (3)
30-30
: LGTM: Conditional serde derive for InitialParamsThe addition of conditional serde derive for
InitialParams
is a good practice. It allows the struct to be serialized when the "serde" feature is enabled, providing flexibility for users who need serialization capabilities.
186-195
: LGTM: Display implementation for Algorithm enumThe
Display
implementation for theAlgorithm
enum is well-implemented. It provides a lowercase string representation of the algorithm names, which is useful for generating human-readable output or interfacing with systems that expect lowercase algorithm names.
55-84
: 🛠️ Refactor suggestionConsider using
serde_repr
or string serialization for TrendIndicatorWhile the current implementation is functional, it can be error-prone if new variants are added or the order changes. As suggested in a previous review, consider using the
serde_repr
crate for more maintainable serialization:use serde_repr::*; #[derive(Serialize_repr, Deserialize_repr)] #[repr(u8)] pub enum TrendIndicator { Linear = 0, Logistic = 1, Flat = 2, }Alternatively, you could serialize the enum as strings for better readability:
#[derive(Serialize, Deserialize)] pub enum TrendIndicator { #[serde(rename = "linear")] Linear, #[serde(rename = "logistic")] Logistic, #[serde(rename = "flat")] Flat, }These approaches would be more robust and easier to maintain.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (2)
crates/augurs-prophet/src/optimizer.rs (2)
127-175
: LGTM: Custom serialization for Data structThe custom serialization for the
Data
struct, including theXSerializer
, is well-implemented and correctly handles the specific format requirements. The division by zero check inXSerializer
is a good safety measure.One minor suggestion:
Consider using
const
for the number of fields in theData
struct:const DATA_FIELD_COUNT: usize = 13; let mut s = serializer.serialize_struct("Data", DATA_FIELD_COUNT)?;This would make it easier to maintain if the number of fields changes in the future.
366-448
: LGTM: Test for Data serializationThe
serialize_data
test is well-structured and correctly verifies the serialization of theData
struct. The use ofpretty_assertions
is a good choice for clear error messages.One suggestion to improve the test:
Consider parameterizing the test with different
Data
instances to cover more cases, especially edge cases. For example:#[test] fn serialize_data_parameterized() { let test_cases = vec![ (Data { /* ... */ }, r#"{ /* expected JSON */ }"#), // Add more test cases here ]; for (data, expected_json) in test_cases { let serialized = serde_json::to_string_pretty(&data).unwrap(); pretty_assertions::assert_eq!(serialized, expected_json); } }This approach would make it easier to add more test cases in the future and ensure comprehensive coverage of the serialization logic.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
- crates/augurs-prophet/src/optimizer.rs (6 hunks)
🧰 Additional context used
🔇 Additional comments (4)
crates/augurs-prophet/src/optimizer.rs (4)
30-30
: LGTM: Conditional serde derive for InitialParamsThe addition of conditional serde derive for
InitialParams
is correct and follows best practices for optional feature support.
188-197
: LGTM: Display implementation for Algorithm enumThe
Display
implementation for theAlgorithm
enum is correct and provides a useful lowercase string representation of each variant.
Line range hint
1-448
: Overall assessment: Well-implemented serialization supportThe changes to
crates/augurs-prophet/src/optimizer.rs
successfully add serialization support to key structures and improve the overall code organization. The custom serialization for theData
struct is particularly well-implemented, addressing the specific requirements for theX
field.Key points:
- Conditional serde derive for
InitialParams
is correct.- Custom serialization for
TrendIndicator
works but could be simplified usingserde_repr
.Data
struct serialization is well-implemented with proper handling of theX
field.Display
implementation forAlgorithm
enum is correct and useful.- The new test for
Data
serialization is comprehensive but could be expanded for more coverage.These changes significantly enhance the interoperability of the
augurs-prophet
crate with external systems that expect specific JSON formats.
55-84
: 🛠️ Refactor suggestionConsider using serde_repr for enum serialization
The custom serialization for
TrendIndicator
is correct, but it can be simplified using theserde_repr
crate. This would reduce boilerplate and make the code more maintainable.Consider replacing the custom implementations with:
use serde_repr::{Serialize_repr, Deserialize_repr}; #[cfg_attr(feature = "serde", derive(Serialize_repr, Deserialize_repr))] #[repr(u8)] pub enum TrendIndicator { Linear = 0, Logistic = 1, Flat = 2, }Don't forget to add
serde_repr
to your dependencies inCargo.toml
.
This almost certainly won't work as-is on Windows but I have no way of testing that right now, and don't plan on using it on Windows any time soon. Contributions to fix that would be welcome! |
This commit adds an implementation of
Optimizer
to theaugurs-prophet
crate that usescmdstan
to optimize or sample from the Prophet model, similar to the Prophet Python package.The
cmdstan
feature is disabled by default since it requires external dependencies and may not work well on all platforms.The
compile-cmdstan
feature can be enabled to build the Prophet model at build time. This requires a working installation ofstan
and setting theSTAN_PATH
environment variable to the path to thestan
installation. This will embed the Prophet model binary and the libtbb dynamic library into the final executable, which will increase the size of the final executable by about 2MB, but the final binary won't require any additional dependencies.There's an example in the
examples
directory which fits and predicts using cmdstan. It can be run with something like:(replacing the
STAN_PATH
value with the path to your Stan installation), or by runningSummary by CodeRabbit
Release Notes
New Features
augurs
package to support Prophet functionalities.augurs-prophet
crate with new dependencies and features.Bug Fixes
Documentation
Chores