Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Correct query times for model plot and forecast #327

Merged
merged 3 commits into from
Dec 5, 2018
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 12 additions & 7 deletions lib/api/CForecastRunner.cc
Original file line number Diff line number Diff line change
Expand Up @@ -173,16 +173,21 @@ void CForecastRunner::forecastWorker() {
}

const TForecastModelWrapper& model = series.s_ToForecast.back();
model_t::TDouble1VecDouble1VecPr support =
model_t::support(model.s_Feature);
bool success = model.s_ForecastModel->forecast(
forecastJob.s_StartTime, forecastJob.forecastEnd(),
forecastJob.s_BoundsPercentile, support.first, support.second,
model_t::EFeature feature{model.s_Feature};
core_t::TTime bucketLength{model.s_ForecastModel->params().bucketLength()};
core_t::TTime startTime{model_t::sampleTime(
feature, forecastJob.s_StartTime, bucketLength)};
core_t::TTime endTime{model_t::sampleTime(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to fix it in CAnomalyJob::doForecast instead? CForecastRunner is just a dumb worker and should not have any important logic. CAnomalyJob::doForecast calls into the runner and sets startTime to m_LastResultsTime, it seems to me, that adjusting it there does the same thing but is a bit cleaner. endTime is anyway just relative to startTime.

Maybe the same can be done for model plots.

Copy link
Contributor Author

@tveasey tveasey Dec 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is this is feature specific. So it is tricky to push it higher up if the forecast is being run over a job with multiple detectors with different features.

I could create a wrapper which implements the logic in the model library. I can't directly push the feature into the forecast function (because it is in the maths library which can't depend on EFeature). I could supply a call back to compute the offset start and end times and have this use the wrapper from the model library.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, how about I add a function to actually run the forecast to model_t which wraps up this detail. Given we only have the maths::CTimeSeriesModel here (for good reason) this seems like it might be the cleanest option.

Copy link

@hendrikmuhs hendrikmuhs Dec 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is this is feature specific. So it is tricky to push it higher up if the forecast is being run over a job with multiple detectors with different features.

ok, I see and agree that's to complicated.

What about inside of model.s_ForecastModel->forecast(...)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That hits the library dependency issue mentioned above. However, what about if I have a
CForecastDataSink::SForecastModelWrapper::forecast function which takes the forecast job. This could wrap all the functionality now in this loop?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good, I am also ok if we keep the current version given that alternatives are to complicated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I like the idea of wrapping this in SForecastModelWrapper. It seems more natural to me than in this loop which is really just about scheduling. I'll make it and see how it looks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

f980f26. Note that none of the members of SForecastModelWrapper are needed outside of the new forecast function, so I converted to a class.

feature, forecastJob.forecastEnd(), bucketLength)};
model_t::TDouble1VecDouble1VecPr support{model_t::support(feature)};
bool success{model.s_ForecastModel->forecast(
startTime, endTime, forecastJob.s_BoundsPercentile,
support.first, support.second,
boost::bind(&model::CForecastDataSink::push, &sink, _1,
model_t::print(model.s_Feature), series.s_PartitionFieldName,
model_t::print(feature), series.s_PartitionFieldName,
series.s_PartitionFieldValue, series.s_ByFieldName,
model.s_ByFieldValue, series.s_DetectorIndex),
message);
message)};
series.s_ToForecast.pop_back();

if (success == false) {
Expand Down
3 changes: 2 additions & 1 deletion lib/model/CModelDetailsView.cc
Original file line number Diff line number Diff line change
Expand Up @@ -77,11 +77,12 @@ void CModelDetailsView::modelPlotForByFieldId(core_t::TTime time,

if (this->isByFieldIdActive(byFieldId)) {
const maths::CModel* model = this->model(feature, byFieldId);
if (!model) {
if (model == nullptr) {
return;
}

std::size_t dimension = model_t::dimension(feature);
time = model_t::sampleTime(feature, time, model->params().bucketLength());

maths_t::TDouble2VecWeightsAry weights(
maths_t::CUnitWeights::unit<TDouble2Vec>(dimension));
Expand Down