Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CStringStore Unit tests write way too many logs #302

Closed
benwtrent opened this issue Nov 2, 2018 · 5 comments · Fixed by #318
Closed

CStringStore Unit tests write way too many logs #302

benwtrent opened this issue Nov 2, 2018 · 5 comments · Fixed by #318
Assignees

Comments

@benwtrent
Copy link
Member

CStringStore unit tests write thousands of logs like the following:

2018-11-02 15:49:24,988 UTC [3922] ERROR CBucketGatherer.cc@451 Sample start time 166605000 is not bucket aligned
2018-11-02 15:49:24,988 UTC [3922] ERROR CBucketGatherer.cc@451 Sample start time 166605000 is not bucket aligned
2018-11-02 15:49:24,988 UTC [3922] ERROR CBucketGatherer.cc@451 Sample start time 166605000 is not bucket aligned
2018-11-02 15:49:24,988 UTC [3922] ERROR CBucketGatherer.cc@451 Sample start time 166615000 is not bucket aligned
2018-11-02 15:49:24,988 UTC [3922] ERROR CBucketGatherer.cc@451 Sample start time 166615000 is not bucket aligned

We need to either fix the test or mute these particular logs when being ran from a unittest.

@benwtrent
Copy link
Member Author

Found the path that is causing this:

In all of the tests a non-zero bucket delay is added:

modelConfig.bucketResultsDelay(2);

This trips this conditional:

if (m_ModelConfig.bucketResultsDelay()) {
bucketLength /= 2;
}

Which cuts the bucketLength in half. Then the following start/ends are off by half a bucket length and when:

void CAnomalyDetector::sample(core_t::TTime startTime,
core_t::TTime endTime,
CResourceMonitor& resourceMonitor) {

for (core_t::TTime time = startTime; time < endTime; time += bucketLength) {
m_Model->sample(time, time + bucketLength, resourceMonitor);
}

is called, certain models validate the sample times

e.g CCountingModel::sample,

void CCountingModel::sample(core_t::TTime startTime,
core_t::TTime endTime,
CResourceMonitor& resourceMonitor) {
CDataGatherer& gatherer = this->dataGatherer();
m_ScheduledEventDescriptions.clear();
if (!gatherer.validateSampleTimes(startTime, endTime)) {
return;
}

Which calls:

if (!maths::CIntegerTools::aligned(startTime - m_BucketStart, this->bucketLength())) {
LOG_ERROR(<< "Sample start time " << startTime << " is not bucket aligned");
return false;
}

I can make this stop by having CStringStoreTest not set the bucketDelay and let it use the default of 0.

What say y'all @tveasey @droberts195 ?

@droberts195
Copy link
Contributor

bucketDelay corresponds to the config option result_finalization_window, and we still have that in our Java side AnalysisConfig objects. What happens in the test makes me think that if anyone ever set that to a non-zero value then we could get this problem in production.

We need to check the code in more detail, so don't make any changes today, but since we do little/no testing with non-zero result_finalization_windows I'm wondering if we should remove all trace of it from the product.

@droberts195
Copy link
Contributor

result_finalization_window is undocumented in our public analysis config documentation. Now we have our multi-bucket feature it's unlikely we'd ever want to resurrect the old Prelert attempt at multiple bucket lengths, and, if I remember correctly, result_finalization_window was designed to work with that.

Based on this issue there's clearly a bug that's triggered if this option is used. Since it's undocumented my preference would be to change the code to completely remove it in 6.6.

But before this is done, can you think of any reason to keep result_finalization_window aka bucketResultsDelay @tveasey and @dimitris-athanasiou?

@tveasey
Copy link
Contributor

tveasey commented Nov 15, 2018

This was also for implementing out-of-phase analysis to mitigate bucketing partially sampling important features in the time series.

However, I don't think that this is something we are going to prioritise reviving. I'm not sure it is worth the complexity and there is overlap the new multi-bucket functionality.

Removing this also means we don't need result serialisation and various other code. I'd be +1 for cleaning this up now, especially since it appears to not be working correctly.

@droberts195
Copy link
Contributor

Good point. If we get rid of result_finalization_window then we should get rid of overlapping_buckets at the same time.

I agree that they're less likely to be needed (and probably too complex to understand) now we have multi-bucket functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants