-
Notifications
You must be signed in to change notification settings - Fork 21
Update PackageLagMonitor for AzureSearch #798
Conversation
@@ -0,0 +1,75 @@ | |||
using System; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copyright header
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
{ | ||
public string Name { get; set; } | ||
|
||
public long DocumentCount { get; set; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: only add the properties you use... the shape has been kind of fluid...
The LastCommitTimestamp
can't change though since it's already used by a runner
@@ -171,7 +170,7 @@ public Task<bool> ProcessPackageDetailsAsync(PackageDetailsCatalogLeaf leaf) | |||
if (shouldRetry) | |||
{ | |||
++retryCount; | |||
_logger.LogInformation("Waiting for {RetryTime} seconds before retrying {PackageId} {PackageVersion} against {SearchBaseUrl}", WaitBetweenPolls.TotalSeconds, packageId, packageVersion, instance.BaseQueryUrl); | |||
_logger.LogInformation("{ServiceType}: Waiting for {RetryTime} seconds before retrying {PackageId} {PackageVersion} against {SearchBaseUrl}", serviceTypeString, WaitBetweenPolls.TotalSeconds, packageId, packageVersion, instance.BaseQueryUrl); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can pass the enum as-is. The ToString()
will be called and result in this came thing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. I'll do that instead.
case ServiceType.AzureSearch: | ||
return GetInstances(endpointUri: new Uri(regionInformation.BaseUrl), instanceCount: 1, regionInformation: regionInformation, serviceType: ServiceType.AzureSearch); | ||
default: | ||
throw new UnknownServiceTypeException(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can throw an existing exception like NotImplementated
with a message containing the service type to avoid the custom exception
|
||
var instances = GetInstances(cloudService.Uri, cloudService.InstanceCount, regionInformation); | ||
var instances = GetInstances(endpointUri: cloudService.Uri, instanceCount: cloudService.InstanceCount, regionInformation: regionInformation, serviceType: ServiceType.LuceneSearch); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these named params are a bit redundant. the parameters themselves are very descriptive in this case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
return instances; | ||
return instances; | ||
case ServiceType.AzureSearch: | ||
return GetInstances(endpointUri: new Uri(regionInformation.BaseUrl), instanceCount: 1, regionInformation: regionInformation, serviceType: ServiceType.AzureSearch); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the code would be a little simpler if you made it construct an instance here directly instead of calling GetInstances
. GetInstances
does a lot of work to create multiple instances that's not necessary for the Azure Search case.
I would recommend:
- Moving the code inside the
Enumerable.Range(...).Select(...)
that generates an instance into its own function. - Replacing this call with a call to that function.
- Renaming
GetInstances
toGetLuceneInstances
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you mean here by simpler.
The switch for specifying port would still need to exist (as the new function would still need to be able to specify this), and the case for AzureSearch would have to manually wrap the return in a list (note the return type of the method).
I think it makes more sense to treat azure search instance as a "LuceneSearch" type with 1 instance (and admittedly no port) rather than trying to completely separate them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, it's a little confusing that GetInstances
will always enumerate an enumerable with a single member in the AzureSearch case. The only thing that's really shared logic in that function is the construction of an instance (which, admittedly, has to check service type) so I was thinking you could move that out. Not a big deal though.
_logger.LogInformation("Lag {Timestamp}:{PackageId} {PackageVersion} SearchInstance:{Region}{Instance} Created: {CreatedLag} V3: {V3Lag}", timeStamp, packageId, packageVersion, instance.Region, instance.Index, createdDelay, v3Delay); | ||
_logger.LogInformation("LastReload:{LastReloadTimestamp} LastEdited:{LastEditedTimestamp} Created:{CreatedTimestamp} ", lastReloadTime, lastEdited, created); | ||
_logger.LogInformation("{ServiceType}: Lag {Timestamp}:{PackageId} {PackageVersion} SearchInstance:{Region}{Instance} Created: {CreatedLag} V3: {V3Lag}", serviceTypeString, timeStamp, packageId, packageVersion, instance.Region, instance.Index, createdDelay, v3Delay); | ||
_logger.LogInformation("{ServiceType}: LastReload:{LastReloadTimestamp} LastEdited:{LastEditedTimestamp} Created:{CreatedTimestamp} ", serviceTypeString, lastReloadTime, lastEdited, created); | ||
if (!isListOperation) | ||
{ | ||
_telemetryService.TrackPackageCreationLag(timeStamp, instance, packageId, packageVersion, createdDelay); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't you be piping the instance.ServiceType
to here and the other _telemetryService
call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note the changes to TelemetryService already do this since we have access to the ServiceType on the instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see now!
|
||
namespace NuGet.Jobs.Monitoring.PackageLag | ||
{ | ||
public class UnknownServiceTypeException : Exception |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
@@ -77,19 +77,22 @@ public SearchInstanceRebooterFacts(ITestOutputHelper output) | |||
0, | |||
"http://localhost:801/search/diag", | |||
"http://localhost:801/query", | |||
_region), | |||
_region, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need some unit tests that somehow assert the flow of a) get /search/diag from Lucene -- use last re-open timestamp and b) get /search/diag from Azure Search -- use current time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whew ok. Added test for this.
@@ -48,7 +50,8 @@ public void TrackV3Lag(DateTimeOffset eventTime, Instance instance, string packa | |||
{ PackageVersion, packageVersion }, | |||
{ Region, instance.Region }, | |||
{ Subscription, _subscription }, | |||
{ InstanceIndex, instance.Index.ToString() } | |||
{ InstanceIndex, instance.Index.ToString() }, | |||
{ ServiceType, Enum.GetName(typeof(ServiceType), instance.ServiceType) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can just do instance.ServiceType.ToString()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updasted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⌚️
{ | ||
var result = new SearchDiagnosticResponse() | ||
{ | ||
LastIndexReloadTime = DateTimeOffset.UtcNow, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should link to that issue you filed about reducing the poll frequency. Maybe make a long-ish comment explaining the situation and how this introduces an error of up to poll frequency (if that is indeed the case).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call. I'll add a link here to the issue at least. And probably add a comment about why we do this.
|
||
private SearchDiagnosticResponse ConvertAzureSearchResponse(AzureSearchDiagnosticResponse azureSearchDiagnosticResponse) | ||
{ | ||
var result = new SearchDiagnosticResponse() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: you can drop the ()
if you use object initializer syntax
var result = new SearchDiagnosticResponse() | |
var result = new SearchDiagnosticResponse |
Same comment for line 240
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add unit tests as @joelverhagen requested and then ship it!
})))); | ||
var searchClient = new SearchServiceClient(_azureApiWrapper, httpClientMock.Object, _options, _logger); | ||
|
||
var luceneResponse = searchClient.GetIndexLastReloadTimeAsync(_luceneInstance, token).Result; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this test can be made async
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
var luceneResponse = searchClient.GetIndexLastReloadTimeAsync(_luceneInstance, token).Result; | ||
var azureResponse = searchClient.GetIndexLastReloadTimeAsync(_azureSearchInstance, token).Result; | ||
|
||
Assert.True(azureResponse - DateTimeOffset.UtcNow < new TimeSpan(hours: 0, minutes: 0, seconds: 10)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be better to capture datetimeoffset.utcnow before and after GetIndexLastReloadTimeAsync
then use Assert.InRange
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
var azureResponse = searchClient.GetIndexLastReloadTimeAsync(_azureSearchInstance, token).Result; | ||
|
||
Assert.True(azureResponse - DateTimeOffset.UtcNow < new TimeSpan(hours: 0, minutes: 0, seconds: 10)); | ||
Assert.True(luceneResponse.Equals(new DateTimeOffset())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assert.Equals
Maybe use an explicit timestamp that is not default(DateTimeOffset)
since this could coincidentally be right instead of do to the expected impl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Address comments then
* Update instance generation to understand azureSearch type. Pipe serviceType through. * Support AzureSearchDiagResponse. Pipe ServiceType through to telemetry. * Fix tests. * Fixed committimestamp to be pessimistic. * Add test for commit time stamp. PR feedback. * Async test.
Adds support for specifying a ServiceType to configuration to allow monitoring of both CloudService based search and WebApp based search.
Updates telemetry dimensions so we can tell them apart.