Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RecoveryTester feature #341

Merged
merged 3 commits into from
Mar 15, 2024
Merged

Add RecoveryTester feature #341

merged 3 commits into from
Mar 15, 2024

Conversation

sebastianburckhardt
Copy link
Member

This PR adds a new feature for testing the recovery of partitions without modifying any files in the storage account.

The feature is available in the form of a new option for the PartitionManagement setting:

   "PartitionManagement": "RecoveryTester",
   "PartitionManagementParameters": "7",

(specifying the partition is optional, if none is specified, all partitions are tested).

As part of this PR, I also

  • refactored how partition managers are selected based on the PartitionManagement setting. This is in anticipation of adding a static partition manager setting, to replace the 'Scripted' setting.
  • removed the 'Scripted' setting since this was meant for internal development only and I have not been using it and it for years, so it does not make sense to maintain that code.

Copy link
Member

@davidmrdavid davidmrdavid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few first impressions.
Also - did you get a chance to test that read-only mode is in fact leaving a storage account as-is? We may want to try it out on a throwaway account first

Comment on lines +94 to +98
/// <summary>
/// Additional parameters for the partition management, if necessary
/// </summary>
public string PartitionManagementParameters { get; set; } = null;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I gather, this setting is rather dynamic, is it not? It can specify the partition ID if using the RecoverTester, but in the future it could specify other stuff (that's why the type is string), right?

I would prefer a more constrained/type-safe approach here, as I think parsing strings can get hairy over time. Could be create a proper options object?

Copy link
Member Author

@sebastianburckhardt sebastianburckhardt Jan 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, this is anticipating future, alternate partition managers where the parameter can be something much different (e.g. it could be an entire json object).

Could be create a proper options object?

Don't mind the idea but I am totally unsure whether/how that can work in the context of all the stuff the functions host does with host.json (e.g. magic environment variable settings and the like), and the stuff we are doing (copying parameters from AF specified parameters to Netherite parameters). So I was just taking the easy route.
If you have experience with it, let me know.

Copy link
Member

@davidmrdavid davidmrdavid Jan 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. I do have some experience here (but not extensive) and I wanna say that the parsing will of host.json is an issue for the DF Extension, so I think that's a problem we can solve there instead of here. In general, it's not that magical :) .

In this case, yeah I think the ""right thing to do"" is to create a partitionManagement options class or interface with a subclass/implementation specific for the recoveryTester. That should suffice for us at this layer. I can solve the problem of parsing at the DF Extension layer :) . I can also commit to do a quick parsing test before merging

Copy link
Member Author

@sebastianburckhardt sebastianburckhardt Jan 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. But for the sake of using our time wisely, I think it is still an o.k. solution to just use a string here. Can be refactored later to provide a more general solution if we really think it is worth the effort.

Comment on lines 26 to 30
/// <summary>
/// Follow a predefined partition management script. This is meant to be used for testing and benchmarking scenarios.
/// This was an internal feature and is no longer supported.
/// </summary>
Scripted,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'm curious if we can mark this as obsolete to prevent future usage.

Suggested change
/// <summary>
/// Follow a predefined partition management script. This is meant to be used for testing and benchmarking scenarios.
/// This was an internal feature and is no longer supported.
/// </summary>
Scripted,
/// <summary>
/// Follow a predefined partition management script. This is meant to be used for testing and benchmarking scenarios.
/// This was an internal feature and is no longer supported.
/// </summary>
[Obsolete]
Scripted,

Copy link
Member Author

@sebastianburckhardt sebastianburckhardt Jan 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

marking it as obsolete may violates major-minor version change rules. I was trying to stay below the radar of the version-police :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this one, I feel like it wouldn't trigger the semVer police since this just raises a warning, not a functional failure. But I won't insist / don't feel strongly.

@sebastianburckhardt
Copy link
Member Author

Also - did you get a chance to test that read-only mode is in fact leaving a storage account as-is?

I did some testing but did not actually look at storage afterwards to confirm it is unmodified. Can do that.

# Conflicts:
#	src/DurableTask.Netherite/StorageLayer/Faster/AzureBlobs/BlobManager.cs
@sebastianburckhardt sebastianburckhardt changed the base branch from dev to main February 29, 2024 18:36
@sebastianburckhardt
Copy link
Member Author

@davidmrdavid, are you o.k. with merging this PR now? I think the recovery feature has proved useful and there seems little risk with putting this in the main branch.

@sebastianburckhardt sebastianburckhardt added this to the 1.4.3 milestone Mar 5, 2024
@sebastianburckhardt sebastianburckhardt merged commit 37f49ad into main Mar 15, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants