-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for GCS storage to 'solrbackup' #301
Comments
Some questions this raises:
|
This commit adds first-pass support for exposing Solr's GcsBackupRepository through our operator configuration. This WIP support has a number of caveats and downsides: - GCS backups eschew the "persistence" step that currently follows normal backups - GCS backups are only included in Solr 8.9+, but there's no check for this currently. - operator logic currently assumes that exactly 1 type of backup config will be provided on a given solrcloud object (i.e. GCS backups and 'local' PV backups are mutually exclusive for a solrcloud. - no automated tests have been added - no documentation of has been added, beyond the examples on issue apache#301
I've attached a rough PR that shows how this could be done. Below are an example 'solrcloud' and 'solrbackup' that use the proposed functionality: SolrCloud
The most noteworthy addition in this snippet is SolrBackup
(Note that there's no new configuration in 'solrbackup', just the removal of the 'persistence' section for gcs-backups.) I'm not wedded to these syntaxes by any means - just wanted to get some examples up here as a concrete starting point for discussion. |
So I just want to make sure I understand correctly. The only thing the operator should do for these "native" backup options, is to call the Solr API right? I'll need to think on this a bit more, but to me it sounds like the only real benefit would be to have the operator be able to do this on a schedule. (and possibly delete old backups if necessary). So instead of facilitating the backup mechanism, it would just be in charge of managing the backups. The more I type this out, the more I'm starting to like it. It would also allow the Solr operator, in the future, to do automatic-rollbacks if it detects failures in a Collection. I think we could change the SolrBackup to do either "managed" or "remote" backups, and in the case of remote, let the user provide the So in that case your example of the backupRestoreOptions in the SolrCloud object would be spot on, but the SolrBackup object would need the |
That's what I'm proposing, yep - the operator wouldn't be doing any of the compression or relocation features for GCS that it currently supports for 'local' backups. It's "just" calling the Solr API. (Which, I'd contend, isn't "nothing". That still saves Ops folks from crafting their own solr.xml, from needing to learn Solr's backup and async-polling APIs, etc.)
Definitely agree. As I said above, I think there's value in this ticket alone. But GCS-support gets much more appealing as the operator's backup featureset generally gets more robust. I love the idea of a "backupschedule" entity that creates individual solrbackup objects in turn. I'll file an issue for that as a placeholder for discussion.
I think I agree with your suggestions here, but let me restate a few of them to make sure I'm understanding you correctly. There's a point or two I'm unclear on.
So taking those suggestions, our new example CRDs would look something like: solrcloud
solrbackup (gcs)
solrbackup (local)
Those could be off a bit based on what you meant about exposing the "managed" v. "remote" flag. Does this look closer to what you were thinking? @HoustonPutman |
Two additional notes:
|
Currently the 'solrbackup' resource assumes that users want backups stored "locally" (i.e. stored on a PV or mounted drive using Solr's LocalFileSystemRepository). These local backups can then optionally be "persisted" - which involves compressing them and shipping them to a different PV or S3 bucket.
But no support exists for using other backup destinations that Solr supports natively, such as GCS (as of 8.9).
We should add this support. Users can configure their GCS-backup settings under solrcloud's
backupRestoreOptions
object, leaving thesolrbackup
object relatively untouched (except that any "persistence" section on 'solrbackup' would now be ignored, as we can only easily compress files that are stored locally).The text was updated successfully, but these errors were encountered: