-
Notifications
You must be signed in to change notification settings - Fork 674
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SOLR-16957: Test user managed cluster with a twist! #1875
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. I don't see the point in testing replication between two isolated SolrCloud nodes though, is that even supported? Are you thinking about some kind of usecase where you pull the index from a cloud cluster to an outside cluster for hot standby purposes?
|
||
# Not totally sure why this didn't load it's data, but it works for our needs! | ||
run curl 'http://localhost:7574/solr/techproducts/select?q=*:*' | ||
assert_output --partial '"numFound":0' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd expect this to return 32 as well. Could we perhaps issue a deleteByQuery request to empty the index before replicating?
# Wish we could block on fetchindex.. Does checking details help? | ||
sleep 5 | ||
run curl 'http://localhost:7574/solr/techproducts/select?q=*:*' | ||
assert_output --partial '"numFound":32' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we do a while loop instead of sleep perhaps? And another way to assert replication is to compare index generation/version, but probably not necessary here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
argh, no examples of a while.... I wonder... DO we need the bin/solr assert
to assert a doc count with a query? I like the timeout capability built into the assert..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still wishing we had a better bin/solr assert
;-). The curl's are all over the place. And the timeout would be nice...
I'm thinking that if having two standalone Solr's each with their own embedded ZK works, well, we just eliminated the need for traditional "standalone" Solr. While having an easy upgrade path for all the folks who want to continue to have user managed index replication. We just change "bin/solr start" to do what today you enabled with "bin/solr start -c" and everything continues to work. Except now, everywhere we make a SolrCloud versus standalone decision, we only have SolrCloud. And all those tickets about "make X work in standalone solr" are now obsoleted... |
Fair enough, but that sounds like a new JIRA issue, not part of this test improvement? I'm sceptical to plan a cluster with 6 nodes, each with its own source of truth in Zookeeper. How would you update the schema of your collection? In standalone, the schema.xml file is replicated to the replica. But that will not work here since Solr reads its schema from each local ZK. So then you need to do I'm more in favor of improving solrcloud with replica modes to the point where there are no benefits of running standalone anymore. |
You are quite right about probably conflating this with my other experiment to see how it works. if you are running a cluster with six nodes, then you probably SHOULD be using Solrcloud, and proper ZK. I'll split this up, and we do need to figure out how to come up with a path to eliminate the solrcloud versus standalone divide... |
@janhoy just to clarify, if I keep the NON zookeeper end 2 end test for replication, do you see that as valuable and worth merging? I'll split the zookeeper version of the test out into it's own PR... I'm interested in playing with it a bit more... |
Yea, not sure how much value it gives in addition to the replication handler tests in then test suite though? Can you comment on that? |
i think we're going to see a lot more change in this area, adding basic auth, the potential changes aorund ZK.. so thinking this helps build confidence we didn't break anything.... Does that seem like enough upside? |
Not very convinced still 😉 |
;-) Okay. That's fair. I'll close this, and if we see value int he future we can reopen. |
i hate having PR's that just hang out open for years in github ;-) |
… it's configuration, and then it starts repeating data.
@gerlowskija here is a proof of concept of user managed cluster based on our conversation last week! |
…y want is solr cloud!
change SOLR_PORT to LEADER_PORT, REPEATER_PORT, FOLLOWER_PORT.... Also, could look up indexversion on leader, and then wait for it on repeater instead of sleep... |
run curl "http://localhost:${SOLR3_PORT}/solr/techproducts/select?q=*:*&rows=0" | ||
assert_output --partial '"numFound":46' | ||
|
||
# Now lets stop our replicator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
repeater!
This PR had no visible activity in the past 60 days, labeling it as stale. Any new activity will remove the stale label. To attract more reviewers, please tag someone or notify the [email protected] mailing list. Thank you for your contribution! |
This PR is now closed due to 60 days of inactivity after being marked as stale. Re-opening this PR is still possible, in which case it will be marked as active again. |
Still working on how/when this type of integration test becomes part of Solr! |
In PR 2783 we talk about various approaches to deploying solr from small to large. It would be good to actually test those deployment scenarios. This tries the first one out.
I am back on the path of wanting to get this in. In SOLR-17492 (and the PR #2783) we talk about how to run Solr. However, how do we actually KNOW that it works? We see a lot of bugs that come from specific combinations of auths, cluster shape, features etc. While there may be more robust ways of supporting testing these combinations, bats is one way that we have here today. Maybe we have a serpate directory of them that get run less frequently, but that validate the various deployment scenarios? |
https://issues.apache.org/jira/browse/SOLR-16957
Description
BATS test for user managed index replication. This is a End 2 End test, not a unit test for Bash scripts.
Solution
Fire up three independent Solrs, set up replication via apis, trigger it, and see what happens.
We demonstrate starting up three independent Solr nodes in the Leader/Repeater/Follower pattern.
Then we create three seperate 'techproducts' collections, uploading the same configset three seperate times to demonstrate that there is no interconnection or shard config between them.
We then index some XML data on the Leader, and then check that it flows through the Repeater to the Follower.
This is repeated for some more documents.
Lastly, we shutdown the Repeater and demonstrate that the Follower still has all of it's documents available for querying.
We delete the data on the Leader, and then subsequantly bring back up the Repeater.
The Repeater perseves all fo the configuration that was done during the setup process after restarting, and immediatley copies over the now empty 'techproducts' index and we then see the Follower picks up that empty collection as well.