Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#3311: Upgrade solr to v7.x #3416

Merged
merged 6 commits into from
May 23, 2024
Merged

#3311: Upgrade solr to v7.x #3416

merged 6 commits into from
May 23, 2024

Conversation

murny
Copy link
Contributor

@murny murny commented Mar 17, 2024

Context

I upgraded to 7.0 first to verify everything was working. Was encountering the following error:

Can't load schema /opt/solr/server/solr/mycores/development/schema.xml: Setting default operator in schema (solrQueryParser/@defaultOperator) not supported

According to the Solr changelog:

The defaultOperator parameter in the schema is no longer supported. Use the q.op parameter instead. This option had been deprecated for several releases. See the section Standard Query Parser Parameters for more information.

So apparently this has been deprecated for awhile.

I peaked at what Hyrax/Hyku/Samvera did, and they basically just removed the one setting that was using it:

samvera/hyku@17d9a63?diff=unified&w=1#diff-c92f6a2e77d4bf367e25e88d6c2c477ea502ca0c819531d927d4c863025e4a9eL337-L342

Where they just removed this configuration:

 <!-- SolrQueryParser configuration: defaultOperator="AND|OR" -->
 <solrQueryParser defaultOperator="OR"/>

This seems to unblock any troubles I had, Jupiter ran just fine on Solr v7.0 after this. I then bumped it up to the lastest 7.X version (v7.7) and this also worked fine.

So this might be easier then I thought.

According to https://solr.apache.org/guide/solr/latest/upgrade-notes/major-changes-in-solr-7.html#upgrade-planning
It sounds like our current indexed data for solr v6 will also work with v7. But we will eventually want to reindex our data when we try to upgrade to v8.

NOTE: There was a 2-3 things that drifted in the hyrax/samvera schema.xml. Not sure if we should investigate these things or not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is used at all? Only the config directory and it's solrconfig.xml is being used. So to avoid the confusion I just removed this one?

@@ -14,8 +14,7 @@ services:
- '5432:5432'

solr:
image: solr:6.6
platform: linux/amd64
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer need the platform tag as solr 7 ships multiple platforms

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything in here is just whitespace diffs.

<!-- <defaultSearchField>text</defaultSearchField> -->

<!-- SolrQueryParser configuration: defaultOperator="AND|OR" -->
<solrQueryParser defaultOperator="OR"/>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main change. I tried various searches in Jupiter and its still doing OR between facets and stuff so everything should be working fine without this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jefferya
Copy link
Contributor

It sounds like our current indexed data for solr v6 will also work with v7. But we will eventually want to reindex our data when we try to upgrade to v8.

A word of caution on this point from my CWRC experiences: verify that the Solr v6 servers are not running a v5 index before updating production to v7.

pgwillia
pgwillia previously approved these changes Mar 20, 2024
Copy link
Member

@pgwillia pgwillia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for the necessary parts to upgrade in our dev/test environments.

I doubt that we need to look into the 2-3 things that drifted in the hyrax/samvera schema.xml, but what are they?

I agree that the v6 and v7 indexes are compatible and after that, we can reindex to prepare for v8. Would be good to coordinate with @nmacgreg and perhaps @henryzhang87 on how the process to upgrade Solr would look in our environment. They'll probably want to be familiar with

<!-- <defaultSearchField>text</defaultSearchField> -->

<!-- SolrQueryParser configuration: defaultOperator="AND|OR" -->
<solrQueryParser defaultOperator="OR"/>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@murny
Copy link
Contributor Author

murny commented Mar 20, 2024

I doubt that we need to look into the 2-3 things that drifted in the hyrax/samvera schema.xml, but what are they?

There were three things I noticed in their Solr 7 version of schema.xml:

They do not have the multiterm analyzer :
image

They use "string" instead of "alphaSort" for type on *_ssi field:
image

They do not have Hierarchical pathing support:
image

This is based on Hyku (but also see the same changes in Hyrax)

@murny
Copy link
Contributor Author

murny commented Mar 20, 2024

It sounds like our current indexed data for solr v6 will also work with v7. But we will eventually want to reindex our data when we try to upgrade to v8.

A word of caution on this point from my CWRC experiences: verify that the Solr v6 servers are not running a v5 index before updating production to v7.

Thats a great call out 😄. Will make sure to mention this to Neil 🙏

@pgwillia
Copy link
Member

I doubt that we need to look into the 2-3 things that drifted in the hyrax/samvera schema.xml, but what are they?

There were three things I noticed in their Solr 7 version of schema.xml:

They do not have the multiterm analyzer : image

I don't think we use this anywhere.

They use "string" instead of "alphaSort" for type on *_ssi field: image

We use *_ssi for our title field. Might need more investigation.

They do not have Hierarchical pathing support: image

We use *_dpsim for our member_of_paths community/collection field. Would also need more investigation.

This is based on Hyku (but also see the same changes in Hyrax)

@murny
Copy link
Contributor Author

murny commented Mar 22, 2024

Making a note to myself: Upgrade solr images to v7 in demo/production docker-compose files as well

Note: this is done now

@murny murny mentioned this pull request Apr 14, 2024
Copy link
Member

@pgwillia pgwillia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@murny murny merged commit c5faad8 into master May 23, 2024
4 checks passed
@murny murny deleted the #3311-upgrade-solr-to-7 branch May 23, 2024 23:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants