Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Harvest: Set export can temporarily be adversely affected by full (clean) reindex. #3437

Closed
kcondon opened this issue Oct 27, 2016 · 3 comments · Fixed by #10222
Closed

Harvest: Set export can temporarily be adversely affected by full (clean) reindex. #3437

kcondon opened this issue Oct 27, 2016 · 3 comments · Fixed by #10222
Assignees
Labels
Feature: Harvesting NIH OTA DC Grant: The Harvard Dataverse repository: A generalist repository integrated with a Data Commons NIH OTA: 1.4.1 4 | 1.4.1 | Resolve OAI-PMH harvesting issues | 5 prdOwnThis is an item synched from the product ... pm.epic.nih_harvesting pm.GREI-d-1.4.1 NIH, yr1, aim4, task1: Resolve OAI-PMH harvesting issues pm.GREI-d-1.4.2 NIH, yr1, aim4, task2: Create working group on packaging standards pm.GREI-d-2.4.1B NIH AIM:4 YR:2 TASK:1B | 2.4.1B | (started yr1) Resolve OAI-PMH harvesting issues Size: 30 A percentage of a sprint. 21 hours. (formerly size:33) Type: Bug a defect User Role: Superuser Has access to the superuser dashboard and cares about how the system is configured
Milestone

Comments

@kcondon
Copy link
Contributor

kcondon commented Oct 27, 2016

Set scope is defined by the results of a solr query in the set definition. When the scope changes, such as additions made or removals, the set is updated and subsequent harvests by clients may get additions or delete notifications.

This can accidentally happen if a clean index is performed and the nightly set exporter runs while the index is relatively blank, not yet finished. The net effect would be to mark most of the sets as deleted and any client harvesting before the next nightly set export or a manual reexport, will update their local harvest content by deleting those we have not finished indexing at the time of export. This would be temporary in principle if the client is schedule to run again at a later date -it would get the updated set once our index has finished and export was rerun.

@djbrooke djbrooke changed the title Harvest: Set export can temporarily be adversely affected by full (clean) reindex. Harvest: Set export can temporarily be adversely affected by full (clean) reindex. Oct 28, 2016
@pdurbin pdurbin added User Role: Superuser Has access to the superuser dashboard and cares about how the system is configured and removed zPriority 2: Moderate labels Jul 12, 2017
@mreekie mreekie added pm.epic.nih_harvesting NIH OTA DC Grant: The Harvard Dataverse repository: A generalist repository integrated with a Data Commons labels May 9, 2022
@mreekie mreekie added the NIH OTA: 1.4.1 4 | 1.4.1 | Resolve OAI-PMH harvesting issues | 5 prdOwnThis is an item synched from the product ... label Dec 5, 2022
@mreekie
Copy link

mreekie commented Jan 9, 2023

Review with Leonid

  • Believe this is still a real problem.
  • Get this estimated and prioritized

@mreekie mreekie added pm.GREI-d-1.4.1 NIH, yr1, aim4, task1: Resolve OAI-PMH harvesting issues pm.GREI-d-1.4.2 NIH, yr1, aim4, task2: Create working group on packaging standards labels Mar 20, 2023
@cmbz cmbz added the pm.GREI-d-2.4.1B NIH AIM:4 YR:2 TASK:1B | 2.4.1B | (started yr1) Resolve OAI-PMH harvesting issues label Jun 2, 2023
@cmbz cmbz moved this to SPRINT- NEEDS SIZING in IQSS Dataverse Project Dec 18, 2023
@cmbz
Copy link

cmbz commented Dec 19, 2023

2023/12/19: Prioritized during meeting on 2023/12/18. Added to Needs Sizing.

@landreev
Copy link
Contributor

Sounds like we need some kind of a "full reindex in progress" lock? - Should help with things other than OAI sets too, I would think - would prevent starting multiple reindex runs, for example?

@landreev landreev added the Size: 30 A percentage of a sprint. 21 hours. (formerly size:33) label Dec 19, 2023
@cmbz cmbz moved this from SPRINT- NEEDS SIZING to SPRINT READY in IQSS Dataverse Project Dec 19, 2023
@landreev landreev moved this from SPRINT READY to This Sprint 🏃‍♀️ 🏃 in IQSS Dataverse Project Jan 3, 2024
@landreev landreev moved this from This Sprint 🏃‍♀️ 🏃 to In Progress 💻 in IQSS Dataverse Project Jan 5, 2024
@landreev landreev self-assigned this Jan 5, 2024
landreev added a commit that referenced this issue Jan 5, 2024
landreev added a commit that referenced this issue Jan 9, 2024
I want to have it in order to be able to create an api test for
for a specific OAI set export case. But I figure it could be useful
otherwise. #3437
landreev added a commit that referenced this issue Jan 9, 2024
landreev added a commit that referenced this issue Jan 9, 2024
landreev added a commit that referenced this issue Jan 9, 2024
landreev added a commit that referenced this issue Jan 10, 2024
landreev added a commit that referenced this issue Jan 11, 2024
@pdurbin pdurbin added this to the 6.2 milestone Feb 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature: Harvesting NIH OTA DC Grant: The Harvard Dataverse repository: A generalist repository integrated with a Data Commons NIH OTA: 1.4.1 4 | 1.4.1 | Resolve OAI-PMH harvesting issues | 5 prdOwnThis is an item synched from the product ... pm.epic.nih_harvesting pm.GREI-d-1.4.1 NIH, yr1, aim4, task1: Resolve OAI-PMH harvesting issues pm.GREI-d-1.4.2 NIH, yr1, aim4, task2: Create working group on packaging standards pm.GREI-d-2.4.1B NIH AIM:4 YR:2 TASK:1B | 2.4.1B | (started yr1) Resolve OAI-PMH harvesting issues Size: 30 A percentage of a sprint. 21 hours. (formerly size:33) Type: Bug a defect User Role: Superuser Has access to the superuser dashboard and cares about how the system is configured
Projects
None yet
5 participants