Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature request] stop an harvest job in progress #7940

Closed
virgilejarrige opened this issue Jun 11, 2021 · 7 comments · Fixed by #9187
Closed

[feature request] stop an harvest job in progress #7940

virgilejarrige opened this issue Jun 11, 2021 · 7 comments · Fixed by #9187
Assignees
Labels
Feature: Harvesting NIH OTA DC Grant: The Harvard Dataverse repository: A generalist repository integrated with a Data Commons NIH OTA: 1.4.1 4 | 1.4.1 | Resolve OAI-PMH harvesting issues | 5 prdOwnThis is an item synched from the product ... pm.epic.nih_harvesting pm.GREI-d-1.4.1 NIH, yr1, aim4, task1: Resolve OAI-PMH harvesting issues pm.GREI-d-1.4.2 NIH, yr1, aim4, task2: Create working group on packaging standards
Milestone

Comments

@virgilejarrige
Copy link

virgilejarrige commented Jun 11, 2021

Hello,

I ran into the issue described here: https://groups.google.com/g/dataverse-community/c/O-NdDtgFrI0/m/os_KjdLxAQAJ

The process worked, but maybe it would be a good idea to have a button that would do the same thing on the harvest page? or maybe an API to reset stuck harvest?

Here is what i've done:
#manually get the harvest job's id:
select * from harvestingclient;

#fix the issue - where {ID} is the database id of the harvesting client.
UPDATE clientharvestrun SET harvestresult=0 WHERE harvestingclient_id={ID} AND harvestresult = 2;
UPDATE harvestingclient SET harvestingnow = FALSE WHERE id={ID};

#restart payara "just in case"
systemctl restart payara

Take care,

Virgile

@djbrooke
Copy link
Contributor

Thanks @virgilejarrige, and good to see you at some #dataverse2021 sessions this week. I think it's a good idea to not have this be reliant on a DB update, but we should also examine and fix the failure cases that results in this condition in the first place as well.

@virgilejarrige
Copy link
Author

Hey Danny!

We had two cases in which this happened:

1 - The harvest of an entire data repository (Nakala) - which worked but was so huge it made our postgresql bdd too big for the VM it was in. As it's on our "all-in-one" test server, we had to use a snapshot to restore it.
That was a noob mistake, but would have been nice to be able to stop it. ;-)

2 - The haverst of an ahp collection - which was ended with the DP Update and in the dashboard now we have "SUCCESS; 0 harvested, 0 deleted, 0 failed."
For this one, here are the parameters if you want to test it on your side:
URL: https://archives.ahp-numerique.fr/index.php/;oai
OAI Set: oai:archives.ahp-numerique.fr:ahpoai_406
Metadata Format: oai_dc
Schedule: None
Archive Type: Generic OAI archive
Archive URL: https://archives.ahp-numerique.fr

@mreekie
Copy link

mreekie commented Apr 27, 2022

sprint:

  • this is an older one. Frequent request.
  • Harvesting jobs can take a long time and currently you are stuck waiting for the end once it starts.
  • This does not have an immediate solution.
  • Solution is not likely complex.

Desired behavior is to stop the harvesting that is in progress as opposed to a pause state.
The current workaround is restarting the app server, so providing a stop will be a large improvement.

  • small. e.g. implement a binary "stop" flag that client checks after every dataset import.

@mreekie mreekie added pm.epic.nih_harvesting NIH OTA DC Grant: The Harvard Dataverse repository: A generalist repository integrated with a Data Commons labels May 9, 2022
@mreekie
Copy link

mreekie commented May 25, 2022

Sprint

  • orphaned in OnDeck in pm.sprint.2022_05_11

@mreekie
Copy link

mreekie commented Jun 8, 2022

Sprint:

  • pm.sprint.2022_05_25 ended WIP

@mreekie
Copy link

mreekie commented Aug 3, 2022

Waiting on PR8753 to clear.
Leonid was looking to work on this but this is not important.
Phil - noted that customers have noticed this in the field.
Gustavo - let's push this to the sprint following this.

@pdurbin
Copy link
Member

pdurbin commented Oct 10, 2022

@mreekie mreekie added the NIH OTA: 1.4.1 4 | 1.4.1 | Resolve OAI-PMH harvesting issues | 5 prdOwnThis is an item synched from the product ... label Oct 25, 2022
@mreekie mreekie added NIH OTA: 1.4.1 4 | 1.4.1 | Resolve OAI-PMH harvesting issues | 5 prdOwnThis is an item synched from the product ... and removed NIH OTA: 1.4.1 4 | 1.4.1 | Resolve OAI-PMH harvesting issues | 5 prdOwnThis is an item synched from the product ... labels Oct 25, 2022
@mreekie mreekie moved this to NIH (Stefano) in IQSS Dataverse Project Nov 2, 2022
landreev added a commit that referenced this issue Nov 21, 2022
(resolved merge conflict in HarvesterServiceBean.java)
(also, it may be easier to abandon this branch and create one from scratch) (#7940)
landreev added a commit that referenced this issue Nov 23, 2022
landreev added a commit that referenced this issue Nov 23, 2022
landreev added a commit that referenced this issue Nov 23, 2022
@pdurbin pdurbin moved this from NIH bklog#000 (Stefano) to 🏁In a Sprint or Completed in IQSS Dataverse Project Nov 29, 2022
@mreekie mreekie moved this from 🏁In a Sprint or Completed to xxx in IQSS Dataverse Project Dec 1, 2022
@mreekie mreekie moved this from xxx to 🏁In a Sprint in IQSS Dataverse Project Dec 2, 2022
landreev added a commit that referenced this issue Dec 6, 2022
landreev added a commit that referenced this issue Dec 7, 2022
@pdurbin pdurbin added this to the 5.13 milestone Dec 15, 2022
@mreekie mreekie moved this to 🚮Clear of the Backlog in IQSS Dataverse Project Jan 28, 2023
@mreekie mreekie added pm.GREI-d-1.4.1 NIH, yr1, aim4, task1: Resolve OAI-PMH harvesting issues pm.GREI-d-1.4.2 NIH, yr1, aim4, task2: Create working group on packaging standards labels Mar 20, 2023
@gwendoux gwendoux moved this to Interested in Cirad Dataverse Jul 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature: Harvesting NIH OTA DC Grant: The Harvard Dataverse repository: A generalist repository integrated with a Data Commons NIH OTA: 1.4.1 4 | 1.4.1 | Resolve OAI-PMH harvesting issues | 5 prdOwnThis is an item synched from the product ... pm.epic.nih_harvesting pm.GREI-d-1.4.1 NIH, yr1, aim4, task1: Resolve OAI-PMH harvesting issues pm.GREI-d-1.4.2 NIH, yr1, aim4, task2: Create working group on packaging standards
Projects
Status: Interested
Status: No status
Development

Successfully merging a pull request may close this issue.

5 participants