Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extracted Text derivative generation doesn't work for Valkyrie resources. #5544

Closed
tpendragon opened this issue Mar 15, 2022 · 1 comment · Fixed by #6110
Closed

Extracted Text derivative generation doesn't work for Valkyrie resources. #5544

tpendragon opened this issue Mar 15, 2022 · 1 comment · Fixed by #6110
Labels
File Set impacts the File Set part of PCDM Model File impacts the File part of PCDM Model valkyrization

Comments

@tpendragon
Copy link
Contributor

Descriptive summary

When generating derivatives extracted text gets pulled out and placed into a direct container as part of the DerivativeService. Unfortunately this won't work with generic Valkyrie - we'll have to update it to use a StorageAdapter appropriately, or something similar.

Rationale

So it works for everything.

Expected behavior

Extracted text is added as an attached FileMetadata node to the FileSet, and the binary content is uploaded to the Wings storage adapter.

Actual behavior

It just errors.

Steps to reproduce the behavior

  1. Upload a PDF
  2. View the error:
sidekiq_1     | 2022-03-15T21:45:09.027Z pid=1 tid=90dt WARN: ArgumentError: 37720c723 is not an http(s) uri
sidekiq_1     | 2022-03-15T21:45:09.027Z pid=1 tid=90dt WARN: /app/samvera/hyrax-engine/app/services/hyrax/persist_directly_contained_output_file_service.rb:27:in `retrieve_file_set'
sidekiq_1     | /app/samvera/hyrax-engine/app/services/hyrax/persist_directly_contained_output_file_service.rb:15:in `call'
sidekiq_1     | /usr/local/bundle/gems/hydra-derivatives-3.6.1/lib/hydra/derivatives/processors/full_text.rb:9:in `process'
sidekiq_1     | /usr/local/bundle/gems/hydra-derivatives-3.6.1/lib/hydra/derivatives/runners/runner.rb:30:in `block (2 levels) in create'
sidekiq_1     | /usr/local/bundle/gems/hydra-derivatives-3.6.1/lib/hydra/derivatives/runners/runner.rb:27:in `each'
sidekiq_1     | /usr/local/bundle/gems/hydra-derivatives-3.6.1/lib/hydra/derivatives/runners/runner.rb:27:in `block in create'
sidekiq_1     | /app/samvera/hyrax-engine/app/services/hyrax/local_file_service.rb:8:in `call'
sidekiq_1     | /usr/local/bundle/gems/hydra-derivatives-3.6.1/lib/hydra/derivatives/runners/runner.rb:41:in `source_file'
sidekiq_1     | /usr/local/bundle/gems/hydra-derivatives-3.6.1/lib/hydra/derivatives/runners/runner.rb:26:in `create'
sidekiq_1     | /app/samvera/hyrax-engine/app/services/hyrax/file_set_derivatives_service.rb:125:in `extract_full_text'
sidekiq_1     | /app/samvera/hyrax-engine/app/services/hyrax/file_set_derivatives_service.rb:78:in `create_pdf_derivatives'
sidekiq_1     | /app/samvera/hyrax-engine/app/services/hyrax/file_set_derivatives_service.rb:34:in `create_derivatives'
sidekiq_1     | /app/samvera/hyrax-engine/app/jobs/valkyrie_create_derivatives_job.rb:11:in `perform'
sidekiq_1     | /usr/local/bundle/gems/activejob-5.2.6.2/lib/active_job/execution.rb:39:in `block in perform_now'
sidekiq_1     | /usr/local/bundle/gems/activesupport-5.2.6.2/lib/active_support/callbacks.rb:109:in `block in run_callbacks'

Related work

#5504

@elrayle elrayle added valkyrization File Set impacts the File Set part of PCDM Model File impacts the File part of PCDM Model labels Mar 16, 2022
@sephirothkod sephirothkod self-assigned this Jun 23, 2023
@alishaevn
Copy link
Contributor

reopening. the related pr was merged, but this needs to go through qa.

@alishaevn alishaevn reopened this Aug 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
File Set impacts the File Set part of PCDM Model File impacts the File part of PCDM Model valkyrization
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants