Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSVImporter file upload bug fix #1541

Merged
merged 5 commits into from
Aug 22, 2018

Conversation

cldambrosio
Copy link
Contributor

@cldambrosio cldambrosio commented Jul 6, 2018

Enables uploading files via the import_from_csv script.

refs #1238 #1164

I included assets which I used to see the CSV importer at work, with the command:
ruby bin/import_from_csv tenant ./lib/assets/csv_test.csv ./lib/assets/batch-upload-test/
(where tenant needs to be replaced by your actual one - for instance foo.localhost)
If you try it you may need to adapt the command to your set up. These assets could be removed if preferred.

  • Notes on tests:

I commented out 4 breaking tests from the codebase because they don't appear useful. The functionality was broken while the tests passed previously.

Thanks @geekscruff for sharing your code in leaf_addons - I used it :)

@samvera/hyrax-code-reviewers

Working solution

Refactor

Tests
@@ -0,0 +1,14 @@
# Called in 'object_factory'
module Importer
class AttachFilesToWork < AttachFilesToWorkJob
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than doing this, would it be possible to coerce the input data to an Array of files (with only one element)? That means that any upstream changes won't have to be ported down to Hyku?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see how overriding upstream changes could become undesirable - I'll work on this thanks!

def update
raise "Object doesn't exist" unless object
run_callbacks(:save) do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are all the callbacks removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't understand their use! Everything seems to work without. With them the method's Assignment Branch Condition size throws a rubocop error - would you prefer me to change the rubocop threshold for this too?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, adjust the thresholds as needed. If the callbacks aren't used, then make a separate PR to "remove unused callbacks" It's easer to understand PRs if they constrain themselves to a single change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok thanks! I just restored the callbacks - that didn't cause a rubocop failure anymore. I think I was getting that on my initial solution before extracting logic in new methods.

msg = "Updated #{klass.model_name.human} #{obj.id}"
Rails.logger.info("#{msg} (#{Array(attributes[system_identifier_field]).first})")
end
## below methods are commented out to pass rubocop inspection ("Class has too many line");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than commenting out the logging feature, how about changing the threshold in .rubocop_todo.yml?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It didn't occur to me to change the enforced practice - will do thanks!

Restore log methods, increase rubocop ClassLength
Copy link
Member

@jcoyne jcoyne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unclear why you are moving functionality from the Actors into the ObjectFactory. The ObjectFactory should be using the actors and any changes necessary in the actors should be made there.

def update
raise "Object doesn't exist" unless object
@attr = update_attributes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have you chosen to use an instance variable here? Why not a local?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is called in attach_file_to_work (line 137).

end
destroy_existing_file_set if object.file_sets.present?
attach_file_to_work
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

attaching a file to a work is the responsibility of the work actor. Why are you adding it here in duplicate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The work_actor doesn't do this - the feature is broken which is the reason for this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is done in the actor stack here: https://github.com/samvera/hyrax/blob/f43e7048bf3961f48812809897b36d84b6577f68/app/actors/hyrax/actors/create_with_files_actor.rb#L10 now. So you may need to upload the file and then set the uploaded file id in the :uploaded_files parameter before calling the actor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Will work on this.

end
destroy_existing_file_set if object.file_sets.present?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure you want to destroy an existing file set? Why not update it? If you replace one fileset with another, people who have cited an existing URI are going to have their references broken. What's the benefit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question; I am not sure that is the best way forward. This is how I envision the use of the csv importer:

  • If I want to update a work with an added file, I would add columns to my csv. I could keep the 1st file or reorder file columns to pick a different one (the thumbnail will be rendered from the 1st).
  • If I delete a file from the csv I would expect not to have it appear in the work anymore.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that's universal though. What if I only want to update titles. Reuploading all the files is going to be very slow. And I don't want to have to do that just to update some metadata.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi! Apologies for a late reply. Good point.
The method actually destroys the file set only if the file itself needs to be changed - the check is line 142 f.destroy if attributes[:file] != f.title. I will rename this method for clarity.
But the file is re-uploaded; which is inefficient for sure. I'll make changes.

@cldambrosio
Copy link
Contributor Author

I see; since the Actors are in Hyrax and we basically only point at them in Hyku the ObjectFactory seemed to be the right place to add code to fix this. I'll try moving the logic in an Actor instead.

@cldambrosio cldambrosio changed the title CSVImporter file upload bug fix [WIP] CSVImporter file upload bug fix Jul 19, 2018
@cldambrosio
Copy link
Contributor Author

The code is currently broken - I am stuck and feedback is welcome. I tried to rely on an existing actor but for now I get an ActiveFedora::AssociationTypeMismatch Error: Hydra::PCDM::File expected, got Hyrax::UploadedFile

@ghost
Copy link

ghost commented Jul 19, 2018

Hi @cldambrosio ... I had that issue with attributes[:files] so I switched to using attributes[:uploaded_files] and attributes[:remote_files], cf. https://github.com/research-technologies/leaf_addons/blob/master/lib/generators/leaf_addons/templates/lib/importer/factory/base_factory.rb#L15

I couldn't figure out a way to get [:files] working at all.

@cldambrosio
Copy link
Contributor Author

Thank you @geekscruff for coming to the rescue :)
I had not tried removing :files from the attributes, now with a few tweaks it works!
I have not included [:remote_files] so far, simply because I don't understand where that comes from - any hint?
I'll run the tests and push a commit later.

@jcoyne
Copy link
Member

jcoyne commented Jul 20, 2018

:remote_files would be used if the files were in box.com or google drive or something like that (not locally uploaded)

@cldambrosio
Copy link
Contributor Author

Makes sense 😅 I just had not come across it in Hyku so far so I was not sure I'd need to include this.

Remove method to replace existing file_set

Restore code where change is no longer necessary
@cldambrosio cldambrosio changed the title [WIP] CSVImporter file upload bug fix CSVImporter file upload bug fix Aug 6, 2018
@cldambrosio
Copy link
Contributor Author

Hi @jcoyne, do you think this is ready to merge?

@orangewolf orangewolf merged commit 7526936 into samvera:master Aug 22, 2018
@cldambrosio
Copy link
Contributor Author

🎉 🙏 😄

@mankind mankind deleted the hyku-csv-importer branch January 18, 2019 09:27
jeremyf added a commit that referenced this pull request Dec 20, 2023
Per https://github.com/samvera/hyku/releases/tag/v5.0.0, these importers
have been deprecated.

The "Commit History of Directories" shows that
there as been little activity, the recent activity being appeasing
Rubocop and working through upgrades.  Further, these files are
conceptually subsumed by Bulkrax.

<details><summary>Commit History of Directories</summary>

```bash
❯ git slog lib/importer
* 6a48652 — Hyrax 5 upgrade rubocop fixes & get specs running in CI (#2065) LaRita Robinson, (2023-12-16)
* 59626ef — 🧹 Rubocop'd lengths and other low hanging fruit Kirk Wang, (2023-12-13)
* fecb4e9 — 🧹 Ran `bundle exec rubocop -a` Kirk Wang, (2023-12-13)
* 0a6d047 — Upgrade to Hyrax 3.4.0 (#1811) Rob Kaufman, (2022-07-07)
* edafd85 — Rubocop fixes (#1640) Rob Kaufman, (2020-08-04)
* 9326387 — update to hyrax 2.2 (#1543) Rob Kaufman, (2018-10-04)
*   7526936 — Merge pull request #1541 from ubiquitypress/hyku-csv-importer Rob Kaufman, (2018-08-21)
|\
| * 904a369 — Remove :files from attributes Remove method to replace existing file_set Claire, (2018-07-20) (tag: v1.0.0)
| * 1cf7f5c — WIP- set uploaded_files param to use existing Actor Claire, (2018-07-19)
| * 9650e54 — Restore callbacks Claire, (2018-07-12)
| * e6a358a — Coerce UploadedFile to an array and use upstream Job Claire, (2018-07-11)
| * 93cb695 — Import files from CSV Claire, (2018-07-06)
* | bc51174 — Hyrax 2.1 upgrade (#1529) Rob Kaufman, (2018-07-26)
|/
* 116d4b9 — Upgrade Hyrax to 2.0.2 and upgrade gems with security vulnerabilities (#1519) Julie Allinson, (2018-03-22)
* d85952b — Changes necessary to get simple CSV import to run Michael J. Giarlo, (2017-08-22)
* a4024f4 — Make ObjectFactory rubocop-clean Joe Atzberger, (2017-08-22)
* 7618c3a — Refactor csv_importer and improve test accuracy Joe Atzberger, (2017-08-22)
* 21eb1e6 — Add a note about a broken method Justin Coyne, (2017-06-15)
* 86eedaa — Update Hyrax and other dependencies Justin Coyne, (2017-04-13)
* bfecea2 — Refactor importers to use Hyrax's actor stack Justin Coyne, (2017-02-23)
* 554b8a8 — Pass ability to the actor Justin Coyne, (2017-02-17)
* a8b25cd — Point at the Hyrax repo Justin Coyne, (2016-12-03)
* 5d005c9 — [WIP] Hyrax! Justin Coyne, (2016-12-02)
* fe229ca — Rename Lerna to Hyku. Fixes #461 Justin Coyne, (2016-11-07)
* 1f5fe23 — Update dependencies, Sufia to 7.0.0.rc1 Justin Coyne, (2016-07-12)
* d2c8fb1 — Rename hybox to lerna Michael J. Giarlo, (2016-06-08)
* 5de0440 — Add a CSV importer Justin Coyne, (2016-05-23)
* 5dcbb9b — Use User.batch_user instead of the deprecated batchuser Justin Coyne, (2016-05-20)
* d2a6c3c — Update to latest rubocop and rubocop-rspec Chris Beer, (2016-05-20)
* 642df79 — MODS/PURL importer Justin Coyne, (2016-05-19)
```

```bash
❯ git slog spec/lib/importer/
* 1c04b9e — 🧹 Hyrax 5 upgrade additional specs (#2067) LaRita Robinson, (2023-12-18)
* 6a48652 — Hyrax 5 upgrade rubocop fixes & get specs running in CI (#2065) LaRita Robinson, (2023-12-16)
* fecb4e9 — 🧹 Ran `bundle exec rubocop -a` Kirk Wang, (2023-12-13)
* 0a6d047 — Upgrade to Hyrax 3.4.0 (#1811) Rob Kaufman, (2022-07-07)
* edafd85 — Rubocop fixes (#1640) Rob Kaufman, (2020-08-04)
* f9a3a64 — Use paths based on Rails.root instead of pure relative paths Chris Colvard, (2020-07-08)
* 1b3c5ed — rubocop fixes nicholas, (2018-10-12)
* aeb9904 — update FactoryGirl to FactoryBot nicholas, (2018-10-09)
* 93cb695 — Import files from CSV Claire, (2018-07-06)
* 116d4b9 — Upgrade Hyrax to 2.0.2 and upgrade gems with security vulnerabilities (#1519) Julie Allinson, (2018-03-22)
* 7618c3a — Refactor csv_importer and improve test accuracy Joe Atzberger, (2017-08-22)
* 86eedaa — Update Hyrax and other dependencies Justin Coyne, (2017-04-13)
* bfecea2 — Refactor importers to use Hyrax's actor stack Justin Coyne, (2017-02-23)
* 8ecb861 — Require rails_helper in one place instead of 100 Joe Atzberger, (2017-02-14)
* 7498ecb — Update to latest hyrax Justin Coyne, (2016-12-20)
* a8b25cd — Point at the Hyrax repo Justin Coyne, (2016-12-03)
* 5d005c9 — [WIP] Hyrax! Justin Coyne, (2016-12-02)
* 5de0440 — Add a CSV importer Justin Coyne, (2016-05-23)
* d2a6c3c — Update to latest rubocop and rubocop-rspec Chris Beer, (2016-05-20)
* 642df79 — MODS/PURL importer Justin Coyne, (2016-05-19)
```

</details>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants