Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fully document import by CSV #1164

Open
hannahfrost opened this issue May 28, 2017 · 2 comments
Open

Fully document import by CSV #1164

hannahfrost opened this issue May 28, 2017 · 2 comments

Comments

@hannahfrost
Copy link

No description provided.

@mjgiarlo mjgiarlo changed the title Fully document import by CVS Fully document import by CSV May 31, 2017
@mjgiarlo mjgiarlo modified the milestone: November 2017 Jun 5, 2017
@mjgiarlo
Copy link
Member

mjgiarlo commented Aug 8, 2017

@hannahfrost @bbranan cc: @jcoyne (just FYI)

I've tested the CSV importer and here's what I found.

First, I created a simplified version of the CSV with a single image-related object (which actually maps to the GenericWork type rather than the Image type, which confused me). You can see that in #1395

Next, I copied a smallish PNG to tmp/hyku-objects, and then ran the command: ./bin/import_from_csv foo.localhost spec/fixtures/csv/simple.csv tmp/hyku-objects

I found that I needed to make a small number of code changes to get the script to run. AccountElevator expects there to be a global logger object, so I added that to the script, and there's also some undocumented dependency between imported objects and collections. So I added a way to ignore that dependency.

Here are some shortcomings that will/may need addressing before we consider this useful in a hosted service context:

  • Files are never attached to the works that are imported
  • CSV import should allow a depositor (email) to be specified on the command-line. By default, the depositor is set to a system user, so you can only see it if you are an administrator.
  • CSV import could allow an AdminSet ID to be specified on the command-line. It defaults to the default AdminSet.
  • CSV import should accept a column for visibility
  • CSV import may need to handle workflow in some way if an item is deposited into an AdminSet that isn't using the self-deposit workflow
  • It is possible to import data that does not match our vocabularies, e.g., resource type
  • It is possible to import data that would not pass form validation. For instance, the current CSV format does not include a field for rights statements. (This doesn't cause errors, but does create data that looks odd, and may make it difficult to edit an imported object without needing to add values for these fields.)

@atz
Copy link
Contributor

atz commented Aug 17, 2017

This blocks #1238, and the problem is not just the documentation, but the fact that nobody (even the authors of this code) can actually run it successfully. There is no Hyku feature here, currently. There is the idea and structure where a feature might go, but nothing usable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants