Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Publisher Field #6

Open
seanherron opened this issue Aug 20, 2013 · 4 comments
Open

Publisher Field #6

seanherron opened this issue Aug 20, 2013 · 4 comments

Comments

@seanherron
Copy link

I'm working on a modification to the extension to parse out data.json files by the organization they belong to in CKAN. One question I have is with the implementation of the publisher field - why does it map to author in CKAN rather than to organization? Was going to change this around but wanted to check on the rationale behind it first. Thanks!

@JoshData
Copy link
Contributor

Hi, Sean.

A few reasons. The main one to be wary of is that organizations are permissions structures in CKAN. I don't think it would be appropriate to map harvested datasets to organizations based on the publisher field. You risk giving write-permission to something that shouldn't be edited, or losing permission to update it later. If anything, they should all map to a single Harvester organization.

Groups might be more appropriate.

But also, publisher is a string field. Mapping to organizations/groups may be complex and the logic may depend on whose catalog it is. Also, managing the creation/updating/deletion of organizations/groups is a lot more work that I didn't want to get into.

Am definitely not opposed to seeing a way to map datasets to groups though. That'd be very handy.

@dwcaraway
Copy link

"You risk giving write-permission to something that shouldn't be edited, or losing permission to update it later. If anything, they should all map to a single Harvester organization."

Don't understand. Are the permissions that are gained/lost with regard to editing metadata?

Our ckan system will use data.json to populate the catalog. we are also offering the ability for users to log in to enter their metadata, upload files, etc. The metadata is used to produce data.json files for the organization.

"But also, publisher is a string field. Mapping to organizations/groups may be complex and the logic may depend on whose catalog it is. Also, managing the creation/updating/deletion of organizations/groups is a lot more work that I didn't want to get into."

Yes, mapping may be inherently complex. We'll likely have to use some machine learning if we start seeing significant variations in the entered data. For right now, though, we'll hope that we won't have to many endpoints to harvest and so can establish standards and procedure to minimize the technical problem of establishing identity.

For us, organization/grouops are great as they're already baked into CKAN already.

Given this information, are there other reasons not to use publisher?

@JoshData
Copy link
Contributor

Well, like I said, I don't think orgs makes sense. Groups makes sense.

But if you guys submit a patch to do either, I'd be glad to merge it.

@JoshData
Copy link
Contributor

JoshData commented Sep 8, 2013

Oh, see #5 though --- I merged Fuhu Xia's patch assigning datasets to the org that owns the harvester source.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants