Publisher Field #6

seanherron · 2013-08-20T14:23:47Z

I'm working on a modification to the extension to parse out data.json files by the organization they belong to in CKAN. One question I have is with the implementation of the publisher field - why does it map to author in CKAN rather than to organization? Was going to change this around but wanted to check on the rationale behind it first. Thanks!

The text was updated successfully, but these errors were encountered:

JoshData · 2013-08-20T14:33:21Z

Hi, Sean.

A few reasons. The main one to be wary of is that organizations are permissions structures in CKAN. I don't think it would be appropriate to map harvested datasets to organizations based on the publisher field. You risk giving write-permission to something that shouldn't be edited, or losing permission to update it later. If anything, they should all map to a single Harvester organization.

Groups might be more appropriate.

But also, publisher is a string field. Mapping to organizations/groups may be complex and the logic may depend on whose catalog it is. Also, managing the creation/updating/deletion of organizations/groups is a lot more work that I didn't want to get into.

Am definitely not opposed to seeing a way to map datasets to groups though. That'd be very handy.

dwcaraway · 2013-08-23T18:37:25Z

"You risk giving write-permission to something that shouldn't be edited, or losing permission to update it later. If anything, they should all map to a single Harvester organization."

Don't understand. Are the permissions that are gained/lost with regard to editing metadata?

Our ckan system will use data.json to populate the catalog. we are also offering the ability for users to log in to enter their metadata, upload files, etc. The metadata is used to produce data.json files for the organization.

"But also, publisher is a string field. Mapping to organizations/groups may be complex and the logic may depend on whose catalog it is. Also, managing the creation/updating/deletion of organizations/groups is a lot more work that I didn't want to get into."

Yes, mapping may be inherently complex. We'll likely have to use some machine learning if we start seeing significant variations in the entered data. For right now, though, we'll hope that we won't have to many endpoints to harvest and so can establish standards and procedure to minimize the technical problem of establishing identity.

For us, organization/grouops are great as they're already baked into CKAN already.

Given this information, are there other reasons not to use publisher?

JoshData · 2013-08-29T13:27:09Z

Well, like I said, I don't think orgs makes sense. Groups makes sense.

But if you guys submit a patch to do either, I'd be glad to merge it.

JoshData · 2013-09-08T20:05:47Z

Oh, see #5 though --- I merged Fuhu Xia's patch assigning datasets to the org that owns the harvester source.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Publisher Field #6

Publisher Field #6

seanherron commented Aug 20, 2013

JoshData commented Aug 20, 2013

dwcaraway commented Aug 23, 2013

JoshData commented Aug 29, 2013

JoshData commented Sep 8, 2013

Publisher Field #6

Publisher Field #6

Comments

seanherron commented Aug 20, 2013

JoshData commented Aug 20, 2013

dwcaraway commented Aug 23, 2013

JoshData commented Aug 29, 2013

JoshData commented Sep 8, 2013