Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some additional email addresses from MDPI/JATS could be captured with special handling #116

Open
seasidesparrow opened this issue Jul 22, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@seasidesparrow
Copy link
Member

seasidesparrow commented Jul 22, 2024

Is your feature request related to a problem? Please describe.
MDPI JATS files occasionally include email addresses in the affiliation text body that are tagged in text with author initials. With the current parser, these are being stripped, rather than parsed out and added as author attributes. In the XML, they aren't being properly tagged with a specific author id, so barring parsing of initials, they would have to all be assigned to any author having that affiliation string

Describe the solution you'd like
We want to capture the email addresses as part of the record. It is probably(?) too complicated to do intelligent parsing (e.g. with author initials), but at a minimum the email addresses could be included as part of the affiliation itself, or parsed out as one or more email addresses that can be assigned to all authors with that affiliation.

Additional context
For an example of inputs and (current) outputs, see https://github.com/seasidesparrow/ADSIngestParser/blob/6411b00f01831df4617f08d394347d11dee63bf3/tests/stubdata/input/mdpi_symmetry-15-00939.xml#L78 and https://github.com/seasidesparrow/ADSIngestParser/blob/main/tests/stubdata/output/mdpi_symmetry-15-00939.json

@seasidesparrow seasidesparrow added the enhancement New feature or request label Jul 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant