We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue by aleksandar-devedzic Thu Aug 25 21:36:08 2022 Originally opened as codelucas/newspaper#948
These are the names of tags that can be found in SCRIPT or META tags that represent dates, maybe you will find this helpful:
publishdatepublish-date prism.publicationDate coverageEndTime uploadDate date published_date published_time pubdate publish_date Date published_at PublishDate dcterms.created rnews:datePublished article:published_time czhdev.publicationDate OriginalPublicationDate og:published_time datePublished article_date_original czhdev.publicationDate article.published published_time_telegram sailthru.date DC.date.issued date parsely-pub-date publishtime publication_date coverageEndTime,publishdate publish-datepublishedAtDate creationDateTime pub_date updated_time dateModified og:updated_time last-modified Last-Modified DC.date.modified krn:published_time article:modified_time modified_time modifiedDateTime dc.modified
The text was updated successfully, but these errors were encountered:
Comment by Cornatul Fri Sep 30 07:49:30 2022
this is the source code that is taking care of the publishe tags PUBLISH_DATE_TAGS = [ {'attribute': 'property', 'value': 'rnews:datePublished', 'content': 'content'}, {'attribute': 'property', 'value': 'article:published_time', 'content': 'content'}, {'attribute': 'name', 'value': 'OriginalPublicationDate', 'content': 'content'}, {'attribute': 'itemprop', 'value': 'datePublished', 'content': 'datetime'}, {'attribute': 'property', 'value': 'og:published_time', 'content': 'content'}, {'attribute': 'name', 'value': 'article_date_original', 'content': 'content'}, {'attribute': 'name', 'value': 'publication_date', 'content': 'content'}, {'attribute': 'name', 'value': 'sailthru.date', 'content': 'content'}, {'attribute': 'name', 'value': 'PublishDate', 'content': 'content'}, {'attribute': 'pubdate', 'value': 'pubdate', 'content': 'datetime'}, {'attribute': 'name', 'value': 'publish_date', 'content': 'content'}, ]
PUBLISH_DATE_TAGS = [ {'attribute': 'property', 'value': 'rnews:datePublished', 'content': 'content'}, {'attribute': 'property', 'value': 'article:published_time', 'content': 'content'}, {'attribute': 'name', 'value': 'OriginalPublicationDate', 'content': 'content'}, {'attribute': 'itemprop', 'value': 'datePublished', 'content': 'datetime'}, {'attribute': 'property', 'value': 'og:published_time', 'content': 'content'}, {'attribute': 'name', 'value': 'article_date_original', 'content': 'content'}, {'attribute': 'name', 'value': 'publication_date', 'content': 'content'}, {'attribute': 'name', 'value': 'sailthru.date', 'content': 'content'}, {'attribute': 'name', 'value': 'PublishDate', 'content': 'content'}, {'attribute': 'pubdate', 'value': 'pubdate', 'content': 'datetime'}, {'attribute': 'name', 'value': 'publish_date', 'content': 'content'}, ]
https://github.com/codelucas/newspaper/blob/master/newspaper/extractors.py line 198 till 235 , you could add your list to the dic array and open a pull request
Sorry, something went wrong.
added in v0.9.2
No branches or pull requests
Issue by aleksandar-devedzic
Thu Aug 25 21:36:08 2022
Originally opened as codelucas/newspaper#948
These are the names of tags that can be found in SCRIPT or META tags that represent dates, maybe you will find this helpful:
publishdatepublish-date
prism.publicationDate
coverageEndTime
uploadDate
date
published_date
published_time
pubdate
publish_date
Date
published_at
PublishDate
dcterms.created
rnews:datePublished
article:published_time
czhdev.publicationDate
OriginalPublicationDate
og:published_time
datePublished
article_date_original
czhdev.publicationDate
article.published
published_time_telegram
sailthru.date
DC.date.issued
date
parsely-pub-date
publishtime
publication_date
coverageEndTime,publishdate
publish-datepublishedAtDate
creationDateTime
pub_date
updated_time
dateModified
og:updated_time
last-modified
Last-Modified
DC.date.modified
krn:published_time
article:modified_time
modified_time
modifiedDateTime
dc.modified
The text was updated successfully, but these errors were encountered: