Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Add fields to warcinfo #22

Closed
BubuAnabelas opened this issue Nov 23, 2018 · 1 comment
Closed

Feature request: Add fields to warcinfo #22

BubuAnabelas opened this issue Nov 23, 2018 · 1 comment

Comments

@BubuAnabelas
Copy link
Contributor

The spec says:

Allowable fields include, but are not limited to...

Currently it just admits the isPartOf, UserAgent and description. It would be useful to add software-based fields, to change the robots fields and to add some extra info to the software line.

function warcInfoContent ({ version, isPartOfV, warcInfoDescription, ua }) {
const base = [
`software: node-warc/${version}${CRLF}format: WARC File Format ${WARCV}${CRLF}robots: ignore${CRLF}`
]
if (isPartOfV != null) {
base.push(`isPartOf: ${isPartOfV}${CRLF}`)
}
if (warcInfoDescription != null) {
base.push(`description: ${warcInfoDescription}${CRLF}`)
}
if (ua != null) {
base.push(`http-header-user-agent: ${ua}${CRLF}`)
}
return base.join('')
}

N0taN3rd added a commit that referenced this issue Dec 15, 2018
longer require including the extension in the name for the WARC file to be created, it will be added if it is omitted
allow the default WARC file option can now be supplied as the only argument to the constructor of all writers or set via
prep for v3.2.0 release
@BubuAnabelas
Copy link
Contributor Author

Tested commit 0de56e6 and everything seems to work fine, feature added.
Closing issue.

N0taN3rd added a commit that referenced this issue Dec 28, 2018
longer require including the extension in the name for the WARC file to be created, it will be added if it is omitted
allow the default WARC file option can now be supplied as the only argument to the constructor of all writers or set via

treat the post data retrieved via `Network.getRequestPostData` as utf-8 strings rather than base64 encoded strings

Added to all writers the ability to write a Webrecorder Player compatible bookmark list (as WARC info record) via
WARCWriterBase.writeWebrecorderBookmarksInfoRecord and the pages property of the genOpts object supplied
to generateWARC fixes #25

update index.d.ts with 3.2.0 changes
N0taN3rd added a commit that referenced this issue Dec 28, 2018
Allow full control over the contents of the WARC info record fixes #22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant