Releases: webrecorder/browsertrix-crawler
Releases · webrecorder/browsertrix-crawler
Browsertrix Crawler v1.4.0-beta.0
What's Changed
- Support loading custom behaviors from URLs and/or filepaths by @tw4l in #707
- support custom css selectors for extracting links by @ikreymer in #689
- Dependency Update by @ikreymer in #718
- add disable-lazy-loading flag, should fix #699 by @ikreymer in #720
- Support loading custom behaviors from git repo by @tw4l in #717
- fix indexing of cookie header: by @ikreymer in #714
- Ensure partial responses are not written by @ikreymer in #721
Full Changelog: v1.3.4...v1.4.0-beta.0
Browsertrix Crawler v1.3.5
What's Changed
- quick fix for cookies not being available for replay (regression from 1.2.x), more extensive fix coming in next version.
- fix cookie not being passed to replay regression: for now, add x-waba… by @ikreymer in #713
Full Changelog: v1.3.4...v1.3.5
Browsertrix Crawler v1.3.4
Browsertrix Crawler v1.3.3
What's Changed
Full Changelog: v1.3.2...v1.3.3
Browsertrix Crawler v1.3.2
What's Changed
- ensure extraHops also apply to maxDepth by @ikreymer in #694
- Tests: disable blockrules test in CI by @ikreymer in #698
- Add documentation for crawl collections by @tw4l in #695
- bump puppeteer core to 23.5.1 by @ikreymer in #700
- fix typo in QA exclude check, which resulted in all URLs being excluded by @ikreymer in #697
Full Changelog: v1.3.1...v1.3.2
Browsertrix Crawler v1.3.1
Browsertrix Crawler v1.3.0
What's Changed
- Use isolated Python venv for dependencies installation by @benoit74 in #591
- Adds warning about crawling with basic auth by @Shrinks99 in #669
- Disable behaviors entirely if --behaviors array is empty by @tw4l in #672
- SOCKS5 over SSH Tunnel Support by @ikreymer in #671
- Streaming in-place WACZ creation + CDXJ indexing by @ikreymer in #673
- fix for direct fetch timeouts by @ikreymer in #677
- WARC writer + incremental indexing fixes by @ikreymer in #679
- Additional direct fetch improvements by @ikreymer in #678
- crawler args typing by @ikreymer in #680
- bump browser to 1.69.162 by @ikreymer in #681
- cleanup: remove old config files from pywb by @ikreymer in #682
- eslint: add strict await checking: by @ikreymer in #684
- update current crawl size in redis on each healthcheck call by @ikreymer in #685
- exit codes: exit with error code 10 if interrupt is caused by unexpected browser exit by @ikreymer in #686
Full Changelog: v1.2.8...v1.3.0
Browsertrix Crawler v1.3.0-beta.1
What's Changed
- fix for direct fetch timeouts by @ikreymer in #677
- WARC writer + incremental indexing fixes by @ikreymer in #679
- Additional direct fetch improvements by @ikreymer in #678
- crawler args typing by @ikreymer in #680
- bump browser to 1.69.162 by @ikreymer in #681
- cleanup: remove old config files from pywb by @ikreymer in #682
- eslint: add strict await checking: by @ikreymer in #684
Full Changelog: v1.3.0-beta.0...v1.3.0-beta.1
Browsertrix Crawler v1.3.0-beta.0
What's Changed
- Use isolated Python venv for dependencies installation by @benoit74 in #591
- Adds warning about crawling with basic auth by @Shrinks99 in #669
- Disable behaviors entirely if --behaviors array is empty by @tw4l in #672
- SOCKS5 over SSH Tunnel Support by @ikreymer in #671
- Streaming in-place WACZ creation + CDXJ indexing by @ikreymer in #673
Full Changelog: v1.2.8...v1.3.0-beta.0