Skip to content

Releases: webrecorder/browsertrix-crawler

Browsertrix Crawler v1.4.0-beta.0

14 Nov 07:29
0b9cd71
Compare
Choose a tag to compare
Pre-release

What's Changed

Full Changelog: v1.3.4...v1.4.0-beta.0

Browsertrix Crawler v1.3.5

05 Nov 21:47
3187685
Compare
Choose a tag to compare

What's Changed

  • quick fix for cookies not being available for replay (regression from 1.2.x), more extensive fix coming in next version.
  • fix cookie not being passed to replay regression: for now, add x-waba… by @ikreymer in #713

Full Changelog: v1.3.4...v1.3.5

Browsertrix Crawler v1.3.4

31 Oct 21:07
e5bab8e
Compare
Choose a tag to compare

What's Changed

Full Changelog: v1.3.3...v1.3.4

Browsertrix Crawler v1.3.3

11 Oct 07:19
Compare
Choose a tag to compare

What's Changed

  • Fix for rare crash (link extraction promise cleanup): by @ikreymer in #701

Full Changelog: v1.3.2...v1.3.3

Browsertrix Crawler v1.3.2

08 Oct 00:26
157ac34
Compare
Choose a tag to compare

What's Changed

  • ensure extraHops also apply to maxDepth by @ikreymer in #694
  • Tests: disable blockrules test in CI by @ikreymer in #698
  • Add documentation for crawl collections by @tw4l in #695
  • bump puppeteer core to 23.5.1 by @ikreymer in #700
  • fix typo in QA exclude check, which resulted in all URLs being excluded by @ikreymer in #697

Full Changelog: v1.3.1...v1.3.2

Browsertrix Crawler v1.3.1

27 Sep 18:32
Compare
Choose a tag to compare

What's Changed

  • direct fetch: when cancelling due to redirect, read full body by @ikreymer in #688
  • Include depth in pages JSONL files by @tw4l in #691
  • Additional exception safety by @ikreymer in #692

Full Changelog: v1.3.0...v1.3.1

Browsertrix Crawler v1.3.0

12 Sep 16:30
Compare
Choose a tag to compare

What's Changed

  • Use isolated Python venv for dependencies installation by @benoit74 in #591
  • Adds warning about crawling with basic auth by @Shrinks99 in #669
  • Disable behaviors entirely if --behaviors array is empty by @tw4l in #672
  • SOCKS5 over SSH Tunnel Support by @ikreymer in #671
  • Streaming in-place WACZ creation + CDXJ indexing by @ikreymer in #673
  • fix for direct fetch timeouts by @ikreymer in #677
  • WARC writer + incremental indexing fixes by @ikreymer in #679
  • Additional direct fetch improvements by @ikreymer in #678
  • crawler args typing by @ikreymer in #680
  • bump browser to 1.69.162 by @ikreymer in #681
  • cleanup: remove old config files from pywb by @ikreymer in #682
  • eslint: add strict await checking: by @ikreymer in #684
  • update current crawl size in redis on each healthcheck call by @ikreymer in #685
  • exit codes: exit with error code 10 if interrupt is caused by unexpected browser exit by @ikreymer in #686

Full Changelog: v1.2.8...v1.3.0

Browsertrix Crawler v1.3.0-beta.1

06 Sep 23:24
b425483
Compare
Choose a tag to compare
Pre-release

What's Changed

Full Changelog: v1.3.0-beta.0...v1.3.0-beta.1

Browsertrix Crawler v1.3.0-beta.0

29 Aug 22:07
85a07af
Compare
Choose a tag to compare
Pre-release

What's Changed

  • Use isolated Python venv for dependencies installation by @benoit74 in #591
  • Adds warning about crawling with basic auth by @Shrinks99 in #669
  • Disable behaviors entirely if --behaviors array is empty by @tw4l in #672
  • SOCKS5 over SSH Tunnel Support by @ikreymer in #671
  • Streaming in-place WACZ creation + CDXJ indexing by @ikreymer in #673

Full Changelog: v1.2.8...v1.3.0-beta.0

Browsertrix Crawler v1.2.8

14 Aug 06:41
8d7fb1e
Compare
Choose a tag to compare

What's Changed

Full Changelog: v1.2.7...v1.2.8