Releases · webrecorder/browsertrix-crawler

14 Nov 07:29

github-actions

v1.4.0-beta.0

0b9cd71

Browsertrix Crawler v1.4.0-beta.0 Pre-release

Pre-release

What's Changed

Support loading custom behaviors from URLs and/or filepaths by @tw4l in #707
support custom css selectors for extracting links by @ikreymer in #689
Dependency Update by @ikreymer in #718
add disable-lazy-loading flag, should fix #699 by @ikreymer in #720
Support loading custom behaviors from git repo by @tw4l in #717
fix indexing of cookie header: by @ikreymer in #714
Ensure partial responses are not written by @ikreymer in #721

Full Changelog: v1.3.4...v1.4.0-beta.0

Contributors

ikreymer and tw4l

Assets 2

05 Nov 21:47

github-actions

v1.3.5

3187685

Browsertrix Crawler v1.3.5 Latest

Latest

What's Changed

quick fix for cookies not being available for replay (regression from 1.2.x), more extensive fix coming in next version.
fix cookie not being passed to replay regression: for now, add x-waba… by @ikreymer in #713

Full Changelog: v1.3.4...v1.3.5

Contributors

ikreymer

Assets 2

31 Oct 21:07

github-actions

v1.3.4

e5bab8e

Browsertrix Crawler v1.3.4

What's Changed

dep: update to wabac.js 2.20 by @ikreymer in #704
deps: update to latest wabac by @ikreymer in #708
tests: use old.webrecorder.net for testing by @ikreymer in #710
range request streaming + various edge-case range optimizations: by @ikreymer in #709

Full Changelog: v1.3.3...v1.3.4

Contributors

ikreymer

Assets 2

11 Oct 07:19

github-actions

v1.3.3

a45b85d

Browsertrix Crawler v1.3.3

What's Changed

Fix for rare crash (link extraction promise cleanup): by @ikreymer in #701

Full Changelog: v1.3.2...v1.3.3

Contributors

ikreymer

Assets 2

08 Oct 00:26

github-actions

v1.3.2

157ac34

Browsertrix Crawler v1.3.2

What's Changed

ensure extraHops also apply to maxDepth by @ikreymer in #694
Tests: disable blockrules test in CI by @ikreymer in #698
Add documentation for crawl collections by @tw4l in #695
bump puppeteer core to 23.5.1 by @ikreymer in #700
fix typo in QA exclude check, which resulted in all URLs being excluded by @ikreymer in #697

Full Changelog: v1.3.1...v1.3.2

Contributors

ikreymer and tw4l

Assets 2

27 Sep 18:32

github-actions

v1.3.1

9f31090

Browsertrix Crawler v1.3.1

What's Changed

direct fetch: when cancelling due to redirect, read full body by @ikreymer in #688
Include depth in pages JSONL files by @tw4l in #691
Additional exception safety by @ikreymer in #692

Full Changelog: v1.3.0...v1.3.1

Contributors

ikreymer and tw4l

Assets 2

12 Sep 16:30

github-actions

v1.3.0

da44257

Browsertrix Crawler v1.3.0

What's Changed

Use isolated Python venv for dependencies installation by @benoit74 in #591
Adds warning about crawling with basic auth by @Shrinks99 in #669
Disable behaviors entirely if --behaviors array is empty by @tw4l in #672
SOCKS5 over SSH Tunnel Support by @ikreymer in #671
Streaming in-place WACZ creation + CDXJ indexing by @ikreymer in #673
fix for direct fetch timeouts by @ikreymer in #677
WARC writer + incremental indexing fixes by @ikreymer in #679
Additional direct fetch improvements by @ikreymer in #678
crawler args typing by @ikreymer in #680
bump browser to 1.69.162 by @ikreymer in #681
cleanup: remove old config files from pywb by @ikreymer in #682
eslint: add strict await checking: by @ikreymer in #684
update current crawl size in redis on each healthcheck call by @ikreymer in #685
exit codes: exit with error code 10 if interrupt is caused by unexpected browser exit by @ikreymer in #686

Full Changelog: v1.2.8...v1.3.0

Contributors

ikreymer, Shrinks99, and 2 other contributors

Assets 2

06 Sep 23:24

github-actions

v1.3.0-beta.1

b425483

Browsertrix Crawler v1.3.0-beta.1 Pre-release

Pre-release

What's Changed

fix for direct fetch timeouts by @ikreymer in #677
WARC writer + incremental indexing fixes by @ikreymer in #679
Additional direct fetch improvements by @ikreymer in #678
crawler args typing by @ikreymer in #680
bump browser to 1.69.162 by @ikreymer in #681
cleanup: remove old config files from pywb by @ikreymer in #682
eslint: add strict await checking: by @ikreymer in #684

Full Changelog: v1.3.0-beta.0...v1.3.0-beta.1

Contributors

ikreymer

Assets 2

29 Aug 22:07

github-actions

v1.3.0-beta.0

85a07af

Browsertrix Crawler v1.3.0-beta.0 Pre-release

Pre-release

What's Changed

Use isolated Python venv for dependencies installation by @benoit74 in #591
Adds warning about crawling with basic auth by @Shrinks99 in #669
Disable behaviors entirely if --behaviors array is empty by @tw4l in #672
SOCKS5 over SSH Tunnel Support by @ikreymer in #671
Streaming in-place WACZ creation + CDXJ indexing by @ikreymer in #673

Full Changelog: v1.2.8...v1.3.0-beta.0

Contributors

ikreymer, Shrinks99, and 2 other contributors

Assets 2

14 Aug 06:41

github-actions

v1.2.8

8d7fb1e

Browsertrix Crawler v1.2.8

What's Changed

1.2.8 updates: by @ikreymer in #668

Full Changelog: v1.2.7...v1.2.8

Contributors

ikreymer

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

Releases: webrecorder/browsertrix-crawler

Browsertrix Crawler v1.4.0-beta.0

What's Changed

Contributors

Browsertrix Crawler v1.3.5

What's Changed

Contributors

Browsertrix Crawler v1.3.4

What's Changed

Contributors

Browsertrix Crawler v1.3.3

What's Changed

Contributors

Browsertrix Crawler v1.3.2

What's Changed

Contributors

Browsertrix Crawler v1.3.1

What's Changed

Contributors

Browsertrix Crawler v1.3.0

What's Changed

Contributors

Browsertrix Crawler v1.3.0-beta.1

What's Changed

Contributors

Browsertrix Crawler v1.3.0-beta.0

What's Changed

Contributors

Browsertrix Crawler v1.2.8

What's Changed

Contributors