Releases: Lightning-AI/litdata
Releases · Lightning-AI/litdata
v0.2.30
What's Changed
- update tags in pkg metadata by @Borda in #384
- 📝 Update Docs: Merge multiple optimized datasets into one by @bhimrazy in #385
- Fix/large num chunks error by @bhimrazy in #381
- fix: non-deterministic CI test failure by @deependujha in #390
- correct the chunk size by adding header size by @tchaton in #395
- pass storage options to s5cmd by @bhimrazy in #397
- CONTRIBUTING.md for LitData by @deependujha in #391
- Feat: add support for custom cache dir in Streaming Dataset by @bhimrazy in #399
- 📝 docs: specify custom cache directory by @bhimrazy in #405
- Fix broken link for CONTRIBUTING.md by @bhimrazy in #404
- Feat/add support for numpy datatypes in tokensloader by @bhimrazy in #401
- Bump version to 0.2.30 by @bhimrazy in #410
Full Changelog: v0.2.29...v0.2.30
v0.2.29
What's Changed
- Update
PL Data
toLitData
by @bhimrazy in #382 - Fix: Chunks deletion issue by @deependujha in #375
- Bump version 0.2.29 by @deependujha in #383
Full Changelog: v0.2.28...v0.2.29
v0.2.28
v0.2.27
What's Changed
- ci: drop dependabot by @Borda in #361
- azure storage options by @mohanreddypmr in #365
- switch
lightning-cloud
to lightning SDK by @Borda in #369 - remove not violated bandit rules from ignore by @Borda in #372
- fixing typos in errors & docs by @Borda in #371
- reduce unnecessary
pass
by @Borda in #373 - fixing docstrings by @Borda in #374
- improve hint readability by @Borda in #376
- Bump version to 0.2.27.dev by @rasbt in #378
- fix import & asignement issue by @Borda in #377
- Feat: Using fsspec to download files by @deependujha in #348
- Bump version to 0.2.27 by @bhimrazy in #379
Full Changelog: v0.2.26...v0.2.27
v0.2.26.dev
What's Changed
- ci: drop dependabot by @Borda in #361
- azure storage options by @mohanreddypmr in #365
- switch
lightning-cloud
to lightning SDK by @Borda in #369 - remove not violated bandit rules from ignore by @Borda in #372
- fixing typos in errors & docs by @Borda in #371
- reduce unnecessary
pass
by @Borda in #373 - fixing docstrings by @Borda in #374
- improve hint readability by @Borda in #376
- Bump version to 0.26.dev by @rasbt in #378
Full Changelog: v0.2.26...v0.2.26.dev
v0.2.26
What's Changed
- Bump mosaicml-streaming from 0.8.0 to 0.8.1 by @dependabot in #346
- Adds check for existence of dataset path before loading index file by @bhimrazy in #350
- Bump pypa/gh-action-pypi-publish from 1.9.0 to 1.10.0 by @dependabot in #352
- Bump coverage from 7.5.3 to 7.6.1 by @dependabot in #345
- bump/ci: update to
0.11.7
by @Borda in #355 - Update README.md by @tchaton in #356
- tchaton patch 1 by @tchaton in #357
- Update README.md by @tchaton in #358
- Update README.md by @tchaton in #359
- Fix: Prevent multiple processes from copying the same file when using… by @dallmann-uniwue in #353
- LitData release 0.2.26 by @tchaton in #360
New Contributors
- @dallmann-uniwue made their first contribution in #353
Full Changelog: v0.2.25...v0.2.26
v0.2.25
What's Changed
- fix(ci): prune duplicated tests/checks by @Borda in #333
- fix(lint): prune invalid configurations by @Borda in #334
- ci: enable testing
py3.10
& prune unused workflows by @Borda in #335 - bump: use the latest/fixed version of
RequirementCache
by @Borda in #336 - Fix: Ensure Compression Algorithm is Installed Before Reading Compressed Data by @bhimrazy in #342
- Bump: release version 0.2.25 by @bhimrazy in #343
Full Changelog: v0.2.24...v0.2.25
v0.2.24
What's Changed
- Update README.md by @tchaton in #319
- Revert "Feat: Add support for reading LitData dataset published to HF" by @bhimrazy in #320
- Expose max download param by @animan42 in #323
- Dummy unit test max download by @animan42 in #325
- Nitpick: random state best practice by @deependujha in #326
- Ref/minor fixes by @bhimrazy in #329
- Bugfix: inconsistent streaming dataloader state (specific to StreamingDataset) by @bhimrazy in #318
- Bump: release version 0.2.24 by @bhimrazy in #332
New Contributors
Full Changelog: v0.2.23...v0.2.24
v0.2.23
What's Changed
- Update README.md by @tchaton in #303
- Fix StreamingDataset.get_len(num_workers=0) by @senarvi in #311
- Feat: Add support for storing and reading dataset from HF by @bhimrazy in #304
- Speed up the search for chunks to skip deletion for by @awaelchli in #312
- Feat: Clear cache if optimized dataset changes by @deependujha in #308
- Added a test for the bug with data loader length with num_workers=0 by @senarvi in #314
- Bump: release version 0.2.23 by @deependujha in #315
Full Changelog: v0.2.22...v0.2.23
Release 0.2.22
What's Changed
- Add support for passing the start_method to optimize by @tchaton in #298
- Add compression example in the readme by @tchaton in #300
- Enforce passing item_loader when customizing underlying storage format by @tchaton in #296
- Optimization when there is no data to download by @tchaton in #301
- Pre release 0.2.22 by @tchaton in #302
Full Changelog: v0.2.21...v0.2.22