Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed on_disk_cache issues #1942

Merged
merged 3 commits into from
Oct 13, 2022
Merged

Conversation

VitalyFedyunin
Copy link
Contributor

@VitalyFedyunin VitalyFedyunin commented Oct 12, 2022

Fixed issues with cache locks and cache files overwrites. Required to be compatible with pytorch/data#810

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]
VitalyFedyunin added a commit that referenced this pull request Oct 12, 2022
ghstack-source-id: 0a71d244a68e1ae09b47597cd51aa51867f44b4f
Pull Request resolved: #1942
Fixed issues with cache locks and cache files overwrites. Required to be compatible with pytorch/data#810




[ghstack-poisoned]
VitalyFedyunin added a commit that referenced this pull request Oct 12, 2022
ghstack-source-id: b560715160f296b4d5872928992a7a630893914f
Pull Request resolved: #1942
@Nayef211
Copy link
Contributor

@VitalyFedyunin looks like all TestCNNDM is failing on all platforms. Will stamp once those are fixed!

@VitalyFedyunin
Copy link
Contributor Author

Looks like tests havn't picked up recent torchdata changes, lets wait a bit and restart them.

@joecummings
Copy link
Contributor

Looks like tests havn't picked up recent torchdata changes, lets wait a bit and restart them.

These tests are now failing for every new PR. Which specific change is breaking this? @VitalyFedyunin

@VitalyFedyunin
Copy link
Contributor Author

That PR pytorch/data#810 introduced BC breaking change on cache api (as well as major stability fix).
This PR is fixing all failed tests.

Fixed issues with cache locks and cache files overwrites. Required to be compatible with pytorch/data#810




[ghstack-poisoned]
VitalyFedyunin added a commit that referenced this pull request Oct 13, 2022
ghstack-source-id: f6fbef07f1e7c877093f04b7891ac2f5f5c8db10
Pull Request resolved: #1942
@VitalyFedyunin
Copy link
Contributor Author

Not sure why 3.8 is failing. Something related to xml decoding.

@joecummings
Copy link
Contributor

Not sure why 3.8 is failing. Something related to xml decoding.

I think this is a transient error with that particular dataset. I'll approve but @Nayef211 can you confirm?

Copy link
Contributor

@joecummings joecummings left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@Nayef211
Copy link
Contributor

Not sure why 3.8 is failing. Something related to xml decoding.

I think this is a transient error with that particular dataset. I'll approve but @Nayef211 can you confirm?

Yup can confirm. This was also raised in #1926

@Nayef211 Nayef211 merged commit e9604ca into gh/VitalyFedyunin/1/base Oct 13, 2022
@Nayef211 Nayef211 deleted the gh/VitalyFedyunin/1/head branch October 13, 2022 19:04
joecummings added a commit that referenced this pull request Oct 14, 2022
* Fixed on_disk_cache issues

[ghstack-poisoned]

* Update on "Fixed on_disk_cache issues"

Fixed issues with cache locks and cache files overwrites. Required to be compatible with pytorch/data#810




[ghstack-poisoned]

* Update on "Fixed on_disk_cache issues"

Fixed issues with cache locks and cache files overwrites. Required to be compatible with pytorch/data#810




[ghstack-poisoned]

Co-authored-by: Vitaly Fedyunin <[email protected]>
@ejguan
Copy link
Contributor

ejguan commented Oct 20, 2022

@Nayef211 @abhinavarora Can we cherry pick this PR to the release?
We plan to cherry pick the one from torchdata pytorch/data#810. And, since this is the PR fixing a bug #1903, should this be included in the release?

Edit: If you can cherrypick this PR, I will do anther RC cut from torchdata side and upload a new RC release for torchtext to validate the change on release branch.

ejguan pushed a commit to ejguan/text that referenced this pull request Oct 20, 2022
* Fixed on_disk_cache issues

[ghstack-poisoned]

* Update on "Fixed on_disk_cache issues"

Fixed issues with cache locks and cache files overwrites. Required to be compatible with pytorch/data#810




[ghstack-poisoned]

* Update on "Fixed on_disk_cache issues"

Fixed issues with cache locks and cache files overwrites. Required to be compatible with pytorch/data#810




[ghstack-poisoned]

Co-authored-by: Vitaly Fedyunin <[email protected]>
abhinavarora pushed a commit that referenced this pull request Oct 21, 2022
* Fixed on_disk_cache issues

[ghstack-poisoned]

* Update on "Fixed on_disk_cache issues"

Fixed issues with cache locks and cache files overwrites. Required to be compatible with pytorch/data#810




[ghstack-poisoned]

* Update on "Fixed on_disk_cache issues"

Fixed issues with cache locks and cache files overwrites. Required to be compatible with pytorch/data#810




[ghstack-poisoned]

Co-authored-by: Vitaly Fedyunin <[email protected]>

Co-authored-by: Joe Cummings <[email protected]>
Co-authored-by: Vitaly Fedyunin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants