-
-
Notifications
You must be signed in to change notification settings - Fork 30.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Database corruption with the shelve module #91228
Comments
After adding a few records, the shelve module corrupts the database keys (the database is still readable if an element key is known, but no more iterable): Traceback (most recent call last):
File "./shelve-test.py", line 81, in <module>
_verify_whois_cache()
File "./shelve-test.py", line 61, in _verify_whois_cache
for key in db.keys():
File "/usr/local/lib/python3.8/_collections_abc.py", line 720, in __iter__
yield from self._mapping
File "/usr/local/lib/python3.8/shelve.py", line 95, in __iter__
for k in self.dict.keys():
SystemError: Negative size passed to PyBytes_FromStringAndSize I provide a short test program and data that systematically reproduces the bug. I added the a script showing execution messages, the resulting database in DB and text formats. Tested with Python 3.8.12 on FreeBSD 13.0-RELEASE-p8. See also similar issues: |
3.8 only gets security patches. If you can, please test with a newer version. |
Hello, [...]
Adding 185.220.102.6
Database has 62 records for 442368 bytes. Last record was 640 bytes long
Traceback (most recent call last):
File "./shelve-test.py", line 84, in <module>
_verify_whois_cache()
File "./shelve-test.py", line 63, in _verify_whois_cache
for key in db.keys():
File "/usr/local/lib/python3.10/_collections_abc.py", line 881, in __iter__
yield from self._mapping
File "/usr/local/lib/python3.10/shelve.py", line 95, in __iter__
for k in self.dict.keys():
SystemError: Negative size passed to PyBytes_FromStringAndSize
# freebsd-version -uk
13.0-RELEASE-p8
13.0-RELEASE-p10
# python3.10 --version
Python 3.10.4 The point at which the database breaks depends (from 50 to 500+ records), the size of the database doesn't seem to be relevant (from 400K to 1800K). The size of the record *apparently* doesn't seem to be relevant (but I'm not 100% sure it's the right figure), though I've had other shelve module uses without issues with many more records but much smaller and less complex. |
I modified the test program to better reflect the size of the data structures stored in shelve (sys.getsizeof() which I used was far off the real size). I saw that the database was corrupted with big records, though even bigger previous records had not corrupted it. Records larger than 1K (mentioned in one of the other problem report) were routinely OK. Records larger than 4K (also mentioned on another PR) were sometimes OK. When I took a problematic record and used it single alone or with few other records, no corruption occurred. Any idea? |
Additional note: the test code WORKS under Windows 8.1 / Python 3.9.1 (though the data file is suffixed .dat instead of .db) resulting in a 4 MB database with 1065 records, some of them > 11 KB. So maybe the bug is system dependent. |
The storage format used under Windows is completely different from the one used under Unix (or *BSD). Apart from the .dat datafile, there is a .dir index file with CSV lines such as "'key', (offset, length)". Whereas under Unix (or *BSD), I have: # file whois_cache.db I'll make a test on a Linux Raspberry Pi, to see if the issue is *BSD specific... |
On 27.03.2022 09:56, Hubert Tournier wrote:
The shelve module uses the dbm module underneath and this will pick https://docs.python.org/3/library/dbm.html It's likely that you'll get the dbm.dumb interface on Windows. dbm.whichdb() will tell you which type of dbm implementation your More on the differences of DBM style libs: Aside: You are probably better off using SQLite with a pickle |
#74573 might be related, this is an older similar issue on macOS. |
Mitigates the impact of python/cpython#91228
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: