-
-
Notifications
You must be signed in to change notification settings - Fork 645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
move importlib functions to within C #408
Conversation
when _bootstrap_external is undergoing setup(), I modify the functions and classes in its namespace (ie its __dict__) with those from _imp.
Testing on |
diff --git a/libc/zipos/read.c b/libc/zipos/read.c
index 2767dd10c..4fc80fa4d 100644
--- a/libc/zipos/read.c
+++ b/libc/zipos/read.c
@@ -43,7 +43,7 @@ ssize_t __zipos_read(struct ZiposHandle *h, const struct iovec *iov,
x = y = opt_offset != -1 ? opt_offset : h->pos;
for (i = 0; i < iovlen && y < h->size; ++i, y += b) {
b = min(iov[i].iov_len, h->size - y);
- memcpy(iov[i].iov_base, h->mem + y, b);
+ if(iov[i].iov_base) memcpy(iov[i].iov_base, h->mem + y, b);
}
if (opt_offset == -1) h->pos = y;
return y - x; silences the warnings.
|
There was a comment there warning of memory faults. asan/ubsan proved it right.
instead of reading the FILE* in little parts via PyMarshal_ReadObjectFromFile, now I just load the whole file into memory and call PyMarshal_ReadObjectFromString instead. This prevents ubsan warnings.
The following change enabled me to fix my code that was causing the ubsan warning: diff --git a/libc/intrin/ubsan.c b/libc/intrin/ubsan.c
index ec357de0f..c65eb4a88 100644
--- a/libc/intrin/ubsan.c
+++ b/libc/intrin/ubsan.c
@@ -216,6 +216,7 @@ static void __ubsan_warning(const struct UbsanSourceLocation *loc,
const char *description) {
kprintf("%s:%d: %subsan warning: %s is undefined behavior%s\n", loc->file,
loc->line, SUBTLE, description, RESET);
+ __die();
}
dontdiscard __ubsan_die_f *__ubsan_abort(const struct UbsanSourceLocation *loc, |
Now all tests pass in |
now we know what happens inside, so we just call the actual function PyEval_EvalCode with the right setup.
spec._initializing is only there for a single check + optimization within import.c that ends up calling _lock_unlock_module. Since we are ignoring locks for the time being, this check can also be ignored.
- memcpy(iov[i].iov_base, h->mem + y, b);
+ if(iov[i].iov_base) memcpy(iov[i].iov_base, h->mem + y, b); To avoid undefined behavior there, you'd like want: - memcpy(iov[i].iov_base, h->mem + y, b);
+ if(b) memcpy(iov[i].iov_base, h->mem + y, b); |
I'd be interested in seeing the numbers. Startup latency is something I value. It's difficult to do with a change like this, because the system calls themselves usually have a lot of overhead. For instance, a stat() system call can be on the order of a microsecond. Whereas a crossover between Python and C my best guess (not having measured it) would probably be around 100ns. There's also some latency when assets are pulled out and inflated from the zip. |
So I implemented this because I read a blog post about Python startup times, and the question I had was "How can we use the benefits of APE ZIP store + Cosmopolitan Libc to speedup imports in APE Python?" I tried to do this earlier when submitting #248, but at that time I wasn't as good at writing CPython with Cosmopolitan Libc's support. Performance is measured in that blog post by just checking
If you like I can run the pyperformance benchmark for startup times. We can also look at
By moving things like this into C, For example, we can add a bool into the |
@jart this moves some of the frequently-used functions from Python code in
_bootstrap_external.py
to C inimport.c
.All tests pass in
MODE=
,MODE=tiny
.The idea is that almost every line of Python code can cause a memory allocation, and so functions that are frequently called end up wasting memory repeatedly. By moving the Python code to C, it becomes more clear how the function calls occur, and we can attempt more optimizations at the lower C level.
For example: during APE startup,
_validate_bytecode_header
andcode_to_bytecode
are called back-to-back for every.pyc
import. This leads to a repeated sequence like belowIn C we can mostly avoid the
BytesObjects
allocations and many function calls by just callingPyMarshal_ReadObjectFromFile
with a validFILE *
.It's a small speedup, but it adds up if all tests run slightly faster than before.