Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-109653: Improve the import time of email.utils #109824

Merged
merged 12 commits into from
Oct 12, 2023
14 changes: 8 additions & 6 deletions Lib/email/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,8 @@
'unquote',
]

import os
import re
import time
import random
import socket
import datetime
import urllib.parse

Expand All @@ -36,9 +33,6 @@

from email._parseaddr import parsedate, parsedate_tz, _parsedate_tz

# Intrapackage imports
from email.charset import Charset

COMMASPACE = ', '
EMPTYSTRING = ''
UEMPTYSTRING = ''
Expand Down Expand Up @@ -94,6 +88,8 @@ def formataddr(pair, charset='utf-8'):
name.encode('ascii')
except UnicodeEncodeError:
if isinstance(charset, str):
# lazy import to improve module import time
from email.charset import Charset
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I like this one, only fine with it since formataddr doesn't look too widely used (a lot of the time it's nice to pay these costs upfront, predictable performance is important, e.g. don't want the first request your webserver serves to be randomly slow)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(a lot of the time it's nice to pay these costs upfront, predictable performance is important

agreed. On the other hand, though, the email package goes in quite heavily for lazy imports in some other places, so this does seem in keeping with that general philosophy:

# Some convenience routines. Don't import Parser and Message as side-effects
# of importing email since those cascadingly import most of the rest of the
# email package.
def message_from_string(s, *args, **kws):
"""Parse a string into a Message object model.
Optional _class and strict are passed to the Parser constructor.
"""
from email.parser import Parser
return Parser(*args, **kws).parsestr(s)
def message_from_bytes(s, *args, **kws):
"""Parse a bytes string into a Message object model.
Optional _class and strict are passed to the Parser constructor.
"""
from email.parser import BytesParser
return BytesParser(*args, **kws).parsebytes(s)
def message_from_file(fp, *args, **kws):
"""Read a file and parse its contents into a Message object model.
Optional _class and strict are passed to the Parser constructor.
"""
from email.parser import Parser
return Parser(*args, **kws).parse(fp)
def message_from_binary_file(fp, *args, **kws):
"""Read a binary file and parse its contents into a Message object model.
Optional _class and strict are passed to the Parser constructor.
"""
from email.parser import BytesParser
return BytesParser(*args, **kws).parse(fp)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be happy to change it so it's imported at the top of the function if you think that'd be better?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah that'd be worse, module level or here. I'm fine with this as is!

charset = Charset(charset)
encoded_name = charset.header_encode(name)
return "%s <%s>" % (encoded_name, address)
Expand Down Expand Up @@ -181,6 +177,12 @@ def make_msgid(idstring=None, domain=None):
portion of the message id after the '@'. It defaults to the locally
defined hostname.
"""
# Lazy imports to speedup module import time
# (no other functions in email.utils need these modules)
import os
AlexWaygood marked this conversation as resolved.
Show resolved Hide resolved
import random
import socket
AlexWaygood marked this conversation as resolved.
Show resolved Hide resolved

timeval = int(time.time()*100)
pid = os.getpid()
randint = random.getrandbits(64)
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Reduce the import time of :mod:`email.utils` by around 43%. This results in
the import time of :mod:`email.message` falling by around 18%, which in turn
reduces the import time of :mod:`importlib.metadata` by around 6%. Patch by
Alex Waygood.
Loading