Skip to content

Commit

Permalink
The yaml.load{,_all} functions require Loader= now
Browse files Browse the repository at this point in the history
  • Loading branch information
ingydotnet committed Sep 23, 2021
1 parent 2f87ac4 commit c274365
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 43 deletions.
47 changes: 5 additions & 42 deletions lib/yaml/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,41 +18,12 @@
import io

#------------------------------------------------------------------------------
# Warnings control
# XXX "Warnings control" is now deprecated. Leaving in the API function to not
# break code that uses it.
#------------------------------------------------------------------------------

# 'Global' warnings state:
_warnings_enabled = {
'YAMLLoadWarning': True,
}

# Get or set global warnings' state
def warnings(settings=None):
if settings is None:
return _warnings_enabled

if type(settings) is dict:
for key in settings:
if key in _warnings_enabled:
_warnings_enabled[key] = settings[key]

# Warn when load() is called without Loader=...
class YAMLLoadWarning(RuntimeWarning):
pass

def load_warning(method):
if _warnings_enabled['YAMLLoadWarning'] is False:
return

import warnings

message = (
"calling yaml.%s() without Loader=... is deprecated, as the "
"default Loader is unsafe. Please read "
"https://msg.pyyaml.org/load for full details."
) % method

warnings.warn(message, YAMLLoadWarning, stacklevel=3)
return {}

#------------------------------------------------------------------------------
def scan(stream, Loader=Loader):
Expand Down Expand Up @@ -100,30 +71,22 @@ def compose_all(stream, Loader=Loader):
finally:
loader.dispose()

def load(stream, Loader=None):
def load(stream, Loader):
"""
Parse the first YAML document in a stream
and produce the corresponding Python object.
"""
if Loader is None:
load_warning('load')
Loader = FullLoader

loader = Loader(stream)
try:
return loader.get_single_data()
finally:
loader.dispose()

def load_all(stream, Loader=None):
def load_all(stream, Loader):
"""
Parse all YAML documents in a stream
and produce corresponding Python objects.
"""
if Loader is None:
load_warning('load_all')
Loader = FullLoader

loader = Loader(stream)
try:
while loader.check_data():
Expand Down
1 change: 0 additions & 1 deletion tests/lib/test_dump_load.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@ def test_load_no_loader(verbose=False):
except TypeError:
return True
assert(False, "load() require Loader=...")

test_load_no_loader.unittest = True

def test_load_safeloader(verbose=False):
Expand Down

6 comments on commit c274365

@vkottler
Copy link

@vkottler vkottler commented on c274365 Oct 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ingydotnet making an argument required on this interface feels like a pretty pointless thing to do, what's the rationale here? This obviously breaks a lot of things for no apparent reason (not that dependencies can't be pinned, but interface changes for parsing/emitting library for a stable serialization format is a bit questionable...).

@nitzmahone
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a decision we came to as a team. Leaving it using an exploitable loader was looking to be an endless source of CVEs from "well-meaning" individuals, and changing it to a different loader poses a number of problems for the future of the library. e.g., did you want a pure-Python loader, a libyaml loader, a libfyaml loader, a 1.2 or 1.3 loader? Once that's decided, which tagset should it use? Then what happens when we add new functionality to the library- do we enable that on the default loader and risk more ire from the breakage it causes? It's been issuing pretty obnoxious deprecation warnings for the past ~3 years on every 5.x release, so it definitely should not have been a surprise, and we waited for a major release boundary to make the change. Just be explicit about what your code needs, and life will be good. 😄

@vkottler
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you want a pure-Python loader, a libyaml loader, a libfyaml loader, a 1.2 or 1.3 loader?

The spec is on 1.2, don't see anything in the spec about 1.3. The "safe" C-based encoder and decoder is the clear candidate as the 99% use-case for the entire ecosystem I would think. Would rejoice at that kind of change rather than leaving it up to the caller (defeats all safety and correctness arguments against...).

Then what happens when we add new functionality to the library

Should there be new functionality that's not additive, if the implementation-to-spec isn't feature complete? Topmost interface churn is not generally a good indicator of health in a package...

It's been issuing pretty obnoxious deprecation warnings for the past ~3 years on every 5.x release

Definitely appreciate clean ups!

@nitzmahone
Copy link
Member

@nitzmahone nitzmahone commented on c274365 Oct 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be new functionality that's not additive, if the implementation-to-spec isn't feature complete?

PyYAML as it stands today is mostly-1.1-with-some-1.2-stuff, and we're working on getting full-fledged 1.2 support in soon; there's also talk of creating new parallel classes as a testbed for 1.3 support. If we made that change today, yaml.load() would probably use a 1.1 SafeLoader (or a Fast variant we've toyed with for best-effort-use-libyaml-if-we-can); when we add full 1.2 support, I'm sure folks writing new code would expect that yaml.load() would use that by default, but that'd be another breaking change, because people again weren't being specific about what they're loading and the level of functionality they expect from the deserializer. If we solve that problem once with a breaking change now (by forcing people to be explicit about what they want), we shouldn't ever need to solve it again, no matter how many new options we add in the future.

@vkottler
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood, thanks for responses and insight @nitzmahone. Will plan on following along with the upcoming efforts!

@rianhunter
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation at https://pyyaml.org/wiki/PyYAMLDocumentation needs to be updated.

Please sign in to comment.