explicitly passed parametrize ids do not get escaped to ascii #1351

davehunt · 2016-02-01T13:10:16Z

If values in parametrize contain unicode bytes, they are escaped when generating the IDs, however if these same objects are provided via the ids argument, they are not escaped. As a result, the resulting IDs can't be used with pylib's XML/HTML generation.

This was discovered and previously discussed in https://github.com/davehunt/pytest-html/pull/30

To reproduce, create a conftest.py with the following contents:

# -*- coding: utf-8 -*-
def pytest_runtest_logreport(report):
    # print report.nodeid  # test_unicode.py::test_unicode[〈]
    print unicode(report.nodeid)  # UnicodeDecodeError: 'ascii' codec can't decode byte 0xe3 in position 30: ordinal not in range(128)
    # print report.nodeid.encode('string-escape')  # test_unicode.py::test_unicode[〈]
    # print unicode(report.nodeid.encode('string-escape'))  # test_unicode[\xe3\x80\x88]

Then, create a test file with the following contents:

# -*- coding: utf-8 -*-
import pytest
@pytest.mark.parametrize('val', ['foo'], ids=['〈'])
def test_unicode(val):
    pass

Run the test, and you will get the following traceback:

test_unicode.py 
INTERNALERROR> Traceback (most recent call last):
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/main.py", line 90, in wrap_session
INTERNALERROR>     session.exitstatus = doit(config, session) or 0
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/main.py", line 121, in _main
INTERNALERROR>     config.hook.pytest_runtestloop(session=session)
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 724, in __call__
INTERNALERROR>     return self._hookexec(self, self._nonwrappers + self._wrappers, kwargs)
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 338, in _hookexec
INTERNALERROR>     return self._inner_hookexec(hook, methods, kwargs)
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 333, in <lambda>
INTERNALERROR>     _MultiCall(methods, kwargs, hook.spec_opts).execute()
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 596, in execute
INTERNALERROR>     res = hook_impl.function(*args)
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/main.py", line 146, in pytest_runtestloop
INTERNALERROR>     item.config.hook.pytest_runtest_protocol(item=item, nextitem=nextitem)
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 724, in __call__
INTERNALERROR>     return self._hookexec(self, self._nonwrappers + self._wrappers, kwargs)
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 338, in _hookexec
INTERNALERROR>     return self._inner_hookexec(hook, methods, kwargs)
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 333, in <lambda>
INTERNALERROR>     _MultiCall(methods, kwargs, hook.spec_opts).execute()
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 595, in execute
INTERNALERROR>     return _wrapped_call(hook_impl.function(*args), self.execute)
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 253, in _wrapped_call
INTERNALERROR>     return call_outcome.get_result()
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 279, in get_result
INTERNALERROR>     _reraise(*ex)  # noqa
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 264, in __init__
INTERNALERROR>     self.result = func()
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 596, in execute
INTERNALERROR>     res = hook_impl.function(*args)
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/runner.py", line 65, in pytest_runtest_protocol
INTERNALERROR>     runtestprotocol(item, nextitem=nextitem)
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/runner.py", line 72, in runtestprotocol
INTERNALERROR>     rep = call_and_report(item, "setup", log)
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/runner.py", line 123, in call_and_report
INTERNALERROR>     hook.pytest_runtest_logreport(report=report)
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 724, in __call__
INTERNALERROR>     return self._hookexec(self, self._nonwrappers + self._wrappers, kwargs)
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 338, in _hookexec
INTERNALERROR>     return self._inner_hookexec(hook, methods, kwargs)
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 333, in <lambda>
INTERNALERROR>     _MultiCall(methods, kwargs, hook.spec_opts).execute()
INTERNALERROR>   File "/Users/dhunt/.virtualenvs/tmp-19232ea9309e990c/lib/python2.7/site-packages/_pytest/vendored_packages/pluggy.py", line 596, in execute
INTERNALERROR>     res = hook_impl.function(*args)
INTERNALERROR>   File "/Users/dhunt/workspace/pytest-scratchpad/conftest.py", line 6, in pytest_runtest_logreport
INTERNALERROR>     print unicode(report.nodeid)  # UnicodeDecodeError: 'ascii' codec can't decode byte 0xe3 in position 30: ordinal not in range(128)
INTERNALERROR> UnicodeDecodeError: 'ascii' codec can't decode byte 0xe3 in position 30: ordinal not in range(128)

The text was updated successfully, but these errors were encountered:

RonnyPfannschmidt · 2016-02-10T19:42:01Z

at second glace, this isbreaking as expected, print on python2 will break on unicode

please verify

The-Compiler · 2016-02-10T20:00:17Z

No, unicode is - in the case of pytest-html, it's when trying to add the element to HTML

The point is if pytest should always make sure that ids are valid unicode objects, by doing _escape_bytes if they are bytes as well.

RonnyPfannschmidt · 2016-02-10T21:55:35Z

my understanding is, that currently node id's are native strings so bytes on python2
if we want to make them unicode in all cases, then we need a major release

The-Compiler · 2016-02-11T06:11:39Z

Sorry, my comment above is wrong - it should ensure they're always ascii-only native strings (i.e. bytes on Python 2), like it does with the autogenerated IDs.

For example, with this conftest.py:

from __future__ import print_function

def pytest_runtest_logreport(report):
    if report.when == 'call':
        print('\n  nodeid: {!r}'.format(report.nodeid))

and this test_foo.py:

# encoding: utf-8

import pytest

@pytest.mark.parametrize('foo', ['büpf'])
def test_args(foo):
    pass

@pytest.mark.parametrize('foo', ['a'], ids=['büpf'])
def test_ids(foo):
    pass

The autogenerated ID gets escaped to be ascii only, but not the ids one:

$ python2 -m py.test  -v
[...]

test_foo.py::test_args[b\xc3\xbcpf] PASSED
  nodeid: 'test_foo.py::test_args[b\\xc3\\xbcpf]'

test_foo.py::test_ids[büpf] PASSED
  nodeid: 'test_foo.py::test_ids[b\xc3\xbcpf]'

RonnyPfannschmidt · 2016-02-11T07:42:47Z

@hpk42 ping?

im under the impression we should either ensure the id's are always ascii,
and warn if the given id's do not match canonical ascii or support unicode

since python is growing more international and identifiers are unicode in 3.x, i think we should support unicode node id's simply because that is what we will get in future
for non-ascii byte string node id's we should issue a pytest-warning and escape them as if they where byte strings

ceridwen · 2016-03-17T18:01:57Z

I'd like to try this bug for my Outreachy application for pytest-html. What's the final conclusion on what parametrize should do with the ids arguments? I'd assume it's probably better to support Unicode ids. Should I call the same _escape_bytes() that's being used in _idval?

The-Compiler · 2016-03-17T18:22:06Z

FWIW, @RonnyPfannschmidt's proposal makes sense from my POV.

RonnyPfannschmidt · 2016-03-17T19:29:36Z

a simplistic implementation needs to ensure a few details

ensue non-ascii byte strings are escaped to something ascii-ish
ensure unicode/utf8 is mapped correctly to text

we should always byte-escape non-aphanumeric id values that are byte strings
we should always unicode-escape unicode id values
the target is, that node id's are always ascii text in the end

nicoddemus · 2016-03-17T19:30:40Z

@RonnyPfannschmidt in your proposal do you mean Python 2 and 3, or only for Python 2? Python 3 should we strive to support full unicode?

RonnyPfannschmidt · 2016-03-17T19:32:14Z

for both - even if we support full unicode id's we still will run into problems on windows, osx and other things

nicoddemus · 2016-03-17T19:35:46Z

Displaying them in the terminal for example?

RonnyPfannschmidt · 2016-03-17T19:39:33Z

yes, also they are sometimes used for filenames/lookup keys - making them consistent ascii text in all cases seems sensible simply to avoid platform issues on unicode boundaries

nicoddemus · 2016-03-17T19:40:40Z

Fair enough, sounds reasonable. Thanks! 😁

ceridwen · 2016-03-18T20:29:33Z

A draft pull request is up at #1463.

ceridwen · 2016-03-22T14:29:12Z

New pull request targeted at features is up at #1470.

The-Compiler · 2016-04-05T19:16:27Z

Closing as the PR was merged

This was referenced Feb 1, 2016

Allow unicode in test names pytest-dev/pytest-html#30

Closed

Add URL tests that follow redirects and allow these to be run from the pipeline mozilla/bedrock#3822

Merged

RonnyPfannschmidt changed the title ~~bytes objects in parametrize ID lists are not encoded~~ explicitly passed parametrize ids do not get escaped to ascii Feb 11, 2016

ceridwen mentioned this issue Apr 5, 2016

Escape both bytes and unicode strings for "ids" in Metafunc.parametrize #1470

Merged

The-Compiler closed this as completed Apr 5, 2016

This was referenced Aug 24, 2016

Update pytest to 3.0.1 pydanny/contributors#24

Closed

Update pytest to 3.0.1 pydanny/cached-property#39

Closed

Update pytest to 3.0.1 DavidHickman/calendary#9

Merged

pyup-bot mentioned this issue Dec 18, 2017

Pin pytest to latest version 3.3.1 gasparka/pyha_demo_project#6

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

explicitly passed parametrize ids do not get escaped to ascii #1351

explicitly passed parametrize ids do not get escaped to ascii #1351

davehunt commented Feb 1, 2016

RonnyPfannschmidt commented Feb 10, 2016

The-Compiler commented Feb 10, 2016

RonnyPfannschmidt commented Feb 10, 2016

The-Compiler commented Feb 11, 2016

RonnyPfannschmidt commented Feb 11, 2016

ceridwen commented Mar 17, 2016

The-Compiler commented Mar 17, 2016

RonnyPfannschmidt commented Mar 17, 2016

nicoddemus commented Mar 17, 2016

RonnyPfannschmidt commented Mar 17, 2016

nicoddemus commented Mar 17, 2016

RonnyPfannschmidt commented Mar 17, 2016

nicoddemus commented Mar 17, 2016

ceridwen commented Mar 18, 2016

ceridwen commented Mar 22, 2016

The-Compiler commented Apr 5, 2016

explicitly passed parametrize ids do not get escaped to ascii #1351

explicitly passed parametrize ids do not get escaped to ascii #1351

Comments

davehunt commented Feb 1, 2016

RonnyPfannschmidt commented Feb 10, 2016

The-Compiler commented Feb 10, 2016

RonnyPfannschmidt commented Feb 10, 2016

The-Compiler commented Feb 11, 2016

RonnyPfannschmidt commented Feb 11, 2016

ceridwen commented Mar 17, 2016

The-Compiler commented Mar 17, 2016

RonnyPfannschmidt commented Mar 17, 2016

nicoddemus commented Mar 17, 2016

RonnyPfannschmidt commented Mar 17, 2016

nicoddemus commented Mar 17, 2016

RonnyPfannschmidt commented Mar 17, 2016

nicoddemus commented Mar 17, 2016

ceridwen commented Mar 18, 2016

ceridwen commented Mar 22, 2016

The-Compiler commented Apr 5, 2016