TypeError: cannot use a string pattern on a bytes-like object #8

AndrewH-Lab49 · 2021-06-11T12:47:22Z

using a live connection to my clients workday:

tap-workday-raas | File "/src/streams/workday-s3/.meltano/extractors/tap-workday-raas/venv/lib/python3.8/site-packages/tap_workday_raas/client.py", line 46, in stream_report
tap-workday-raas | coro.send(chunk)
tap-workday-raas | File "/src/streams/workday-s3/.meltano/extractors/tap-workday-raas/venv/lib/python3.8/site-packages/ijson/backends/python.py", line 39, in Lexer
tap-workday-raas | match = LEXEME_RE.search(buf, pos)
tap-workday-raas | TypeError: cannot use a string pattern on a bytes-like object

in client.py on line 46 I replace

coro.send(chunk)

with

coro.send(chunk.decode(resp.encoding))

and I get:

tap-workday-raas | INFO Done syncing.
meltano | Incremental state has been updated at 2021-06-11 12:33:28.720672.
meltano | Extract & load complete!

I am not convinced that this is the best solution. Perhaps using the Content-Type first to get the xml encoding before using requests guess at encoding might be better? The above example was just trying to be helpful.

I am not sure how this impacts the existing unit test?

I had a hard time working with the unit tests without spending too much time. For instance I do not know where tap_tester comes from. It didn't pip install and wasn't part of setup process. I don't believe I have access to the circle docker image, S3....

Please excuse me if I missed something.

The text was updated successfully, but these errors were encountered:

AndrewH-Lab49 · 2021-06-11T14:57:49Z

maybe use something like this to get the encoding from the header content type first?

# Get the header as a dictionary and Split the Content-Type string value into a list by '; '
# filter list by 'charset='
# return the first item in the list (the only item)
# strip 'charset='

content_headers_list = resp.headers['Content-Type'].split('; ')
v_encoding_key = 'charset='

try:
    v_encoding = next(filter(lambda x: x.startswith(v_encoding_key), content_headers_list)).lstrip(v_encoding_key)
except:
    v_encoding = resp.encoding

and then

coro.send(chunk.decode(v_encoding))

This was referenced Jun 11, 2021

decode chunk with info from header / requests #9

Open

decode chunk with info from header / requests lab49/tap-workday-raas#1

Merged

AndrewH-Lab49 changed the title ~~cannot use a string pattern on a bytes-like object~~ TypeError: cannot use a string pattern on a bytes-like object Jun 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TypeError: cannot use a string pattern on a bytes-like object #8

TypeError: cannot use a string pattern on a bytes-like object #8

AndrewH-Lab49 commented Jun 11, 2021 •

edited

Loading

AndrewH-Lab49 commented Jun 11, 2021 •

edited

Loading

TypeError: cannot use a string pattern on a bytes-like object #8

TypeError: cannot use a string pattern on a bytes-like object #8

Comments

AndrewH-Lab49 commented Jun 11, 2021 • edited Loading

AndrewH-Lab49 commented Jun 11, 2021 • edited Loading

AndrewH-Lab49 commented Jun 11, 2021 •

edited

Loading

AndrewH-Lab49 commented Jun 11, 2021 •

edited

Loading