Skip to content
This repository has been archived by the owner on Mar 3, 2023. It is now read-only.

Fix log-reader for Python3 #3580

Merged
merged 3 commits into from
Jul 26, 2020
Merged

Conversation

thinker0
Copy link
Member

@thinker0 thinker0 commented Jul 23, 2020

Python3

@Code0x58

[I 200722 23:06:34 web:1811] 200 GET /filedata/./log-files/container_36_system-log-to-es_36.log.0?offset=-1&length=-1 (10.25.178.125) 0.64ms
[E 200722 23:06:34 web:1407] Uncaught exception GET /filedata/./log-files/container_36_system-log-to-es_36.log.0?offset=2178106&length=50000 (10.25.178.125)
    HTTPServerRequest(protocol='http', host='LNMESOSS1513:31449', method='GET', uri='/filedata/./log-files/container_36_system-log-to-es_36.log.0?offset=2178106&length=50000', version='HTTP/1.1', remote_ip='10.25.178.125', headers={'Connection': 'close', 'Host': 'LNMESOSS1513:31449', 'Accept-Encoding': 'gzip'})
    Traceback (most recent call last):
      File "/var/lib/mesos/slaves/0114e468-0f07-485c-8673-14ff74e8b46c-S593/frameworks/c663397e-a472-43bd-92dd-d97027fcf6ce-0000/executors/thermos-www-release-heron-system-access-es-others-36-60361105-ae4e-4efc-84df-b70d3c717113/runs/f8aa692a-249d-40f8-a8db-e7e4a8cbdc8f/sandbox/.pex/installed_wheels/a92bf48f8e3ffa83220bafe58403d68e2d170148/tornado-4.0.2-cp36-cp36m-linux_x86_64.whl/tornado/web.py", line 1288, in _stack_context_handle_exception
        raise_exc_info((type, value, traceback))
      File "<string>", line 3, in raise_exc_info
      File "/var/lib/mesos/slaves/0114e468-0f07-485c-8673-14ff74e8b46c-S593/frameworks/c663397e-a472-43bd-92dd-d97027fcf6ce-0000/executors/thermos-www-release-heron-system-access-es-others-36-60361105-ae4e-4efc-84df-b70d3c717113/runs/f8aa692a-249d-40f8-a8db-e7e4a8cbdc8f/sandbox/.pex/installed_wheels/a92bf48f8e3ffa83220bafe58403d68e2d170148/tornado-4.0.2-cp36-cp36m-linux_x86_64.whl/tornado/web.py", line 1475, in wrapper
        result = method(self, *args, **kwargs)
      File "/var/lib/mesos/slaves/0114e468-0f07-485c-8673-14ff74e8b46c-S593/frameworks/c663397e-a472-43bd-92dd-d97027fcf6ce-0000/executors/thermos-www-release-heron-system-access-es-others-36-60361105-ae4e-4efc-84df-b70d3c717113/runs/f8aa692a-249d-40f8-a8db-e7e4a8cbdc8f/sandbox/.pex/code/e789de3949af7faec9b2f90bd1c725ad59bc2557/heron/shell/src/python/handlers/filedatahandler.py", line 50, in get
        data = utils.read_chunk(path, offset=offset, length=length, escape_data=True)
      File "/var/lib/mesos/slaves/0114e468-0f07-485c-8673-14ff74e8b46c-S593/frameworks/c663397e-a472-43bd-92dd-d97027fcf6ce-0000/executors/thermos-www-release-heron-system-access-es-others-36-60361105-ae4e-4efc-84df-b70d3c717113/runs/f8aa692a-249d-40f8-a8db-e7e4a8cbdc8f/sandbox/.pex/code/e789de3949af7faec9b2f90bd1c725ad59bc2557/heron/shell/src/python/utils.py", line 146, in read_chunk
        data = _escape_data(data) if escape_data else data
      File "/var/lib/mesos/slaves/0114e468-0f07-485c-8673-14ff74e8b46c-S593/frameworks/c663397e-a472-43bd-92dd-d97027fcf6ce-0000/executors/thermos-www-release-heron-system-access-es-others-36-60361105-ae4e-4efc-84df-b70d3c717113/runs/f8aa692a-249d-40f8-a8db-e7e4a8cbdc8f/sandbox/.pex/code/e789de3949af7faec9b2f90bd1c725ad59bc2557/heron/shell/src/python/utils.py", line 152, in _escape_data
        return escape(data.decode('utf8', 'replace'))
    AttributeError: 'str' object has no attribute 'decode'
[E 200722 23:06:34 web:1811] 500 GET /filedata/./log-files/container_36_system-log-to-es_36.log.0?offset=2178106&length=50000 (10.25.178.125) 0.98ms

@@ -135,7 +135,7 @@ def read_chunk(filename, offset=-1, length=-1, escape_data=False):
if length == -1:
length = fstat.st_size - offset

with open(filename, "r") as fp:
Copy link
Contributor

@Code0x58 Code0x58 Jul 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be consistent with the previous behaviour (not sure it still makes sense), I think the whole PR change would be something like:

with open(filename, "rb") as fp:
   ...
   if data:
       # use permissive decoding and escaping if escape_data is set, otherwise use strict decoding
       data = _escape_data(data) if escape_data else data.decode()
       ....

@@ -135,7 +135,7 @@ def read_chunk(filename, offset=-1, length=-1, escape_data=False):
if length == -1:
length = fstat.st_size - offset

with open(filename, "r") as fp:
Copy link
Contributor

@Code0x58 Code0x58 Jul 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be consistent with the previous behaviour (not sure it still makes sense), I think the whole PR change would be something like:

with open(filename, "rb") as fp:
   ...
   if data:
       # use permissive decoding and escaping if escape_data is set, otherwise use strict decoding
       data = _escape_data(data) if escape_data else data.decode()
       ....

Copy link
Contributor

@Code0x58 Code0x58 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like this could be made more compatible with the old behaviour

return dict(offset=offset, length=len(data), data=data)

return dict(offset=offset, length=0)

def _escape_data(data):
return escape(data.decode('utf8', 'replace'))
Copy link
Contributor

@Code0x58 Code0x58 Jul 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

once the change to this is reverted, the PR will be compatible with the python2 behaviour - alternativley you could do the decoding in the data = ... line above

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code is not working in python3 rather than python compatible. I am trying to modify it because the above error occurs.

Copy link
Member Author

@thinker0 thinker0 Jul 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with open(filename, "rb") as fp:

If you read it as binary, escape doesn't work.

[E 200723 10:40:39 web:1407] Uncaught exception GET /filedata/./log-files/container_46_pulsar-prod-4_96.log.0?offset=4191685&length=50000 (10.128.139.55)
    HTTPServerRequest(protocol='http', host='shared-aurora-044-ladp-jp2p-prod:31516', method='GET', uri='/filedata/./log-files/container_46_pulsar-prod-4_96.log.0?offset=4191685&length=50000', version='HTTP/1.1', remote_ip='10.128.139.55', headers={'Connection': 'close', 'Host': 'shared-aurora-044-ladp-jp2p-prod:31516', 'Accept-Encoding': 'gzip'})
    Traceback (most recent call last):
      File "/var/lib/mesos/slaves/9a7b96a9-a670-48e3-bedf-051a91ac5a9b-S251/frameworks/c663397e-a472-43bd-92dd-d97027fcf6ce-0000/executors/thermos-www-release-heron-system-access-es-others-46-02c80d1d-6257-4b42-829c-6fafbed25b8d/runs/8597c5be-55d4-4b8b-9e2e-2b8faddc3372/sandbox/.pex/installed_wheels/ba642ca162d5bf8bbf2fad77d02edcf2a3188eb8/tornado-4.0.2-cp36-cp36m-linux_x86_64.whl/tornado/web.py", line 1288, in _stack_context_handle_exception
        raise_exc_info((type, value, traceback))
      File "<string>", line 3, in raise_exc_info
      File "/var/lib/mesos/slaves/9a7b96a9-a670-48e3-bedf-051a91ac5a9b-S251/frameworks/c663397e-a472-43bd-92dd-d97027fcf6ce-0000/executors/thermos-www-release-heron-system-access-es-others-46-02c80d1d-6257-4b42-829c-6fafbed25b8d/runs/8597c5be-55d4-4b8b-9e2e-2b8faddc3372/sandbox/.pex/installed_wheels/ba642ca162d5bf8bbf2fad77d02edcf2a3188eb8/tornado-4.0.2-cp36-cp36m-linux_x86_64.whl/tornado/web.py", line 1475, in wrapper
        result = method(self, *args, **kwargs)
      File "/var/lib/mesos/slaves/9a7b96a9-a670-48e3-bedf-051a91ac5a9b-S251/frameworks/c663397e-a472-43bd-92dd-d97027fcf6ce-0000/executors/thermos-www-release-heron-system-access-es-others-46-02c80d1d-6257-4b42-829c-6fafbed25b8d/runs/8597c5be-55d4-4b8b-9e2e-2b8faddc3372/sandbox/.pex/code/ca989480a7444b1c46f32b02147ba41568922357/heron/shell/src/python/handlers/filedatahandler.py", line 50, in get
        data = utils.read_chunk(path, offset=offset, length=length, escape_data=True)
      File "/var/lib/mesos/slaves/9a7b96a9-a670-48e3-bedf-051a91ac5a9b-S251/frameworks/c663397e-a472-43bd-92dd-d97027fcf6ce-0000/executors/thermos-www-release-heron-system-access-es-others-46-02c80d1d-6257-4b42-829c-6fafbed25b8d/runs/8597c5be-55d4-4b8b-9e2e-2b8faddc3372/sandbox/.pex/code/ca989480a7444b1c46f32b02147ba41568922357/heron/shell/src/python/utils.py", line 147, in read_chunk
        data = _escape_data(data) if escape_data else data.decode()
      File "/var/lib/mesos/slaves/9a7b96a9-a670-48e3-bedf-051a91ac5a9b-S251/frameworks/c663397e-a472-43bd-92dd-d97027fcf6ce-0000/executors/thermos-www-release-heron-system-access-es-others-46-02c80d1d-6257-4b42-829c-6fafbed25b8d/runs/8597c5be-55d4-4b8b-9e2e-2b8faddc3372/sandbox/.pex/code/ca989480a7444b1c46f32b02147ba41568922357/heron/shell/src/python/utils.py", line 153, in _escape_data
        return escape(data)
      File "/usr/lib64/python3.6/xml/sax/saxutils.py", line 27, in escape
        data = data.replace("&", "&amp;")
    TypeError: a bytes-like object is required, not 'str'

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

escape(data.decode())

It works because I modified it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, although I think that decode will still want errors='replace' to be compatible with the python2 behaviour

Signed-off-by: thinker0 <[email protected]>
@huijunwu huijunwu mentioned this pull request Jul 26, 2020
@huijunwu huijunwu merged commit 0c9b209 into apache:master Jul 26, 2020
@thinker0 thinker0 deleted the feature/fix-log-reader branch July 26, 2020 07:47
nicknezis pushed a commit that referenced this pull request Sep 14, 2020
* Fix log-reader for Python3

Signed-off-by: thinker0 <[email protected]>

* typo

Signed-off-by: thinker0 <[email protected]>

* Revert commit

Co-authored-by: thinker0 <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants