Fix message parsing for documents containing Content-Length keyword #80

dinvlad · 2019-08-24T02:59:58Z

Description

This PR fixes the logic for parsing a batch of messages in data_received(), where each message body may contain a literal Content-Length keyword.

Previously, if a document contained this keyword, then the server would crash, because it attempted to split messages on this keyword.

Now, instead of splitting the messages on the keyword, it parses content length from the first keyword, and then extracts the message based on the length. After that, the parsing loop is repeated, until no more messages can be read.

Update: I added handling of partial messages, since some messages may arrive in chunks.

Code review checklist (for code reviewer to complete)

Pull request represents a single change (i.e. not fixing disparate/unrelated things in a single PR)
Title summarizes what is changing
Commit messages are meaningful (see this for details)
Tests have been included and/or updated, as appropriate
Docstrings have been included and/or updated, as appropriate
Standalone docs have been updated accordingly
CONTRIBUTORS.md was updated, as appropriate
Changelog has been updated, as needed (see CHANGELOG.md)

This commit fixes the logic for parsing a batch of messages in data_received(), where each message body may contain a literal 'Content-Length' keyword. Previously, if a document contained this keyword, the the server would crash, because it attempted to split messages on this keyword. Now, instead of splitting the messages on the keyword, it parses content length from the first keyword, and then extracts the message based on the length. After that, the parsing loop is repeated, until no more messages can be read.

This commit enables receival of partial messages, where the entire message may arrive in chunks. We achieve this by maintaining _message_buf [], which builds the message as the parts arrive, until the entire message (as specified in its Content-Length) can be parsed. Then, the loop continues for any other messages that may arrive together with the first one. The updated regex allows extra headers in the message. Additionally, we change aio_readline() logic to always pass the entire message (with headers) to data_received(). This allows us to avoid extraneous logic than required otherwise. Finally, we add tests for this functionality.

danixeee · 2019-08-29T10:54:29Z

@dinvlad Thanks for the PR, great work! Is it ready for the review or you plan to make more changes?

dinvlad · 2019-08-29T13:22:29Z

I think it's ready, thanks!

danixeee · 2019-08-31T13:40:58Z

CONTRIBUTORS.md

@@ -5,3 +5,4 @@
 - [Max O'Cull](https://github.com/Maxattax97)
 - [Tomoya Tanjo](https://github.com/tom-tan)
 - [yorodm](https://github.com/yorodm)
+- [Denis Loginov](https://github.com/dinvlad)


Can you make this alphabetical?

danixeee · 2019-08-31T13:43:17Z

pygls/protocol.py

@@ -397,18 +397,43 @@ def connection_made(self, transport: asyncio.Transport):
        """Method from base class, called when connection is established"""
        self.transport = transport

+    MESSAGE_PATTERN = re.compile(


Can you move this up where other class constants are (alphabetically)?

danixeee

Please address two very minor (formatting) comments and I will merge it.

This is much better and robust then before, thank you again for submitting this PR!

Edit:

Could you please update the CHANGELOG?

dinvlad · 2019-09-03T13:48:18Z

Awesome, thanks for reviewing! I've made the changes.

danixeee · 2019-09-03T16:52:00Z

I have tested everything and it works. I needed to test json-extension after installing from .vsix, since it is using IO connection instead of TCP that way.

I merged another PR today, can you just resolve conflicts and include link to your PR?

CHANGELOG should look like this:

- Fix parsing of partial messages and those with Content-Length keyword ([#80])
- Fix Full SyncKind for servers accepting Incremental SyncKind ([#78])

[#80]: https://github.com/openlawlibrary/pygls/pull/80
[#78]: https://github.com/openlawlibrary/pygls/pull/78

dinvlad · 2019-09-03T17:24:03Z

Sounds good - resolved, please take a look.

danixeee

Looks good. Thanks for the PR @dinvlad!

danixeee self-requested a review August 24, 2019 16:23

danixeee assigned dinvlad Aug 24, 2019

danixeee added bug Something isn't working enhancement New feature or request labels Aug 24, 2019

dinvlad added 3 commits August 24, 2019 20:55

Update contributors

6a5eab8

Style fixes

3062396

danixeee reviewed Aug 31, 2019

View reviewed changes

danixeee suggested changes Aug 31, 2019

View reviewed changes

dinvlad added 3 commits September 3, 2019 09:30

Fix alphabetical order for contributors

2b84099

Relocate MESSAGE_PATTERN declaration in JsonRPCProtocol

96452b2

Add Changelog for openlawlibrary#80

b48c91c

Fix the location of Changelog entry for Unreleased version

74b7291

Merge branch 'master' into master

468561f

danixeee approved these changes Sep 3, 2019

View reviewed changes

danixeee merged commit 0d97f79 into openlawlibrary:master Sep 3, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix message parsing for documents containing Content-Length keyword #80

Fix message parsing for documents containing Content-Length keyword #80

dinvlad commented Aug 24, 2019 •

edited

Loading

danixeee commented Aug 29, 2019

dinvlad commented Aug 29, 2019

danixeee Aug 31, 2019

danixeee Aug 31, 2019

danixeee left a comment •

edited

Loading

dinvlad commented Sep 3, 2019

danixeee commented Sep 3, 2019 •

edited

Loading

dinvlad commented Sep 3, 2019

danixeee left a comment

Fix message parsing for documents containing Content-Length keyword #80

Fix message parsing for documents containing Content-Length keyword #80

Conversation

dinvlad commented Aug 24, 2019 • edited Loading

Description

Code review checklist (for code reviewer to complete)

danixeee commented Aug 29, 2019

dinvlad commented Aug 29, 2019

danixeee Aug 31, 2019

Choose a reason for hiding this comment

danixeee Aug 31, 2019

Choose a reason for hiding this comment

danixeee left a comment • edited Loading

Choose a reason for hiding this comment

dinvlad commented Sep 3, 2019

danixeee commented Sep 3, 2019 • edited Loading

dinvlad commented Sep 3, 2019

danixeee left a comment

Choose a reason for hiding this comment

dinvlad commented Aug 24, 2019 •

edited

Loading

danixeee left a comment •

edited

Loading

danixeee commented Sep 3, 2019 •

edited

Loading