Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve uri.parseQuery to never raise an error #16647

Merged
merged 1 commit into from
Jan 12, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,16 @@
with other backends. see #9125. Use `-d:nimLegacyJsRound` for previous behavior.
- Added `socketstream` module that wraps sockets in the stream interface

- Changed the behavior of `uri.decodeQuery` when there are unencoded `=`
Copy link
Member

@timotheecour timotheecour Jan 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced that the CI errors comes from my code, errors seems to come from fidget and PackageFileParsed (whatever those are) and I don't really see the problem looking at the CI logs...

https://dev.azure.com/nim-lang/255dfe86-e590-40bb-a8a2-3c0295ebdeb1/_apis/build/builds/12009/logs/84

2021-01-11T23:28:40.1530530Z �[1m�[31mFAIL: �[36mtests/stdlib/turi.nim c�[0m
...
2021-01-11T23:28:40.1541230Z ../../lib/system/fatal.nim(53, 5) Error: unhandled exception: expected raising 'UriParseError', instead nothing was raised by:
2021-01-11T23:28:40.1542060Z discard toSeq(decodeQuery("a=1&b=2c=6")) [AssertionDefect]

it comes from this PR, the fix is trivial:
in tests/stdlib/turi.nim, adapt:

    doAssertRaises(UriParseError):
      discard toSeq(decodeQuery("a=1&b=2c=6"))

accordingly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I was looking at the other CI runs where this did not appear, my bad.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no problem; please click "resolve conversation" for any resolved comment (unless it contains pushback)

characters in the decoded values. Prior versions would raise an error. This is
no longer the case to comply with the HTML spec and other languages
implementations. Old behavior can be obtained with
`-d:nimLegacyParseQueryStrict`. `cgi.decodeData` which uses the same
underlying code is also updated the same way.




- Added `math.signbit`.

- Removed the optional `longestMatch` parameter of the `critbits._WithPrefix` iterators (it never worked reliably)
Expand Down
14 changes: 4 additions & 10 deletions lib/pure/cgi.nim
Original file line number Diff line number Diff line change
Expand Up @@ -84,23 +84,17 @@ proc getEncodedData(allowedMethods: set[RequestMethod]): string =
iterator decodeData*(data: string): tuple[key, value: TaintedString] =
## Reads and decodes CGI data and yields the (name, value) pairs the
## data consists of.
Comment on lines 85 to 86
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

decodeData should document nimLegacyParseQueryStrict option too.

And even with -d:nimLegacyParseQueryStrict, it never raises CgiError now. It actually raises UriParseError.

ref #19308 (comment)

try:
for (key, value) in uri.decodeQuery(data):
yield (key, value)
except UriParseError as e:
cgiError(e.msg)
for (key, value) in uri.decodeQuery(data):
yield (key, value)

iterator decodeData*(allowedMethods: set[RequestMethod] =
{methodNone, methodPost, methodGet}): tuple[key, value: TaintedString] =
## Reads and decodes CGI data and yields the (name, value) pairs the
## data consists of. If the client does not use a method listed in the
## `allowedMethods` set, a `CgiError` exception is raised.
let data = getEncodedData(allowedMethods)
try:
for (key, value) in uri.decodeQuery(data):
yield (key, value)
except UriParseError as e:
cgiError(e.msg)
for (key, value) in uri.decodeQuery(data):
yield (key, value)

proc readData*(allowedMethods: set[RequestMethod] =
{methodNone, methodPost, methodGet}): StringTableRef =
Expand Down
38 changes: 23 additions & 15 deletions lib/pure/uri.nim
Original file line number Diff line number Diff line change
Expand Up @@ -161,22 +161,26 @@ func encodeQuery*(query: openArray[(string, string)], usePlus = true,
result.add(encodeUrl(val, usePlus))

iterator decodeQuery*(data: string): tuple[key, value: TaintedString] =
## Reads and decodes query string ``data`` and yields the (key, value) pairs the
## data consists of.
## Reads and decodes query string `data` and yields the `(key, value)` pairs
## the data consists of. If compiled with `-d:nimLegacyParseQueryStrict`, an
## error is raised when there is an unencoded `=` character in a decoded
## value, which was the behavior in Nim < 1.5.1
runnableExamples:
import std/sugar
let s = collect(newSeq):
for k, v in decodeQuery("foo=1&bar=2"): (k, v)
doAssert s == @[("foo", "1"), ("bar", "2")]
import std/sequtils
doAssert toSeq(decodeQuery("foo=1&bar=2=3")) == @[("foo", "1"), ("bar", "2=3")]
doAssert toSeq(decodeQuery("&a&=b&=&&")) == @[("", ""), ("a", ""), ("", "b"), ("", ""), ("", "")]

proc parseData(data: string, i: int, field: var string): int =
proc parseData(data: string, i: int, field: var string, sep: char): int =
result = i
while result < data.len:
case data[result]
let c = data[result]
case c
of '%': add(field, decodePercent(data, result))
of '+': add(field, ' ')
of '=', '&': break
else: add(field, data[result])
of '&': break
else:
if c == sep: break
else: add(field, data[result])
Comment on lines +181 to +183
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
else:
if c == sep: break
else: add(field, data[result])
elif c == sep: break
else: add(field, data[result])

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the else of the case statement, I'm not convinced that elif is a valid option for it.

Copy link
Member

@timotheecour timotheecour Jan 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's valid, and it works, i verified

inc(result)

var i = 0
Expand All @@ -185,16 +189,20 @@ iterator decodeQuery*(data: string): tuple[key, value: TaintedString] =
# decode everything in one pass:
while i < data.len:
setLen(name, 0) # reuse memory
i = parseData(data, i, name)
i = parseData(data, i, name, '=')
mildred marked this conversation as resolved.
Show resolved Hide resolved
setLen(value, 0) # reuse memory
if i < data.len and data[i] == '=':
inc(i) # skip '='
i = parseData(data, i, value)
when defined(nimLegacyParseQueryStrict):
i = parseData(data, i, value, '=')
else:
i = parseData(data, i, value, '&')
yield (name.TaintedString, value.TaintedString)
if i < data.len:
if data[i] == '&': inc(i)
else:
uriParseError("'&' expected at index '$#' for '$#'" % [$i, data])
when defined(nimLegacyParseQueryStrict):
if data[i] != '&':
uriParseError("'&' expected at index '$#' for '$#'" % [$i, data])
inc(i)

func parseAuthority(authority: string, result: var Uri) =
var i = 0
Expand Down
3 changes: 1 addition & 2 deletions tests/stdlib/turi.nim
Original file line number Diff line number Diff line change
Expand Up @@ -287,8 +287,7 @@ template main() =

block: # decodeQuery
doAssert toSeq(decodeQuery("a=1&b=0")) == @[("a", "1"), ("b", "0")]
doAssertRaises(UriParseError):
discard toSeq(decodeQuery("a=1&b=2c=6"))
doAssert toSeq(decodeQuery("a=1&b=2c=6")) == @[("a", "1"), ("b", "2c=6")]

static: main()
main()