Skip to content

Commit

Permalink
doc(decode_bytes): advise on the joint use with itemize()
Browse files Browse the repository at this point in the history
Closes: #35
  • Loading branch information
mih committed Jul 11, 2024
1 parent c3f311b commit 8290bed
Showing 1 changed file with 8 additions and 0 deletions.
8 changes: 8 additions & 0 deletions datasalad/itertools/decode_bytes.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,14 @@ def decode_bytes(
be spread across multiple chunks of heterogeneous sizes, for example
output read from a process or pieces of a download.
There is no guarantee that exactly one output chunk will be yielded for
every input chunk. Input byte strings might be split at error-locations, or
might be joined if a multi-byte encoding is spread over multiple chunks. If
``decode_bytes()`` is used together with ``itemize()``, it is advisable to
wrap ``itemize()`` around ``decode_bytes()`` to avoid an impact on the
number and nature of yielded items with respect to the desired itemization
pattern.
Multi-byte encodings that are spread over multiple byte chunks are
supported, and chunks are joined as necessary. For example, the utf-8
encoding for ö is ``b'\\xc3\\xb6'``. If the encoding is split in the
Expand Down

0 comments on commit 8290bed

Please sign in to comment.