Releases: fb55/htmlparser2
Releases · fb55/htmlparser2
v9.1.0
v9.0.0
Breaking Changes
- The tokenizer now uses the
EntityDecoder
from theentities
module #1480- Parsing of entities in attributes is now aligned with the HTML spec, and some inputs will produce different results. Eg. in
<a href='&=boo'>
the attribute value won't be modified any more. - The
ontextentity
tokenizer callback now has anendIndex
argument; if you use the tokenizer directly, make sure indices are still the same.
- Parsing of entities in attributes is now aligned with the HTML spec, and some inputs will produce different results. Eg. in
- Stacks inside the parser have been reversed. #1511
Features
- Added a
createDocumentStream
function, analogous tocreateDomStream
(which is now deprecated) #1510
Full Changelog: v8.0.2...v9.0.0
v8.0.2
Bug Fixes
Other changes
- Dependency version bumps
- GitHub Workflows security hardening by @sashashura in #1365
- refactor(lint): Add
eslint-plugin-n
and-unicorn
by @fb55 in #1352 - chore(test): Move from JSON tests to specs by @fb55 in #1354
- docs(readme): Use GitHub Actions CI badge by @fb55 in #1374
New Contributors
- @sashashura made their first contribution in #1365
- @KillyMXI made their first contribution in #1460
Full Changelog: v8.0.1...v8.0.2
v8.0.1
v8.0.0
Breaking
- The deprecated
FeedHandler
class has been removed #1166- See #1166 for how to migrate.
- Typescript >= 4.5 is now required; see #1242
- The types from
domhandler
anddomutils
have changed, the deprecatednormalizeWhitespace
option was removed #1164 - The parser was updated to no longer concatenate strings. This led to several changes of internal interfaces. #1045
- This reduces the memory overhead when parsing streams, and avoids copying memory.
- Breaking if you were previously extending internals.
Features
htmlparser2
is now a dual CommonJS & ESM module #1165
Other changes
- Updated for
entities
' updated decoding tree structure #1146 - Highlight special close-implies-open logic by @vassudanagunta in #1047
- Update Events/07 test to clarify interpretation of tag end slashes by @vassudanagunta in #1046
- Suggest
parse5
for HTML compliance by @vassudanagunta in #1147
New Contributors
- @vassudanagunta made their first contribution in #1047
Full Changelog: v7.2.0...v8.0.0
v7.2.0
What's Changed
Fixes:
Docs
- docs(readme): make
parseDocument()
example clearer by @cameronsteele in #998
Refactors:
- Introduce sequences & fast forwarding by @fb55 in #1007
- Emit text before entities once entity is confirmed by @fb55 in #1009
The refactors lead to a combined ~5% speed-up.
New Contributors
- @cameronsteele made their first contribution in #998
Full Changelog: v7.1.2...v7.2.0
v7.1.2
v7.1.1
v7.1.0
Features:
- Added an
isImplied
flag to theonopentag
/onclosetag
events (#930) f917004- This allows consumers to set start/end indices more correctly. Inspired by posthtml/posthtml-parser#80.
- It is now possible to get indices for attributes (#929) 28c162b
Fixes:
[email protected]
changed how indices were computed. Unfortunately, a lot of edge-cases weren't handled correctly. This version fixes this..pause
would lead to data being wrongfully discarded (#927) 78af88d- The tokenizer would still emit some data after an error (#923) 08b2040
- Issue in foreign content: The tag name
foreignObject
will always be lowercased in HTML e852205
Refactors:
- refactor(feeds): Move
getFeed
todomutils
(#931) f10dc03 - refactor(tokenizer): Use explicit empty buffer if we have reached the end 9c30fe6
- chore(tests): Add test for error without a listener 0eb0067
- chore(tests): Use proxies to collect events (#920) a2b0bf3
- chore(tests): Move
stream
tests intoWritableStream.spec
(#916) da67eba - refactor(tokenizer): Remove unused branches, improve test coverage (#914) a2eae51
- docs(readme): Update benchmark results d45fc82
v7.0.0
[email protected]
changes a lot of internals, resulting in an 20% overall performance improvement in AndreasMadsen's htmlparser-benchmark.
Breaking changes:
- Fixed how start & end index positions are calculated (#910) 5ab080e
- Some indices, especially end indices, will now have changed. Most importantly, end indices will now always be greater or equal than start indices (whoops!).
Features:
Refactors:
- Use a trie to decode HTML & XML entities in the tokenizer (#863) 9a47a55
- Leads to large speed-ups when dealing with entities.
- Iterate over char codes in the tokenizer (#894) f5aed75
- Improved tokenizer performance by ~40%.
- Use
Map
foropenImpliesClose
in the parser (#911) 39a8109 - Moved logic of
FeedHandler
to a function (#912) 3a672ff