Skip to content

Commit

Permalink
Initialize startPos to 0 vs -1
Browse files Browse the repository at this point in the history
So that nodes that are added without first going through transition() have the correct start pos.

Fixes #2106
  • Loading branch information
jhy committed Jul 1, 2024
1 parent 8a342fd commit be1301f
Show file tree
Hide file tree
Showing 3 changed files with 15 additions and 1 deletion.
5 changes: 5 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,11 @@

* Removed previously deprecated internal classes and methods. [2094](https://github.com/jhy/jsoup/pull/2094)

### Bug Fixes

* When tracking source positions, if the first node was a TextNode, its position was incorrectly set
to `-1.` [2106](https://github.com/jhy/jsoup/issues/2106)

---

## 1.17.2 (2023-Dec-29)
Expand Down
2 changes: 1 addition & 1 deletion src/main/java/org/jsoup/parser/Tokeniser.java
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ final class Tokeniser {
@Nullable private String lastStartCloseSeq; // "</" + lastStartTag, so we can quickly check for that in RCData

private static final int Unset = -1;
private int markupStartPos, charStartPos = Unset; // reader pos at the start of markup / characters. updated on state transition
private int markupStartPos, charStartPos = 0; // reader pos at the start of markup / characters. updated on state transition. Initialized to start (0), but set to Unset after emissions.

Tokeniser(TreeBuilder treeBuilder) {
tagPending = startPending = new Token.StartTag(treeBuilder);
Expand Down
9 changes: 9 additions & 0 deletions src/test/java/org/jsoup/parser/PositionTest.java
Original file line number Diff line number Diff line change
Expand Up @@ -487,6 +487,15 @@ private void printRange(Node node) {
assertEquals("h1:0-9~12-17; id:4-6=7-8; #text:9-12; #text:17-18; h2:18-27~30-35; id:22-24=25-26; #text:27-30; h10:35-40~43-49; #text:40-43; ", track.toString());
}

@Test void tracksFirstTextnode() {
// https://github.com/jhy/jsoup/issues/2106
String html = "foo<p></p>bar<p></p><div><b>baz</b></div>";
Document doc = Jsoup.parse(html, TrackingHtmlParser);
StringBuilder track = new StringBuilder();
doc.body().forEachNode(node -> accumulatePositions(node, track));
assertEquals("body:0-0~41-41; #text:0-3; p:3-6~6-10; #text:10-13; p:13-16~16-20; div:20-25~35-41; b:25-28~31-35; #text:28-31; ", track.toString());
}

@Test void updateKeyMaintainsRangeLc() {
String html = "<p xsi:CLASS=On>One</p>";
Document doc = Jsoup.parse(html, TrackingHtmlParser);
Expand Down

0 comments on commit be1301f

Please sign in to comment.