Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

https://github.com/jackdewinter/pymarkdown/issues/1247 #1248

Merged
merged 13 commits into from
Oct 28, 2024
Merged
Show file tree
Hide file tree
Changes from 12 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 18 additions & 2 deletions newdocs/src/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,28 @@
<!--- pyml disable-next-line no-duplicate-heading-->
### Added

- None
- [Issue 1233](https://github.com/jackdewinter/pymarkdown/issues/1233)
- [Issue 1234](https://github.com/jackdewinter/pymarkdown/issues/1234)
- [Issue 1235](https://github.com/jackdewinter/pymarkdown/issues/1235)
- Adding more comprehensive "drop X" tests where multiple levels of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (documentation): The term "drop X" could benefit from more context

Consider explaining what "drop X" refers to, as this might not be immediately clear to all readers.

Suggested change
- Adding more comprehensive "drop X" tests where multiple levels of
- Adding more comprehensive database column dropping tests where multiple levels of schema dependencies are involved

containers are involved, and then dropping one or more of those
containers in a single line.

<!--- pyml disable-next-line no-duplicate-heading-->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question (documentation): Issues 1233-1235 appear in both Added and Fixed sections

Consider clarifying why these issues appear in both sections, or consolidate them into the most appropriate section to avoid confusion.

### Fixed

- None
- [Issue 1233](https://github.com/jackdewinter/pymarkdown/issues/1233)
- [Issue 1234](https://github.com/jackdewinter/pymarkdown/issues/1234)
- [Issue 1235](https://github.com/jackdewinter/pymarkdown/issues/1235)
- Adding new "drop_x" tests and resolve any scan issues with them.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick (documentation): Inconsistent terminology between "drop X" and "drop_x"

Consider using consistent terminology throughout the changelog to avoid confusion.

- [Issue 1243](https://github.com/jackdewinter/pymarkdown/issues/1243)
- [Issue 1245](https://github.com/jackdewinter/pymarkdown/issues/1245)
- Handling leading spaces in `__fix_spacing` function now that data
is present.
- [Issue 1247](https://github.com/jackdewinter/pymarkdown/issues/1247)
- In 3+ drop cases with only lists and no block quotes, indent is
not calculated properly on rehydrate. This in turn causes the
fixed text to be wrong.

<!--- pyml disable-next-line no-duplicate-heading-->
### Changed
Expand Down
8 changes: 4 additions & 4 deletions publish/coverage.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,12 @@
"projectName": "pymarkdown",
"reportSource": "pytest",
"branchLevel": {
"totalMeasured": 5385,
"totalCovered": 5385
"totalMeasured": 5403,
"totalCovered": 5401
},
"lineLevel": {
"totalMeasured": 21081,
"totalCovered": 21081
"totalMeasured": 21123,
"totalCovered": 21122
}
}

6 changes: 3 additions & 3 deletions publish/test-results.json
Original file line number Diff line number Diff line change
Expand Up @@ -1367,7 +1367,7 @@
"totalTests": 653,
"failedTests": 0,
"errorTests": 0,
"skippedTests": 84,
"skippedTests": 61,
"elapsedTimeInMilliseconds": 0
},
{
Expand Down Expand Up @@ -1620,10 +1620,10 @@
},
{
"name": "test.test_markdown_extra",
"totalTests": 203,
"totalTests": 210,
"failedTests": 0,
"errorTests": 0,
"skippedTests": 2,
"skippedTests": 3,
"elapsedTimeInMilliseconds": 0
},
{
Expand Down
4 changes: 3 additions & 1 deletion pymarkdown/block_quotes/block_quote_count_helper.py
Original file line number Diff line number Diff line change
Expand Up @@ -328,7 +328,9 @@ def __xx_part_two(
)
assert parser_state.original_line_to_parse is not None
if (
parser_state.original_line_to_parse[
embedded_list_stack_token.indent_level
>= len(parser_state.original_line_to_parse)
or parser_state.original_line_to_parse[
start_index : embedded_list_stack_token.indent_level
].strip()
or parser_state.original_line_to_parse[
Expand Down
85 changes: 57 additions & 28 deletions pymarkdown/plugins/rule_md_031.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ def __init__(self) -> None:
self.__removed_container_adjustments: Optional[
List[PendingContainerAdjustment]
] = None
self.__last_end_container_tokens: List[MarkdownToken] = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (complexity): Consider refactoring container end token handling to use a recursive traversal approach instead of maintaining state

The current implementation maintains ongoing state via __last_end_container_tokens and uses complex index manipulation to process them. This can be simplified while keeping the enhanced multi-container functionality:

def __collect_preceding_container_ends(self, token: MarkdownToken) -> List[MarkdownToken]:
    """Collect container end tokens that immediately precede this token."""
    container_ends = []
    current = self.__last_token
    while current and (current.is_list_end or current.is_block_quote_end):
        container_ends.insert(0, current)
        current = self.__second_last_token
    return container_ends

# In the fenced code block handling section:
if token.is_fenced_code_block and self.__last_token.is_list_end:
    container_ends = self.__collect_preceding_container_ends(token)
    replacement_tokens = [
        BlankLineMarkdownToken(
            extracted_whitespace="",
            position_marker=PositionMarker(new_token.line_number - 1, 0, ""),
            column_delta=1,
        )
    ]
    replacement_tokens.extend(container_ends)
    replacement_tokens.append(new_token)
    self.register_replace_tokens_request(
        context, container_ends[0], token, replacement_tokens
    )

This approach:

  1. Eliminates the need for ongoing token list maintenance
  2. Computes container ends only when needed
  3. Removes complex index manipulation
  4. Keeps the enhanced multi-container functionality


self.__leading_space_index_tracker = LeadingSpaceIndexTracker()

Expand Down Expand Up @@ -359,7 +360,8 @@ def __fix_spacing_list_remove_list(self, context: PluginScanContext) -> None:
removed_list_token = cast(
ListStartMarkdownToken, self.__removed_container_stack_token
)
assert removed_list_token.leading_spaces is not None
if removed_list_token.leading_spaces is None:
return
split_spaces = removed_list_token.leading_spaces.split("\n")
split_spaces_length = len(split_spaces)
if split_spaces_length > 1:
Expand Down Expand Up @@ -461,6 +463,40 @@ def __calculate_adjust(self, initial_index: int, container_index: int) -> int:
else 1
)

def __fix_spacing_one_container(
self, context: PluginScanContext, token: MarkdownToken
) -> bool:
did_special_list_fix = False
if self.__leading_space_index_tracker.get_container_stack_item(
-1
).is_block_quote_start:
self.__fix_spacing_block_quote(token)
else:
did_special_list_fix = self.__fix_spacing_list(context, token)
return did_special_list_fix

def __fix_spacing_removed_container(
self, context: PluginScanContext, token: MarkdownToken
) -> None:
_ = token
assert self.__removed_container_stack_token is not None
if self.__removed_container_stack_token.is_list_start:
removed_list_token = cast(
ListStartMarkdownToken, self.__removed_container_stack_token
)
if removed_list_token.leading_spaces is not None:
split_spaces = removed_list_token.leading_spaces.split("\n")
split_spaces.append("")
else:
split_spaces = [""]
self.register_fix_token_request(
context,
self.__removed_container_stack_token,
"next_token",
"leading_spaces",
"\n".join(split_spaces),
)

def __fix_spacing(
self, context: PluginScanContext, token: MarkdownToken, special_case: bool
) -> None:
Expand All @@ -469,47 +505,35 @@ def __fix_spacing(
self.__fix_spacing_special_case(context, token)
return
if self.__leading_space_index_tracker.in_at_least_one_container():
if self.__leading_space_index_tracker.get_container_stack_item(
-1
).is_block_quote_start:
self.__fix_spacing_block_quote(token)
else:
did_special_list_fix = self.__fix_spacing_list(context, token)
did_special_list_fix = self.__fix_spacing_one_container(context, token)
elif self.__removed_container_stack_token:
if self.__removed_container_stack_token.is_list_start:
removed_list_token = cast(
ListStartMarkdownToken, self.__removed_container_stack_token
)
assert removed_list_token.leading_spaces is None
# if removed_list_token.leading_spaces is not None:
# split_spaces = removed_list_token.leading_spaces.split("\n")
# split_spaces.append("")
# else:
split_spaces = [""]
self.register_fix_token_request(
context,
self.__removed_container_stack_token,
"next_token",
"leading_spaces",
"\n".join(split_spaces),
)
self.__fix_spacing_removed_container(context, token)

new_token = copy.deepcopy(token)
self.__fix_count += 1
new_token.adjust_line_number(context, self.__fix_count)
assert self.__last_token is not None
if token.is_fenced_code_block and self.__last_token.is_list_end:
end_container_index = len(self.__last_end_container_tokens) - 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (code-quality): Extract code out into method (extract-method)

while (
end_container_index >= 0
and self.__last_end_container_tokens[end_container_index].is_list_end
):
end_container_index -= 1
first_token = self.__last_end_container_tokens[end_container_index + 1]
replacement_tokens: List[MarkdownToken] = [
BlankLineMarkdownToken(
extracted_whitespace="",
position_marker=PositionMarker(new_token.line_number - 1, 0, ""),
column_delta=1,
),
self.__last_token,
new_token,
)
]
replacement_tokens.extend(
self.__last_end_container_tokens[end_container_index + 1 :]
)
replacement_tokens.append(new_token)
self.register_replace_tokens_request(
context, self.__last_token, token, replacement_tokens
context, first_token, token, replacement_tokens
)
elif did_special_list_fix:
assert self.__last_token and self.__last_token.is_block_quote_end
Expand Down Expand Up @@ -741,6 +765,11 @@ def next_token(self, context: PluginScanContext, token: MarkdownToken) -> None:
):
self.__last_non_end_token = token

if token.is_block_quote_end or token.is_list_end:
self.__last_end_container_tokens.append(token)
else:
self.__last_end_container_tokens.clear()

self.__second_last_token = self.__last_token
self.__last_token = token
self.__removed_container_stack_token = None
Expand Down
98 changes: 70 additions & 28 deletions pymarkdown/transform_markdown/transform_containers.py
Original file line number Diff line number Diff line change
Expand Up @@ -426,9 +426,55 @@ def __abcd(

# pylint: enable=too-many-arguments, too-many-boolean-expressions

@staticmethod
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (complexity): Consider consolidating the indent calculation logic into a single method without stack manipulation.

The split into __abcd_final_list_second_half has introduced unnecessary complexity through stack manipulation and branching. Consider simplifying by focusing on the indent calculation:

@staticmethod
def __abcd_final_list(
    removed_tokens: List[MarkdownToken],
    removed_token_indices: List[int],
    container_stack: List[MarkdownToken],
) -> Optional[str]:
    # Handle list token prefix
    removed_list_token = cast(ListStartMarkdownToken, removed_tokens[0])
    assert removed_list_token.leading_spaces is not None
    split_leading_spaces = removed_list_token.leading_spaces.split("\n")
    possible_prefix = None
    if removed_token_indices[0] < len(split_leading_spaces):
        possible_prefix = split_leading_spaces[removed_token_indices[0]]
    del removed_tokens[0]
    del removed_token_indices[0]

    # Calculate indent if only list tokens
    if not any(t.is_block_quote_start for t in container_stack + removed_tokens):
        current_indent_level = (container_stack[-1].indent_level 
                              if container_stack else 0)
        return (" " * current_indent_level if possible_prefix is None 
                else possible_prefix)

    # Handle block quote case
    if removed_tokens[0].is_block_quote_start:
        bq_token = cast(BlockQuoteMarkdownToken, removed_tokens[0])
        assert bq_token.bleading_spaces is not None
        split_spaces = bq_token.bleading_spaces.split("\n")
        assert removed_token_indices[0] < len(split_spaces)
        prefix = "" if possible_prefix is None else possible_prefix
        prefix = f"{split_spaces[removed_token_indices[0]]}{prefix}"
        del removed_tokens[0]
        del removed_token_indices[0]
        return prefix

    return None

This approach:

  1. Eliminates the need for stack copying and reversal
  2. Simplifies the branching logic
  3. Keeps the indent calculation separate from block quote handling
  4. Maintains all functionality while reducing state manipulation

def __abcd_final_list_second_half(
removed_tokens: List[MarkdownToken],
removed_token_indices: List[int],
container_stack: List[MarkdownToken],
possible_prefix: Optional[str],
) -> Optional[str]:
container_stack_copy: List[MarkdownToken] = container_stack[:]
container_stack_copy.extend(removed_tokens[::-1])
is_only_list_starts_on_stack = not any(
i.is_block_quote_start for i in container_stack_copy
)

prefix = None
if is_only_list_starts_on_stack and possible_prefix is not None:
prefix = possible_prefix
token_index = len(removed_tokens) - 1
while token_index >= 0:
list_token = cast(ListStartMarkdownToken, removed_tokens[token_index])
assert list_token.leading_spaces is None
assert removed_token_indices[token_index] == 0
token_index -= 1
del removed_tokens[0]
del removed_token_indices[0]

list_token = cast(ListStartMarkdownToken, container_stack[-1])
current_indent_level = list_token.indent_level if container_stack else 0
prefix = " " * current_indent_level
else:
if removed_tokens[0].is_block_quote_start:
removed_block_quote_token = cast(
BlockQuoteMarkdownToken, removed_tokens[0]
)
assert removed_block_quote_token.bleading_spaces is not None
split_bleading_spaces = removed_block_quote_token.bleading_spaces.split(
"\n"
)
assert removed_token_indices[0] < len(split_bleading_spaces)
prefix = "" if possible_prefix is None else possible_prefix
prefix = f"{split_bleading_spaces[removed_token_indices[0]]}{prefix}"
del removed_tokens[0]
del removed_token_indices[0]
return prefix

@staticmethod
def __abcd_final_list(
removed_tokens: List[MarkdownToken], removed_token_indices: List[int]
removed_tokens: List[MarkdownToken],
removed_token_indices: List[int],
container_stack: List[MarkdownToken],
) -> Optional[str]:
removed_list_token = cast(ListStartMarkdownToken, removed_tokens[0])
assert removed_list_token.leading_spaces is not None
Expand All @@ -437,26 +483,17 @@ def __abcd_final_list(
# prefix is None for most cases
possible_prefix = None
if (
removed_token_indices[0] < len(split_leading_spaces)
and split_leading_spaces[removed_token_indices[0]]
removed_token_indices[0]
< len(split_leading_spaces)
# and split_leading_spaces[removed_token_indices[0]]
):
possible_prefix = split_leading_spaces[removed_token_indices[0]]
del removed_tokens[0]
del removed_token_indices[0]

prefix = None
if removed_tokens[0].is_block_quote_start:
removed_block_quote_token = cast(BlockQuoteMarkdownToken, removed_tokens[0])
assert removed_block_quote_token.bleading_spaces is not None
split_bleading_spaces = removed_block_quote_token.bleading_spaces.split(
"\n"
)
assert removed_token_indices[0] < len(split_bleading_spaces)
prefix = "" if possible_prefix is None else possible_prefix
prefix = f"{split_bleading_spaces[removed_token_indices[0]]}{prefix}"
del removed_tokens[0]
del removed_token_indices[0]
return prefix
return TransformContainers.__abcd_final_list_second_half(
removed_tokens, removed_token_indices, container_stack, possible_prefix
)

@staticmethod
def __abcd_final(
Expand Down Expand Up @@ -506,7 +543,7 @@ def __abcd_final(
del removed_token_indices[0]
else:
prefix = TransformContainers.__abcd_final_list(
removed_tokens, removed_token_indices
removed_tokens, removed_token_indices, container_stack
)
return prefix

Expand Down Expand Up @@ -1435,11 +1472,15 @@ def __adjust_for_list_end_part_4(
do_it = True
if token_stack[-1].is_list_start and removed_tokens[-1].is_list_start:
removed_list_token = cast(ListStartMarkdownToken, removed_tokens[-1])
assert removed_list_token.leading_spaces is not None
removed_token_split_spaces = removed_list_token.leading_spaces.split("\n")
removed_token_index = removed_token_indices[-1]
assert removed_token_index < len(removed_token_split_spaces)
removed_token_indices[-1] += 1
if removed_list_token.leading_spaces is None:
assert removed_token_index == 0
else:
removed_token_split_spaces = removed_list_token.leading_spaces.split(
"\n"
)
assert removed_token_index < len(removed_token_split_spaces)
removed_token_indices[-1] += 1
do_it = False
block_me = True
if do_it:
Expand Down Expand Up @@ -1858,13 +1899,14 @@ def __apply_primary_transformation_adjust_container_line(
if token_stack[-1].is_new_list_item
else cast(ListStartMarkdownToken, token_stack[-1])
)
assert (
prev_list_token.leading_spaces is not None
), "Leading spaces must be defined by this point."
split_leading_spaces = prev_list_token.leading_spaces.split(
ParserHelper.newline_character
)
if last_container_token_index < len(split_leading_spaces):
split_leading_spaces = None
if prev_list_token.leading_spaces is not None:
split_leading_spaces = prev_list_token.leading_spaces.split(
ParserHelper.newline_character
)
if split_leading_spaces is not None and last_container_token_index < len(
split_leading_spaces
):
POGGER.debug(f" -->{ParserHelper.make_value_visible(split_leading_spaces)}")
POGGER.debug(
f" primary-->container_line>:{ParserHelper.make_value_visible(container_line)}:<"
Expand Down
Loading