Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] text wrapping treats U+00A0 NO-WRAP SPACE as a regular space #3545

Open
2 tasks done
mgedmin opened this issue Oct 31, 2024 · 3 comments · May be fixed by #3571
Open
2 tasks done

[BUG] text wrapping treats U+00A0 NO-WRAP SPACE as a regular space #3545

mgedmin opened this issue Oct 31, 2024 · 3 comments · May be fixed by #3571

Comments

@mgedmin
Copy link

mgedmin commented Oct 31, 2024

Describe the bug

Here's a contrived example that demonstrates the issue:

from rich.console import Console
from rich.panel import Panel

console = Console()
text = 'A quick brown\N{NO-BREAK SPACE}fox jumps over the lazy dog.'
console.print(Panel(text, width=20))

It prints

╭──────────────────╮
│ A quick brown    │
│ fox jumps over   │
│ the lazy dog.    │
╰──────────────────╯

Observe how there's a line break between 'brown' and 'fox', despite the U+0020 NO-BREAK SPACE character. Apparently rich treats it like any other whitespace.

This is what I wanted to see:

╭──────────────────╮
│ A quick          │
│ brown fox jumps  │
│ over the lazy    │
│ dog.             │
╰──────────────────╯

Platform

Click to expand

What platform (Win/Linux/Mac) are you running on?

Ubuntu 24.10

What terminal software are you using?

GNOME Terminal 3.54.0 using VTE 0.78.0 +BIDI +GNUTLS +ICU +SYSTEMD

$ python -m rich.diagnose
╭───────────────────────── <class 'rich.console.Console'> ─────────────────────────╮
│ A high level console interface.                                                  │
│                                                                                  │
│ ╭──────────────────────────────────────────────────────────────────────────────╮ │
│ │ <console width=190 ColorSystem.TRUECOLOR>                                    │ │
│ ╰──────────────────────────────────────────────────────────────────────────────╯ │
│                                                                                  │
│     color_system = 'truecolor'                                                   │
│         encoding = 'utf-8'                                                       │
│             file = <_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'> │
│           height = 43                                                            │
│    is_alt_screen = False                                                         │
│ is_dumb_terminal = False                                                         │
│   is_interactive = True                                                          │
│       is_jupyter = False                                                         │
│      is_terminal = True                                                          │
│   legacy_windows = False                                                         │
│         no_color = False                                                         │
│          options = ConsoleOptions(                                               │
│                        size=ConsoleDimensions(width=190, height=43),             │
│                        legacy_windows=False,                                     │
│                        min_width=1,                                              │
│                        max_width=190,                                            │
│                        is_terminal=True,                                         │
│                        encoding='utf-8',                                         │
│                        max_height=43,                                            │
│                        justify=None,                                             │
│                        overflow=None,                                            │
│                        no_wrap=False,                                            │
│                        highlight=None,                                           │
│                        markup=None,                                              │
│                        height=None                                               │
│                    )                                                             │
│            quiet = False                                                         │
│           record = False                                                         │
│         safe_box = True                                                          │
│             size = ConsoleDimensions(width=190, height=43)                       │
│        soft_wrap = False                                                         │
│           stderr = False                                                         │
│            style = None                                                          │
│         tab_size = 8                                                             │
│            width = 190                                                           │
╰──────────────────────────────────────────────────────────────────────────────────╯
╭─── <class 'rich._windows.WindowsConsoleFeatures'> ────╮
│ Windows features available.                           │
│                                                       │
│ ╭───────────────────────────────────────────────────╮ │
│ │ WindowsConsoleFeatures(vt=False, truecolor=False) │ │
│ ╰───────────────────────────────────────────────────╯ │
│                                                       │
│ truecolor = False                                     │
│        vt = False                                     │
╰───────────────────────────────────────────────────────╯
╭────── Environment Variables ───────╮
│ {                                  │
│     'TERM': 'xterm-256color',      │
│     'COLORTERM': 'truecolor',      │
│     'CLICOLOR': None,              │
│     'NO_COLOR': None,              │
│     'TERM_PROGRAM': None,          │
│     'COLUMNS': None,               │
│     'LINES': None,                 │
│     'JUPYTER_COLUMNS': None,       │
│     'JUPYTER_LINES': None,         │
│     'JPY_PARENT_PID': None,        │
│     'VSCODE_VERBOSE_LOGGING': None │
│ }                                  │
╰────────────────────────────────────╯
platform="Linux"

$ uv tree | grep rich
├── rich v13.8.1
Copy link

Thank you for your issue. Give us a little time to review it.

PS. You might want to check the FAQ if you haven't done so already.

This is an automated reply, generated by FAQtory

mgedmin added a commit to mgedmin/rich that referenced this issue Nov 22, 2024
@mgedmin mgedmin linked a pull request Nov 22, 2024 that will close this issue
9 tasks
@oir
Copy link

oir commented Nov 28, 2024

Apologies if this is unrelated but I have come across something similar when using tables and justification:

from rich.console import Console
from rich.table import Table, Column

table = Table(Column(justify="right"), show_header=False)
table.add_row("hello\N{NO-BREAK SPACE}")
table.add_row("world")

console = Console()
console.print(table)

which prints

┌────────┐
│  hello │
│  world │
└────────┘

whereas I was expecting something like:

┌────────┐
│ hello  │
│  world │
└────────┘

Curiously, left justification works as I expected

from rich.console import Console
from rich.table import Table, Column

table = Table(Column(justify="left"), show_header=False)
table.add_row("\N{NO-BREAK SPACE}hello")
table.add_row("world")

console = Console()
console.print(table)
┌────────┐
│  hello │
│ world  │
└────────┘

@mgedmin
Copy link
Author

mgedmin commented Nov 30, 2024

@oir: it's a related, but different issue. Both issues stem from the fact that Python considers U+00A0 to be a space character, and so it gets matched by \s in regular expressions and also stripped by str.rstrip() (which is what Table layout uses).

I've monkey-patched both bugs out in my current project with this:

import re

from rich.text import Text


_re_whitespace = re.compile("[^\\S\N{NO-BREAK SPACE}]+$")


def monkeypatch_rich_word_wrapping_bug() -> None:
    # Workaround for https://github.com/Textualize/rich/issues/3545
    import rich._wrap
    import rich.text
    rich._wrap.re_word = re.compile("\\s*[\\S\N{NO-BREAK SPACE}]+\\s*")
    rich.text._re_whitespace = _re_whitespace

    def rstrip(self: Text) -> None:
        self.plain = _re_whitespace.sub('', self.plain)
    Text.rstrip = rstrip  # type: ignore[method-assign]

while waiting for feedback on #3571.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants