Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🚀 make whitespace changes visible as such #522

Open
Kr1ss-XD opened this issue Feb 12, 2021 · 21 comments
Open

🚀 make whitespace changes visible as such #522

Kr1ss-XD opened this issue Feb 12, 2021 · 21 comments

Comments

@Kr1ss-XD
Copy link
Contributor

Hi, at first glance I couldn't find any open or closed issues on this topic. Please excuse me if I missed one.

Recently, while viewing a particular diff, I noticed that every indented line was highlighted as a difference, even though the code itself was not changed, other than being indented by spaces vs. tabs. (PKGBUILD's are interpreted by bash, so indentation doesn't matter).

delta_whitespace1

Therefore, I think it might be useful to have an option which allows displaying whitespace characters using one or more user-defined glyphs, and/or applying the delta.{plus,minus}-emph-style settings to whitespace changes.

In the case that the characters to display tabs were be defined as '‣·' and tab size as 4, I would imagine something like :

delta_whitespace2

@dandavison
Copy link
Owner

Hi @Kr1ss-XD can I just check -- passing -w (--ignore-all-space) to git diff/show/etc works with delta already. Are you suggesting the new delta functionality would be worth adding, in addition to what we have with -w?

@Kr1ss-XD
Copy link
Contributor Author

Yes, that's correct. I would like to be able to see whitespace changes, but also to distinguish them from other differences on a glance.

@dandavison
Copy link
Owner

I see. It almost feels like there could be a git option for rendering tabs as visible characters doesn't it? But there doesn't seem to be one. Just playing around with possibilities, this seems quite nice:

[core]
    pager = sed 's/\t/├──┤/g' | delta

So is this as simple as allowing the user to supply a string to substitute tabs? Here's a quick branch implementing that, do you want to try it out? #523 E.g.

git -c delta.tab-string='├──┤' diff
image

@Kr1ss-XD
Copy link
Contributor Author

Nice, I guess that's exactly what I was looking for.

I'll check out that branch and build it. Thanks a lot !

@Kr1ss-XD
Copy link
Contributor Author

It's working fine @dandavison 👍

@dandavison
Copy link
Owner

Great. One question is do we consider the breaking change of getting rid of the tabs option, since if we go with the new branch it's redundant since any number of spaces can be provided explicitly. I guess it can be marked as deprecated (but we are not yet at 1.0 so I suppose the point of 0.x is you're allowed to break backwards compatibility while setting down.)

@Kr1ss-XD
Copy link
Contributor Author

Kr1ss-XD commented Feb 12, 2021

I'd say there are - at least - two ways to handle this.

  1. deprecate delta.tabs in favor of delta.tab-string (and break backwards compatibility), or
  2. maybe keep tabs, and let tab-string accept 1, 2 or 3 characters. If one character is given, repeat it tabs times. If there are two chars, print the first one and repeat the 2nd character tabs-1 times. If tab_strings contains three characters, print the 1st one, print the second one tabs-2 times, then print the 3rd character.

I suppose the point of 0.x is you're allowed to break backwards compatibility while setting down.

That's my understanding, too. Though I'd say it's preferable to prevent breaking changes if possible.

@th1000s
Copy link
Collaborator

th1000s commented Feb 15, 2021

Speaking of tabs, when replacing them delta currently uses a fixed number of spaces. However these replacements must stop at a multiple of the tab width, counted from the start of the line:

1\tX                                   1       X                               1‣······X
12\tX         which looks like         12      X     and delta then might      12‣·····X
123\tX        this with a tab-         123     X     render them like this:    123‣····X
1234\tX       width of 8               1234    X                               1234‣···X
1234567\tX                             1234567 X                               1234567‣X

Also maybe use a more muted color than white, e.g. gray or blue as vim does with :set list.

dandavison added a commit that referenced this issue Feb 17, 2021
@robinmoussu
Copy link

If possible, I think it would be great to make clearer that lines that contains only whitespace change don’t introduce other modifications.

current

In this example, it doesn’t seems like the first line is really an addition, because of the whitespace changes on the next lines. I think the easiest change would be to use the light green background for lines that are entirely new. The same logic can be used for lines that where completely removed.

proposed

@th1000s
Copy link
Collaborator

th1000s commented May 6, 2021

@robinmoussu also see PR #13 which among other things discusses that single changed words are emphasized more than an entire added or removed line.

@dandavison
Copy link
Owner

@robinmoussu, @th1000s putting aside temporarily the question of what the defaults should be, what I would like to be true is that delta's style configuration options can achieve any of these desired highlighting styles. (Seeing as there are so many options, it seems like it must be a bug for things like this to be non-achievable!)

I believe that the example above can be achieved by making plus-style the same as plus-emph-style, right? Specifically, I usually use a light background, and delta --show-config shows

image

which gives this diff:

image

So now if we make plus-style be the same as plus-emph-style,

image

it looks like this

image

I also haven't forgotten about @nkouevda's PR #13, which dates from before delta gained its *-style options. As I said in that PR, delta's highlighting style was largely influenced by what Github does, and they reserve the emph color for individual words, not whole lines. But I don't think I could argue that that's clearly correct! So it seems that what we should do, even if we do not change defaults at this stage, it to make a version of the new-lines-get-emph-color behavior available as an option; hopefully the example I gave above proves that that will be easy to do since the option API already supports it.

@dhduvall
Copy link

IMHO, the --tabs option is still useful, since it's how you get raw tabs in the output (if you want that), and with tab alignment (#729, #751), people using tabs for code alignment can see their diffs using the same tabstops their code uses.

Also, I would suggest using a vim-style configuration for this, where you can set the characters replacing the space and the beginning, end, and middle of a tab. What vim doesn't do (and I would like) is to configure what you use for a tab expanding to a single column, and maybe also for two columns.

Also, should all tabs and spaces be replaced with these characters, only those on lines that have changed, or only those that have actually changed? (And should this be configurable?)

I have a very crude change on top of #752, but as you can see, it still needs some work:

Screen Shot of visualized tabs

@alfredbez
Copy link

Just for reference: Git also has an option to show invisible or non-printable characters via --ws-error-highlight=all.

@mentalisttraceur
Copy link

mentalisttraceur commented Mar 20, 2023

I would really appreciate something like bat --show-all, more than the other possibilities being discussed here.

Since delta is using code from bat for its syntax highlighting anyway, I think we can just pass through a boolean and have bat take care of the rest (edge-cases like combining and double-width characters, ambiguities like a literal · appearing in the actual diff'ed text, ...).

I also think having a "show all invisible characters" option, rather than something specific to just tabs, would map a lot better to user expectations, since it's found in so many IDEs/editors/tools - many devs are familiar with it, and already have trained up their minds to visually parse those kinds of visual cues for invisible characters. For example, when I first ran into the problem of delta hiding a spaces-to-tab change (which I'd normally spot as a single-column misalignment in a vanilla git diff), my first through was to look for a "show all/hidden/whitespace/invisible characters" option in the help text, webpage, and then here.

@dandavison
Copy link
Owner

Since delta is using code from bat for its syntax highlighting anyway

Hi @mentalisttraceur, this isn't true, but I'm curious where you got that idea from, since I see it said quite a bit!

@mentalisttraceur
Copy link

mentalisttraceur commented Mar 20, 2023

@dandavison I think the strongest reason was the quote (emphasis added) "delta does not use bat when it is running (it does use some code from the excellent bat project, but users don't need to be aware of this)" on your page about delta's use of pager environment variables, compounded by other mentions of bat in the context of delta themes such as here and here. (And maybe I've seen other people say it too, I can't remember.)

All together, that gave me the impression that there was significant code reuse, and my first thought would be to try to reuse it for any heavy lifting needed to syntax-highlight the text inside the diff markers.

@dandavison
Copy link
Owner

Right, it's funny, I added those sentences to try to dispel the belief that delta used code from bat for syntax highlighting, but I can see it's confusing especially if people come with a prior belief that it does. Delta uses bat code for things like launching the pager process, and it uses the color themes and language definitions that bat uses. But the low-level details of syntax highlighting are implemented independently in bat and delta, both using the same underlying library (syntect).

@mentalisttraceur
Copy link

Gotcha, thanks for clarifying!

If I think of any better wording or place for more precisely/unambiguously mitigating that misunderstanding I'll let you know!

In any case, I would love a --show-all option, and I think bat is a good example of it done well.

@dandavison
Copy link
Owner

dandavison commented Mar 21, 2023

@mentalisttraceur, I agree, I use bat -A a lot.

I haven't thought too hard about how helpful this is but FWIW here's a conversation with GPT4 about this issue.


Can you summarize the following git-delta issue and suggest next steps?
[paste entire text copied from browser including title]

The git-delta issue #522 discusses the need for a feature to make whitespace changes visible and distinguishable from other changes in a diff. The main problem raised is that when viewing a diff, indented lines are highlighted as differences, even though the code is not changed, but only indented with spaces or tabs. The proposed solution is to have an option to display whitespace characters using user-defined glyphs and/or applying specific delta styles to whitespace changes.

Throughout the conversation, there are various suggestions and implementations shared. The initial idea is to allow the user to supply a string to substitute tabs. However, it evolves to include discussions about how to handle the deprecated delta.tabs in favor of delta.tab-string and whether to break backward compatibility.

The conversation also covers topics such as:

  • Tab alignment and the possibility of using a vim-style configuration for tab replacement characters.
  • Showing invisible or non-printable characters with Git's --ws-error-highlight=all option.
  • The possibility of adding a --show-all option similar to the bat tool.

Next steps:

  1. Assess the feasibility and potential impact of the suggested implementations and decide on the most suitable approach.
  2. Implement the chosen solution, ensuring that it addresses the issue raised and is compatible with existing features.
  3. Test the implemented solution thoroughly to ensure it works as expected and does not introduce new issues.
  4. Update documentation and provide examples of how to use the new feature.
    Close the issue after successful implementation and testing.

Can you give a specific suggestion for a next step as follows:

  1. The proposed change to delta's behavior and user experience
  2. The proposed implementation
  1. Proposed change to delta's behavior and user experience:
  • Introduce a --show-all option that, when enabled, will display all invisible characters in the diff, such as spaces and tabs. This option will help users to quickly identify and distinguish between whitespace changes and other changes in the code. The feature should be inspired by the similar functionality in the bat tool, ensuring that it is familiar and intuitive for users.
  1. Proposed implementation:
  • Add a new command-line flag --show-all to the list of options accepted by delta.
  • In the processing stage where delta is handling the diff, check if the --show-all flag is enabled. If so, replace the invisible characters (e.g., spaces and tabs) with their corresponding visible glyphs (e.g., '·' for space and '‣' for tab).
  • Ensure that the new feature is compatible with other delta options such as --ignore-all-space and works seamlessly with delta's styles and themes.
  • Add tests to verify that the new feature works as expected and does not introduce new issues or conflicts with existing functionality.
  • Update the documentation and help messages to include the new --show-all option, providing examples of its usage and explaining its purpose.

Is the idea of the --show-all option to explain to the user differences in whitespace between the old and new version of the file, or is it simply to show which whitespace characters are in use even when they haven't changed?

Apologies for any confusion in my previous response. The idea of the --show-all option is to display all whitespace characters in use, regardless of whether they have changed or not. This will help users visualize the actual structure of the code, including spaces and tabs, making it easier for them to understand the formatting and indentation. It does not specifically highlight the differences in whitespace between the old and new versions of the file. However, by making the whitespace characters visible, users can more easily identify and compare the changes in whitespace themselves.

How do you propose that delta should display differences in whitespace characters between the old and new version?

Delta could use a combination of colors and symbols to indicate the differences in whitespace characters. For example, it could use different colors for changed spaces and tabs, and a separate color for unchanged whitespace characters.

Here is an ASCII representation of how it could look:

Old version:
if (condition) {
····doSomething();
}

New version:
if (condition) {
‣‣‣‣doSomething();
}

In this example:

  • · represents a space character from the old version
  • ‣ represents a tab character from the new version
    Delta could use a light red background color for the removed spaces (represented by ·) and a light green background color for the added tabs (represented by ‣). This would make it clear that spaces were removed and replaced with tabs.

Please note that this is a text-based representation, and the actual implementation in delta would involve colors and custom symbols to create a more visually appealing and informative display.

@mentalisttraceur
Copy link

Sounds like I convinced ChatGPT4 😄.

I appreciate that you publicly shared that - I hadn't thought to use AI as a thinking aid like that, seems useful!

I basically agree with its recommendations - obviously to be precise, we probably don't want a full tab represented by just one single-width character (I'm tempted to say "that's what it meant", in the sense that if a human said this, I'd suppose that's probably what they meant).

As for coloring, I imagine the coloring of the whitespace characters to be like syntax highlighting now: if style includes syntax, the special character stand-ins would have the same foreground/text colors on all lines (maybe just copy how bat -A colors them), and the background color would indicate removed/added/moved. (Of course if someone customizes the style to have a foreground color override the syntax coloring, that overrides.)

@thecaralice
Copy link

As someone who prefers using tabs in code I'd also be happy to see an option for rendering a tab to just a sequence of spaces and rendering spaces as some kind of a middle dot or a fancy underscore or something (should I open a PR for that?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants