Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use sys.stdout.flush() in skip and success #2529

Closed
AlttiRi opened this issue Apr 26, 2022 · 14 comments
Closed

Use sys.stdout.flush() in skip and success #2529

AlttiRi opened this issue Apr 26, 2022 · 14 comments

Comments

@AlttiRi
Copy link

AlttiRi commented Apr 26, 2022

On Windows with colored output text enabled ("output": {"mode": "color"}) the buffering is abnormally aggressive.
In some cases I can see the output in the console only after the program end (when all downloads are skipped).

The simple fix is using of

sys.stdout.flush()

in skip (and in success, just in case) methods of ColorOutput class.

def skip(self, path):
sys.stdout.write("\033[2m" + self.shorten(path) + "\033[0m\n")
def success(self, path, tries):
sys.stdout.write("\r\033[1;32m" + self.shorten(path) + "\033[0m\n")


Most likely because of this also Ctrl+C works very bad.

There are no problems without colored text enabled, btw. (When it uses "# " prefix for the skipped downloads instead of gray coloring.)

@AlttiRi
Copy link
Author

AlttiRi commented May 2, 2022

The bug is still exists with Git-Bash without enabled Enable experimental support for pseudo consoles option.
(when MSYS=disable_pcon is instead of MSYS=enable_pcon in /etc/git-bash.config file)

For example, download https://www.hentai-foundry.com/pictures/user/personalami/scraps (3 images), then run it again, 3 lines will be shown only after the program end, while whey should be showed one by one.

With sys.stdout.flush() in skip it works fine.

@mikf
Copy link
Owner

mikf commented May 4, 2022

What's the output of
python -c "import sys; print(f'{sys.stdout.line_buffering = }\n{sys.stdout.isatty() = }')"
for you?

What you are describing here and have been in other issues sounds like Python doesn't recognize your terminal as a regular TTY and does not enable line buffering by default, among other things.

$ python -c "import sys; print(f'{sys.stdout.line_buffering = }\n{sys.stdout.isatty() = }')"
sys.stdout.line_buffering = True
sys.stdout.isatty() = True

@AlttiRi
Copy link
Author

AlttiRi commented May 4, 2022

With the default Git-Bash settings (so without Enable experimental support for pseudo consoles option):

echo "MSYS=disable_pcon" > /etc/git-bash.config

python -c "import sys; print(f'{sys.stdout.line_buffering = }\n{sys.stdout.isatty() = }')" prints:

sys.stdout.line_buffering = False
sys.stdout.isatty() = False

In this case flush is required.

With echo "MSYS=enable_pcon" > /etc/git-bash.config console log works well without flush.
However, this option is not default in Git-Bash yet.

Git-Bash uses MinTTY.

@mikf
Copy link
Owner

mikf commented May 17, 2022

All writes to stdout now get explicitly flushed (eeef9cc)

This would not be necessary if Python would properly detect Git-Bash/MinTTY with its default settings and enable sys.stdout.line_buffering. flush() now gets called twice for TTYs that have line_buffering enabled.

Not the end of the world, but I would've liked to find a solution that calls flush() only once in all cases. Maybe I should just always disable ´line_buffering` and only do explicit flushes, even outside of output.py?

@AlttiRi
Copy link
Author

AlttiRi commented May 17, 2022

Something like Strategy pattern

const isFlushRequired = // ...
function getStdWrite() {
  if (isFlushRequired) {
    return text => {
      process.stdout.write(text);
      process.stdout.flush();
    };
  }
  return text => {
    process.stdout.write(text);
  };
}

Or just

function stdWrite(text) {
  process.stdout.write(text);
  if (isFlushRequired) {
    process.stdout.flush();
  }
}

@God-damnit-all
Copy link
Contributor

God-damnit-all commented May 19, 2022

The ANSI color codes don't seem to work in PowerShell. Maybe this will fix it?

https://stackoverflow.com/a/72292134

If not, maybe this (lines 6-8):

https://gist.github.com/RDCH106/6562cc7136b30a5c59628501d87906f7

@God-damnit-all
Copy link
Contributor

God-damnit-all commented May 19, 2022

Turns out, all you need to fix the color output in PowerShell is to just include these two lines in __init__.py:

import os
os.system('')

Found that out from here: https://stackoverflow.com/a/64222858

@God-damnit-all
Copy link
Contributor

@mikf In addition to the change above, could you add an output setting that would allow for * and # to still be used to indicate success/skip in color mode? I want to start using PowerShell's Start-Transcript command for logging (since I use gdl as part of a script that logs other things) and sadly Start-Transcript doesn't properly log ANSI.

My proposal is output.prepend and it's set to this by default: {"success": "* ", "skip": "# ", "force": false} and to make it appear in color mode, you set force to true, while allowing users to customize what they want for indicating success/skip so that the setting doesn't seem so out-of-place.

@God-damnit-all
Copy link
Contributor

Apparently OSes other than Windows uses a checkmark for success, so maybe this instead: {"success": true, "skip": true, "force": false}

If success/skip is set to a string, it prepends that instead, unless color mode is active and force is set to its default of false. If success/skip is set to false, then it wouldn't prepend anything even if color mode is active. If force is set to true, then prepends would function in color mode.

mikf added a commit that referenced this issue May 27, 2022
work in progress

the same output as produced by "mode": "color" can be achieved with

"output": {
    "mode": {
        "start"  : "{}",
        "success": "\r\u001b[1;32m{}\u001b[0m\n",
        "skip"   : "\u001b[2m{}\u001b[0m\n",
        "progress"      : "\r{0:>7}B {1:>7}B/s ",
        "progress-total": "\r{3:>3}% {0:>7}B {1:>7}B/s "
    }
}

to make 'output.shorten' work correctly, it is necessary to manually
specify the number of extra characters:

        "start"  : [12, "Downloading {}"]
@mikf
Copy link
Owner

mikf commented May 27, 2022

@ImportTaste instead of implementing all sort of extra options, 7990fe8 adds a way to more or less completely customize the standard output and download progress bar.

No explicit documentation for now, since I'm not sure if I want to leave it like this (suggestions welcome), but the commit message example is hopefully good enough to get the general idea across.

For the download progress format strings:

  • progress is for when the total amount is unknown
  • progress-total for when it is known
  • {0} is bytes downloaded
  • {1} is bytes per second
  • {2} is number of total bytes
  • {3} is percent of total bytes

@AlttiRi
Copy link
Author

AlttiRi commented Jun 29, 2022

Something wrong with console width detection now:
Screenshot
image

Hm, the old exe files work the same.
Maybe I did note it before because I use 180 columns console width by default.
However, it worked before, it seems for me, but I can't reproduce it now with the old versions, it's strange.
Anyway, currently it works incorrectly.

    "output": {
        "mode": "color",
        "colors": {
            "success": "1;32",
            "skip"   : "38;5;245"
        },
        "progress": true,
        "shorten": true,
        "private": true,
        "log": { 
            "level": "info",
            "format": {
                "debug"  : "\u001b[0;37m{name}: {message}\u001b[0m",
                "info"   : "\u001b[1;37m{name}: {message}\u001b[0m",
                "warning": "\u001b[1;33m{name}: {message}\u001b[0m",
                "error"  : "\u001b[1;31m{name}: {message}\u001b[0m"
            }
        }
    },

@mikf
Copy link
Owner

mikf commented Jul 1, 2022

The problem is that with "shorten": true, gallery-dl assumes all characters to have a width of 1.
Use "shorten": "eaw" (East Asian Width) to make it work with characters with a width of 2 columns.

This is not the default because it produces wrong results if the appropriate fonts aren't available and all/some characters are displayed as ? or , and the algorithm is at least an order of magnitude slower.

@AlttiRi
Copy link
Author

AlttiRi commented Jul 1, 2022

Hm, it works. I thought I already used monospaced font.

For example, I usually use "" as a separator for keys. In the screenshots above it is not long.

Ah, wait, yeah, the hieroglyphs take 2 chars width. On the first line there are 10 hieroglyphs so 10 chars were moved to the second line.

@Hrxn
Copy link
Contributor

Hrxn commented Jul 1, 2022

Hm, it works. I thought I already used monospaced font.

For example, I usually use "" as a separator for keys. In the screenshots above it is not long.

Ah, wait, yeah, the hieroglyphs take 2 chars width. On the first line there are 10 hieroglyphs so 10 chars were moved to the second line.

nb. this has nothing to do with monospaced fonts. Monospace here means simply the glyphs all have the same horizontal width, this applies only to display/rendering, it's not related to the "technical" width of a character (or a string) of 1 (as in one byte), that is the encoding, not displaying.

mikf added a commit that referenced this issue Jan 1, 2023
@mikf mikf closed this as completed Jan 5, 2023
mikf added a commit that referenced this issue Feb 26, 2023
(#1621, #2152, #2529)

Allow setting custom input/output encodings and options
without having to rely on Python's defaults.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants