Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use 'v' flag for regex subtitle parsing #556

Merged
merged 1 commit into from
Oct 28, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -197,7 +197,7 @@ Useful examples of regular expressions:
- `(.*)\n+(?!-)(.*)` : Some subtitles are split in several lines and this regex forces them into a single line. For this filter to work, you must also put `$1 $2` in the "Subtitle regex filter text replacement" field.
- **NB**: When using this regex pattern in combination with other patterns (using the `|` operator, see below), place this pattern at the end. This ensures that all other regex transformations are applied first, and then the results are finally combined into a single line.
- `-?\[.*\]` : Remove indications enclosed by square brackets that sound or music that is playing (e.g. "**\[PLAYFUL MUSIC]**" or "**\-[GASPS]**")
- `^[-\(\)\.\sA-ZAÂÃÀÇÉÊÍÓÔÕÚÑ]+$` : As an alternative to the above, filter out descriptions written in capital letters, but without the square brackets (e.g. "**PLAYFUL MUSIC**"). If your language has additional letters with diacritics, you feel free to add them to this list.
- `^[\-\(\)\.\s\p{Lu}]+$` : As an alternative to the above, filter out descriptions written in capital letters, but without the square brackets (e.g. "**PLAYFUL MUSIC**"). If your language has additional letters with diacritics, you feel free to add them to this list.
Copy link
Contributor Author

@artjomsR artjomsR Oct 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The missed slash here is an existing "bug": it needs to be \- and not -, as the dash does not represent a range but should be an escaped character instead. (to capture lines such as - GASPS)

- `[♪♬#~〜]+` : Any combination of symbols on their own that represent playing music (e.g. `♪♬♪`)

Regular expressions can be combined with the character `|` (no spaces needed inbetween). E.g., if you want to use the 2 last regexes from this list, you can use `-?\[.*\]|[♪♬#~〜]+`. You can combine as many regexes as you wish this way.
Expand Down
2 changes: 1 addition & 1 deletion common/subtitle-reader/subtitle-reader.ts
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ export default class SubtitleReader {
let regex: RegExp | undefined;

try {
regex = regexFilter.trim() === '' ? undefined : new RegExp(regexFilter, 'g');
regex = regexFilter.trim() === '' ? undefined : new RegExp(regexFilter, 'gv');
} catch (e) {
regex = undefined;
}
Expand Down
Loading