Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Body filtering #156

Open
Caellian opened this issue Oct 10, 2024 · 2 comments
Open

Body filtering #156

Caellian opened this issue Oct 10, 2024 · 2 comments
Assignees
Labels
enhancement New feature or request @type/new

Comments

@Caellian
Copy link

Caellian commented Oct 10, 2024

I'd like to use this for labeling issues in Conky, but as part of issue reports people often paste their configuration files which contain keywords I'd like to check for. This means that checking a lot of labels in the body would yield a lot of false positives.

Example:

Description

Some issue that mentions HW sensors not working. Therefore the issue should only be labeled with sensors label.

Config

User pastes their whole config as part of bug report:

conky.text = [[
  HDD: ${hdd_variable} -- HDD is detected and issue labeled `disk io`
  Volume: ${volume_variable} -- Volume is detected and issue labeled `audio`
]]

Not the best or most accurate example, but I hope it gets the point across, based on contents of that code block the body could trigger all labels.

I suggest adding a mechanism that allows excluding parts of MD content from regex, either by checking only content under a specific title (e.g. Details/Description) or by excluding code blocks (might be easier to implement (remove all "```(\w+)?.+```")).

@srvaroa
Copy link
Owner

srvaroa commented Nov 14, 2024

Hi, sorry I forgot to reply on this one. This makes sense, I have to experiment a bit with something similar to your suggestion. I can see something like:

- body:
  pre-filter: <remove-all-regex>
  filter: <actual matching regex>

@Caellian
Copy link
Author

Caellian commented Nov 17, 2024

Thanks for the reply.

I think that simple title-based filtering might be a good addition to pre-filter down the line. It's commonly much simpler to exclude/include everything in section than it would be to write a regex that removes stuff. This is less clear from the example because \[\[.*?\]\] works in this case perfectly, but there's cases where removed content can only be differentiated based on location because matched content is structured the same:

Example

Description

[...]

I believe same might be the case with Ubuntu, but I don't have it installed so that needs confirmation. Also, Void Linux doesn't use systemd so that's very likely also the case there.

System

OS: Linux
Distro: Arch Linux

The most sane way of dealing with this using regex would be to remove .*## System, but it feels more faulty than being able to extract sections.

Go has a crate for markdown, but I think this would be more complicated than simply splitting input on lines that start with #{1,6} and internally producing something like:

sections:
  - level: 3
    title: <title>
    body: <content>

Filtering can then use:

- body:
  - title:
      - content: <title name regex>
        level: <title level number>
      - content: <title name regex>
        level: <title level number>
    label: <label name>

# Probably requires some additional abstraction:
- body:
  - section:
      - title: <title name regex>
        <any body items except section>
    label: <label name>

If both get added, I suggest pre-filter to run after that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request @type/new
Projects
None yet
Development

No branches or pull requests

2 participants