Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gradio sanitizing links; no longer able to open in new window #4536

Closed
1 task done
decent-engineer-decent-datascientist opened this issue Jun 15, 2023 · 17 comments · Fixed by #4577
Closed
1 task done
Assignees
Labels
bug Something isn't working
Milestone

Comments

@decent-engineer-decent-datascientist

Describe the bug

Currently my chat bot returns verified sources with html links as seen below. It uses the target="_blank"> to open these links in a new tab. Gradio past the version that introduced marked now sanitizes this target parameter out of the processed link.

input:
f'<a href="{link}" class="source" rel="noopener noreferrer" target="_blank"> {name} | {score:.0%}</a>

output in published site:
f'<a href="{link}" class="source" rel="noopener noreferrer"> {name} | {score:.0%}</a>

Is there an existing issue for this?

  • I have searched the existing issues

Reproduction

working on it

Screenshot

working on it

Logs

working on it

System Info

gradio>=3.29.1

Severity

blocking upgrade to latest gradio version

@decent-engineer-decent-datascientist decent-engineer-decent-datascientist added the bug Something isn't working label Jun 15, 2023
@abidlabs
Copy link
Member

Hi @decent-engineer-decent-datascientist can you please provide a way for us to reproduce this issue? Can you also confirm that this is an issue on the latest version of Gradio?

@hannahblair
Copy link
Collaborator

I've been able to repro this in a test space here with 3.35.2. Might be a side effect of #3667. Looking into fixing this!

@decent-engineer-decent-datascientist
Copy link
Author

decent-engineer-decent-datascientist commented Jun 16, 2023

@hannahblair thank you for helping out, completely snowed under with work.

@hannahblair
Copy link
Collaborator

hannahblair commented Jun 18, 2023

Ignore my initial theory, it looks like this was a side effect of #4360 and seems to be a known issue with DOMPurify. Will see if there's a viable workaround!

@decent-engineer-decent-datascientist

Ah cool! is there any way within gradio to add those hooks to my app? Thank you for debugging this @hannahblair

@decent-engineer-decent-datascientist

Hey friend, just checking to see if there was a preferred pattern to add those dompurify hooks to Gradio's JS.

@abidlabs
Copy link
Member

@decent-engineer-decent-datascientist Gradio doesn't expose those hooks (and it feels too low-level for us to expose) so no workaround at the moment short of modifying the Gradio code directly.

I think we should just write the hook in Gradio to ensure that _blank is not stripped out, let's see what @hannahblair thinks as well

@khilnani
Copy link

I encountered the same issue and was able to reproduce it using the code below quite easily. While we wait for a fix, does anyone know the last version of gradio where this worked correctly? Version where links in the chatbot had the target="_blank" correctly inserted / preserved and subsequently opened in a new tab

import gradio as gr

with gr.Blocks() as demo:
    chatbot = gr.Chatbot(height=200)
    msg = gr.Textbox()
    clear = gr.ClearButton([msg, chatbot])

    def respond(message, chat_history):
        bot_message = "<a href='http://google.com'>Google HTML Link</a> \nJust a URL to https://google.com"
        chat_history.append((message, bot_message))
        return "", chat_history

    msg.submit(respond, [msg, chatbot], [msg, chatbot])

demo.launch()

Resulting HTML (via inspect element)

<span class="svelte-15hifvz"><p><a href="http://google.com">Google HTML Link</a><br>Just a URL to <a href="https://google.com">https://google.com</a></p>
</span>

@khilnani
Copy link

Based on my testing, the last version when this worked was 3.31.0 (i checked checked each version from the latest and worked backwards)

@abidlabs abidlabs added this to the 3.x milestone Jun 19, 2023
@pngwn
Copy link
Member

pngwn commented Jun 20, 2023

We should make sure we add the correct rel attributes as well to prevent security issues:

cure53/DOMPurify#317 (comment)

We should also add noreferrer in addition to noopener.

@decent-engineer-decent-datascientist

Based on my testing, the last version when this worked was 3.31.0 (i checked checked each version from the latest and worked backwards)

Heads up, that version leveraged marked's html sanitization; it wasn't the best and often parsed html as text.

@khilnani
Copy link

Based on my testing, the last version when this worked was 3.31.0 (i checked checked each version from the latest and worked backwards)

Heads up, that version leveraged marked's html sanitization; it wasn't the best and often parsed html as text.

Thanks. I noticed that... pros and cons. Sadly, having the web page change and user losing the entire chat history when clicking on a link is is what I am trying to avoid

@lucasalvarezlacasa
Copy link

Is this fixed now? I'm outputting an a tag with target="_blank" and it gets automatically removed for some reason.

I understand it is because the documents being referenced by the links point to files that are part of the same URL where the application is running. Is this the case? How do I solve it?.

@pngwn
Copy link
Member

pngwn commented Apr 10, 2024

Depending on the specific issue it should be fixed. If you are using the latest version of gradio and are still having issues, please open a new issue with a minimal reproduction so we can look into it.

@lucasalvarezlacasa
Copy link

I created an issue here explaining my situation.

@6DammK9
Copy link

6DammK9 commented Sep 17, 2024

Just commenting, not going to re-open the issue.
target="_blank" doesn't work when the hyperlink has the same domain with the gradio host.

	DOMPurify.addHook("afterSanitizeAttributes", function (node) {
		if ("target" in node) {
			if (is_external_url(node.getAttribute("href"))) {
				node.setAttribute("target", "_blank");
				node.setAttribute("rel", "noopener noreferrer");
			}
		}
	});

Therefore when you're working on some on-prem solution under a static domain, pay attention to this situation.
My case is just using nginx to make the exact same PDF file host available in 2 domains.

This work: xxx.company.com/chatbot > yyy.company.com/filehost.pdf
This doesn't work: This work: xxx.company.com/chatbot > xxx.company.com/filehost.pdf

@hannahblair
Copy link
Collaborator

I'll look into that @6DammK9! Thanks for raising this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants