Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues when pasting into post title field #38637

Closed
claudiulodro opened this issue Feb 8, 2022 · 39 comments · Fixed by #42321
Closed

Issues when pasting into post title field #38637

claudiulodro opened this issue Feb 8, 2022 · 39 comments · Fixed by #42321
Assignees
Labels
[Block] Post Title Affects the Post Title Block [Feature] Paste [Status] In Progress Tracking issues with work in progress [Type] Bug An existing feature does not function as intended

Comments

@claudiulodro
Copy link
Contributor

claudiulodro commented Feb 8, 2022

Description

While doing some in-depth testing around copy + pasting into the post title field of the editor, I noticed a few issues:

  1. When pasting headings with a lot of formatting from Google Docs into the post title field, a bunch of the original formatting is initially displayed in the editor. This is really just a visual issue, as it doesn't cause issues on the frontend and goes away when a post is re-visited in the editor.
  2. When pasting strings with certain characters into the post title field, it's possible to generate permalinks that can cause redirection issues or crash browsers.
  3. When pasting a full line heading from Google Docs into the post title field in Safari, certain headings crash the editor and the whole browser.

Step-by-step reproduction instructions

To reproduce issue 1 around pasting formatted content from Google Docs into the title field:

  1. Create a Google Doc. Add some headings with a bunch of formatting to the doc.
  2. Copy one of the headings. Paste it into the title field of the editor. Observe much of the styling carries over (color, sometimes size, sometimes font weight, etc.)
  3. Save post. Re-open editor. Observe post title doesn't have all that extra formatting applied and looks normal.
    This screen recording demonstrates this issue well:
    gutenbergpastetitle1

To reproduce issue 2 around permalink generation when certain characters are in the title:

  1. Copy this string, including the character at the end which displays as [obj] or an empty space, depending on your browser:

Orange County Cities Continue Grappling With State Mandated Housing Goals This Week

  1. Paste it into the title field. Publish the post. The generated permalink and title will look like this in different browsers:

Chrome:
Screen Shot 2022-02-08 at 11 36 58 AM

Safari:

Screen Shot 2022-02-08 at 11 38 50 AM

Screen Shot 2022-02-08 at 11 38 57 AM

The presence of this character in URLs seems to cause difficulties for some users' browsers, and can crash the tab when a user visits a link with the character in it. I am unsure how to get that character "naturally" in the clipboard, but it seems to happen when copy + pasting from something (I've encountered it on a number of sites I manage following the WP 5.9 release).

To reproduce issue 3 around the Safari crashing when a heading is pasted into the title field:

  1. This doesn't seem to happen with every Google Doc, but it does happen with many in my testing. In Safari, copy the full heading line to the clipboard and paste it into the post title field. Observe the browser crashes. In the below screen recording, I copy the heading line, paste it, Safari crashes and reloads, I paste it again, Safari crashes and doesn't reload:
    gutenbergtitlepaste2

Screenshots, screen recording, code snippet

See steps to reproduce

Environment info

I tested on a clean site running only WP 5.9 (using the version of Gutenberg that ships with WP 5.9). Theme was Twenty Twenty Two, but I can reproduce with other themes. I tested with Safari and Chrome. Device was a MacBook Pro.

Please confirm that you have searched existing issues in the repo.

Yes

Please confirm that you have tested with all plugins deactivated except Gutenberg.

Yes

@annezazu annezazu added [Block] Post Title Affects the Post Title Block [Feature] Paste [Type] Bug An existing feature does not function as intended labels Feb 10, 2022
@gwwar
Copy link
Contributor

gwwar commented Feb 17, 2022

@claudiulodro thanks for the report! Do you happen to have any extra plugins running? I'm having trouble reproducing issues 2/3 on 5.9

We use the object replacement character internally in the rich text package, so some havoc can occur if it slips in via paste. Most commonly folks were copy pasting in MS Word.

We previously worked around this in #34851

@claudiulodro
Copy link
Contributor Author

Thanks for the reply! That's good info. I reproduced these on a clean site with no extra plugins, specifically for testing these issues, but I'm sure it does probably vary depending on browser version, OS, when a Google Doc was written (I bet the internal markup changes occasionally), etc.

For item 2 specifically, I've been receiving a number of reports and noticing it across many of the sites we host following WP 5.9. In talking to the customers, it does appear that the one thing they have in common is that they write the posts in an external editor and then paste them into WP, so MS Word seems like a good theory. I'll continue collecting data as I encounter the issue and follow up if I get more concrete facts. :)

@gwwar
Copy link
Contributor

gwwar commented Feb 18, 2022

Chatting with @claudiulodro this might require Safari 14 to reproduce. I do see some slightly different behavior on Safari 14 vs 15 but I'm still having trouble reproducing the wrong permalink url / safari crash.

@claudiulodro if you can consistently reproduce this, would you be interested in proposing a patch? I'm happy to help review.

@Robertght
Copy link

Robertght commented Feb 23, 2022

I experienced this recently as well, but the text was copied from a page created with Gutenberg and added that inside a post later.

In this case, it only appears in Firefox.

@gwwar
Copy link
Contributor

gwwar commented Feb 23, 2022

Hmm, I still didn't have luck reproducing on Firefox. I'll maybe try to bulletproof the slug creation logic to remove the object replacement character, and ask y'all to test in a bit.

@annezazu
Copy link
Contributor

I can replicate this repeatedly in Chrome when pasting into the Post Title. Most recently could replicate on the Make Network when writing a post for Make Test.

@gwwar
Copy link
Contributor

gwwar commented Feb 23, 2022

@annezazu do you see the same behavior in this branch? #39033

@fringillas
Copy link

I have a similar situation on WP 5.9.1. If you paste from MS Word into the post title on a mac in chrome, you will get a "?" at the end of the title when the draft is saved. The styling from MS Word is visible in WP until the draft is saved, after saving the normal post title styling is visible plus the added "?".

The permalink is also affected, and gets a ""-char at the end. I can't really replicate this permalink-behavior, it only seem to happen sometimes.

@claudiulodro
Copy link
Contributor Author

Adding a few more data points: I had a 3 more reports of the  issue yesterday. In all cases, the users were pasting the title from Google Docs or MS Word into the post title field.

@eduardogoncalves
Copy link

Hello, I'm having the same issue here. I noticed this is happening when I'm blogging on macOS. When I copy a text, like the title of an article (https://g1.globo.com/mt/mato-grosso/noticia/2022/03/08/mulher-que-matou-amiga-com-facada-no-peito-em-mt-e-condenada-a-10-anos-de-prisao.ghtml) and paste it into my post title, on macOs it doesn't show any char/space at the end of the string. But when I open it on Windows machine, it displays an [obj] char at the end.

In the original site title it doesn't show any special char, it looks like gutenberg is adding it.
image

@gwwar
Copy link
Contributor

gwwar commented Mar 11, 2022

@dmsnell would you be available to help keep an eye on this one, since I'll be less available to contribute as often? From the reports, it's highly likely that there's still an issue here, but the tricky part is being able to reproduce the issue. We're likely missing environment/browser details or additional steps.

@dmsnell
Copy link
Member

dmsnell commented Mar 11, 2022

  1. When pasting headings with a lot of formatting from Google Docs into the post title field, a bunch of the original formatting is initially displayed in the editor

I was able to reproduce this and believe this probably relates to the interaction with the paste handler on the title. The title block I think is stripping away the formatting by design (because the title can't have any formatting), but we still paste the HTML contents of the clipboard in instead of the plaintext contents.

If I can find some time I will confirm this and see if it's an easy fix.

  1. When pasting strings with certain characters into the post title field, it's possible to generate permalinks that can cause redirection issues or crash browsers.

I was unable to reproduce this but I'm running macOS. I tried in Firefox, Chromium, and Safari.

  1. When pasting a full line heading from Google Docs into the post title field in Safari, certain headings crash the editor and the whole browser.

Was able to reproduce this but had some trouble figuring out where the error is because of how React error boundaries are swallowing them up. It seems like an image might trigger it. The copy contents of the offending HTML follow

Offending HTML
<h1 dir="ltr" id="docs-internal-guid-c1cd5ed7-7fff-3a3f-1919-cc48a9b51d7b" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; text-decoration: none; line-height: 1.38; margin-top: 20pt; margin-bottom: 6pt;"><span style="font-size: 72pt; font-family: Lobster, cursive; color: rgb(0, 0, 0); background-color: transparent; font-weight: 400; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-variant-east-asian: normal; font-variant-position: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">The title</span><span style="border: none; display: inline-block; overflow: hidden; width: 208px; height: 208px;"><img src="https://lh5.googleusercontent.com/t8NWrgIkqzbB-Ai6s94pxgZ8SYwiXg1qQTx0jsWzU1juZ3FNT-JeCWg1L2WGRU7nAd6DjzJF1_7JzsRZMD_vBbHLrZG9qfgzHOF52jui-CIF35pcJyvvsDkk2ttNIZQ8HxWhBG-o" width="208" height="208" style="margin-left: 0px; margin-top: 0px;"></span></h1><p dir="ltr" style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; text-decoration: none; line-height: 1.38; margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0); background-color: transparent; font-weight: 400; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-variant-east-asian: normal; font-variant-position: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">And a </span><a href="https://www.youtube.com/watch?v=dQw4w9WgXcQ" style="text-decoration: none;"><span style="font-size: 11pt; font-family: Arial; color: rgb(17, 85, 204); background-color: transparent; font-weight: 400; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-variant-east-asian: normal; font-variant-position: normal; text-decoration: underline; text-decoration-skip: none; vertical-align: baseline; white-space: pre-wrap;">link</span></a><span style="font-size: 11pt; font-family: Arial; color: rgb(0, 0, 0); background-color: transparent; font-weight: 400; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-variant-east-asian: normal; font-variant-position: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">!</span></p>

Surprisingly it doesn't appear to crash if I'm pasting over an existing title. It only crashes if the title is empty before pasting.


I can try and keep an eye on this but I'm not sure how readily I'll be able to start tackling it. If I don't report back in a week it probably means I'm too occupied with other work to look. Feel free to re-ping me, especially if anyone has ideas for a fix.

@michaelmuniz
Copy link

We've observed this as well, and typically find it is the result of copy/pasting text from Word into the post title field. Thankfully it is not crashing the editor (likely the result of the previous fixes), but is there a way to suppress a trailing space or other non-displayable characters in the post title field?

Example Post: https://cpj.org/thetorch/2022/03/the-russian-media-is-dead/

We find the object replacement character shows up on Firefox and Android, but not Chrome, Safar, or iOS.

Interestingly enough, we also see this displaying on the twitter card when sharing out to social media (any browser), ie: https://cpj.org/thetorch/2022/03/the-russian-media-is-dead/?share=twitter&nb=1

@eduardogoncalves
Copy link

@michaelmuniz on Windows it shows up on Firefox and Chrome.

Firefox

image

Chrome 99.0.4844.51

image

@JuanDBB
Copy link

JuanDBB commented Mar 18, 2022

Hello
This is a big problem for a newspaper, where journalists copy/paste titles from word and similars. We have this problem since wordpress 5.9. You can see the OBJ even in the permalink.

@ckeeney
Copy link

ckeeney commented Apr 6, 2022

Adding this filter will replace the [obj] in the post slug with some alphanumerics characters.

In my test cases, post-slug[obj] became post-slugefbfbc, and in my test case the url encoding of [obj] was
%ef%bf%bc. This is obviously less than ideal, but far better than having the non-printable characters in the url.

add_filter("wp_unique_post_slug", function($slug, $post_ID, $post_status, $post_type, $post_parent, $original_slug) {
    return preg_replace('/[^\w-]/', '', $slug);
}, 10, 6);

@digitalsutton
Copy link

Any update on this? Copy/pasting from word or Google is incredibly common. Yes, we can try to "train" our content creators, but I think it's fair to expect that title text to be properly filtered to avoid malformed characters appearing to users.

Does anybody have a workaround they can share? Thank you!

@staceypee
Copy link

We're experiencing this as well since updating to 5.9, and it's driving us a little crazy as it wreaks havoc on our permalink structure. Thanks @ckeeney for the filter which I will try as a stopgap.

@edemir206
Copy link

Hello,

We are facing the same issue at www.ufsm.br

Our urls examples:
https://www.ufsm.br/midias/arco/pesquisa-de-dida-larruscain-aborda-saberes-profissionais-de-musicodocentes-%ef%bf%bc/

https://www.ufsm.br/unidades-universitarias/ct/2022/05/06/depg-divulga-resultado-da-selecao-de-monitores%ef%bf%bc/

image

Our users are complaining since last week but I don't know in fact how many urls could be affected by now.

They reported that are copying pasting titles from external editors into gutenberg,

Our WP version is 5.9.3

Hope a soon fix.

@rfischmann
Copy link

I don't know if this is related to the "OBJ" issue, nor if the "OBJ" issue has been fixed on WordPress 6.0, but I'm running 6.0 and it still maintains another bug when copy and pasting using Firefox.

Even if I copy from the title itself, if I paste normally (and not using the special keyboard shortcut to paste in plain text), it breaks the whole editor layout:

title

@dmsnell
Copy link
Member

dmsnell commented Jun 8, 2022

can someone who is able to reproduce this please copy the offending content from Word and paste it into my clipboard viewer and then paste the full contents of that page here into this issue? you can inspect the source code to verify that nothing nefarious is going on in that code.

Screen.Recording.2022-06-08.at.5.34.42.PM.mov

@dmsnell
Copy link
Member

dmsnell commented Jul 1, 2022

Finally I've been able to figure out some reproducibility steps and I have to say I'm no longer surprised on why this was so hard to figure out. I suspect at this point we only get into this situation after deleting an existing title and then pasting; if you can confirm that you have seen this bug without hitting backspace, delete, or pasting over existing content I'd like to know.

Some interesting bits:

  • The [OBJ] character isn't coming over from Google Docs. It's generated inside the RichText on the page title.
  • The [OBJ] character doesn't appear to come over when the title is already blank. It comes in after pasting over or incompletely deleting and then pasting into the title.
  • the pasteHandler properly reports the plaintext value of the title back to the RichText instance.
  • If you copy a single line and paste it then you get the [OBJ] but if you select more than one line and paste you don't.
  • If you copy a single line from the left to the right you get it but if you copy from the right to the left then you don't.

Reproduction steps

  • Create a Google Doc with two lines in it.
  • Copy one of the lines and paste into your Gutenberg document as the post title.
  • Save.
  • Everything is fine.
  • Select the title in the editor and paste again. You should see the added vertical space inserted at this point.
  • Save.
  • The [OBJ] character is there.
Screen.Recording.2022-07-01.at.4.18.45.PM.mov

The extra space in the video is <br class=\"Apple-interchange-newline\"> inserted after the title. We can see them in the contentEditable inside the title's RichText

Screen Shot 2022-07-01 at 4 24 30 PM

@ironprogrammer
Copy link
Contributor

ironprogrammer commented Jul 1, 2022

Thanks for the repro notes and video, @dmsnell! To add to this, I've found that the browser used has a big impact on the reproducibility. First, my environment details:

  • OS: macOS 12.4
  • Browsers:
    • Safari 15.5
    • Google Chrome 103.0.5060.53
    • Mozilla Firefox 102.0
  • Server: nginx/1.23.0
  • PHP: 7.4.30
  • WordPress: 5.9.3, 6.0, and 6.1-alpha
  • Theme: twentytwentytwo v1.2
  • Gutenberg plugin NOT active

Note to Readers: Viewing this issue in Firefox or Chrome may more clearly show the character (aka "[OBJ]") being referenced when pasted inline.

Reproduction Test Results

  • With the latest repro steps, I'm able to repro as written in Chrome. Note that the trailing \n, obvious in Clipboard Viewer, must be part of the copied text, as indicated in the video. The font style of the title field appears unaffected in the editor, but the resultant post title contains characters.
  • In Firefox, the copied HTML styling is applied to the title field immediately, upon the first paste. Subsequent pastes do not register as changed content (i.e. the editor doesn’t detect a change to the title). Since no additional line breaks are added, I was unable to "Update" the post. The resultant post does NOT have characters in the title.
  • Over in Safari, the editor title field is styled on first paste, like in Firefox. However, subsequent highlight/pastes DO register a change to the title, like in Chrome. The resultant post title contains characters.

Observations

  1. In both Safari and Chrome it appears that the \n from the paste content correlates to the character being saved to the title. On the resultant post, this character is visible as a blank space in Chrome, and collapsed visually in Safari (hidden). When post titles including this character are viewed in Firefox, the character appears visually similar to [OBJ].
  2. Firefox and Safari reflect the pasted content's HTML styling on the title field when the \n is present (as documented by @claudiulodro), but Chrome does not.
  3. In each test case, if the \n was ommited from the copied text, then the issue was not reproducible.

As an aside, I was able to "naturally" reproduce the related slug issue discussed here, as well as over on Trac 55117. This was possible using Chrome and performing a "double paste" of the pasteboard with the sample text containing \n. The additional paste has to occur prior to saving/publishing the post, otherwise the previous good slug is retained. I'll update these findings over in Trac.

Props @dmsnell for the Clipboard Viewer, which proved immensely helpful in understanding what is actually in the clipboard from various sources.

@ironprogrammer
Copy link
Contributor

I wanted to share additional repro steps from the related Core ticket, Trac 55117#comment:29 "Additional Information".

Key Takeaways: The browser used and how the cursor is placed/moved into the title field matters in how this issue is reproduced.

@github-actions github-actions bot added the [Status] In Progress Tracking issues with work in progress label Jul 11, 2022
@danielcostadev
Copy link

Solution: Windows (Chrome) CTRL + Shift + V (Paste as plain text)

@Rafiozoo
Copy link

Rafiozoo commented Aug 6, 2022

Adding this filter will replace the [obj] in the post slug with some alphanumerics characters.

In my test cases, post-slug[obj] became post-slugefbfbc, and in my test case the url encoding of [obj] was %ef%bf%bc. This is obviously less than ideal, but far better than having the non-printable characters in the url.

add_filter("wp_unique_post_slug", function($slug, $post_ID, $post_status, $post_type, $post_parent, $original_slug) {
    return preg_replace('/[^\w-]/', '', $slug);
}, 10, 6);

@ckeeney Thanks! It helps for the slug.
Just checked in WP 6.0.1 that after normal post edit / save the filter cleans the slug well.
But in quick edit replaces [OBJ] into "efbfbc" string.

The code below works for me:

add_filter("wp_unique_post_slug", function($slug, $post_ID, $post_status, $post_type, $post_parent, $original_slug) {
    return preg_replace('/(%ef%bf%bc)|(efbfbc)|[^\w-]/', '', $slug);
}, 10, 6);

@dmsnell
Copy link
Member

dmsnell commented Aug 8, 2022

@Rafiozoo this should be fixed since the merge of #42321 - are you still seeing it in new pastes or is the sin an old post?

Note too that changing the URL-encoding into efbfbc is probably neither a helpful change nor the easiest. Removing it altogether I think would lead to a better result wouldn't it?

@swinggraphics
Copy link

Why is this issue closed? The problem still exists in 6.0.2 whether you use the block editor or Quick Edit.

Solution: Windows (Chrome) CTRL + Shift + V (Paste as plain text)

This is a good workaround but not a solution.

@ironprogrammer
Copy link
Contributor

In response to @swinggraphics:

Why is this issue closed?

It was addressed in #42321, which shipped with Gutenberg 13.8. In the timeline that coincidentally appeared just below the workaround suggestion, but they're unrelated 😂

Today's beta release of WordPress 6.1 includes this fix. Alternatively, the fix is also included in the Gutenberg plugin since 13.8. If you still encounter the issue after updating with either of these options, please share your experience and environment information here.

@swinggraphics
Copy link

Today's beta release of WordPress 6.1 includes this fix. Alternatively, the fix is also included in the Gutenberg plugin since 13.8. If you still encounter the issue after updating with either of these options, please share your experience and environment information here.

Gotcha, thank you! The line drawn between replies can be misleading…at least to me. :) I am helping out on a site where the authors run into this constantly. After fixing another half dozen for them today, we'll all be very glad when the fix ships.

@coreyworrell
Copy link
Contributor

I'm still noticing <strong> tags when pasting text into the title field, in 6.1.1. These get saved to the database.

@bozzmedia
Copy link

bozzmedia commented Jan 23, 2023

I'm running into this issue on 6.1.1 when pasting linked text which is copied from the frontend of the website itself. The only way to see the source in the title is to look at All Posts to see the inserted HTML. I have had the issue with <strong> but now also <A href="/">

IMO these html tags should be stripped out automatically for post titles, or we at least need a code view to easily clean them up. I run into this issue regularly in the block editor.

@ironprogrammer
Copy link
Contributor

This issue relates to the title field being displayed with styling from the original copy source text. The fix to this issue hides the styling that may be present in the post title field while in the editor. But it doesn't prevent post titles from having markup.

For historical reasons, titles can contain markup, as odd as this may seem -- but this is a feature, and not a bug.

Please note that there is an enhancement underway that would hide this added markup from the posts list table, which may improve things where the markup can be visually distracting: https://core.trac.wordpress.org/ticket/57265.

@coreyworrell
Copy link
Contributor

@ironprogrammer hiding the markup just seems crazy to me. I understand having markup in the title should be allowed, but no markup should come across when pasting. If you type it out, sure, allow it, show it everywhere. Certainly hiding it from the list table and editor would cause more problems because then you would not know your title has markup until down the road seeing it elsewhere (RSS feed, etc).

@bozzmedia
Copy link

@ironprogrammer good to know this is a feature and not a bug, thanks for clarifying.

Hiding the markup from the post title is an interesting approach but since the title is still output with the markup it just obscures the issue further. If the Post Title needs to support markup there should be a way to toggle the styling on and off so you can actually edit the markup.

@bozzmedia
Copy link

Today https://core.trac.wordpress.org/ticket/57265 was marked as wontfix due to concerns.

Please consider re-opening this ticket as this issue (markup pasted or written into post titles is not editable in the block editor) persists. Thank you.

@Himshekhar07
Copy link

Himshekhar07 commented Feb 9, 2023

When I copy and paste bold words from a google document file or any other site, it shows <strong> tag in the admin backend for the Twenty-Three, Twenty Twenty-One, Twenty Twenty-Two themes.

Today I created a core ticket for this issue as well : https://core.trac.wordpress.org/ticket/57682#ticket

For better understanding I am posting a video:
https://share.cleanshot.com/GTcXSfSJyBdM6rwDRTY4

@ironprogrammer
Copy link
Contributor

Hi, @Himshekhar07 -- as noted in #38637 (comment), this behavior is intentional.

There is a separate issue you might check on, #46823, that requests markup in the title field be made visible/editable. This might help identify unintended titles before they are saved.

@bozzmedia
Copy link

Related: #38668

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Block] Post Title Affects the Post Title Block [Feature] Paste [Status] In Progress Tracking issues with work in progress [Type] Bug An existing feature does not function as intended
Projects
None yet
Development

Successfully merging a pull request may close this issue.