Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Navigation Block: Properly decode URL-encoded links #46435

Merged
6 changes: 5 additions & 1 deletion packages/block-library/src/navigation-link/edit.js
Original file line number Diff line number Diff line change
Expand Up @@ -435,7 +435,11 @@ export default function NavigationLinkEdit( {
<TextControl
value={ url || '' }
onChange={ ( urlValue ) => {
setAttributes( { url: urlValue } );
updateAttributes(
{ url: urlValue },
setAttributes,
attributes
);
} }
label={ __( 'URL' ) }
autoComplete="off"
Expand Down
36 changes: 35 additions & 1 deletion packages/block-library/src/navigation-link/index.php
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,40 @@ function block_core_navigation_link_render_submenu_icon() {
return '<svg xmlns="http://www.w3.org/2000/svg" width="12" height="12" viewBox="0 0 12 12" fill="none" aria-hidden="true" focusable="false"><path d="M1.50002 4L6.00002 8L10.5 4" stroke-width="1.5"></path></svg>';
}

/**
* Checks if the given url is encoded
*
* @param string $url The url to check.
*
* @return boolean Whether or not a url is encoded.
*/
function is_url_encoded($url) {
$query = parse_url($url, PHP_URL_QUERY);
$query_params = wp_parse_args($query);
foreach($query_params as $query_param){
if(rawurldecode($query_param) !== $query_param){
return true;
}
}
return false;
}


/**
* Decodes a url if it's encoded, returning the same url if not.
*
* @param string $url The url to decode.
*
* @return string $url Returns the decoded url.
*/
function urldecode_once($url){
if(is_url_encoded($url)){
return rawurldecode($url);
}
return $url;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we consolidate these two functions into one function called block_core_navigation_link_maybe_urldecode()?

For better or for worse, maybe_* functions are a common pattern in WordPress core: https://developer.wordpress.org/?s=maybe

It seems like there are some PHPCS issues that need to be fixed too.

Copy link
Contributor Author

@kozer kozer Jan 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer having them as two separate functions, as it's clear what each one of them is doing, and we can reuse is_url_encoded for testing encoded URLs elsewhere if we need to instead of tying those two functions together and producing a "side effect" if the URL is encoded.

However, I'll rename the urldecode_once to maybe_urldecode to match the common pattern in WordPress core.

Thanks for pointing this out!

It seems like there are some PHPCS issues that need to be fixed too.

Can you elaborate a bit more on this? I don't see any PHPCS issues locally for those two functions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kozer Given this will eventually be included in WordPress core, I'd like them consolidated to one function to match the pattern already established in WordPress core.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danielbachhuber Doesn't the maybe_urldecode renaming establish that pattern?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kozer If it's only one block_core_navigation_link_maybe_urldecode() function, then yes 😊

In the context of a large open source project, we shouldn't introduce two functions if we only need one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danielbachhuber I made the change as requested!
A side note:
I'm not sure if the phpcs errors you mentioned are related to tabs vs spaces. Neither vscode or neovim show any warnings/errors to me.
Are tabs the way to go in PHP-related files?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if the phpcs errors you mentioned are related to tabs vs spaces. Neither vscode or neovim show any warnings/errors to me.
Are tabs the way to go in PHP-related files?

@kozer Here's the failing test: https://github.com/WordPress/gutenberg/actions/runs/3830602130/jobs/6518686126

The failing test runs npm run lint:php:

image

You can run npm run lint:php in your local environment to see all of the same errors.

I use this VS Code extension to integrate PHPCS:

Name: PHP Sniffer & Beautifier
Id: valeryanm.vscode-phpsab
Description: PHP Sniffer & Beautifier for Visual Studio Code
Version: 0.0.15
Publisher: Samuel Hilson
VS Marketplace Link: https://marketplace.visualstudio.com/items?itemName=ValeryanM.vscode-phpsab

You can see all of WordPress' PHP coding standards here: https://developer.wordpress.org/coding-standards/wordpress-coding-standards/php/



/**
* Renders the `core/navigation-link` block.
*
Expand Down Expand Up @@ -171,7 +205,7 @@ function render_block_core_navigation_link( $attributes, $content, $block ) {

// Start appending HTML attributes to anchor tag.
if ( isset( $attributes['url'] ) ) {
$html .= ' href="' . esc_url( $attributes['url'] ) . '"';
$html .= ' href="' . esc_url( urldecode_once( $attributes['url'] ) ) . '"';
}

if ( $is_active ) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@ export const updateAttributes = (

setAttributes( {
// Passed `url` may already be encoded. To prevent double encoding, decodeURI is executed to revert to the original string.
...( newUrl && { url: encodeURI( safeDecodeURI( newUrl ) ) } ),
...{ url: newUrl ? encodeURI( safeDecodeURI( newUrl ) ) : newUrl },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not super clear why it was necessary to encodeURI() here in the first place. It seems like https://example.com?s=<> is JSON-encoded just fine:

>> var test = {};
undefined
>> test.url = 'https://example.com?s=<>';
"https://example.com/?s=<>"
>> JSON.stringify(test);
"{\"url\":\"https://example.com/?s=<>\"}"

Maybe because the <> gets stripped out by kses? Can you debug and document why #19679 was necessary in the first place? If it's not necessary, maybe we can remove it entirely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danielbachhuber you were right. The reason that this was stripped out is because of esc_url function that uses kses under the hood.
So I think this is the way to go here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason that this was stripped out is because of esc_url function that uses kses under the hood.

@kozer For posterity, can you provide a full step-by-step documentation of what's going on?

Copy link
Contributor Author

@kozer kozer Jan 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sure @danielbachhuber! So, the process is the following:

  • User enters a new URL via the editor.
  • Before encodeURI introduced the URL was stored as is.
  • After saving, when someone navigates to the page with the inserted URL, the URL is passed in the render_block_core_navigation_link function.
  • When rendered, the URL is passed through esc_url. esc_url uses the preg_replace function that seems to discard < and > characters:
	$url = preg_replace( '|[^a-z0-9-~+_.?#=!&;,/:%@$\|*\'()\[\]\\x80-\\xff]|i', '', $url );
  • Using encodeURI in a URL that has those characters, (eg: encodeURI('http:example.com?s=<>') encodes it, and produce an encoded URL (in the above example the URL is encoded to http:example.com?s=%3C%3E, and so, esc_url no longer strips out those characters.

This is the reason why encodeURI was introduced originally in #19679, and produce the bug we are now facing.

I also put those steps in the description of this pr. Given that, I assume this is ok to be merged.

...( label && { label } ),
...( undefined !== opensInNewTab && { opensInNewTab } ),
...( id && Number.isInteger( id ) && { id } ),
Expand Down
33 changes: 33 additions & 0 deletions phpunit/class-block-library-navigation-link-test.php
Original file line number Diff line number Diff line change
Expand Up @@ -212,6 +212,39 @@ public function test_returns_link_for_plain_link() {
);
}

public function test_returns_link_for_decoded_link() {

$urls_before_render = [
"https://example.com/?id=10&data=lzB%252Fzd%252FZA%253D%253D",
"https://example.com/?id=10&data=lzB%2Fzd%FZA%3D%3D",
"https://example.com/?id=10&data=1234",
];

$urls_after_render = [
'https://example.com/?id=10&#038;data=lzB%2Fzd%2FZA%3D%3D',
"https://example.com/?id=10&#038;data=lzB%2Fzd%FZA%3D%3D",
'https://example.com/?id=10&#038;data=1234',
];

foreach ( $urls_before_render as $idx => $link ) {
$parsed_blocks = parse_blocks('<!-- wp:navigation-link {"label":"test label", "url": "' . $link . '"} /-->');
$this->assertEquals( 1, count( $parsed_blocks ) );
$block = $parsed_blocks[0];
$navigation_link_block = new WP_Block( $block, array() );
$this->assertEquals(
true,
strpos(
gutenberg_render_block_core_navigation_link(
$navigation_link_block->attributes,
array(),
$navigation_link_block
),
$urls_after_render[$idx]
) !== false
);
};
}

public function test_returns_empty_when_custom_post_type_draft() {
$page_id = self::$custom_draft->ID;

Expand Down