Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add URL Details endpoint to REST API to allow retrieval of info about a remote URL #18042

Merged
merged 57 commits into from
Feb 1, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
bc64186
Scaffold out basic new endpoint
getdave Oct 21, 2019
b57b01d
Implement basic retrival of title tag from remote url
getdave Oct 21, 2019
be851b6
Adds validation, sanitization and permissions checks.
getdave Oct 21, 2019
bb289db
i18n fixes and docblocks
getdave Oct 21, 2019
d6a5572
Adds caching of remote request
getdave Oct 21, 2019
01676a9
Update with feedback
obenland Oct 31, 2019
86491dd
Tie up loose ends
obenland Nov 1, 2019
b02d50e
Remove unneed replacements
obenland Nov 1, 2019
3419f2a
Account for Custom Post Types in permissions check
getdave Dec 11, 2020
80b0cd5
Improve error handling
getdave Dec 11, 2020
07e0379
Improve args checking
getdave Dec 11, 2020
786c2f2
Refactor to enable query of more data in future.
getdave Dec 11, 2020
7f15fc5
Use md5
getdave Dec 11, 2020
b3ad435
Extract utility functions
getdave Dec 11, 2020
7ea84e2
Fix lint errors
getdave Dec 11, 2020
a7631a8
Add unit test scaffold
getdave Dec 11, 2020
73f496c
Attempt to mock HTTP requests for testing. Needs work.
getdave Dec 11, 2020
d80802c
Use DIR convention
getdave Dec 11, 2020
992a332
Allow opt out of cache functionality via filter
getdave Dec 14, 2020
08b3e69
Force tests to opt out of cache. Fix broken tests.
getdave Dec 14, 2020
bbc860f
Adjust 404 case to use standard response code
getdave Dec 14, 2020
cbbf5a8
Add dedicated methods to mock success/failure HTTP responses
getdave Dec 14, 2020
6cac32f
Add test for empty body in response.
getdave Dec 14, 2020
7ac1530
Add ability to filter request args. Update tests to account.
getdave Dec 14, 2020
c43021d
Document cache filter
getdave Dec 14, 2020
dc341c1
Iniital attempt at adding schema
getdave Dec 14, 2020
e8880c4
Add test for unautenticated user.
getdave Dec 14, 2020
808f0d8
Adjust schema to use WP convention
getdave Dec 14, 2020
0fb4c04
Adjust schema defaults
getdave Dec 14, 2020
72ab31a
Add endpoint args schema test
getdave Dec 14, 2020
55bcefc
Removes filter to bypass cache.
getdave Dec 16, 2020
ac6638c
Use existing transient filter to test cache and bypass for majority o…
getdave Dec 16, 2020
c3248e8
Use default timeout.
getdave Dec 16, 2020
15fb676
Removes unnecessary default from schema
getdave Dec 16, 2020
387624f
Use predefine constants for response size
getdave Jan 4, 2021
8684c77
Rearrange test code order
getdave Jan 4, 2021
94f2927
Adds ability to filter cache expiration
getdave Jan 4, 2021
598b857
Allow filtering the data retrived for a given URL
getdave Jan 4, 2021
c2dd92a
Remove @access comments
getdave Jan 11, 2021
7b8b366
Prefer custom error message over get_status_header_desc
getdave Jan 11, 2021
7053110
Utilise add_additional_fields_schema
getdave Jan 11, 2021
dc54ece
Utilise add_additional_fields_to_object
getdave Jan 11, 2021
cbce992
Enable filtering of response object
getdave Jan 11, 2021
987c02c
Add test to assert on ability to filter uncached responses
getdave Jan 11, 2021
ec700d4
Rename url placeholder literal
getdave Jan 11, 2021
5f4cc4e
Ensure accurately testing for correctly passed cached variable in fil…
getdave Jan 11, 2021
3d8becc
Update to test filtering of both cached and uncached responses
getdave Jan 11, 2021
fd44a72
Fixing linting
getdave Jan 11, 2021
5d9f314
Use array offset syntax prefered by Core
getdave Jan 12, 2021
3bcd206
Remove from cache and update filter
getdave Jan 12, 2021
802779f
Update filter name to align with standards
getdave Jan 12, 2021
af45585
Update response code to align with Core standards
getdave Jan 12, 2021
99f30f6
Update cache ttl in docs to use variable
getdave Jan 12, 2021
27de858
Use comment to reference canonical version of filter docs
getdave Jan 12, 2021
12f9724
Update to cache entire HTTP response body and remove additional filter
getdave Jan 30, 2021
5947a94
Tweak code comments
getdave Jan 30, 2021
5967e70
Remove need to encode/decode cache value
getdave Jan 30, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
264 changes: 264 additions & 0 deletions lib/class-wp-rest-url-details-controller.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,264 @@
<?php
/**
* REST API: WP_REST_URL_Details_Controller class
*
* @package Gutenberg
*/

/**
* Controller which provides REST endpoint for retrieving information
* from a remote site's HTML response.
*
* @since 5.?.0
*
* @see WP_REST_Controller
*/
class WP_REST_URL_Details_Controller extends WP_REST_Controller {

/**
* Constructs the controller.
*/
public function __construct() {
$this->namespace = '__experimental';
$this->rest_base = 'url-details';
}

/**
* Registers the necessary REST API routes.
*/
public function register_routes() {
register_rest_route(
$this->namespace,
'/' . $this->rest_base,
array(
array(
'methods' => WP_REST_Server::READABLE,
'callback' => array( $this, 'parse_url_details' ),
'args' => array(
'url' => array(
'required' => true,
'description' => __( 'The URL to process.', 'gutenberg' ),
'validate_callback' => 'wp_http_validate_url',
'sanitize_callback' => 'esc_url_raw',
'type' => 'string',
'format' => 'uri',
),
),
'permission_callback' => array( $this, 'permissions_check' ),
'schema' => array( $this, 'get_public_item_schema' ),
),
)
);
}

/**
* Get the schema for the endpoint.
*
* @return array the schema.
*/
public function get_item_schema() {

if ( $this->schema ) {
return $this->add_additional_fields_schema( $this->schema );
}

$schema = array(
'$schema' => 'http://json-schema.org/draft-04/schema#',
'title' => 'url-details',
'type' => 'object',
'properties' => array(
'title' => array(
'description' => __( 'The contents of the <title> tag from the URL.', 'gutenberg' ),
'type' => 'string',
'context' => array( 'view', 'edit', 'embed' ),
'readonly' => true,
),
),
);

$this->schema = $schema;

return $this->add_additional_fields_schema( $this->schema );
}

obenland marked this conversation as resolved.
Show resolved Hide resolved
/**
* Retrieves the contents of the <title> tag from the HTML
* response.
*
* @param WP_REST_REQUEST $request Full details about the request.
* @return WP_REST_Response|WP_Error The parsed details as a response object or an error.
*/
public function parse_url_details( $request ) {

$url = untrailingslashit( $request['url'] );

if ( empty( $url ) ) {
return new WP_Error( 'rest_invalid_url', __( 'Invalid URL', 'gutenberg' ), array( 'status' => 404 ) );
}

// Transient per URL.
$cache_key = $this->build_cache_key_for_url( $url );

// Attempt to retrieve cached response.
$cached_response = $this->get_cache( $cache_key );

if ( ! empty( $cached_response ) ) {
$remote_url_response = $cached_response;
} else {
$remote_url_response = $this->get_remote_url( $url );

// Exit if we don't have a valid body or it's empty.
if ( is_wp_error( $remote_url_response ) || empty( $remote_url_response ) ) {
return $remote_url_response;
}

// Cache the valid response.
$this->set_cache( $cache_key, $remote_url_response );
}

$data = $this->add_additional_fields_to_object(
array(
'title' => $this->get_title( $remote_url_response ),
),
$request
);

// Wrap the data in a response object.
$response = rest_ensure_response( $data );

/**
* Filters the URL data for the response.
*
* @param WP_REST_Response $response The response object.
* @param string $url The requested URL.
* @param WP_REST_Request $request Request object.
* @param array $remote_url_response HTTP response body from the remote URL.
*/
return apply_filters( 'rest_prepare_url_details', $response, $url, $request, $remote_url_response );
}

/**
* Checks whether a given request has permission to read remote urls.
*
* @return WP_Error|bool True if the request has access, WP_Error object otherwise.
*/
public function permissions_check() {
if ( current_user_can( 'edit_posts' ) ) {
getdave marked this conversation as resolved.
Show resolved Hide resolved
return true;
}

foreach ( get_post_types( array( 'show_in_rest' => true ), 'objects' ) as $post_type ) {
if ( current_user_can( $post_type->cap->edit_posts ) ) {
return true;
}
}

return new WP_Error(
'rest_cannot_view_url_details',
__( 'Sorry, you are not allowed to process remote urls.', 'gutenberg' ),
array( 'status' => rest_authorization_required_code() )
);
}

/**
* Retrieves the document title from a remote URL.
*
* @param string $url The website url whose HTML we want to access.
* @return array|WP_Error the HTTP response from the remote URL or error.
*/
private function get_remote_url( $url ) {

$args = array(
'limit_response_size' => 150 * KB_IN_BYTES,
);

/**
* Filters the HTTP request args for URL data retrieval.
*
* Can be used to adjust response size limit and other WP_Http::request args.
*
* @param array $args Arguments used for the HTTP request
* @param string $url The attempted URL.
*/
$args = apply_filters( 'rest_url_details_http_request_args', $args, $url );

$response = wp_safe_remote_get(
$url,
$args
);

if ( WP_Http::OK !== wp_remote_retrieve_response_code( $response ) ) {
// Not saving the error response to cache since the error might be temporary.
return new WP_Error( 'no_response', __( 'URL not found. Response returned a non-200 status code for this URL.', 'gutenberg' ), array( 'status' => WP_Http::NOT_FOUND ) );
}

$remote_body = wp_remote_retrieve_body( $response );

if ( empty( $remote_body ) ) {
return new WP_Error( 'no_content', __( 'Unable to retrieve body from response at this URL.', 'gutenberg' ), array( 'status' => WP_Http::NOT_FOUND ) );
}

return $remote_body;
}

/**
* Parses the <title> contents from the provided HTML
*
* @param string $html the HTML from the remote website at URL.
* @return string the title tag contents (maybe empty).
*/
private function get_title( $html ) {
preg_match( '|<title>([^<]*?)</title>|is', $html, $match_title );
getdave marked this conversation as resolved.
Show resolved Hide resolved

$title = isset( $match_title[1] ) ? trim( $match_title[1] ) : '';

return $title;
}

/**
* Utility function to build cache key for a given URL.
*
* @param string $url the URL for which to build a cache key.
* @return string the cache key.
*/
private function build_cache_key_for_url( $url ) {
return 'g_url_details_response_' . md5( $url );
}

/**
* Utility function to retrieve a value from the cache at a given key.
*
* @param string $key the cache key.
* @return string the value from the cache.
*/
private function get_cache( $key ) {
return get_transient( $key );
}

/**
* Utility function to cache a given data set at a given cache key.
*
* @param string $key the cache key under which to store the value.
* @param string $data the data to be stored at the given cache key.
* @return void
*/
private function set_cache( $key, $data = '' ) {
if ( ! is_array( $data ) ) {
return;
}

$ttl = HOUR_IN_SECONDS;

/**
* Filters the cache expiration.
*
* Can be used to adjust the time until expiration in seconds for the cache
* of the data retrieved for the given URL.
*
* @param int $ttl the time until cache expiration in seconds.
*/
$cache_expiration = apply_filters( 'rest_url_details_cache_expiration', $ttl );

return set_transient( $key, $data, $cache_expiration );
}
}
4 changes: 4 additions & 0 deletions lib/load.php
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,10 @@ function gutenberg_is_experiment_enabled( $name ) {
* End: Include for phase 2
*/

if ( ! class_exists( 'WP_REST_URL_Details_Controller' ) ) {
require_once __DIR__ . '/class-wp-rest-url-details-controller.php';
}

require __DIR__ . '/rest-api.php';
}

Expand Down
12 changes: 12 additions & 0 deletions lib/rest-api.php
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,18 @@
die( 'Silence is golden.' );
}


/**
* Registers the REST API routes for URL Details.
*
* @since 5.0.0
*/
function gutenberg_register_url_details_routes() {
$url_details_controller = new WP_REST_URL_Details_Controller();
$url_details_controller->register_routes();
}
add_action( 'rest_api_init', 'gutenberg_register_url_details_routes' );

/**
* Registers the block pattern directory.
*/
Expand Down
Loading