Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Amp spike experiment #254

Closed
wants to merge 18 commits into from
224 changes: 224 additions & 0 deletions AMP Spike/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,224 @@
# AMP Spike

This experiment is an attempt to merge the functionalites of
[Wordpress AMP plugin][1] and [Lullabot AMP Library][2] by making an assessment
of what could be abstracted and put into a PHP package that could be used by
different platforms.

I work with AMP for Drupal and I am one of the contributors to the [Drupal AMP project][3].
The Drupal module uses the [AMP Library][2] to validate and convert the markup.
The Library comes with many interesting functionalities including:
- The ability to validate the AMP markup
- The ability to convert HTML markup to AMP Markup, fixing and removing invalid tags.
- Reporting changes and errors found
- A command line tool that can be used to convert files manually, useful for testing
- PHPUnit tests
- I works with third party dependencies that are managed with composer

The AMP Library was developed in 2016 by [Lullabot][4] in partnership with Google.
Some of the key people involved:
- [Sidharth Kshatriya][5] @sidkshatriya
- [Jerad Bitner][6] @sirkitree
- [Darren Petersen][7] @dsayswhat
- [Matthew Tift][8] @mtift
- [Marc Drummond][9] @mdrummond

Here is the [Full list of contributors](https://github.com/Lullabot/amp-library/graphs/contributors)

Maintainers of the Drupal AMP Module:
- [Matthew Tift][8] @mtift
- [Marc Drummond][9] @mdrummond
- [Sidharth Kshatriya][5] @sidkshatriya
- [KarenS][11] @karens
- [Jerad Bitner][6] @sirkitree
- [Sascha Grossenbacher][12] @berdir
- [Dave Reid][13] @davereid
- [Alex Pott][14] @alexpott
- [Daniel Wehner][15] @dawehner
- [Marcelo Vani][16] @marcelovani

[1]: https://wordpress.org/plugins/amp
[2]: https://github.com/Lullabot/amp-library
[3]: https://www.drupal.org/project/amp
[4]: https://www.lullabot.com/articles/amping-up-drupal
[5]: https://github.com/sidkshatriya
[6]: https://github.com/sirkitree
[7]: https://github.com/dsayswhat
[8]: https://github.com/mtift
[9]: https://github.com/mdrummond
[11]: https://github.com/karens
[12]: https://github.com/berdir
[13]: https://github.com/davereid
[14]: https://github.com/alexpott
[15]: https://github.com/dawehner
[16]: https://github.com/marcelovani

Here is the [Full list of contributors](https://www.drupal.org/node/2582081/committers)

## The problem
As Google AMP team keep pushing new functionalities into the [AMP Html Project][20]
on a daily basis, we have a situation where the AMP Library no longer works with
the latest AMP Tags and attributes, which are stripped out by the AMP Library.
Here are some of the issues for AMP Library and Drupal AMP module:

- Lullabot AMP library Validate Generator needs update: [#2962305](https://www.drupal.org/project/amp/issues/2962305)
- Move Power User mode to separate module: [#2970810](https://www.drupal.org/project/amp/issues/2970810)
- Incorrectly removing AMP-Auto-Ads code: [#2959782](https://www.drupal.org/project/amp/issues/2959782)
- Validator Generated (validator-generated.php file) needs update: [#231](https://github.com/Lullabot/amp-library/issues/231)
- Question: validator-generated.php: [#181](https://github.com/Lullabot/amp-library/issues/181)
- AMP bind script is being removed: [#226](https://github.com/Lullabot/amp-library/issues/226)
- Amp-sidebar is being removed: [#222](https://github.com/Lullabot/amp-library/issues/222)
- rtc-config attribute is getting stripped from <amp-ad> tags: [#220](https://github.com/Lullabot/amp-library/issues/220)
- validator-generated.php seems to be incomplete: [#208](https://github.com/Lullabot/amp-library/issues/208)

AMP Library uses an auto-generated class to validate and strip invalid AMP tags,
[see more details here](https://github.com/Lullabot/amp-library/tree/master/src/Spec)
This file is generated by a the [Lullabot fork](https://github.com/Lullabot/amphtml)
of the [ampproject/amphtml][20].

The problem is that the generated file no longer works with the latest AMP HTML code.
There is another problem, the Validators [on this folder](https://github.com/Lullabot/amp-library/tree/master/src/Validate)
are basically a port from validate.js to PHP classes. The original validate.js
has changed a lot and has been moved into a [folder called *engine*](https://github.com/ampproject/amphtml/tree/master/validator/engine).

[20]: https://github.com/ampproject/amphtml

So, the fix requires a total re-write of the PHP classes and the Python script
that creates the generated validator class.

# The experiment
#### 1. WordPress
I was curious about the [Wordpress plugin][40]

The Wordpress plugin is mainly maintained by:

- [Weston Ruter][31] @westonruter
- [Ryan Kienstra][32] @kienstra
- [Jacob Schweitzer][33] @jacobschweitzer
- [Alberto A. Medina][34] @amedina

[31]: https://github.com/westonruter
[32]: https://github.com/kienstra
[33]: https://github.com/jacobschweitzer
[34]: https://github.com/amedina
[40]: https://github.com/ampproject/amp-wp

Here is the [Full list of contributors](https://github.com/ampproject/amp-wp/graphs/contributors)


I decided to have a look into the code to see how it works and what bits of
functionalities are common with [AMP Library][2].
I can see that the module has some [Sanitizers](https://github.com/ampproject/amp-wp/tree/develop/includes/sanitizers)
which are called [Passes](https://github.com/Lullabot/amp-library/tree/master/src/Pass) in the AMP library.
This is a good sign that we can probably join some of the code.

Since the AMP Library is already a standalone package, I started by moving some
of the code from the Wordpress plugin into the AMP Library. I also removed the
git submodule. See all comments on this Pull request [Abstracting AMP validators and sanitizers into AMP Library ][50].

So, now we have WordPress plugin using AMP Library!

It does the job and also prints some debugging information, see screenshots:
![Screenshots](https://github.com/dennisinteractive/amp-library/raw/amp-spike/AMP%20Spike/Screenshots.jpg)
On the left, you see the Desktop version on a WordPress site.
In the middle, you see the AMP version of the article using the original AMP WP plugin.
On the right, it's the [Forked WP Plugin][50] using the Lullabot Library with some extra
debugging information (optional).

This is the HTML markup of the article
```html
AMP Converter Test
<p style="text-align: center;">Image
<img class="wp-image-13 size-medium aligncenter" src="https://cdn.shopify.com/s/files/1/2222/0805/products/Stereo-845-PP-e1441807241782_480x480.jpg" alt="" width="300" height="198" />
</p>
<p style="text-align: center;">YouTube Video
<iframe src="https://www.youtube.com/embed/cKgIsOA-G-Q" width="560" height="315" frameborder="0" allowfullscreen="allowfullscreen"></iframe>
</p>
```

This is the code that replaces the class:
see https://github.com/marcelovani/amp-wp/pull/4/files#diff-35292ad861a5ed6690412c3ccfe5b0adR318
```php
private function build_post_content() {
$amp = new \Lullabot\AMP\AMP();
$content = apply_filters('the_content', $this->post->post_content);
$options = array(
'scope' => 'body',
'add_stats_html_comment' => false
);
$amp->loadHtml($content, $options);
$this->add_data_by_key('post_amp_content', $amp->convertToAmpHtml());
$this->merge_data_for_key('amp_component_scripts', $amp->getComponentJs());
$this->add_data_by_key('post_amp_debug', $amp->warningsHumanHtml());
return;
```
I wanted to change the scope to apply the AMP markup on the full HTML that is
rendered by WordPress, but I don't know how to do that. But for the proof of
concept, this should be a good start.

#### How to test this
- Install WordPress as usual
- Check out the forked AMP WP pulgin inside the `wp-content/plugins`
folder buy running: `git clone -b amp-library [email protected]:marcelovani/amp-wp.git amp`
- Go inside the `wp-content/plugins/amp` folder and run `composer install`.
- Enable and configure the AMP plugin
- Create an article and visit the /amp url


#### 2. Lullabot AMP Library
I moved some code from the WP plugin and from the Lullabot fork of AMPHTML into
This fork of [AMP Library](https://github.com/dennisinteractive/amp-library/tree/amphtml-generator)
See this pull request [Amphtml generator update][51]

#### How to regenerate the validator scripts
The AMP Library currently has both [validators (class-amp-allowed-tags-generated.php and validator-generated.php)][57]
side by side, until we do the actual work to update the classes and then get rid of one of them.
- Check out the forked AMP library buy running: `git clone -b amphtml-generator [email protected]:dennisinteractive/amp-library.git amp-library`
- Go inside the `amp-library` and run `Vagrant up --provision` to start the VM (Requires [Vagrant][55] and [VirtualBox][56])
- Get inside the VM `vagrant ssh`
- Go to the bin folder `cd /vagrant/bin`
- Run the build script `sh ./build.sh`

The build script will check out the latest tag of [AMPHtml][20], copy the Python
scripts inside `vendor/amphtml/validator` folder, generate the PHP files and copy
them into `src/Spec` folder.
These files can be committed and the AMP Library can be tagged and released.

#### The command line tool
- Go into `wp-content/plugins/amp/vendor/lullabot/amp` folder and run `composer install`
- Run `vendor/bin/amp-console amp:convert vendor/lullabot/amp/sample-html/sample-html-fragment.html`
Options:
```
--no-orig-and-warn If set, the original HTML and warnings/messages encountered during conversion will not be printed out
--no-lines If set, the line numbers will be not printed alongside the AMPized HTML.
--diff If set, a diff of the input and output HTML will be printed out instead of the AMP html. Note that the original HTML will be formatted before being diffed with output HTML for best results. This is because the output HTML is also formatted automatically.
--js If set, a list of custom amp components and the url to include the js is printed out
--full-document If set, assumes this is a whole document html document and not an html fragment underneath the body (which is the default)
--options=OPTIONS If set, loads options from the file indicated
-h, --help Display this help message
-q, --quiet Do not output any message
-V, --version Display this application version
--ansi Force ANSI output
--no-ansi Disable ANSI output
-n, --no-interaction Do not ask any interactive question
-v|vv|vvv, --verbose Increase the verbosity of messages: 1 for normal output, 2 for more verbose output and 3 for debug
```

### Pull requests:

- [WP Plugin using AMP Library][50] pull/4
- [Amphtml generator update][51] pull/6


[50]: https://github.com/marcelovani/amp-wp/pull/4
[51]: https://github.com/dennisinteractive/amp-library/pull/6
[55]: https://www.vagrantup.com/
[56]: https://www.virtualbox.org/wiki/Downloads
[57]: https://github.com/dennisinteractive/amp-library/tree/amphtml-generator/src/Spec

# Challenges and ideas
- Figure out if this is the best approach and update the [Python scripts](https://github.com/dennisinteractive/amp-library/tree/amphtml-generator/bin)
and the PHP classes in <https://github.com/Lullabot/amp-library/tree/master/src/Validate>
- Figure out how to make the WP AMP plugin work for site builders that don't use
Composer. Maybe as part of the build we include the library?
- Provide a Dockerfile as an alternative for the Vagrantfile for the AMP Library
Binary file added AMP Spike/Screenshots.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.