Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing approach #2481

Closed
f1ames opened this issue Aug 6, 2018 · 3 comments · Fixed by ckeditor/ckeditor5-paste-from-office#16
Closed

Testing approach #2481

f1ames opened this issue Aug 6, 2018 · 3 comments · Fixed by ckeditor/ckeditor5-paste-from-office#16
Labels
status:discussion type:task This issue reports a chore (non-production change) and other types of "todos".
Milestone

Comments

@f1ames
Copy link
Contributor

f1ames commented Aug 6, 2018

Unit/integration tests are a crucial part of Paste from Word. Even though some things work out of the box (in most cases) when copying from Word (basic styles, links, etc), they need to be covered with tests to prevent any regressions and have full control over what's going on with the plugin code.

I have created one integration test for basic styles (bold, italic, underline, strikethrough) to see how we may approach creating these tests in a most effective manner.

1. Tests structure

It is important to properly structure tests data so tests can be easily read and modified. Since there can be a lot of input data files, the main idea is to use the following structure:

- tests
  - data/inputs (not sure about dir name ATM)
    - plugin-name
      - test-name
        - test-group-name
          - input.html
          - input.chrome.html
  - plugin-name
    - test1.js
    - test2.js    
  - plugin-name2.js

The nesting depends on how many separate test files there will be for a given plugin. If there is only one, there is no point in creating a separate dir (so there will be tests/plugin-name.js). If there are more than one test for a single plugin, then it becomes a directory like tests/plugin-name/test1.js and tests/plugin-name/test2.js. For example (one test per plugin):

- tests
  - data/inputs
    - basic-styles
      - bold-within-text // Name of the test section - describe( 'bold within text', ... )
        - input.chrome.html
  - plugin-name
    - basic-styles.js

The important thing to mention is that there can be a separate input file for each browser, for each Word version (so worst case scenario Chrome, Firefox, Safari, Edge for both Word 2016 and Word 2013 gives 8 files). Also original Word .docx from which input file was generated should be stored (another two files). Since Word output is pretty long, combining these files doesn't make any sense as it would make them completely unreadable.

When it comes to storing expected output in a separate file, I am not sure. It is important to test pasting one piece of content in a different contexts (inline/block context, tables) so for one input there will be many expected outputs depending on the context. As long as it is not necessary and because the output is usually much shorter, it could be simply placed inside a test file.

The whole idea of strongly structured tests is the fact that it is much easier to automate things - loading specific input file for specific browser (see section below), automatically loading input files for other Word version without modifying the test file, etc.

2. Testing on different browsers

There are situations when same Word document pasted into different browsers gives different data inside clipboard. That means in such situations, each browser should have its own input file. The expected output is the same for each browser (you paste the same data, you expect the same result - only intermediate data is different due to different browser behaviour). So there might be:

  • input1.chrome.html
  • input1.firefox.html
  • input1.safari.html
  • input1.edge.html

My assumption is that each input should be only used for a specific browser, there is no point in running test with input1.firefox.html input on e.g. Chrome, because such data will never appear there.
Even if we assume that normalisation process is browser independent (so test for specific browser should not fail on others), which I am not sure ATM, it is still much more efficient to run each test only on its designated browser (4 browsers x 1 test or 4 browsers x 4 test).

For such implementation we will need a mechanism to detect browser and to correctly serve input file inside tests based on detected browser.

3. What to test?

Paste from Word will be basically a 2-step process:

  • The pasted HTML is processed/normalised by the special Word filter (not created yet) which will handle all the quirks, invalid structure, etc to transform it into something understandable by CKEditor.
  • Standard flow for handling pasted data is executed (conversions, etc) which results in pasted content transformed to view and then applied to model (here between view -> model filtering takes place) then view and finally DOM structure.

Results of both steps needs to be validated with tests (normalised Word input and final editor content. However, when checking final editor content, tests may check model, view or DOM (or all of them). As the model will represent the final data which is then transformed to view and DOM checking all 3 layers seems to be redundant. Checking model only should be sufficient.

@f1ames
Copy link
Contributor Author

f1ames commented Aug 8, 2018

Apart from what I've described above, after creating some more tests I start to wonder how many cases/contexts we should cover per input. For example, for pasting linked text like Regular <a href="https://ckeditor.com/">link</a>1 (see in the test file) I have created tests checking pasting such data in 5 contexts:

  • empty editor
  • non-empty paragraph
  • non-default block element (like heading)
  • inline element (like styling)
  • another link element

and for another two input files (two links and multiline link) the same. So it gives 15 tests for 3 inputs. I think it makes sense for integration tests as it gives good coverage for many different cases/usages. The important think to remember is that input files should vary, each covering different cases/formatting/structure so tests are really not redundant.

As for unit tests, which should cover normalisation step which is context independent, there is no need for different contexts per input.

@f1ames
Copy link
Contributor Author

f1ames commented Aug 10, 2018

I started working on lists ckeditor/ckeditor5-paste-from-office#5, which means writing integration tests first and I have some more thoughts about testing.

What to test... again

Earlier I have mentioned that:

As the model will represent the final data which is then transformed to view and DOM checking all 3 layers seems to be redundant. Checking model only should be sufficient.

It seems to be sufficient when working with inline styles or links. It gets trickier when working with content which is not represented the same way in a model and in the DOM. For example having a model like:

<listItem listIndent="0" listType="bulleted">Item1</listItem>
<listItem listIndent="0" listType="numbered">Item2</listItem>

Results in HTML like:

<ul>
    <li>Item1</li>
</ul>
<ol>
    <li>Item2</li>
</ol>

so understanding such tests requires the knowledge how exactly model is transformed. Also as Word documents contains and outputs HTML like structure (well, in a big short) it is easier to understand and validate test when there is input HTML structure which should be transformed into output HTML structure and this two can be compared. Model is an intermediate state (important and also should be checked during testing) which can sometimes look quite different than input and expected output.

Test generation

Looking on the basic-styles and link tests there is a lot of repetition, each tests sets initial HTML, pastes some content and expects specific output. Creating a full test suite for single plugin requires a lot of typing and I was thinking how it could be speed up. The idea is to generate test cases so remove as much repetitive code as possible.

There are basically two approaches. Starting with test like:

describe( 'bold within text', () => {
    it( 'pastes in the empty editor', () => { ... } );
    it( 'pastes in the paragraph', () => { ... } );
    it( 'pastes inside another bold element', () => { ... } );
} );

describe( 'italic starting text', () => {
    it( 'pastes in the empty editor', () => { ... } );
    it( 'pastes in the paragraph', () => { ... } );
    it( 'pastes inside another italic element', () => { ... } );
} );

it can be shorten to something like:

generateTests( {
    'bold within text': {
        'pastes in the empty editor': () => { ... },
        'pastes in the paragraph': () => { ... },
        'pastes inside another bold element': () => { ... }
    },
    'italic starting text': {
        'pastes in the empty editor': () => { ... },
        'pastes in the paragraph': () => { ... },
        'pastes inside another italic element': () => { ... }
    }
);

which is a little less code, but the structure is basically the same and it doesn't make test fail any more readable. Instead of whole test functions it could contain only expected output, but then for large chunks of HTML / model data it is super hard to format to be readable. So I'm not happy with it.

The more interesting approach would be something like:

generateTests( {
    'contexts': {
        'pastes in the empty editor': 'context html like <p>Foo bar</p>',
        'pastes in the paragraph': 'context html',
        'pastes inside another bold element': 'context html'
    },
    'inputs': {
        'bold within text': [
            'expected output for context 1',
            'expected output for context 2',
            'expected output for context 3'
        ],
        'italic starting text': [
            'expected output for context 1',
            'expected output for context 2',
            'expected output for context 3'
        ]
    }
}

so that each input (bold within text and italic starting text) will be used in each context. This form is more concise, however without knowing how generateTests works exactly it is harder to understand and may be hard to debug trying to match wich input fails on which context with what output.
Also it suffers from the same problem as the previous proposal that for bigger chunks of HTML it creates just unreadable chunk of markup. And toy are unable to use some context only for some inputs.

That being said, for the time being I stick to manually creating whole tests same as for basic-styles and link, but paying more attention to what each test can bring so there are no redundant tests checking similar inputs in the same contexts.

@f1ames
Copy link
Contributor Author

f1ames commented Aug 20, 2018

After F2F discussion with @Reinmar we have established that:

  1. Integration tests should not test pasting in a different contexts. Testing paste in a different contexts basically tests how insertContent() function works which is not directly related to Paste from Word. It will be enough to test pasting in one context (empty editor or simple paragraph).
  2. Integration tests should check model data generated from normalized Word content. The input of insertContent() function should be checked. This way we make sure input is transformed in a proper place and correct pasting results are an effect of correct normalization and not a side effects of clipboard pipeline or content transformation during insertion. Also as model data is mapped to view and then DOM in a consistent manner, same model will always produce the same final output/content which means checking model state is enough.
  3. Unit tests should check result of a normalization step as it will validate if Paste from Word normalization works correctly on its own.
  4. Unit tests should also check what data is insertContent() function called with. This will check if clipboard pipeline does not change or malform normalized data in any unpredictable manner. Also with previous step we will make sure that everything works as expected and all transformations are handled by proper places in the pasting pipeline.

Reinmar referenced this issue in ckeditor/ckeditor5-paste-from-office Nov 5, 2018
Other: Support for pasting flat list and spacing handling for pasted content in Safari. Closes #1. Closes #12. Closes #15.
@mlewand mlewand transferred this issue from ckeditor/ckeditor5-paste-from-office Oct 9, 2019
@mlewand mlewand added this to the iteration 21 milestone Oct 9, 2019
@mlewand mlewand added status:discussion type:task This issue reports a chore (non-production change) and other types of "todos". labels Oct 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status:discussion type:task This issue reports a chore (non-production change) and other types of "todos".
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants