Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-step measurement implementation #618

Closed
sburnicki opened this issue May 13, 2016 · 14 comments
Closed

Multi-step measurement implementation #618

sburnicki opened this issue May 13, 2016 · 14 comments

Comments

@sburnicki
Copy link
Contributor

With a multi-step implementation, we want to get multiple separated measurements by executing a single script. Currently, you can only get one result per executed script.

There is an existing fork with a multi-step implementation (https://github.com/iteratec-hh/webpagetest/tree/multistep), however, the real multi-step implementation is somehow lost in different commits and the fork is not compatible with upstream.

For a new, clean implementation of multi-step I'd like to discuss the implementation before actually starting with it.

How is multi-step implemted in the fork?
The desired behavior is to be downwards-compatible to the current single-step version.
Each new measurement starts therefore with a setEventName command.
Commands executed between two setEventName are regarded as a single measurement, which work equally to the current implementation.

A script with three measurements would therefore look like this:

setEventName google
navigate http://www.google.com
setEventName aol
navigate http://www.aol.com
setEventName yahoo
navigate http://www.yahoo.com

If we'd leave out the line setEventName yahoo, the second result of the measurement would be just as executing both navigates to aol and yahoo with the upstream WebPagetest version.

Another upside of this behavior is that there are very little changes to the current implementation of the script processing. The main differences to be made are in saving, sending them to the server, displaying the results, and make them accessible via REST API.

@pmeenan
Copy link
Contributor

pmeenan commented May 13, 2016

I don't know that the script modifications are strictly necessary. Even without them the agent code generates synthetic names and knows the step number it is currently working on. It is certainly helpful when reporting or presenting to the users.

For me, the biggest requirement is for backward compatibility with both the agents and consumers of the API. The structure of the JSON/XML/HAR can't change for the case of a single run and it would be best if the multi-step case was still backward compatible (possibly returning the combined data of all steps as a sequence where a single run would normally report). There is a lot of tooling and automation that uses the existing API and it needs to continue to work.

Something like this might work:

  • The "firstView" and "repeatView" entries under runs, average, standardDeviation and median would be the sum of the metrics from all of the steps.
  • The median run would be selected based on the aggregate timings for a given run
  • In multi-step cases only:
    • The requests would be missing and would need to be pulled from the individual steps
    • The video frames would be missing and would need to be pulled from the individual steps
    • There would be a "steps" array that contained the full "firstView"/"repeatView" entries for each step
    • Each step would include the label as well as the start time relative to the first step
    • The video and request timings in a given step would be relative to the start of that one step

As far as the agents reporting the results, the page data, object data, screen shots, traces and other files should get an additional indicator in the file name to report the set number. It should be missing for the first reported step so the files are backwards-compatible with existing servers and incremented for each reported/recorded step (i.e. don't increment it for steps that have a logData 0 block around them).

The video files also need to be reported in such a way that they get stored into separate directories for each step and so that reporting to a legacy server does not break.

@sburnicki
Copy link
Contributor Author

About backward compatibility: Do you only value API compatibility, or also behavior compatibility?
Both the modifications to the JSON/XML results and to the agent result reports would at least result in a different behavior for existing scripts, where multiple steps are executed (and only the last is considered in the result).

This is related to my next point about median, average, standardDeviation:

The "firstView" and "repeatView" entries under runs, average, standardDeviation and median would be the sum of the metrics from all of the steps.
The median run would be selected based on the aggregate timings for a given run

Depending on the use case, also the data of the specific steps would be interesting, not only data about the aggregated steps (we call that "journey") So in a journey with three steps, the median for different steps might be from different runs.
Maybe both should be supported, e.g. by introducing new subelements in the corresponding section.

For our main interest, the OpenSpeedMonitor, these values do not really matter, so this is not a personal requirement, but a consideration.

The requests would be missing and would need to be pulled from the individual steps
Wouldn't missing requests break API compatibility?

By "pull", do you mean it would be needed to refer to the data in the steps array to avoid too much duplicate data? I would agree with that.

There would be a "steps" array that contained the full "firstView"/"repeatView" entries for each step

I think it makes sense to include a step array per run, so it should be a subelement of each run in the run array, right?

@pmeenan
Copy link
Contributor

pmeenan commented May 19, 2016

Existing multiple-step scripts with multiple measurements are fundamentally broken if you try to do it with the current agent (all steps are crammed together). Behavior for multi-step scripts with only one reported step (logData 0/1 or combinesteps) needs to be maintained though.

For median/average/stddev, if the user wants to do something fancier and calculate the median of each step, they can (and should) calculate that directly off of the data from each run and just ignore the convenience metrics. And yes, by "pull" I mean don't include request data at all in the aggregates because of the large amount of duplicate data.

Agree that the steps would be within each run. To maintain backward compatibility though it probably needs to be one layer lower in the JSON and inside of the "firstView"/"repeatView" entry within each run.

i.e.

"runs" : [
   "1" : {
        "firstView" : {
            "steps": []
       }
    }
]

@sburnicki
Copy link
Contributor Author

sburnicki commented May 19, 2016

For multistep measurements you proposed to leave out requests and video frames and include them for the individual steps. I think the same would hold for images, thumbnails, and rawData, since they'd also differ for each step.

However, I'm not convinced of completely leaving them out for multi-step measurements. Of course the current execution of multi-step scripts doesn't return meaningful data, but removing these values might cause existing 3rd party software to crash instead of getting nonsense data.

The same idea holds for general "results" like runs.1.firstView.title and also the other values: For multi-step these values do not make sense, but should be filled with some values to not break other tools (e.g. first or last measured step).

The steps arrray with the correct per step results should only be added in addition.

@sburnicki
Copy link
Contributor Author

sburnicki commented Jun 8, 2016

Now that I started to dig into the server code, I have the following plan for multi-step implementation:

  • XML result exports
  • JSON result exports
  • HAR results exports
  • CSV exports
  • text export
  • gzip export
  • video creation
  • frame download
  • User Interface
    • generated images
    • summary
    • details
    • performance review
    • mimetype breakdown
    • domain breakdown
    • screenshots
    • filmstrip view
    • custom waterfall
    • page images
  • NodeJS agent
  • NodeJS multistep result handling in server

@sburnicki
Copy link
Contributor Author

Again about compatibility, especially the output of different runs (at the example of XML results):

  • What kind of data should exist on top level of run-specfic output? Results of the first run, or aggregated data?
  • If we use aggeregated data, what about the data that cannot be agrregated, or doesn't make sense to be aggregated?
  • If xmlResults should contain data about all requests, should it be included on top level, as now, to stay compatible? If yes, the amount of duplicate data will be quite huge. If no, the XML wouldn't be compatible.

Also, I think the data of the first step should then be duplicated:

@pmeenan
Copy link
Contributor

pmeenan commented Jun 16, 2016

My preference would be that for non-multistep tests it doesn't change but for multistep tests I think it makes sense to do something "smarter"

At the top level of a multi-step test I think it would be great if we could report the combined stats for the flow (like all steps combined into a sequence) and just the stats, not raw request-level stuff.

That way any automation that can trend high-level metrics like page load time, bytes in, etc would still do something reasonable with multi-step tests.

I'm also open to just not including anything at the top level for a multi-step test and require any tooling to handle it explicitly. Given that it hasn't been supported before we don't have to worry about backward compatibility for those (just whatever we can do to make tooling easier/more consistent).

@sburnicki
Copy link
Contributor Author

Okay, so existing tooling doesn't need to be able to handle result data of multistep tests in any way. That's fine for me.

What I'd like to see is that tooling which supports multistep doesn't need to differentiate between multistep and singlestep results (as singlestep tests are only a subset of multistep runs then).

However, this would also lead to duplicate data for singlestep runs.

What we could do is to check for a parameter (like &multistep=true) which forces multistep results even for singlestep runs. This way data processing for tooling can stay consistent.
However, a solution without parameters would be even nicer.

@sburnicki
Copy link
Contributor Author

For documentation purposes:
Both XML and JSON signlestep results can be forced to be also in multistep format by passing &multistepFormat=1 as a URL parameter to xmlResult.php, respectivel jsonResult.php.

@sburnicki
Copy link
Contributor Author

With #684 major UI implementation is finished, so multistep should be usable at least for desktop agents. Investigation on how much effort the NodeJS agent would need follows next.

@pmeenan
Copy link
Contributor

pmeenan commented Aug 15, 2016

Thanks so much for all of the work you put into this. It was pretty epic and it was great to have everything in manageable chunks.

@sburnicki
Copy link
Contributor Author

You're welcome. Thank you for checking and merging all this.

@zeman
Copy link
Contributor

zeman commented Nov 23, 2016

I noticed that HAR exports haven't been ticked off the list above. Can you confirm that they don't yet support multistep?

@sburnicki
Copy link
Contributor Author

Yes, unfortunately that's right.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants