Skip to content

Debugging observation failures

rcottingham edited this page Dec 12, 2024 · 1 revision

This page provides a detailed guide to help users debug observation failures.

You will need access to the recording files to play back, identify faults and analyse causes of failure.

For detailed analysis you will need a video editor or video playback tool that gives you a frame accurate view of the file and allows you to seek to a specific frame number or time, and step forwards and backwards frame-by-frame e.g. for Windows, VLC media player, or for Linux, PiTiVi.

For audio analysis you will need a tool to be able to view and analyse the audio wave from the recording. Sonic Visualiser can be used to view closely examine the recorded audio.

The following steps will help you to debug a test where the observation framework reports a failure:

Video Tests

  1. Run the observation framework again with --log debug to get more detailed log information in logs folder. Further information is here

  2. Find the frame number where the test starts in "logs/session.log" file. e.g. "Start a New test: cfhd_12.5_25_50-local/sequential-track-playback__t3.html".

  3. Use a video editor or video playback tool to view the recording and locate the start frame or time of the test.

  4. Analyse the recording, as follows.

Obvious failures

From the starting frame of the test (identified in Step 2), play the recording until the test finishes (where s:"finished") to check for any obvious failures, such as:

  • Clarity of captured QR codes is poor, this can lead to big chunks of missing frames. See this document for more details on capturing clear recordings.
  • Test streams should normally start with a green frame and finish with a red frame (except specific tests such as random access). A common reason for failure is the first and/or last frames not being visible - this should be obvious given these markers.
  • Video fading-in or presented from the top-down when playback starts.
  • If a decoder stalls for a few tens or hundreds of milliseconds during playback, the duration being reported will be longer than expected by the length of the stall.
  • Failure to play 12.5Hz/15Hz content or 50Hz/60Hz content at the correct frame rate will appear as both a (much larger) duration failure and a currentTime failure.
  • Long start up delay is indicated by the gap between "s: playing" until the display of the next frame change (normally frame number 2)

Further analysis

For any non-obvious failures, locate the specific point in the recording and step forwards and backwards frame-by-frame.

  • When a small number of frames are reported as missing:

    • If, for example, frame 166 is reported missing, go to any of previous detected frames and step forwards in the recording frame-by-frame. If the next frame after 165 is 167, it indicates that frame 166 is actually missing. If 166 presented, check that the QR code is clear enough to be detected by observation framework.
  • When start up delay has failed:

    • Find the starting frame of the test in "logs/session.log", then step forwards in the recording to find the following frame numbers:
    1. Where the playback status start (where s:"playing") is first detected. e.g.: Status=playing Last Action=play
    2. Where the first frame change after play is first detected.
    • Find the first of these frame numbers in the recording.
    • Note: there may be an offset between the frame numbers used in your video tool and those reported by the Observation Framework. You may need to move forwards and backwards in the video tool to determine this offset.
  • When duration match has failed:

    • Find the starting frame of the test in "logs/session.log", then step forwards in the recording to find the following frame numbers:
    1. Where the second decoded frame is first detected
    2. Where the last decoded frame is first detected
    • Find the first of these frame numbers in the recording.
    • Note: many tests are 30s and, if using 25Hz video, 30s at 25Hz is 750 frames so look for the next instance of "Frame Number=750".
    • Note: there may be an offset between the frame numbers used in your video tool and those reported by the Observation Framework. You may need to move forwards and backwards in the video tool to determine this offset.
  • When currentTime match has failed:

    • Detailed currentTime results are logged in the file "logs/ct_diff.csv". You can sort by "Time Difference" to get all currentTime values that exceeded the tolerance.
    • Find the currentTime reports from "logs/session.log" to get the detected frame number in recording. e.g.: for currentTime 23643 from "logs/ct_diff.csv file, it is presented as "Current Time=23.643" from "logs/session.log" file.
    • Find the first of the frame numbers in the recording.
    • Note: there may be an offset between the frame numbers used in your video tool and those reported by the Observation Framework. You may need to move forwards and backwards in the video tool to determine this offset.

Audio Tests

  1. Run the observation framework again with --log debug to get more detailed log information in logs folder. Further information is here

  2. Playback the recording to find the start time of the test.

  3. Use a video editor or video playback tool to view the recording and locate the start frame or time of the test. Use Sonic Visualiser to view the audio recording and the start time of the test.

  4. Analyse the recording by viewing the audio wave in Sonic Visualiser and plotted graphs in the logs folder.

e.g.: xxx_subject_data_0.png below shows the audio wave for the test, and start (blue line) and end (green line) of the subject data. xxx_audio_segment_data.png shows offsets of detected audio segment data. image

Obvious failures

From the starting frame of the test (identified in Step 2), play the recording until the test finishes (where s:"finished") to check for any obvious failures, such as:

  • No audio / unexpected audio / obvious gaps in audio recording during playback. White noise should be played from start to end of test, without any glitches or dropouts.
  • Test should normally start where the audio wave rises, and finish where the audio wave falls to nearly 0. subject_data.png file should show blue and green vertical lines aligned with audio start and end. One common reason for failure is the first and/or last audio segment are not being presented.
  • Audio dropouts may also be a common reason of failures. Audio wave shows gaps when viewed from subject_data.png or more clearly on Sonic Visualiser.

Further analysis

For any non-obvious failure, analyse by locating the specific point in the segment data.

  • When start up delay has failed:

    • Find the starting frame of the test in "logs/session.log", then step forwards in the recording to find the frame number where the playback starts (where s:"playing") is first detected. e.g.: Status=playing Last Action=play.
    • Find the first of these frame numbers in the recording.
    • Then find the starting time from audio_segment_data.csv, where the beginning of the audio segment is detected. Normally it is Segment(0.0ms) but it may be another segment when the first is missing or in special tests such as random-access.
  • When a small number of audio segments are reported as failed:

    • If, for example, Segment(220.0ms) is reported as failed, check from audio_segment_data.png. The segment not presented in-line with other segment it is stating that the audio from 220ms-240ms whole duration or part of it is missing. Audio is measured every 20ms. image
  • When duration match has failed:

    • Find the starting time of the following audio segments from audio_segment_data.csv:
    1. Where the beginning of the audio segment is detected. Normally it is Segment(0.0ms) but it may be another segment when the first is missing or in special tests such as random-access.
    2. Where the end of the audio segment is detected. Normally it is the last segment, but it may be another when the last is missing.
  • When audio segment has passed, but duration has failed:

    • Fails duration requirement in 8.2.5.1 (and can be concluded that playback has gaps but no sequence errors).
    • In this case, the total duration of the gaps and the delays is “small”, where “small” means less than 500ms.
    • If the total duration of gaps and delays is not “small”, the segment test will also fail as explained below (both audio segment and duration failed).
    • (For interest only: The "small" number comes from a “neighbourhood” inspection approach used in the Observation Framework audio automation to reduce test time; audio data is only inspected “in the neighbourhood” of where it is expected to be.)
  • When audio segment has failed, but duration has passed:

    • Fails segment test in 8.2.5.3 (can be concluded that playback has invalid data order)
    • In this case, the audio must have out-of-order or incomplete data.
    • This may imply data missing with gaps (silence), but arranged so as not to cause a delay failure or duration failure.
    • Replacing dropped audio with exactly the same amount of silence will cause this condition.
    • In this case the segment will start to fail at the error location but pass after the failure.
  • When both audio segment and duration have failed:

    • Fails both (playback is known to have gaps, validity of segment order is undetermined).
    • The audio must have gaps (Duration Fail) but can’t tell if the data is in the correct sequence. Some data may be missing or out of order.
    • Strictly speaking, when segment fails due to “large” gaps, you will see the segment pass at the beginning of the test, then fail from the first gap to the end. In this case the duration will also be longer than expected.
    • When the segment fails all data (no successful 20 mS data blocks), this means the presented data is simply wrong compared to expectations. This may be due to a variety of reasons other than gaps.

Audio and Video Combined Tests

The following log files can be used for debugging A/V synchronisation failures:

  1. video_data.csv file
  2. audio_data.csv file
  3. av_sync_diff.csv shows A/V sync differences.
  • When A/V sync has failed - common issues:

    • Delays in either the audio or video starting; the start up delay observation can be checked to diagnose such a case.
    • Gaps in audio.
    • Earliest video and audio sample starting from a different presentation time.