Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First version of DebuggingFramework.md. #708

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
216 changes: 216 additions & 0 deletions DebuggingFramework.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,216 @@
#WebAssembly Debugging

This is a design proposal for enabling source-level debugging of WebAssembly
modules.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a bit redundant, given that we're in the design repo ;-)


## Introduction

There are several entities involved in debugging a source program compiled into
WebAssembly:

- the compiled code being executed, called the _debuggee_
- the browser engine executing the debuggee
- the browser devtools
- the source-program listing, usually split into multiple source files
- the description of how the debuggee code maps to the source program and vice
versa (e.g., WebAssembly code locations to source code locations, source
variables to WebAssembly memory, WebAssembly call stack to source functions,
etc.), called the _debug info_

These entities carry information and interact in particular ways to enable a
user to perform debugging actions at the source-code level. For example, the
browser devtools offer a UI allowing the user to step through the source code,
requiring partial execution of the debuggee chunk corresponding to each step,
and displaying the source file containing the stepped-through program.

Specifying the debugging framework means describing how these entities interact,
what they need from each other, and how they access it; all in order to execute
the user's commands.

## Debugger as a Procedural Interface

A crucial element of this design proposal is to omit specifying a particular
format for the debug info. Instead, we introduce a separate entity we call the
debugger; its job is to act as a procedural intermediary between the devtools
and the debug info. Devtools execute user actions by leveraging debugger
methods we will specify here.

Thus any debug-info format is acceptable, as long as the debugger can correctly
interpret it. The format is, in effect, a convention between the front-end
parsing the source and the debugger.

What the design _does_ specify is two interaction protocols:

1. between the devtools and the debugger, and
2. between the debugger and the wasm execution engine

## Action Flow

For reference, the entities are charted in the figure below:

![Entities making up the debug framework](wasm-debug-chart.png)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice chart! I'd put the "Debugger" box inside the browser box since it's executing JS or wasm code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I felt that the wasm module and the debugger should be shown outside, because they don't come with the browser but are downloaded. Then I try to explain in the text how they are used by the browser when obtained.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC 4. and 5. are what you're specifying here, right? It sounds like @lukewagner wants to drop 3. from the diagram, but you'd rather keep it because it can be separate (not being specified in this doc)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this aims to specify 3 and 5, while leaving 4 unspecified (other than saying how the debugger can find the debug info and source files). The way the debugger interprets the debug info will be hidden from other actors in this picture, freeing the debugger implementers to choose whatever suits them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point and just noticed that "wasm module" is also, symmetrically, outside the box.


(The figure uses the conventional phrase "wasm module" to refer to a debuggee.
For now, we overlook the complexities of multiple wasm modules or mixing JS and
wasm on the same call stack.)

Black arrows in the figure indicate the action flow and timeline:

1. When the page is first rendered, the debuggee is loaded into the browser and
sent to the wasm execution engine.
2. The user interacts with the browser's devtools UI (eg, requesting a source
listing for a function to set a breakpoint).
3. Devtools call on the debugger to execute operation(s) necessary to fulfilling
the user's request. After going through steps 4 and/or 5, the debugger
returns a result for the devtools to display.
4. The debugger consults the debug info and/or source files to obtain the proper
mapping between the source code and the wasm.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: the chart lacks an arrow from the source files to the debugger.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I felt the extra arrow would clutter the picture without adding much information value. I glommed the debug info and source files rectangles together to suggest that they act in unison.

5. Optionally, the debugger requests information from the browser, such as
examining the wasm memory or call stack in the execution engine.

Note that 3 and 5 are bidirectional arrows: sometimes the flow of action goes
from the browser through the debugger to devtools. For instance, this will
happen when a breakpoint is hit and the wasm execution is paused. The browser
will then notify the debugger, who will in turn notify devtools after
translating the wasm program counter to the corresponding source line.

## About the Debugger

We propose that the debugger implement a JavaScript API (by, for example, being
a JavaScript library) that provides the actions described in the next section.
This allows for easy access to the browser, as well as ease of implementation
because JavaScript is widely known and supported.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm worried about 64-bit integer support, and eventually other types such as SIMD. Would we just split them up into 32-bit values? That's clunky but doesn't see too bad.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying it's a problem that some wasm types don't map directly to JS types? That sounds like a minor issue, easily worked around. Or is there more that I'm missing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, that's exactly what I mean. Agreed it's not a big issue, just icky.


This also makes it possible for the debuggee to contain a URL for the debugger
to be used on it. This URL can point to any server, allowing debugger reuse
among many debuggees.

Because wasm [will](GC.md), in the future, be callable from JavaScript, this
opens up the possibility of a debugger itself being a wasm module. Similarly,
this API can, over time, expand to serve the debugging needs of not just wasm
but also all other scripts on the page (most significantly JavaScript modules).

## UI -> Debugger Protocol Operations

The following operations are available regardless of the debuggee's execution
status:

`init(debug_info_url)` the debugger visits the url, obtaining the debug info
(including how to grab source files).

`sources()` returns a list of handles to all source files. A handle contains
the file path and a unique identifier.

`symbol_location(mangled_name)` returns the source location where the named
symbol is defined. Locations consist of a file handle, a line number, and a
column number.

`typeof(mangled_name)` returns the type of the named symbol; for functions,
returns the entire function type, including the return type and the types of all
parameters. (TODO: define type)

`source(file_handle)` returns the source-file text.

`break(location)` sets a breakpoint, returning a handle to it.

`status()` returns a value indicating whether the debuggee is currently running,
paused, or inactive.

`pause()` if the debuggee is currently running, pauses its execution; otherwise,
does nothing.

`resume()` if the debuggee is currently paused, resumes its execution;
otherwise, does nothing.

The following operations are additionally available when the debuggee is paused:

`callstack()` returns the call stack of the debuggee's current execution state.
The call stack is an array whose elements correspond to stack frames. Each
frame has the location, the mangled function name, and argument values.

`step_into()` steps into debuggee's current source line (entering functions).

`step_over()` steps over the current source line (not entering functions).

`value(mangled_name)` returns the current value of the named symbol; the symbol
is looked up in debuggee's current execution context.

`set(mangled_name, value)` sets the current value of the named symbol; the
symbol is looked up in the current debuggee execution context.

`return(value)` makes the current function return the given value. (TODO: what
if the function's return type is void?)

## Debugger -> UI Notifications

- a breakpoint was hit
- an uncaught exception was thrown
- `abort()` was called
- `exit()` was called

(TODO: specify parameters and results)

## About Debug Info and Source Files

The debugger will need to access the debug info and source files for the current
debuggee. The debuggee can specify URLs for these items. Their format is not
specified -- it can be anything that the debugger knows how to interpret.

## Debugger -> Debuggee Protocol Operations

- pause execution
- set a breakpoint at a given byte offset
- get the current call stack
- execute the current wasm instruction and move on to the next one
- execute the current wasm instruction, but if it's a function call, execute the
whole function
- get the value of a wasm memory location
- set the value of a wasm memory location

(TODO: specify parameters and results)

## Debuggee -> Debugger Notifications

- an uncaught exception was thrown

(TODO: specify parameters and results)

## Example End-To-End Flows

Here are a few examples of how the user's actions can be implemented using the
operations listed above:

### Setting and Triggering a Breakpoint

1. The UI initializes the debugger with the debug-info URL from the wasm module.
2. The UI invokes `sources()` and shows a list of file paths.
3. The user picks a file. The UI invokes `source()` on the file's handle to
obtain the file's text, then displays it to the user.
4. The user sets a breakpoint in the displayed source.
- The UI constructs a location from the file and the line, then invokes
`break()`. It stores the returned handle in the list of existing
breakpoints.
- The debugger translates the source location into a wasm byte offset using
the debug info, then asks the debuggee to set a breakpoint there.
5. When the breakpoint is triggered in the debuggee, the debuggee notifies the
debugger, which in turn notifies the UI. The UI looks up the received handle
in the list of all breakpoints and informs the user which breakpoint has just
triggered. The UI then invokes `callstack()` and displays it to the user for
examining.

### Printing a Paused Program's Variable Value

1. The user selects a symbol from the source listing. The UI mangles the name
and invokes `value()` on it.
2. The debugger looks up the symbol in the debug info and obtains its location
in the wasm memory. It asks the debuggee for the memory contents and returns
the result to the UI, which displays it.

### Stepping over a Source Line

1. The users clicks the UI element for stepping over. The UI invokes
`step_over()`.
2. The debugger looks up the current location in the debug info.
3. The debugger then tells the debuggee to execute the current wasm instruction,
executing whole functions.
4. The debugger repeats steps 2 and 3 until the current location changes.
Binary file added wasm-debug-chart.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.