Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flesh out emscripten metadata #8519

Merged
merged 3 commits into from
May 2, 2019
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 26 additions & 5 deletions tools/shared.py
Original file line number Diff line number Diff line change
Expand Up @@ -518,7 +518,7 @@ def get_emscripten_version(path):
# For the Emscripten-specific WASM metadata section, follows semver, changes
# whenever metadata section changes structure
# NB: major version 0 implies no compatibility
(EMSCRIPTEN_METADATA_MAJOR, EMSCRIPTEN_METADATA_MINOR) = (0, 0)
(EMSCRIPTEN_METADATA_MAJOR, EMSCRIPTEN_METADATA_MINOR) = (0, 1)
# For the JS/WASM ABI, specifies the minimum ABI version required of
# the WASM runtime implementation by the generated WASM binary. It follows
# semver and changes whenever C types change size/signedness or
Expand Down Expand Up @@ -2979,8 +2979,20 @@ def delebify(buf, offset):

@staticmethod
def add_emscripten_metadata(js_file, wasm_file):
mem_size = Settings.STATIC_BUMP
WASM_PAGE_SIZE = 65536
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so STATIC_BUMP is no longer needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

STATIC_BUMP was being used to compute DYNAMICTOP_PTR and tempDoublePtr but since those are now not derived from STATIC_BUMP it's no longer necessary to store that.


mem_size = Settings.TOTAL_MEMORY // WASM_PAGE_SIZE
Copy link
Collaborator

@sbc100 sbc100 Apr 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't // do the wrong thing here and round down? I guess TOTAL_MEMORY is already guaranteed to be a multiple so probably fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the intent is to make sure mem_size is an integer. I could have done int(TOTAL_MEMORY / WASM_PAGE_SIZE) but this seemed better.

table_size = Settings.WASM_TABLE_SIZE
global_base = Settings.GLOBAL_BASE

js = open(js_file).read()
m = re.search(r"tempDoublePtr\s+=\s+(\d+)", js)
tempdouble_ptr = int(m.group(1))
m = re.search(r"DYNAMIC_BASE\s+=\s+(\d+)", js)
dynamic_base = int(m.group(1))
m = re.search(r"DYNAMICTOP_PTR\s+=\s+(\d+)", js)
dynamictop_ptr = int(m.group(1))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems strangely convoluted to read back in the file we just generated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would require more extensive changes to force the JS compiler to use a value for DYNAMICTOP_PTR that would be chosen by Python. Right now it's automatically allocated by makeStaticAlloc(). I thought the method of parsing the JS was acceptable since it was done in earlier incarnations of this function.


logger.debug('creating wasm emscripten metadata section with mem size %d, table size %d' % (mem_size, table_size,))
name = b'\x13emscripten_metadata' # section name, including prefixed size
contents = (
Expand All @@ -2994,13 +3006,22 @@ def add_emscripten_metadata(js_file, wasm_file):
WebAssembly.lebify(EMSCRIPTEN_ABI_MAJOR) +
WebAssembly.lebify(EMSCRIPTEN_ABI_MINOR) +

# static bump
WebAssembly.lebify(mem_size) +

# table size
WebAssembly.lebify(table_size)
WebAssembly.lebify(table_size) +

WebAssembly.lebify(global_base) +

WebAssembly.lebify(dynamic_base) +

WebAssembly.lebify(dynamictop_ptr) +

WebAssembly.lebify(tempdouble_ptr) +
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make more sense to export these things as globals from the wasm file?

If we are going to make this an ABI can we document to exact meaning on each of these things?

Do all these make sense with both fastcomp and the llvm backend?

Copy link
Contributor Author

@rianhunter rianhunter Apr 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason they are in a different metadata section is that they must be used to properly parameterize the Emscripten runtime before the WASM can be successfully instantiated. (There is a circular dependence).

Medium term this will be documented exactly. I still don't consider it stable enough to document, which is partly the reason it isn't emitted automatically and requires "-s EMIT_EMSCRIPTEN_METADATA=1" to opt in. There is some initial documentation for developers here https://github.com/emscripten-core/emscripten/wiki/WebAssembly-Standalone#jswasm-abi

At least as far as emulating the JS runtime, tempDoublePtr, DYNAMICTOP_PTR, and DYNAMIC_BASE makes sense with the LLVM backend. It wouldn't be possible for the host to implement sbrk() and copyTempDouble() without those parameters. global_base does not seem to be needed by LLVM generated WASM files (in asmjs/binaryen it's imported as __memory_base). I haven't yet successfully run LLVM generated WASM binaries but to be honest I anticipate more changes at that time.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with landing this now, as long as we can iterate on this, change it, and hopefully simplify it in the future. I wouldn't want to be locked in the ad-hoc conventions we currently have without a more formal spec.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, Major version on this metadata section is still 0, so it's still explicitly not supported and not guaranteed to be in any specific format.


# NB: more data can be appended here as long as you increase
# the EMSCRIPTEN_METADATA_MINOR

b''
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for symmetry purposes with the preceding lines (each ending with +). Fine to take out at the small expense of potentially incurring unnecessary line diffs in future edits of this function.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. How about using contents = b''.join(... , then you can use trailing commas on each line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could do that but then I'd have to change more lines of code :) (see this comment for past comment on keeping code style consistent: #7815 (comment)) I have no problem doing it if you really think it's worth it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, but if you are going to leave the multiline + but in that case I would drop the extra b''.

Also, is there a reason why echo the lines above is separated by an empty line? This is odd, especially since its single expression.

)

orig = open(wasm_file, 'rb').read()
Expand Down