Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Command line use (and testing) of taxsim.js file #1

Open
thomascwells opened this issue Jun 10, 2022 · 25 comments
Open

Command line use (and testing) of taxsim.js file #1

thomascwells opened this issue Jun 10, 2022 · 25 comments

Comments

@thomascwells
Copy link

Context: I'd like to use your compiled .wasm and .js files offline as part of an R package. [1]

Issue: I can't seem to figure out how to interact with the js or wasm you built via CLI.

I tried the test line from your make file, but that returned an error.

>echo "year,mstat\n2020,2" | node -e "require('./taxsim.js')()"
 TAXSIM: ERROR year,mstat\n2020 is not a taxsim35 input variable.
STOP 901

So I tried a different form of the input (this from the examples on NBER), but still no luck.

>echo "tax,mstat,year,ltcg,idtl\n1,2,1970,100000,5" | node -e "require('./taxsim.js')()"
 TAXSIM: ERROR tax,mstat,year,l is not a taxsim35 input variable.
STOP 901

If I switch to Object-style input, I get a different output (STOP 999), but not the results.

>echo "{taxsimid: 1, mstat: 2, year: 1970, ltcg: 100000, idtl: 5}" | node -e "require('./taxsim.js')()"
STOP 999

I've also tried using wasmer but I think it fails because of some of the emscripten compile settings. I don't think wasmer and other WebAssembly runtimes know how to use the .js.

$ wasmer taxsim.wasm
error: failed to run `taxsim.wasm`
╰─▶ 1: Emscripten requires at least one imported table

My ideal situation would be to use a JS or WASM interface already available in R, like the V8 port for R, but my attempts to make that work have hit dead ends as well.

> ctx <- V8::v8()
> ctx$source("taxsim.js")
> ctx$call('taxsim', '{taxsimid: 1, mstat: 2, year: 1970, ltcg: 100000, idtl: 5}')
Aborted(Assertion failed: shell environment detected but not enabled at build time.  Add 'shell' to `-s ENVIRONMENT` to enable.)
named list()

Do you have any suggestions?

I'm mostly an R programmer just trying to fuddle my way through this, so apologies if I've made some obvious beginner mistake.

[1] I'd like to add a feature to @shanejorr's package so calculations can be done without sending data to a remote server.

@tmm1
Copy link
Owner

tmm1 commented Jun 10, 2022

Hi! Glad to see someone playing with this.

Try echo -e "..." | ...

node should work, as should v8. Wasmer likely will not.


[tmm1@tmm1-nuc ~]$ echo "\n\n"
\n\n
[tmm1@tmm1-nuc ~]$ echo -e "\n\n"


@tmm1
Copy link
Owner

tmm1 commented Jun 10, 2022

Add 'shell' to -s ENVIRONMENT to enable.

This might be simple enough to add to the Dockerfile. I'll research the emscripten docs to understand the implications.

@thomascwells
Copy link
Author

Try echo -e "..." | ...

Ah ha! Yes, that works! The -e does seem to be important.

But also, I was trying to run my commands on Windows powershell (facepalm), but the newline is different there.

Powershell:
echo "year,mstat`n2020,2" | node -e "require('./taxsim.js')()"

Bash:
echo "year,mstat\n2020,2" | node -e "require('./taxsim.js')()"

Thanks for that!

Enabling the shell environment would help things a lot. Another (additional? complementary?) option might be standalone WASM for use with wasmer, as discussed here: https://v8.dev/blog/emscripten-standalone-wasm. However, I don't know if that would break integration with your taxsim.app implementation (which is quite nice, by the way!).

@tmm1
Copy link
Owner

tmm1 commented Jun 10, 2022

Thanks for the link.

If you want to run both on the Web and elsewhere, and you want 100% optimal code size and startup times, you should make two separate builds, one with -s STANDALONE and one without.

This isn't ideal but may be required for wasm runtimes.

Enabling the shell environment would help things a lot.

I poked at this and looks like we can pass a list into ENVIRONMENT:

https://github.com/emscripten-core/emscripten/blob/4881eaf6c23f590e5a8f8bd0d013c6c1cc11f42f/emcc.py#L288-L295

I'm not sure what the default is but it must be web,node since both of those work.

@tmm1
Copy link
Owner

tmm1 commented Jun 10, 2022

I made a change in 5530026

The generated js file didn't really change much. You can find it at https://taxsim.app/taxsim.js. Let me know how it works out.

@tmm1
Copy link
Owner

tmm1 commented Jun 10, 2022

I tried a quick STANDALONE_WASM=1 build and ran it with both wasmtime and wasmer. Unfortunately neither worked.

$ wasmer taxsim.wasm
error: failed to run `taxsim.wasm`
╰─▶ 1: Error while importing "wasi_snapshot_preview1"."args_sizes_get": unknown import. Expected Function(FunctionType { params: [I32, I32], results: [I32] })

$ wasmtime taxsim.wasm
Error: failed to run main module `taxsim.wasm`
Caused by:
    0: failed to instantiate "taxsim.wasm"
    1: unknown import: `env::__syscall_dup` has not been defined

My understanding is that WASI is still in early days so I've been waiting for the standard to finalize before digging in further.

There is some investigation happening here as well: StarGate01/Full-Stack-Fortran#7

Also we're behind a few versions of emscripten, which is blocked on investigation of this issue: StarGate01/Full-Stack-Fortran#10 (comment)

@tmm1
Copy link
Owner

tmm1 commented Jun 10, 2022

I did some testing with d8

Before 5530026, I see the same issue as you:

$ d8 -e "load('./taxsim.js'); taxsim()"
Aborted(Assertion failed: shell environment detected but not enabled at build time.  Add 'shell' to `-s ENVIRONMENT` to enable.)

After the change, it gets past that:

$ printf "year,mstat\n2020,2" | d8 -e "load('./taxsim.js'); taxsim()"
STOP 993

Somehow the input requirements are different.. it must not be receiving EOF correctly? If I add an extra newline, it returns the data atleast before erroring:

$ printf "year,mstat\n2020,2\n" | d8 -e "load('./taxsim.js'); taxsim()"

taxsimid,year,state,fiitax,siitax,fica,frate,srate,ficar,tfica
0.,2020,0,-3600.00,0.00,0.00,-7.65,0.00,15.30,0.00
STOP 993

@tmm1
Copy link
Owner

tmm1 commented Jun 10, 2022

I made some more progress in cdd1355

However I think it will take a bit more finagling to get it working inside R/v8.

https://github.com/emscripten-core/emscripten/blob/e1d97f55d12ed63dc30e547c818af9a63be0b976/src/shell.js#L317-L327

@tmm1
Copy link
Owner

tmm1 commented Jun 10, 2022

It works in R!

> download.file("https://taxsim.app/taxsim.wasm", "taxsim.wasm")
> wasm <- readBin("taxsim.wasm", raw(), file.info("taxsim.wasm")$size)

> library(V8)
Using V8 engine 9.6.180.12
> ctx <- v8()
> ctx$assign("wasmBinary", wasm)
> ctx$source("https://taxsim.app/taxsim.js")

> ctx$call("taxsim", JS("{year:2020, mstat:2}"), JS("{wasmBinary}"), await=TRUE)
[1] "taxsimid,year,state,fiitax,siitax,fica,frate,srate,ficar,tfica\r\n0.,2020,0,-3600.00,0.00,0.00,-7.65,0.00,15.30,0.00\r\n"

@feenberg
Copy link
Collaborator

feenberg commented Jun 10, 2022 via email

@feenberg
Copy link
Collaborator

feenberg commented Jun 10, 2022 via email

@feenberg
Copy link
Collaborator

feenberg commented Jun 10, 2022 via email

@tmm1
Copy link
Owner

tmm1 commented Jun 10, 2022

Thanks Dan!

The changes discussed above have all been incorporated into this repository. To use them, you can:

$ cd taxsim.js
$ git pull
$ cp /path/to/latest/taxsim.f .
$ make

This will create a new taxsim.{js,wasm} which is compatible with node and libv8, and can be used in R.


taxsim doesn't use any environment variables that I am aware of. What makes this useful?

This is a flag to the compiler which tells it to make the wasm/js compatible with libv8. It was incorporated in 5530026 so there's nothing you need to do.

two separate builds, one with -s STANDALONE and one without.
Where do I put that? Is the savings significant?

I tried this and it didn't work so there's no reason or benefit to this at the moment.

The new location
will be at taxsim.us, and I will make all of taxsim available there as
soon as I have it set up, including taxsim.wasm.

Thats great news!

If taxsim.f source is going to become public, we should host a copy here on GitHub as well, so it's easier to see how it changes over time. I'm happy to help with this once you get to that point.

I suppose if someone is running taxsim.wasm in the browser, they can be
confident taxsim.wasm doesn't do anything bad to there system. Does node
protect them?

Yes wasm in the browser is extremely sandboxed and very safe. It cannot access the filesystem, environment or really anything on your system.

node is less safe because it has no sandbox. Technically if the js was crafted maliciously it could access anything on your system, since node.js is a full programming language and runtime.

For users concerned about security, having the taxsim.f source available alongside the source in this repo would allow them to create their own copy of taxsim.wasm from scratch and audit every step of the process.

@tmm1
Copy link
Owner

tmm1 commented Jun 10, 2022

@thomascwells Let me know if the R example works in your environment. I only tested on my macOS machine.

If you end up working on adding support to https://github.com/shanejorr/usincometaxes, please cc me on any pull requests. I'd also like to add an entry to the README here once that happens, under projects using taxsim.js.

@thomascwells
Copy link
Author

thomascwells commented Jun 10, 2022

It works in R!

This looks great! Thanks for all the updates Aman! I'll give this a test run today.

Somehow the input requirements are different.. it must not be receiving EOF correctly? If I add an extra newline, it returns the data at least before erroring:

When I was trying to use the HTTP interface via curl as described here*, it would fail if the uploaded file contained an end-of-file new line. I had to create a file without a final blank line or I would get errors each time. Some working R code here, but Dan's recent change for skipping blank lines might solve that issue.

*Note: I think the URL in the curl example on NBER didn't work for me. It should be https://taxsim.nber.org/uptest/webfile.cgi, not http://wwwdev...


I was part way through crafting some responses to Dan's questions and comments, but Aman beat me to it with better and more thorough answers. Suffice to say, I agree with Aman's thoughts about sandboxing and the importance of being able to audit the compilation process from source. I'm also happy to help with GitHub publishing & maintenance if you go that way. Building some CI/CD workflows triggered on updates to the Fortran source would be ideal!

Dan, I know you had some concerns about possible derivative work if you were to broadly publish the source. Do you think those concerns could be addressed with appropriate licensing?

@thomascwells
Copy link
Author

Unfortunately, I could not reproduce your success in R. This is what I got:

> download.file("https://taxsim.app/taxsim.wasm", "taxsim.wasm")
trying URL 'https://taxsim.app/taxsim.wasm'
Content type 'application/wasm' length unknown
downloaded 900 KB

> wasm <- readBin("taxsim.wasm", raw(), file.info("taxsim.wasm")$size)
> library(V8)
> ctx <- v8()
> ctx$assign("wasmBinary", wasm)
> ctx$source("https://taxsim.app/taxsim.js")
> ctx$call("taxsim", JS("{year:2020, mstat:2}"), JS("{wasmBinary}"), await=TRUE)
failed to asynchronously prepare wasm: CompileError: WasmCompile: Wasm decoding failed: unknown section code #0x10 @+1443
Aborted(CompileError: WasmCompile: Wasm decoding failed: unknown section code #0x10 @+1443)
Error in context_eval(join(src), private$context, serialize, await) : 
  RuntimeError: Aborted(CompileError: WasmCompile: Wasm decoding failed: unknown section code #0x10 @+1443). Build with -s ASSERTIONS=1 for more info.

I can continue trying some things, including rebuilding the WASM using the emscripten stack with the copy of the fortran source (Thanks Dan!)

However, it's worth noting that base-R has capabilities for compiling Fortran and running those modules directly, so maybe we're trying to solve the wrong problem.

Should we fork the R-related components of this conversation to a different issue? Or possibly over to the R package (usincometaxes) repo?

@tmm1
Copy link
Owner

tmm1 commented Jun 10, 2022

CompileError: WasmCompile: Wasm decoding failed: unknown section code #0x10 @+1443

My hunch is this might be related to the version of libv8? What did library(V8) output? Mine was:

library(V8)
Using V8 engine 9.6.180.12

Actually, you said you're on windows so there might be some line encoding issue going on. See emscripten-core/emscripten#7987 (comment)

There may be some other variant of download.file that ensures binary mode?

On https://stat.ethz.ch/R-manual/R-devel/library/utils/html/download.file.html it says:

The choice of binary transfer (mode = "wb" or "ab") is important on Windows, since unlike Unix-alikes it does distinguish between text and binary files and for text transfers changes \n line endings to \r\n (aka ‘CRLF’).

So I think the fix is simply:

> download.file("https://taxsim.app/taxsim.wasm", "taxsim.wasm", mode="wb")

@tmm1
Copy link
Owner

tmm1 commented Jun 10, 2022

However, it's worth noting that base-R has capabilities for compiling Fortran and running those modules directly, so maybe we're trying to solve the wrong problem.

I believe Dan has also done some similar work in the past, to build a stata/ado plugin using the fortran code directly.

You're welcome to pursue this, but I imagine it would require changes to the fortran code itself to make it behave correctly in a plugin environment. Then you would also need to compile the plugin for different operating systems separately.

Personally I prefer the wasm approach, since it can treat the fortran source as a complete black box and results in a cross-platform artifact.

@thomascwells
Copy link
Author

Yes, it appears we are using different versions of libv8:

> library(V8)
Using V8 engine 6.2.414.50

... but, the download.file( ... mode="wb" fix worked like a charm!

> download.file("https://taxsim.app/taxsim.wasm", "taxsim.wasm", mode="wb")
trying URL 'https://taxsim.app/taxsim.wasm'
Content type 'application/wasm' length unknown
downloaded 900 KB

> wasm <- readBin("taxsim.wasm", raw(), file.info("taxsim.wasm")$size)
> library(V8)
Using V8 engine 6.2.414.50
> ctx <- v8()
> ctx$assign("wasmBinary", wasm)
> ctx$source("https://taxsim.app/taxsim.js")
> ctx$call("taxsim", JS("{year:2020, mstat:2}"), JS("{wasmBinary}"), await=TRUE)
[1] "taxsimid,year,state,fiitax,siitax,fica,frate,srate,ficar,tfica\r\n0.,2020,0,-3600.00,0.00,0.00,-7.65,0.00,15.30,0.00\r\n"

I'll do some more testing with this. If it works well, I'll try to get this wasm+v8 method added to the R package.

@feenberg
Copy link
Collaborator

feenberg commented Jun 10, 2022 via email

@thomascwells
Copy link
Author

I'll give you a call later today about the current state of and my goals for the R functions!

@feenberg
Copy link
Collaborator

feenberg commented Jun 10, 2022 via email

@thomascwells
Copy link
Author

I'll give you a call right now. We can discuss the git commands and R.

@thomascwells
Copy link
Author

Just a note in case someone else comes across this thread: most of the discussion related to command line use of TAXSIM and interfacing with R has shifted to an internal NBER Gitlab repo. I expect whatever comes of that discussion will eventually work its way back to this repo, but for now I guess this "issue" is just on-hold.

@feenberg
Copy link
Collaborator

feenberg commented Oct 11, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants