Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: invalid literal for int() with base 10 when trying to dump model #18

Closed
travisbrady opened this issue Apr 12, 2018 · 11 comments

Comments

@travisbrady
Copy link

I'm encountering the following error when trying to dump a model using Python 3.6.5 running on Mac OS C.

Error:

python dump_treelite.py
[main] treelite
[LightGBM] [Info] Finished loading 250 models
[11:49:15] /Users/travis/build/hcho3/treelite-wheels/treelite/src/frontend/lightgbm.cc:104: Warning: input file was not terminated with end-of-line character.
[11:49:15] /Users/travis/build/hcho3/treelite-wheels/treelite/src/compiler/ast_native.cc:22: Using ASTNativeCompiler
[11:49:15] /Users/travis/build/hcho3/treelite-wheels/treelite/src/compiler/ast/split.cc:15: Parallel compilation enabled; member trees will be divided into 4 translation units.
[11:49:15] /Users/travis/build/hcho3/treelite-wheels/treelite/src/c_api/c_api.cc:297: Code generation finished. Writing code to files...
[11:49:15] /Users/travis/build/hcho3/treelite-wheels/treelite/src/c_api/c_api.cc:314: Writing file recipe.json...
[11:49:15] /Users/travis/build/hcho3/treelite-wheels/treelite/src/c_api/c_api.cc:314: Writing file main.c...
[11:49:15] /Users/travis/build/hcho3/treelite-wheels/treelite/src/c_api/c_api.cc:314: Writing file tu1.c...
[11:49:15] /Users/travis/build/hcho3/treelite-wheels/treelite/src/c_api/c_api.cc:314: Writing file tu3.c...
[11:49:15] /Users/travis/build/hcho3/treelite-wheels/treelite/src/c_api/c_api.cc:314: Writing file header.h...
[11:49:15] /Users/travis/build/hcho3/treelite-wheels/treelite/src/c_api/c_api.cc:314: Writing file tu0.c...
[11:49:15] /Users/travis/build/hcho3/treelite-wheels/treelite/src/c_api/c_api.cc:314: Writing file tu2.c...
[11:49:15] /Users/tbrady/.pyenv/versions/3.6.5/envs/venv365/lib/python3.6/site-packages/treelite/contrib/__init__.py:208: WARNING: some of the source files are long. Expect long compilation time. You may want to adjust the parameter parallel_comp.

[11:49:15] /Users/tbrady/.pyenv/versions/3.6.5/envs/venv365/lib/python3.6/site-packages/treelite/contrib/util.py:96: Compiling sources files in directory /var/folders/n1/fsscqfrd5fg8mybxkv2hl5t0m5v75k/T/tmp4n9b4c9t into object files (*.o)...
Traceback (most recent call last):
  File "dump_treelite.py", line 21, in <module>
    main()
  File "dump_treelite.py", line 16, in main
    params={'parallel_comp':4}, nthread=8, verbose=True)
  File "/Users/tbrady/.pyenv/versions/3.6.5/envs/venv365/lib/python3.6/site-packages/treelite/frontend.py", line 99, in export_lib
    verbose, options)
  File "/Users/tbrady/.pyenv/versions/3.6.5/envs/venv365/lib/python3.6/site-packages/treelite/contrib/__init__.py", line 221, in create_shared
    _create_shared(dirpath, toolchain, recipe, nthread, options, verbose)
  File "/Users/tbrady/.pyenv/versions/3.6.5/envs/venv365/lib/python3.6/site-packages/treelite/contrib/gcc.py", line 61, in _create_shared
    return _create_shared_base(dirpath, recipe, nthread, verbose)
  File "/Users/tbrady/.pyenv/versions/3.6.5/envs/venv365/lib/python3.6/site-packages/treelite/contrib/util.py", line 115, in _create_shared_base
    result.append(_wait(proc[tid], workqueue[tid]))
  File "/Users/tbrady/.pyenv/versions/3.6.5/envs/venv365/lib/python3.6/site-packages/treelite/contrib/util.py", line 82, in _wait
    retcode = [int(line) for line in f]
  File "/Users/tbrady/.pyenv/versions/3.6.5/envs/venv365/lib/python3.6/site-packages/treelite/contrib/util.py", line 82, in <listcomp>
    retcode = [int(line) for line in f]
ValueError: invalid literal for int() with base 10: 'clang -c -O3 -o tu2.o tu2.c -fPIC -std=c99 \n'

Script to dump model:

from sklearn.externals import joblib
import treelite

def main():
    out_fn = 'models/brr_v1.1.0a6_tl.dylib'
    lgb_txt_fn = 'models/brr_v1.1.0a6.lgb.txt'
    print('[main] treelite')
    ranker = joblib.load('models/brr_v1.1.0a6.pkl.gz')
    ranker.booster_.save_model(lgb_txt_fn)
    tl = treelite.Model.load(lgb_txt_fn, model_format='lightgbm')
    tl.export_lib('clang', libpath=out_fn,
                             params={'parallel_comp':4}, nthread=8, verbose=True)
    print('wrote {}'.format(out_fn))


if __name__ == '__main__':
    main()

OS Details:

$ uname -a
Darwin HA002727 16.7.0 Darwin Kernel Version 16.7.0: Thu Jun 15 17:36:27 PDT 2017; root:xnu-3789.70.16~2/RELEASE_X86_64 x86_64

Python details:

$ python
Python 3.6.5 (default, Apr 12 2018, 10:53:09)
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.38)] on darwin

Compiler:

$ clang --version
Apple LLVM version 8.0.0 (clang-800.0.38)
Target: x86_64-apple-darwin16.7.0
Thread model: posix
@hcho3
Copy link
Collaborator

hcho3 commented Apr 13, 2018

@travisbrady Thanks for reporting the bug. Would you be able to run the following diagnostic script?

lgb_txt_fn = 'models/brr_v1.1.0a6.lgb.txt'
tl = treelite.Model.load(lgb_txt_fn, model_format='lightgbm')
tl.compile(dirpath='./model_dump', params={}, verbose=True)
treelite.create_shared(toolchain='clang', dirpath='./model_dump', verbose=True)

This should create a directory named model_dump. You should locate the file named retcode_cpu0.txt, retcode_cpu1.txt etc. in that directory and post them here. These files will give us better diagnosis for the crash. We appreciate your help!

@travisbrady
Copy link
Author

Thanks, @hcho3 .

I get the same error (ValueError: invalid literal for int() with base 10: 'clang -c -O3 -o main.o main.c -fPIC -std=c99 \n') when the diagnostic script hits the treelite.compile_shared line.

But the tl.compile line did produce a bunch of retcode files and retcode_cpu0.txt contains:

clang -c -O3 -o main.o main.c -fPIC -std=c99
echo $? >> retcode_cpu0.txt

All of the other retcode files are empty.

@hcho3
Copy link
Collaborator

hcho3 commented Apr 14, 2018

I will try to re-produce the error on my end. Thanks!

@hcho3
Copy link
Collaborator

hcho3 commented Apr 16, 2018

So far I was unable to re-produce the issue on my Macbook Pro. Can you run

echo $SHELL

to see which shell you are using?

Also, do you see the same issue arising on another machine (e.g. Linux)?

@hcho3
Copy link
Collaborator

hcho3 commented Apr 19, 2018

Another question: are you using the package from PyPI (pip install) or compiling from the source?

@AndrewHannigan
Copy link

AndrewHannigan commented Apr 19, 2018

Just got the same error, glad you guys are already on this :)

OS:

» uname -a
Darwin Andrews-MacBook-Pro.local 17.5.0 Darwin Kernel Version 17.5.0: Mon Mar  5 22:24:32 PST 2018; root:xnu-4570.51.1~1/RELEASE_X86_64 x86_64

Python:

» python3
Python 3.6.5 (default, Mar 30 2018, 06:41:53) 
[GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.39.2)] on darwin

Compiler:

treelite-demo/model_dump » gcc --version
Configured with: --prefix=/Library/Developer/CommandLineTools/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 9.1.0 (clang-902.0.39.1)
Target: x86_64-apple-darwin17.5.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin

@hcho3 That diagnostic script also produced the same error for me. All of the resulting retcode files are empty except retcode_cpu0.txt contains:

gcc -c -O3 -o main.o main.c -fPIC -std=c99 
echo $? >> retcode_cpu0.txt

main.c and header.h look like they were created successfully, contents of main.c look good.

I'm using zsh:

> echo $SHELL
/bin/zsh

And I did install using pip.

@AndrewHannigan
Copy link

Also, oddly if I try to compile main.c myself using this command in retcode_cpu0.txt:

gcc -c -O3 -o main.o main.c -fPIC -std=c99

It compiles just fine

@hcho3
Copy link
Collaborator

hcho3 commented Apr 19, 2018

This is interesting: I was able to reproduce the same error by setting SHELL to zsh. More precisely, I launched an interactive Python session with

SHELL=/usr/local/bin/zsh ipython3

and ran the diagnostic script.

I will have to modify the logic in treelite.create_shared to make it work with a variety of shells. I'll let you know when the fix is available.

@AndrewHannigan
Copy link

AndrewHannigan commented Apr 19, 2018

@hcho3 Ah interesting. Yes, when I switch to bash it works fine. Well, that works as a quick fix for now. Will stay tuned for zsh patch, thanks!

hcho3 added a commit that referenced this issue Apr 19, 2018
Addresses issue #18.

Diagnosis: The command `> log.txt` is intended to create an empty log
named `log.txt`. This command behaves differently for some shells
such as `zsh`. In `zsh`, the command is interpreted as `cat > log.txt`.
(See https://stackoverflow.com/a/15546937 for more details.)

Fix: Use `: > log.txt` to reliably create an empty log.
@hcho3
Copy link
Collaborator

hcho3 commented Apr 19, 2018

@AndrewHannigan @travisbrady The latest commit should fix the issue for zsh. Let me know if the issue arises again.

@hcho3 hcho3 closed this as completed Apr 19, 2018
@travisbrady
Copy link
Author

Ahh ok, I am indeed a zsh user (on OS X). I'll give this change a go.
Thanks, @hcho3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants