Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] nvtabular could fail if ‘PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION’ is set to ‘python’ before importing nvtabular #1546

Closed
vonodiripsa opened this issue May 14, 2022 · 2 comments
Assignees
Labels
bug Something isn't working P0
Milestone

Comments

@vonodiripsa
Copy link

nvtabular could fail if ‘PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION’ is set to ‘python’ before importing nvtabular

The Microsoft engineer investigated another import error involving nvtabular and found that it is triggered if this environment variable PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION is set to ‘python’ prior to the import.
The reason is they found the exception is caused in a file generated by protobuf, and can only be reached when the variable ‘PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION’ set to ‘python’.
This is interesting, because in several places during init in various modules, the env var is set to precisely this value in nvtabular code. If the environment variable is not set to this value prior to import, things work fine. Normally, this wouldn’t be an issue since that environment variable wouldn’t typically be set, however, in this setting, we have horovod involved, which in our code has one step at the beginning that propagates all environment variable settings from the driver to the workers. So if a user happens to import nvtabular in the driver or notebook (without error), the environment variable is set there by nvtabular and is then propagated to the workers by horovod, where subsequent nvtabular imports hit the error as above. There are several work arounds, including not importing nvtabular at all on the driver, as it is not actually needed there, but still would be nice to not have this level of fragileness

Steps/Code to reproduce bug

Set an env variable prior to ‘import nvtabular’ like below:

os.environ['PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION'] = 'python'
import nvtabular

if you set it to:
os.environ['PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION'] = 'cpp'

it should work

Expected behavior
It should depend only on documented environment

Environment details (please complete the following information):

  • Azure NV-WWFO subscription, nvtabular-synapse-demo Synapse workspace, cruise/Criteo-Training notebook
  • Method of NVTabular install: conda and source installation. Part of the installation was by MSFT

Additional context
No additional context about the problem here.

@vonodiripsa vonodiripsa added the bug Something isn't working label May 14, 2022
@viswa-nvidia viswa-nvidia changed the title [BUG] [BUG] nvtabular could fail if ‘PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION’ is set to ‘python’ before importing nvtabular May 16, 2022
@viswa-nvidia viswa-nvidia added this to the Merlin 22.06 milestone Jun 9, 2022
@viswa-nvidia
Copy link

@benfred , please update the status

@benfred
Copy link
Member

benfred commented Jun 13, 2022

We've changed the way NVTabular bundles the triton proto file - so that we longer no require the PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python environment variable to be set in #1566 - and removed the code that set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION. This shouldn't be an issue with the latest code, or the next release happening later this week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P0
Projects
None yet
Development

No branches or pull requests

3 participants