Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tensorflow dependency #27

Closed
JackTemaki opened this issue Sep 29, 2021 · 13 comments
Closed

Tensorflow dependency #27

JackTemaki opened this issue Sep 29, 2021 · 13 comments

Comments

@JackTemaki
Copy link
Contributor

The code uses from tensorflow.python.util import nest. Is there a possibility to include this code standalone into the repository without linking on tensorflow? I think this is an extreme dependency, and causes the following two issues for me:

  • The loading time of the sisyphus manager is increased by 4 seconds just with the tensorflow import, which is not nice for a tool you start often and interactively.
  • I now have a numpy version conflict. I usually keep the RETURNN runtime environment completely separated from anything in Sisyphus, but have one library (for G2P) that needs numpy > 1.20 but Tensorflow needs < 1.20. Of course I could created yet another environment and call this externally, or maybe Tensorflow 2.6 (which would be fine for just loading nest) does not have this dependency issue anymore (2.5 still has this).

Nevertheless, the first issue is already big enough for me to search for a solution.

@albertz
Copy link
Member

albertz commented Sep 29, 2021

What TF version is that? I'm pretty sure you can use a newer Numpy version with not totally outdated TF versions.

@albertz
Copy link
Member

albertz commented Sep 29, 2021

The TensorFlow slow import time is reported here: tensorflow/tensorflow#37729

So this is known. I'm not sure if there are maybe potential workarounds.

@albertz
Copy link
Member

albertz commented Sep 29, 2021

nest in TF is basically exactly what we need for certain things, though. So the alternatives are:

  • Come up with an own simple implementation of nest. Bad for all the common reasons. It will be worse, it will be less well tested, it increases our code size, etc.
  • Use some other external dependencies. Bad because this introduces another external dependency, but otherwise maybe ok.

Do you know about some (well-used, well-known) alternative lib which provides what nest does?
From a quick search, I found iteration-utilities (GitHub 60 stars), more-itertools (GitHub >2k stars). Not sure if there are others. Also not sure if they provide the same functionality as nest does.
From a quick glance, it doesn't look like this would really work for us. So even if we use e.g. more_itertools, we still would need some own custom code to replicate the functionality of nest.

We could also create our own external library. Maybe just copy the TF nest code and export it as a standalone library. Put it to rwth-i6/nest, create a pip package for it. Although the name nest is already taken. nested as well.

@albertz
Copy link
Member

albertz commented Sep 29, 2021

Note that we probably will also have other returnn imports anyway, e.g. Data and DimensionTag, which also can lead to TF imports. (Or it would be very ugly to work around that, basically replication those classes, or other things.)

@albertz
Copy link
Member

albertz commented Sep 29, 2021

Also, when we follow some of the suggestions here, we might also import stuff from tensorflow.python.autograph.pyct. (Again similar discussion, maybe there are other external libs which can replace pyct, but I don't know.)

@JackTemaki
Copy link
Contributor Author

I'm pretty sure you can use a newer Numpy version with not totally outdated TF versions.

No, unfortunately even the latest nightly builds depend on older numpy... But I guess it can be okay to still use a newer numpy version as long it is only the "nest" functions that are imported. I will test this.

We could also create our own external library. Maybe just copy the TF nest code and export it as a standalone library. Put it to rwth-i6/nest, create a pip package for it.

I will check if this is possible, the linked thread about import speed was not helpful so far.

@albertz
Copy link
Member

albertz commented Sep 29, 2021

I'm pretty sure TF will work anyway with a newer Numpy version. I don't really see why not. I have done that quite commonly.

I'm also almost sure that G2P or whatever other library also probably works with Numpy 1.19 or so. It's maybe just not tested.

@JackTemaki
Copy link
Contributor Author

I'm also almost sure that G2P or whatever other library also probably works with Numpy 1.19

This definitely did crash, otherwise I would not have noticed.

@albertz
Copy link
Member

albertz commented Sep 29, 2021

I'm also almost sure that G2P or whatever other library also probably works with Numpy 1.19

This definitely did crash, otherwise I would not have noticed.

Python exception, or segfault? In the latter case, I assume G2P compiles some native code, and must also be compiled with the right Numpy version, so you cannot use a different Numpy than the used during compilation.

@JackTemaki
Copy link
Contributor Author

Python exception, or segfault? In the latter case, I assume G2P compiles some native code, and must also be compiled with the right Numpy version, so you cannot use a different Numpy than the used during compilation.

Segfault, but it does not matter here. I will just use the newest numpy version. I will never use the Tensorflow of that env for any other purpose than for loading returnn_common anyways, and as long that works it is fine.

For the startup time, I guess I will just live with it for now. Moving nest out of TF is not simple, as it uses natively compiled code, and maybe you want to use other TF stuff. So I will close this issue.

@albertz
Copy link
Member

albertz commented Jan 5, 2022

I found another alternative to nest, namely tree (from DeepMind): GitHub, docs, pip install dm-tree

@albertz
Copy link
Member

albertz commented Jan 5, 2022

However, we have the TF dependency in several other places as well now. E.g. #47. It's not really easily possible to remove it.

@albertz
Copy link
Member

albertz commented Feb 15, 2024

Please see rwth-i6/returnn#1314 for a more recent discussion with more alternatives.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants