Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TF 2.0] allow tf.function input_signature to be specified by annotations #31579

Closed
jeffpollock9 opened this issue Aug 13, 2019 · 21 comments
Closed
Assignees
Labels
comp:autograph Autograph related issues stale This label marks the issue/pr stale - to be closed automatically if no activity stat:contribution welcome Status - Contributions welcome type:feature Feature requests

Comments

@jeffpollock9
Copy link

System information

  • TensorFlow version (you are using): 2.0.0-rc0
  • Are you willing to contribute it (Yes/No): Yes

Describe the feature and the current behavior/state.

tf.function has an argument input_signature which I have been using to try and make my code a bit safer and ensure I don't keep re-tracing functions. The input_signature specifies the tensor type for each of the function arguments. It would be much nicer (I think) to specify these types using python (>=3.5) annotations, where a suitable version of python is available. A very rough example looks like:

import tensorflow as tf


def function(fn):
    input_signature = list(fn.__annotations__.values())
    return tf.function(fn, autograph=False, input_signature=input_signature)


@function
def foo(
    x: tf.TensorSpec(shape=[None], dtype=tf.float64),
    y: tf.TensorSpec(shape=[None], dtype=tf.float64),
):
    return x + 10.0 + y


vec32 = tf.random.normal([2], dtype=tf.float32)
vec64 = tf.random.normal([2], dtype=tf.float64)


# should pass
foo(vec64, vec64)
foo(y=vec64, x=vec64)

# should fail
foo(vec32, vec64)

Which I think is nicer than the current signature:

@tf.function(
    autograph=False,
    input_signature=[
        tf.TensorSpec(shape=[None], dtype=tf.float64),
        tf.TensorSpec(shape=[None], dtype=tf.float64),
    ],
)
def foo(x, y):
    return x + 10.0 + y

I think the main benefit of the annotation approach is that the argument name and type are beside each other, and this syntax is already widely used in python.

In order to enable using annotations as the input_signature I think there should be an extra boolean argument to tf.function called e.g. use_annotation_input_signature which defaults to False.

Also note I have set autograph=False here to avoid a warning:

Cause: name 'foo_scope' is not defined

I am guessing a proper implementation inside of tf.function would not have this problem.

Will this change the current api? How?

It would add an additional argument to tf.function which at the default value would not change anything.

Who will benefit with this feature?

Anyone using python >= 3.5 who would like to specify the tensor types of their functions.

Any Other info.

None

@bionicles
Copy link

bionicles commented Aug 13, 2019

the concept of "None" == "Anything" is dumb and reflects poorly on Tensorflow and Keras team + users

@oanush oanush self-assigned this Aug 14, 2019
@oanush oanush added TF 2.0 Issues relating to TensorFlow 2.0 comp:autograph Autograph related issues type:support Support issues labels Aug 14, 2019
@oanush oanush assigned ymodak and unassigned oanush Aug 14, 2019
@ymodak ymodak added type:feature Feature requests and removed type:support Support issues labels Aug 14, 2019
@ymodak ymodak assigned mdanatg and unassigned ymodak Aug 14, 2019
@kkimdev
Copy link
Contributor

kkimdev commented Aug 15, 2019

Hi, I like the general idea of leveraging more standard Python features. Though one property we want to keep for @tf.function is that preserving the same semantics with or without @tf.function, and since we're not passing tf.TensorSpec instance as an argument, annotating the arguments to be tf.TensorSpec won't be strictly correct.

@jeffpollock9
Copy link
Author

Hi @kkimdev. Yes I realised that TensorSpec wasn't the right type shortly after writing this and agree it is weird/confusing to annotate the arguments with the wrong type. I can't think of a solution which would work for this now, since the "spec" (e.g. can choose any shape based on runtime arguments) is dynamic and the annotations are really meant to be static so that mypy etc can do static analysis.

Will close this later unless anyone has any ideas on how it could work properly.

@kkimdev
Copy link
Contributor

kkimdev commented Aug 29, 2019

Closing as it has been 2 weeks.

@kkimdev kkimdev closed this as completed Aug 29, 2019
@danmou
Copy link

danmou commented Dec 9, 2019

Why not make it possible to use tf.Tensor[shape, dtype] as an annotation? This would also be useful in general to document the input requirements.

@mdanatg
Copy link

mdanatg commented Dec 9, 2019

Yes, that would be the ideal format. I think you can use tf.Tensor (untemplated) right now in your regular type annotations, but implementing it as a generic type that captures both dtype and shape would take a bit more programming in the definition of the Tensor classes.

@mdanatg mdanatg added the stat:contribution welcome Status - Contributions welcome label Dec 9, 2019
@jeffpollock9
Copy link
Author

jeffpollock9 commented Dec 9, 2019

@danmou yes that's a good idea - not sure why I didn't do that in the first place. I'll re-open this and try to have a go at implementing it but if anyone has any ideas/would like to help that'd be great.

EDIT: I can't re-open this, @kkimdev if you think this would be useful at all can you re-open please?

@jeffpollock9
Copy link
Author

I've tried to knock up some code to try and figure out how this might work but am finding it quite hard (mainly due to my lack of experience with typing). If there is any feedback on this it would be great:

import tensorflow as tf

from typing import Generic, TypeVar
from typing_extensions import Literal


ShapeType = TypeVar("ShapeType")
DataType = TypeVar("DataType")

Shape = Literal


class Float32:
    dtype = tf.float32


class Float64:
    dtype = tf.float64


# TODO(jeff): generate all dtypes


class Tensor(Generic[ShapeType, DataType]):
    @classmethod
    def shape(self):
        return self.__args__[0].__values__

    @classmethod
    def dtype(self):
        return self.__args__[1].dtype

    def __add__(self, other):
        return self + other


def function(fn):
    annotation_values = fn.__annotations__.values()
    tensor_specs = [tf.TensorSpec(x.shape(), x.dtype()) for x in annotation_values]
    return tf.function(fn, input_signature=tensor_specs)


@function
def foo(x: Tensor[Shape[None, 2, 3], Float64]):
    return x + 42.0


foo(tf.random.normal([1, 2, 3], dtype=tf.float64))  # OK
foo(tf.random.normal([2, 2, 3], dtype=tf.float64))  # OK
foo(tf.random.normal([1, 2, 3, 4], dtype=tf.float64))  # NOT OK
foo(tf.random.normal([1, 2, 3], dtype=tf.float32))  # NOT OK

which seems to pass the correct input_signature in this very simple example but has some mypy errors that I don't know how to deal with:

$ mypy types_test.py
types_test.py:1: error: No library stub file for module 'tensorflow'
types_test.py:1: note: (Stub files are from https://github.com/python/typeshed)
types_test.py:44: error: Variable "types_test.Shape" is not valid as a type
types_test.py:44: error: Invalid type: try using Literal[2] instead?
types_test.py:44: error: Invalid type: try using Literal[3] instead?
Found 4 errors in 1 file (checked 1 source file)

@danmou
Copy link

danmou commented Dec 16, 2019

@jeffpollock9 you'll probably need to redefine __class_getitem__ for the Tensor class. (That's the method that's called when you call Tensor[...]).

@danmou
Copy link

danmou commented Dec 16, 2019

Also I don't think defining Shape as a Literal will work (then undefined dimensions wouldn't work). I don't think it's possible to make mypy handle shapes correctly anyway without extending mypy itself, so I think the easiest solution for now would to simply make __class_getitem__ for Tensor return a Tensor with some attributes set for shape and dtype (which will be ignored by mypy but can be used by tf.function). Possibly mypy could handle dtypes correctly, but I'd say that's not so important anyway until someone starts type annotating the entire TF library.

@jeffpollock9
Copy link
Author

Many thanks for the comments, @danmou! As far as I can tell class_getitem was added in python3.7 so have switched to that (was on 3.6 before).

I'm trying to figure out a better way of making some sort of Shape type since you mentioned Literal is not a good idea - but I am not sure how. If you don't mind - do you have any ideas? This is where I have got to so far:

import tensorflow as tf

from typing import Generic, TypeVar, get_type_hints
from typing_extensions import Literal

ShapeType = TypeVar("ShapeType")
DataType = TypeVar("DataType")

# TODO(jeff): this shouldn't be Literal
Shape = Literal


class Float32:
    dtype = tf.float32


class Float64:
    dtype = tf.float64


# TODO(jeff): generate all dtypes


class Tensor(Generic[ShapeType, DataType]):
    def __class_getitem__(cls, item):
        shape = item[0].__args__
        dtype = item[1].dtype
        return shape, dtype


def function(fn):
    type_hints = get_type_hints(fn)
    input_signature = [
        tf.TensorSpec(shape, dtype, name) for name, (shape, dtype) in type_hints.items()
    ]
    return tf.function(fn, input_signature=input_signature)


@function
def foo(x: Tensor[Shape[None, 2, 3], Float64], y: Tensor[Shape[1, 1, 1], Float64]):
    return x + y


>>> print(foo.input_signature)
(TensorSpec(shape=(None, 2, 3), dtype=tf.float64, name='x'), TensorSpec(shape=(1, 1, 1), dtype=tf.float64, name='y'))

with:

$ mypy types_test.py
types_test.py:1: error: No library stub file for module 'tensorflow'
types_test.py:1: note: (Stub files are from https://github.com/python/typeshed)
types_test.py:40: error: Variable "types_test.Shape" is not valid as a type
types_test.py:40: error: Invalid type: try using Literal[2] instead?
types_test.py:40: error: Invalid type: try using Literal[3] instead?
types_test.py:40: error: Invalid type: try using Literal[1] instead?
types_test.py:41: error: Unsupported left operand type for + ("Tensor[Shape?[None, Any, Any], Float64]")
Found 6 errors in 1 file (checked 1 source file)

So indeed there is a problem with using Shape = Literal but I am not sure how to make a new type which can hold the list of shape data.

Thanks

@mdanatg
Copy link

mdanatg commented Dec 16, 2019

I think this is going in the right direction. Here's a version of @jeffpollock9 's code that I think mypy is happy with (barring the lack of annotations in TensorFlow).

Since in the space of types there are no values (None is just sugar for NoneType), it doesn't seem currently possible to specify something like MyCustomType[1] without the use of Literal. So the type annotations will look a bit awkward.
Perhaps a future PEP could relax that.

It appears that Literal is the only special type that allows a variable number of things (and even in that case the values are packed into a sugared tuple). In other words we can't define a type Shape[*Dim] and we're forced to either use Literal[(None, 2, 3)] or to specialize things by rank, like Shape1D, Shape2D, etc. I'm also a bit afraid of Literal because it has the semantic "the value can be any one of these (their order is irrelevant)", but here we really need to say: "the value is this specific list (in this specific order)". That's why I used Literal[(a, b, c)] and not just Literal[a, b, c].

Lastly, I think we can also get by entirely using inspect to examine these type arguments - see the code.

Here's the code:

import tensorflow as tf

import inspect
import typing
from typing import Any, Generic, TypeVar, get_type_hints
from typing import NewType
from typing_extensions import Literal

ShapeType = TypeVar("ShapeType")
DataType = TypeVar("DataType")


class Shape(Generic[ShapeType]):
  pass


class Float32(object):
    value = tf.float32


class Float64(object):
    value = tf.float64


# TODO(jeff): generate all dtypes


class Tensor(Generic[ShapeType, DataType]):
  def __rmul__(self, other: Any):
    pass  # Just appeasing mypy here, the real Tensor has a proper implementation.

  pass


def function(fn):
    argspec = inspect.getfullargspec(fn)
    if (argspec.varargs is not None or argspec.varkw is not None or argspec.varkw is not None):
      raise NotImplemented('only positional args for now')

    input_signature = []
    for name in argspec.args:
      if name not in argspec.annotations:
        input_signature.append(None)
        continue
      shape_as_type, dtype = argspec.annotations[name].__args__
      shape = []
      for s in shape_as_type.__args__[0].__values__:
        if s is None:
          shape.append(None)
        else:
          shape.append(int(s))

      ts = tf.TensorSpec(shape=shape, dtype=dtype.value)
      input_signature.append(ts)
    return tf.function(fn, input_signature=input_signature)


@function
def foo(x: Tensor[Shape[Literal[(None, 2, 3)]], Float64]):
    return 2 * x

foo(tf.random.normal([1, 2, 3], dtype=tf.float64))  # OK
foo(tf.random.normal([2, 2, 3], dtype=tf.float64))  # OK
try:
  foo(tf.random.normal([1, 2, 3, 4], dtype=tf.float64))  # NOT OK
  assert False
except ValueError:
  pass
try:
  foo(tf.random.normal([1, 2, 3], dtype=tf.float32))  # NOT OK
  assert False
except ValueError:
  pass

@mdanatg
Copy link

mdanatg commented Dec 16, 2019

Going the Literal-free path might not be so bad. Here's a version that's very verbose, but the type annotation looks quite nice. I named the dimensions MNISTHeight and MNISTWeight to show that such boilerplate-y types can have an actual intuitive meaning.

## This is what the gigantic file of type defs would contain

Shape3DDim1 = TypeVar("Shape3DDim1")
Shape3DDim2 = TypeVar("Shape3DDim2")
Shape3DDim3 = TypeVar("Shape3DDim3")

class Shape3D(Generic[Shape3DDim1, Shape3DDim2, Shape3DDim3]):
  pass

class Dimension(object):
  value = NotImplemented

class Dynamic(Dimension):
  value = None

## This is what the user would have to define:

class MNISTWidth(Dimension):
  value = 2

class MNISTHeight(Dimension):
  value = 3


@function
def foo(x: Tensor[Shape3D[Dynamic, MNISTWidth, MNISTHeight], Float64]):
    return 2 * x

@jeffpollock9
Copy link
Author

@mdanatg thanks for this! I really like your Literal-free code - since it doesn't seem possible to define a Shape[*Dim] type I think this is the way to go. The only downside is the big file of typedefs as you mentioned - but I think we could automatically generate a file with up to (say) Shape10D and I can't imagine it ever being a limitation.

@jeffpollock9
Copy link
Author

I've made a few changes to the code above:

Firstly, I don't think we need to handle a None value in the input_signature list as this doesn't work with tf.function anyway, i.e. this doesn't work:

@tf.function(input_signature=[None, tf.TensorSpec([1, 2], tf.float32)])
def foo(x, y):
    return x + y

with:

TypeError: Invalid input_signature [None, TensorSpec(shape=(1, 2), dtype=tf.float32, name=None)]; input_signature must be a possibly nested sequence of TensorSpec objects.

so we can remove:

if name not in argspec.annotations:
    input_signature.append(None)
    continue

Secondly, I had to change:

for s in shape_as_type.__args__[0].__values__:

to

for s in shape_as_type.__args__:

Thirdly, for the inner loop over the shapes, should it not be s.value instead of s?

so the full code is:

import tensorflow as tf
import inspect

from typing import Generic, Any, TypeVar

# TODO: generate all dtypes
# TODO: generate all shapes

ShapeType = TypeVar("ShapeType")
DataType = TypeVar("DataType")

Shape3DDim1 = TypeVar("Shape3DDim1")
Shape3DDim2 = TypeVar("Shape3DDim2")
Shape3DDim3 = TypeVar("Shape3DDim3")


class Shape3D(Generic[Shape3DDim1, Shape3DDim2, Shape3DDim3]):
    pass


class Dimension(object):
    value = NotImplemented


class Dynamic(Dimension):
    value = None


class Float32(object):
    value = tf.float32


class Float64(object):
    value = tf.float64


class Tensor(Generic[ShapeType, DataType]):
    def __rmul__(self, other: Any):
        pass  # Just appeasing mypy here, the real Tensor has a proper implementation.


def function(fn):
    argspec = inspect.getfullargspec(fn)
    if argspec.varargs is not None or argspec.varkw is not None:
        raise NotImplemented("only positional args for now")

    input_signature = []
    for name in argspec.args:
        shape_as_type, dtype = argspec.annotations[name].__args__
        shape = []
        for s in shape_as_type.__args__:
            if s.value is None:
                shape.append(None)
            else:
                shape.append(int(s.value))

        ts = tf.TensorSpec(shape=shape, dtype=dtype.value, name=name)
        input_signature.append(ts)
    return tf.function(fn, input_signature=input_signature)


# User code starts here
class MNISTWidth(Dimension):
    value = 2


class MNISTHeight(Dimension):
    value = 3


@function
def foo(x: Tensor[Shape3D[Dynamic, MNISTWidth, MNISTHeight], Float64]):
    return 2.0 * x


# Some ad hoc testing
print(f"foo signature: {foo.input_signature}")
foo_x_ts = tf.TensorSpec(shape=[None, 2, 3], dtype=tf.float64, name="x")
assert len(foo.input_signature) == 1
assert foo.input_signature[0] == foo_x_ts


@function
def bar():
    return tf.random.normal([1, 2, 3])


print(f"bar signature: {bar.input_signature}")
assert bar.input_signature == ()
$ python types_test.py 
foo signature: (TensorSpec(shape=(None, 2, 3), dtype=tf.float64, name='x'),)
bar signature: ()

$ mypy types_test.py
types_test.py:1: error: No library stub file for module 'tensorflow'
types_test.py:1: note: (Stub files are from https://github.com/python/typeshed)
Found 1 error in 1 file (checked 1 source file)

I also removed the extra check in argspec.varkw is not None or argspec.varkw is not None - I guess that was just a typo?

@mdanatg
Copy link

mdanatg commented Dec 17, 2019

Yep, your edits all look good! The Literal-free version did require them, but I didn't want that to clutter the post.
You're right about input_signature, it only supports None in a change that's not submitted yet, sorry! Probably best to raise an error for now. In future version we should be able to leave some args without annotation and their shape/type will be inferred.

@mdanatg
Copy link

mdanatg commented Feb 20, 2020

FYI, tensorflow/community#208 aims to establish a home for type definitions such as these. The RFC mentions this ongoing work, but we can include more specific details if ready.

@jeffpollock9
Copy link
Author

@mdanatg thanks for this - looks really interesting! I had a couple of evenings to try and add some of this to tensorflow but was struggling to even run the existing tests as TF takes days to build on my laptop. I'm hoping to have some time to try again soon but if there is anything in particular I could contribute please let me know.

FYI, tensorflow/community#208 aims to establish a home for type definitions such as these. The RFC mentions this ongoing work, but we can include more specific details if ready.

@mdanatg
Copy link

mdanatg commented Dec 9, 2020

Quick note, DeemMind has created in implementation similar to the ideas in this thread: https://github.com/deepmind/tensor_annotations

@tilakrayal tilakrayal removed the TF 2.0 Issues relating to TensorFlow 2.0 label Dec 18, 2021
@mdanatg mdanatg removed their assignment Feb 7, 2023
@github-actions
Copy link

This issue is stale because it has been open for 180 days with no activity. It will be closed if no further activity occurs. Thank you.

@github-actions github-actions bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label Mar 28, 2023
Copy link

This issue was closed because it has been inactive for 1 year.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:autograph Autograph related issues stale This label marks the issue/pr stale - to be closed automatically if no activity stat:contribution welcome Status - Contributions welcome type:feature Feature requests
Projects
None yet
Development

No branches or pull requests

9 participants