Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additional axis added when using tf.matmult #223

Closed
roblem opened this issue Feb 14, 2020 · 3 comments
Closed

Additional axis added when using tf.matmult #223

roblem opened this issue Feb 14, 2020 · 3 comments

Comments

@roblem
Copy link

roblem commented Feb 14, 2020

Using some tensor flow linear algebra operators require input tensors to be 2d. This can lead to problems during model evaluation when Random Variables are defined as 1d tensors. Very likely related to #222 and #193:

Suppose you want to use tensorflow matrix multiple to supply inputs into your likelihood:

@pm4.model
def toy_pm4():
    
    beta = yield pm4.Normal("beta", 0, 10, plate=3)
    sigma = yield pm4.Exponential("sigma", rate = .1, plate=1)
    print(beta.shape)
    print(X.shape)

    mu = tf.linalg.matmul(X, beta)
    print(mu.shape)

    obs = yield pm4.Normal('obs', loc=mu, scale=sigma, observed=Y)
    return obs

gives this output (some error lines suppressed and error thrown at the line mu=):

shape of beta:  (3,)
shape of data:  (1000, 3)
...
InvalidArgumentError: In[1] is not a matrix. Instead it has shape [3] [Op:MatMul]
``

The calculation of mu would have been ok using PyMC3 and theano.dot since we have conformability and theano would treat the 1d tensor as a matrix for purposes of the multiplication. So I have to do a tf.reshape to make beta into a 2d tensor and use that in the matrix multiply op:

@pm4.model
def toy_pm4():
    
    beta = yield pm4.Normal("beta", 0, 10, plate=3)
    sigma = yield pm4.Exponential("sigma", rate = .1, plate=1)
    
    print("shape of beta: ", beta.shape)
    print("shape of data: ", X.shape)
    bmat = tf.reshape(beta,(beta.shape[0],1))
    print("shape of beta matrix: ", bmat.shape)
    
    mu = tf.linalg.matmul(X, bmat)
    print("shape of mu: ", mu.shape)
    

    obs = yield pm4.Normal('obs', loc=mu, scale=sigma, observed=Y)

    return obs

While mu is created it is now 1000x1 (rather than (1000,)) and sampling fails (with the error occurring at the obs= line) since the likelihood expected a 1d tensor:

shape of beta:  (3,)
shape of data:  (1000, 3)
shape of beta matrix:  (3, 1)
shape of mu:  (1000, 1)
...
EvaluationError: The values supplied to the distribution 'toy_pm4/obs' are not consistent with the distribution's shape (dist_shape).
dist_shape = batch_shape + event_shape = TensorShape([1000, 1])
Supplied values shape = TensorShape([1000]).
A values array is considered to have a consistent shape with the distribution if two conditions are met.
1) It has a greater or equal number of dimensions when compared to the distribution (len(value.shape) >= len(dist_shape))
2) The values shape is compatible with the distribution's shape: dist_shape.is_compatible_with(    value_shape[(len(values.shape) - len(dist_shape)):])

To get the model to run, you need to do an additional reshape of mu (really a squeeze) to make mu a 1d tensor:

@pm4.model
def toy_pm4():
    
    beta = yield pm4.Normal("beta", 0, 10, plate=3)
    sigma = yield pm4.Exponential("sigma", rate = .1, plate=1)
        
    bmat = tf.reshape(beta,(beta.shape[0],1))
    
    mu = tf.linalg.matmul(X, bmat)
        
    muvec = tf.squeeze(mu,[1])
    
    obs = yield pm4.Normal('obs', loc=muvec, scale=sigma, observed=Y)

    return obs

Also, adding a dimension to beta in plate (e.g beta = yield pm4.Normal("beta", 0, 10, plate=(3,1))) doesn't fix the problem.

@twiecki
Copy link
Member

twiecki commented Feb 14, 2020

I haven't read this carefully yet but thanks for the report. Just merged the shape changes, does that fix anything?

@lucianopaz
Copy link
Contributor

@roblem, this is a tensorflow problem. tensorflow, unlike theano has no general purpose dot operator, that decides whether to delegate its calculations to matmul or matvec. Tensorflow forces the choice on the users. In your case, you want to multiply a matrix against a vector so you should just do tf.linalg.matvec instead of tf.linalg.matmul.

On the other hand, when you reshaped beta to be a matrix, your sampling fails because you didn't reshape the observeds to have shape (1000, 1). It actually says this in the error message. You supplied values of shape (1000,) but we expected values with shape (1000, 1). For all purposes, a tensor that has shape (1000, 1) is completely different than one that has shape (1000,), so you'll have to be careful with shape handling (pymc3 handled this recklessly, and that lead to very big problems down the line).

@roblem
Copy link
Author

roblem commented Feb 14, 2020

Thanks. Makes sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants