Additional axis added when using tf.matmult #223

roblem · 2020-02-14T09:09:35Z

Using some tensor flow linear algebra operators require input tensors to be 2d. This can lead to problems during model evaluation when Random Variables are defined as 1d tensors. Very likely related to #222 and #193:

Suppose you want to use tensorflow matrix multiple to supply inputs into your likelihood:

@pm4.model
def toy_pm4():
    
    beta = yield pm4.Normal("beta", 0, 10, plate=3)
    sigma = yield pm4.Exponential("sigma", rate = .1, plate=1)
    print(beta.shape)
    print(X.shape)

    mu = tf.linalg.matmul(X, beta)
    print(mu.shape)

    obs = yield pm4.Normal('obs', loc=mu, scale=sigma, observed=Y)
    return obs

gives this output (some error lines suppressed and error thrown at the line mu=):

shape of beta:  (3,)
shape of data:  (1000, 3)
...
InvalidArgumentError: In[1] is not a matrix. Instead it has shape [3] [Op:MatMul]
``

The calculation of mu would have been ok using PyMC3 and theano.dot since we have conformability and theano would treat the 1d tensor as a matrix for purposes of the multiplication. So I have to do a tf.reshape to make beta into a 2d tensor and use that in the matrix multiply op:

@pm4.model
def toy_pm4():
    
    beta = yield pm4.Normal("beta", 0, 10, plate=3)
    sigma = yield pm4.Exponential("sigma", rate = .1, plate=1)
    
    print("shape of beta: ", beta.shape)
    print("shape of data: ", X.shape)
    bmat = tf.reshape(beta,(beta.shape[0],1))
    print("shape of beta matrix: ", bmat.shape)
    
    mu = tf.linalg.matmul(X, bmat)
    print("shape of mu: ", mu.shape)
    

    obs = yield pm4.Normal('obs', loc=mu, scale=sigma, observed=Y)

    return obs

While mu is created it is now 1000x1 (rather than (1000,)) and sampling fails (with the error occurring at the obs= line) since the likelihood expected a 1d tensor:

shape of beta:  (3,)
shape of data:  (1000, 3)
shape of beta matrix:  (3, 1)
shape of mu:  (1000, 1)
...
EvaluationError: The values supplied to the distribution 'toy_pm4/obs' are not consistent with the distribution's shape (dist_shape).
dist_shape = batch_shape + event_shape = TensorShape([1000, 1])
Supplied values shape = TensorShape([1000]).
A values array is considered to have a consistent shape with the distribution if two conditions are met.
1) It has a greater or equal number of dimensions when compared to the distribution (len(value.shape) >= len(dist_shape))
2) The values shape is compatible with the distribution's shape: dist_shape.is_compatible_with(    value_shape[(len(values.shape) - len(dist_shape)):])

To get the model to run, you need to do an additional reshape of mu (really a squeeze) to make mu a 1d tensor:

@pm4.model
def toy_pm4():
    
    beta = yield pm4.Normal("beta", 0, 10, plate=3)
    sigma = yield pm4.Exponential("sigma", rate = .1, plate=1)
        
    bmat = tf.reshape(beta,(beta.shape[0],1))
    
    mu = tf.linalg.matmul(X, bmat)
        
    muvec = tf.squeeze(mu,[1])
    
    obs = yield pm4.Normal('obs', loc=muvec, scale=sigma, observed=Y)

    return obs

Also, adding a dimension to beta in plate (e.g beta = yield pm4.Normal("beta", 0, 10, plate=(3,1))) doesn't fix the problem.

The text was updated successfully, but these errors were encountered:

twiecki · 2020-02-14T11:04:42Z

I haven't read this carefully yet but thanks for the report. Just merged the shape changes, does that fix anything?

lucianopaz · 2020-02-14T11:13:46Z

@roblem, this is a tensorflow problem. tensorflow, unlike theano has no general purpose dot operator, that decides whether to delegate its calculations to matmul or matvec. Tensorflow forces the choice on the users. In your case, you want to multiply a matrix against a vector so you should just do tf.linalg.matvec instead of tf.linalg.matmul.

On the other hand, when you reshaped beta to be a matrix, your sampling fails because you didn't reshape the observeds to have shape (1000, 1). It actually says this in the error message. You supplied values of shape (1000,) but we expected values with shape (1000, 1). For all purposes, a tensor that has shape (1000, 1) is completely different than one that has shape (1000,), so you'll have to be careful with shape handling (pymc3 handled this recklessly, and that lead to very big problems down the line).

roblem · 2020-02-14T11:33:42Z

Thanks. Makes sense.

lucianopaz closed this as completed Feb 14, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Additional axis added when using tf.matmult #223

Additional axis added when using tf.matmult #223

roblem commented Feb 14, 2020

twiecki commented Feb 14, 2020

lucianopaz commented Feb 14, 2020

roblem commented Feb 14, 2020

Additional axis added when using tf.matmult #223

Additional axis added when using tf.matmult #223

Comments

roblem commented Feb 14, 2020

twiecki commented Feb 14, 2020

lucianopaz commented Feb 14, 2020

roblem commented Feb 14, 2020