Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom Layer output shape isn't working with model.summary() #20746

Open
vinayak19th opened this issue Jan 10, 2025 · 2 comments
Open

Custom Layer output shape isn't working with model.summary() #20746

vinayak19th opened this issue Jan 10, 2025 · 2 comments

Comments

@vinayak19th
Copy link

vinayak19th commented Jan 10, 2025

The model.summary api doesn't seem to properly access the layer.compute_output_shapes() call.

Current Behavior

Model: "Q_KAN"
____________________________________________________________________________
 Layer (type)                Output Shape              Param #   Trainable  
============================================================================
 input_68 (InputLayer)       [(None, 2)]               0         Y          
                                                                            
 DenseKAN (DenseQKan)        None                      18        Y          
                                                                            
 RescalePi (Rescale)         None                      0         N          
                                                                            
============================================================================
Total params: 18 (144.00 Byte)
Trainable params: 18 (144.00 Byte)
Non-trainable params: 0 (0.00 Byte)
____________________________________________________________________________

Expected Behavior

Model: "Q_KAN"
____________________________________________________________________________
 Layer (type)                Output Shape              Param #   Trainable  
============================================================================
 input_68 (InputLayer)       [(None, 2)]               0         Y          
                                                                            
 DenseKAN (DenseQKan)         [(None, 10)]                      18        Y          
                                                                            
 RescalePi (Rescale)          [(None, 10)]                      0         N          
                                                                            
============================================================================
Total params: 18 (144.00 Byte)
Trainable params: 18 (144.00 Byte)
Non-trainable params: 0 (0.00 Byte)
____________________________________________________________________________

Standalone code to reproduce the issue

Custom layer code:

class DenseQKan(tf.keras.layers.Layer):
    def __init__(self,units:int,circuit:qml.QNode,layers:int,**kwargs):
        super().__init__(**kwargs)
        self.circuit = circuit
        self.qubits =  len(circuit.device.wires)
        self.units = units
        self.qbatches = None
        self.layers = layers
        
    def build(self,input_shape):
        if input_shape[-1]> self.qubits:
            self.qbatches = np.ceil(input_shape[-1]/self.qubits).astype(np.int32)
        else:
            self.qbatches = 1
        self.layer_weights = []
        for u in range(self.units):
            self.layer_weights.append(self.add_weight(shape=(self.qbatches,input_shape[-1]//self.qbatches,self.layers),
                                   initializer=tf.keras.initializers.RandomUniform(minval=-np.pi, maxval=np.pi, seed=None),
                                   trainable=True))
        self.built = True

    def compute_output_shape(self,input_shape):
        print("Build Input Shape",input_shape)
        return (input_shape[0],self.units)
        
    def call(self,inputs):
        assert self.qbatches != None 
        splits = tf.split(inputs,self.qbatches,-1) 
        out = []
        for u in range(self.units):
            unit_out = 0
            for qb in range(self.qbatches):
                qb_out = tf.reduce_sum(tf.stack(self.circuit(splits[qb],self.layer_weights[u][qb]),axis=-1),axis=-1)
                unit_out = unit_out+qb_out
            out.append(unit_out)
        out = tf.stack(out,axis=-1)
        return out

Appararent fix:

For some reason adding the line below as the final layer fixes the issue.

out = tf.reshape(out,(tf.shape(inputs)[0],self.units))

Model definition code:

# def create_model(units,qubits,layers,circuit,input_shape=2):
inp = Input(shape=input_shape)
out = DenseQKan(units,circuit,layers,name="DenseKAN")(inp)
out = Rescale(name="RescalePi")(out)
model = Model(inputs=inp,outputs=out,name="Q_KAN")
model.summary(show_trainable=True)
@harshaljanjani
Copy link
Contributor

The results I got when trying to recreate the issue were quite different from what is stated in the description. In fact, it seems to behave in the opposite way.

Here's the code I used to reproduce the issue:

import tensorflow as tf
import pennylane as qml
import numpy as np

dev = qml.device('default.qubit', wires=2)

@qml.qnode(dev)
def circuit(inputs, weights):
    qml.RX(inputs[0], wires=0)
    qml.RY(inputs[1], wires=1)
    qml.CNOT(wires=[0, 1])
    for w in weights:
        qml.RZ(w, wires=0)
    return [qml.expval(qml.PauliZ(0)), qml.expval(qml.PauliZ(1))]

class DenseQKan(tf.keras.layers.Layer):
    def __init__(self, units: int, circuit: qml.QNode, layers: int, **kwargs):
        super().__init__(**kwargs)
        self.circuit = circuit
        self.qubits = len(circuit.device.wires)
        self.units = units
        self.qbatches = None
        self.layers = layers
        
    def build(self, input_shape):
        if input_shape[-1] > self.qubits:
            self.qbatches = np.ceil(input_shape[-1] / self.qubits).astype(np.int32)
        else:
            self.qbatches = 1
        self.layer_weights = []
        for u in range(self.units):
            self.layer_weights.append(
                self.add_weight(
                    shape=(self.qbatches, input_shape[-1] // self.qbatches, self.layers),
                    initializer=tf.keras.initializers.RandomUniform(minval=-np.pi, maxval=np.pi, seed=None),
                    trainable=True
                )
            )
        self.built = True

    def compute_output_shape(self, input_shape):
        print("Build Input Shape", input_shape)
        return (input_shape[0], self.units)
        
    def call(self, inputs):
        assert self.qbatches is not None 
        splits = tf.split(inputs, self.qbatches, -1) 
        out = []
        for u in range(self.units):
            unit_out = 0
            for qb in range(self.qbatches):
                qb_out = tf.reduce_sum(
                    tf.stack(self.circuit(splits[qb], self.layer_weights[u][qb]), axis=-1),
                    axis=-1
                )
                unit_out = unit_out + qb_out
            out.append(unit_out)
        out = tf.stack(out, axis=-1)
        return out

class Rescale(tf.keras.layers.Layer):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
    
    def call(self, inputs):
        return inputs * np.pi

def model_with_issue(input_shape=(2,), units=10, layers=3):
    inp = tf.keras.layers.Input(shape=input_shape)
    out = DenseQKan(units, circuit, layers, name="DenseKAN")(inp)
    out = Rescale(name="RescalePi")(out)
    model = tf.keras.Model(inputs=inp, outputs=out, name="Q_KAN")
    return model

def model_without_issue(input_shape=(2,), units=10, layers=3):
    inp = tf.keras.layers.Input(shape=input_shape)
    out = DenseQKan(units, circuit, layers, name="DenseKAN")(inp)
    # the apparent fix to reshape the output as suggested
    out = tf.reshape(out, (tf.shape(inp)[0], units))
    out = Rescale(name="RescalePi")(out)
    model = tf.keras.Model(inputs=inp, outputs=out, name="Q_KAN")
    return model

print("Model with the issue:")
model_issue = model_with_issue()
model_issue.summary(show_trainable=True)

print("Model without the issue:")
model_issue = model_without_issue()
model_issue.summary(show_trainable=True)

When I run this, the output for model_with_issue matches the expected structure, and the model summary is as follows:

Model: "Q_KAN"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━┓
┃ Layer (type)                          ┃ Output Shape                  ┃        Param # ┃ Traina… ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━┩
│ input_layer (InputLayer)              │ (None, 2)                     │              0 │    -    │
├───────────────────────────────────────┼───────────────────────────────┼────────────────┼─────────┤
│ DenseKAN (DenseQKan)                  │ (None, 10)                    │             60 │    Y    │
├───────────────────────────────────────┼───────────────────────────────┼────────────────┼─────────┤
│ RescalePi (Rescale)                   │ (None, 10)                    │              0 │    -    │
└───────────────────────────────────────┴───────────────────────────────┴────────────────┴─────────┘
 Total params: 60 (240.00 B)
 Trainable params: 60 (240.00 B)
 Non-trainable params: 0 (0.00 B)

However, when I introduce the fix in model_without_issue, I encounter this error:

Build Input Shape (None, 2)
Traceback (most recent call last):
  ...
ValueError: A KerasTensor cannot be used as input to a TensorFlow function. ...

It seems that the fix causes issues due to attempting to use a KerasTensor in a tf.reshape operation, which is not allowed. Instead, it seems like we’d need to wrap this logic in a custom Keras layer or handle it differently.

Do respond in this thread if there are any additional details, as I feel like there's a piece missing from this.

Environment:

  • Keras: 3.8.0
  • TensorFlow: 2.18.0
  • PennyLane: 0.39.0
  • NumPy: 2.0.2

@sonali-kumari1
Copy link
Contributor

Hi @vinayak19th @harshaljanjani -
Thanks for reporting this issue. I have tried to replicate the reported behavior with the latest version of keras(3.8.0) and model summary(with and without reshape) is as follows:

Model summary without reshape:

Model: "Q_KAN"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━┓
┃ Layer (type)                        ┃ Output Shape                 ┃       Param # ┃ Traina… ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━┩
│ input_layer_24 (InputLayer)         │ (None, 2)                    │             0 │    -    │
├─────────────────────────────────────┼──────────────────────────────┼───────────────┼─────────┤
│ DenseKAN (DenseQKan)                │ (None, 10)                   │            20 │    Y    │
├─────────────────────────────────────┼──────────────────────────────┼───────────────┼─────────┤
│ RescalePi (Rescaling)               │ (None, 10)                   │             0 │    -    │
└─────────────────────────────────────┴──────────────────────────────┴───────────────┴─────────┘
 Total params: 20 (80.00 B)
 Trainable params: 20 (80.00 B)
 Non-trainable params: 0 (0.00 B)

Model summary with reshape:

Model: "Q_KAN"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━┓
┃ Layer (type)                        ┃ Output Shape                 ┃       Param # ┃ Traina… ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━┩
│ input_layer_25 (InputLayer)         │ (None, 2)                    │             0 │    -    │
├─────────────────────────────────────┼──────────────────────────────┼───────────────┼─────────┤
│ DenseKAN (DenseQKan)                │ (None, 10)                   │            20 │    Y    │
├─────────────────────────────────────┼──────────────────────────────┼───────────────┼─────────┤
│ reshape_3 (Reshape)                 │ (None, 10)                   │             0 │    -    │
├─────────────────────────────────────┼──────────────────────────────┼───────────────┼─────────┤
│ RescalePi (Rescaling)               │ (None, 10)                   │             0 │    -    │
└─────────────────────────────────────┴──────────────────────────────┴───────────────┴─────────┘
 Total params: 20 (80.00 B)
 Trainable params: 20 (80.00 B)
 Non-trainable params: 0 (0.00 B)

The error ValueError: A KerasTensor cannot be used as input to a TensorFlow function occurs because kerasTensor is being passed to a Tensorflow function(tf.reshape in your case). To resolve this, you can use keras.layers.reshape instead of tf.reshape like this:
out = keras.layers.Reshape((units,))(out).

And instead of defining a custom Rescale layer, you can also use built-in Rescaling layer of keras like this:
out = Rescaling(inp * np.pi,name="RescalePi")(out)
Attaching gist for your reference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants