Custom Layer output shape isn't working with model.summary() #20746

vinayak19th · 2025-01-10T07:06:11Z

The model.summary api doesn't seem to properly access the layer.compute_output_shapes() call.

Current Behavior

Model: "Q_KAN"
____________________________________________________________________________
 Layer (type)                Output Shape              Param #   Trainable  
============================================================================
 input_68 (InputLayer)       [(None, 2)]               0         Y          
                                                                            
 DenseKAN (DenseQKan)        None                      18        Y          
                                                                            
 RescalePi (Rescale)         None                      0         N          
                                                                            
============================================================================
Total params: 18 (144.00 Byte)
Trainable params: 18 (144.00 Byte)
Non-trainable params: 0 (0.00 Byte)
____________________________________________________________________________

Expected Behavior

Model: "Q_KAN"
____________________________________________________________________________
 Layer (type)                Output Shape              Param #   Trainable  
============================================================================
 input_68 (InputLayer)       [(None, 2)]               0         Y          
                                                                            
 DenseKAN (DenseQKan)         [(None, 10)]                      18        Y          
                                                                            
 RescalePi (Rescale)          [(None, 10)]                      0         N          
                                                                            
============================================================================
Total params: 18 (144.00 Byte)
Trainable params: 18 (144.00 Byte)
Non-trainable params: 0 (0.00 Byte)
____________________________________________________________________________

Standalone code to reproduce the issue

Custom layer code:

class DenseQKan(tf.keras.layers.Layer):
    def __init__(self,units:int,circuit:qml.QNode,layers:int,**kwargs):
        super().__init__(**kwargs)
        self.circuit = circuit
        self.qubits =  len(circuit.device.wires)
        self.units = units
        self.qbatches = None
        self.layers = layers
        
    def build(self,input_shape):
        if input_shape[-1]> self.qubits:
            self.qbatches = np.ceil(input_shape[-1]/self.qubits).astype(np.int32)
        else:
            self.qbatches = 1
        self.layer_weights = []
        for u in range(self.units):
            self.layer_weights.append(self.add_weight(shape=(self.qbatches,input_shape[-1]//self.qbatches,self.layers),
                                   initializer=tf.keras.initializers.RandomUniform(minval=-np.pi, maxval=np.pi, seed=None),
                                   trainable=True))
        self.built = True

    def compute_output_shape(self,input_shape):
        print("Build Input Shape",input_shape)
        return (input_shape[0],self.units)
        
    def call(self,inputs):
        assert self.qbatches != None 
        splits = tf.split(inputs,self.qbatches,-1) 
        out = []
        for u in range(self.units):
            unit_out = 0
            for qb in range(self.qbatches):
                qb_out = tf.reduce_sum(tf.stack(self.circuit(splits[qb],self.layer_weights[u][qb]),axis=-1),axis=-1)
                unit_out = unit_out+qb_out
            out.append(unit_out)
        out = tf.stack(out,axis=-1)
        return out

Appararent fix:

For some reason adding the line below as the final layer fixes the issue.

out = tf.reshape(out,(tf.shape(inputs)[0],self.units))

Model definition code:

# def create_model(units,qubits,layers,circuit,input_shape=2):
inp = Input(shape=input_shape)
out = DenseQKan(units,circuit,layers,name="DenseKAN")(inp)
out = Rescale(name="RescalePi")(out)
model = Model(inputs=inp,outputs=out,name="Q_KAN")
model.summary(show_trainable=True)

The text was updated successfully, but these errors were encountered:

harshaljanjani · 2025-01-10T16:20:58Z

The results I got when trying to recreate the issue were quite different from what is stated in the description. In fact, it seems to behave in the opposite way.

Here's the code I used to reproduce the issue:

import tensorflow as tf
import pennylane as qml
import numpy as np

dev = qml.device('default.qubit', wires=2)

@qml.qnode(dev)
def circuit(inputs, weights):
    qml.RX(inputs[0], wires=0)
    qml.RY(inputs[1], wires=1)
    qml.CNOT(wires=[0, 1])
    for w in weights:
        qml.RZ(w, wires=0)
    return [qml.expval(qml.PauliZ(0)), qml.expval(qml.PauliZ(1))]

class DenseQKan(tf.keras.layers.Layer):
    def __init__(self, units: int, circuit: qml.QNode, layers: int, **kwargs):
        super().__init__(**kwargs)
        self.circuit = circuit
        self.qubits = len(circuit.device.wires)
        self.units = units
        self.qbatches = None
        self.layers = layers
        
    def build(self, input_shape):
        if input_shape[-1] > self.qubits:
            self.qbatches = np.ceil(input_shape[-1] / self.qubits).astype(np.int32)
        else:
            self.qbatches = 1
        self.layer_weights = []
        for u in range(self.units):
            self.layer_weights.append(
                self.add_weight(
                    shape=(self.qbatches, input_shape[-1] // self.qbatches, self.layers),
                    initializer=tf.keras.initializers.RandomUniform(minval=-np.pi, maxval=np.pi, seed=None),
                    trainable=True
                )
            )
        self.built = True

    def compute_output_shape(self, input_shape):
        print("Build Input Shape", input_shape)
        return (input_shape[0], self.units)
        
    def call(self, inputs):
        assert self.qbatches is not None 
        splits = tf.split(inputs, self.qbatches, -1) 
        out = []
        for u in range(self.units):
            unit_out = 0
            for qb in range(self.qbatches):
                qb_out = tf.reduce_sum(
                    tf.stack(self.circuit(splits[qb], self.layer_weights[u][qb]), axis=-1),
                    axis=-1
                )
                unit_out = unit_out + qb_out
            out.append(unit_out)
        out = tf.stack(out, axis=-1)
        return out

class Rescale(tf.keras.layers.Layer):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
    
    def call(self, inputs):
        return inputs * np.pi

def model_with_issue(input_shape=(2,), units=10, layers=3):
    inp = tf.keras.layers.Input(shape=input_shape)
    out = DenseQKan(units, circuit, layers, name="DenseKAN")(inp)
    out = Rescale(name="RescalePi")(out)
    model = tf.keras.Model(inputs=inp, outputs=out, name="Q_KAN")
    return model

def model_without_issue(input_shape=(2,), units=10, layers=3):
    inp = tf.keras.layers.Input(shape=input_shape)
    out = DenseQKan(units, circuit, layers, name="DenseKAN")(inp)
    # the apparent fix to reshape the output as suggested
    out = tf.reshape(out, (tf.shape(inp)[0], units))
    out = Rescale(name="RescalePi")(out)
    model = tf.keras.Model(inputs=inp, outputs=out, name="Q_KAN")
    return model

print("Model with the issue:")
model_issue = model_with_issue()
model_issue.summary(show_trainable=True)

print("Model without the issue:")
model_issue = model_without_issue()
model_issue.summary(show_trainable=True)

When I run this, the output for model_with_issue matches the expected structure, and the model summary is as follows:

Model: "Q_KAN"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━┓
┃ Layer (type)                          ┃ Output Shape                  ┃        Param # ┃ Traina… ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━┩
│ input_layer (InputLayer)              │ (None, 2)                     │              0 │    -    │
├───────────────────────────────────────┼───────────────────────────────┼────────────────┼─────────┤
│ DenseKAN (DenseQKan)                  │ (None, 10)                    │             60 │    Y    │
├───────────────────────────────────────┼───────────────────────────────┼────────────────┼─────────┤
│ RescalePi (Rescale)                   │ (None, 10)                    │              0 │    -    │
└───────────────────────────────────────┴───────────────────────────────┴────────────────┴─────────┘
 Total params: 60 (240.00 B)
 Trainable params: 60 (240.00 B)
 Non-trainable params: 0 (0.00 B)

However, when I introduce the fix in model_without_issue, I encounter this error:

Build Input Shape (None, 2)
Traceback (most recent call last):
  ...
ValueError: A KerasTensor cannot be used as input to a TensorFlow function. ...

It seems that the fix causes issues due to attempting to use a KerasTensor in a tf.reshape operation, which is not allowed. Instead, it seems like we’d need to wrap this logic in a custom Keras layer or handle it differently.

Do respond in this thread if there are any additional details, as I feel like there's a piece missing from this.

Environment:

Keras: 3.8.0
TensorFlow: 2.18.0
PennyLane: 0.39.0
NumPy: 2.0.2

sonali-kumari1 · 2025-01-13T08:47:02Z

Hi @vinayak19th @harshaljanjani -
Thanks for reporting this issue. I have tried to replicate the reported behavior with the latest version of keras(3.8.0) and model summary(with and without reshape) is as follows:

Model summary without reshape:

Model: "Q_KAN"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━┓
┃ Layer (type)                        ┃ Output Shape                 ┃       Param # ┃ Traina… ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━┩
│ input_layer_24 (InputLayer)         │ (None, 2)                    │             0 │    -    │
├─────────────────────────────────────┼──────────────────────────────┼───────────────┼─────────┤
│ DenseKAN (DenseQKan)                │ (None, 10)                   │            20 │    Y    │
├─────────────────────────────────────┼──────────────────────────────┼───────────────┼─────────┤
│ RescalePi (Rescaling)               │ (None, 10)                   │             0 │    -    │
└─────────────────────────────────────┴──────────────────────────────┴───────────────┴─────────┘
 Total params: 20 (80.00 B)
 Trainable params: 20 (80.00 B)
 Non-trainable params: 0 (0.00 B)

Model summary with reshape:

Model: "Q_KAN"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━┓
┃ Layer (type)                        ┃ Output Shape                 ┃       Param # ┃ Traina… ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━┩
│ input_layer_25 (InputLayer)         │ (None, 2)                    │             0 │    -    │
├─────────────────────────────────────┼──────────────────────────────┼───────────────┼─────────┤
│ DenseKAN (DenseQKan)                │ (None, 10)                   │            20 │    Y    │
├─────────────────────────────────────┼──────────────────────────────┼───────────────┼─────────┤
│ reshape_3 (Reshape)                 │ (None, 10)                   │             0 │    -    │
├─────────────────────────────────────┼──────────────────────────────┼───────────────┼─────────┤
│ RescalePi (Rescaling)               │ (None, 10)                   │             0 │    -    │
└─────────────────────────────────────┴──────────────────────────────┴───────────────┴─────────┘
 Total params: 20 (80.00 B)
 Trainable params: 20 (80.00 B)
 Non-trainable params: 0 (0.00 B)

The error ValueError: A KerasTensor cannot be used as input to a TensorFlow function occurs because kerasTensor is being passed to a Tensorflow function(tf.reshape in your case). To resolve this, you can use keras.layers.reshape instead of tf.reshape like this:
out = keras.layers.Reshape((units,))(out).

And instead of defining a custom Rescale layer, you can also use built-in Rescaling layer of keras like this:
out = Rescaling(inp * np.pi,name="RescalePi")(out)
Attaching gist for your reference.

github-actions bot assigned mehtamansi29 Jan 10, 2025

sonali-kumari1 added type:Bug stat:awaiting response from contributor labels Jan 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom Layer output shape isn't working with model.summary() #20746

Custom Layer output shape isn't working with model.summary() #20746

vinayak19th commented Jan 10, 2025 •

edited

Loading

harshaljanjani commented Jan 10, 2025

sonali-kumari1 commented Jan 13, 2025

Custom Layer output shape isn't working with model.summary() #20746

Custom Layer output shape isn't working with model.summary() #20746

Comments

vinayak19th commented Jan 10, 2025 • edited Loading

Standalone code to reproduce the issue

Appararent fix:

harshaljanjani commented Jan 10, 2025

sonali-kumari1 commented Jan 13, 2025

vinayak19th commented Jan 10, 2025 •

edited

Loading