Skip to content

C. Package Description

Unai Alegre-Ibarra edited this page Aug 12, 2022 · 19 revisions

This library is intended to be used as an extension of the PyTorch library. The user is meant to create its own model architectures that are children of PyTorch's torch.nn.Module. These own custom models enable the use of any standard class defined in torch.nn, but it also enables to use Processor- or DNPU- based classes, that represent dopant-network devices.

The library is divided into three main parts:

  • Processors: Main package for handling all related simulations and hardware measurements of dopant-networks
  • Algorithms: It provides different default algorithms for brains-py, that already add several features that are particular to dopant-networks. However, for more advanced tasks, it is recommended that the users develop their own gradient descent algorithms.
  • Utils: A set of classes to provide support in useful tasks that are typically required when using the library.

1 Processors

1.1 Processor

The processor is the key unit for this package. It represents a dopant-network device, which can have a different number of activation electrodes. The main benefit of using a processor is that it can represent the behaviour of a dopant network either as a simulation or directly in hardware, by measuring the device in real-time. If considered a black box, with only inputs and outputs, the Processor would behave exactly the same when using a simulation or a hardware circuit. This comes very handily for designing different types of dopant-network-based circuits, as several simulations of dopant-network devices can be used and connected together in simulations, as it enables the reuse of the same code for both simulations and hardware.

The processor is initialised with a configuration dictionary, that contains information about the processor type (software or hardware) that wants to be instantiated; The electrode effects, in case they want to be overridden from the info dictionary; and the information about the waveforms. For instantiating a hardware processor, the configs dictionary can also have data related to the instantiation of the driver. Additionally, the instantiation of a software processor accepts an info dictionary, containing relevant information on how the surrogate model was trained (number of activation/readout electrodes, the input range of activation electrodes, clipping, amplification factor, etc.), and the information on the structure of the surrogate model (number of layers, activation function, etc.). Finally, the software processor also accepts a state dictionary, containing all the trained weights for a surrogate model. More information can be found on the Processor class itself.

As it can be observed in the image below, the processor can either contain an instance of a SurrogateModel (SoftwareProcessor) or a HardwareProcessor. Since there are several differences between these two classes, the Processor class does the required modifications in such a way that they can be seen from the outside of the black box as acting in the same way. The main difference comes from the need of creating waveforms, as further explained in the introduction of the Wiki. The waveforms use plateaus and slopes, to avoid damaging devices. However, this might slow the process when using surrogate models, so it can be avoided in these cases.

Insert image

  • Processor: Each DNPU contains a processor. This class enables to seamlessly handle internal changes in processors from Hardware to Software or vice versa. With regard to waveforms, it will create the plateaus of the waveform before passing them to either the SurrogateModel or the HardwareProcessor, as this is a common requirement for both of them. HardwareProcessor requires plateaued data to avoid undesirable capacitive effects and SurrogateModel (SoftwareProcessor) might need them for noise analysis. Using the terminology from the introduction of the Wiki, the Processor only passes input data to activation electrodes. Only the DNPU class makes a differentiation between data-input and control electrodes from the activation electrodes, as the DNPU class contains the learnable parameters for the control voltages, assigned to some of the activation electrodes. The DNPU class also handles the ordering of the input data with the control voltages into the adequate electrodes before passing the information to the Processor class.

  • Hardware processor: It is composed of a driver instance, which is a connection (for a single, or multiple hardware DNPUs) with one of the following National Instruments measurement devices, which are children of the NationalInstrumentsSetup class (CDAQ-to-NiDAQ or CDAQ-to-CDAQ). The HardwareProcessor module expects that the input data has already been plateaued, and creates (and removes) the ramps, required for not damaging the device. The class manages the creation of slopes for plateaued data before passing the data to the drivers, and it also transforms PyTorch tensors to NumPy arrays, as this is the format that the drivers accept. Once it has read information from the hardware, it passes the data back to PyTorch tensors and it masks out the information in the output corresponding to the points in the ramp. It has been designed in such a way, that it enables to easily develop other hardware drivers if required, and the HardwareProcessor would behave in the exact same way.

  • Software processor (SurrogateModel): Instead of having a driver, it is composed of a deep neural network model (with frozen weights). The SoftwareProcessor module expects input data that is already plateaued. SoftwareProcessors do not require any sort of ramping in the inputs, as this would reduce the overall performance in computing the outputs. However, plateaued data is allowed as it might help with simulating noise effects, providing a more accurate output than that without noise. For faster measurements, where noise simulation is not needed, a plateau of lenght 1 is recommended. Apart from the waveform difference, the SoftwareProcessor also applies to the output several effects such as the amplification correction of the device, clipping values, or relevant noise simulations. The hardware output is expected to be in nano-Amperes. Therefore, in order to be able to read it, it is amplified. The surrogate model can add an amplification correction factor so that the original output in nano-Amperes is received.

You can also find additional information here:

1.1.1 Example: When to use a processor

A single or several Processors can be used simultaneously in a custom torch.nn.Module model and the package still allows for seamlessly validating them on hardware, by simply calling the custom implementation of the hw_eval function on the custom class. No further changes in software are required but this call. The processor does not have any learnable parameters, when in software simulation mode, it simply maps the input-output relationship of the device, when in hardware mode, it behaves in the same way.

The usage of an independent instance of Processor is recommended for experiments where only hardware measurements are expected, with no learning involved (for example, measuring an IV curve). Also, the declaration or reference to an instance of a processor is required for instantiating more advanced units such as the DNPU, DNPUBatchNorm, or DNPUConv2d (explained below). It is NOT recommended to use SurrogateModel or HardwareModel classes independently.

More information about the usage of processor in a custom class can be found in the usage examples of the Wiki.

1.1.2 Example: Initialise a Processor as Hardware

In order to initialise a processor as hardware, the following code can be used. First, configurations need to be defined:

hw_processor_configs = {
    "processor_type" : 'cdaq_to_cdaq', # There are four different processor types: 'simulation', 'cdaq_to_cdaq', 'cdaq_to_nidaq', 'simulation_debug'. 
    'input_indices': [3,6], # It specifies what indices will be taken as data input. These correspond to the indices of the activation channels inside the instruments' setup configuration. 
    'waveform': { # The waveform determines the number of points that will be used to represent a single point. More information about waveforms can be found in Section 5.1 Waveforms: https://github.com/BraiNEdarwin/brains-py/wiki/A.-Introduction
        'plateau_length' : 30, 
        'slope_length' : 30
    }
}

driver_configs = {
        'inverted_output': True, # Whether if the op-amp circuit to amplify the output of the DNPU applies an inversion or not.
        'amplification': [41], # Indicates the amplification correction factor that will be applied to obtain the real current measurement from the setup. More information about this can be found in Section 5.3 of the introduction of brains-py wiki: https://github.com/BraiNEdarwin/brains-py/wiki/A.-Introduction
        'instruments_setup':{ 
            'multiple_devices': False, # Indicates whether the setup is using a PCB with multiple devices or not.
            'trigger_source': 'cDAQ1/segment1', # Triggering signal to be sent for reading synchronisation. You can check for this signal for your setup on the NIMax app.
            'average_io_point_difference': True, # If the number of points read from the device is different from the number of points written to the device due to a difference in their
                                                # sampling frequencies, this variable indicates if there should be an averaging so that the input and output have the same length.
            'activation_instrument': 'cDAQ1Mod3', # Main module used for sending voltage signals to the DNPU 
            'activation_sampling_frequency': 1000, # Number of samples that will be written to the DNPU in one second.
            'activation_channels': [6,0,1,5,2,4,3], # Channels of the module that will be used for sending signals the device. 
            'activation_voltage_ranges': # Maximum minimum and maximum voltage ranges that will be allowed to be sent to the DNPU, per electrode. Dimensions (Electrode_no, 2)
              [
                [-0.7,1.2],[-0.3,0.4],[-0.7,1.2],[-0.8,1.1],[-0.6,1.1],[-0.3,0.5],[-0.7,1.1]
              ],
            'readout_instrument': 'cDAQ1Mod4', # Main module used for receiving voltage signals from the DNPU (after the op-amp) 
            'readout_sampling_frequency': 1000, # Number of samples that will be read from the DNPU in one second
            'readout_channels': [0], # Channels of the module that will be used for reading signals from the device. 
            'activation_channel_mask': [1, 1, 1, 1, 1, 1, 1] # Whether if all the channels connected to the device electrodes would be used or not. 1 for enabling its use, 0 for disabling it. When disabled, the channels will not be declared.
        }
    
    }

hw_processor_configs['driver'] = driver_configs

Second, the processor can be directly initialised as:

processor = Processor(hw_processor_configs)

1.1.3 Example: Initialise a Processor as Software

In order to initialise a processor as a software simulation, it is required to have the surrogate model information first, as the weights that represent the input-output relationship of the dopant network need to be loaded. . This information is produced when training a surrogate model and typically stored as 'training_data.pt'. For learning how to train a surrogate model, follow the instructions in Jupiter notebooks in brainspy-tasks. For initialising a processor as software, already having the data from training a surrogate model, the following instructions can be followed.

model_data = torch.load('training_data.pt', map_location=TorchUtils.get_device()) # Mapping location allows for loading the data into current device type (cpu or cuda). If models are trained using GPU and need to be used in a cpu-only computer, this line is needed.

Then, the processor can be loaded in a similar way to how it is done for hardware.

sw_processor_configs = {
    "processor_type": 'simulation', # There are four different processor types: 'simulation', 'cdaq_to_cdaq', 'cdaq_to_nidaq', 'simulation_debug'.  In this case, we are using a processor for simulation purposes

    'input_indices': [3, 6], # It specifies what indices will be taken as data input. These correspond to the indices of the activation channels inside the instrument's setup configuration.
    
    'waveform': {  # The waveform determines the number of points that will be used to represent a single point. More information about waveforms can be found in Section 5.1 Waveforms: https://github.com/BraiNEdarwin/brains-py/wiki/A.-Introduction . In this case, since it is only a simulation, a single plateau and slope length are selected. Longer plateaus can be selected in the simulation if other effects such as noise want to be simulated. The longer the plateaus and slopes, the slower the experiment will be.
        'plateau_length': 1,
        'slope_length': 0
    }
}

Finally, the processor can be loaded as follows:

processor = Processor(sw_processor_configs, model_data['info'], model_data['model_state_dict'])

1.1.4 Example: Swapping processors

After a processor is initialised, it can be easily swapped from hardware to software as follows:

processor.swap(hw_processor_configs)

Similarly, it can be brought back to software as follows:

processor.swap(sw_processor_configs, model_data['info'], model_data['model_state_dict'])

1.1.4 Handling differences between plateaus and targets

As part of the measurements in hardware, it is needed to create plateaus and slopes of the waveform. The slopes are removed within the HardwareProcessor, however, the output will still be plateaued. However, this might cause a mismatch between the outputs and the target size. In order to avoid this discrepancy, the targets need to be formatted, so that the loss function does not complain about this mismatch. This can be done in two main ways:

  • Declare the average_plateaus flags as True when declaring a Processor instance. This will average the plateaus, and make the output batch size the same as the output batch size.

  • Passing the targets through the format_targets function. And it should be done before passing the data to the loss function during training or testing, as follows:

 ... 
predictions = processor(inputs)
plateaued_targets = processor.format_targets(targets)
loss = loss_fn(predictions, plateaued_targets)
 ... 

1.2 DNPU

The main aim of the Processor class is to handle the differences between HardwareProcessor and SurrogateModel (SofwareProcessor) classes, such that it seamlessly can work with both of them. The DNPU also adds some of the inputs to the activation electrodes as learnable parameters, as these would be the control voltages that need to be learned using the simulation process. There are three main differences between a DNPU and a Processor:

  • While the processor treats all inputs as part of the activation electrodes, the DNPU makes a differentiation between data input electrodes and control electrodes. The DNPU will have as many input dimensions as number of data input electrodes, while it will declare learnable parameters associated with different electrodes. As a user of a DNPU, you will only need to worry about passing the correct inputs. The DNPU class will pass them to the adequate electrodes for you. The control voltages applied to the control electrodes are automatically handled for you.
  • The DNPU class allows the user to declare a linear transformation for the input data. The user can specify the ranges in which the data is, and it will automatically calculate the input mapping for leveraging the whole input range of the electrode, according to the electrodes that have been selected as data-input.
  • The DNPU class also allows declaring an array of dopant-network devices in a time-multiplexing fashion (using the same processor). In this way, the behaviour of several dopant-network units in a row can be simulated.

2.1 Example: Initialise a DNPU

A DNPU representing a single dopant-network, with data input indices at electrodes 2 and 3 (from 0 to 6), can be declared as follows:

# A processor is declared as explained in 3.3.2
processor = Processor(sw_processor_configs, model_data['info'], model_data['model_state_dict'])

dnpu = DNPU(processor, data_input_indices=[[2,3]])

The input of this device will be of (batch_size, data inputs * the number of dopant-networks). Therefore, for the previous example it will be two-dimensional only:

x = torch.rand(128,2) # With a batch size of 128
result = dnpu(x)

The shape of the result will be (batch_size, readout_electrodes * the number of dopant-networks). In this case (128,1).

2.2 Example: Initialise a DNPU with multiple time-multiplexed dopant-networks

A DNPU representing a single dopant-network, with data input indices at electrodes 2 and 3 (from 0 to 6), and another dnpu with data input electrode indices 4 and 5 can be declared as follows:

# A processor is declared as explained in 3.3.2
processor = Processor(sw_processor_configs, model_data['info'], model_data['model_state_dict'])

dnpu = DNPU(processor, data_input_indices=[[2,3],[4,5]])

The input of this device will be of (batch_size, data inputs * the number of dopant-networks). Therefore, for the previous example it will be four-dimensional:

x = torch.rand(128,4) # With a batch size of 128
result = dnpu(x)

The shape of the result will be (batch_size, readout_electrodes * the number of dopant-networks). In this case (128,2).

The same principle can be used to define 3 or more dopant-networks inside the DNPU class

dnpu = DNPU(processor, data_input_indices=[[2,3,4]]) # A single dopant-network with three data input indices
dnpu = DNPU(processor, data_input_indices=[[2,3],[2,3],[2,3]]) # Three dopant-network with same data input indices
dnpu = DNPU(processor, data_input_indices=[[2,3],[4,5],[1,2],[1,2]]) # Four dopant-network with different data input indices each
dnpu = DNPU(processor, data_input_indices=[[2,3]]*100) # A hundred dopant-networks with the same input indices

2.3 Adding a linear transformation

Devices are trained with certain activation electrode voltage ranges. Sometimes dopant-network devices work better when taking advantage of the whole range in which they were trained (as this range is typically chosen based on their optimal IV curves). For this reason, it is useful to perform a linear transformation between the inputs and these ranges, but this can be a difficult task, given that each electrode can have its own range, and different electrodes can be selected as data input electrodes. In order to solve this problem, the range values are stored in the info dictionary, and loaded at Processor and DNPU classes. In this way, a linear transformation can be added to a given DNPU layer. In this case, the user of the library only needs to provide the ranges of the input to the DNPU, and it will automatically perform a linear transformation to the input voltage ranges of those electrodes. In order to activate the linear transformation, a simple function can be called after the instantiation of a DNPU.

dnpu = DNPU(processor, data_input_indices=[[2,3,4]]) # A single dopant-network with three data input indices
dnpu.add_input_transform([-1,1]) # This particular transform will take input values in a range from -1 to 1, and automatically translate them to the corresponding voltage ranges of electrodes 2, 3 and 4.

The use of the linear transformation can also be removed by calling:

dnpu.remove_input_transform() 

2.4 Forcing the control voltages to stay within their range

It can also be useful during training to force the control voltages to be within the ranges in which it was trained, as solutions outside those ranges might not be reproducible in hardware, maybe because of inaccurately extrapolated results, or because of hardware setup limitations such as clipping.

There are two main techniques for this. One is to call the regularizer() function, which sums the quantity of all control voltages that are outside the ranges. This can be added to the loss function, in a similar way to how it is done for L1 or L2 regularization. Note that this method might not strictly choose values within the ranges. In order to force ranges, the constraint_control_voltages() function can be called after the optimizer step.

**Note that when creating more advanced modules using several DNPU instances, a custom regularizer or constraint_control_voltages function might be needed. Check the usage examples in the wiki for more information. **

2.5 Handling differences between plateaus and targets

The handling of the plateaus and targets is taken from the processor instance that is referenced within the DNPU. This can be used in the same way as for processor:

 ... 

predictions = dnpu(inputs)
plateaued_targets = dnpu.format_targets(targets)
loss = loss_fn(predictions, plateaued_targets)
 ... 

1.3 DNPUBatchNorm

This class is just a child of brainspy.processors.dnpu.DNPU class that adds a batch normalisation layer after the output. This is useful to normalise the output current of a DNPU for resolving a particular task. Normalised values can be passed through a sigmoid function, allowing output value ranges to be within 0 and 1, and therefore, they can be linearly mapped to the input ranges of the dopant-network data input electrodes of the next layer.

The initialisation parameters are the same as those of the DNPU, but with the parameters of the batch norm. Which are further explained in PyTorch's official documentation BatchNorm2d

dnpu = DNPUBatchNorm(processor, data_input_indices=[[2,3,4]]) # A single dopant-network with three data input indices

1.4 DNPUConv2d

This class imitates the 2d convolution operation using several DNPUs as kernels. Since it is a child of DNPU it also enables the addition of a linear transformation, as well as regularisation or constriction of voltages. The initialisation is the same as the DNPU and the convolution class from Pytorch's official documentation Conv2d

input_list = [[2, 3, 4]] * self.nr_nodes
conv = DNPUConv2d(
            processor,
            data_input_indices=input_list,
            in_channels=3,
            out_channels=6,
            kernel_size=3,
            stride=stride,
            padding=0,
        )

2 Algorithms

There are two main flavours of algorithms: Genetic Algorithm and Gradient Descent. Both algorithms can be executed seamlessly by importing and calling their corresponding 'train' function. For general purpose, the corresponding train function can be loaded from brainspy.utils.manager, as follows:

gd_trainer = manager.get_algorithm('gradient') 
ga_trainer = manager.get_algorithm('genetic')

Both trainers will share similar attributes. They will require:

  • Dataloaders: A list of dataloaders, containing the training and validation dataloaders.
  • Criterion: The loss or fitness function to be used
  • Optimizer: The type of optimizer (E.g., adam for gradient descent or GeneticOptimizer for genetic algorithm, which is a BLXAB implementation in PyTorch)
  • Configs: a set of configs that will be slightly different depending on if it is for the genetic algorithm or for the gradient descent.

2.1 Genetic Algorithm

This flavour is supported to train devices directly on-chip or on simulations. However, the current optimizer only supports having control voltages as learnable parameters. This means that other parameters like linear layers or other classes from torch.nn are not currently supported by the algorithm. The recommendation is to use the genetic algorithm for on-chip training of single dopant-network devices, and the off-chip gradient descent to explore different circuit designs that include classes from torch.nn such as (torch.nn.Linear).

2.1.1 Example: Usage of genetic algorithm

You can create your dataloaders as regular PyTorch dataloaders are created. You can find more information at PyTorch's original data loading tutorial.

train_loader = torch.utils.data.DataLoader(my_dataset,
                                       batch_size=256,
                                       shuffle=True)
val_loader = torch.utils.data.DataLoader(val_dataset,
                                       batch_size=256,
                                       shuffle=False)
dataloaders = [train_loader, val_loader]

The fitness function can be loaded by calling it directly, or using the manager:

criterion = brainspy.manager.get_criterion('corrsig_fit')

Similarly, the optimizer can also be retrieved with the manager:

configs = {"optimizer" : "genetic",
               "partition": [4,22],
               "epochs": 100}

model = CustomDNPUModel() # if you provide an instance of a model, it will take the ranges from it. But it will need the get_control_ranges function implemented. Alternatively, the key "gene_range" containing the ranges for control voltages could be provided.

optimizer = brainspy.manager.get_optimizer(model,configs)

Finally, the training function can be called with the manager, and used with the variables that were just declared.

configs = {'stop_threshold' = 0.9} # Threshold at which the fitness function will stop, in order to enable faster training

ga_trainer = manager.get_algorithm('genetic')

ga_trainer(model, dataloaders, criterion, optimizer, configs)

2.1.2 Genetic Optimiser

The library also provides a genetic optimiser, which follows a similar structure to that of other optimisers provided by default with Pytorch. The reason for doing it in this way is to facilitate its integration with Pytorch. The 'step' method from the genetic optimizer receives the result after passing the outputs through the fitness function of all genomes (control voltage combinations). Then, it can sort them by performance and execute the crossover and mutation steps, providing a new pool of genomes for the next generation. The 'step' method has been created to be used in a similar way, to that of the Pytorch optimisers. Note as well that since this optimiser is not a gradient-based method, it should be executed by calling model.eval() and with torch.no_grad():.

2.2 Gradient Descent

This flavour is supported to train devices off-chip. First, a surrogate model of one or several dopant-networks need to be created. Second, custom complex models of several dopant-networks can be created and trained with vanilla gradient descent techniques provided by PyTorch. Finally, the software will support validating results directly on hardware.

There also exists support for on-chip gradient descent training, but it has not been made public, as it is in the process of provisional patenting. If you are interested in having this flavour, you can contact Unai Alegre-Ibarra and Mark Boon.

2.1.1 Example: Usage of gradient descent algorithm

You can create your dataloaders as regular PyTorch dataloaders are created. You can find more information at PyTorch's original data loading tutorial.

train_loader = torch.utils.data.DataLoader(my_dataset,
                                       batch_size=256,
                                       shuffle=True)
val_loader = torch.utils.data.DataLoader(val_dataset,
                                       batch_size=256,
                                       shuffle=False)
dataloaders = [train_loader, val_loader

The loss function can be loaded from any of PyTorch's _Loss child classes.

criterion = torch.nn.MSELoss()

As well as from the manager, for custom loss functions such as the fisher one.

criterion = brainspy.manager.get_criterion('fisher')

For the optimizer any PyTorch Optimizer can be used, for example:

model = CustomDNPUModel()
optimizer = torch.optim.Adam(model.parameters())

Finally, the training function can be called with the manager, and used with the variables that were just declared.


configs = {'epochs' = 100,# Number of times that the training algorithm will go through the whole dataset
           'constraint_control_voltages' = 'clip', # It can be either 'regul' or 'clip'. Go to 2.4 for more information.
} 

gd_trainer = manager.get_algorithm('gradient')

gd_trainer(model, dataloaders, criterion, optimizer, configs)

3 Utils

3.1 IO

This file enables to load and save .yaml formatted files. This is useful for loading several configurations from files, instead of having to create dictionaries manually each time.

3.2 Manager

This file enables to load of loss functions, optimizers, algorithms, and drivers from config files. This further enables us to use the library from configurations.

3.3 Pytorch

Handles the usage of the correct device (CPU or GPU) as well as the data type of PyTorch tensors.

3.4 Transforms

Independent functions that facilitate the linear transformations in multiple dimensions are used for performing the linear transformations before passing the input to the data input electrodes in dopant networks.

3.4 Waveform

It contains a class that helps handle the waveforms in multiple dimensions in an independent way (e.g., plateaus can be directly created from points, or slopes only can be created from plateaus). It divides the creation of the waveform into several stages so that it can be handled efficiently within the processor. It might be useful to use it in a separate way, for other types of operations.

3.5 Signal

This file has several custom loss and fitness functions used in the BRAINS center for particular tasks. These functions include Fisher, for promoting separation between binary classes, and corrsig fit, which promotes both separation (between classes) and correlation between signals.

3.6 Performance

For some tasks used within the BRAINS center, a correlated signal is found for classification purposes. In order to know the accuracy of an obtained signal for a binary classification task, a perceptron can be trained given the output signal for a given set of control voltages, and a set of input/output labels. Then, a perceptron can be trained over the normalised given data. The files within the folder performance help train a perceptron and measure accuracy over its threshold.

Clone this wiki locally