-
Notifications
You must be signed in to change notification settings - Fork 4.3k
BrainScript Network Builder
Custom networks are described in CNTK's custom network description language "BrainScript." To define a custom network, include a section named BrainScriptNetworkBuilder
in your training configuration. Detailed description on the network description language can be found on the Basic Concepts page and the corresponding sub-pages.
There are two forms of using the BrainScript network builder, one using parentheses (...)
, and a short-hand form using brackets [...]
. To describe your network in an external file, specify a block similar to this:
BrainScriptNetworkBuilder = (new ComputationNetwork [
include "yourNetwork.bs"
])
where yourNetwork.bs
contains the network described using BrainScript. In the above form, yourNetwork.bs
is searched for first in the same directory as the config file, and if not found, in the directory of the CNTK executable. Both absolute and relative pathnames are accepted here. E.g., bs/yourNetwork.bs
means a file located in a directory bs
next to your config file (or alternatively in the CNTK executable directory).
Alternatively, you can define your network inline, inside the config file. This can simplify configuration if you don't plan to share the same brain script across multiple configurations. Use this form:
BrainScriptNetworkBuilder = [
# insert network description here
]
Note that this is merely a short-hand for:
BrainScriptNetworkBuilder = (new ComputationNetwork [
# insert network description here
])
i.e. if you define your network using the bracket form BrainScriptNetworkBuilder = [ /*network definition*/ ]
then the system will automatically prepend new ComputationNetwork
, while /*network definition*/
consists of the member assignments of the record that is passed to construct the internal ComputationNetwork
C++ object.
If you use the parenthesis form
BrainScriptNetworkBuilder = (
/*full BrainScript expression*/
)
(round parentheses), you must provide a full BrainScript expression that evaluates to a C++ object of type ComputationNetwork
, for example BrainScriptNetworkBuilder = (new ComputationNetwork [ /*network definition*/ ])
. Parenthesis form must be used if your network is constructed in any other way, such as through a model-editing operation or as a result of a BrainScript function invocation.
Next: BrainScript Basic Concepts.
In older versions of CNTK, the network builder was called NDLNetworkBuilder
. Its definition language is a subset of BrainScript. The old parser was less capable, but also more forgiving. There are also other small differences.
NDLNetworkBuilder
is now deprecated, but due to the similarity, it is not difficult to upgrade to BrainScriptNetworkBuilder
. The following is a guide on how convert NDLNetworkBuilder
network descriptions to BrainScriptNetworkBuilder
's.
Converting an existing network definition for the NDLNetworkBuilder
to BrainScriptNetworkBuilder
is simple in most cases. The main changes are the surrounding syntax. The core network description itself is largely upwards compatible and likely identical or near-identical if you don't take advantage of the new language features.
To convert your descriptions, you must switch the network builder, adapt w.r.t. outer syntax, and possibly make minor adaptations to your network code itself.
Step 1. Switching the network builder. Replace the NDLNetworkBuilder
with the corresponding BrainScriptNetworkBuilder
block in the CNTK config file. If your network description is in a separate file:
# change from:
NDLNetworkBuilder = [
ndlMacros = "shared.ndl" # (if any)
networkDescription = "yourNetwork.ndl"
]
# ...to:
BrainScriptNetworkBuilder = (new ComputationNetwork [
include "shared.bs" # (if any)
include "yourNetwork.bs"
])
(The change of filename extension is not strictly necessary but recommended.)
If your network description is in the .cntk
config file itself:
# change from:
NDLNetworkBuilder = [
# macros
load = [
SigmoidNetwork (x, W, b) = Sigmoid (Plus (Times (W, x), b))
]
# network description
run = [
feat = Input (13)
...
ce = CrossEntropyWithSoftmax (labels, z, tag="criterion")
]
]
# ...to:
BrainScriptNetworkBuilder = [
# macros are just defined inline
SigmoidNetwork (x, W, b) = Sigmoid (Plus (Times (W, x), b)) # or: Sigmoid (W * x + b)
# network description
feat = Input (13)
...
ce = CrossEntropyWithSoftmax (labels, z, tag="criterion")
]
Step 2. Remove load
and run
blocks. With BrainScriptNetworkBuilder
, macro/function definitions and main code are combined. The load
and run
blocks must simply be removed. For example, this:
load = ndlMnistMacros
run = DNN
ndlMnistMacros = [
featDim = 784
...
labels = InputValue(labelDim)
]
DNN = [
hiddenDim = 200
...
outputNodes = (ol)
]
simply becomes:
featDim = 784
...
labels = InputValue(labelDim)
hiddenDim = 200
...
outputNodes = (ol)
You may have used the run
variable to select one of multiple configurations with an external variable, e.g.:
NDLNetworkBuilder = [
run = $whichModel$ # outside parameter selects model, must be either "model1" or "model2"
model1 = [ ... (MODEL 1 DEFINITION) ]
model2 = [ ... (MODEL 1 DEFINITION) ]
]
This pattern was mostly necessary because NDL did not have conditional expressions. In BrainScript, this would now be written with an if
expression:
BrainScriptNetworkBuilder = (new ComputationNetwork
if $whichModel$ == "model1" then [ ... (MODEL 1 DEFINITION) ]
else if $whichModel$ == "model2" then [ ... (MODEL 2 DEFINITION) ]
else Fail("Invalid model selector value '$whichModel$'")
)
However, often, the selected models are very similar, so a better way would be to merge their descriptions and instead use conditionals inside only for where they differ. Here is an example where a parameter is used to choose between a unidirectional and a bidirectional LSTM:
encoderFunction =
if useBidirectionalEncoder
then BS.RNNs.RecurrentBirectionalLSTMPStack
else BS.RNNs.RecurrentLSTMPStack
encoder = encoderFunction (encoderDims, inputEmbedded, inputDim=inputEmbeddingDim)
Step 3. Adjust your network description. Regarding the network description (formulas) itself, BrainScript is largely upwards compatible with NDL. These are the main differences:
-
The return value of macros (functions) is no longer the last variable defined in them, but the entire set of variables. You must explicitly select the output value at the end. For example:
# NDL: f(x) = [ x2 = x*x y = x2 + 1 ] # <-- return value defaults to last entry, i.e. y # BrainScript: f(x) = [ x2 = x*x y = x2 + 1 ].y # <-- return value y must be explicitly dereferenced
Without this change, the function return value would be the entire record, and the typical error you will get is that a
ComputationNode
was expected where aComputationNetwork
was found. -
BrainScript does not allow functions with variable numbers of parameters. This matters primarily for the
Parameter()
function: A vector parameter can no longer be written asParameter(N)
, it now has to be explicitly written as a 1-column matrixParameter(N, 1)
. Without this change, you will get an error about mismatching number of positional parameters. This notation also works with NDL, so you can make this change first and test it with NDL before converting. This is also a good opportunity to rename any uses of the legacy nameLearnableParameter()
toParameter()
.It also matters for the
RowStack()
function, which in BrainScript takes a single parameter that is an array of inputs. The inputs must be separated by a colon (:
) instead of a comma, e.g.RowStack (a:b:c)
instead ofRowStack (a, b, c)
. -
Some defaults have been updated, primarily the optional
imageLayout
parameter ofConvolution()
, the pooling operations, andImageInput()
. For NDL, these defaulted tolegacy
, whereas now the default iscudnn
which is required to be compatible with the cuDNN convolution primitives. (All code samples explicitly specify this parameter ascudnn
already.) -
BrainScript's parser is more restrictive:
-
Identifiers are now case-sensitive. Built-in functions use PascalCase (e.g.
RectifiedLinear
), and built-in variables and parameter names use camelCase (e.g.modelPath
,criterionNodes
), as do option strings (init="fixedValue"
,tag="criterion"
). Note that for names of optional parameters, incorrect spellings are not always caught as an error. Instead, some incorrectly spelled optional parameters are just ignored. An example are the 'special nodes' definitions. Their correct spelling for those is now:featureNodes = ... labelNodes = ... criterionNodes = ... evaluationNodes = ... outputNodes = ...
-
Abbreviated alternative names are no longer allowed, such as
Const()
should beConstant()
,tag="eval"
should betag="evaluation"
, andevalNodes
is nowevaluationNodes
. -
Some mis-spelled names were corrected:
criteria
is nowcriterion
(likewisecriterionNodes
),defaultHiddenActivity
is nowdefaultHiddenActivation
. -
The
=
sign is no longer optional for function definitions. -
It is no longer allowed to use curly braces for blocks (
{ ... }
), only brackets ([ ... ]
) are allowed. -
Option labels must be quoted as strings, e.g.
init="uniform"
rather thaninit=uniform
(without the quotes, BrainScript would fail with an error message saying that the symboluniform
is unknown).
This more restricted syntax is still accepted by
NDLNetworkBuilder
, so we recommend to first make these syntactical changes and test them with NDL, before actually changing to BrainScript. -
Step 4. Remove NDLNetworkBuilder
from "write" and "test" sections. Please review your "write" and "test" sections for NDLNetworkBuilder
sections, and remove them. Some of our stock examples have extraneous NDLNetworkBuilder
sections that should not be there. If your configuration is based on one of these examples, you may have such sections as well. They used to be ignored. But with the BrainScript update, defining a new network in these sections now has a meaning (model editing), so they are no longer ignored and therefore should be removed.
The syntax of the deprecated NDLNetworkBuilder
is:
NDLNetworkBuilder = [
networkDescription = "yourNetwork.ndl"
]
The NDLNetworkBuilder
block has the following parameters:
-
networkDescription
: the file path of the network description file. With the deprecatedNDLNetworkBuilder
, it was customary to use the file extension.ndl
. If there is nonetworkDescription
parameter specified then the network description is assumed to be inlined in the sameNDLNetworkBuilder
subblock, specified with therun
parameter below. Note that only one file path may be specified via thenetworkDescription
parameter. To load multiple files of macros, use thendlMacros
parameter. -
run
: the block of the NDL that will be executed. If an external NDL file is specified via thenetworkDescription
parameter, therun
parameter identifies a block in that file. This parameter overrides anyrun
parameters that may already exist in the file. If nonetworkDescription
file is specified, therun
parameter identifies a block in the current configuration file. -
load
: the blocks of NDL scripts to load. Multiple blocks can be specified via a ":" separated list. The blocks specified by theload
parameter typically contain macros for use by therun
block. Similar to therun
parameter, theload
parameter identifies blocks in an external NDL file and overrides anyload
parameters that may already exist in the file, if a file is specified by thenetworkDescription
parameter. If nonetworkDescription
file is specified,load
identifies a block in the current configuration file. -
ndlMacros
: the file path where NDL macros may be loaded. This parameter is usually used to load a default set of NDL macros that can be used by all NDL scripts. Multiple NDL files, each specifying different sets of macros, can be loaded by specifying a "+" separated list of file paths for thisndlMacros
parameter. In order to share macros with other command blocks such as NDL's model-editing language (MEL) blocks, you should define it at the root level of the configuration file. -
randomSeedOffset
: a non-negative random seed offset value in initializing the learnable parameters. The default value is0
. This allows users to run experiments with different random initialization.