Running this code

Working with the network locally

Prerequisites

We use the pipenv dependency/virtualenv framework:

$ pipenv install
$ pipenv shell
(mac-graph-sjOzWQ6Y) $

Prediction

You can watch the model predict values from the hold-back data:

$ python -m macgraph.predict --name my_dataset --model-version 0ds9f0s

predicted_label: shabby
actual_label: derilict
src: How <space> clean <space> is <space> 3 ? <unk> <eos> <eos>
-------
predicted_label: small
actual_label: medium-sized
src: How <space> big <space> is <space> 4 ? <unk> <eos> <eos>
-------
predicted_label: medium-sized
actual_label: tiny
src: How <space> big <space> is <space> 7 ? <unk> <eos> <eos>
-------
predicted_label: True
actual_label: True
src: Does <space> 1 <space> have <space> rail <space> connections ? <unk>
-------
predicted_label: True
actual_label: False
src: Does <space> 0 <space> have <space> rail <space> connections ? <unk>
-------
predicted_label: victorian
actual_label: victorian
src: What <space> architectural <space> style <space> is <space> 1 ? <unk>

TODO: Get it predicting from your typed input

Building the data

To train the model, you need training data.

If you want to skip this step, you can download the pre-built data from our public dataset. This repo is a work in progress so the format is still in flux.

The underlying data (a Graph-Question-Answer YAML from CLEVR-graph) must be pre-processed for training and evaluation. The YAML is transformed into TensorFlow records, and split into train-evaluate-predict tranches.

First generate a gqa.yaml with the command:

clevr-graph$ python -m gqa.generate --count 50000 --int-names
cp data/gqa-some-id.yaml ../mac-graph/input_data/raw/my_dataset.yaml

Then build (that is, pre-process into a vocab table and tfrecords) the data:

mac-graph$ python -m macgraph.input.build --name my_dataset

Arguments to build

--limit N will only read N records from the YAML and only output a total of N tf-records (split across three tranches)
--type-string-prefix StationProperty will filter just questions with type string prefix "StationProperty"

Training

Let's build a model. (Note, this requires training data from the previous section).

General advice is to have at least 40,000 training records (e.g. build from 50,000 GQA triples)

python -m macgraph.train --name my_dataset

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RUNNING.md

RUNNING.md

Running this code

Working with the network locally

Prerequisites

Prediction

Building the data

Arguments to build

Training

Files

RUNNING.md

Latest commit

History

RUNNING.md

File metadata and controls

Running this code

Working with the network locally

Prerequisites

Prediction

Building the data

Arguments to build

Training