You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We use a Logger object that stores data as lists of "values" associated to "keys" in a python dictionary. This dictionary is stored in RAM. At the end of a train epoch or eval epoch, Logger creates/flushes a logs.json file in the experiment directory.
If the code crashes before a flush, the data is lost and we want to use Logger to monitor stuff such as CPU memory usage or GPU memory usage before a crash!
We need to write the full json files each time a new value has been added.
We need to load the full json files each time a new value has been added to visualize stuff.
Our constraints
We want to keep our logs in the experiment directory (no SQL/NoSQL datasets, SQLite maybe?).
We want to write new values only (For instance, we only write values of epoch 10 at epoch 10).
We want concurrent reads and writes (at least in differrent keys).
Some propositions
The following tools store the data on the file system (not in RAM).
Requires library to read, user must know SQL to do custom queries/applications (We could add a wrapper over SQLite in Logger)
Experiments comparison in SQLite
databases= []
forexperimentinall_experiments:
databases.append(open...)
forexperiment, databaseinzip(all_experiment, databases):
formetricinlist_of_metrics:
min_metric=select... # may be already in cachemax_metric=select... # may be already in cache
(useitheretoagglomerateinpython)
The text was updated successfully, but these errors were encountered:
Cadene
changed the title
Improve logging with SQLite
Improve logging (logs.json) with SQLite
Mar 29, 2020
tl;dr: SQLite will replace logs.json
Our current implementation
We use a Logger object that stores data as lists of "values" associated to "keys" in a python dictionary. This dictionary is stored in RAM. At the end of a train epoch or eval epoch, Logger creates/flushes a
logs.json
file in the experiment directory.Its problems
Our constraints
Some propositions
The following tools store the data on the file system (not in RAM).
H5PY one file
see
Pros:
data['train_epoch.epoch'][10]
Cons:
LMDB
see
Pros:
Cons:
netCDF
see
One CSV per key / or binary file
Pros:
Cons:
SQLite
see
Pros:
Cons:
Experiments comparison in SQLite
The text was updated successfully, but these errors were encountered: