Add (optional at compile-time) G3Frame JSON output #69

cozzyd · 2021-12-17T23:28:01Z

I envision it being useful to access data in .g3 format from web
applications for monitoring purposes. For that reason, it would
be useful to have a way to convert any data to JSON (one possible alternative
would be a .g3 binary file reader in javascript, but that is much more
work).

Fortunately, cereal supports JSON as an archive format, so adding
JSON output is nearly trivial. We mostly just need to make sure
the JSONOutputArchive version of all the serializations are compiled,
and write a saveJSON() method for g3frame. An asJSON() (as_json() in
Python) method is also provided that returns a string.

There there are a few small differences from binary output:

Don't bother emitting crc sums
Don't FLAC encode, ever
Output the character instead of the number for frametype

Currently, this is only enabled by a new compile-time option (the cmake
variable ENABLE_JSON_OUTPUT), though in the future it will probably be
enabled by default if it doesn't break anything. It does add moderately
to binary size and compile time, but hopefully that's not a huge deal.
The asJSON/as_json methods still exist without JSON support, but return
an error in JSON format.

Also included is a new script, spt3g-jsonify, that will read in a
.g3[.gz] file and output a json stream as a proof of concept.

There are a few places where some other code had to be modified, due to
the different API for binary in cereal text and binary archive formats.
Actually, it's in code that will never be run, but gets generated and
must compile.

Still TODO:

docs
tests
example HTTP endpoint (likely using boost::beast or equivalent to produce gzipped json).

I envision it being useful to access data in .g3 format from web applications for monitoring purposes. For that reason, it would be useful to have a way to convert any data to JSON (one possible alternative would be a .g3 binary file reader in javascript, but that is much more work). Fortunately, cereal supports JSON as an archive format, so adding JSON output is nearly trivial. We mostly just need to make sure the JSONOutputArchive version of all the serializations are compiled, and write a saveJSON() method for g3frame. An asJSON() (as_json() in Python) method is also provided that returns a string. There there are a few small differences from binary output: - Don't bother emitting crc sums - Don't FLAC encode, ever - Output the character instead of the number for frametype Currently, this is only enabled by a new compile-time option (the cmake variable ENABLE_JSON_OUTPUT), though in the future it will probably be enabled by default if it doesn't break anything. It does add moderately to binary size and compile time, but hopefully that's not a huge deal. The asJSON/as_json methods still exist without JSON support, but return an error in JSON format. Also included is a new script, spt3g-jsonify, that will read in a .g3[.gz] file and output a json stream as a proof of concept. There are a few places where some other code had to be modified, due to the different API for binary in cereal text and binary archive formats. Actually, it's in code that will never be run, but gets generated and must compile.

CMakeLists.txt

By using the PUBLIC target, it both affects the compilation of core and anything compiled against core.

arahlin · 2021-12-18T14:31:42Z

By the way, I cherry-picked your cmake action fix onto master. Thanks for fixing that!

I envision it being useful to access data in .g3 format from web applications for monitoring purposes. For that reason, it would be useful to have a way to convert any data to JSON (one possible alternative would be a .g3 binary file reader in javascript, but that is much more work). Fortunately, cereal supports JSON as an archive format, so adding JSON output is nearly trivial. We mostly just need to make sure the JSONOutputArchive version of all the serializations are compiled, and write a saveJSON() method for g3frame. An asJSON() (as_json() in Python) method is also provided that returns a string. There there are a few small differences from binary output: - Don't bother emitting crc sums - Don't FLAC encode, ever - Output the character instead of the number for frametype Currently, this is only enabled by a new compile-time option (the cmake variable ENABLE_JSON_OUTPUT), though in the future it will probably be enabled by default if it doesn't break anything. It does add moderately to binary size and compile time, but hopefully that's not a huge deal. The asJSON/as_json methods still exist without JSON support, but return an error in JSON format. Also included is a new script, spt3g-jsonify, that will read in a .g3[.gz] file and output a json stream as a proof of concept. There are a few places where some other code had to be modified, due to the different API for binary in cereal text and binary archive formats. Actually, it's in code that will never be run, but gets generated and must compile.

By using the PUBLIC target, it both affects the compilation of core and anything compiled against core.

arahlin · 2024-02-27T23:19:45Z

I think it probably makes some sense to move the python GIL / threading context machinery to a separate PR, since it's used in a few different places (G3PipelineInfo, G3Reader, G3Writer, G3EventBuilder...) and is not specific to this particular feature.

This PR creates a new class that simplifies initialization of python threads, as well as acquiriing / releasing the Python global interpreter lock in various contexts. Use cases include: 1. Ensuring that Py_Initialize() is properly called at the beginning of a program that is expected to interact with the python interpreter, and also that Py_Finalize() is called when the program is finished. 2. Ensuring that the current thread state is saved and the GIL released as necessary, e.g. for IO operations, and then the thread state is restored on completion. 3. Ensuring that the GIL is acquired for one-off interaction with the python interpreter, and released when complete. A G3PythonContext object is used throughout the library code for cases 2 and 3. If the python interpreter has not been initialized (i.e. the compiled program is expected to be purely in C++), then these context objects are essentially no-op. If the python interpreter is initialized (e.g. inside a python program or command-line interface), then these context objects will handle the GIL appropriately. See the examples/cppexample.cxx C++ program for a simple implementation of the above behavior. This PR also adds logic throughout the G3PipelineInfo and G3ModuleConfig class definitions to enable them to serialize appropriately in a pure-C++ program.

These are python objects, and if we allow them to be deleted otherwise, bad things happen. This fixes at least most of the concurrency problems I have with reading files that have G3PipelineInfo in them?

Use a G3MapFrameObject storage structure for the module arguments, rather than a map of python objects. Since the serialization process requires a call to repr() for non-G3FrameObjects anyway, do this step in the python shim that creates the config in the first place. Also ensure that simple scalar values are serialized as frame objects. Adds a new ``spt3g.core.to_g3frameobject`` function for converting python objects to G3FrameObjects.

arahlin · 2024-03-01T15:32:38Z

The context PR is merged, but you probably need #148 (also merged, should fix your latest build failure) and #147 (pending review) to make this work.

cozzyd · 2024-03-01T15:47:36Z

Hmm, I wonder why it worked for me without updating the c++ standard...

arahlin · 2024-03-03T18:22:38Z

Ok, you should be able to merge with master, so that just your json changes would be part of this PR now!

…er merges

cozzyd added 2 commits December 17, 2021 17:00

blindly try to fix the cmake action

1d88104

arahlin requested changes Dec 18, 2021

View reviewed changes

CMakeLists.txt Outdated Show resolved Hide resolved

move ENABLE_JSON_OUTPUT to core/CmakeLists.txt

740ac3c

By using the PUBLIC target, it both affects the compilation of core and anything compiled against core.

arahlin assigned cozzyd Dec 18, 2021

cozzyd and others added 12 commits April 22, 2022 09:34

move ENABLE_JSON_OUTPUT to core/CmakeLists.txt

1fa4c82

By using the PUBLIC target, it both affects the compilation of core and anything compiled against core.

First attempt at abstracting GIL locking / C++ python initialization

f85006e

formatting

be76b17

make cppexample work with python things

8be8d2c

use python3 instead of python in shebang

42e7064

CmakeLists for cppexample

7724616

initial commit of spt3g-json-serve

50b4273

failed attempts at python thread safety

4609008

merge

3abda84

Merge branch 'master' into json_output

0de19ce

fix name collision with CPython symbol

fa0a597

arahlin mentioned this pull request Feb 27, 2024

Consolidate python GIL / threads context handling #145

Closed

arahlin and others added 10 commits February 27, 2024 23:57

more words

8e949a4

terrible merge

1f56430

update to newest httplib

8fb0d62

Hold GIL while clearing config in G3ModuleInfo

0eea5ee

These are python objects, and if we allow them to be deleted otherwise, bad things happen. This fixes at least most of the concurrency problems I have with reading files that have G3PipelineInfo in them?

split interpreter initialization and GIL handling into separate classes

2b09b89

Hold GIL while clearing config in G3ModuleInfo

17e90d5

These are python objects, and if we allow them to be deleted otherwise, bad things happen. This fixes at least most of the concurrency problems I have with reading files that have G3PipelineInfo in them?

remove spurious python context

2c4f205

don't bother with globals

f4654ff

arahlin added 7 commits February 29, 2024 02:31

ocd

865c9f9

limit eval namespace, values method

b6a1ed5

no print

48ba193

Merge remote-tracking branch 'origin/master' into modconfig_refactor

fb04351

refactor to hide .config attribute from python user to avoid confusion

44ad66a

add test

98edc00

cleanup

170fdbc

cozzyd added 3 commits March 1, 2024 13:32

merge in latest updates to modconfig_refactor branch

24d354b

Merge branch 'master' into json_output

6b46b6d

get rid of destructor definition resulting from crappy merge on my part

da75530

cozzyd and others added 17 commits March 4, 2024 11:18

merge

fae872d

remove spurious space

67843cc

missed part of merge

220e605

somehow ended up with a few extra lines in pipelineinfo.py from earli…

b4b1b3f

…er merges

cleanups

3de2109

style

84db403

big rework of server example, including html output for dir listing

c18afae

increment version, fix json output bug recently introduced

7ee4dcf

add some docs

7b9dc86

rst formatting

fc6bb3d

rename to more descriptive name

e0468f6

Merge branch 'master' into json_output

71d1fa4

Fix build errors

6f41707

undo unnecessary whitespace changes

0ff476a

Fix build errors

d057779

Merge branch 'master' into json_output

d9f03e6

Avoid extra boost libraries

4e4cec5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add (optional at compile-time) G3Frame JSON output #69

Add (optional at compile-time) G3Frame JSON output #69

cozzyd commented Dec 17, 2021

arahlin commented Dec 18, 2021

arahlin commented Feb 27, 2024

arahlin commented Mar 1, 2024

cozzyd commented Mar 1, 2024

arahlin commented Mar 3, 2024

Add (optional at compile-time) G3Frame JSON output #69

Are you sure you want to change the base?

Add (optional at compile-time) G3Frame JSON output #69

Conversation

cozzyd commented Dec 17, 2021

arahlin commented Dec 18, 2021

arahlin commented Feb 27, 2024

arahlin commented Mar 1, 2024

cozzyd commented Mar 1, 2024

arahlin commented Mar 3, 2024