Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove deprecated EventStore based I/O functionality #485

Merged
merged 37 commits into from
Dec 13, 2023
Merged
Show file tree
Hide file tree
Changes from 35 commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
c78fc01
Remove EventStore related files and make things compile again
tmadlener Sep 15, 2023
9d5a064
Remove EventStore python bindings
tmadlener Sep 15, 2023
642f80a
Remove unnecessary EventStore
tmadlener Sep 15, 2023
a125ae6
Remove implementation files and python tests
tmadlener Oct 12, 2023
8c22306
Move test case to Frame
tmadlener Oct 11, 2023
211ed0a
Remove test case that no longer applies
tmadlener Oct 11, 2023
295fd03
Remove no longer existing files from being installed
tmadlener Oct 12, 2023
8b185b8
Switch relation_range test to Frame based I/O
tmadlener Oct 12, 2023
ca555f6
Adapt legacy reader test for root
tmadlener Oct 12, 2023
9fa7477
Adapt the legacy tests for sio
tmadlener Oct 12, 2023
8b83519
Switch to downloaded legacy inputs for podio-dump tests
tmadlener Oct 12, 2023
184c151
Switch pyunittests to use downloaded legacy data
tmadlener Oct 12, 2023
fff8dad
Remove more mentions of EventStore
tmadlener Oct 12, 2023
0467a99
Remove now unused files
tmadlener Oct 12, 2023
a0592ed
Remove EventStore remnants from I/O tests
tmadlener Oct 12, 2023
22bd529
Rename store to event
tmadlener Oct 12, 2023
99a5a3c
Remove test case that is covered in unittests
tmadlener Oct 12, 2023
9cb049a
Fix clang-tidy complaints
tmadlener Oct 12, 2023
5932572
Adapt test environment based on availability of test data
tmadlener Oct 12, 2023
9f5f60a
Skip tests if data not available
tmadlener Oct 12, 2023
59afde6
Remove EventStore from UserDataCollection doc
tmadlener Oct 13, 2023
a91e519
Update backend documentation to remove EventStore mentions
tmadlener Oct 13, 2023
412da04
Update main documentation to remove EventStore
tmadlener Oct 13, 2023
bf7672e
Remove spurious whitespace
tmadlener Oct 13, 2023
734c96e
Remove even more spurious whitespace
tmadlener Oct 13, 2023
c602c4a
Remove nonexistant test again after rebase
tmadlener Dec 8, 2023
4fc18c2
[format]: Fix style comments
tmadlener Dec 12, 2023
40577cf
Make python unittests work again
tmadlener Dec 12, 2023
ab337a5
Make SIO legacy tests run
tmadlener Dec 12, 2023
1cfc878
Switch podio-dump tests to use ExternalData
tmadlener Dec 12, 2023
a0fcdfa
Add neessary Frame include
tmadlener Dec 12, 2023
c0423f2
Guard GenericParameter friend-ness for RNTuple support
tmadlener Dec 12, 2023
8d7f26e
Adapt sanitizer ignored test cases after renaming some tests
tmadlener Dec 12, 2023
4004079
Use usual test environment for legacy tests
tmadlener Dec 12, 2023
65fbe65
Remove unnecessary cmake variable
tmadlener Dec 13, 2023
b58098c
Fix documentation typo
tmadlener Dec 13, 2023
9016223
Move status message to more appropriate place
tmadlener Dec 13, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -191,11 +191,20 @@ add_subdirectory(src)
SET(podio_PYTHON_DIR ${PROJECT_SOURCE_DIR}/python CACHE PATH "Path to the podio python directory")

if(BUILD_TESTING)
include(ExternalData)
list(APPEND ExternalData_URL_TEMPLATES
"https://key4hep.web.cern.ch:443/testFiles/podio/%(hash)"
)
include(cmake/podioTest.cmake)
add_subdirectory(tests)
endif()
add_subdirectory(tools)
add_subdirectory(python)


if(BUILD_TESTING)
# Make sure to fetch all data, after all legacy test cases have been added
ExternalData_Add_Target(legacy_test_cases)
endif()
#--- add CMake infrastructure --------------------------------------------------
include(cmake/podioCreateConfig.cmake)
9 changes: 6 additions & 3 deletions cmake/podioTest.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,8 @@
function(PODIO_SET_TEST_ENV test)
# We need to convert this into a list of arguments that can be used as environment variable
list(JOIN PODIO_IO_HANDLERS " " IO_HANDLERS)
set_property(TEST ${test}
PROPERTY ENVIRONMENT
LD_LIBRARY_PATH=${PROJECT_BINARY_DIR}/tests:${PROJECT_BINARY_DIR}/src:$<TARGET_FILE_DIR:ROOT::Tree>:$<$<TARGET_EXISTS:SIO::sio>:$<TARGET_FILE_DIR:SIO::sio>>:$ENV{LD_LIBRARY_PATH}
set(test_environment
LD_LIBRARY_PATH=${PROJECT_BINARY_DIR}/tests:${PROJECT_BINARY_DIR}/src:$<TARGET_FILE_DIR:ROOT::Tree>:$<$<TARGET_EXISTS:SIO::sio>:$<TARGET_FILE_DIR:SIO::sio>>:$ENV{LD_LIBRARY_PATH}
PYTHONPATH=${PROJECT_SOURCE_DIR}/python:$ENV{PYTHONPATH}
PODIO_SIOBLOCK_PATH=${PROJECT_BINARY_DIR}/tests
ROOT_INCLUDE_PATH=${PROJECT_BINARY_DIR}/tests/datamodel:${PROJECT_SOURCE_DIR}/include
Expand All @@ -14,6 +13,10 @@ function(PODIO_SET_TEST_ENV test)
PODIO_USE_CLANG_FORMAT=${PODIO_USE_CLANG_FORMAT}
PODIO_BASE=${PROJECT_SOURCE_DIR}
ENABLE_SIO=${ENABLE_SIO}
PODIO_BUILD_BASE=${PROJECT_BINARY_DIR}
)
set_property(TEST ${test}
PROPERTY ENVIRONMENT "${test_environment}"
)
endfunction()

Expand Down
37 changes: 11 additions & 26 deletions doc/advanced_topics.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,26 +32,14 @@ Before writing out a collection, the data need to be put into the proper structu

### Reading Back-End

There are two possibilities to implement a reading-back end. In case one uses the `podio::EventStore`, one simply has to implement the `IReader` interface.

If not taking advantage of this implementation, the data reader or the event store have to implement the `ICollectionProvider` interface. Reading of a collection happens then similar to:

```cpp
// ...
// your creation of the collection and reading of the PODs from disk
// ...
collection->setBuffer(buffer);
auto refCollections = collection->referenceCollections();
// ...
// your filling of refCollections from disk
// ...
collection->setID( <collection ID read from disk> );
collection->prepareAfterRead();
// ...
collection->setReferences( &collectionProvider );
```

The strong assumption here is that all references are being followed up directly and no later on-demand reading is done.
The main requirement for a reading backend is its capability of reading back all
the necessary data from which a collection can be constructed in the form of
`podio::CollectionReadBuffers`. From thes buffers collections can then be
tmadlener marked this conversation as resolved.
Show resolved Hide resolved
constructed. Each instance has to contain the (type erased) POD buffers (as a
`std::vector`), the (possibly empty) vectors of `podio::ObjectID`s that contain
the relation information as well the (possibly empty) vectors for the vector
member buffers, which are currently stored as pairs of the type (as a
`std::string`) and (type erased) data buffers in the form of `std::vector`s.

### Dumping JSON

Expand Down Expand Up @@ -94,12 +82,9 @@ As explained in the section about mutability of data, thread-safety is only guar
During the calls of `prepareForWriting` and `prepareAfterReading` on collections other operations like object creation or addition will lead to an inconsistent state.

### Not-thread-safe components
The example event store provided with PODIO is as of writing not thread-safe. Neither is the chosen serialization.

## Implementing a transient Event Class

PODIO contains one example `podio::EventStore` class.
To implement your own transient event store, the only requirement is to set the collectionID of each collection to a unique ID on creation.
The Readers and Writers that ship with podio are assumed to run on a single
thread only (more precisely we assume that each Reader or Writer doesn't have to
synchronize with any other for file operations).

## Running pre-commit

Expand Down
66 changes: 18 additions & 48 deletions doc/examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,28 +102,30 @@ Passing in a size argument is optional; If no argument is passed all elements wi
if an argument is passed only as many elements as requested will be returned.
If the collection holds less elements than are requested, only as elements as are available will be returned.

### EventStore functionality
### `podio::Frame` container

The event store contained in the package is for *educational* purposes and kept very minimal. It has two main methods:
The `podio::Frame` is the main container for containing and grouping collections
together. It has two main methods:

```cpp
/// create a new collection
/// Store a collection
template<typename T>
T& create(const std::string& name);
const T& put(T&& coll, const std::string& name);

/// access a collection.
/// access a collection
template<typename T>
const T& get(const std::string& name);
const& T get(const std::string& name);
```

Please note that a `put` method for collections is not foreseen.
Note that for `put`ting collections into the Frame an explicit `std::move` is
necessary to highlight the change of ownership that happens in this case.

### Object Retrieval

Collections can be retrieved explicitly:

```cpp
auto& hits = store.get<HitCollection>("hits");
auto& hits = frame.get<HitCollection>("hits");
if (hits.isValid()) { ... }
```

Expand All @@ -135,51 +137,19 @@ Or implicitly when following an object reference. In both cases the access to da
Sometimes it is necessary or useful to store additional data that is not directly foreseen in the EDM.
This could be configuration parameters of simulation jobs, or parameter descriptions like cell-ID encoding etc. PODIO currently allows to store such meta data in terms of a `GenericParameters` class that
holds an arbitrary number of named parameters of type `int, float, string` or vectors if these.
Meta data can be stored and retrieved from the `EventStore` for runs, collections and events via
the three methods:
```cpp
virtual GenericParameters& EventStore::getRunMetaData(int runID);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add a small explanation that one just creates a frame if there is a metadata use case? Just to explain a bit more the frame concept

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for a non-default one where putting a parameter is not enough

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add this in this PR still, or do we push it into the documentation update?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That can go to a separate PR

virtual GenericParameters& EventStore::getEventMetaData();
virtual GenericParameters& EventStore::getCollectionMetaData(int colID);
```

- example for writing event data:
```cpp
auto& evtMD = store.getEventMetaData() ;
evtMD.setValue( "UserEventWeight" , (float) 100.*i ) ;
```
- example for reading event data:
```cpp
auto& evtMD = store.getEventMetaData() ;
float evtWeight = evtMD.getFloatVal( "UserEventWeight" ) ;

```

- example for writing collection meta data:

```cpp
auto& hits = store.create<ExampleHitCollection>("hits");
// ...
auto& colMD = store.getCollectionMetaData( hits.getID() );
colMD.setValue("CellIDEncodingString","system:8,barrel:3,layer:6,slice:5,x:-16,y:-16");
```

- example for reading collection meta data

```cpp
auto colMD = store.getCollectionMetaData( hits.getID() );
std::string es = colMD.getStringVal("CellIDEncodingString") ;
```

Meta data can be stored and retrieved from the `Frame` via the templated `putParameter` and `getParameter` methods.

#### Python Interface

The class `EventStore` provides all the necessary (read) access to event files. It can be used as follows:
The `Reader` and `Writer` classes in the `root_io` and `sio_io` submodules
provide all the necessary functionality to read and write event files. An
example of reading files looks like this:


```python
from EventStore import EventStore
store = EventStore(<list of files>)
for event in store:
from podio.root_io import Reader
reader = Reader("one or many input files")
for event in reader.get("events"):
hits = store.get("hits")
for hit in hits:
# ...
Expand Down
11 changes: 6 additions & 5 deletions doc/userdata.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,17 +6,18 @@ data* via the `podio::UserDataCollection`. It gives the user access to a
the data stored in the EDM classes for each event.

## Example usage
Creating or getting a `UserDataCollection` via the `EventStore` works the same
as with any other collection of the EDM via the `create` or `get` functions:
Creating or getting a `UserDataCollection` via the `Frame` works the same
as with any other collection of the EDM via the `put` or `get` functions:

```cpp
#include "podio/UserDataCollection.h"

// Create a collection
auto& userFloats = store.create<podio::UserDataCollection<float>>("userFloats");
// Create a collection and put it into a Frame
userFloats = podio::UserDataCollection<float>();
frame.put(std::move(userFloats), "userFloats");

// get a collection
const auto& userData = store.get<podio::UserDataCollection<float>>("userFloats");
const auto& userData = frame.get<podio::UserDataCollection<float>>("userFloats");
```

The interface of the `UserDataCollection` is similar to a basic version of the
Expand Down
80 changes: 0 additions & 80 deletions include/podio/ASCIIWriter.h

This file was deleted.

Loading