Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load particles according to edm4hep #36

Merged
merged 39 commits into from
Jun 19, 2024

Conversation

brauliorivas
Copy link
Member

@brauliorivas brauliorivas commented Jun 9, 2024

BEGINRELEASENOTES

  • One of the principal worries of dmx is to keep up to date with the edm4hep data event model. Whenever someone wants to explore their data, this should represent the most recent model of edm4hep.
  • A way to keep up to date with the model, is to (periodically, under triggers, when manually run, etc) execute two scripts that will read the yaml definition file for edm4hep and create the output/datatypes.js file that will be used by dmx to detect which members/relations need to be loaded given some file i.e. reads the json data from the events, and loads everything according to output/datatypes.js.

ENDRELEASENOTES

The idea here is to follow the goal mentioned by @tmadlener, to always show the latest version. The output/datatypes.js file will be used to know which members and relations (name + type) has each type of particle. Then, when loading an event, the data will be shown according to the file. However, two things may happen:

  • The data from edm4hep is different from the json event, so in that case, we could ignore those values from the json.
  • The data from edm4hep is the same as in the json event, so everything is shown as expected.

Then, we could have a "general" particle for each type, that will access members and relations, to later have some methods that will allow to draw that particle, or draw a relation towards a type of particle.

Copy link
Contributor

@tmadlener tmadlener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this is a somewhat parallel development for #25 with a slightly more automated approach to generating the loading functionality, right?

I think it goes into a good direction already, I have a few comments inline as well. In general, this solves a few of the problems that we will be facing, and it should allow us to always read the latest format files, but I am not yet sure it will allow us to read also older files that have been written with an older version of EDM4hep. Having said that, maybe we don't need to solve that in all its glory now already and should rather try to continue with getting some of the visualization going for these with the current version of EDM4hep.

As far as I can tell, the loading is almost fully there by now, and we could start to prototyping some of the visualization for the things we can load now.

js/types/dynamic.js Outdated Show resolved Hide resolved
Comment on lines 3 to 37
class Reconstruction extends Particle {
constructor() {
super();
}
}

export class Cluster extends Reconstruction {
constructor() {
super();
}
}

export class ParticleID extends Reconstruction {
constructor() {
super();
}
}

export class ReconstructedParticle extends Reconstruction {
constructor() {
super();
}
}

export class Vertex extends Reconstruction {
constructor() {
super();
}
}

export class Track extends Reconstruction {
constructor() {
super();
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure whether it's necessary to differentiate between Particle and Reconstruction in this case. In the end we want all of the datatypes that we find in the yaml file to be on sort of the same level (at least for reading in). We can then still group them in some other way, but we will probably do that manually. Additionally, since the bodies for all of these are effectively the same, I think they could also go into the file that is generated, since we will do it for (almost) all datatypes in any case.

Copy link
Member Author

@brauliorivas brauliorivas Jun 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I'm going to create only a general EDMObject (as you mentioned) that will probably contain common features across all the objects. Also, I don't think it's necessary to add it to the generated file, because this EDMObject should contain more "logic" than relevant information when loading.

@@ -0,0 +1,8 @@
export class Particle {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a bit poorly named. In the end not everything in the datamodel will be something that resembles a particle. Maybe EDMObject or something like that would be a better name here.

Comment on lines 1 to 7
export const units = {
"charge": "e",
"mass": "GeV",
"time": "ns",
"momentum": "GeV",
"vertex": "mm",
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this really apply for all of the occurences? vertex in itself doesn't have a unit, the position of a vertex on the other hand probably will, so maybe position would be better here than vertex?

Comment on lines 90 to 96
it("should only load particles of one type", () => {
const loadersConfig = "ReconstructedParticle";

const particles = loadParticles(data, "0", loadersConfig);

expect(particles.hasOwnProperty(loadersConfig)).toBe(true);
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you really want to test this, you also need to check that nothing else has been loaded. Otherwise, you are effectively just testing one half of what you want to test (according to the test title).

"edm4hep::MCParticle": {
"members": [
{
"type": "int32_t",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the type useful in this form (i.e. targetting c++). Do we actually need the type for anything at the moment? Otherwise, we could drop it and only bring it back if we really need it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, then I will drop them.

@brauliorivas
Copy link
Member Author

brauliorivas commented Jun 10, 2024

I suppose this is a somewhat parallel development for #25 with a slightly more automated approach to generating the loading functionality, right?

Indeed, this way we get the latest version.

I am not yet sure it will allow us to read also older files that have been written with an older version of EDM4hep

I'm not either. The option is to try to read the common things between versions, but if some particle changes too much, it isn't very helpful. We could (a bit "ugly solution") include (all, some?) versions of edm4hep when generating the file with definitions. This way, we can use the latest version by default, or the one that best fits an object from edm4hep when reading a JSON file. We may also try to fetch the version according to releases but this would require a server (unless the user wants to clone the repo locally and run some sort of script). Another way is to simply read the common members/relations across versions. So for example if the latest version of MCParticle had members a and b, but the user has a file with an MCParticle from an old version with members a and c, then we only read a, b is blank and c is ignored. I think the first solution is better even though is not so orthodox, it serves well.

Copy link

github-actions bot commented Jun 10, 2024

PR Preview Action v1.4.7
Preview removed because the pull request was closed.
2024-06-19 14:34 UTC

@brauliorivas brauliorivas marked this pull request as ready for review June 10, 2024 22:13
Copy link
Contributor

@tmadlener tmadlener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for now we can focus on just getting the loading of the current / latest version work dynamically. In principle we have all the pieces in place to make this dynamic on the version as well, IIUC. Mainly (using current names not the proposed ones),

  • loadParticles and loadParticleType, which would need an additional version argument to then dispatch to another loading function.
  • A way to generate different versions of the objects to load

With this we should be able to add specific loading schemes later.

One thing that I don't yet understand is: When is the generation of the datatypes.js actually run? I assume it's done whenever necessary offline? Is this manually triggered? If that is the case, could you put just a few lines / commands into a markdown file at least minimally describing the process?

I have also left a few smaller comments. The main one I have at this point is about naming. There is still quite a bit about particle in function and variable names. However, IIUC, this could all be object or type (depending on context), as it applies to all datatypes not just particle like ones, right?

Finally (also in one of the inline comments), it should be possible to also fit MCParticle into this scheme, right? Could you do that, just to see if it is compatible with the current approach? If not, you might have to adapt either this dynamic loading or the current implementation for the MCParticle, because in that is no special datatype, it was just the one we started with.

js/types/load.js Outdated
}

const particlesToLoad = [
// subset of datatypes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am missing the edm4hep::MCParticle here (or I am missing something entirely?). It should be possible to load that through the same mechanism, right? That would also serve as a good "real world" test to see whether the scheme works to get to a useful representation in memory.

model/index.js Outdated
Comment on lines 60 to 64
if (type.includes("edm4hep::")) {
newMembers.push(new DataTypeMember(type, name, unit));
} else {
newMembers.push(new DataTypeMember(null, name, unit));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I have missed this bit before when I was talking about dropping the types. Do we need the type information for members that are components? (I.e. the things starting with edm4hep::? If yes, than we might as well just keep all the type information instead of selectively dropping stuff. If we don't need the type information for the components either, I think we can drop all of them here.

Checking whether we can fit the MCParticle into this dynamic scheme should give as a good indication of what is needed, I think.

Copy link
Member Author

@brauliorivas brauliorivas Jun 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought to keep only types from objects that contain edm4hep:: because maybe when drawing relations, we may need to know towards what type of object the relation is. But you are right; currently it isn't necessary, and we may continue without them. Also, I've included MCParticle which loads seamlessly. Using the testing json file, I get this
Selection_005
when running the loading function, without error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this also already used when I load a file in the viewer for displaying?

Copy link
Member Author

@brauliorivas brauliorivas Jun 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, not.

js/types/load.js Outdated Show resolved Hide resolved
js/types/load.js Outdated Show resolved Hide resolved
model/MODEL.md Outdated
## Further improvements

- We could trigger the update process automatically when a new release of edm4hep is published, but for now, we will keep it simple and run the update process manually.
- To maitain backward compatibility, we could add a versioning system so we can later load different versions of the edm4hep data model into dmx, however we currrently load the latest version.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typos

Suggested change
- To maitain backward compatibility, we could add a versioning system so we can later load different versions of the edm4hep data model into dmx, however we currrently load the latest version.
- To maintain backward compatibility, we could add a versioning system so we can later load different versions of the edm4hep data model into dmx, however we currently load the latest version.

model/MODEL.md Outdated
npm run update
```

to run both processes at once and immediatly update the dmx application.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

Suggested change
to run both processes at once and immediatly update the dmx application.
to run both processes at once and immediately update the dmx application.

```

will download the latest information from the edm4hep, by using the file called `edm4hep.yaml` under `model` directory, containg up to date information about `edm4hep`. The second command

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

Suggested change
will download the latest information from the edm4hep, by using the file called `edm4hep.yaml` under `model` directory, containing up to date information about `edm4hep`. The second command

Copy link
Contributor

@tmadlener tmadlener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. One thing that we could add as a test is to add a small JSON file that we can load and then make some assertions on to make sure that things work with "real" inputs and that the links are to the correct objects.

import { objectTypes } from "../js/types/objects.js";

test("load a collection of particles", () => {
const type = "edm4hep::Track";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not entirely sure why this test is here? It looks like the dynamic.test.js already does a very similar thing? What makes this test different?

Copy link
Member Author

@brauliorivas brauliorivas Jun 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

load.test.js is like a more general version of dynamic.test.js. In dynamic.test.js, the 3 basic functions to load objects (loadMembers, loadOneToOneRelations and loadOneToManyRelations) are tested, meanwhile in load.test.js the function loadObjectType is tested, which uses the 3 functions. It's not exactly the same, because loadObjectType also makes some extra processing. Maybe we could test loadObjectType only because I think it may change less (the test is more result-prone than implementation-prone) and it's directly used by dmx.

@brauliorivas
Copy link
Member Author

This looks good to me. One thing that we could add as a test is to add a small JSON file that we can load and then make some assertions on to make sure that things work with "real" inputs and that the links are to the correct objects.

Sure, I'll add this 👍🏻.

Copy link
Contributor

@tmadlener tmadlener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the loading from json tests. This looks good to me. I have just one minor question, where I miss some bits of the testing framework.

@kjvbrt anything from your side for this? Otherwise, I would propose to merge this and then starting to work on the visualizations on top of this.

"edm4hep::ReconstructedParticle",
]);

expect(objects["edm4hep::MCParticle"]).toBeDefined();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does .toBeDefined() test here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought like an "assertion". So basically, it tests if an object that contains that type of particle was created.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah OK. So effectively with the current setup

expect(objects["edm4hep::Cluster"]).toBeDefined();

would fail, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but that's an old commit. Currently, that test is not present.

@brauliorivas
Copy link
Member Author

@kjvbrt anything from your side for this? Otherwise, I would propose to merge this and then starting to work on the visualizations on top of this.

Yeah, please 😃

@tmadlener
Copy link
Contributor

tmadlener commented Jun 19, 2024

One final thought of mine. This is able to handle multiple collections of the same type, right? For MCParticle there will usually only be one, but e.g. for Tracks or ReconstructedParticle there might be multiple collections. We can always address that in a separate PR if not yet.

@brauliorivas
Copy link
Member Author

One final thought of mine. This is able to handle multiple collections of the same type, right? For MCParticle there will usually only be one, but e.g. for Tracks or ReconstructedParticle there might be multiple collections. We can always address that in a separate PR if not yet.

Yes, on a single event could exist multiple collections of the same type. So here all the collections are loaded and then "merged".

@tmadlener
Copy link
Contributor

Is it then still possible to only view selected collections?

@brauliorivas
Copy link
Member Author

Is it then still possible to only view selected collections?

Yes. But I think this should be developed when working on visualization.

@kjvbrt
Copy link
Collaborator

kjvbrt commented Jun 19, 2024

This looks good to me, let's continue with the visualizations... :)

@tmadlener tmadlener merged commit fad0d00 into key4hep:main Jun 19, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants