Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Vector instead of List in transform methods #62

Merged
merged 1 commit into from
Mar 18, 2024

Conversation

pondzix
Copy link
Contributor

@pondzix pondzix commented Mar 13, 2024

Currently, the transform functions in common-loaders return a List. We construct the list by concatenating two lists: 1 for atomic fields and 1 for the self-describing entity fields (this line).

The ::: operator is relatively expensive, especially for large lists. And this is a “hot” function which we invoke for every single event.

If we switch the implementation to use Vector instead of List then it should be much more efficient to concatenate the two lists.

Currently, the transform functions in common-loaders return a [List](https://github.com/snowplow-incubator/common-streams/blob/0.2.1/modules/loaders-common/src/main/scala/com/snowplowanalytics/snowplow/loaders/transform/Transform.scala#L83).  We construct the list by concatenating two lists: 1 for atomic fields and 1 for the self-describing entity fields ([this line](https://github.com/snowplow-incubator/common-streams/blob/0.2.1/modules/loaders-common/src/main/scala/com/snowplowanalytics/snowplow/loaders/transform/Transform.scala#L86)).

The ::: operator is relatively expensive, especially for large lists.  And this is a “hot” function which we invoke for every single event.

If we switch the implementation to use Vector instead of List then it should be much more efficient to concatenate the two lists.
@pondzix pondzix force-pushed the use_vector_in_transform branch from 7bcc80f to ff178b6 Compare March 18, 2024 09:03
@pondzix pondzix merged commit ff178b6 into develop Mar 18, 2024
1 check passed
@istreeter istreeter deleted the use_vector_in_transform branch March 18, 2024 23:25
istreeter added a commit that referenced this pull request Mar 19, 2024
In #62 the transform method was improved to avoid the expensive `:::`
operator.  But we were still converting a `List` to a `Vector` for every
single event.  That conversion can be eliminated fairly easily.
istreeter added a commit that referenced this pull request Mar 24, 2024
In #62 the transform method was improved to avoid the expensive `:::`
operator.  But we were still converting a `List` to a `Vector` for every
single event.  That conversion can be eliminated fairly easily.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants