Skip to content
This repository has been archived by the owner on Jan 20, 2022. It is now read-only.

Pass the semigroup to create Mergeable online #687

Merged
merged 2 commits into from
Oct 5, 2016
Merged

Conversation

johnynek
Copy link
Collaborator

@johnynek johnynek commented Sep 8, 2016

closes #677

The issue is that it is easy to have the semigroup get out of sync between the implicit semigroup in the Producer graph, and the captured value for storm. This is not totally fixed, since when using a single store as a service and a store, we still need to capture the semigroup, but leftJoin does not have access to that. We could fix that walking the graph and finding the semigroup at the location of the sumByKey. We can do that in a later PR.

new CombinedServiceStoreFactory[K, V] {
def mergeableStore = () => onlineStore
def mergeableStore = _ => online
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistency between _ and ()

Copy link
Collaborator Author

@johnynek johnynek Sep 8, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you mean? They mean different things. _: Semigroup[V] is being ignored here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, of course

@jnievelt
Copy link
Contributor

jnievelt commented Sep 8, 2016

Looks good. I suppose we'll need a version bump for the MergeableStoreFactory signature change? At least it should be a trivial modification to make sources compatible.

@johnynek
Copy link
Collaborator Author

johnynek commented Sep 8, 2016

@jnievelt yes, this will be a minor version break.

}

trait MergeableStoreFactory[-K, V] extends java.io.Serializable {
def mergeableStore: () => Mergeable[K, V]
def mergeableStore: Semigroup[V] => Mergeable[K, V]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point it seems like this method should be named like a function and not a property. Earlier it was in the spirit of lazily created mergeable store so the name was ok. Should we call it createMergeableStore?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I'm a little nervous about that because it lead to a lot of serialization problems before (which is why we did this). Calling a method is still lazy, so def mergeableStore = store would have been lazy, but still we got burned enough to add this extra layer).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't mean to suggest we change the structure, just the name. MergeableStoreFactory.mergeableStore reads like a property which would provide a mergeable store but it is a function that takes an input to provide a mergeable store.

e.g. when I read factory.mergeableStore I would think that this is a mergeable store object, finding out that it is a function that needs a semigroup as input to produce a mergeable store would be a surprise.

@pankajroark
Copy link
Contributor

Looks good overall, one minor comment.

@pankajroark
Copy link
Contributor

@johnynek Could you cut a release before merging this? This will for sure require changes in our internal repository since there are custom implementations of MergeableStoreFactory that will need to change.

@johnynek
Copy link
Collaborator Author

johnynek commented Sep 8, 2016

@pankajroark I want to push back on cutting a release. The reason is this:

  1. I don't want to see us raise the bar to "any binary change should have a published release before merging"
  2. this change is:
    a. unlikely to hit many users I assume (who should have used the constructor functions in most cases)
    b. totally mechanical to port (just ignore the semigroup for the old behavior, or actually use it to get a safer operation).
  3. The work would be throw-away, since we would need to publish again (since we publish RCs from develop), and any work on a merge would need to be done against the new jar.
  4. We can always roll-back if there is some reason why this change is too onerous (but fixing code to be safer, seems to me like generally a win, not a cost). In fact, lots of the reason for custom MergeableStoreFactory is because you need to know exactly what semigroup is. With this change, that is no longer needed, so I imagine you can have maybe just one or two implementations throughout Twitter.

To turn it around: should Twitter employees publish new jars on each incompatible change for other users to see what the change will be like before a merge? I think that is too much work.

@jnievelt
Copy link
Contributor

jnievelt commented Sep 8, 2016

Actually I think there's just one place where we call MergeableStoreFactory#mergeableStore and it's in a test. The constructions (using from and apply) should be source compatible at least. I don't think it's a problem to merge before releasing.

@pankajroark
Copy link
Contributor

Yeah, this should be ok. Overriding Mergeable is common, MergeableStoreFactory seems to be created mostly with the help of factory function. By bad in confusing the two.

On cutting a release on breaking changes, it was just an effort on my part to make my life easier. I'm planning to do a summingbird release soon and was hoping to do least amount of code changes in our repo, the last release turned out to be very complicated due to the MemoryPlatform changes. But it's not a big deal. In general I agree I wouldn't expect a release of summingbird on every breaking change.

@johnynek
Copy link
Collaborator Author

johnynek commented Oct 5, 2016

I have a +1 on this now, @pankajroark @jnievelt ?

@johnynek johnynek merged commit 159c8a5 into develop Oct 5, 2016
@johnynek johnynek deleted the oscar/issue-677 branch October 5, 2016 17:22
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Storm Store type should take Semigroup[V]
4 participants