Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added consistent hashing strategy #1087

Merged
merged 12 commits into from
Sep 16, 2022

Conversation

jaronoff97
Copy link
Contributor

@jaronoff97 jaronoff97 commented Sep 12, 2022

This PR adds the consistent hashing strategy. In doing so, I included tests to verify some of the expected behaviors of this strategy method. This PR is one solution to close #1061 as two target allocator replicas will be able to run simultaneously with this strategy because they will always provide the same answer.

spec:
  targetAllocator:
    allocationStrategy: "consistent-hashing"
    replicas: 2
    enabled: true

Screen Shot 2022-09-14 at 11 17 14 AM

Comment on lines +41 to +44
PartitionCount: 1061,
ReplicationFactor: 5,
Load: 1.1,
Hasher: hasher{},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: should this be configurable

return xxhash.Sum64(data)
}

type consistentHashingAllocator struct {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of this code looks very similar to the least weighted strategy. It may be worth investigating in the future if we can combine these as I tried to do in #1068

Copy link
Contributor

@kristinapathak kristinapathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actual logic looks great! It is extremely similar to the least weighted algorithm...

cmd/otel-allocator/allocation/consistent_hashing.go Outdated Show resolved Hide resolved
cmd/otel-allocator/allocation/consistent_hashing.go Outdated Show resolved Hide resolved
@jaronoff97 jaronoff97 marked this pull request as ready for review September 14, 2022 15:49
@jaronoff97 jaronoff97 requested a review from a team September 14, 2022 15:49
Comment on lines 111 to 112
// Replicas is the number of pod instances for the underlying TargetAllocator, this can only be set to values other
// than 1 if a strategy that allows for high availability is chosen.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Replicas is the number of pod instances for the underlying TargetAllocator, this can only be set to values other
// than 1 if a strategy that allows for high availability is chosen.
// Replicas is the number of pod instances for the underlying TargetAllocator, this should only be set to values other
// than 1 if a strategy that allows for high availability is chosen.

Is there an enforcement mechanism, or is this only advisory?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which are the strategies that allow HA?

AllocationStrategy is a string how does the user know how to configure it?

	// AllocationStrategy determines which strategy the target allocator should use for allocation
	AllocationStrategy string `json:"allocationStrategy,omitempty"`

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only advisory, the logic for enforcing may be simple now, but will be more complicated in the future when we have things like leader election or maybe even another method like distributed state. I also was thinking if someone wanted to plug in their own TA that the operator would only know the image of, we wouldn't have a way of changing the validated webhook to know to accept the custom TA strategy.

r.Spec.Replicas = &one
}
if r.Spec.TargetAllocator.Enabled && r.Spec.TargetAllocator.Replicas == nil {
r.Spec.TargetAllocator.Replicas = &one
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it safe to share the pointer here? If these values are ever modified is it always by assigning a new pointer or is it ever possible that the underlying value will be changed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's okay to modify because this is just for the webhook. The object passed to the operator wouldn't have a spec with a shared pointer (this worked in my testing)

Comment on lines +41 to +43
PartitionCount: 1061,
ReplicationFactor: 5,
Load: 1.1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these configuration options be exposed to the operator?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure... I think i'm going to open a follow up issue to make these configurable as it would require some refactoring of how configuration is passed down to the allocation strategies as right now it's only the string. Is that alright?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's fine. I think using the string for configuration might be fine, as long as it's well specified. Something like URL query params style might work:

allocationStrategy: "consistentHashing?load=1.5&partitionCount=3000"

That would make it a straightforward addition while leaving plenty of flexibility. Don't need to solve now though, can figure it out on the follow up issue.

Comment on lines +95 to +98
// Do nothing if the item is already there
if _, ok := c.targetItems[k]; ok {
continue
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary? addTargetToTargetItems() has provision for when the new target has an existing collector attached to it, which I think would only happen when the target existed previously as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is still necessary. the provision in addTargetToTargetItems() is for when we reassign a collector after a new collector is added which shifts the amount of members in the hashring. In this case, we are iterating through the newly added targets and seeing if the target has already been added. Because we're using consistent hashing, the target would get unassigned and then reassigned to the same collector which seems like unnecessary work.

Comment on lines 111 to 112
// Replicas is the number of pod instances for the underlying TargetAllocator, this can only be set to values other
// than 1 if a strategy that allows for high availability is chosen.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which are the strategies that allow HA?

AllocationStrategy is a string how does the user know how to configure it?

	// AllocationStrategy determines which strategy the target allocator should use for allocation
	AllocationStrategy string `json:"allocationStrategy,omitempty"`

cmd/otel-allocator/allocation/consistent_hashing.go Outdated Show resolved Hide resolved
Copy link
Contributor

@kristinapathak kristinapathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great! a few grammar suggestions.

Copy link
Member

@pavolloffay pavolloffay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approving based on the approvals from TA contributors.

ItielOlenick pushed a commit to ItielOlenick/opentelemetry-operator that referenced this pull request May 1, 2024
…#1087)

* Added consistent hashing strategy

* testing fix, metrics fix

* Add configurable replicas

* Fix linting

* Fix test and add a new one

* Rename var to rerun tests

* Comments and things

* Grammar

* docs 🤦
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement leader election for the target allocator
4 participants