You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We currently have multiple methods across the codebase for generating unique identifiers. To improve consistency and flexibility, I propose unifying this functionality under servicelib.identifiers_utils. The main objectives are:
Generate Context-Specific Unique Name Identifiers:
The ability to generate unique identifiers based on different contexts/scopes, such as globally unique, unique within a project, process, hostname, or cluster.
The context can be passed as discriminators that will define the scope of uniqueness.
Standardized Identifier Formats:
Provide support for generating both standard UUIDs and human-readable identifiers with optional prefixes.
UUIDs should follow a standard format like uuid4 for general uniqueness or uuid5 (namespace-based) for deterministic IDs based on specific discriminators.
Human-readable identifiers should support optional prefixes (e.g., pay_123456124 for payment identifiers).
Example Implementation:
importhashlibimporttimeimportuuidimportsocketfrommodels_library.basic_typesimportIdStrdefshort_sha256(input_string: str, length: int=8) ->IdStr:
"""Generates a truncated SHA-256 hash of the input string."""sha_signature=hashlib.sha256(input_string.encode()).hexdigest()
returnIdStr(sha_signature[:length])
defgenerate_name_identifier(*discriminators, prefix: str|None=None, length: int=8) ->IdStr:
""" Generates a unique identifier based on the provided discriminators (e.g., project name, hostname). Optionally includes a human-readable prefix and truncates the identifier to the desired length. """idr=short_sha256("/".join(map(str, discriminators)), length=length)
ifprefix:
idr=f"{prefix}_{idr}"returnidrdefgenerate_uuid(*discriminators, base_uuid: uuid.UUID|None=None) ->uuid.UUID:
""" Generates a UUID based on the provided discriminators. Uses uuid5 for namespace-based determinism. """ifnotbase_uuid:
base_uuid=uuid.uuid4()
returnuuid.uuid5(base_uuid, "/".join(map(str, discriminators)))
# Example usagedefget_rabbitmq_client_unique_name(prefix: str) ->IdStr:
""" Generates a unique RabbitMQ client name based on the hostname and current time, with an optional prefix. """hostname=socket.gethostname()
returngenerate_name_identifier(time.time(), hostname, prefix=prefix, length=8)
Key Points and Improvements:
Contextual Uniqueness: The generate_name_identifier function allows you to pass any relevant context (e.g., hostname, project, or process) to ensure uniqueness within the intended scope.
Prefix Support: Human-readable prefixes can be added to identifiers for better clarity and debugging, such as pay_ for payment identifiers or user_ for user-related identifiers.
Shortened SHA-256 Identifiers: For identifiers that require truncation, we use a shortened SHA-256 hash, which can be configured via the length parameter to balance between uniqueness and brevity. However, consider using longer truncations if there are concerns about collisions in large systems.
UUID Generation: For cases requiring globally unique or deterministic identifiers, the generate_uuid function uses uuid5 for generating namespace-based UUIDs (preferred over uuid3 due to its stronger cryptographic properties).
Flexibility: Both generate_name_identifier and generate_uuid functions are flexible, allowing users to define how discriminators affect uniqueness within their system.
Next Steps:
We can further extend this utility by allowing specific discriminators, such as user ID or session ID, for more fine-grained uniqueness when necessary.
Additional formats (e.g., base62 encoding for compact identifiers) can be considered if we find a need to reduce the length of identifiers without sacrificing uniqueness.
CC: @sanderegg @matusdrobuliak66 @giancarloromeo @GitHK
We currently have multiple methods across the codebase for generating unique identifiers. To improve consistency and flexibility, I propose unifying this functionality under
servicelib.identifiers_utils
. The main objectives are:Generate Context-Specific Unique Name Identifiers:
Standardized Identifier Formats:
uuid4
for general uniqueness oruuid5
(namespace-based) for deterministic IDs based on specific discriminators.pay_123456124
for payment identifiers).Example Implementation:
Key Points and Improvements:
Contextual Uniqueness: The
generate_name_identifier
function allows you to pass any relevant context (e.g., hostname, project, or process) to ensure uniqueness within the intended scope.Prefix Support: Human-readable prefixes can be added to identifiers for better clarity and debugging, such as
pay_
for payment identifiers oruser_
for user-related identifiers.Shortened SHA-256 Identifiers: For identifiers that require truncation, we use a shortened SHA-256 hash, which can be configured via the
length
parameter to balance between uniqueness and brevity. However, consider using longer truncations if there are concerns about collisions in large systems.UUID Generation: For cases requiring globally unique or deterministic identifiers, the
generate_uuid
function usesuuid5
for generating namespace-based UUIDs (preferred overuuid3
due to its stronger cryptographic properties).Flexibility: Both
generate_name_identifier
andgenerate_uuid
functions are flexible, allowing users to define how discriminators affect uniqueness within their system.Next Steps:
Originally posted by @pcrespov in #6365 (comment)
The text was updated successfully, but these errors were encountered: