Skip to content

Latest commit

 

History

History
388 lines (232 loc) · 9.94 KB

File metadata and controls

388 lines (232 loc) · 9.94 KB

tfra.dynamic_embedding.HkvHashTable

View source on GitHub




Class HkvHashTable

A generic mutable hash table implementation.

HkvHashTable is a multi-level cache hash table that allows storing values simultaneously in both GPU and CPU. It enables efficient utilization of training resources while ensuring high-performance queries, insert. This greatly expands the capacity of the hash table, making it suitable for more complex training tasks. For more detailed information about HierarchicalKV, please refer to HierarchicalKV .

Environment request

  • CUDA version >= 11.2
  • NVIDIA GPU with compute capability 8.0, 8.6, 8.7 or 9.0
  • GCC supports `C++17' standard or later.

Example usage:

table = tfra.dynamic_embedding.HkvHashTable(key_dtype=tf.string,
                                               value_dtype=tf.int64,
                                               default_value=-1)
sess.run(table.insert(keys, values))
out = table.lookup(query_keys)
print(out.eval())

__init__

View source

KHkvHashTableInitCapacity = 1024 * 1024
KHkvHashTableMaxCapacity = 1024 * 1024
KHkvHashTableMaxHbmForValuesByBytes = 1024 * 1024 * 1024


__init__(
    key_dtype,
    value_dtype,
    default_value,
    name='HkvHashTable',
    checkpoint=(True),
    init_capacity=KHkvHashTableInitCapacity,
    max_capacity=KHkvHashTableMaxCapacity,
    max_hbm_for_values=KHkvHashTableMaxHbmForValuesByBytes,
    config=None,
    device='',
    evict_strategy=HkvEvictStrategy.LRU,
    step_per_epoch=0,
    gen_scores_fn=None,
    reserved_key_start_bit=0,
)

Creates an empty HkvHashTable object.

Creates a table, the type of its keys and values are specified by key_dtype and value_dtype, respectively.

Args:

  • key_dtype: the type of the key tensors.
  • value_dtype: the type of the value tensors.
  • default_value: The value to use if a key is missing in the table.
  • name: A name for the operation (optional).
  • checkpoint: if True, the contents of the table are saved to and restored from checkpoints. If shared_name is empty for a checkpointed table, it is shared using the table node name.
  • init_capacity: initial size for the Variable and initial size of each hash
  • max_capacity: max capacity for the Variable and max capacity of each hash
  • max_hbm_for_values: The maximum HBM capacity occupied by the values of the hash table, measured in bytes.
  • config: a HkvHashTableConfig object
  • device: initial size for the Variable and initial size of each hash tables will be int(init_size / N), N is the number of the devices.
  • evict_strategy: Select and set different evict strategies.
  • step_per_epoch: How many steps per epoch. This parameter must be set when you select EPOCHLRU or EPOCHLFU evict strategy.
  • gen_scores_fn: Custom method for generating scores. This must be set when you choose to use CUSTOMIZED evict strategy.

Returns:

A HkvHashTable object.

Raises:

  • ValueError: If checkpoint is True and no name was specified.

Properties

key_dtype

The table key dtype.

name

The name of the table.

resource_handle

Returns the resource handle associated with this Resource.

value_dtype

The table value dtype.

Methods

__getitem__

__getitem__(keys)

Looks up keys in a table, outputs the corresponding values.

accum

View source

accum(
    keys,
    values_or_deltas,
    exists,
    name=None
)

Associates keys with values.

Args:

  • keys: Keys to accmulate. Can be a tensor of any shape. Must match the table's key type.
  • values_or_deltas: values to be associated with keys. Must be a tensor of the same shape as keys and match the table's value type.
  • exists: A bool type tensor indicates if keys already exist or not. Must be a tensor of the same shape as keys.
  • name: A name for the operation (optional).

Returns:

The created Operation.

Raises:

  • TypeError: when keys or values doesn't match the table data types.

clear

View source

clear(name=None)

clear all keys and values in the table.

Args:

  • name: A name for the operation (optional).

Returns:

The created Operation.

export

View source

export(name=None)

Returns tensors of all keys and values in the table.

Args:

  • name: A name for the operation (optional).

Returns:

A pair of tensors with the first tensor containing all keys and the second tensors containing all values in the table.

insert

View source

insert(
    keys,
    values,
    name=None
)

Associates keys with values.

Args:

  • keys: Keys to insert. Can be a tensor of any shape. Must match the table's key type.
  • values: Values to be associated with keys. Must be a tensor of the same shape as keys and match the table's value type.
  • name: A name for the operation (optional).

Returns:

The created Operation.

Raises:

  • TypeError: when keys or values doesn't match the table data types.

lookup

View source

lookup(
    keys,
    dynamic_default_values=None,
    return_exists=(False),
    name=None
)

Looks up keys in a table, outputs the corresponding values.

The default_value is used for keys not present in the table.

Args:

  • keys: Keys to look up. Can be a tensor of any shape. Must match the table's key_dtype.
  • dynamic_default_values: The values to use if a key is missing in the table. If None (by default), the static default_value self._default_value will be used.
  • return_exists: if True, will return a additional Tensor which indicates if or not keys are existing in the table.
  • name: A name for the operation (optional).

Returns:

A tensor containing the values in the same shape as keys using the table's value type.

  • exists: A bool type Tensor of the same shape as keys which indicates if keys are existing in the table. Only provided if return_exists is True.

Raises:

  • TypeError: when keys do not match the table data types.

remove

View source

remove(
    keys,
    name=None
)

Removes keys and its associated values from the table.

If a key is not present in the table, it is silently ignored.

Args:

  • keys: Keys to remove. Can be a tensor of any shape. Must match the table's key type.
  • name: A name for the operation (optional).

Returns:

The created Operation.

Raises:

  • TypeError: when keys do not match the table data types.

size

View source

size(name=None)

Compute the number of elements in this table.

Args:

  • name: A name for the operation (optional).

Returns:

A scalar tensor containing the number of elements in this table.