Skip to content

Commit

Permalink
msgpack: support datetime extended type
Browse files Browse the repository at this point in the history
Tarantool supports datetime type since version 2.10.0 [1]. This patch
introduced the support of Tarantool datetime type in msgpack decoders
and encoders.

Tarantool datetime objects are decoded to `tarantool.Datetime` type.
`tarantool.Datetime` objects may be encoded to Tarantool datetime
objects.

`tarantool.Datetime` is basically a `pandas.Timestamp` wrapper. You can
create `tarantool.Datetime` objects
- from `pandas.Timestamp` object,
- by using the same API as in `pandas.Timestamp()` [2],
- from another `tarantool.Datetime` object.

To work with datetime data as a `pandas.Timestamp`, convert
`tarantool.Datetime` object to a `pandas.Timestamp` with
`to_pd_timestamp()` method call. You can use this `pandas.Timestamp`
object to build a `tarantool.Datetime` object before sending data to
Tarantool.

To work with data as `numpy.datetime64` or `datetime.datetime`, convert
to a `pandas.Timestamp` and then use `to_datetime64()` or
`to_datetime()` converter.

pandas.Timestamp was chosen to store data because it could be used
to store both nanoseconds and timezone information. In-build Python
datetime.datetime supports microseconds at most, numpy.datetime64 do not
support timezones.

Tarantool datetime interval type is planned to be stored in custom
type tarantool.Interval and we'll need a way to support arithmetic
between datetime and interval. This is the reason we use custom class
instead of plain pandas.Timestamp.

This patch does not yet introduce the support of timezones in datetime.

1. tarantool/tarantool#5941
2. https://pandas.pydata.org/docs/reference/api/pandas.Timestamp.html

Part of #204
  • Loading branch information
DifferentialOrange committed Sep 13, 2022
1 parent c70dfa6 commit 39ddae3
Show file tree
Hide file tree
Showing 10 changed files with 345 additions and 6 deletions.
21 changes: 21 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,27 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Added
- Decimal type support (#203).
- UUID type support (#202).
- Datetime type support and tarantool.Datetime type (#204).

Tarantool datetime objects are decoded to `tarantool.Datetime`
type. `tarantool.Datetime` may be encoded to Tarantool datetime
objects.

`tarantool.Datetime` is basically a `pandas.Timestamp` wrapper.
You can create `tarantool.Datetime` objects
- from `pandas.Timestamp` object,
- by using the same API as in `pandas.Timestamp()`,
- from another `tarantool.Datetime` object.

To work with datetime data as a `pandas.Timestamp`, convert
`tarantool.Datetime` object to a `pandas.Timestamp` with
`to_pd_timestamp()` method call. You can use this
`pandas.Timestamp` object to build a `tarantool.Datetime`
object before sending data to Tarantool.

To work with data as `numpy.datetime64` or `datetime.datetime`,
convert to a `pandas.Timestamp` and then use `to_datetime64()`
or `to_datetime()` converter.

### Changed
- Bump msgpack requirement to 1.0.4 (PR #223).
Expand Down
1 change: 1 addition & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
msgpack>=1.0.4
pandas
6 changes: 5 additions & 1 deletion tarantool/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,10 @@
ENCODING_DEFAULT,
)

from tarantool.msgpack_ext.types.datetime import (
Datetime,
)

__version__ = "0.9.0"


Expand Down Expand Up @@ -91,7 +95,7 @@ def connectmesh(addrs=({'host': 'localhost', 'port': 3301},), user=None,

__all__ = ['connect', 'Connection', 'connectmesh', 'MeshConnection', 'Schema',
'Error', 'DatabaseError', 'NetworkError', 'NetworkWarning',
'SchemaError', 'dbapi']
'SchemaError', 'dbapi', 'Datetime']

# ConnectionPool is supported only for Python 3.7 or newer.
if sys.version_info.major >= 3 and sys.version_info.minor >= 7:
Expand Down
9 changes: 9 additions & 0 deletions tarantool/msgpack_ext/datetime.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
from tarantool.msgpack_ext.types.datetime import Datetime

EXT_ID = 4

def encode(obj):
return obj.msgpack_encode()

def decode(data):
return Datetime(data)
8 changes: 6 additions & 2 deletions tarantool/msgpack_ext/packer.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,16 @@
from uuid import UUID
from msgpack import ExtType

from tarantool.msgpack_ext.types.datetime import Datetime

import tarantool.msgpack_ext.decimal as ext_decimal
import tarantool.msgpack_ext.uuid as ext_uuid
import tarantool.msgpack_ext.datetime as ext_datetime

encoders = [
{'type': Decimal, 'ext': ext_decimal},
{'type': UUID, 'ext': ext_uuid },
{'type': Decimal, 'ext': ext_decimal },
{'type': UUID, 'ext': ext_uuid },
{'type': Datetime, 'ext': ext_datetime},
]

def default(obj):
Expand Down
126 changes: 126 additions & 0 deletions tarantool/msgpack_ext/types/datetime.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
from copy import deepcopy

import pandas

# https://www.tarantool.io/en/doc/latest/dev_guide/internals/msgpack_extensions/#the-datetime-type
#
# The datetime MessagePack representation looks like this:
# +---------+----------------+==========+-----------------+
# | MP_EXT | MP_DATETIME | seconds | nsec; tzoffset; |
# | = d7/d8 | = 4 | | tzindex; |
# +---------+----------------+==========+-----------------+
# MessagePack data contains:
#
# * Seconds (8 bytes) as an unencoded 64-bit signed integer stored in the
# little-endian order.
# * The optional fields (8 bytes), if any of them have a non-zero value.
# The fields include nsec (4 bytes), tzoffset (2 bytes), and
# tzindex (2 bytes) packed in the little-endian order.
#
# seconds is seconds since Epoch, where the epoch is the point where the time
# starts, and is platform dependent. For Unix, the epoch is January 1,
# 1970, 00:00:00 (UTC). Tarantool uses a double type, see a structure
# definition in src/lib/core/datetime.h and reasons in
# https://github.com/tarantool/tarantool/wiki/Datetime-internals#intervals-in-c
#
# nsec is nanoseconds, fractional part of seconds. Tarantool uses int32_t, see
# a definition in src/lib/core/datetime.h.
#
# tzoffset is timezone offset in minutes from UTC. Tarantool uses a int16_t type,
# see a structure definition in src/lib/core/datetime.h.
#
# tzindex is Olson timezone id. Tarantool uses a int16_t type, see a structure
# definition in src/lib/core/datetime.h. If both tzoffset and tzindex are
# specified, tzindex has the preference and the tzoffset value is ignored.

SECONDS_SIZE_BYTES = 8
NSEC_SIZE_BYTES = 4
TZOFFSET_SIZE_BYTES = 2
TZINDEX_SIZE_BYTES = 2

BYTEORDER = 'little'

NSEC_IN_SEC = 1000000000


def get_bytes_as_int(data, cursor, size):
part = data[cursor:cursor + size]
return int.from_bytes(part, BYTEORDER, signed=True), cursor + size

def get_int_as_bytes(data, size):
return data.to_bytes(size, byteorder=BYTEORDER, signed=True)

def msgpack_decode(data):
cursor = 0
seconds, cursor = get_bytes_as_int(data, cursor, SECONDS_SIZE_BYTES)

if len(data) > SECONDS_SIZE_BYTES:
nsec, cursor = get_bytes_as_int(data, cursor, NSEC_SIZE_BYTES)
tzoffset, cursor = get_bytes_as_int(data, cursor, TZOFFSET_SIZE_BYTES)
tzindex, cursor = get_bytes_as_int(data, cursor, TZINDEX_SIZE_BYTES)
elif len(data) == SECONDS_SIZE_BYTES:
nsec = 0
tzoffset = 0
tzindex = 0
else:
raise MsgpackError('Unexpected datetime payload length')

if (tzoffset != 0) or (tzindex != 0):
raise NotImplementedError

total_nsec = seconds * NSEC_IN_SEC + nsec

return pandas.to_datetime(total_nsec, unit='ns')

class Datetime():
def __init__(self, *args, **kwargs):
if len(args) > 0:
data = args[0]
if isinstance(data, bytes):
self._timestamp = msgpack_decode(data)
return

if isinstance(data, pandas.Timestamp):
self._timestamp = = deepcopy(data)
return

if isinstance(data, Datetime):
self._timestamp = deepcopy(data._timestamp)
return
else:
self._timestamp = pandas.Timestamp(*args, **kwargs)
return

def __eq__(self, other):
if isinstance(other, Datetime):
return self._timestamp == other._timestamp
elif isinstance(other, pandas.Timestamp):
return self._timestamp == other
else:
return False

def __str__(self):
return self._timestamp.__str__()

def __repr__(self):
return self._timestamp.__repr__()

def to_pd_timestamp(self):
return deepcopy(self._timestamp)

def msgpack_encode(self):
ts_value = self._timestamp.value

seconds = ts_value // NSEC_IN_SEC
nsec = ts_value % NSEC_IN_SEC
tzoffset = 0
tzindex = 0

buf = get_int_as_bytes(seconds, SECONDS_SIZE_BYTES)

if (nsec != 0) or (tzoffset != 0) or (tzindex != 0):
buf = buf + get_int_as_bytes(nsec, NSEC_SIZE_BYTES)
buf = buf + get_int_as_bytes(tzoffset, TZOFFSET_SIZE_BYTES)
buf = buf + get_int_as_bytes(tzindex, TZINDEX_SIZE_BYTES)

return buf
6 changes: 4 additions & 2 deletions tarantool/msgpack_ext/unpacker.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
import tarantool.msgpack_ext.decimal as ext_decimal
import tarantool.msgpack_ext.uuid as ext_uuid
import tarantool.msgpack_ext.datetime as ext_datetime

decoders = {
ext_decimal.EXT_ID: ext_decimal.decode,
ext_uuid.EXT_ID : ext_uuid.decode ,
ext_decimal.EXT_ID : ext_decimal.decode ,
ext_uuid.EXT_ID : ext_uuid.decode ,
ext_datetime.EXT_ID: ext_datetime.decode,
}

def ext_hook(code, data):
Expand Down
3 changes: 2 additions & 1 deletion test/suites/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,14 @@
from .test_ssl import TestSuite_Ssl
from .test_decimal import TestSuite_Decimal
from .test_uuid import TestSuite_UUID
from .test_datetime import TestSuite_Datetime

test_cases = (TestSuite_Schema_UnicodeConnection,
TestSuite_Schema_BinaryConnection,
TestSuite_Request, TestSuite_Protocol, TestSuite_Reconnect,
TestSuite_Mesh, TestSuite_Execute, TestSuite_DBAPI,
TestSuite_Encoding, TestSuite_Pool, TestSuite_Ssl,
TestSuite_Decimal, TestSuite_UUID)
TestSuite_Decimal, TestSuite_UUID, TestSuite_Datetime)

def load_tests(loader, tests, pattern):
suite = unittest.TestSuite()
Expand Down
11 changes: 11 additions & 0 deletions test/suites/lib/skip.py
Original file line number Diff line number Diff line change
Expand Up @@ -154,3 +154,14 @@ def skip_or_run_UUID_test(func):

return skip_or_run_test_tarantool(func, '2.4.1',
'does not support UUID type')

def skip_or_run_datetime_test(func):
"""Decorator to skip or run datetime-related tests depending on
the tarantool version.
Tarantool supports datetime type only since 2.10.0 version.
See https://github.com/tarantool/tarantool/issues/5941
"""

return skip_or_run_test_pcall_require(func, 'datetime',
'does not support datetime type')
Loading

0 comments on commit 39ddae3

Please sign in to comment.