You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Looking for correct way to apply MultivariateGrouper to data from PandasDataset.from_long_dataframe
or how to transform my custom dataset to object of type like get_dataset("electricity_nips", regenerate=False).
Please notice the train_grouper(train_ds) works well but test_grouper(test_ds) raises the error.
I have studied this example but it work with standard dataset and i see no examples how can i gather it from long (or any other) dataframe or convert
To Reproduce
(Please provide minimal example of code snippet that reproduces the error. For existing examples, please provide link.)
from gluonts.dataset.pandas import PandasDataset
from gluonts.dataset.split import OffsetSplitter
import gluonts
print(gluonts.__version__)
# create long dataframe with two time series, 100 values for every of them
df=pd.DataFrame(
data={
'target': [i for i in range(100)]+[i for i in range(100)],
'item_id': ['var_1' for i in range(100)]+['var_2' for i in range(100)]
},
index=[i for i in pd.date_range(start='1970-01-01', periods=100, freq='1D')]+
[i for i in pd.date_range(start='1970-01-01', periods=100, freq='1D')]
)
print(df.info())
gluon_ds=PandasDataset.from_long_dataframe(
dataframe=df,
target='target',
item_id='item_id',
freq='1D'
)
splitter = OffsetSplitter(offset=70)
train_ds, test_template = splitter.split(gluon_ds)
test_ds = test_template.generate_instances(
prediction_length=1,
windows=29,
distance=1
)
print(f"{train_ds=}")
print(f"{test_ds=}")
train_grouper = MultivariateGrouper(
max_target_dim=2
)
test_grouper = MultivariateGrouper(
num_test_dates=29,
max_target_dim=2,
)
train_gr_data = train_grouper(train_ds)
test_gr_data = test_grouper(test_ds)
Error message or code output
(Paste the complete error message, including stack trace, or the undesired output that the above snippet produces.)
0.15.1
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 200 entries, 1970-01-01 to 1970-04-10
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 target 200 non-null int64
1 item_id 200 non-null object
dtypes: int64(1), object(1)
memory usage: 4.7+ KB
None
train_ds=TrainingDataset(dataset=PandasDataset<size=2, freq=1D, num_feat_dynamic_real=0, num_past_feat_dynamic_real=0, num_feat_static_real=0, num_feat_static_cat=0, static_cardinalities=[]>, splitter=OffsetSplitter(offset=70))
test_ds=TestData(dataset=PandasDataset<size=2, freq=1D, num_feat_dynamic_real=0, num_past_feat_dynamic_real=0, num_feat_static_real=0, num_feat_static_cat=0, static_cardinalities=[]>, splitter=OffsetSplitter(offset=70), prediction_length=1, windows=29, distance=1, max_history=None)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
[<ipython-input-2-7778d6fa7e79>](https://localhost:8080/#) in <cell line: 42>()
40
41 train_gr_data = train_grouper(train_ds)
---> 42 test_gr_data = test_grouper(test_ds)
1 frames
[/usr/local/lib/python3.10/dist-packages/gluonts/dataset/multivariate_grouper.py](https://localhost:8080/#) in __call__(self, dataset)
85
86 def __call__(self, dataset: Dataset) -> Dataset:
---> 87 self._preprocess(dataset)
88 return self._group_all(dataset)
89
[/usr/local/lib/python3.10/dist-packages/gluonts/dataset/multivariate_grouper.py](https://localhost:8080/#) in _preprocess(self, dataset)
98 """
99 for data in dataset:
--> 100 timestamp = data[FieldName.START]
101
102 if self.first_timestamp is None:
TypeError: tuple indices must be integers or slices, not str
Description
Looking for correct way to apply MultivariateGrouper to data from PandasDataset.from_long_dataframe
or how to transform my custom dataset to object of type like get_dataset("electricity_nips", regenerate=False).
Please notice the
train_grouper(train_ds)
works well buttest_grouper(test_ds)
raises the error.I have studied this example but it work with standard dataset and i see no examples how can i gather it from long (or any other) dataframe or convert
To Reproduce
(Please provide minimal example of code snippet that reproduces the error. For existing examples, please provide link.)
Error message or code output
(Paste the complete error message, including stack trace, or the undesired output that the above snippet produces.)
Environment
Operating system: google colab
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=22.04
DISTRIB_CODENAME=jammy
DISTRIB_DESCRIPTION="Ubuntu 22.04.3 LTS"
PRETTY_NAME="Ubuntu 22.04.3 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.3 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
Python version: Python 3.10.12
GluonTS version: 0.15.1
MXNet version: no MXNet
(Add as much information about your environment as possible, e.g. dependencies versions.)
The text was updated successfully, but these errors were encountered: