Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extend DummyDatasetMultipleSequenceLength #492

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jotix16
Copy link
Contributor

@jotix16 jotix16 commented Apr 27, 2021

Added get_all_tags, init_seq_order and get_current_seq_order for DummyDatasetMultipleSequenceLength

@jotix16 jotix16 requested review from albertz and a team as code owners April 27, 2021 19:04
@jotix16 jotix16 marked this pull request as draft April 27, 2021 19:28
"""
:rtype: list[str]
"""
return ["seq-%i" % i for i in range(self._num_seqs)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self._num_seqs might be inf. It should handle that correctly (fall back to the default behavior).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the default behaviour?

Does it even make sense for a dummy dataset to be initialized with num_seqs=inf?
In GeneratingDataset it is per default inf but here it has to be set by the user.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the default behaviour?

Whatever super of this function does. Just check the code.

Does it even make sense for a dummy dataset to be initialized with num_seqs=inf?

Yes it does. Why not? It's usually only used for testing. In any case, this is explicitly supported.

def get_seq_len(i):
return self.seq_len['data']

self._seq_order = self.get_seq_order_for_epoch(epoch, self._num_seqs, get_seq_len=get_seq_len)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this makes sense to support for this dataset.
Also you are not using it, i.e. ignoring it (but maybe just because it's incomplete so far).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The self._seq_order you mean?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By support, I mean init_seq_order in general.
By use, I mean _seq_order.

@albertz
Copy link
Member

albertz commented Apr 27, 2021

Why is this specific for DummyDatasetMultipleDataKeys? Why not for the base class DummyDataset? Or even better its base class GeneratingDataset.

@albertz
Copy link
Member

albertz commented Apr 27, 2021

Please also see failing tests.

  returnn/datasets/generating.py:779: WEAK WARNING PyMissingOrEmptyDocstringInspection: Missing docstring
  returnn/datasets/generating.py:779: WEAK WARNING PyUnusedLocalInspection: Parameter 'i' value is not used
  returnn/datasets/generating.py:785: WEAK WARNING PyMissingOrEmptyDocstringInspection: Missing docstring

@jotix16 jotix16 changed the title extend DummyDatasetMultipleDataKeys extend DummyDatasetMultipleSequenceLength Apr 28, 2021
@jotix16
Copy link
Contributor Author

jotix16 commented Apr 28, 2021

Why is this specific for DummyDatasetMultipleDataKeys? Why not for the base class DummyDataset? Or even better its base class GeneratingDataset.

I don't see any reason why not to use it for GeneratingDataset directly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants