Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use terminology *data reader creator* instead of *data reader*, and r… #1412

Merged
merged 4 commits into from
Feb 22, 2017

Conversation

helinwang
Copy link
Contributor

…emove lambda usage from design doc

def data_reader_bool(t):
while True:
yield t
def data_reader_creator_bool(t):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data_reader_bool ==> reader_creator_bool

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

while True:
yield t
def data_reader_creator_bool(t):
def creator:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

creator ==> reader

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


For example, we want to use a source of real images (reusing mnist dataset), and a source of fake images as input for [Generative Adversarial Networks](https://arxiv.org/abs/1406.2661).

We can do:

```python
def data_reader_fake_image():
def data_reader_creator_fake_image():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data_reader_creator_fake_image ==> reader_creator_random_image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


For example, we want to use a source of real images (reusing mnist dataset), and a source of fake images as input for [Generative Adversarial Networks](https://arxiv.org/abs/1406.2661).

We can do:

```python
def data_reader_fake_image():
def data_reader_creator_fake_image():
while True:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reader_creator_random_image 应该有参数表示图像大小

def reader_creator_random_image(width, height):
    def reader():
        while True:
            yield numpy.random.uniform(-1, 1, size=width*height)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


Paddle reads data from data reader during training. It will be passed into `paddle.train` as a parameter.
Paddle reads data from *data reader* during training. *data reader creator* (or *reader creator*) creates a *data reader* when invoked. *reader creator* will be passed into `paddle.train` as a parameter.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At training and testing time, PaddlePaddle programs need to read data. To ease the users' work to write data readign code, we define that

  • A reader is a function that reads data (from file, network, random number generator, etc) and yields data items.
  • A reader creator is a function that returns a reader function.
  • A reader decorator is a function, which accepts one or more readers, and returns a reader.

and provide frequently used reader creators and reader decorators.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


Data reader is a function with no parameter that creates a iterable (anything can be used in `for x in iterable`):
Data reader creator is a function with no parameter that creates a iterable (anything can be used in `for x in iterable`):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a reader creator can have parameters. For example:

def reader_creator_text(filename):
    def reader():
        ... read from filename ...
    return reader

Here the creator does have one paraemter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -41,11 +41,11 @@ label_layer = paddle.layer.data("label", ...)
paddle.train(paddle.dataset.mnist, {"image":0, "label":1}, 128, 10, ...)
```

## Data Reader Decorators
## Data Reader Creator Decorator
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reader Creator Decorator ==> Reader Decorator

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@wangkuiyi wangkuiyi mentioned this pull request Feb 22, 2017
@helinwang helinwang merged commit 5a1d926 into PaddlePaddle:develop Feb 22, 2017
lizexu123 pushed a commit to lizexu123/Paddle that referenced this pull request Feb 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants