Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve documentation and examples #30

Merged
merged 2 commits into from
May 4, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 35 additions & 35 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ such as a list of user records or log entries. CSV is not exactly a new format
and has been used in a large number of systems for decades. In particular, CSV
is often used for historical reasons and despite its shortcomings, it is still a
very common export format for a large number of tools to interface with
spreadsheet processors (such as Exel, Calc etc.). This library provides a simple
spreadsheet processors (such as Excel, Calc etc.). This library provides a simple
streaming API to process very large CSV files with thousands or even millions of
rows efficiently without having to load the whole file into memory at once.

Expand Down Expand Up @@ -40,7 +40,7 @@ rows efficiently without having to load the whole file into memory at once.

## Support us

We invest a lot of time developing, maintaining and updating our awesome
We invest a lot of time developing, maintaining, and updating our awesome
open-source projects. You can help us sustain this high-quality of our work by
[becoming a sponsor on GitHub](https://github.com/sponsors/clue). Sponsors get
numerous benefits in return, see our [sponsoring page](https://github.com/sponsors/clue)
Expand Down Expand Up @@ -96,9 +96,9 @@ World!"
started using some CSV-variant long before this standard was defined.

Some applications refer to CSV as Character-Separated Values, simply because
using another delimiter (such as semicolon or tab) is a rather common approach
using another delimiter (such as a semicolon or tab) is a rather common approach
to avoid the need to enclose common values in quotes. This is particularly
common for systems in Europe (and elsewhere) that use a comma as decimal separator.
common for systems in Europe (and elsewhere) that use a comma as a decimal separator.

```
name;comment
Expand All @@ -115,7 +115,7 @@ consistently.

Despite its shortcomings, CSV is widely used and this is unlikely to change any
time soon. In particular, CSV is a very common export format for a lot of tools
to interface with spreadsheet processors (such as Exel, Calc etc.). This means
to interface with spreadsheet processors (such as Excel, Calc etc.). This means
that CSV is often used for historical reasons and using CSV to store structured
application data is usually not a good idea nowadays – but exporting to CSV for
known applications continues to be a very reasonable approach.
Expand Down Expand Up @@ -155,12 +155,12 @@ test,1,24
"hello world",2,48
```
```php
$stdin = new ReadableResourceStream(STDIN);
$stdin = new React\Stream\ReadableResourceStream(STDIN);

$stream = new Decoder($stdin);
$csv = new Clue\React\Csv\Decoder($stdin);

$stream->on('data', function ($data) {
// data is a parsed element from the CSV stream
$csv->on('data', function (array $data) {
// $data is a parsed element from the CSV stream
// line 1: $data = array('test', '1', '24');
// line 2: $data = array('hello world', '2', '48');
var_dump($data);
Expand All @@ -179,9 +179,9 @@ use a quote enclosure character (`"`) and a backslash escape character (`\`).
This behavior can be controlled through the optional constructor parameters:

```php
$stream = new Decoder($stdin, ';');
$csv = new Clue\React\Csv\Decoder($stdin, ';');

$stream->on('data', function ($data) {
$csv->on('data', function (array $data) {
// CSV fields will now be delimited by semicolon
});
```
Expand All @@ -193,15 +193,15 @@ unreasonably long lines. It accepts an additional argument if you want to change
this from the default of 64 KiB:

```php
$stream = new Decoder($stdin, ',', '"', '\\', 64 * 1024);
$csv = new Clue\React\Csv\Decoder($stdin, ',', '"', '\\', 64 * 1024);
```

If the underlying stream emits an `error` event or the plain stream contains
any data that does not represent a valid CSV stream,
it will emit an `error` event and then `close` the input stream:

```php
$stream->on('error', function (Exception $error) {
$csv->on('error', function (Exception $error) {
// an error occured, stream will close next
});
```
Expand All @@ -212,7 +212,7 @@ followed by an `end` event on success or an `error` event for
incomplete/invalid CSV data as above:

```php
$stream->on('end', function () {
$csv->on('end', function () {
// stream successfully ended, stream will close next
});
```
Expand All @@ -221,7 +221,7 @@ If either the underlying stream or the `Decoder` is closed, it will forward
the `close` event:

```php
$stream->on('close', function () {
$csv->on('close', function () {
// stream closed
// possibly after an "end" event or due to an "error" event
});
Expand All @@ -231,7 +231,7 @@ The `close(): void` method can be used to explicitly close the `Decoder` and
its underlying stream:

```php
$stream->close();
$csv->close();
```

The `pipe(WritableStreamInterface $dest, array $options = array(): WritableStreamInterface`
Expand All @@ -240,7 +240,7 @@ Please note that the `Decoder` emits decoded/parsed data events, while many
(most?) writable streams expect only data chunks:

```php
$stream->pipe($logger);
$csv->pipe($logger);
```

For more details, see ReactPHP's
Expand All @@ -261,11 +261,11 @@ test,1
"hello world",2
```
```php
$stdin = new ReadableResourceStream(STDIN);
$stdin = new React\Stream\ReadableResourceStream(STDIN);

$stream = new AssocDecoder($stdin);
$csv = new Clue\React\Csv\AssocDecoder($stdin);

$stream->on('data', function ($data) {
$csv->on('data', function (array $data) {
// $data is a parsed element from the CSV stream
// line 1: $data = array('name' => 'test', 'id' => '1');
// line 2: $data = array('name' => 'hello world', 'id' => '2');
Expand All @@ -285,7 +285,7 @@ assoc arrays. After receiving the name of headers, this class will always emit
a `headers` event with a list of header names.

```php
$stream->on('headers', function (array $headers) {
$csv->on('headers', function (array $headers) {
// header line: $headers = array('name', 'id');
var_dump($headers);
});
Expand All @@ -312,12 +312,12 @@ and accepts its data through the same interface, but handles any data as complet
CSV elements instead of just chunks of strings:

```php
$stdout = new WritableResourceStream(STDOUT);
$stdout = new React\Stream\WritableResourceStream(STDOUT);

$stream = new Encoder($stdout);
$csv = new Clue\React\Csv\Encoder($stdout);

$stream->write(array('test', true, 24));
$stream->write(array('hello world', 2, 48));
$csv->write(array('test', true, 24));
$csv->write(array('hello world', 2, 48));
```
```
test,1,24
Expand All @@ -332,9 +332,9 @@ a Unix-style EOL (`\n` or `LF`).
This behavior can be controlled through the optional constructor parameters:

```php
$stream = new Encoder($stdout, ';');
$csv = new Clue\React\Csv\Encoder($stdout, ';');

$stream->write(array('hello', 'world'));
$csv->write(array('hello', 'world'));
```
```
hello;world
Expand All @@ -345,7 +345,7 @@ any data that can not be represented as a valid CSV stream,
it will emit an `error` event and then `close` the input stream:

```php
$stream->on('error', function (Exception $error) {
$csv->on('error', function (Exception $error) {
// an error occured, stream will close next
});
```
Expand All @@ -354,7 +354,7 @@ If either the underlying stream or the `Encoder` is closed, it will forward
the `close` event:

```php
$stream->on('close', function () {
$csv->on('close', function () {
// stream closed
// possibly after an "end" event or due to an "error" event
});
Expand All @@ -364,22 +364,22 @@ The `end(mixed $data = null): void` method can be used to optionally emit
any final data and then soft-close the `Encoder` and its underlying stream:

```php
$stream->end();
$csv->end();
```

The `close(): void` method can be used to explicitly close the `Encoder` and
its underlying stream:

```php
$stream->close();
$csv->close();
```

For more details, see ReactPHP's
[`WritableStreamInterface`](https://github.com/reactphp/stream#writablestreaminterface).

## Install

The recommended way to install this library is [through Composer](https://getcomposer.org).
The recommended way to install this library is [through Composer](https://getcomposer.org/).
[New to Composer?](https://getcomposer.org/doc/00-intro.md)

This project follows [SemVer](https://semver.org/).
Expand All @@ -394,12 +394,12 @@ See also the [CHANGELOG](CHANGELOG.md) for details about version upgrades.
This project aims to run on any platform and thus does not require any PHP
extensions and supports running on legacy PHP 5.3 through current PHP 8+ and
HHVM.
It's *highly recommended to use the latest supported PHP version PHP 7+* for this project.
It's *highly recommended to use the latest supported PHP version* for this project.

## Tests

To run the test suite, you first need to clone this repo and then install all
dependencies [through Composer](https://getcomposer.org):
dependencies [through Composer](https://getcomposer.org/):

```bash
$ composer install
Expand All @@ -408,7 +408,7 @@ $ composer install
To run the test suite, go to the project root and run:

```bash
$ php vendor/bin/phpunit
$ vendor/bin/phpunit
```

## License
Expand Down
15 changes: 6 additions & 9 deletions examples/01-count.php
Original file line number Diff line number Diff line change
Expand Up @@ -2,31 +2,28 @@

// $ php examples/01-count.php < examples/users.csv

use Clue\React\Csv\AssocDecoder;
use React\EventLoop\Loop;
use React\Stream\ReadableResourceStream;
use React\Stream\WritableResourceStream;

require __DIR__ . '/../vendor/autoload.php';

$exit = 0;
$in = new ReadableResourceStream(STDIN);
$info = new WritableResourceStream(STDERR);
$in = new React\Stream\ReadableResourceStream(STDIN);
$info = new React\Stream\WritableResourceStream(STDERR);

$delimiter = isset($argv[1]) ? $argv[1] : ',';

$decoder = new AssocDecoder($in, $delimiter);
$csv = new Clue\React\Csv\AssocDecoder($in, $delimiter);

$count = 0;
$decoder->on('data', function () use (&$count) {
$csv->on('data', function () use (&$count) {
++$count;
});

$decoder->on('end', function () use (&$count) {
$csv->on('end', function () use (&$count) {
echo $count . PHP_EOL;
});

$decoder->on('error', function (Exception $e) use (&$count, &$exit, $info) {
$csv->on('error', function (Exception $e) use (&$count, &$exit, $info) {
$info->write('ERROR after record ' . $count . ': ' . $e->getMessage() . PHP_EOL);
$exit = 1;
});
Expand Down
18 changes: 7 additions & 11 deletions examples/02-validate.php
Original file line number Diff line number Diff line change
Expand Up @@ -2,26 +2,22 @@

// $ php examples/02-validate.php < examples/users.csv

use Clue\React\Csv\Decoder;
use Clue\React\Csv\Encoder;
use React\EventLoop\Loop;
use React\Stream\ReadableResourceStream;
use React\Stream\WritableResourceStream;

require __DIR__ . '/../vendor/autoload.php';

$exit = 0;
$in = new ReadableResourceStream(STDIN);
$out = new WritableResourceStream(STDOUT);
$info = new WritableResourceStream(STDERR);
$in = new React\Stream\ReadableResourceStream(STDIN);
$out = new React\Stream\WritableResourceStream(STDOUT);
$info = new React\Stream\WritableResourceStream(STDERR);

$delimiter = isset($argv[1]) ? $argv[1] : ',';

$decoder = new Decoder($in, $delimiter);
$encoder = new Encoder($out, $delimiter);
$decoder->pipe($encoder);
$csv = new Clue\React\Csv\Decoder($in, $delimiter);
$encoder = new Clue\React\Csv\Encoder($out, $delimiter);
$csv->pipe($encoder);

$decoder->on('error', function (Exception $e) use ($info, &$exit) {
$csv->on('error', function (Exception $e) use ($info, &$exit) {
$info->write('ERROR: ' . $e->getMessage() . PHP_EOL);
$exit = 1;
});
Expand Down
18 changes: 7 additions & 11 deletions examples/11-csv2ndjson.php
Original file line number Diff line number Diff line change
Expand Up @@ -3,34 +3,30 @@
// $ php examples/11-csv2ndjson.php < examples/users.csv > examples/users.ndjson
// see also https://github.com/clue/reactphp-ndjson

use Clue\React\Csv\AssocDecoder;
use React\EventLoop\Loop;
use React\Stream\ReadableResourceStream;
use React\Stream\WritableResourceStream;
use React\Stream\ThroughStream;

require __DIR__ . '/../vendor/autoload.php';

$exit = 0;
$in = new ReadableResourceStream(STDIN);
$out = new WritableResourceStream(STDOUT);
$info = new WritableResourceStream(STDERR);
$in = new React\Stream\ReadableResourceStream(STDIN);
$out = new React\Stream\WritableResourceStream(STDOUT);
$info = new React\Stream\WritableResourceStream(STDERR);

$delimiter = isset($argv[1]) ? $argv[1] : ',';

$decoder = new AssocDecoder($in, $delimiter);
$csv = new Clue\React\Csv\AssocDecoder($in, $delimiter);

$encoder = new ThroughStream(function ($data) {
$encoder = new React\Stream\ThroughStream(function ($data) {
$data = \array_filter($data, function ($one) {
return ($one !== '');
});

return \json_encode($data, JSON_UNESCAPED_SLASHES | JSON_UNESCAPED_UNICODE) . "\n";
});

$decoder->pipe($encoder)->pipe($out);
$csv->pipe($encoder)->pipe($out);

$decoder->on('error', function (Exception $e) use ($info, &$exit) {
$csv->on('error', function (Exception $e) use ($info, &$exit) {
$info->write('ERROR: ' . $e->getMessage() . PHP_EOL);
$exit = 1;
});
Expand Down
Loading