Skip to content

Commit

Permalink
Simplify ResultSet instantiation
Browse files Browse the repository at this point in the history
  • Loading branch information
nyamsprod committed Jan 19, 2025
1 parent 45c4618 commit 8b13132
Show file tree
Hide file tree
Showing 8 changed files with 252 additions and 61 deletions.
7 changes: 5 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,16 @@ All Notable changes to `Csv` will be documented in this file
- `TabularDataReader::selectAllExcept`
- `Statement::selectAllExcept`
- `ResultSet::createFromTabularData`
- `RdbmsResult`
- `TabularData`
- `ResultSet::createFromRdbms`
- `RdbmsResult` class to allow converting RDBMS result into `ResultSet`
- `TabularData` interface

### Deprecated

- `Writer::relaxEnclosure` use `Writer::necessaryEnclosure`
- `ResultSet::createFromTabularDataReader` use `ResultSet::createFromTabularData`
- `ResultSet::createFromRecords` use `ResultSet::createFromTabularData`
- `ResultSet::__construct` use `ResultSet::createFromTabularData`

### Fixed

Expand Down
143 changes: 143 additions & 0 deletions docs/9.0/interoperability/tabular-data-importer.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,143 @@
---
layout: default
title: Tabular Data Importer
---

# Tabular Data

<p class="message-notice">Starting with version <code>9.22.0</code></p>

Since version `9.6` the package provides a common API to works with tabular data like structure. A tabular data
is data organized in rows and columns. The fact that the package aim at interacting mainly with CSV does not
restrict its usage to CSV document only, In fact if you can provide a tabular data structure to the package
it should be able to manipulate such data with ease. Hence, the introduction of the `TabularData` interface.to allow
interoperates with any tabular structure.

As seen by the package a tabular data is:

- a collection of similar records (preferably consistent in their size);
- an optional header with unique values;

This `TabularData` interface such contract by extending PHP's `IteratorAggregate` interface and by providing the
`getHeader` method which returns a list of unique string (which can be empty if no header is provided).

```php
interface TabularData extends IteratorAggregate
{
/** @return list<string> */
public function getHeader(): array;
}
```

## Basic Usage

Once a `TabularData` implementing object is given to the `ResultSet` class it can be manipulated and inspected as if
it was a CSV document. It will effectively access the full reading API provided by the package.

For instance the `Reader` class implements the `TabularData` interface as such you can instantiate directly
a `ResultSet` instance using the following code:

```php
$resultSet = ResultSet::createFromTabularData(
Reader::createFromPath('path/to/file.csv')
);
```

## Database Importer usage

A common source of tabular data are RDBMS result. From listing the content of a table to returning the result of
a complex query on multiple tables with joins, RDBMS result are always express as tabular data. As such it is possible
to convert them and manipulate via the package. To ease such manipulation the `ResultSet` class exposes the
`ResultSet::createFromRdbms` method:

```php
$connection = new SQLite3( '/path/to/my/db.sqlite');
$stmt = $connection->query("SELECT * FROM users");
$stmt instanceof SQLite3Result || throw new RuntimeException('SQLite3 results not available');

$user24 = ResultSet::createFromRdbms($stmt)->nth(23);
```

the `createFromRdbms` can be used with the following Database Extensions:

- SQLite3 (`SQLite3Result` object)
- MySQL Improved Extension (`mysqli_result` object)
- PostgreSQL (`PgSql\Result` object returned by the `pg_get_result`)
- PDO (`PDOStatement` object)

Behind the scene the named constructor leverages the `League\Csv\RdbmsResult` class which implements the `TabularData` interface.
This class is responsible from converting RDBMS results into `TabularData` instances. But you can also use the class
as a standalone feature to quickly

- retrieve column names from the listed Database extensions as follows:

```php
$connection = pg_connect("dbname=publisher");
$result = pg_query($connection, "SELECT * FROM authors");
$result !== false || throw new RuntimeException('PostgreSQL results not available');

$names = RdbmsResult::columnNames($result);
//will return ['firstname', 'lastname', ...]
```

- convert the result into an `Iterator` using the `records` public static method.

```php
mysqli_report(MYSQLI_REPORT_ERROR | MYSQLI_REPORT_STRICT);
$connection = new mysqli("localhost", "my_user", "my_password", "world");
$result = $connection->query("SELECT * FROM authors");
$result instanceOf mysqli_result || throw new RuntimeException('MySQL results not available');
foreach (RdbmsResult::records($stmt) as $record) {
// returns each found record which match the processed query.
}
```

<p class="message-warning">The <code>PDOStatement</code> class does not support rewinding the object.
To work around this limitation, the <code>RdbmsResult</code> stores the results in a
<code>ArrayIterator</code> instance for cache which can lead to huge memory usage if the
returned <code>PDOStatement</code> result is huge.</p>

## Generic Importer Logic

Implementing the `TabularData` should be straightforward, you can easily convert any structure into a `TabularData` instance
using the following logic. Keep in mind that the codebase to generate an instance may vary depending on the source and the
size of your data but the logic should stay the same.

```php
use League\Csv\ResultSet;
use League\Csv\TabularData;

$payload = <<<JSON
[
{"id": 1, "firstname": "Jonn", "lastname": "doe", "email": "[email protected]"},
{"id": 2, "firstname": "Jane", "lastname": "doe", "email": "[email protected]"},
]
JSON;

$tabularData = new class ($payload) implements TabularData {
private readonly array $header;
private readonly ArrayIterator $records;
public function __construct(string $payload)
{
try {
$data = json_decode($payload, true);
$this->header = array_keys($data[0] ?? []);
$this->records = new ArrayIterator($data);
} catch (Throwable $exception) {
throw new ValueError('The provided JSON payload could not be converted into a Tabular Data instance.', previous: $exception);
}
}

public function getHeader() : array
{
return $this->header;
}

public function getIterator() : Iterator
{
return $this->records;
}
};

$resultSet = ResultSet::createFromTabularData($tabularData);
```
31 changes: 1 addition & 30 deletions docs/9.0/reader/resultset.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,36 +44,7 @@ the `createFromRdbms` can be used with the following Database Extensions:
As such using the instance on huge results will trigger high memory usage as all the data will be stored in a
<code>ArrayIterator</code> instance for cache to allow rewinding and inspecting the tabular data.</p>

Behind the scene the named constructor leverages the `RdbmsResult` class which implements the `TabularData` interface.
This class is responsible from converting RDBMS results into TabularData` instances. But you can also use the class
to retrieve column names from the listed Database extensions as follow:

```php
$db = new SQLite3( '/path/to/my/db.sqlite');
$stmt = $db->query("SELECT * FROM users");
$stmt instanceof SQLite3Result || throw new RuntimeException('SQLite3 results not available');

$names = RdbmsResult::columnNames($stmt);
//will return ['firstname', 'lastname', ...]
```

The same class can also convert the Database result into an `Iterator` using the `records` public static method.

```php
$db = new SQLite3( '/path/to/my/db.sqlite');
$stmt = $db->query(
"SELECT *
FROM users
INNER JOIN permissions
ON users.id = permissions.user_id
WHERE users.is_active = 't'
AND permissions.is_active = 't'"
);
$stmt instanceof SQLite3Result || throw new RuntimeException('SQLite3 results not available');
foreach (RdbmsResult::records($stmt) as $record) {
// returns each found record which match the processed query.
}
```
Please refer to the [TabularData Importer](/9.0/interoperability/tabular-data-importer) for more information.

## Selecting records

Expand Down
1 change: 1 addition & 0 deletions docs/_data/menu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ version:
Force Enclosure : '/9.0/interoperability/enclose-field/'
Handling Delimiter : '/9.0/interoperability/swap-delimiter/'
Formula Injection : '/9.0/interoperability/escape-formula-injection/'
Tabular Data: '/9.0/interoperability/tabular-data-importer/'
Converting Records:
Overview: '/9.0/converter/'
Charset Converter: '/9.0/converter/charset/'
Expand Down
26 changes: 24 additions & 2 deletions src/FragmentFinder.php
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@

namespace League\Csv;

use Iterator;

use function array_filter;
use function array_map;
use function array_reduce;
Expand Down Expand Up @@ -117,7 +119,17 @@ private function find(array $parsedExpression, TabularDataReader $tabularDataRea

$selections = array_filter($selections, fn (array $selection) => -1 !== $selection['start']);
if ([] === $selections) {
return [ResultSet::createFromRecords()];
return [ResultSet::createFromTabularData(new class () implements TabularData {
public function getHeader(): array
{
return [];
}

public function getIterator(): Iterator
{
return MapIterator::toIterator([]);
}
})];
}

if (self::TYPE_ROW === $type) {
Expand All @@ -143,7 +155,17 @@ private function find(array $parsedExpression, TabularDataReader $tabularDataRea
);

return [match ([]) {
$columns => ResultSet::createFromRecords(),
$columns => ResultSet::createFromTabularData(new class () implements TabularData {
public function getHeader(): array
{
return [];
}

public function getIterator(): Iterator
{
return MapIterator::toIterator([]);
}
}),
default => Statement::create()->select(...$columns)->process($tabularDataReader),
}];
}
Expand Down
17 changes: 9 additions & 8 deletions src/ResultSet.php
Original file line number Diff line number Diff line change
Expand Up @@ -98,14 +98,6 @@ public static function createFromTabularData(TabularData $records): self
return new self($records->getIterator(), $records->getHeader());
}

/**
* Returns a new instance from a collection without header.
*/
public static function createFromRecords(iterable $records = []): self
{
return new self(MapIterator::toIterator($records));
}

public function __destruct()
{
unset($this->records);
Expand Down Expand Up @@ -675,4 +667,13 @@ public static function createFromTabularDataReader(TabularDataReader $reader): s
{
return self::createFromTabularData($reader);
}

/**
* Returns a new instance from a collection without header.
*/
#[Deprecated(message:'use League\Csv\ResultSet::createFromTabularData() instead', since:'league/csv:9.22.0')]
public static function createFromRecords(iterable $records = []): self
{
return new self(MapIterator::toIterator($records));
}
}
56 changes: 39 additions & 17 deletions src/ResultSetTest.php
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@

namespace League\Csv;

use ArrayIterator;
use Iterator;
use PHPUnit\Framework\Attributes\DataProvider;
use PHPUnit\Framework\Attributes\Group;
use SplTempFileObject;
Expand Down Expand Up @@ -51,27 +53,47 @@ protected function tearDown(): void

protected function tabularData(): TabularDataReader
{
return new ResultSet([
['date', 'temperature', 'place'],
['2011-01-01', '1', 'Galway'],
['2011-01-02', '-1', 'Galway'],
['2011-01-03', '0', 'Galway'],
['2011-01-01', '6', 'Berkeley'],
['2011-01-02', '8', 'Berkeley'],
['2011-01-03', '5', 'Berkeley'],
]);
return ResultSet::createFromTabularData(new class () implements TabularData {
public function getHeader(): array
{
return [];
}

public function getIterator(): Iterator
{
return new ArrayIterator([
['date', 'temperature', 'place'],
['2011-01-01', '1', 'Galway'],
['2011-01-02', '-1', 'Galway'],
['2011-01-03', '0', 'Galway'],
['2011-01-01', '6', 'Berkeley'],
['2011-01-02', '8', 'Berkeley'],
['2011-01-03', '5', 'Berkeley'],
]);
}
});
}

protected function tabularDataWithHeader(): TabularDataReader
{
return new ResultSet([
['2011-01-01', '1', 'Galway'],
['2011-01-02', '-1', 'Galway'],
['2011-01-03', '0', 'Galway'],
['2011-01-01', '6', 'Berkeley'],
['2011-01-02', '8', 'Berkeley'],
['2011-01-03', '5', 'Berkeley'],
], ['date', 'temperature', 'place']);
return ResultSet::createFromTabularData(new class () implements TabularData {
public function getHeader(): array
{
return ['date', 'temperature', 'place'];
}

public function getIterator(): Iterator
{
return new ArrayIterator([
['2011-01-01', '1', 'Galway'],
['2011-01-02', '-1', 'Galway'],
['2011-01-03', '0', 'Galway'],
['2011-01-01', '6', 'Berkeley'],
['2011-01-02', '8', 'Berkeley'],
['2011-01-03', '5', 'Berkeley'],
]);
}
});
}

public function testFilter(): void
Expand Down
Loading

0 comments on commit 8b13132

Please sign in to comment.