Skip to content

Commit

Permalink
Replace spice.ai dataset examples with taxi_trips from spiceai/quicks…
Browse files Browse the repository at this point in the history
…tart
  • Loading branch information
ewgenius committed Dec 16, 2024
1 parent d70f51b commit 9da1512
Show file tree
Hide file tree
Showing 9 changed files with 88 additions and 72 deletions.
26 changes: 20 additions & 6 deletions spiceaidocs/docs/cli/reference/datasets.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,17 +23,31 @@ spice datasets [flags]
```shell
>>> spice datasets

FROM NAME REPLICATION ACCELERATION DEPENDSON STATUS
spice.ai/eth.beacon.recent_slots eth_beacon_recent_slotsss false false Ready
spice.ai/eth.recent_blocks eth_rec_blocks false false Initializing
FROM NAME REPLICATION ACCELERATION STATUS PROPERTIES
spice.ai/spiceai/quickstart/datasets/taxi_trips taxi_trips false false Ready map[]
spice.ai/spiceai/tpch/datasets/tpch.customer tpch.customer false false Ready map[]
spice.ai/spiceai/tpch/datasets/tpch.lineitem tpch.lineitem false false Ready map[]
spice.ai/spiceai/tpch/datasets/tpch.nation tpch.nation false false Ready map[]
spice.ai/spiceai/tpch/datasets/tpch.orders tpch.orders false false Ready map[]
spice.ai/spiceai/tpch/datasets/tpch.part tpch.part false false Ready map[]
spice.ai/spiceai/tpch/datasets/tpch.partsupp tpch.partsupp false false Ready map[]
spice.ai/spiceai/tpch/datasets/tpch.region tpch.region false false Ready map[]
spice.ai/spiceai/tpch/datasets/tpch.supplier tpch.supplier false false Ready map[]
```

### Additional Example

```shell
>>> spice datasets --tls-root-certificate-file /path/to/cert.pem

FROM NAME REPLICATION ACCELERATION DEPENDSON STATUS
spice.ai/eth.beacon.recent_slots eth_beacon_recent_slotsss false false Ready
spice.ai/eth.recent_blocks eth_rec_blocks false false Initializing
FROM NAME REPLICATION ACCELERATION STATUS PROPERTIES
spice.ai/spiceai/quickstart/datasets/taxi_trips taxi_trips false false Ready map[]
spice.ai/spiceai/tpch/datasets/tpch.customer tpch.customer false false Ready map[]
spice.ai/spiceai/tpch/datasets/tpch.lineitem tpch.lineitem false false Ready map[]
spice.ai/spiceai/tpch/datasets/tpch.nation tpch.nation false false Ready map[]
spice.ai/spiceai/tpch/datasets/tpch.orders tpch.orders false false Ready map[]
spice.ai/spiceai/tpch/datasets/tpch.part tpch.part false false Ready map[]
spice.ai/spiceai/tpch/datasets/tpch.partsupp tpch.partsupp false false Ready map[]
spice.ai/spiceai/tpch/datasets/tpch.region tpch.region false false Ready map[]
spice.ai/spiceai/tpch/datasets/tpch.supplier tpch.supplier false false Ready map[]
```
18 changes: 9 additions & 9 deletions spiceaidocs/docs/components/data-accelerators/data-refresh.md
Original file line number Diff line number Diff line change
Expand Up @@ -187,15 +187,15 @@ Example:

```yaml
datasets:
- from: spice.ai/eth.recent_blocks
name: eth_recent_blocks
- from: spice.ai/spiceai/quickstart/datasets/taxi_trips
name: taxi_trips
acceleration:
enabled: true
refresh_mode: full
refresh_check_interval: 10s
```

This configuration will refresh `eth.recent_blocks` data every 10 seconds.
This configuration will refresh `taxi_trips` data every 10 seconds.

## Refresh On-Demand

Expand Down Expand Up @@ -242,8 +242,8 @@ Example: Disable rertries

```yaml
datasets:
- from: spice.ai/eth.recent_blocks
name: eth_recent_blocks
- from: spice.ai/spiceai/quickstart/datasets/taxi_trips
name: taxi_trips
acceleration:
refresh_retry_enabled: false
refresh_check_interval: 30s
Expand All @@ -253,8 +253,8 @@ Example: Limit retries to a maximum of 10 attempts

```yaml
datasets:
- from: spice.ai/eth.recent_blocks
name: eth_recent_blocks
- from: spice.ai/spiceai/quickstart/datasets/taxi_trips
name: taxi_trips
acceleration:
refresh_retry_max_attempts: 10
refresh_check_interval: 30s
Expand All @@ -278,8 +278,8 @@ Example:

```yaml
datasets:
- from: spice.ai/eth.recent_blocks
name: eth_recent_blocks
- from: spice.ai/spiceai/quickstart/datasets/taxi_trips
name: taxi_trips
acceleration:
refresh_check_interval: 10s
refresh_jitter_enabled: true
Expand Down
12 changes: 7 additions & 5 deletions spiceaidocs/docs/components/data-connectors/spiceai.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,13 +18,15 @@ Secrets will be written to a `.env` file by using the `spice login` command and

### Parameters

- `from`: The Spice.ai dataset ID. For instance `spice.ai/eth.recent_blocks` or `spice.ai/eth.recent_traces`. To query a dataset in a shared Spicepod, use the format `spice.ai/<org>/<app>/datasets/<dataset_id>`.
#### `from`

The Spice.ai Cloud Platform dataset URI. To query a dataset in a public Spice.ai App, use the format `spice.ai/<org>/<app>/datasets/<dataset_name>`.

## Example

```yaml
- from: spice.ai/eth.recent_blocks
name: eth_recent_blocks
- from: spice.ai/spiceai/quickstart/datasets/taxi_trips
name: taxi_trips
```
```yaml
Expand All @@ -35,8 +37,8 @@ Secrets will be written to a `.env` file by using the `spice login` command and
## Full Configuration Example
```yaml
- from: spice.ai/eth.recent_blocks
name: eth_recent_blocks
- from: spice.ai/spiceai/tpch/datasets/customer
name: tpch.customer
params:
spiceai_api_key: ${secrets:spiceai_api_key}
acceleration:
Expand Down
4 changes: 2 additions & 2 deletions spiceaidocs/docs/components/secret-stores/keyring/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,8 +34,8 @@ And the secret can be referenced in parameters:
```yaml
datasets:
- from: spice.ai:eth.recent_blocks
name: blocks
- from: spice.ai/spiceai/quickstart/datasets/taxi_trips
name: taxi_trips
params:
spiceai_api_key: ${keyring:spiceai_api_key} # ${secrets:spiceai_api_key} can also be used
```
4 changes: 2 additions & 2 deletions spiceaidocs/docs/components/secret-stores/kubernetes/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,8 @@ And the secret can be referenced in parameters:
```yaml
datasets:
- from: spice.ai:eth.recent_blocks
name: blocks
- from: spice.ai/spiceai/quickstart/datasets/taxi_trips
name: taxi_trips
params:
spiceai_api_key: ${k8s:spiceai_api_key} # ${secrets:spiceai_api_key} can also be used to fallback to other secret stores
```
Expand Down
12 changes: 6 additions & 6 deletions spiceaidocs/docs/features/data-acceleration/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,35 +26,35 @@ Data Security: Assess data sensitivity and secure network connections between th

## Example

### Locally Accelerating eth.recent_blocks
### Locally Accelerating taxi_trips

- Start Spice with the following dataset:

```yaml
datasets:
- from: spice.ai/eth.recent_blocks
name: eth_recent_blocks
- from: spice.ai/spiceai/quickstart/datasets/taxi_trips
name: taxi_trips
acceleration:
enabled: true
refresh_mode: full
refresh_check_interval: 10s
```
- The dataset `eth.recent_blocks` will be accelerated locally by the Spice runtime. The data will be refreshed every 10 seconds.
- The dataset `taxi_trips` will be accelerated locally by the Spice runtime. The data will be refreshed every 10 seconds.

- Compare query times against the Spice platform:

```bash
curl \
--url 'https://data.spiceai.io/v1/sql?api_key=[API_KEY]' \
--data 'select * from eth.recent_blocks'
--data 'select * from taxi_trips'
```

And the locally accelerated dataset:

```bash
spice sql
select * from eth_recent_blocks
select * from taxi_trips;
```

[Learn more about Data Accelerators](/components/data-accelerators) for faster access.
64 changes: 32 additions & 32 deletions spiceaidocs/docs/getting-started/spiceai.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ After logging in, create an app in order to get an API key.

![create_app-1](https://github.com/spiceai/spiceai/assets/112157037/d2446406-1f06-40fb-8373-1b6d692cb5f7)

This quickstart will use the [`eth.recent_blocks`](https://docs.spice.ai/reference/sql-query-tables/ethereum/core-tables) dataset.
This quickstart will use the `taxi_trips` dataset from https://spice.ai/spiceai/quickstart Spice.ai app.

**Step 1.** Initialize a new project:

Expand Down Expand Up @@ -58,19 +58,19 @@ spice dataset configure
Enter a dataset name that will be used to reference the dataset in queries. This name does not need to match the name in the dataset source.

```bash
dataset name: (spice_app) eth_recent_blocks
dataset name: (spice_app) taxi_trips
```

Enter the description of the dataset:

```
description: Recent Ethereum blocks
description: Taxi trips dataset
```

Enter the location of the dataset:

```bash
from: spice.ai/eth.recent_blocks
from: spice.ai/spiceai/quickstart/datasets/taxi_trips
```

Select `y` when prompted whether to accelerate the data:
Expand All @@ -82,9 +82,9 @@ Locally accelerate (y/n)? y
You should see the following output from your runtime terminal:

```bash
2024-08-05T13:09:08.342450Z INFO runtime: Dataset eth_recent_blocks registered (spice.ai/eth.recent_blocks), acceleration (arrow, 10s refresh), results cache enabled.
2024-08-05T13:09:08.343641Z INFO runtime::accelerated_table::refresh_task: Loading data for dataset eth_recent_blocks
2024-08-05T13:09:09.575822Z INFO runtime::accelerated_table::refresh_task: Loaded 146 rows (6.36 MiB) for dataset eth_recent_blocks in 1s 232ms.
2024-12-16T05:12:45.803694Z INFO runtime::init::dataset: Dataset taxi_trips registered (spice.ai/spiceai/quickstart/datasets/taxi_trips), acceleration (arrow, 10s refresh), results cache enabled.
2024-12-16T05:12:45.805494Z INFO runtime::accelerated_table::refresh_task: Loading data for dataset taxi_trips
2024-12-16T05:13:24.218345Z INFO runtime::accelerated_table::refresh_task: Loaded 2,964,624 rows (8.41 GiB) for dataset taxi_trips in 38s 412ms.
```
**Step 5.** In a new terminal window, use the Spice SQL REPL to query the dataset
Expand All @@ -94,47 +94,47 @@ spice sql
```
```bash
SELECT number, size, gas_used from eth_recent_blocks LIMIT 10;
SELECT tpep_pickup_datetime, passenger_count, trip_distance from taxi_trips LIMIT 10;
```
The output displays the results of the query along with the query execution time:
```bash
+----------+--------+----------+
| number | size | gas_used |
+----------+--------+----------+
| 20462425 | 32466 | 6705045 |
| 20462435 | 262114 | 29985196 |
| 20462427 | 138376 | 29989452 |
| 20462444 | 40541 | 9480363 |
| 20462431 | 78505 | 16994166 |
| 20462461 | 110372 | 21987571 |
| 20462441 | 51089 | 11136440 |
| 20462428 | 327660 | 29998593 |
| 20462429 | 133518 | 20159194 |
| 20462422 | 61461 | 13389415 |
+----------+--------+----------+

Time: 0.008562625 seconds. 10 rows.
+----------------------+-----------------+---------------+
| tpep_pickup_datetime | passenger_count | trip_distance |
+----------------------+-----------------+---------------+
| 2024-01-11T12:55:12 | 1 | 0.0 |
| 2024-01-11T12:55:12 | 1 | 0.0 |
| 2024-01-11T12:04:56 | 1 | 0.63 |
| 2024-01-11T12:18:31 | 1 | 1.38 |
| 2024-01-11T12:39:26 | 1 | 1.01 |
| 2024-01-11T12:18:58 | 1 | 5.13 |
| 2024-01-11T12:43:13 | 1 | 2.9 |
| 2024-01-11T12:05:41 | 1 | 1.36 |
| 2024-01-11T12:20:41 | 1 | 1.11 |
| 2024-01-11T12:37:25 | 1 | 2.04 |
+----------------------+-----------------+---------------+

Time: 0.00538925 seconds. 10 rows.
```
You can experiment with the time it takes to generate queries when using non-accelerated datasets. You can change the acceleration setting from `true` to `false` in the datasets.yaml file.
### Additional Example
```bash
# Query to display the average gas used in recent Ethereum blocks
SELECT AVG(gas_used) FROM eth_recent_blocks;
# Query to display the average trip distance
SELECT AVG(trip_distance) FROM taxi_trips;
```
The output displays the average gas used:
```bash
+------------------+
| avg |
+------------------+
| 15000000.1234567 |
+------------------+
+-------------------------------+
| avg(taxi_trips.trip_distance) |
+-------------------------------+
| 3.652169178958276 |
+-------------------------------+

Time: 0.005678123 seconds. 1 row.
Time: 0.031145625 seconds. 1 rows.
```
16 changes: 8 additions & 8 deletions spiceaidocs/docs/reference/spicepod/datasets.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@ Inline example:

```yaml
datasets:
- from: spice.ai/eth.beacon.eigenlayer
name: strategy_manager_deposits
- from: spice.ai/spiceai/quickstart/datasets/taxi_trips
name: taxi_trips
acceleration:
enabled: true
mode: memory # / file
Expand Down Expand Up @@ -44,14 +44,14 @@ Relative path example:

```yaml
datasets:
- ref: datasets/eth_recent_transactions
- ref: datasets/taxi_trips
```

`datasets/eth_recent_transactions/dataset.yaml`
`datasets/taxi_trips/dataset.yaml`

```yaml
from: spice.ai/eth.recent_transactions
name: eth_recent_transactions
from: spice.ai/spiceai/quickstart/datasets/taxi_trips
name: taxi_trips
type: overwrite
acceleration:
enabled: true
Expand Down Expand Up @@ -105,8 +105,8 @@ An alternative to adding the dataset definition inline in the `spicepod.yaml` fi
**dataset.yaml**

```yaml
from: spice.ai/eth.recent_transactions
name: eth_recent_transactions
from: spice.ai/spiceai/quickstart/datasets/taxi_trips
name: taxi_trips
type: overwrite
acceleration:
enabled: true
Expand Down
4 changes: 2 additions & 2 deletions spiceaidocs/docs/reference/spicepod/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -252,8 +252,8 @@ A dataset defined inline.

```yaml
datasets:
- from: spice.ai/eth.recent_blocks
name: eth_blocks
- from: spice.ai/spiceai/quickstart/datasets/taxi_trips
name: taxi_trips
acceleration:
enabled: true
refresh_mode: full
Expand Down

0 comments on commit 9da1512

Please sign in to comment.