From 94b2581ee523ae49ee4e0dea0b1358ceccfadce2 Mon Sep 17 00:00:00 2001
From: Advayp <69655599+Advayp@users.noreply.github.com>
Date: Thu, 9 Jan 2025 14:07:47 -0800
Subject: [PATCH 1/9] Improve `spice dataset` documentation

---
 spiceaidocs/docs/cli/reference/dataset.md | 40 ++++++++++++++++-------
 1 file changed, 28 insertions(+), 12 deletions(-)

diff --git a/spiceaidocs/docs/cli/reference/dataset.md b/spiceaidocs/docs/cli/reference/dataset.md
index 25d7780bc..5c6587147 100644
--- a/spiceaidocs/docs/cli/reference/dataset.md
+++ b/spiceaidocs/docs/cli/reference/dataset.md
@@ -1,22 +1,38 @@
 ---
-title: "dataset"
-sidebar_label: "dataset"
-pagination_prev: null
-pagination_next: null
----
-
-Dataset operations
+ title: 'dataset'
+ sidebar_label: 'dataset'
+ pagination_prev: null
+ pagination_next: null
+ ---
+Perform operations relating to Spice datasets.
 
 ### Usage
-
 ```shell
 spice dataset [command]
 ```
 
-Available `command`s:
+ Available `command`s:
+
+ - `configure`: Create/configure a dataset directly from the command-line, including customizing components such as Data Connector (`from` in a Spicepod) and acceleration along with other metadata.
+
+ #### Flags
+
+ - `-h`, `--help` Print this help message
+
+ ### Sample Output
 
-- `configure`:    Configure a dataset
+ #### Output from Configure
 
-#### Flags
+ ```bash
+> spice dataset configure
 
-- `-h`, `--help`   Print this help message
\ No newline at end of file
+ 2024/12/18 01:06:32 INFO dataset name: sample-project
+ taxi_trips # Input 1: Name of dataset
+ 2024/12/18 01:06:59 WARN Dataset names with hyphens should be quoted in queries:
+ i.e. SELECT * FROM "remote-source"
+ description: Taxi trips in s3 # Input 2: Description
+ from: s3://spiceai-demo-datasets/taxi_trips/2024/  # Input 3: Source
+ 2024/12/18 01:07:25 INFO locally accelerate (y/n)? (y)
+ n # Input 4: Acceleration
+ 2024/12/18 01:07:32 INFO Saved datasets/remote-source/dataset.yaml
+ ```
\ No newline at end of file

From 75dff3f359f495b2d6753200001813492666ed07 Mon Sep 17 00:00:00 2001
From: Advayp <69655599+Advayp@users.noreply.github.com>
Date: Thu, 9 Jan 2025 15:55:19 -0800
Subject: [PATCH 2/9] Add detailed explanation of output and results of `spice
 dataset configure`

---
 spiceaidocs/docs/cli/reference/dataset.md | 58 +++++++++++++++++++----
 1 file changed, 48 insertions(+), 10 deletions(-)

diff --git a/spiceaidocs/docs/cli/reference/dataset.md b/spiceaidocs/docs/cli/reference/dataset.md
index 5c6587147..695f6cc9d 100644
--- a/spiceaidocs/docs/cli/reference/dataset.md
+++ b/spiceaidocs/docs/cli/reference/dataset.md
@@ -4,7 +4,7 @@
  pagination_prev: null
  pagination_next: null
  ---
-Perform operations relating to Spice datasets.
+Configure a Spice dataset.
 
 ### Usage
 ```shell
@@ -15,24 +15,62 @@ spice dataset [command]
 
  - `configure`: Create/configure a dataset directly from the command-line, including customizing components such as Data Connector (`from` in a Spicepod) and acceleration along with other metadata.
 
+ **Note**: In order to run `spice dataset configure`, there *must* be a `spicepod.yaml` file in the root of your project directory. To create this file, see [`spice init`](/cli/reference/init).
+
  #### Flags
 
  - `-h`, `--help` Print this help message
 
- ### Sample Output
+ ### Examples
 
- #### Output from Configure
+ When running `spice dataset configure`, Spice will prompt for four inputs:
+ 1. The name of the dataset, labelled by `(1)` below.
+ 2. The description of the dataset, labelled by `(2)` below.
+ 3. The source of the dataset, labelled by `(3)` below. Consult [Spice's supported data connectors](/components/data-connectors) to see possible values for this field. 
+ 4. Whether or not to enable acceleration for this dataset, labelled by `(4)`. The default value for this input is `y`, enabling acceleration for this dataset. Learn more about acceleration in the [dataset acceleration reference](/components/data-accelerators).
 
- ```bash
+ ```shell
 > spice dataset configure
 
  2024/12/18 01:06:32 INFO dataset name: sample-project
- taxi_trips # Input 1: Name of dataset
+ taxi-trips # (1)
  2024/12/18 01:06:59 WARN Dataset names with hyphens should be quoted in queries:
  i.e. SELECT * FROM "remote-source"
- description: Taxi trips in s3 # Input 2: Description
- from: s3://spiceai-demo-datasets/taxi_trips/2024/  # Input 3: Source
- 2024/12/18 01:07:25 INFO locally accelerate (y/n)? (y)
- n # Input 4: Acceleration
+ description: Taxi trips in s3 # (2)
+ from: s3://spiceai-demo-datasets/taxi_trips/2024/  # (3)
+ 2024/12/18 01:075 INFO locally accelerate (y/n)? (y)
+ n # (4)
  2024/12/18 01:07:32 INFO Saved datasets/remote-source/dataset.yaml
- ```
\ No newline at end of file
+ ```
+
+After execution, the directory structure looks like this for the above example:
+ ```
+ ├── datasets
+ │   ├── taxi-trips
+ │       ├── dataset.yaml
+ ├── spicepod.yaml
+ └── ...
+ ```
+
+ The datasets folder includes the datasets for your project configured by using `spice dataset configure` or added manually.
+
+The `dataset.yaml` file in `./datasets/taxi-trips` is configured as defined by the inputs provided to `spice dataset configure`. For this example, the `datatset.yaml` file looks as follows:
+
+```yaml
+from: s3://spiceai-demo-datasets/taxi_trips/2024/
+name: taxi-trips
+description: Taxi trips in s3
+acceleration:
+    enabled: false
+```
+
+The command additionally updates the root `spicepod.yaml` file to include the configured dataset as a reference (`ref`). For this example, `spicepod.yaml` would include the following:
+```yaml
+version: v1
+kind: Spicepod
+name: Taxi Trips with Spice
+datasets:
+    - ref: datasets/taxi-trips
+```
+
+To learn more about Spice datasets and Spicepods, visit the [Spice dataset reference](/reference/spicepod/datasets) and [Spicepod reference](/reference/spicepod).
\ No newline at end of file

From 8029c382f9bf8d0cb71dfd62bd408d0f71329b90 Mon Sep 17 00:00:00 2001
From: Advayp <69655599+Advayp@users.noreply.github.com>
Date: Thu, 9 Jan 2025 15:56:28 -0800
Subject: [PATCH 3/9] Change tense to reflect number of examples

---
 spiceaidocs/docs/cli/reference/dataset.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/spiceaidocs/docs/cli/reference/dataset.md b/spiceaidocs/docs/cli/reference/dataset.md
index 695f6cc9d..a6500c35e 100644
--- a/spiceaidocs/docs/cli/reference/dataset.md
+++ b/spiceaidocs/docs/cli/reference/dataset.md
@@ -21,7 +21,7 @@ spice dataset [command]
 
  - `-h`, `--help` Print this help message
 
- ### Examples
+ ### Example
 
  When running `spice dataset configure`, Spice will prompt for four inputs:
  1. The name of the dataset, labelled by `(1)` below.

From e0524492c76b218b6a4388e0e7ead676eb6804ac Mon Sep 17 00:00:00 2001
From: Advayp <69655599+Advayp@users.noreply.github.com>
Date: Thu, 9 Jan 2025 16:11:40 -0800
Subject: [PATCH 4/9] Update spiceaidocs/docs/cli/reference/dataset.md

Co-authored-by: Jack Eadie <jack@spice.ai>
---
 spiceaidocs/docs/cli/reference/dataset.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/spiceaidocs/docs/cli/reference/dataset.md b/spiceaidocs/docs/cli/reference/dataset.md
index a6500c35e..694b42006 100644
--- a/spiceaidocs/docs/cli/reference/dataset.md
+++ b/spiceaidocs/docs/cli/reference/dataset.md
@@ -13,7 +13,7 @@ spice dataset [command]
 
  Available `command`s:
 
- - `configure`: Create/configure a dataset directly from the command-line, including customizing components such as Data Connector (`from` in a Spicepod) and acceleration along with other metadata.
+ - `configure`: Create/configure a dataset directly from the command-line, including customizing components such as whether to add acceleration to the connector.
 
  **Note**: In order to run `spice dataset configure`, there *must* be a `spicepod.yaml` file in the root of your project directory. To create this file, see [`spice init`](/cli/reference/init).
 

From 733c3b361a149b436d48347c5281f65d674f36db Mon Sep 17 00:00:00 2001
From: Advayp <69655599+Advayp@users.noreply.github.com>
Date: Thu, 9 Jan 2025 16:13:20 -0800
Subject: [PATCH 5/9] Update previously incorrect styling

---
 spiceaidocs/docs/cli/reference/dataset.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/spiceaidocs/docs/cli/reference/dataset.md b/spiceaidocs/docs/cli/reference/dataset.md
index 694b42006..3f90f165a 100644
--- a/spiceaidocs/docs/cli/reference/dataset.md
+++ b/spiceaidocs/docs/cli/reference/dataset.md
@@ -1,6 +1,6 @@
 ---
- title: 'dataset'
- sidebar_label: 'dataset'
+ title: "dataset"
+ sidebar_label: "dataset"
  pagination_prev: null
  pagination_next: null
  ---

From 122eadb5525e841896f3dbe2d52604b8421aca61 Mon Sep 17 00:00:00 2001
From: Advayp <69655599+Advayp@users.noreply.github.com>
Date: Fri, 10 Jan 2025 17:39:18 -0800
Subject: [PATCH 6/9] Address feedback

---
 spiceaidocs/docs/cli/reference/dataset.md | 17 +++++++----------
 1 file changed, 7 insertions(+), 10 deletions(-)

diff --git a/spiceaidocs/docs/cli/reference/dataset.md b/spiceaidocs/docs/cli/reference/dataset.md
index 3f90f165a..8a910724b 100644
--- a/spiceaidocs/docs/cli/reference/dataset.md
+++ b/spiceaidocs/docs/cli/reference/dataset.md
@@ -26,21 +26,18 @@ spice dataset [command]
  When running `spice dataset configure`, Spice will prompt for four inputs:
  1. The name of the dataset, labelled by `(1)` below.
  2. The description of the dataset, labelled by `(2)` below.
- 3. The source of the dataset, labelled by `(3)` below. Consult [Spice's supported data connectors](/components/data-connectors) to see possible values for this field. 
+ 3. The source of the dataset, labelled by `(3)` below. Consult [Spice's supported data connectors](/components/data-connectors) to see possible values for this field. Note: Spice may prompt for a file format if necessary, as shown in the example below.
  4. Whether or not to enable acceleration for this dataset, labelled by `(4)`. The default value for this input is `y`, enabling acceleration for this dataset. Learn more about acceleration in the [dataset acceleration reference](/components/data-accelerators).
 
  ```shell
 > spice dataset configure
 
- 2024/12/18 01:06:32 INFO dataset name: sample-project
- taxi-trips # (1)
- 2024/12/18 01:06:59 WARN Dataset names with hyphens should be quoted in queries:
- i.e. SELECT * FROM "remote-source"
- description: Taxi trips in s3 # (2)
- from: s3://spiceai-demo-datasets/taxi_trips/2024/  # (3)
- 2024/12/18 01:075 INFO locally accelerate (y/n)? (y)
- n # (4)
- 2024/12/18 01:07:32 INFO Saved datasets/remote-source/dataset.yaml
+dataset name: (spiceai) taxi-trips # (1)
+description: Taxi Trips in S3 # (2)
+from: s3://spiceai-demo-datasets/taxi_trips/2024/ # (3)
+file_format (parquet/csv) (parquet) parquet
+locally accelerate (y/n)? (y) y # (4)
+2025/01/10 14:07:46 INFO Saved datasets/test/dataset.yaml
  ```
 
 After execution, the directory structure looks like this for the above example:

From 381d1380913632fc4503324326f54624b5eb49f3 Mon Sep 17 00:00:00 2001
From: Advay Patil <advaypatil27@gmail.com>
Date: Fri, 10 Jan 2025 22:32:09 -0800
Subject: [PATCH 7/9] Fix Vercel build issue

---
 spiceaidocs/docs/cli/reference/dataset.md | 65 +++++++++++++----------
 1 file changed, 36 insertions(+), 29 deletions(-)

diff --git a/spiceaidocs/docs/cli/reference/dataset.md b/spiceaidocs/docs/cli/reference/dataset.md
index 8a910724b..21911baa6 100644
--- a/spiceaidocs/docs/cli/reference/dataset.md
+++ b/spiceaidocs/docs/cli/reference/dataset.md
@@ -1,35 +1,40 @@
 ---
- title: "dataset"
- sidebar_label: "dataset"
- pagination_prev: null
- pagination_next: null
- ---
+
+title: "dataset"
+sidebar_label: "dataset"
+pagination_prev: null
+pagination_next: null
+
+---
+
 Configure a Spice dataset.
 
 ### Usage
+
 ```shell
 spice dataset [command]
 ```
 
- Available `command`s:
+Available `command`s:
+
+- `configure`: Create/configure a dataset directly from the command-line, including customizing components such as whether to add acceleration to the connector.
 
- - `configure`: Create/configure a dataset directly from the command-line, including customizing components such as whether to add acceleration to the connector.
+**Note**: In order to run `spice dataset configure`, there _must_ be a `spicepod.yaml` file in the root of your project directory. To create this file, see [`spice init`](/cli/reference/init).
 
- **Note**: In order to run `spice dataset configure`, there *must* be a `spicepod.yaml` file in the root of your project directory. To create this file, see [`spice init`](/cli/reference/init).
+#### Flags
 
- #### Flags
+- `-h`, `--help` Print this help message
 
- - `-h`, `--help` Print this help message
+### Example
 
- ### Example
+When running `spice dataset configure`, Spice will prompt for four inputs:
 
- When running `spice dataset configure`, Spice will prompt for four inputs:
- 1. The name of the dataset, labelled by `(1)` below.
- 2. The description of the dataset, labelled by `(2)` below.
- 3. The source of the dataset, labelled by `(3)` below. Consult [Spice's supported data connectors](/components/data-connectors) to see possible values for this field. Note: Spice may prompt for a file format if necessary, as shown in the example below.
- 4. Whether or not to enable acceleration for this dataset, labelled by `(4)`. The default value for this input is `y`, enabling acceleration for this dataset. Learn more about acceleration in the [dataset acceleration reference](/components/data-accelerators).
+1.  The name of the dataset, labelled by `(1)` below.
+2.  The description of the dataset, labelled by `(2)` below.
+3.  The source of the dataset, labelled by `(3)` below. Consult [Spice's supported data connectors](/components/data-connectors) to see possible values for this field. Note: Spice may prompt for a file format if necessary, as shown in the example below.
+4.  Whether or not to enable acceleration for this dataset, labelled by `(4)`. The default value for this input is `y`, enabling acceleration for this dataset. Learn more about acceleration in the [dataset acceleration reference](/components/data-accelerators).
 
- ```shell
+```shell
 > spice dataset configure
 
 dataset name: (spiceai) taxi-trips # (1)
@@ -38,18 +43,19 @@ from: s3://spiceai-demo-datasets/taxi_trips/2024/ # (3)
 file_format (parquet/csv) (parquet) parquet
 locally accelerate (y/n)? (y) y # (4)
 2025/01/10 14:07:46 INFO Saved datasets/test/dataset.yaml
- ```
+```
 
 After execution, the directory structure looks like this for the above example:
- ```
- ├── datasets
- │   ├── taxi-trips
- │       ├── dataset.yaml
- ├── spicepod.yaml
- └── ...
- ```
 
- The datasets folder includes the datasets for your project configured by using `spice dataset configure` or added manually.
+```
+├── datasets
+│   ├── taxi-trips
+│       ├── dataset.yaml
+├── spicepod.yaml
+└── ...
+```
+
+The datasets folder includes the datasets for your project configured by using `spice dataset configure` or added manually.
 
 The `dataset.yaml` file in `./datasets/taxi-trips` is configured as defined by the inputs provided to `spice dataset configure`. For this example, the `datatset.yaml` file looks as follows:
 
@@ -58,16 +64,17 @@ from: s3://spiceai-demo-datasets/taxi_trips/2024/
 name: taxi-trips
 description: Taxi trips in s3
 acceleration:
-    enabled: false
+  - enabled: false
 ```
 
 The command additionally updates the root `spicepod.yaml` file to include the configured dataset as a reference (`ref`). For this example, `spicepod.yaml` would include the following:
+
 ```yaml
 version: v1
 kind: Spicepod
 name: Taxi Trips with Spice
 datasets:
-    - ref: datasets/taxi-trips
+  - ref: datasets/taxi-trips
 ```
 
-To learn more about Spice datasets and Spicepods, visit the [Spice dataset reference](/reference/spicepod/datasets) and [Spicepod reference](/reference/spicepod).
\ No newline at end of file
+To learn more about Spice datasets and Spicepods, visit the [Spice dataset reference](/reference/spicepod/datasets) and [Spicepod reference](/reference/spicepod).

From c4e97f7c3858742c31c7cb017497bb93f6c46a64 Mon Sep 17 00:00:00 2001
From: Advayp <69655599+Advayp@users.noreply.github.com>
Date: Mon, 13 Jan 2025 16:56:14 -0800
Subject: [PATCH 8/9] Fix typo

Co-authored-by: Jack Eadie <jack@spice.ai>
---
 website/docs/cli/reference/dataset.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/website/docs/cli/reference/dataset.md b/website/docs/cli/reference/dataset.md
index 21911baa6..bbe5a1ed8 100644
--- a/website/docs/cli/reference/dataset.md
+++ b/website/docs/cli/reference/dataset.md
@@ -57,7 +57,7 @@ After execution, the directory structure looks like this for the above example:
 
 The datasets folder includes the datasets for your project configured by using `spice dataset configure` or added manually.
 
-The `dataset.yaml` file in `./datasets/taxi-trips` is configured as defined by the inputs provided to `spice dataset configure`. For this example, the `datatset.yaml` file looks as follows:
+The `dataset.yaml` file in `./datasets/taxi-trips` is configured as defined by the inputs provided to `spice dataset configure`. For this example, the `dataset.yaml` file looks as follows:
 
 ```yaml
 from: s3://spiceai-demo-datasets/taxi_trips/2024/

From 20652b8bf2ea59cb584d3a28d0239f4611e54262 Mon Sep 17 00:00:00 2001
From: Advay Patil <advaypatil27@gmail.com>
Date: Mon, 13 Jan 2025 17:05:04 -0800
Subject: [PATCH 9/9] Fix build issue

---
 website/docs/cli/reference/dataset.md | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/website/docs/cli/reference/dataset.md b/website/docs/cli/reference/dataset.md
index bbe5a1ed8..2e5022993 100644
--- a/website/docs/cli/reference/dataset.md
+++ b/website/docs/cli/reference/dataset.md
@@ -19,7 +19,7 @@ Available `command`s:
 
 - `configure`: Create/configure a dataset directly from the command-line, including customizing components such as whether to add acceleration to the connector.
 
-**Note**: In order to run `spice dataset configure`, there _must_ be a `spicepod.yaml` file in the root of your project directory. To create this file, see [`spice init`](/cli/reference/init).
+**Note**: In order to run `spice dataset configure`, there _must_ be a `spicepod.yaml` file in the root of your project directory. To create this file, see [`spice init`](/docs/cli/reference/init).
 
 #### Flags
 
@@ -31,8 +31,8 @@ When running `spice dataset configure`, Spice will prompt for four inputs:
 
 1.  The name of the dataset, labelled by `(1)` below.
 2.  The description of the dataset, labelled by `(2)` below.
-3.  The source of the dataset, labelled by `(3)` below. Consult [Spice's supported data connectors](/components/data-connectors) to see possible values for this field. Note: Spice may prompt for a file format if necessary, as shown in the example below.
-4.  Whether or not to enable acceleration for this dataset, labelled by `(4)`. The default value for this input is `y`, enabling acceleration for this dataset. Learn more about acceleration in the [dataset acceleration reference](/components/data-accelerators).
+3.  The source of the dataset, labelled by `(3)` below. Consult [Spice's supported data connectors](/docs/components/data-connectors) to see possible values for this field. Note: Spice may prompt for a file format if necessary, as shown in the example below.
+4.  Whether or not to enable acceleration for this dataset, labelled by `(4)`. The default value for this input is `y`, enabling acceleration for this dataset. Learn more about acceleration in the [dataset acceleration reference](/docs/components/data-accelerators).
 
 ```shell
 > spice dataset configure
@@ -77,4 +77,4 @@ datasets:
   - ref: datasets/taxi-trips
 ```
 
-To learn more about Spice datasets and Spicepods, visit the [Spice dataset reference](/reference/spicepod/datasets) and [Spicepod reference](/reference/spicepod).
+To learn more about Spice datasets and Spicepods, visit the [Spice dataset reference](/docs/reference/spicepod/datasets) and [Spicepod reference](/docs/reference/spicepod).