Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changed documentation regarding extension properties api #651

Merged
merged 7 commits into from
Aug 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,7 @@ internal interface AccessApi {
*
* For example:
* ```kotlin
* val df = DataFrame.read("titanic.csv")
* val df /* : AnyFrame */ = DataFrame.read("titanic.csv")
* ```
*/
interface ExtensionPropertiesApi
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,7 @@ class ApiLevels {
@TransformDataFrameExpressions
fun extensionProperties1() {
// SampleStart
val df = DataFrame.read("titanic.csv")
val df /* : AnyFrame */ = DataFrame.read("titanic.csv")
// SampleEnd
}
}
Original file line number Diff line number Diff line change
@@ -1,19 +1,23 @@
package org.jetbrains.kotlinx.dataframe.samples.api

import org.jetbrains.kotlinx.dataframe.AnyFrame
import org.jetbrains.kotlinx.dataframe.DataFrame
import org.jetbrains.kotlinx.dataframe.api.dataFrameOf
import org.jetbrains.kotlinx.dataframe.api.take
import org.jetbrains.kotlinx.dataframe.explainer.WritersideFooter
import org.jetbrains.kotlinx.dataframe.explainer.WritersideStyle
import org.jetbrains.kotlinx.dataframe.io.read
import org.jetbrains.kotlinx.dataframe.io.toStandaloneHTML
import org.junit.Test
import java.io.File

// To display code together with a table, we can use TransformDataFrameExpressions annotation together with korro
// This class provides an ability to save only a table that can be embedded anywhere in the documentation
class OtherSamples {

@Test
fun extensionPropertiesApi1() {
val df = dataFrameOf("example")(123)
val df = DataFrame.read("../data/titanic.csv", delimiter = ';').take(5)
writeTable(df, "extensionPropertiesApi1")
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,7 @@ class ApiLevels {
@TransformDataFrameExpressions
fun extensionProperties1() {
// SampleStart
val df = DataFrame.read("titanic.csv")
val df /* : AnyFrame */ = DataFrame.read("titanic.csv")
// SampleEnd
}
}
Original file line number Diff line number Diff line change
@@ -1,20 +1,23 @@
package org.jetbrains.kotlinx.dataframe.samples.api

import org.jetbrains.kotlinx.dataframe.AnyFrame
import org.jetbrains.kotlinx.dataframe.api.dataFrameOf
import org.jetbrains.kotlinx.dataframe.DataFrame
import org.jetbrains.kotlinx.dataframe.api.take
import org.jetbrains.kotlinx.dataframe.explainer.WritersideFooter
import org.jetbrains.kotlinx.dataframe.explainer.WritersideStyle
import org.jetbrains.kotlinx.dataframe.io.read
import org.jetbrains.kotlinx.dataframe.io.toStandaloneHTML
import org.junit.Test
import java.io.File

// To display code together with a table, we can use TransformDataFrameExpressions annotation together with korro
// This class provides an ability to save only a table that can be embedded anywhere in the documentation
class OtherSamples {

@Test
fun extensionPropertiesApi1() {
val df = dataFrameOf("example")(123)
writeTable(df, "extensionPropertiesApi1")
fun example() {
val df = DataFrame.read("../data/titanic.csv", delimiter = ';').take(5)
// writeTable(df, "exampleName")
}

private fun writeTable(df: AnyFrame, name: String) {
Expand Down
4 changes: 2 additions & 2 deletions docs/StardustDocs/topics/apiLevels.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ Here's a list of all APIs in order of increasing safety.
Columns accessed by the [`KProperty`](https://kotlinlang.org/docs/reflection.html#property-references) of some class.
The name and type of column should match the name and type of property, respectively.

* [**Extension Properties API**](extensionPropertiesApi.md)
* [**Extension Properties API**](extensionPropertiesApi.md) <br/>
Extension access properties are generated based on the dataframe schema. The name and type of properties are inferred
from the name and type of the corresponding columns.

Expand Down Expand Up @@ -114,7 +114,7 @@ val passengers = DataFrame.read("titanic.csv")
<!---FUN extensionProperties1-->

```kotlin
val df = DataFrame.read("titanic.csv")
val df /* : AnyFrame */ = DataFrame.read("titanic.csv")
```

<!---END-->
Expand Down
45 changes: 15 additions & 30 deletions docs/StardustDocs/topics/extensionPropertiesApi.md
Original file line number Diff line number Diff line change
@@ -1,39 +1,24 @@
[//]: # (title: Extension properties API)
[//]: # (title: Extension Properties API)

<!---IMPORT org.jetbrains.kotlinx.dataframe.samples.api.ApiLevels-->

When [`DataFrame`](DataFrame.md) is used within Jupyter Notebooks or Datalore with Kotlin Kernel,
after every cell execution all new global variables of type DataFrame are analyzed and replaced
with typed [`DataFrame`](DataFrame.md) wrapper with auto-generated extension properties for data access:

<!---FUN extensionProperties1-->
Auto-generated extension properties are the safest and easiest way to access columns in a [`DataFrame`](DataFrame.md).
They are generated based on a [dataframe schema](schemas.md),
with the name and type of properties inferred from the name and type of the corresponding columns.

Having these, it allows you to work with your dataframe like:
```kotlin
val df = DataFrame.read("titanic.csv")
val peopleDf /* : DataFrame<Person> */ = DataFrame.read("people.csv").cast<Person>()
val nameColumn /* : DataColumn<String> */ = peopleDf.name
val ageColumn /* : DataColumn<Int> */ = peopleDf.personData.age
```

<!---END-->

Now data can be accessed by `.` member accessor

<!---FUN extensionProperties2-->

and of course
```kotlin
df.add("lastName") { name.split(",").last() }
.dropNulls { age }
.filter { survived && home.endsWith("NY") && age in 10..20 }
peopleDf.add("lastName") { name.split(",").last() }
.dropNulls { personData.age }
.filter { survived && home.endsWith("NY") && personData.age in 10..20 }
```

<!---END-->

The `titanic.csv` file could be found [here](https://github.com/Kotlin/dataframe/blob/master/data/titanic.csv).

In notebooks, extension properties are generated for [`DataSchema`](schemas.md) that is extracted from [`DataFrame`](DataFrame.md)
instance after REPL line execution.
After that [`DataFrame`](DataFrame.md) variable is typed with its own [`DataSchema`](schemas.md), so only valid extension properties corresponding to actual columns in DataFrame will be allowed by the compiler and suggested by completion.

Extension properties can be generated in IntelliJ IDEA using the [Kotlin DataFrame Gradle plugin](schemasGradle.md#configuration).

<warning>
In notebooks generated properties won't appear and be updated until the cell has been executed. It often means that you have to introduce new variable frequently to sync extension properties with actual schema
</warning>
To find out how to use this API in your environment, check out [Working with Data Schemas](schemas.md)
or jump straight to [Data Schemas in Gradle projects](schemasGradle.md),
or [Data Schemas in Jupyter notebooks](schemasJupyter.md).
Loading