Skip to content

Extension properties docs #1246

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jun 13, 2025
Merged

Extension properties docs #1246

merged 6 commits into from
Jun 13, 2025

Conversation

AndreiKingsley
Copy link
Collaborator

Copy link
Collaborator

@Jolanrensen Jolanrensen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice extension :) hopefully it will make it a lot clearer for users.

See #661 for most of my comments haha

@@ -24,6 +24,9 @@ Explore our structured, in-depth guides to steadily improve your Kotlin DataFram

<img src="quickstart_preview.png" border-effect="rounded" width="705"/>

* [](extensionPropertiesApi.md) — learn about extension properties for `DataFrame`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you're referring to the library "DataFrame", if you're referring to the type "DataFrame"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What? DataFrame is a type/object here, and there are extensions for it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then please add a link :)

@@ -88,8 +88,8 @@ columns.
Column selectors are widely used across operations — one of the simplest examples is `.select { }`, which returns a new
DataFrame with only the columns chosen in Columns Selection expression.

After executing the cell where a `DataFrame` variable is declared, an extension with properties for its columns is
automatically generated.
After executing the cell where a `DataFrame` variable is declared,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or DataRow

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unnecessary information IMHO. I mean there's mention about Row API below, I think it's enough for begin.


Some operations use `RowExpression`, i.e., expression that applies for all `DataFrame` rows. For example `.filter { }`
that returns a new `DataFrame` with rows that satisfy a condition given by row expression.
Some operations use [DataRow API](DataRow.md), with expressions and conditions
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the

that returns a new `DataFrame` with rows that satisfy a condition given by row expression.
Some operations use [DataRow API](DataRow.md), with expressions and conditions
that apply for all `DataFrame` rows.
For example, `.filter { }` that returns a new `DataFrame` with rows \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

random backslash?

)
```

Read the `DataFrame` from the CSV file and specify the schema with `convertTo`:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm maybe also mention cast<>()? convertTo is safer but also heavier

val df = DataFrame.readCsv("example.csv").convertTo<Person>()
```

Extensions for this `DataFrame` will be generated automatically by plugin,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*the plugin


Moreover, new extensions will be generated on-the-fly after each schema change:
by changing any column [name](rename.md)
or [type](convert.md), or [add](add.md) a new one.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oxford comma

Moreover, new extensions will be generated on-the-fly after each schema change:
by changing any column [name](rename.md)
or [type](convert.md), or [add](add.md) a new one.
For example, rename the "name" column into "firstName" and then we can use `firstName` extensions
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

`name` column

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My logic is simple:
name is an extension property (DataColumn/row value).
"name" is a column.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm but then "" could both refer to a column and a column name, which is also confusing IMO. I'd be okay with the column and the column accessor being written the same.

To find out how to use this API in your environment, check out [Working with Data Schemas](schemas.md)
or jump straight to [Data Schemas in Gradle projects](schemasGradle.md),
or [Data Schemas in Jupyter notebooks](schemasJupyter.md).
See [Kotlin DataFrame Compiler Plugin Example](https://github.com/Kotlin/dataframe/tree/plugin_example/examples/kotlin-dataframe-plugin-example)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you make the name a bit shorter here as well?

@AndreiKingsley AndreiKingsley merged commit d8012ed into master Jun 13, 2025
5 checks passed
@AndreiKingsley AndreiKingsley deleted the extension_properties_docs branch June 13, 2025 12:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add more info about extensions in docs.
2 participants