1
1
[ // ] : # ( title: Extension Properties API )
2
2
3
- When working with a DataFrame, the most convenient and reliable way
3
+ When working with a [ ` DataFrame ` ] ( DataFrame.md ) , the most convenient and reliable way
4
4
to access its columns — including for operations and retrieving column values
5
5
in row expressions — is through auto-generated extension properties.
6
6
They are generated based on a [ dataframe schema] ( schemas.md ) ,
7
7
with the name and type of properties inferred from the name and type of the corresponding columns.
8
- It also works for all types of hierarchical dataframes
8
+ It also works for all types of hierarchical dataframes.
9
9
10
10
> The behavior of data schema generation differs between the
11
11
> [ Compiler Plugin] ( Compiler-Plugin.md ) and [ Kotlin Notebook] ( gettingStartedKotlinNotebook.md ) .
12
12
>
13
- > * In the ** Kotlin Notebook** , a schema is generated * only after cell execution* for
13
+ > * In ** Kotlin Notebook** , a schema is generated * only after cell execution* for
14
14
> ` DataFrame ` variables defined within that cell.
15
15
> * With the ** Compiler Plugin** , a new schema is generated * after every operation*
16
16
> — but support for all operations is still in progress.
@@ -21,10 +21,12 @@ It also works for all types of hierarchical dataframes
21
21
22
22
## Example
23
23
24
- Consider
24
+ Consider a simple hierarchical dataframe from
25
25
<resource src =" example.csv " ></resource >.
26
+
26
27
This table consists of two columns: ` name ` , which is a ` String ` column, and ` info ` ,
27
- which is a ** column group** containing two nested value columns —
28
+ which is a [ ** column group** ] ( DataColumn.md#columngroup ) containing two nested
29
+ [ value columns] ( DataColumn.md#valuecolumn ) —
28
30
` age ` of type ` Int ` , and ` height ` of type ` Double ` .
29
31
30
32
<table >
@@ -55,7 +57,7 @@ which is a **column group** containing two nested value columns —
55
57
56
58
<tabs >
57
59
<tab title =" Kotlin Notebook " >
58
- Read the ` DataFrame ` from the CSV file:
60
+ Read the [ ` DataFrame ` ] ( DataFrame.md ) from the CSV file:
59
61
60
62
``` kotlin
61
63
val df = DataFrame .readCsv(" example.csv" )
@@ -78,10 +80,10 @@ df.sortBy { name and info.height }
78
80
df.filter { name.startsWith(" A" ) && info.age >= 16 }
79
81
```
80
82
81
- If you change DataFrame schema by changing any column [ name] ( rename.md )
82
- or [ type] ( convert.md ) , or [ add] ( add.md ) a new one, you need to
83
- run a cell with a new DataFrame declaration first.
84
- For example, rename the " name" column into "firstName":
83
+ If you change the dataframe's schema by changing any column [ name] ( rename.md ) ,
84
+ or [ type] ( convert.md ) or [ add] ( add.md ) a new one, you need to
85
+ run a cell with a new [ ` DataFrame ` ] ( DataFrame.md ) declaration first.
86
+ For example, rename the ` name ` column into "firstName":
85
87
86
88
``` kotlin
87
89
val dfRenamed = df.rename { name }.into(" firstName" )
@@ -95,13 +97,13 @@ dfRenamed.rename { firstName }.into("name")
95
97
dfRenamed.filter { firstName == " Nikita" }
96
98
```
97
99
98
- See [ ] ( quickstart.md ) in the Kotlin Notebook with basic Extension Properties API examples.
100
+ See the [ ] ( quickstart.md ) in Kotlin Notebook with basic Extension Properties API examples.
99
101
100
102
</tab >
101
103
<tab title =" Compiler Plugin " >
102
104
103
- For now, if you read ` DatFrame ` from a file or URL, you need to define its schema manually.
104
- You can do it fast with [ ` generate..() ` methods] ( DataSchema-Data-Classes-Generation.md ) .
105
+ For now, if you read [ ` DataFrame ` ] ( DataFrame.md ) from a file or URL, you need to define its schema manually.
106
+ You can do it quickly with [ ` generate..() ` methods] ( DataSchema-Data-Classes-Generation.md ) .
105
107
106
108
Define schemas:
107
109
``` kotlin
@@ -118,13 +120,14 @@ data class Person(
118
120
)
119
121
```
120
122
121
- Read the ` DataFrame ` from the CSV file and specify the schema with ` convertTo ` :
123
+ Read the ` DataFrame ` from the CSV file and specify the schema with
124
+ [ ` .convertTo() ` ] ( convertTo.md ) or [ ` cast() ` ] ( cast.md ) :
122
125
123
126
``` kotlin
124
127
val df = DataFrame .readCsv(" example.csv" ).convertTo<Person >()
125
128
```
126
129
127
- Extensions for this ` DataFrame ` will be generated automatically by plugin,
130
+ Extensions for this ` DataFrame ` will be generated automatically by the plugin,
128
131
so you can use extensions for accessing columns,
129
132
using it in operations inside the [ Column Selector DSL] ( ColumnSelectors.md )
130
133
and [ DataRow API] ( DataRow.md ) .
@@ -142,8 +145,8 @@ df.filter { name.startsWith("A") && info.age >= 16 }
142
145
```
143
146
144
147
Moreover, new extensions will be generated on-the-fly after each schema change:
145
- by changing any column [ name] ( rename.md )
146
- or [ type] ( convert.md ) , or [ add] ( add.md ) a new one.
148
+ by changing any column [ name] ( rename.md ) ,
149
+ or [ type] ( convert.md ) or [ add] ( add.md ) a new one.
147
150
For example, rename the "name" column into "firstName" and then we can use ` firstName ` extensions
148
151
in the following operations:
149
152
@@ -155,7 +158,7 @@ df.rename { name }.into("firstName")
155
158
.filter { firstName == " Nikita" }
156
159
```
157
160
158
- See [ Kotlin DataFrame Compiler Plugin Example] ( https://github.com/Kotlin/dataframe/tree/plugin_example/examples/kotlin-dataframe-plugin-example )
161
+ See [ Compiler Plugin Example] ( https://github.com/Kotlin/dataframe/tree/plugin_example/examples/kotlin-dataframe-plugin-example )
159
162
IDEA project with basic Extension Properties API examples.
160
163
</tab >
161
164
</tabs >
0 commit comments