Skip to content

various docs fixes to lower the number of errors #1238

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 5, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,5 +62,3 @@ Code samples for the documentation website reside in [core/.../test/.../samples/
and [tests/.../samples/api](../tests/src/test/kotlin/org/jetbrains/kotlinx/dataframe/samples/api) (for samples can depend on other I/O modules)
and they are copied over to Markdown files in [docs/StardustDocs/topics](./StardustDocs/topics)
by [Korro](https://github.com/devcrocod/korro).


6 changes: 3 additions & 3 deletions docs/StardustDocs/topics/ColumnSelectors.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,17 +49,17 @@ or `ColumnSet` that adheres to the optional given condition. If no column adhere
`NoSuchElementException` is thrown.

##### Col {collapsible="true"}
`col(name)`, `col(5)`, `this[5]`
`col(name)`, `col(5)`

Creates a [ColumnAccessor](DataColumn.md#column-accessors) (or `SingleColumn`) for a column with the given
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a short explanation about ColumnAccessor and its difference with DataColumn?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe... But I'm not sure where. Should it be in DataColumn or here in the selectors?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then again... it's not really an accessor like we made before. It's just a resolver, maybe we could better rephrase it in the entire file

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creates a [ColumnAccessor](DataColumn.md) (or `SingleColumn`) for a column with the given
argument from the top-level or specified [column group](DataColumn.md#columngroup). The argument can be either an
index (`Int`) or a reference to a column (`String`, `ColumnPath`, `KProperty`, or `ColumnAccessor`;
any [AccessApi](apiLevels.md)).

##### Value Col, Frame Col, Col Group {collapsible="true"}
`valueCol(name)`, `valueCol(5)`, `frameCol(name)`, `frameCol(5)`, `colGroup(name)`, `colGroup(5)`

Creates a [ColumnAccessor](DataColumn.md#column-accessors) (or `SingleColumn`) for a
Creates a [ColumnAccessor](DataColumn.md) (or `SingleColumn`) for a
[value column](DataColumn.md#valuecolumn) / [frame column](DataColumn.md#framecolumn) /
[column group](DataColumn.md#columngroup) with the given argument from the top-level or
specified [column group](DataColumn.md#columngroup). The argument can be either an index (`Int`) or a reference
Expand Down
4 changes: 3 additions & 1 deletion docs/StardustDocs/topics/Compiler-Plugin.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,9 @@
Kotlin DataFrame compiler plugin: available in Gradle projects, is coming to Kotlin Notebook and Maven projects soon.

Check out this video that shows how expressions update the schema of a dataframe:
<video src="compiler_plugin.mp4" controls/>


<video src="compiler_plugin.mp4" controls=""/>

## Setup

Expand Down
2 changes: 1 addition & 1 deletion docs/StardustDocs/topics/_shadow_resources.md
Original file line number Diff line number Diff line change
Expand Up @@ -164,4 +164,4 @@
<resource src="notebook_test_generate_docs_1.html"></resource>
<resource src="notebook_test_rename_3.html"></resource>
<resource src="notebook_test_rename_4.html"></resource>
<resource src="notebook_test_rename_5.html"></resource>
<resource src="notebook_test_rename_5.html"></resource>
Original file line number Diff line number Diff line change
Expand Up @@ -224,7 +224,7 @@ tasks.withType(org.jmailen.gradle.kotlinter.tasks.LintTask).all {
</tab>
<tab title=".editorconfig">

```.editorconfig
```editorconfig
[{**/*.Generated.kt,**/*$Extensions.kt}]
ktlint = disabled
```
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ Here are the key features:
You can quickly load data into `DataFrame` into a notebook by simply dragging and dropping a file
(.csv/.json/.xlsx and .geojson/.shp) directly into the notebook editor:

<video src="ktnb_drag_n_drop.mp4" controls/>
<video src="ktnb_drag_n_drop.mp4" controls=""/>

### Visual Data Exploration
**Page through your data**:
Expand All @@ -35,18 +35,26 @@ This is a convenient alternative to using `sortBy` in separate cells.
**Go straight to the data you need**:
You can jump directly to a particular row or column if you want something specific.
This makes working with large datasets more straightforward.
<video src="https://github.com/user-attachments/assets/aeae1c79-9755-4558-bac4-420bf1331f39" controls></video>


<video src="https://github.com/user-attachments/assets/aeae1c79-9755-4558-bac4-420bf1331f39" controls=""/>


### Drill down into nested data
When your data has multiple layers, like a table within a table,
you can now click on a cell containing a nested table to view these details directly.
This makes it easy to go deeper into your data and then return to where you were.
<video src="https://github.com/user-attachments/assets/ef9509be-e19b-469c-9bad-0ce81eec36b0" controls></video>


<video src="https://github.com/user-attachments/assets/ef9509be-e19b-469c-9bad-0ce81eec36b0" controls=""/>


### Visualize multiple tables via tabs
You can open and visualize multiple tables in separate tabs.
This feature is tailored to those who need to compare, contrast, or monitor different datasets simultaneously.
<video src="https://github.com/user-attachments/assets/51b7a6e3-0187-49b3-bf5e-0c4d60f8b769" controls></video>


<video src="https://github.com/user-attachments/assets/51b7a6e3-0187-49b3-bf5e-0c4d60f8b769" controls=""/>


### Exporting to files
Expand All @@ -55,7 +63,9 @@ You can export data directly from the dataframe into various file formats.
This simplifies sharing and further analysis.
The interface supports exporting data to JSON for web applications,
CSV for spreadsheet tools, and XML for data interchange.
<video src="https://github.com/user-attachments/assets/ec28c59a-1555-44ce-98f6-a60d8feae347" controls></video>


<video src="https://github.com/user-attachments/assets/ec28c59a-1555-44ce-98f6-a60d8feae347" controls=""/>


### Convenient copying of data from tables
Expand All @@ -64,7 +74,9 @@ or you can use keyboard shortcuts for quicker selection
and then copy what’s needed with a simple right-click or another shortcut.
It’s designed to feel intuitive,
like copying text from a document, but with the structure and format of your data preserved.
<video src="https://github.com/user-attachments/assets/88e53dfb-361f-40f8-bffb-52a512cdd3cd" controls></video>


<video src="https://github.com/user-attachments/assets/88e53dfb-361f-40f8-bffb-52a512cdd3cd" controls=""/>


To get started, ensure you have the latest version of the Kotlin Notebook Plugin installed in IntelliJ IDEA,
Expand Down
20 changes: 10 additions & 10 deletions docs/StardustDocs/topics/median.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,13 +58,13 @@ See [column selectors](ColumnSelectors.md) for how to select the columns for thi
The following automatic type conversions are performed for the `median` operation.
(Note that `null` only appears in the return type when using `-orNull` overloads).

| Conversion | Result for Empty Input |
|--------------------------------|------------------------|
| T -> T where T : Comparable<T> | null |
| Int -> Double | null |
| Byte -> Double | null |
| Short -> Double | null |
| Long -> Double | null |
| Double -> Double | null |
| Float -> Double | null |
| Nothing -> Nothing | null |
| Conversion | Result for Empty Input |
|----------------------------------|------------------------|
| T -> T where T : Comparable\<T\> | null |
| Int -> Double | null |
| Byte -> Double | null |
| Short -> Double | null |
| Long -> Double | null |
| Double -> Double | null |
| Float -> Double | null |
| Nothing -> Nothing | null |
20 changes: 10 additions & 10 deletions docs/StardustDocs/topics/minmax.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,13 +49,13 @@ See [column selectors](ColumnSelectors.md) for how to select the columns for thi
The following automatic type conversions are performed for the `min` and `max` operations.
(Note that `null` only appears in the return type when using `-orNull` overloads).

| Conversion | Result for Empty Input |
|--------------------------------|------------------------|
| T -> T where T : Comparable<T> | null |
| Int -> Int | null |
| Byte -> Byte | null |
| Short -> Short | null |
| Long -> Long | null |
| Double -> Double | null |
| Float -> Float | null |
| Nothing -> Nothing | null |
| Conversion | Result for Empty Input |
|----------------------------------|------------------------|
| T -> T where T : Comparable\<T\> | null |
| Int -> Int | null |
| Byte -> Byte | null |
| Short -> Short | null |
| Long -> Long | null |
| Double -> Double | null |
| Float -> Float | null |
| Nothing -> Nothing | null |
39 changes: 39 additions & 0 deletions docs/StardustDocs/topics/parse.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,3 +96,42 @@ DataFrame.parser.addDateTimePattern("dd.MM.uuuu HH:mm:ss")
```

<!---END-->

For `locale`, this means that the one being used by the parser is defined as:

↪ The locale given as function argument directly, or in `parserOptions`, if it is not `null`, else

&nbsp;&nbsp;&nbsp;&nbsp;↪ The locale set by `DataFrame.parser.locale = ...`, if it is not `null`, else

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;↪ `Locale.getDefault()`, which is the system's default locale that can be changed with `Locale.setDefault()`.

### Parsing Doubles

DataFrame has a new fast and powerful double parser enabled by default.
It is based on [the FastDoubleParser library](https://github.com/wrandelshofer/FastDoubleParser) for its
high performance and configurability
(in the future, we might expand this support to `Float`, `BigDecimal`, and `BigInteger` as well).

The parser is locale-aware; it will use the locale set by the
[(global)](#global-parser-options) [parser options](#parser-options) to parse the doubles.
It also has a fallback mechanism built in, meaning it can recognize characters from
all other locales (and some from [Wikipedia](https://en.wikipedia.org/wiki/Decimal_separator))
and parse them correctly as long as they don't conflict with the current locale.

For example, if your locale uses ',' as decimal separator, it will not recognize ',' as thousands separator, but it will
recognize ''', ' ', '٬', '_', ' ', etc. as such.
The same holds for characters like "e", "inf", "×10^", "NaN", etc. (ignoring case).

This means you can safely parse `"123'456 789,012.345×10^6"` with a US locale but not `"1.234,5"`.

Aside from this, DataFrame also explicitly recognizes "∞", "inf", "infinity", and "infty" as `Double.POSITIVE_INFINITY`
(as well as their negative counterparts), "nan", "na", and "n/a" as `Double.NaN`,
and all forms of whitespace are treated equally.

If `FastDoubleParser` fails to parse a `String` as `Double`, DataFrame will try
to parse it using the standard `NumberFormat.parse()` function as a last resort.

If you experience any issues with the new parser, you can turn it off by setting
`useFastDoubleParser = false`, which will use the old `NumberFormat.parse()` function instead.

Please [report](https://github.com/Kotlin/dataframe/issues) any issues you encounter.
20 changes: 10 additions & 10 deletions docs/StardustDocs/topics/percentile.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,13 +80,13 @@ See [column selectors](ColumnSelectors.md) for how to select the columns for thi
The following automatic type conversions are performed for the `percentile` operation.
(Note that `null` only appears in the return type when using `-orNull` overloads).

| Conversion | Result for Empty Input |
|--------------------------------|------------------------|
| T -> T where T : Comparable<T> | null |
| Int -> Double | null |
| Byte -> Double | null |
| Short -> Double | null |
| Long -> Double | null |
| Double -> Double | null |
| Float -> Double | null |
| Nothing -> Nothing | null |
| Conversion | Result for Empty Input |
|----------------------------------|------------------------|
| T -> T where T : Comparable\<T\> | null |
| Int -> Double | null |
| Byte -> Double | null |
| Short -> Double | null |
| Long -> Double | null |
| Double -> Double | null |
| Float -> Double | null |
| Nothing -> Nothing | null |