Skip to content

Readme update #1251

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
120 changes: 104 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,14 +20,26 @@ Kotlin DataFrame aims to reconcile Kotlin's static typing with the dynamic natur
* **Typesafe** — on-the-fly generation of extension properties for type safe data access with Kotlin-style care for null safety.
* **Polymorphic** — type compatibility derives from column schema compatibility. You can define a function that requires a special subset of columns in a dataframe but doesn't care about other columns.

Integrates with [Kotlin kernel for Jupyter](https://github.com/Kotlin/kotlin-jupyter). Inspired by [krangl](https://github.com/holgerbrandl/krangl), Kotlin Collections and [pandas](https://pandas.pydata.org/)
Integrates with [Kotlin Notebook](https://kotlinlang.org/docs/kotlin-notebook-overview.html).
Inspired by [krangl](https://github.com/holgerbrandl/krangl), Kotlin Collections and [pandas](https://pandas.pydata.org/)

## 🚀 Quickstart

Looking for a fast and simple way to learn the basics?
Get started in minutes with our [Quickstart Guide](https://kotlin.github.io/dataframe/quickstart.html).

It walks you through the core features of Kotlin DataFrame with minimal setup and clear examples
— perfect for getting up to speed in just a few minutes.

[![quickstart_preview](docs/StardustDocs/images/guides/quickstart_preview.png)](https://kotlin.github.io/dataframe/quickstart.html)

## Documentation

Explore [**documentation**](https://kotlin.github.io/dataframe) for details.

You could find the following articles there:

* [Guides and Examples](https://kotlin.github.io/dataframe/guides-and-examples.html)
* [Get started with Kotlin DataFrame](https://kotlin.github.io/dataframe/gettingstarted.html)
* [Working with Data Schemas](https://kotlin.github.io/dataframe/schemas.html)
* [Setup compiler plugin in Gradle project](https://kotlin.github.io/dataframe/compiler-plugin.html)
Expand All @@ -46,31 +58,102 @@ Check out this [notebook with new features](examples/notebooks/feature_overviews

## Setup

```kotlin
implementation("org.jetbrains.kotlinx:dataframe:1.0.0-Beta2")
> For more detailed instructions on how to get started with Kotlin DataFrame, refer to the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe put this at the bottom?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. It's a general recommendation for both KTNB and gradle.
  2. I'd like user to use website setup guidelines, here it is not detailed enough.

> [Getting Started](https://kotlin.github.io/dataframe/gettingstarted.html).

### Kotlin Notebook

You can use Kotlin DataFrame in [Kotlin Notebook](https://kotlinlang.org/docs/kotlin-notebook-overview.html),
or other interactive environment with [Kotlin Jupyter Kernel](https://github.com/Kotlin/kotlin-jupyter) support,
such as [Datalore](https://datalore.jetbrains.com/),
and [Jupyter Notebook](https://jupyter.org/).

You can include all the necessary dependencies and imports in the notebook using *line magic*:

```
%use dataframe
```

Check out the [custom setup page](https://kotlin.github.io/dataframe/gettingstartedgradleadvanced.html) if you don't need some of the formats as dependencies,
for Groovy, and for configurations specific to Android projects.
You can use `%useLatestDescriptors`
to get the latest stable version without updating the Kotlin kernel:

## Code example
```
%useLatestDescriptors
%use dataframe
```

```kotlin
import org.jetbrains.kotlinx.dataframe.*
import org.jetbrains.kotlinx.dataframe.api.*
import org.jetbrains.kotlinx.dataframe.io.*
Or manually specify the version:

```
%use dataframe($dataframe_version)
```

Refer to the
[Get started with Kotlin DataFrame in Kotlin Notebook](https://kotlin.github.io/dataframe/gettingstartedkotlinnotebook.html)
for details.

### Gradle

Add dependencies in the build.gradle.kts script:

```kotlin
val df = DataFrame.read("https://raw.githubusercontent.com/Kotlin/dataframe/master/data/jetbrains_repositories.csv")
df["full_name"][0] // Indexing https://kotlin.github.io/dataframe/access.html
dependencies {
implementation("org.jetbrains.kotlinx:dataframe:1.0.0-Beta2")
}
```

Make sure that you have `mavenCentral()` in the list of repositories:

df.filter { "stargazers_count"<Int>() > 50 }.print()
```kotlin
repositories {
mavenCentral()
}
```

## Getting started in Kotlin Notebook
Refer to the
[Get started with Kotlin DataFrame on Gradle](https://kotlin.github.io/dataframe/gettingstartedgradle.html)
for details.
Also, check out the [custom setup page](https://kotlin.github.io/dataframe/gettingstartedgradleadvanced.html)
if you don't need some formats as dependencies,
for Groovy, and for configurations specific to Android projects.

## Code example

Follow this [guide](https://kotlin.github.io/dataframe/gettingstartedkotlinnotebook.html)
This example of Kotlin DataFrame code with
the [Compiler Plugin](https://kotlin.github.io/dataframe/compiler-plugin.html) enabled.
See [the full project](https://github.com/Kotlin/dataframe/tree/master/examples/kotlin-dataframe-plugin-example).
See also
[this example in Kotlin Notebook](https://github.com/Kotlin/dataframe/tree/master/examples/notebooks/readme_example.ipynb).

```kotlin
val df = DataFrame
// Read DataFrame from the CSV file.
.readCsv("https://raw.githubusercontent.com/Kotlin/dataframe/master/data/jetbrains_repositories.csv")
// And convert it to match the `Repositories` schema.
.convertTo<Repositories>()

// Update the DataFrame.
val reposUpdated = repos
// Rename columns to CamelCase.
.renameToCamelCase()
// Rename "stargazersCount" column to "stars".
.rename { stargazersCount }.into("stars")
// Filter by the number of stars:
.filter { stars > 50 }
// Convert values in the "topic" column (which were `String` initially)
// to the list of topics.
.convert { topics }.with {
val inner = it.removeSurrounding("[", "]")
if (inner.isEmpty()) emptyList() else inner.split(',').map(String::trim)
}
// Add a new column with the number of topics.
.add("topicCount") { topics.size }

// Write the updated DataFrame to a CSV file.
reposUpdated.writeCsv("jetbrains_repositories_new.csv")
```

Explore [**more examples here**](https://kotlin.github.io/dataframe/guides-and-examples.html).

## Data model
* `DataFrame` is a list of columns with equal sizes and distinct names.
Expand All @@ -79,7 +162,12 @@ Follow this [guide](https://kotlin.github.io/dataframe/gettingstartedkotlinnoteb
* `ColumnGroup` — contains columns
* `FrameColumn` — contains dataframes

Explore [**more examples here**](https://kotlin.github.io/dataframe/guides-and-examples.html).
## Visualizations

[Kandy](https://kotlin.github.io/kandy/welcome.html) plotting library provides seamless visualizations
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

visualizations or visualisations? ;P you write it in two ways

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we follow american english, it should be a z

for your dataframes.

![kandy_preview](docs/StardustDocs/images/guides/kandy_gallery_preview.png)

## Kotlin, Kotlin Jupyter, Arrow, and JDK versions

Expand Down
Loading