-
Notifications
You must be signed in to change notification settings - Fork 73
IDE sample of "unsupported sources"->DataFrame #1231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Jolanrensen
wants to merge
13
commits into
master
Choose a base branch
from
unsupported-data-sources-examples
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
13 commits
Select commit
Hold shift + click to select a range
0e16817
added IDE sample of exposed->DataFrame
Jolanrensen 211412a
WIP spark
Jolanrensen 28866ca
added working kotlin spark sample
Jolanrensen 46128e8
added kotlin-spark-api-less example for spark as well
Jolanrensen 464393f
Merge branch 'master' into unsupported-data-sources-examples
Jolanrensen 79eac6c
wip Multik sample
Jolanrensen 70c8ee8
updating multik example, 2d
Jolanrensen 05fd49e
added multik n-dim example
Jolanrensen 7817357
small docs and readme updates regarding main concepts
Jolanrensen 27fd209
expanded on all comments and made all steps of unsupported data sourc…
Jolanrensen 2327777
Made mri-like example for Multik inside dataframe
Jolanrensen 3194e30
Merge branch 'master' into unsupported-data-sources-examples
Jolanrensen 1520147
Adding links to interop guides examples in docs
Jolanrensen File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
73 changes: 73 additions & 0 deletions
73
examples/idea-examples/unsupported-data-sources/build.gradle.kts
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,73 @@ | ||
plugins { | ||
application | ||
kotlin("jvm") | ||
|
||
id("org.jetbrains.kotlinx.dataframe") | ||
|
||
// only mandatory if `kotlin.dataframe.add.ksp=false` in gradle.properties | ||
id("com.google.devtools.ksp") | ||
} | ||
|
||
repositories { | ||
mavenLocal() // in case of local dataframe development | ||
mavenCentral() | ||
} | ||
|
||
dependencies { | ||
// implementation("org.jetbrains.kotlinx:dataframe:X.Y.Z") | ||
implementation(project(":")) | ||
|
||
// exposed + sqlite database support | ||
implementation(libs.sqlite) | ||
implementation(libs.exposed.core) | ||
implementation(libs.exposed.kotlin.datetime) | ||
implementation(libs.exposed.jdbc) | ||
implementation(libs.exposed.json) | ||
implementation(libs.exposed.money) | ||
|
||
// (kotlin) spark support | ||
implementation(libs.kotlin.spark) | ||
compileOnly(libs.spark) | ||
implementation(libs.log4j.core) | ||
implementation(libs.log4j.api) | ||
|
||
// multik support | ||
implementation(libs.multik.core) | ||
implementation(libs.multik.default) | ||
} | ||
|
||
/** | ||
* Runs the kotlinSpark/typedDataset example with java 11. | ||
*/ | ||
val runKotlinSparkTypedDataset by tasks.registering(JavaExec::class) { | ||
classpath = sourceSets["main"].runtimeClasspath | ||
javaLauncher = javaToolchains.launcherFor { languageVersion = JavaLanguageVersion.of(11) } | ||
mainClass = "org.jetbrains.kotlinx.dataframe.examples.kotlinSpark.TypedDatasetKt" | ||
} | ||
|
||
/** | ||
* Runs the kotlinSpark/untypedDataset example with java 11. | ||
*/ | ||
val runKotlinSparkUntypedDataset by tasks.registering(JavaExec::class) { | ||
classpath = sourceSets["main"].runtimeClasspath | ||
javaLauncher = javaToolchains.launcherFor { languageVersion = JavaLanguageVersion.of(11) } | ||
mainClass = "org.jetbrains.kotlinx.dataframe.examples.kotlinSpark.UntypedDatasetKt" | ||
} | ||
|
||
/** | ||
* Runs the spark/typedDataset example with java 11. | ||
*/ | ||
val runSparkTypedDataset by tasks.registering(JavaExec::class) { | ||
classpath = sourceSets["main"].runtimeClasspath | ||
javaLauncher = javaToolchains.launcherFor { languageVersion = JavaLanguageVersion.of(11) } | ||
mainClass = "org.jetbrains.kotlinx.dataframe.examples.spark.TypedDatasetKt" | ||
} | ||
|
||
/** | ||
* Runs the spark/untypedDataset example with java 11. | ||
*/ | ||
val runSparkUntypedDataset by tasks.registering(JavaExec::class) { | ||
classpath = sourceSets["main"].runtimeClasspath | ||
javaLauncher = javaToolchains.launcherFor { languageVersion = JavaLanguageVersion.of(11) } | ||
mainClass = "org.jetbrains.kotlinx.dataframe.examples.spark.UntypedDatasetKt" | ||
} |
107 changes: 107 additions & 0 deletions
107
...es/src/main/kotlin/org/jetbrains/kotlinx/dataframe/examples/exposed/compatibilityLayer.kt
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,107 @@ | ||
package org.jetbrains.kotlinx.dataframe.examples.exposed | ||
|
||
import org.jetbrains.exposed.v1.core.BiCompositeColumn | ||
import org.jetbrains.exposed.v1.core.Column | ||
import org.jetbrains.exposed.v1.core.Expression | ||
import org.jetbrains.exposed.v1.core.ExpressionAlias | ||
import org.jetbrains.exposed.v1.core.ResultRow | ||
import org.jetbrains.exposed.v1.core.Table | ||
import org.jetbrains.exposed.v1.jdbc.Query | ||
import org.jetbrains.kotlinx.dataframe.AnyFrame | ||
import org.jetbrains.kotlinx.dataframe.DataFrame | ||
import org.jetbrains.kotlinx.dataframe.annotations.DataSchema | ||
import org.jetbrains.kotlinx.dataframe.api.convertTo | ||
import org.jetbrains.kotlinx.dataframe.api.toDataFrame | ||
import org.jetbrains.kotlinx.dataframe.codeGen.NameNormalizer | ||
import org.jetbrains.kotlinx.dataframe.impl.schema.DataFrameSchemaImpl | ||
import org.jetbrains.kotlinx.dataframe.schema.ColumnSchema | ||
import org.jetbrains.kotlinx.dataframe.schema.DataFrameSchema | ||
import kotlin.reflect.KProperty1 | ||
import kotlin.reflect.full.isSubtypeOf | ||
import kotlin.reflect.full.memberProperties | ||
import kotlin.reflect.typeOf | ||
|
||
/** | ||
* Retrieves all columns of any [Iterable][Iterable]`<`[ResultRow][ResultRow]`>`, like [Query][Query], | ||
* from Exposed row by row and converts the resulting [Map] into a [DataFrame], cast to type [T]. | ||
* | ||
* In notebooks, the untyped version works just as well due to runtime inference :) | ||
*/ | ||
inline fun <reified T : Any> Iterable<ResultRow>.convertToDataFrame(): DataFrame<T> = | ||
convertToDataFrame().convertTo<T>() | ||
|
||
/** | ||
* Retrieves all columns of an [Iterable][Iterable]`<`[ResultRow][ResultRow]`>` from Exposed, like [Query][Query], | ||
* row by row and converts the resulting [Map] of lists into a [DataFrame] by calling | ||
* [Map.toDataFrame]. | ||
*/ | ||
@JvmName("convertToAnyFrame") | ||
fun Iterable<ResultRow>.convertToDataFrame(): AnyFrame { | ||
val map = mutableMapOf<String, MutableList<Any?>>() | ||
for (row in this) { | ||
for (expression in row.fieldIndex.keys) { | ||
map.getOrPut(expression.readableName) { | ||
mutableListOf() | ||
} += row[expression] | ||
} | ||
} | ||
return map.toDataFrame() | ||
} | ||
|
||
/** | ||
* Retrieves a simple column name from [this] [Expression]. | ||
* | ||
* Might need to be expanded with multiple types of [Expression]. | ||
*/ | ||
val Expression<*>.readableName: String | ||
get() = when (this) { | ||
is Column<*> -> name | ||
is ExpressionAlias<*> -> alias | ||
is BiCompositeColumn<*, *, *> -> getRealColumns().joinToString("_") { it.readableName } | ||
else -> toString() | ||
} | ||
|
||
/** | ||
* Creates a [DataFrameSchema] from the declared [Table] instance. | ||
* | ||
* This is not needed for conversion, but it can be useful to create a DataFrame [@DataSchema][DataSchema] instance. | ||
* | ||
* @param columnNameToAccessor Optional [MutableMap] which will be filled with entries mapping | ||
* the SQL column name to the accessor name from the [Table]. | ||
* This can be used to define a [NameNormalizer] later. | ||
* @see toDataFrameSchemaWithNameNormalizer | ||
*/ | ||
@Suppress("UNCHECKED_CAST") | ||
fun Table.toDataFrameSchema(columnNameToAccessor: MutableMap<String, String> = mutableMapOf()): DataFrameSchema { | ||
// we use reflection to go over all `Column<*>` properties in the Table object | ||
val columns = this::class.memberProperties | ||
.filter { it.returnType.isSubtypeOf(typeOf<Column<*>>()) } | ||
.associate { prop -> | ||
prop as KProperty1<Table, Column<*>> | ||
|
||
// retrieve the SQL column name | ||
val columnName = prop.get(this).name | ||
// store the SQL column name together with the accessor name in the map | ||
columnNameToAccessor[columnName] = prop.name | ||
|
||
// get the column type from `val a: Column<Type>` | ||
val type = prop.returnType.arguments.first().type!! | ||
|
||
// and we add the name and column shema type to the `columns` map :) | ||
columnName to ColumnSchema.Value(type) | ||
} | ||
return DataFrameSchemaImpl(columns) | ||
} | ||
|
||
/** | ||
* Creates a [DataFrameSchema] from the declared [Table] instance with a [NameNormalizer] to | ||
* convert the SQL column names to the corresponding Kotlin property names. | ||
* | ||
* This is not needed for conversion, but it can be useful to create a DataFrame [@DataSchema][DataSchema] instance. | ||
* | ||
* @see toDataFrameSchema | ||
*/ | ||
fun Table.toDataFrameSchemaWithNameNormalizer(): Pair<DataFrameSchema, NameNormalizer> { | ||
val columnNameToAccessor = mutableMapOf<String, String>() | ||
return Pair(toDataFrameSchema(), NameNormalizer { columnNameToAccessor[it] ?: it }) | ||
} |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be useful to add link to https://kotlin.github.io/dataframe/extensionpropertiesapi.html#example?