[SPARK-51815] Add `Row` struct #63

dongjoon-hyun · 2025-04-16T06:03:45Z

What changes were proposed in this pull request?

This PR aims to add Row struct and use it.

Why are the changes needed?

To make DataFrame APIs return Row instead of [String?] in Swift.

- public func collect() async throws -> [[String?]] {
+ public func collect() async throws -> [Row] {

- public func head(_ n: Int32 = 1) async throws -> [[String?]] {
+ public func head(_ n: Int32 = 1) async throws -> [Row] {

Note that Row is added to support general type fields, but this PR replaces the existing API's [String?] signature into Row-based signature. The detailed one-to-one mapping among other types will be handled later.

Does this PR introduce any user-facing change?

Yes, but this is a change to the unreleased version.

How was this patch tested?

Pass the CIs.

Was this patch authored or co-authored using generative AI tooling?

No.

dongjoon-hyun · 2025-04-16T06:42:23Z

Could you review this Row when you have some time, @viirya ? As you pointed out previously in the following of the previous collect PR, Row was not added until now. And, this PR is the first implementation.

[SPARK-51508] Support collect(): [[String?]] for DataFrame #17 (comment)

viirya · 2025-04-16T06:52:09Z

Sources/SparkConnect/DataFrame.swift

@@ -217,7 +217,7 @@ public actor DataFrame: Sendable {
            values.append(str.asString(i))


This is still stored the value as string? So we still cannot insert column values into Row in their types?

Yes~ This is a part-one implementation to introduce Row struct like the note in the PR description.

Note that Row is added to support general type fields, but this PR replaces the existing API's [String?] signature into Row-based signature. The detailed one-to-one mapping among other types will be handled later.

Got it. Thank you.

viirya · 2025-04-16T06:55:36Z

Tests/SparkConnectTests/Resources/queries/describe_database.sql.answer

+[Location,*]
+[Owner,*]


Hmm, why the the second elements of the two items ["Location","file:/opt/spark/work-dir/spark-warehouse"],["Owner","185"] are * now?

During the Catalog PR, the cleaning methods were added previously. While regenerating answer files Today, I used the cleaned strings because the cleaned one is better in the GitHub repo.

spark-connect-swift/Tests/SparkConnectTests/SQLTests.swift

Lines 36 to 50 in 21724f5

private func cleanUp(_ str: String) -> String {

return removeOwner(removeID(removeLocation(str)))

}

private func removeID(_ str: String) -> String {

return str.replacing(regexPlanId, with: "plan_id=").replacing(regexID, with: "#")

}

private func removeLocation(_ str: String) -> String {

return str.replacing(regexLocation, with: "*")

}

private func removeOwner(_ str: String) -> String {

return str.replacing(regexOwner, with: "*")

}

I see. Thank you.

dongjoon-hyun · 2025-04-16T06:59:11Z

Thank you! Merged to main.

dongjoon-hyun force-pushed the SPARK-51815 branch 2 times, most recently from a538a15 to 1ef9013 Compare April 16, 2025 06:12

[SPARK-51815] Add Row struct

6d441c3

dongjoon-hyun force-pushed the SPARK-51815 branch from 1ef9013 to 6d441c3 Compare April 16, 2025 06:17

Re-style

d087b50

viirya reviewed Apr 16, 2025

View reviewed changes

viirya approved these changes Apr 16, 2025

View reviewed changes

dongjoon-hyun closed this in 119eeea Apr 16, 2025

dongjoon-hyun deleted the SPARK-51815 branch April 16, 2025 07:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-51815] Add `Row` struct #63

[SPARK-51815] Add `Row` struct #63

Uh oh!

dongjoon-hyun commented Apr 16, 2025 •

edited

Loading

Uh oh!

dongjoon-hyun commented Apr 16, 2025 •

edited

Loading

Uh oh!

viirya Apr 16, 2025

Uh oh!

dongjoon-hyun Apr 16, 2025

Uh oh!

viirya Apr 16, 2025

Uh oh!

viirya Apr 16, 2025

Uh oh!

dongjoon-hyun Apr 16, 2025 •

edited

Loading

Uh oh!

viirya Apr 16, 2025

Uh oh!

dongjoon-hyun commented Apr 16, 2025

Uh oh!

Uh oh!

		@@ -217,7 +217,7 @@ public actor DataFrame: Sendable {
		values.append(str.asString(i))

	private func cleanUp(_ str: String) -> String {
	return removeOwner(removeID(removeLocation(str)))
	}

	private func removeID(_ str: String) -> String {
	return str.replacing(regexPlanId, with: "plan_id=").replacing(regexID, with: "#")
	}

	private func removeLocation(_ str: String) -> String {
	return str.replacing(regexLocation, with: "*")
	}

	private func removeOwner(_ str: String) -> String {
	return str.replacing(regexOwner, with: "*")
	}

		[Location,*]
		[Owner,*]

[SPARK-51815] Add Row struct #63

[SPARK-51815] Add Row struct #63

Uh oh!

Conversation

dongjoon-hyun commented Apr 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

dongjoon-hyun commented Apr 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

viirya Apr 16, 2025

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun Apr 16, 2025

Choose a reason for hiding this comment

Uh oh!

viirya Apr 16, 2025

Choose a reason for hiding this comment

Uh oh!

viirya Apr 16, 2025

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun Apr 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

viirya Apr 16, 2025

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun commented Apr 16, 2025

Uh oh!

Uh oh!

[SPARK-51815] Add `Row` struct #63

[SPARK-51815] Add `Row` struct #63

dongjoon-hyun commented Apr 16, 2025 •

edited

Loading

dongjoon-hyun commented Apr 16, 2025 •

edited

Loading

dongjoon-hyun Apr 16, 2025 •

edited

Loading