You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
`PipelinesHandler` is used to [handle pipeline commands](#handlePipelinesCommand) in [Spark Connect]({{ book.spark_connect }}) ([SparkConnectPlanner]({{ book.spark_connect }}/server/SparkConnectPlanner), precisely).
*`SparkConnectPlanner` is requested to `handlePipelineCommand` (for `PIPELINE_COMMAND` command)
20
31
32
+
### Define Dataset Command { #DEFINE_DATASET }
33
+
34
+
`handlePipelinesCommand` prints out the following INFO message to the logs:
35
+
36
+
```text
37
+
Define pipelines dataset cmd received: [cmd]
38
+
```
39
+
40
+
`handlePipelinesCommand`[defines a dataset](#defineDataset).
41
+
42
+
### Define Flow Command { #DEFINE_FLOW }
43
+
44
+
`handlePipelinesCommand` prints out the following INFO message to the logs:
45
+
46
+
```text
47
+
Define pipelines flow cmd received: [cmd]
48
+
```
49
+
50
+
`handlePipelinesCommand`[defines a flow](#defineFlow).
51
+
21
52
### startRun { #startRun }
22
53
23
54
```scala
@@ -37,7 +68,9 @@ createDataflowGraph(
37
68
spark: SparkSession):String
38
69
```
39
70
40
-
`createDataflowGraph`...FIXME
71
+
`createDataflowGraph` finds the catalog and the database in the given `cmd` command and [creates a dataflow graph](DataflowGraphRegistry.md#createDataflowGraph).
72
+
73
+
`createDataflowGraph` returns the ID of the created dataflow graph.
### Define Dataset (Table or View) { #defineDataset }
53
86
54
87
```scala
55
88
defineDataset(
56
89
dataset: proto.PipelineCommand.DefineDataset,
57
90
sparkSession: SparkSession):Unit
58
91
```
59
92
60
-
`defineDataset`...FIXME
93
+
`defineDataset` looks up the [GraphRegistrationContext](DataflowGraphRegistry.md#getDataflowGraphOrThrow) for the given `dataset` (or throws a `SparkException` if not found).
94
+
95
+
`defineDataset` branches off based on the `dataset` type:
96
+
97
+
| Dataset Type | Action |
98
+
|--------------|--------|
99
+
|`MATERIALIZED_VIEW` or `TABLE`|[Registers a table](GraphRegistrationContext.md#registerTable)|
100
+
|`TEMPORARY_VIEW`|[Registers a view](GraphRegistrationContext.md#registerView)|
101
+
102
+
For unknown types, `defineDataset` reports an `IllegalArgumentException`:
103
+
104
+
```text
105
+
Unknown dataset type: [type]
106
+
```
61
107
62
108
### defineFlow { #defineFlow }
63
109
@@ -68,4 +114,14 @@ defineFlow(
68
114
sparkSession: SparkSession):Unit
69
115
```
70
116
71
-
`defineFlow`...FIXME
117
+
`defineFlow` looks up the [GraphRegistrationContext](DataflowGraphRegistry.md#getDataflowGraphOrThrow) for the given `flow` (or throws a `SparkException` if not found).
118
+
119
+
!!! note "Implicit Flows"
120
+
An **implicit flow** is a flow with the name of the target dataset (i.e. one defined as part of dataset creation).
121
+
122
+
`defineFlow`[creates a flow identifier](GraphIdentifierManager.md#parseTableIdentifier) (for the `flow` name).
123
+
124
+
??? note "AnalysisException"
125
+
`defineFlow` reports an `AnalysisException` if the given `flow` is not an implicit flow, but is defined with a multi-part identifier.
126
+
127
+
In the end, `defineFlow`[registers a flow](GraphRegistrationContext.md#registerFlow).
0 commit comments