Skip to content

Commit 2528bdc

Browse files
kravets-levkoJesse
andauthored
Refactoring: PECO-219 followup, PECO-218 (get rid of HiveUtils and simplify API) (#47)
* Prepare DBSQLOperation for decomposition * Extract checkIfOperationHasMoreRows and getResult * Extract waitUntilReady * Finally remove HiveUtils::WaitUntilReady; update tests; restore progress and callback support for DBSQLOperation::waitUntilReady * Remove HiveUtils class; put all utility modules together * Update examples * Update docs * Update changelog * Apply suggestions from code review Co-authored-by: Jesse <jesse.whitehouse@databricks.com> Co-authored-by: Jesse <jesse.whitehouse@databricks.com>
1 parent dd2c4c0 commit 2528bdc

33 files changed

+416
-652
lines changed

CHANGELOG.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,24 @@
22

33
## 0.1.x (Unreleased)
44

5+
- `DBSQLOperation` interface simplified: `HiveUtils` were removed and replaced with new methods
6+
`DBSQLOperation.fetchChunk`/`DBSQLOperation.fetchAll`. New API implements all necessary waiting
7+
and data conversion routines internally
8+
- Better TypeScript support
9+
- Thrift definitions updated to support additional Databricks features
10+
- User-agent string updated; a part of user-agent string is configurable through `DBSQLClient`'s `clientId` option
11+
- Connection now uses keep-alive (not configurable at this moment)
12+
- `DBSQLClient` now prepends slash to path when needed
13+
- `DBSQLOperation`: default chunk size for data fetching increased from 100 to 100.000
14+
15+
### Upgrading
16+
17+
`DBSQLClient.utils` was permanently removed. Code which used `utils.waitUntilReady`, `utils.fetchAll`
18+
and `utils.getResult` to get data should now be replaced with the single `DBSQLOperation.fetchAll` method.
19+
Progress reporting, previously supported by `utils.waitUntilReady`, is now configurable via
20+
`DBSQLOperation.fetchChunk`/`DBSQLOperation.fetchAll` options. `DBSQLOperation.setMaxRows` also became
21+
an option of methods mentioned above.
22+
523
## 0.1.8-beta.1 (2022-06-24)
624

725
- Initial release

README.md

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,6 @@ npm i @databricks/sql
3333
const { DBSQLClient } = require('@databricks/sql');
3434

3535
const client = new DBSQLClient();
36-
const utils = DBSQLClient.utils;
3736

3837
client
3938
.connect({
@@ -45,11 +44,9 @@ client
4544
const session = await client.openSession();
4645

4746
const queryOperation = await session.executeStatement('SELECT "Hello, World!"', { runAsync: true });
48-
await utils.waitUntilReady(queryOperation, false, () => {});
49-
await utils.fetchAll(queryOperation);
47+
const result = await queryOperation.fetchAll();
5048
await queryOperation.close();
5149

52-
const result = utils.getResult(queryOperation).getValue();
5350
console.table(result);
5451

5552
await session.close();

docs/readme.md

Lines changed: 26 additions & 74 deletions
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,8 @@
55
1. [Foreword](#foreword)
66
2. [Example](#example) \
77
2.1. [Error handling](#error-handling)
8-
3. [HiveSession](#hivesession)
9-
4. [HiveOperation](#hiveoperation) \
10-
4.1. [HiveUtils](#hiveutils)
8+
3. [DBSQLSession](#dbsqlsession)
9+
4. [DBSQLOperation](#dbsqloperation)
1110
5. [Status](#status)
1211
6. [Finalize](#finalize)
1312

@@ -23,7 +22,6 @@ If you find any mistakes, misleading or some confusion feel free to create an is
2322
const { DBSQLClient } = require('@databricks/sql');
2423

2524
const client = new DBSQLClient();
26-
const utils = DBSQLClient.utils;
2725

2826
client
2927
.connect({
@@ -37,20 +35,17 @@ client
3735
const createTableOperation = await session.executeStatement(
3836
'CREATE TABLE IF NOT EXISTS pokes (foo INT, bar STRING)',
3937
);
40-
await utils.waitUntilReady(createTableOperation, false, () => {});
38+
await createTableOperation.fetchAll();
4139
await createTableOperation.close();
4240

4341
const loadDataOperation = await session.executeStatement('INSERT INTO pokes VALUES(123, "Hello, world!"');
44-
await utils.waitUntilReady(loadDataOperation, false, () => {});
42+
await loadDataOperation.fetchAll();
4543
await loadDataOperation.close();
4644

4745
const selectDataOperation = await session.executeStatement('SELECT * FROM pokes', { runAsync: true });
48-
await utils.waitUntilReady(selectDataOperation, false, () => {});
49-
await utils.fetchAll(selectDataOperation);
46+
const result = await selectDataOperation.fetchAll(selectDataOperation);
5047
await selectDataOperation.close();
5148

52-
const result = utils.getResult(selectDataOperation).getValue();
53-
5449
console.log(JSON.stringify(result, null, '\t'));
5550

5651
await session.close();
@@ -71,9 +66,9 @@ client.on('error', (error) => {
7166
});
7267
```
7368

74-
## HiveSession
69+
## DBSQLSession
7570

76-
After you connect to the server you should open session to start working with Hive server.
71+
After you connect to the server you should open session to start working with server.
7772

7873
```javascript
7974
...
@@ -84,9 +79,9 @@ To open session you must provide [OpenSessionRequest](/lib/hive/Commands/OpenSes
8479

8580
Into "configuration" you may set any of the configurations that required for the session of your Hive instance.
8681

87-
After the session is opened you will have the [HiveSession](/lib/HiveSession.ts) instance.
82+
After the session is opened you will have the [DBSQLSession](/lib/DBSQLSession.ts) instance.
8883

89-
Class [HiveSession](/lib/HiveSession.ts) is a facade for API that works with [SessionHandle](/lib/hive/Types/index.ts#L77).
84+
Class [DBSQLSession](/lib/DBSQLSession.ts) is a facade for API that works with [SessionHandle](/lib/hive/Types/index.ts#L77).
9085

9186
The method you will use the most is `executeStatement`
9287

@@ -100,81 +95,44 @@ const operation = await session.executeStatement(
10095

10196
- "statement" is DDL/DML statement (CREATE TABLE, INSERT, UPDATE, SELECT, LOAD, etc.)
10297

103-
- [options](/lib/contracts/IHiveSession.ts#L14)
98+
- [options](/lib/contracts/IDBSQLSession.ts#L14)
10499

105100
- runAsync allows executing operation asynchronously.
106101

107102
- confOverlay overrides session configuration properties.
108103

109104
- timeout is the maximum time to execute an operation. It has Buffer type because timestamp in Hive has capacity 64. So for such value, you should use [node-int64](https://www.npmjs.com/package/node-int64) npm module.
110105

111-
To know other methods see [IHiveSession](/lib/contracts/IHiveSession.ts) and [examples/session.js](/examples/session.js).
112-
113-
## HiveOperation
106+
To know other methods see [IDBSQLSession](/lib/contracts/IDBSQLSession.ts) and [examples/session.js](/examples/session.js).
114107

115-
In most cases, HiveSession methods return [HiveOperation](/lib/HiveOperation.ts), which helps you to retrieve requested data.
108+
## DBSQLOperation
116109

117-
After you fetch the result, the operation will have [TableSchema](/lib/hive/Types/index.ts#L143) and data (Array<[RowSet](/lib/hive/Types/index.ts#L218)>).
110+
In most cases, DBSQLSession methods return [DBSQLOperation](/lib/DBSQLOperation.ts), which helps you to retrieve requested data.
118111

119-
### HiveUtils
112+
After you fetch the result, the operation will have [TableSchema](/lib/hive/Types/index.ts#L143) and data.
120113

121-
Operation is executed asynchronously, so before retrieving the result, you have to wait until it has finished state.
114+
Operation is executed asynchronously, but `fetchChunk`/`fetchAll` will wait until it has finished. You can
115+
get current status of operation any time using a dedicated method:
122116

123117
```javascript
124118
...
125119
const response = await operation.status();
126120
const isReady = response.operationState === TCLIService_types.TOperationState.FINISHED_STATE;
127121
```
128122

129-
Also, the result is fetched by portions, the size of a portion you can set by method [setMaxRows()](/lib/HiveOperation.ts#L115).
123+
Also, the result is fetched by portions, the size of a portion you can pass as option to `fetchChunk`/`fetchAll`.
130124

131125
```javascript
132126
...
133-
operation.setMaxRows(500);
134-
const status = await operation.fetch();
127+
const results = await operation.fetchChunk({ maxRows: 500 });
135128
```
136129

137-
After you fetch all data and you have schema and set of data, you can transfrom data in readable format.
130+
Schema becomes available after you start fetching data.
138131

139132
```javascript
140133
...
134+
await operation.fetchChunk();
141135
const schema = operation.getSchema();
142-
const data = operation.getData();
143-
```
144-
145-
To simplify this process, you may use [HiveUtils](/lib/utils/HiveUtils.ts).
146-
147-
```typescript
148-
/**
149-
* Executes until operation has status finished or has one of the invalid states.
150-
*
151-
* @param operation operation to perform
152-
* @param progress flag for operation status command. If it sets true, response will include progressUpdateResponse with progress information
153-
* @param callback if callback specified it will be called each time the operation status response received and it will be passed as first parameter
154-
*/
155-
waitUntilReady(
156-
operation: IOperation,
157-
progress?: boolean,
158-
callback?: Function
159-
): Promise<IOperation>
160-
161-
/**
162-
* Fetches data until operation hasMoreRows.
163-
*
164-
* @param operation
165-
*/
166-
fetchAll(operation: IOperation): Promise<IOperation>
167-
168-
/**
169-
* Transforms operation result
170-
*
171-
* @param operation operation to perform
172-
* @param resultHandler you may specify your own handler. If not specified the result is transformed to JSON
173-
*/
174-
getResult(
175-
operation: IOperation,
176-
resultHandler?: IOperationResult
177-
): IOperationResult
178136
```
179137

180138
_NOTICE_
@@ -187,19 +145,13 @@ For more details see [IOperation](/lib/contracts/IOperation.ts).
187145
### Example
188146

189147
```javascript
190-
const { DBSQLClient } = require('@databricks/sql');
191-
const utils = DBSQLClient.utils;
192148
...
193-
await utils.waitUntilReady(
194-
operation,
195-
true,
196-
(stateResponse) => {
197-
console.log(stateResponse.taskStatus);
198-
}
199-
);
200-
await utils.fetchAll(operation);
201-
202-
const result = utils.getResult(operation).getValue();
149+
const result = await operation.fetchAll({
150+
progress: true,
151+
callback: (stateResponse) => {
152+
console.log(stateResponse.taskStatus);
153+
},
154+
});
203155
```
204156

205157
## Status

examples/cancel_operation.js

Lines changed: 4 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -7,14 +7,10 @@ const path = '/sql/1.0/endpoints/****';
77
const token = 'dapi********************************';
88

99
async function getQueryResult(operation) {
10-
const utils = DBSQLClient.utils;
11-
12-
await utils.waitUntilReady(operation, false, () => {});
1310
console.log('Fetching data...');
14-
await utils.fetchAll(operation);
11+
const results = await operation.fetchAll();
1512
await operation.close();
16-
17-
return utils.getResult(operation).getValue();
13+
return results;
1814
}
1915

2016
async function cancelQuery(operation) {
@@ -41,9 +37,9 @@ client
4137
console.log('Running query...');
4238
const queryOperation = await session.executeStatement(
4339
`
44-
SELECT id
40+
SELECT *
4541
FROM RANGE(100000000)
46-
ORDER BY RANDOM() + 2 asc
42+
ORDER BY RANDOM() ASC
4743
`,
4844
{ runAsync: true },
4945
);

examples/data_types.js

Lines changed: 15 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,9 @@ const client = new DBSQLClient();
44

55
const utils = DBSQLClient.utils;
66

7-
const [host, path, token] = process.argv.slice(2);
7+
const host = '****.databricks.com';
8+
const path = '/sql/1.0/endpoints/****';
9+
const token = 'dapi********************************';
810

911
client.connect({ host, path, token }).then(async (client) => {
1012
try {
@@ -175,16 +177,19 @@ const testComplexTypes = async (session) => {
175177
const execute = async (session, statement) => {
176178
const operation = await session.executeStatement(statement, { runAsync: true });
177179

178-
await utils.waitUntilReady(operation, true, (stateResponse) => {
179-
return;
180-
if (stateResponse.taskStatus) {
181-
console.log(stateResponse.taskStatus);
182-
} else {
183-
console.log(utils.formatProgress(stateResponse.progressUpdateResponse));
184-
}
180+
const result = await operation.fetchAll({
181+
progress: true,
182+
callback: (stateResponse) => {
183+
return;
184+
if (stateResponse.taskStatus) {
185+
console.log(stateResponse.taskStatus);
186+
} else {
187+
console.log(utils.formatProgress(stateResponse.progressUpdateResponse));
188+
}
189+
},
185190
});
186-
await utils.fetchAll(operation);
191+
187192
await operation.close();
188193

189-
return utils.getResult(operation).getValue();
194+
return result;
190195
};

examples/repl

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -20,14 +20,10 @@ async function initClient({ host, endpointId, token }) {
2020
}
2121

2222
async function runQuery(session, query) {
23-
const utils = DBSQLClient.utils;
24-
2523
const queryOperation = await session.executeStatement(query, { runAsync: true });
26-
await utils.waitUntilReady(queryOperation, false, () => {});
27-
await utils.fetchAll(queryOperation);
24+
const result = await queryOperation.fetchAll();
2825
await queryOperation.close();
29-
30-
return utils.getResult(queryOperation).getValue();
26+
return result;
3127
}
3228

3329
const format = {

examples/session.js

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,9 @@ const client = new DBSQLClient();
44

55
const utils = DBSQLClient.utils;
66

7-
const [host, path, token] = process.argv.slice(2);
7+
const host = '****.databricks.com';
8+
const path = '/sql/1.0/endpoints/****';
9+
const token = 'dapi********************************';
810

911
client
1012
.connect({ host, path, token })
@@ -44,12 +46,14 @@ client
4446
});
4547

4648
async function handleOperation(operation) {
47-
await utils.waitUntilReady(operation, true, (stateResponse) => {
48-
console.log(stateResponse.taskStatus);
49+
const result = await operation.fetchAll({
50+
progress: true,
51+
callback: (stateResponse) => {
52+
console.log(stateResponse.taskStatus);
53+
},
4954
});
50-
await utils.fetchAll(operation);
5155
await operation.close();
52-
return utils.getResult(operation).getValue();
56+
return result;
5357
}
5458

5559
const createTables = async (session) => {

examples/usage.js

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,9 @@ const { DBSQLClient, thrift } = require('../');
22

33
const client = new DBSQLClient();
44

5-
const [host, path, token] = process.argv.slice(2);
5+
const host = '****.databricks.com';
6+
const path = '/sql/1.0/endpoints/****';
7+
const token = 'dapi********************************';
68

79
client
810
.connect({ host, path, token })

lib/DBSQLClient.ts

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -17,11 +17,8 @@ import StatusFactory from './factory/StatusFactory';
1717
import HiveDriverError from './errors/HiveDriverError';
1818
import { buildUserAgentString, definedOrError } from './utils';
1919
import PlainHttpAuthentication from './connection/auth/PlainHttpAuthentication';
20-
import HiveUtils from './utils/HiveUtils';
2120

2221
export default class DBSQLClient extends EventEmitter implements IDBSQLClient {
23-
static utils = new HiveUtils();
24-
2522
private client: TCLIService.Client | null;
2623
private connection: IThriftConnection | null;
2724
private statusFactory: StatusFactory;

0 commit comments

Comments
 (0)