Gluestick TypeScript

A powerful TypeScript library for data processing and ETL operations on the hotglue IPaaS platform, built with Polars for high-performance data manipulation. Supports multiple export formats including CSV, JSON, Parquet, and Singer specification.

Installation

npm install @hotglue/gluestick-ts

Quick Start

import * as gs from '@hotglue/gluestick-ts';

// Create a Reader to access your data
const reader = new gs.Reader();

// Get available data streams
const streams = reader.keys();
console.log('Available streams:', streams);

// Read and process a specific stream
const dataFrame = reader.get('your_stream_name', { catalogTypes: true });

// Export processed data (defaults to singer)
gs.toExport(dataFrame, 'output_name', './etl-output');

// Export as CSV
gs.toExport(dataFrame, 'output_name', './etl-output', { exportFormat: 'csv' });

Core Components

Reader Class

The Reader class is your main interface for accessing data streams:

const reader = new gs.Reader(inputDir?, rootDir?);

Methods:

get(stream, options) - Read a specific stream as a Polars DataFrame
keys() - Get all available stream names
getPk(stream) - Get primary keys for a stream from catalog

Options:

catalogTypes: boolean - Use catalog for automatic type inference
Other options will be passed through to Polars when reading. See ReadCSV and ReadParquet options for more information

Export Functions

Export your processed data in multiple formats:

gs.toExport(dataFrame, outputName, outputDir, options?);

Supported formats:

Singer (default) - Singer specification format for data integration
CSV - Comma-separated values
JSON - Single JSON array
JSONL - Newline-delimited JSON
Parquet - Columnar storage format

Development

Build the project:

npm run build

Run examples:

# Run CSV processing example
npm run run:example:csv

# Run Parquet processing example  
npm run run:example:parquet

API Reference

Reader Constructor

new Reader(inputDir?: string, rootDir?: string)

inputDir - Custom input directory (default: ${rootDir}/sync-output)
rootDir - Root directory (default: process.env.ROOT_DIR || '.')

Reader Methods

`get(stream: string, options?: ReadOptions): DataFrame | null`

Read a data stream as a Polars DataFrame.

const df = reader.get('users', { catalogTypes: true });

Options:

catalogTypes: boolean - Use catalog for automatic type inference

`keys(): string[]`

Get all available stream names.

const streams = reader.keys();
// Returns: ['users', 'orders', 'products']

`getPk(stream: string): string[] | null`

Get primary keys for a stream from the catalog.

const primaryKeys = reader.getPk('users');
// Returns: ['id']

Export Function

toExport(
  dataFrame: DataFrame,
  outputName: string,
  outputDir: string,
  options?: ExportOptions
): void

Parameters:

dataFrame - Polars DataFrame to export
outputName - Name for the output file (without extension)
outputDir - Directory to write the output file
options - Export configuration options

Export Options:

interface ExportOptions {
  exportFormat?: 'csv' | 'json' | 'jsonl' | 'parquet' | 'singer';
  outputFilePrefix?: string;
  keys?: string[];  // Primary keys for the data
  stringifyObjects?: boolean;
  reservedVariables?: Record<string, string>;
  allowObjects?: boolean;  // For Singer format
  schema?: SingerHeaderMap;  // For Singer format
}

Examples:

// Export as CSV with prefix
gs.toExport(dataFrame, 'processed_users', './output', {
  exportFormat: 'csv',
  outputFilePrefix: 'tenant_123_',
  keys: ['user_id']
});

// Export as Singer format
gs.toExport(dataFrame, 'processed_users', './output', {
  exportFormat: 'singer',
  allowObjects: true,
  keys: ['user_id']
});

Singer Format Support

Export data in Singer specification format for data integration pipelines:

// Basic Singer export
gs.toExport(dataFrame, 'users', './output', {
  exportFormat: 'singer',
  keys: ['id']
});

// Singer export with object support
gs.toExport(dataFrame, 'users', './output', {
  exportFormat: 'singer',
  allowObjects: true,
  keys: ['id']
});

The Singer export automatically generates SCHEMA, RECORD, and STATE messages according to the Singer specification.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
examples		examples
src		src
.gitignore		.gitignore
.npmignore		.npmignore
README.md		README.md
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Gluestick TypeScript

Installation

Quick Start

Core Components

Reader Class

Export Functions

Development

API Reference

Reader Constructor

Reader Methods

`get(stream: string, options?: ReadOptions): DataFrame | null`

`keys(): string[]`

`getPk(stream: string): string[] | null`

Export Function

Singer Format Support

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

hotgluexyz/gluestick-ts

Folders and files

Latest commit

History

Repository files navigation

Gluestick TypeScript

Installation

Quick Start

Core Components

Reader Class

Export Functions

Development

API Reference

Reader Constructor

Reader Methods

get(stream: string, options?: ReadOptions): DataFrame | null

keys(): string[]

getPk(stream: string): string[] | null

Export Function

Singer Format Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

`get(stream: string, options?: ReadOptions): DataFrame | null`

`keys(): string[]`

`getPk(stream: string): string[] | null`

Packages