|
| 1 | +# @deck.gl-community/arrow-layers |
| 2 | + |
| 3 | +The easiest, most efficient way to render large geospatial datasets in [deck.gl](https://deck.gl), via [GeoArrow](https://geoarrow.org). |
| 4 | + |
| 5 | +This is just a _glue library_ to deck.gl. It generates the same layer objects as upstream deck.gl does, but uses a [low-level binary interface](https://deck.gl/docs/developer-guide/performance#supply-attributes-directly) for best performance. Using the binary interface directly is really easy to mess up. Instead, the layer classes exposed by `@deck.gl-community/arrow-layers` focus on making the process easy to use and validating user input, and under the hood pass buffers to deck.gl's binary interface. |
| 6 | + |
| 7 | + |
| 8 | + |
| 9 | +<p style="text-align:center">3.2 million points rendered with a <code>GeoArrowScatterplotLayer</code>.</p> |
| 10 | + |
| 11 | +## Features |
| 12 | + |
| 13 | +- **Fast**: copies binary buffers directly from an [Arrow JS](https://www.npmjs.com/package/apache-arrow) [`Table`](https://arrow.apache.org/docs/js/classes/Arrow_dom.Table.html) object to the GPU using [deck.gl's binary data interface](https://deck.gl/docs/developer-guide/performance#supply-attributes-directly). |
| 14 | +- **Memory-efficient**: no intermediate data representation and no garbage-collector overhead. |
| 15 | +- **Full layer customization**: Use the same layer properties as in the upstream deck.gl layer documentation. Any _accessor_ (layer property prefixed with `get*`) can be passed an Arrow [`Vector`](https://arrow.apache.org/docs/js/classes/Arrow_dom.Vector.html). |
| 16 | +- **Input validation**. Validation can be turned off via the `_validate` property on most layer types. |
| 17 | +- **Multi-threaded polygon triangulation**. When rendering polygon layers, a process called [polygon triangulation](https://en.wikipedia.org/wiki/Polygon_triangulation) must happen on the CPU before data can be copied to the GPU. Ordinarily, this can block the main thread for several seconds, but the `GeoArrowSolidPolygonLayer` will perform this process off the main thread, on up to 8 web workers. |
| 18 | +- **Progressive rendering support**. For streaming-capable data formats like Arrow IPC and Parquet, you can render a GeoArrow layer per chunk as the data loads. |
| 19 | + |
| 20 | +## Examples |
| 21 | + |
| 22 | +Standalone examples exist in the [`examples/`](examples/) directory. Create an issue if you have trouble running them. |
| 23 | + |
| 24 | +More hosted examples on Observable are planned. |
| 25 | + |
| 26 | +## Providing accessors |
| 27 | + |
| 28 | +All deck.gl layers have two types of properties: ["Render Options"](https://deck.gl/docs/api-reference/layers/scatterplot-layer#render-options) — constant properties across a layer — and "Data Accessors" — properties that can vary across rows. An accessor is any property prefixed with `get`, like `GeoArrowScatterplotLayer`'s `getFillColor`. |
| 29 | + |
| 30 | +With `@deck.gl-community/arrow-layers` specifically, there are two ways to pass these data accessors, either as pre-computed columns or with function callbacks on Arrow data. |
| 31 | + |
| 32 | +### Pre-computed Arrow columns |
| 33 | + |
| 34 | +If you have an Arrow column ([`Vector`](https://arrow.apache.org/docs/js/classes/Arrow_dom.Vector.html) in Arrow JS terminology), you can pass that directly into a layer: |
| 35 | + |
| 36 | +```ts |
| 37 | +import { Table } from "apache-arrow"; |
| 38 | +import { GeoArrowScatterplotLayer } from "@deck.gl-community/arrow-layers"; |
| 39 | + |
| 40 | +const table = new Table(...); |
| 41 | +const deckLayer = new GeoArrowScatterplotLayer({ |
| 42 | + id: "scatterplot", |
| 43 | + data: table, |
| 44 | + /// Geometry column |
| 45 | + getPosition: table.getChild("geometry")!, |
| 46 | + /// Column of type FixedSizeList[3] or FixedSizeList[4], with child type Uint8 |
| 47 | + getFillColor: table.getChild("colors")!, |
| 48 | +}); |
| 49 | +``` |
| 50 | + |
| 51 | +For example, [lonboard](https://github.com/developmentseed/lonboard) computes Arrow columns on the Python side for all attributes so that end users have available the full capabilities Python. Then those columns are serialized to Python and the resulting `arrow.Vector` is passed into the relevant layer. |
| 52 | + |
| 53 | +### Function accessors |
| 54 | + |
| 55 | +GeoArrow layers accept a callback that takes an object with `index` and `data`. `data` is an `arrow.RecordBatch` object (a vertical section of the input `Table`), and `index` is the positional index of the current row of that batch. In TypeScript, you should see accurate type checking. |
| 56 | + |
| 57 | +```ts |
| 58 | +const deckLayer = new GeoArrowPathLayer({ |
| 59 | + id: "geoarrow-path", |
| 60 | + data: table, |
| 61 | + getColor: ({ index, data, target }) => { |
| 62 | + const recordBatch = data.data; |
| 63 | + const row = recordBatch.get(index)!; |
| 64 | + return COLORS_LOOKUP[row["scalerank"]]; |
| 65 | + }, |
| 66 | +}), |
| 67 | +``` |
| 68 | + |
| 69 | +The full example is in [`examples/multilinestring/app.tsx`](examples/multilinestring/app.tsx). |
| 70 | + |
| 71 | +You can also use assign to the `target` prop to reduce garbage collector overhead, as described in the [deck.gl performance guide](https://deck.gl/docs/developer-guide/performance#supply-binary-blobs-to-the-data-prop). |
| 72 | + |
| 73 | +## Data Loading |
| 74 | + |
| 75 | +To create deck.gl layers using this library, you need to first get GeoArrow-formatted data into the browser, discussed below. |
| 76 | + |
| 77 | +[OGR/GDAL](https://gdal.org/) is useful for converting among data formats on the backend, and it includes both [GeoArrow](https://gdal.org/drivers/vector/arrow.html#vector-arrow) and [GeoParquet](https://gdal.org/drivers/vector/parquet.html) drivers. Pass `-lco GEOMETRY_ENCODING=GEOARROW` when converting to Arrow or Parquet files in order to store geometries in a GeoArrow-native geometry column. |
| 78 | + |
| 79 | +### Arrow IPC |
| 80 | + |
| 81 | +If you already have Arrow IPC files (also called Feather files) with a GeoArrow geometry column, you can use [`apache-arrow`](https://www.npmjs.com/package/apache-arrow) to load those files. |
| 82 | + |
| 83 | +```ts |
| 84 | +import { tableFromIPC } from "apache-arrow"; |
| 85 | +import { GeoArrowScatterplotLayer } from "@deck.gl-community/arrow-layers"; |
| 86 | +
|
| 87 | +const resp = await fetch("url/to/file.arrow"); |
| 88 | +const jsTable = await tableFromIPC(resp); |
| 89 | +const deckLayer = new GeoArrowScatterplotLayer({ |
| 90 | + id: "scatterplot", |
| 91 | + data: jsTable, |
| 92 | + /// Replace with the correct geometry column name |
| 93 | + getPosition: jsTable.getChild("geometry")!, |
| 94 | +}); |
| 95 | +``` |
| 96 | + |
| 97 | +Note those IPC files must be saved **uncompressed** (at least not internally compressed). As of v14, Arrow JS does not currently support loading IPC files with internal compression. |
| 98 | + |
| 99 | +### Parquet |
| 100 | + |
| 101 | +If you have a Parquet file where the geometry column is stored as _GeoArrow_ encoding (i.e. not as a binary column with WKB-encoded geometries), you can use the stable `parquet-wasm` library to load those files. |
| 102 | + |
| 103 | +```ts |
| 104 | +import { readParquet } from "parquet-wasm" |
| 105 | +import { tableFromIPC } from "apache-arrow"; |
| 106 | +import { GeoArrowScatterplotLayer } from "@deck.gl-community/arrow-layers"; |
| 107 | +
|
| 108 | +const resp = await fetch("url/to/file.parquet"); |
| 109 | +const arrayBuffer = await resp.arrayBuffer(); |
| 110 | +const wasmTable = readParquet(new Uint8Array(arrayBuffer)); |
| 111 | +const jsTable = tableFromIPC(wasmTable.intoIPCStream()); |
| 112 | +const deckLayer = new GeoArrowScatterplotLayer({ |
| 113 | + id: "scatterplot", |
| 114 | + data: jsTable, |
| 115 | + /// Replace with the correct geometry column name |
| 116 | + getPosition: jsTable.getChild("geometry")!, |
| 117 | +}); |
| 118 | +``` |
| 119 | + |
| 120 | +See below for instructions to load GeoParquet 1.0 files, which have WKB-encoded geometries that need to be decoded before they can be used with `@deck.gl-community/arrow-layers`. |
| 121 | + |
| 122 | +### GeoParquet |
| 123 | + |
| 124 | +An initial version of the [`@geoarrow/geoparquet-wasm`](https://www.npmjs.com/package/@geoarrow/geoparquet-wasm) library is published, which reads a GeoParquet file to GeoArrow memory. |
| 125 | + |
| 126 | +```ts |
| 127 | +import { readGeoParquet } from "@geoarrow/geoparquet-wasm"; |
| 128 | +import { tableFromIPC } from "apache-arrow"; |
| 129 | +import { GeoArrowScatterplotLayer } from "@deck.gl-community/arrow-layers"; |
| 130 | +
|
| 131 | +const resp = await fetch("url/to/file.parquet"); |
| 132 | +const arrayBuffer = await resp.arrayBuffer(); |
| 133 | +const wasmTable = readGeoParquet(new Uint8Array(arrayBuffer)); |
| 134 | +const jsTable = tableFromIPC(wasmTable.intoTable().intoIPCStream()); |
| 135 | +const deckLayer = new GeoArrowScatterplotLayer({ |
| 136 | + id: "scatterplot", |
| 137 | + data: jsTable, |
| 138 | + /// Replace with the correct geometry column name |
| 139 | + getPosition: jsTable.getChild("geometry")!, |
| 140 | +}); |
| 141 | +``` |
| 142 | + |
| 143 | +If you hit a bug with `@geoarrow/geoparquet-wasm`, please create a reproducible bug report [here](https://github.com/geoarrow/geoarrow-rs/issues/new). |
| 144 | + |
| 145 | +### FlatGeobuf |
| 146 | + |
| 147 | +An initial version of the [`@geoarrow/flatgeobuf-wasm`](https://www.npmjs.com/package/@geoarrow/flatgeobuf-wasm) library is published, which reads a FlatGeobuf file to GeoArrow memory. As of version 0.2.0-beta.1, this library does not yet support remote files, and expects the full FlatGeobuf file to exist in memory. |
| 148 | + |
| 149 | +```ts |
| 150 | +import { readFlatGeobuf } from "@geoarrow/flatgeobuf-wasm"; |
| 151 | +import { tableFromIPC } from "apache-arrow"; |
| 152 | +import { GeoArrowScatterplotLayer } from "@deck.gl-community/arrow-layers"; |
| 153 | +
|
| 154 | +const resp = await fetch("url/to/file.fgb"); |
| 155 | +const arrayBuffer = await resp.arrayBuffer(); |
| 156 | +const wasmTable = readFlatGeobuf(new Uint8Array(arrayBuffer)); |
| 157 | +const jsTable = tableFromIPC(wasmTable.intoTable().intoIPCStream()); |
| 158 | +const deckLayer = new GeoArrowScatterplotLayer({ |
| 159 | + id: "scatterplot", |
| 160 | + data: jsTable, |
| 161 | + /// Replace with the correct geometry column name |
| 162 | + getPosition: jsTable.getChild("geometry")!, |
| 163 | +}); |
| 164 | +``` |
| 165 | + |
| 166 | +If you hit a bug with `@geoarrow/flatgeobuf-wasm`, please create a reproducible bug report [here](https://github.com/geoarrow/geoarrow-rs/issues/new). |
0 commit comments