diff --git a/api-reference/v2/general/errors.mdx b/api-reference/v2/general/errors.mdx index 65638b8..00cc789 100644 --- a/api-reference/v2/general/errors.mdx +++ b/api-reference/v2/general/errors.mdx @@ -16,7 +16,7 @@ All error responses will follow the following format with a top-level `error` ob } ``` -## Invalid Auth Token +### Invalid Auth Token Using an auth token that does not exist or is incorrect will result in a `404` response status. @@ -33,4 +33,23 @@ curl --request GET \ "message": "API key not found, or duplicate IN****ID" } } -``` \ No newline at end of file +``` + +### Invalid Stash ID or Serial + +Using a stash ID or serial that does not [meet the requirements](../stashing/introduction#stash-ids-and-serials) will result in a `400` response status. + +```bash +curl --request PUT \ + --url https://api.glideapps.com/stashes/-INVALID-/1 \ + --header 'Authorization: Bearer VALID-API-KEY' +``` + +```json +{ + "error": { + "type": "request_validation_error", + "message": "Invalid request params: Stash ID must be 256 characters max, alphanumeric with dashes and underscores, no leading dash or underscore" + } +} + ``` \ No newline at end of file diff --git a/api-reference/v2/resources/changelog.mdx b/api-reference/v2/resources/changelog.mdx index eedbba9..f0a0086 100644 --- a/api-reference/v2/resources/changelog.mdx +++ b/api-reference/v2/resources/changelog.mdx @@ -1,7 +1,14 @@ +--- title: Glide API Changelog sidebarTitle: Changelog --- +### August 28, 2024 + +- Users should now use the PUT method instead of POST for `/stashes/{stashID}/{serial}` to set the content of a chunk in a stash. +- Clarified that if a chunk is uploaded with an existing serial, its data will be overwritten. +- Documented the new format requirements for `stashID` and `serial`. + ### August 23, 2024 - The `POST /tables` endpoint now returns HTTP status 201 on success instead of 200. diff --git a/api-reference/v2/stashing/delete-stash.mdx b/api-reference/v2/stashing/delete-stash.mdx index 9202fae..2867d68 100644 --- a/api-reference/v2/stashing/delete-stash.mdx +++ b/api-reference/v2/stashing/delete-stash.mdx @@ -3,7 +3,9 @@ title: Delete Stash openapi: delete /stashes/{stashID} --- -If you no longer need a stash, you can delete it. This will remove the stash and all the data it contains. Stashes are automatically deleted within 48 hours of creation. +If you no longer need a stash, you can delete it. This will remove the stash and all the data it contains. + +Even if you do not call this endpoint, all stashes are automatically deleted within 48 hours after they are created. To understand what stashing is and how to use it to work with large datasets, please see our [introduction to stashing](/api-reference/v2/stashing/introduction). diff --git a/api-reference/v2/stashing/introduction.mdx b/api-reference/v2/stashing/introduction.mdx index 0ee922b..ffecf0b 100644 --- a/api-reference/v2/stashing/introduction.mdx +++ b/api-reference/v2/stashing/introduction.mdx @@ -3,11 +3,11 @@ title: Introduction description: Stashing large datasets for use with the Glide API --- -When working with large datasets it is necessary to break it into smaller chunks for performance and reliability. We call this process "stashing". +When using large datasets with the Glide API, it may be necessary to break them into smaller chunks for performance and reliability. We call this process "stashing." ## What is Stashing? -Stashing is the process by which a large dataset is broken into smaller subsets for uploading to Glide. Each subset is uploaded to Glide independently (either sequentially or in parallel) to form the complete dataset. +Stashing is the process by which a large dataset is broken into smaller chunks for uploading to Glide. Each chunk is uploaded to Glide independently (either sequentially or in parallel) to form the complete dataset. Once all data has been uploaded to the stash, the stash can then be referenced in other API calls to refer to the full dataset. This eliminates the need to include the entire dataset in the request itself, which may not be feasible due to its size. @@ -15,22 +15,22 @@ Once all data has been uploaded to the stash, the stash can then be referenced i You should use stashing when: -* You have a large dataset that you want to upload to Glide. Anything larger than 5mb should be broken up into smaller subsets and stashed. +* You have a large dataset that you want to upload to Glide. Anything larger than 5mb should be broken up into smaller chunks and stashed. * You want to perform an atomic operation using a large dataset. For example, you may want to perform an import of data into an existing table but don't want users to see the intermediate state of the import or incremental updates while they're using their application. -## Core Concepts +## Stash IDs and Serials -The main components of a stash are its ID and the individual chunked data subsets which are identified by a serial. Both the id and serial are values you define. +The main components of a stash are its ID and the individual chunked data subsets, which are identified by serials. Both the ID and serial are values you define. -The stash ID is a unique identifier for the stash that you define from the relevant information of your domain. This is often a combination of temporal information and a domain identifier. For instance: `20240215-job32` or `2024-07-05T15:17:50Z-customer93ak`. +The **stash ID** is a unique identifier for the stash that you define. You might use information that's relevant to your domain, such as `20240215-job32` or `2024-07-05T15-17-50Z_customer93ak`, a UUID, or any other unique identifier. -Each subset of data that is uploaded to the stash is identified by a serial. If the order of each data subset is important to the overall datset, you should use the serial to represent the order of loading (e.g., `1`, `2`, etc...). +Each chunk of data that is uploaded to the stash is identified by a unique **serial**. The sort order of the serials indicates the order of the chunks in the overall datset. If all serials can be parsed as integers, numerical sort order is used, otherwise sorting is done lexicographically according to each character's Unicode code point value. If the order of data chunks in the stash is important, using integers as serials (e.g., `1`, `2`, etc...) is recommended. -If the order of each data subset is not important, then you can use a random serial for each subset like a UUID: `123e4567-e89b-12d3-a456-426655440000`. The only requirement is that the serial must be unique for each subset. +The maximum length for both stash IDs and serials is 256 characters. They may only contain letters, numbers, hyphens, and underscores, and must start with a letter or number. ## Referencing a Stash -Once a stash, and all its parts, has been uploaded you can use the stash ID in other API calls to refer to the full dataset instead of including the entire dataset itself. Think of it as a reference to all the data in the stash. +Once all chunks have been uploaded to a stash, you can use the stash ID in place of passing the full dataset inline to other Glide API calls. Think of it as a reference to all the data in the stash. For instance, instead of including all the row data in a request to create a table, you can instead reference the stash ID: diff --git a/api-reference/v2/stashing/post-stashes-serial.mdx b/api-reference/v2/stashing/post-stashes-serial.mdx deleted file mode 100644 index f6b1839..0000000 --- a/api-reference/v2/stashing/post-stashes-serial.mdx +++ /dev/null @@ -1,14 +0,0 @@ ---- -title: Stash Data -openapi: post /stashes/{stashID}/{serial} ---- - -When working with large datasets it is necessary to break it into smaller chunks for performance and reliability. We call this process "stashing". - - - To understand what stashing is and how to use it to work with large datasets, please see our [introduction to stashing](/api-reference/v2/stashing/introduction). - - - - The stashID must be in the format of a UUID, e.g., `"123e4567-e89b-12d3-a456-426655440000"`. This is a known issue and will be fixed in a future release. - \ No newline at end of file diff --git a/api-reference/v2/stashing/put-stashes-serial.mdx b/api-reference/v2/stashing/put-stashes-serial.mdx new file mode 100644 index 0000000..e1f1077 --- /dev/null +++ b/api-reference/v2/stashing/put-stashes-serial.mdx @@ -0,0 +1,10 @@ +--- +title: Stash Data +openapi: put /stashes/{stashID}/{serial} +--- + +When using large datasets with the Glide API, it may be necessary to break them into smaller chunks for performance and reliability. We call this process "stashing." + + + To learn more about stashing and how to use it to work with large datasets, please see our [introduction to stashing](/api-reference/v2/stashing/introduction). + diff --git a/api-reference/v2/tutorials/bulk-import.mdx b/api-reference/v2/tutorials/bulk-import.mdx index 8d1a8c8..4b5ce82 100644 --- a/api-reference/v2/tutorials/bulk-import.mdx +++ b/api-reference/v2/tutorials/bulk-import.mdx @@ -32,10 +32,6 @@ To simplify the coordination, parallelization, and idempotency of the upload pro For instance, a daily import process might have a stash ID of `20240501-import`. Or, an import specific to a single customer might have a stash ID of `customer-381-import`. - - The stashID must be in the format of a UUID, e.g., `"123e4567-e89b-12d3-a456-426655440000"`. This is a known issue and will be fixed in a future release. - - You are responsible for ensuring that the stash ID is unique and stable across associated uploads. ## Upload Data diff --git a/mint.json b/mint.json index a4f245c..a0b445a 100644 --- a/mint.json +++ b/mint.json @@ -43,7 +43,7 @@ "group": "Stashing", "pages": [ "api-reference/v2/stashing/introduction", - "api-reference/v2/stashing/post-stashes-serial", + "api-reference/v2/stashing/put-stashes-serial", "api-reference/v2/stashing/delete-stash" ] }, diff --git a/openapi/swagger.json b/openapi/swagger.json index 43ea6aa..b7c57e5 100644 --- a/openapi/swagger.json +++ b/openapi/swagger.json @@ -367,7 +367,8 @@ "properties": { "$stashID": { "type": "string", - "description": "ID of the stash whose data should be used", + "pattern": "^[a-zA-Z0-9][a-zA-Z0-9_-]{0,255}$", + "description": "ID of the stash, e.g., `20240215-job32`", "example": "20240215-job32" } }, @@ -695,7 +696,8 @@ "properties": { "$stashID": { "type": "string", - "description": "ID of the stash whose data should be used", + "pattern": "^[a-zA-Z0-9][a-zA-Z0-9_-]{0,255}$", + "description": "ID of the stash, e.g., `20240215-job32`", "example": "20240215-job32" } }, @@ -924,7 +926,8 @@ "properties": { "$stashID": { "type": "string", - "description": "ID of the stash whose data should be used", + "pattern": "^[a-zA-Z0-9][a-zA-Z0-9_-]{0,255}$", + "description": "ID of the stash, e.g., `20240215-job32`", "example": "20240215-job32" } }, @@ -941,87 +944,8 @@ } } }, - "/stashes": { - "post": { - "responses": { - "200": { - "description": "", - "content": { - "application/json": { - "schema": { - "type": "object", - "properties": { - "data": { - "type": "object", - "properties": { - "stashID": { - "type": "string", - "description": "The newly created stash" - } - }, - "required": [ - "stashID" - ], - "additionalProperties": false - } - }, - "required": [ - "data" - ], - "additionalProperties": false - } - } - } - }, - "400": { - "description": "", - "content": { - "application/json": { - "schema": { - "type": "object", - "properties": { - "error": { - "type": "object", - "properties": { - "type": { - "type": "string" - }, - "message": { - "type": "string" - } - }, - "required": [ - "type", - "message" - ], - "additionalProperties": false - } - }, - "required": [ - "error" - ], - "additionalProperties": false - } - } - } - } - }, - "operationId": "Create stash", - "requestBody": { - "content": { - "application/json": { - "schema": { - "type": "object", - "properties": {}, - "additionalProperties": false - } - } - } - } - } - }, "/stashes/{stashID}/{serial}": { - "post": { + "put": { "responses": { "200": { "description": "", @@ -1069,14 +993,15 @@ } } }, - "description": "Adds data to a stash", + "description": "Sets the content of a chunk of data inside a stash", "parameters": [ { "name": "stashID", "in": "path", "schema": { "type": "string", - "description": "ID of the stash to add data to, e.g., `20240215-job32`", + "pattern": "^[a-zA-Z0-9][a-zA-Z0-9_-]{0,255}$", + "description": "ID of the stash. The stash will be created if it doesn't already exist.", "example": "20240215-job32" }, "required": true @@ -1086,7 +1011,9 @@ "in": "path", "schema": { "type": "string", - "description": "Serial identifier of the chunk of data to add to the stash. Chunks will be assembled in the sort order of their serials, so utilize ordered identifiers for each chunk if a specific ordering of data is required, e.g., `1`, `2`, etc...\nIf the order of data is not important, random, but unique, values can be used, e.g., `c2a4567`." + "pattern": "^[a-zA-Z0-9][a-zA-Z0-9_-]{0,255}$", + "description": "Serial identifier of the chunk of data to set in the stash. If a chunk has already been sent with the same serial, its data will be overwritten. Chunks will be assembled in the sort order of their serials, so utilize ordered identifiers for each chunk if a specific ordering of data in the stash is desired, e.g., `1`, `2`, etc...\nIf the order of data is not important, random, but unique, values can be used, e.g., `c2a4567`.", + "example": "1" }, "required": true } @@ -1170,14 +1097,15 @@ } } }, - "description": "Deletes a stash and its data", + "description": "Deletes a stash and all its data", "parameters": [ { "name": "stashID", "in": "path", "schema": { "type": "string", - "description": "ID of the stash to delete, e.g., `20240215-job32`", + "pattern": "^[a-zA-Z0-9][a-zA-Z0-9_-]{0,255}$", + "description": "ID of the stash, e.g., `20240215-job32`", "example": "20240215-job32" }, "required": true