DynamoDB Text Search

Disclaimer

This is a proof of concept.

DynamoDB is not a good fit for the job of text search. Adding data to the index is extremely expensive, and fuzzy searching is not possible.

For production full text searching, I recommend Elastic Search or Postgres text search.

Purpose

Full text search without spinning up a pesky Elastic search cluster!

Search text: "say hello"

Search results:
- "Oh dear, 'Say hello John' she instructed him"
- "Deep dish lyrics: say hello, say hello"
- "We clap and say hello,. With our friends at storytime"

Usage

Create a DTS instance

export const dts = new DynamoTextSearch({
  tableName: "DTS_example",
  region: "ap-southeast-2",
  // can be used if this table is shared with other entities
  keyPrefix: "",
});

Define an index

export const myIndex: DTSIndex = {
  name: "myIndex",

  // optional, the larger this value the more partitions entities will be distributed across
  // raise to increase read and write throughput, at the cost of more RCUs during searches
  // default: 1
  // numShards: 1,

  // optional, specify custom characters to treat as a word boundary
  // delimiters: " .,;",

  // optional, specify characters to ignore from search input
  // ignoreChars: " .,:;!@#$%^&*()-+=_'",

  // optional, specify maximum number of searchable characters stored in each segment
  // maxSearchableLength: 50
};

Load some entries

const dataItems = await loadMydata();

await Promise.all(
  dataItems.map(dataItem =>
    dts.addEntry({
      index: myIndex,
      entryText: dataItem.text,
      entry: dataItem,
    })
  )
)

Perform a search

const searchResults = await dts.search({
  index: myIndex,
  searchText: "hello world",
  // optional, return data pre-sorted by a key within 'entity'
  sortKey: "dateCreated",
});

console.log(`Found ${searchResults.length} items`);
console.log(searchResults);

See below for examples

Setup

Dynamo DB table

Create a table as per the configuration in infra/dynamo-text-search/main.tf

OR use terraform to auto create the table:

cd infra
terraform init
terraform apply
cd ../

Installing & running examples

I've included 2 test scripts, one to load the index with some data and another to perform a search. The Bible (not sure which version...) is included as test data

Install dependencies yarn install
Load data yarn example:load-bible
1. Load as much data as you would like, then cancel the ingest with ctrl-c
2. If you let this run, the whole bible will be ingested. With a small number of WCUs provisioned this may take a few minutes.
3. Ingesting the entire example dataset to an on-demand table costs about $1.50 USD
Search data yarn example:search-bible "Abraham and Isaac"

Performance

Loading data can be slow. With a provisioned table at 10,000 WCUs I was able to ingest the bible in about 30 seconds. Loading data is also quiet expensive, loading this example data set costs about $1.50 USD

Querying data is extremely fast, Once the data has been loaded RCUs can be increased to allow up to numShards * 6,000 queries per second There is no limit to the amount of data that can be stored in a single index, performance will remain constant with any size index.

Once data has been loaded, the number of shards cannot be changed - the data will have to be re-imported with a new index configuration.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
example-data		example-data
infra		infra
src		src
.gitignore		.gitignore
LICENSE		LICENSE
jest.config.js		jest.config.js
package.json		package.json
readme.md		readme.md
tsconfig.json		tsconfig.json
yarn.lock		yarn.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DynamoDB Text Search

Purpose

Usage

Setup

Dynamo DB table

Installing & running examples

Performance

About

Uh oh!

Releases

Packages

Languages

License

jyelewis/DynamoDB-text-search

Folders and files

Latest commit

History

Repository files navigation

DynamoDB Text Search

Purpose

Usage

Setup

Dynamo DB table

Installing & running examples

Performance

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages