Skip to content

angelarw/topic-modeling-visualizations

Repository files navigation

Understanding a large amount of text by modeling & visualizing topics

Architecture

Setup

  1. In the S3 console, create a bucket in the US East (Virginia) region (region code us-east-1). Use a unique name, e.g. large-text-understanding-{username}

  2. Enable CORS on the bucket using the below policy

    <?xml version="1.0" encoding="UTF-8"?>
    <CORSConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
    <CORSRule>
        <AllowedOrigin>*</AllowedOrigin>
        <AllowedMethod>GET</AllowedMethod>
        <MaxAgeSeconds>3000</MaxAgeSeconds>
        <ExposeHeader>x-amz-server-side-encryption</ExposeHeader>
        <ExposeHeader>x-amz-request-id</ExposeHeader>
        <ExposeHeader>x-amz-id-2</ExposeHeader>
        <AllowedHeader>*</AllowedHeader>
    </CORSRule>
    </CORSConfiguration>
    
  3. Launch cloudformation stack:

    Launch cloudformation in us-east-1

Follow tutorial in Jupyter notebook

  1. Go to SageMaker console and open Jupyter notebook instance once it's ready

  2. Navigate into topic-modeling-visualizations/ and open Topic Modeling Tutorial.ipynb

  3. Follow steps detailed in the notebook

Use the webapp to explore topics and documents

follow the instructions in the notebook to launch the webapp

Topic view

Document view

Clean up

  1. If you created a cloud9 environment, delete it
  2. Delete the large-text-understanding CloudFormation stack
  3. Delete the s3 bucket you created

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •