|
| 1 | +--- |
| 2 | +title: Streamlit in Snowflake |
| 3 | +--- |
| 4 | + |
| 5 | +[Streamlit](https://streamlit.io/) is an open-source Python framework for data scientists and AI/ML engineers to |
| 6 | +deliver dynamic data apps with only a few lines of code. |
| 7 | + |
| 8 | +[Streamlit in Snowflake](https://www.snowflake.com/en/product/features/streamlit-in-snowflake/) enables data scientists |
| 9 | +and Python developers to combine Streamlit's component-rich, open-source Python library with the scale, performance and |
| 10 | +security of the Snowflake platform. Streamlit Python scripts can define user interface (UI) components such as |
| 11 | +filters, graphs, sliders, and more to interact with your data. |
| 12 | + |
| 13 | +In this example, you use Snowsight in your Snowflake account to create a simple Streamlit app that uses |
| 14 | +[Snowflake Cortex Search for RAG](https://docs.snowflake.com/user-guide/snowflake-cortex/cortex-search/cortex-search-overview) |
| 15 | +to ask natural-language questions about an existing table in your Snowflake account. This table |
| 16 | +contains data that was generated by Unstructured. Answers are returned in natural-language, |
| 17 | +chatbot-style format. |
| 18 | + |
| 19 | +## Prerequisites |
| 20 | + |
| 21 | +- A table in Snowflake that contains data that was generated by Unstructured. The |
| 22 | + target Snowflake table must have a column named `EMBEDDINGS` that will contains vector embeddings for the text in the table's `TEXT` column. |
| 23 | + The following Streamlit example app assumes that the `EMBEDDINGS` column contains 1,024 vector embeddings and has a data type of `VECTOR(FLOAT, 1024)`. |
| 24 | + |
| 25 | + To create this table, you can [create a custom Unstructured workflow](/ui/workflows#create-a-custom-workflow) |
| 26 | + that uses any supported [source connector](/ui/sources/overview) along with the |
| 27 | + [Snowflake destination connector](/ui/destinations/snowflake). Then |
| 28 | + [run](/ui/workflows#edit%2C-delete%2C-or-run-a-workflow) the workflow to |
| 29 | + generate the data and then insert that generated data into the target Snowflake table. |
| 30 | + |
| 31 | + After the data is inserted into the target Snowflake table, you can run the following Snowflake SQL statement to |
| 32 | + generate the 1,024 vector embeddings for the text in the table's `TEXT` column and then insert those generated vector |
| 33 | + embeddings into the table's `EMBEDDINGS` column. The model specified here for generating the vector embeddings is the |
| 34 | + same one that is used by the Streamlit example app: |
| 35 | + |
| 36 | + ```sql |
| 37 | + UPDATE ELEMENTS |
| 38 | + SET EMBEDDINGS = SNOWFLAKE.CORTEX.EMBED_TEXT_1024( |
| 39 | + 'snowflake-arctic-embed-l-v2.0', |
| 40 | + TEXT |
| 41 | + ); |
| 42 | + ``` |
| 43 | + |
| 44 | + To learn how to run Snowflake SQL statements, see for example |
| 45 | + [Querying data using worksheets](https://docs.snowflake.com/user-guide/ui-snowsight-query). |
| 46 | + |
| 47 | +- You must have the appropriate privileges to create and use a Streamlit app in your Snowflake account. These |
| 48 | + privileges include ones for the target table's parent database and schema as well as the Snowflake warehouse that |
| 49 | + runs the Streamlit app. For details, see |
| 50 | + [Getting started with Streamlit in Snowflake](https://docs.snowflake.com/developer-guide/streamlit/getting-started). |
| 51 | + |
| 52 | +## Create and run the example app |
| 53 | + |
| 54 | +<Steps> |
| 55 | + <Step title="Create the Streamlit app"> |
| 56 | + 1. In Snowsight for your Snowflake account, on the sidebar, click **Projects > Streamlit**. |
| 57 | + 2. Click **+ Streamlit App**. |
| 58 | + 3. For **App title**, enter a name for your app, such as `Unstructured Demo Streamlit App`. |
| 59 | + 4. For **App location**, chose the target database and schema to store the app in. |
| 60 | + 5. For **App warehouse**, choose the warehouse that you want to use to run your app and execute its queries. |
| 61 | + 6. Click **Create**. |
| 62 | + </Step> |
| 63 | + <Step title="Add code to the Streamlit app"> |
| 64 | + In this step, you add Python code to the Streamlit app that you created in the previous step. |
| 65 | + |
| 66 | + This step explains each part of the code as you add it. If you want to skip past these explanations, add the |
| 67 | + code in the [complete code example](#complete-code-example) all at once, and then skip ahead to |
| 68 | + the next step, "Run the Streamlit app." |
| 69 | + |
| 70 | + 1. Import Python dependencies that get the current connection to the Snowflake database and schema and get Streamlit functions and features. |
| 71 | + |
| 72 | + ```python |
| 73 | + from snowflake.snowpark.context import get_active_session |
| 74 | + import streamlit as st |
| 75 | + ``` |
| 76 | + |
| 77 | + 2. Get the current connection to the Snowflake database and schema. |
| 78 | + |
| 79 | + ```python |
| 80 | + session = get_active_session() |
| 81 | + ``` |
| 82 | + |
| 83 | + 3. Display the title of the app in the Streamlit UI, and get the user's search query from the Streamlit UI. |
| 84 | + |
| 85 | + ```python |
| 86 | + st.title("Snowflake Cortex Search for RAG with Data from Unstructured") |
| 87 | + |
| 88 | + query = st.text_input("Enter your search query:") |
| 89 | + ``` |
| 90 | + |
| 91 | + 4. Get the user's search query and display a progress indicator in the UI. |
| 92 | + |
| 93 | + ```python |
| 94 | + if query: |
| 95 | + with st.spinner("Embedding and retrieving..."): |
| 96 | + ``` |
| 97 | + 5. Use the user's search query to get the top result from the `ELEMENTS` table. |
| 98 | + The `ELEMENTS` table contains the data that was generated by Unstructured. The code uses the |
| 99 | + `SNOWFLAKE.CORTEX.EMBED_TEXT_1024` function to generate vector embeddings for the user's search query and the `VECTOR_COSINE_SIMILARITY` |
| 100 | + function to get the similarity between the vector embeddings for the user's search query and the vector embeddings for the `TEXT` column |
| 101 | + for each rown in the `ELEMENTS` table. The code then orders the results by similarity and limits the results to the row with the greatest similarity |
| 102 | + between the search query and the target text. |
| 103 | + |
| 104 | + ```python |
| 105 | + top_result_df = session.sql(f""" |
| 106 | + WITH query_embedding AS ( |
| 107 | + SELECT SNOWFLAKE.CORTEX.EMBED_TEXT_1024( |
| 108 | + 'snowflake-arctic-embed-l-v2.0', '{query}' |
| 109 | + ) AS EMBED |
| 110 | + ) |
| 111 | + SELECT |
| 112 | + e.TEXT, |
| 113 | + VECTOR_COSINE_SIMILARITY(e.EMBEDDINGS, q.EMBED) AS similarity |
| 114 | + FROM ELEMENTS e, query_embedding q |
| 115 | + ORDER BY similarity DESC |
| 116 | + LIMIT 1 |
| 117 | + """).to_pandas() |
| 118 | + ``` |
| 119 | + 6. Get the `TEXT` column from the top result and use it as context for the user's search query. |
| 120 | + |
| 121 | + ```python |
| 122 | + context = top_result_df["TEXT"][0] |
| 123 | + ``` |
| 124 | + |
| 125 | + 7. Use the user's search query and the context from the top result to get a response from Snowflake Cortex Search for RAG. |
| 126 | + The code uses the `SNOWFLAKE.CORTEX.COMPLETE` function to generate a response to the user's search query based on the context from the top result. |
| 127 | + |
| 128 | + ```python |
| 129 | + completion_df = session.sql(f""" |
| 130 | + SELECT SNOWFLAKE.CORTEX.COMPLETE( |
| 131 | + 'snowflake-arctic', |
| 132 | + CONCAT('Context: ', $$ {context} $$, ' \\n\\nQuestion: {query}\\nAnswer:') |
| 133 | + ) AS RESPONSE |
| 134 | + """).to_pandas() |
| 135 | + ``` |
| 136 | + |
| 137 | + 8. Display the generated response in the Streamlit UI. |
| 138 | + |
| 139 | + ```python |
| 140 | + st.write("Answer:") |
| 141 | + st.write(completion_df["RESPONSE"][0]) |
| 142 | + ``` |
| 143 | + </Step> |
| 144 | + <Step title="Run the Streamlit app"> |
| 145 | + 1. In the upper right corner, click **Run**. |
| 146 | + 2. For **Enter your search query**, enter some natural-language question about the `TEXT` column in the table. |
| 147 | + 3. Press **Enter**. |
| 148 | + |
| 149 | + Snowflake Cortex Search for RAG returns its answer to your question in natural-language, chatbot-style format. |
| 150 | + </Step> |
| 151 | +</Steps> |
| 152 | + |
| 153 | +## Complete code example |
| 154 | + |
| 155 | +The full code example for the Streamlit app is as follows: |
| 156 | + |
| 157 | +```python |
| 158 | +from snowflake.snowpark.context import get_active_session |
| 159 | +import streamlit as st |
| 160 | + |
| 161 | +session = get_active_session() |
| 162 | + |
| 163 | +st.title("Snowflake Cortex Search for RAG with Data from Unstructured") |
| 164 | + |
| 165 | +query = st.text_input("Enter your search query:") |
| 166 | + |
| 167 | +if query: |
| 168 | + with st.spinner("Embedding and retrieving..."): |
| 169 | + |
| 170 | + top_result_df = session.sql(f""" |
| 171 | + WITH query_embedding AS ( |
| 172 | + SELECT SNOWFLAKE.CORTEX.EMBED_TEXT_1024( |
| 173 | + 'snowflake-arctic-embed-l-v2.0', '{query}' |
| 174 | + ) AS EMBED |
| 175 | + ) |
| 176 | + SELECT |
| 177 | + e.TEXT, |
| 178 | + VECTOR_COSINE_SIMILARITY(e.EMBEDDINGS, q.EMBED) AS similarity |
| 179 | + FROM ELEMENTS e, query_embedding q |
| 180 | + ORDER BY similarity DESC |
| 181 | + LIMIT 1 |
| 182 | + """).to_pandas() |
| 183 | + |
| 184 | + context = top_result_df["TEXT"][0] |
| 185 | + |
| 186 | + completion_df = session.sql(f""" |
| 187 | + SELECT SNOWFLAKE.CORTEX.COMPLETE( |
| 188 | + 'snowflake-arctic', |
| 189 | + CONCAT('Context: ', $$ {context} $$, ' \\n\\nQuestion: {query}\\nAnswer:') |
| 190 | + ) AS RESPONSE |
| 191 | + """).to_pandas() |
| 192 | + |
| 193 | + st.write("Answer:") |
| 194 | + st.write(completion_df["RESPONSE"][0]) |
| 195 | +``` |
| 196 | + |
| 197 | +## Additional resources |
| 198 | + |
| 199 | +- [Streamlit in Snowflake documentation](https://docs.snowflake.com/developer-guide/streamlit/about-streamlit) |
| 200 | +- [Create and deploy Streamlit apps using Snowsight](https://docs.snowflake.com/developer-guide/streamlit/create-streamlit-ui) |
| 201 | +- [Snowflake Solutions Developer Center for Streamlit](https://www.snowflake.com/en/developers/solutions-center/?tags=technology%2Fstreamlit) |
| 202 | +- [Streamlit documentation](https://docs.streamlit.io/) |
| 203 | + |
0 commit comments