|
2 | 2 | "cells": [
|
3 | 3 | {
|
4 | 4 | "cell_type": "markdown",
|
5 |
| - "id": "5ef5f772-f48a-4bb1-bb68-4e8e9236fd2e", |
6 | 5 | "metadata": {},
|
7 | 6 | "source": [
|
8 | 7 | "# QuantLSTM - ONNX (QCDQ) representation"
|
9 | 8 | ]
|
10 | 9 | },
|
11 | 10 | {
|
12 | 11 | "cell_type": "markdown",
|
13 |
| - "id": "e5a747f9-fd74-4ebc-8d74-17bf06ff2d48", |
14 | 12 | "metadata": {},
|
15 | 13 | "source": [
|
16 |
| - "This notebook is divided into `five` parts:\n", |
| 14 | + "This notebook is divided into `six` parts:\n", |
17 | 15 | "\n",
|
| 16 | + "<br><b>Part 0</b> : Package Installations.\n", |
| 17 | + "<br>\n", |
18 | 18 | "<br><b>Part 1</b> : Introduction to LSTMs.\n",
|
19 | 19 | "<br>\n",
|
20 | 20 | "<br><b>Part 2</b> : Model creation with brevitas QuantLSTM layer. \n",
|
|
28 | 28 | },
|
29 | 29 | {
|
30 | 30 | "cell_type": "markdown",
|
31 |
| - "id": "69ae7154-8cf3-4ee7-88c3-3bec0550008a", |
| 31 | + "metadata": {}, |
| 32 | + "source": [ |
| 33 | + "# Package Installations" |
| 34 | + ] |
| 35 | + }, |
| 36 | + { |
| 37 | + "cell_type": "code", |
| 38 | + "execution_count": null, |
| 39 | + "metadata": {}, |
| 40 | + "outputs": [], |
| 41 | + "source": [ |
| 42 | + "#Required package installations, This cell only needs to be executed once at the start\n", |
| 43 | + "!pip install torch==1.13.1\n", |
| 44 | + "!pip install brevitas==0.9.1\n", |
| 45 | + "!pip install onnx==1.13.0\n", |
| 46 | + "!pip install onnxoptimizer==0.3.13\n", |
| 47 | + "!pip install onnxruntime==1.11.1\n", |
| 48 | + "!pip install netron==7.2.5\n", |
| 49 | + "!pip install qonnx==0.2.0\n", |
| 50 | + "!pip install IPython\n", |
| 51 | + "!pip install ipykernel\n", |
| 52 | + "!ipython kernel install --user --name=venv\n", |
| 53 | + "\n", |
| 54 | + "#The below location can change depending on your installation of the 'venv' virtual environment\n", |
| 55 | + "!cp ./4_quant_lstm_helper/function.py ../venv/lib/python3.8/site-packages/brevitas/export/onnx/standard/\n", |
| 56 | + "!cp ./4_quant_lstm_helper/handler.py ../venv/lib/python3.8/site-packages/brevitas/export/onnx/standard/qcdq/\n", |
| 57 | + "\n", |
| 58 | + "#NOTE : Make sure to chnage the kernel to from \"Python 3\" to \"venv\" before running the below commands" |
| 59 | + ] |
| 60 | + }, |
| 61 | + { |
| 62 | + "cell_type": "markdown", |
32 | 63 | "metadata": {},
|
33 | 64 | "source": [
|
34 | 65 | "# Introduction to LSTM's "
|
35 | 66 | ]
|
36 | 67 | },
|
37 | 68 | {
|
38 |
| - "attachments": {}, |
39 | 69 | "cell_type": "markdown",
|
40 |
| - "id": "e7a903ef-1680-4a20-8c61-267884b76c96", |
41 | 70 | "metadata": {},
|
42 | 71 | "source": [
|
43 | 72 | "`LSTM’s (Long Short-Term Memory)` are sequential neural networks that are capable of learning long term dependencies especially in sequence prediction problems. They are deployed in machine translation, speech recognition, image captioning and especially used for time-series analysis applications.\n",
|
|
73 | 102 | },
|
74 | 103 | {
|
75 | 104 | "cell_type": "markdown",
|
76 |
| - "id": "70d052c8-e5cd-4eb1-89e5-f8ae956cb853", |
77 | 105 | "metadata": {},
|
78 | 106 | "source": [
|
79 | 107 | "# QuantLSTM model creation"
|
80 | 108 | ]
|
81 | 109 | },
|
82 | 110 | {
|
83 | 111 | "cell_type": "markdown",
|
84 |
| - "id": "6a64be7c", |
85 | 112 | "metadata": {},
|
86 | 113 | "source": [
|
87 | 114 | "In the 2nd part of the notebook, we will create a single layer `QuantLSTM` model in brevitas. We will evaluate with a given set of inputs. We then export this model to `QONNX` so that the same parameters (weights/biases/scales) can be extracted and used in the `QCDQ-LSTM` implementation."
|
|
90 | 117 | {
|
91 | 118 | "cell_type": "code",
|
92 | 119 | "execution_count": null,
|
93 |
| - "id": "84d66548-365d-46a5-9eaa-bb767085f9aa", |
94 | 120 | "metadata": {},
|
95 | 121 | "outputs": [],
|
96 | 122 | "source": [
|
|
119 | 145 | {
|
120 | 146 | "cell_type": "code",
|
121 | 147 | "execution_count": null,
|
122 |
| - "id": "23a7682c", |
123 | 148 | "metadata": {},
|
124 | 149 | "outputs": [],
|
125 | 150 | "source": [
|
|
153 | 178 | },
|
154 | 179 | {
|
155 | 180 | "cell_type": "markdown",
|
156 |
| - "id": "347ef1f5-36e8-4103-9b13-efa7fe93eb5e", |
157 | 181 | "metadata": {},
|
158 | 182 | "source": [
|
159 | 183 | "`Abbreviations` : Short-forms defined in the next code block can be referenced here for definitions.\n",
|
|
166 | 190 | {
|
167 | 191 | "cell_type": "code",
|
168 | 192 | "execution_count": null,
|
169 |
| - "id": "0bfbf5a3-8556-4190-a28f-4fe9859c55a9", |
170 | 193 | "metadata": {},
|
171 | 194 | "outputs": [],
|
172 | 195 | "source": [
|
|
210 | 233 | },
|
211 | 234 | {
|
212 | 235 | "cell_type": "markdown",
|
213 |
| - "id": "10237589-f84e-423a-829e-3e2c2e806ed7", |
214 | 236 | "metadata": {},
|
215 | 237 | "source": [
|
216 | 238 | "# LSTM ONNX model"
|
217 | 239 | ]
|
218 | 240 | },
|
219 | 241 | {
|
220 | 242 | "cell_type": "markdown",
|
221 |
| - "id": "367547b8", |
222 | 243 | "metadata": {},
|
223 | 244 | "source": [
|
224 | 245 | "In the 3rd part of the notebook, we will construct the `QCDQ-LSTM` model with standard ONNX operators. After loading all the parameters in the above block we can now start building our ONNX model with QCDQ quantization to represent the LSTM computations described in part-1.\n"
|
|
227 | 248 | {
|
228 | 249 | "cell_type": "code",
|
229 | 250 | "execution_count": null,
|
230 |
| - "id": "02fe4d94-af24-4d5e-a809-7d8c49e7fd90", |
231 | 251 | "metadata": {},
|
232 | 252 | "outputs": [],
|
233 | 253 | "source": [
|
|
249 | 269 | },
|
250 | 270 | {
|
251 | 271 | "cell_type": "markdown",
|
252 |
| - "id": "15098a9e-4187-4987-82cc-275eba650923", |
253 | 272 | "metadata": {},
|
254 | 273 | "source": [
|
255 | 274 | "`Abbreviations` : These describe different short-forms used in the next two blocks.\n",
|
|
265 | 284 | },
|
266 | 285 | {
|
267 | 286 | "cell_type": "markdown",
|
268 |
| - "id": "f2edc0cc", |
269 | 287 | "metadata": {},
|
270 | 288 | "source": [
|
271 | 289 | "We start defining the model by defining the `inputs` and `outputs` defined as value_info tensors in ONNX.\n",
|
|
276 | 294 | {
|
277 | 295 | "cell_type": "code",
|
278 | 296 | "execution_count": null,
|
279 |
| - "id": "02761646-4c6d-440f-8e90-4935beebab56", |
280 | 297 | "metadata": {},
|
281 | 298 | "outputs": [],
|
282 | 299 | "source": [
|
|
294 | 311 | {
|
295 | 312 | "cell_type": "code",
|
296 | 313 | "execution_count": null,
|
297 |
| - "id": "c08e5a23-ef2e-4bca-9293-c800350c2c62", |
298 | 314 | "metadata": {},
|
299 | 315 | "outputs": [],
|
300 | 316 | "source": [
|
|
412 | 428 | },
|
413 | 429 | {
|
414 | 430 | "cell_type": "markdown",
|
415 |
| - "id": "3d10867f", |
416 | 431 | "metadata": {},
|
417 | 432 | "source": [
|
418 | 433 | "After defining the above operations we now connect them and create a graph with the help of onnx.helper `make_graph` utility function"
|
|
421 | 436 | {
|
422 | 437 | "cell_type": "code",
|
423 | 438 | "execution_count": null,
|
424 |
| - "id": "79839558-8752-4fc8-9b0e-8fed47c91701", |
425 | 439 | "metadata": {},
|
426 | 440 | "outputs": [],
|
427 | 441 | "source": [
|
|
632 | 646 | },
|
633 | 647 | {
|
634 | 648 | "cell_type": "markdown",
|
635 |
| - "id": "b1b16751", |
636 | 649 | "metadata": {},
|
637 | 650 | "source": [
|
638 | 651 | "The above created graph can now be converted into a qonnx model with the `qonnx_make_model` utility. We save the model with `onnx.save` utility and then view it in Netron with the help of `showInNetron` utility. \n"
|
|
641 | 654 | {
|
642 | 655 | "cell_type": "code",
|
643 | 656 | "execution_count": null,
|
644 |
| - "id": "c6ec7b2a-456d-4452-97ec-df9a471d5391", |
645 | 657 | "metadata": {},
|
646 | 658 | "outputs": [],
|
647 | 659 | "source": [
|
|
652 | 664 | },
|
653 | 665 | {
|
654 | 666 | "cell_type": "markdown",
|
655 |
| - "id": "40b49257", |
656 | 667 | "metadata": {},
|
657 | 668 | "source": [
|
658 | 669 | "In this block of code we execute the onnx graph to check that it can execute without any errors. We perform it's functional verification in the later part of the notebook."
|
|
661 | 672 | {
|
662 | 673 | "cell_type": "code",
|
663 | 674 | "execution_count": null,
|
664 |
| - "id": "db5892bc-ac8d-4972-afcf-20bf880f5e86", |
665 | 675 | "metadata": {},
|
666 | 676 | "outputs": [],
|
667 | 677 | "source": [
|
|
691 | 701 | },
|
692 | 702 | {
|
693 | 703 | "cell_type": "markdown",
|
694 |
| - "id": "5d2b5a1e-654e-46a5-9d4f-8708611a6d1e", |
695 | 704 | "metadata": {},
|
696 | 705 | "source": [
|
697 | 706 | "# SCAN Operation Integration"
|
698 | 707 | ]
|
699 | 708 | },
|
700 | 709 | {
|
701 | 710 | "cell_type": "markdown",
|
702 |
| - "id": "7365329a-f3d2-4f74-8e2f-9076771e07a7", |
703 | 711 | "metadata": {},
|
704 | 712 | "source": [
|
705 | 713 | "### Introduction to ONNX Scan operation\n",
|
|
721 | 729 | },
|
722 | 730 | {
|
723 | 731 | "cell_type": "markdown",
|
724 |
| - "id": "17f247f7", |
725 | 732 | "metadata": {},
|
726 | 733 | "source": [
|
727 | 734 | "The `Scan` operation is essentially a container operator which will consume the LSTM graph that we created above in it's body.\n",
|
|
733 | 740 | {
|
734 | 741 | "cell_type": "code",
|
735 | 742 | "execution_count": null,
|
736 |
| - "id": "700a93a8-f757-4fa1-88dd-47a3f2a7f171", |
737 | 743 | "metadata": {},
|
738 | 744 | "outputs": [],
|
739 | 745 | "source": [
|
|
750 | 756 | },
|
751 | 757 | {
|
752 | 758 | "cell_type": "markdown",
|
753 |
| - "id": "572f191e", |
754 | 759 | "metadata": {},
|
755 | 760 | "source": [
|
756 | 761 | "We will now create the scan operator here now utilizing the `make_node` utility from ONNX.\n",
|
|
760 | 765 | {
|
761 | 766 | "cell_type": "code",
|
762 | 767 | "execution_count": null,
|
763 |
| - "id": "111fdce4-464f-40c1-ac4d-3022b05f153e", |
764 | 768 | "metadata": {},
|
765 | 769 | "outputs": [],
|
766 | 770 | "source": [
|
|
775 | 779 | },
|
776 | 780 | {
|
777 | 781 | "cell_type": "markdown",
|
778 |
| - "id": "ea8a05d9", |
779 | 782 | "metadata": {},
|
780 | 783 | "source": [
|
781 | 784 | "We can now define the graph for the scan operator utilizing the `make_graph` utility."
|
|
784 | 787 | {
|
785 | 788 | "cell_type": "code",
|
786 | 789 | "execution_count": null,
|
787 |
| - "id": "4668cf2b-524e-4768-8dc8-9d619f6273da", |
788 | 790 | "metadata": {},
|
789 | 791 | "outputs": [],
|
790 | 792 | "source": [
|
|
810 | 812 | },
|
811 | 813 | {
|
812 | 814 | "cell_type": "markdown",
|
813 |
| - "id": "0673e335", |
814 | 815 | "metadata": {},
|
815 | 816 | "source": [
|
816 | 817 | "Now that we have the SCAN based quantized LSTM model ready, we can now go forward and test it with the same sets of inputs we used for the testing of the brevitas model.\n"
|
|
819 | 820 | {
|
820 | 821 | "cell_type": "code",
|
821 | 822 | "execution_count": null,
|
822 |
| - "id": "818d2a81-686f-4a4a-8e78-17dbf75d8451", |
823 | 823 | "metadata": {},
|
824 | 824 | "outputs": [],
|
825 | 825 | "source": [
|
|
854 | 854 | },
|
855 | 855 | {
|
856 | 856 | "cell_type": "markdown",
|
857 |
| - "id": "907d2ff9-f605-4aec-891e-0c77a1a92346", |
858 | 857 | "metadata": {},
|
859 | 858 | "source": [
|
860 | 859 | "# Functional Verification"
|
861 | 860 | ]
|
862 | 861 | },
|
863 | 862 | {
|
864 | 863 | "cell_type": "markdown",
|
865 |
| - "id": "b6bb6c60", |
866 | 864 | "metadata": {},
|
867 | 865 | "source": [
|
868 | 866 | "In the final part of the notebook, we compare the output of the 8-bit quantized `(QCDQ)-LSTM` implementation with the `QuantLSTM` brevitas model.\n"
|
|
871 | 869 | {
|
872 | 870 | "cell_type": "code",
|
873 | 871 | "execution_count": null,
|
874 |
| - "id": "2fe07395-6cf9-4c99-a0d3-a27aa6a326b5", |
875 | 872 | "metadata": {},
|
876 | 873 | "outputs": [],
|
877 | 874 | "source": [
|
|
900 | 897 | },
|
901 | 898 | {
|
902 | 899 | "cell_type": "markdown",
|
903 |
| - "id": "7bcca933", |
904 | 900 | "metadata": {},
|
905 | 901 | "source": [
|
906 | 902 | "Note the difference in outputs increases as we progress with processing the inputs. The first two outputs are very close to one another, but as we get the outputs for more inputs we see for some values differ from the brevitas output by a considerable amount.\n",
|
|
909 | 905 | },
|
910 | 906 | {
|
911 | 907 | "cell_type": "markdown",
|
912 |
| - "id": "81c6d531", |
913 | 908 | "metadata": {},
|
914 | 909 | "source": []
|
915 | 910 | }
|
916 | 911 | ],
|
917 | 912 | "metadata": {
|
918 | 913 | "kernelspec": {
|
919 |
| - "display_name": "Python 3 (ipykernel)", |
| 914 | + "display_name": "venv", |
920 | 915 | "language": "python",
|
921 |
| - "name": "python3" |
| 916 | + "name": "venv" |
922 | 917 | },
|
923 | 918 | "language_info": {
|
924 | 919 | "codemirror_mode": {
|
|
930 | 925 | "name": "python",
|
931 | 926 | "nbconvert_exporter": "python",
|
932 | 927 | "pygments_lexer": "ipython3",
|
933 |
| - "version": "3.8.10" |
| 928 | + "version": "3.8.0" |
934 | 929 | }
|
935 | 930 | },
|
936 | 931 | "nbformat": 4,
|
|
0 commit comments