Infosys Generative AI Framework 3.0.0

Infosys Generative AI Framework is a python library that provides various APIs listed below

Prerequisites

Python =3.10

APIs

The details of each API and its core functionality is given below. For more details, please read the docs.

S#	API	Description
1	audio to text	This API is used for converting audio files to text. It uses the OpenAI API to perform the conversion.
2	code translate	This API is used for translating code from one programming language to another. It uses the OpenAI API for the translation process.
3	single code documentation	This API is used for generating documentation for a given source code file. It uses the OpenAI API to generate the documentation based on the content of the source code.
4	unit test generation	This API is used for automatically generating unit tests for a given source code file. It uses the OpenAI API to generate the tests based on the content of the source code.
5	multiple code documentation	This API is used for generating documentation for multiple source code files at once. It uses the OpenAI API to generate the documentation based on the content of each source code file.
6	summarize PDF document	This API is used for summarizing the content of PDF files. It uses the OpenAI API to generate the summary based on the content of the PDF.
7	generate insights	This API is used for generating insights from given data. It uses the OpenAI API to generate the insights based on the content of the data.
8	generate metadata and description	This API is used for generating meta descriptions from a text file. It uses the OpenAI API to generate the meta descriptions based on the content of the text file.
9	add searchable embeddings	This API is used for indexing the pdf documents to vectordb (eg:chromadb). It leverages OpenAI (text-embedding-ada-002) and OpenSource (all-MiniLM-L6-v2) models.
10	retrieve context	This API is used for generating closest matches from the embeddings stored in vectordb (eg:chromadb) based on the input query. It leverages OpenAI (text-embedding-ada-002) and OpenSource (all-MiniLM-L6-v2) models.
11	generate answer	This API is used for generating answers from the embeddings stored in vectordb (eg:chromadb) based on the input query.It leverages OpenAI (text-embedding-ada-002 and gpt-4) and OpenSource (all-MiniLM-L6-v2 and roberta-base-squad2) models.
12	document reset	This API is used for deleting the pdf document(s) from vectordb (eg:chromadb).
13	retrieve video moments	This API is used for generating the images and video moments from the input video file based on input query. It leverages OpenSource (clip-ViT-B-32) model.
14	build knowledge graph	This API is used for building and filtering the Knowledge graph based on the input query. It leverages OpenAI (gpt-4) model.
15	search knowledge graph	This API is used for searching the knowledge graph based on the input query. It leverages OpenAI (gpt-4) model.

The API logical input/output is given below.

Step	API	Input	Output
1	audio to text	`audio file path` , `mom_required (bool)`	`transcripted text`, `mom text (optional)`
2	code translate	`source code file path` , `source language` , `target language`	`translated code`
3	single code documentation	`source code file path` , `source language`	`documentation of the code`
4	unit test generation	`source code file path` , `source language`	`generated test cases`
5	multiple code documentation	`source code file paths` , `source language`	`documentation of multiple files`
6	summarize PDF document	`pdf file path` , `summarization type`	`summary of the pdf file`
7	generate insights	`csv or xlsx file path` , `number of completions`	`summary of the file`
8	generate metadata and description	`text file path` , `number of completions`	`keywords` , `meta description`
9	add searchable embeddings	`file paths` , `embedding type` , `vocab dir path` , `vector db directory`	`embeddings added to the vectordb`
10	retrieve context	`query` , `embedding type` , `top k value` , `vector db directory`	`list of closest matches`
11	generate answer	`query` , `embedding type` , `vector db directory` , `top k value` , `rag(bool)`	`generated answer`
12	document reset	`vector db directory` , `collection name`	`status message`
13	retrieve video moments	`query` , `display results count` , `video file path` , `clip duration` , `output root path`	`output file paths of images and videos`
14	build knowledge graph	`query` , `mode` , `root path` , `output file prefix` , `graph json file path (optional)`	`graph output image file path` , `graph output json file path` , `message text`
15	search knowledge graph	`query` , `root path` , `output file prefix` , `graph json file path (optional)`	`graph output image file path` , `message text`

Examples

For code examples, please read docs/notebook.

List of models

S#	Model Name	Type	Dependent API
1	whisper-base	`OpenSource`	`audio to text`
2	gpt2	`OpenSource`	`code translate` , `summarise PDF document`
3	all-MiniLM-L6-v2	`OpenSource`	`add searchable embeddings` , `retrieve context`
4	roberta-base-squad2	`OpenSource`	`generate answer`
5	clip-ViT-B-32	`OpenSource`	`retrieve video moments`
6	text-embedding-ada-002	`OpenAI`	`add searchable embeddings` , `retrieve context` , `generate answer`
7	gpt-4	`OpenAI`	`audio to text` , `code translate` , `single code documentation` , `unit test generation` , `multiple code documentation` , `summarize PDF document` , `generate insights` , `generate metadata and description` , `generate answer` , `build knowledge graph` , `search knowledge graph`

Steps to download OpenSource models

mkdir C:\MyProgramFiles\AI\models
cd C:\MyProgramFiles\AI\models
git lfs install
#To download 'whisper_base' model,use below command
git clone https://huggingface.co/openai/whisper-base
#To download 'gpt2' model,use below command
git clone https://huggingface.co/openai/gpt2
#To download 'all-MiniLM-L6-v2' model,use below command
git clone https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2
# To download 'roberta-base-squad2' model,use below command
git clone https://huggingface.co/deepset/roberta-base-squad2
# To download 'clip-ViT-B-32' model,use below command
git clone https://huggingface.co/sentence-transformers/clip-ViT-B-32

Steps required for settingup chromadb

Install Microsoft Visual Studio C++ Build Tools >= 14.0

The following combinations of the `generate answer` functionality are working:

S#	Rag	Embedding	Inference
1	True	text-embedding-ada-002	gpt-4
2	True	all-MiniLM-L6-v2	roberta-base-squad2
3	False	NA	gpt-4

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
docs/notebook		docs/notebook
infy_gen_ai_fmwk		infy_gen_ai_fmwk
Contribution License Agreement.pdf		Contribution License Agreement.pdf
LICENSE		LICENSE
README.md		README.md
Release Notice File.pdf		Release Notice File.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Infosys Generative AI Framework 3.0.0

Prerequisites

APIs

Examples

List of models

Steps to download OpenSource models

Steps required for settingup chromadb

The following combinations of the `generate answer` functionality are working:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

Infosys/Infosys-Generative-AI-Framework

Folders and files

Latest commit

History

Repository files navigation

Infosys Generative AI Framework 3.0.0

Prerequisites

APIs

Examples

List of models

Steps to download OpenSource models

Steps required for settingup chromadb

The following combinations of the generate answer functionality are working:

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

The following combinations of the `generate answer` functionality are working:

Packages