"description": "## Overview\n\nThis is a beta version of **Aimon Rely.** It includes our proprietary hallucination detector. This is an beta-release, so please treat it as such. Check with us (send a note to [info@aimon.ai](https://mailto:info@aimon.ai)) before using this API in a production setting. There are limited uptime guarantees at the moment. Please report any issues to the Aimon team (at [info@aimon.ai](https://mailto:info@aimon.ai)).\n\n> Use the APIs with caution - do not send sensitive or protected data to this API. \n \n\n## Features\n\n#### Hallucination detection\n\nGiven a context and the generated text, this API is able to detect 2 different types of model hallucinations: intrinsic and extrinsic.\n\n- The \"is_hallucinated\" field indicates whether the \"generated_text\" (passed in the input) is hallucinated.\n- A top level passage level \"score\" indicates if the entire set of sentences contain any hallucinations. The score is a probabilty measure of how hallucinated the text is compared to the context. A score >= 0.5 can be classified as a hallucination.\n- We also provide sentence level scores to help with explanability.\n \n\n#### Completeness detection\n\nGiven a context, generated text and optionally a reference text, this API is able to detect if the generated text completely answered the user's question. The context should include the context documents along with the user query as passed in to the LLM.\n\nThe output contains a \"score\" that is between 0.0 and 1.0 which indicates the degree of completeness. If the generated answer is not at all relevant to the user query, a score between 0.0 to 0.2 is possible. If the generated answer is relevant but misses some information, a score between 0.2 and 0.7 is possible. If the generated answer is relevant and fully captures all of the information, a score between 0.7 and 1.0 is possible.\n\nThe API also includes a \"reasoning\" field that is a text based explanation of the score. It also does a best effort method of pointing out the points that were missed from the expected answer.\n\n#### Conciseness detection\n\nGiven a context, generated text and optionally a reference text, this API is able to detect if the generated text was concise or verbose in terms of addressing the user query. The context should include the context documents along with the user query as passed in to the LLM.\n\nThe output contains a \"score\" that is between 0.0 and 1.0 which indicates the degree of conciseness. If the generated answer is very verbose and contains a lot of un-necessary information that is not relevant to the user query, a score between 0.0 to 0.2 is possible. If the generated answer is mostly relevant to the user query but has some amount of text that is not necessary for the user query a score between 0.2 and 0.7 is possible. If the generated answer is very concise and properly addresses all important points for the user query, a score between 0.7 and 1.0 is possible.\n\nThe API also includes a \"reasoning\" field that is a text based explanation of the score. It also does a best effort method of pointing out the un-necessary information that was included in the output.\n\n## **Limitations**\n\n- Input payloads with context sizes greater than 32,000 tokens will not work at the moment.\n- Maximum batch size is 25 items at the moment.",
0 commit comments