Skip to content

Conversation

@justinhchae
Copy link

@justinhchae justinhchae commented Sep 24, 2025

Hi and thanks for the library, we really like what we see and hope to include this as an optional feature for our other work.

After forking, I conducted an initial review of nano-graphrag and found two major issues preventing (1) installation on a windows machine using pip install (issue 163) and (2) formatting the cheap model function's Azure OpenAI endpoint (issue 164), related issues include: (issue 105).

In the first issue, when attempting to follow instructions for installation using pip install on a Windows machine, a "'charmap' codec can't decode" error is encountered. Upon further inspection, this issue can be resolved if passing a value explicitly for the encoding parameter with the open context in setup.py.

For example, instead of with open("./nano_graphrag/__init__.py") as f: the same line can read with open("./nano_graphrag/__init__.py", "r", encoding="utf-8") as f:.

The modification is proposed when opening the init dunder and readme.md files.

In the second issue, there is an issue when using Azure OpenAI end points that are privately hosted and may not conform to public patterns. As a result, in the graphrag class and ._llm.py utility functions, the API endpoint can potentially be malformed, given the base_url of the primary ('best') model end point given in the .env file.

An example of the malformed end point is as follows.

If we have AZURE_OPENAI_ENDPOINT="https://<path_to_api>/gpt-4o", when utility functions such as azure_openai_complete_if_cache attempt to retrieve the mini version of the end point, the result can be https://<path_to_api>/gpt-4o/openai/deployments/gpt-4o-mini/chat/completions?api-version=2024-10-21. In this case, the end point is malformed because the actual end point reads as https://<path_to_api>/gpt-4o-mini/openai/deployments/gpt-4o-mini/chat/completions?api-version=2024-10-21. It is a minor issue in formatting the text of the end point but a major issue as it must be correct or it fails.

As a result, when attempting to perform insert functions with the graphrag class, the operation fails because the end point for the cheap model function does not exist.

The proposed fix, specifically for Azure Open AI is to allow the user to specify the mini end point (meant for the 'cheap' model) in the .env file; if given, to parse that endpoint for use throughout graphrag. The changes are primarily contained to ._llm.py where a new utility function called get_azure_openai_mini_client_instance will manually retrieve the environmental variable for "AZURE_OPENAI_MINI_ENDPOINT" and return the AzureOpenAI result with the fully formed API end point. The default (else) is to return the expected result from get_azure_openai_async_client_instance. In addition, the azure_openai_complete_if_cache function is modified to call the new function based on the deployment name being equal to "gpt-4o-mini".

An operational test is provided in the test folder (test_api.py) meant to functionally validate that the graphrag can reach and interact with both the primary and mini model endpoints. A docstring is included to show an example of the expected .env parameters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant