-
Notifications
You must be signed in to change notification settings - Fork 657
Description
Hello maintainers @kartikpersistent @jexp @prakriti-solankey,
Firstly thank you for building this product. I am running the llm-graph-builder locally on one of my servers and i am having issues scaling up my current deployment to more than 150 research papers.
After 150 papers, the /sources_list
api takes a lot of time (approx 5mins) to give the response and get the initial list loading on the UI. And then the extraction runs for indefinite period if a new file is provided(even for small files like that are 50kb).
My configuration is as follows:
-
Setup: Docker setup
-
LLM: Llama-4-scout (deployed in another server with 2 x NVIDIA H100 ) - Average inference time is 83 seconds per token
-
LLM-graph Builder server config
Model name: AMD Ryzen Threadripper PRO 7985WX 64-Cores
CPU family: 25
Model: 24
Thread(s) per core: 2
Core(s) per socket: 64
Socket(s): 1
Stepping: 1
Frequency boost: enabled
CPU max MHz: 8240.6250
CPU min MHz: 1500.0000
BogoMIPS: 6390.47 -
Neo4j Database
Community Edition Instance deployed on
Model name: AMD EPYC-Milan Processor
CPU family: 25
Model: 1
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 8
Stepping: 1
BogoMIPS: 3992.49
I want to scale the system to nearly 10k papers and want to hear any tips and tricks from you all. Please help me get through this issue.
Thank you.
Best.