Academic Researcher

An AI-powered tool for writing well cited (and citation checked) academic paragraphs. It leverages multiple LLMs and internet search and fetch APIs to gather, analyze and synthesize well cited scientific paragraphs.

Features

Multi-model approach using:
- O3-mini - efficient reasoning LLM
- Claude 3.5 sonnet - latest Claude 3.5 sonnet model
- Google Gemini 2.0 Pro - latest Gemini Pro model
- Google Gemini 2.0 Flash with thinking - latest Gemini model with "thinking".
- DeepSeekR1 - DeepSeek's reasoning LLM via the Fireworks.ai API (stable, fast and secure compared to the original DeepSeek API).
Iterative search with adaptive query generation
Fulltext citation by citation validation and metadata enrichment
Comprehensive reference management
Quality review and content revision

Prerequisites

Python 3.8+
Required API keys (see .env.example)
Chrome/Chromium for web scraping capabilities

Installation

Clone the repository:

git clone https://github.com/SahPet/Academic_researcher.git
cd Academic_researcher

Install dependencies:

pip install -r requirements.txt

Change the name .env.example to .env and add your API keys (required unless you comment out parts of the code):

OPENAI_API_KEY: Obtain from OpenAI API Keys
GEMINI_API_KEY: Obtain from Google Gemini API Keys
SERPER_API_KEY: Obtain from Serper API Keys
FIREWORKS_API_KEY: Fast and stable DeepSeekR1 API - obtain from Fireworks API Keys
SCRAPINGBEE_API_KEY: Obtain from ScrapingBee API Keys
PARSEHUB_API_KEY and PARSEHUB_PROJECT_TOKEN: Obtain from ParseHub API Documentation

Install Chrome/Chromium for Selenium support

Usage

Optional parameters (change in the main py file):

BASE_OUTPUT_FOLDER: Default is C:\research_outputs
MAX_SEARCH_ROUNDS: Default is 7
MAX_REFERENCES: Default is 12

Running the main script:

python research_crew_NO_CREWAI_simplified_github.py

When prompted, enter your research question. The tool will:

Generate search queries
Gather relevant academic references
Synthesize content with proper citations
Validate (fulltext download) and add extra references if needed
Produce and iteratively improve the academic text

Output Structure

Example Input

What are the current best strategies for auto annotation of semantic segmentation?

Example Output

Click to expand example output

Current best strategies for auto annotation of semantic segmentation datasets center on minimizing the labor-intensive process of dense pixel-level labeling by leveraging a combination of weak and noisy supervision, active learning, and human-in-the-loop correction. For instance, recent frameworks incorporating semi-supervised learning with uncertainty-aware active sampling have demonstrated that segmentation models can achieve competitive Dice scores while significantly reducing the manual annotation workload, with some studies suggesting reductions on the order of 50‒70% compared to fully supervised methods (Wang et al., 2021; Zhang et al., 2025).

Additionally, empirical research indicates that using point-based and image-level annotations can accelerate the labeling process and help mitigate common human errors without compromising segmentation accuracy (Fernández-Moreno, 2023; Zhang et al., 2025). Emerging zero-shot approaches also harness self-supervised techniques—as evidenced by the scalability of masked autoencoders for feature learning—to automatically generate annotations without extensive manual input (He et al., 2022; Xie et al., 2022).

Furthermore, the integration of automated pre-annotation models, such as the Segment Anything Model (SAM), with selective human verification has shown promise for efficiently handling uncertain or complex cases, although further research is needed to validate its generalizability across diverse imaging modalities (Kirillov et al., 2023).

References:

Fernández-Moreno, M. (2023). Exploring the trade-off between performance and annotation in deep learning: An engineering perspective. Engineering Applications of Artificial Intelligence. Retrieved from https://openreview.net/pdf?id=jMiZegbLUe
He, K., Chen, X., Xie, S., Li, Y., Dollár, P., & Girshick, R. (2022). Masked autoencoders are scalable vision learners. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 15979‒15988). https://doi.org/10.1109/CVPR52688.2022.01553
Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., Xiao, T., Whitehead, S., Berg, A. C., Lo, W.-Y., et al. (2023). Segment Anything. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 4015‒4026).
Wang, S., Li, C., Liu, Z., & Wang, R. (2021). Annotation-efficient deep learning for automatic medical image segmentation. Nature Communications, 12(1). Retrieved from https://www.nature.com/articles/s41467-021-26216-9
Xie, Z., Zhang, Z., Cao, Y., Lin, Y., Bao, J., Yao, Z., Dai, Q., & Hu, H. (2022). Simmim: A simple framework for masked image modeling. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 9643‒9653). https://doi.org/10.1109/CVPR52688.2022.00943
Zhang, Y., Zhao, S., Gu, H., & Mazurowski, M. A. (2025). How to efficiently annotate images for best-performing deep learning-based segmentation models: An empirical study with weak and noisy annotations and Segment Anything Model. Journal of Imaging Informatics in Medicine. Retrieved from https://pubmed.ncbi.nlm.nih.gov/39843720

Results are saved in C:/research_outputs/ with subfolders containing:

Content drafts and final version
Search results and reference data
Citation validations
Quality review feedback

Contributing

Contributions welcome!

License

MIT License - see LICENSE file for details

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
Previous_versions		Previous_versions
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
research_crew_NO_CREWAI_simplified_github.py		research_crew_NO_CREWAI_simplified_github.py
serper_tool.py		serper_tool.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Academic Researcher

Features

Prerequisites

Installation

Usage

Output Structure

Example Input

Example Output

References:

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

SahPet/Academic_researcher

Folders and files

Latest commit

History

Repository files navigation

Academic Researcher

Features

Prerequisites

Installation

Usage

Output Structure

Example Input

Example Output

References:

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages