If you maintain a master or global BibTeX file of all your BibTeX entries, it may be too large to load into an LLM if it has 600-800 or more entries. It will exceed the LLM's context limits. This Emacs package enables the exporting of subsets of BibTeX entries that are selected by keywords. I find this faster inside Emacs than with a GUI-driven interface to your master BibTeX file like JabRef.
This Emacs package streamlines the management of extensive BibTeX bibliographies by enabling selective extraction of references based on keyword criteria. These selected subsets can be easier to work with when building a BibTeX collection for a specific manuscript or annotated bibliography. These selected subsets of BibTeX entries are also helpful when working with AI-powered writing assistants or large language models (LLMs). The latter have limited context windows and cannot handle BibTeX files with thousands of entries. Using these subsets of entries is easier than training an LLM on your corpus of files. This tool addresses that challenge by creating focused subsets of BibTeX entries tailored to meet specific research needs or when working with AI chatbots.
This package is a sophisticated filter for your reference library, similar to how a librarian might pull relevant books from different sections based on your research topic. This package intelligently selects BibTeX entries that match your specified criteria.
- Extract entries using keyword searches across configurable field sets
- Support for complex multi-word search phrases
- Boolean logic operations (AND/OR) for combining multiple search terms
- Flexible field targeting for precision filtering
- Customizable export destinations of the selected BibTeX entries with user-defined paths
- Real-time feedback displaying the count of extracted entries
- Seamless integration with existing Emacs workflows
- Optimized for integration with modern AI writing assistants
- Context-aware subset creation for improved LLM interactions
- Clone this repository.
- Load the file
exportbib.el
into a new buffer. - Before using, set the path to your master BibTeX library file and reevaluate the updated buffer:
defvar bibexport-bibtex-master-file "~/Documents/global.bib"
- Evaluate this buffer:
M-x eval-buffer
- Get list of interactive functions in the minibuffer by entering exportbib- (Presumably, you are using the vertico and orderless packages). You should see one interactive function. Select it and answer the series of prompts for information.
bibexport-export-bibtex-entries-by-multikeywords
Add this to your init.el
file and reload Emacs or evaluate in the scratch buffer.
Straight will git clone
this repo and store it in the repos
subfolder of your .emacs.d
folder.
You have to run straight-pull-all
to pull any updates.
(use-package bibexport
:straight t
'(:type git
:repo "https://github.com/MooersLab/export-select-bibtex-entries-el.git"
:files ("bibexport.el")))
The functions will always be available.
Edit the paths of this function to suit.
Add to your init.el
file or evaluate the function in your scratch buffer for a quick start.
Inspired https://sachachua.com/dotemacs/index.html#org4dd39d0.
(defun bibexport-functions-load ()
"Load bibexport.el file."
(interactive)
(let ((file-path "~/6112MooersLabGitHubLabRepos/export-select-bibtex-entries-el/bibexport.el"))
(if (file-exists-p (expand-file-name file-path))
(load-file file-path)
(message "Cannot find bibexport.el file"))))
- Launch the selection interface:
M-x bibexport-export-bibtex-entries-by-multikeywords
- Specify your search keywords (separate multiple terms with semicolons)
- Configure field targeting options (title, author, year, keywords, etc.)
- Select Boolean logic for multi-keyword searches (AND/OR operations)
- Choose your preferred export format
- Define the output file path and name
- Review the extraction summary in the minibuffer
Keywords: machine learning; neural networks
Fields: title, abstract, keywords
Logic: OR
Output: ~/research/ml-references.bib
Result: 2222 entries exported
- You can set the path to the master BibTeX file.
- You can change the BibTeX fields to be searched.
- A keyword can be a multi-word phrase.
- If there are multiple keywords, they are separated by semicolons.
- With more than one keyword, the keywords can be joined by simple Boolean AND/OR logic.
Target specific bibliographic fields to refine your searches:
- Author names and affiliations
- Publication titles and abstracts
- Journal names and conference proceedings
- Publication years and date ranges
- Keywords and subject classifications
- DOI identifiers and URLs
We encourage community contributions to enhance this tool's functionality:
- Fork the repository to your GitHub account.
- Create a dedicated feature branch for your modifications.
- Implement your improvements with clear, documented code.
- Commit changes with descriptive messages.
- Submit a pull request detailing your enhancements.
- Open an issue under the Issues tab for technical problems or feature suggestions
The code in bibexport.el
was developed iteratively with the help of GPT 4.1.
The code was tested in Emacs 30.1.
The function works as advertised when installed with straight.
Version | Changes | Date |
---|---|---|
Version 0.1 | Added badges, funding, and update table. Initial commit. | 05/23/2025 |
Version 0.1.1 | Corrected a minor bug. Updated README.md. | 05/24/2025 |
Version 0.1.2 | Added missing t after straight keyword in configuration for bibexport. | 06/14/2025 |
- NIH: R01 CA242845
- NIH: R01 AI088011
- NIH: P30 CA225520 (PI: R. Mannel)
- NIH: P20 GM103640 and P30 GM145423 (PI: A. West)