This repository contains two Python scripts that allow you to convert Web of Science bibliographic data between the following formats:
TabDelimited.txt→WOS.xlsxandPlainText.txtFiltered WOS.xlsx→Filtered TabDelimited.txtandFiltered PlainText.txt
These tools are particularly useful when you:
- Need to extract cited references (which the default WOS Excel export omits).
- Apply document filtering using tools like PRISMA and need to export the cleaned dataset.
Purpose: Converts the WOS TabDelimited.txt export file to:
- A full-featured Excel file (
WOS.xlsx) - A plain-text Web of Science format file (
PlainText.txt) that includes full tagging (e.g.,AU,CR, etc.)
Use this script right after downloading from Web of Science.
Purpose: After filtering your Excel file (e.g., manually or via PRISMA), use this script to:
- Reconstruct the tab-delimited format (
TabDelimited.txt) - Recreate the tagged plain-text format (
PlainText.txt) for further processing
Use this after filtering your Excel (
WOS_Filtered.xlsx) to retain only relevant records.
-
Clone this repository:
git clone https://github.com/yourusername/wos-format-converter.git cd wos-format-converter -
Install dependencies:
pip install pandas
-
Run the appropriate script based on your workflow:
python WOS_Converter_TabDelimited_to_xlsx_PlainText.py # OR python WOS_Converter_Filtered_xlsx_to_TabDelimitedText_PlainText.py
Initial Conversion:
- Input:
TabDelimited.txt - Output:
WOS.xlsx,PlainText.txt
After Filtering:
- Input:
WOS_Filtered.xlsx - Output:
TabDelimited_Filtered.txt,PlainText_Filtered.txt
The TabDelimited.txt file generated by this tool is suitable for VOSviewer, a software tool for constructing and visualizing bibliometric networks.
Cite VOSviewer as:
Van Eck, N. J., & Waltman, L. (2010). Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics, 84(2), 523–538.
https://www.vosviewer.com
The PlainText.txt output is compatible with the Bibliometrix R package and its web-based interface Biblioshiny for comprehensive science mapping analysis.
Cite Bibliometrix as:
Aria, M., & Cuccurullo, C. (2017). bibliometrix: An R-tool for comprehensive science mapping analysis. Journal of Informetrics, 11(4), 959–975.
https://www.bibliometrix.org
The PlainText.txt output is also compatible with CiteSpace, a Java-based application for visualizing and analyzing trends and patterns in scientific literature.
To ensure compatibility with CiteSpace:
- Export records in Plain Text format with Full Record and Cited References
- Name files as
download_*.txt(e.g.,download_1.txt) - Use data from supported sources like Web of Science or Scopus
Cite CiteSpace as:
Chen, C. (2006). CiteSpace II: Detecting and visualizing emerging trends and transient patterns in scientific literature. Journal of the American Society for Information Science and Technology, 57(3), 359–377.
https://doi.org/10.1002/asi.20317
- The conversion preserves essential Web of Science tags (
AU,TI,CR, etc.) - Cited references (
CR) are correctly included in the plain-text output, unlike in the native WOS Excel export - Column mapping is based on the official WOS format standard
Created by Nasser Khalili
If you use this tool in your research, feel free to give a ⭐ or cite the repository.
This project is licensed under the MIT License - see the LICENSE file for details.