Skip to content

Commit 6a210fe

Browse files
Merge pull request #9 from pescheckit/feature_added-more-providers
Added more providers
2 parents 572f803 + 8dce90c commit 6a210fe

12 files changed

+1821
-219
lines changed

.flake8

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
11
[flake8]
22
max-line-length = 120
3-
ignore = W293
3+
ignore = W293
4+
exclude = python_gpt_po/tests

.github/workflows/python-ci-package.yml

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,9 @@ on:
66
- main
77
tags:
88
- '*'
9+
pull_request:
10+
branches:
11+
- main
912
release:
1013
types: [published]
1114

@@ -61,7 +64,7 @@ jobs:
6164
pip install -r requirements.txt
6265
- name: Run tests
6366
run: |
64-
pytest python_gpt_po/tests/test_po_translator.py
67+
pytest
6568
6669
deploy:
6770
needs: test

.pylintrc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,4 +6,4 @@ logging-format-style = new
66
disable = C0103,R0903,E1101,E1205,W0703
77

88
[SIMILARITIES]
9-
ignore-paths=src/customersatisfactionmetrics/migrations
9+
ignore-paths=python_gpt_po/tests

README.md

Lines changed: 61 additions & 64 deletions
Original file line numberDiff line numberDiff line change
@@ -1,119 +1,116 @@
11
# Python GPT-4 PO File Translator
22

3-
This Python script provides a robust and flexible tool for translating `.po` files using OpenAI's GPT-4 model. It accommodates various translation modes, handles fuzzy entries, and integrates batch processing for larger projects, making it suitable for diverse `.po` file structures and sizes.
3+
A robust tool for translating gettext (.po) files using AI models from multiple providers (OpenAI, Anthropic / Claude, and DeepSeek). It supports both bulk and individual translations, handles fuzzy entries, and can infer target languages based on folder structures.
44

55
## Features
66

7-
- **Bulk and Individual Translation Modes**: Allows efficient bulk translation or precise, entry-by-entry translations for nuanced content.
8-
- **Detailed Language Option (`--detail-lang`)**: Supports using full language names (e.g., "Netherlands, German") alongside shortcodes (e.g., `nl, de`), ensuring clarity in translation prompts.
9-
- **Configurable Batch Size**: Set the number of entries to translate per batch during bulk translation, optimizing API usage.
10-
- **Fuzzy Entry Management**: Automatically removes fuzzy flags and entries, ensuring only valid translations are processed.
11-
- **Language Inference from Folder Structure**: Infers the target language from the folder structure, reducing the need for explicit language specifications.
12-
- **Translation Validation and Retry Logic**: Built-in mechanisms validate translations and automatically retry to avoid incorrect or verbose translations.
13-
- **Logging for Transparency**: Detailed logging for monitoring, debugging, and ensuring progress throughout the translation process.
14-
- **OpenAI API Key Management**: Supports environment variables or command-line arguments for securely providing OpenAI API credentials.
15-
- **Retry Mechanism for Failed Translations**: Retries failed translations up to three times, reducing incomplete or incorrect outputs.
16-
- **Post-Processing for Concise Translations**: Automatically reviews translations to ensure they are concise and free of unnecessary explanations or repetitions.
7+
- **Multi-Provider Support:** Integrates with OpenAI, Anthropic / Claude, and DeepSeek.
8+
- **Bulk & Individual Modes:** Translate entire files in batches or process entries one by one.
9+
- **Fuzzy Entry Management:** Automatically removes fuzzy entries to ensure clean translations.
10+
- **Folder-Based Language Inference:** Detects the target language from directory structure.
11+
- **Customizable Batch Size:** Configure the number of entries per translation request.
12+
- **Retry & Validation:** Automatic retries and validation to ensure concise, correct translations.
13+
- **Detailed Logging:** Transparent logging for progress monitoring and debugging.
14+
- **Flexible API Key Configuration:** Supply API keys via environment variables or command-line arguments.
15+
- **Detailed Language Option:** Use full language names (e.g., "German") for clearer prompts alongside language codes (e.g., de).
1716

1817
## Requirements
1918

2019
- Python 3.x
21-
- `polib` library (for `.po` file handling)
22-
- `openai` Python package (for integration with OpenAI GPT models)
23-
- `tenacity` library (for retry mechanisms)
24-
- `python-dotenv` (for managing environment variables)
20+
- [polib](https://pypi.org/project/polib/)
21+
- [openai](https://pypi.org/project/openai/)
22+
- [tenacity](https://pypi.org/project/tenacity/)
23+
- [python-dotenv](https://pypi.org/project/python-dotenv/)
2524

2625
## Installation
2726

2827
### Via PyPI
2928

30-
Install the `gpt-po-translator` package directly from PyPI:
31-
3229
```bash
3330
pip install gpt-po-translator
3431
```
3532

3633
### Manual Installation
3734

38-
For manual installation or working with the latest code from the repository:
35+
Clone the repository and install the package:
3936

40-
1. Clone the repository:
41-
```bash
42-
git clone [repository URL]
43-
```
44-
2. Navigate to the cloned directory and install the package:
45-
```bash
46-
pip install .
47-
```
37+
```bash
38+
git clone https://github.com/yourusername/python-gpt-po.git
39+
cd python-gpt-po
40+
pip install .
41+
```
4842

4943
## API Key Configuration
5044

51-
The `gpt-po-translator` supports two methods for providing OpenAI API credentials:
45+
You can provide your API key in two ways:
46+
47+
### Environment Variable
5248

53-
1. **Environment Variable**: Set your OpenAI API key as an environment variable named `OPENAI_API_KEY`. This method is recommended for security and ease of API key management.
49+
Set your OpenAI API key:
5450

55-
```bash
56-
export OPENAI_API_KEY='your_api_key_here'
57-
```
51+
```bash
52+
export OPENAI_API_KEY='your_api_key_here'
53+
```
5854

59-
2. **Command-Line Argument**: Pass the API key as a command-line argument using the `--api_key` option.
55+
### Command-Line Argument
6056

61-
```bash
62-
gpt-po-translator --folder ./locales --lang de,fr --api_key 'your_api_key_here' --bulk --bulksize 100 --folder-language
63-
```
57+
Pass your API key directly when invoking the tool:
6458

65-
Make sure your API key is securely stored and not exposed in public spaces or repositories.
59+
```bash
60+
gpt-po-translator --folder ./locales --lang de,fr --api_key 'your_api_key_here' --bulk --bulksize 100 --folder-language
61+
```
6662

6763
## Usage
6864

69-
Use `gpt-po-translator` as a command-line tool for translating `.po` files:
65+
Run the tool from the command line to translate your .po files:
7066

7167
```bash
72-
gpt-po-translator --folder [path_to_po_files] --lang [language_codes] [--api_key [your_openai_api_key]] [--fuzzy] [--bulk] [--bulksize [batch_size]] [--folder-language] [--detail-lang [full_language_names]]
68+
gpt-po-translator --folder <path_to_po_files> --lang <language_codes> [options]
7369
```
7470

7571
### Example
7672

73+
Translate .po files in the `./locales` folder to German and French:
74+
7775
```bash
7876
gpt-po-translator --folder ./locales --lang de,fr --api_key 'your_api_key_here' --bulk --bulksize 40 --folder-language --detail-lang "German,French"
7977
```
8078

81-
This command translates `.po` files in the `./locales` folder to German and French, using the provided OpenAI API key and processing 40 translations per batch in bulk mode. It also infers the language from the folder structure.
79+
## Documentation
8280

83-
### Command-Line Options
81+
For a detailed explanation of all available parameters and a deep dive into the internal workings of the tool, please see our [Advanced Usage Guide](docs/usage.md).
8482

85-
- `--folder`: Specifies the input folder containing `.po` files.
86-
- `--lang`: Comma-separated language codes to filter `.po` files (e.g., `de,fr`).
87-
- `--detail-lang`: Optional argument for full language names, matching the order of `--lang` (e.g., "German,French").
88-
- `--fuzzy`: Removes fuzzy entries before processing.
89-
- `--bulk`: Enables bulk translation mode for faster processing.
90-
- `--bulksize`: Sets the batch size for bulk translation (default is 50).
91-
- `--model`: Specifies the OpenAI model to use for translations (default is `gpt-3.5-turbo-0125`).
92-
- `--api_key`: OpenAI API key. Can be provided through the command line or as an environment variable.
93-
- `--folder-language`: Infers the target language from the folder structure.
83+
## Command-Line Options
9484

95-
## Detailed Language Names and Shortcodes
85+
- `--folder`: Path to the directory containing .po files.
86+
- `--lang`: Comma-separated target language codes (e.g., de,fr).
87+
- `--detail-lang`: Comma-separated full language names corresponding to the codes (e.g., "German,French").
88+
- `--fuzzy`: Remove fuzzy entries before processing.
89+
- `--bulk`: Enable bulk translation mode.
90+
- `--bulksize`: Set the number of entries per bulk translation (default is 50).
91+
- `--model`: Specify the translation model (defaults are provider-specific).
92+
- `--api_key`: API key for translation; can also be provided via environment variable.
93+
- `--folder-language`: Infer the target language from the folder structure.
9694

97-
The `--detail-lang` option complements `--lang` by allowing you to specify full language names (e.g., `Netherlands,German`) instead of language shortcodes. The full names are then used in the context of OpenAI prompts, improving clarity for the GPT model.
95+
## Logging & Error Handling
9896

99-
Example usage:
97+
- **Logging:** Detailed logs track the translation process and help with debugging.
98+
- **Error Handling:** The tool automatically retries failed translations (up to three times) and validates output to prevent overly verbose responses.
10099

101-
```bash
102-
gpt-po-translator --folder ./locales --lang nl,de --detail-lang "Netherlands,German"
103-
```
100+
## Testing
104101

105-
## Logging
102+
To run all tests:
106103

107-
The script logs detailed information about the files being processed, the number of translations, and batch details in bulk mode. Logs are essential for monitoring progress, debugging issues, and ensuring transparency throughout the translation process.
104+
```bash
105+
python -m pytest
106+
```
108107

109-
## Error Handling and Retries
108+
## License
110109

111-
The script includes robust error handling and retries to ensure reliable translation:
110+
This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
112111

113-
- **Failed Translations**: Automatically retries failed translations up to three times.
114-
- **Empty Translations**: If an empty translation is returned, the script will attempt to translate the text again using an alternative approach.
115-
- **Lengthy or Incorrect Translations**: Translations that are too long or contain explanations instead of direct translations are flagged and retried.
112+
## About
116113

117-
## License
114+
Powered by state-of-the-art AI models (including OpenAI’s GPT-4 and GPT-3.5), this tool is designed to streamline the localization process for .po files. Whether you need to process large batches or handle specific entries, the Python GPT-4 PO File Translator adapts to your translation needs.
118115

119-
This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
116+
For more details, contributions, or bug reports, please visit our [GitHub repository](https://github.com/yourusername/python-gpt-po).

0 commit comments

Comments
 (0)