Fix #110 formatting in README for consistency and clarity #118

lazzyms · 2025-08-11T12:22:31Z

Fixes: #110
Problem
The Ollama example code in README.md was missing the required language_model_type parameter, causing users to encounter the error: "API key must be provided for cloud-hosted models via the api_key parameter or the LANGEXTRACT_API_KEY environment variable" when trying to run the example.
Solution
Added the missing language_model_type=inference.OllamaLanguageModel parameter to the Ollama integration example in README.md.
Changes

File modified: README.md
Change: Added language_model_type=inference.OllamaLanguageModel parameter to Ollama example code
Impact: Users can now run the Ollama example without API key errors, as the parameter correctly identifies it as a local model

Testing

Verified the updated example code is valid Python syntax
Confirmed parameter formatting is consistent with other examples in the README
Validated the example now runs without the previous API key error

Type of Change

Bug fix (non-breaking change which fixes an issue)
Documentation update
New feature
Breaking change

This documentation fix ensures users can successfully follow the Ollama integration guide by including the required parameter that distinguishes local models from cloud-hosted ones.

- Switch from badge.fury.io to shields.io for working PyPI badge - Convert relative paths to absolute GitHub URLs for PyPI compatibility - Bump version to 0.1.3

- Add GitHub Actions workflow for automated PyPI publishing via OIDC - Configure trusted publishing environment for verified releases - Update project metadata with proper URLs and license format - Prepare for v1.0.0 stable release with production-ready automation

- Add pylibmagic>=0.5.0 dependency for bundled libraries - Add [full] install option and pre-import handling - Update README with troubleshooting and Docker sections - Bump version to 1.0.1 Fixes google#6

Deleted an inline comment referencing the output directory in the save_annotated_documents.

…ples.md docs: clarify output_dir behavior in medication_examples.md

Prevents confusion from default `test_output/...` by explicitly saving to current directory.

docs: add output_dir="." to all save_annotated_documents examples

feat: add code formatting and linting pipeline

Introduces a common base exception class that all library-specific exceptions inherit from, enabling users to catch all LangExtract errors with a single except clause.

Add LangExtractError base exception for centralized error handling

Fixes google#25 - Windows installation failure due to pylibmagic build requirements Breaking change: LangFunLanguageModel removed. Use GeminiLanguageModel or OllamaLanguageModel instead.

fix: Remove LangFun and pylibmagic dependencies to fix Windows installation and OpenAI SDK v1.x compatibility

- Modified save_annotated_documents to accept both pathlib.Path and string paths - Convert string paths to Path objects before calling mkdir() - This fixes the error when using output_dir='.' as shown in the README example

…-mkdir Fix save_annotated_documents to handle string paths

feat: Add OpenAI language model support

…s: (google#10) * docs: clarify output_dir behavior in medication_examples.md * Removed inline comment in medication example Deleted an inline comment referencing the output directory in the save_annotated_documents. * docs: add output_dir="." to all save_annotated_documents examples Prevents confusion from default `test_output/...` by explicitly saving to current directory. * build: add formatting & linting pipeline with pre-commit integration * style: apply pyink, isort, and pre-commit formatting * ci: enable format and lint checks in tox * Add LangExtractError base exception for centralized error handling Introduces a common base exception class that all library-specific exceptions inherit from, enabling users to catch all LangExtract errors with a single except clause. * fix(ui): prevent current highlight border from being obscured --------- Co-authored-by: Leena Kamran <62442533+kleeena@users.noreply.github.com> Co-authored-by: Akshay Goel <akshay.k.goel@gmail.com>

- Gemini & OpenAI test suites with retry on transient errors - CI: Separate job, Python 3.11 only, skips for forks - Validates char_interval for all extractions - Multilingual test xfail (issue google#13) TODO: Remove xfail from multilingual test after tokenizer fix

…oogle#57) Fixes google#27

…e#62) - Add quickstart example and documentation for local LLM usage - Include Docker setup with health checks and docker-compose - Add integration tests and update CI pipeline - Secure setup: localhost-only binding, containerized deployment Signed-off-by: Akshay Goel <goelak@google.com>

- Ollama integration with Docker examples - Fixed OllamaLanguageModel parameter name (model -> model_id) - Added CI/CD tests for Ollama - Updated documentation with consistent API examples

Bumps the github_actions group with 1 update in the /.github/workflows directory: [tj-actions/changed-files](https://github.com/tj-actions/changed-files). Updates `tj-actions/changed-files` from 44 to 46 - [Release notes](https://github.com/tj-actions/changed-files/releases) - [Changelog](https://github.com/tj-actions/changed-files/blob/main/HISTORY.md) - [Commits](tj-actions/changed-files@v44...v46) --- updated-dependencies: - dependency-name: tj-actions/changed-files dependency-version: '46' dependency-type: direct:production dependency-group: github_actions ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

…e#74) - Add check-linked-issue.yml: Enforces that PRs reference issues with 5+ community reactions - Add check-pr-size.yml: Labels PRs by size and enforces 1000 line limit - Update CONTRIBUTING.md: Document new PR requirements and size guidelines - Include helpful error messages with links to contribution guidelines - Create a scalable system for maintaining code quality and review efficiency

Enables two ways to run live API tests: 1. workflow_dispatch: Manual trigger via Actions tab 2. Label trigger: Add 'ready-to-merge' label to any PR The label-based approach uses pull_request_target for security: - Runs in base repository context with access to secrets - Safely merges PR into main branch before testing - Only maintainers can trigger - Comments test results back to PR This provides a production-ready solution for testing PRs from forks while maintaining security, following patterns used by major projects.

* Add base_url to OpenAILanguageModel * Github action lint is outdated, so adapting * Adding base_url to parameterized test * Lint fixes to inference_test.py

Bug: Workflows triggered on pull_request_target but checked for pull_request, causing all validations to be skipped. Fixed: - Event condition checks now match trigger type - Add manual revalidation workflow - Enable workflow_dispatch with PR number input

- Creates visible PR checks (pass/fail status) - Shows validation errors in status description (up to 140 chars) - Links to workflow run for full details - Maintains backward compatibility with comment reporting

The workflow was comparing boolean true to string 'true', causing all validations to incorrectly show as failed even when all checks passed.

- revalidate-all-prs.sh: Triggers manual validation for all open PRs - add-size-labels.sh: Adds size labels (XS/S/M/L/XL) based on change count - add-new-checks.sh: Adds required status checks to branch protection These scripts require maintainer permissions and help manage PR workflows.

- Add type ignore comments for IPython imports - Fix return type annotation (remove unnecessary quotes) - Add _is_jupyter() to properly detect notebook environments - Replace lambda with def function for pylint compliance Fixes google#65

- Add format-check job that checks actual PR code, not merge commit - Validate formatting before expensive fork PR tests - Provide clear error messages when formatting fails Fixes false positives where incorrectly formatted PRs passed CI

Auto-updates PRs behind main, handles forks/conflicts gracefully, skips bot/draft PRs, monitors API limits

- Apply end-of-file and whitespace fixes to workflows

- Fix empty interval bug when newline falls at chunk boundary (issue google#71) - Add concise comment explaining the fix logic - Remove excessive/obvious comments from chunking tests - Improve test docstring to be more descriptive and professional

The exceptions.py file existed in both the root directory and langextract/ directory with identical content. This removes the duplicate from the root to avoid confusion and maintain proper package structure.

…le (google#97) Introduces a provider registry system enabling third-party providers to be dynamically registered and discovered through a plugin architecture. Users can now integrate custom LLM backends (Azure OpenAI, AWS Bedrock, custom inference servers) without modifying core LangExtract code. Fixes google#80, google#67, google#54, google#49, google#48, google#53 Key Changes: **Provider Registry** (`langextract/providers/registry.py`) - Pattern-based registration with priority resolution - Automatic discovery via Python entry points - Lazy loading for performance **Factory Enhancements** (`langextract/factory.py`) - `ModelConfig` dataclass for structured configuration - Explicit provider selection when patterns overlap - Full backward compatibility maintained **Plugin Example** (`examples/custom_provider_plugin/`) - Complete working example with entry point configuration - Shows how to create custom providers for any backend **Documentation** - Comprehensive provider system README with architecture diagrams - Step-by-step plugin creation guide **Dependencies** - Move openai to optional dependencies - Update tox.ini to include openai in test environments **Lint Fixes** - Add appropriate pylint suppressions for legitimate patterns - Fix unused variable warnings in tests - Address import and global statement warnings No anticipated breakage - full backward compatibility maintained. Given significant internal changes to provider loading, issues should be reported if unexpected behavior is encountered.

Add common development files, tools, and temporary file patterns

- Show current approach using factory.create_model() - Add note that direct model passing to extract() is coming soon - Keep planned API as commented code for reference

Ensure providers are loaded before pattern matching to prevent API key errors when using local models. Optimize to skip loading when provider is explicitly specified.

- Add proper permissions (issues: write for comments) - Skip draft PRs to avoid noise - Prevent duplicate comments with hidden marker - Search both title and body for issue links - Support all keyword variants and cross-repo references - Count unique users for reactions, not total count - Include 'write' permission for maintainer override - Add concurrency control for rapid edits - Handle cross-repo issues gracefully

- 6 tests: plugin discovery, loading, idempotency, error handling - Smart CI triggers for integration test on provider changes - New tox environments: plugin-smoke and plugin-integration

github-actions · 2025-08-11T19:57:10Z

⚠️ Branch Update Required

Your branch is 1 commits behind main. Please update your branch to ensure CI checks run with the latest code:

git fetch origin main
git merge origin/main
git push

Note: Enable "Allow edits by maintainers" to allow automatic updates.

github-actions · 2025-08-19T02:32:31Z

⚠️ Branch Update Required

Your branch is 26 commits behind main. Please update your branch to ensure CI checks run with the latest code:

git fetch origin main
git merge origin/main
git push

Note: Enable "Allow edits by maintainers" to allow automatic updates.

aksg87 · 2025-08-21T05:32:21Z

Thanks for the contribution and PR. The language_model_type is going to be deprecated which is noted in recent docstring updates. Using the model config is going to the recommended way of interacting with models. Closing this PR for now, but please discuss the issue again if it comes up. There should be an update that handles passing all parameters to model kwargs more gracefully soon.

aksg87 and others added 30 commits July 22, 2025 01:39

docs(pypi): Improve README display and badge reliability

2ce2399

- Switch from badge.fury.io to shields.io for working PyPI badge - Convert relative paths to absolute GitHub URLs for PyPI compatibility - Bump version to 0.1.3

Fix: Resolve libmagic ImportError (google#6)

e696a48

- Add pylibmagic>=0.5.0 dependency for bundled libraries - Add [full] install option and pre-import handling - Update README with troubleshooting and Docker sections - Bump version to 1.0.1 Fixes google#6

docs: clarify output_dir behavior in medication_examples.md

5447637

Merge pull request google#11 from google/fix/libmagic-dependency-issue

9c47b34

Removed inline comment in medication example

175e075

Deleted an inline comment referencing the output directory in the save_annotated_documents.

Merge pull request google#15 from kleeena/docs/update-medication_exam…

9472099

…ples.md docs: clarify output_dir behavior in medication_examples.md

docs: add output_dir="." to all save_annotated_documents examples

e6c3dcd

Prevents confusion from default `test_output/...` by explicitly saving to current directory.

Merge pull request google#17 from google/fix/output-dir-consistency

1fb1f1d

docs: add output_dir="." to all save_annotated_documents examples

build: add formatting & linting pipeline with pre-commit integration

13fbd2c

style: apply pyink, isort, and pre-commit formatting

c8d2027

ci: enable format and lint checks in tox

146a095

Merge pull request google#24 from google/feat/code-formatting-pipeline

aa6da18

feat: add code formatting and linting pipeline

Add LangExtractError base exception for centralized error handling

ed65bca

Introduces a common base exception class that all library-specific exceptions inherit from, enabling users to catch all LangExtract errors with a single except clause.

Merge pull request google#26 from google/feat/exception-hierarchy

6c4508b

Add LangExtractError base exception for centralized error handling

fix: Remove LangFun and pylibmagic dependencies (v1.0.2)

8b85225

Fixes google#25 - Windows installation failure due to pylibmagic build requirements Breaking change: LangFunLanguageModel removed. Use GeminiLanguageModel or OllamaLanguageModel instead.

Merge pull request google#28 from google/fix/remove-breaking-dep-langfun

88520cc

fix: Remove LangFun and pylibmagic dependencies to fix Windows installation and OpenAI SDK v1.x compatibility

Fix save_annotated_documents to handle string paths

75a6f12

- Modified save_annotated_documents to accept both pathlib.Path and string paths - Convert string paths to Path objects before calling mkdir() - This fixes the error when using output_dir='.' as shown in the README example

Merge pull request google#29 from google/fix-save-annotated-documents…

a415b94

…-mkdir Fix save_annotated_documents to handle string paths

feat: Add OpenAI language model support

8289b3a

Merge pull request google#31 from google/feature/add-oai-inference

c8ef723

feat: Add OpenAI language model support

Add PR template validation workflow (google#45)

dc61372

fix: Change OllamaLanguageModel parameter from 'model' to 'model_id' (g…

da771e6

…oogle#57) Fixes google#27

feat: Add CITATION.cff file for proper software citation

e83d5cf

chore: Bump version to 1.0.4 for release

a7ef0bd

- Ollama integration with Docker examples - Fixed OllamaLanguageModel parameter name (model -> model_id) - Added CI/CD tests for Ollama - Updated documentation with consistent API examples

aksg87 and others added 22 commits August 6, 2025 09:38

Add base_url to OpenAILanguageModel (google#51)

234081e

* Add base_url to OpenAILanguageModel * Github action lint is outdated, so adapting * Adding base_url to parameterized test * Lint fixes to inference_test.py

Add commit status to revalidation workflow

6fb66cf

- Creates visible PR checks (pass/fail status) - Shows validation errors in status description (up to 140 chars) - Links to workflow run for full details - Maintains backward compatibility with comment reporting

Fix boolean comparison in revalidation workflow

47a251e

The workflow was comparing boolean true to string 'true', causing all validations to incorrectly show as failed even when all checks passed.

Fix CI to validate PR branch formatting directly

e6dcc8e

- Add format-check job that checks actual PR code, not merge commit - Validate formatting before expensive fork PR tests - Provide clear error messages when formatting fails Fixes false positives where incorrectly formatted PRs passed CI

Add PR update automation workflows

1c3c1a2

Auto-updates PRs behind main, handles forks/conflicts gracefully, skips bot/draft PRs, monitors API limits

Fix workflow formatting

b60f0b2

- Apply end-of-file and whitespace fixes to workflows

Bump version to 1.0.5

b3bff86

Remove duplicate exceptions.py from root directory (google#94)

f3c1553

The exceptions.py file existed in both the root directory and langextract/ directory with identical content. This removes the duplicate from the root to avoid confusion and maintain proper package structure.

Fix unicode escaping in example generation (google#98)

845258c

Update provider documentation

c8aa788

Update .gitignore with additional development patterns

f069d6f

Add common development files, tools, and temporary file patterns

Update custom provider example to clarify planned model passing feature

0c08fd1

- Show current approach using factory.create_model() - Add note that direct model passing to extract() is coming soon - Keep planned API as commented code for reference

Fix lazy loading for provider pattern registration (google#113)

1a25621

Ensure providers are loaded before pattern matching to prevent API key errors when using local models. Optimize to skip loading when provider is explicitly specified.

Add tests for provider plugin system (google#114)

8989620

- 6 tests: plugin discovery, loading, idempotency, error handling - Smart CI triggers for integration test on provider changes - New tox environments: plugin-smoke and plugin-integration

Fix google#110 formatting in README for consistency and clarity

0407072

github-actions bot added the size/XS Pull request with less than 50 lines changed label Aug 11, 2025

lazzyms marked this pull request as ready for review August 11, 2025 12:24

lazzyms mentioned this pull request Aug 11, 2025

[bug] langextract still asking for API key on local Ollama API setup #110

Closed

aksg87 force-pushed the main branch from e36e455 to 3dff0d3 Compare August 21, 2025 01:43

aksg87 closed this Aug 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix #110 formatting in README for consistency and clarity #118

Fix #110 formatting in README for consistency and clarity #118

Uh oh!

lazzyms commented Aug 11, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Aug 11, 2025

Uh oh!

github-actions bot commented Aug 19, 2025

Uh oh!

aksg87 commented Aug 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Fix #110 formatting in README for consistency and clarity #118

Fix #110 formatting in README for consistency and clarity #118

Uh oh!

Conversation

lazzyms commented Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Aug 11, 2025

Uh oh!

github-actions bot commented Aug 19, 2025

Uh oh!

aksg87 commented Aug 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

lazzyms commented Aug 11, 2025 •

edited

Loading