6.1.0 #14636
DevinTDHa
announced in
Announcement
6.1.0
#14636
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
📢 Spark NLP 6.1.0: State-of-the-art LLM Capabilities and Advancing Universal Ingestion
We are excited to announce Spark NLP 6.1.0, another milestone for building scalable, distributed AI pipelines! This major release significantly enhances our capabilities for state-of-the-art multimodal and large language models and universal data ingestion. Upgrade Spark NLP to 6.1.0 to improve both usability and performance across ingestion, inference, and multimodal processing pipelines, all within the native Spark ecosystem.
🔥 Highlights
llama.cpp
Integration: We've updated ourllama.cpp
backend to tagb5932
which supports inference with the latest generation of LLMs.Reader2Doc
: Introducing a new annotator that streamlines the process of loading and integrating diverse file formats (PDFs, Word, Excel, PowerPoint, HTML, Text, Email, Markdown) directly into Spark NLP pipelines with a unified and flexible interface.🚀 New Features & Enhancements
Large Language Models (LLMs)
llama.cpp
Upgrade: Our llama.cpp backend has been upgraded to versionb5932
. This update enables native inference for the newest LLMs, such as Gemma 3 and Phi-4, ensuring broader model compatibility and improved performance.AutoGGUFVisionModel
annotator to the latest backend. This means that this annotator will not be available in this version. As a workaround, please use version 6.0.5 of Spark NLP.Document Ingestion
Reader2Doc
Annotator: This new annotator provides a simplified, unified interface for integrating various Spark NLP readers. It supports a wide range of formats, including PDFs, plain text, HTML, Word (.doc
/.docx
), Excel (.xls
/.xlsx
), PowerPoint (.ppt
/.pptx
), email files (.eml
,.msg
), and Markdown (.md
).Let's use a code example to see how easy it is to use:
Check out our full example notebook to see it in action.
🐛 Bug Fixes
❤️ Community Support
Installation
Python
Spark Packages
spark-nlp on Apache Spark 3.0.x, 3.1.x, 3.2.x, 3.3.x, and 3.4.x (Scala 2.12):
GPU
Apple Silicon (M1 & M2)
AArch64
Maven
spark-nlp on Apache Spark 3.0.x, 3.1.x, 3.2.x, 3.3.x, and 3.4.x:
spark-nlp-gpu:
spark-nlp-silicon:
spark-nlp-aarch64:
FAT JARs
What's Changed
Full Changelog: 6.0.5...6.1.0
This discussion was created from the release 6.1.0.
Beta Was this translation helpful? Give feedback.
All reactions