-
Notifications
You must be signed in to change notification settings - Fork 29
Hopsworks Python library installation documentation improvements #431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,56 +1,115 @@ | ||
--- | ||
description: Documentation on how to install the Hopsworks and HSFS Python libraries, including the specific requirements for Mac OSX and Windows. | ||
description: Documentation on how to install the Hopsworks Python and Java library. | ||
--- | ||
# Client Installation Guide | ||
|
||
## Hopsworks (including Feature Store and MLOps) | ||
The Hopsworks client library is required to connect to the Hopsworks Feature Store and MLOps services from your local machine or any other Python environment such as Google Colab or AWS Sagemaker. Execute the following command to install the full Hopsworks client library in your Python environment: | ||
## Hopsworks Python library | ||
|
||
The Hopsworks Python client library is required to connect to Hopsworks from your local machine or any other Python environment such as Google Colab or AWS Sagemaker. Execute the following command to install the Hopsworks client library in your Python environment: | ||
|
||
!!! note "Virtual environment" | ||
It is recommended to use a virtual python environment instead of the system environment used by your operating system, in order to avoid any side effects regarding interfering dependencies. | ||
|
||
```bash | ||
pip install hopsworks | ||
``` | ||
Supported versions of Python: 3.8, 3.9, 3.10, 3.11, 3.12 ([PyPI ↗](https://pypi.org/project/hopsworks/)) | ||
|
||
!!! attention "OSX Installation" | ||
Hopsworks latest version should work on OSX systems without any additional requirements. However if installing an older version of the Hopsworks SDK you might need to install `librdkafka` manually. Checkout the documentation for the specific version you are installing. | ||
|
||
!!! attention "Windows/Conda Installation" | ||
|
||
On Windows systems you might need to install twofish manually before installing hopsworks, if you don't have the Microsoft Visual C++ Build Tools installed. In that case, it is recommended to use a conda environment and run the following commands: | ||
|
||
```bash | ||
conda install twofish | ||
pip install hopsworks | ||
pip install hopsworks[python] | ||
``` | ||
|
||
## Feature Store only | ||
To only install the Hopsworks Feature Store client library, execute the following command: | ||
```bash | ||
pip install hopsworks[python] | ||
``` | ||
Supported versions of Python: 3.8, 3.9, 3.10, 3.11, 3.12 ([PyPI ↗](https://pypi.org/project/hopsworks/)) | ||
|
||
### Profiles | ||
|
||
The Hopsworks library has several profiles that bring additional dependencies and enable additional functionalities: | ||
|
||
| Profile Name | Description | | ||
| ------------------ | ------------- | | ||
| No Profile | This is the base installation. Supports interacting with the feature store metadata, model registry and deployments. It also supports reading and writing from the feature store from PySpark environments. | | ||
| `python` | This profile enables reading and writing from/to the feature store from a Python environment | | ||
| `great-expectations` | This profile installs the [Great Expectations](https://greatexpectations.io/) Python library and enables data validation on feature pipelines | | ||
| `polars` | This profile installs the [Polars](https://pola.rs/) library and enables reading and writing Polars DataFrames | | ||
|
||
You can install all the above profiles with the following command: | ||
|
||
```bash | ||
pip install hsfs[python] | ||
# or if using zsh | ||
pip install 'hsfs[python]' | ||
pip install hopsworks[python,great-expectations,polars] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is Polars not just an extension of the python profile? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Doesn't look like from the project configuration: https://github.com/logicalclocks/hopsworks-api/blob/0c93c2ab91b3962a1bc45633def1e2320eb0f1d7/python/pyproject.toml#L91 - we had to put polars on its own profile because on certain VMs the instructions set is not available and importing polars was crashing |
||
``` | ||
Supported versions of Python: 3.8, 3.9, 3.10, 3.11, 3.12 ([PyPI ↗](https://pypi.org/project/hsfs/)) | ||
|
||
!!! attention "OSX Installation" | ||
Hopsworks latest version should work on OSX systems without any additional requirements. However if installing an older version of the Hopsworks SDK you might need to install `librdkafka` manually. Checkout the documentation for the specific version you are installing. | ||
## HSFS Java Library: | ||
|
||
!!! attention "Windows/Conda Installation" | ||
If you want to interact with the Hopsworks Feature Store from environments such as Spark, Flink or Beam, you can use the Hopsworks Feature Store (HSFS) Java library. | ||
|
||
On Windows systems you might need to install twofish manually before installing hsfs, if you don't have the Microsoft Visual C++ Build Tools installed. In that case, it is recommended to use a conda environment and run the following commands: | ||
|
||
```bash | ||
conda install twofish | ||
pip install hsfs[python] | ||
``` | ||
!!!note "Feature Store Only" | ||
|
||
The Java library only allows interaction with the Feature Store component of the Hopsworks platform. Additionally each environment might restrict the supported API operation. You can see which API operation is supported by which environment [here](../fs/compute_engines) | ||
|
||
The HSFS library is available on the Hopsworks' Maven repository. If you are using Maven as build tool, you can add the following in your `pom.xml` file: | ||
|
||
``` | ||
<repositories> | ||
<repository> | ||
<id>Hops</id> | ||
<name>Hops Repository</name> | ||
<url>https://archiva.hops.works/repository/Hops/</url> | ||
<releases> | ||
<enabled>true</enabled> | ||
</releases> | ||
<snapshots> | ||
<enabled>true</enabled> | ||
</snapshots> | ||
</repository> | ||
</repositories> | ||
``` | ||
|
||
The library has different builds targeting different environments: | ||
|
||
### Spark | ||
|
||
The `artifactId` for the Spark build is `hsfs-spark-spark{spark.version}`, if you are using Maven as build tool, you can add the following dependency: | ||
|
||
``` | ||
<dependency> | ||
<groupId>com.logicalclocks</groupId> | ||
<artifactId>hsfs-spark-spark3.1</artifactId> | ||
<version>${hsfs.version}</version> | ||
</dependency> | ||
``` | ||
|
||
Hopsworks provides builds for Spark 3.1, 3.3 and 3.5. The builds are also provided as JAR files which can be downloaded from [Hopsworks repository](https://repo.hops.works/master/hsfs) | ||
|
||
### Flink | ||
|
||
The `artifactId` for the Flink build is `hsfs-flink`, if you are using Maven as build tool, you can add the following dependency: | ||
|
||
``` | ||
<dependency> | ||
<groupId>com.logicalclocks</groupId> | ||
<artifactId>hsfs-flink</artifactId> | ||
<version>${hsfs.version}</version> | ||
</dependency> | ||
``` | ||
|
||
### Beam | ||
|
||
The `artifactId` for the Beam build is `hsfs-beam`, if you are using Maven as build tool, you can add the following dependency: | ||
|
||
``` | ||
<dependency> | ||
<groupId>com.logicalclocks</groupId> | ||
<artifactId>hsfs-beam</artifactId> | ||
<version>${hsfs.version}</version> | ||
</dependency> | ||
``` | ||
|
||
## Next Steps | ||
|
||
If you are using a local python environment and want to connect to the Hopsworks Feature Store, you can follow the [Python Guide](../integrations/python.md#generate-an-api-key) section to create an API Key and to get started. | ||
If you are using a local python environment and want to connect to Hopsworks, you can follow the [Python Guide](../integrations/python.md#generate-an-api-key) section to create an API Key and to get started. | ||
|
||
## Other environments | ||
|
||
|
Uh oh!
There was an error while loading. Please reload this page.