-
Notifications
You must be signed in to change notification settings - Fork 15
Description
Is your feature request related to a problem? Please describe.
Currently, the org.apache.hadoop:hadoop-aws
package is included in the default Spark configuration for the Python and R libraries. This dependency is not necessary for most users by default, and its inclusion may introduce unnecessary complexity or conflicts for environments that do not require AWS S3 support.
Describe the solution you'd like
Remove the org.apache.hadoop:hadoop-aws
package from the default Spark configuration in both the Python and R libraries. Instead, users who require AWS S3 integration should be directed to follow the instructions provided in the documentation: https://pathling.csiro.au/docs/libraries/installation/spark
Describe alternatives you've considered
- Keeping the package by default: This is not ideal, as it adds overhead for users who do not need AWS support.
- Making the dependency optional/documented: This is preferable and can be achieved by providing clear instructions for users who want to add it themselves.
Additional context
Removing this dependency will simplify the default configuration and reduce the initial dependency download for most use cases. Users who require S3 support can still enable it by following the provided documentation link.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status