Remove hadoop-aws from default Spark configuration in Python and R libraries

**Is your feature request related to a problem? Please describe.**
Currently, the `org.apache.hadoop:hadoop-aws` package is included in the default Spark configuration for the Python and R libraries. This dependency is not necessary for most users by default, and its inclusion may introduce unnecessary complexity or conflicts for environments that do not require AWS S3 support.

**Describe the solution you'd like**
Remove the `org.apache.hadoop:hadoop-aws` package from the default Spark configuration in both the Python and R libraries. Instead, users who require AWS S3 integration should be directed to follow the instructions provided in the documentation: https://pathling.csiro.au/docs/libraries/installation/spark

**Describe alternatives you've considered**
- Keeping the package by default: This is not ideal, as it adds overhead for users who do not need AWS support.
- Making the dependency optional/documented: This is preferable and can be achieved by providing clear instructions for users who want to add it themselves.

**Additional context**
Removing this dependency will simplify the default configuration and reduce the initial dependency download for most use cases. Users who require S3 support can still enable it by following the provided documentation link.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove hadoop-aws from default Spark configuration in Python and R libraries #2487

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Remove hadoop-aws from default Spark configuration in Python and R libraries #2487

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions