Skip to content

Commit ad40148

Browse files
authored
Prepare pre-release 3.0.0b3 - Modin 0.16, major refactoring, improved partitioning write (#1682)
* Prepare pre-release 3.0.0b3 - Modin 0.16, major refactoring, improved partitioning write
1 parent 2ef2ac8 commit ad40148

26 files changed

+1103
-327
lines changed

.bumpversion.cfg

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
[bumpversion]
2-
current_version = 3.0.0b2
2+
current_version = 3.0.0b3
33
commit = False
44
tag = False
55
tag_name = {new_version}
@@ -61,6 +61,6 @@ first_value = 1
6161

6262
[bumpversion:file:tutorials/023 - Flexible Partitions Filter.ipynb]
6363

64-
[bumpversion:file:tutorials/034 - Distributing Calls using Ray]
64+
[bumpversion:file:tutorials/034 - Distributing Calls using Ray.ipynb]
6565

66-
[bumpversion:file:tutorials/035 - Distributing Calls on Ray Remote Cluster.ipynb]
66+
[bumpversion:file:tutorials/035 - Distributing Calls on Ray Remote Cluster.ipynb]

CONTRIBUTING.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -316,7 +316,7 @@ available_node_types:
316316
SubnetId: {replace with subnet within above AZs}
317317
318318
setup_commands:
319-
- pip install "awswrangler[modin, ray]==3.0.0b2"
319+
- pip install "awswrangler[modin, ray]==3.0.0b3"
320320
- pip install pytest
321321
322322
```

CONTRIBUTING_COMMON_ERRORS.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,9 +13,9 @@ Requirement already satisfied: pbr!=2.1.0,>=2.0.0 in ./.venv/lib/python3.7/site-
1313
Using legacy 'setup.py install' for python-Levenshtein, since package 'wheel' is not installed.
1414
Installing collected packages: awswrangler, python-Levenshtein
1515
Attempting uninstall: awswrangler
16-
Found existing installation: awswrangler 3.0.0b2
17-
Uninstalling awswrangler-3.0.0b2:
18-
Successfully uninstalled awswrangler-3.0.0b2
16+
Found existing installation: awswrangler 3.0.0b3
17+
Uninstalling awswrangler-3.0.0b3:
18+
Successfully uninstalled awswrangler-3.0.0b3
1919
Running setup.py develop for awswrangler
2020
Running setup.py install for python-Levenshtein ... error
2121
ERROR: Command errored out with exit status 1:

README.md

Lines changed: 37 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ Easy integration with Athena, Glue, Redshift, Timestream, OpenSearch, Neptune, Q
1111

1212
> An [AWS Professional Service](https://aws.amazon.com/professional-services/) open source initiative | aws-proserve-opensource@amazon.com
1313
14-
[![Release](https://img.shields.io/badge/release-3.0.0b2-brightgreen.svg)](https://pypi.org/project/awswrangler/)
14+
[![Release](https://img.shields.io/badge/release-3.0.0b3-brightgreen.svg)](https://pypi.org/project/awswrangler/)
1515
[![Python Version](https://img.shields.io/badge/python-3.7%20%7C%203.8%20%7C%203.9%20%7C%203.10-brightgreen.svg)](https://anaconda.org/conda-forge/awswrangler)
1616
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
1717
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
@@ -26,7 +26,7 @@ Easy integration with Athena, Glue, Redshift, Timestream, OpenSearch, Neptune, Q
2626
| **[PyPi](https://pypi.org/project/awswrangler/)** | [![PyPI Downloads](https://pepy.tech/badge/awswrangler)](https://pypi.org/project/awswrangler/) | `pip install awswrangler` |
2727
| **[Conda](https://anaconda.org/conda-forge/awswrangler)** | [![Conda Downloads](https://img.shields.io/conda/dn/conda-forge/awswrangler.svg)](https://anaconda.org/conda-forge/awswrangler) | `conda install -c conda-forge awswrangler` |
2828

29-
> ⚠️ **For platforms without PyArrow 3 support (e.g. [EMR](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/install.html#emr-cluster), [Glue PySpark Job](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/install.html#aws-glue-pyspark-jobs), MWAA):**<br>
29+
> ⚠️ **For platforms without PyArrow 3 support (e.g. [EMR](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/install.html#emr-cluster), [Glue PySpark Job](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/install.html#aws-glue-pyspark-jobs), MWAA):**<br>
3030
➡️ `pip install pyarrow==2 awswrangler`
3131

3232
Powered By [<img src="https://arrow.apache.org/img/arrow.png" width="200">](https://arrow.apache.org/powered_by/)
@@ -44,7 +44,7 @@ Powered By [<img src="https://arrow.apache.org/img/arrow.png" width="200">](http
4444

4545
Installation command: `pip install awswrangler`
4646

47-
> ⚠️ **For platforms without PyArrow 3 support (e.g. [EMR](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/install.html#emr-cluster), [Glue PySpark Job](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/install.html#aws-glue-pyspark-jobs), MWAA):**<br>
47+
> ⚠️ **For platforms without PyArrow 3 support (e.g. [EMR](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/install.html#emr-cluster), [Glue PySpark Job](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/install.html#aws-glue-pyspark-jobs), MWAA):**<br>
4848
➡️`pip install pyarrow==2 awswrangler`
4949

5050
```py3
@@ -98,17 +98,17 @@ FROM "sampleDB"."sampleTable" ORDER BY time DESC LIMIT 3
9898

9999
## [Read The Docs](https://aws-sdk-pandas.readthedocs.io/)
100100

101-
- [**What is AWS SDK for pandas?**](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/what.html)
102-
- [**Install**](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/install.html)
103-
- [PyPi (pip)](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/install.html#pypi-pip)
104-
- [Conda](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/install.html#conda)
105-
- [AWS Lambda Layer](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/install.html#aws-lambda-layer)
106-
- [AWS Glue Python Shell Jobs](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/install.html#aws-glue-python-shell-jobs)
107-
- [AWS Glue PySpark Jobs](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/install.html#aws-glue-pyspark-jobs)
108-
- [Amazon SageMaker Notebook](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/install.html#amazon-sagemaker-notebook)
109-
- [Amazon SageMaker Notebook Lifecycle](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/install.html#amazon-sagemaker-notebook-lifecycle)
110-
- [EMR](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/install.html#emr)
111-
- [From source](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/install.html#from-source)
101+
- [**What is AWS SDK for pandas?**](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/what.html)
102+
- [**Install**](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/install.html)
103+
- [PyPi (pip)](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/install.html#pypi-pip)
104+
- [Conda](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/install.html#conda)
105+
- [AWS Lambda Layer](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/install.html#aws-lambda-layer)
106+
- [AWS Glue Python Shell Jobs](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/install.html#aws-glue-python-shell-jobs)
107+
- [AWS Glue PySpark Jobs](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/install.html#aws-glue-pyspark-jobs)
108+
- [Amazon SageMaker Notebook](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/install.html#amazon-sagemaker-notebook)
109+
- [Amazon SageMaker Notebook Lifecycle](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/install.html#amazon-sagemaker-notebook-lifecycle)
110+
- [EMR](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/install.html#emr)
111+
- [From source](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/install.html#from-source)
112112
- [**Tutorials**](https://github.com/aws/aws-sdk-pandas/tree/main/tutorials)
113113
- [001 - Introduction](https://github.com/aws/aws-sdk-pandas/blob/main/tutorials/001%20-%20Introduction.ipynb)
114114
- [002 - Sessions](https://github.com/aws/aws-sdk-pandas/blob/main/tutorials/002%20-%20Sessions.ipynb)
@@ -145,29 +145,29 @@ FROM "sampleDB"."sampleTable" ORDER BY time DESC LIMIT 3
145145
- [033 - Amazon Neptune](https://github.com/aws/aws-sdk-pandas/blob/main/tutorials/033%20-%20Amazon%20Neptune.ipynb)
146146
- [034 - Distributing Calls Using Ray](https://github.com/aws/aws-sdk-pandas/blob/main/tutorials/034%20-%20Distributing%20Calls%20using%20Ray.ipynb)
147147
- [35 - Distributing Calls on Ray Remote Cluster](https://github.com/aws/aws-sdk-pandas/blob/main/tutorials/035%20-%20Distributing%20Calls%20on%20Ray%20Remote%20Cluster.ipynb)
148-
- [**API Reference**](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/api.html)
149-
- [Amazon S3](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/api.html#amazon-s3)
150-
- [AWS Glue Catalog](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/api.html#aws-glue-catalog)
151-
- [Amazon Athena](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/api.html#amazon-athena)
152-
- [AWS Lake Formation](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/api.html#aws-lake-formation)
153-
- [Amazon Redshift](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/api.html#amazon-redshift)
154-
- [PostgreSQL](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/api.html#postgresql)
155-
- [MySQL](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/api.html#mysql)
156-
- [SQL Server](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/api.html#sqlserver)
157-
- [Oracle](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/api.html#oracle)
158-
- [Data API Redshift](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/api.html#data-api-redshift)
159-
- [Data API RDS](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/api.html#data-api-rds)
160-
- [OpenSearch](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/api.html#opensearch)
161-
- [Amazon Neptune](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/api.html#amazon-neptune)
162-
- [DynamoDB](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/api.html#dynamodb)
163-
- [Amazon Timestream](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/api.html#amazon-timestream)
164-
- [Amazon EMR](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/api.html#amazon-emr)
165-
- [Amazon CloudWatch Logs](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/api.html#amazon-cloudwatch-logs)
166-
- [Amazon Chime](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/api.html#amazon-chime)
167-
- [Amazon QuickSight](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/api.html#amazon-quicksight)
168-
- [AWS STS](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/api.html#aws-sts)
169-
- [AWS Secrets Manager](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/api.html#aws-secrets-manager)
170-
- [Global Configurations](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/api.html#global-configurations)
148+
- [**API Reference**](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/api.html)
149+
- [Amazon S3](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/api.html#amazon-s3)
150+
- [AWS Glue Catalog](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/api.html#aws-glue-catalog)
151+
- [Amazon Athena](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/api.html#amazon-athena)
152+
- [AWS Lake Formation](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/api.html#aws-lake-formation)
153+
- [Amazon Redshift](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/api.html#amazon-redshift)
154+
- [PostgreSQL](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/api.html#postgresql)
155+
- [MySQL](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/api.html#mysql)
156+
- [SQL Server](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/api.html#sqlserver)
157+
- [Oracle](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/api.html#oracle)
158+
- [Data API Redshift](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/api.html#data-api-redshift)
159+
- [Data API RDS](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/api.html#data-api-rds)
160+
- [OpenSearch](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/api.html#opensearch)
161+
- [Amazon Neptune](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/api.html#amazon-neptune)
162+
- [DynamoDB](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/api.html#dynamodb)
163+
- [Amazon Timestream](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/api.html#amazon-timestream)
164+
- [Amazon EMR](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/api.html#amazon-emr)
165+
- [Amazon CloudWatch Logs](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/api.html#amazon-cloudwatch-logs)
166+
- [Amazon Chime](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/api.html#amazon-chime)
167+
- [Amazon QuickSight](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/api.html#amazon-quicksight)
168+
- [AWS STS](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/api.html#aws-sts)
169+
- [AWS Secrets Manager](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/api.html#aws-secrets-manager)
170+
- [Global Configurations](https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/api.html#global-configurations)
171171
- [**License**](https://github.com/aws/aws-sdk-pandas/blob/main/LICENSE.txt)
172172
- [**Contributing**](https://github.com/aws/aws-sdk-pandas/blob/main/CONTRIBUTING.md)
173173

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
3.0.0b2
1+
3.0.0b3

awswrangler/__metadata__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,5 +7,5 @@
77

88
__title__: str = "awswrangler"
99
__description__: str = "Pandas on AWS."
10-
__version__: str = "3.0.0b2"
10+
__version__: str = "3.0.0b3"
1111
__license__: str = "Apache License 2.0"

awswrangler/athena/_read.py

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -706,11 +706,11 @@ def read_sql_query(
706706
707707
**Related tutorial:**
708708
709-
- `Amazon Athena <https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/
709+
- `Amazon Athena <https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/
710710
tutorials/006%20-%20Amazon%20Athena.html>`_
711-
- `Athena Cache <https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/
711+
- `Athena Cache <https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/
712712
tutorials/019%20-%20Athena%20Cache.html>`_
713-
- `Global Configurations <https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/
713+
- `Global Configurations <https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/
714714
tutorials/021%20-%20Global%20Configurations.html>`_
715715
716716
**There are three approaches available through ctas_approach and unload_approach parameters:**
@@ -774,7 +774,7 @@ def read_sql_query(
774774
/athena.html#Athena.Client.get_query_execution>`_ .
775775
776776
For a practical example check out the
777-
`related tutorial <https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/
777+
`related tutorial <https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/
778778
tutorials/024%20-%20Athena%20Query%20Metadata.html>`_!
779779
780780
@@ -1019,11 +1019,11 @@ def read_sql_table(
10191019
10201020
**Related tutorial:**
10211021
1022-
- `Amazon Athena <https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/
1022+
- `Amazon Athena <https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/
10231023
tutorials/006%20-%20Amazon%20Athena.html>`_
1024-
- `Athena Cache <https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/
1024+
- `Athena Cache <https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/
10251025
tutorials/019%20-%20Athena%20Cache.html>`_
1026-
- `Global Configurations <https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/
1026+
- `Global Configurations <https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/
10271027
tutorials/021%20-%20Global%20Configurations.html>`_
10281028
10291029
**There are two approaches to be defined through ctas_approach parameter:**
@@ -1068,7 +1068,7 @@ def read_sql_table(
10681068
/athena.html#Athena.Client.get_query_execution>`_ .
10691069
10701070
For a practical example check out the
1071-
`related tutorial <https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/
1071+
`related tutorial <https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/
10721072
tutorials/024%20-%20Athena%20Query%20Metadata.html>`_!
10731073
10741074

awswrangler/s3/_read_parquet.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -411,7 +411,7 @@ def read_parquet(
411411
must return a bool, True to read the partition or False to ignore it.
412412
Ignored if `dataset=False`.
413413
E.g ``lambda x: True if x["year"] == "2020" and x["month"] == "1" else False``
414-
https://aws-data-wrangler.readthedocs.io/en/3.0.0b2/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
414+
https://aws-data-wrangler.readthedocs.io/en/3.0.0b3/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
415415
columns : List[str], optional
416416
List of columns to read from the file(s).
417417
validate_schema : bool, default False
@@ -623,7 +623,7 @@ def read_parquet_table(
623623
must return a bool, True to read the partition or False to ignore it.
624624
Ignored if `dataset=False`.
625625
E.g ``lambda x: True if x["year"] == "2020" and x["month"] == "1" else False``
626-
https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
626+
https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
627627
columns : List[str], optional
628628
List of columns to read from the file(s).
629629
validate_schema : bool, default False

awswrangler/s3/_read_text.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -225,7 +225,7 @@ def read_csv(
225225
This function MUST return a bool, True to read the partition or False to ignore it.
226226
Ignored if `dataset=False`.
227227
E.g ``lambda x: True if x["year"] == "2020" and x["month"] == "1" else False``
228-
https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
228+
https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
229229
parallelism : int, optional
230230
The requested parallelism of the read. Only used when `distributed` add-on is installed.
231231
Parallelism may be limited by the number of files of the dataset. 200 by default.
@@ -378,7 +378,7 @@ def read_fwf(
378378
This function MUST return a bool, True to read the partition or False to ignore it.
379379
Ignored if `dataset=False`.
380380
E.g ``lambda x: True if x["year"] == "2020" and x["month"] == "1" else False``
381-
https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
381+
https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
382382
parallelism : int, optional
383383
The requested parallelism of the read. Only used when `distributed` add-on is installed.
384384
Parallelism may be limited by the number of files of the dataset. 200 by default.
@@ -535,7 +535,7 @@ def read_json(
535535
This function MUST return a bool, True to read the partition or False to ignore it.
536536
Ignored if `dataset=False`.
537537
E.g ``lambda x: True if x["year"] == "2020" and x["month"] == "1" else False``
538-
https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
538+
https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
539539
parallelism : int, optional
540540
The requested parallelism of the read. Only used when `distributed` add-on is installed.
541541
Parallelism may be limited by the number of files of the dataset. 200 by default.

awswrangler/s3/_write_parquet.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -319,18 +319,18 @@ def to_parquet( # pylint: disable=too-many-arguments,too-many-locals,too-many-b
319319
concurrent_partitioning: bool
320320
If True will increase the parallelism level during the partitions writing. It will decrease the
321321
writing time and increase the memory usage.
322-
https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/tutorials/022%20-%20Writing%20Partitions%20Concurrently.html
322+
https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/tutorials/022%20-%20Writing%20Partitions%20Concurrently.html
323323
mode: str, optional
324324
``append`` (Default), ``overwrite``, ``overwrite_partitions``. Only takes effect if dataset=True.
325325
For details check the related tutorial:
326-
https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/tutorials/004%20-%20Parquet%20Datasets.html
326+
https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/tutorials/004%20-%20Parquet%20Datasets.html
327327
catalog_versioning : bool
328328
If True and `mode="overwrite"`, creates an archived version of the table catalog before updating it.
329329
schema_evolution : bool
330330
If True allows schema evolution (new or missing columns), otherwise a exception will be raised. True by default.
331331
(Only considered if dataset=True and mode in ("append", "overwrite_partitions"))
332332
Related tutorial:
333-
https://aws-sdk-pandas.readthedocs.io/en/3.0.0b2/tutorials/014%20-%20Schema%20Evolution.html
333+
https://aws-sdk-pandas.readthedocs.io/en/3.0.0b3/tutorials/014%20-%20Schema%20Evolution.html
334334
database : str, optional
335335
Glue/Athena catalog: Database name.
336336
table : str, optional

0 commit comments

Comments
 (0)