Releases: aws/aws-sdk-pandas
Releases · aws/aws-sdk-pandas
AWS SDK for pandas 3.7.0
Breaking changes 💥
Lake Formation Governed tables are being phased out and we are dropping support (#2692).
Features/Enhancements 🚀
Bug fixes 🐛
- Index columns removed on s3.to_parquet by @robert-schmidtke in #2655
- Missing timezone metadata by @kukushking in #2682
- remove enforced openpyxl engine constraint by @jaidisido in #2696
- Iceberg partitioning not working with partition transform functions by @LeonLuttenberger in #2694
- remove awswrangler README from
site-packages
folder by @AlJohri in #2698 - indent categories in pyarrow_additional_kwargs correctly by @jaidisido in #2701
New Contributors
Full Changelog: 3.6.0...3.7.0
AWS SDK for pandas 3.6.0
Features/Enhancements 🚀
- Enable Iceberg row deletion & add
mode
parameter toto_iceberg
by @LeonLuttenberger in #2632 - Add support for pyarrow type
large_string
by @joakibo in #2663 - Add
max_results
toathena.list_query_executions
by @LeonLuttenberger in #2665
Bug fixes 🐛
- Pyarrow 15 imports & remove unused code by @kukushking in #2649
New Contributors
Full Changelog: 3.5.2...3.6.0
AWS SDK for pandas 3.5.2
Bug fixes 🐛
- DynamoDB key & filter expressions attribute overwrite by @kukushking in #2615
- Allow PostgreSQL reserved keywords as column names by @LeonLuttenberger in #2619
- Add
to_iceberg
support for filling missing columns in the DataFrame with None by @LeonLuttenberger in #2616 - Forward
ignore_nulls
for container types by @raaidarshad in #2636
Documentation 📚
- Add
s3_additional_kwargs
to docstrings by @malachi-constant in #2627 - Fix outdated hyperlinks in documentation by @LeonLuttenberger in #2634
Other 🤖
- Enable dependabot to upgrade GitHub actions by @LeonLuttenberger in #2618
- Update badges in README by @LeonLuttenberger in #2628
- Add vulnerability label to dependabot PRs with alert state by @jaidisido in #2629
New Contributors
- @raaidarshad made their first contribution in #2636
Full Changelog: 3.5.1...3.5.2
AWS SDK for pandas 3.5.1
Bug fixes 🐛
- Deserialization error when reading from DynamoDB using
KeyConditionExpression
by @LeonLuttenberger in #2607 - Reading of chunked parquet when columns parameter is specified by @rchromik in #2599
Documentation 📚
- Add
show_create_table
to Athena API page by @MikeSchriefer in #2610
Other 🤖
- chore: Replace
bump2version
withbump-my-version
by @LeonLuttenberger in #2608 - chore(deps-dev): bump jinja2 from 3.1.2 to 3.1.3 by @dependabot in #2609
- chore(deps): bump grpcio from 1.51.3 to 1.53.0 by @dependabot in #2612
New Contributors
- @MikeSchriefer made their first contribution in #2610
- @rchromik made their first contribution in #2599
Full Changelog: 3.5.0...3.5.1
AWS SDK for pandas 3.5.0
Breaking changes 💥
Due to CVEs, Ray is capped to patched version 2.9.x. As a result, the latest version of the library cannot be used on the Glue for Ray runtime. We have raised the CVEs issue to the Glue team
Features/Enhancements 🚀
- Add
spark_properties
to athena spark by @rajagurunath in #2508 - Add
MERGE INTO
support for Iceberg by @LeonLuttenberger in #2527 - Support partitioning by index cols by @kukushking in #2528
- Add
analysis_template_arn
tocleanrooms.read_sql_query
by @jaidisido in #2584 - Python 3.12 support by @LeonLuttenberger in #2559
- Note: Ray currently does not support Python 3.12. As such, distributed operations on data frames will not work yet.
- Relevant Ray issue
- Upgrade to Ray 2.9.0+ and refactor Ray datasources to the new API by @kukushking in #2570
Bug fixes 🐛
- Athena/Neptune minor fixes by @kukushking in #2526
- Reset index and handle last index by @Antropath in #2531
- Oracle failed import message by @matthewdeanmartin in #2537
- Add parameterized queries where possible to address the risk of SQL injection by @LeonLuttenberger in #2540
- SQL identifiers by @kukushking in #2543
- coerce_timestamps - allow None by @kukushking in #2556
- Add validation for
table
andschema
params for Redshift by @LeonLuttenberger in #2551 - Redshift VARBYTE support by @kukushking in #2573
Documentation 📚
- Add SSM Public Param usage to docs by @malachi-constant in #2521
Other 🤖
- refactor: Remove usage of boto3 resources by @LeonLuttenberger in #2525
- chore(deps): bump aiohttp from 3.8.5 to 3.8.6 by @dependabot in #2519
- chore(deps): bump aiohttp from 3.8.6 to 3.9.0 by @dependabot in #2535
- chore(deps): bump cryptography from 41.0.4 to 41.0.6 by @dependabot in #2538
- chore(deps-dev): bump jupyter-server from 2.7.2 to 2.11.2 by @dependabot in #2545
- chore: Upgrade test infrastructure dependencies by @LeonLuttenberger in #2562
- chore: Prepare 3.5.0 release by @LeonLuttenberger in #2560
- chore: Upgrade deltalake dependency by @LeonLuttenberger in #2563
- chore: Replace black formatter with ruff format by @LeonLuttenberger in #2568
- chore: ruff improvements by @LeonLuttenberger in #2571
- chore: upgrade
oracledb
to 2.0 by @LeonLuttenberger in #2574 - chore(deps-dev): bump the development-dependencies group with 8 updates by @dependabot in #2577
- chore(deps-dev): bump the development-dependencies group with 5 updates by @dependabot in #2583
- chore(deps-dev): bump the development-dependencies group with 3 updates by @dependabot in #2590
- chore(deps): bump the production-dependencies group with 5 updates by @dependabot in #2591
- chore: type annotations by @LeonLuttenberger in #2585
- chore: Replace PyLint with Ruff by @LeonLuttenberger in #2588
- chore: Update gremlinpython & add aiohttp by @kukushking in #2595
New Contributors
- @rajagurunath made their first contribution in #2508
- @Antropath made their first contribution in #2531
- @matthewdeanmartin made their first contribution in #2537
Full Changelog: 3.4.2...3.5.0
AWS SDK for pandas 3.4.2
Features/Enhancements 🚀
- Update pyarrow to 14.0.1 to fix arbitrary code execution security vulnerability
Full Changelog: 3.4.1...3.4.2
AWS SDK for pandas 3.4.1
Features/Enhancements 🚀
- feat: Add schema evolution to
athena.to_iceberg
by @LeonLuttenberger in #2465 - feat: Athena - add
client_request_token
by @kukushking in #2474 - feat: Redshift data api - allow all auth combinations by @kukushking in #2475
- feat: add columns comments to iceberg by @frenchytheasian in #2482
- feat: Add Python 3.11 layers in
cn-north-1
&cn-northwest-1
by @kukushking in #2514
Bug fixes 🐛
- fix: Add missing call to
sanitize_column_name
increate_*_table
by @LeonLuttenberger in #2464 - fix: Hyphenated Iceberg table names by @LeonLuttenberger in #2466
- fix:
requests_aws4auth
not being treated as an optional dependency by @LeonLuttenberger in #2471 - fix: KeyError exception in athena wrangler by @rabingaire in #2483
- fix: column names and apply map by @LumberjackUsingMath in #2492
- fix: Gremlin batch size calc by @kukushking in #2496
Documentation 📚
- docs: Update layers.rst - add cn-north-1 & cn-northwest-1 by @kukushking in #2477
New Contributors
- @rabingaire made their first contribution in #2483
- @frenchytheasian made their first contribution in #2482
- @LumberjackUsingMath made their first contribution in #2492
Full Changelog: 3.4.0...3.4.1
AWS SDK for pandas 3.4.0
Features/Enhancements 🚀
- Geospatial - parse Athena geospatial types via geopandas by @kukushking in #2346
- Allow group identifiers to be used in
wr.cloudwatch
queries by @LeonLuttenberger in #2430 - Add ignore null store parquet metadata by @raaidarshad in #2450
Bug fixes 🐛
- Add missing boto3 session in
athena.to_iceberg
wait_query by @jaidisido in #2428 - Add catalog ID in
athena.to_iceberg
by @jaidisido in #2446 - Return None for missing column and partition key comment by @robert-schmidtke in #2449
- Fix urllib3 error when building AWS Lambda Layers by @LeonLuttenberger in #2447
- Duplicate schema argument in
wr.s3.to_parquet
by @kukushking in #2455
Tests 🧪
- Test dependabot groups feature by @jaidisido in #2426
New Contributors
- @raaidarshad made their first contribution in #2450
Full Changelog: 3.3.0...3.4.0
AWS SDK for pandas 3.3.0
Features/Enhancements 🚀
- Support Athena query prepared statements & Athena parameterized queries by @LeonLuttenberger in #2344
- Add dtype parameter in to_iceberg function by @paulobrunheroto in #2359
- Add CleanRooms read module by @jaidisido in #2366
- Escape and validate table identifiers and literals in PostreSQL by @kukushking in #2390
- Add Python 3.11 support by @moralesl in #2414
Bug fixes 🐛
- Escape column names in PRIMARY KEY statement in SQL query by @mc51 in #2351
- Remove .lower in dtype sanitize for to_parquet by @jaidisido in #2369
- Enforce use_threads=False when Limit is supplied by @jaidisido in #2372
- Fix Boto3 session not being passed to
cleanrooms.wait_query
by @LeonLuttenberger in #2381 - Allow ANSI-compatible identifiers in RDS Data API by @kukushking in #2391
- Pass schema to chunked parquet reads by @kukushking in #2400
- Support pyarrow schema in DynamoDB read_items #2399 by @jaidisido in #2401
- Upgrade Ray to 2.6 and fix security dependabots by @jaidisido in #2403
- Fix Arrow timezone localization by @kukushking in #2411
- Use from_arrow instead of from_arrow_refs by @jaidisido in #2417
Tests 🧪
- Make minimal tests run on mac and windows by @LeonLuttenberger in #2347
- Add Aurora PostgreSQL Serverless by @kukushking in #2388
New Contributors
- @mc51 made their first contribution in #2351
- @paulobrunheroto made their first contribution in #2359
- @moralesl made their first contribution in #2414
Full Changelog: 3.2.1...3.3.0
AWS SDK for pandas 3.2.1
Fixes 🛠️
- Fix error where library could not be imported on Windows due to
No module named 'pyarrow._orc'
by @LeonLuttenberger in #2341 #2337 - Lower
packaging
version requirement by @LeonLuttenberger in #2340 - Allow Ray 2.5 & downgrade tox by @kukushking in #2338
Full Changelog: 3.2.0...3.2.1