Skip to content

Releases: awslabs/deequ

2.0.12

19 Aug 15:15
ce03bdc

Choose a tag to compare

What's Changed

  • Added Implementation of DQDL Rules and Execution

    • add implementation of DQDL rule execution by @happy-coral in #620
    • Add implementation of outcome mapping in DeequOutcomeTranslator by @happy-coral in #621
    • Add implementation for DQDL rules: CompletenessRule, IsCompleteRule, UniquenessRule, IsUniqueRule, ColumnCorrelationRule by @happy-coral in #622
    • Add implementation for DQDL rules: DistinctValuesCount, Entropy, Mean, StandardDeviation, Sum, UniqueValueRatio by @happy-coral in #624
    • Update README to describe DQDL support and add Java & Scala DQDL examples by @happy-coral in #634
    • Add support for DQDL IsPrimaryKey rule by @happy-coral in #635
    • Add support for DQDL ColumnLength rule by @eycho-am in #636
  • Modify Histogram to be in descending frequency by @kyraman in #630

  • Introduce HistogramBase for common histogram behavior by @kyraman in #631

  • Modify maven publishing to use central portal by @eycho-am in #633

  • Add support for DQDL CustomSql rule & Deequ CustomSql check by @happy-coral in #632

  • fix(kll): Add SerDe Implementation for KLLSketch by @mdrakiburrahman in #628

  • Updated version in pom.xml to 2.0.12-spark-3.5 by @eycho-am in #637

New Contributors

Full Changelog: 2.0.11...2.0.12

2.0.11

12 Aug 16:22
fe256f5

Choose a tag to compare

What's Changed

  • Add AnalyzerOptions to Analyzer serialize / deserialize logic by @kchaturvedi in #597
  • Refine row count retrieval to skip redundant Size() scans by @lawofcycles in #605
  • Updated version in pom.xml to 2.0.11-spark-3.5 by @eycho-am in #615

New Contributors

Full Changelog: 2.0.10...2.0.11

2.0.10

23 Apr 20:52

Choose a tag to compare

New Features

Maintenance / Fixes

  • feature/replace-rdd by @shriyavanvari in #586
  • Adds a test to verify that Deequ's isContainedIn constraint correctly handles string values containing single quotes in the verification process. by @D-Minor in #602

New Contributors

Full Changelog: 2.0.9...2.0.10

2.0.9

23 Apr 20:48

Choose a tag to compare

Maintenance / Fixes

  • Fix row level bug when composing outcome #594

Full Changelog: 2.0.8...2.0.9

2.0.8

23 Apr 16:41
7c76c59

Choose a tag to compare

New Features

Maintenance / Fixes

New Contributors

Full Changelog: 2.0.7...2.0.8

2.0.7

02 Jul 17:47

Choose a tag to compare

What's Changed

Upgrades

New Features

  • New type of MetricsRepository by @VenkataKarthikP:
    • Using Spark tables as the data source in #518
  • Row Level Result Treatment Options by @eycho-am:
    • Uniqueness and Completeness in #532
    • Miminum and Maximum in #535
  • Anomaly Detection Changes by @zeotuan:
    • Add Daily Season with Hourly Interval to HoltWinter in #546
  • New analyzers:

Maintenance/Fixes

  • Fix Breeze dependency conflict in Anomaly Detection Spark 3.4+ by @zeotuan in #545
  • Data Sync / DatasetMatch changes by @VenkataKarthikP:
    • add data synchronization test to verification Suite in #526
    • support col match and change to DatasetMatch in #529
  • Row level results fixes:
    • Add analyzerOption to add filteredRowOutcome for isPrimaryKey Check by @eycho-am in #537
    • Fix bug in MinLength and MaxLength when NullBehavior.EmptyString by @eycho-am in #538
    • [Min/Max] Apply filtered row behavior at the row level evaluation by @rdsharma26 in #543
    • [MinLength/MaxLength] Apply filtered row behavior at the row level evaluation by @rdsharma26 in #547
    • Fix for satisfies row level results bug by @rdsharma26 in #553

New Contributors

Full Changelog: 2.0.6...2.0.7

2.0.6

13 Nov 17:16
54c5e48

Choose a tag to compare

What's Changed

  • NEW: Exact Quantile Check
  • Data Synchronization/Matching fixes
    • Delegate to Spark for checking existence of columns in the given dataframes by @rdsharma26 in #515
    • Verify that non key columns exist in each dataset by @rdsharma26 in #517
  • Addition of tests
    • Test that exceptions within a check's constraints do not affect other… by @tylermcdaniel0 in #516

New Contributors

Full Changelog: 2.0.5...2.0.6

2.0.5

13 Nov 17:13
94821c2

Choose a tag to compare

What's Changed

  • Spark 3.4 Update
  • NEW: Custom SQL analyzer
  • Analyzer Improvements
    • Allow all DQ constraints to be generated from an Analyzer by @mentekid in #508

New Contributors

Full Changelog: 2.0.4...2.0.5

2.0.4

10 Aug 17:18

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: 2.0.3...2.0.4

2.0.3

07 Mar 20:10
c9a0eae

Choose a tag to compare

What's Changed

  • Adding chi-square distance method for categorical variables by @bevhanno in #444
  • [WIP] Row Level Results by @mentekid in #451
  • [Experimental] Addition of dataset comparison utilities by @rdsharma26 in #449

New Contributors

Full Changelog: 2.0.2...2.0.3