Releases: google/differential-privacy
DP Lib v4.0.0
These release notes cover a series of updates across our differential privacy libraries and tools. A key highlight is the introduction of PipelineDP4j, a new end-to-end differential privacy solution for the JVM.
PipelineDP4j
PipelineDP4j is a new end-to-end differential privacy solution for JVM that supports various frameworks for distributed data processing such as Apache Spark and Apache Beam. It is the JVM implementation of PipelineDP (https://pipelinedp.io/), and is conceptually similar to Privacy on Beam.
New Features
- Initial Release: Published PipelineDP4j, the JVM version of PipelineDP.
- New Stable Public API: Released a new stable public API, replacing the previous experimental one.
- The following frameworks (backends) are supported: Apache Spark (both Dataset and DataFrame APIs), Apache Beam, local execution.
- The following metrics are supported: PRIVACY_ID_COUNT, COUNT, SUM, VECTOR_SUM, MEAN, VARIANCE, QUANTILES.
- Private groups (partitions) selection is supported.
- Public groups are supported as well.
- A usage example is provided with instructions how to run locally with Beam and Spark and how to run on Google Cloud with Beam and Spark.
C++ DP Library
Breaking Changes
- Removal of Deprecated Functions: Removed unused, deprecated functions in approx bounds and partition selection pre-thresholding.
New Features
- BoundsProvider Framework: Refactored code and introduced
BoundsProvider
, making it easier to implement new bounding algorithms. AddedApproxBoundsAsBoundsProvider
wrapper for usingApproxBounds
with the new interface.
Bug Fixes
- Normalized NaN floating point values in
SetValue<T>
for architecture-independent floating point semantics. - Fixed float cast overflow in quantile tree.
- Simplified
UniformDouble()
implementation and made it match its description.
Other
- ApproxBounds Deprecation: Refactored
ApproxBoundsAsBoundsProvider
, replacing the wrapped class withApproxBoundsProvider
and deprecatingApproxBounds
. - Serialization Updates:
- Updated serialization (still backwards compatible). Added new serialization for
BoundsProvider
. Backwards compatability will be removed in the next release. - Changed serialization for sum, mean, and variance (remains fully backwards compatible for now). Backwards compatability will be removed in the next release.
- Updated serialization (still backwards compatible). Added new serialization for
- API Enhancements: Changed return types of
NumericalMechanism::AddNoise
fromT
todouble
orint64_t
to reduce confusion with unsigned types (not a breaking change due to implicit conversion). - Code Quality & Dependencies:
- Used
std::optional
andstd::make_unique
in favor ofabsl::*
. - General code cleanup, refined comments, and updated absl dependency.
- Moved reconstruction logic for clamped bounds into a separate class.
- Used
Java DP Library
New Features
- Added a method to add Gaussian noise with a given L2 sensitivity.
- Added support for noise addition with zCDP in Java Gaussian Noise.
Bug Fixes
- Changed
maxContributions
fromInteger
toint
inApproximateBounds
.
Other
- Updated dependencies.
- Loaded java rules from
@rules_java
.
Go DP Library
Breaking Changes
- Removed Bazel build support for Go code (library, examples, and Privacy on Beam). Building should now be done with the go tool. Building with Bazel could still work via gazelle, but is not officially supported.
- Standard Deviation Deprecated: Marked StandardDeviation as deprecated and updated README. Please use Variance instead.
Bug Fixes
- Handled unspecified noise & infinite lower/upper bounds for Variance.
Other
- Updated README to reflect that pre-thresholding is available in Go.
- Go examples now depends on the local version of the Go DP Library with the go tool (go.mod)
Privacy on Beam (PoB)
Breaking Changes
- Removed Bazel build support for Go code (library, examples, and Privacy on Beam). Building should now be done with the go tool. Building with Bazel could still work via gazelle, but is not officially supported.
New Features
- Implemented
VariancePerKey
, which returns variance; andVarianceStatisticsPerKey
, which returns variance, mean, and count together. - Implemented the
MeanStatisticsPerKey
API which returns count and mean together.
Bug Fixes
- Fixed inconsistent test case parameter in
TestDistinctPrivacyIDThresholdLeavesSomeEntries
. - Fixed tolerance in
TestMeanPerKeyCrossPartitionContributionBounding
.
Other
- Privacy on Beam now depends on the local version of the Go DP Library with the go tool (go.mod).
- Updated
pre_thresholding.md
to include links to Go Library & PoB.
General/Other
Breaking Changes
- Proto changes: Deprecated
bounds_summary
field in all protos (still populated).
DP Lib v3.0.0
This is a major release with new features, improvements, deprecations, and bug fixes across the libraries. It includes breaking changes, which are listed below.
Note: All features not mentioned in the release notes are considered experimental for this release. In particular, the accounting library, the stochastic testing API in C++, and the PostgreSQL extension are experimental and likely to change or be removed in the future.
Changes since the 2.1.0 release
C++ DP Lib:
Breaking changes:
- Remove default L1 sensitivity for Laplace mechanisms. L1 sensitivity has to be set explicitly.
- Remove
GetOutputConfidenceInterval
from algorithms and clarify semantics ofNoiseConfidenceInterval
. - Remove
base::Status
and migrate remaining API toabsl::Status
- Remove logging
soft-fork
and use logging fromabsl
instead. - Remove deprecated protocol buffer fields.
New features:
- Add a payload to the
Status
output when approximate bounds could not find appropriate bounds.
Java DP Lib:
Breaking change:
- Noise interface in Java now has 2 additional
addNoise()
methods that accept delta of typedouble
. Clients with their own implementation of the interface must implement the new methods to avoid compilation errors.
Deprecated:
- Deprecated the methods that accept delta of type
Double
in theNoise
interface and aggregation primitives (e.g.,Count
,BoundedSum
, etc). We will delete the deprecated methods in the next release. Please migrate to their overloaded versions that accept delta of type double in the meantime.
New feature: Long bounded sum.
Go DP Lib:
New features:
- Pre-thresholding for
PreAggSelectPartitions
andCount
aggregation primitives. - Add
IncrementBy
for thePreAggSelectPartitions
aggregation primitive. - Negative counts are now explicitly allowed when incrementing via IncrementBy in
Count
&PreAggSelectPartitions
, meaning it is now possible to decrement privacy ID counts.
Privacy on Beam:
Breaking changes:
- New API for
PrivacySpec
andAggregationParams
to enable new features. - Test Mode is now a field in
PrivacySpecParams
, no need to usepbeamtest
for enabling test mode.
Deprecated:
- Merge
SelectPartitionsParams
&PartitionSelectionParams
. BothSelectPartitions
and private partition selection of other aggregations now usePartitionSelectionParams
.SelectPartitionsParams
is deprecated and might be deleted in future releases.
New features:
- Pre-thresholding on top of DP thresholding, available for all aggregations. See pre-thresholding documentation for more details.
- Aggregation and partition selection budgets can be specified separately instead of being split automatically. This allows for granular budget allocation.
- Test mode can now also be used in non-test runs, e.g. in order to compare differentially private results with raw results.
- Scalable public partitions for
<K, V>
types. Previously, due to an issue with how we processed public partitions, only in-memory public partitions or smallPCollections
as public partitions worked with KV type aggregations. Now, it should be possible to use arbitrarily large public partitions. Note that this issue did not affect V-type aggregations (i.e.DistinctPrivacyID
,Count
). - Add option to
CountParams
to allow negative outputs to allow more accurate statistical analysis of the output.
DP Lib v2.1.0
C++ DP Lib:
- New feature: Pre-thresholding Partition Selection
Java DP Lib:
- New feature: Pre-thresholding Partition Selection
- New feature: Truncated Geometric Partition Selection
Go / Privacy on Beam:
- Internal refactorings only
DP Lib v2.0.0
This is a major release with new features, improvements and bug fixes across the libraries. It includes breaking changes, which are listed below.
Overview Table
Algorithm | C++ | Go | Java |
---|---|---|---|
Laplace mechanism | β | β | β |
Gaussian mechanism | β | β | β |
Count | β | β | β |
Sum | β | β | β |
Mean | β | β | β |
Variance | β | β | β |
Quantiles | β | β | β |
Automatic bounds approximation | β | β | β |
Truncated geometric thresholding | β | β | β |
Laplace thresholding | β | β | β |
Gaussian thresholding | β | β | β |
Note: All features not mentioned in the release notes are considered experimental for this release. In particular, the accounting library, the stochastic testing API in C++, and the PostgreSQL extension are experimental and likely to change or be removed in the future.
Changes since the 1.1.2 release
C++
New features:
- Add confidence intervals for mean.
Breaking changes:
- Removed budget fraction for DP algorithms; use absolute privacy budgets during initialization instead.
- The semantics of
ApproxBounds
budget has changed. A DP algorithm now consumes at most the given epsilon and delta. If the DP algorithm usesApproxBounds
internally, the algorithm splits the budget.
New deprecations (not yet removed, but users should migrate):
- Deprecated
BoundedStandardDeviation
; useBoundedVariance
instead. - Use
absl::Status
instead of our own soft fork.
Java
New features:
- Implement discrete Laplace noise generator.
- Implement automatic approximate DP bounds calculation.
Go
New features:
- Allow setting equal
lower
andupper
bounds forBoundedSum
contributions. - Improved error reporting.
Breaking changes:
- Return errors instead of
log.Fatal/Exit'ing
. This changes the function signatures, so the errors now have to be handled or ignored. - Rename
BoundedMeanFloat64
toBoundedMean
. - Disable defaults for
MaxPartitionsContributed
. - Disallow using 0 or very small epsilon for Gaussian noise.
Privacy on Beam:
New features:
- Public partitions improvements:
- Support public partitions for
DistinctPerKey
. - Support in-memory public partitions for all aggregations.
- Faster public partitions for
Count
&DistinctPrivacyID
when public partitions are aPCollection
(i.e. not in-memory).
- Support public partitions for
- Clamp negative counts to 0 for
DistinctPrivacyID
. - Improve error reporting.
- Depends on GitHub for the Go DP Library dependency instead of the local version.
Breaking changes:
- Disallow equal bounds for
MeanPerKey
.
DP Lib 1.1.2
This patch release only affects the Go DP Library and Privacy on Beam.
This fixes a privacy-impacting bug that only affects ThresholdedResult()
function of dpagg.Count
and pbeam.DistinctPrivacyId()
without public partitions in Privacy on Beam where a conversion of a floating point threshold to an integer threshold caused the mechanism to exhibit a larger delta than specified. For example:
- Calling
ThresholdedResult()
on a Count with(Epsilon: 1.0, Noise: Laplace, MaxPartitionsContributed=1.0)
andthresholdDelta=1e-6
should use a threshold of 14.122363 but instead it used 14.0, which increased thethresholdDelta
to1.130165e-6
. - Similarly, calling
ThresholdedResult()
on a Count with(Epsilon: 2.0, Noise: Laplace, MaxPartitionsContributed=1.0)
andthresholdDelta=1e-4
should use a threshold of 5.258597 but instead it used 5.0, which increased thethresholdDelta
to1.677313e-4
.
See the single commit for more details on the bug & the fix.
DP Lib 1.1.1
This patch release only affects the C++ DP Lib.
We fixed the implicit approximate bounds use case for multiple contributions for sum, mean. variance, and stddev, i.e., when neither approximate bounds, nor upper/lower limits have been set in the builder. Since the approx bounds algorithm just provides the sensitivity for a subsequently executed algorithm, this might not be catastrophic and still provides differentially private outputs with a slightly increased privacy loss.
DP Lib 1.1.0
This is a minor release with new features, improvements and bug fixes across the libraries. There should be no breaking changes.
Overview Table
Algorithm | C++ | Go | Java |
---|---|---|---|
Laplace mechanism | β | β | β |
Gaussian mechanism | β | β | β |
Laplace mechanism | β | β | β |
Count | β | β | β |
Sum | β | β | β |
Mean | β | β | β |
Variance | β | β | β |
Standard deviation | β | β | β |
Quantiles | β | β | β |
Automatic bounds approximation | β | β | β |
Truncated geometric thresholding | β | β | β |
Laplace thresholding | β | β | β |
Gaussian thresholding | β | β | β |
β Β => supported ;Β βΒ => not supported yet
New features since the 1.0.0 release
C++
- Support for Gaussian Partition Selection
- NumericalMechanism supports
GetVariance
- Users can have the library automatically select the numerical mechanism (Laplace or Gaussian) with the smaller variance
Java
- Confidence intervals for Quantiles
Go
- Support for variance and standard deviation
Privacy on Beam
- Support for multiple quantiles using quantile trees
Bug Fixes
Privacy on Beam
- Fix a privacy bug in DistinctPerKey where contributions might not be bound correctly in some rare cases
- Fix a bug in codelab in sum.go and multiple.go where instead of summing up revenue we sum up time spent
Other
Privacy on Beam
- Refactor error reporting, errors are propagated up to top-level functions as much as possible
Usage
Java via Maven
<dependency>
<groupId>com.google.privacy.differentialprivacy</groupId>
<artifactId>differentialprivacy</artifactId>
<version>1.1.0</version>
</dependency>
Or use the Java artifact with other build systems.
Via the Go command
For the go building blocks library:
go get github.com/google/differential-privacy/go@v1.1.0
For Privacy on Beam:
go get github.com/google/differential-privacy/privacy-on-beam@v1.1.0
DP Lib 1.0.1
Only affects Privacy-on-Beam.
This is a patch release for v1.0.0 that includes a fix for the rare privacy bug in the DistinctPerKey
function of Privacy-on-Beam.
The bug occurred when there are outlier users in the input that contribute to many partitions or to many values AND the values contributed are the same as values from other users (the second part is critical, if the contributed values only come from a single user then the bug does not occur). Then, the output might not have been differentially private due to incorrect contribution bounding. See the single commit for more details on the bug & the fix.
DP Lib 1.0.0
This is the initial release of Googleβs differential privacy libraries. We are using semantic versioning. The initial version number is 1.0.0, as this library is already used for production use cases and we consider our API as stable.
We are supporting C++, Java, and Go.Β This release also includes Privacy-on-Beam, a framework for differential privacy build on top of Apache Beam Go.
Note:Β All features not mentioned in the release notes are considered experimental for this release.Β In particular, the accounting library, the stochastic testing API in C++, and the PostgreSQL extension are experimental and likely to change or be removed in the future.
DP building blocks libraries
Overview table
Algorithm | C++ | Go | Java |
---|---|---|---|
Laplace mechanism | β | β | β |
Gaussian mechanism | β | β | β |
Count | β | β | β |
Sum | β | β | β |
Mean | β | β | β |
Variance | β | β | β |
Standard deviation | β | β | β |
Quantiles | β | β | β |
Automatic bounds approximation | β | β | β |
Truncated geometric thresholding | β | β | β |
Laplace thresholding | β | β | β |
Gaussian thresholding | β | β | β |
β => supported ; β => not supported yet
Base features in C++/Java/Go
Aggregations: count, sum, mean, quantiles
Partition selection mechanisms: truncated geometric thresholding, Laplace thresholding
Numerical mechanisms for providing secure Laplace and Gaussian noise
Additional features in C++
Additional aggregations: variance, standard deviation, max, min
Automatic per-partition bounds approximation
Additional features in Java
Additional partition selection mechanisms: Gaussian thresholding
Additional features in Go
Additional partition selection mechanisms: Gaussian thresholding
Privacy-on-Beam (based on Apache Beam Go)
Aggregations: count distinct privacy IDs, count distinct values per key, count per key, sum per key, mean per key, partition selection
Partition selection mechanisms: truncated geometric thresholding, Laplace thresholding, Gaussian thresholding
Numerical mechanisms: Laplace noise, secure Gaussian noise
In-memory public partitions in aggregations when the list of public partitions is small enough to fit in memory
Two test modes: no noise with contribution bounding, no noise without contribution bounding
Support to modify private collections using functional DoFns.