Skip to content

Commit fdd1f6f

Browse files
authored
Merge pull request #302 from dynatrace-oss/reset-and-hash-method
Reset and hash method
2 parents 5621b0b + 2229413 commit fdd1f6f

File tree

6 files changed

+70
-85
lines changed

6 files changed

+70
-85
lines changed

.gitattributes

Lines changed: 5 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,10 @@
11
* text=auto
2-
3-
*.java text
4-
*.py text
5-
*.md text
2+
*.java text eol=lf
3+
*.py text eol=lf
4+
*.md text eol=lf
65
*.csv text
7-
8-
*.bat text eol=crlf
9-
6+
gradlew.bat text eol=crlf
107
gradlew text eol=lf
8+
*.gradle eol=lf
119
*.sh text eol=lf
12-
1310
*.jar binary

.palantir/revapi.yml

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,17 @@ acceptedBreaks:
5555
\ com.dynatrace.hash4j.distinctcount.DistinctCounter.Estimator<T>>>>::reconstructHash(int)\
5656
\ @ com.dynatrace.hash4j.distinctcount.UltraLogLog"
5757
justification: "removed non-public method"
58+
"0.17.0":
59+
com.dynatrace.hash4j:hash4j:
60+
- code: "java.method.addedToInterface"
61+
new: "method com.dynatrace.hash4j.hashing.HashStream128 com.dynatrace.hash4j.hashing.HashStream128::copy()"
62+
justification: "{added copy method to HashStream}"
63+
- code: "java.method.addedToInterface"
64+
new: "method com.dynatrace.hash4j.hashing.HashStream32 com.dynatrace.hash4j.hashing.HashStream32::copy()"
65+
justification: "{added copy method to HashStream}"
66+
- code: "java.method.addedToInterface"
67+
new: "method com.dynatrace.hash4j.hashing.HashStream64 com.dynatrace.hash4j.hashing.HashStream64::copy()"
68+
justification: "{added copy method to HashStream}"
5869
"0.18.0":
5970
com.dynatrace.hash4j:hash4j:
6071
- code: "java.class.visibilityReduced"

README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -26,12 +26,12 @@ To add a dependency on hash4j using Maven, use the following:
2626
<dependency>
2727
<groupId>com.dynatrace.hash4j</groupId>
2828
<artifactId>hash4j</artifactId>
29-
<version>0.18.0</version>
29+
<version>0.19.0</version>
3030
</dependency>
3131
```
3232
To add a dependency using Gradle:
3333
```gradle
34-
implementation 'com.dynatrace.hash4j:hash4j:0.18.0'
34+
implementation 'com.dynatrace.hash4j:hash4j:0.19.0'
3535
```
3636

3737
## Hash algorithms
@@ -134,7 +134,7 @@ In case of non-distributed data streams, the [martingale estimator](src/main/jav
134134
can be used, which gives slightly better estimation results as the asymptotic storage factor is $6\ln 2 = 4.159$.
135135
This gives a relative standard error of $\sqrt{\frac{6\ln 2}{6m}} = \frac{0.833}{\sqrt{m}}$.
136136
The theoretically predicted estimation errors have been empirically confirmed by [simulation results](doc/hyperloglog-estimation-error.md).
137-
* UltraLogLog: This algorithm is described in detail in this [paper](https://arxiv.org/abs/2308.16862).
137+
* UltraLogLog: This algorithm is described in detail in this [paper](https://doi.org/10.14778/3654621.3654632).
138138
Like for HyperLogLog, a precision parameter $p$ defines the number of registers $m = 2^p$.
139139
However, since UltraLogLog uses 8-bit registers to enable fast random accesses and updates of the registers,
140140
$m$ is also the state size in bytes.
@@ -211,7 +211,7 @@ The following consistent hashing algorithms are available:
211211
* [Improved Consistent Weighted Sampling](https://doi.org/10.1109/ICDM.2010.80): This algorithm is based on improved
212212
consistent weighted sampling with a constant computation time independent of the number of buckets. This algorithm is faster than
213213
JumpHash for a large number of buckets.
214-
* [JumpBackHash](https://arxiv.org/abs/2403.18682): In contrast to JumpHash, which traverses "active indices" (see [here](https://doi.org/10.1109/ICDM.2010.80) for a definition)
214+
* [JumpBackHash](https://doi.org/10.1002/spe.3385): In contrast to JumpHash, which traverses "active indices" (see [here](https://doi.org/10.1109/ICDM.2010.80) for a definition)
215215
in ascending order, JumpBackHash does this in the opposite direction. In this way, floating-point operations can be completely avoided.
216216
Further optimizations minimize the number of random values that need to be generated to reach
217217
the largest "active index" within the given bucket range in amortized constant time. The largest "active index",

build.gradle

Lines changed: 48 additions & 70 deletions
Original file line numberDiff line numberDiff line change
@@ -38,8 +38,8 @@ java {
3838
toolchain {
3939
languageVersion = JavaLanguageVersion.of(21)
4040
}
41-
withJavadocJar()
4241
withSourcesJar()
42+
withJavadocJar()
4343
}
4444

4545

@@ -103,7 +103,7 @@ tasks.test {
103103
}
104104

105105
tasks.register("java21Test", Test) {
106-
// compare https://github.com/melix/mrjar-gradle-plugin/blob/dac99aadd451e3c2176aa6e13af7ad39e20c2cb9/plugin/src/main/java/me/champeau/mrjar/MultiReleaseExtension.java group=LifecycleBasePlugin.VERIFICATION_GROUP
106+
group = LifecycleBasePlugin.VERIFICATION_GROUP
107107
javaLauncher = javaToolchains.launcherFor {
108108
languageVersion = JavaLanguageVersion.of(21)
109109
}
@@ -128,7 +128,7 @@ tasks.withType(JavaCompile).configureEach {
128128
}
129129

130130
group = 'com.dynatrace.hash4j'
131-
version = '0.18.0'
131+
version = '0.19.0'
132132

133133

134134
static def readJavaLicense(licenseName) {
@@ -172,6 +172,35 @@ spotless {
172172
def eclipseCdtVersion = '11.6'
173173
def blackVersion = '24.10.0'
174174
def greclipseVersion = '4.32'
175+
def specialLicenseHeaders = [
176+
new Tuple3('javaImohash', 'MIT_IMOHASH', [
177+
'src/main/java/com/dynatrace/hash4j/file/Imohash1_0_2.java'
178+
]),
179+
new Tuple3('javaKomihash', 'MIT_KOMIHASH' , [
180+
'src/main/java/com/dynatrace/hash4j/hashing/Komihash4_3.java',
181+
'src/main/java/com/dynatrace/hash4j/hashing/Komihash5_0.java',
182+
'src/main/java/com/dynatrace/hash4j/hashing/AbstractKomihash.java'
183+
]),
184+
new Tuple3('javaFarmHash', 'MIT_APACHE_2_0_FARMHASH',[
185+
'src/main/java/com/dynatrace/hash4j/hashing/FarmHashNa.java',
186+
'src/main/java/com/dynatrace/hash4j/hashing/FarmHashUo.java'
187+
]),
188+
new Tuple3('javaPolymurHash', 'ZLIB_POLYMURHASH',[
189+
'src/main/java/com/dynatrace/hash4j/hashing/PolymurHash2_0.java'
190+
]),
191+
new Tuple3('javaSplitMix64', 'CREATIVE_COMMONS_SPLITMIX64',[
192+
'src/main/java/com/dynatrace/hash4j/random/SplitMix64V1.java'
193+
]),
194+
new Tuple3('javaExponential', 'BOOST_EXPONENTIAL_RANDOM_GENERATION',[
195+
'src/main/java/com/dynatrace/hash4j/random/RandomExponentialUtil.java'
196+
]),
197+
new Tuple3('javaConsistentJumpHash', 'APACHE_2_0_GUAVA',[
198+
'src/main/java/com/dynatrace/hash4j/consistent/ConsistentJumpBucketHasher.java'
199+
]),
200+
new Tuple3('javaXXH', 'APACHE_2_0_XXH',[
201+
'src/main/java/com/dynatrace/hash4j/hashing/XXH3_64.java'
202+
])
203+
]
175204

176205
ratchetFrom 'origin/main'
177206
apply plugin: 'groovy'
@@ -192,76 +221,25 @@ spotless {
192221
java {
193222
importOrder()
194223
removeUnusedImports()
224+
cleanthat()
195225
googleJavaFormat(googleJavaFormatVersion)
226+
formatAnnotations()
196227
licenseHeader readJavaLicense('APACHE_2_0_DYNATRACE')
197-
targetExclude \
198-
'src/main/java/com/dynatrace/hash4j/consistent/ConsistentJumpBucketHasher.java',\
199-
'src/main/java/com/dynatrace/hash4j/file/Imohash1_0_2.java',\
200-
'src/main/java/com/dynatrace/hash4j/hashing/Komihash4_3.java',\
201-
'src/main/java/com/dynatrace/hash4j/hashing/Komihash5_0.java',\
202-
'src/main/java/com/dynatrace/hash4j/hashing/PolymurHash2_0.java',\
203-
'src/main/java/com/dynatrace/hash4j/hashing/AbstractKomihash.java',\
204-
'src/main/java/com/dynatrace/hash4j/hashing/FarmHashNa.java',\
205-
'src/main/java/com/dynatrace/hash4j/hashing/FarmHashUo.java',\
206-
'src/main/java/com/dynatrace/hash4j/random/SplitMix64V1.java',\
207-
'src/main/java/com/dynatrace/hash4j/random/RandomExponentialUtil.java',\
208-
'src/main/java/com/dynatrace/hash4j/hashing/XXH3_64.java'
209-
}
210-
format 'javaImohash', JavaExtension, {
211-
importOrder()
212-
removeUnusedImports()
213-
googleJavaFormat(googleJavaFormatVersion)
214-
licenseHeader readJavaLicense('APACHE_2_0_DYNATRACE') + '\n\n' + readJavaLicense('MIT_IMOHASH')
215-
target 'src/main/java/com/dynatrace/hash4j/file/Imohash1_0_2.java'
216-
}
217-
format 'javaKomihash', JavaExtension, {
218-
importOrder()
219-
removeUnusedImports()
220-
googleJavaFormat(googleJavaFormatVersion)
221-
licenseHeader readJavaLicense('APACHE_2_0_DYNATRACE') + '\n\n' + readJavaLicense('MIT_KOMIHASH')
222-
target 'src/main/java/com/dynatrace/hash4j/hashing/Komihash4_3.java', 'src/main/java/com/dynatrace/hash4j/hashing/Komihash5_0.java', 'src/main/java/com/dynatrace/hash4j/hashing/AbstractKomihash.java'
223-
}
224-
format 'javaFarmHash', JavaExtension, {
225-
importOrder()
226-
removeUnusedImports()
227-
googleJavaFormat(googleJavaFormatVersion)
228-
licenseHeader readJavaLicense('APACHE_2_0_DYNATRACE') + '\n\n' + readJavaLicense('MIT_APACHE_2_0_FARMHASH')
229-
target 'src/main/java/com/dynatrace/hash4j/hashing/FarmHashNa.java','src/main/java/com/dynatrace/hash4j/hashing/FarmHashUo.java'
230-
}
231-
format 'javaPolymurHash', JavaExtension, {
232-
importOrder()
233-
removeUnusedImports()
234-
googleJavaFormat(googleJavaFormatVersion)
235-
licenseHeader readJavaLicense('APACHE_2_0_DYNATRACE') + '\n\n' + readJavaLicense('ZLIB_POLYMURHASH')
236-
target 'src/main/java/com/dynatrace/hash4j/hashing/PolymurHash2_0.java'
237-
}
238-
format 'javaSplitMix64', JavaExtension, {
239-
importOrder()
240-
removeUnusedImports()
241-
googleJavaFormat(googleJavaFormatVersion)
242-
licenseHeader readJavaLicense('APACHE_2_0_DYNATRACE') + '\n\n' + readJavaLicense('CREATIVE_COMMONS_SPLITMIX64')
243-
target 'src/main/java/com/dynatrace/hash4j/random/SplitMix64V1.java'
244-
}
245-
format 'javaExponential', JavaExtension, {
246-
importOrder()
247-
removeUnusedImports()
248-
googleJavaFormat(googleJavaFormatVersion)
249-
licenseHeader readJavaLicense('APACHE_2_0_DYNATRACE') + '\n\n' + readJavaLicense('BOOST_EXPONENTIAL_RANDOM_GENERATION')
250-
target 'src/main/java/com/dynatrace/hash4j/random/RandomExponentialUtil.java'
251-
}
252-
format 'javaConsistentJumpHash', JavaExtension, {
253-
importOrder()
254-
removeUnusedImports()
255-
googleJavaFormat(googleJavaFormatVersion)
256-
licenseHeader readJavaLicense('APACHE_2_0_DYNATRACE') + '\n\n' + readJavaLicense('APACHE_2_0_GUAVA')
257-
target 'src/main/java/com/dynatrace/hash4j/consistent/ConsistentJumpBucketHasher.java'
228+
targetExclude specialLicenseHeaders.collect {it.get(2)}.flatten()
258229
}
259-
format 'javaXXH', JavaExtension, {
260-
importOrder()
261-
removeUnusedImports()
262-
googleJavaFormat(googleJavaFormatVersion)
263-
licenseHeader readJavaLicense('APACHE_2_0_DYNATRACE') + '\n\n' + readJavaLicense('APACHE_2_0_XXH')
264-
target 'src/main/java/com/dynatrace/hash4j/hashing/XXH3_64.java'
230+
specialLicenseHeaders.forEach {
231+
def formatName = it.get(0)
232+
def licenseName = it.get(1)
233+
def files = it.get(2)
234+
format formatName, JavaExtension, {
235+
importOrder()
236+
removeUnusedImports()
237+
cleanthat()
238+
googleJavaFormat(googleJavaFormatVersion)
239+
formatAnnotations()
240+
licenseHeader readJavaLicense('APACHE_2_0_DYNATRACE') + '\n\n' + readJavaLicense(licenseName)
241+
target files
242+
}
265243
}
266244
}
267245

src/main/java/com/dynatrace/hash4j/consistent/ConsistentHashing.java

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,8 @@ public static ConsistentBucketHasher improvedConsistentWeightedSampling(
6363
*
6464
* <p>In contrast to other algorithms, JumpBackHash runs in constant time and does not require
6565
* floating-point operations. On some machines it may achieve similar performance as a modulo
66-
* operation.
66+
* operation. See Otmar Ertl, "JumpBackHash: Say Goodbye to the Modulo Operation to Distribute
67+
* Keys Uniformly to Buckets", <a href="https://doi.org/10.1002/spe.3385">10.1002/spe.3385.</a>
6768
*
6869
* @param pseudoRandomGeneratorProvider a {@link PseudoRandomGeneratorProvider}
6970
* @return a {@link ConsistentBucketHasher}

src/test/java/com/dynatrace/hash4j/hashing/AbstractHasherTest.java

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,6 @@
3636
import java.util.function.Consumer;
3737
import java.util.function.Function;
3838
import java.util.function.Supplier;
39-
import org.jetbrains.annotations.NotNull;
4039
import org.junit.jupiter.api.Disabled;
4140
import org.junit.jupiter.api.Test;
4241
import org.junit.jupiter.api.TestInstance;
@@ -949,7 +948,6 @@ public char charAt(int index) {
949948
return AbstractHasher.getChar(buffer, (index & (NUM_CHARS_IN_BUFFER - 1)) << 1);
950949
}
951950

952-
@NotNull
953951
@Override
954952
public CharSequence subSequence(int start, int end) {
955953
throw new UnsupportedOperationException();

0 commit comments

Comments
 (0)