Skip to content

JA5 fingerpringing computation and filtration #2052

@krizhanovsky

Description

@krizhanovsky

Motivation

JA3 is a popular client identification, including for DDoS mitigation. Also required for #1381. ja4 provides hashes for TCP, TLS, HTTP and even destination measures. p0f produces fingerprint on l2-l4 level.

JA3 uses MD5 for the hash calculation and JA4 goes even further and uses SHA-256. The purpose of the cryptographicaly strong hash functions is to prevent guessing of an alternate clear text for a particular signature. There is no reason for a DDoS attack developer to guess the network characteristics to mimic particular hashes since they just can reproduce the required stack. Moreover, the original ja3 blog post mentions that fuzzy hashing could be more beneficial and the MD5 performance issues. In our case it'd be completely impractical to call a crypto context for the fingerprints.

Let's introduce an alternate multi-layer fingerprinting and name it JA5.

Scope

Compute the JA5 hash on the Tempesta TLS layer and implement a rate limiter for each of the JA5 hash. This must be an on-the-fly reconfigurable option, proposed configuration format:

ja5 max_entries=N {
    <hash value> <connections per second> <TLS records per second>;
}

e.g. (0 unlimited, default):

ja5h max_entries=100 {
    e7d705a3286e19ea42f587b344ee6865 0 10;
    6734f37431670b3ab4292b8f60f29984 10 100;
}

ja5n {
    e7d705a3286e19ea42f587b344ee6865 0 10;
}

# JA5t (JA5 TLS) hash only
ja5t {
    e7d705a3286e19ea42f587b344ee6865 0 10;
}

# Full JA5 hash
ja5 {
    e7d705a3286e19ea42f587b344ee6865 0 10;
}

To generate the ratelimites we need to account the hashes, so the hashes must be added to the access log.

The fingerprints and their current rates along with rate-limits should be stored in separate shared tables protected with a spin-lock (please write a TODO comment with reference to #515 to replace this with the new HTrie). The tables should be organized as intrusive hash table plus an LRU linked list of maximum size max_entries (1000 by default). The configured fingerprints are never evicted.

We log the fingerprints in #537 on per-request basis, so the analytics will track all the fingerprints and here we only need to keep the most aggressive hitters (likely DDoS fingerprints) to limit them.

JA5

There is a set of fingerprints on different layers to classify clients and their requests on all the layers.

The fingerprint must be computed inline, i.e. during the TLS, HTTP and TCP processing.

This is a binary value consisting of following bits (from most significant):

TCP and IP (ja5n, essentially p0f)

(maybe a kernel patch will be required to store the values - in this case it makes sense to move this to a separate issue to do after #1808 )

  1. 1 bit: IPv4 or IPv6
  2. 4 bits: maximum TTL - minimum TTL (15 for difference 15 and above)
  3. alignment to 1 byte
  4. 1 byte: minimum TTL during
  5. 1 byte: length of IPv4 options or IPv6 extension headers
  6. 1 byte window size in MTUs
  7. 2 bytes number of network packets per a request (255 for 255 and above)
  8. 2 bytes MTU size
  9. 2 bytes: wscale
  10. 2 bytes: sum * 11 + tcp_option

TLS (ja5t)

  1. 1 bit: TLS version (1.2 or 1.3, constant for now, left 1.3 for TLS 1.3 #1031 )
  2. 1 bit: abbreviated handshake
  3. 1 bit: found vhost for SNI
  4. 3 bits: ALPN: h2, http/1.1,http/1.1,h2, h2,http/1.1, other, zero value for empty
  5. alignment to 1 byte
  6. 2 bytes: sum * 11 + cipher_suite (11 is just a small prime, relatively far from a power of 2). This scheme represents the order of ciphersuites.
  7. 2 bytes: sum * 11 + extension_type
  8. 2 bytes: sum * 11 + elliptic_curve

HTTP (ja5h)

  1. 1 bit: http version (h1 or h2)
  2. 5 bits: HTTP method (tfw_http_meth_t value)
  3. 5 bits: number of Cookie values (all bits set for 31 and more cookies, within one or several headers)
  4. 6 bits: number of headers (all bits set for 63 and more headers)
  5. 1 bit: has Referer
  6. alignment to 3 bytes
  7. 4 bytes: sum * 11 + header, where header is a 4 or less bytes (depending on how many bytes available) value for tfw_http_parse_req() or static table value or also 4 bytes of decoded value from the dynamic table.

Testing

Corresponding testing issue #2308

Documentation

Update https://tempesta-tech.com/knowledge-base/Handling-clients/ .

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions