-
Notifications
You must be signed in to change notification settings - Fork 106
TDBv0.2: Cache background revalidation and eviction #515
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I've made few roughly benchmarks HTTP2 with enabled caching.
Tempesta1kb response
5kb response
128kb response
128kb reponse with HTTP/1
Nginx (nginx/1.23.3)1kb response
5kb response
128kb response
FYI: |
With the latest discussion https://github.com/tempesta-tech/tempesta-test/pull/602/files#r1622305438 and our website purging issue https://github.com/tempesta-tech/tempesta-tech.com/issues/64 , it could make sense to make the eviction thread also send conditional requests for particular resources (typically defines as dynamic, e.g. wiki or blog posts in our case). This causes extra overhead to both the upstream and Tempesta servers and introduces delays. It's much worse than cache purge plugins, but it would solve our problem and maybe similar problems of others. TBD: it solves the problem not in a nice way and requires development effort... |
|
Uh oh!
There was an error while loading. Please reload this page.
Depends on #1869
Scope
tfw_cache_mgr
thread must traverse Web-cache and evict stale records on memory pressure or revalidate them otherwise. The thread must be accurately scheduled and throttled to not to impact system performance as well as efficiently free required memory. #500 must be kept in mind as well.Validation logic is defined by RFC 7234 4.3 and requires implementation of conditional requests.
Keep in mind DoS attack from #520. Following items linked with #516 (TDB v0.3) must be implemented:
tfw_cache_mgr
thread or just a callback, see TDB eviction and stale response processing #2074 (comment)reinsert
andlookup & insert
(tdb_rec_get_alloc()
) logic from Temporal client accounting #1115 (temporary implementatied in Temporal client accounting #1178).__cache_add_node()
creates a TDB entry, which immediately becomes visible for other threads, and latertfw_cache_copy_resp()
inserts actual data, so concurrent threads may get incomplete or corrupted data. It can be done in 2 phases (soft updates): (1) allocate space in TDB data area and (2) actual insert (index update) to link the data.tfw_client_obtain()
modifications from Temporal client accounting #1178, as well as similar HTTP sessions storage (Sticky cookies load balancing #685), and__cache_add_node()
must be changed to use the soft updates. This also implies some versioning: while a softirq sending data for current cached object (probably very slowly with Redesign of TCP synchronous sending and data caching #391 .1 in mind), the object may stall and/or replaced by a new version, so the new version only must be fetched by new scans while the old version must reside in TDB untill it's fully transmitted and then it should be evicted.chroot
isolation).The current TDB table size maximum is 128GB, which is too small for the web cache on the modern hardwareThis is teh subject for NUMA-aware cache modes #400__cache_entry_size()
call which introduces an extra response traversal. It seems we can just allocate new TDB data blocks and later reuse them if we have extra space or just ignore the tail if it's unusable.The task is required to fix #803.
UPD. Since filtering (#731) and QoS (#488) also require eviction, there job should be done in
tdb_mgr
thread instead.UPD. TDB was designed to provide access to stored data in zero-copy fashion, such that cached response body can be sent directly to a socket. This property made several design limitations and introduced many difficulties. However, with TLS we always have to copy data. So TDB design can be significantly simplified with copying. So depends on #634.
Cache eviction
While CART is well known good adaptive replacement algorithm, there are number of caching algorithms based on machine learning, which provide much better cache hit. See for example the survey and Cacheus. Some of the algorithms required access to columnar storage for statistics (common practice in CDNs).
At least some interface for the user space algorithm is required. Probably just CART with some weights, where weights are loaded from the users space into the kernel, would be enough.
The cache must implement per-vhost eviction strategies and space quotas to provide caching QoS for CDN cases. Probably 2-layer quotas are required to not prevent poor configuration issues for bad Vary specification on application side, which may take too much space (linked with #733). Different eviction strategies are required to handle e.g. chunks of live streams (huge data volume, immediately remove outdated chunks) and rarely updated web content like CSS (may service stale entries).
It must be possible to 'lock' some records in evictable data sets (see #858 and #471).
Purging
On this feature implementation we should be able to normally update the site content w/o Tempesta restart or memory leaks. It's hard to track which new pages appeared and which are deleted during site content update, so in this task we need:
/foo/*.php
or/foo/bar/*
Done in TDB eviction and stale response processing #2074immediate
(purge
in original [Cache] purging #501) strategy for the purging (we still need the mode to leave stale responses in the cache for Servicing stale cached responses and immediate purging #522);Documentation
Need to update https://github.com/tempesta-tech/tempesta/wiki/Caching-Responses#manual-cache-purging wiki page.
Testing
invalidate
andimmediate
strategiesThe text was updated successfully, but these errors were encountered: