redis
diff --git a/‎src/redis_stack/probabilistic_data_structures.md
Lines changed: 24 additions & 12 deletions b/‎src/redis_stack/probabilistic_data_structures.md
Lines changed: 24 additions & 12 deletions
diff --git a/‎src/redis_stack/redis_for_time_series.md
Lines changed: 13 additions & 15 deletions b/‎src/redis_stack/redis_for_time_series.md
Lines changed: 13 additions & 15 deletions
@@ -1,27 +1,39 @@
-In very broad terms probabilistic data structures (PDS) allow us to get to a "close enough" result in a much shorter time and by using significantly less memory.
+This tutorial will demonstrate Redis Stack's probabilistic data structure capabilities using the bike shop use case.
 
-Redis Stack supports 4 of the most famous PDS:
-- Bloom filters
-- Cuckoo filters
-- Count-Min Sketch
+Redis Stack supports the following probabilistic data structures:
+
+- Bloom filter
+- Cuckoo filter
+- Count-min sketch
 - Top-K
+- t-digest
+
+Probabilistic data structures, in general, provide results that are "close enough" in a much shorter time and by using significantly less memory than other data types such as sets or hashes. Here, you'll learn how to use a Bloom filter.
+
+A Bloom filter allows you to check if an element is present in a set using a very small, fixed-size amount of memory. A query will return one of two possible answers:
 
-In the rest of this tutorial we'll introduce how you can use a Bloom filter to save many heavy calls to the relational database, or a lot of memory, compared to using sets or hashes.
-A Bloom filter is a probabilistic data structure that enables you to check if an element is present in a set using a very small memory space of a fixed size. **It can guarantee the absence of an element from a set, but it can only give an estimation about its presence**. So when it responds that an element is not present in a set (a negative answer), you can be sure that indeed is the case. However, one out of every N positive answers will be wrong.
-Even though it looks unusual at a first glance, this kind of uncertainty still has its place in computer science. There are many cases out there where a negative answer will prevent very costly operations;
+1. the element *might* be in the set
+2. the element is definitely not in the set
 
-How can a Bloom filter be useful to our bike shop? For starters, we could keep a Bloom filter that stores all usernames of people who've already registered with our service. That way, when someone is creating a new account we can very quickly check if that username is free. If the answer is yes, we'd still have to go and check the main database for the precise result, but if the answer is no, we can skip that call and continue with the registration. 
+In other words, a Bloom filter will guarantee the absence of an element in a set, but it can only give an estimation about its presence. False positives are entirely possible. See [this Wikipedia article](https://en.wikipedia.org/wiki/Bloom_filter) for more detailed information about false positives and their frequency.
 
-Another, perhaps more interesting example is for showing better and more relevant ads to users. We could keep a bloom filter per user with all the products they've bought from the shop, and when we get a list of products from our suggestion engine we could check it against this filter.
+Despite the uncertainty involved when using Bloom filters, they are still valuable for many applications.
 
+How can a Bloom filter be helpful to an online bike shop service? For starters, you can use a Bloom filter to store the usernames of people who've already registered with the shop. When someone creates a new account, the system can very quickly check if the user's proposed username is available. If the answer is yes, you can confirm using your primary database. But, if the answer is no, further checks are not required and registration can proceed.
 
-```redis Add all bought product ids in the Bloom filter
+Another use case is targeting ads to users. A per-user Bloom filter can be created and populated with all the products each user has purchased from the shop. When the shop's ad suggestion engine provides a list of possible ads to show a user, it can check each item against the user's Bloom filter. Each item that is not part of the filter are good targets. For each item that might already be part of the filter, a second query can be made to the primary database to confirm. If the second confirmation is negative, then that ad can be added to the target list.
+
+First, create the Bloom filter.
+
+```redis Add all bought product IDs to a Bloom filter
 BF.MADD user:778:bought_products  4545667 9026875 3178945 4848754 1242449
 ```
 
-Just before we try to show an ad to a user, we can first check if that product id is already in their "bought products" Bloom filter. If the answer is yes - we might choose to check the main database, or we might skip to the next recommendation from our list. But if the answer is no, then we know for sure that our user hasn't bought that product:
+Next, run a couple of queries.
 
 ```redis Has a user bought this product?
 BF.EXISTS  user:778:bought_products 1234567  // No, the user has not bought this product
 BF.EXISTS  user:778:bought_products 3178945  // The user might have bought this product
 ```
+
+You can read more about Bloom filters and their use cases [here](https://redis.io/docs/data-types/probabilistic/bloom-filter/). See [here](https://redis.io/commands/?group=bf) for the complete list of Bloom filter commands.
@@ -1,10 +1,11 @@
-Among other things, Redis Stack provides you with a native time series data structure. Let's see how a time series might
-be useful in our bike shop.
+This tutorial will demonstrate Redis Stack's ability to store time series data using the bike shop use case.
 
-As we have multiple physical shops too, alongside our online shop, it could be helpful to have an overview of the sales
-volume. We will create one time series per shop tracking the total amount of all sales. In addition, we will mark the
-time series with the appropriate region label, `east` or `west`. This kind of representation will allow us to easily
-query bike sales performance per certain time periods, per shop, per region or across all shops.
+The bike shop company consists of multiple physical stores and an online presense. It would be helpful to have an aggregate view of sales volume across the physical and online stores.
+
+In the following example, a time series key is created for each of the five shops to track total sales. Each key is marked with an appropriate region label, `east` or `west`. This kind of representation will allows you to easily
+query bike sales performance certain time periods on a per shop, per region, or across all shops.
+
+Notice the `DUPLICATE_POLICY SUM`
 
 ```redis Create time series per shop
 TS.CREATE bike_sales_1 DUPLICATE_POLICY SUM LABELS region east compacted no
@@ -14,12 +15,10 @@ TS.CREATE bike_sales_4 DUPLICATE_POLICY SUM LABELS region west compacted no
 TS.CREATE bike_sales_5 DUPLICATE_POLICY SUM LABELS region west compacted no
 ```
 
-In the above query, we make the shop id (1,2,3,4,5) a part of the time series name. You might also notice
-the `DUPLICATE_POLICY SUM` argument; this describes what should be done when two events in the same time series share
-the same timestamp: In this case, it would mean that two sales happened at exactly the same time, so the resulting value
-should be a sum of the two sales amounts.
+Notice the `DUPLICATE_POLICY SUM` arguments; these describe how Redis should behave when two events in the same store and region have
+the same timestamp. In this case, two sales that happen at exactly the same time in a particular store and region are added together.
 
-Since the metrics are collected with a millisecond timestamp, we can compact our time series into sales per hour:
+Time series data is collected using millisecond timestamps. You can compact time series data and make it available in various sized aggregations. Here's an example of aggregating data by day:
 
 ```redis Time series compaction
 TS.CREATE bike_sales_1_per_day LABELS region east compacted yes
@@ -714,7 +713,7 @@ TS.RANGE bike_sales_1 - + AGGREGATION avg 3600000
 TS.RANGE bike_sales_2 - + AGGREGATION avg 86400000
 ```
 
-```redis Get sales per day, across west region
+```redis Get sales per day for the west region
 // Get sales per day, across all shops in the west region
 TS.MRANGE - + AGGREGATION sum 86400000 FILTER region=west compacted=no
 ```
@@ -727,9 +726,7 @@ TS.MRANGE - + FILTER region=(east,west) compacted=no GROUPBY region REDUCE sum
 TS.MRANGE - + AGGREGATION avg 86400000 FILTER region=(east,west) compacted=no GROUPBY region REDUCE sum
 ```
 
-Remember that compacted version of the time series we created at the beginning of this section? This last one is exactly the kind
-of queries we might need it for. Instead of doing the aggregation on all the data points, we can simply run this query
-on the compacted time series:
+The next two queries show how useful compacted time series can be to reduce the data set size needed for aggregated queries.
 
 ```redis All sales per region, compacted
 TS.MRANGE - + FILTER region=(east,west) compacted=yes GROUPBY region REDUCE sum
@@ -739,3 +736,4 @@ TS.MRANGE - + FILTER region=(east,west) compacted=yes GROUPBY region REDUCE sum
 TS.RANGE bike_sales_3_per_day - + FILTER_BY_VALUE 3000 5000
 ```
 
+You can read more about time series and use cases [here](https://redis.io/docs/data-types/timeseries/). See [here](https://redis.io/commands/?group=timeseries) for the complete list of time series commands.