From 12f4e014bd7724d7ed2c861057b52f634b9fbc02 Mon Sep 17 00:00:00 2001 From: Igor Bogoslavskyi Date: Sun, 1 Dec 2024 17:00:36 +0100 Subject: [PATCH 01/26] wip: adding optional and variant lecture --- lectures/optional_and_variant.md | 144 +++++++++++++++++++++++++++++++ 1 file changed, 144 insertions(+) create mode 100644 lectures/optional_and_variant.md diff --git a/lectures/optional_and_variant.md b/lectures/optional_and_variant.md new file mode 100644 index 0000000..27285fc --- /dev/null +++ b/lectures/optional_and_variant.md @@ -0,0 +1,144 @@ + +**`std::optional` and `std::variant` in Modern C++** +-- + +

+ Video Thumbnail +

+ +## **Introduction** + +When working with modern C++ (C++17 and beyond), we often need tools to handle optional values or represent data that can take one of several types. That’s where `std::optional` and `std::variant` come into play. Today, we’ll explore what these features are, why they’re useful, and how you can leverage them in your projects. + + +## **What is `std::optional`?** + +### Why use `std::optional`? + +Imagine a function that searches for an item in a container. If the item is found, the function should return it. But what if it isn’t? Before C++17, you might have returned a special value (like `-1` for integers) or used a pointer, potentially introducing ambiguity or risking undefined behavior. + +`std::optional` solves this by explicitly representing the absence of a value. It's a type-safe mechanism that avoids the pitfalls of ad-hoc solutions. + +### Examples of `std::optional` in action + +#### A simple search function + +````cpp +#include +#include +#include + +std::optional Find(const std::vector& data, int value) { + for (int element : data) { + if (element == value) { + return element; // Return the value if found + } + } + return std::nullopt; // Explicitly indicate "no value" +} + +int main() { + std::vector numbers = {1, 2, 3, 4, 5}; + auto result = Find(numbers, 3); + + if (result) { // Check if a value exists + std::cout << "Found: " << *result << '\n'; + } else { + std::cout << "Not found.\n"; + } +} +```` +In this example, `std::optional` clearly communicates that the function may or may not return a value. + +#### A factory function + +````cpp +std::optional CreateString(bool should_create) { + if (should_create) { + return "Hello, World!"; + } + return std::nullopt; +} + +int main() { + auto maybe_string = CreateString(true); + + if (maybe_string) { + std::cout << *maybe_string << '\n'; + } else { + std::cout << "No string created.\n"; + } +} +```` +--- + +## **What is `std::variant`?** + +### Why use `std::variant`? + +`std::variant` is a type-safe union introduced in C++17. It allows a variable to hold one value out of a defined set of types. Think of it as a more flexible alternative to `enum` or `std::any`, but with static type checking. + +For instance, if a variable can hold either an integer or a string, you can use `std::variant` instead of rolling your own solution with `void*` or `boost::variant`. + +### Examples of `std::variant` in action + +#### Basic usage + +````cpp +#include +#include +#include + +int main() { + std::variant value; + + value = 42; // Assign an integer + std::cout << "Integer: " << std::get(value) << '\n'; + + value = "Hello, std::variant!"; // Assign a string + std::cout << "String: " << std::get(value) << '\n'; +} +```` +#### Pattern matching with `std::visit` + +````cpp +#include +#include +#include + +int main() { + std::variant value = "Hello, Variant!"; + + std::visit([](auto&& arg) { + using T = std::decay_t; + if constexpr (std::is_same_v) { + std::cout << "Integer: " << arg << '\n'; + } else if constexpr (std::is_same_v) { + std::cout << "String: " << arg << '\n'; + } + }, value); +} +```` +Here, `std::visit` applies a visitor (a callable object) to the value contained in the variant. + +--- + +## **Key differences and common use cases** + +| Feature | `std::optional` | `std::variant` | +|--------------------|------------------------------------------------------|------------------------------------------------| +| Purpose | Represents optional values (may or may not exist). | Represents one of several types. | +| Typical Use Case | Returning a value or "nothing" from a function. | Handling inputs or data with multiple types. | +| Type Safety | Yes. | Yes. | +| Pattern Matching | Not applicable. | Supported via `std::visit`. | + +--- + +## **Summary** + +`std::optional` and `std::variant` are two powerful tools in the C++ toolbox that greatly enhance type safety and code readability. + +- Use `std::optional` when a value might be absent. +- Use `std::variant` when a value can be one of several types. + +These features enable us to write cleaner, more expressive code while avoiding common pitfalls. Experiment with them in your projects and see how they can simplify your development workflow! From 489aabc1893a0d644572e4eb7574748e5b9c31c4 Mon Sep 17 00:00:00 2001 From: Igor Bogoslavskyi Date: Tue, 3 Dec 2024 21:50:13 +0100 Subject: [PATCH 02/26] Update optional text --- lectures/optional_and_variant.md | 102 ++++++++++++++++--------------- 1 file changed, 52 insertions(+), 50 deletions(-) diff --git a/lectures/optional_and_variant.md b/lectures/optional_and_variant.md index 27285fc..f2f4e67 100644 --- a/lectures/optional_and_variant.md +++ b/lectures/optional_and_variant.md @@ -6,75 +6,77 @@ Video Thumbnail

-## **Introduction** +When working with modern C++ (C++17 and beyond), we often need tools to handle optional values or represent data that can take one of several types. That’s where `std::optional` and `std::variant` come into play. Today, we’ll explore what these features are, why they’re useful, and how to use properly them. -When working with modern C++ (C++17 and beyond), we often need tools to handle optional values or represent data that can take one of several types. That’s where `std::optional` and `std::variant` come into play. Today, we’ll explore what these features are, why they’re useful, and how you can leverage them in your projects. + +## Why use `std::optional`? +To understand why we need `std::optional` I believe its best to start with an example. -## **What is `std::optional`?** +Let's say we have a function `GetAnswerFromLlm` that, getting a question, is supposed to answer all of our questions using some large language model. +```cpp +#include -### Why use `std::optional`? +std::string GetAnswerFromLlm(const std::string& question); +``` -Imagine a function that searches for an item in a container. If the item is found, the function should return it. But what if it isn’t? Before C++17, you might have returned a special value (like `-1` for integers) or used a pointer, potentially introducing ambiguity or risking undefined behavior. +In a normal case, this is a good-enough interface, we ask it things and get some answers. But what happens if something goes wrong within this function? What if it _cannot_ answer our question? What should it return so that we know that an error has occurred. -`std::optional` solves this by explicitly representing the absence of a value. It's a type-safe mechanism that avoids the pitfalls of ad-hoc solutions. +Largely speaking there are two school of thought here: +- It can throw an **exceptions** to indicate that some error has happened +- Or it can return a special value to indicate a failure -### Examples of `std::optional` in action +I will not talk too much about exceptions today, I will just mention that in many codebases, especially those that contain safety-critical code, exceptions are banned altogether due to the fact that there is, strictly speaking, no way to guarantee their runtime performance because of their dynamic implementation. -#### A simple search function +This prompted people to think our of the box to avoid using exceptions but still to know that something went wrong during the execution of their function. -````cpp +In the olden days (before C++17), people would return a special value from the function. For example, we could just return some pre-defined string, for example an empty one, should something have gone wrong. But what if we ask our LLM to actually return an empty string and it would fail to do so? What should it return then? + +This is where `std::optional` comes to the rescue. We can now return a `std::optional` instead of just returning a `std::string`: +```cpp #include -#include -#include +#include -std::optional Find(const std::vector& data, int value) { - for (int element : data) { - if (element == value) { - return element; // Return the value if found - } - } - return std::nullopt; // Explicitly indicate "no value" -} +std::optional GetAnswerFromLlm(const std::string& question); +``` +Now it is super clear when reading this function that it might fail because it only optionally returns a string. -int main() { - std::vector numbers = {1, 2, 3, 4, 5}; - auto result = Find(numbers, 3); - - if (result) { // Check if a value exists - std::cout << "Found: " << *result << '\n'; - } else { - std::cout << "Not found.\n"; - } -} -```` -In this example, `std::optional` clearly communicates that the function may or may not return a value. +`llm.hpp` +```cpp +#include +#include -#### A factory function +std::optional GetAnswerFromLlm(const std::string& question); +``` -````cpp -std::optional CreateString(bool should_create) { - if (should_create) { - return "Hello, World!"; - } - return std::nullopt; -} +So let's see how we could work with such a function! For this we'll call it a couple of times with various prompts and process the results that we're getting: -int main() { - auto maybe_string = CreateString(true); +`main.cpp` +```cpp +#include "llm.hpp" - if (maybe_string) { - std::cout << *maybe_string << '\n'; - } else { - std::cout << "No string created.\n"; - } +int main() { + const auto suggestion = GetAnswerFromLlm( + "In one word, what should I do with my life?"); + if (!suggestion) return 1; + const auto further_suggestion = GetAnswerFromLlm( + std::string{"In one word, what should I do after doing this: "} + suggestion.value()); + if (!further_suggestion.has_value()) return 1; + std::cout << + "The LLM told me to " << *suggestion << + ", and then to " << *further_suggestion << std::endl; + return 0; } -```` ---- +``` +In general, `std::optional` provides an interface in which we are able to: +- Check if it holds a value by calling its `has_value()` method or implicitly converting it to `bool` +- Get the stored value by calling `value()` or using a dereferencing operator `*`. Beware, though that getting a value of an optional that holds no value is undefined behavior, so _always check_ that there is actually a value stored in an optional. + +There are many use-cases for `optional` in situations where we want to be able to handle a case where a value might exist but also might be missing under certain circumstances. -## **What is `std::variant`?** + -### Why use `std::variant`? +## Why use `std::variant`? `std::variant` is a type-safe union introduced in C++17. It allows a variable to hold one value out of a defined set of types. Think of it as a more flexible alternative to `enum` or `std::any`, but with static type checking. From 664ef3a169b3c0757cd1de2f98c5469701cbf3cb Mon Sep 17 00:00:00 2001 From: Igor Bogoslavskyi Date: Sun, 8 Dec 2024 19:07:29 +0100 Subject: [PATCH 03/26] Optional lecture topic complete --- lectures/optional.md | 216 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 216 insertions(+) create mode 100644 lectures/optional.md diff --git a/lectures/optional.md b/lectures/optional.md new file mode 100644 index 0000000..54a49c9 --- /dev/null +++ b/lectures/optional.md @@ -0,0 +1,216 @@ + +**`std::optional` and `std::expected` in Modern C++** +-- + +

+ Video Thumbnail +

+ +When working with modern C++, we often need tools to handle optional values. These are useful in many situations, like when returning from a function that might fail during execution. Since C++17 we have a class `std::optional` that can be used in such situations. And since C++23 we're also getting `std::expected`. So let's chat about what these types are, when to use them and what to think about while using them. + + + + +## Use `std::optional` to represent optional class fields +For example, imagine we implement a game and we have some items that it can hold in either hand. +```cpp +template +struct Character { + Item left_hand_item; + Item right_hand_item; +}; +``` + +The character, however, might hold nothing in their hands, so how do we model this? + +We _could_ just replace them with pointers and if there is a `nullptr` stored there it would mean that the character holds no item there. But this has certain drawbacks as it changes the semantics of these variables. Before, our `Character` object had value semantics and now it follows pointer semantics under the hood, meaning that copying our `Character` object would become harder. The simple choice of allowing the character to have no objects in their hands should not force these unrelated design decisions. + +One way to avoid this issue is to store a `std::optional` in each hand of the character instead: +```cpp +template +struct Character { + std::optional left_hand_item; + std::optional right_hand_item; +}; +``` + +Now it is clear just by looking at this tiny code snippet that neither item is required for the correct operation of the character and we did not change the value-semantics of our object. + +Before we talk about how to use `std:::optional`, I'd like to first talk a bit about another important use-case - error handling. + +## Use `std::optional` to return from functions that might fail +Let's say we have a function `GetAnswerFromLlm` that, getting a question, is supposed to answer all of our questions using some large language model. +```cpp +#include + +std::string GetAnswerFromLlm(const std::string& question); +``` + +This is a simple interface that serves its purpose in most situations: we ask it things and get some `std::string` answers. But what happens if something goes wrong within this function? What if it _cannot_ answer our question? What should it return so that we know that an error has occurred. + +Largely speaking there are two schools of thought here: +- It can throw an **exception** to indicate that some error has occurred +- It can return a special value to indicate a failure + +### Why not throw an exception +We'll have to briefly talk about the first option here if only to explain why we're not going to talk about in-depth. + +Generally, at any point in our program we can `throw` an exception. It then is handled in a separate execution path, invisible to the user and can be caught at any point in the program upstream from the place where the exception was thrown. + +In our case, the `GetAnswerFromLlm` would then throw an exception if, say, the network was down and our LLM of choice was unreachable: +```cpp +#include + +std::string GetAnswerFromLlm(const std::string& question) { + const auto llm_handle = GetLlmHandle(); + if (!llm_handle) { + throw std::runtime_error("Cannot get LLM handle"); + } + return llm_handle->GetAnswer(question); +} +``` +If we are set on using exceptions, on the calling side, we would need to "catch" exceptions using the `try`-`catch` blocks. Generally, we wrap the code we want to execute into a `try` block that is followed by a `catch` block that handles all of our potential errors. +```cpp +int main() { + try { + const answer = GetAnswerFromLlm("What am I doing with ny life?"); + std::cout << answer << std::endl; + } catch (std::runtime_error error) { + std::cerr << error << std::endl; + } catch (...) { + std::cerr << "Unexpected error happened" << std::endl; + } +} +``` +I will not talk too much about exceptions, mostly because in all of my decade of using C++ professionally I very rarely worked in code bases that use exceptions. Many code bases, especially those that contain safety-critical code, ban exceptions altogether due to the fact that there is, strictly speaking, no way to guarantee how long it takes to process an exception once one is thrown because of their dynamic implementation. + +Furthermore, they have another issue of creating a hidden logic path that can be hard to trace. We have to become very rigorous about what function throws which exceptions when and, in some cases, the only way to know this is by relying on a documentation of a function which, in many cases, does not fully exist. I firmly believe that the statement `catch (...)` is singlehandedly responsible for many errors that you've undoubtedly encountered before yourself. Just imagine that the `LlmHandle::GetAnswer` function also throws some other exception that we don't expect - this would lead us to showing the "unexpected error happened" message, which is not super useful to the user of our code. + + +### Avoid the hidden error path +All of these issues prompted people to think out of the box to avoid using exceptions but still to allow them to know that something went wrong during the execution of their function. + +In the olden days (before C++17), there were only three options: +1. To return a special value from the function that indicates a failure: + ```cpp + #include + + // 😱 Not a great idea nowadays. + std::string GetAnswerFromLlm(const std::string& question, std::string& answer) { + const auto llm_handle = GetLlmHandle(); + if (!llm_handle) { return {}; } + return llm_handle->GetAnswer(question); + } + ``` + This option is not ideal because it is hard to define an appropriate "failure" value to return from most functions. For example, an empty string sounds like a good option for such a value, but then the LLM response to a query "Read this text, answer with empty string when done" would overlap with such a default value. Not great and the logic would be similar for any string we would designate as the failure value. +2. Another historic option is to return an error code from the function, which required passing any values that the function had to change as a non-const reference or pointer: + ```cpp + #include + + // 😱 Not a great idea nowadays. + int GetAnswerFromLlm(const std::string& question, std::string& answer) { + const auto llm_handle = GetLlmHandle(); + if (!llm_handle) { return 1; } + answer = llm_handle->GetAnswer(question); + return 0; + } + ``` + This options is equally poor because now we lose a lot of benefits that we get with the compiler optimizing the return value that we get from a function and also reduce the readability of the code. This method is error prone and hard to read. Not great either. +3. An even worse but also still used method (OpenGL, anyone?) method is to set some global error variable and explore its value after every call to see if something bad has happened. + ```cpp + #include + + // 😱 Not a great idea to have a global variable. + inline static int last_error{}; + + // 😱 Not a great idea nowadays. + std::string GetAnswerFromLlm(const std::string& question) { + const auto llm_handle = GetLlmHandle(); + if (!llm_handle) { + last_error = 1; + return {}; + } + last_error = 0; + return llm_handle->GetAnswer(question); + } + ``` + I believe I don't have to go into many details as to why his is not an ideal way to deal with errors: it is even less readable and more error prone than the previous method. We even have to use a global variable! Good luck testing this code, especially when running a number of tests in parallel. + +But I would not be telling you all of the above if there were no better way of course. This is where `std::optional` comes to the rescue. Instead of all of the horrible things we've just discussed, we can return a `std::optional` instead of just returning a `std::string`: + +`llm.hpp` +```cpp +#include +#include + +std::optional GetAnswerFromLlm(const std::string& question); +``` +Now it is super clear when reading this function that it might fail because it only optionally returns a string. It also forces us to deal with any potential error happening inside of this function when we call it because the _type_ or the value we get forces us to do it. No hidden error path! + +## How to work with `std::optional` +So let's see how we could work with such a function! For this we'll call it a couple of times with various prompts and process the results that we're getting: + +`main.cpp` +```cpp +#include "llm.hpp" + +int main() { + const auto suggestion = GetAnswerFromLlm( + "In one word, what should I do with my life?"); + if (!suggestion) return 1; + const auto further_suggestion = GetAnswerFromLlm( + std::string{"In one word, what should I do after doing this: "} + suggestion.value()); + if (!further_suggestion.has_value()) return 1; + std::cout << + "The LLM told me to " << *suggestion << + ", and then to " << *further_suggestion << std::endl; + return 0; +} +``` +In general, `std::optional` provides an interface in which we are able to: +- Check if it holds a value by calling its `has_value()` method or implicitly converting it to `bool` +- Get the stored value by calling `value()` or using a dereferencing operator `*`. Beware, though that getting a value of an optional that holds no value is undefined behavior, so _always check_ that there is actually a value stored in an optional. + +There are many use-cases for `optional` in situations where we want to be able to handle a case where a value might exist but also might be missing under certain circumstances. + + + +## What about `std::expected` +There is just one more quality of life improvement that we are missing here. If we receive a `std::optional` object that stores a `std::nullopt` in it as a result of a function call, we know that the function failed. But we don't know **why** it failed. + +This is why in C++23 we are getting a class `std::expected` that, while being very similar to `std::optional` has another template parameter: `std::expected` that stores the type of an error that might be stored in this object instead of the value we expect. This way, we can store arbitrary values to indicate that an error has occurred: +```cpp +#include + +// 😱 Not a great idea to have a global variable. +inline static int last_error{}; + +// 😱 Not a great idea nowadays. +std::expected GetAnswerFromLlm(const std::string& question) { + const auto llm_handle = GetLlmHandle(); + if (!llm_handle) { + return std::unexpected{"No network"}; + } + return llm_handle->GetAnswer(question); +} +``` +Now if we have a network outage, we can return an error that tells us about this being the case and should the `LlmHandle::GetAnswer` return an expected too, it would automagically propagate to the caller of the `GetAnswerFromLlm` function. + +## Performance implications +Largely speaking, both `std::optional` and `std::expected` are both implemented as a `union` in C++, meaning that the expected and unexpected values are stored _in the same underlying memory_ with helper functions allowing us to query which one is actually stored there. + +This means that if the unexpected type is smaller than the expected type, there is no memory overhead. This leads us to the first performance consideration: do not use large types for the unexpected type in `std::expected`. There is not much we can do wrong with `std::optional` on this front as it holds a small `std::nullopt` type if it does not hold the expected type. + +As these types are compile-time they also allow the compiler to optimize the code that uses them quite well and generally do not have any overhead over a single `if` statement. Which leads us to our second performance consideration: if you have a very tight loop that does not use optional or expected values, measure the runtime of your code if you need to introduce those and make sure that performance is still satisfied. + +Finally, there are some quirks of the compilers and how they work around optimizing the return values from the functions. If we create objects that we aim to return in a wrong way, the compiler might generate unnecessary moves or copies of the objects. Here is how to return our objects: + +For more please see a short and clear video by Jason Turner that covers this topic. + + +## Summary +Overall, classes like `std::optional` and `std::expected` are extremely useful to represent values that optionally hold a value. Sometimes it is enough for us to know that the value simply might not exist, that's where `std::optional` shines but sometimes we would also like to know **why** the value does not exist and that's why `std::expected` has been added. + +These classes are super useful - they make the code readable, maintain value semantics which is used quite often when coding in modern C++ and keep the code very performant. + + From a40da96fd9cc36181afea74dd53c596c2b352bac Mon Sep 17 00:00:00 2001 From: Igor Bogoslavskyi Date: Sun, 8 Dec 2024 19:39:03 +0100 Subject: [PATCH 04/26] Initial variant text --- lectures/optional_and_variant.md | 158 +++++++++++-------------------- 1 file changed, 55 insertions(+), 103 deletions(-) diff --git a/lectures/optional_and_variant.md b/lectures/optional_and_variant.md index f2f4e67..c848826 100644 --- a/lectures/optional_and_variant.md +++ b/lectures/optional_and_variant.md @@ -1,146 +1,98 @@ -**`std::optional` and `std::variant` in Modern C++** +`std::variant` in Modern C++ --

Video Thumbnail

-When working with modern C++ (C++17 and beyond), we often need tools to handle optional values or represent data that can take one of several types. That’s where `std::optional` and `std::variant` come into play. Today, we’ll explore what these features are, why they’re useful, and how to use properly them. +In the last lecture we talked about `std::optional` and `std::expected` types that make our life better. It might be useful to understand _how_ they can store two values of different types in the same memory. We can get a glimpse into this by understanding how `std::variant` works. Furthermore, we can store many more types than two in it. This, incidentally also happens to be the key to mimicking dynamic polymorphism when using templates. -## Why use `std::optional`? -To understand why we need `std::optional` I believe its best to start with an example. - -Let's say we have a function `GetAnswerFromLlm` that, getting a question, is supposed to answer all of our questions using some large language model. -```cpp -#include - -std::string GetAnswerFromLlm(const std::string& question); -``` - -In a normal case, this is a good-enough interface, we ask it things and get some answers. But what happens if something goes wrong within this function? What if it _cannot_ answer our question? What should it return so that we know that an error has occurred. - -Largely speaking there are two school of thought here: -- It can throw an **exceptions** to indicate that some error has happened -- Or it can return a special value to indicate a failure - -I will not talk too much about exceptions today, I will just mention that in many codebases, especially those that contain safety-critical code, exceptions are banned altogether due to the fact that there is, strictly speaking, no way to guarantee their runtime performance because of their dynamic implementation. - -This prompted people to think our of the box to avoid using exceptions but still to know that something went wrong during the execution of their function. - -In the olden days (before C++17), people would return a special value from the function. For example, we could just return some pre-defined string, for example an empty one, should something have gone wrong. But what if we ask our LLM to actually return an empty string and it would fail to do so? What should it return then? - -This is where `std::optional` comes to the rescue. We can now return a `std::optional` instead of just returning a `std::string`: -```cpp -#include -#include +## Why use `std::variant`? -std::optional GetAnswerFromLlm(const std::string& question); -``` -Now it is super clear when reading this function that it might fail because it only optionally returns a string. +`std::variant` is a type-safe `union` type introduced in C++17. It allows a variable to hold one value out of a defined set of types. -`llm.hpp` +For instance, if a variable can hold either an integer or a string, you can use `std::variant` and put any value in it: ```cpp -#include +#include +#include #include -std::optional GetAnswerFromLlm(const std::string& question); -``` - -So let's see how we could work with such a function! For this we'll call it a couple of times with various prompts and process the results that we're getting: - -`main.cpp` -```cpp -#include "llm.hpp" - int main() { - const auto suggestion = GetAnswerFromLlm( - "In one word, what should I do with my life?"); - if (!suggestion) return 1; - const auto further_suggestion = GetAnswerFromLlm( - std::string{"In one word, what should I do after doing this: "} + suggestion.value()); - if (!further_suggestion.has_value()) return 1; - std::cout << - "The LLM told me to " << *suggestion << - ", and then to " << *further_suggestion << std::endl; + // This compiles + std::variant value; + value = 42; // value holds an int. + std::cout << "Integer: " << std::get(value) << '\n'; + value = "42" // value now holds a string. + std::cout << "String: " << std::get(value) << '\n'; return 0; } ``` -In general, `std::optional` provides an interface in which we are able to: -- Check if it holds a value by calling its `has_value()` method or implicitly converting it to `bool` -- Get the stored value by calling `value()` or using a dereferencing operator `*`. Beware, though that getting a value of an optional that holds no value is undefined behavior, so _always check_ that there is actually a value stored in an optional. -There are many use-cases for `optional` in situations where we want to be able to handle a case where a value might exist but also might be missing under certain circumstances. +### How `std::variant` is used in practice? +While cool already, the current tiny example might feel quite limited. Think about it, we somehow have to _know_ which type our `std::variant` holds to use it. Which almost feels like it defeats the purpose. And to a degree it does. - +But we should not despair, this is C++ after all, there are options for us to use to make sure that we can work with _any_ type that the variant holds. This option is to use a visitor pattern through the use of the `std::visit` function: -## Why use `std::variant`? - -`std::variant` is a type-safe union introduced in C++17. It allows a variable to hold one value out of a defined set of types. Think of it as a more flexible alternative to `enum` or `std::any`, but with static type checking. - -For instance, if a variable can hold either an integer or a string, you can use `std::variant` instead of rolling your own solution with `void*` or `boost::variant`. - -### Examples of `std::variant` in action - -#### Basic usage - -````cpp +```cpp #include #include #include -int main() { - std::variant value; - - value = 42; // Assign an integer - std::cout << "Integer: " << std::get(value) << '\n'; +struct Printer { + void operator(int value) const { + std::cout << "Integer: " << value << '\n'; + } + void operator(const std::string& value) const { + std::cout << "String: " << value << '\n'; + } +}; - value = "Hello, std::variant!"; // Assign a string - std::cout << "String: " << std::get(value) << '\n'; +int main() { + std::variant value = "Hello, Variant!"; + std::visit(Printer{}, value); + value = 42; + std::visit(Printer{}, value); } -```` -#### Pattern matching with `std::visit` +``` +Here, `std::visit` applies a [function object](lambdas.md#before-lambdas-we-had-function-objects-or-functors) to the value contained in the variant. Should our variant hold a string, the operator that accepts a string is called and should it hold an integer instead, the operator that accepts an integer is called instead. + +Note, that a typical pitfall that beginners make is to forget that all of the checks for this code happen at compile time without taking into account the runtime logic of our code. -````cpp +If, for example, we would change our `Printer` function object to a `LengthPrinter` function object that only knows how to print length of objects, our code will not compile even though we only ever actually store an `std::string` in our variant: +```cpp #include #include #include +struct LengthPrinter { + void operator(const std::string& value) const { + std::cout << "String length: " << value.size() << '\n'; + } +}; + int main() { + // ❌ Does not compile! std::variant value = "Hello, Variant!"; - - std::visit([](auto&& arg) { - using T = std::decay_t; - if constexpr (std::is_same_v) { - std::cout << "Integer: " << arg << '\n'; - } else if constexpr (std::is_same_v) { - std::cout << "String: " << arg << '\n'; - } - }, value); + std::visit(LengthPrinter{}, value); } -```` -Here, `std::visit` applies a visitor (a callable object) to the value contained in the variant. +``` +This happens because the compiler must guarantee that all the code paths compile because it does not know which other code might be called. This might happen if some dynamic library gets linked to our code after it gets compiled. If that dynamic library actually stores an `int` in our variant the compiled code must know how to deal with it. ---- +Many people find this confusing and get burned by this at least a couple of times until it becomes very intuitive and please remember that it just takes time. -## **Key differences and common use cases** +## `std::monostate` +Whenever we create a new `std::variant` object we actually initialize it to storing some uninitialized value of the type that is first in the list of types that the variant can store. Sometimes it might be undesirable and we want the variant to be initialized in an "empty" state. For this purpose there is a type `std::monostate` in the standard library and we can define our variant type using `std::monostate` as its first type in the list. +```cpp +std::variant value{}; +// value holds an instance of std::monostate now. +``` -| Feature | `std::optional` | `std::variant` | -|--------------------|------------------------------------------------------|------------------------------------------------| -| Purpose | Represents optional values (may or may not exist). | Represents one of several types. | -| Typical Use Case | Returning a value or "nothing" from a function. | Handling inputs or data with multiple types. | -| Type Safety | Yes. | Yes. | -| Pattern Matching | Not applicable. | Supported via `std::visit`. | +Note that it probably means that we'll need to differentiate between our variant holding the `std::monostate` value or some other value in the `std::visit` that we will inevitably use at a later point in time. ---- ## **Summary** -`std::optional` and `std::variant` are two powerful tools in the C++ toolbox that greatly enhance type safety and code readability. - -- Use `std::optional` when a value might be absent. -- Use `std::variant` when a value can be one of several types. - -These features enable us to write cleaner, more expressive code while avoiding common pitfalls. Experiment with them in your projects and see how they can simplify your development workflow! +Overall, `std::variant` is extremely important for modern C++. If we implement our code largely using templates or concepts and need to enable polymorphic behavior based on some values provided at runtime, there is probably no way for us to avoid using it. Which also means that we probably also will need to use `std::visit`. These things might well be confusing from the get go but after we've looked into how function objects and lambdas work we should have no issues using all of this machinery. From 2eefc7a55d9cf93961370702f99ba069e1705fd7 Mon Sep 17 00:00:00 2001 From: Igor Bogoslavskyi Date: Mon, 16 Dec 2024 22:39:39 +0100 Subject: [PATCH 05/26] Further update optional text, rename variant lecture --- lectures/images/error.png.webp | Bin 0 -> 3892 bytes lectures/optional.md | 110 +++++++++++------- .../{optional_and_variant.md => variant.md} | 0 3 files changed, 68 insertions(+), 42 deletions(-) create mode 100644 lectures/images/error.png.webp rename lectures/{optional_and_variant.md => variant.md} (100%) diff --git a/lectures/images/error.png.webp b/lectures/images/error.png.webp new file mode 100644 index 0000000000000000000000000000000000000000..1c38906bfd893a7f111893009d6137a73d20325e GIT binary patch literal 3892 zcmV-456kdUNk&F24*&pHMM6+kP&gnU4*&p=O#qz%DklN$06tMBkVYe-A)%tO8o+Q0 z31x2IZhm`d!yoc<`c2@Do4H|AJJyQNY?<5tbLs=k1N_f`2f9x%->u)M9`HZz|6V=K z|C#iT^m_kK|Nq6iLX%ujefjsGqD=>8At(cM14U&eo(e~|pk{`vHS`gi$H>%X*r%fJ8s|Mq0R92;PtfP}RE>r+SYSqTHOe0mO*k}eQ$Hg; zjPR23bbX^LdE5Qvu|-ij(zRjDGXI;>rkY{B^44aO2>R>T@=6AO8Mv_BbbRFAK-d2= zv9S2usWME`^bF#JUA=ChpSu{MT55tL8jY7~ty~x);)4sjf5oTjzs1Hr;mB0y*oQp` z-6No-1ifHnc1v_={O`8tbKu#ErBkN&(qE@PW0BCh`*DY*q&ayK#q~A~V?|W+^rlBk9$hANjmBxTb6ZW&D=Y1r zGxs$sls8^b^YU@CQ-`nMJ`~*h5ZVv@3MY=UvdG)3`(Uh|b%0Z_a>4&Z6n`XtlZ{Fm zL(=+JhNRiK49xmcPctp+-x;%Jwka1{2tagAvx8Is0RH@>A5eUv?`8h(CANMQLN+d! zRqo=aOdT3Pd|O{JCBH6|C;wq08)R!-DJsh~$D2Rtn=C)5msjF#nK2?ib0#dF&-OwTpR?$CYL49c}?(_j%G-zmh3md|sKD&nm z>@okbqIduP5XJuD8OdPZ|F~!UBioBTS>g^)b%OLXs}0OvS%Ayz?}-f?Gal9`iTDv~ zT#`tER#c5rYUi|J(hvumjoas^HdHjrx`0Z(VS3vNVPTR8zx8&eRkajnkhB`^AGC~k zNF`l>K-b0B{}&GX3_J6%;)ARjA+zoCPf-8N+zy+Xp!AS)mR5PJ{eYrl9Ve+H1~RL? z(mM|zQK*-ZCur<(AteY-aFj|5uH-zLeQvnZdoze{O8 zebCY2!x4rwTdnIuN`?BOlL1}yw!D1x2~`giNyxh8KfDL;o{{m-IQ#jbY|I|9KyjbV z7CP`tBi2_BIJQ95T_Ocbz)#{t7 zM`=NaULVlz^nCBu8)5)hBZoEQ3OZm&0)EsOtTyMl89s*+2&NhB-5L=mSY1lG$KN7u_<_nz zxyMWVXBnX*M3@mnhfme^$;rV6;lp3*qol~$bd#Kj1S>o8@?eyhv9A_eWLU?hP-l$R z$&;y~+cq;ijAk4UE4_w3Zhm`;F{g#y4`&E85UQ#09DDVOTInPHM#)NEB8M1@m!Py4 z+Xr3i_Fo!opN5RDBjGUsnpkvArR>wd!Y6aAcJ=9BVY3kxhuW5}O@q^pOTl@XbCMQg_cw|#7RQUch=7QEsNTN1u;MgKt?EzX=}V&Prm+dc(ps{7V)F=uAwMsv z!|>qls<+)}d5GKHTQ7ZCAgNAM492XU`glQamuxp2&< z`!4^)5Wf`^KOk3L;vbQCpEcQ-K9*#Jzmg&y9AWSCGef;{W@-v$R3G1T`)`HUu-Aiv z-fNkQtqphn7ONc<-41W%0!*pesWow>bXTcygD`x<06+6NpY&t?E$cg}dk~lSh&E*! zYkb`d{W~fv%5huE${rNG()3tGul*bxGmU6 zU}_VwcYOSSzem3X4q%o^eLnTEu07-$S?3&p09IX$1M3Xg(r87Hf*?Der2O}(Mtc<~ zIrg)(6?eD15w)u{Tw>SY#8!J7_~sHn3)+tebAQ6DBc;LjNQw{fLFn<&iY1#SyhCYH zL=m?LKja*IJ7|e2Dh$$LN?Y5wp1hu2CpJT#E94t3Kjjd%SX`kRz2ebm6XNXgh)}vg zuJOgA%q1edo&fTFrS4wu?Wxx*St-NkBhY{vno`1>2G3k*FK+nCLlvfXYw$3+_bQ|&mqgJUWIkr22M>f-8H~UvVk1vgv|Fly1 zxh8ZaGjE>~RKNDzoL<LILI!2* z7*pj$+FxI_2Y!(c5a3_`<$Fbq<2dv!2XCyj5(YH??wFZz3N=fMaj(yNsSdr-ONL#9 z&IVw~itcJhA(p@jB#4Lob!DDLj#+14#QSn(^VzCk%$^!tbDhTxZ=W)J=L&NVv`2GgiPga}z80N>au z_?;k#Wfb-vKL~xVtrZubm29?M0$Kd&;hI%b0U>fzz3=#%8ugpQE{91v3Q$P<_XF(r zwz27TkN~ZvVFF>p1u+33@hC|RP{VGAJxi|?ndTwzwm|=_YPYt3J|Oku(!%C5iGN!; z_umMcugn){!@cPJ|IM;2cb;9@OTucHooxvqlqMEfD=#qktm$h}OXg(;o)MoSys(&} zxSqM$y9+30Qvd#HKclfXORMd&mXiVD;hkqtW-;qh>x|aX7%qx6jjuLWBBm6d|Jbst z=qk#$hrA}ZpN}y0IE+hcGyjTgd=eU*U%@K1B|sAV!ORmj?9~rtwaukR|IdV)+@xQ= zL7pZK>1IZ6P43#99LeCT`Un1NOY5sA`|;wR#DmlSdsjk{I31{VI=Q^O11vn+9gAxx zwUgd5Fa2V@O=xn>Pxknxo#YFGR?4U3L?og_%eHTx@p=2fon2qQ1NfV9jQwT(uXD#6 z6ck0`_Bann0KXaQULk=jK9s-!0^{lIhl30+ELcd}c12J1mM?bt09g4vcHUk2Vahjd z0JV9Uu*rjiU!FaVw74qdN-736>Y$uQ3B$t33bm8t?b>gK) zop-0f(lCARpkmbR6<3oRCxtb-9T$>|nae zU+b}-o2)gWG%$hBM(32n*sp#hY99PM6|A zgVaq-C7VerZP}6Hle$S;Zp@Dq@zk0y7YYc}BycoBGAa6|d%i?!H>LNSK%8?@&rtg!c&%Q4Kp8EvR({v1>T%fkR zdk>{4dWXFUh@%Q@YD_b7u}*PVtq{0ATqnn_mN3NsA5Fotbp1+qW3Agt&=r^Y%p@kk z$y6^Q`BA}Ti84h$id1k}VoZ@w;*}g0Sd%1E_@ze$mJh9yONNQvxfm92ITNl*!KEy0 zuiKSSns$Rmpoz7!GgVC7&@P9VyonjT7+zZ4bFso|L39MJ{QQ1B$~YdvH0+FIn{C>`MMdI8g)^=}=O;31v-Wtl+aq$3h9jUlqp#@2Z#AE(2O9uf!Xdp8nd2%l>#=s zz?K+w90*Eza*kUhciK7Wt#W+SBDb3YOUAkN{)A|{vGwa>(b?08kqD8=wOdp~wVideo Thumbnail

-When working with modern C++, we often need tools to handle optional values. These are useful in many situations, like when returning from a function that might fail during execution. Since C++17 we have a class `std::optional` that can be used in such situations. And since C++23 we're also getting `std::expected`. So let's chat about what these types are, when to use them and what to think about while using them. +When working with modern C++, we often need tools to handle optional values. These are useful in many situations, like when returning from a function that might fail during execution. Since C++17 we have a class `std::optional` that can be used in such situations. And since C++23 we're also getting `std::expected`. So let's chat about what these types are, when to use them and what to remember when using them. ## Use `std::optional` to represent optional class fields -For example, imagine we implement a game and we have some items that it can hold in either hand. +For example, imagine that we want to implement a game character and we have some items that they can hold in either hand. ```cpp template struct Character { @@ -21,9 +21,21 @@ struct Character { }; ``` -The character, however, might hold nothing in their hands, so how do we model this? +The character, however, might hold nothing in their hands too, so how do we model this? -We _could_ just replace them with pointers and if there is a `nullptr` stored there it would mean that the character holds no item there. But this has certain drawbacks as it changes the semantics of these variables. Before, our `Character` object had value semantics and now it follows pointer semantics under the hood, meaning that copying our `Character` object would become harder. The simple choice of allowing the character to have no objects in their hands should not force these unrelated design decisions. +We _could_ just replace the items with pointers and if there is a `nullptr` stored in either of those it would mean that the character holds no item in the corresponding hand. But this has certain drawbacks as it changes the semantics of these variables. +```cpp +// 😱 Who owns the items? +template +struct Character { + Item* left_hand_item; + Item* right_hand_item; +}; +``` + +Before, our `Character` object had value semantics and now it follows pointer semantics under the hood, meaning that copying our `Character` object would become [harder](memory_and_smart_pointers.md#performing-shallow-copy-by-mistake). + +This is not great. The simple decision of allowing the character to have no objects in their hands forces us to actively think about memory, complicating the implementation and forcing unrelated design considerations upon us. One way to avoid this issue is to store a `std::optional` in each hand of the character instead: ```cpp @@ -34,9 +46,9 @@ struct Character { }; ``` -Now it is clear just by looking at this tiny code snippet that neither item is required for the correct operation of the character and we did not change the value-semantics of our object. +Now it is clear just by looking at this tiny code snippet that neither item is required for the correct operation of the character. As a bonus, the object still has value semantics and can be copied and moved without any issues. -Before we talk about how to use `std:::optional`, I'd like to first talk a bit about another important use-case - error handling. +Before we talk about how to use `std:::optional`, I'd like to first talk a bit about another important use-case for it - **error handling**. ## Use `std::optional` to return from functions that might fail Let's say we have a function `GetAnswerFromLlm` that, getting a question, is supposed to answer all of our questions using some large language model. @@ -46,18 +58,18 @@ Let's say we have a function `GetAnswerFromLlm` that, getting a question, is sup std::string GetAnswerFromLlm(const std::string& question); ``` -This is a simple interface that serves its purpose in most situations: we ask it things and get some `std::string` answers. But what happens if something goes wrong within this function? What if it _cannot_ answer our question? What should it return so that we know that an error has occurred. +This is a simple interface that serves its purpose in most situations: we ask it things and get some `std::string` answers, sometimes of questionable quality. But what happens if something goes wrong within this function? What if it _cannot_ answer our question? What should this function return so that we know that an error has occurred. Largely speaking there are two schools of thought here: - It can throw an **exception** to indicate that some error has occurred - It can return a special value to indicate a failure ### Why not throw an exception -We'll have to briefly talk about the first option here if only to explain why we're not going to talk about in-depth. +We'll have to briefly talk about the first option here if only to explain why we're not going to talk about in-depth. And I can already see people with pitchforks coming for me so do note that this is a highly-debated topic with even thoughts of [re-imagining exceptions altogether](https://www.youtube.com/watch?v=ARYP83yNAWk). -Generally, at any point in our program we can `throw` an exception. It then is handled in a separate execution path, invisible to the user and can be caught at any point in the program upstream from the place where the exception was thrown. +Anyway. Exceptions. Generally, at any point in our program we can `throw` an exception. It then is handled in a separate execution path, invisible to the user and can be caught at any point in the program upstream from the place where the exception was thrown by value or by reference. Yes, exceptions are polymorphic and use [runtime polymorphism](inheritance.md#using-virtual-for-interface-inheritance-and-proper-polymorphism), which is one of the issues people have with them. -In our case, the `GetAnswerFromLlm` would then throw an exception if, say, the network was down and our LLM of choice was unreachable: +In our case, if, say, the network would be down and our LLM of choice would be unreachable, the `GetAnswerFromLlm` would throw an exception, say a `std::runtime_error`: ```cpp #include @@ -69,7 +81,8 @@ std::string GetAnswerFromLlm(const std::string& question) { return llm_handle->GetAnswer(question); } ``` -If we are set on using exceptions, on the calling side, we would need to "catch" exceptions using the `try`-`catch` blocks. Generally, we wrap the code we want to execute into a `try` block that is followed by a `catch` block that handles all of our potential errors. + +On the calling side, we would need to "catch" this exception using the `try`-`catch` blocks. Generally, if using exceptions for reporting errors, we wrap the code we want to execute into a `try` block that is followed by a `catch` block that handles all of our potential errors. ```cpp int main() { try { @@ -82,16 +95,22 @@ int main() { } } ``` -I will not talk too much about exceptions, mostly because in all of my decade of using C++ professionally I very rarely worked in code bases that use exceptions. Many code bases, especially those that contain safety-critical code, ban exceptions altogether due to the fact that there is, strictly speaking, no way to guarantee how long it takes to process an exception once one is thrown because of their dynamic implementation. +I will not talk too much about exceptions, mostly because in around a decade of using C++ professionally I very rarely worked in code bases that use exceptions. Many code bases, especially those that contain safety-critical code, ban exceptions altogether due to the fact that there is, strictly speaking, no way to guarantee how long it takes to process an exception once one is thrown because of their dynamic implementation. + +Furthermore, there is another thing I don't really like about them. They create a hidden logic path that can be hard to trace when reading the code. +You see, the `catch` block that catches an exception can be in _any_ calling function and it will catch a matching exception that is thrown at any depth of the call stack. + +This typically means that we have to become very rigorous about what function throws which exceptions when and, in some cases, the only way to know this is by relying on a documentation of a function which, in many cases, does not fully exist or is not up to date. I firmly believe that the statement `catch (...)` is singlehandedly responsible for many errors that we've all encountered. + +Video Thumbnail -Furthermore, they have another issue of creating a hidden logic path that can be hard to trace. We have to become very rigorous about what function throws which exceptions when and, in some cases, the only way to know this is by relying on a documentation of a function which, in many cases, does not fully exist. I firmly believe that the statement `catch (...)` is singlehandedly responsible for many errors that you've undoubtedly encountered before yourself. Just imagine that the `LlmHandle::GetAnswer` function also throws some other exception that we don't expect - this would lead us to showing the "unexpected error happened" message, which is not super useful to the user of our code. - +To be a bit more concrete, just imagine that the `LlmHandle::GetAnswer` function throws some other exception, say `std::logic_error` that we don't expect - this would lead us to showing such a `"Something happened"` message, which is not super useful to the user of our code. ### Avoid the hidden error path All of these issues prompted people to think out of the box to avoid using exceptions but still to allow them to know that something went wrong during the execution of their function. -In the olden days (before C++17), there were only three options: -1. To return a special value from the function that indicates a failure: +In the olden days (before C++17), there were only three options. +1. The first one was to return a special value from the function. When the user receives this function they know that an error has occurred: ```cpp #include @@ -102,7 +121,7 @@ In the olden days (before C++17), there were only three options: return llm_handle->GetAnswer(question); } ``` - This option is not ideal because it is hard to define an appropriate "failure" value to return from most functions. For example, an empty string sounds like a good option for such a value, but then the LLM response to a query "Read this text, answer with empty string when done" would overlap with such a default value. Not great and the logic would be similar for any string we would designate as the failure value. + This option is not ideal because it is hard to define an appropriate "failure" value to return from most functions. For example, an empty string sounds like a good option for such a value, but then the LLM response to a query "Read this text, answer with empty string when done" would overlap with such a default value. Not great, right? We can extend the same logic of course for any string we would designate as the "failure value" 2. Another historic option is to return an error code from the function, which required passing any values that the function had to change as a non-const reference or pointer: ```cpp #include @@ -115,12 +134,13 @@ In the olden days (before C++17), there were only three options: return 0; } ``` - This options is equally poor because now we lose a lot of benefits that we get with the compiler optimizing the return value that we get from a function and also reduce the readability of the code. This method is error prone and hard to read. Not great either. -3. An even worse but also still used method (OpenGL, anyone?) method is to set some global error variable and explore its value after every call to see if something bad has happened. + This options is also not great. I would argue that not being able to have pure functions that get only const inputs and return a single output makes the code a lot less readable. Furthermore, modern compilers are very good at optimizing the returned value and sometimes the function that constructs this value altogether which might be a bit harder if we pass a reference to some value stored elsewhere. Although I don't know enough about the magic that the compilers do under the hood to be 100% about this second reason, so if you happen to know more - tell me! + +3. An arguably even worse but still sometimes used method (OpenGL, anyone?) is to set some global error variable if an error has occurred and explore its value after every call to see if something bad has actually happened. ```cpp #include - // 😱 Not a great idea to have a global variable. + // 😱 Not a great idea to have a global mutable variable. inline static int last_error{}; // 😱 Not a great idea nowadays. @@ -134,18 +154,24 @@ In the olden days (before C++17), there were only three options: return llm_handle->GetAnswer(question); } ``` - I believe I don't have to go into many details as to why his is not an ideal way to deal with errors: it is even less readable and more error prone than the previous method. We even have to use a global variable! Good luck testing this code, especially when running a number of tests in parallel. + I believe I don't have to go into many details as to why his is not an ideal way to deal with errors: it is even less readable and more error prone than the previous method. We even have to use a mutable global variable! Good luck testing this code, especially when running a number of tests in parallel. -But I would not be telling you all of the above if there were no better way of course. This is where `std::optional` comes to the rescue. Instead of all of the horrible things we've just discussed, we can return a `std::optional` instead of just returning a `std::string`: +But I would not be telling you all of this if there were no better way. This is where `std::optional` comes to the rescue. Instead of all of the horrible things we've just discussed, we can return a `std::optional` instead of just returning a `std::string`: `llm.hpp` ```cpp #include #include -std::optional GetAnswerFromLlm(const std::string& question); +std::optional GetAnswerFromLlm(const std::string& question) { + const auto llm_handle = GetLlmHandle(); + if (!llm_handle) { return {}; } + return llm_handle->GetAnswer(question); +} ``` -Now it is super clear when reading this function that it might fail because it only optionally returns a string. It also forces us to deal with any potential error happening inside of this function when we call it because the _type_ or the value we get forces us to do it. No hidden error path! +Now it is super clear when reading this function that it might fail because it only _optionally_ returns a string. It also forces us to deal with any potential error happening inside of this function when we call it because the _type_ or the value we get forces us to do it. No hidden error path! + +Note also, that the code of the function itself stayed _exactly_ the same as in the case where we would indicate an error by returning an empty string, just the return type is different! ## How to work with `std::optional` So let's see how we could work with such a function! For this we'll call it a couple of times with various prompts and process the results that we're getting: @@ -163,29 +189,21 @@ int main() { if (!further_suggestion.has_value()) return 1; std::cout << "The LLM told me to " << *suggestion << - ", and then to " << *further_suggestion << std::endl; + ", and then to " << further_suggestion.value() << std::endl; return 0; } ``` In general, `std::optional` provides an interface in which we are able to: - Check if it holds a value by calling its `has_value()` method or implicitly converting it to `bool` -- Get the stored value by calling `value()` or using a dereferencing operator `*`. Beware, though that getting a value of an optional that holds no value is undefined behavior, so _always check_ that there is actually a value stored in an optional. - -There are many use-cases for `optional` in situations where we want to be able to handle a case where a value might exist but also might be missing under certain circumstances. - - +- Get the stored value by calling `value()` or using a dereferencing operator `*` as well as `->` should we want to call methods or ged data of an object stored in the optional wrapper. Beware, though that getting a value of an optional that holds no value is undefined behavior, so _always check_ that there is actually a value stored in an optional. -## What about `std::expected` -There is just one more quality of life improvement that we are missing here. If we receive a `std::optional` object that stores a `std::nullopt` in it as a result of a function call, we know that the function failed. But we don't know **why** it failed. +## Use `std::expected` to tell why a function failed +There is just one more quality of life improvement that we are missing here. If we receive a `std::optional` object that stores a `std::nullopt` as a result of a function call, we know that the function failed. But we don't know **why** it failed. This is why in C++23 we are getting a class `std::expected` that, while being very similar to `std::optional` has another template parameter: `std::expected` that stores the type of an error that might be stored in this object instead of the value we expect. This way, we can store arbitrary values to indicate that an error has occurred: ```cpp #include -// 😱 Not a great idea to have a global variable. -inline static int last_error{}; - -// 😱 Not a great idea nowadays. std::expected GetAnswerFromLlm(const std::string& question) { const auto llm_handle = GetLlmHandle(); if (!llm_handle) { @@ -194,19 +212,27 @@ std::expected GetAnswerFromLlm(const std::string& ques return llm_handle->GetAnswer(question); } ``` -Now if we have a network outage, we can return an error that tells us about this being the case and should the `LlmHandle::GetAnswer` return an expected too, it would automagically propagate to the caller of the `GetAnswerFromLlm` function. +Now if we have a network outage, we can return an error that tells us about this being the case and should the `LlmHandle::GetAnswer` return an expected object of the same type too, it would automagically propagate to the caller of the `GetAnswerFromLlm` function. -## Performance implications +## How are they implemented and their performance implications Largely speaking, both `std::optional` and `std::expected` are both implemented as a `union` in C++, meaning that the expected and unexpected values are stored _in the same underlying memory_ with helper functions allowing us to query which one is actually stored there. -This means that if the unexpected type is smaller than the expected type, there is no memory overhead. This leads us to the first performance consideration: do not use large types for the unexpected type in `std::expected`. There is not much we can do wrong with `std::optional` on this front as it holds a small `std::nullopt` type if it does not hold the expected type. +This means that if the unexpected type is smaller than the expected type, there is no memory overhead. This leads us to the first performance consideration: **we should not use large types for the _unexpected_ type in `std::expected`**. Otherwise, we might be wasting a lot of memory: +```cpp +// 😱 Not a great idea. +std::expected SomeFunction(); +``` +Here, instead of returning an tiny `int` object we will now always return an object that takes the same amount of memory as `HugeType`. As allocating memory is work, this will also most probably be slower than returning tiny integer numbers. + +The good news here is that there is not much we can do wrong with `std::optional` on this front as it holds a small `std::nullopt` type if it does not hold the expected return type. + +As you might have already guessed, both `std::optional` and `std::variant` are class templates. Which means that they are created and checked at compile-time. Which incidentally allows the compiler to optimize the code that uses them quite well. This in turn means that generally neither `std::optional` nor `std::expected` have much of a runtime overhead. -As these types are compile-time they also allow the compiler to optimize the code that uses them quite well and generally do not have any overhead over a single `if` statement. Which leads us to our second performance consideration: if you have a very tight loop that does not use optional or expected values, measure the runtime of your code if you need to introduce those and make sure that performance is still satisfied. +That being said, they might not be completely for free which leads us to our second performance consideration: **if we have a very tight loop that does not use `optional` or `expected` values, we must measure the runtime of your code if we introduce those and make sure that performance is still satisfied**. Finally, there are some quirks of the compilers and how they work around optimizing the return values from the functions. If we create objects that we aim to return in a wrong way, the compiler might generate unnecessary moves or copies of the objects. Here is how to return our objects: -For more please see a short and clear video by Jason Turner that covers this topic. - +For more please see a [short and clear video by Jason Turner](https://www.youtube.com/watch?v=0yJk5yfdih0) that covers this topic. ## Summary Overall, classes like `std::optional` and `std::expected` are extremely useful to represent values that optionally hold a value. Sometimes it is enough for us to know that the value simply might not exist, that's where `std::optional` shines but sometimes we would also like to know **why** the value does not exist and that's why `std::expected` has been added. diff --git a/lectures/optional_and_variant.md b/lectures/variant.md similarity index 100% rename from lectures/optional_and_variant.md rename to lectures/variant.md From c46063e7c6e7761c7784f477c2398e9a2c4a3dc4 Mon Sep 17 00:00:00 2001 From: Igor Bogoslavskyi Date: Tue, 17 Dec 2024 08:06:45 +0100 Subject: [PATCH 06/26] Closing in on final optional text --- lectures/optional.md | 41 ++++++++++++++++++++++++++++++----------- 1 file changed, 30 insertions(+), 11 deletions(-) diff --git a/lectures/optional.md b/lectures/optional.md index d940b98..c182134 100644 --- a/lectures/optional.md +++ b/lectures/optional.md @@ -1,20 +1,29 @@ -**`std::optional` and `std::expected` in Modern C++** +**`std::optional` and `std::expected`** --

Video Thumbnail

-When working with modern C++, we often need tools to handle optional values. These are useful in many situations, like when returning from a function that might fail during execution. Since C++17 we have a class `std::optional` that can be used in such situations. And since C++23 we're also getting `std::expected`. So let's chat about what these types are, when to use them and what to remember when using them. +- [**`std::optional` and `std::expected`**](#stdoptional-and-stdexpected) +- [Use `std::optional` to represent optional class fields](#use-stdoptional-to-represent-optional-class-fields) +- [Use `std::optional` to return from functions that might fail](#use-stdoptional-to-return-from-functions-that-might-fail) + - [Why not throw an exception](#why-not-throw-an-exception) + - [Avoid the hidden error path](#avoid-the-hidden-error-path) +- [How to work with `std::optional`](#how-to-work-with-stdoptional) +- [Use `std::expected` to tell why a function failed](#use-stdexpected-to-tell-why-a-function-failed) +- [How are they implemented and their performance implications](#how-are-they-implemented-and-their-performance-implications) +- [Summary](#summary) - +When working with modern C++, we often need tools to handle optional values. These are useful in many situations, like when returning from a function that might fail during execution. Since C++17 we have a class `std::optional` that can be used in such situations. And since C++23 we're also getting `std::expected`. So let's chat about what these types are, when to use them and what to remember when using them to make sure we're not sacrificing any performance. + + ## Use `std::optional` to represent optional class fields -For example, imagine that we want to implement a game character and we have some items that they can hold in either hand. +As a a first tiny example, imagine that we want to implement a game character and we have some items that they can hold in either hand (we'll for now assume that the items are of the same pre-defined type for simplicity but could of course extend this example with a class template): ```cpp -template struct Character { Item left_hand_item; Item right_hand_item; @@ -23,10 +32,21 @@ struct Character { The character, however, might hold nothing in their hands too, so how do we model this? +As a naïve solution, we could of course just add two additional boolean values `has_item_in_left_hand` and `has_item_in_right_hand` respectively: +```cpp +struct Character { + Item left_hand_item; + Item right_hand_item; + // 😱 Not a great solution, we need to keep these in sync! + bool has_item_in_left_hand; + bool has_item_in_right_hand; +}; +``` +This is not a great solution as we would then need to keep these variables in sync and I, for one, do not trust myself with such an important task, especially if I can avoid it. So, speaking of avoiding this, can we somehow bake this information into the stored item types directly? + We _could_ just replace the items with pointers and if there is a `nullptr` stored in either of those it would mean that the character holds no item in the corresponding hand. But this has certain drawbacks as it changes the semantics of these variables. ```cpp // 😱 Who owns the items? -template struct Character { Item* left_hand_item; Item* right_hand_item; @@ -39,7 +59,6 @@ This is not great. The simple decision of allowing the character to have no obje One way to avoid this issue is to store a `std::optional` in each hand of the character instead: ```cpp -template struct Character { std::optional left_hand_item; std::optional right_hand_item; @@ -230,13 +249,13 @@ As you might have already guessed, both `std::optional` and `std::variant` are c That being said, they might not be completely for free which leads us to our second performance consideration: **if we have a very tight loop that does not use `optional` or `expected` values, we must measure the runtime of your code if we introduce those and make sure that performance is still satisfied**. -Finally, there are some quirks of the compilers and how they work around optimizing the return values from the functions. If we create objects that we aim to return in a wrong way, the compiler might generate unnecessary moves or copies of the objects. Here is how to return our objects: +Finally, there are some quirks around how the compilers are able to optimize the code when a function returns `optional` or `expected` values. If we create objects that we aim to return in a wrong way, the compiler might generate unnecessary moves or copies of the objects. Here is how to return our objects to avoid this: For more please see a [short and clear video by Jason Turner](https://www.youtube.com/watch?v=0yJk5yfdih0) that covers this topic. ## Summary -Overall, classes like `std::optional` and `std::expected` are extremely useful to represent values that optionally hold a value. Sometimes it is enough for us to know that the value simply might not exist, that's where `std::optional` shines but sometimes we would also like to know **why** the value does not exist and that's why `std::expected` has been added. +Overall, classes like `std::optional` and `std::expected` are extremely useful to represent values that optionally hold a value. Sometimes it is enough for us to know that the value simply might not be there, without caring for a reason behind this, that's where `std::optional` shines. But sometimes, especially when returning from functions, we would also like to know **why** the value does not exist and that's what `std::expected` has been added for in C++23. Oh, and if you'd like to use something like `std::expected` before C++23, take a peek at `tl::expected`, I've gotten some good mileage out of it over the years. -These classes are super useful - they make the code readable, maintain value semantics which is used quite often when coding in modern C++ and keep the code very performant. +These classes are very useful - they make the intent behind our code crystal-clear. They also allow us to keep the code readable and performant. - + From 94275d807513a80e6a707966045fc8b2f0fa3341 Mon Sep 17 00:00:00 2001 From: Igor Bogoslavskyi Date: Mon, 21 Apr 2025 22:33:19 +0200 Subject: [PATCH 07/26] Minor changes --- lectures/optional.md | 57 ++++++++++++++++++++++++++++++++++---------- 1 file changed, 44 insertions(+), 13 deletions(-) diff --git a/lectures/optional.md b/lectures/optional.md index c182134..01a3674 100644 --- a/lectures/optional.md +++ b/lectures/optional.md @@ -17,12 +17,14 @@ - [Summary](#summary) -When working with modern C++, we often need tools to handle optional values. These are useful in many situations, like when returning from a function that might fail during execution. Since C++17 we have a class `std::optional` that can be used in such situations. And since C++23 we're also getting `std::expected`. So let's chat about what these types are, when to use them and what to remember when using them to make sure we're not sacrificing any performance. +When working with modern C++, we often need tools to handle optional values. These are useful in many situations, like when returning from a function that might fail during execution. Since C++17 we have a templated class `std::optional` that can be used in such situations. And since C++23 we're also getting `std::expected`. So let's chat about what these types are, when to use them and what to remember when using them to make sure we're not sacrificing any performance. ## Use `std::optional` to represent optional class fields + As a a first tiny example, imagine that we want to implement a game character and we have some items that they can hold in either hand (we'll for now assume that the items are of the same pre-defined type for simplicity but could of course extend this example with a class template): + ```cpp struct Character { Item left_hand_item; @@ -33,6 +35,7 @@ struct Character { The character, however, might hold nothing in their hands too, so how do we model this? As a naïve solution, we could of course just add two additional boolean values `has_item_in_left_hand` and `has_item_in_right_hand` respectively: + ```cpp struct Character { Item left_hand_item; @@ -42,9 +45,11 @@ struct Character { bool has_item_in_right_hand; }; ``` + This is not a great solution as we would then need to keep these variables in sync and I, for one, do not trust myself with such an important task, especially if I can avoid it. So, speaking of avoiding this, can we somehow bake this information into the stored item types directly? We _could_ just replace the items with pointers and if there is a `nullptr` stored in either of those it would mean that the character holds no item in the corresponding hand. But this has certain drawbacks as it changes the semantics of these variables. + ```cpp // 😱 Who owns the items? struct Character { @@ -58,6 +63,7 @@ Before, our `Character` object had value semantics and now it follows pointer se This is not great. The simple decision of allowing the character to have no objects in their hands forces us to actively think about memory, complicating the implementation and forcing unrelated design considerations upon us. One way to avoid this issue is to store a `std::optional` in each hand of the character instead: + ```cpp struct Character { std::optional left_hand_item; @@ -70,7 +76,9 @@ Now it is clear just by looking at this tiny code snippet that neither item is r Before we talk about how to use `std:::optional`, I'd like to first talk a bit about another important use-case for it - **error handling**. ## Use `std::optional` to return from functions that might fail + Let's say we have a function `GetAnswerFromLlm` that, getting a question, is supposed to answer all of our questions using some large language model. + ```cpp #include @@ -80,15 +88,19 @@ std::string GetAnswerFromLlm(const std::string& question); This is a simple interface that serves its purpose in most situations: we ask it things and get some `std::string` answers, sometimes of questionable quality. But what happens if something goes wrong within this function? What if it _cannot_ answer our question? What should this function return so that we know that an error has occurred. Largely speaking there are two schools of thought here: + - It can throw an **exception** to indicate that some error has occurred - It can return a special value to indicate a failure +- TODO: add a third option where it sets some global error state ### Why not throw an exception -We'll have to briefly talk about the first option here if only to explain why we're not going to talk about in-depth. And I can already see people with pitchforks coming for me so do note that this is a highly-debated topic with even thoughts of [re-imagining exceptions altogether](https://www.youtube.com/watch?v=ARYP83yNAWk). -Anyway. Exceptions. Generally, at any point in our program we can `throw` an exception. It then is handled in a separate execution path, invisible to the user and can be caught at any point in the program upstream from the place where the exception was thrown by value or by reference. Yes, exceptions are polymorphic and use [runtime polymorphism](inheritance.md#using-virtual-for-interface-inheritance-and-proper-polymorphism), which is one of the issues people have with them. +We'll have to briefly talk about the first option here if only to explain why we're not going to talk about it in-depth. And I can already see people with pitchforks coming for me so do note that this is a highly-debated topic with even thoughts of [re-imagining exceptions altogether](https://www.youtube.com/watch?v=ARYP83yNAWk). + +Anyway. Exceptions. Generally, at any point in our program we can `throw` an exception. It then is handled in a separate execution path, invisible to the user and can be caught by value or by reference at any point in the program upstream from the place where the exception was originally thrown. Yes, exceptions are polymorphic and use [runtime polymorphism](inheritance.md#using-virtual-for-interface-inheritance-and-proper-polymorphism), which is one of the issues people have with them. In our case, if, say, the network would be down and our LLM of choice would be unreachable, the `GetAnswerFromLlm` would throw an exception, say a `std::runtime_error`: + ```cpp #include @@ -102,6 +114,7 @@ std::string GetAnswerFromLlm(const std::string& question) { ``` On the calling side, we would need to "catch" this exception using the `try`-`catch` blocks. Generally, if using exceptions for reporting errors, we wrap the code we want to execute into a `try` block that is followed by a `catch` block that handles all of our potential errors. + ```cpp int main() { try { @@ -114,37 +127,47 @@ int main() { } } ``` + I will not talk too much about exceptions, mostly because in around a decade of using C++ professionally I very rarely worked in code bases that use exceptions. Many code bases, especially those that contain safety-critical code, ban exceptions altogether due to the fact that there is, strictly speaking, no way to guarantee how long it takes to process an exception once one is thrown because of their dynamic implementation. + Furthermore, there is another thing I don't really like about them. They create a hidden logic path that can be hard to trace when reading the code. You see, the `catch` block that catches an exception can be in _any_ calling function and it will catch a matching exception that is thrown at any depth of the call stack. -This typically means that we have to become very rigorous about what function throws which exceptions when and, in some cases, the only way to know this is by relying on a documentation of a function which, in many cases, does not fully exist or is not up to date. I firmly believe that the statement `catch (...)` is singlehandedly responsible for many errors that we've all encountered. Video Thumbnail +This typically means that we have to become very rigorous about what function throws which exceptions when and, in some cases, the only way to know this is by relying on a documentation of a function which, in many cases, does not fully exist or is not up to date. I firmly believe that the statement `catch (...)` is singlehandedly responsible for many errors of the style of "oops, something happened" that we've all encountered. + To be a bit more concrete, just imagine that the `LlmHandle::GetAnswer` function throws some other exception, say `std::logic_error` that we don't expect - this would lead us to showing such a `"Something happened"` message, which is not super useful to the user of our code. + ### Avoid the hidden error path -All of these issues prompted people to think out of the box to avoid using exceptions but still to allow them to know that something went wrong during the execution of their function. + +All of these issues prompted people to think out of the box to avoid using exceptions. And that while still having a way to know that something went wrong during the execution of some code. In the olden days (before C++17), there were only three options. -1. The first one was to return a special value from the function. When the user receives this function they know that an error has occurred: + +1. The first one was to return a special value from the function. When the user receives this value they know that an error has occurred: + ```cpp #include - // 😱 Not a great idea nowadays. + // 😱 Assumes empty string to indicate error. Not a great idea nowadays. std::string GetAnswerFromLlm(const std::string& question, std::string& answer) { const auto llm_handle = GetLlmHandle(); if (!llm_handle) { return {}; } return llm_handle->GetAnswer(question); } ``` - This option is not ideal because it is hard to define an appropriate "failure" value to return from most functions. For example, an empty string sounds like a good option for such a value, but then the LLM response to a query "Read this text, answer with empty string when done" would overlap with such a default value. Not great, right? We can extend the same logic of course for any string we would designate as the "failure value" -2. Another historic option is to return an error code from the function, which required passing any values that the function had to change as a non-const reference or pointer: + + This option is not ideal because it is hard to define an appropriate "failure" value to return from most functions. For example, an empty string sounds like a good option for such a value, but then the LLM response to a query "Read this text, do not answer anything when done" would overlap with such a default value. Not great, right? We can extend the same logic of course for any string we would designate as the "failure value". +2. Another option is to return an error code from the function, which required passing any values that the function had to change as a non-const reference or pointer: + ```cpp #include + // Returns a status code rather than the value we want. // 😱 Not a great idea nowadays. int GetAnswerFromLlm(const std::string& question, std::string& answer) { const auto llm_handle = GetLlmHandle(); @@ -153,9 +176,11 @@ In the olden days (before C++17), there were only three options. return 0; } ``` + This options is also not great. I would argue that not being able to have pure functions that get only const inputs and return a single output makes the code a lot less readable. Furthermore, modern compilers are very good at optimizing the returned value and sometimes the function that constructs this value altogether which might be a bit harder if we pass a reference to some value stored elsewhere. Although I don't know enough about the magic that the compilers do under the hood to be 100% about this second reason, so if you happen to know more - tell me! 3. An arguably even worse but still sometimes used method (OpenGL, anyone?) is to set some global error variable if an error has occurred and explore its value after every call to see if something bad has actually happened. + ```cpp #include @@ -173,11 +198,13 @@ In the olden days (before C++17), there were only three options. return llm_handle->GetAnswer(question); } ``` - I believe I don't have to go into many details as to why his is not an ideal way to deal with errors: it is even less readable and more error prone than the previous method. We even have to use a mutable global variable! Good luck testing this code, especially when running a number of tests in parallel. -But I would not be telling you all of this if there were no better way. This is where `std::optional` comes to the rescue. Instead of all of the horrible things we've just discussed, we can return a `std::optional` instead of just returning a `std::string`: + I believe I don't have to go into many details as to why his is not an ideal way to deal with errors: it is even less readable and more error prone than the previous method. We even have to use a mutable global variable! Also, good luck [testing](googletest.md) this code, especially when running a number of tests in parallel. + +But I would not be telling you all of this if there were no better way, would I? This is where `std::optional` comes to the rescue. Instead of all of the horrible things we've just discussed, we can return a `std::optional` instead of just returning a `std::string`: `llm.hpp` + ```cpp #include #include @@ -188,14 +215,17 @@ std::optional GetAnswerFromLlm(const std::string& question) { return llm_handle->GetAnswer(question); } ``` + Now it is super clear when reading this function that it might fail because it only _optionally_ returns a string. It also forces us to deal with any potential error happening inside of this function when we call it because the _type_ or the value we get forces us to do it. No hidden error path! Note also, that the code of the function itself stayed _exactly_ the same as in the case where we would indicate an error by returning an empty string, just the return type is different! ## How to work with `std::optional` + So let's see how we could work with such a function! For this we'll call it a couple of times with various prompts and process the results that we're getting: `main.cpp` + ```cpp #include "llm.hpp" @@ -236,12 +266,13 @@ Now if we have a network outage, we can return an error that tells us about this ## How are they implemented and their performance implications Largely speaking, both `std::optional` and `std::expected` are both implemented as a `union` in C++, meaning that the expected and unexpected values are stored _in the same underlying memory_ with helper functions allowing us to query which one is actually stored there. -This means that if the unexpected type is smaller than the expected type, there is no memory overhead. This leads us to the first performance consideration: **we should not use large types for the _unexpected_ type in `std::expected`**. Otherwise, we might be wasting a lot of memory: +This means that if the unexpected type has a smaller memory footprint than the expected type, then there is no memory overhead. This leads us to the first performance consideration: **we should not use large types for the _unexpected_ type in `std::expected`**. Otherwise, we might be wasting a lot of memory: ```cpp // 😱 Not a great idea. std::expected SomeFunction(); ``` -Here, instead of returning an tiny `int` object we will now always return an object that takes the same amount of memory as `HugeType`. As allocating memory is work, this will also most probably be slower than returning tiny integer numbers. +Here, instead of returning a tiny `int` object we will now always return an object that takes the same amount of memory as `HugeType`. As allocating memory is work, this will also most probably be slower than returning tiny integer numbers. + The good news here is that there is not much we can do wrong with `std::optional` on this front as it holds a small `std::nullopt` type if it does not hold the expected return type. From d0aeabbd217dac2a1dd945b8e058707aeb0608e2 Mon Sep 17 00:00:00 2001 From: Igor Bogoslavskyi Date: Mon, 26 May 2025 22:03:09 +0200 Subject: [PATCH 08/26] Restructure lecture further. --- lectures/optional.md | 158 +++++++++++++++++++++++++++++-------------- 1 file changed, 106 insertions(+), 52 deletions(-) diff --git a/lectures/optional.md b/lectures/optional.md index 01a3674..eae6e7e 100644 --- a/lectures/optional.md +++ b/lectures/optional.md @@ -1,83 +1,75 @@ -**`std::optional` and `std::expected`** +**Error handling in C++** --

Video Thumbnail

-- [**`std::optional` and `std::expected`**](#stdoptional-and-stdexpected) -- [Use `std::optional` to represent optional class fields](#use-stdoptional-to-represent-optional-class-fields) -- [Use `std::optional` to return from functions that might fail](#use-stdoptional-to-return-from-functions-that-might-fail) +- [**Error handling in C++**](#error-handling-in-c) +- [Disclaimer](#disclaimer) +- [What is error handling after all](#what-is-error-handling-after-all) +- [What to do about unrecoverable errors](#what-to-do-about-unrecoverable-errors) +- [How to recover from recoverable errors](#how-to-recover-from-recoverable-errors) - [Why not throw an exception](#why-not-throw-an-exception) + - [Why I don't use exceptions much](#why-i-dont-use-exceptions-much) - [Avoid the hidden error path](#avoid-the-hidden-error-path) - [How to work with `std::optional`](#how-to-work-with-stdoptional) - [Use `std::expected` to tell why a function failed](#use-stdexpected-to-tell-why-a-function-failed) +- [Use `std::optional` to represent optional class fields](#use-stdoptional-to-represent-optional-class-fields) - [How are they implemented and their performance implications](#how-are-they-implemented-and-their-performance-implications) - [Summary](#summary) +When writing code in C++, just like in life overall, we don't always get what we want. The good news is that we can prepare by being careful and anticipating some of the errors that we can encounter. There are many mechanisms in C++ for this and today we're talking about what options we have with an added benefit of some highly opinionated suggestions. All of you experienced C++ devs, prepare your pitch forks! :wink: -When working with modern C++, we often need tools to handle optional values. These are useful in many situations, like when returning from a function that might fail during execution. Since C++17 we have a templated class `std::optional` that can be used in such situations. And since C++23 we're also getting `std::expected`. So let's chat about what these types are, when to use them and what to remember when using them to make sure we're not sacrificing any performance. -## Use `std::optional` to represent optional class fields +## Disclaimer -As a a first tiny example, imagine that we want to implement a game character and we have some items that they can hold in either hand (we'll for now assume that the items are of the same pre-defined type for simplicity but could of course extend this example with a class template): +The topics we cover today don't have a single simple answer. The main reason for this is the shear power of C++ and all of the things is lets us do. This is only strengthened by how long C++ exists, the diversity of the use-cases and the people who use it. Depending on your context, the particular way of thinking presented here might be more or less useful to you. My experience mostly comes from automotive and robotics bubbles and might not apply to your domain. I will do my best to mention all options, but will only cover in-depth areas that I have been using myself over the last 15 or so years. -```cpp -struct Character { - Item left_hand_item; - Item right_hand_item; -}; -``` +I aim to add links to opinions alternative to those expressed in this lecture to the best of my ability, but if I miss something, please do not hesitate to let me know in the comments. -The character, however, might hold nothing in their hands too, so how do we model this? +## What is error handling after all -As a naïve solution, we could of course just add two additional boolean values `has_item_in_left_hand` and `has_item_in_right_hand` respectively: +It makes sense to start our conversation with defining what we call an "error" in the first place in the context of our C++ code. Essentially, on the highest level of abstraction, we say that there was an error when the code does not produce the result we expect it to produce. -```cpp -struct Character { - Item left_hand_item; - Item right_hand_item; - // 😱 Not a great solution, we need to keep these in sync! - bool has_item_in_left_hand; - bool has_item_in_right_hand; -}; -``` +We can further classify the possible error by its origin. The errors are typically thought of as: -This is not a great solution as we would then need to keep these variables in sync and I, for one, do not trust myself with such an important task, especially if I can avoid it. So, speaking of avoiding this, can we somehow bake this information into the stored item types directly? +- recoverable: errors that we can recover from within the normal operation of the program. An example of these would be a network timeout. +- unrecoverable: errors that indicate a state of the program so broken that any recovery is a moot point. Typical examples are programmatic errors and errors resulting from undefined behavior encountered previously in the program. -We _could_ just replace the items with pointers and if there is a `nullptr` stored in either of those it would mean that the character holds no item in the corresponding hand. But this has certain drawbacks as it changes the semantics of these variables. + -```cpp -// 😱 Who owns the items? -struct Character { - Item* left_hand_item; - Item* right_hand_item; -}; -``` +Note that this classification is still highly debated. There is a large camp of people, who believe that every error is potentially recoverable and should be treated as such. This is a valid way of thinking but it comes with a price that, at least in my industry, people are usually unwilling to pay. -Before, our `Character` object had value semantics and now it follows pointer semantics under the hood, meaning that copying our `Character` object would become [harder](memory_and_smart_pointers.md#performing-shallow-copy-by-mistake). +## What to do about unrecoverable errors -This is not great. The simple decision of allowing the character to have no objects in their hands forces us to actively think about memory, complicating the implementation and forcing unrelated design considerations upon us. +Here, we will assume that we cannot or don't want to try to recover from a class of errors that we deem "unrecoverable". That being said, we still generally want to have tools to reduce the likelihood of these errors popping up. In my experience, most of these errors come from an erroneous assumption or an undetected error earlier in the program. -One way to avoid this issue is to store a `std::optional` in each hand of the character instead: +One typical way of dealing with issues like these is a combination of two techniques: -```cpp -struct Character { - std::optional left_hand_item; - std::optional right_hand_item; -}; -``` +- Having a high [test code coverage](googletest.md), ideally 100% code line coverage +- Enforcing contract checking at the start (and potentially also at the end) of every function -Now it is clear just by looking at this tiny code snippet that neither item is required for the correct operation of the character. As a bonus, the object still has value semantics and can be copied and moved without any issues. +The combination of these technique allows us to increases the likelihood that an actual error would be caught early in the development and won't make it into the actual delivered application. -Before we talk about how to use `std:::optional`, I'd like to first talk a bit about another important use-case for it - **error handling**. + + +Such contract enforcement typically crash the application if their premise is not met, assuming that the only way this could have happened is if something before has already done horribly wrong. + +This obviously needs careful considerations. You don't want all of the software in your car die at a random point in time without any recovery procedure. + +We won't talk about this too much but in general, as at least one potential reason for such failures is memory being in an undefined and potentially inconsistent state, people usually run multiple processes and monitor the main process by some watchdog that activates a safe recovery procedure if needed. -## Use `std::optional` to return from functions that might fail + -Let's say we have a function `GetAnswerFromLlm` that, getting a question, is supposed to answer all of our questions using some large language model. +## How to recover from recoverable errors + +The bulk of this talk is focused around ways to recover from a recoverable error in modern C++. Here, a function is our smallest unit of concern. + +For the sake of example, let's say we have a function `GetAnswerFromLlm` that, getting a question, is supposed to answer all of our questions using some large language model living in the cloud. ```cpp #include @@ -85,21 +77,22 @@ Let's say we have a function `GetAnswerFromLlm` that, getting a question, is sup std::string GetAnswerFromLlm(const std::string& question); ``` -This is a simple interface that serves its purpose in most situations: we ask it things and get some `std::string` answers, sometimes of questionable quality. But what happens if something goes wrong within this function? What if it _cannot_ answer our question? What should this function return so that we know that an error has occurred. +We've seen [functions](functions.md) like this before. This is a simple interface that serves its purpose in most situations: we ask it things and get some `std::string` answers (sometimes of questionable quality). But what if this function _cannot_ return an answer to our question? What should this function do in this case, so that we know that an error has occurred. Largely speaking there are two schools of thought here: - It can throw an **exception** to indicate that some error has occurred -- It can return a special value to indicate a failure -- TODO: add a third option where it sets some global error state +- It can return or set a special value to indicate a failure ### Why not throw an exception -We'll have to briefly talk about the first option here if only to explain why we're not going to talk about it in-depth. And I can already see people with pitchforks coming for me so do note that this is a highly-debated topic with even thoughts of [re-imagining exceptions altogether](https://www.youtube.com/watch?v=ARYP83yNAWk). +We'll have to briefly talk about the first option here if only to explain why we're not going to talk about it in-depth. And I can already see people with pitchforks coming for me so do note that this is a highly-debated topic with even thoughts of [re-imagining exceptions altogether](https://www.youtube.com/watch?v=ARYP83yNAWk) as shown in this wonderful presentation by Herb Sutter. + +Anyway. Exceptions. Generally, at any point in our program we can `throw` an exception. This exception is then handled in a separate execution path, invisible to the user. Otherwise, `std::exception` is just a [class](classes_intro.md) like all those that we've seen before already. An exception object can be caught by value or by reference at any point in the program upstream from the place where the exception was originally thrown. Also, exceptions are polymorphic and use [runtime polymorphism](inheritance.md#using-virtual-for-interface-inheritance-and-proper-polymorphism), so there can be a hierarchy of exception classes and when exceptions are caught by reference, they can be caught by their base class. -Anyway. Exceptions. Generally, at any point in our program we can `throw` an exception. It then is handled in a separate execution path, invisible to the user and can be caught by value or by reference at any point in the program upstream from the place where the exception was originally thrown. Yes, exceptions are polymorphic and use [runtime polymorphism](inheritance.md#using-virtual-for-interface-inheritance-and-proper-polymorphism), which is one of the issues people have with them. +Essentially the problem comes down to exceptions using dynamic allocation at the throwing side and RTTI (Runtime Type Information) at the catching side. This means that technically a program can take an arbitrary amount of time to throw and catch an exceptions. In many domains where C++ is used, like in automotive, this is a non-starter. -In our case, if, say, the network would be down and our LLM of choice would be unreachable, the `GetAnswerFromLlm` would throw an exception, say a `std::runtime_error`: +In our case, if, say, the network would be down and our LLM of choice would be unreachable, the `GetAnswerFromLlm` could throw an exception, say a `std::runtime_error`: ```cpp #include @@ -128,8 +121,11 @@ int main() { } ``` +### Why I don't use exceptions much + I will not talk too much about exceptions, mostly because in around a decade of using C++ professionally I very rarely worked in code bases that use exceptions. Many code bases, especially those that contain safety-critical code, ban exceptions altogether due to the fact that there is, strictly speaking, no way to guarantee how long it takes to process an exception once one is thrown because of their dynamic implementation. + Furthermore, there is another thing I don't really like about them. They create a hidden logic path that can be hard to trace when reading the code. You see, the `catch` block that catches an exception can be in _any_ calling function and it will catch a matching exception that is thrown at any depth of the call stack. @@ -142,6 +138,8 @@ This typically means that we have to become very rigorous about what function th To be a bit more concrete, just imagine that the `LlmHandle::GetAnswer` function throws some other exception, say `std::logic_error` that we don't expect - this would lead us to showing such a `"Something happened"` message, which is not super useful to the user of our code. + + ### Avoid the hidden error path All of these issues prompted people to think out of the box to avoid using exceptions. And that while still having a way to know that something went wrong during the execution of some code. @@ -263,6 +261,62 @@ std::expected GetAnswerFromLlm(const std::string& ques ``` Now if we have a network outage, we can return an error that tells us about this being the case and should the `LlmHandle::GetAnswer` return an expected object of the same type too, it would automagically propagate to the caller of the `GetAnswerFromLlm` function. +## Use `std::optional` to represent optional class fields + +As a a first tiny example, imagine that we want to implement a game character and we have some items that they can hold in either hand (we'll for now assume that the items are of the same pre-defined type for simplicity but could of course extend this example with a class template): + +```cpp +struct Character { + Item left_hand_item; + Item right_hand_item; +}; +``` + +The character, however, might hold nothing in their hands too, so how do we model this? + +As a naïve solution, we could of course just add two additional boolean values `has_item_in_left_hand` and `has_item_in_right_hand` respectively: + +```cpp +struct Character { + Item left_hand_item; + Item right_hand_item; + // 😱 Not a great solution, we need to keep these in sync! + bool has_item_in_left_hand; + bool has_item_in_right_hand; +}; +``` + +This is not a great solution as we would then need to keep these variables in sync and I, for one, do not trust myself with such an important task, especially if I can avoid it. So, speaking of avoiding this, can we somehow bake this information into the stored item types directly? + +We _could_ just replace the items with pointers and if there is a `nullptr` stored in either of those it would mean that the character holds no item in the corresponding hand. But this has certain drawbacks as it changes the semantics of these variables. + +```cpp +// 😱 Who owns the items? +struct Character { + Item* left_hand_item; + Item* right_hand_item; +}; +``` + +Before, our `Character` object had value semantics and now it follows pointer semantics under the hood, meaning that copying our `Character` object would become [harder](memory_and_smart_pointers.md#performing-shallow-copy-by-mistake). + +This is not great. The simple decision of allowing the character to have no objects in their hands forces us to actively think about memory, complicating the implementation and forcing unrelated design considerations upon us. + +One way to avoid this issue is to store a `std::optional` in each hand of the character instead: + +```cpp +struct Character { + std::optional left_hand_item; + std::optional right_hand_item; +}; +``` + +Now it is clear just by looking at this tiny code snippet that neither item is required for the correct operation of the character. As a bonus, the object still has value semantics and can be copied and moved without any issues. + +Before we talk about how to use `std:::optional`, I'd like to first talk a bit about another important use-case for it - **error handling**. + + + ## How are they implemented and their performance implications Largely speaking, both `std::optional` and `std::expected` are both implemented as a `union` in C++, meaning that the expected and unexpected values are stored _in the same underlying memory_ with helper functions allowing us to query which one is actually stored there. From 43551f60b330212a05e3b4ce44c827f610579a19 Mon Sep 17 00:00:00 2001 From: Igor Bogoslavskyi Date: Thu, 29 May 2025 18:11:02 +0200 Subject: [PATCH 09/26] Update further optional lecture --- lectures/optional.md | 84 +++++++++++++++++++++++++++----------------- 1 file changed, 52 insertions(+), 32 deletions(-) diff --git a/lectures/optional.md b/lectures/optional.md index eae6e7e..6ca70d6 100644 --- a/lectures/optional.md +++ b/lectures/optional.md @@ -11,17 +11,18 @@ - [What is error handling after all](#what-is-error-handling-after-all) - [What to do about unrecoverable errors](#what-to-do-about-unrecoverable-errors) - [How to recover from recoverable errors](#how-to-recover-from-recoverable-errors) + - [Why not set a global value](#why-not-set-a-global-value) - [Why not throw an exception](#why-not-throw-an-exception) - - [Why I don't use exceptions much](#why-i-dont-use-exceptions-much) - - [Avoid the hidden error path](#avoid-the-hidden-error-path) + - [Exceptions are expensive](#exceptions-are-expensive) + - [The hidden path is hidden](#the-hidden-path-is-hidden) + - [Use return type for explicit error path](#use-return-type-for-explicit-error-path) - [How to work with `std::optional`](#how-to-work-with-stdoptional) - [Use `std::expected` to tell why a function failed](#use-stdexpected-to-tell-why-a-function-failed) - [Use `std::optional` to represent optional class fields](#use-stdoptional-to-represent-optional-class-fields) - [How are they implemented and their performance implications](#how-are-they-implemented-and-their-performance-implications) - [Summary](#summary) -When writing code in C++, just like in life overall, we don't always get what we want. The good news is that we can prepare by being careful and anticipating some of the errors that we can encounter. There are many mechanisms in C++ for this and today we're talking about what options we have with an added benefit of some highly opinionated suggestions. All of you experienced C++ devs, prepare your pitch forks! :wink: - +When writing code in C++, just like in life overall, we don't always get what we want. The good news is that we can prepare by being careful and anticipating some of the errors that we can encounter. Just like with everything else in C++, there are many mechanisms for this and today we're talking about what options we have with an added "benefit" of some highly opinionated suggestions. All of you experienced C++ devs, prepare your pitch forks! :wink: @@ -33,20 +34,22 @@ I aim to add links to opinions alternative to those expressed in this lecture to ## What is error handling after all -It makes sense to start our conversation with defining what we call an "error" in the first place in the context of our C++ code. Essentially, on the highest level of abstraction, we say that there was an error when the code does not produce the result we expect it to produce. +With the disclaimer out of the way, it makes sense to start our conversation by defining what we call an "error" in the first place in the context of our C++ code. + +Essentially, on the highest level of abstraction, we say that there was an error when the code does not produce the result we expect it to produce. -We can further classify the possible error by its origin. The errors are typically thought of as: +We can further classify the possible errors by their origin. The errors are typically thought of as: -- recoverable: errors that we can recover from within the normal operation of the program. An example of these would be a network timeout. -- unrecoverable: errors that indicate a state of the program so broken that any recovery is a moot point. Typical examples are programmatic errors and errors resulting from undefined behavior encountered previously in the program. +- **recoverable:** errors that we can recover from within the normal operation of the program. An example of these would be a network timeout in a situation when a user can wait and retry. +- **unrecoverable:** errors that indicate a state of the program so broken that any recovery is useless. Typical examples are programmatic errors and errors resulting from undefined behavior encountered previously in the program. -Note that this classification is still highly debated. There is a large camp of people, who believe that every error is potentially recoverable and should be treated as such. This is a valid way of thinking but it comes with a price that, at least in my industry, people are usually unwilling to pay. +Note that, while some languages, like Rust, make this distinction directly in their official documentation, the classification of errors into recoverable and unrecoverable is still highly debated in C++. There is a large camp of people, who believe that every error is potentially recoverable and should be treated as such and that an error should be reported for potentially being handled later at a different place in the program. This is absolutely a valid way of thinking but it comes with a price that, at least in my industry, people are usually unwilling to pay. ## What to do about unrecoverable errors -Here, we will assume that we cannot or don't want to try to recover from a class of errors that we deem "unrecoverable". That being said, we still generally want to have tools to reduce the likelihood of these errors popping up. In my experience, most of these errors come from an erroneous assumption or an undetected error earlier in the program. +In this lecture, we will assume that we cannot or don't want to try to recover from a class of errors that we deem "unrecoverable". That being said, we still generally want to have tools to reduce the likelihood of these errors popping up. In my experience, most of these errors come from an erroneous assumption or an undetected error earlier in the program. One typical way of dealing with issues like these is a combination of two techniques: @@ -57,17 +60,17 @@ The combination of these technique allows us to increases the likelihood that an -Such contract enforcement typically crash the application if their premise is not met, assuming that the only way this could have happened is if something before has already done horribly wrong. +Such contract enforcement typically crash the application if their premise is not met, assuming that the only way this could have happened is if something before has already gone horribly wrong, no recovery is possible, and the best way to move on is to die as quickly as possible. -This obviously needs careful considerations. You don't want all of the software in your car die at a random point in time without any recovery procedure. +This obviously needs careful considerations. You don't want all of the software in your car to die at a random point in time without any recovery procedure. -We won't talk about this too much but in general, as at least one potential reason for such failures is memory being in an undefined and potentially inconsistent state, people usually run multiple processes and monitor the main process by some watchdog that activates a safe recovery procedure if needed. +We won't talk about this too much but in general, as at least one potential reason for such failures is memory being in an undefined and potentially inconsistent state, people usually run multiple processes or even multiple programs on different hardware and monitor the main execution path by some watchdog that activates a safe recovery procedure if needed. ## How to recover from recoverable errors -The bulk of this talk is focused around ways to recover from a recoverable error in modern C++. Here, a function is our smallest unit of concern. +The bulk of this talk is focused around ways to recover from a recoverable error in modern C++ with a function being our smallest unit of concern. For the sake of example, let's say we have a function `GetAnswerFromLlm` that, getting a question, is supposed to answer all of our questions using some large language model living in the cloud. @@ -77,22 +80,28 @@ For the sake of example, let's say we have a function `GetAnswerFromLlm` that, g std::string GetAnswerFromLlm(const std::string& question); ``` -We've seen [functions](functions.md) like this before. This is a simple interface that serves its purpose in most situations: we ask it things and get some `std::string` answers (sometimes of questionable quality). But what if this function _cannot_ return an answer to our question? What should this function do in this case, so that we know that an error has occurred. +We've seen [functions](functions.md) like this before. This is a simple interface that serves its purpose in most situations: we ask it things and get some `std::string` answers (sometimes of questionable quality). But what if this function _cannot_ return an answer to our question? What should this function do in this case, so that we know that an error has occurred? -Largely speaking there are two schools of thought here: +Largely speaking there are three schools of thought here: -- It can throw an **exception** to indicate that some error has occurred -- It can return or set a special value to indicate a failure +1. It can throw an **exception** to indicate that some error has occurred +2. **It can return a special value to indicate a failure** +3. It can set a special global value to indicate a failure -### Why not throw an exception +Today we mostly focus on option 2., where we would return a special value of a special type to indicate that something went wrong, but before we go there, I'd like to briefly talk about why I don't like the other options. -We'll have to briefly talk about the first option here if only to explain why we're not going to talk about it in-depth. And I can already see people with pitchforks coming for me so do note that this is a highly-debated topic with even thoughts of [re-imagining exceptions altogether](https://www.youtube.com/watch?v=ARYP83yNAWk) as shown in this wonderful presentation by Herb Sutter. +### Why not set a global value -Anyway. Exceptions. Generally, at any point in our program we can `throw` an exception. This exception is then handled in a separate execution path, invisible to the user. Otherwise, `std::exception` is just a [class](classes_intro.md) like all those that we've seen before already. An exception object can be caught by value or by reference at any point in the program upstream from the place where the exception was originally thrown. Also, exceptions are polymorphic and use [runtime polymorphism](inheritance.md#using-virtual-for-interface-inheritance-and-proper-polymorphism), so there can be a hierarchy of exception classes and when exceptions are caught by reference, they can be caught by their base class. +We'll start with option 3 - setting some global value as an indicator for a failure. This way was quite popular long time ago but it rarely used today when we believe that variables should live in as local scope as possible. But you can still encounter it if you ever code using OpenGL, for example. + -Essentially the problem comes down to exceptions using dynamic allocation at the throwing side and RTTI (Runtime Type Information) at the catching side. This means that technically a program can take an arbitrary amount of time to throw and catch an exceptions. In many domains where C++ is used, like in automotive, this is a non-starter. +### Why not throw an exception -In our case, if, say, the network would be down and our LLM of choice would be unreachable, the `GetAnswerFromLlm` could throw an exception, say a `std::runtime_error`: +A more interesting question is why not use option 1 - to throw an exception. + +And I can already see people with pitchforks coming for me so do note that this is a highly-debated topic with even thoughts of [re-imagining exceptions altogether](https://www.youtube.com/watch?v=ARYP83yNAWk) as shown in this wonderful presentation by Herb Sutter. + +Anyway. Exceptions. Generally, at any point in our program we can `throw` an exception. In our case, if, say, the network would be down and our LLM of choice would be unreachable, the `GetAnswerFromLlm` could throw an exception, say a `std::runtime_error`: ```cpp #include @@ -106,7 +115,7 @@ std::string GetAnswerFromLlm(const std::string& question) { } ``` -On the calling side, we would need to "catch" this exception using the `try`-`catch` blocks. Generally, if using exceptions for reporting errors, we wrap the code we want to execute into a `try` block that is followed by a `catch` block that handles all of our potential errors. +This exception is then "caught" in some other part of the program upstream of the place at which it was thrown using a so-called "try-catch" block. The exception travels to get there on a separate execution path, invisible to the user. ```cpp int main() { @@ -121,26 +130,35 @@ int main() { } ``` -### Why I don't use exceptions much + + +This sounds wonderful at first glance as it allows us to use the return type of our function for actually returning the result of the operation without trying to use it for anything else. This way also goes along the philosophy of having no unrecoverable errors: the function that throws an exception makes no decision about this error being recoverable or not - this will be decided by some other part of code that handles (or fails to handle) this exception. -I will not talk too much about exceptions, mostly because in around a decade of using C++ professionally I very rarely worked in code bases that use exceptions. Many code bases, especially those that contain safety-critical code, ban exceptions altogether due to the fact that there is, strictly speaking, no way to guarantee how long it takes to process an exception once one is thrown because of their dynamic implementation. +However, there are some limitations to this approach that we'll try to outline here. + +#### Exceptions are expensive + +A `std::exception` is just a [class](classes_intro.md) like all those that we've seen before already. An exception object can be caught by value or by reference at any point in the program upstream from the place where the exception was originally thrown. Also, exceptions are polymorphic and use [runtime polymorphism](inheritance.md#using-virtual-for-interface-inheritance-and-proper-polymorphism), so there can be a hierarchy of exception classes and when exceptions are caught by reference, they can be caught by their base class. + +Essentially the problem comes down to exceptions using dynamic allocation at the throwing side and RTTI (Runtime Type Information) at the catching side. This means that technically a program can take an arbitrary amount of time to throw and catch an exceptions. Many code bases, especially those that contain safety-critical code, ban exceptions altogether due to the fact that there is, strictly speaking, no way to guarantee how long it takes to process an exception once one is thrown because of their dynamic implementation. In all the places where I worked the exceptions were either banned altogether or avoided when possible. +#### The hidden path is hidden + Furthermore, there is another thing I don't really like about them. They create a hidden logic path that can be hard to trace when reading the code. You see, the `catch` block that catches an exception can be in _any_ calling function and it will catch a matching exception that is thrown at any depth of the call stack. - Video Thumbnail This typically means that we have to become very rigorous about what function throws which exceptions when and, in some cases, the only way to know this is by relying on a documentation of a function which, in many cases, does not fully exist or is not up to date. I firmly believe that the statement `catch (...)` is singlehandedly responsible for many errors of the style of "oops, something happened" that we've all encountered. -To be a bit more concrete, just imagine that the `LlmHandle::GetAnswer` function throws some other exception, say `std::logic_error` that we don't expect - this would lead us to showing such a `"Something happened"` message, which is not super useful to the user of our code. +To be a bit more concrete, just imagine that the `LlmHandle::GetAnswer` function throws some other exception, say `std::logic_error` that we don't expect - this would lead us to showing such a `"Something happened"` message, which is not super useful to the user of our code and still likely leads to the program to crash, which is what we tried to avoid with exceptions in the first place. - +### Use return type for explicit error path -### Avoid the hidden error path + All of these issues prompted people to think out of the box to avoid using exceptions. And that while still having a way to know that something went wrong during the execution of some code. @@ -159,7 +177,7 @@ In the olden days (before C++17), there were only three options. } ``` - This option is not ideal because it is hard to define an appropriate "failure" value to return from most functions. For example, an empty string sounds like a good option for such a value, but then the LLM response to a query "Read this text, do not answer anything when done" would overlap with such a default value. Not great, right? We can extend the same logic of course for any string we would designate as the "failure value". + This option is not ideal because it is hard to define an appropriate "failure" value to return from most functions. For example, an empty string sounds like a good option for such a value, but then the LLM response to a query "Read this text, return empty string when done" would overlap with such a default value. Not great, right? We can extend the same logic of course for any string we would designate as the "failure value". 2. Another option is to return an error code from the function, which required passing any values that the function had to change as a non-const reference or pointer: ```cpp @@ -175,7 +193,7 @@ In the olden days (before C++17), there were only three options. } ``` - This options is also not great. I would argue that not being able to have pure functions that get only const inputs and return a single output makes the code a lot less readable. Furthermore, modern compilers are very good at optimizing the returned value and sometimes the function that constructs this value altogether which might be a bit harder if we pass a reference to some value stored elsewhere. Although I don't know enough about the magic that the compilers do under the hood to be 100% about this second reason, so if you happen to know more - tell me! + This options is also not great. I would argue that not being able to have pure functions that get only const inputs and return a single output makes the code a lot less readable. Furthermore, modern compilers are very good at optimizing the returned value and sometimes the function that constructs this value altogether which might be a bit harder if we pass a reference to a value stored elsewhere. Although I don't know enough about the magic that the compilers do under the hood to be 100% about this second reason, so if you happen to know more - tell me! 3. An arguably even worse but still sometimes used method (OpenGL, anyone?) is to set some global error variable if an error has occurred and explore its value after every call to see if something bad has actually happened. @@ -263,6 +281,8 @@ Now if we have a network outage, we can return an error that tells us about this ## Use `std::optional` to represent optional class fields + + As a a first tiny example, imagine that we want to implement a game character and we have some items that they can hold in either hand (we'll for now assume that the items are of the same pre-defined type for simplicity but could of course extend this example with a class template): ```cpp From afea6d3c64f5a3f31a07d76b162674a44435c6ef Mon Sep 17 00:00:00 2001 From: Igor Bogoslavskyi Date: Sun, 1 Jun 2025 14:55:08 +0200 Subject: [PATCH 10/26] Rewrite the optional lecture --- lectures/optional.md | 447 +++++++++++++++++++++---------------------- 1 file changed, 216 insertions(+), 231 deletions(-) diff --git a/lectures/optional.md b/lectures/optional.md index 6ca70d6..bd6112f 100644 --- a/lectures/optional.md +++ b/lectures/optional.md @@ -1,366 +1,351 @@ -**Error handling in C++** --- +**Error Handling in C++**

Video Thumbnail

-- [**Error handling in C++**](#error-handling-in-c) + - [Disclaimer](#disclaimer) -- [What is error handling after all](#what-is-error-handling-after-all) -- [What to do about unrecoverable errors](#what-to-do-about-unrecoverable-errors) -- [How to recover from recoverable errors](#how-to-recover-from-recoverable-errors) - - [Why not set a global value](#why-not-set-a-global-value) - - [Why not throw an exception](#why-not-throw-an-exception) - - [Exceptions are expensive](#exceptions-are-expensive) - - [The hidden path is hidden](#the-hidden-path-is-hidden) - - [Use return type for explicit error path](#use-return-type-for-explicit-error-path) -- [How to work with `std::optional`](#how-to-work-with-stdoptional) -- [Use `std::expected` to tell why a function failed](#use-stdexpected-to-tell-why-a-function-failed) -- [Use `std::optional` to represent optional class fields](#use-stdoptional-to-represent-optional-class-fields) -- [How are they implemented and their performance implications](#how-are-they-implemented-and-their-performance-implications) -- [Summary](#summary) +- [What Do We Mean by “Error”?](#what-do-we-mean-by-error) +- [Unrecoverable errors: **fail early**](#unrecoverable-errors-fail-early) + - [How to deal with unrecoverable errors](#how-to-deal-with-unrecoverable-errors) + - [How to minimize number of unrecoverable errors](#how-to-minimize-number-of-unrecoverable-errors) +- [Recoverable errors: **handle and proceed**](#recoverable-errors-handle-and-proceed) + - [Exceptions](#exceptions) + - [What exceptions are](#what-exceptions-are) + - [Exceptions are (sometimes) expensive](#exceptions-are-sometimes-expensive) + - [Exceptions hide the error path](#exceptions-hide-the-error-path) + - [Returning errors explicitly can work better if done well](#returning-errors-explicitly-can-work-better-if-done-well) + - [Returning a value indicating error does not always work 😱](#returning-a-value-indicating-error-does-not-always-work-) + - [Returning an error code breaks "pure functions" 😱](#returning-an-error-code-breaks-pure-functions-) + - [Using `std::optional`: **a better way**](#using-stdoptional-a-better-way) + - [Using `std::expected`: **add context**](#using-stdexpected-add-context) + - [Performance Considerations for `std::optional` and `std::expected`](#performance-considerations-for-stdoptional-and-stdexpected) + - [Error type size matters](#error-type-size-matters) + - [Return value optimization](#return-value-optimization) + - [Summary](#summary) -When writing code in C++, just like in life overall, we don't always get what we want. The good news is that we can prepare by being careful and anticipating some of the errors that we can encounter. Just like with everything else in C++, there are many mechanisms for this and today we're talking about what options we have with an added "benefit" of some highly opinionated suggestions. All of you experienced C++ devs, prepare your pitch forks! :wink: +When writing C++ code, much like in life, we don’t always get what we want. The good news is that we can prepare for this! - +And just like everything else in C++, there are… a lot of ways to do that. -## Disclaimer +Today we’re talking about error handling. What options we have, which trade-offs they come with, and what modern C++ gives us to make our lives a bit easier. -The topics we cover today don't have a single simple answer. The main reason for this is the shear power of C++ and all of the things is lets us do. This is only strengthened by how long C++ exists, the diversity of the use-cases and the people who use it. Depending on your context, the particular way of thinking presented here might be more or less useful to you. My experience mostly comes from automotive and robotics bubbles and might not apply to your domain. I will do my best to mention all options, but will only cover in-depth areas that I have been using myself over the last 15 or so years. +And as this topic is quite nuanced, there will definitely be some statements that are quite opinionated and I can already see some people with pitchforks coming my way... so... I'm sure it's gonna be fun! -I aim to add links to opinions alternative to those expressed in this lecture to the best of my ability, but if I miss something, please do not hesitate to let me know in the comments. +# Disclaimer -## What is error handling after all +Following up on what I've just said, I'd like to start with a disclaimer. -With the disclaimer out of the way, it makes sense to start our conversation by defining what we call an "error" in the first place in the context of our C++ code. +This isn’t a one-size-fits-all topic. C++ is huge, powerful, and used across every domain imaginable for a long time. -Essentially, on the highest level of abstraction, we say that there was an error when the code does not produce the result we expect it to produce. +*My* perspective comes from domains like robotics and automotive—where predictability, traceability, and safety are of highest importance. What works for us may not work for everyone. -We can further classify the possible errors by their origin. The errors are typically thought of as: +That being said, I believe that what I present here will fit to many domains with minimal adaptation. Where possible, I’ll try to mention multiple possible options and if I *do* miss an important one—please let me know! -- **recoverable:** errors that we can recover from within the normal operation of the program. An example of these would be a network timeout in a situation when a user can wait and retry. -- **unrecoverable:** errors that indicate a state of the program so broken that any recovery is useless. Typical examples are programmatic errors and errors resulting from undefined behavior encountered previously in the program. + - +# What Do We Mean by “Error”? -Note that, while some languages, like Rust, make this distinction directly in their official documentation, the classification of errors into recoverable and unrecoverable is still highly debated in C++. There is a large camp of people, who believe that every error is potentially recoverable and should be treated as such and that an error should be reported for potentially being handled later at a different place in the program. This is absolutely a valid way of thinking but it comes with a price that, at least in my industry, people are usually unwilling to pay. +Before we go into how to handle errors, let’s clarify what we mean when we say "error" in the first place. -## What to do about unrecoverable errors +At the highest level: an error is when the code doesn’t produce the expected result. But there is nuance here! -In this lecture, we will assume that we cannot or don't want to try to recover from a class of errors that we deem "unrecoverable". That being said, we still generally want to have tools to reduce the likelihood of these errors popping up. In my experience, most of these errors come from an erroneous assumption or an undetected error earlier in the program. +We generally split these into two broad groups: -One typical way of dealing with issues like these is a combination of two techniques: +- **Unrecoverable errors** — where the program reaches an invalid or inconsistent state, and continuing could be unsafe or meaningless. +- **Recoverable errors** — where the program can detect something went wrong, and has ways to proceed by an alternative path. -- Having a high [test code coverage](googletest.md), ideally 100% code line coverage -- Enforcing contract checking at the start (and potentially also at the end) of every function +Some languages—like Rust—bake this distinction into the type system. C++ doesn’t. But the classification is still useful, especially when designing interfaces. -The combination of these technique allows us to increases the likelihood that an actual error would be caught early in the development and won't make it into the actual delivered application. +# Unrecoverable errors: **fail early** - +## How to deal with unrecoverable errors -Such contract enforcement typically crash the application if their premise is not met, assuming that the only way this could have happened is if something before has already gone horribly wrong, no recovery is possible, and the best way to move on is to die as quickly as possible. +Let’s start with the errors we don’t want to recover from. -This obviously needs careful considerations. You don't want all of the software in your car to die at a random point in time without any recovery procedure. +These usually come from bugs: a violated precondition, accessing something that shouldn’t be accessed, or hitting undefined behavior. In all of these cases the program is already in some unknown state, so we have no guarantees on anything that happens next. So recovery is most likely impossible. -We won't talk about this too much but in general, as at least one potential reason for such failures is memory being in an undefined and potentially inconsistent state, people usually run multiple processes or even multiple programs on different hardware and monitor the main execution path by some watchdog that activates a safe recovery procedure if needed. +We often want to catch these types of errors as early as possible—and crash as early as possible—before any more damage is done. - +A typical approach is to enforce contracts at function boundaries. My favorite method is to use the `CHECK` macro that can be found in Abseil library. Here’s a tiny example of checking if the element actually exists in a vector before returning it: -## How to recover from recoverable errors +```cpp +#include -The bulk of this talk is focused around ways to recover from a recoverable error in modern C++ with a function being our smallest unit of concern. +int GetElementAt(const std::vector& v, std::size_t index) { + CHECK(index < v.size()); // Contract check. + return v[index]; +} +``` -For the sake of example, let's say we have a function `GetAnswerFromLlm` that, getting a question, is supposed to answer all of our questions using some large language model living in the cloud. +This is simple: we check upfront that an index is actually valid. If it isn’t, we crash instead of going into the undefined behavior land. -```cpp -#include +Same pattern can be used in any other places where certain pre-conditions must be met in order to proceed, like in this example where some `data` object needs to be valid in order to be processed: -std::string GetAnswerFromLlm(const std::string& question); +```cpp +void ProcessSensorData(const SensorData& data) { + CHECK(data.IsValid()); + // Safe to process data here. +} ``` -We've seen [functions](functions.md) like this before. This is a simple interface that serves its purpose in most situations: we ask it things and get some `std::string` answers (sometimes of questionable quality). But what if this function _cannot_ return an answer to our question? What should this function do in this case, so that we know that an error has occurred? +We don't try to continue with invalid data. We stop. -Largely speaking there are three schools of thought here: +Under this philosophy, we essentially treat bugs as bugs—not as conditions we can try to live with. -1. It can throw an **exception** to indicate that some error has occurred -2. **It can return a special value to indicate a failure** -3. It can set a special global value to indicate a failure +## How to minimize number of unrecoverable errors -Today we mostly focus on option 2., where we would return a special value of a special type to indicate that something went wrong, but before we go there, I'd like to briefly talk about why I don't like the other options. +Of course, we'd rather not hit them at all. In practice, we rely on: -### Why not set a global value +- High test coverage—ideally line coverage close to 100%. +- Contract checks on inputs and outputs. +- Assertions during development to catch bad assumptions early. -We'll start with option 3 - setting some global value as an indicator for a failure. This way was quite popular long time ago but it rarely used today when we believe that variables should live in as local scope as possible. But you can still encounter it if you ever code using OpenGL, for example. - +In safety-critical systems, we often isolate components into separate processes or even hardware units, with watchdogs that can trigger recovery actions if something crashes. This way we minimize the time to failure while keeping the system safe as a whole even when certain components fail. -### Why not throw an exception +But again, that’s recovery at the system level—not inside the code where the error occurred. This is a large architecture topic in itself and is far beyond what I want to talk about today. -A more interesting question is why not use option 1 - to throw an exception. +# Recoverable errors: **handle and proceed** -And I can already see people with pitchforks coming for me so do note that this is a highly-debated topic with even thoughts of [re-imagining exceptions altogether](https://www.youtube.com/watch?v=ARYP83yNAWk) as shown in this wonderful presentation by Herb Sutter. +Now, what I actually want to talk about today is **recoverable errors**. -Anyway. Exceptions. Generally, at any point in our program we can `throw` an exception. In our case, if, say, the network would be down and our LLM of choice would be unreachable, the `GetAnswerFromLlm` could throw an exception, say a `std::runtime_error`: +To talk about them, let's start with an example function: ```cpp -#include +std::string GetAnswerFromLlm(const std::string& question); +``` + +This function is supposed to call some LLM over the network. It’s supposed to return a response. But what if the network is down? Or we ran out of LLM tokens? + +We have to decide: what does the function do in that case? + +We have two broad strategies of communicating a failure that have emerged in C++ over the years: + +1. **Return a special value from a function.** +2. Throw an exception. + +We’ll spend most of our time on the first one—but let’s first spend some time and talk about what might be wrong with just throwing an exception. + +## Exceptions + +### What exceptions are + +Since C++98 we have a powerful machinery of exceptions at our disposal. An exception is essentially just an object of some class, typically derived from [`std::exception`](https://en.cppreference.com/w/cpp/error/exception.html) class. Such an exception holds the information about the underlying failure and can be "throws" and "caught" within a C++ program. +In our example, we could throw a `std::runtime_error` when the network is not available: + +```cpp std::string GetAnswerFromLlm(const std::string& question) { - const auto llm_handle = GetLlmHandle(); - if (!llm_handle) { - throw std::runtime_error("Cannot get LLM handle"); - } - return llm_handle->GetAnswer(question); + auto llm = GetLlmHandle(); + if (!llm) throw std::runtime_error("No network connection"); + return llm->GetAnswer(question); } ``` -This exception is then "caught" in some other part of the program upstream of the place at which it was thrown using a so-called "try-catch" block. The exception travels to get there on a separate execution path, invisible to the user. +And handle this exception by catching it by reference, which is possible because `std::runtime_error` derives from `std::exception`: ```cpp +#include + +std::string GetAnswerFromLlm(const std::string& question) { + auto llm = GetLlmHandle(); + if (!llm) throw std::runtime_error("No network connection"); + return llm->GetAnswer(question); +} + int main() { try { - const answer = GetAnswerFromLlm("What am I doing with ny life?"); - std::cout << answer << std::endl; - } catch (std::runtime_error error) { - std::cerr << error << std::endl; - } catch (...) { - std::cerr << "Unexpected error happened" << std::endl; + auto response = GetAnswerFromLlm("What should I do?"); + std::cout << response << "\n"; + } catch (const std::exception& e) { + std::cerr << "Error: " << e.what() << "\n"; } } ``` - +On paper, this looks clean. But there are problems. -This sounds wonderful at first glance as it allows us to use the return type of our function for actually returning the result of the operation without trying to use it for anything else. This way also goes along the philosophy of having no unrecoverable errors: the function that throws an exception makes no decision about this error being recoverable or not - this will be decided by some other part of code that handles (or fails to handle) this exception. +### Exceptions are (sometimes) expensive -However, there are some limitations to this approach that we'll try to outline here. +Exceptions typically [allocate memory on the heap](memory_and_smart_pointers.md#the-heap) when thrown, and rely on **R**un-**T**ime **T**ype **I**nformation (RTTI) to propagate through the call stack. -#### Exceptions are expensive +Both of these operations happen at runtime of the program and cost some time. Unfortunately, there are no guarantees on timing or performance of these operations. While in most common scenarios these operations run fast-enough, in real-time or safety-critical code, such unpredictability is unacceptable. -A `std::exception` is just a [class](classes_intro.md) like all those that we've seen before already. An exception object can be caught by value or by reference at any point in the program upstream from the place where the exception was originally thrown. Also, exceptions are polymorphic and use [runtime polymorphism](inheritance.md#using-virtual-for-interface-inheritance-and-proper-polymorphism), so there can be a hierarchy of exception classes and when exceptions are caught by reference, they can be caught by their base class. +Every serious project I’ve worked on either banned exceptions completely, or avoided them in performance-critical paths. -Essentially the problem comes down to exceptions using dynamic allocation at the throwing side and RTTI (Runtime Type Information) at the catching side. This means that technically a program can take an arbitrary amount of time to throw and catch an exceptions. Many code bases, especially those that contain safety-critical code, ban exceptions altogether due to the fact that there is, strictly speaking, no way to guarantee how long it takes to process an exception once one is thrown because of their dynamic implementation. In all the places where I worked the exceptions were either banned altogether or avoided when possible. + -#### The hidden path is hidden +### Exceptions hide the error path -Furthermore, there is another thing I don't really like about them. They create a hidden logic path that can be hard to trace when reading the code. -You see, the `catch` block that catches an exception can be in _any_ calling function and it will catch a matching exception that is thrown at any depth of the call stack. +Exceptions also make control flow harder to reason about. -Video Thumbnail +The error can propagate across many layers of calls before being caught. It’s easy to miss what a function might throw—especially if documentation is incomplete or out of date (which it almost always is). -This typically means that we have to become very rigorous about what function throws which exceptions when and, in some cases, the only way to know this is by relying on a documentation of a function which, in many cases, does not fully exist or is not up to date. I firmly believe that the statement `catch (...)` is singlehandedly responsible for many errors of the style of "oops, something happened" that we've all encountered. +Furthermore, we can use generic catch blocks like `catch (...)` and these make things even worse. We end up catching *something*, but we no longer know what or why. -To be a bit more concrete, just imagine that the `LlmHandle::GetAnswer` function throws some other exception, say `std::logic_error` that we don't expect - this would lead us to showing such a `"Something happened"` message, which is not super useful to the user of our code and still likely leads to the program to crash, which is what we tried to avoid with exceptions in the first place. - +Here's a real-world style example: -### Use return type for explicit error path +```cpp +#include - +std::string GetAnswerFromLlm(const std::string& question) { + auto llm = GetLlmHandle(); + if (!llm) throw std::runtime_error("No network connection"); + return llm->GetAnswer(question); +} -All of these issues prompted people to think out of the box to avoid using exceptions. And that while still having a way to know that something went wrong during the execution of some code. +int main() { + try { + auto answer = GetAnswerFromLlm("What’s the meaning of life?"); + std::cout << answer << "\n"; + } catch (...) { + // Not very helpful, is it? + std::cerr << "Oops, something happened.\n"; + } +} +``` -In the olden days (before C++17), there were only three options. +Video Thumbnail -1. The first one was to return a special value from the function. When the user receives this value they know that an error has occurred: +If `GetAnswerFromLlm` throws `std::logic_error` but we only expect `std::runtime_error`, we might miss important context or even crash anyway. - ```cpp - #include - - // 😱 Assumes empty string to indicate error. Not a great idea nowadays. - std::string GetAnswerFromLlm(const std::string& question, std::string& answer) { - const auto llm_handle = GetLlmHandle(); - if (!llm_handle) { return {}; } - return llm_handle->GetAnswer(question); - } - ``` +I believe that `catch(...)` and equivalent constructs are singlehandedly responsible for the absolute majority of the funny error messages that you've undoubtedly seen all over the internet. - This option is not ideal because it is hard to define an appropriate "failure" value to return from most functions. For example, an empty string sounds like a good option for such a value, but then the LLM response to a query "Read this text, return empty string when done" would overlap with such a default value. Not great, right? We can extend the same logic of course for any string we would designate as the "failure value". -2. Another option is to return an error code from the function, which required passing any values that the function had to change as a non-const reference or pointer: +## Returning errors explicitly can work better if done well - ```cpp - #include - - // Returns a status code rather than the value we want. - // 😱 Not a great idea nowadays. - int GetAnswerFromLlm(const std::string& question, std::string& answer) { - const auto llm_handle = GetLlmHandle(); - if (!llm_handle) { return 1; } - answer = llm_handle->GetAnswer(question); - return 0; - } - ``` +Instead of throwing exceptions, we can encode failure directly in the return value from our function. + +I would way that there are three distinct ways of thinking about it. +Let's illustrate all of them on a function we've already looked at: + +```cpp +std::string GetAnswerFromLlm(const std::string& question); +``` + +We can: - This options is also not great. I would argue that not being able to have pure functions that get only const inputs and return a single output makes the code a lot less readable. Furthermore, modern compilers are very good at optimizing the returned value and sometimes the function that constructs this value altogether which might be a bit harder if we pass a reference to a value stored elsewhere. Although I don't know enough about the magic that the compilers do under the hood to be 100% about this second reason, so if you happen to know more - tell me! - -3. An arguably even worse but still sometimes used method (OpenGL, anyone?) is to set some global error variable if an error has occurred and explore its value after every call to see if something bad has actually happened. +1. Return a special **value** of the same return type, `std::string` in our case +2. Return an error code, which would change the signature of the function to return `int` instead: ```cpp - #include - - // 😱 Not a great idea to have a global mutable variable. - inline static int last_error{}; - - // 😱 Not a great idea nowadays. - std::string GetAnswerFromLlm(const std::string& question) { - const auto llm_handle = GetLlmHandle(); - if (!llm_handle) { - last_error = 1; - return {}; - } - last_error = 0; - return llm_handle->GetAnswer(question); - } + int GetAnswerFromLlm(const std::string& question, std::string& result); ``` - I believe I don't have to go into many details as to why his is not an ideal way to deal with errors: it is even less readable and more error prone than the previous method. We even have to use a mutable global variable! Also, good luck [testing](googletest.md) this code, especially when running a number of tests in parallel. +3. Return a different type `std::optional` which only holds a valid `std::string` in case of success. -But I would not be telling you all of this if there were no better way, would I? This is where `std::optional` comes to the rescue. Instead of all of the horrible things we've just discussed, we can return a `std::optional` instead of just returning a `std::string`: +I believe that the third option is the best out of these three, but let me explain why first, before going deeper into details. -`llm.hpp` +### Returning a value indicating error does not always work 😱 -```cpp -#include -#include +There is a number of issues with returning a special error value from a function. In our case, a naïve choice would be to return an empty string if there was no answer from the LLM, but what if we asked the LLM something along the lines of "read this file, return empty string when done"? An empty string *is* the valid response here! -std::optional GetAnswerFromLlm(const std::string& question) { - const auto llm_handle = GetLlmHandle(); - if (!llm_handle) { return {}; } - return llm_handle->GetAnswer(question); -} +Similar cases can be constructed for most values we can come up with. In addition to that, there is no easy way to encode the *reason* for the failure, like that the network was down. And a final nail in the coffin of this method is that it does not work at all for functions that return `void` for obvious reasons. + +### Returning an error code breaks "pure functions" 😱 + +Returning an error code solves at least a couple of issues for us. It is fast and reliable and we can design our software with different error codes in mind so that the reason for the failure is also communicated to us. This is also still the prevalent way of handling errors in C, so there _is_ some merit to this method. + +However, if our function actually must return a value, the only way to use error codes is to change its return type to the type that our error codes have, like `int`, which forces us to provide an additional output parameter to our function, like `std::string& result` in our case: + +```cpp +int GetAnswerFromLlm(const std::string& question, std::string& result); ``` -Now it is super clear when reading this function that it might fail because it only _optionally_ returns a string. It also forces us to deal with any potential error happening inside of this function when we call it because the _type_ or the value we get forces us to do it. No hidden error path! +The main issue with this from my point of view is that it is clunky, mixes input/output in the signature, and limits functional composition. Furthermore, nowadays, the compilers are able to perform Return Value Optimization for values returned from a function and this functionality is limited for such input/output parameters. -Note also, that the code of the function itself stayed _exactly_ the same as in the case where we would indicate an error by returning an empty string, just the return type is different! +So clearly, there are some issues with this method too. I believe it has its merits sometimes, but there has to be a reason for it and the performance must be measured well. -## How to work with `std::optional` +### Using `std::optional`: **a better way** -So let's see how we could work with such a function! For this we'll call it a couple of times with various prompts and process the results that we're getting: +With C++17, we gained `std::optional`. -`main.cpp` +Now, we can express “might return a value” cleanly: ```cpp -#include "llm.hpp" - -int main() { - const auto suggestion = GetAnswerFromLlm( - "In one word, what should I do with my life?"); - if (!suggestion) return 1; - const auto further_suggestion = GetAnswerFromLlm( - std::string{"In one word, what should I do after doing this: "} + suggestion.value()); - if (!further_suggestion.has_value()) return 1; - std::cout << - "The LLM told me to " << *suggestion << - ", and then to " << further_suggestion.value() << std::endl; - return 0; -} +std::optional GetAnswerFromLlm(const std::string& question); ``` -In general, `std::optional` provides an interface in which we are able to: -- Check if it holds a value by calling its `has_value()` method or implicitly converting it to `bool` -- Get the stored value by calling `value()` or using a dereferencing operator `*` as well as `->` should we want to call methods or ged data of an object stored in the optional wrapper. Beware, though that getting a value of an optional that holds no value is undefined behavior, so _always check_ that there is actually a value stored in an optional. -## Use `std::expected` to tell why a function failed -There is just one more quality of life improvement that we are missing here. If we receive a `std::optional` object that stores a `std::nullopt` as a result of a function call, we know that the function failed. But we don't know **why** it failed. +And use it like this: -This is why in C++23 we are getting a class `std::expected` that, while being very similar to `std::optional` has another template parameter: `std::expected` that stores the type of an error that might be stored in this object instead of the value we expect. This way, we can store arbitrary values to indicate that an error has occurred: ```cpp -#include - -std::expected GetAnswerFromLlm(const std::string& question) { - const auto llm_handle = GetLlmHandle(); - if (!llm_handle) { - return std::unexpected{"No network"}; - } - return llm_handle->GetAnswer(question); +int main() { + auto answer = GetAnswerFromLlm("What now?"); + if (!answer) return 1; + std::cout << *answer << "\n"; } ``` -Now if we have a network outage, we can return an error that tells us about this being the case and should the `LlmHandle::GetAnswer` return an expected object of the same type too, it would automagically propagate to the caller of the `GetAnswerFromLlm` function. -## Use `std::optional` to represent optional class fields +The presence or absence of a value is part of the type. No more guessing. No more relying on magic return values or input/output arguments. + +### Using `std::expected`: **add context** - +However, we might notice that `std::optional` only tells us that something went wrong, but not *what* went wrong. -As a a first tiny example, imagine that we want to implement a game character and we have some items that they can hold in either hand (we'll for now assume that the items are of the same pre-defined type for simplicity but could of course extend this example with a class template): +Enter `std::expected`, coming in C++23. ```cpp -struct Character { - Item left_hand_item; - Item right_hand_item; -}; +std::expected GetAnswerFromLlm(const std::string& question); ``` -The character, however, might hold nothing in their hands too, so how do we model this? - -As a naïve solution, we could of course just add two additional boolean values `has_item_in_left_hand` and `has_item_in_right_hand` respectively: +Now we can return either a valid result, or an error message: ```cpp -struct Character { - Item left_hand_item; - Item right_hand_item; - // 😱 Not a great solution, we need to keep these in sync! - bool has_item_in_left_hand; - bool has_item_in_right_hand; -}; +if (!IsNetworkAvailable()) { + return std::unexpected("Network unreachable"); +} ``` -This is not a great solution as we would then need to keep these variables in sync and I, for one, do not trust myself with such an important task, especially if I can avoid it. So, speaking of avoiding this, can we somehow bake this information into the stored item types directly? +And the caller handles both cases explicitly. -We _could_ just replace the items with pointers and if there is a `nullptr` stored in either of those it would mean that the character holds no item in the corresponding hand. But this has certain drawbacks as it changes the semantics of these variables. +If we're on C++20 or earlier, we can use `tl::expected` as a drop-in replacement. -```cpp -// 😱 Who owns the items? -struct Character { - Item* left_hand_item; - Item* right_hand_item; -}; -``` +## Performance Considerations for `std::optional` and `std::expected` -Before, our `Character` object had value semantics and now it follows pointer semantics under the hood, meaning that copying our `Character` object would become [harder](memory_and_smart_pointers.md#performing-shallow-copy-by-mistake). +Both `std::optional` and `std::expected` are implemented using `union`-like storage internally—meaning the value and error share memory. -This is not great. The simple decision of allowing the character to have no objects in their hands forces us to actively think about memory, complicating the implementation and forcing unrelated design considerations upon us. +There are still a few things to be aware of: -One way to avoid this issue is to store a `std::optional` in each hand of the character instead: +### Error type size matters -```cpp -struct Character { - std::optional left_hand_item; - std::optional right_hand_item; -}; -``` +With `expected`, the error type affects the size of the object—even when we’re returning a success. -Now it is clear just by looking at this tiny code snippet that neither item is required for the correct operation of the character. As a bonus, the object still has value semantics and can be copied and moved without any issues. +So this is something to avoid: -Before we talk about how to use `std:::optional`, I'd like to first talk a bit about another important use-case for it - **error handling**. +```cpp +std::expected SomeFunction(); // bad idea +``` +Every return now has the size of the larger type. +### Return value optimization -## How are they implemented and their performance implications -Largely speaking, both `std::optional` and `std::expected` are both implemented as a `union` in C++, meaning that the expected and unexpected values are stored _in the same underlying memory_ with helper functions allowing us to query which one is actually stored there. +To avoid unnecessary copies, we should return values directly: -This means that if the unexpected type has a smaller memory footprint than the expected type, then there is no memory overhead. This leads us to the first performance consideration: **we should not use large types for the _unexpected_ type in `std::expected`**. Otherwise, we might be wasting a lot of memory: ```cpp -// 😱 Not a great idea. -std::expected SomeFunction(); +std::expected GetAnswer() { + return std::expected{"Answer"}; +} ``` -Here, instead of returning a tiny `int` object we will now always return an object that takes the same amount of memory as `HugeType`. As allocating memory is work, this will also most probably be slower than returning tiny integer numbers. - -The good news here is that there is not much we can do wrong with `std::optional` on this front as it holds a small `std::nullopt` type if it does not hold the expected return type. +Avoid creating a local variable and returning it unless needed. Let the compiler optimize the value construction. + +Jason Turner has a good [video on this](https://www.youtube.com/watch?v=0yJk5yfdih0) if you want to dig into the details. -As you might have already guessed, both `std::optional` and `std::variant` are class templates. Which means that they are created and checked at compile-time. Which incidentally allows the compiler to optimize the code that uses them quite well. This in turn means that generally neither `std::optional` nor `std::expected` have much of a runtime overhead. +## Summary -That being said, they might not be completely for free which leads us to our second performance consideration: **if we have a very tight loop that does not use `optional` or `expected` values, we must measure the runtime of your code if we introduce those and make sure that performance is still satisfied**. +We went through quite some material today. To summarize what we've talked about, I'd recommend the following: -Finally, there are some quirks around how the compilers are able to optimize the code when a function returns `optional` or `expected` values. If we create objects that we aim to return in a wrong way, the compiler might generate unnecessary moves or copies of the objects. Here is how to return our objects to avoid this: - -For more please see a [short and clear video by Jason Turner](https://www.youtube.com/watch?v=0yJk5yfdih0) that covers this topic. +- Use `std::optional` when a value might be missing. +- Use `std::expected` when we want to return either a result or an error. +- Avoid exceptions in time-critical or safety-critical systems. +- Avoid other ways of handling errors if possible. -## Summary -Overall, classes like `std::optional` and `std::expected` are extremely useful to represent values that optionally hold a value. Sometimes it is enough for us to know that the value simply might not be there, without caring for a reason behind this, that's where `std::optional` shines. But sometimes, especially when returning from functions, we would also like to know **why** the value does not exist and that's what `std::expected` has been added for in C++23. Oh, and if you'd like to use something like `std::expected` before C++23, take a peek at `tl::expected`, I've gotten some good mileage out of it over the years. +These tools let us make failure explicit and force the caller to handle it. That leads to clearer, safer, and more maintainable code. -These classes are very useful - they make the intent behind our code crystal-clear. They also allow us to keep the code readable and performant. +One final thing I wanted to add is that obviously, the `std::optional` class can be used also in other places, not just as a return type from a function. If some object of ours must have an optional value, using `std::optional` can be a good idea there too! But I'm sure you're going to be able to figure this out from the related cppreference page. - + From 1c61316194862c4e3de2a1cf3012b7ed645ec006 Mon Sep 17 00:00:00 2001 From: Igor Bogoslavskyi Date: Wed, 4 Jun 2025 01:22:21 +0200 Subject: [PATCH 11/26] Near final version of text for error handling --- lectures/optional.md | 259 +++++++++++++++++++++++++------------------ 1 file changed, 154 insertions(+), 105 deletions(-) diff --git a/lectures/optional.md b/lectures/optional.md index bd6112f..2d4fac3 100644 --- a/lectures/optional.md +++ b/lectures/optional.md @@ -16,6 +16,7 @@ - [What exceptions are](#what-exceptions-are) - [Exceptions are (sometimes) expensive](#exceptions-are-sometimes-expensive) - [Exceptions hide the error path](#exceptions-hide-the-error-path) + - [Exceptions are banned in many code bases](#exceptions-are-banned-in-many-code-bases) - [Returning errors explicitly can work better if done well](#returning-errors-explicitly-can-work-better-if-done-well) - [Returning a value indicating error does not always work 😱](#returning-a-value-indicating-error-does-not-always-work-) - [Returning an error code breaks "pure functions" 😱](#returning-an-error-code-breaks-pure-functions-) @@ -38,11 +39,11 @@ And as this topic is quite nuanced, there will definitely be some statements tha Following up on what I've just said, I'd like to start with a disclaimer. -This isn’t a one-size-fits-all topic. C++ is huge, powerful, and used across every domain imaginable for a long time. +This isn’t a one-size-fits-all topic. C++ is huge, powerful, and used across every domain imaginable for a long-long time. *My* perspective comes from domains like robotics and automotive—where predictability, traceability, and safety are of highest importance. What works for us may not work for everyone. -That being said, I believe that what I present here will fit to many domains with minimal adaptation. Where possible, I’ll try to mention multiple possible options and if I *do* miss an important one—please let me know! +That being said, I believe that what I present here will fit many other domains with minimal adaptation and is grounded in sane reasoning. Where possible, I’ll try to mention multiple possible options and if I *do* miss an important one—please let me know! @@ -50,14 +51,15 @@ That being said, I believe that what I present here will fit to many domains wit Before we go into how to handle errors, let’s clarify what we mean when we say "error" in the first place. -At the highest level: an error is when the code doesn’t produce the expected result. But there is nuance here! +At the highest level: an error is something that happens when the code doesn’t produce the expected result. But there is nuance here! -We generally split these into two broad groups: +I like to think of errors belonging to one of two broad groups: -- **Unrecoverable errors** — where the program reaches an invalid or inconsistent state, and continuing could be unsafe or meaningless. -- **Recoverable errors** — where the program can detect something went wrong, and has ways to proceed by an alternative path. +- **Unrecoverable errors** — where the program reaches a state where recovery is impossible or meaningless. +- **Recoverable errors** — where the program can detect that something went wrong, and has ways to proceed following an alternative path. -Some languages—like Rust—bake this distinction into the type system. C++ doesn’t. But the classification is still useful, especially when designing interfaces. +Some languages—like Rust—bake this distinction [into the language design](https://doc.rust-lang.org/book/ch09-00-error-handling.html). C++ doesn’t. +But, for my money, this classification is still useful. So let's talk a bit more in-depth about these kinds of errors and I hope that you'll agree with my logic by the end of it. # Unrecoverable errors: **fail early** @@ -65,51 +67,41 @@ Some languages—like Rust—bake this distinction into the type system. C++ doe Let’s start with the errors we don’t want to recover from. -These usually come from bugs: a violated precondition, accessing something that shouldn’t be accessed, or hitting undefined behavior. In all of these cases the program is already in some unknown state, so we have no guarantees on anything that happens next. So recovery is most likely impossible. +These usually come from bugs: a violated precondition, accessing something that shouldn’t be accessed, or hitting undefined behavior. In all of these cases the program is already in some unknown or unexpected state, so we have no guarantees on anything that happens next. Which means that recovery is most likely impossible and we are probably better off not even trying to recover. We often want to catch these types of errors as early as possible—and crash as early as possible—before any more damage is done. -A typical approach is to enforce contracts at function boundaries. My favorite method is to use the `CHECK` macro that can be found in Abseil library. Here’s a tiny example of checking if the element actually exists in a vector before returning it: +A typical approach is to enforce contracts at function boundaries. My favorite method is to use the [`CHECK`](https://abseil.io/docs/cpp/guides/logging#CHECK) macro that can be found in the [Abseil library](https://abseil.io/docs/). Here’s a toy example of registering a robot by id: ```cpp #include +#include -int GetElementAt(const std::vector& v, std::size_t index) { - CHECK(index < v.size()); // Contract check. - return v[index]; +void RegisterRobot(const std::string& robot_id) { + CHECK(!robot_id.empty()) << "Robot ID cannot be empty"; + // Perform some registration logic. } ``` -This is simple: we check upfront that an index is actually valid. If it isn’t, we crash instead of going into the undefined behavior land. +The idea is simple: we check upfront that the id is actually valid. If it isn’t, we crash instead of going into the undefined behavior land. The Abseil library handles all the additional niceties for us, like showing a stack trace for the failure as well as showing an optional message that explains the failure. -Same pattern can be used in any other places where certain pre-conditions must be met in order to proceed, like in this example where some `data` object needs to be valid in order to be processed: - -```cpp -void ProcessSensorData(const SensorData& data) { - CHECK(data.IsValid()); - // Safe to process data here. -} -``` - -We don't try to continue with invalid data. We stop. + Under this philosophy, we essentially treat bugs as bugs—not as conditions we can try to live with. ## How to minimize number of unrecoverable errors -Of course, we'd rather not hit them at all. In practice, we rely on: +Of course, we'd rather not have the bugs we're talking about here at all. In practice, we aim to keep the test coverage high for our code, ideally close to 100% line and branch coverage. -- High test coverage—ideally line coverage close to 100%. -- Contract checks on inputs and outputs. -- Assertions during development to catch bad assumptions early. +However, this does not guarantee that our program will not hit a `CHECK` failure in production, so we have to think about these scenarios too. -In safety-critical systems, we often isolate components into separate processes or even hardware units, with watchdogs that can trigger recovery actions if something crashes. This way we minimize the time to failure while keeping the system safe as a whole even when certain components fail. +In safety-critical systems, we often isolate components into separate processes or even hardware units, with watchdogs that can trigger recovery actions if something crashes. This way we can have our cake and eat it at the same time: using `CHECK` minimizes the time-to-failure when a bug is encountered, while our fallback options keep the system safe as a whole even when certain components fail. -But again, that’s recovery at the system level—not inside the code where the error occurred. This is a large architecture topic in itself and is far beyond what I want to talk about today. +That being said, such design of a system as a whole is a large architecture topic in itself and is far beyond what I want to talk about today. # Recoverable errors: **handle and proceed** -Now, what I actually want to talk about today is **recoverable errors**. +Now, what I actually *want* to talk about today is **recoverable errors**. To talk about them, let's start with an example function: @@ -117,29 +109,29 @@ To talk about them, let's start with an example function: std::string GetAnswerFromLlm(const std::string& question); ``` -This function is supposed to call some LLM over the network. It’s supposed to return a response. But what if the network is down? Or we ran out of LLM tokens? +This function is supposed to call some LLM over the network. It’s supposed to return a response. But what if the network is down? Or we ran out of LLM tokens? Or the AI became self aware and is refusing to answer our stupid questions? -We have to decide: what does the function do in that case? +Until that happens it is up to us to decide what our function does in case of a failure! -We have two broad strategies of communicating a failure that have emerged in C++ over the years: +Broadly speaking, we have two strategies of communicating failures like these that have emerged in C++ over the years: 1. **Return a special value from a function.** 2. Throw an exception. -We’ll spend most of our time on the first one—but let’s first spend some time and talk about what might be wrong with just throwing an exception. +We’ll spend most of our time on the first one—but let’s first spend some time and talk about throwing exceptions, and yes, why I think it might not be the best thing we could do. This is the time to get your pitchforks ready 😉. ## Exceptions ### What exceptions are -Since C++98 we have a powerful machinery of exceptions at our disposal. An exception is essentially just an object of some class, typically derived from [`std::exception`](https://en.cppreference.com/w/cpp/error/exception.html) class. Such an exception holds the information about the underlying failure and can be "throws" and "caught" within a C++ program. +Since C++98 we have a powerful machinery of exceptions at our disposal. An exception is essentially just an object of some type, typically derived from [`std::exception`](https://en.cppreference.com/w/cpp/error/exception.html) class. Such an exception holds the information about the underlying failure and can be "thrown" and "caught" within a C++ program. -In our example, we could throw a `std::runtime_error` when the network is not available: +In our example function, we could throw an object of `std::runtime_error` when the network is not available: ```cpp std::string GetAnswerFromLlm(const std::string& question) { - auto llm = GetLlmHandle(); - if (!llm) throw std::runtime_error("No network connection"); + const auto llm = GetLlmHandle(); // Assuming GetLlmHandle exists. + if (!llm) throw std::runtime_error{"No network connection"}; return llm->GetAnswer(question); } ``` @@ -150,14 +142,14 @@ And handle this exception by catching it by reference, which is possible because #include std::string GetAnswerFromLlm(const std::string& question) { - auto llm = GetLlmHandle(); - if (!llm) throw std::runtime_error("No network connection"); + const auto llm = GetLlmHandle(); // Assuming GetLlmHandle exists. + if (!llm) throw std::runtime_error{}"No network connection"}; return llm->GetAnswer(question); } int main() { try { - auto response = GetAnswerFromLlm("What should I do?"); + const auto response = GetAnswerFromLlm("What should I do?"); std::cout << response << "\n"; } catch (const std::exception& e) { std::cerr << "Error: " << e.what() << "\n"; @@ -169,39 +161,37 @@ On paper, this looks clean. But there are problems. ### Exceptions are (sometimes) expensive -Exceptions typically [allocate memory on the heap](memory_and_smart_pointers.md#the-heap) when thrown, and rely on **R**un-**T**ime **T**ype **I**nformation (RTTI) to propagate through the call stack. +Exceptions typically [allocate memory on the heap](memory_and_smart_pointers.md#the-heap) when thrown, and rely on **R**un-**T**ime **T**ype **I**nformation ([RTTI](https://en.wikipedia.org/wiki/Run-time_type_information)) to propagate through the call stack. There is a [great talk by Andreas Weiss](https://www.youtube.com/watch?v=kO0KVB-XIeE), my former colleague at BMW, that goes into a lot of detail how exactly exceptions behave. The talk is called "Exceptions demystified" and I urge you to give it a watch if you want to know *all* the details! -Both of these operations happen at runtime of the program and cost some time. Unfortunately, there are no guarantees on timing or performance of these operations. While in most common scenarios these operations run fast-enough, in real-time or safety-critical code, such unpredictability is unacceptable. +But long story short, both throwing and catching exceptions relies on mechanisms that work at runtime and therefore cost execution time. -Every serious project I’ve worked on either banned exceptions completely, or avoided them in performance-critical paths. - - - - +Unfortunately, there are no guarantees on timing or performance of these operations. While in most common scenarios these operations run fast-enough, in real-time or safety-critical code, such unpredictability is unacceptable. ### Exceptions hide the error path -Exceptions also make control flow harder to reason about. +Exceptions also arguably make control flow harder to reason about. To quote Google C++ style sheet: + +> Exceptions make the control flow of programs difficult to evaluate by looking at code: functions may return in places you don't expect. This causes maintainability and debugging difficulties. -The error can propagate across many layers of calls before being caught. It’s easy to miss what a function might throw—especially if documentation is incomplete or out of date (which it almost always is). +Indeed, an error can propagate across many layers of calls before being caught. It’s easy to miss what a function might throw—especially if documentation is incomplete or out of date (which it almost always is). -Furthermore, we can use generic catch blocks like `catch (...)` and these make things even worse. We end up catching *something*, but we no longer know what or why. +Furthermore, the language permits the use of generic catch blocks like `catch (...)` and these make things even more confusing. We end up catching *something*, but we no longer know what or who threw it at us! 😱 -Here's a real-world style example: +In our own example, if `GetAnswerFromLlm` throws an undocumented `std::logic_error` but we only expect `std::runtime_error`, we might miss important context or even crash anyway: ```cpp #include std::string GetAnswerFromLlm(const std::string& question) { - auto llm = GetLlmHandle(); + const auto llm = GetLlmHandle(); if (!llm) throw std::runtime_error("No network connection"); return llm->GetAnswer(question); } int main() { try { - auto answer = GetAnswerFromLlm("What’s the meaning of life?"); - std::cout << answer << "\n"; + const auto response = GetAnswerFromLlm("What’s the meaning of life?"); + std::cout << response << "\n"; } catch (...) { // Not very helpful, is it? std::cerr << "Oops, something happened.\n"; @@ -211,13 +201,25 @@ int main() { Video Thumbnail -If `GetAnswerFromLlm` throws `std::logic_error` but we only expect `std::runtime_error`, we might miss important context or even crash anyway. +I believe that `catch(...)` and equivalent constructs are singlehandedly responsible for the absolute majority of the fun error messages that we can see all over the internet and have probably encountered ourselves multiple times. + +### Exceptions are banned in many code bases + +All of these issues led a lot of code bases to ban exceptions altogether. In 2019, isocpp.org did a [survey](https://isocpp.org/files/papers/CppDevSurvey-2018-02-summary.pdf) on this matter and found that about half the respondents could not use exceptions at least in part of their code bases. + +My own experience aligns with these results - every serious project I’ve worked on either banned exceptions completely, or avoided them in performance-critical paths. But then again, I did work in robotics and automotive for the majority of my career. -I believe that `catch(...)` and equivalent constructs are singlehandedly responsible for the absolute majority of the funny error messages that you've undoubtedly seen all over the internet. +The problem of using exceptions with an acceptable overhead has quite vibrant discussions around it with even calls for re-imagining exceptions altogether as can be seen in this [wonderful talk by Herb Sutter](https://www.youtube.com/watch?v=ARYP83yNAWk) from CppCon 2019 as well as his [corresponding paper](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0709r4.pdf) on this topic. + + + +But until the C++ community figures out what to do we are stuck with many people being unable to use the default error handling mechanism in C++. + +So what do we do? ## Returning errors explicitly can work better if done well -Instead of throwing exceptions, we can encode failure directly in the return value from our function. +Now is a time to return to the other option we hinted at before: dealing with errors by returning a special value from a function. I would way that there are three distinct ways of thinking about it. Let's illustrate all of them on a function we've already looked at: @@ -228,26 +230,34 @@ std::string GetAnswerFromLlm(const std::string& question); We can: -1. Return a special **value** of the same return type, `std::string` in our case -2. Return an error code, which would change the signature of the function to return `int` instead: +1. Keep the return type, `std::string` in our case, but return a special **value** of this type +2. Return an **error code**, which would change the signature of the function to return `int` or a similar type instead: ```cpp int GetAnswerFromLlm(const std::string& question, std::string& result); ``` -3. Return a different type `std::optional` which only holds a valid `std::string` in case of success. +3. Return a **different type** specifically designed to encode failure states alongside the actual return, like `std::optional` which only holds a valid `std::string` in case of success. -I believe that the third option is the best out of these three, but let me explain why first, before going deeper into details. +I believe that the third option is the best out of these three, but let me explain why the first two are not cutting it, before going deeper into details. ### Returning a value indicating error does not always work 😱 -There is a number of issues with returning a special error value from a function. In our case, a naïve choice would be to return an empty string if there was no answer from the LLM, but what if we asked the LLM something along the lines of "read this file, return empty string when done"? An empty string *is* the valid response here! +There is a number of issues with returning a special value from a function without using a special return type. As an illustration, in our case, a naïve choice would be to return an empty string if there was no answer from the LLM, but what if we asked the LLM something along the lines of "read this file, return empty string when done"? An empty string *is* the valid response here! How do we distinguish this output from a failure? + +```cpp +std::string GetAnswerFromLlm(const std::string& question) { + const auto llm = GetLlmHandle(); + if (!llm) return ""; // 😱 Not a great idea! + return llm->GetAnswer(question); +} +``` -Similar cases can be constructed for most values we can come up with. In addition to that, there is no easy way to encode the *reason* for the failure, like that the network was down. And a final nail in the coffin of this method is that it does not work at all for functions that return `void` for obvious reasons. +Similar cases can be constructed for most values we can come up with. In addition to that, there is no easy way to encode the *reason* for the failure - we do want to know if we failed due to a network timeout or due to an imminent AI world takeover. And a final nail in the coffin of this method is that it does not work at all for functions that return `void` for obvious reasons. ### Returning an error code breaks "pure functions" 😱 -Returning an error code solves at least a couple of issues for us. It is fast and reliable and we can design our software with different error codes in mind so that the reason for the failure is also communicated to us. This is also still the prevalent way of handling errors in C, so there _is_ some merit to this method. +Returning an error code instead solves at least a couple of these issues. It is fast and reliable and we can design our software with different error codes in mind so that the reason for the failure is also communicated to us. This is also still the prevalent way of handling errors in C and in some library that we can find in the wild, so there *is* some merit to this method. However, if our function actually must return a value, the only way to use error codes is to change its return type to the type that our error codes have, like `int`, which forces us to provide an additional output parameter to our function, like `std::string& result` in our case: @@ -255,97 +265,136 @@ However, if our function actually must return a value, the only way to use error int GetAnswerFromLlm(const std::string& question, std::string& result); ``` -The main issue with this from my point of view is that it is clunky, mixes input/output in the signature, and limits functional composition. Furthermore, nowadays, the compilers are able to perform Return Value Optimization for values returned from a function and this functionality is limited for such input/output parameters. +The main issue with this from my point of view is that it is clunky, mixes input/output in the signature, and limits functional composition. Consider how we would use this function: + +```cpp +int main() { + std::string response{}; // Can't be const! + const auto success = GetAnswerFromLlm("What’s the meaning of life?", response); + if (!success) { + std::cerr << "Could not get the result from LLM\n"; + return 1; + } + std::cout << response << "\n"; +} +``` + +In this code, we have to create an empty string before calling the `GetAnswerFromLlm` function. Furthermore this string cannot be `const`, which goes against everything we've been talking in this series until now. -So clearly, there are some issues with this method too. I believe it has its merits sometimes, but there has to be a reason for it and the performance must be measured well. +On top of all this, nowadays, the compilers are able to perform Return Value Optimization (or [RVO](https://en.cppreference.com/w/cpp/language/copy_elision.html)) for values returned from a function and this functionality is limited for such input/output parameters. -### Using `std::optional`: **a better way** +So clearly, there are some issues with this method too. I believe it has its merits sometimes, but there has to be a reason for it and we must measure the performance well. -With C++17, we gained `std::optional`. +### Using `std::optional`: **a better way** -Now, we can express “might return a value” cleanly: +I believe that there *is* a better way. With C++17, we gained [`std::optional`](https://en.cppreference.com/w/cpp/utility/optional.html) with which we can express that a function “might return a value” if everything goes well: ```cpp std::optional GetAnswerFromLlm(const std::string& question); ``` -And use it like this: +Now our function returns an object of a different type, `std::optional` that we can use in an `if` statement to find out if it actually holds a value, which we can get to by calling its `value()` method or using a dereferencing operator `*` just like with pointers: ```cpp int main() { - auto answer = GetAnswerFromLlm("What now?"); - if (!answer) return 1; - std::cout << *answer << "\n"; + const auto answer = GetAnswerFromLlm("What now?"); + if (answer.has_value()) return 1; + std::cout << answer.value() << "\n"; + std::cout << *answer << "\n"; // Same as above. } ``` -The presence or absence of a value is part of the type. No more guessing. No more relying on magic return values or input/output arguments. +The presence or absence of a value is encoded into the type itself. No more guessing. No more relying on magic return values or input/output arguments. And as always, we can always find more information about how to use it at [cppreference.com](https://en.cppreference.com/w/cpp/utility/optional.html). ### Using `std::expected`: **add context** -However, we might notice that `std::optional` only tells us that something went wrong, but not *what* went wrong. +However, we might notice that `std::optional` only tells us that *something* went wrong, but not *what* went wrong. We're still interested in a reason! + +Enter [`std::expected`](https://en.cppreference.com/w/cpp/utility/expected.html), coming in C++23. And if you'd like to know what led to it being added to the language, give this [fantastic talk by Andrei Alexandrescu](https://www.youtube.com/watch?v=PH4WBuE1BHI) a watch! It is one of my favorite talks ever! It is both informative and entertaining in an equal measure! -Enter `std::expected`, coming in C++23. + + +With `std::expected` we could do the same things we could with `std::optional` and more by changing our function accordingly: ```cpp std::expected GetAnswerFromLlm(const std::string& question); ``` -Now we can return either a valid result, or an error message: +Essentially, `std::expected` holds one of two values of two potentially different types - an expected or an unexpected one. Now we can return either a valid result, or an error message: ```cpp -if (!IsNetworkAvailable()) { - return std::unexpected("Network unreachable"); +std::expected GetAnswerFromLlm(const std::string& question) { + const auto llm = GetLlmHandle(); + if (!llm) return std::unexpected("Cannot get LLM handle."); + return llm->GetAnswer(question); } ``` -And the caller handles both cases explicitly. +This has all the benefits we mentioned before: -If we're on C++20 or earlier, we can use `tl::expected` as a drop-in replacement. +- The signature of our function clearly states that it might fail +- The error if it happens needs to be dealt explicitly by the caller +- Everything happens in deterministic time with no RTTI overhead +- World for functions returning `void` too -## Performance Considerations for `std::optional` and `std::expected` +Using it is also quite neat: -Both `std::optional` and `std::expected` are implemented using `union`-like storage internally—meaning the value and error share memory. +```cpp +int main() { + const auto answer = GetAnswerFromLlm("What now?"); + if (!answer.has_value()) { + std::cerr << answer.error() << "\n"; + return 1; + } + std::cout << answer.value() << "\n"; +} +``` + +There is just one tiny issue that spoils our fun. As you've probably noticed, most of the things we covered until now targeted C++17, and `std::expected` is only available from C++23 on. But there is a solution to this: we can use [`tl::expected`](https://github.com/TartanLlama/expected) as a drop-in replacement for code bases that don't yet adopt C++23. -There are still a few things to be aware of: +## Performance Considerations for `std::optional` and `std::expected` + +One final thing I want to talk about is performance considerations when using `std::optional` and `std::expected`. ### Error type size matters -With `expected`, the error type affects the size of the object—even when we’re returning a success. +Both `std::optional` and `std::expected` are implemented using [`union`](https://en.cppreference.com/w/cpp/language/union.html)-like storage internally—meaning the value and error share memory with the bigger type defining the amount of memory allocated. Note that we should not use `union` directly in our code, but a number of standard classes use it under the hood. + +In the case of `std::optional` this does not play much of a difference, as the "error" type is a tiny `std::nullopt_t` type but for `std::expected`, the error type affects the size of the object—even when we’re returning a success. So this is something to avoid: ```cpp -std::expected SomeFunction(); // bad idea +// Bad idea, wasting memory 😱 +std::expected SomeFunction(); ``` -Every return now has the size of the larger type. +Every return now has the size of the larger type. Don't do this! + + ### Return value optimization -To avoid unnecessary copies, we should return values directly: +There is also one quirk with how these types interact with the return value optimization and named return value optimization in C++. These topics are quite nuanced, but in general, the rule of thumb here is quite simple: we should prefer constructing the `std::expected` and `std::optional` objects in-place rather than creating a local variable first. -```cpp -std::expected GetAnswer() { - return std::expected{"Answer"}; -} -``` +For more details, I'll refer you to a [short video by Jason Turner](https://www.youtube.com/watch?v=0yJk5yfdih0) on this. -Avoid creating a local variable and returning it unless needed. Let the compiler optimize the value construction. +## Summary -Jason Turner has a good [video on this](https://www.youtube.com/watch?v=0yJk5yfdih0) if you want to dig into the details. +We went through quite some material today. We've looked at all the various kinds of ways to deal with errors happening in our (and somebody else's) code. As a short summary, I hope that I could convince you that these are some sane suggestions: -## Summary + -We went through quite some material today. To summarize what we've talked about, I'd recommend the following: +- Use `CHECK` and similar macros for dealing with unrecoverable errors like programming bugs or contract violation. +- Use `std::optional` as a return type when a value might be missing due to a recoverable error occurring. +- Use `std::expected` when a reason for failure is important to know. +- Keep the test coverage of the code high to reduce chances of missing errors. +- Avoid exceptions in time-critical or safety-critical systems due to their non-deterministic runtime overhead. -- Use `std::optional` when a value might be missing. -- Use `std::expected` when we want to return either a result or an error. -- Avoid exceptions in time-critical or safety-critical systems. -- Avoid other ways of handling errors if possible. +All in all, the overall direction that we seem to be following as a community is to make failure explicit and force the caller to handle it. That leads to clearer, safer, and more maintainable code. -These tools let us make failure explicit and force the caller to handle it. That leads to clearer, safer, and more maintainable code. +One final thing I wanted to add is that obviously, the `std::optional` class can be used also in other places, not just as a return type from a function. If some object of ours must have an optional value, using `std::optional` can be a good idea there too! But I'm sure you're going to be able to figure this out from the related [cppreference page](https://en.cppreference.com/w/cpp/utility/optional.html). -One final thing I wanted to add is that obviously, the `std::optional` class can be used also in other places, not just as a return type from a function. If some object of ours must have an optional value, using `std::optional` can be a good idea there too! But I'm sure you're going to be able to figure this out from the related cppreference page. + +See you in the next one! Bye! --> From 3e8da8fd9a7c61ef808de36829ca2e42197539ec Mon Sep 17 00:00:00 2001 From: Igor Bogoslavskyi Date: Wed, 4 Jun 2025 01:24:26 +0200 Subject: [PATCH 12/26] Renamed lecture into error handling --- lectures/{optional.md => error_handling.md} | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) rename lectures/{optional.md => error_handling.md} (99%) diff --git a/lectures/optional.md b/lectures/error_handling.md similarity index 99% rename from lectures/optional.md rename to lectures/error_handling.md index 2d4fac3..540ae10 100644 --- a/lectures/optional.md +++ b/lectures/error_handling.md @@ -333,7 +333,7 @@ std::expected GetAnswerFromLlm(const std::string& ques This has all the benefits we mentioned before: - The signature of our function clearly states that it might fail -- The error if it happens needs to be dealt explicitly by the caller +- The error if it happens needs to be dealt with explicitly by the caller - Everything happens in deterministic time with no RTTI overhead - World for functions returning `void` too From 3ec03decff9905d3a12a818b929c3a9489cae528 Mon Sep 17 00:00:00 2001 From: Igor Bogoslavskyi Date: Wed, 4 Jun 2025 01:24:51 +0200 Subject: [PATCH 13/26] Fix wrong word --- lectures/error_handling.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lectures/error_handling.md b/lectures/error_handling.md index 540ae10..1f4d028 100644 --- a/lectures/error_handling.md +++ b/lectures/error_handling.md @@ -335,7 +335,7 @@ This has all the benefits we mentioned before: - The signature of our function clearly states that it might fail - The error if it happens needs to be dealt with explicitly by the caller - Everything happens in deterministic time with no RTTI overhead -- World for functions returning `void` too +- Works for functions returning `void` too Using it is also quite neat: From a749a4b8ccbecc31632b72fb948172c651a5fae7 Mon Sep 17 00:00:00 2001 From: Igor Bogoslavskyi Date: Wed, 4 Jun 2025 01:26:24 +0200 Subject: [PATCH 14/26] Minor restructuring --- lectures/error_handling.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/lectures/error_handling.md b/lectures/error_handling.md index 1f4d028..458de05 100644 --- a/lectures/error_handling.md +++ b/lectures/error_handling.md @@ -330,13 +330,6 @@ std::expected GetAnswerFromLlm(const std::string& ques } ``` -This has all the benefits we mentioned before: - -- The signature of our function clearly states that it might fail -- The error if it happens needs to be dealt with explicitly by the caller -- Everything happens in deterministic time with no RTTI overhead -- Works for functions returning `void` too - Using it is also quite neat: ```cpp @@ -350,6 +343,13 @@ int main() { } ``` +This has all the benefits we mentioned before: + +- The signature of our function clearly states that it might fail +- The error if it happens needs to be dealt with explicitly by the caller +- Everything happens in deterministic time with no unpredictable runtime overhead +- Works for functions returning `void` too + There is just one tiny issue that spoils our fun. As you've probably noticed, most of the things we covered until now targeted C++17, and `std::expected` is only available from C++23 on. But there is a solution to this: we can use [`tl::expected`](https://github.com/TartanLlama/expected) as a drop-in replacement for code bases that don't yet adopt C++23. ## Performance Considerations for `std::optional` and `std::expected` From b815d8e5b8913e53a37d856cde865a45e091efa0 Mon Sep 17 00:00:00 2001 From: Igor Bogoslavskyi Date: Wed, 11 Jun 2025 22:45:48 +0200 Subject: [PATCH 15/26] Chaging examples to better ones --- .../robot_example_simple/robot_example.cpp | 69 +++ lectures/error_handling.md | 436 +++++++++++++++--- 2 files changed, 451 insertions(+), 54 deletions(-) create mode 100644 lectures/code/error_handling/robot_example_simple/robot_example.cpp diff --git a/lectures/code/error_handling/robot_example_simple/robot_example.cpp b/lectures/code/error_handling/robot_example_simple/robot_example.cpp new file mode 100644 index 0000000..06370a0 --- /dev/null +++ b/lectures/code/error_handling/robot_example_simple/robot_example.cpp @@ -0,0 +1,69 @@ +#include +#include +#include + +// Can be arbitrary types, here int for simplicity. +using Robot = int; +using Mission = int; + +// This should be a class, using struct for simplicity. +struct MissionRobotAssignments { + + void AssignRobot(int assignment_index, const Robot& robot) { + assert((assignment_index < robots.size()) && (assignment_index >= 0)); + robots[assignment_index] = robot; + } + + void Print() const { + assert(robots.size() == missions.size()); + for (auto i = 0UL; i < robots.size(); ++i) { + std::cout << i << ": Mission " << // + missions[i] << " is carried out by the robot " << // + robots[i] << std::endl; + } + } + + std::vector missions{}; + std::vector robots{}; +}; + +// Multiple issues here for now. +// We should handle failure to get a proper value. +// We also could use a struct in place of a pair. +std::pair GetNextChangeEntryFromUser( + const MissionRobotAssignments& assignments) { + std::pair entry{}; + std::cout << "Please select mission index." << std::endl; + std::cin >> entry.first; + std::cout << "Please provide new robot id." << std::endl; + std::cin >> entry.second; + return entry; +} + +bool CheckIfUserWantsChanges() { + std::cout << "Do you want to change assignment? [y/n]" << std::endl; + std::string answer{}; + std::cin >> answer; + if (answer == "y") { return true; } + return false; +} + +int main() { + MissionRobotAssignments assignments{{Mission{42}, Mission{40}}, + {Robot{10}, Robot{23}}}; + assignments.Print(); + while (true) { + const auto user_wants_changes = CheckIfUserWantsChanges(); + if (!user_wants_changes) { break; } + const auto change_entry = GetNextChangeEntryFromUser(assignments); + assignments.AssignRobot(change_entry.first, change_entry.second); + } + assignments.Print(); + + std::cout << "Address of assignments.missions.data(): " + << assignments.missions.data() << std::endl; + std::cout << "Address of assignments.robots.data(): " + << assignments.robots.data() << std::endl; + const auto diff = assignments.robots.data() - assignments.missions.data(); + std::cout << "Diff in address: " << diff << std::endl; +} diff --git a/lectures/error_handling.md b/lectures/error_handling.md index 458de05..ac73f4d 100644 --- a/lectures/error_handling.md +++ b/lectures/error_handling.md @@ -5,11 +5,16 @@ Video Thumbnail

- - [Disclaimer](#disclaimer) - [What Do We Mean by “Error”?](#what-do-we-mean-by-error) - [Unrecoverable errors: **fail early**](#unrecoverable-errors-fail-early) + - [Intro to unrecoverable errors](#intro-to-unrecoverable-errors) + - [Setting up an example](#setting-up-an-example) + - [Unrecoverable error example](#unrecoverable-error-example) - [How to deal with unrecoverable errors](#how-to-deal-with-unrecoverable-errors) + - [Catch them as early as possible](#catch-them-as-early-as-possible) + - [Don't use `assert`](#dont-use-assert) + - [Use `CHECK` macro instead](#use-check-macro-instead) - [How to minimize number of unrecoverable errors](#how-to-minimize-number-of-unrecoverable-errors) - [Recoverable errors: **handle and proceed**](#recoverable-errors-handle-and-proceed) - [Exceptions](#exceptions) @@ -27,96 +32,396 @@ - [Return value optimization](#return-value-optimization) - [Summary](#summary) -When writing C++ code, much like in life, we don’t always get what we want. The good news is that we can prepare for this! +When writing C++ code, much like in life, we don’t always get what we want. The good news is that we can prepare for this and recover from not getting what we want! -And just like everything else in C++, there are… a lot of ways to do that. +But, just like with everything else in C++, there are… a lot of ways to do that. 😅 -Today we’re talking about error handling. What options we have, which trade-offs they come with, and what modern C++ gives us to make our lives a bit easier. +Today we’re talking about error handling. What options we have, which trade-offs they come with, and what tools modern C++ gives us to make our lives a bit easier. And as this topic is quite nuanced, there will definitely be some statements that are quite opinionated and I can already see some people with pitchforks coming my way... so... I'm sure it's gonna be fun! -# Disclaimer + -Following up on what I've just said, I'd like to start with a disclaimer. +# Disclaimer -This isn’t a one-size-fits-all topic. C++ is huge, powerful, and used across every domain imaginable for a long-long time. +This is definitely *not* a one-size-fits-all topic. C++ is huge, powerful, and used across every domain imaginable for a long-long time. -*My* perspective comes from domains like robotics and automotive—where predictability, traceability, and safety are of highest importance. What works for us may not work for everyone. +*My* perspective comes from domains like robotics and automotive—where predictability and safety are of highest importance. What works for us may not work for everyone. -That being said, I believe that what I present here will fit many other domains with minimal adaptation and is grounded in sane reasoning. Where possible, I’ll try to mention multiple possible options and if I *do* miss an important one—please let me know! +That being said, I believe that what we talk about here will fit many other domains with minimal adaptation and is grounded in sane reasoning. Where possible, I’ll try to mention multiple possible options and if I *do* miss an important one—please let me know! + + # What Do We Mean by “Error”? Before we go into how to handle errors, let’s clarify what we mean when we say "error" in the first place. -At the highest level: an error is something that happens when the code doesn’t produce the expected result. But there is nuance here! +At the highest level: an error is something that happens when the code doesn’t produce the result we expect. But there is nuance here! -I like to think of errors belonging to one of two broad groups: +I tend to think of errors belonging to one of two broad groups: - **Unrecoverable errors** — where the program reaches a state where recovery is impossible or meaningless. - **Recoverable errors** — where the program can detect that something went wrong, and has ways to proceed following an alternative path. -Some languages—like Rust—bake this distinction [into the language design](https://doc.rust-lang.org/book/ch09-00-error-handling.html). C++ doesn’t. -But, for my money, this classification is still useful. So let's talk a bit more in-depth about these kinds of errors and I hope that you'll agree with my logic by the end of it. +Some languages—like Rust—bake this distinction [into the language design](https://doc.rust-lang.org/book/ch09-00-error-handling.html). C++ doesn’t, making the topic of error handling slightly more nuanced. + +But, for my money, this classification is still useful. So let's talk a bit more in-depth about these kinds of errors and I hope that you'll agree with, or at least accept my logic by the end of it. # Unrecoverable errors: **fail early** -## How to deal with unrecoverable errors +## Intro to unrecoverable errors Let’s start with the errors we don’t want to recover from. -These usually come from bugs: a violated precondition, accessing something that shouldn’t be accessed, or hitting undefined behavior. In all of these cases the program is already in some unknown or unexpected state, so we have no guarantees on anything that happens next. Which means that recovery is most likely impossible and we are probably better off not even trying to recover. +These usually come from programming bugs: a violated precondition, accessing something that shouldn’t be accessed, or hitting undefined behavior. + +In all of these cases the program is typically already in some unknown or unexpected state, so we have no guarantees on anything that happens next. For all we know, our memory might have already been corrupted, which means that recovery is most likely impossible and we are probably better off not even trying to recover. + +## Setting up an example + +Let us illustrate this with an example. Let's say we have a task to assign robots to missions. For simplicity, we will just typedef our `Robot` and `Mission` class to `int` but they can, of course, be arbitrary types. + +`types.hpp` + +```cpp +#pragma once + +// Can be arbitrary types, here int for simplicity. +using Robot = int; +using Mission = int; +``` + +We can model the assignment of robots to missions by a class `MissionRobotAssignments` that holds a vector of missions and a vector of robots. It supports a way to assign a new robot to a mission through the `AssignRobot` function and has a convenient `Print` function. In this example, we use a `struct` for keeping the amount of code on the screen manageable, but it should probably be a class, please feel free to refresh why in the video on [classes](classes_intro.md) from before. + +`mission_robot_assignments.hpp` + +```cpp +#pragma once + +#include "types.hpp" + +#include +#include + +// This should be a class, using struct for simplicity. +struct MissionRobotAssignments { + + void AssignRobot(int assignment_index, const Robot& robot) { + robots[assignment_index] = robot; + } + + void Print() const { + for (auto i = 0UL; i < robots.size(); ++i) { + std::cout << i << ": Mission " << // + missions[i] << " is carried out by the robot " << // + robots[i] << std::endl; + } + } + + std::vector missions{}; + std::vector robots{}; +}; +``` + +For the sake of our example, we let the user modify these assignments manually. We would need some more functions for this. More concretely, we have a function `CheckIfUserWantsChanges` that asks the user if they want to make any changes and returns a boolean value that indicates their answer. + +In addition to this, we need a function `GetNextChangeEntryFromUser` that actually asks for the user's input about *what* they want to change. We ask them for a mission index followed by a request to provide some arbitrary robot id, retuning these as a pair of values. + +`user_input.hpp` + +```cpp +#pragma once + +#include "mission_robot_assignments.hpp" + +#include +#include + +inline bool CheckIfUserWantsChanges() { + std::cout << "Do you want to change assignment? [y/n]" << std::endl; + std::string answer{}; + std::cin >> answer; + if (answer == "y") { return true; } + return false; +} + +// Multiple issues here for now. +// We should handle failure to get a proper value. +// We also could use a struct in place of a pair. +inline std::pair GetNextChangeEntryFromUser( + const MissionRobotAssignments& assignments) { + std::pair entry{}; + std::cout << "Please select mission index." << std::endl; + std::cin >> entry.first; + std::cout << "Please provide new robot id." << std::endl; + std::cin >> entry.second; + return entry; +} +``` + +Finally, we need a `main` function that prints the initial assignment and keeps asking the user for their input until they don't want to provide any. We end by printing the resulting assignments. + +`robot_example.cpp` + +```cpp +#include "mission_robot_assignments.hpp" +#include "user_input.hpp" +#include "types.hpp" + +// Careful! The code below causes a subtle bug! +int main() { + MissionRobotAssignments assignments{{Mission{42}, Mission{40}}, + {Robot{10}, Robot{23}}}; + assignments.Print(); + while (true) { + const auto user_wants_changes = CheckIfUserWantsChanges(); + if (!user_wants_changes) { break; } + const auto change_entry = GetNextChangeEntryFromUser(assignments); + assignments.AssignRobot(change_entry.first, change_entry.second); + } + assignments.Print(); +} +``` + +Obviously, this example does not do too much, but trust me it is good enough to illustrate many of the core concepts we are talking about today. I hope that with some minor stretch of imagination we can all imagine how it could be extended to a real-world use-case by adding a couple more functions. + +Oh, and as always, there is [complete code](code/error_handling/robot_example_simple/robot_example.cpp) to this project. + + +## Unrecoverable error example + +Now with the example set up, we can illustrate what a typical unrecoverable error looks like. + +An example run of this program might look something like this: + +```output +0: Mission 42 is carried out by the robot 10 +1: Mission 40 is carried out by the robot 23 +Do you want to change assignment? [y/n] +y +Please select mission index. +40 +Please provide new robot id. +4242 +Do you want to change assignment? [y/n] +n +0: Mission 4242 is carried out by the robot 10 +1: Mission 40 is carried out by the robot 23 +``` + +Here, there are two assignments to start with and the user wants to change the assignment for missions 40 to a robot with an id 4242. However, if we look at the assignments printed in the end we see something strange there: the robots assigned to missions did not change at all! Instead, a mission id changed! + +Essentially, due to our poor interface, the user got confused and instead of specifying the array index for a mission they wanted to change, they provided mission id. The code still uses that id as an index, writing the robot id that the user provided far beyond the expected memory address. + +But why does it change the mission entry? + +To get a hint about this, we can create an even simpler example by changing our `main` function: + +```cpp +#include "mission_robot_assignments.hpp" +#include "types.hpp" + +int main() { + MissionRobotAssignments assignments{{Mission{42}, Mission{40}}, + {Robot{10}, Robot{23}}}; + std::cout << "Address of assignments.missions.data(): " + << assignments.missions.data() << std::endl; + std::cout << "Address of assignments.robots.data(): " + << assignments.robots.data() << std::endl; + const auto diff = assignments.missions.data() - assignments.robots.data(); + std::cout << "Diff in address: " << diff << std::endl; +} +``` + +Running this code will output the addresses for the data stored within our assignments object and the difference between their addresses. If this is confusing, please feel free to refresh what [raw pointers](raw_pointers.md) are before going deeper into *this* topic. + +```output +Address of assignments.missions.data(): 0x145605ea0 +Address of assignments.robots.data(): 0x145605e00 +Diff in address: 40 +``` + +If we look long enough at the output of our code and remember that we used 40 as our mission id that we wanted to change, something starts to click! + +Here's what happened: the program was asking us for a mission **index** but we provided a mission **id**. Then the id that the user provided got written into the `assignments.robots` vector way beyond the end of its memory and it "just so happened" that the address of the element we wrote to ended up having the address of the first element of the `assignments.missions`! So we overwrote the first element of our missions by mistake! + +Do note, that we're hitting **undefined behavior** here! This result will almost certainly be different if you run it on your own machine! The addresses of the `assignments.robots` and `assignments.missions` vectors will be different and can even have a different order in memory. You might have already noticed that `assignments.robots` appears before `assignments.missions` in memory even though `assignments.robots` appears *after* `assignments.missions` in the `struct` declaration! If you'd like to understand a bit more about how these are allocated, we've covered this [in the lecture on how C++ allocates memory](memory_and_smart_pointers.md). + +The main point I was trying to make here though is that now our memory is corrupted. This particular example was carefully constructed to overwrite an element of another vector, but if the user provides a different "mission id" the code will write the user-provided number to an arbitrary part of memory that belongs to our program, or will crash with a segmentation fault error if we hit memory that does not belong to us yet. What this means is that once an event like this has occurred, we do not have any guarantee on the consistency of the state of our program. Arbitrary objects might be corrupted and can behave in unpredictable ways from this point on without us being able to know about it. + +This concept lies at the core of what makes this type of errors "unrecoverable". If the data we try to use for recovery is corrupted, we have no guarantees that any such recovery will succeed. + +## How to deal with unrecoverable errors + +### Catch them as early as possible We often want to catch these types of errors as early as possible—and crash as early as possible—before any more damage is done. -A typical approach is to enforce contracts at function boundaries. My favorite method is to use the [`CHECK`](https://abseil.io/docs/cpp/guides/logging#CHECK) macro that can be found in the [Abseil library](https://abseil.io/docs/). Here’s a toy example of registering a robot by id: +### Don't use `assert` + +A typical approach is to enforce contracts at function boundaries. + +One way that is often recommended on the Internet is to use [`assert`](https://en.cppreference.com/w/cpp/error/assert.html) that can be found in the `` include file. I'm not a fan of using `assert` as it has one super annoying flaw but due to how popular it is in many C++ tutorials, we'll have to go through this topic step by step. + +Essentially, `assert` allows us to check any boolean condition passed into it: + +```cpp +assert(2 + 2 == 4); +``` + +In our example, we could change the `AssignRobot` and `Print` methods of our `MissionRobotAssignments` class to perform the checks needed to avoid potential undefined behavior: + +`mission_robot_assignments.hpp` + +```cpp +#pragma once + +#include "types.hpp" + +#include +#include +#include + + +// This should be a class, using struct for simplicity. +struct MissionRobotAssignments { + + void AssignRobot(int assignment_index, const Robot& robot) { + assert((assignment_index < robots.size()) && (assignment_index >= 0)); + robots[assignment_index] = robot; + } + + void Print() const { + assert(robots.size() == missions.size()); + for (auto i = 0UL; i < robots.size(); ++i) { + std::cout << i << ": Mission " << // + missions[i] << " is carried out by the robot " << // + robots[i] << std::endl; + } + } + + std::vector missions{}; + std::vector robots{}; +}; +``` + +Now, if we try to compile and run our example just as we did before we won't be able to hit the same undefined behavior as the assertion will trigger! + +```output +0: Mission 42 is carried out by the robot 10 +1: Mission 40 is carried out by the robot 23 +Do you want to change assignment? [y/n] +y +Please select mission index. +40 +Please provide new robot id. +4242 +Assertion failed: ((assignment_index < robots.size()) && (assignment_index >= 0)), function AssignRobot, file mission_robot_assignments.hpp, line 14. +[1] 29732 abort ./robot_example +``` + +So far so good, right? So what is that fatal flaw I've been talking about that makes me dislike `assert`? Well, you see, all `assert` statements get disabled when a macro `NDEBUG` is defined. This is a standard macro name that gets defined by default for most release builds. So essentially, `assert` does not protect us from undefined behavior in the code we actually deploy! + +We can easily demonstrate that the `asserts` indeed get compiled out by adding `-DNDEBUG` flag to our compilation command: + +```cmd +c++ -std=c++17 -DNDEBUG -o robot_example robot_example.cpp +``` + +Running our example *now* leads to the same undefined behavior we observed before as all of the assertions were compiled out. + +### Use `CHECK` macro instead + +Even as `assert` might not be a perfect tool for the job, the idea of checking the function pre-conditions is actually still a very good idea! + +And there are better tools for this! My favorite method is to use the [`CHECK`](https://abseil.io/docs/cpp/guides/logging#CHECK) macro that can be found in the [Abseil library](https://abseil.io/docs/). We can use it in the same way as we used `assert`: ```cpp +#pragma once + +#include "types.hpp" + #include -#include -void RegisterRobot(const std::string& robot_id) { - CHECK(!robot_id.empty()) << "Robot ID cannot be empty"; - // Perform some registration logic. -} +#include +#include + +// This should be a class, using struct for simplicity. +struct MissionRobotAssignments { + + void AssignRobot(int assignment_index, const Robot& robot) { + CHECK_LT(assignment_index, robots.size()); + CHECK_GE(assignment_index, 0); + robots[assignment_index] = robot; + } + + void Print() const { + CHECK_EQ(robots.size(), missions.size()); + for (auto i = 0UL; i < robots.size(); ++i) { + std::cout << i << ": Mission " << // + missions[i] << " is carried out by the robot " << // + robots[i] << std::endl; + } + } + + std::vector missions{}; + std::vector robots{}; +}; ``` -The idea is simple: we check upfront that the id is actually valid. If it isn’t, we crash instead of going into the undefined behavior land. The Abseil library handles all the additional niceties for us, like showing a stack trace for the failure as well as showing an optional message that explains the failure. +And the output is very similar with the main difference being that it also works in release builds! - +The main concern that people have when using `CHECK` macros is performance as they stay in our code and do cost some time when our program runs. But I would say that the benefits far outweigh the costs, and, unless we've measured that we cannot allow the tiny performance hit in a particular place of our code, we should be free to use `CHECK` for safety. -Under this philosophy, we essentially treat bugs as bugs—not as conditions we can try to live with. +All in all, at least in my book, `CHECK` is our main weapon against unrecoverable errors and the undefined behavior that they tend to cause. + + ## How to minimize number of unrecoverable errors -Of course, we'd rather not have the bugs we're talking about here at all. In practice, we aim to keep the test coverage high for our code, ideally close to 100% line and branch coverage. +Of course, we'd rather not have the bugs we're talking about here at all. In practice, we aim to keep the test coverage high for our code, ideally close to 100% line and branch coverage. In some industry, like automotive, aviation, or medical this is actually a requirement. -However, this does not guarantee that our program will not hit a `CHECK` failure in production, so we have to think about these scenarios too. +However, even the most rigorous checking does not guarantee that our program will not hit a `CHECK` failure in production. It does happen from time to time that a cosmic ray will hit our memory just right and flip a bit. We might not be able to detect this, but using `CHECK` rigorously increases our chances. Now, if the flipped bit causes some pre-condition to fail, our program will fail rather than continue running in an undefined state. But my main point here is that despite our best efforts we still need to be prepared for when our program crashes. -In safety-critical systems, we often isolate components into separate processes or even hardware units, with watchdogs that can trigger recovery actions if something crashes. This way we can have our cake and eat it at the same time: using `CHECK` minimizes the time-to-failure when a bug is encountered, while our fallback options keep the system safe as a whole even when certain components fail. +In safety-critical systems, we often isolate components into separate processes or even separate hardware units, with watchdogs that can trigger recovery actions if something crashes. This way we can have our cake and eat it at the same time: using `CHECK` minimizes the time-to-failure when a bug is encountered, while our fallback options keep the system safe as a whole even when certain components fail. -That being said, such design of a system as a whole is a large architecture topic in itself and is far beyond what I want to talk about today. +That being said, such design of a system as a whole is a large architecture topic in itself and is far beyond what I want to talk about today. In most non-safety-critical systems we do not need to think about these failure cases as deeply and we can usually just restart our program in case of a one-off failure. But, if you're interested in this topic, I've given an introductory lecture [on this topic](https://youtu.be/DtRktn4bVWg?si=DJuU8OjxtBcj5o2C) at the University of Bonn some years ago. # Recoverable errors: **handle and proceed** -Now, what I actually *want* to talk about today is **recoverable errors**. +But not every error should instantly crash our program! Indeed, in our example, the original cause of the error is not a cosmic ray flipping bits of our memory, but a wrong user input! + +The good thing about user inputs is that we can ask the user to correct these without crashing our program! The type of errors we encounter here can be called **recoverable errors**. + +To talk about them, let us focus on the function `GetNextChangeEntryFromUser` from our example. Currently, there is no validation of what the user inputs but we absolutely can and should perform such validation! -To talk about them, let's start with an example function: +As we design our program, we know that the input the mission index the user provides first must be within the bounds of the mission vector within our `assignments` object: ```cpp -std::string GetAnswerFromLlm(const std::string& question); +// Multiple issues here for now. +// We should handle failure to get a proper value. +// We also could use a struct in place of a pair. +inline std::pair GetNextChangeEntryFromUser( + const MissionRobotAssignments& assignments) { + std::pair entry{}; + std::cout << "Please select mission index." << std::endl; + std::cin >> entry.first; // <-- This value is NOT arbitrary! + std::cout << "Please provide new robot id." << std::endl; + std::cin >> entry.second; + return entry; +} ``` -This function is supposed to call some LLM over the network. It’s supposed to return a response. But what if the network is down? Or we ran out of LLM tokens? Or the AI became self aware and is refusing to answer our stupid questions? - -Until that happens it is up to us to decide what our function does in case of a failure! +So we'd like to somehow know that something went wrong withing the `GetNextChangeEntryFromUser` function and recover from this. Broadly speaking, we have two strategies of communicating failures like these that have emerged in C++ over the years: 1. **Return a special value from a function.** -2. Throw an exception. +2. Throw an **exception**. We’ll spend most of our time on the first one—but let’s first spend some time and talk about throwing exceptions, and yes, why I think it might not be the best thing we could do. This is the time to get your pitchforks ready 😉. @@ -126,38 +431,59 @@ We’ll spend most of our time on the first one—but let’s first spend some t Since C++98 we have a powerful machinery of exceptions at our disposal. An exception is essentially just an object of some type, typically derived from [`std::exception`](https://en.cppreference.com/w/cpp/error/exception.html) class. Such an exception holds the information about the underlying failure and can be "thrown" and "caught" within a C++ program. -In our example function, we could throw an object of `std::runtime_error` when the network is not available: +In our example function, we could throw an object of `std::runtime_error` when the user inputs a wrong mission index: ```cpp -std::string GetAnswerFromLlm(const std::string& question) { - const auto llm = GetLlmHandle(); // Assuming GetLlmHandle exists. - if (!llm) throw std::runtime_error{"No network connection"}; - return llm->GetAnswer(question); +// Multiple issues here for now. +// We should handle failure to get a proper value. +// We also could use a struct in place of a pair. +inline std::pair GetNextChangeEntryFromUser( + const MissionRobotAssignments& assignments) { + std::pair entry{}; + std::cout << "Please select mission index." << std::endl; + std::cin >> entry.first; // <-- This value is NOT arbitrary! + if ((entry.first < 0) || (entry.first >= assignments.missions.size())) { + throw std::runtime_error("Wrong mission index provided."); + } + std::cout << "Please provide new robot id." << std::endl; + std::cin >> entry.second; + return entry; } ``` -And handle this exception by catching it by reference, which is possible because `std::runtime_error` derives from `std::exception`: +Throwing an exception will interrupt the normal function flow, destroy all objects allocated in the appropriate scope and will continue with the "exceptional flow" of the program to find a place where the thrown exception can be handled. -```cpp -#include +Speaking of handling exceptions, we can "catch" them anywhere upstream from the place they have been thrown from. As `std::exception` is just an object, it can be caught by value or by reference. When using exceptions, it is considered best practice to catch them by reference. In our case this is possible because `std::runtime_error` derives from `std::exception`: -std::string GetAnswerFromLlm(const std::string& question) { - const auto llm = GetLlmHandle(); // Assuming GetLlmHandle exists. - if (!llm) throw std::runtime_error{}"No network connection"}; - return llm->GetAnswer(question); -} +```cpp +#include "mission_robot_assignments.hpp" +#include "user_input.hpp" +#include "types.hpp" int main() { - try { - const auto response = GetAnswerFromLlm("What should I do?"); - std::cout << response << "\n"; - } catch (const std::exception& e) { - std::cerr << "Error: " << e.what() << "\n"; + MissionRobotAssignments assignments{{Mission{42}, Mission{40}}, + {Robot{10}, Robot{23}}}; + assignments.Print(); + while (true) { + const auto user_wants_changes = CheckIfUserWantsChanges(); + if (!user_wants_changes) { break; } + try { + const auto change_entry = GetNextChangeEntryFromUser(assignments); + assignments.AssignRobot(change_entry.first, change_entry.second); + } catch (const std::exception& e) { + std::cerr << "Error: " << e.what() << "\n"; + std::cerr << "Please try again.\n"; + } } + assignments.Print(); } ``` -On paper, this looks clean. But there are problems. +Should we forget to catch an exception it bubbles up to the very top and terminates our program. + +On paper, this looks very neat. But there are problems. + + ### Exceptions are (sometimes) expensive @@ -255,6 +581,8 @@ std::string GetAnswerFromLlm(const std::string& question) { Similar cases can be constructed for most values we can come up with. In addition to that, there is no easy way to encode the *reason* for the failure - we do want to know if we failed due to a network timeout or due to an imminent AI world takeover. And a final nail in the coffin of this method is that it does not work at all for functions that return `void` for obvious reasons. + + ### Returning an error code breaks "pure functions" 😱 Returning an error code instead solves at least a couple of these issues. It is fast and reliable and we can design our software with different error codes in mind so that the reason for the failure is also communicated to us. This is also still the prevalent way of handling errors in C and in some library that we can find in the wild, so there *is* some merit to this method. From 1d6e86b7df96c75526f24857cfd8d2730099e394 Mon Sep 17 00:00:00 2001 From: Igor Bogoslavskyi Date: Sun, 15 Jun 2025 21:56:17 +0200 Subject: [PATCH 16/26] Finalize text (again) --- lectures/error_handling.md | 341 ++++++++++++++++++++++++------------- 1 file changed, 223 insertions(+), 118 deletions(-) diff --git a/lectures/error_handling.md b/lectures/error_handling.md index ac73f4d..99ed5f7 100644 --- a/lectures/error_handling.md +++ b/lectures/error_handling.md @@ -15,6 +15,7 @@ - [Catch them as early as possible](#catch-them-as-early-as-possible) - [Don't use `assert`](#dont-use-assert) - [Use `CHECK` macro instead](#use-check-macro-instead) + - [Complete the example](#complete-the-example) - [How to minimize number of unrecoverable errors](#how-to-minimize-number-of-unrecoverable-errors) - [Recoverable errors: **handle and proceed**](#recoverable-errors-handle-and-proceed) - [Exceptions](#exceptions) @@ -32,15 +33,17 @@ - [Return value optimization](#return-value-optimization) - [Summary](#summary) -When writing C++ code, much like in life, we don’t always get what we want. The good news is that we can prepare for this and recover from not getting what we want! +When writing C++ code, much like in life, we don’t always get what we want. The good news is that C++ comes packed with the tools to prepare for this and maybe even recover! -But, just like with everything else in C++, there are… a lot of ways to do that. 😅 +But, just like with everything else in C++, there are… well, a *number* of ways to do that. 😅 Today we’re talking about error handling. What options we have, which trade-offs they come with, and what tools modern C++ gives us to make our lives a bit easier. -And as this topic is quite nuanced, there will definitely be some statements that are quite opinionated and I can already see some people with pitchforks coming my way... so... I'm sure it's gonna be fun! +And as this topic is quite nuanced, there will definitely be some statements that are quite opinionated. I can already see some people with pitchforks coming my way... so... I'm sure it's gonna be fun! - + # Disclaimer @@ -48,7 +51,7 @@ This is definitely *not* a one-size-fits-all topic. C++ is huge, powerful, and u *My* perspective comes from domains like robotics and automotive—where predictability and safety are of highest importance. What works for us may not work for everyone. -That being said, I believe that what we talk about here will fit many other domains with minimal adaptation and is grounded in sane reasoning. Where possible, I’ll try to mention multiple possible options and if I *do* miss an important one—please let me know! +That being said, I believe that what we talk about here will fit many other domains with minimal adaptation and is grounded in relatively sane reasoning. Where possible, I’ll try to mention multiple possible options and if I *do* miss an important one—please let me know! @@ -64,26 +67,31 @@ At the highest level: an error is something that happens when the code doesn’t I tend to think of errors belonging to one of two broad groups: -- **Unrecoverable errors** — where the program reaches a state where recovery is impossible or meaningless. +- **Unrecoverable errors** — where the program reaches a state in which recovery is impossible or meaningless. - **Recoverable errors** — where the program can detect that something went wrong, and has ways to proceed following an alternative path. -Some languages—like Rust—bake this distinction [into the language design](https://doc.rust-lang.org/book/ch09-00-error-handling.html). C++ doesn’t, making the topic of error handling slightly more nuanced. +Some languages—like Rust—bake this distinction [directly into the language design](https://doc.rust-lang.org/book/ch09-00-error-handling.html). C++ doesn’t, making the topic of error handling slightly more nuanced. -But, for my money, this classification is still useful. So let's talk a bit more in-depth about these kinds of errors and I hope that you'll agree with, or at least accept my logic by the end of it. +But, for my money, this classification, while not universal, is still useful. So let's talk a bit more in-depth about these kinds of errors and the intuition behind them. # Unrecoverable errors: **fail early** ## Intro to unrecoverable errors -Let’s start with the errors we don’t want to recover from. +Let’s start with the errors we don’t typically want to try to recover from. -These usually come from programming bugs: a violated precondition, accessing something that shouldn’t be accessed, or hitting undefined behavior. +These usually come from programming bugs or rare hardware failures and show themselves as unexpected values that our variables take. -In all of these cases the program is typically already in some unknown or unexpected state, so we have no guarantees on anything that happens next. For all we know, our memory might have already been corrupted, which means that recovery is most likely impossible and we are probably better off not even trying to recover. +```cpp +// Somewhere in the program. +std::sqrt(value); // If value == -1, how did we get here? +``` + +When this happens, the program is typically already in some unknown or unexpected state, so we have no guarantees on anything that happens next. For all we know, our memory might have already been fully corrupted, which means that recovery is likely impossible and we are probably better off not even trying to recover. ## Setting up an example -Let us illustrate this with an example. Let's say we have a task to assign robots to missions. For simplicity, we will just typedef our `Robot` and `Mission` class to `int` but they can, of course, be arbitrary types. +Let us illustrate one such case with an example. Let's say we have a task to assign robots to missions. For simplicity, we just typedef our `Robot` and `Mission` class to `int` but they can, of course, be arbitrary types. `types.hpp` @@ -129,7 +137,7 @@ struct MissionRobotAssignments { For the sake of our example, we let the user modify these assignments manually. We would need some more functions for this. More concretely, we have a function `CheckIfUserWantsChanges` that asks the user if they want to make any changes and returns a boolean value that indicates their answer. -In addition to this, we need a function `GetNextChangeEntryFromUser` that actually asks for the user's input about *what* they want to change. We ask them for a mission index followed by a request to provide some arbitrary robot id, retuning these as a pair of values. +In addition to this, we need a function `GetNextChangeEntryFromUser` that actually asks for the user's input about *what* they want to change. We ask them for a mission index followed by a request to provide some arbitrary robot id, retuning these as a pair of values. If anything here sounds confusing, please go back to the [lecture on streams](io_streams.md) that we covered towards the start of this course. `user_input.hpp` @@ -163,7 +171,7 @@ inline std::pair GetNextChangeEntryFromUser( } ``` -Finally, we need a `main` function that prints the initial assignment and keeps asking the user for their input until they don't want to provide any. We end by printing the resulting assignments. +Finally, we need a `main` function that prints the initial assignment and keeps asking the user for their input until they decide that they don't want to provide any. We end by printing the resulting assignments. `robot_example.cpp` @@ -187,7 +195,7 @@ int main() { } ``` -Obviously, this example does not do too much, but trust me it is good enough to illustrate many of the core concepts we are talking about today. I hope that with some minor stretch of imagination we can all imagine how it could be extended to a real-world use-case by adding a couple more functions. +Obviously, this example does not do too much, and it might feel a bit verbose, but trust me it is good enough to illustrate many of the core concepts we are talking about today. I hope that with some minor stretch of imagination we can all imagine how it could be extended to a real-world use-case by adding a couple more functions. Oh, and as always, there is [complete code](code/error_handling/robot_example_simple/robot_example.cpp) to this project. @@ -196,7 +204,13 @@ Oh, and as always, there is [complete code](code/error_handling/robot_example_si Now with the example set up, we can illustrate what a typical unrecoverable error looks like. -An example run of this program might look something like this: +As we are using header files, we can compile our code with a single command: + +```cmd +c++ -std=c++17 -o robot_example robot_example.cpp +``` + +And if we run the resulting binary, the output might look something like this: ```output 0: Mission 42 is carried out by the robot 10 @@ -213,9 +227,7 @@ n 1: Mission 40 is carried out by the robot 23 ``` -Here, there are two assignments to start with and the user wants to change the assignment for missions 40 to a robot with an id 4242. However, if we look at the assignments printed in the end we see something strange there: the robots assigned to missions did not change at all! Instead, a mission id changed! - -Essentially, due to our poor interface, the user got confused and instead of specifying the array index for a mission they wanted to change, they provided mission id. The code still uses that id as an index, writing the robot id that the user provided far beyond the expected memory address. +Here, there are two assignments to start with and the user wants to change the assignment for missions 40 to a robot with an id 4242. However, if we look at the assignments printed in the end we see something strange there: the robots assigned to missions **did not change** at all! Instead, a **mission id changed**! But why does it change the mission entry? @@ -230,7 +242,7 @@ int main() { {Robot{10}, Robot{23}}}; std::cout << "Address of assignments.missions.data(): " << assignments.missions.data() << std::endl; - std::cout << "Address of assignments.robots.data(): " + std::cout << "Address of assignments.robots.data(): " << assignments.robots.data() << std::endl; const auto diff = assignments.missions.data() - assignments.robots.data(); std::cout << "Diff in address: " << diff << std::endl; @@ -241,31 +253,33 @@ Running this code will output the addresses for the data stored within our assig ```output Address of assignments.missions.data(): 0x145605ea0 -Address of assignments.robots.data(): 0x145605e00 +Address of assignments.robots.data(): 0x145605e00 Diff in address: 40 ``` If we look long enough at the output of our code and remember that we used 40 as our mission id that we wanted to change, something starts to click! -Here's what happened: the program was asking us for a mission **index** but we provided a mission **id**. Then the id that the user provided got written into the `assignments.robots` vector way beyond the end of its memory and it "just so happened" that the address of the element we wrote to ended up having the address of the first element of the `assignments.missions`! So we overwrote the first element of our missions by mistake! +**Here's what happened:** the program was asking us for a mission **index** but we provided a mission **id**. Then the id that the user provided got written into the `assignments.robots` vector way beyond the end of its memory and it *"just so happened"* that the address of the element we wrote to ended up having the address of the first element of the `assignments.missions`! So we overwrote the first element of `assignments.missions` by mistake! + +Do note, that we're hitting **undefined behavior** here! This result will almost certainly be different if you run this code on your own machine! The addresses of the `assignments.robots` and `assignments.missions` vectors will be different and can even have a different order in memory. You might have already noticed that in this run `assignments.robots` appears *before* `assignments.missions` in memory even though `assignments.robots` appears *after* `assignments.missions` in the `struct` declaration! If you'd like to understand a bit more about how these are allocated, we've covered this [in the lecture on how C++ allocates memory](memory_and_smart_pointers.md). -Do note, that we're hitting **undefined behavior** here! This result will almost certainly be different if you run it on your own machine! The addresses of the `assignments.robots` and `assignments.missions` vectors will be different and can even have a different order in memory. You might have already noticed that `assignments.robots` appears before `assignments.missions` in memory even though `assignments.robots` appears *after* `assignments.missions` in the `struct` declaration! If you'd like to understand a bit more about how these are allocated, we've covered this [in the lecture on how C++ allocates memory](memory_and_smart_pointers.md). +The main point I was trying to make here though is that now our **memory is corrupted**. This particular example was carefully constructed to overwrite an element of another vector, but if the user provides a different "mission id" the code will write the user-provided number to an arbitrary part of memory that belongs to our program, or will crash with a segmentation fault error if we hit memory that does not belong to us yet. -The main point I was trying to make here though is that now our memory is corrupted. This particular example was carefully constructed to overwrite an element of another vector, but if the user provides a different "mission id" the code will write the user-provided number to an arbitrary part of memory that belongs to our program, or will crash with a segmentation fault error if we hit memory that does not belong to us yet. What this means is that once an event like this has occurred, we do not have any guarantee on the consistency of the state of our program. Arbitrary objects might be corrupted and can behave in unpredictable ways from this point on without us being able to know about it. +What this means is that once an event like this has occurred, **we do not have any guarantee on the consistency of the state of our program**. Arbitrary objects might have already been corrupted and can behave in unpredictable ways from this point on without us knowing about it. Which can lead to random-looking sporadic failures in seemingly unrelated parts of our program. -This concept lies at the core of what makes this type of errors "unrecoverable". If the data we try to use for recovery is corrupted, we have no guarantees that any such recovery will succeed. +🚨 This concept lies at the core of what makes this type of errors "unrecoverable". If the data we try to use for recovery is corrupted, we have no guarantees that any such recovery will succeed! ## How to deal with unrecoverable errors ### Catch them as early as possible -We often want to catch these types of errors as early as possible—and crash as early as possible—before any more damage is done. +Therefore as we mentioned briefly before, a typical advice is to catch any wrong values that propagate through our program as early as possible—and crash as early as possible—before any more damage is done. ### Don't use `assert` -A typical approach is to enforce contracts at function boundaries. + -One way that is often recommended on the Internet is to use [`assert`](https://en.cppreference.com/w/cpp/error/assert.html) that can be found in the `` include file. I'm not a fan of using `assert` as it has one super annoying flaw but due to how popular it is in many C++ tutorials, we'll have to go through this topic step by step. +One way to detect wrong values floating through our program that is often recommended on the Internet is to use [`assert`](https://en.cppreference.com/w/cpp/error/assert.html) that can be found in the `` include file. I'm not a fan of using `assert` as it has one super annoying flaw but due to how popular it is in many C++ tutorials, we'll have to go through this topic step by step. Essentially, `assert` allows us to check any boolean condition passed into it: @@ -273,7 +287,7 @@ Essentially, `assert` allows us to check any boolean condition passed into it: assert(2 + 2 == 4); ``` -In our example, we could change the `AssignRobot` and `Print` methods of our `MissionRobotAssignments` class to perform the checks needed to avoid potential undefined behavior: +In our example, we could change the `AssignRobot` method of our `MissionRobotAssignments` class to perform the check needed to avoid potential undefined behavior: `mission_robot_assignments.hpp` @@ -286,17 +300,17 @@ In our example, we could change the `AssignRobot` and `Print` methods of our `Mi #include #include - // This should be a class, using struct for simplicity. struct MissionRobotAssignments { void AssignRobot(int assignment_index, const Robot& robot) { - assert((assignment_index < robots.size()) && (assignment_index >= 0)); + assert(assignment_index < robots.size()); + assert(assignment_index >= 0); robots[assignment_index] = robot; } void Print() const { - assert(robots.size() == missions.size()); + // Same as before. for (auto i = 0UL; i < robots.size(); ++i) { std::cout << i << ": Mission " << // missions[i] << " is carried out by the robot " << // @@ -309,7 +323,7 @@ struct MissionRobotAssignments { }; ``` -Now, if we try to compile and run our example just as we did before we won't be able to hit the same undefined behavior as the assertion will trigger! +Now, if we try to compile and run our example just as we did before - the assertion will trigger! ```output 0: Mission 42 is carried out by the robot 10 @@ -324,21 +338,21 @@ Assertion failed: ((assignment_index < robots.size()) && (assignment_index >= 0) [1] 29732 abort ./robot_example ``` -So far so good, right? So what is that fatal flaw I've been talking about that makes me dislike `assert`? Well, you see, all `assert` statements get disabled when a macro `NDEBUG` is defined. This is a standard macro name that gets defined by default for most release builds. So essentially, `assert` does not protect us from undefined behavior in the code we actually deploy! +So far so good, right? So what is that fatal flaw I've been talking about that makes me dislike `assert`? Well, you see, all `assert` statements get disabled when a macro `NDEBUG` is defined. This is a standard macro name that controls if the debug symbols get compiled into the binary and gets passed to the compilation command for most release builds as we generally don't want debug symbols in the binary we release. So essentially, `assert` **does not protect us from undefined behavior in the code we actually deploy**! -We can easily demonstrate that the `asserts` indeed get compiled out by adding `-DNDEBUG` flag to our compilation command: +We can easily demonstrate that the `asserts` indeed get *compiled out* by adding `-DNDEBUG` flag to our compilation command: ```cmd c++ -std=c++17 -DNDEBUG -o robot_example robot_example.cpp ``` -Running our example *now* leads to the same undefined behavior we observed before as all of the assertions were compiled out. +Running our example *now* leads to the same undefined behavior we observed before as all of the assertions were compiled out. So `asserts` essentially become useless in production! ### Use `CHECK` macro instead -Even as `assert` might not be a perfect tool for the job, the idea of checking the function pre-conditions is actually still a very good idea! +Even as `assert` might not be a perfect tool for the job, the idea of checking the function's pre- and sometimes pose-conditions is actually still a **very good idea**! -And there are better tools for this! My favorite method is to use the [`CHECK`](https://abseil.io/docs/cpp/guides/logging#CHECK) macro that can be found in the [Abseil library](https://abseil.io/docs/). We can use it in the same way as we used `assert`: +We just need a better tool! My favorite method is to use the [`CHECK`](https://abseil.io/docs/cpp/guides/logging#CHECK) macro that can be found in the [Abseil library](https://abseil.io/docs/). We can use it in the same way as we used `assert`: ```cpp #pragma once @@ -360,7 +374,7 @@ struct MissionRobotAssignments { } void Print() const { - CHECK_EQ(robots.size(), missions.size()); + // Same as before. for (auto i = 0UL; i < robots.size(); ++i) { std::cout << i << ": Mission " << // missions[i] << " is carried out by the robot " << // @@ -381,25 +395,37 @@ All in all, at least in my book, `CHECK` is our main weapon against unrecoverabl +### Complete the example + +By the way, we've only covered how we could improve our `AssignRobot` method of the `MissionRobotAssignments` class. Do you think our `Print` function would benefit from the same treatment? + + + ## How to minimize number of unrecoverable errors -Of course, we'd rather not have the bugs we're talking about here at all. In practice, we aim to keep the test coverage high for our code, ideally close to 100% line and branch coverage. In some industry, like automotive, aviation, or medical this is actually a requirement. +Of course, hard failures in the programs we ship is not ideal! One way to reduce the risk of such failures is to keep the test coverage high for the code we write, ideally close to 100% line and branch coverage. This way we catch most of them during development. In some industries, like automotive, aviation, or medical this is actually a legal requirement. + +But unfortunately, despite our best efforts, we cannot *completely* avoid failures in the programs we ship! -However, even the most rigorous checking does not guarantee that our program will not hit a `CHECK` failure in production. It does happen from time to time that a cosmic ray will hit our memory just right and flip a bit. We might not be able to detect this, but using `CHECK` rigorously increases our chances. Now, if the flipped bit causes some pre-condition to fail, our program will fail rather than continue running in an undefined state. But my main point here is that despite our best efforts we still need to be prepared for when our program crashes. +Even if we do everything right on our side, the hardware can still fail and corrupt our memory. One fun example of this is the famous error in the Belgian election on the 18th of May 2003, where [one political party got 4096 extra votes](https://en.wikipedia.org/wiki/Electronic_voting_in_Belgium). If you look at the number carefully, you might notice that it is a 2 to the power of 12. One error explanation was that a cosmic ray flipped "a bit at the position 13 in the memory of the computer", essentially leading to 4096 more votes. -In safety-critical systems, we often isolate components into separate processes or even separate hardware units, with watchdogs that can trigger recovery actions if something crashes. This way we can have our cake and eat it at the same time: using `CHECK` minimizes the time-to-failure when a bug is encountered, while our fallback options keep the system safe as a whole even when certain components fail. + -That being said, such design of a system as a whole is a large architecture topic in itself and is far beyond what I want to talk about today. In most non-safety-critical systems we do not need to think about these failure cases as deeply and we can usually just restart our program in case of a one-off failure. But, if you're interested in this topic, I've given an introductory lecture [on this topic](https://youtu.be/DtRktn4bVWg?si=DJuU8OjxtBcj5o2C) at the University of Bonn some years ago. +With the knowledge that we cannot completely remove the risk of hitting an unrecoverable error in production, in safety-critical systems, we often isolate components into separate processes or even separate hardware units, with watchdogs that can trigger recovery actions if something crashes. + +This way we can have our cake and eat it at the same time: using `CHECK` minimizes the time-to-failure when a bug is encountered, while our fallback options keep the system safe as a whole even when certain components fail. + +That being said, this is a system architecture question and this topic is far beyond what I want to talk about today. In most non-safety-critical systems we do not need to think about these failure cases as deeply and we can usually just restart our program in case of a one-off failure. But, if you're interested in this topic, I've given a part of my introductory lecture [on this topic](https://youtu.be/DtRktn4bVWg?si=DJuU8OjxtBcj5o2C) at the University of Bonn some years ago. # Recoverable errors: **handle and proceed** -But not every error should instantly crash our program! Indeed, in our example, the original cause of the error is not a cosmic ray flipping bits of our memory, but a wrong user input! +But not *every* error should instantly crash our program! Indeed, in our example, the original cause of the error is not a cosmic ray flipping bits of our memory - but a wrong user input! -The good thing about user inputs is that we can ask the user to correct these without crashing our program! The type of errors we encounter here can be called **recoverable errors**. +The good thing about user inputs is that we can ask the user to correct these without crashing! The type of errors we encounter here can be called **recoverable errors**. -To talk about them, let us focus on the function `GetNextChangeEntryFromUser` from our example. Currently, there is no validation of what the user inputs but we absolutely can and should perform such validation! +To talk about them, let us focus on the function `GetNextChangeEntryFromUser` from our example. Currently, there is no validation of what the user inputs but we absolutely *can* and *should* perform such validation! -As we design our program, we know that the input the mission index the user provides first must be within the bounds of the mission vector within our `assignments` object: +As we design our program, we know that the mission index that the user provides first must be within the bounds of the mission vector within our `assignments` object: ```cpp // Multiple issues here for now. @@ -407,6 +433,8 @@ As we design our program, we know that the input the mission index the user prov // We also could use a struct in place of a pair. inline std::pair GetNextChangeEntryFromUser( const MissionRobotAssignments& assignments) { + std::cout << "Current assignments:\n"; + assignments.Print(); std::pair entry{}; std::cout << "Please select mission index." << std::endl; std::cin >> entry.first; // <-- This value is NOT arbitrary! @@ -416,14 +444,14 @@ inline std::pair GetNextChangeEntryFromUser( } ``` -So we'd like to somehow know that something went wrong withing the `GetNextChangeEntryFromUser` function and recover from this. +So we'd like to somehow know that something went wrong within the `GetNextChangeEntryFromUser` function and recover from this. Broadly speaking, we have two strategies of communicating failures like these that have emerged in C++ over the years: 1. **Return a special value from a function.** 2. Throw an **exception**. -We’ll spend most of our time on the first one—but let’s first spend some time and talk about throwing exceptions, and yes, why I think it might not be the best thing we could do. This is the time to get your pitchforks ready 😉. +We’ll spend most of our time on the first one—but let’s first talk about throwing exceptions, and yes, why I think it might not be the best thing we could do. This is the time to get your pitchforks ready 😉. ## Exceptions @@ -435,10 +463,11 @@ In our example function, we could throw an object of `std::runtime_error` when t ```cpp // Multiple issues here for now. -// We should handle failure to get a proper value. // We also could use a struct in place of a pair. inline std::pair GetNextChangeEntryFromUser( const MissionRobotAssignments& assignments) { + std::cout << "Current assignments:\n"; + assignments.Print(); std::pair entry{}; std::cout << "Please select mission index." << std::endl; std::cin >> entry.first; // <-- This value is NOT arbitrary! @@ -451,7 +480,7 @@ inline std::pair GetNextChangeEntryFromUser( } ``` -Throwing an exception will interrupt the normal function flow, destroy all objects allocated in the appropriate scope and will continue with the "exceptional flow" of the program to find a place where the thrown exception can be handled. +Throwing an exception interrupts the normal function flow, destroy all objects allocated in the appropriate scope and will continue with the "exceptional flow" of the program to find a place where the thrown exception can be handled. Speaking of handling exceptions, we can "catch" them anywhere upstream from the place they have been thrown from. As `std::exception` is just an object, it can be caught by value or by reference. When using exceptions, it is considered best practice to catch them by reference. In our case this is possible because `std::runtime_error` derives from `std::exception`: @@ -481,9 +510,9 @@ int main() { Should we forget to catch an exception it bubbles up to the very top and terminates our program. -On paper, this looks very neat. But there are problems. +On paper, this looks very neat. Essentially, with exceptions, we could forget about the distinction between recoverable and unrecoverable errors: if we catch an error, it is a "recoverable" one, if not - we treat it as "unrecoverable". This is one the main arguments from people who like using exceptions - do not make a global decision inside of a local scope, let someone with more overview figure out what to do. And it *is* a very good argument! The other good argument is that exceptions *are* part of the language so it feels odd not to use them. - +But there are problems with exceptions that, at least in some industries, we just cannot ignore. ### Exceptions are (sometimes) expensive @@ -503,31 +532,38 @@ Indeed, an error can propagate across many layers of calls before being caught. Furthermore, the language permits the use of generic catch blocks like `catch (...)` and these make things even more confusing. We end up catching *something*, but we no longer know what or who threw it at us! 😱 -In our own example, if `GetAnswerFromLlm` throws an undocumented `std::logic_error` but we only expect `std::runtime_error`, we might miss important context or even crash anyway: +In our own example, if, say `AssignRobot` throws an undocumented `std::out_of_range` exception but we only expect `std::runtime_error`, we might miss important context or even crash anyway: ```cpp -#include - -std::string GetAnswerFromLlm(const std::string& question) { - const auto llm = GetLlmHandle(); - if (!llm) throw std::runtime_error("No network connection"); - return llm->GetAnswer(question); -} +#include "mission_robot_assignments.hpp" +#include "user_input.hpp" +#include "types.hpp" int main() { - try { - const auto response = GetAnswerFromLlm("What’s the meaning of life?"); - std::cout << response << "\n"; - } catch (...) { - // Not very helpful, is it? - std::cerr << "Oops, something happened.\n"; + MissionRobotAssignments assignments{{Mission{42}, Mission{40}}, + {Robot{10}, Robot{23}}}; + assignments.Print(); + while (true) { + const auto user_wants_changes = CheckIfUserWantsChanges(); + if (!user_wants_changes) { break; } + try { + const auto change_entry = GetNextChangeEntryFromUser(assignments); + assignments.AssignRobot(change_entry.first, change_entry.second); + } catch (const std::runtime_error& e) { + std::cerr << "Error: " << e.what() << "\n"; + std::cerr << "Please try again.\n"; + } catch (...) { + // Not very helpful, is it? + std::cerr << "Oops, something happened.\n"; + } } + assignments.Print(); } ``` Video Thumbnail -I believe that `catch(...)` and equivalent constructs are singlehandedly responsible for the absolute majority of the fun error messages that we can see all over the internet and have probably encountered ourselves multiple times. +I believe that `catch(...)` and equivalent constructs are singlehandedly responsible for the absolute majority of the very unspecific and unhelpful error messages that we see all over the internet and have probably encountered ourselves multiple times. ### Exceptions are banned in many code bases @@ -535,81 +571,100 @@ All of these issues led a lot of code bases to ban exceptions altogether. In 201 My own experience aligns with these results - every serious project I’ve worked on either banned exceptions completely, or avoided them in performance-critical paths. But then again, I did work in robotics and automotive for the majority of my career. -The problem of using exceptions with an acceptable overhead has quite vibrant discussions around it with even calls for re-imagining exceptions altogether as can be seen in this [wonderful talk by Herb Sutter](https://www.youtube.com/watch?v=ARYP83yNAWk) from CppCon 2019 as well as his [corresponding paper](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0709r4.pdf) on this topic. +The problem of using exceptions with an acceptable overhead has quite vibrant discussions around it with even calls for re-designing exceptions altogether as can be seen in this [wonderful talk by Herb Sutter](https://www.youtube.com/watch?v=ARYP83yNAWk) from CppCon 2019 as well as his [corresponding paper](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0709r4.pdf) on this topic. But until the C++ community figures out what to do we are stuck with many people being unable to use the default error handling mechanism in C++. -So what do we do? +So what *do* we do? ## Returning errors explicitly can work better if done well -Now is a time to return to the other option we hinted at before: dealing with errors by returning a special value from a function. +Now is a time to return to the other option of detecting errors in functions that we hinted at before: **dealing with errors by returning a special value from a function**. -I would way that there are three distinct ways of thinking about it. +I would say that there are three distinct ways of thinking about it. Let's illustrate all of them on a function we've already looked at: ```cpp -std::string GetAnswerFromLlm(const std::string& question); +std::pair GetNextChangeEntryFromUser(const MissionRobotAssignments& assignments); ``` We can: -1. Keep the return type, `std::string` in our case, but return a special **value** of this type +1. Keep the return type, `std::pair` in our case, but return a special **value** of this type in case of failure. 2. Return an **error code**, which would change the signature of the function to return `int` or a similar type instead: ```cpp - int GetAnswerFromLlm(const std::string& question, std::string& result); + int GetNextChangeEntryFromUser(const MissionRobotAssignments& assignments, std::pair& result); ``` -3. Return a **different type** specifically designed to encode failure states alongside the actual return, like `std::optional` which only holds a valid `std::string` in case of success. +3. Return a **different type** specifically designed to encode failure states alongside the actual return, like `std::optional>` which only holds a valid `std::pair` in case of success. I believe that the third option is the best out of these three, but let me explain why the first two are not cutting it, before going deeper into details. ### Returning a value indicating error does not always work 😱 -There is a number of issues with returning a special value from a function without using a special return type. As an illustration, in our case, a naïve choice would be to return an empty string if there was no answer from the LLM, but what if we asked the LLM something along the lines of "read this file, return empty string when done"? An empty string *is* the valid response here! How do we distinguish this output from a failure? +There is a number of issues with returning a special value from a function without using a special return type. As an illustration, in our case, a naïve choice would be to return a pair of zeros from the function if the user provided a wrong mission index. However, as you might imagine, a pair of zeros *is* a completely valid mission-robot assignment! How do we disambiguate this value `{0, 0}` value from a valid one? ```cpp -std::string GetAnswerFromLlm(const std::string& question) { - const auto llm = GetLlmHandle(); - if (!llm) return ""; // 😱 Not a great idea! - return llm->GetAnswer(question); +// 😱 Not a great idea! +inline std::pair GetNextChangeEntryFromUser( + const MissionRobotAssignments& assignments) { + std::cout << "Current assignments:\n"; + assignments.Print(); + std::pair entry{}; + std::cout << "Please select mission index." << std::endl; + std::cin >> entry.first; // <-- This value is NOT arbitrary! + if ((entry.first < 0) || (entry.first >= assignments.missions.size())) { + return {0, 0}; + } + std::cout << "Please provide new robot id." << std::endl; + std::cin >> entry.second; + return entry; } ``` -Similar cases can be constructed for most values we can come up with. In addition to that, there is no easy way to encode the *reason* for the failure - we do want to know if we failed due to a network timeout or due to an imminent AI world takeover. And a final nail in the coffin of this method is that it does not work at all for functions that return `void` for obvious reasons. - - - ### Returning an error code breaks "pure functions" 😱 Returning an error code instead solves at least a couple of these issues. It is fast and reliable and we can design our software with different error codes in mind so that the reason for the failure is also communicated to us. This is also still the prevalent way of handling errors in C and in some library that we can find in the wild, so there *is* some merit to this method. -However, if our function actually must return a value, the only way to use error codes is to change its return type to the type that our error codes have, like `int`, which forces us to provide an additional output parameter to our function, like `std::string& result` in our case: +However, if our function actually must return a value, the only way to use error codes is to change its return type to the type that our error codes have, like `int`, which forces us to provide an additional output parameter to our function, like `std::pair& result` in our case: ```cpp -int GetAnswerFromLlm(const std::string& question, std::string& result); +int GetNextChangeEntryFromUser(const MissionRobotAssignments& assignments, std::pair& result); ``` The main issue with this from my point of view is that it is clunky, mixes input/output in the signature, and limits functional composition. Consider how we would use this function: ```cpp +#include "mission_robot_assignments.hpp" +#include "user_input.hpp" +#include "types.hpp" + int main() { - std::string response{}; // Can't be const! - const auto success = GetAnswerFromLlm("What’s the meaning of life?", response); - if (!success) { - std::cerr << "Could not get the result from LLM\n"; - return 1; + MissionRobotAssignments assignments{{Mission{42}, Mission{40}}, + {Robot{10}, Robot{23}}}; + assignments.Print(); + while (true) { + const auto user_wants_changes = CheckIfUserWantsChanges(); + if (!user_wants_changes) { break; } + std::pair change_entry{}; // Cannot be const! + const auto error_code = GetNextChangeEntryFromUser(assignments, change_entry); + if (!error_code) { + // Get the actual message stored somewhere else using the int error code. + std::cerr << "Please try again.\n"; + continue; } - std::cout << response << "\n"; + assignments.AssignRobot(change_entry.first, change_entry.second); + } + assignments.Print(); } ``` -In this code, we have to create an empty string before calling the `GetAnswerFromLlm` function. Furthermore this string cannot be `const`, which goes against everything we've been talking in this series until now. +In this code, we have to create a pair object before calling the `GetAnswerFromLlm` function. Furthermore this object cannot be `const`, which goes against everything we've been talking about in this series of C++ lectures until now. -On top of all this, nowadays, the compilers are able to perform Return Value Optimization (or [RVO](https://en.cppreference.com/w/cpp/language/copy_elision.html)) for values returned from a function and this functionality is limited for such input/output parameters. +On top of all this, nowadays, the compilers are able to perform **R**eturn **V**alue **O**ptimization (or [RVO](https://en.cppreference.com/w/cpp/language/copy_elision.html)) for values returned from a function, essentially skipping the function call, and constructing the needed value in-place, and this functionality is limited for such input/output parameters. So clearly, there are some issues with this method too. I believe it has its merits sometimes, but there has to be a reason for it and we must measure the performance well. @@ -618,21 +673,48 @@ So clearly, there are some issues with this method too. I believe it has its mer I believe that there *is* a better way. With C++17, we gained [`std::optional`](https://en.cppreference.com/w/cpp/utility/optional.html) with which we can express that a function “might return a value” if everything goes well: ```cpp -std::optional GetAnswerFromLlm(const std::string& question); +std::optional> GetNextChangeEntryFromUser( + const MissionRobotAssignments& assignments) { + std::cout << "Current assignments:\n"; + assignments.Print(); + std::pair entry{}; + std::cout << "Please select mission index." << std::endl; + std::cin >> entry.first; + if ((entry.first < 0) || (entry.first >= assignments.missions.size())) { + return {}; // <-- Creates an empty optional, or std:nullopt. + } + std::cout << "Please provide new robot id." << std::endl; + std::cin >> entry.second; + return entry; // <-- Creates an optional filled with a pair. +} ``` -Now our function returns an object of a different type, `std::optional` that we can use in an `if` statement to find out if it actually holds a value, which we can get to by calling its `value()` method or using a dereferencing operator `*` just like with pointers: +Now our function returns an object of a different type, `std::optional>` that we can use in an `if` statement to find out if it actually holds a value, which we can get to by calling its `value()` method or using a dereferencing operators `*` and `->`, just like with pointers: ```cpp +#include "mission_robot_assignments.hpp" +#include "user_input.hpp" +#include "types.hpp" + int main() { - const auto answer = GetAnswerFromLlm("What now?"); - if (answer.has_value()) return 1; - std::cout << answer.value() << "\n"; - std::cout << *answer << "\n"; // Same as above. + MissionRobotAssignments assignments{{Mission{42}, Mission{40}}, + {Robot{10}, Robot{23}}}; + assignments.Print(); + while (true) { + const auto user_wants_changes = CheckIfUserWantsChanges(); + if (!user_wants_changes) { break; } + const auto change_entry = GetNextChangeEntryFromUser(assignments); + if (!change_entry.has_value()) { + std::cerr << "Please try again.\n"; + continue; + } + assignments.AssignRobot(change_entry->first, change_entry->second); + } + assignments.Print(); } ``` -The presence or absence of a value is encoded into the type itself. No more guessing. No more relying on magic return values or input/output arguments. And as always, we can always find more information about how to use it at [cppreference.com](https://en.cppreference.com/w/cpp/utility/optional.html). +The presence or absence of a value is encoded into the type itself. No more guessing. No more relying on magic return values or input/output arguments. And the code is very short and neat! And, as always, we can always find more information about how to use `std::optional` at [cppreference.com](https://en.cppreference.com/w/cpp/utility/optional.html). ### Using `std::expected`: **add context** @@ -640,7 +722,7 @@ However, we might notice that `std::optional` only tells us that *something* wen Enter [`std::expected`](https://en.cppreference.com/w/cpp/utility/expected.html), coming in C++23. And if you'd like to know what led to it being added to the language, give this [fantastic talk by Andrei Alexandrescu](https://www.youtube.com/watch?v=PH4WBuE1BHI) a watch! It is one of my favorite talks ever! It is both informative and entertaining in an equal measure! - + With `std::expected` we could do the same things we could with `std::optional` and more by changing our function accordingly: @@ -651,23 +733,45 @@ std::expected GetAnswerFromLlm(const std::string& ques Essentially, `std::expected` holds one of two values of two potentially different types - an expected or an unexpected one. Now we can return either a valid result, or an error message: ```cpp -std::expected GetAnswerFromLlm(const std::string& question) { - const auto llm = GetLlmHandle(); - if (!llm) return std::unexpected("Cannot get LLM handle."); - return llm->GetAnswer(question); +std::expected, std::string> GetNextChangeEntryFromUser( + const MissionRobotAssignments& assignments) { + std::cout << "Current assignments:\n"; + assignments.Print(); + std::pair entry{}; + std::cout << "Please select mission index." << std::endl; + std::cin >> entry.first; + if ((entry.first < 0) || (entry.first >= assignments.missions.size())) { + return std::unexpected("Wrong value provided."); + } + std::cout << "Please provide new robot id." << std::endl; + std::cin >> entry.second; + return entry; // <-- Creates a std::expected filled with a pair. } ``` Using it is also quite neat: ```cpp +#include "mission_robot_assignments.hpp" +#include "user_input.hpp" +#include "types.hpp" + int main() { - const auto answer = GetAnswerFromLlm("What now?"); - if (!answer.has_value()) { - std::cerr << answer.error() << "\n"; - return 1; + MissionRobotAssignments assignments{{Mission{42}, Mission{40}}, + {Robot{10}, Robot{23}}}; + assignments.Print(); + while (true) { + const auto user_wants_changes = CheckIfUserWantsChanges(); + if (!user_wants_changes) { break; } + const auto change_entry = GetNextChangeEntryFromUser(assignments); + if (!change_entry.has_value()) { + std::cerr << answer.error() << "\n"; + std::cerr << "Please try again.\n"; + continue; + } + assignments.AssignRobot(change_entry->first, change_entry->second); } - std::cout << answer.value() << "\n"; + assignments.Print(); } ``` @@ -678,7 +782,7 @@ This has all the benefits we mentioned before: - Everything happens in deterministic time with no unpredictable runtime overhead - Works for functions returning `void` too -There is just one tiny issue that spoils our fun. As you've probably noticed, most of the things we covered until now targeted C++17, and `std::expected` is only available from C++23 on. But there is a solution to this: we can use [`tl::expected`](https://github.com/TartanLlama/expected) as a drop-in replacement for code bases that don't yet adopt C++23. +There is just one tiny issue that spoils our fun. As you've probably noticed, most of the things we covered until now in this C++ course targeted C++17, and `std::expected` is only available from C++23 on. But there is a solution to this: we can use [`tl::expected`](https://github.com/TartanLlama/expected) as a drop-in replacement for code bases that don't yet adopt C++23. ## Performance Considerations for `std::optional` and `std::expected` @@ -713,16 +817,17 @@ We went through quite some material today. We've looked at all the various kinds -- Use `CHECK` and similar macros for dealing with unrecoverable errors like programming bugs or contract violation. -- Use `std::optional` as a return type when a value might be missing due to a recoverable error occurring. -- Use `std::expected` when a reason for failure is important to know. +- Use `CHECK` and similar macros for dealing with unrecoverable errors like programming bugs or contract violation to fail as fast as possible when they are encountered. - Keep the test coverage of the code high to reduce chances of missing errors. +- Use `std::optional` as a return type when a value might be missing due to a recoverable error and the reason for the failure is not important. +- Use `std::expected` when a reason for failure *is* important to know how to recover. - Avoid exceptions in time-critical or safety-critical systems due to their non-deterministic runtime overhead. +- Avoid old error handling mechanisms like returning error codes when possible. All in all, the overall direction that we seem to be following as a community is to make failure explicit and force the caller to handle it. That leads to clearer, safer, and more maintainable code. -One final thing I wanted to add is that obviously, the `std::optional` class can be used also in other places, not just as a return type from a function. If some object of ours must have an optional value, using `std::optional` can be a good idea there too! But I'm sure you're going to be able to figure this out from the related [cppreference page](https://en.cppreference.com/w/cpp/utility/optional.html). +One final thing I wanted to add is that obviously, the `std::optional` class can be used also in other places, not just as a return type from a function. If some object of ours must have an optional value, using `std::optional` can be a good idea there too! But I'm sure you're going to be able to figure this out from the related [cppreference page](https://en.cppreference.com/w/cpp/utility/optional.html) on your own 😉. - From 51e69059b4b300374f0820a9ebc0e09a329acaec Mon Sep 17 00:00:00 2001 From: Igor Bogoslavskyi Date: Mon, 23 Jun 2025 00:26:16 +0200 Subject: [PATCH 17/26] Change examples yet again, hopefully the last time --- .../comparison_game/comparison_game.cpp | 81 ++ .../robot_example_simple/robot_example.cpp | 69 -- lectures/error_handling.md | 831 ++++++++---------- lectures/images/docs.png | 3 + 4 files changed, 469 insertions(+), 515 deletions(-) create mode 100644 lectures/code/error_handling/comparison_game/comparison_game.cpp delete mode 100644 lectures/code/error_handling/robot_example_simple/robot_example.cpp create mode 100644 lectures/images/docs.png diff --git a/lectures/code/error_handling/comparison_game/comparison_game.cpp b/lectures/code/error_handling/comparison_game/comparison_game.cpp new file mode 100644 index 0000000..e831f2e --- /dev/null +++ b/lectures/code/error_handling/comparison_game/comparison_game.cpp @@ -0,0 +1,81 @@ +#include +#include + +struct ChangeEntry { + int index{}; + int value{}; +}; + +// 😱 Warning! No error handling! +class Game { + public: + Game(std::vector&& ref_numbers, + std::vector&& player_numbers, + int budget) + : ref_numbers_{std::move(ref_numbers)}, + player_numbers_{std::move(player_numbers)}, + budget_{budget} {} + + void Print() const { + std::cout << "Budget: " << budget_ << std::endl; + std::cout << "Reference numbers: "; + for (auto number : ref_numbers_) { std::cout << number << "\t"; } + std::cout << std::endl; + std::cout << "Player numbers: "; + for (auto number : player_numbers_) { std::cout << number << "\t"; } + std::cout << std::endl; + } + + bool CheckIfPlayerWon() const { + int win_loss_counter{}; + for (auto i = 0UL; i < player_numbers_.size(); ++i) { + const auto difference = player_numbers_[i] - ref_numbers_[i]; + if (difference > 0) win_loss_counter++; + if (difference < 0) win_loss_counter--; + } + return win_loss_counter > 0; + } + + void ChangePlayerNumberIfPossible(const ChangeEntry& change_entry) { + auto& player_number = player_numbers_[change_entry.index]; + const auto difference = std::abs(change_entry.value - player_number); + if (difference > budget_) { return; } + player_number = change_entry.value; + budget_ -= difference; + } + + const std::vector& ref_numbers() const { return ref_numbers_; } + const std::vector& player_numbers() const { return player_numbers_; } + bool UserHasBudget() const { return budget_ > 0; } + + private: + std::vector ref_numbers_{}; + std::vector player_numbers_{}; + int budget_{}; +}; + +// Multiple issues here for now. +// We should handle failure to get a proper value. +ChangeEntry GetNextChangeEntryFromUser(const Game& game) { + game.Print(); + ChangeEntry entry{}; + std::cout << "Please enter number to change: "; + std::cin >> entry.index; + std::cout << "Please provide a new value: "; + std::cin >> entry.value; + return entry; +} + +int main() { + Game game{{42, 49, 23}, {42, 40, 23}, 10}; + while (game.UserHasBudget()) { + const auto change_entry = GetNextChangeEntryFromUser(game); + game.ChangePlayerNumberIfPossible(change_entry); + } + game.Print(); + if (game.CheckIfPlayerWon()) { + std::cout << "You win!\n"; + } else { + std::cout << "Not win today. Try again!\n"; + } +} diff --git a/lectures/code/error_handling/robot_example_simple/robot_example.cpp b/lectures/code/error_handling/robot_example_simple/robot_example.cpp deleted file mode 100644 index 06370a0..0000000 --- a/lectures/code/error_handling/robot_example_simple/robot_example.cpp +++ /dev/null @@ -1,69 +0,0 @@ -#include -#include -#include - -// Can be arbitrary types, here int for simplicity. -using Robot = int; -using Mission = int; - -// This should be a class, using struct for simplicity. -struct MissionRobotAssignments { - - void AssignRobot(int assignment_index, const Robot& robot) { - assert((assignment_index < robots.size()) && (assignment_index >= 0)); - robots[assignment_index] = robot; - } - - void Print() const { - assert(robots.size() == missions.size()); - for (auto i = 0UL; i < robots.size(); ++i) { - std::cout << i << ": Mission " << // - missions[i] << " is carried out by the robot " << // - robots[i] << std::endl; - } - } - - std::vector missions{}; - std::vector robots{}; -}; - -// Multiple issues here for now. -// We should handle failure to get a proper value. -// We also could use a struct in place of a pair. -std::pair GetNextChangeEntryFromUser( - const MissionRobotAssignments& assignments) { - std::pair entry{}; - std::cout << "Please select mission index." << std::endl; - std::cin >> entry.first; - std::cout << "Please provide new robot id." << std::endl; - std::cin >> entry.second; - return entry; -} - -bool CheckIfUserWantsChanges() { - std::cout << "Do you want to change assignment? [y/n]" << std::endl; - std::string answer{}; - std::cin >> answer; - if (answer == "y") { return true; } - return false; -} - -int main() { - MissionRobotAssignments assignments{{Mission{42}, Mission{40}}, - {Robot{10}, Robot{23}}}; - assignments.Print(); - while (true) { - const auto user_wants_changes = CheckIfUserWantsChanges(); - if (!user_wants_changes) { break; } - const auto change_entry = GetNextChangeEntryFromUser(assignments); - assignments.AssignRobot(change_entry.first, change_entry.second); - } - assignments.Print(); - - std::cout << "Address of assignments.missions.data(): " - << assignments.missions.data() << std::endl; - std::cout << "Address of assignments.robots.data(): " - << assignments.robots.data() << std::endl; - const auto diff = assignments.robots.data() - assignments.missions.data(); - std::cout << "Diff in address: " << diff << std::endl; -} diff --git a/lectures/error_handling.md b/lectures/error_handling.md index 99ed5f7..045cca5 100644 --- a/lectures/error_handling.md +++ b/lectures/error_handling.md @@ -7,22 +7,25 @@ - [Disclaimer](#disclaimer) - [What Do We Mean by “Error”?](#what-do-we-mean-by-error) +- [Setting up the example: **a comparison game**](#setting-up-the-example-a-comparison-game) + - [Rules of the game](#rules-of-the-game) + - [Initial code of the game](#initial-code-of-the-game) - [Unrecoverable errors: **fail early**](#unrecoverable-errors-fail-early) - - [Intro to unrecoverable errors](#intro-to-unrecoverable-errors) - - [Setting up an example](#setting-up-an-example) - - [Unrecoverable error example](#unrecoverable-error-example) + - [Our first unrecoverable error encounter](#our-first-unrecoverable-error-encounter) - [How to deal with unrecoverable errors](#how-to-deal-with-unrecoverable-errors) - [Catch them as early as possible](#catch-them-as-early-as-possible) + - [Use `CHECK` macro to fail early](#use-check-macro-to-fail-early) - [Don't use `assert`](#dont-use-assert) - - [Use `CHECK` macro instead](#use-check-macro-instead) - - [Complete the example](#complete-the-example) + - [Complete the example yourself](#complete-the-example-yourself) - [How to minimize number of unrecoverable errors](#how-to-minimize-number-of-unrecoverable-errors) - [Recoverable errors: **handle and proceed**](#recoverable-errors-handle-and-proceed) - [Exceptions](#exceptions) - - [What exceptions are](#what-exceptions-are) - - [Exceptions are (sometimes) expensive](#exceptions-are-sometimes-expensive) - - [Exceptions hide the error path](#exceptions-hide-the-error-path) - - [Exceptions are banned in many code bases](#exceptions-are-banned-in-many-code-bases) + - [How to use exceptions](#how-to-use-exceptions) + - [A case for exceptions for both "recoverable" and "unrecoverable" errors](#a-case-for-exceptions-for-both-recoverable-and-unrecoverable-errors) + - [Why not just use exceptions](#why-not-just-use-exceptions) + - [Exceptions are (sometimes) expensive](#exceptions-are-sometimes-expensive) + - [Exceptions hide the error path](#exceptions-hide-the-error-path) + - [Exceptions are banned in many code bases](#exceptions-are-banned-in-many-code-bases) - [Returning errors explicitly can work better if done well](#returning-errors-explicitly-can-work-better-if-done-well) - [Returning a value indicating error does not always work 😱](#returning-a-value-indicating-error-does-not-always-work-) - [Returning an error code breaks "pure functions" 😱](#returning-an-error-code-breaks-pure-functions-) @@ -30,16 +33,14 @@ - [Using `std::expected`: **add context**](#using-stdexpected-add-context) - [Performance Considerations for `std::optional` and `std::expected`](#performance-considerations-for-stdoptional-and-stdexpected) - [Error type size matters](#error-type-size-matters) - - [Return value optimization](#return-value-optimization) + - [Return value optimization with `std::optional` and `std::expected`](#return-value-optimization-with-stdoptional-and-stdexpected) - [Summary](#summary) -When writing C++ code, much like in life, we don’t always get what we want. The good news is that C++ comes packed with the tools to prepare for this and maybe even recover! - -But, just like with everything else in C++, there are… well, a *number* of ways to do that. 😅 +When writing C++ code, much like in real life, we don’t always get what we want. The good news is that C++ comes packed with the tools to let us be prepared for this eventuality! Today we’re talking about error handling. What options we have, which trade-offs they come with, and what tools modern C++ gives us to make our lives a bit easier. -And as this topic is quite nuanced, there will definitely be some statements that are quite opinionated. I can already see some people with pitchforks coming my way... so... I'm sure it's gonna be fun! +Buckle up! There is a lot to cover and quite some nuance in this topic! There will also inevitably be some statements that are quite opinionated and I can already see people with pitchforks and torches coming for me... so... I'm sure it's gonna be fun! - # What Do We Mean by “Error”? -Before we go into how to handle errors, let’s clarify what we mean when we say "error" in the first place. +Before we go into how to handle errors, however, let’s clarify what we mean when we think about an "error" in programming. -At the highest level: an error is something that happens when the code doesn’t produce the result we expect. But there is nuance here! +At the highest level: an error is something that happens when a program doesn’t produce the result we expect. I tend to think of errors belonging to one of two broad groups: - **Unrecoverable errors** — where the program reaches a state in which recovery is impossible or meaningless. - **Recoverable errors** — where the program can detect that something went wrong, and has ways to proceed following an alternative path. -Some languages—like Rust—bake this distinction [directly into the language design](https://doc.rust-lang.org/book/ch09-00-error-handling.html). C++ doesn’t, making the topic of error handling slightly more nuanced. +Some languages, like Rust, bake this distinction [directly into the language design](https://doc.rust-lang.org/book/ch09-00-error-handling.html). C++ doesn’t, making the topic of error handling slightly more nuanced. But, for my money, this classification, while not universal, is still useful. So let's talk a bit more in-depth about these kinds of errors and the intuition behind them. -# Unrecoverable errors: **fail early** - -## Intro to unrecoverable errors - -Let’s start with the errors we don’t typically want to try to recover from. +# Setting up the example: **a comparison game** -These usually come from programming bugs or rare hardware failures and show themselves as unexpected values that our variables take. +## Rules of the game -```cpp -// Somewhere in the program. -std::sqrt(value); // If value == -1, how did we get here? -``` +There is a lot of ground to cover here and to not get lost, I would like to introduce a small example that will guide us and help illustrate all of the concepts we are talking about today. -When this happens, the program is typically already in some unknown or unexpected state, so we have no guarantees on anything that happens next. For all we know, our memory might have already been fully corrupted, which means that recovery is likely impossible and we are probably better off not even trying to recover. +To this end, let's model a simple puzzle game. In this game a player start with an array of numbers generated for them. This array gets compared to some reference array, also generated for this game. The player wins if, when comparing numbers one-by-one, they have the higher number more times. -## Setting up an example +To make it an actual *game*, we need to give the player at least *some* control over their numbers. So we allow them to use a certain budget that can be used to increase the any in their array. -Let us illustrate one such case with an example. Let's say we have a task to assign robots to missions. For simplicity, we just typedef our `Robot` and `Mission` class to `int` but they can, of course, be arbitrary types. +## Initial code of the game -`types.hpp` +Now let's spend a couple of minutes to set up the code for all what we've just discussed. -```cpp -#pragma once +To start off, we'll probably need a class `Game` that would hold the reference as well as the player numbers. It also needs a way: -// Can be arbitrary types, here int for simplicity. -using Robot = int; -using Mission = int; -``` +- to print the current state of the game; +- to check if they player won by comparing the player's numbers with reference ones one-by-one and keeping the score; +- to change the player number if there is still budget for this provided a `ChangeEntry` object, a tiny `struct` with `index` and `value` in our case. -We can model the assignment of robots to missions by a class `MissionRobotAssignments` that holds a vector of missions and a vector of robots. It supports a way to assign a new robot to a mission through the `AssignRobot` function and has a convenient `Print` function. In this example, we use a `struct` for keeping the amount of code on the screen manageable, but it should probably be a class, please feel free to refresh why in the video on [classes](classes_intro.md) from before. +This `change_entry` must come from somewhere, so we need a way to ask the player to provide it. We can encapsulate our user interaction into a function like `GetNextChangeEntryFromUser` which will print the current state of the game, ask the user for their input and fill a `change_entry` object using this input. -`mission_robot_assignments.hpp` +To keep this example simple, we implement all of this in one `cpp` file alongside a simple `main` function that creates a `Game` object, asks the user to provide a desired `change_entry` and changing the player's numbers accordingly in a loop until the user runs out of budget. Finally, we check if the player has won the game and show them the result. ```cpp -#pragma once - -#include "types.hpp" - #include #include -// This should be a class, using struct for simplicity. -struct MissionRobotAssignments { +struct ChangeEntry { + int index{}; + int value{}; +}; - void AssignRobot(int assignment_index, const Robot& robot) { - robots[assignment_index] = robot; - } +// 😱 Warning! No error handling! +class Game { + public: + Game(std::vector&& ref_numbers, + std::vector&& player_numbers, + int budget) + : ref_numbers_{std::move(ref_numbers)}, + player_numbers_{std::move(player_numbers)}, + budget_{budget} {} void Print() const { - for (auto i = 0UL; i < robots.size(); ++i) { - std::cout << i << ": Mission " << // - missions[i] << " is carried out by the robot " << // - robots[i] << std::endl; - } + std::cout << "Budget: " << budget_ << std::endl; + std::cout << "Computer numbers: "; + for (auto number : ref_numbers_) { std::cout << number << "\t"; } + std::cout << "\nPlayer numbers: "; + for (auto number : player_numbers_) { std::cout << number << "\t"; } + std::cout << std::endl; } - std::vector missions{}; - std::vector robots{}; -}; -``` - -For the sake of our example, we let the user modify these assignments manually. We would need some more functions for this. More concretely, we have a function `CheckIfUserWantsChanges` that asks the user if they want to make any changes and returns a boolean value that indicates their answer. - -In addition to this, we need a function `GetNextChangeEntryFromUser` that actually asks for the user's input about *what* they want to change. We ask them for a mission index followed by a request to provide some arbitrary robot id, retuning these as a pair of values. If anything here sounds confusing, please go back to the [lecture on streams](io_streams.md) that we covered towards the start of this course. - -`user_input.hpp` - -```cpp -#pragma once + bool CheckIfPlayerWon() const { + int win_loss_counter{}; + for (auto i = 0UL; i < player_numbers_.size(); ++i) { + const auto difference = player_numbers_[i] - ref_numbers_[i]; + if (difference > 0) win_loss_counter++; + if (difference < 0) win_loss_counter--; + } + return win_loss_counter > 0; + } -#include "mission_robot_assignments.hpp" + void ChangePlayerNumberIfPossible(const ChangeEntry& change_entry) { + auto& player_number = player_numbers_[change_entry.index]; + const auto difference = std::abs(change_entry.value - player_number); + if (difference > budget_) { return; } + player_number = change_entry.value; + budget_ -= difference; + } -#include -#include + bool UserHasBudget() const { return budget_ > 0; } -inline bool CheckIfUserWantsChanges() { - std::cout << "Do you want to change assignment? [y/n]" << std::endl; - std::string answer{}; - std::cin >> answer; - if (answer == "y") { return true; } - return false; -} + private: + std::vector ref_numbers_{}; + std::vector player_numbers_{}; + int budget_{}; +}; -// Multiple issues here for now. -// We should handle failure to get a proper value. -// We also could use a struct in place of a pair. -inline std::pair GetNextChangeEntryFromUser( - const MissionRobotAssignments& assignments) { - std::pair entry{}; - std::cout << "Please select mission index." << std::endl; - std::cin >> entry.first; - std::cout << "Please provide new robot id." << std::endl; - std::cin >> entry.second; +// 😱 We should handle failure to get a proper value. +ChangeEntry GetNextChangeEntryFromUser(const Game& game) { + game.Print(); + ChangeEntry entry{}; + std::cout << "Please enter number to change: "; + std::cin >> entry.index; + std::cout << "Please provide a a new value: "; + std::cin >> entry.value; return entry; } -``` - -Finally, we need a `main` function that prints the initial assignment and keeps asking the user for their input until they decide that they don't want to provide any. We end by printing the resulting assignments. -`robot_example.cpp` - -```cpp -#include "mission_robot_assignments.hpp" -#include "user_input.hpp" -#include "types.hpp" - -// Careful! The code below causes a subtle bug! int main() { - MissionRobotAssignments assignments{{Mission{42}, Mission{40}}, - {Robot{10}, Robot{23}}}; - assignments.Print(); - while (true) { - const auto user_wants_changes = CheckIfUserWantsChanges(); - if (!user_wants_changes) { break; } - const auto change_entry = GetNextChangeEntryFromUser(assignments); - assignments.AssignRobot(change_entry.first, change_entry.second); + Game game{{42, 50, 23}, {42, 40, 99}, 10}; + while (game.UserHasBudget()) { + const auto change_entry = GetNextChangeEntryFromUser(game); + game.ChangePlayerNumberIfPossible(change_entry); + } + if (game.CheckIfPlayerWon()) { + std::cout << "You win!\n"; + } else { + std::cout << "Not win today. Try again!\n"; } - assignments.Print(); } ``` -Obviously, this example does not do too much, and it might feel a bit verbose, but trust me it is good enough to illustrate many of the core concepts we are talking about today. I hope that with some minor stretch of imagination we can all imagine how it could be extended to a real-world use-case by adding a couple more functions. +We can build this program as a single executable directly from the command line: -Oh, and as always, there is [complete code](code/error_handling/robot_example_simple/robot_example.cpp) to this project. - - -## Unrecoverable error example +```cmd +c++ -std=c++17 -o comparison_game comparison_game.cpp +``` -Now with the example set up, we can illustrate what a typical unrecoverable error looks like. +Ideally, we should test all of our functions, but for now let's just give it a play-through instead! -As we are using header files, we can compile our code with a single command: +# Unrecoverable errors: **fail early** -```cmd -c++ -std=c++17 -o robot_example robot_example.cpp -``` +## Our first unrecoverable error encounter -And if we run the resulting binary, the output might look something like this: +We run our executable and are greeted with expected numbers as well as a prompt to change one of our numbers: ```output -0: Mission 42 is carried out by the robot 10 -1: Mission 40 is carried out by the robot 23 -Do you want to change assignment? [y/n] -y -Please select mission index. -40 -Please provide new robot id. -4242 -Do you want to change assignment? [y/n] -n -0: Mission 4242 is carried out by the robot 10 -1: Mission 40 is carried out by the robot 23 +λ › ./comparison_game +Budget: 10 +Reference numbers: 42 49 23 +Player numbers: 42 40 23 +Please enter number to change: ``` -Here, there are two assignments to start with and the user wants to change the assignment for missions 40 to a robot with an id 4242. However, if we look at the assignments printed in the end we see something strange there: the robots assigned to missions **did not change** at all! Instead, a **mission id changed**! +We tie the reference numbers in the first and third columns but lose in the second. So our only chance to win is to change the value `40` to `50`, which is also what our budget allows for! So we provide these numbers to our program and observe what happens: -But why does it change the mission entry? +```output +λ › ./comparison_game +Budget: 10 +Reference numbers: 42 49 23 +Player numbers: 42 40 23 +Please enter number to change: 40 +Please provide a a new value: 49 +Budget: 2 +Reference numbers: 50 49 23 +Player numbers: 42 40 23 +Please enter number to change: +``` -To get a hint about this, we can create an even simpler example by changing our `main` function: +But wait, what's going on here? Why did our number in the second column not change? Why is our budget not decreased by `10`? Even more strangely, why did the first reference number change to `50`? -```cpp -#include "mission_robot_assignments.hpp" -#include "types.hpp" +The answer to all of these questions is that we have just encountered our first unrecoverable error that manifests itself in wrong values in our memory through the "virtues" of Undefined Behavior. But what gives? -int main() { - MissionRobotAssignments assignments{{Mission{42}, Mission{40}}, - {Robot{10}, Robot{23}}}; - std::cout << "Address of assignments.missions.data(): " - << assignments.missions.data() << std::endl; - std::cout << "Address of assignments.robots.data(): " - << assignments.robots.data() << std::endl; - const auto diff = assignments.missions.data() - assignments.robots.data(); - std::cout << "Diff in address: " << diff << std::endl; -} -``` +Well, there is a chain of events that caused our values to be changed in ways that we don't expect and we'll be digging through all of these in the remainder of today's lecture. -Running this code will output the addresses for the data stored within our assignments object and the difference between their addresses. If this is confusing, please feel free to refresh what [raw pointers](raw_pointers.md) are before going deeper into *this* topic. +But the most immediate cause is that the user has mistakenly provided the number they wanted to change, `40`, rather than an index of that number, `1`. We then did not check that provided "index" and wrote the provided new value directly into it. If we rerun our game again and provide `1` as the first input, we win, just as we expect! ```output -Address of assignments.missions.data(): 0x145605ea0 -Address of assignments.robots.data(): 0x145605e00 -Diff in address: 40 +λ › ./comparison_game +Budget: 10 +Reference numbers: 42 49 23 +Player numbers: 42 40 23 +Please enter number to change: 1 +Please provide a new value: 50 +Budget: 0 +Reference numbers: 42 49 23 +Player numbers: 42 50 23 +You win! ``` -If we look long enough at the output of our code and remember that we used 40 as our mission id that we wanted to change, something starts to click! +When we provide `40` as we did the first time, our wrong index is far beyond the size of the `player_numbers_` vector and, when we write beyond its bounds we enter the Undefined Behavior land. -**Here's what happened:** the program was asking us for a mission **index** but we provided a mission **id**. Then the id that the user provided got written into the `assignments.robots` vector way beyond the end of its memory and it *"just so happened"* that the address of the element we wrote to ended up having the address of the first element of the `assignments.missions`! So we overwrote the first element of `assignments.missions` by mistake! +What happens next is unpredictable. If we are lucky and the address into which we write does not belong to our program, the program will crash. -Do note, that we're hitting **undefined behavior** here! This result will almost certainly be different if you run this code on your own machine! The addresses of the `assignments.robots` and `assignments.missions` vectors will be different and can even have a different order in memory. You might have already noticed that in this run `assignments.robots` appears *before* `assignments.missions` in memory even though `assignments.robots` appears *after* `assignments.missions` in the `struct` declaration! If you'd like to understand a bit more about how these are allocated, we've covered this [in the lecture on how C++ allocates memory](memory_and_smart_pointers.md). +If we are not lucky however, we will rewrite *some* memory that belongs to our program, potentially corrupting any object that actually owns that memory. In this particular example, I picked the values in such a way, that the "fake index" just happens to be equal to a difference in pointers to the data of the `player_numbers_` and `ref_numbers_` vectors. Which then results in us writing directly into the first element of the `ref_numbers_` vector, resulting in an update to reference numbers. -The main point I was trying to make here though is that now our **memory is corrupted**. This particular example was carefully constructed to overwrite an element of another vector, but if the user provides a different "mission id" the code will write the user-provided number to an arbitrary part of memory that belongs to our program, or will crash with a segmentation fault error if we hit memory that does not belong to us yet. +But I want to stress again, that most likely if you run the same program on your machine - you will get a different behavior altogether! Even the order of `ref_numbers_` and `player_numbers_` in memory is not guaranteed, note how on my machine they do not even follow the order of declaration! -What this means is that once an event like this has occurred, **we do not have any guarantee on the consistency of the state of our program**. Arbitrary objects might have already been corrupted and can behave in unpredictable ways from this point on without us knowing about it. Which can lead to random-looking sporadic failures in seemingly unrelated parts of our program. +What *doesn't* change is that once the `ChangePlayerNumberIfPossible` method is called with a wrong `change_entry` in our example, all bets are off - **we do not have any guarantees on the consistency of the state of our program anymore** -🚨 This concept lies at the core of what makes this type of errors "unrecoverable". If the data we try to use for recovery is corrupted, we have no guarantees that any such recovery will succeed! +Arbitrary objects might have already been corrupted and can behave in unpredictable ways from this point on without us knowing about it. Which can lead to random-looking sporadic failures in seemingly unrelated parts of our program across multiple runs. This obviously becomes even harder to track down when we write more complex programs than our toy example here. + +🚨 This idea lies at the core of what makes this type of errors "unrecoverable". If the data we try to use for recovery is also corrupted, we have no guarantees that any recovery will succeed at all! ## How to deal with unrecoverable errors ### Catch them as early as possible -Therefore as we mentioned briefly before, a typical advice is to catch any wrong values that propagate through our program as early as possible—and crash as early as possible—before any more damage is done. - -### Don't use `assert` - - - -One way to detect wrong values floating through our program that is often recommended on the Internet is to use [`assert`](https://en.cppreference.com/w/cpp/error/assert.html) that can be found in the `` include file. I'm not a fan of using `assert` as it has one super annoying flaw but due to how popular it is in many C++ tutorials, we'll have to go through this topic step by step. +Therefore a typical advice is to catch any wrong values that propagate through our program as early as possible—and crash as early as possible—before any more damage is done. -Essentially, `assert` allows us to check any boolean condition passed into it: +### Use `CHECK` macro to fail early -```cpp -assert(2 + 2 == 4); -``` - -In our example, we could change the `AssignRobot` method of our `MissionRobotAssignments` class to perform the check needed to avoid potential undefined behavior: - -`mission_robot_assignments.hpp` +My favorite tool for this is the [`CHECK`](https://abseil.io/docs/cpp/guides/logging#CHECK) macro that can be found in the [Abseil library](https://abseil.io/docs/). We can use it in our `ChangePlayerNumberIfPossible` to check if the index is within bounds: ```cpp -#pragma once - -#include "types.hpp" - -#include -#include -#include - -// This should be a class, using struct for simplicity. -struct MissionRobotAssignments { - - void AssignRobot(int assignment_index, const Robot& robot) { - assert(assignment_index < robots.size()); - assert(assignment_index >= 0); - robots[assignment_index] = robot; - } +#include - void Print() const { - // Same as before. - for (auto i = 0UL; i < robots.size(); ++i) { - std::cout << i << ": Mission " << // - missions[i] << " is carried out by the robot " << // - robots[i] << std::endl; - } - } +// Old code unchanged here. + +void ChangePlayerNumberIfPossible(const ChangeEntry& change_entry) { + // Checking: + // (change_entry.index >= 0) + // (change_entry.index < player_numbers_.size()) + CHECK_GE(change_entry.index, 0); + CHECK_LT(change_entry.index, player_numbers_.size()); + auto& player_number = player_numbers_[change_entry.index]; + const auto difference = std::abs(change_entry.value - player_number); + if (difference > budget_) { return; } + player_number = change_entry.value; + budget_ -= difference; +} - std::vector missions{}; - std::vector robots{}; -}; +// Old code unchanged here. ``` -Now, if we try to compile and run our example just as we did before - the assertion will trigger! +If we run our example now, we will get a crash as soon as we call the `ChangePlayerNumberIfPossible` function that clearly states where this error originated and which check failed letting us debug this as easily as possible: ```output -0: Mission 42 is carried out by the robot 10 -1: Mission 40 is carried out by the robot 23 -Do you want to change assignment? [y/n] -y -Please select mission index. -40 -Please provide new robot id. -4242 -Assertion failed: ((assignment_index < robots.size()) && (assignment_index >= 0)), function AssignRobot, file mission_robot_assignments.hpp, line 14. -[1] 29732 abort ./robot_example +F0000 00:00:1750605447.566908 1 example.cpp:44] Check failed: change_entry.index < player_numbers_.size() (40 vs. 3) ``` -So far so good, right? So what is that fatal flaw I've been talking about that makes me dislike `assert`? Well, you see, all `assert` statements get disabled when a macro `NDEBUG` is defined. This is a standard macro name that controls if the debug symbols get compiled into the binary and gets passed to the compilation command for most release builds as we generally don't want debug symbols in the binary we release. So essentially, `assert` **does not protect us from undefined behavior in the code we actually deploy**! +One concern that people have when thinking of using the `CHECK` macros is performance as these checks stay in the code we ship and do cost some time when our program runs. -We can easily demonstrate that the `asserts` indeed get *compiled out* by adding `-DNDEBUG` flag to our compilation command: - -```cmd -c++ -std=c++17 -DNDEBUG -o robot_example robot_example.cpp -``` +For my money, in most cases, the benefits far outweigh the costs, and, unless we've measured that we cannot allow the tiny performance hit in a particular place of our code, we should be free to use `CHECK` for safety against entering the Undefined Behavior land. -Running our example *now* leads to the same undefined behavior we observed before as all of the assertions were compiled out. So `asserts` essentially become useless in production! + -### Use `CHECK` macro instead +### Don't use `assert` -Even as `assert` might not be a perfect tool for the job, the idea of checking the function's pre- and sometimes pose-conditions is actually still a **very good idea**! +You might wonder if using `CHECK` if our only way and so I have to talk about one very famous alternative here that is often recommended on the Internet. This alternative is to use [`assert`](https://en.cppreference.com/w/cpp/error/assert.html) in place of `CHECK`. The `assert` statement can be found in the `` include file. I'm not a fan of using `assert`. -We just need a better tool! My favorite method is to use the [`CHECK`](https://abseil.io/docs/cpp/guides/logging#CHECK) macro that can be found in the [Abseil library](https://abseil.io/docs/). We can use it in the same way as we used `assert`: +What is this smell? Is something on fire? Ah, it's the people with torches and pitchforks again coming for me for not liking `assert`! -```cpp -#pragma once +Let me explain myself. You see, `assert` has one super annoying flaw that makes it impossible for me to recommend it for production code. I've seen sooo many bugs stemming from this! But let me show what I'm talking about on our game example. -#include "types.hpp" +First, we can use `assert` in a very similar way to `CHECK`: -#include +```cpp +#include -#include -#include +// Old code unchanged here. -// This should be a class, using struct for simplicity. -struct MissionRobotAssignments { +void ChangePlayerNumberIfPossible(const ChangeEntry& change_entry) { + assert(change_entry.index >= 0); + assert(change_entry.index < player_numbers_.size()); + auto& player_number = player_numbers_[change_entry.index]; + const auto difference = std::abs(change_entry.value - player_number); + if (difference > budget_) { return; } + player_number = change_entry.value; + budget_ -= difference; +} - void AssignRobot(int assignment_index, const Robot& robot) { - CHECK_LT(assignment_index, robots.size()); - CHECK_GE(assignment_index, 0); - robots[assignment_index] = robot; - } +// Old code unchanged here. +``` - void Print() const { - // Same as before. - for (auto i = 0UL; i < robots.size(); ++i) { - std::cout << i << ": Mission " << // - missions[i] << " is carried out by the robot " << // - robots[i] << std::endl; - } - } +Now, if we compile and run our game just as we did before - the assertion will trigger: - std::vector missions{}; - std::vector robots{}; -}; +```output +output.s: /app/example.cpp:44: void Game::ChangePlayerNumberIfPossible(const ChangeEntry &): Assertion `change_entry.index < player_numbers_.size()' failed. +Program terminated with signal: SIGSEGV ``` -And the output is very similar with the main difference being that it also works in release builds! +So far so good, right? Using `assert` also crashes our program when the wrong input is provided and shows us where the wrong value was detected. -The main concern that people have when using `CHECK` macros is performance as they stay in our code and do cost some time when our program runs. But I would say that the benefits far outweigh the costs, and, unless we've measured that we cannot allow the tiny performance hit in a particular place of our code, we should be free to use `CHECK` for safety. +So what is that annoying flaw I've been talking about that makes me dislike `assert`? Well, you see, all `assert` statements get *disabled* when a macro `NDEBUG` is defined. This is a standard macro name that controls if the debug symbols get compiled into the binary and gets passed to the compilation command for most release builds as we generally don't want debug symbols in the binary we release. So essentially, `assert` **does not protect us from undefined behavior in the code we actually deploy**! -All in all, at least in my book, `CHECK` is our main weapon against unrecoverable errors and the undefined behavior that they tend to cause. +We can easily demonstrate that the `asserts` indeed get *compiled out* by adding `-DNDEBUG` flag to our compilation command: - +```cmd +c++ -std=c++17 -DNDEBUG -o comparison_game comparison_game.cpp +``` + +Running our game *now* and providing a wrong input index leads to the same undefined behavior we observed before as all of the assertions were compiled out. Not great, right? -### Complete the example +### Complete the example yourself -By the way, we've only covered how we could improve our `AssignRobot` method of the `MissionRobotAssignments` class. Do you think our `Print` function would benefit from the same treatment? +By the way, we've only covered how we could improve our `ChangePlayerNumberIfPossible` method of the `Game` class. Do you think our `CheckIfPlayerWon` function would benefit from the same treatment? - + ## How to minimize number of unrecoverable errors -Of course, hard failures in the programs we ship is not ideal! One way to reduce the risk of such failures is to keep the test coverage high for the code we write, ideally close to 100% line and branch coverage. This way we catch most of them during development. In some industries, like automotive, aviation, or medical this is actually a legal requirement. +Of course, hard failures in the programs we ship is also not ideal! + +One way to reduce the risk of such failures is to keep the test coverage high for the code we write, ideally close to 100% line and branch coverage, i.e., every line and logical branch gets executed at least once in our test suite. + +This way we catch most of the unrecoverable errors during development. In some industries, like automotive, aviation, or medical this is actually a legal requirement. But unfortunately, despite our best efforts, we cannot *completely* avoid failures in the programs we ship! -Even if we do everything right on our side, the hardware can still fail and corrupt our memory. One fun example of this is the famous error in the Belgian election on the 18th of May 2003, where [one political party got 4096 extra votes](https://en.wikipedia.org/wiki/Electronic_voting_in_Belgium). If you look at the number carefully, you might notice that it is a 2 to the power of 12. One error explanation was that a cosmic ray flipped "a bit at the position 13 in the memory of the computer", essentially leading to 4096 more votes. +Even if we do everything right on our side, hardware can still fail and corrupt our memory. One fun example of this is the famous error in the Belgian election on the 18th of May 2003, where [one political party got 4096 extra votes](https://en.wikipedia.org/wiki/Electronic_voting_in_Belgium) due to what is believed to have been a cosmic ray flipping "a bit at the position 13 in the memory of the computer", essentially leading to 4096 more votes. - + -With the knowledge that we cannot completely remove the risk of hitting an unrecoverable error in production, in safety-critical systems, we often isolate components into separate processes or even separate hardware units, with watchdogs that can trigger recovery actions if something crashes. +This visualization is actually taken from this [excellent Veritasium video](https://www.youtube.com/watch?v=AaZ_RSt0KP8). Do give it a watch, it speaks about this and other similar cases much more in-depth! + +With the knowledge that we cannot completely remove the risk of hitting an unrecoverable error in production, and that we also cannot just outright fail, in safety-critical systems, we often isolate components into separate processes or even separate hardware units, with watchdogs that can trigger recovery actions if one of our components suddenly crashes. This way we can have our cake and eat it at the same time: using `CHECK` minimizes the time-to-failure when a bug is encountered, while our fallback options keep the system safe as a whole even when certain components fail. -That being said, this is a system architecture question and this topic is far beyond what I want to talk about today. In most non-safety-critical systems we do not need to think about these failure cases as deeply and we can usually just restart our program in case of a one-off failure. But, if you're interested in this topic, I've given a part of my introductory lecture [on this topic](https://youtu.be/DtRktn4bVWg?si=DJuU8OjxtBcj5o2C) at the University of Bonn some years ago. +That being said, this is a system architecture question and this topic is far beyond what I want to talk about today. In most non-safety-critical systems we do not need to think about these failure cases as deeply and we can usually just restart our program in case of a one-off failure. But, if you're interested in this topic, In an introductory lecture to the Self Driving Cars course at the University of Bonn that I've given some years ago I've dedicated a significant part towards the end of that lecture [to this topic](https://youtu.be/DtRktn4bVWg?si=DJuU8OjxtBcj5o2C). So do give it a watch if you're interested. # Recoverable errors: **handle and proceed** @@ -425,147 +376,143 @@ The good thing about user inputs is that we can ask the user to correct these wi To talk about them, let us focus on the function `GetNextChangeEntryFromUser` from our example. Currently, there is no validation of what the user inputs but we absolutely *can* and *should* perform such validation! -As we design our program, we know that the mission index that the user provides first must be within the bounds of the mission vector within our `assignments` object: +As we design our program, we know that the number index that the user provides first must be within the bounds of the player numbers vector of the `game` object: ```cpp -// Multiple issues here for now. -// We should handle failure to get a proper value. -// We also could use a struct in place of a pair. -inline std::pair GetNextChangeEntryFromUser( - const MissionRobotAssignments& assignments) { - std::cout << "Current assignments:\n"; - assignments.Print(); - std::pair entry{}; - std::cout << "Please select mission index." << std::endl; - std::cin >> entry.first; // <-- This value is NOT arbitrary! - std::cout << "Please provide new robot id." << std::endl; - std::cin >> entry.second; +// 😱 We should handle failure to get a proper value. +ChangeEntry GetNextChangeEntryFromUser(const Game& game) { + game.Print(); + ChangeEntry entry{}; + std::cout << "Please enter number to change: "; + std::cin >> entry.index; // <-- This value is NOT arbitrary! + std::cout << "Please provide a new value: "; + std::cin >> entry.value; return entry; } ``` -So we'd like to somehow know that something went wrong within the `GetNextChangeEntryFromUser` function and recover from this. +So, when the player, *does* provide a wrong value we'd like to somehow know that something went wrong within the `GetNextChangeEntryFromUser` function and recover from this. Broadly speaking, we have two strategies of communicating failures like these that have emerged in C++ over the years: 1. **Return a special value from a function.** 2. Throw an **exception**. -We’ll spend most of our time on the first one—but let’s first talk about throwing exceptions, and yes, why I think it might not be the best thing we could do. This is the time to get your pitchforks ready 😉. +We’ll spend most of our time on the first option—but let’s first talk about throwing exceptions, and yes, why I think it might not be the best thing we could do. + +This is yet again a good time to get your pitchforks and torches ready 😉. ## Exceptions -### What exceptions are +### How to use exceptions Since C++98 we have a powerful machinery of exceptions at our disposal. An exception is essentially just an object of some type, typically derived from [`std::exception`](https://en.cppreference.com/w/cpp/error/exception.html) class. Such an exception holds the information about the underlying failure and can be "thrown" and "caught" within a C++ program. -In our example function, we could throw an object of `std::runtime_error` when the user inputs a wrong mission index: +In our example, we could throw an object of `std::out_of_range` when the user inputs a wrong number index: ```cpp -// Multiple issues here for now. -// We also could use a struct in place of a pair. -inline std::pair GetNextChangeEntryFromUser( - const MissionRobotAssignments& assignments) { - std::cout << "Current assignments:\n"; - assignments.Print(); - std::pair entry{}; - std::cout << "Please select mission index." << std::endl; - std::cin >> entry.first; // <-- This value is NOT arbitrary! - if ((entry.first < 0) || (entry.first >= assignments.missions.size())) { - throw std::runtime_error("Wrong mission index provided."); +// 😱 I'm not a fan of using exceptions. +ChangeEntry GetNextChangeEntryFromUser(const Game& game) { + game.Print(); + ChangeEntry entry{}; + std::cout << "Please enter number to change: "; + std::cin >> entry.index; // <-- This value is NOT arbitrary! + if ((entry.index < 0) || (entry.index >= game.player_numbers().size())) { + throw std::out_of_range("Wrong number index provided."); } - std::cout << "Please provide new robot id." << std::endl; - std::cin >> entry.second; + std::cout << "Please provide a new value: "; + std::cin >> entry.value; return entry; } ``` -Throwing an exception interrupts the normal function flow, destroy all objects allocated in the appropriate scope and will continue with the "exceptional flow" of the program to find a place where the thrown exception can be handled. +Throwing an exception interrupts the normal program flow. We leave the current scope, so all objects allocated in it are automatically destroyed and the program continues with the "exceptional flow" to find a place where the thrown exception can be handled. -Speaking of handling exceptions, we can "catch" them anywhere upstream from the place they have been thrown from. As `std::exception` is just an object, it can be caught by value or by reference. When using exceptions, it is considered best practice to catch them by reference. In our case this is possible because `std::runtime_error` derives from `std::exception`: +Speaking of handling exceptions, we can "catch" them anywhere upstream from the place they have been thrown from. As `std::exception` is just an object, it can be caught by value or by reference. It is considered best practice to catch them by reference. In our case we can catch either an `std::out_of_range` exception directly or, as `std::out_of_range` derives from `std::exception`, catch `std::exception` instead: ```cpp -#include "mission_robot_assignments.hpp" -#include "user_input.hpp" -#include "types.hpp" +// Old unchanged code. int main() { - MissionRobotAssignments assignments{{Mission{42}, Mission{40}}, - {Robot{10}, Robot{23}}}; - assignments.Print(); - while (true) { - const auto user_wants_changes = CheckIfUserWantsChanges(); - if (!user_wants_changes) { break; } + Game game{{42, 49, 23}, {42, 40, 23}, 10}; + while (game.UserHasBudget()) { try { - const auto change_entry = GetNextChangeEntryFromUser(assignments); - assignments.AssignRobot(change_entry.first, change_entry.second); - } catch (const std::exception& e) { - std::cerr << "Error: " << e.what() << "\n"; - std::cerr << "Please try again.\n"; + const auto change_entry = GetNextChangeEntryFromUser(game); + game.ChangePlayerNumberIfPossible(change_entry); + } catch (const std::out_of_range& e) { + std::cerr << e.what() << std::endl; } } - assignments.Print(); + game.Print(); + if (game.CheckIfPlayerWon()) { + std::cout << "You win!\n"; + } else { + std::cout << "Not win today. Try again!\n"; + } } ``` -Should we forget to catch an exception it bubbles up to the very top and terminates our program. +### A case for exceptions for both "recoverable" and "unrecoverable" errors + +Should we forget to catch an exception it bubbles up to the very top and terminates our program. To verify this, you could try changing `catch (const std::out_of_range& e)` to `catch (const std::runtime_error& e)` and see what happens. + +On paper, this looks very neat. Essentially, with exceptions, we could forget about the distinction between recoverable and unrecoverable errors: if we catch an error, it is a "recoverable" one, if not - we treat it as "unrecoverable". -On paper, this looks very neat. Essentially, with exceptions, we could forget about the distinction between recoverable and unrecoverable errors: if we catch an error, it is a "recoverable" one, if not - we treat it as "unrecoverable". This is one the main arguments from people who like using exceptions - do not make a global decision inside of a local scope, let someone with more overview figure out what to do. And it *is* a very good argument! The other good argument is that exceptions *are* part of the language so it feels odd not to use them. +This is one the main arguments from people who like using exceptions - do not make a global decision about the error type inside of a local scope - let someone with more overview figure out what to do. + +And it *is* a very good argument! + +The other good argument is that exceptions *are* part of the language so it feels odd not to use them. But there are problems with exceptions that, at least in some industries, we just cannot ignore. -### Exceptions are (sometimes) expensive +### Why not just use exceptions -Exceptions typically [allocate memory on the heap](memory_and_smart_pointers.md#the-heap) when thrown, and rely on **R**un-**T**ime **T**ype **I**nformation ([RTTI](https://en.wikipedia.org/wiki/Run-time_type_information)) to propagate through the call stack. There is a [great talk by Andreas Weiss](https://www.youtube.com/watch?v=kO0KVB-XIeE), my former colleague at BMW, that goes into a lot of detail how exactly exceptions behave. The talk is called "Exceptions demystified" and I urge you to give it a watch if you want to know *all* the details! +#### Exceptions are (sometimes) expensive -But long story short, both throwing and catching exceptions relies on mechanisms that work at runtime and therefore cost execution time. +Exceptions typically [allocate memory on the heap](memory_and_smart_pointers.md#the-heap) when thrown, and rely on **R**un-**T**ime **T**ype **I**nformation ([RTTI](https://en.wikipedia.org/wiki/Run-time_type_information)) to propagate through the call stack. There is a really in-depth talk, called ["Exceptions demystified", by Andreas Weiss](https://www.youtube.com/watch?v=kO0KVB-XIeE), my former colleague at BMW, that goes into a lot of detail about how exceptions behave. -Unfortunately, there are no guarantees on timing or performance of these operations. While in most common scenarios these operations run fast-enough, in real-time or safety-critical code, such unpredictability is unacceptable. +Long story short, both throwing and catching exceptions relies on mechanisms that work at runtime and therefore cost execution time. -### Exceptions hide the error path +Unfortunately, there are no guarantees on timing or performance of these operations. While in most common scenarios these operations run "fast-enough", in real-time or safety-critical code, such unpredictability is usually unacceptable. -Exceptions also arguably make control flow harder to reason about. To quote Google C++ style sheet: +#### Exceptions hide the error path + +Exceptions also arguably make control flow harder to read and reason about. To quote Google C++ style sheet: > Exceptions make the control flow of programs difficult to evaluate by looking at code: functions may return in places you don't expect. This causes maintainability and debugging difficulties. +Docs are always out of date + Indeed, an error can propagate across many layers of calls before being caught. It’s easy to miss what a function might throw—especially if documentation is incomplete or out of date (which it almost always is). Furthermore, the language permits the use of generic catch blocks like `catch (...)` and these make things even more confusing. We end up catching *something*, but we no longer know what or who threw it at us! 😱 -In our own example, if, say `AssignRobot` throws an undocumented `std::out_of_range` exception but we only expect `std::runtime_error`, we might miss important context or even crash anyway: +In our own example, if, the `ChangePlayerNumberIfPossible` function throws an undocumented `std::runtime_error` exception but we only expect `std::out_of_range`, we don't have a good way of detecting this! Our only options are to allow such exceptions to be left unhandled and eventually to terminate our program or to add a `catch(...)` clause where we miss important context for what caused the error, making catching it a lot less useful: ```cpp -#include "mission_robot_assignments.hpp" -#include "user_input.hpp" -#include "types.hpp" - int main() { - MissionRobotAssignments assignments{{Mission{42}, Mission{40}}, - {Robot{10}, Robot{23}}}; - assignments.Print(); - while (true) { - const auto user_wants_changes = CheckIfUserWantsChanges(); - if (!user_wants_changes) { break; } + Game game{{42, 49, 23}, {42, 40, 23}, 10}; + while (game.UserHasBudget()) { try { - const auto change_entry = GetNextChangeEntryFromUser(assignments); - assignments.AssignRobot(change_entry.first, change_entry.second); - } catch (const std::runtime_error& e) { - std::cerr << "Error: " << e.what() << "\n"; - std::cerr << "Please try again.\n"; + const auto change_entry = GetNextChangeEntryFromUser(game); + game.ChangePlayerNumberIfPossible(change_entry); + } catch (const std::out_of_range& e) { + std::cerr << e.what() << std::endl; } catch (...) { - // Not very helpful, is it? + // 😱 Not very useful, is it? std::cerr << "Oops, something happened.\n"; } } - assignments.Print(); + // Rest of the code. } ``` -Video Thumbnail +Something happened error I believe that `catch(...)` and equivalent constructs are singlehandedly responsible for the absolute majority of the very unspecific and unhelpful error messages that we see all over the internet and have probably encountered ourselves multiple times. -### Exceptions are banned in many code bases +#### Exceptions are banned in many code bases All of these issues led a lot of code bases to ban exceptions altogether. In 2019, isocpp.org did a [survey](https://isocpp.org/files/papers/CppDevSurvey-2018-02-summary.pdf) on this matter and found that about half the respondents could not use exceptions at least in part of their code bases. @@ -573,9 +520,9 @@ My own experience aligns with these results - every serious project I’ve worke The problem of using exceptions with an acceptable overhead has quite vibrant discussions around it with even calls for re-designing exceptions altogether as can be seen in this [wonderful talk by Herb Sutter](https://www.youtube.com/watch?v=ARYP83yNAWk) from CppCon 2019 as well as his [corresponding paper](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0709r4.pdf) on this topic. - + -But until the C++ community figures out what to do we are stuck with many people being unable to use the default error handling mechanism in C++. +But until the C++ community figures out what to do, we are stuck with many people being unable to use the default error handling mechanism in C++. So what *do* we do? @@ -587,191 +534,183 @@ I would say that there are three distinct ways of thinking about it. Let's illustrate all of them on a function we've already looked at: ```cpp -std::pair GetNextChangeEntryFromUser(const MissionRobotAssignments& assignments); +ChangeEntry GetNextChangeEntryFromUser(const Game& game); ``` We can: -1. Keep the return type, `std::pair` in our case, but return a special **value** of this type in case of failure. +1. Keep the return type, `ChangeEntry` in our case, but return a special **value** of this type in case of failure, say a value-initialized `ChangeEntry{}` object; 2. Return an **error code**, which would change the signature of the function to return `int` or a similar type instead: ```cpp - int GetNextChangeEntryFromUser(const MissionRobotAssignments& assignments, std::pair& result); + int GetNextChangeEntryFromUser(const Game& game, ChangeEntry& result); ``` -3. Return a **different type** specifically designed to encode failure states alongside the actual return, like `std::optional>` which only holds a valid `std::pair` in case of success. +3. Return a **different type** specifically designed to encode failure states alongside the actual return, like `std::optional` which only holds a valid `ChangeEntry` in case of success. -I believe that the third option is the best out of these three, but let me explain why the first two are not cutting it, before going deeper into details. +I believe that the third option is the best out of these three, but it will be easier to explain why I think so after we talk about why the first two are not cutting it. ### Returning a value indicating error does not always work 😱 -There is a number of issues with returning a special value from a function without using a special return type. As an illustration, in our case, a naïve choice would be to return a pair of zeros from the function if the user provided a wrong mission index. However, as you might imagine, a pair of zeros *is* a completely valid mission-robot assignment! How do we disambiguate this value `{0, 0}` value from a valid one? +There is a number of issues with returning a special value from a function without using a special return type. As an illustration, in our case, a naïve choice would be to return a value-initialized `ChangeEntry{}` object from the `GetNextChangeEntryFromUser` function if the user provided a wrong number index. This object will essentially hold zeros for its `index` and `value` entries. However, as you might imagine, a pair of zeros *is* a completely valid, if slightly useless, change entry! How do we disambiguate this value from a valid one? ```cpp // 😱 Not a great idea! -inline std::pair GetNextChangeEntryFromUser( - const MissionRobotAssignments& assignments) { - std::cout << "Current assignments:\n"; - assignments.Print(); - std::pair entry{}; - std::cout << "Please select mission index." << std::endl; - std::cin >> entry.first; // <-- This value is NOT arbitrary! - if ((entry.first < 0) || (entry.first >= assignments.missions.size())) { - return {0, 0}; +ChangeEntry GetNextChangeEntryFromUser(const Game& game) { + game.Print(); + ChangeEntry entry{}; + std::cout << "Please enter number to change: "; + std::cin >> entry.index; + if ((entry.index < 0) || (entry.index >= game.player_numbers().size())) { + return {}; // How do we know this value indicates an error? } - std::cout << "Please provide new robot id." << std::endl; - std::cin >> entry.second; + std::cout << "Please provide a new value: "; + std::cin >> entry.value; return entry; } ``` +Many other real-world functions will face the same issues which makes this method of returning a pre-defined value in case an error was encountered not particularly useful in practice. + ### Returning an error code breaks "pure functions" 😱 -Returning an error code instead solves at least a couple of these issues. It is fast and reliable and we can design our software with different error codes in mind so that the reason for the failure is also communicated to us. This is also still the prevalent way of handling errors in C and in some library that we can find in the wild, so there *is* some merit to this method. +Returning an error code instead solves at least a couple of these issues. It is fast and reliable and we can design our software with different error codes in mind so that the reason for the failure is also communicated to us. This is also still the prevalent way of handling errors in C and in some libraries that we can find in the wild, so there *is* some merit to this method. -However, if our function actually must return a value, the only way to use error codes is to change its return type to the type that our error codes have, like `int`, which forces us to provide an additional output parameter to our function, like `std::pair& result` in our case: +However, if our function actually must *return* a value, which most of the functions do, the only way to use error codes is to change its return type to the type that our error codes have, like `int`, which forces us to provide an additional output parameter to our function, like `ChangeEntry& result`: ```cpp -int GetNextChangeEntryFromUser(const MissionRobotAssignments& assignments, std::pair& result); +int GetNextChangeEntryFromUser(const Game& game, ChangeEntry& result); ``` -The main issue with this from my point of view is that it is clunky, mixes input/output in the signature, and limits functional composition. Consider how we would use this function: +The main issue with this from my point of view is that it is clunky, mixes input/output in the signature, and limits functional composition. + +Consider how we would use this function: ```cpp -#include "mission_robot_assignments.hpp" -#include "user_input.hpp" -#include "types.hpp" +// Old code above. int main() { - MissionRobotAssignments assignments{{Mission{42}, Mission{40}}, - {Robot{10}, Robot{23}}}; - assignments.Print(); - while (true) { - const auto user_wants_changes = CheckIfUserWantsChanges(); - if (!user_wants_changes) { break; } - std::pair change_entry{}; // Cannot be const! - const auto error_code = GetNextChangeEntryFromUser(assignments, change_entry); - if (!error_code) { - // Get the actual message stored somewhere else using the int error code. - std::cerr << "Please try again.\n"; + Game game{{42, 49, 23}, {42, 40, 23}, 10}; + while (game.UserHasBudget()) { + // Cannot be const, cannot use auto, have to allocate. + ChangeEntry change_entry{}; + const auto error_code = GetNextChangeEntryFromUser(game, change_entry); + if (error_code != 0) { + std::cerr << GetReason(error_code) << std::endl; continue; } - assignments.AssignRobot(change_entry.first, change_entry.second); + game.ChangePlayerNumberIfPossible(change_entry); } - assignments.Print(); + // Rest of the code. } ``` -In this code, we have to create a pair object before calling the `GetAnswerFromLlm` function. Furthermore this object cannot be `const`, which goes against everything we've been talking about in this series of C++ lectures until now. +Here, we can detect that something went wrong and, if we have some function `GetReason` that converts an error code to string, we even know *what* went wrong. -On top of all this, nowadays, the compilers are able to perform **R**eturn **V**alue **O**ptimization (or [RVO](https://en.cppreference.com/w/cpp/language/copy_elision.html)) for values returned from a function, essentially skipping the function call, and constructing the needed value in-place, and this functionality is limited for such input/output parameters. +But! We have to create a `ChangeEntry` object before calling the `GetNextChangeEntryFromUser` function. Furthermore this object cannot be `const`, which goes against everything we've been talking about in this series of C++ lectures until now. -So clearly, there are some issues with this method too. I believe it has its merits sometimes, but there has to be a reason for it and we must measure the performance well. +On top of all this, nowadays, the compilers are able to perform **R**eturn **V**alue **O**ptimization (or [RVO](https://en.cppreference.com/w/cpp/language/copy_elision.html)) for values returned from a function, essentially skipping the function call altogether, and constructing the needed value in-place. This functionality is, however, limited when using input/output parameters. + +So clearly, there are some issues with this method too. I believe it has its merits sometimes, but there has to be a reason for using it and we must measure the performance well. + + ### Using `std::optional`: **a better way** -I believe that there *is* a better way. With C++17, we gained [`std::optional`](https://en.cppreference.com/w/cpp/utility/optional.html) with which we can express that a function “might return a value” if everything goes well: +I believe that there *is* a better way though. With C++17, we gained [`std::optional`](https://en.cppreference.com/w/cpp/utility/optional.html) with which we can express that a function “might return a value” by returning an object of `std::optional` type. We can create this object either empty, holding a so-called `std::nullopt`, or from a valid `ChangeEntry`: ```cpp -std::optional> GetNextChangeEntryFromUser( - const MissionRobotAssignments& assignments) { - std::cout << "Current assignments:\n"; - assignments.Print(); - std::pair entry{}; - std::cout << "Please select mission index." << std::endl; - std::cin >> entry.first; - if ((entry.first < 0) || (entry.first >= assignments.missions.size())) { - return {}; // <-- Creates an empty optional, or std:nullopt. +#include + +// Other code above. +std::optional GetNextChangeEntryFromUser(const Game& game) { + game.Print(); + ChangeEntry entry{}; + std::cout << "Please enter number to change: "; + std::cin >> entry.index; + if ((entry.index < 0) || (entry.index >= game.player_numbers().size())) { + return {}; // <-- Create an empty optional, or std:nullopt. } - std::cout << "Please provide new robot id." << std::endl; - std::cin >> entry.second; - return entry; // <-- Creates an optional filled with a pair. + std::cout << "Please provide a new value: "; + std::cin >> entry.value; + return entry; // <-- Optional filled with a ChangeEntry object. } ``` -Now our function returns an object of a different type, `std::optional>` that we can use in an `if` statement to find out if it actually holds a value, which we can get to by calling its `value()` method or using a dereferencing operators `*` and `->`, just like with pointers: +We can use our newly-returned optional object in an `if` statement to find out if it actually holds a valid change entry. If the object does not hold a value, we show an error and continue asking the user to provide better input but if the object *does* hold a value, we can get to it by calling its `value()` method or using a dereferencing operators `*` and `->`, just like we did with pointers: ```cpp -#include "mission_robot_assignments.hpp" -#include "user_input.hpp" -#include "types.hpp" - +// Other code above. int main() { - MissionRobotAssignments assignments{{Mission{42}, Mission{40}}, - {Robot{10}, Robot{23}}}; - assignments.Print(); - while (true) { - const auto user_wants_changes = CheckIfUserWantsChanges(); - if (!user_wants_changes) { break; } - const auto change_entry = GetNextChangeEntryFromUser(assignments); - if (!change_entry.has_value()) { - std::cerr << "Please try again.\n"; + Game game{{42, 49, 23}, {42, 40, 23}, 10}; + while (game.UserHasBudget()) { + const auto change_entry = GetNextChangeEntryFromUser(game); + if (!change_entry) { // Also possible: change_entry.has_value(). + std::cerr << "Error when getting a number index." << std::endl; continue; } - assignments.AssignRobot(change_entry->first, change_entry->second); + game.ChangePlayerNumberIfPossible(change_entry.value()); } - assignments.Print(); + // Rest of the code. } ``` The presence or absence of a value is encoded into the type itself. No more guessing. No more relying on magic return values or input/output arguments. And the code is very short and neat! And, as always, we can always find more information about how to use `std::optional` at [cppreference.com](https://en.cppreference.com/w/cpp/utility/optional.html). +But, we've again lost a capability of knowing *what* went wrong. We just know that *something* has not gone to plan. There is simply no place to store the reason! + + + ### Using `std::expected`: **add context** -However, we might notice that `std::optional` only tells us that *something* went wrong, but not *what* went wrong. We're still interested in a reason! +But we are still interested in a reason for our failures! Enter [`std::expected`](https://en.cppreference.com/w/cpp/utility/expected.html), coming in C++23. And if you'd like to know what led to it being added to the language, give this [fantastic talk by Andrei Alexandrescu](https://www.youtube.com/watch?v=PH4WBuE1BHI) a watch! It is one of my favorite talks ever! It is both informative and entertaining in an equal measure! -With `std::expected` we could do the same things we could with `std::optional` and more by changing our function accordingly: +With `std::expected` we could do the same things we could do with `std::optional` and more by changing our function accordingly: ```cpp std::expected GetAnswerFromLlm(const std::string& question); ``` -Essentially, `std::expected` holds one of two values of two potentially different types - an expected or an unexpected one. Now we can return either a valid result, or an error message: +Essentially, `std::expected` holds one of two values of two potentially different types - an expected (`ChangeEntry` in our case) or an unexpected one (`std::string` in our tiny example). Now we can return either a valid result, *or* an error message: ```cpp -std::expected, std::string> GetNextChangeEntryFromUser( - const MissionRobotAssignments& assignments) { - std::cout << "Current assignments:\n"; - assignments.Print(); - std::pair entry{}; - std::cout << "Please select mission index." << std::endl; - std::cin >> entry.first; - if ((entry.first < 0) || (entry.first >= assignments.missions.size())) { - return std::unexpected("Wrong value provided."); +#include +#include + +// Other code above. +std::expected GetNextChangeEntryFromUser(const Game& game) { + game.Print(); + ChangeEntry entry{}; + std::cout << "Please enter number to change: "; + std::cin >> entry.index; + if ((entry.index < 0) || (entry.index >= game.player_numbers().size())) { + return std::unexpected(std::format("Index {} must be in [0, {}) interval", entry.index, game.player_numbers().size())); } - std::cout << "Please provide new robot id." << std::endl; - std::cin >> entry.second; - return entry; // <-- Creates a std::expected filled with a pair. + std::cout << "Please provide a new value: "; + std::cin >> entry.value; + return entry; } ``` -Using it is also quite neat: +Using it is also quite neat and is actually very similar to how we used the `std::optional`, the only difference is that we can now get the `error` from the object we return from `GetNextChangeEntryFromUser` if it holds one: ```cpp -#include "mission_robot_assignments.hpp" -#include "user_input.hpp" -#include "types.hpp" - +// Other code above. int main() { - MissionRobotAssignments assignments{{Mission{42}, Mission{40}}, - {Robot{10}, Robot{23}}}; - assignments.Print(); - while (true) { - const auto user_wants_changes = CheckIfUserWantsChanges(); - if (!user_wants_changes) { break; } - const auto change_entry = GetNextChangeEntryFromUser(assignments); - if (!change_entry.has_value()) { - std::cerr << answer.error() << "\n"; - std::cerr << "Please try again.\n"; + Game game{{42, 49, 23}, {42, 40, 23}, 10}; + while (game.UserHasBudget()) { + const auto change_entry = GetNextChangeEntryFromUser(game); + if (!change_entry) { // Also possible: change_entry.has_value(). + std::cerr << change_entry.error() << std::endl; continue; } - assignments.AssignRobot(change_entry->first, change_entry->second); + game.ChangePlayerNumberIfPossible(change_entry.value()); } - assignments.Print(); + // Rest of the code. } ``` @@ -782,7 +721,7 @@ This has all the benefits we mentioned before: - Everything happens in deterministic time with no unpredictable runtime overhead - Works for functions returning `void` too -There is just one tiny issue that spoils our fun. As you've probably noticed, most of the things we covered until now in this C++ course targeted C++17, and `std::expected` is only available from C++23 on. But there is a solution to this: we can use [`tl::expected`](https://github.com/TartanLlama/expected) as a drop-in replacement for code bases that don't yet adopt C++23. +There is just one tiny issue that spoils our fun. As you've probably noticed, most of the things we covered until now in this C++ course targeted C++17 as a recent-enough but also wide-enough used standard. Unfortunately `std::expected` is only available from C++23 on. But there is a solution to this: we can use [`tl::expected`](https://github.com/TartanLlama/expected) as a drop-in replacement for code bases that don't yet adopt C++23. ## Performance Considerations for `std::optional` and `std::expected` @@ -803,24 +742,24 @@ std::expected SomeFunction(); Every return now has the size of the larger type. Don't do this! - +Mostly I've seen people using `std::error_code`, `std::string` or custom enums as an error type in `std::expected`. All of these are relatively efficient and have a small stack memory footprint. If the code base you work in has a standard - follow it, otherwise, experiment with the options I just mentioned and find one that works best for your current circumstances. -### Return value optimization +### Return value optimization with `std::optional` and `std::expected` -There is also one quirk with how these types interact with the return value optimization and named return value optimization in C++. These topics are quite nuanced, but in general, the rule of thumb here is quite simple: we should prefer constructing the `std::expected` and `std::optional` objects in-place rather than creating a local variable first. +There is also one quirk with how `std::optional` and `std::expected` types interact with the return value optimization (RVO) and named return value optimization (NRVO) which makes all the nice examples that we've just seen a little less nice. -For more details, I'll refer you to a [short video by Jason Turner](https://www.youtube.com/watch?v=0yJk5yfdih0) on this. +These topics are quite nuanced, and I don't want to go into many details here but in case we care about the performance of our code that uses `std::optional` or `std::expected` we'll have to uglify our code a little bit. For more details, see this [short video by Jason Turner](https://www.youtube.com/watch?v=0yJk5yfdih0) where he covers all of these situations in-depth. ## Summary -We went through quite some material today. We've looked at all the various kinds of ways to deal with errors happening in our (and somebody else's) code. As a short summary, I hope that I could convince you that these are some sane suggestions: +I believe that this concludes a more of less complete overview of how to deal with errors in C++. - +As a short summary, I hope that I could convince you that these are some sane suggestions: -- Use `CHECK` and similar macros for dealing with unrecoverable errors like programming bugs or contract violation to fail as fast as possible when they are encountered. -- Keep the test coverage of the code high to reduce chances of missing errors. +- Use `CHECK` and similar macros for dealing with unrecoverable errors like programming bugs or contract violations in order to fail as fast as possible when they are encountered. +- Keep the test coverage of the code high to reduce chances of crashing in the released code. - Use `std::optional` as a return type when a value might be missing due to a recoverable error and the reason for the failure is not important. -- Use `std::expected` when a reason for failure *is* important to know how to recover. +- Use `std::expected` when a reason for failure *is* important to know how to be able to recover from it. - Avoid exceptions in time-critical or safety-critical systems due to their non-deterministic runtime overhead. - Avoid old error handling mechanisms like returning error codes when possible. @@ -828,6 +767,6 @@ All in all, the overall direction that we seem to be following as a community is One final thing I wanted to add is that obviously, the `std::optional` class can be used also in other places, not just as a return type from a function. If some object of ours must have an optional value, using `std::optional` can be a good idea there too! But I'm sure you're going to be able to figure this out from the related [cppreference page](https://en.cppreference.com/w/cpp/utility/optional.html) on your own 😉. - diff --git a/lectures/images/docs.png b/lectures/images/docs.png new file mode 100644 index 0000000..1640d62 --- /dev/null +++ b/lectures/images/docs.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:86d8bc75f4c9e9068cf034785c799b8fe6aee80e6e9c4d95c23f6b912d50a7db +size 300075 From 3cf55bc67afee3fbd96b2e56cc3870ad3dc65f87 Mon Sep 17 00:00:00 2001 From: Igor Bogoslavskyi Date: Mon, 23 Jun 2025 00:36:52 +0200 Subject: [PATCH 18/26] Remove variant lecture from this branch --- lectures/variant.md | 98 --------------------------------------------- 1 file changed, 98 deletions(-) delete mode 100644 lectures/variant.md diff --git a/lectures/variant.md b/lectures/variant.md deleted file mode 100644 index c848826..0000000 --- a/lectures/variant.md +++ /dev/null @@ -1,98 +0,0 @@ - -`std::variant` in Modern C++ --- - -

- Video Thumbnail -

- -In the last lecture we talked about `std::optional` and `std::expected` types that make our life better. It might be useful to understand _how_ they can store two values of different types in the same memory. We can get a glimpse into this by understanding how `std::variant` works. Furthermore, we can store many more types than two in it. This, incidentally also happens to be the key to mimicking dynamic polymorphism when using templates. - - - -## Why use `std::variant`? - -`std::variant` is a type-safe `union` type introduced in C++17. It allows a variable to hold one value out of a defined set of types. - -For instance, if a variable can hold either an integer or a string, you can use `std::variant` and put any value in it: -```cpp -#include -#include -#include - -int main() { - // This compiles - std::variant value; - value = 42; // value holds an int. - std::cout << "Integer: " << std::get(value) << '\n'; - value = "42" // value now holds a string. - std::cout << "String: " << std::get(value) << '\n'; - return 0; -} -``` - -### How `std::variant` is used in practice? -While cool already, the current tiny example might feel quite limited. Think about it, we somehow have to _know_ which type our `std::variant` holds to use it. Which almost feels like it defeats the purpose. And to a degree it does. - -But we should not despair, this is C++ after all, there are options for us to use to make sure that we can work with _any_ type that the variant holds. This option is to use a visitor pattern through the use of the `std::visit` function: - -```cpp -#include -#include -#include - -struct Printer { - void operator(int value) const { - std::cout << "Integer: " << value << '\n'; - } - void operator(const std::string& value) const { - std::cout << "String: " << value << '\n'; - } -}; - -int main() { - std::variant value = "Hello, Variant!"; - std::visit(Printer{}, value); - value = 42; - std::visit(Printer{}, value); -} -``` -Here, `std::visit` applies a [function object](lambdas.md#before-lambdas-we-had-function-objects-or-functors) to the value contained in the variant. Should our variant hold a string, the operator that accepts a string is called and should it hold an integer instead, the operator that accepts an integer is called instead. - -Note, that a typical pitfall that beginners make is to forget that all of the checks for this code happen at compile time without taking into account the runtime logic of our code. - -If, for example, we would change our `Printer` function object to a `LengthPrinter` function object that only knows how to print length of objects, our code will not compile even though we only ever actually store an `std::string` in our variant: -```cpp -#include -#include -#include - -struct LengthPrinter { - void operator(const std::string& value) const { - std::cout << "String length: " << value.size() << '\n'; - } -}; - -int main() { - // ❌ Does not compile! - std::variant value = "Hello, Variant!"; - std::visit(LengthPrinter{}, value); -} -``` -This happens because the compiler must guarantee that all the code paths compile because it does not know which other code might be called. This might happen if some dynamic library gets linked to our code after it gets compiled. If that dynamic library actually stores an `int` in our variant the compiled code must know how to deal with it. - -Many people find this confusing and get burned by this at least a couple of times until it becomes very intuitive and please remember that it just takes time. - -## `std::monostate` -Whenever we create a new `std::variant` object we actually initialize it to storing some uninitialized value of the type that is first in the list of types that the variant can store. Sometimes it might be undesirable and we want the variant to be initialized in an "empty" state. For this purpose there is a type `std::monostate` in the standard library and we can define our variant type using `std::monostate` as its first type in the list. -```cpp -std::variant value{}; -// value holds an instance of std::monostate now. -``` - -Note that it probably means that we'll need to differentiate between our variant holding the `std::monostate` value or some other value in the `std::visit` that we will inevitably use at a later point in time. - - -## **Summary** - -Overall, `std::variant` is extremely important for modern C++. If we implement our code largely using templates or concepts and need to enable polymorphic behavior based on some values provided at runtime, there is probably no way for us to avoid using it. Which also means that we probably also will need to use `std::visit`. These things might well be confusing from the get go but after we've looked into how function objects and lambdas work we should have no issues using all of this machinery. From 99646563979cc96adcefe4b2d13c617b48b631b9 Mon Sep 17 00:00:00 2001 From: Igor Bogoslavskyi Date: Mon, 23 Jun 2025 23:00:27 +0200 Subject: [PATCH 19/26] Better comments in an example --- .../comparison_game/comparison_game.cpp | 3 +-- lectures/error_handling.md | 23 +++++++++++++++---- 2 files changed, 20 insertions(+), 6 deletions(-) diff --git a/lectures/code/error_handling/comparison_game/comparison_game.cpp b/lectures/code/error_handling/comparison_game/comparison_game.cpp index e831f2e..585b2d2 100644 --- a/lectures/code/error_handling/comparison_game/comparison_game.cpp +++ b/lectures/code/error_handling/comparison_game/comparison_game.cpp @@ -54,8 +54,7 @@ class Game { int budget_{}; }; -// Multiple issues here for now. -// We should handle failure to get a proper value. +// 😱 We should handle failure to get a proper value. ChangeEntry GetNextChangeEntryFromUser(const Game& game) { game.Print(); ChangeEntry entry{}; diff --git a/lectures/error_handling.md b/lectures/error_handling.md index 045cca5..1bd491e 100644 --- a/lectures/error_handling.md +++ b/lectures/error_handling.md @@ -120,9 +120,10 @@ class Game { void Print() const { std::cout << "Budget: " << budget_ << std::endl; - std::cout << "Computer numbers: "; + std::cout << "Reference numbers: "; for (auto number : ref_numbers_) { std::cout << number << "\t"; } - std::cout << "\nPlayer numbers: "; + std::cout << std::endl; + std::cout << "Player numbers: "; for (auto number : player_numbers_) { std::cout << number << "\t"; } std::cout << std::endl; } @@ -145,6 +146,8 @@ class Game { budget_ -= difference; } + const std::vector& ref_numbers() const { return ref_numbers_; } + const std::vector& player_numbers() const { return player_numbers_; } bool UserHasBudget() const { return budget_ > 0; } private: @@ -159,17 +162,18 @@ ChangeEntry GetNextChangeEntryFromUser(const Game& game) { ChangeEntry entry{}; std::cout << "Please enter number to change: "; std::cin >> entry.index; - std::cout << "Please provide a a new value: "; + std::cout << "Please provide a new value: "; std::cin >> entry.value; return entry; } int main() { - Game game{{42, 50, 23}, {42, 40, 99}, 10}; + Game game{{42, 49, 23}, {42, 40, 23}, 10}; while (game.UserHasBudget()) { const auto change_entry = GetNextChangeEntryFromUser(game); game.ChangePlayerNumberIfPossible(change_entry); } + game.Print(); if (game.CheckIfPlayerWon()) { std::cout << "You win!\n"; } else { @@ -260,6 +264,17 @@ Therefore a typical advice is to catch any wrong values that propagate through o My favorite tool for this is the [`CHECK`](https://abseil.io/docs/cpp/guides/logging#CHECK) macro that can be found in the [Abseil library](https://abseil.io/docs/). We can use it in our `ChangePlayerNumberIfPossible` to check if the index is within bounds: + ```cpp #include From a1d93095e8de0b3d5a2eb24d373b36aef463cabf Mon Sep 17 00:00:00 2001 From: Igor Bogoslavskyi Date: Mon, 23 Jun 2025 23:39:27 +0200 Subject: [PATCH 20/26] Make all code snippets compile --- lectures/error_handling.md | 393 +++++++++++++++++++++++++++++++++++-- 1 file changed, 381 insertions(+), 12 deletions(-) diff --git a/lectures/error_handling.md b/lectures/error_handling.md index 1bd491e..773d643 100644 --- a/lectures/error_handling.md +++ b/lectures/error_handling.md @@ -266,21 +266,37 @@ My favorite tool for this is the [`CHECK`](https://abseil.io/docs/cpp/guides/log ```cpp #include // Old code unchanged here. -void ChangePlayerNumberIfPossible(const ChangeEntry& change_entry) { +void Game::ChangePlayerNumberIfPossible(const ChangeEntry& change_entry) { // Checking: // (change_entry.index >= 0) // (change_entry.index < player_numbers_.size()) @@ -318,12 +334,35 @@ Let me explain myself. You see, `assert` has one super annoying flaw that makes First, we can use `assert` in a very similar way to `CHECK`: + ```cpp #include // Old code unchanged here. -void ChangePlayerNumberIfPossible(const ChangeEntry& change_entry) { +void Game::ChangePlayerNumberIfPossible(const ChangeEntry& change_entry) { assert(change_entry.index >= 0); assert(change_entry.index < player_numbers_.size()); auto& player_number = player_numbers_[change_entry.index]; @@ -393,6 +432,24 @@ To talk about them, let us focus on the function `GetNextChangeEntryFromUser` fr As we design our program, we know that the number index that the user provides first must be within the bounds of the player numbers vector of the `game` object: + ```cpp // 😱 We should handle failure to get a proper value. ChangeEntry GetNextChangeEntryFromUser(const Game& game) { @@ -425,13 +482,32 @@ Since C++98 we have a powerful machinery of exceptions at our disposal. An excep In our example, we could throw an object of `std::out_of_range` when the user inputs a wrong number index: + ```cpp // 😱 I'm not a fan of using exceptions. ChangeEntry GetNextChangeEntryFromUser(const Game& game) { game.Print(); ChangeEntry entry{}; std::cout << "Please enter number to change: "; - std::cin >> entry.index; // <-- This value is NOT arbitrary! + std::cin >> entry.index; if ((entry.index < 0) || (entry.index >= game.player_numbers().size())) { throw std::out_of_range("Wrong number index provided."); } @@ -445,6 +521,40 @@ Throwing an exception interrupts the normal program flow. We leave the current s Speaking of handling exceptions, we can "catch" them anywhere upstream from the place they have been thrown from. As `std::exception` is just an object, it can be caught by value or by reference. It is considered best practice to catch them by reference. In our case we can catch either an `std::out_of_range` exception directly or, as `std::out_of_range` derives from `std::exception`, catch `std::exception` instead: + ```cpp // Old unchanged code. @@ -505,6 +615,32 @@ Furthermore, the language permits the use of generic catch blocks like `catch (. In our own example, if, the `ChangePlayerNumberIfPossible` function throws an undocumented `std::runtime_error` exception but we only expect `std::out_of_range`, we don't have a good way of detecting this! Our only options are to allow such exceptions to be left unhandled and eventually to terminate our program or to add a `catch(...)` clause where we miss important context for what caused the error, making catching it a lot less useful: + ```cpp int main() { Game game{{42, 49, 23}, {42, 40, 23}, 10}; @@ -548,6 +684,17 @@ Now is a time to return to the other option of detecting errors in functions tha I would say that there are three distinct ways of thinking about it. Let's illustrate all of them on a function we've already looked at: + ```cpp ChangeEntry GetNextChangeEntryFromUser(const Game& game); ``` @@ -557,6 +704,17 @@ We can: 1. Keep the return type, `ChangeEntry` in our case, but return a special **value** of this type in case of failure, say a value-initialized `ChangeEntry{}` object; 2. Return an **error code**, which would change the signature of the function to return `int` or a similar type instead: + ```cpp int GetNextChangeEntryFromUser(const Game& game, ChangeEntry& result); ``` @@ -569,6 +727,30 @@ I believe that the third option is the best out of these three, but it will be e There is a number of issues with returning a special value from a function without using a special return type. As an illustration, in our case, a naïve choice would be to return a value-initialized `ChangeEntry{}` object from the `GetNextChangeEntryFromUser` function if the user provided a wrong number index. This object will essentially hold zeros for its `index` and `value` entries. However, as you might imagine, a pair of zeros *is* a completely valid, if slightly useless, change entry! How do we disambiguate this value from a valid one? + ```cpp // 😱 Not a great idea! ChangeEntry GetNextChangeEntryFromUser(const Game& game) { @@ -591,27 +773,101 @@ Many other real-world functions will face the same issues which makes this metho Returning an error code instead solves at least a couple of these issues. It is fast and reliable and we can design our software with different error codes in mind so that the reason for the failure is also communicated to us. This is also still the prevalent way of handling errors in C and in some libraries that we can find in the wild, so there *is* some merit to this method. -However, if our function actually must *return* a value, which most of the functions do, the only way to use error codes is to change its return type to the type that our error codes have, like `int`, which forces us to provide an additional output parameter to our function, like `ChangeEntry& result`: +However, if our function actually must *return* a value, which most of the functions do, the only way to use error codes is to change its return type to the type that our error codes have, like `int`, which forces us to provide an additional output parameter to our function, like `ChangeEntry& result` which we subsequently dill inside of the body of our function: + ```cpp -int GetNextChangeEntryFromUser(const Game& game, ChangeEntry& result); +// 😱 Mostly not a great idea in C++. +int GetNextChangeEntryFromUser(const Game& game, ChangeEntry& result) { + game.Print(); + std::cout << "Please enter number to change: "; + int index{}; + std::cin >> index; + if ((index < 0) || (index >= game.player_numbers().size())) { + return kError; // Probably some constant defined elsewhere. + } + result.index = index; + std::cout << "Please provide a new value: "; + std::cin >> result.value; + return kSuccess; // Typically a 0, but using a constant is better. +} ``` The main issue with this from my point of view is that it is clunky, mixes input/output in the signature, and limits functional composition. Consider how we would use this function: + ```cpp // Old code above. +std::string GetFailureReason(int constant) { + // Some logic to return a message. + if (constant == kError) { return "Provided index is out of range."; } + return "Unknown error encountered."; +} + int main() { Game game{{42, 49, 23}, {42, 40, 23}, 10}; while (game.UserHasBudget()) { // Cannot be const, cannot use auto, have to allocate. ChangeEntry change_entry{}; const auto error_code = GetNextChangeEntryFromUser(game, change_entry); - if (error_code != 0) { - std::cerr << GetReason(error_code) << std::endl; + if (error_code != kSuccess) { + std::cerr << GetFailureReason(error_code) << std::endl; continue; } game.ChangePlayerNumberIfPossible(change_entry); @@ -634,6 +890,33 @@ So clearly, there are some issues with this method too. I believe it has its mer I believe that there *is* a better way though. With C++17, we gained [`std::optional`](https://en.cppreference.com/w/cpp/utility/optional.html) with which we can express that a function “might return a value” by returning an object of `std::optional` type. We can create this object either empty, holding a so-called `std::nullopt`, or from a valid `ChangeEntry`: + ```cpp #include @@ -654,6 +937,32 @@ std::optional GetNextChangeEntryFromUser(const Game& game) { We can use our newly-returned optional object in an `if` statement to find out if it actually holds a valid change entry. If the object does not hold a value, we show an error and continue asking the user to provide better input but if the object *does* hold a value, we can get to it by calling its `value()` method or using a dereferencing operators `*` and `->`, just like we did with pointers: + ```cpp // Other code above. int main() { @@ -686,12 +995,31 @@ Enter [`std::expected`](https://en.cppreference.com/w/cpp/utility/expected.html) With `std::expected` we could do the same things we could do with `std::optional` and more by changing our function accordingly: -```cpp -std::expected GetAnswerFromLlm(const std::string& question); -``` + ```cpp #include #include @@ -711,8 +1039,38 @@ std::expected GetNextChangeEntryFromUser(const Game& g } ``` +Essentially, `std::expected` holds one of two values of two potentially different types - an expected (`ChangeEntry` in our case) or an unexpected one (`std::string` in our tiny example). Now we can return either a valid result, *or* an error message. + Using it is also quite neat and is actually very similar to how we used the `std::optional`, the only difference is that we can now get the `error` from the object we return from `GetNextChangeEntryFromUser` if it holds one: + ```cpp // Other code above. int main() { @@ -750,6 +1108,17 @@ In the case of `std::optional` this does not play much of a difference, as the " So this is something to avoid: + ```cpp // Bad idea, wasting memory 😱 std::expected SomeFunction(); From 4f8afd5abd62bc6829a6fa17d3aa0c3adb57e0cf Mon Sep 17 00:00:00 2001 From: Igor Bogoslavskyi Date: Tue, 24 Jun 2025 00:36:32 +0200 Subject: [PATCH 21/26] Another read-through and minor updates --- lectures/error_handling.md | 127 +++++++++++++++++++------------------ 1 file changed, 66 insertions(+), 61 deletions(-) diff --git a/lectures/error_handling.md b/lectures/error_handling.md index 773d643..acc769a 100644 --- a/lectures/error_handling.md +++ b/lectures/error_handling.md @@ -16,13 +16,13 @@ - [Catch them as early as possible](#catch-them-as-early-as-possible) - [Use `CHECK` macro to fail early](#use-check-macro-to-fail-early) - [Don't use `assert`](#dont-use-assert) - - [Complete the example yourself](#complete-the-example-yourself) + - [Complete the `Game` class yourself](#complete-the-game-class-yourself) - [How to minimize number of unrecoverable errors](#how-to-minimize-number-of-unrecoverable-errors) - [Recoverable errors: **handle and proceed**](#recoverable-errors-handle-and-proceed) - [Exceptions](#exceptions) - [How to use exceptions](#how-to-use-exceptions) - [A case for exceptions for both "recoverable" and "unrecoverable" errors](#a-case-for-exceptions-for-both-recoverable-and-unrecoverable-errors) - - [Why not just use exceptions](#why-not-just-use-exceptions) + - [Why we might not want to use exceptions](#why-we-might-not-want-to-use-exceptions) - [Exceptions are (sometimes) expensive](#exceptions-are-sometimes-expensive) - [Exceptions hide the error path](#exceptions-hide-the-error-path) - [Exceptions are banned in many code bases](#exceptions-are-banned-in-many-code-bases) @@ -40,7 +40,9 @@ When writing C++ code, much like in real life, we don’t always get what we wan Today we’re talking about error handling. What options we have, which trade-offs they come with, and what tools modern C++ gives us to make our lives a bit easier. -Buckle up! There is a lot to cover and quite some nuance in this topic! There will also inevitably be some statements that are quite opinionated and I can already see people with pitchforks and torches coming for me... so... I'm sure it's gonna be fun! +So get a coffee and buckle up! There is a lot to cover and quite some nuance in this topic! + +Ah, and I can already smell the torches and hear the scraping of the pitchforks that people will bring to punish me for all the controversial statements I am about to make! What can go wrong, eh? - + + # What Do We Mean by “Error”? Before we go into how to handle errors, however, let’s clarify what we mean when we think about an "error" in programming. -At the highest level: an error is something that happens when a program doesn’t produce the result we expect. +At the highest level: **an error is something that happens when a program doesn’t produce the result we expect.** I tend to think of errors belonging to one of two broad groups: @@ -182,19 +186,23 @@ int main() { } ``` +This is not the most elegant code out there, but it is not too far from the style of code we might encounter in the wild. + +And by the way, all of this code is, as always, [available](code/error_handling/comparison_game/comparison_game.cpp) for you to play around on the accompanying GitHub page. + We can build this program as a single executable directly from the command line: ```cmd c++ -std=c++17 -o comparison_game comparison_game.cpp ``` -Ideally, we should test all of our functions, but for now let's just give it a play-through instead! +Ideally, we should at least unit-test all of our functions, but it will serve as a great cautionary tale if we just give it a play-through instead! # Unrecoverable errors: **fail early** ## Our first unrecoverable error encounter -We run our executable and are greeted with expected numbers as well as a prompt to change one of our numbers: +We run our executable and are greeted with expected numbers as well as a prompt to change one of *our* numbers: ```output λ › ./comparison_game @@ -204,7 +212,7 @@ Player numbers: 42 40 23 Please enter number to change: ``` -We tie the reference numbers in the first and third columns but lose in the second. So our only chance to win is to change the value `40` to `50`, which is also what our budget allows for! So we provide these numbers to our program and observe what happens: +We tie the reference numbers in the first and third columns but lose in the second one. So our only chance to win is to change the value `40` to `50`, which is also what our budget allows for! So we provide these numbers to our program and observe what happens: ```output λ › ./comparison_game @@ -221,11 +229,13 @@ Please enter number to change: But wait, what's going on here? Why did our number in the second column not change? Why is our budget not decreased by `10`? Even more strangely, why did the first reference number change to `50`? -The answer to all of these questions is that we have just encountered our first unrecoverable error that manifests itself in wrong values in our memory through the "virtues" of Undefined Behavior. But what gives? +The answer to all of these questions is that we have just encountered an unrecoverable error that manifests itself in wrong values in our memory through the "virtues" of Undefined Behavior. But what gives? + +Well, there is a chain of events that caused our values to be changed in ways that we don't expect and we'll be digging through all of these and more in the remainder of today's lecture. -Well, there is a chain of events that caused our values to be changed in ways that we don't expect and we'll be digging through all of these in the remainder of today's lecture. +But the most immediate cause is that the user has mistakenly provided the number they wanted to change, `40`, rather than an index of that number, `1`. We then did not check that the provided "index" is within the bounds of our number array and wrote the provided new value directly into the number array under this wrong index. -But the most immediate cause is that the user has mistakenly provided the number they wanted to change, `40`, rather than an index of that number, `1`. We then did not check that provided "index" and wrote the provided new value directly into it. If we rerun our game again and provide `1` as the first input, we win, just as we expect! +If we correct our mistake, rerun our game again and provide `1` as the first input, we win, just as we expect! ```output λ › ./comparison_game @@ -242,11 +252,13 @@ You win! When we provide `40` as we did the first time, our wrong index is far beyond the size of the `player_numbers_` vector and, when we write beyond its bounds we enter the Undefined Behavior land. -What happens next is unpredictable. If we are lucky and the address into which we write does not belong to our program, the program will crash. +What happens next is **unpredictable**. If we are *lucky* and the address into which we write does **not** belong to our program, the program will crash. -If we are not lucky however, we will rewrite *some* memory that belongs to our program, potentially corrupting any object that actually owns that memory. In this particular example, I picked the values in such a way, that the "fake index" just happens to be equal to a difference in pointers to the data of the `player_numbers_` and `ref_numbers_` vectors. Which then results in us writing directly into the first element of the `ref_numbers_` vector, resulting in an update to reference numbers. +If we are *not* lucky however, we will rewrite *some* memory that belongs to our program, potentially corrupting any object that actually owns that memory. In this particular example, I picked the values in such a way, that the "fake index" just happens to be equal to a difference in pointers `player_numbers_.data()` and `ref_numbers_.data()`. Which then results in us writing directly into the first element of the `ref_numbers_` vector, resulting in an unexpected update to reference numbers. -But I want to stress again, that most likely if you run the same program on your machine - you will get a different behavior altogether! Even the order of `ref_numbers_` and `player_numbers_` in memory is not guaranteed, note how on my machine they do not even follow the order of declaration! +But I want to stress again that if we run the same program on another machine - we will most likely get a *different behavior* altogether! + +Even the order of `ref_numbers_` and `player_numbers_` in memory is not guaranteed, note how on my machine they do not even follow the order of declaration! What *doesn't* change is that once the `ChangePlayerNumberIfPossible` method is called with a wrong `change_entry` in our example, all bets are off - **we do not have any guarantees on the consistency of the state of our program anymore** @@ -268,7 +280,6 @@ My favorite tool for this is the [`CHECK`](https://abseil.io/docs/cpp/guides/log `CPP_SETUP_START` #include -#define CHECK(expr) #define CHECK_GE(expr, value) #define CHECK_LT(expr, value) @@ -326,11 +337,11 @@ For my money, in most cases, the benefits far outweigh the costs, and, unless we ### Don't use `assert` -You might wonder if using `CHECK` if our only way and so I have to talk about one very famous alternative here that is often recommended on the Internet. This alternative is to use [`assert`](https://en.cppreference.com/w/cpp/error/assert.html) in place of `CHECK`. The `assert` statement can be found in the `` include file. I'm not a fan of using `assert`. +You might wonder if using `CHECK` is our only way to help us detect when we are in an inconsistent state and so I have to talk about one very famous alternative here that is often recommended on the Internet. This alternative is to use [`assert`](https://en.cppreference.com/w/cpp/error/assert.html). The `assert` statement can be found in the `` include file. I'm not a fan of using `assert`. -What is this smell? Is something on fire? Ah, it's the people with torches and pitchforks again coming for me for not liking `assert`! +If you came here with your torches and pitchforks, this is probably a good time to take them out! 😉 -Let me explain myself. You see, `assert` has one super annoying flaw that makes it impossible for me to recommend it for production code. I've seen sooo many bugs stemming from this! But let me show what I'm talking about on our game example. +But let me explain myself. You see, `assert` has one super annoying flaw that makes it impossible for me to recommend it for production code. I've seen sooo many bugs stemming from this! But let me show what I'm talking about using our game example. First, we can use `assert` in a very similar way to `CHECK`: @@ -375,7 +386,13 @@ void Game::ChangePlayerNumberIfPossible(const ChangeEntry& change_entry) { // Old code unchanged here. ``` -Now, if we compile and run our game just as we did before - the assertion will trigger: +We compile and run our game just as we did before: + +```cmd +c++ -std=c++17 -DNDEBUG -o comparison_game comparison_game.cpp +``` + +And if we run our program the assertion will trigger: ```output output.s: /app/example.cpp:44: void Game::ChangePlayerNumberIfPossible(const ChangeEntry &): Assertion `change_entry.index < player_numbers_.size()' failed. @@ -384,9 +401,9 @@ Program terminated with signal: SIGSEGV So far so good, right? Using `assert` also crashes our program when the wrong input is provided and shows us where the wrong value was detected. -So what is that annoying flaw I've been talking about that makes me dislike `assert`? Well, you see, all `assert` statements get *disabled* when a macro `NDEBUG` is defined. This is a standard macro name that controls if the debug symbols get compiled into the binary and gets passed to the compilation command for most release builds as we generally don't want debug symbols in the binary we release. So essentially, `assert` **does not protect us from undefined behavior in the code we actually deploy**! +So what is that annoying flaw I've been talking about that makes me dislike `assert`? Well, you see, all `assert` statements get *disabled* when a macro `NDEBUG` is defined. This is a standard macro name that controls if the debug symbols get compiled into the binary and gets passed to the compilation command for most release builds as we generally don't want debug symbols in the binary we release. So essentially, `assert` almost always **does not protect us from undefined behavior in the code we actually deploy**! -We can easily demonstrate that the `asserts` indeed get *compiled out* by adding `-DNDEBUG` flag to our compilation command: +We can easily demonstrate that the `assert`s indeed get *compiled out* by adding `-DNDEBUG` flag to our compilation command: ```cmd c++ -std=c++17 -DNDEBUG -o comparison_game comparison_game.cpp @@ -394,7 +411,9 @@ c++ -std=c++17 -DNDEBUG -o comparison_game comparison_game.cpp Running our game *now* and providing a wrong input index leads to the same undefined behavior we observed before as all of the assertions were compiled out. Not great, right? -### Complete the example yourself + + +### Complete the `Game` class yourself By the way, we've only covered how we could improve our `ChangePlayerNumberIfPossible` method of the `Game` class. Do you think our `CheckIfPlayerWon` function would benefit from the same treatment? @@ -402,25 +421,25 @@ By the way, we've only covered how we could improve our `ChangePlayerNumberIfPos ## How to minimize number of unrecoverable errors -Of course, hard failures in the programs we ship is also not ideal! +Of course, hard failures in the programs we ship are not ideal! -One way to reduce the risk of such failures is to keep the test coverage high for the code we write, ideally close to 100% line and branch coverage, i.e., every line and logical branch gets executed at least once in our test suite. +One way to reduce the risk of such failures is to keep the test coverage high for the code we write, ideally close to 100% line and branch coverage, i.e., every line and logical branch gets executed at least once in our test suite which is run regularly and automatically. This way we catch most of the unrecoverable errors during development. In some industries, like automotive, aviation, or medical this is actually a legal requirement. -But unfortunately, despite our best efforts, we cannot *completely* avoid failures in the programs we ship! +But unfortunately, despite our best efforts, we cannot *completely* avoid catastrophic failures in the programs we ship! Even if we do everything right on our side, hardware can still fail and corrupt our memory. One fun example of this is the famous error in the Belgian election on the 18th of May 2003, where [one political party got 4096 extra votes](https://en.wikipedia.org/wiki/Electronic_voting_in_Belgium) due to what is believed to have been a cosmic ray flipping "a bit at the position 13 in the memory of the computer", essentially leading to 4096 more votes. -This visualization is actually taken from this [excellent Veritasium video](https://www.youtube.com/watch?v=AaZ_RSt0KP8). Do give it a watch, it speaks about this and other similar cases much more in-depth! +This visualization is actually taken from an [excellent Veritasium video](https://www.youtube.com/watch?v=AaZ_RSt0KP8) on this topic. Do watch it if you want to know more! -With the knowledge that we cannot completely remove the risk of hitting an unrecoverable error in production, and that we also cannot just outright fail, in safety-critical systems, we often isolate components into separate processes or even separate hardware units, with watchdogs that can trigger recovery actions if one of our components suddenly crashes. +With the knowledge that we cannot completely remove the risk of hitting an unrecoverable error in production, and that we also probably don't want our, say, flight software just randomly die during operation, in safety-critical systems, we often isolate components into separate processes or even separate hardware units, with watchdogs that can trigger recovery actions if one of our components suddenly crashes. This way we can have our cake and eat it at the same time: using `CHECK` minimizes the time-to-failure when a bug is encountered, while our fallback options keep the system safe as a whole even when certain components fail. -That being said, this is a system architecture question and this topic is far beyond what I want to talk about today. In most non-safety-critical systems we do not need to think about these failure cases as deeply and we can usually just restart our program in case of a one-off failure. But, if you're interested in this topic, In an introductory lecture to the Self Driving Cars course at the University of Bonn that I've given some years ago I've dedicated a significant part towards the end of that lecture [to this topic](https://youtu.be/DtRktn4bVWg?si=DJuU8OjxtBcj5o2C). So do give it a watch if you're interested. +That being said, this is a system architecture question and this topic is far beyond what I want to talk about today. In most non-safety-critical systems we do not need to think about these failure cases as deeply and we can usually just restart our program in case of a one-off failure. But, if it sounds interesting to you, in an [introductory lecture to the Self Driving Cars course](https://youtu.be/DtRktn4bVWg?si=DJuU8OjxtBcj5o2C) at the University of Bonn, I've dedicated a significant part towards the end to this topic. # Recoverable errors: **handle and proceed** @@ -428,9 +447,9 @@ But not *every* error should instantly crash our program! Indeed, in our example The good thing about user inputs is that we can ask the user to correct these without crashing! The type of errors we encounter here can be called **recoverable errors**. -To talk about them, let us focus on the function `GetNextChangeEntryFromUser` from our example. Currently, there is no validation of what the user inputs but we absolutely *can* and *should* perform such validation! +To talk about them, let us focus on the function `GetNextChangeEntryFromUser` from our example. Currently, there is no validation of what the user provides as input but we absolutely *can* and *should* perform such validation! -As we design our program, we know that the number index that the user provides first must be within the bounds of the player numbers vector of the `game` object: +As we design our program, we know that the index that the player provides must be within the bounds of the player numbers vector in the `game` object: -But until the C++ community figures out what to do, we are stuck with many people being unable to use the default error handling mechanism in C++. +But until the C++ community figures out what to do, we are stuck with many people being unable to use the default error handling mechanism in C++. 🤷‍♂️ So what *do* we do? @@ -701,25 +722,9 @@ ChangeEntry GetNextChangeEntryFromUser(const Game& game); We can: -1. Keep the return type, `ChangeEntry` in our case, but return a special **value** of this type in case of failure, say a value-initialized `ChangeEntry{}` object; -2. Return an **error code**, which would change the signature of the function to return `int` or a similar type instead: - - - ```cpp - int GetNextChangeEntryFromUser(const Game& game, ChangeEntry& result); - ``` - -3. Return a **different type** specifically designed to encode failure states alongside the actual return, like `std::optional` which only holds a valid `ChangeEntry` in case of success. +1. Keep the return type, `ChangeEntry` in our case, but return a special **value** of this type in case of failure; +2. Return an **error code** from our function to indicate success or failure; +3. Return a **different type** specifically designed to encode failure states alongside the actual return. I believe that the third option is the best out of these three, but it will be easier to explain why I think so after we talk about why the first two are not cutting it. @@ -1116,8 +1121,8 @@ using HugeErrorObject = int; $PLACEHOLDER `CPP_SETUP_END` -`CPP_COPY_SNIPPET` error_handling/get_change_expected_main.cpp -`CPP_RUN_CMD` CWD:error_handling c++ -std=c++23 -c get_change_expected_main.cpp +`CPP_COPY_SNIPPET` error_handling/expected_bad.cpp +`CPP_RUN_CMD` CWD:error_handling c++ -std=c++23 -c expected_bad.cpp --> ```cpp // Bad idea, wasting memory 😱 From a8ca676ca68596fb98db5062e66d78cc73e3eaa2 Mon Sep 17 00:00:00 2001 From: Igor Bogoslavskyi Date: Tue, 24 Jun 2025 23:30:11 +0200 Subject: [PATCH 22/26] Finalize text --- lectures/error_handling.md | 181 ++++++++++++++++++------------------- 1 file changed, 87 insertions(+), 94 deletions(-) diff --git a/lectures/error_handling.md b/lectures/error_handling.md index acc769a..63cea34 100644 --- a/lectures/error_handling.md +++ b/lectures/error_handling.md @@ -35,14 +35,15 @@ - [Error type size matters](#error-type-size-matters) - [Return value optimization with `std::optional` and `std::expected`](#return-value-optimization-with-stdoptional-and-stdexpected) - [Summary](#summary) + - [Other use of `std::optional`](#other-use-of-stdoptional) -When writing C++ code, much like in real life, we don’t always get what we want. The good news is that C++ comes packed with the tools to let us be prepared for this eventuality! +When writing C++ code, much like in real life, we don’t always get what we want. The good news is that, unlike real life, C++ comes packed with the tools that help us prepare for not getting what we want. -Today we’re talking about error handling. What options we have, which trade-offs they come with, and what tools modern C++ gives us to make our lives a bit easier. +Today we’re talking about error handling. What options we have and which trade-offs they come with. So get a coffee and buckle up! There is a lot to cover and quite some nuance in this topic! -Ah, and I can already smell the torches and hear the scraping of the pitchforks that people will bring to punish me for all the controversial statements I am about to make! What can go wrong, eh? +Ah, and I can already smell the torches and hear the scraping of the pitchforks that people will inevitably bring to punish me for all the controversial statements I am about to make! What can go wrong, eh? @@ -68,7 +69,7 @@ Now, back to error handling. --> Before we go into how to handle errors, however, let’s clarify what we mean when we think about an "error" in programming. -At the highest level: **an error is something that happens when a program doesn’t produce the result we expect.** +At the highest level: **an error is what happens when a program doesn’t produce the result we expect.** I tend to think of errors belonging to one of two broad groups: @@ -77,31 +78,33 @@ I tend to think of errors belonging to one of two broad groups: Some languages, like Rust, bake this distinction [directly into the language design](https://doc.rust-lang.org/book/ch09-00-error-handling.html). C++ doesn’t, making the topic of error handling slightly more nuanced. -But, for my money, this classification, while not universal, is still useful. So let's talk a bit more in-depth about these kinds of errors and the intuition behind them. +But, for my money, this classification, while not universal, is still useful. So let me present my case and talk a bit more in-depth about these kinds of errors and the intuition behind them. # Setting up the example: **a comparison game** ## Rules of the game -There is a lot of ground to cover here and to not get lost, I would like to introduce a small example that will guide us and help illustrate all of the concepts we are talking about today. +We have a lot to talk about and to not get lost, I would like to introduce a small example that will guide us and help illustrate all of the concepts we are talking about today. -To this end, let's model a simple puzzle game. In this game a player start with an array of numbers generated for them. This array gets compared to some reference array, also generated for this game. The player wins if, when comparing numbers one-by-one, they have the higher number more times. +To this end, let's model a simple puzzle game. In this game a player start with a vector of numbers generated for them. In the end of the game, this vector gets compared to some, also generated, reference vector. The player wins if, when comparing numbers one-by-one, they have the higher number more times. -To make it an actual *game*, we need to give the player at least *some* control over their numbers. So we allow them to use a certain budget that can be used to increase the any in their array. +To make it an actual *game*, we need to give the player at least *some* control over their numbers. So we allow them to use a certain budget that can be used up to increase any numbers in their vector. The goal is to use the budget cleverly to win the game. ## Initial code of the game -Now let's spend a couple of minutes to set up the code for all what we've just discussed. +Now let's spend a couple of minutes to set up the code for all what we've just discussed. We won't model all of the potential complexity of this game but I urge you to play around with these ideas after watching this lecture and see if you can build something that is actually interesting to play! -To start off, we'll probably need a class `Game` that would hold the reference as well as the player numbers. It also needs a way: +To start off, we'll probably need a class `Game` that would hold the reference and player numbers as well as the remaining budget. It also needs a way: - to print the current state of the game; -- to check if they player won by comparing the player's numbers with reference ones one-by-one and keeping the score; +- to check if the player won by comparing the player's numbers with reference ones one-by-one and keeping the score; - to change the player number if there is still budget for this provided a `ChangeEntry` object, a tiny `struct` with `index` and `value` in our case. -This `change_entry` must come from somewhere, so we need a way to ask the player to provide it. We can encapsulate our user interaction into a function like `GetNextChangeEntryFromUser` which will print the current state of the game, ask the user for their input and fill a `change_entry` object using this input. +This `change_entry` must come from somewhere, so we need a way to ask the player to provide it. We can encapsulate our user interactions into a function like `GetNextChangeEntryFromUser` which will print the current state of the game, ask the user for their input and fill a `change_entry` object using this input. -To keep this example simple, we implement all of this in one `cpp` file alongside a simple `main` function that creates a `Game` object, asks the user to provide a desired `change_entry` and changing the player's numbers accordingly in a loop until the user runs out of budget. Finally, we check if the player has won the game and show them the result. +To keep this example simple, we implement all of this in one `cpp` file alongside a simple `main` function that creates a `Game` object, asks the user to provide a desired `change_entry` and changing the player's numbers accordingly in a loop until the user runs out of budget. + +Finally, we check if the player has won the game and show them the result. ```cpp #include @@ -188,7 +191,7 @@ int main() { This is not the most elegant code out there, but it is not too far from the style of code we might encounter in the wild. -And by the way, all of this code is, as always, [available](code/error_handling/comparison_game/comparison_game.cpp) for you to play around on the accompanying GitHub page. +And by the way, all of this code is, as always, [available](code/error_handling/comparison_game/comparison_game.cpp) for you to play around with on the accompanying GitHub page. We can build this program as a single executable directly from the command line: @@ -196,7 +199,7 @@ We can build this program as a single executable directly from the command line: c++ -std=c++17 -o comparison_game comparison_game.cpp ``` -Ideally, we should at least unit-test all of our functions, but it will serve as a great cautionary tale if we just give it a play-through instead! +Ideally, we should at least unit-test all of our functions, but it will serve as a nice cautionary tale if we just give it a play-through instead! # Unrecoverable errors: **fail early** @@ -227,15 +230,15 @@ Player numbers: 42 40 23 Please enter number to change: ``` -But wait, what's going on here? Why did our number in the second column not change? Why is our budget not decreased by `10`? Even more strangely, why did the first reference number change to `50`? +Docs are always out of date -The answer to all of these questions is that we have just encountered an unrecoverable error that manifests itself in wrong values in our memory through the "virtues" of Undefined Behavior. But what gives? +🤯 But wait, what's going on here? Why did our number in the second column not change? Why is our budget not decreased by `10`? Even more strangely, why did the first reference number change to `50`? -Well, there is a chain of events that caused our values to be changed in ways that we don't expect and we'll be digging through all of these and more in the remainder of today's lecture. +The answer to all of these questions is that we have just encountered an **unrecoverable error** that manifests itself in wrong values in our memory through the "virtues" of **Undefined Behavior**. But what gives? -But the most immediate cause is that the user has mistakenly provided the number they wanted to change, `40`, rather than an index of that number, `1`. We then did not check that the provided "index" is within the bounds of our number array and wrote the provided new value directly into the number array under this wrong index. +Well, there is a chain of events that caused our values to be changed in ways that we didn't expect. The most immediate cause is that the user has mistakenly provided the **number** they wanted to change, `40`, rather than an **index** of that number, `1`. We then did not check that the provided "index" is within the allowed bounds of the player numbers vector and wrote the provided new value under this wrong index. -If we correct our mistake, rerun our game again and provide `1` as the first input, we win, just as we expect! +If we correct our mistake by rerunning our game again and providing `1` as the first input, we win, just as we expected to! ```output λ › ./comparison_game @@ -254,11 +257,13 @@ When we provide `40` as we did the first time, our wrong index is far beyond the What happens next is **unpredictable**. If we are *lucky* and the address into which we write does **not** belong to our program, the program will crash. -If we are *not* lucky however, we will rewrite *some* memory that belongs to our program, potentially corrupting any object that actually owns that memory. In this particular example, I picked the values in such a way, that the "fake index" just happens to be equal to a difference in pointers `player_numbers_.data()` and `ref_numbers_.data()`. Which then results in us writing directly into the first element of the `ref_numbers_` vector, resulting in an unexpected update to reference numbers. +If we are *not* lucky however, we will rewrite *some* memory that belongs to our program, potentially corrupting any objects that actually own that memory. + +In this particular example, I picked the values in such a way, that the "fake index" just happens to be equal to a difference in pointers `player_numbers_.data()` and `ref_numbers_.data()`. Which then results in us writing directly into the first element of the `ref_numbers_` vector, resulting in an unexpected update to the reference numbers. But I want to stress again that if we run the same program on another machine - we will most likely get a *different behavior* altogether! -Even the order of `ref_numbers_` and `player_numbers_` in memory is not guaranteed, note how on my machine they do not even follow the order of declaration! +Even the order of `ref_numbers_` and `player_numbers_` in memory is not guaranteed, note how during this run on my machine they do not even follow the order of declaration! What *doesn't* change is that once the `ChangePlayerNumberIfPossible` method is called with a wrong `change_entry` in our example, all bets are off - **we do not have any guarantees on the consistency of the state of our program anymore** @@ -323,25 +328,29 @@ void Game::ChangePlayerNumberIfPossible(const ChangeEntry& change_entry) { // Old code unchanged here. ``` -If we run our example now, we will get a crash as soon as we call the `ChangePlayerNumberIfPossible` function that clearly states where this error originated and which check failed letting us debug this as easily as possible: +If we run our example now, we will get a crash as soon as we call the `ChangePlayerNumberIfPossible` function that clearly states where this error originated from and which check failed letting us debug this as easily as possible: ```output F0000 00:00:1750605447.566908 1 example.cpp:44] Check failed: change_entry.index < player_numbers_.size() (40 vs. 3) ``` -One concern that people have when thinking of using the `CHECK` macros is performance as these checks stay in the code we ship and do cost some time when our program runs. +To the degree of my knowledge, using `CHECK`-like macros to check pre-conditions of functions is widely considered a good practice. These `CHECK`s can be violated either due to a bug in our program or due to some undefined behavior just like in our example from earlier. -For my money, in most cases, the benefits far outweigh the costs, and, unless we've measured that we cannot allow the tiny performance hit in a particular place of our code, we should be free to use `CHECK` for safety against entering the Undefined Behavior land. +One concern that people have when thinking of using the `CHECK` macros is performance as these checks stay in the production code we ship and do cost some little time when our program runs. - +For my money, in most cases, the benefits far outweigh the costs, and, unless we've measured that we cannot allow the tiny performance hit in a particular place of our code, we should be free to use `CHECK` for safety against entering the Undefined Behavior land and saving us days, weeks or even month of debugging issues that are really hard to debug. + + ### Don't use `assert` -You might wonder if using `CHECK` is our only way to help us detect when we are in an inconsistent state and so I have to talk about one very famous alternative here that is often recommended on the Internet. This alternative is to use [`assert`](https://en.cppreference.com/w/cpp/error/assert.html). The `assert` statement can be found in the `` include file. I'm not a fan of using `assert`. +You might wonder if using `CHECK` is our only way to help us detect when we are in an inconsistent state and so I have to talk about one very famous alternative that is often recommended on the Internet. + +This alternative is to use [`assert`](https://en.cppreference.com/w/cpp/error/assert.html) statement which can be found in the `` include file. Full disclosure: I'm not a fan of using `assert`. -If you came here with your torches and pitchforks, this is probably a good time to take them out! 😉 +If you came here with your torches and pitchforks, this is probably a good time to get them ready! 😉 -But let me explain myself. You see, `assert` has one super annoying flaw that makes it impossible for me to recommend it for production code. I've seen sooo many bugs stemming from this! But let me show what I'm talking about using our game example. +But let me explain myself before you do anything drastic. You see, `assert` has one super annoying flaw that makes it impossible for me to recommend it for production code. I've seen sooo many bugs go unnoticed because of this! Let me show what I'm talking about using our game example. First, we can use `assert` in a very similar way to `CHECK`: @@ -389,10 +398,10 @@ void Game::ChangePlayerNumberIfPossible(const ChangeEntry& change_entry) { We compile and run our game just as we did before: ```cmd -c++ -std=c++17 -DNDEBUG -o comparison_game comparison_game.cpp +c++ -std=c++17 -o comparison_game comparison_game.cpp ``` -And if we run our program the assertion will trigger: +And if we run our program the assertion triggers: ```output output.s: /app/example.cpp:44: void Game::ChangePlayerNumberIfPossible(const ChangeEntry &): Assertion `change_entry.index < player_numbers_.size()' failed. @@ -401,7 +410,7 @@ Program terminated with signal: SIGSEGV So far so good, right? Using `assert` also crashes our program when the wrong input is provided and shows us where the wrong value was detected. -So what is that annoying flaw I've been talking about that makes me dislike `assert`? Well, you see, all `assert` statements get *disabled* when a macro `NDEBUG` is defined. This is a standard macro name that controls if the debug symbols get compiled into the binary and gets passed to the compilation command for most release builds as we generally don't want debug symbols in the binary we release. So essentially, `assert` almost always **does not protect us from undefined behavior in the code we actually deploy**! +So what is that annoying flaw I've been talking about that makes me dislike `assert`? Well, you see, all `assert` statements get *disabled* when a macro `NDEBUG` is defined. This is a standard macro that controls if the debug symbols get compiled into the binary and gets passed to the compilation command for most release builds as we generally don't want debug symbols in the binary we release. So essentially, `assert` almost always **does not protect us from undefined behavior in the code we actually deploy**! We can easily demonstrate that the `assert`s indeed get *compiled out* by adding `-DNDEBUG` flag to our compilation command: @@ -409,10 +418,12 @@ We can easily demonstrate that the `assert`s indeed get *compiled out* by adding c++ -std=c++17 -DNDEBUG -o comparison_game comparison_game.cpp ``` -Running our game *now* and providing a wrong input index leads to the same undefined behavior we observed before as all of the assertions were compiled out. Not great, right? +Running our game *now* and providing a wrong input index leads to the same undefined behavior we observed before as all of the assertions were compiled out. Not great, right? What makes it even less great is that many people just don't know that `asserts` get disabled like that and are sure that they are protected, while they really are not! +And `CHECK` does not have these flaws. + ### Complete the `Game` class yourself By the way, we've only covered how we could improve our `ChangePlayerNumberIfPossible` method of the `Game` class. Do you think our `CheckIfPlayerWon` function would benefit from the same treatment? @@ -425,7 +436,7 @@ Of course, hard failures in the programs we ship are not ideal! One way to reduce the risk of such failures is to keep the test coverage high for the code we write, ideally close to 100% line and branch coverage, i.e., every line and logical branch gets executed at least once in our test suite which is run regularly and automatically. -This way we catch most of the unrecoverable errors during development. In some industries, like automotive, aviation, or medical this is actually a legal requirement. +This way we catch most of the unrecoverable errors during development. In some industries, like automotive, aviation, or medical this is actually a requirement to obtain the certifications necessary for the resulting software to be used. But unfortunately, despite our best efforts, we cannot *completely* avoid catastrophic failures in the programs we ship! @@ -435,11 +446,11 @@ Even if we do everything right on our side, hardware can still fail and corrupt This visualization is actually taken from an [excellent Veritasium video](https://www.youtube.com/watch?v=AaZ_RSt0KP8) on this topic. Do watch it if you want to know more! -With the knowledge that we cannot completely remove the risk of hitting an unrecoverable error in production, and that we also probably don't want our, say, flight software just randomly die during operation, in safety-critical systems, we often isolate components into separate processes or even separate hardware units, with watchdogs that can trigger recovery actions if one of our components suddenly crashes. +With the knowledge that we cannot completely remove the risk of hitting an unrecoverable error in production, and that we also probably don't want our, say, flight software just randomly dying during operation, in safety-critical systems, we often isolate components into separate processes or even separate hardware units, with watchdogs that can trigger recovery actions if one of our components suddenly crashes. This way we can have our cake and eat it at the same time: using `CHECK` minimizes the time-to-failure when a bug is encountered, while our fallback options keep the system safe as a whole even when certain components fail. -That being said, this is a system architecture question and this topic is far beyond what I want to talk about today. In most non-safety-critical systems we do not need to think about these failure cases as deeply and we can usually just restart our program in case of a one-off failure. But, if it sounds interesting to you, in an [introductory lecture to the Self Driving Cars course](https://youtu.be/DtRktn4bVWg?si=DJuU8OjxtBcj5o2C) at the University of Bonn, I've dedicated a significant part towards the end to this topic. +That being said, this is a system architecture question and this topic is far beyond what I want to talk about today. In most non-safety-critical systems we do not need to think about these failure cases as deeply and we can usually just restart our program in case of a one-off failure. # Recoverable errors: **handle and proceed** @@ -538,7 +549,7 @@ ChangeEntry GetNextChangeEntryFromUser(const Game& game) { Throwing an exception interrupts the normal program flow. We leave the current scope, so all objects allocated in it are automatically destroyed and the program continues with the "exceptional flow" to find a place where the thrown exception can be handled. -Speaking of handling exceptions, we can "catch" them anywhere upstream from the place they have been thrown from. As `std::exception` is just an object, it can be caught by value or by reference but it is considered a good practice to catch them by reference. In our case we can catch either a `std::out_of_range` exception directly or, as `std::out_of_range` derives from `std::exception`, catch `std::exception` instead: +Speaking of handling exceptions, we can "catch" them anywhere upstream from the place they have been thrown from. As `std::exception` is just an object, it can be caught by value or by reference but it is considered a good practice to catch them by reference. In our case we can catch either a `std::out_of_range` exception directly or, as `std::out_of_range` derives from `std::exception`, catch `std::exception` instead. All of the exceptions that derive from `std::exception` can override the `what()` function that we can use to see the cause of the error: -```cpp -ChangeEntry GetNextChangeEntryFromUser(const Game& game); -``` - -We can: - -1. Keep the return type, `ChangeEntry` in our case, but return a special **value** of this type in case of failure; +1. Keep the return type of the function if it has any, but return a special **value** of this type in case of failure; 2. Return an **error code** from our function to indicate success or failure; 3. Return a **different type** specifically designed to encode failure states alongside the actual return. @@ -730,7 +725,7 @@ I believe that the third option is the best out of these three, but it will be e ### Returning a value indicating error does not always work 😱 -There is a number of issues with returning a special value from a function without using a special return type. As an illustration, in our case, a naïve choice would be to return a value-initialized `ChangeEntry{}` object from the `GetNextChangeEntryFromUser` function if the user provided a wrong number index. This object will essentially hold zeros for its `index` and `value` entries. However, as you might imagine, a pair of zeros *is* a completely valid, if slightly useless, change entry! How do we disambiguate this value from a valid one? +There is a number of issues with returning a special value from a function without using a special return type. As an illustration, in our `GetNextChangeEntryFromUser` function, a naïve choice would be to return a value-initialized `ChangeEntry{}` object if the user provided a wrong number index. This object would essentially hold zeros for its `index` and `value` entries. However, as you might imagine, a pair of zeros *is* a completely valid, if slightly useless, change entry! How do we disambiguate this value from a valid one? - ### Using `std::optional`: **a better way** I believe that there *is* a better way though. With C++17, we gained [`std::optional`](https://en.cppreference.com/w/cpp/utility/optional.html) with which we can express that a function “might return a value” by returning an object of `std::optional` type. We can create this object either empty, holding a so-called `std::nullopt`, or from a valid `ChangeEntry`: @@ -988,17 +981,13 @@ The presence or absence of a value is encoded into the type itself. No more gues But, we've again lost a capability of knowing *what* went wrong. We just know that *something* has not gone to plan. There is simply no place to store the reason! - - -### Using `std::expected`: **add context** - But we are still interested in a reason for our failures! -Enter [`std::expected`](https://en.cppreference.com/w/cpp/utility/expected.html), coming in C++23. And if you'd like to know what led to it being added to the language, give this [fantastic talk by Andrei Alexandrescu](https://www.youtube.com/watch?v=PH4WBuE1BHI) a watch! It is one of my favorite talks ever! It is both informative and entertaining in an equal measure! +### Using `std::expected`: **add context** - +Enter [`std::expected`](https://en.cppreference.com/w/cpp/utility/expected.html), coming in C++23. And if you'd like to hear somebody more entertaining than myself talk about why we have it, watch this [fantastic talk by Andrei Alexandrescu](https://www.youtube.com/watch?v=PH4WBuE1BHI)! It is one of my favorite talks ever! It is both informative and entertaining in an equal measure! -With `std::expected` we could do the same things we could do with `std::optional` and more by changing our function accordingly: +Anyway, with `std::expected` we could do the same things we could do with `std::optional` and more by changing our function accordingly: From e1f0b517250a8e292459d9571f23a77ed1a6b0a6 Mon Sep 17 00:00:00 2001 From: Igor Bogoslavskyi Date: Tue, 24 Jun 2025 23:36:50 +0200 Subject: [PATCH 23/26] Fix errors in snippets reported by CI --- lectures/error_handling.md | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/lectures/error_handling.md b/lectures/error_handling.md index 63cea34..70605c5 100644 --- a/lectures/error_handling.md +++ b/lectures/error_handling.md @@ -284,6 +284,7 @@ My favorite tool for this is the [`CHECK`](https://abseil.io/docs/cpp/guides/log ```cpp int main() { - Game game{{42, 49, 23}, {42, 40, 23}, 10}; + Game game{{42, 49, 23}, {10, 40, 24}, 10}; while (game.UserHasBudget()) { try { const auto change_entry = GetNextChangeEntryFromUser(game); @@ -868,7 +874,7 @@ std::string GetFailureReason(int constant) { } int main() { - Game game{{42, 49, 23}, {42, 40, 23}, 10}; + Game game{{42, 49, 23}, {10, 40, 24}, 10}; while (game.UserHasBudget()) { // Cannot be const, cannot use auto, have to allocate. ChangeEntry change_entry{}; @@ -974,7 +980,7 @@ $PLACEHOLDER ```cpp // Other code above. int main() { - Game game{{42, 49, 23}, {42, 40, 23}, 10}; + Game game{{42, 49, 23}, {10, 40, 24}, 10}; while (game.UserHasBudget()) { const auto change_entry = GetNextChangeEntryFromUser(game); if (!change_entry) { // Also possible: change_entry.has_value(). @@ -1078,7 +1084,7 @@ $PLACEHOLDER ```cpp // Other code above. int main() { - Game game{{42, 49, 23}, {42, 40, 23}, 10}; + Game game{{42, 49, 23}, {10, 40, 24}, 10}; while (game.UserHasBudget()) { const auto change_entry = GetNextChangeEntryFromUser(game); if (!change_entry) { // Also possible: change_entry.has_value(). diff --git a/lectures/images/game_loss.png b/lectures/images/game_loss.png new file mode 100644 index 0000000..8fd4a1f --- /dev/null +++ b/lectures/images/game_loss.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:e051e6fdf633d3a9d9c72b165273a789e169152c2320c01e5d28abcd6865f898 +size 122750 diff --git a/lectures/images/game_win.png b/lectures/images/game_win.png new file mode 100644 index 0000000..1ba4d99 --- /dev/null +++ b/lectures/images/game_win.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ef1bc702765b98ff4e1711a6a0dacb2808285a38286f94b4446279ce673fcfb3 +size 132390 diff --git a/lectures/images/memory_ub.png b/lectures/images/memory_ub.png new file mode 100644 index 0000000..e7455df --- /dev/null +++ b/lectures/images/memory_ub.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f5a77980c57367fb4094e2814fd078fdb135ea150f7abfe64af5e27edc40d939 +size 48636 From d78a588d8ecf0697b80b0bfddfa5732d707a745d Mon Sep 17 00:00:00 2001 From: Igor Bogoslavskyi Date: Mon, 30 Jun 2025 01:07:43 +0200 Subject: [PATCH 26/26] Minor final details --- .../error_handling/src/scenes/example.tsx | 788 +++++++++++++++--- .../comparison_game_expected.cpp | 103 +++ lectures/error_handling.md | 16 +- 3 files changed, 798 insertions(+), 109 deletions(-) create mode 100644 lectures/code/error_handling/comparison_game/comparison_game_expected.cpp diff --git a/animation/error_handling/src/scenes/example.tsx b/animation/error_handling/src/scenes/example.tsx index 9888a26..8c3860e 100644 --- a/animation/error_handling/src/scenes/example.tsx +++ b/animation/error_handling/src/scenes/example.tsx @@ -1,22 +1,54 @@ import { createRef } from '@motion-canvas/core/lib/utils'; -import { makeScene2D, Code, Camera, lines } from '@motion-canvas/2d'; +import { makeScene2D, Code, Camera, Node, lines } from '@motion-canvas/2d'; import { all, waitFor, waitUntil } from '@motion-canvas/core/lib/flow'; import { DEFAULT } from '@motion-canvas/core/lib/signals'; import { BBox, Vector2 } from '@motion-canvas/core/lib/types'; +import { Stage } from '@motion-canvas/core'; export default makeScene2D(function* (view) { const codeRef = createRef(); - const camera = createRef(); - - yield view.add( - - ,); + const camera_1 = createRef(); + const camera_2 = createRef(); + const stage_2 = createRef(); + + const scene = ( + + + + ); + + view.add( + <> + + + + + + , + ); const duration = 1.0 @@ -45,7 +77,7 @@ class Game { yield* all( codeRef().code(game_1, 0.0).wait(duration), - camera().zoom(0.7, 0.0) + camera_1().zoom(0.7, 0.0), ); const game_2 = `\ @@ -85,12 +117,15 @@ class Game { yield* all( codeRef().code(game_2, duration).wait(duration), - camera().zoom(0.5, duration), + camera_1().zoom(0.5, duration), + codeRef().selection([ + lines(0, 0), + lines(13, 13 + 9), + ], duration) ); const game_3 = `\ -#include #include #include @@ -136,7 +171,10 @@ class Game { yield* all( codeRef().code(game_3, duration).wait(duration), - camera().centerOn([0, 500], duration), + camera_1().centerOn([0, 500], duration), + codeRef().selection([ + lines(154 - 131, 162 - 131), + ], duration) ); const game_4 = `\ @@ -144,6 +182,11 @@ class Game { #include #include +struct ChangeEntry { + int index{}; + int value{}; +}; + // 😱 Warning! No error handling! class Game { public: @@ -192,9 +235,27 @@ class Game { int budget_{}; };` + yield* all( + camera_2().centerOn([-800, -1200], 0), + camera_2().zoom(0.6, 0), + ); + + yield* all( + stage_2().position.x(0, duration / 2), + ); + yield* all( codeRef().code(game_4, duration).wait(duration), - camera().centerOn([0, 800], duration), + stage_2().scale(1.3, duration), + stage_2().position([-150, 150], duration), + camera_2().zoom(0.35, duration), + camera_2().centerOn([-800, -1450], duration), + camera_1().centerOn([0, 700], duration), + codeRef().selection([ + lines(0, 0), + lines(4, 7), + lines(221 - 183, 228 - 183), + ], duration) ); const game_5 = `\ @@ -202,6 +263,11 @@ class Game { #include #include +struct ChangeEntry { + int index{}; + int value{}; +}; + // 😱 Warning! No error handling! class Game { public: @@ -262,8 +328,19 @@ ChangeEntry GetNextChangeEntryFromUser(const Game& game) { }` yield* all( + stage_2().position.x(900, duration / 2), codeRef().code(game_5, duration).wait(duration), - camera().centerOn([0, 1500], duration), + camera_1().centerOn([0, 1500], duration), + codeRef().selection([ + lines(306 - 250, 316 - 250), + ], duration) + ); + + yield* all( + camera_2().centerOn([-800, -1200], 0), + camera_2().zoom(0.6, 0), + stage_2().scale(1.0, 0), + stage_2().position([900, 0], 0), ); const game_6 = `\ @@ -271,6 +348,11 @@ ChangeEntry GetNextChangeEntryFromUser(const Game& game) { #include #include +struct ChangeEntry { + int index{}; + int value{}; +}; + // 😱 Warning! No error handling! class Game { public: @@ -342,21 +424,35 @@ int main() { } else { std::cout << "Not win today. Try again!\\n"; } -} -` +}` yield* all( codeRef().code(game_6, duration).wait(duration), - camera().centerOn([0, 1800], duration), + camera_1().centerOn([0, 1800], duration), + codeRef().selection([ + lines(390 - 323, 403 - 323), + ], duration) + ); + + yield* all( + camera_1().centerOn([0, 0], 0), + camera_1().zoom(0.2, 0), + codeRef().selection(DEFAULT, 0), + waitFor(duration) ); - const game_check_header = `\ + const game_check = `\ #include #include #include #include +struct ChangeEntry { + int index{}; + int value{}; +}; + // 😱 Warning! No error handling! class Game { public: @@ -388,6 +484,11 @@ class Game { } void ChangePlayerNumberIfPossible(const ChangeEntry& change_entry) { + // Checking: + // (change_entry.index >= 0) + // (change_entry.index < player_numbers_.size()) + CHECK_GE(change_entry.index, 0); + CHECK_LT(change_entry.index, player_numbers_.size()); auto& player_number = player_numbers_[change_entry.index]; const auto difference = std::abs(change_entry.value - player_number); if (difference > budget_) { return; } @@ -428,31 +529,41 @@ int main() { } else { std::cout << "Not win today. Try again!\\n"; } -} -` - yield* all( - camera().centerOn([0, 0], 0), - camera().zoom(0.2, 0), - waitFor(duration) - ); +}` yield* all( - camera().centerOn([-300, -2000], duration), - camera().zoom(1, duration), - waitFor(duration) + camera_2().centerOn([-700, -2400], 0), + camera_1().centerOn([100, -100], duration), + camera_1().zoom(0.7, duration), + codeRef().selection([ + lines(454 - 416, 462 - 416), + ], duration) ); + yield* waitFor(0.5 * duration); + yield* all( - codeRef().code(game_check_header, duration).wait(duration), + codeRef().code(game_check, duration).wait(duration), + stage_2().position.x(0, duration / 2), + camera_2().centerOn([-700, -2600], duration), + codeRef().selection([ + lines(0, 0), + lines(456 - 416, 468 - 416), + ], duration) ); - const game_check = `\ -#include + const game_assert = `\ +#include #include #include #include +struct ChangeEntry { + int index{}; + int value{}; +}; + // 😱 Warning! No error handling! class Game { public: @@ -483,12 +594,10 @@ class Game { return win_loss_counter > 0; } + // 😱 Beware of using asserts in release code. void ChangePlayerNumberIfPossible(const ChangeEntry& change_entry) { - // Checking: - // (change_entry.index >= 0) - // (change_entry.index < player_numbers_.size()) - CHECK_GE(change_entry.index, 0); - CHECK_LT(change_entry.index, player_numbers_.size()); + assert(change_entry.index >= 0); + assert(change_entry.index < player_numbers_.size()); auto& player_number = player_numbers_[change_entry.index]; const auto difference = std::abs(change_entry.value - player_number); if (difference > budget_) { return; } @@ -529,27 +638,58 @@ int main() { } else { std::cout << "Not win today. Try again!\\n"; } -} -` +}` yield* all( - camera().centerOn([0, 100], duration), - camera().zoom(0.7, duration), + codeRef().code(game_assert, duration).wait(duration), + camera_2().centerOn([-700, -2550], duration), + codeRef().selection([ + lines(0, 0), + lines(597 - 556, 606 - 556), + ], duration) ); - yield* waitFor(0.5 * duration); + yield* all( + stage_2().position.x(900, 0), + waitFor(duration), + ); yield* all( - codeRef().code(game_check, duration).wait(duration), + camera_1().centerOn([0, -600], duration), + waitFor(duration * 2), + codeRef().selection([ + lines(0, 0), + lines(587 - 556, 595 - 556), + ], duration) ); - const game_assert = `\ -#include + yield* all( + codeRef().code(game_check, 0).wait(duration), + camera_1().centerOn([0, 0], 0), + codeRef().selection(DEFAULT, 0), + camera_1().zoom(0.2, 0), + ); + + yield* all( + camera_1().centerOn([0, 1300], duration), + camera_1().zoom(0.7, duration), + codeRef().selection([ + lines(702 - 639, 712 - 639), + ], duration) + ); + + const game_before_recoverable = `\ +#include #include #include #include +struct ChangeEntry { + int index{}; + int value{}; +}; + // 😱 Warning! No error handling! class Game { public: @@ -580,10 +720,12 @@ class Game { return win_loss_counter > 0; } - // 😱 Beware of using asserts in release code. void ChangePlayerNumberIfPossible(const ChangeEntry& change_entry) { - assert(change_entry.index >= 0); - assert(change_entry.index < player_numbers_.size()); + // Checking: + // (change_entry.index >= 0) + // (change_entry.index < player_numbers_.size()) + CHECK_GE(change_entry.index, 0); + CHECK_LT(change_entry.index, player_numbers_.size()); auto& player_number = player_numbers_[change_entry.index]; const auto difference = std::abs(change_entry.value - player_number); if (difference > budget_) { return; } @@ -606,7 +748,7 @@ ChangeEntry GetNextChangeEntryFromUser(const Game& game) { game.Print(); ChangeEntry entry{}; std::cout << "Please enter number to change: "; - std::cin >> entry.index; + std::cin >> entry.index; // <-- This value is NOT arbitrary! std::cout << "Please provide a new value: "; std::cin >> entry.value; return entry; @@ -624,35 +766,29 @@ int main() { } else { std::cout << "Not win today. Try again!\\n"; } -} -` - yield* all( - codeRef().code(game_assert, duration).wait(duration), - ); - - yield* all( - camera().centerOn([0, -600], duration) - ); - - yield* all( - codeRef().code(game_check, 0).wait(duration), - camera().centerOn([0, 0], 0), - camera().zoom(0.2, 0), - ); +}` yield* all( - camera().centerOn([0, 1300], duration), - camera().zoom(0.7, duration), + codeRef().code(game_before_recoverable, duration).wait(duration), + codeRef().selection([ + lines(702 - 639, 712 - 639), + ], duration) ); + yield* waitFor(duration); - const game_before_recoverable = `\ + const game_exceptions_func = `\ #include #include #include +#include #include -// 😱 Warning! No error handling! +struct ChangeEntry { + int index{}; + int value{}; +}; + class Game { public: Game(std::vector&& ref_numbers, @@ -705,12 +841,15 @@ class Game { int budget_{}; }; -// 😱 We should handle failure to get a proper value. +// 😱 I'm not a fan of using exceptions. ChangeEntry GetNextChangeEntryFromUser(const Game& game) { game.Print(); ChangeEntry entry{}; std::cout << "Please enter number to change: "; - std::cin >> entry.index; // <-- This value is NOT arbitrary! + std::cin >> entry.index; + if ((entry.index < 0) || (entry.index >= game.player_numbers().size())) { + throw std::out_of_range("Wrong number index provided."); + } std::cout << "Please provide a new value: "; std::cin >> entry.value; return entry; @@ -728,21 +867,36 @@ int main() { } else { std::cout << "Not win today. Try again!\\n"; } -} -` +}` + yield* all( - codeRef().code(game_before_recoverable, duration).wait(duration), + stage_2().position.x(0, duration / 2), + camera_2().centerOn([-700, -2450], 0), + ); + + yield* all( + codeRef().code(game_exceptions_func, duration).wait(duration), + codeRef().selection([ + lines(4, 4), + lines(795 - 732, 808 - 732), + ], duration), + camera_2().centerOn([-700, -2450], duration), ); yield* waitFor(duration); - const game_exceptions_func = `\ + const game_exceptions = `\ #include #include #include +#include #include -// 😱 Warning! No error handling! +struct ChangeEntry { + int index{}; + int value{}; +}; + class Game { public: Game(std::vector&& ref_numbers, @@ -812,8 +966,12 @@ ChangeEntry GetNextChangeEntryFromUser(const Game& game) { int main() { Game game{{42, 49, 23}, {10, 40, 24}, 10}; while (game.UserHasBudget()) { - const auto change_entry = GetNextChangeEntryFromUser(game); - game.ChangePlayerNumberIfPossible(change_entry); + try { + const auto change_entry = GetNextChangeEntryFromUser(game); + game.ChangePlayerNumberIfPossible(change_entry); + } catch (const std::out_of_range& e) { + std::cerr << e.what() << std::endl; + } } game.Print(); if (game.CheckIfPlayerWon()) { @@ -821,21 +979,33 @@ int main() { } else { std::cout << "Not win today. Try again!\\n"; } -} -` +}` + yield* all( - codeRef().code(game_exceptions_func, duration).wait(duration), + camera_1().centerOn([0, 1900], duration), + camera_1().zoom(0.45, duration), + stage_2().position.x(900, duration / 2), + codeRef().selection([ + lines(891 - 828, 922 - 828), + ], duration), + codeRef().code(game_exceptions, duration).wait(duration), ); + yield* waitFor(duration); - const game_exceptions = `\ + const game_exceptions_catch_all = `\ #include #include #include +#include #include -// 😱 Warning! No error handling! +struct ChangeEntry { + int index{}; + int value{}; +}; + class Game { public: Game(std::vector&& ref_numbers, @@ -910,6 +1080,9 @@ int main() { game.ChangePlayerNumberIfPossible(change_entry); } catch (const std::out_of_range& e) { std::cerr << e.what() << std::endl; + } catch (...) { + // 😱 Not very useful, is it? + std::cerr << "Oops, something happened.\\n"; } } game.Print(); @@ -918,26 +1091,41 @@ int main() { } else { std::cout << "Not win today. Try again!\\n"; } -} -` +}` + yield* all( - camera().centerOn([0, 2100], duration), - camera().zoom(0.7, duration), + codeRef().code(game_exceptions_catch_all, duration).wait(duration), + codeRef().selection([ + lines(891 - 828, 925 - 828), + ], duration), ); + yield* waitFor(duration); + + // Recoverable errors yield* all( - codeRef().code(game_exceptions, duration).wait(duration), + codeRef().selection(DEFAULT, 0).wait(duration), + camera_1().centerOn([0, 1200], 0), + camera_2().centerOn([-700, -2700], 0), + camera_1().zoom(0.7, 0), + codeRef().selection([ + lines(1, 5), + lines(1111 - 1047, 1124 - 1047), + ], 0), ); - yield* waitFor(duration); - const game_exceptions_catch_all = `\ + const recoverable_value = `\ #include #include #include #include -// 😱 Warning! No error handling! +struct ChangeEntry { + int index{}; + int value{}; +}; + class Game { public: Game(std::vector&& ref_numbers, @@ -990,14 +1178,14 @@ class Game { int budget_{}; }; -// 😱 I'm not a fan of using exceptions. +// 😱 Not a great idea! ChangeEntry GetNextChangeEntryFromUser(const Game& game) { game.Print(); ChangeEntry entry{}; std::cout << "Please enter number to change: "; std::cin >> entry.index; if ((entry.index < 0) || (entry.index >= game.player_numbers().size())) { - throw std::out_of_range("Wrong number index provided."); + return {}; // 😱 How do we know this value indicates an error? } std::cout << "Please provide a new value: "; std::cin >> entry.value; @@ -1007,15 +1195,133 @@ ChangeEntry GetNextChangeEntryFromUser(const Game& game) { int main() { Game game{{42, 49, 23}, {10, 40, 24}, 10}; while (game.UserHasBudget()) { - try { - const auto change_entry = GetNextChangeEntryFromUser(game); - game.ChangePlayerNumberIfPossible(change_entry); - } catch (const std::out_of_range& e) { - std::cerr << e.what() << std::endl; - } catch (...) { - // 😱 Not very useful, is it? - std::cerr << "Oops, something happened.\\n"; + const auto change_entry = GetNextChangeEntryFromUser(game); + game.ChangePlayerNumberIfPossible(change_entry); + } + game.Print(); + if (game.CheckIfPlayerWon()) { + std::cout << "You win!\\n"; + } else { + std::cout << "Not win today. Try again!\\n"; + } +}` + yield* all( + stage_2().position.x(0, duration / 2), + ); + + yield* all( + stage_2().position.x(0, duration / 2), + camera_1().centerOn([0, 1400], duration), + camera_2().centerOn([-700, -2500], duration), + codeRef().selection([ + lines(1, 4), + lines(1109 - 1047, 1122 - 1047), + ], duration), + codeRef().code(recoverable_value, duration).wait(duration), + ); + + yield* all( + stage_2().position.x(900, duration / 2), + ); + + // Error codes + + const recoverable_error_codes = `\ +#include + +#include +#include +#include + +struct ChangeEntry { + int index{}; + int value{}; +}; + +class Game { + public: + Game(std::vector&& ref_numbers, + std::vector&& player_numbers, + int budget) + : ref_numbers_{std::move(ref_numbers)}, + player_numbers_{std::move(player_numbers)}, + budget_{budget} {} + + void Print() const { + std::cout << "Budget: " << budget_ << std::endl; + std::cout << "Reference numbers: "; + for (auto number : ref_numbers_) { std::cout << number << "\\t"; } + std::cout << std::endl; + std::cout << "Player numbers: "; + for (auto number : player_numbers_) { std::cout << number << "\\t"; } + std::cout << std::endl; + } + + bool CheckIfPlayerWon() const { + int win_loss_counter{}; + for (auto i = 0UL; i < player_numbers_.size(); ++i) { + const auto difference = player_numbers_[i] - ref_numbers_[i]; + if (difference > 0) win_loss_counter++; + if (difference < 0) win_loss_counter--; + } + return win_loss_counter > 0; + } + + void ChangePlayerNumberIfPossible(const ChangeEntry& change_entry) { + // Checking: + // (change_entry.index >= 0) + // (change_entry.index < player_numbers_.size()) + CHECK_GE(change_entry.index, 0); + CHECK_LT(change_entry.index, player_numbers_.size()); + auto& player_number = player_numbers_[change_entry.index]; + const auto difference = std::abs(change_entry.value - player_number); + if (difference > budget_) { return; } + player_number = change_entry.value; + budget_ -= difference; + } + + const std::vector& ref_numbers() const { return ref_numbers_; } + const std::vector& player_numbers() const { return player_numbers_; } + bool UserHasBudget() const { return budget_ > 0; } + + private: + std::vector ref_numbers_{}; + std::vector player_numbers_{}; + int budget_{}; +}; + +// 😱 Mostly not a great idea in C++. +int GetNextChangeEntryFromUser(const Game& game, ChangeEntry& result) { + game.Print(); + std::cout << "Please enter number to change: "; + int index{}; + std::cin >> index; + if ((index < 0) || (index >= game.player_numbers().size())) { + return kError; // Usually some constant defined elsewhere. + } + result.index = index; + std::cout << "Please provide a new value: "; + std::cin >> result.value; + return kSuccess; // Typically a 0, but using a constant is better. +} + +std::string GetFailureReason(int constant) { + // Some logic to return a message. + if (constant == kError) { return "Provided index is out of range."; } + return "Unknown error encountered."; +} + +int main() { + Game game{{42, 49, 23}, {10, 40, 24}, 10}; + while (game.UserHasBudget()) { + // Cannot be const, cannot use auto, has to allocate. + ChangeEntry change_entry{}; + const auto error_code = GetNextChangeEntryFromUser(game, change_entry); + if (error_code != kSuccess) { + std::cerr << GetFailureReason(error_code) << std::endl; + continue; } + game.ChangePlayerNumberIfPossible(change_entry); } game.Print(); if (game.CheckIfPlayerWon()) { @@ -1023,13 +1329,283 @@ int main() { } else { std::cout << "Not win today. Try again!\\n"; } +}` + + yield* all( + codeRef().selection([ + lines(1, 4), + lines(1220 - 1158, 1234 - 1158), + ], duration), + camera_1().centerOn([0, 1000], duration), + codeRef().code(recoverable_error_codes, duration).wait(duration), + ); + + yield* all( + camera_1().zoom(0.6, duration), + camera_1().centerOn([0, 2300], duration), + codeRef().selection([ + lines(1, 4), + lines(1235 - 1158, 1260 - 1158), + ], duration), + codeRef().code(recoverable_error_codes, duration).wait(duration), + ); + + + // Optional + + const recoverable_optional = `\ +#include + +#include +#include +#include +#include + +struct ChangeEntry { + int index{}; + int value{}; +}; + +class Game { + public: + Game(std::vector&& ref_numbers, + std::vector&& player_numbers, + int budget) + : ref_numbers_{std::move(ref_numbers)}, + player_numbers_{std::move(player_numbers)}, + budget_{budget} {} + + void Print() const { + std::cout << "Budget: " << budget_ << std::endl; + std::cout << "Reference numbers: "; + for (auto number : ref_numbers_) { std::cout << number << "\\t"; } + std::cout << std::endl; + std::cout << "Player numbers: "; + for (auto number : player_numbers_) { std::cout << number << "\\t"; } + std::cout << std::endl; + } + + bool CheckIfPlayerWon() const { + int win_loss_counter{}; + for (auto i = 0UL; i < player_numbers_.size(); ++i) { + const auto difference = player_numbers_[i] - ref_numbers_[i]; + if (difference > 0) win_loss_counter++; + if (difference < 0) win_loss_counter--; + } + return win_loss_counter > 0; + } + + void ChangePlayerNumberIfPossible(const ChangeEntry& change_entry) { + // Checking: + // (change_entry.index >= 0) + // (change_entry.index < player_numbers_.size()) + CHECK_GE(change_entry.index, 0); + CHECK_LT(change_entry.index, player_numbers_.size()); + auto& player_number = player_numbers_[change_entry.index]; + const auto difference = std::abs(change_entry.value - player_number); + if (difference > budget_) { return; } + player_number = change_entry.value; + budget_ -= difference; + } + + const std::vector& ref_numbers() const { return ref_numbers_; } + const std::vector& player_numbers() const { return player_numbers_; } + bool UserHasBudget() const { return budget_ > 0; } + + private: + std::vector ref_numbers_{}; + std::vector player_numbers_{}; + int budget_{}; +}; + +std::optional +GetNextChangeEntryFromUser(const Game& game) { + game.Print(); + ChangeEntry entry{}; + std::cout << "Please enter number to change: "; + std::cin >> entry.index; + if ((entry.index < 0) || (entry.index >= game.player_numbers().size())) { + return {}; // <-- Create an empty optional, or std:nullopt. + } + std::cout << "Please provide a new value: "; + std::cin >> entry.value; + return entry; // <-- Optional filled with a ChangeEntry object. } -` + +int main() { + Game game{{42, 49, 23}, {10, 40, 24}, 10}; + while (game.UserHasBudget()) { + const auto change_entry = GetNextChangeEntryFromUser(game); + if (!change_entry) { // Also possible: change_entry.has_value(). + std::cerr << "Error when getting a number index." << std::endl; + continue; + } + game.ChangePlayerNumberIfPossible(change_entry.value()); + } + game.Print(); + if (game.CheckIfPlayerWon()) { + std::cout << "You win!\\n"; + } else { + std::cout << "Not win today. Try again!\\n"; + } +}` yield* all( - codeRef().code(game_exceptions_catch_all, duration).wait(duration), + camera_1().centerOn([0, 900], 0), + camera_2().centerOn([-700, -2850], 0), + codeRef().selection([ + lines(1, 4), + lines(1109 - 1047, 1123 - 1047), + ], 0), + waitFor(duration) ); - yield* waitFor(duration); + + yield* all( + stage_2().position.x(0, duration / 2), + ); + + yield* all( + camera_1().zoom(0.6, duration), + camera_1().centerOn([0, 1200], duration), + camera_2().centerOn([-700, -2550], duration), + codeRef().selection([ + lines(4, 4), + lines(1110 - 1047, 1123 - 1047), + ], duration), + codeRef().code(recoverable_optional, duration).wait(duration), + ); + + yield* all( + stage_2().position.x(900, duration / 2), + camera_1().zoom(0.5, duration), + camera_1().centerOn([200, 1850], duration), + camera_2().centerOn([-700, -2550], duration), + codeRef().selection([ + lines(4, 4), + lines(1110 - 1047, 1223 - 1047), + ], duration), + codeRef().code(recoverable_optional, duration).wait(duration), + ); + + // Expected + + const recoverable_expected = `\ +#include + +#include +#include +#include +#include +#include + +struct ChangeEntry { + int index{}; + int value{}; +}; + +class Game { + public: + Game(std::vector&& ref_numbers, + std::vector&& player_numbers, + int budget) + : ref_numbers_{std::move(ref_numbers)}, + player_numbers_{std::move(player_numbers)}, + budget_{budget} {} + + void Print() const { + std::cout << "Budget: " << budget_ << std::endl; + std::cout << "Reference numbers: "; + for (auto number : ref_numbers_) { std::cout << number << "\\t"; } + std::cout << std::endl; + std::cout << "Player numbers: "; + for (auto number : player_numbers_) { std::cout << number << "\\t"; } + std::cout << std::endl; + } + + bool CheckIfPlayerWon() const { + int win_loss_counter{}; + for (auto i = 0UL; i < player_numbers_.size(); ++i) { + const auto difference = player_numbers_[i] - ref_numbers_[i]; + if (difference > 0) win_loss_counter++; + if (difference < 0) win_loss_counter--; + } + return win_loss_counter > 0; + } + + void ChangePlayerNumberIfPossible(const ChangeEntry& change_entry) { + // Checking: + // (change_entry.index >= 0) + // (change_entry.index < player_numbers_.size()) + CHECK_GE(change_entry.index, 0); + CHECK_LT(change_entry.index, player_numbers_.size()); + auto& player_number = player_numbers_[change_entry.index]; + const auto difference = std::abs(change_entry.value - player_number); + if (difference > budget_) { return; } + player_number = change_entry.value; + budget_ -= difference; + } + + const std::vector& ref_numbers() const { return ref_numbers_; } + const std::vector& player_numbers() const { return player_numbers_; } + bool UserHasBudget() const { return budget_ > 0; } + + private: + std::vector ref_numbers_{}; + std::vector player_numbers_{}; + int budget_{}; +}; + +// Requires C++23 +std::expected +GetNextChangeEntryFromUser(const Game& game) { + game.Print(); + ChangeEntry entry{}; + std::cout << "Please enter number to change: "; + std::cin >> entry.index; + if ((entry.index < 0) || (entry.index >= game.player_numbers().size())) { + return std::unexpected( + std::format("Index {} must be in [0, {}) interval", + entry.index, game.player_numbers().size())); + } + std::cout << "Please provide a new value: "; + std::cin >> entry.value; + return entry; +} + +int main() { + Game game{{42, 49, 23}, {10, 40, 24}, 10}; + while (game.UserHasBudget()) { + const auto change_entry = GetNextChangeEntryFromUser(game); + if (!change_entry) { // Also possible: change_entry.has_value(). + std::cerr << change_entry.error() << std::endl; + continue; + } + game.ChangePlayerNumberIfPossible(change_entry.value()); + } + game.Print(); + if (game.CheckIfPlayerWon()) { + std::cout << "You win!\\n"; + } else { + std::cout << "Not win today. Try again!\\n"; + } +}` + + yield* all( + stage_2().position.y(-60, 0), + stage_2().position.x(100, duration / 2), + ); + + yield* all( + camera_1().zoom(0.45, duration), + camera_1().centerOn([200, 1850], duration), + camera_2().centerOn([-700, -2750], duration), + codeRef().selection([ + lines(3, 4), + lines(1111 - 1047, 1218 - 1047), + ], duration), + codeRef().code(recoverable_expected, duration).wait(duration), + ); + yield* waitFor(duration * 3); }); diff --git a/lectures/code/error_handling/comparison_game/comparison_game_expected.cpp b/lectures/code/error_handling/comparison_game/comparison_game_expected.cpp new file mode 100644 index 0000000..1e004f9 --- /dev/null +++ b/lectures/code/error_handling/comparison_game/comparison_game_expected.cpp @@ -0,0 +1,103 @@ +// For simplicity, I don't use actual abseil here as it would require CMake +// which would complicate this example. +#define CHECK_GE(expr, value) +#define CHECK_LT(expr, value) + +#include +#include +#include +#include +#include +#include + +struct ChangeEntry { + int index{}; + int value{}; +}; + +class Game { + public: + Game(std::vector&& ref_numbers, + std::vector&& player_numbers, + int budget) + : ref_numbers_{std::move(ref_numbers)}, + player_numbers_{std::move(player_numbers)}, + budget_{budget} {} + + void Print() const { + std::cout << "Budget: " << budget_ << std::endl; + std::cout << "Reference numbers: "; + for (auto number : ref_numbers_) { std::cout << number << "\t"; } + std::cout << std::endl; + std::cout << "Player numbers: "; + for (auto number : player_numbers_) { std::cout << number << "\t"; } + std::cout << std::endl; + } + + bool CheckIfPlayerWon() const { + int win_loss_counter{}; + for (auto i = 0UL; i < player_numbers_.size(); ++i) { + const auto difference = player_numbers_[i] - ref_numbers_[i]; + if (difference > 0) win_loss_counter++; + if (difference < 0) win_loss_counter--; + } + return win_loss_counter > 0; + } + + void ChangePlayerNumberIfPossible(const ChangeEntry& change_entry) { + // Checking: + // (change_entry.index >= 0) + // (change_entry.index < player_numbers_.size()) + CHECK_GE(change_entry.index, 0); + CHECK_LT(change_entry.index, player_numbers_.size()); + auto& player_number = player_numbers_[change_entry.index]; + const auto difference = std::abs(change_entry.value - player_number); + if (difference > budget_) { return; } + player_number = change_entry.value; + budget_ -= difference; + } + + const std::vector& ref_numbers() const { return ref_numbers_; } + const std::vector& player_numbers() const { return player_numbers_; } + bool UserHasBudget() const { return budget_ > 0; } + + private: + std::vector ref_numbers_{}; + std::vector player_numbers_{}; + int budget_{}; +}; + +// Requires C++23 +std::expected GetNextChangeEntryFromUser( + const Game& game) { + game.Print(); + ChangeEntry entry{}; + std::cout << "Please enter number to change: "; + std::cin >> entry.index; + if ((entry.index < 0) || (entry.index >= game.player_numbers().size())) { + return std::unexpected(std::format("Index {} must be in [0, {}) interval", + entry.index, + game.player_numbers().size())); + } + std::cout << "Please provide a new value: "; + std::cin >> entry.value; + return entry; +} + +int main() { + Game game{{42, 49, 23}, {10, 40, 24}, 10}; + while (game.UserHasBudget()) { + const auto change_entry = GetNextChangeEntryFromUser(game); + if (!change_entry) { // Also possible: change_entry.has_value(). + std::cerr << change_entry.error() << std::endl; + continue; + } + game.ChangePlayerNumberIfPossible(change_entry.value()); + } + game.Print(); + if (game.CheckIfPlayerWon()) { + std::cout << "You win!\n"; + } else { + std::cout << "Not win today. Try again!\n"; + } +} diff --git a/lectures/error_handling.md b/lectures/error_handling.md index 7f2fc19..f85d676 100644 --- a/lectures/error_handling.md +++ b/lectures/error_handling.md @@ -541,6 +541,10 @@ $PLACEHOLDER `CPP_RUN_CMD` CWD:error_handling c++ -std=c++17 -c get_change_exceptions.cpp --> ```cpp +#include + +// Old unchanged code. + // 😱 I'm not a fan of using exceptions. ChangeEntry GetNextChangeEntryFromUser(const Game& game) { game.Print(); @@ -596,6 +600,8 @@ $PLACEHOLDER `CPP_RUN_CMD` CWD:error_handling c++ -std=c++17 -c main_get_change_exceptions.cpp --> ```cpp +#include + // Old unchanged code. int main() { @@ -685,6 +691,8 @@ $PLACEHOLDER `CPP_RUN_CMD` CWD:error_handling c++ -std=c++17 -c main_get_change_exceptions_ellipsis.cpp --> ```cpp +// Old unchanged code. + int main() { Game game{{42, 49, 23}, {10, 40, 24}, 10}; while (game.UserHasBudget()) { @@ -876,7 +884,7 @@ std::string GetFailureReason(int constant) { int main() { Game game{{42, 49, 23}, {10, 40, 24}, 10}; while (game.UserHasBudget()) { - // Cannot be const, cannot use auto, have to allocate. + // Cannot be const, cannot use auto, has to allocate. ChangeEntry change_entry{}; const auto error_code = GetNextChangeEntryFromUser(game, change_entry); if (error_code != kSuccess) { @@ -1041,7 +1049,9 @@ std::expected GetNextChangeEntryFromUser(const Game& g std::cout << "Please enter number to change: "; std::cin >> entry.index; if ((entry.index < 0) || (entry.index >= game.player_numbers().size())) { - return std::unexpected(std::format("Index {} must be in [0, {}) interval", entry.index, game.player_numbers().size())); + return std::unexpected( + std::format("Index {} must be in [0, {}) interval", + entry.index, game.player_numbers().size())); } std::cout << "Please provide a new value: "; std::cin >> entry.value; @@ -1148,7 +1158,7 @@ These topics are quite nuanced, and I don't want to go into many details here bu I believe that this concludes a more of less complete overview of how to deal (and how *not* to deal) with errors in modern C++. -As a short summary, I hope that I could convince you that these are some sane suggestions: +As a short summary, I hope that I could convince you that these are some sane suggestions (that can also be found in the [final example](code/error_handling/comparison_game/comparison_game_expected.cpp)): - Use `CHECK` and similar macros for dealing with unrecoverable errors like programming bugs or contract violations in order to fail as fast as possible when they are encountered; - Keep the test coverage of the code high to reduce chances of crashing in the released code;