|
| 1 | +{ |
| 2 | + "cells": [ |
| 3 | + { |
| 4 | + "cell_type": "markdown", |
| 5 | + "metadata": { |
| 6 | + "slideshow": { |
| 7 | + "slide_type": "slide" |
| 8 | + } |
| 9 | + }, |
| 10 | + "source": [ |
| 11 | + "\n", |
| 12 | + "\n", |
| 13 | + "# Sessions 7 & 8: Object-Oriented Programming\n", |
| 14 | + "\n", |
| 15 | + "### Juan Luis Cano Rodríguez <jcano@faculty.ie.edu> - Master in Business Analytics and Big Data (2019-04-08)" |
| 16 | + ] |
| 17 | + }, |
| 18 | + { |
| 19 | + "cell_type": "markdown", |
| 20 | + "metadata": {}, |
| 21 | + "source": [ |
| 22 | + "## What are \"objects\" anyway?\n", |
| 23 | + "\n", |
| 24 | + "So far we have learned how to define variables, functions and modules in Python, and we have been using objects defined in other libraries, for example pandas `DataFrame`s or matplotlib `Figure`s. In very simple terms, an *object* is something that can optionally have:\n", |
| 25 | + "\n", |
| 26 | + "* Object-bound variables, called **properties**\n", |
| 27 | + "* Object-bound functions, called **methods**\n", |
| 28 | + "\n", |
| 29 | + "If the object *properties* can change, we say the object has a **state**, and also that it's **mutable**. Otherwise, it's **stateless** and **immutable**. A typical example of such differences are lists (mutable) and tuples (immutable):" |
| 30 | + ] |
| 31 | + }, |
| 32 | + { |
| 33 | + "cell_type": "code", |
| 34 | + "execution_count": null, |
| 35 | + "metadata": {}, |
| 36 | + "outputs": [], |
| 37 | + "source": [] |
| 38 | + }, |
| 39 | + { |
| 40 | + "cell_type": "markdown", |
| 41 | + "metadata": {}, |
| 42 | + "source": [ |
| 43 | + "<div class=\"alert alert-info\">Immutable objects have the advantage that they can be <strong>hashed</strong>, that is: they can be transformed, using some cryptographical function, into something that uniquely represents that object. Mutable objects can't, because the hash would have to change every time the state of the object changed. <strong>Dictionary keys have to be hashable objects.</strong></div>" |
| 44 | + ] |
| 45 | + }, |
| 46 | + { |
| 47 | + "cell_type": "code", |
| 48 | + "execution_count": null, |
| 49 | + "metadata": {}, |
| 50 | + "outputs": [], |
| 51 | + "source": [] |
| 52 | + }, |
| 53 | + { |
| 54 | + "cell_type": "markdown", |
| 55 | + "metadata": {}, |
| 56 | + "source": [ |
| 57 | + "## Classes and instances\n", |
| 58 | + "\n", |
| 59 | + "Objects are defined by **instantiating a class**. A **class** is a *template* for new objects, where we define its behavior, and an **instance** is a particular realization of that class.\n", |
| 60 | + "\n", |
| 61 | + "\n", |
| 62 | + "\n", |
| 63 | + "### Example\n", |
| 64 | + "\n", |
| 65 | + "We want to model the behavior of the users of our company product, to later study how much time they spend, what are their preferences and so forth. Let's create a `User` class:" |
| 66 | + ] |
| 67 | + }, |
| 68 | + { |
| 69 | + "cell_type": "code", |
| 70 | + "execution_count": null, |
| 71 | + "metadata": {}, |
| 72 | + "outputs": [], |
| 73 | + "source": [] |
| 74 | + }, |
| 75 | + { |
| 76 | + "cell_type": "markdown", |
| 77 | + "metadata": {}, |
| 78 | + "source": [ |
| 79 | + "Our `User` **class** is of type `type`, which means that it can be used to create new objects. Now, let's create two instances:" |
| 80 | + ] |
| 81 | + }, |
| 82 | + { |
| 83 | + "cell_type": "code", |
| 84 | + "execution_count": null, |
| 85 | + "metadata": {}, |
| 86 | + "outputs": [], |
| 87 | + "source": [] |
| 88 | + }, |
| 89 | + { |
| 90 | + "cell_type": "markdown", |
| 91 | + "metadata": {}, |
| 92 | + "source": [ |
| 93 | + "We have two **instances** of `User`: `user1` and `user2`. With a slight abuse of notation, we would say we have *two `User` objects*, or just *two `Users`s*." |
| 94 | + ] |
| 95 | + }, |
| 96 | + { |
| 97 | + "cell_type": "markdown", |
| 98 | + "metadata": {}, |
| 99 | + "source": [ |
| 100 | + "### Using the instance: `self`\n", |
| 101 | + "\n", |
| 102 | + "Let's add a very simple **method** to demonstrate a very important concept in Python: the *explicit `self`*. Remember that a method is like a function that is bound to the object, and can use its properties. Methods are defined like this:" |
| 103 | + ] |
| 104 | + }, |
| 105 | + { |
| 106 | + "cell_type": "code", |
| 107 | + "execution_count": null, |
| 108 | + "metadata": {}, |
| 109 | + "outputs": [], |
| 110 | + "source": [] |
| 111 | + }, |
| 112 | + { |
| 113 | + "cell_type": "markdown", |
| 114 | + "metadata": {}, |
| 115 | + "source": [ |
| 116 | + "Notice how we called `user1.test()` **without passing an extra argument**? This is because Python is automatically passing the instance. It's the equivalent of doing this (**never do this**):" |
| 117 | + ] |
| 118 | + }, |
| 119 | + { |
| 120 | + "cell_type": "code", |
| 121 | + "execution_count": null, |
| 122 | + "metadata": {}, |
| 123 | + "outputs": [], |
| 124 | + "source": [] |
| 125 | + }, |
| 126 | + { |
| 127 | + "cell_type": "markdown", |
| 128 | + "metadata": {}, |
| 129 | + "source": [ |
| 130 | + "In fact, if we define a method without a first parameter, it will fail when we call it:" |
| 131 | + ] |
| 132 | + }, |
| 133 | + { |
| 134 | + "cell_type": "code", |
| 135 | + "execution_count": null, |
| 136 | + "metadata": {}, |
| 137 | + "outputs": [], |
| 138 | + "source": [] |
| 139 | + }, |
| 140 | + { |
| 141 | + "cell_type": "markdown", |
| 142 | + "metadata": {}, |
| 143 | + "source": [ |
| 144 | + "This first parameter can be called anything, but **everybody uses `self`**. Remember, conventions are important to minimize surprise and enhance collaboration!" |
| 145 | + ] |
| 146 | + }, |
| 147 | + { |
| 148 | + "cell_type": "markdown", |
| 149 | + "metadata": {}, |
| 150 | + "source": [ |
| 151 | + "### Initializing our instances\n", |
| 152 | + "\n", |
| 153 | + "Our `User` objects are not very useful yet. We will now add some properties, like their `name` and their `signup_date`. These properties should be specified on creation, in a way that I cannot have a user without `name` and `signup_date`. For that, Python provides us a special method, `__init__`<sup>1</sup>, that **initializes**<sup>2</sup> the object:" |
| 154 | + ] |
| 155 | + }, |
| 156 | + { |
| 157 | + "cell_type": "code", |
| 158 | + "execution_count": null, |
| 159 | + "metadata": {}, |
| 160 | + "outputs": [], |
| 161 | + "source": [] |
| 162 | + }, |
| 163 | + { |
| 164 | + "cell_type": "markdown", |
| 165 | + "metadata": {}, |
| 166 | + "source": [ |
| 167 | + "<div class=\"alert alert-warning\"><sup>1</sup>Not to be confused with the <code>__init__.py</code> we used to put our code!</div>\n", |
| 168 | + "<div class=\"alert alert-warning\"><sup>2</sup>Sometimes this method is called the <em>constructor</em>, but strictly speaking, in Python the constructor is <code>__new__</code> and you should not use it. The difference is that the constructor <em>returns an instance</em>, whereas the initializer <em>works with an already created instance and should return <code>None</code></em>.</div>" |
| 169 | + ] |
| 170 | + }, |
| 171 | + { |
| 172 | + "cell_type": "markdown", |
| 173 | + "metadata": {}, |
| 174 | + "source": [ |
| 175 | + "That's something! However, there are several things we can improve:\n", |
| 176 | + "\n", |
| 177 | + "* It can be cumbersome to specify the date every time, and it would be nice to have some default.\n", |
| 178 | + "* The default representation of the instances contains some hexadecimal memory address and nothing else. It would be nice to at least see the user name and the signup date:" |
| 179 | + ] |
| 180 | + }, |
| 181 | + { |
| 182 | + "cell_type": "code", |
| 183 | + "execution_count": null, |
| 184 | + "metadata": {}, |
| 185 | + "outputs": [], |
| 186 | + "source": [] |
| 187 | + }, |
| 188 | + { |
| 189 | + "cell_type": "markdown", |
| 190 | + "metadata": {}, |
| 191 | + "source": [ |
| 192 | + "* Nothing stops me from changing the name and signup_date of a existing user:" |
| 193 | + ] |
| 194 | + }, |
| 195 | + { |
| 196 | + "cell_type": "code", |
| 197 | + "execution_count": null, |
| 198 | + "metadata": {}, |
| 199 | + "outputs": [], |
| 200 | + "source": [] |
| 201 | + }, |
| 202 | + { |
| 203 | + "cell_type": "markdown", |
| 204 | + "metadata": {}, |
| 205 | + "source": [ |
| 206 | + "### Exercise\n", |
| 207 | + "\n", |
| 208 | + "* Make `signup_date` optional by providing a default value (be careful, there's a trap!)\n", |
| 209 | + "* Make the `__repr__` method return a string containing the `name` and `signup_date`, which will override the default " |
| 210 | + ] |
| 211 | + }, |
| 212 | + { |
| 213 | + "cell_type": "markdown", |
| 214 | + "metadata": {}, |
| 215 | + "source": [ |
| 216 | + "### Protecting properties\n", |
| 217 | + "\n", |
| 218 | + "In Python, *there are no private attributes* (neither properties nor methods), and in fact everything can be accessed<sup>1</sup>. However, we can \"hide\" them by default in autocomplete and other environments by using a leading underscore `_`: this is usually called **protected variables**.\n", |
| 219 | + "\n", |
| 220 | + "There is a common pattern in which, if I want to make some property read-only, we can\n", |
| 221 | + "\n", |
| 222 | + "1. Make it protected\n", |
| 223 | + "2. Create a \"getter\" using the `@property` decorator, which gets the value of the protected property with a public name\n", |
| 224 | + "\n", |
| 225 | + "<small><sup>1</sup>This philosophy used to be summarized by the sentence \"we are all consenting adults here\", which is nowadays being less used.</small>" |
| 226 | + ] |
| 227 | + }, |
| 228 | + { |
| 229 | + "cell_type": "code", |
| 230 | + "execution_count": null, |
| 231 | + "metadata": {}, |
| 232 | + "outputs": [], |
| 233 | + "source": [] |
| 234 | + }, |
| 235 | + { |
| 236 | + "cell_type": "markdown", |
| 237 | + "metadata": {}, |
| 238 | + "source": [ |
| 239 | + "### Inheritance" |
| 240 | + ] |
| 241 | + }, |
| 242 | + { |
| 243 | + "cell_type": "code", |
| 244 | + "execution_count": null, |
| 245 | + "metadata": {}, |
| 246 | + "outputs": [], |
| 247 | + "source": [] |
| 248 | + }, |
| 249 | + { |
| 250 | + "cell_type": "markdown", |
| 251 | + "metadata": {}, |
| 252 | + "source": [ |
| 253 | + "<div class=\"alert alert-warning\">Python supports multiple inheritance as well, which must be handled with care: see for example the <a href=\"https://www.wikiwand.com/en/Multiple_inheritance#/The_diamond_problem\">Diamond problem</a>.</div>\n", |
| 254 | + "<div class=\"alert alert-warning\">Now that you discovered inheritance, you might be tempted to use it everywhere. Lots of very subtle mistakes can be introduced by abusing inheritance or using it in wrong ways, see for example <a href=\"https://softwareengineering.stackexchange.com/a/238184/15297\">this amusing story</a>, which explains of the <a href=\"https://www.wikiwand.com/en/Liskov_substitution_principle\">(Barbara) Liskov substitution principle</a>, and this article about <a href=\"http://www.thedigitalcatonline.com/blog/2014/08/20/python-3-oop-part-3-delegation-composition-and-inheritance/\">composition and inheritance</a>.</div>" |
| 255 | + ] |
| 256 | + }, |
| 257 | + { |
| 258 | + "cell_type": "markdown", |
| 259 | + "metadata": {}, |
| 260 | + "source": [ |
| 261 | + "### More special methods\n", |
| 262 | + "\n", |
| 263 | + "https://docs.python.org/3/reference/datamodel.html" |
| 264 | + ] |
| 265 | + }, |
| 266 | + { |
| 267 | + "cell_type": "code", |
| 268 | + "execution_count": null, |
| 269 | + "metadata": {}, |
| 270 | + "outputs": [], |
| 271 | + "source": [] |
| 272 | + } |
| 273 | + ], |
| 274 | + "metadata": { |
| 275 | + "kernelspec": { |
| 276 | + "display_name": "Python 3", |
| 277 | + "language": "python", |
| 278 | + "name": "python3" |
| 279 | + }, |
| 280 | + "language_info": { |
| 281 | + "codemirror_mode": { |
| 282 | + "name": "ipython", |
| 283 | + "version": 3 |
| 284 | + }, |
| 285 | + "file_extension": ".py", |
| 286 | + "mimetype": "text/x-python", |
| 287 | + "name": "python", |
| 288 | + "nbconvert_exporter": "python", |
| 289 | + "pygments_lexer": "ipython3", |
| 290 | + "version": "3.7.1" |
| 291 | + } |
| 292 | + }, |
| 293 | + "nbformat": 4, |
| 294 | + "nbformat_minor": 2 |
| 295 | +} |
0 commit comments