2. Idioms

Dataflow

1. Chained comparison operators

if x <= y and y <= z:
  print('ok')

Better

if x <= y <= z:
  # do something

2. Ternary operator

value = 0
if cond:
  value = 1

Better

value = 1 if cond else 0

Intuitively it's like how we write in maths, f(x) = |x| = x if x > 0 else -x

3. `or` operator

if x:
  y = x
else:
  y = 'fallback'

Better: use or

y = x or 'fallback'

or returns the first operand if the first operand evaluates to True, and the second operand if the first operand evaluates to False. Examples:

'' or 'default' # 'default'
0 or 1 # 1
None or 0 # 0
[] or [3] # [3]
None or [] # []
False or 0 # 0

Check existence in a collection

if city == 'Nairobi' or city == 'Kampala' or city == 'Lagos':
  found = True

Better: use in keyword

city = 'Nairobi'
found = city in {'Nairobi', 'Kampala', 'Lagos'}

Here we used a set of cities, though we could also have used

a tuple, ('Nairobi', 'Kampala', 'Lagos'), or
a list ['Nairobi', 'Kampala', 'Lagos']

Set will be advantageous when number of cities is very large. In summary, use in where possible:

Contains: if x in items
Iteration: for x in items

Concatenating strings

sentence = ['this','is','a','sentence']
sentence_str = ''
for word in sentence:
  sentence_str += word + '  '
sentence_str = sentence_str[:-1]
# 'this is a sentence'

Above code uses the Shlemiel the painter’s algorithm and is accidentaly quadratic 👎. Instead use join

' '.join(sentence)

Looping

Simple Looping

for i in range(len(my_list)):
  print(my_list[i])

Better 👇

for elem in my_list:
  print(elem)

Looping over a collection with indices

for i in range(len(my_list)):
  print(i, my_list[i])

Better: use enumerate

for idx, element in enumerate(my_list):
  print (idx, element)

enumerate returns an iterator

Looping backwards

colors = ['red', 'green', 'blue', 'yellow']

for i in range(len(colors)-1, -1, -1):
  print(colors[i])

Better: use slicing [::-1]

for color in colors[::-1]:
  print(color)

Even Better: use reversed 👌. It returns an iterator.

for color in reversed(colors):
  print(color)

Looping over two collections

names = ['raymond', 'rachel', 'matthew']
colors = ['red', 'green', 'blue', 'yellow']

n = min(len(names), len(colors))
for i in range(n):
  print(names[i], '--->', colors[i])

Better: use zip

for name, color in zip(names, colors):
  print(name, '--->', color)

zip too returns an iterator.

Make (an iterable of) bigrams of items in iterable: zip(mylist, mylist[1:])

words = 'A girl has no name'.split()
bigrams = list(zip(words, words[1:]))
# bigrams is [('A', 'girl'), ('girl', 'has'), ('has', 'no'), ('no', 'name')]

Transpose an iterable of tuples: zip(*data)

data = [(1, 2, 3), (4, 5, 6)]
transposed = list(zip(*data))
# transposed is [(1, 4), (2, 5), (3, 6)]

Summary: The iterators enumerate, zip, reversed are syntax goodies (syntactic sugar) that cover many usual cases to make code more readable and pretty.

Dict's default value: `get`

1. Default value for item not in dictionary

color_weights = {'blue': 1, 'green': 2, 'red': 3}
yellow_weight = color_value['yellow'] if 'yellow' in color_weights else -1

Better: use get

yellow_value = color_value.get('yellow', -1)

2. Counting with dictionaries

colors = ['red', 'green', 'red', 'blue', 'green', 'red']

d = {}
for color in colors:
    if color not in d:
        d[color] = 0
    d[color] += 1

# {'blue': 1, 'green': 2, 'red': 3}

Better

d = {}
for color in colors:
    d[color] = d.get(color, 0) + 1

Use collections 💪

from collections import Counter

Counter(colors)

Grouping

Use defaultdict

`any` function

Let's simulate an experiment to shuffle 'n' cards each with a unique label in 0...n-1, and then check if any k^th card's label is k.

We will use sample function from random module for that. sample is used for sampling with replacement; sample(range(n), n) is equivalent to shuffling the list 0...n-1.

from random import sample
idx_labels = enumerate(sample(range(n), n))

To check the experiment:

for idx, label in idx_labels:
  if idx == label:
    print(True)
print(False)

Better: use any

if any(idx == label for idx, label in idx_labels):
  print(True)
else:
  print(False)

We could also have used a list instead of a generator: any([idx == label for idx, label in idx_labels]), but obviously generator-expression used above is memory-efficient.

The `with` statement

foo = open('/tmp/foo', 'w')
try:
  foo.write('sometext')
finally:
  foo.close()

👆code is equivalent to 👇. Use with

with open('/tmp/foo', 'w') as handle:
  handle.write('sometext')

Comprehensions

squares = list(map(lambda x: x**2, range(1,10)))
even_squares = list(map(lambda x: x**2, filter(lambda x: x % 2 == 0, range(1,10))))

List comprehensions 👇are more readable and pythonic! 🤘

squares = [x**2 for x in range(1,10)]
even_squares = [x**2 for x in range(1,10) if x % 2 == 0]

Specialized tool beats a general purpose tool

Specialized tools usually outperform or are more accurate than general purpose tools

math.sqrt(x) is more accurate than x ** 0.5

math.log2() is exact for powers of two

from math import log, log2
all(log(2 ** x, 2) == x for x in range(100)) # False
all(log2(2 ** x) == x for x in range(100)) # True

In PySpark, key_value_rdd.countByKey() is way faster thankey_value_rdd.groupBy().mapValues(len).collect() because of less shuffling involved.

Direct links

Iterators

Bell Curve

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

2. Idioms

Dataflow

1. Chained comparison operators

2. Ternary operator

3. `or` operator

Check existence in a collection

Concatenating strings

Looping

Simple Looping

Looping over a collection with indices

Looping backwards

Looping over two collections

Dict's default value: `get`

1. Default value for item not in dictionary

2. Counting with dictionaries

Grouping

`any` function

The `with` statement

Comprehensions

Specialized tool beats a general purpose tool

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Direct links

Clone this wiki locally

2. Idioms

Dataflow

1. Chained comparison operators

2. Ternary operator

3. or operator

Check existence in a collection

Concatenating strings

Looping

Simple Looping

Looping over a collection with indices

Looping backwards

Looping over two collections

Dict's default value: get

1. Default value for item not in dictionary

2. Counting with dictionaries

Grouping

any function

The with statement

Comprehensions

Specialized tool beats a general purpose tool

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Direct links

Clone this wiki locally

3. `or` operator

Dict's default value: `get`

`any` function

The `with` statement