Writing Better Python Code

Written by Dan Sackett on October 21, 2014

In my opinion, a good programming language is one that has best practices that are well defined and globally followed.

It makes reading and writing code much easier because you know what to expect. I love Python because of this. Starting with the PEP-8 standard, we have common conventions on how to write and structure our code. I think getting into Python is actually quite simple because of this convention. The challenge with Python really is mastering the more powerful features that make your code even better.

Today I wanted to show off some of the cool things that can bring your Python code from intermediate level to advanced.

Use Join to Concatenate Lists

When you first learn Python, one thing that you probably pick up is the use of a for loop. Python for loops are actually quite elegant with the for x in y syntax. One thing that you might learn in an early tutorial perhaps is joining a bunch of strings together to form one:

animals = ['dog', 'cat', 'bird', 'fish', 'snake']

result = ''
for animal in animals:
     if animal is animals[-1]:
         result += animal
         result += animal + ', '

All we're doing here is taking each animal in our list and adding them together to form a new string which results in dog, cat, bird, fish, snake.

While this works well, it isn't as efficient as we would like. We have a for loop, and if else statement, and in the end it's 7 lines of code to write. Let's see how an efficient Python programmer would do this:

animals = ['dog', 'cat', 'bird', 'fish', 'snake']
result = ', '.join(animals)

If you've never seen this before, learn it. We can ditch the loop altogether and use a string function to concatenate our list to form our desired result of dog, cat, bird, fish, snake.

How does this work?

Well we start off with a string. Our string in this case is ', '. While this seems odd, we can think of this string as the glue that we want to hold our list items together. So we want each item to be followed by a comma and a space. We then use the .join() function on this string. There are tons of string functions we can use but this one is one of the more fun tricks. We pass our array to the join function and like magic, it just works.

Notice as well that we don't get a trailing comma and space. The join function is smart enough to know when the list is done and when our glue is no longer needed.

This brings us from 7 lines of code with checks to 2 lines of efficient code. This is fun, but what if we want to do something custom with this list when we print it. For instance, what if we want to create a sentence like "Choose: dog, cat, bird, fish, or snake"

Let's try this:

animals = ['dog', 'cat', 'bird', 'fish', 'snake']
print 'Choose: {0}, or {1}'.format(', '.join(animals[:-1]), animals[-1])

Not too shabby, right. We still have two lines and it's still readable. We take advantage of Python's ability to use negative numbers for slices and indexes here. Our list joins all list members except for the last one and we get the last item separately.

That's cool too, but what if we want to call a function on each item before we print it? We can't do that in two lines, can we?

Well, actually we can.

animals = ['dog', 'cat', 'bird', 'fish', 'snake']
print 'Choose: {0}, or {1}'.format(', '.join(a.title() for a in animals[:-1]), animals[-1].title())

How is this possible?

Since the join function takes an iterable as an argument, we can use a comprehension, right? So here, we use a comprehension inside the join function creating ourselves a generator. We then can run functions on each item in the list and voila! In this case, we title case each item and get Choose: Dog, Cat, Bird, Fish, or Snake.

If that's too much in one line, you could also do this within the list itself. We could have just as easily done:

animals = [a.title() for a in ['dog', 'cat', 'bird', 'fish', 'snake']]
print 'Choose: {0}, or {1}'.format(', '.join(animals[:-1]), animals[-1])

Both ways work and both are much more efficient than creating loops.

Get Your List Items Indexes with Enumerate

One different thing in Python for loops is that it really acts as a foreach loop. There isn't i = 0 and i++ nonsense. For loops simply iterate over an item and allow you to manipulate data. This often times can throw a wrench into programmer's workflow though as they want to get the index of their list items while iterating.

How can we do this if we don't set our variable i?

Enumerate the list!

my_list = ['zero', 'one', 'two', 'three', 'four']

for index, value in enumerate(my_list):
    print '{0}: {1}'.format(index, value)

In this example, we're using the enumerate function which takes our list as the sole argument. Notice also that we are identifying an index and a value in our for loop. This results in the following:

0: zero
1: one
2: two
3: three
4: four

As you can see, we have proper indexes. What does the enumerate function give us exactly? Let's check it out:

print enumerate(my_list)
# <enumerate object at 0x2847870>

print list(enumerate(my_list))
# [(0, 'zero'), (1, 'one'), (2, 'two'), (3, 'three'), (4, 'four')]

So an enumerate function returns an enumerate object by default. This enumerate object is actually a lazy Python object which is somewhat like a generator in that it returns one value at a time. When we run list() on this enumerate object, we see that what we're actually getting back is a list of tuples where the index is grouped with the item.

This is why we can use this in a for loop like we did above. The enumerate object pairs an index with the item and when we loop through it we can access both.

Enumerating lists is the perfect way to keep your for loops clean still when you need an index.

Accessing Values in a Dictionary

Dictionaries are awesome. They let you create maps to your data and access them with an easy API. How many times have you tried this though?

my_dict = {'name': 'Dan', 'gender': 'Male', 'age': 25}

Traceback (most recent call last):
  File "<input>", line 1, in <module>
KeyError: 'height'

The KeyError tells us that there is no such property in our dictionary which is true. So seeing this, novice Python programmers may write a try / catch block to avoid this:

my_dict = {'name': 'Dan', 'gender': 'Male', 'age': 25}

except KeyError, e:
    print 0

Try / Catch blocks are great in some regards and can really be useful, but in this case it's a simple KeyError. Luckily, the advanced Python programmer knows a better way to solve this.

my_dict = {'name': 'Dan', 'gender': 'Male', 'age': 25}

my_dict.get('height', 0)

This allows us to forget about catching KeyErrors and focusing on what we want and what we will return if we can't get it. The get function takes two arguments, the second being optional. The first is the key we want to get the value for. In our case, we want the "height". The second optional argument is the return value if the key doesn't exist. If you specify this value like I did, it will return this by default. If you don't specify it then it will return nothing.

I use this for accessing keys every time. Sometimes you can't rely on the key being set and this will save you every time. Get in a habit of accessing dictionary items with this syntax.

Build Dictionaries Smarter

Continuing with dictionary related tricks, let's see a common programming pattern:

scores = {}
attempts = [('dan', 87), ('erik', 95), ('jason', 79), ('erik', 97), ('dan', 100)]

for (name, score) in attempts:
    if name in scores:
        scores[name] = [score]

In this example, we have people who took a test a varying number of times. We want to create a dictionary from these attempts with the name as the key and an array of scores as the value. We can loop through the attempts and check if the key is already in our dictionary. If it is, add the score into the array. If not, create the key and initialize the array.

This is a very common thing to do in programming and this is a decent solution.

Python allows us to condense our code a little though. Let's see the first way to simplify this:

scores = {}
attempts = [('dan', 87), ('erik', 95), ('jason', 79), ('erik', 97), ('dan', 100)]

for (name, score) in attempts:
    scores.setdefault(name, []).append(score)

Looking at the above, we replaced our if else statement with a method which will manage this for us. The setdefault function takes a key and a value as arguments. When we go through our loop, if the key is set then we simply skip the assignment and move into the append portion of the expression. If the key isn't set though, we create the record and instantiate it with an empty list.

This is a great way to handle creating new dictionaries or appending to old ones.

Of course, we don't need to set a list as our default entry. We can just as easily do:

counts = {}
attempts = [('dan', 87), ('erik', 95), ('jason', 79), ('erik', 97), ('dan', 100)]

for (name, score) in attempts:
    counts.setdefault(name, 0)
    counts[name] += 1

In this example, if our key doesn't exist then we set the value to 0. We add one for each new entry which result in a dictionary with the number of times someone attempted a test.

Not bad. We can do one more iteration on this.

from collections import defaultdict

counts = defaultdict(list)
attempts = [('dan', 87), ('erik', 95), ('jason', 79), ('erik', 97), ('dan', 100)]

for (name, score) in attempts:

In this example, we actually import a Python module from collections called defaultdict. The defaultdict takes a type parameter. Now, instead of defining an empty dictionary we have a function working for us. Since it defaults to a list on new entries, we can immediately assign keys and values in our loop and it just works.

One thing to note though, we do not get a standard dictionary back when we call counts. We get back defaultdict(<type 'list'>, {'dan': [87, 100], 'jason': [79], 'erik': [95, 97]}). We can do the same things we do to a dictionary but this object specifies what new keys will be created with.

We can do like we did above and use the defaultdict to get integers as well.

from collections import defaultdict

counts = defaultdict(int)
attempts = [('dan', 87), ('erik', 95), ('jason', 79), ('erik', 97), ('dan', 100)]

for (name, score) in attempts:
    counts[name] += 1

Our defaultdict initiates our new keys with an integer (0 actually), and allows us to add to the value.

These are some of the more efficient ways to create dictionaries.

Building Dictionaries from Lists

Another cool thing that we can do with dictionaries is build them based on two lists. For instance, let's say we have a list of animals and a list of animal noises:

animals = ['dog', 'cat', 'bird', 'fish', 'snake']
noises = ['woof', 'purr', 'squawk', 'blub blub', 'hiss']

If we wanted to join these two together then we could do that two ways. First, we could take a naive approach:

joined = defaultdict(str)
for index, animal in enumerate(animals):
    joined[animal] = noises[index]

In this example, we get our index and then as we loop through our first list we take that index and grab the matching item in our second list and pair them. This results in:

print dict(joined)
# {'fish': 'blub blub', 'cat': 'purr', 'bird': 'squawk', 'snake': 'hiss', 'dog': 'woof'}

While this approach works, it's not the best approach. Python has a number of awesome buit-in functions and one of them is zip(). The zip functions takes any number of arguments, all being lists, and combines them to form a list of paired tuples.

print zip(animals, noises)
# [('dog', 'woof'), ('cat', 'purr'), ('bird', 'squawk'), ('fish', 'blub blub'), ('snake', 'hiss')]

print dict(zip(animals, noises))
# {'bird': 'squawk', 'fish': 'blub blub', 'dog': 'woof', 'snake': 'hiss', 'cat': 'purr'}

As we see from the first run of the zip function, out list of tuples pairs things up based on indexes. We then use out dict() function to transform our list of tuples into a dictionary and we're finished. No loops needed!

List and Dictionary Unpacking

Take a look at the following code.

def my_function(*args, **kwargs):

Have you seen something like that before? If you've worked with Django or a bigger Python application, chances are good that you have. By definition, these are known as splats. When taken as parameters to a function they will consume arbitrary data you pass in.

What's the difference between the two?

For starters, the names "args" and "kwargs" can be anything you want them to be. It's best convention to use these words though as the rest of the Python community does so. Args stands for arguments or essentially standalone arguments. Kwargs stands for keyword arguments. It's easiest to describe these with an example:

def my_function(*args, **kwargs):
    print 'Args: {0}'.format(args)
    print 'Kwargs: {0}'.format(kwargs)

my_function('hello', 'world', my_keyword='my_value')

# Args: ('hello', 'world')
# Kwargs: {'my_keyword': 'my_value'}

Looking at the output, we can see that args is a tuple that consists of the first two arguments. Kwargs is a dictionary with our keyword and our value set together. What's happening is these magic * identifiers collect the passed in values. It allows us to expect any input and catch it. *args will catch all arguments that are bare while **kwargs will catch all arguments that are defined.

Hopefully that makes sense.

Now the point I was trying to make was we can use this notation outside of a function signature. Let's see *args first.

my_list = ['one', 'two', 'three', 'four']

print '{0}, {1}, {2} GO!'.format(*my_list)

# one, two, three GO!

This is what's known as unpacking. With the format function, we define indexes in our string and then we can pass *my_list to is and the function will use the list as the reference to the indexes. Notice that we have four items in our list but we only define thee indexes in our string. This is fine as long as we don't have an invalid index.

What gets cooler is when we work with dictionaries.

my_dict = {'name': 'Dan', 'gender': 'Male', 'age': 25}

print 'Hello, my name is {name}! I am a {age} year old {gender}.'.format(**my_dict)

# Hello, my name is Dan! I am a 25 year old Male.

First of all, notice that we aren't using indexes anymore and instead we're using actual keys. Now notice that we pass **my_dict into the format function. Since we are dealing with keyword arguments, we use two asterixes. This unpacks our dictionary based on the keys we define in our string.

I love this and I think it makes writing format strings much simpler if everything is confined in the same data type. One place where this is especially handy is if we're calling a function that takes a config object.

video_config = {
    'height': 600,
    'width': 800.
    'url': 'http://my_url.com'


This will unpack the dictionary items as keyword arguments to the function and as long as the function is expecting this then we're good to go!


Python has a lot of cool tricks up its sleeve and it's good to learn some of them to become more efficient. One of my favorite things ever in Python, list comprehensions, will get its very own post next so be on the look out for that.

Have any other fun tricks that save time? I'd love to learn more!


comments powered by Disqus