Differences Between Python 2.7.x and Python 3.x

Written by Dan Sackett on October 30, 2014

Python is a funny language in that there are two communities working with it. There are those that work with Python 2.x and there are others that work with Python 3.x.

If you're new to Python, this can become super confusing. Hopefully today I can shed some light on the main difference between the two Python versions. Specifically, I'll be speaking about Python 2.7 and Python 3.4.

In all honesty, I haven't used Python 3 very much. I learned Python 2.7 and haven't made the full switch to the updated syntax. Luckily, I'm not alone in this. There is a large population of Pythonistas that have yet to move forward for varying reasons. Some people believe Python 3 was a mistake while others have existing projects that would take substantial work to port over to Python 3. Whatever the reason is, it's not a huge concern since Python 2.7 is stable and is just as suitable for the job.

As a bit of history, Python 3 was released in 2008. It is currently in active development and is now up to version 3.4.2. As for Python 2, it received the end of life stamp in 2010 when Python 2.7.6 was released. ]

What this means is that Python 2.7 is going to be in the same place for the rest of it's life. There won't be new features or new development on it. Python 3 is still moving forward and anything new to Python will be introduced in this version.

So which one should you use?

Well, that depends. As of today, a lot of major packages have been ported to Python 3. As you can see here only a few of the Python super packages have yet to be ported to Python 3. When you're starting a new project, it's smart to know what packages you are going to use and check that website to ensure that you'll have full support.

It's in your best interest to learn Python 3 since it will be the future, but as I said, I still haven't personally made the switch. If anything, take this post as a guide to seeing what's different.

The __future__ Module

Since a number of Python 3 functions are not backported to Python 2, we have a module in Python 2 that allows us to use these new features. This module is __future__. If we want to use a Python 3 feature in our Python 2 code, we can import from this module to use the features like so:

from __future__ import print_function

This will give us the print function which I'll get into next. Some of the features we can import are:

While these won't make a lot of sense now, as you see more Python 3 these will become more apparent. The important thing to know is that Python 3 functionality is available for Python 2 applications. This is how we can port applications over.

To learn more about the __future__ module, read about it on the Python docs

Print Function

One of the things you'll run into right away is printing is Python 3 is different. In Python 2, we are used to this:

# Python 2.7
>>> print 'Hello, world!'

It is understood that the print statement acts as a function and prints the items following it. To some people, this is confusing. Why not make it an actual function like most other features in Python? Python 3 answers this question with the print function.

# Python 3.4
>>> print('Hello, World!')

This makes a lot of sense. We contain what we want to print and it now acts like every other function. What happens when we try to use the old print statement in Python 3 though?

# Python 3.4
>>> print 'Hello, World!'
  File "<stdin>", line 1
    print 'Hello, World!'
                        ^
SyntaxError: Missing parentheses in call to 'print'

That's right, it errors out. Of the Python 3 features, this one makes the most sense to me. It keeps everything uniform. Before, the print statement seemed like an inconsistency.

Remember, we can use this in Python 2 with from __future__ import print_function.

Integer Division

Another big difference you may notice quickly is that dividing integers in Python 3 will yield different results than it did in Python 2.

# Python 2.7
>>> 5 / 4
1

In Python 2, we see that integer division returns an integer. For most people doing math, they want to see the full decimal answer to this and not the integer version. To do this in Python 2, we have to do one of the following:

# Python 2.7
>>> float(5) / float(4)
1.25
>>> 5 / float(4)
1.25

We have to explicitly cast our integer to a float to get that float back. Python 3 is a little different.

# Python 3.4
>>> 5 / 4
1.25

Python 3 says that if a number is going to return a float through division then it should do so. We shouldn't need to explicitly state it. Whichever side of the argument you are on, you need to be sure when working with Python 3 you know this. It can throw off tests and values if you're not careful.

What happens if we want to get the whole number back in Python 3?

Well, we have a division symbol that can do this:

# Python 2 and Python 3
>>> 5 // 4
1

Remember, we can use this in Python 2 with from __future__ import division.

Unicode Strings

In Python 2, we see that strings are strings unless we note that they aren't. For instance, if we wanted to use a unicode string in Python 2 then we need to either typecast it with the unicode() function or denote it as u'string'.

# Python 2.7
>>> type('Hello, world!')
<type 'str'>
>>> type(unicode('Hello, world!'))
<type 'unicode'>
>>> type(u'Hello, world!')
<type 'unicode'>

Because of this, simply printing a string with Unicode characters without casting it will result in the following:

# Python 2.7
>>> print 'string \u03BC'
string \u03BC

Not ideal. In Python 3, this is solved. All strings allow for unicode characters and will print properly.

# Python 3.4
>>> print('string \u03BC')
string μ

This is one of those small features that you really may not notice until you forget the unicode casting in Python 3 and things just work. Still, it's nice to see that support straight out of the box because it saves you from type casting now.

xrange

In both Python 2 and 3, we have a function called xrange() which is an efficient way to produce a list of integers without consuming a lot of memory. This is of course more efficient than the range() function which creates the list in memory. While they yield the same result, it is generally good practice to prefer xrange in this case.

Python 3 believes that's definitely the case as it has removed the xrange function and moved its functionality into the standard range function. For instance:

# Python 2.7
>>> range(10)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> xrange(10)
xrange(10)
>>> list(xrange(10))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

# Python 3.4
>>> range(10)
range(0, 10)
>>> list(range(10))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> xrange(10)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'xrange' is not defined

In Python 3, a NameError will be thrown if you try to use the old xrange function. Notice also that in Python 3, the range function returns a range object. As I mentioned, Python 2 xrange is the same as Python 3 range. This is one of those best practice decisions that makes your code more efficient under the hood without an added step of using a different function.

Exceptions

Python 2 is very accepting to the format of raising an exception. You can do either of the following and it will work as expected:

# Python 2.7
>>> raise NotImplementedError, "Not implemented!"
Traceback (most recent call last):
  File "<input>", line 1, in <module>
NotImplementedError: Not implemented!

>>> raise NotImplementedError("Not implemented!")
Traceback (most recent call last):
  File "<input>", line 1, in <module>
NotImplementedError: Not implemented!

This flexibility is nice, in some cases, but doesn't help provide a best practices example. By preference I like to use the parenthesis as if it's a function call. Python 3 believes this as well and actually will throw an error if you use the first version:

# Python 3.4
>>> raise NotImplementedError, "Not implemented!"
  File "<stdin>", line 1
    raise NotImplementedError, "Not implemented!"
                             ^
SyntaxError: invalid syntax

>>> raise NotImplementedError("Not implemented!")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NotImplementedError: Not implemented!

I think this is good to keep a consistent model for how things should be done. Extending this, we will see that handling these exceptions changes as well.

# Python 2.7
>>> try:
...     NameError_Yo
... except NameError, e:
...     print e, 'This works!'
...     
name 'NameError_Yo' is not defined This works!

# Python 3.4
>>> try:
...     NameError_Yo
... except NameError, e:
  File "<stdin>", line 3
    except NameError, e:
                    ^
SyntaxError: invalid syntax

Even before I can finish typing in the interpreter, we get a SyntaxError in Python 3. How do we do this then? We use a syntax that is intuitive. Welcome the as keyword.

# Python 3.4
>>> try:
...     NameError_Yo
... except NameError as e:
...     print(e, 'This works!')
... 
name 'NameError_Yo' is not defined This works!

In Python 3, we say Error as e which is how you would say it in plain English. Again, this change makes some sense considering a lot of the other Python human readable constructs.

A New Next function for Generators

When working with generators in Python 2.7, we have always been able to use the .next() function to get the next item. In Python 3, this is changed to a standalone next() function.

# Python 2.7
>>> my_gen = (x for x in range(10))
>>> my_gen.next()
0

# Python 3.4
>>> my_gen = (x for x in range(10))
>>> my_gen.next()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'generator' object has no attribute 'next'

>>> next(my_gen)
0

In Python 3, we get an AttributeError for using the old syntax. This new function I'm guessing matches more closely to a lot of ideas in Python. Personally, I like the .next() syntax and think that it's self explanatory. Either way, this is one that will certainly break your code when migrating to Python 3.

Input and Raw Input

One thing that Python 2 has that really is confusing to newcomers is the input() function and the raw_input() function. Which one should I use? Most tutorials will show the raw_input version which is correct. The input function is dangerous and behaves oddly. For instance:

# Python 2.7
>>> x = input('Prompt: ')
Prompt: Hello, world!
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "<string>", line 1
    Hello, world!
                ^
SyntaxError: unexpected EOF while parsing

This expects the user to type an actual string so the input should be:

# Python 2.7
>>> x = input('Prompt: ')
Prompt: 'Hello, world!'
>>> x
'Hello, world!'

This is non-intuitive, especially for collecting data which is why we get the raw_input() function. This will accept input and transform it into a string for us.

# Python 2.7
>>> x = raw_input('Prompt: ')
Prompt: Hello, world!
>>> x
'Hello, world!'

Having two functions like this is just odd. Especially for Python. Python 3 fixes this and replaces the input function with the raw_input functionality.

# Python 3.4
>>> x = input('Prompt: ')
Prompt: Hello, world!

>>> x = raw_input('Prompt: ')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'raw_input' is not defined

With this, we now only have one function to do what we are expecting. When accepting user input, we should have control over it, not them. Casting this input to a string is going to allow us to parse it as we want and this will save us in most cases. This is another thing that Python 3 gets correct.

Iterators VS. Lists

In Python 2.7, a lot of built-ins returned lists by default. In Python 3, this has been cleaned up to save some memory. For instance, we will see the following in Python 2:

# Python 2.7
my_dict = {'name': 'Dan', 'age': 25}

>>> my_dict.keys()
['age', 'name']

>>> my_dict.items()
[('age', 25), ('name', 'Dan')]

>>> my_dict.values()
[25, 'Dan']

>>> my_dict.iterkeys()
<dictionary-keyiterator object at 0x207e050>

>>> my_dict.iteritems()
<dictionary-itemiterator object at 0x207e0a8>

>>> my_dict.itervalues()
<dictionary-valueiterator object at 0x207e158>

Unless we specifically note that we want an iterator, we will get back a list in Python 2. Python 3 decided that it would be best to always return iterators for efficiency sake.

# Python 3.4
my_dict = {'name': 'Dan', 'age': 25}

>>> my_dict.keys()
dict_keys(['age', 'name'])

>>> my_dict.items()
dict_items([('age', 25), ('name', 'Dan')])

>>> my_dict.values()
dict_values([25, 'Dan'])

>>> my_dict.iterkeys()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'dict' object has no attribute 'iterkeys'

>>> my_dict.iteritems()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'dict' object has no attribute 'iteritems'

>>> my_dict.itervalues()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'dict' object has no attribute 'itervalues'

As we can see, our common functions are in fact iterators. Python 3 discards the iter* functions for this reason making the API more concise again.

A few others that you'll notice that now return iterators are:

Keyword-Only Arguments

Something new in Python 3 is the ability to have keyword-only arguments in a function call. For instance:

# Python 3.4
>>> def my_func(*x, times=10):
...     return x * times
... 
>>> my_func(1, 2, 3, 4)
(1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4)
>>> my_func(1, 2, 3, 4, times=2)
(1, 2, 3, 4, 1, 2, 3, 4)

Looking at this, we have our first argument which will collect all regular arguments. In our first call to the function, you see that it uses the default number of times. In our second call though, we specifically mention the number of times we want to do this and it works.

Keyword-only arguments allow us to define functions that take a variable number of arguments while still allowing specific keyword based ones. It's a nice feature when you need it.

Extended Unpacking

Somewhat tied to the above is a new form of unpacking. We now have full use of splats in Python 3 for variable assignment.

# Python 3.4
>>> a, b, *rest = range(5)
>>> a
0
>>> b
1
>>> rest
[2, 3, 4]

With Python already supporting multiple variable assignment like this, the splat is a good idea. Note that we can use this * syntax anywhere in our assignment too.

# Python 3.4
>>> *start, last = range(5)
>>> start
[0, 1, 2, 3]
>>> last
4

This is a nice feature to give us flexibility as it pertains to variable assignment. If you don't want to save the result then we can just as easily do *, last = range(5) which will just get us the last item in the range call. Pretty cool.

Conclusion

As you can see, Python 3 isn't the devil. When it comes to most features, they are smart and elegant solutions to old problems. To get a full list of changes, check out the Python Website for a good list. Don't be afraid to test it out and see if you like it better. Just remember the differences between the two when switching back and forth. It'll save you a headache later on.

Do you use Python 3? I'd love to know how you feel about it.


python

comments powered by Disqus