Porting setuptools to py3k
I recently spent some time porting setuptools to the py3k-struni branch as a means of testing both 2to3 specifically and the porting process generally. What follows are the notes from the experience. Two things to keep in mind: first, the
struni branch, though slated to become the “official” Python 3000 branch, is still very much in flux and currently has 30+ failing tests; needless to say, it’s not an ideal porting target. Secondly, I was attempting this without Python 2.6’s forward compatibility mode, which is still mostly unwritten. As both of these situations change, I’ll keep trying to port more code to test the general readiness of the 2.x -> 3.x migration strategy.
Things to do in your Python 2 code:
- Don’t write code like this:
class install(_install): new_commands = [ ('install_egg_info', lambda self: True), ('install_scripts', lambda self: True), ] _nc = dict(new_commands) sub_commands = [ cmd for cmd in _install.sub_commands if cmd not in _nc ] + new_commands
That won’t work in Python 3000 because of changes to list comprehensions and class definitions. Move the
_ncdeclarations out of the class body.
- Don’t rely on implicit relative imports. In Python 3000, all imports will be absolute by default; you should write one of
from setuptools.dist import _get_unpatched # or from .dist import _get_unpatched
from dist import _get_unpatched
Stuff that needs to be easier:
- The fact that
__cmp__methods are going away sucks, plain and simple. This required me to manually implement four additional comparison methods for every class that had previously relied on
__cmp__. I hope their removal will be rethought and retracted.
strunibranch currently has three different string-ish types: bytes,
str(previously unicode) and
str8. Guido has said that
str8will eventually go away, but its presence in unexpected places (like modules’
__file__attributes) made for some needlessly frustrating debugging. Ignoring
str8, the new
bytestypes is going to be a serious obstacle for anyone wanting to move their codebase to Python 3. Take the following two lines:
data = open(some_file).read() # read in text mode # and data = open(some_file, "rb").read() # read in binary mode
The first returns a
str, the second a
bytesobject; these two types have incompatible APIs, and the current state of the
strunibranch makes it impossible to write code that operates on both. For example, the signatures of the types’
split()methods are different, and the
bytestype lacks a
splitlines()method. These aren’t hypothetical differences: I’ve run into both problems while trying to fix several of the tests in the standard library.
- On the subject of
bytes, I ran into two additional
strincompatibilities when porting setuptools. First, when you iterate over
strinstances, you get single-character
strs back; when you iterate over
bytesinstances, you get integers. Combine this with code that switches based on type, and you end up banging your head against the table when your code starts kicking out errors, complaining that Python can’t iterate over the number 91. Secondly, I am absolutely sick of seeing “cannot concatenate str and bytes types” errors; my general tactic is to start throwing
str()calls around until the error goes away, but that kind of shotgun debugging hurts my soul.
This needs to be easier. I hope Guido will release any notes he’s been taking while porting the standard library to use the new
On the plus side, setuptools helped turn up a few bugs in 2to3, as well as some places where the translation could have been improved (and has been). I intend to repeat this experiment once the
struni branch settles down and once 2.6’s py3k-compat mode works.