Safe Python code formatting with autopep8

4 minute read

Having consistent code formatting in a team makes it easier to review and read other people’s code. Here I explain how I use autopep8 to format code safely. By safety, I mean that formatting the code doesn’t accidentally change its logic.

*Update (01/06/2020): for new projects I now recommend black, but autopep8 can still be useful. Read below.

There are 3 popular code formatters for Python.

  • autopep8: it’s the oldest, considered stable and very configurable. However, it is not completely safe even without the --aggressive option. Some options may change the logic of the code, so one must choose carefully which to enable.
  • yapf: it’s not guaranteed to be idempotent, that is, yapf(code) != yapf(yapf(code)). In my opinion, being idempotent is vital for a formatter. Read more.
  • black: the newest, it’s not configurable, which is good because it avoids arguing about which options to enable. It has good reputation but it’s still in beta. It has sanity checks that guarantee the Abstract Syntax Tree (AST) of the code won’t change.

For new projects I recommend using black because it’s the safest thanks to the AST check and is supported by the Python Software Foundation.

For old projects that have never been formatted with a tool, using autopep8 is still a sensible option. If you need to update a file, running autopep8 on it can improve the code with small changes (compared to black) and it’s stable and safe enough if the options are chosen carefully.

Here’s how I use autopep8. Below I’ve copied the list of changes from the autopep8 documentation which are safe as they only modify white space. However, I prefer to exclude E26 fixes as they can mess comments up. In addition, autopep8 ignores by default E226 and E24.

E101 - Reindent all lines.
E11  - Fix indentation.
E121 - Fix indentation to be a multiple of four.
E122 - Add absent indentation for hanging indentation.
E123 - Align closing bracket to match opening bracket.
E124 - Align closing bracket to match visual indentation.
E125 - Indent to distinguish line from next logical line.
E126 - Fix over-indented hanging indentation.
E127 - Fix visual indentation.
E128 - Fix visual indentation.
E129 - Fix visual indentation.
E131 - Fix hanging indent for unaligned continuation line.
E133 - Fix missing indentation for closing bracket.
E20  - Remove extraneous whitespace.
E211 - Remove extraneous whitespace.
E22  - Fix extraneous whitespace around keywords.
E224 - Remove extraneous whitespace around operator.
E225 - Fix missing whitespace around operator.
E226 - Fix missing whitespace around arithmetic operator.
E227 - Fix missing whitespace around bitwise/shift operator.
E228 - Fix missing whitespace around modulo operator.
E231 - Add missing whitespace.
E241 - Fix extraneous whitespace around keywords.
E242 - Remove extraneous whitespace around operator.
E251 - Remove whitespace around parameter '=' sign.
E252 - Missing whitespace around parameter equals.
E26  - Fix spacing after comment hash for inline comments.
E265 - Fix spacing after comment hash for block comments.
E266 - Fix too many leading '#' for block comments.
E27  - Fix extraneous whitespace around keywords.
E301 - Add missing blank line.
E302 - Add missing 2 blank lines.
E303 - Remove extra blank lines.
E304 - Remove blank line following function decorator.
E305 - Expected 2 blank lines after end of function or class.
E306 - Expected 1 blank line before a nested definition.

W291 - Remove trailing whitespace.
W292 - Add a single newline at the end of the file.
W293 - Remove trailing whitespace on blank line.
W391 - Remove trailing blank lines.

Some of the following changes are potentially dangerous and can change the logic of the program, e.g. E402.

E401 - Put imports on separate lines.
E402 - Fix module level import not at top of file
E501 - Try to make lines fit within --max-line-length characters.
E502 - Remove extraneous escape of newline.
E701 - Put colon-separated compound statement on separate lines.
E70  - Put semicolon-separated compound statement on separate lines.
E711 - Fix comparison with None.
E712 - Fix comparison with boolean.
E713 - Use 'not in' for test for membership.
E714 - Use 'is not' test for object identity.
E721 - Use "isinstance()" instead of comparing types directly.
E722 - Fix bare except.
E731 - Use a def when use do not assign a lambda expression.

W503 - Fix line break before binary operator.
W504 - Fix line break after binary operator.
W601 - Use "in" rather than "has_key()".
W602 - Fix deprecated form of raising exception.
W603 - Use "!=" instead of "<>"
W604 - Use "repr()" instead of backticks.
W605 - Fix invalid escape sequence 'x'.
W690 - Fix various deprecated code (via lib2to3).

My strategy for new files is to run with all default options on because the whole file will be peer-reviewed. To make old code uniform and for new additions to old files I run autopep8 like this:

autopep8 --select E1,E2,E3,E401,W2,W3 --ignore E226,E24,E26 --in-place --recursive --verbose .

This is quite conservative but I’m confident it is safe. My goal is to make code style uniform by removing small inconsistencies but without big changes.

To ensure the uniformity of all the codebase, the previous command should be adapted as a pre-commit hook. Likewise, the server pre-receive/update hook (or a pipeline) should run it too. If it detects possible issues, this means the code wasn’t formatted before pushing and, as a consequence, it should be rejected from the remote. Enforcing this rule will guarantee uniformity through the codebase.

Categories:

Updated:

Leave a comment