Home | | Python | | Share This Page |
A Python source code cleanup utility
— P. Lutus — Message Page —
Copyright © 2010, P. Lutus
(double-click any word to see its definition)
I may have mentioned that I don't take a language seriously unless I can create a beautifier for its source files, preferably written in the language itself. My Ruby beautifier has become quite popular, and writing it helped me learn many of that language's traits. I even wrote a beautifier for Bash scripts, when I was writing a lof of those — but I decided against trying to write it as a Bash shell script.
I had resisted taking up Python for a long time because of one of its less desirable characteristics — whitespace is syntactically significant. I regard this as an abomination, but over time I got involved with some projects that relied on Python (Sage and Blender among others). I eventually weakened and started writing in Python, and I have decided it's worth its defects.
I wouldn't be emphasizing the whitespace issue except that PyBeautify needs to work around the implications of the whitespace issue. Unlike beautifiers for other languages, PyBeautify can only change the overall indentation of a program (and a few other things) — it can't use the language's block syntax tokens to control the indentation, because Python's block syntax is controlled by indentation, not by tokens.
Here is what PyBeautify does:
In pass one, PyBeautify scans a source file and determines which indentation the file uses — one or more spaces.
In pass two, PyBeautify indents the program based on either PyBeautify's default indentation of two spaces or a user-entered specification. This feature can be used to reliably change a file's oveall indentation from one standard to another, and any indentation between 1 and 64 spaces can be specified.
PyBeautify also checks the program's indentation for consistency. The assumption is that a program will always use a multiple of a basic indentation — say, four spaces — and each indentation is a multiple of this value.
If PyBeautify finds any indentation inconsistencies, for each one it prints a warning with a file name and a line number, but it doesn't try to change the indentation.
PyBeautify also turns all tabs into eight-space blocks. I think it's generally accepted that tabs should be removed from the world of computing. PyBeautify does its little part.
Here is what PyBeautify won't do:
Make your source files beautiful (the program's name is more a tradition than a description), unless you regard removal of tabs as a move toward beauty (as I do).
Change the indentation of lines it thinks are errors. It will print a warning message for each one, but any changes are up to you.
Here's how to use PyBeautify:
- Use as a stream filter:
./pybeautify.py - < input.py > output.py- Specify an indentation other than two spaces:
./pybeautify.py 4 - < input.py > output.py- Replace a file in place, specifying an indentation of 4 spaces (makes a backup copy):
./pybeautify.py 4 input.py- Process all Python files in a directory in the same way:
./pybeautify.py 4 *.py
Licensing, Source
PyBeautify is released under the GNU General Public License. Here is the plain-text source file without line numbers.
Revision History
- Version 1.0 12/01/2010. Initial Public Release.
Program Listing
1: #!/usr/bin/env python 2: # -*- coding: utf-8 -*- 3: 4: # Version 1.0 12/01/2010 5: 6: # *************************************************************************** 7: # * Copyright (C) 2010, Paul Lutus * 8: # * * 9: # * This program is free software; you can redistribute it and/or modify * 10: # * it under the terms of the GNU General Public License as published by * 11: # * the Free Software Foundation; either version 2 of the License, or * 12: # * (at your option) any later version. * 13: # * * 14: # * This program is distributed in the hope that it will be useful, * 15: # * but WITHOUT ANY WARRANTY; without even the implied warranty of * 16: # * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * 17: # * GNU General Public License for more details. * 18: # * * 19: # * You should have received a copy of the GNU General Public License * 20: # * along with this program; if not, write to the * 21: # * Free Software Foundation, Inc., * 22: # * 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. * 23: # *************************************************************************** 24: 25: import re, sys, shutil 26: 27: class PyBeautify: 28: 29: def __init__(self): 30: self.default_indent = 2 31: 32: # split line into indent and content 33: def parse_line(self,s): 34: indent,content = re.search(r'^(\s*)(.*)$',s).groups() 35: return indent,len(indent),content 36: 37: def parse_stream(self,stream,path,indv): 38: lines = [line.expandtabs().rstrip() for line in stream.readlines()] 39: 40: # pass 1: find the minimum indent 41: mi = 1000000 42: for line in lines: 43: if(re.search(r'\S',line)): # only non-blank lines 44: indent,li,content = self.parse_line(line) 45: if(li > 0 and li < mi): mi = li 46: 47: # pass 2: create output string with specified indentation 48: output = [] 49: for n,line in enumerate(lines): 50: indent,li,content = self.parse_line(line) 51: if(li % mi != 0): # if indentation is not a multiple of mi 52: sys.stderr.write("Warning: inconsistent indentation in line %d of file \"%s\".\n" \ 53: % (n+1,path)) 54: iv = li * indv / mi # create indent value 55: output.append("%s%s" % (' ' * iv,content)) 56: return '\n'.join(output) + '\n' 57: 58: def parse_file(self,path,indv): 59: if (path == '-'): # stdin, stdout 60: print(self.parse_stream(sys.stdin,path,indv)) # end = ' ' 61: else: # it's a file 62: try: # making a backup copy 63: shutil.copyfile(path,path+"~") 64: except: # backup failed 65: sys.stderr.write("Error: unable to create backup copy of file \"%s\", quitting.\n" \ 66: % path) 67: exit(1) 68: with open(path) as fh: # read the file 69: output = self.parse_stream(fh,path,indv) 70: with open(path,'w') as fh: # write the result 71: fh.write(output) 72: 73: def process(self): 74: sys.argv.pop(0) # drop program name 75: if (not sys.argv): # no program arguments 76: sys.stderr.write("Usage: [indent default %d] filenames or \"-\" for stream\n" \ 77: % self.default_indent) 78: exit(0) 79: else: 80: try: # is the first argument a number? 81: indent = int(sys.argv[0]) 82: sys.argv.pop(0) # drop the number 83: except: # not a number, probably a file name 84: indent = self.default_indent 85: if(indent <= 0 or indent > 64): # test of acceptable indentations 86: sys.stderr.write("Error: bad indent entry value: \"%d\", quitting.\n" \ 87: % indent) 88: exit(1) 89: for path in sys.argv: 90: self.parse_file(path,indent) 91: 92: 93: PyBeautify().process() 94:
Home | | Python | | Share This Page |