Many developers often use print statements for debugging their code. Print statements are a good way to trace and troubleshoot potential problems, but they are not great when working on collaborative projects. Prints are also problematic in multi-threaded programs. You have to remove your print statements before shipping your code.
A better way to print debug information is to use the python standard
logging
module. Here are some of the advantages of using logging
:
- The messages are timestamped with the severity level.
- The logging module logs useful information such as module name, function name, line number, process id, thread id, etc, etc.
- Messages can be filtered to only log what you are interested in without changing the program.
- The messages can be sent to standard out, standard err, files, mails, sockets, and more. You can run you program in production without having debugging messages cluttering the user output.
The logging module
The following program illustrates how I usually use the python
logging
module for debugging.
#!/usr/bin/env python
import os
import timeit
import logging
# Configure the logging module to print interesting information
loglevel = logging.getLevelName(os.getenv('LEVEL', 'WARN'))
logging.basicConfig(
format="%(asctime)s %(filename)s.%(funcName)s:%(lineno)d %(message)s",
level=loglevel
)
def bigloop():
logging.info('Start')
for counter in range(100):
val = counter
logging.debug('The value is: %d', val)
logging.info('End')
if __name__ == '__main__':
logging.info('Main starts')
bigloop()
Running this program will produce the following output, with all the information I need to debug it: time to a thousandth of a second, name of the program, name of the function, line number and message.
[~]$ export LEVEL=DEBUG
[~]$ python debug-tip.py
2015-09-26 17:13:12,082 debug-tip.py.<module>:20 Main starts
2015-09-26 17:13:12,082 debug-tip.py.bigloop:13 Start
2015-09-26 17:13:12,082 debug-tip.py.bigloop:16 The value is: 0
2015-09-26 17:13:12,083 debug-tip.py.bigloop:16 The value is: 1
2015-09-26 17:13:12,083 debug-tip.py.bigloop:16 The value is: 2
2015-09-26 17:13:12,083 debug-tip.py.bigloop:16 The value is: 3
2015-09-26 17:13:12,083 debug-tip.py.bigloop:16 The value is: 4
2015-09-26 17:13:12,083 debug-tip.py.bigloop:16 The value is: 5
2015-09-26 17:13:12,083 debug-tip.py.bigloop:16 The value is: 6
2015-09-26 17:13:12,083 debug-tip.py.bigloop:16 The value is: 7
2015-09-26 17:13:12,083 debug-tip.py.bigloop:16 The value is: 8
2015-09-26 17:13:12,083 debug-tip.py.bigloop:16 The value is: 9
. . .
2015-09-26 17:13:12,087 debug-tip.py.bigloop:16 The value is: 98
2015-09-26 17:13:12,087 debug-tip.py.bigloop:16 The value is: 99
2015-09-26 17:13:12,087 debug-tip.py.bigloop:17 End
When you want to turn off the debugging messages, just unset the
environment variable LEVEL
and the messages will disappear.
[~]$ unset LEVEL
[~]$ python debug-tip.py
I can set the environment variable to a different level so only the messages at that level will be printed out.
[~]$ export LEVEL=INFO
[~]$ python debug-tip.py
2015-09-26 17:17:42,026 debug-tip.py.<module>:20 Main starts
2015-09-26 17:17:42,026 debug-tip.py.bigloop:13 Start
2015-09-26 17:17:42,026 debug-tip.py.bigloop:17 End
Everything has a price
The logging module is so great that I use it all over my code to print
code states, variable information, to help me to debug. That
awesomeness comes at a price, especially for the line
logging.debug(...)
. A call to logging.debug
is made for every run
of the loop. In a big program or an event based program, these little
calls can be very expensive and make your program run slower.
To solve that problem some people just call that line if the logging
LEVEL
is set to DEBUG
. This new version of the program includes
an if
to test if the line logging.debug
has to be called.
#!/usr/bin/env python
import os
import timeit
import logging
# Configure the logging module to print interesting information
loglevel = logging.getLevelName(os.getenv('LEVEL', 'WARN'))
logging.basicConfig(
format="%(asctime)s %(filename)s.%(funcName)s:%(lineno)d %(message)s",
level=loglevel
)
def bigloop():
logging.info('Start')
for counter in range(100):
val = counter
if loglevel == logging.getLevelName('DEBUG'):
logging.debug('The value is: %d', val)
logging.info('End')
if __name__ == '__main__':
logging.info('Main starts')
timer = timeit.Timer('bigloop()', 'from __main__ import bigloop')
print timer.timeit(5000)
Timing the execution of this program shows us that the program
runs 3 times faster without the call to logging.debug
.
# without the if
[~]$ python debug-tip.py
0.623032093048
# with the if
[~]$ python debug-tip.py
0.243802070618
A better way to do it
This is a huge improvement in terms of execution time, but not really
an improvement in terms of code readability. Having these 2 lines all
over your code might clutter your program. This is why some people
create a debug
function.
#!/usr/bin/env python
import os
import timeit
import logging
# Configure the logging module to print interesting information
loglevel = logging.getLevelName(os.getenv('LEVEL', 'WARN'))
logging.basicConfig(
format="%(asctime)s %(filename)s.%(funcName)s:%(lineno)d %(message)s",
level=loglevel
)
def debug(msg):
if loglevel == logging.getLevelName('DEBUG'):
logging.debug(msg)
def bigloop():
logging.info('Start')
for counter in range(100):
val = counter
debug('The value is: %d' % val)
logging.info('End')
if __name__ == '__main__':
logging.info('Main starts')
timer = timeit.Timer('bigloop()', 'from __main__ import bigloop')
print timer.timeit(5000)
This new version is better in terms of readability but once again
slower. A call to the debug
function is made every time.
[~]$ python debug-tip.py
0.584649085999
Assertions
To solve that problem I use assertions. An assertion is a sanity-check that you can turn on or turn off when you are done testing the program.
When you ship a program to production or when you create a package,
run python with the option -O
for optimize or set the
environment variable PYTHONOPTIMIZE
. When you do this, python
removes all the assertions.
Here is the final version of the program using assertions:
#!/usr/bin/env python
import os
import timeit
import logging
# Configure the logging module to print interesting information
loglevel = logging.getLevelName(os.getenv('LEVEL', 'WARN'))
logging.basicConfig(
format="%(asctime)s %(filename)s.%(funcName)s:%(lineno)d %(message)s",
level=loglevel
)
def debug(msg):
logging.debug(msg)
return True
def bigloop():
logging.info('Start')
for counter in range(100):
val = counter
assert debug('The value is: %d' % val)
logging.info('End')
if __name__ == '__main__':
logging.info('Main starts')
timer = timeit.Timer('bigloop()', 'from __main__ import bigloop')
print timer.timeit(5000)
As you can see below, the program now runs 34 times faster. All the
debug information is still present in the source code and can be
turned on or off at anytime. Just set the environment variable
LEVEL
and run the program without the optimization flag.
[~]$ python debug-tip.py
0.927535057068
[~]$ python -O debug-tip.py
0.027086019516