Concatenation of strings in Python
Most Python books or blog posts, teach us that concatenating
strings using the
+ sign is a bad idea. This is true: using
=+ is a really bad idea. In Python, strings are immutable. This means
that every time your assign a new value or you want to increase
the size of an existing string, Python has to allocate a new memory
space large enough to receive the new string, copy the new string,
then deallocate the old space. The following example is really bad Python
programming. Python will allocate, copy and deallocate 1000 times the
memory for the variable
s = "" for x in range(1000): s += str(x) + ', '
The result is that developers everywhere write lines like this:
filename = '.'.join([name, extension])
filename = "%s.%s" % (name, extension)
In the case of simple concatenation like this it doesn't make any sense to use join or string formatting. A simple plus will work faster and is easier to read.
filename storing the result has to be allocated no
matter what method you use. The variables
already allocated and will not be reallocated. In this case simply
var3 = var1 + var2 makes total sense.
Here is the execution time for each solution:
# That quick benchmark was run on Python 2.7.8 In : timeit("a = fname + ext", setup="fname='database'; ext='.dat'") Out: 0.06346487998962402 In : timeit("a = ''.join((fname, ext))", setup="fname='database'; ext='.dat'") Out: 0.1665630340576172 In : timeit("a = '%s%s' % (fname, ext)", setup="fname='database'; ext='.dat'") Out: 0.19054698944091797 In : timeit("a = '%(fname)s%(ext)s' % (x)", setup="x=dict(fname='database', ext='.dat')") Out: 0.2898139953613281
As you can see in these benchmarks the form
var3 = var1 + var2 is
the fastest by a factor of 3. It is also obviously easier to read.