Most Python books or blog posts, teach us that concatenating
strings using the +
sign is a bad idea. This is true: using +
or
=+
is a really bad idea. In Python, strings are immutable. This means
that every time your assign a new value or you want to increase
the size of an existing string, Python has to allocate a new memory
space large enough to receive the new string, copy the new string,
then deallocate the old space. The following example is really bad Python
programming. Python will allocate, copy and deallocate 1000 times the
memory for the variable s
.
s = ""
for x in range(1000):
s += str(x) + ', '
The result is that developers everywhere write lines like this:
filename = '.'.join([name, extension])
or
filename = "%s.%s" % (name, extension)
In the case of simple concatenation like this it doesn't make any sense to use join or string formatting. A simple plus will work faster and is easier to read.
The variable filename
storing the result has to be allocated no
matter what method you use. The variables name
and extension
are
already allocated and will not be reallocated. In this case simply
writing var3 = var1 + var2
makes total sense.
Here is the execution time for each solution:
# That quick benchmark was run on Python 2.7.8
In [1]: timeit("a = fname + ext",
setup="fname='database'; ext='.dat'")
Out[1]: 0.06346487998962402
In [2]: timeit("a = ''.join((fname, ext))",
setup="fname='database'; ext='.dat'")
Out[2]: 0.1665630340576172
In [3]: timeit("a = '%s%s' % (fname, ext)",
setup="fname='database'; ext='.dat'")
Out[3]: 0.19054698944091797
In [4]: timeit("a = '%(fname)s%(ext)s' % (x)",
setup="x=dict(fname='database', ext='.dat')")
Out[4]: 0.2898139953613281
As you can see in these benchmarks the form var3 = var1 + var2
is
the fastest by a factor of 3. It is also obviously easier to read.