Tuesday, 13 August 2013

Python buffer copy speed - why is array slower than string?

Python buffer copy speed - why is array slower than string?

I have a buffer object in C++ that inherits from std::vector<char>. I want
to convert this buffer into a Python string so that I can send it out over
the network via Twisted's protocol.transport.write.
Two ways I thought of doing this are (1) making a string and filling it
char by char:
def scpychar(buf, n):
s = ''
for i in xrange(0, n):
s += buf[i]
return s
and (2) making a char array (since I know how big the buffer is), filling
it and converting it to a string
def scpyarr(buf, n):
a = array.array('c','0'*n)
for i in xrange(0, n):
a[i] = buf[i]
return a.tostring()
I would have thought that (1) has to make a new string object every time s
+= buf[i] is called, and copy the contents of the old string. So I was
expecting (2) to be quicker than (1). But if I test this using timeit, I
find that (1) is actually about twice as fast as (2).
I was wondering if someone could explain why (1) is faster?
Bonus points for an even more efficient way to convert from a
std::vector<char> to a Python string.

No comments:

Post a Comment