I have recently added the podcasts on http://blog.stackoverflow.com to my feeds and in one of the episodes (second, I think), someone called in and asked Joel Spolsky whether he should learn C++ vs something else and aside the OOP what differentiated C and C++. Joel explained the benefits of C to Jeff and mentioned an example whose shaky reproduction goes like this, “If you are asked to write a script to replace every “x” in a string with a “y”, then the first solution most high level programmers would think of is to go over all the characters in the string and when they find an “x”, they append a “y” to the new string or they append the currently existing letter. They just don’t know how inefficient this is.”
I don’t know C (but I am learning it now) but I guess he might be right. I wrote a recursive solution for someone on orkut who had problems with it and I employed such a method (but considering that Python strings are immutable, I was forced to use a list and finally do a “”.join(string_list) ). Listening to that podcast made me think about other ways of doing that and obviously slicing a string comes to mind. I cannot plainly replace positions in Python strings (they’re immutable) and converting it to a list, going over replacing and then rejoining it sort of defeats the purpose of setting out to optimize that code. Here is the solution I whipped up for someone called Hari Priya on Orkut:
if index == len(string_name):
return "".join(final_string)
else:
if string_name[index] == "x":
final_string.append("y")
index += 1
else:
final_string.append(string_name[index])
index += 1
return changeXY(string_name, index, final_string)
Clearly, not the best, there’s a load of repetition there and all that but since I am too bored to fix that up, I will proceed with my next solution.
#Author: Shriphani Palakodety
#Filename: changeXY2.py
def changeXY(string_name):
slices_list = []
places = string_name.count("x")
start = 0
for i in xrange(places):
position = string_name.find("x", start)
slices_list.append(string_name[start:position]+"y")
start = position+1
slices_list.append(string_name[start:])
return "".join(slices_list)
That is the solution where I construct portions of the string and join them at the end. Since the only idea I’ve got about efficiency is limited to checking the runtime, I did just that.
Step 1: Pick a string. I picked
boneywasawarriorwayayixkickedasterixandobelixbutt
.
Here is a time based comparision:
time python changeXY1.py boneywasawarriorwayayiykickedasteriyandobeliybutt real 0m0.301s user 0m0.077s sys 0m0.170s
That was the time taken for the recursive solution to replace all “x”s with “y”s.
time python changeXY2.py boneywasawarriorwayayiykickedasteriyandobeliybutt real 0m0.255s user 0m0.077s sys 0m0.124s
That was the time the second solution using slicing took to run. It took less time and hence it was more efficient than the recursive solution.
Did that deserve an entire blog post? Probably not. Still, stackoverflow’s podcasts are pretty good and any advice from Joel Spolsky is good advice. I might drop in a question but I have no idea what to ask him.
Well, Happy Coding ![]()



9 comments ↓
I am a little curious about C and C++ examples where there is potentially a faster operation than iterating over the string looking for ‘x’. That is an order N, 1time operation. A little hard to improve upon unless you have some way of knowing where all the X’s are (besides doing an order N walk over the data).
In python the obvious thing do do is cheat and use “avnxdef”.replace(”x”, “y”).
Still it would have been nice to get timings for that and:
res = “”.join(l if l != “x” else “y” for l in string)
which is the closest approximation to a C implementation.
You’re missing a much better solution:
def changeXY3(string_name):
return ‘y’.join(string_name.split(’x'))
This runs 3 times faster than changeXY2 and about 770 times faster than changeXY1, and it’s much shorter. This is idiomatic Python. I presented a tutorial on the subject at PyCon 2007, and the materials are on the web: http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html
This solution doesn’t require a list or anything, would also assume its faster.
“boneywasawarriorwayayixkickedasterixandobelixbutt”.replace(’x',’y')
No need to guess. Here’s a comparison of changeXY2() vs “boneywasawarriorwayayixkickedasterixandobelixbutt”.replace(”x”, “y”)-
[e:\test\python]py24 str_replace.py
changeXY
0.0348 secs for 10000 reps
TESTSTR.replace
0.0054 secs for 10000 reps
I’d say it speaks for itself.
@Doug
I’m also curious about that example… But I have an idea.
You assumed the inefficiency reside in “to go over all the characters in the string and when they find an “x””. I think the inefficiancy reside in “they append a”.
To append a char to a string may be inefficient (depending of the internal implementation of the string). If the string is implemented with a chained list, it’s totally efficient, but if it’s a fixed size buffer, appending mean (potential?) reallocation.
append => reallocation => inefficient.
the replacement is not in O(n) but in O(n^2) if you need to reallocate at each iteration.
I am sorry I didn’t mention clearly in my post that I was interested in comparing the recursive solution against the other one. Does anyone know how the replace() method is implemented?
@Doug:
Here is the runtime for the solution you submitted:
I think sage is right. The “appending” seems to be the problem here. Of course, anyone familiar with Python would know what
is inefficient and in general, a bad idea because str1 is being reconstructed everytime. I hence used a list to collect all the characters and used the join() method at the end. I believe Joel was not speaking about this but was thinking of something else (for instance Ruby has mutable strings and ruby programmers can append to strings using something like str1[len(str1)] = char). So, Joel might be hintins that walking and appending is the wrong way to do things. Walking over the data seems to be unavoidable. Even the find() method needs to walk over the input to check where “x” appears.
The first solution (the recursive one) seems to have a complexity of O(n). The second one should have something better than that as it has a lesser runtime for the very same input. Can someone figure out the complexity of the algorithm where I used string slices ?
I think he is trying to differentiate allocating all at once or appending one at a time. In python strings are immutable so it is difficult to demonstrate but maybe something like this:
def changeXY(string_name):
final_string=list(string_name) #allocate all once
for i in xrange(len(final_string)):
….if final_string[i]==”x”:
……..final_string[i]=”y”
return “”.join(final_string)
def naiveChange(string_name):
final_string=”" #start empty and append
for i in xrange(len(string_name)):
….if string_name[i]==”x”:
……..final_string+=”y”
….else:
……..final_string+=string_name[i]
return final_string
The thing is that the discussion is pretty academic anyway because the correct way is to use string.replace().
@David:
I am sorry your comment did not show up earlier as my spam filter marked it as spam. But that is an excellent solution which we all missed. Thanks for the link.
Leave a Comment