<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: Observations About Strings</title>
	<atom:link href="http://shriphani.com/blog/2008/06/09/observations-about-strings/feed/" rel="self" type="application/rss+xml" />
	<link>http://shriphani.com/blog/2008/06/09/observations-about-strings/</link>
	<description>Weblog of an Aspiring Computer Scientist</description>
	<pubDate>Thu, 04 Dec 2008 18:39:33 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.6.1</generator>
		<item>
		<title>By: Shriphani</title>
		<link>http://shriphani.com/blog/2008/06/09/observations-about-strings/#comment-251</link>
		<dc:creator>Shriphani</dc:creator>
		<pubDate>Wed, 11 Jun 2008 01:10:11 +0000</pubDate>
		<guid isPermaLink="false">http://shriphani.com/blog/?p=152#comment-251</guid>
		<description>@David:

I am sorry your comment did not show up earlier as my spam filter marked it as spam. But that is an excellent solution which we all missed. Thanks for the link.</description>
		<content:encoded><![CDATA[<p>@David:</p>
<p>I am sorry your comment did not show up earlier as my spam filter marked it as spam. But that is an excellent solution which we all missed. Thanks for the link.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: nes</title>
		<link>http://shriphani.com/blog/2008/06/09/observations-about-strings/#comment-250</link>
		<dc:creator>nes</dc:creator>
		<pubDate>Tue, 10 Jun 2008 19:54:22 +0000</pubDate>
		<guid isPermaLink="false">http://shriphani.com/blog/?p=152#comment-250</guid>
		<description>I think he is trying to differentiate allocating all at once or appending one at a time. In python strings are immutable so it is difficult to demonstrate but maybe something like this:

def changeXY(string_name):
final_string=list(string_name) #allocate all once
for i in xrange(len(final_string)):
....if final_string[i]=="x":
........final_string[i]="y"
return "".join(final_string)

def naiveChange(string_name):
final_string="" #start empty and append
for i in xrange(len(string_name)):
....if string_name[i]=="x":
........final_string+="y"
....else:
........final_string+=string_name[i]
return final_string


The thing is that the discussion is pretty academic anyway because the correct way is to use string.replace().</description>
		<content:encoded><![CDATA[<p>I think he is trying to differentiate allocating all at once or appending one at a time. In python strings are immutable so it is difficult to demonstrate but maybe something like this:</p>
<p>def changeXY(string_name):<br />
final_string=list(string_name) #allocate all once<br />
for i in xrange(len(final_string)):<br />
&#8230;.if final_string[i]==&#8221;x&#8221;:<br />
&#8230;&#8230;..final_string[i]=&#8221;y&#8221;<br />
return &#8220;&#8221;.join(final_string)</p>
<p>def naiveChange(string_name):<br />
final_string=&#8221;" #start empty and append<br />
for i in xrange(len(string_name)):<br />
&#8230;.if string_name[i]==&#8221;x&#8221;:<br />
&#8230;&#8230;..final_string+=&#8221;y&#8221;<br />
&#8230;.else:<br />
&#8230;&#8230;..final_string+=string_name[i]<br />
return final_string</p>
<p>The thing is that the discussion is pretty academic anyway because the correct way is to use string.replace().</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Shriphani</title>
		<link>http://shriphani.com/blog/2008/06/09/observations-about-strings/#comment-249</link>
		<dc:creator>Shriphani</dc:creator>
		<pubDate>Tue, 10 Jun 2008 06:03:14 +0000</pubDate>
		<guid isPermaLink="false">http://shriphani.com/blog/?p=152#comment-249</guid>
		<description>@Doug:

Here is the runtime for the solution you submitted:
&lt;pre&gt;
time python res.py
boneywasawarriorwayayiykicksasteriyandobeliybutt

real    0m0.268s
user    0m0.031s
sys     0m0.139s&lt;/pre&gt;

I think sage is right. The "appending" seems to be the problem here. Of course, anyone familiar with Python would know what &lt;pre&gt;str1 += str2&lt;/pre&gt; is inefficient and in general, a bad idea because str1 is being reconstructed everytime. I hence used a list to collect all the characters and used the join() method at the end. I believe Joel was not speaking about this but was thinking of something else (for instance Ruby has mutable strings and ruby programmers can append to strings using something like str1[len(str1)] = char). So, Joel might be hintins that walking and appending is the wrong way to do things. Walking over the data seems to be unavoidable. Even the find() method needs to walk over the input to check where "x" appears. 

The first solution (the recursive one) seems to have a complexity of O(n). The second one should have something better than that as it has a lesser runtime for the very same input. Can someone figure out the complexity of the algorithm where I used string slices ?</description>
		<content:encoded><![CDATA[<p>@Doug:</p>
<p>Here is the runtime for the solution you submitted:</p>
<pre>
time python res.py
boneywasawarriorwayayiykicksasteriyandobeliybutt

real    0m0.268s
user    0m0.031s
sys     0m0.139s</pre>
<p>I think sage is right. The &#8220;appending&#8221; seems to be the problem here. Of course, anyone familiar with Python would know what
<pre>str1 += str2</pre>
<p> is inefficient and in general, a bad idea because str1 is being reconstructed everytime. I hence used a list to collect all the characters and used the join() method at the end. I believe Joel was not speaking about this but was thinking of something else (for instance Ruby has mutable strings and ruby programmers can append to strings using something like str1[len(str1)] = char). So, Joel might be hintins that walking and appending is the wrong way to do things. Walking over the data seems to be unavoidable. Even the find() method needs to walk over the input to check where &#8220;x&#8221; appears. </p>
<p>The first solution (the recursive one) seems to have a complexity of O(n). The second one should have something better than that as it has a lesser runtime for the very same input. Can someone figure out the complexity of the algorithm where I used string slices ?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Shriphani</title>
		<link>http://shriphani.com/blog/2008/06/09/observations-about-strings/#comment-246</link>
		<dc:creator>Shriphani</dc:creator>
		<pubDate>Tue, 10 Jun 2008 00:38:53 +0000</pubDate>
		<guid isPermaLink="false">http://shriphani.com/blog/?p=152#comment-246</guid>
		<description>I am sorry I didn't mention clearly in my post that I was interested in comparing the recursive solution against the other one. Does anyone know how the replace() method is implemented?</description>
		<content:encoded><![CDATA[<p>I am sorry I didn&#8217;t mention clearly in my post that I was interested in comparing the recursive solution against the other one. Does anyone know how the replace() method is implemented?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: sage</title>
		<link>http://shriphani.com/blog/2008/06/09/observations-about-strings/#comment-243</link>
		<dc:creator>sage</dc:creator>
		<pubDate>Mon, 09 Jun 2008 18:49:00 +0000</pubDate>
		<guid isPermaLink="false">http://shriphani.com/blog/?p=152#comment-243</guid>
		<description>@Doug

I'm also curious about that example... But I have an idea.
You assumed the inefficiency reside in "to go over all the characters in the string and when they find an “x”". I think the inefficiancy reside in "they append a".

To append a char to a string may be inefficient (depending of the internal implementation of the string). If the string is implemented with a chained list, it's totally efficient, but if it's a fixed size buffer, appending mean (potential?) reallocation.

append =&#62; reallocation =&#62; inefficient.

the replacement is not in O(n) but in O(n^2) if you need to reallocate at each iteration.</description>
		<content:encoded><![CDATA[<p>@Doug</p>
<p>I&#8217;m also curious about that example&#8230; But I have an idea.<br />
You assumed the inefficiency reside in &#8220;to go over all the characters in the string and when they find an “x”&#8221;. I think the inefficiancy reside in &#8220;they append a&#8221;.</p>
<p>To append a char to a string may be inefficient (depending of the internal implementation of the string). If the string is implemented with a chained list, it&#8217;s totally efficient, but if it&#8217;s a fixed size buffer, appending mean (potential?) reallocation.</p>
<p>append =&gt; reallocation =&gt; inefficient.</p>
<p>the replacement is not in O(n) but in O(n^2) if you need to reallocate at each iteration.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Thomas Passin</title>
		<link>http://shriphani.com/blog/2008/06/09/observations-about-strings/#comment-239</link>
		<dc:creator>Thomas Passin</dc:creator>
		<pubDate>Mon, 09 Jun 2008 18:26:53 +0000</pubDate>
		<guid isPermaLink="false">http://shriphani.com/blog/?p=152#comment-239</guid>
		<description>No need to guess.  Here's a comparison of changeXY2() vs “boneywasawarriorwayayixkickedasterixandobelixbutt”.replace("x", "y")-

[e:\test\python]py24 str_replace.py
changeXY
0.0348 secs for 10000 reps

TESTSTR.replace
0.0054 secs for 10000 reps

I'd say it speaks for itself.</description>
		<content:encoded><![CDATA[<p>No need to guess.  Here&#8217;s a comparison of changeXY2() vs “boneywasawarriorwayayixkickedasterixandobelixbutt”.replace(&#8221;x&#8221;, &#8220;y&#8221;)-</p>
<p>[e:\test\python]py24 str_replace.py<br />
changeXY<br />
0.0348 secs for 10000 reps</p>
<p>TESTSTR.replace<br />
0.0054 secs for 10000 reps</p>
<p>I&#8217;d say it speaks for itself.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: ToddB</title>
		<link>http://shriphani.com/blog/2008/06/09/observations-about-strings/#comment-238</link>
		<dc:creator>ToddB</dc:creator>
		<pubDate>Mon, 09 Jun 2008 15:45:48 +0000</pubDate>
		<guid isPermaLink="false">http://shriphani.com/blog/?p=152#comment-238</guid>
		<description>This solution doesn't require a list or anything, would also assume its faster.
"boneywasawarriorwayayixkickedasterixandobelixbutt".replace('x','y')</description>
		<content:encoded><![CDATA[<p>This solution doesn&#8217;t require a list or anything, would also assume its faster.<br />
&#8220;boneywasawarriorwayayixkickedasterixandobelixbutt&#8221;.replace(&#8217;x',&#8217;y')</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: David Goodger</title>
		<link>http://shriphani.com/blog/2008/06/09/observations-about-strings/#comment-237</link>
		<dc:creator>David Goodger</dc:creator>
		<pubDate>Mon, 09 Jun 2008 15:11:10 +0000</pubDate>
		<guid isPermaLink="false">http://shriphani.com/blog/?p=152#comment-237</guid>
		<description>You're missing a much better solution:

def changeXY3(string_name):
    return 'y'.join(string_name.split('x'))

This runs 3 times faster than changeXY2 and about 770 times faster than changeXY1, and it's much shorter. This is idiomatic Python. I presented a tutorial on the subject at PyCon 2007, and the materials are on the web: http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html</description>
		<content:encoded><![CDATA[<p>You&#8217;re missing a much better solution:</p>
<p>def changeXY3(string_name):<br />
    return &#8216;y&#8217;.join(string_name.split(&#8217;x'))</p>
<p>This runs 3 times faster than changeXY2 and about 770 times faster than changeXY1, and it&#8217;s much shorter. This is idiomatic Python. I presented a tutorial on the subject at PyCon 2007, and the materials are on the web: <a href="http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html" rel="nofollow">http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Doug Napoleone</title>
		<link>http://shriphani.com/blog/2008/06/09/observations-about-strings/#comment-236</link>
		<dc:creator>Doug Napoleone</dc:creator>
		<pubDate>Mon, 09 Jun 2008 15:06:34 +0000</pubDate>
		<guid isPermaLink="false">http://shriphani.com/blog/?p=152#comment-236</guid>
		<description>I am a little curious about C and C++ examples where there is potentially a faster operation than iterating over the string looking for 'x'. That is an order N, 1time operation. A little hard to improve upon unless you have some way of knowing where all the X's are (besides doing an order N walk over the data).

In python the obvious thing do do is cheat and use "avnxdef".replace("x", "y"). 
Still it would have been nice to get timings for that and:

res = "".join(l if l != "x" else "y" for l in string)

which is the closest approximation to a C implementation.</description>
		<content:encoded><![CDATA[<p>I am a little curious about C and C++ examples where there is potentially a faster operation than iterating over the string looking for &#8216;x&#8217;. That is an order N, 1time operation. A little hard to improve upon unless you have some way of knowing where all the X&#8217;s are (besides doing an order N walk over the data).</p>
<p>In python the obvious thing do do is cheat and use &#8220;avnxdef&#8221;.replace(&#8221;x&#8221;, &#8220;y&#8221;).<br />
Still it would have been nice to get timings for that and:</p>
<p>res = &#8220;&#8221;.join(l if l != &#8220;x&#8221; else &#8220;y&#8221; for l in string)</p>
<p>which is the closest approximation to a C implementation.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
