Entries Tagged 'python' ↓

My First Knol

So, I’ve finally managed to write my first Knol. It was about the powerset construction script I once wrote. Enjoy: http://knol.google.com/k/shriphani-palakodety/powerset-construction-using-gray-code/2ktbww4a502kb/2#

Singapore Trip, Mergesort, Collatz

I am back from Singapore and it was a decent trip. I will be putting up pictures I clicked there soon. But first, while in S’pore, I implemented mergesort in Python.

Mergesort works by diving the given list down to groups of two, sorting them and merging these groups to get the sorted list.

So, the script that divides these lists into groups of two should be pretty straightforward:

def makeSubLists(num_list):
    ‘Divide given numbers into groups of two’
    lists = []
    i = 0
    while i < len(num_list):
        lists.append(num_list[i:i+2])
        i += 2
    return lists

def sortTwos(num_list):
    ’sort the lists returned by makeSubLists()’
    try:
        if num_list[0] >= num_list[1]:
            num_list.reverse()
    except IndexError:
            pass
    return num_list

Next, we need to somehow merge these sorted lists back. I decided to take a recursive approach:

  • Suppose that two of the lists we’re merging are [x1, x2] and [y1, y2].
  • If our final list can be called “final_list”, we add x1 or y1 whichever is the smaller value. Remove x1 or y1 from the list.
  • We now have [x2] and [y1, y2] (I’m assuming that x1 is smaller). Repeat step 2 and keep doing it till one of the lists is reduced to [], the remaining part of the other list can be appended at the end of final_list because it is sorted already.
def recurMerge(list1, list2, merged_list=[]):
    ‘recursively merge the two given lists’
    if len(list1) == 0:
        merged_list += list2
        return merged_list
    elif len(list2) == 0:
        merged_list += list1
        return merged_list
    else:
        if list1[0] <= list2[0]:
            merged_list.append(list1[0])
            list1.pop(0)
        else:
            merged_list.append(list2[0])
            list2.pop(0)
        return recurMerge(list1, list2, merged_list)

The next snippet recursively supplies lists to recurMerge() till a single list containing the sorted numbers is obtained.

def merge(singletons):
    ’supply lists to recurMerge and recursively sort’
    final_list = []
    if len(singletons) == 1:
        return singletons[0]
    else:
        i = 0
        while i < len(singletons):
            final_list.append(recurMerge(singletons[i], singletons[i+1], []))
            i += 2
        singletons = final_list
        return merge(singletons)

You can download this mergesort implementation here.

Next comes the Collatz conjecture which goes something like this:

  • Pick a number n.
  • If n is even, n = n / 2.
  • If n is odd, n = 3n + 1
  • Go to step 2.

Do that with any n and you will always end up at 1. So, if n equals 13, the numbers resulting from applying the steps of the conjecture are:

  • 40
  • 20
  • 10
  • 5
  • 16
  • 8
  • 4
  • 2
  • 1

See.

Project Euler has a question based on the Collatz Conjecture which asks you find that number under 1 million which takes the longest number of steps to get to 1.

Here is the code to solve that:

collatz_dict = {}

def collatzDepth(number):
    collatz_hash = {}
    depth = 0
    while number > 1:
        if number % 2 == 0:
            number = number / 2
        else:
            number = 3 * number + 1
        depth += 1
    return depth

for i in xrange(1000000):
    collatz_dict[collatzDepth(i)] = i

def findMaxDepth(collatz_dict):
    val = max(collatz_dict.keys())
    return val, collatz_dict[val]

print findMaxDepth(collatz_dict)

In 10 days, I will be at Purdue…. wow…. I just can’t believe how time flew and how I recovered from the MIT admissions result…. I guess I’ve moved on…..

Greedy Algorithms

In the past week, I have traveled across 4 major pilgrimage centers and of course accumulated lots of brownie points for no stupid comments on the deities. I have decided to add more content to my site, mostly to make it humorous etc.

Right, I’ve had bloody little opportunity to further my education and I could only manage to read about greedy algorithms (I actually went to greedy algorithms after reading about the Traveling Salesman Problem that Scott Aaronson seemed to have used to write a game in high school).

Greedy algorithms operate by picking the extremum to proceed towards the solution without any regard for what might happen at a later stage of the solution.

I then decided to implement a few greedy algorithms then.

The Number of Coins While Making Change:

The Indian currency is the Rupee and 1 Rupee equals 100 Paise. The coins in circulation are:

  • 25 paise (gets you nothing)
  • 50 paise (gets you nothing)
  • 1 Rupee (gets you enough gum to chew for an hour)
  • 2 Rupees (gets you enough gum to chew for 2 hours)
  • 5 Rupees (gets you enough gum for an entire day)

So, here is an recursive implementation of this problem:

change = [25, 50, 100, 200, 500] #All Indian coins in use.
change.sort()
change.reverse()
give = []

def findMax(amount):
        if amount < min(change):
                return give
        else:
                for coin in change:
                        if coin <= amount:
                                give.append(coin)
                                amount -= coin
                                return findMax(amount)
                        else:
                                continue
                       
               

test_cases = [2500, 7200, 1200, 550, 650, 750, 850]

for amount in test_cases:
        print findMax(amount)
        give = []

The output:

vvs rao@vvsrao-PC /cygdrive/c/Users/vvs rao/scripts
$ python greedy_change.py
[500, 500, 500, 500, 500]
[500, 500, 500, 500, 500, 500, 500, 500, 500, 500, 500, 500, 500, 500, 200]
[500, 500, 200]
[500, 50]
[500, 100, 50]
[500, 200, 50]
[500, 200, 100, 50]

When this excessive journeying ends, I will write a few more posts. Goodbye till then

Soundex

I remembered reading about the soundex code used to encode similarly sounding names or names with similar spelling so that they could be retrieved easily.

This is what the soundex code for a word consists of:

  • The first letter of the name.
  • The letters A, E, I, O, U, Y, W and H are never used in encoding (except when they are the first character in a word)
  • The following table lists the character codes:
  • Code Key Letters
    1 B, P, F, V
    2 C, S, K, G, J, Q, X, Z
    3 D, T
    4 L
    5 M, N
    6 R
  • Letters are not encoded if the preceeding letters have the same code
  • Codes are truncated after the third digit.
  • Trailing zeros are appended as needed.

So, here is the snippet that should generate a soundex code for a given word:

code_dict = {‘A’:"0", "E":"0", "I":"0", "O":"0", "U":"0", ‘W’:"0", ‘H’:"0", ‘B’:1, ‘P’:1, ‘F’:1, ‘V’:1, ‘C’:2, ‘S’:2, ‘K’:2, ‘G’:2, ‘J’:2, ‘Q’:2, ‘X’:2, ‘Z’:2, ‘D’:3, ‘T’:3, ‘L’:4, ‘M’:5, ‘N’:5, ‘R’:6}

def codeGen(word):
word = word.upper()
code = [word[0]]
for i in range(1, len(word)):
if code_dict[word[i]] != "0":
if word[i] != word[i-1] and code_dict[word[i]] != code_dict[word[i-1]]:
code.append(str(code_dict[word[i]]))
else:
continue
else:
continue

if len(code) <= 4:
code += ["0"] * (4 - len(code))

else:
code = code[0:4]

return "".join(code)

#Tests
print codeGen("SCHAAK")
print codeGen("Lee")
print codeGen("Kuhne")
print codeGen("EBELL")
print codeGen("ebelson")
print codeGen("SCHAEFER")

The output:

$ python soundex.py
S200
L000
K500
E140
E142
S160

This problem was part of the Berkeley Programming Contest held in 2007 at the University of California Berkeley.

An Anagram Detector

Yesterday, I tackled the last problem in the “Aha! Algorithms” chapter of Jon Bentley’s “Programming Pearls”, an anagram detector. It takes a file consisting of words separated by a line and return lists containing anagrams separated by a line. It does not construct anagrams though and just hunts down a file for them.

The first step in building this includes an element of identity common to anagrams. This is called a signature. All anagrams belonging to the same class have the same signature. I chose my signature to be a string that contains letters followed by the number of times they appear in the word (the letters are in alphabetical order). Example:

"alphabet" - "a2b1e1h1l1p1t1"

The next step involves using a script that will bring together words with the same signature and these obviously are anagrams. Finally, a function prints these anagrams.

This portion of Python code generates the signature:

def sign(word):
        char_list = list(word)
        char_list.sort()
        word_list = []
        for char in char_list:
                if not char in word_list:
                        word_list.extend([char, str(word.count(char))])
                else:
                        continue
        signature = "".join(word_list)
        return signature

Next comes the function that is supposed to bring the anagrams together (so to speak):

def anagramCollect(filename):
        """Obtain Sinatures and Create a Dict.
        key = the signature
        value = a list of words with same signature"
""
        f = open(filename, "r")
        ana_dict = {}
        for word in f:
                signature = sign(word[:-1])
                if ana_dict.has_key(signature):
                        ana_dict[signature].append(word[:-1])
                else:
                        ana_dict[signature] = [word[:-1]]
        f.close()
        return ana_dict

In the above function, I used word[:-1] to avoid appending the newline character that seems to be unavoidable while using “for word in f:”.

Finally, a function to check the anagram dictionary for anagrams (should include those values whose length is more than 1) and to print them out.

def printAnagram(ana_dict):
        ana_list = ana_dict.values()
        for value in ana_list:
                if len(value) > 1:
                        print value
                else:
                        continue

I decided to test this script on a file containing words separated by lines. I used the following words:

pans
pots
opt
snap
stop
tops

When I ran the script, I got:

 python anagram.py
['pans', 'snap']
['pots', 'stop', 'tops']

So, it works! The script finally looks like this.

Well, Happy Coding :D

Binary Search and Vector Rotation - Jon Bentley Inspired

Since I got my copy of “Programming Pearls” by Jon Bentley, I have hardly been able to enjoy the excellent writing skills of Jon Bentley. I finally got some time to read, “Aha! Algorithms” from his book and I encountered a section on “Binary Search”. I read somewhere that during a lecture at CMU, Jon said that most binary search implementations were broken. After reading a description of the problem from the book, I decided to whip up an implementation myself. Binary search involves probing the range to see on which side of the median, the number lies and the new range is that half. This process is repeated until the value is located. So, here is a very short binary search I wrote:

def search(num_list, num, begin, end):
        if end< begin:
                return "Number not found in list"
        pos = (begin+end)/2
        if num_list[pos] > num:
                return search(num_list, num, begin, pos - 1)
        elif num_list[pos] < num:
                return search(num_list, num, pos+1, end)
        else:
                return pos

num_list = [1, 2, 4, 5, 6, 8, 11, 34, 72, 112, 134, 145, 156, 167, 178, 189, 190, 210, 321, 432, 543, 654, 765, 876, 987]
value = 112

print search(num_list, value, 0, len(num_list))

I ran the script and this is what I got:

 python binsearch2.py
9

Another attempt:

num_list = [1, 2, 4, 5, 6, 8, 11, 34, 72, 112, 134, 145, 156, 167, 178, 189, 190, 210, 321, 432, 543, 654, 765, 876, 987]
value = 35
print search(num_list, value, 0, len(num_list))

The output:

python binsearch2.py
Number not found in list

The next problem in the same chapter involved rotating vectors. A beautiful algorithm was described in the book. To rotate “i” elements out of “n” in an array, the best solution I could think of was to throw i elements in a separate list and keep appending them in reverse order to a new list which contains the (n-i) elements. As usual, Jon Bentley had a better algorithm. He considered the list to be two groups “a” and “b”. So our list can be represented by “ab”. We need to obtain “ba”. If we denote a’s reverse as {a}^{r} and b’s reverse as {b}^{r}, then we get, ({{a}^{r} {b}^{r}})^{r} = ba. So, I implemented this algorithm in a few seconds using:

def rotate(list_name, i):
        ‘Rotate i elements of the given list’
        a = list_name[:i]
        b = list_name[i:]
        a.reverse()
        b.reverse()
        final_list = a+b
        final_list.reverse()
        return final_list

This is how it works:

print rotate([‘a’, ‘b’, ‘c’, ‘d’, ‘e’, ‘f’, ‘g’, ‘h’, ‘i’, ‘j’], 3)

The solution:

 python vectorrot.py
['d', 'e', 'f', 'g', 'h', 'i', 'j', 'a', 'b', 'c',]

.

Well, there’s another problem that I shall solve later. The book is really a masterpiece. I had a sneak-peek at what was to follow the next problem and it turns out its an anagram constructor! Wee! I can foresee a fun filled week ahead!

EDIT: The binary search implementation now terminates when it can’t find the value in the list.

Observations About Strings

I have recently added the podcasts on http://blog.stackoverflow.com to my feeds and in one of the episodes (second, I think), someone called in and asked Joel Spolsky whether he should learn C++ vs something else and aside the OOP what differentiated C and C++. Joel explained the benefits of C to Jeff and mentioned an example whose shaky reproduction goes like this, “If you are asked to write a script to replace every “x” in a string with a “y”, then the first solution most high level programmers would think of is to go over all the characters in the string and when they find an “x”, they append a “y” to the new string or they append the currently existing letter. They just don’t know how inefficient this is.”

I don’t know C (but I am learning it now) but I guess he might be right. I wrote a recursive solution for someone on orkut who had problems with it and I employed such a method (but considering that Python strings are immutable, I was forced to use a list and finally do a “”.join(string_list) ). Listening to that podcast made me think about other ways of doing that and obviously slicing a string comes to mind. I cannot plainly replace positions in Python strings (they’re immutable) and converting it to a list, going over replacing and then rejoining it sort of defeats the purpose of setting out to optimize that code. Here is the solution I whipped up for someone called Hari Priya on Orkut:

def changeXY(string_name, index=0, final_string=[]):
        if index == len(string_name):
                return "".join(final_string)
        else:
                if string_name[index] == "x":
                        final_string.append("y")
                        index += 1
                else:
                        final_string.append(string_name[index])
                        index += 1
                return changeXY(string_name, index, final_string)
 

Clearly, not the best, there’s a load of repetition there and all that but since I am too bored to fix that up, I will proceed with my next solution.

#!/usr/bin/env python
#Author: Shriphani Palakodety
#Filename: changeXY2.py

def changeXY(string_name):
        slices_list = []
        places = string_name.count("x")
        start = 0
        for i in xrange(places):
                position = string_name.find("x", start)
                slices_list.append(string_name[start:position]+"y")
                start = position+1
        slices_list.append(string_name[start:])
        return "".join(slices_list)
 

That is the solution where I construct portions of the string and join them at the end. Since the only idea I’ve got about efficiency is limited to checking the runtime, I did just that.

Step 1: Pick a string. I picked

boneywasawarriorwayayixkickedasterixandobelixbutt

.

Here is a time based comparision:

time python changeXY1.py
boneywasawarriorwayayiykickedasteriyandobeliybutt

real    0m0.301s
user    0m0.077s
sys     0m0.170s

That was the time taken for the recursive solution to replace all “x”s with “y”s.

time python changeXY2.py
boneywasawarriorwayayiykickedasteriyandobeliybutt

real    0m0.255s
user    0m0.077s
sys     0m0.124s

That was the time the second solution using slicing took to run. It took less time and hence it was more efficient than the recursive solution.

Did that deserve an entire blog post? Probably not. Still, stackoverflow’s podcasts are pretty good and any advice from Joel Spolsky is good advice. I might drop in a question but I have no idea what to ask him.
Well, Happy Coding :D

220, Anna Salai - Hurdle Cleared

Woohoo, I’ve managed to clear the hurdle at 220 Anna Salai. But let me tell you guys, Chennai can drive anyone mad. Here is what I recollect from my trip (I do not have a camera so no pics):

  • We drop in at a 3 star hotel named Dee Cee Manor ( who rates these hotels by the way? 3 stars is like overrating this establishment’s ridiculous rooms/food ).
  • The autos suck. Oh yes! Spencer’s Plaza is a cool shopping center and it is about 2 kilometers from Dee Cee Manor. The first auto driver I met said 120 rupees. F*ck, 120 ? For 2 kilometers? Right, I ask the bugger to put on the meter. He says no. The next guy I meet charges me 60. I climb in and realize that auto drivers in Hyderabad are better than most others
  • I then purchased two albums I had been dying to get my hands on, Paul McCartney’s Memory Almost Full and the #1 Beatle’s (I spent all my music aware days listening to the Beatles).
  • I then go to the consulate and the doors impress me.
  • The VO looks at my i20 and scribbles on my DS156 and then says in 1 week, your passport will be with you. Wee ! No questions !
  • Then I went to Tirupathi ( no stupid comments on my religious beliefs. My faith is my own ).
  • I took a train to the aforementioned location. A marwari family climbed and make a mess of everything, moving people around to accommodate that king sized, 20 member family of theirs. Bah, they kept shouting, put on music at 100 decibels ( and promptly earned abuse from my mother ). I know a marwari family myself and they are bloody educated and know how to behave in public. I still think people in our nation need to understand the notion of Public property and how all citizens are entitled to silence and deserve as much space as they have paid for ( which is limited to one seat in this case ).
  • I then realized that Tirupathi wasn’t something I could survive. Did nothing there.

Oh by the way, I solved a problem ( albiet an easy one ) from the YSU-ACM High School Programming Contest - 1991 ( The best I could perform when the silence was under siege by this bloody family) on the way to Tirupathi. Here is the question: http://www.cs.berkeley.edu/~hilfingr/programming-contest/ysu_94.pdf (the inverted triangles sum).

Here is the solution:

#!/usr/bin/env python
#Author: Shriphani Palakodety
#YSU-ACM programming contest
#1991 PSet
#Premature optimization is the root of all evil.
#        - Shriphani Palakodety ( 2008 A.D )
level = "To be given"
level_list = []

while True:
        if level != "":
                level = raw_input()
                level_list.append(level)
        else:
                break

level_list.pop(-1)

def printTriangle(level):
        for i in range(int(level), -1, -1):
                print "*" * i
               

for x in level_list:
        printTriangle(x)
 

Done. Expect an increase in post frequency now. Phew, the consulate appoinment gave me nightmares.

A Few More Math Sums; Combinatorics and Triangular Numbers

I solved a couple of sums from the Project Euler archives and feel quite stupid since this is the first time in the last few weeks that a post has some code in it. I really ought to improve my efficiency. Well, the first sum I did was no. 42 and extremely special since it is the ultimate answer to the ultimate question. Well, the question was:

By converting each letter in a word to a number corresponding to its alphabetical position and adding these values we form a word value. For example, the word value for SKY is 19 + 11 + 25 = 55 = t10. If the word value is a triangle number then we shall call the word a triangle word.

Using words.txt (right click and ‘Save Link/Target As…’), a 16K text file containing nearly two-thousand common English words, how many are triangle words?

So, I whipped up a solution in no time. I was only hindered when it came to figuring out whether a number was a perfect square or not and I was trying to explore better algorithms that mine which involved squaring the integral part of the square root of type float obtained by using math.sqrt(num) and squaring this integral part to see if we return to the original number again. I couldn’t really find a decent solution so I just implemented my own answer:

#!/usr/bin/env python
#Author: Shriphani Palakodety
#Example 42, Project Euler Archives.

import string
import math

def makeAlphaDict():
        alpha_dict = {}
        for i in range(len(string.uppercase)):
                alpha_dict[string.uppercase[i]] = i + 1
        return alpha_dict

alpha_dict = makeAlphaDict()

def checkTriangular(word):
        word_sum = 0
        for char in word:
                if char == ‘"’:
                        continue
                else:
                        word_sum += alpha_dict[char]
        discrmnt = 8 * word_sum + 1
        if checkOdd(int(math.sqrt(discrmnt))) == True and checkPerfSquare(discrmnt) == True:
                return True
        else:
                return False
       
def checkPerfSquare(num):
        probable = int(math.sqrt(num))
        if probable * probable == num:
                return True
        else:
                return False

def checkOdd(num):
        if num % 2 == 0:
                return False
        else:
                return True

input = open("words.txt", "r")
words_list = input.readlines()[0].split(",")
input.close()
word_count = 0

for word in words_list:
        print word
        if checkTriangular(word) == True:
                word_count += 1
        else:
                continue

print word_count
 

Thats the first question. The next problem I solved was #53;

How many values of nCr, for 1 ? n ? 100, are greater than one-million?

Since \binom {n} {r} is a symmetric function about the middle term while r increases from 0 to n, I figured out that once we hit upon one (n, r) pair, without computing the rest, we could obtain the other values of r for a fixed value of n that crossed the million mark. Remember, if n is even, the number of terms is odd and if n is odd, number of terms is even. In case the number of terms is even, the middle terms are n / 2 and n / 2   1. In case, the number of terms is odd, the middle term is (n   1) / 2. Using just these three I could get to the solution in no time. Here is the code:

#!/usr/bin/env python
#Author: Shriphani Palakodety
#Project Euler Problem 53

def factorial(number):
        if number == 0 or number == 1:
                return 1
        else:
                return number * factorial(number - 1)

def combination(n, r):
        n_fact = factorial(n)
        r_fact = factorial(r)
        n_r_fact = factorial(n - r)

        return n_fact / (r_fact * n_r_fact)

def midTerm(n):
        if n % 2 == 0:
                mid_term_1 = n / 2
                return mid_term_1
        else:
                mid_term = (n + 1) / 2
                return mid_term

comb_num = 0

def countComb(n, r):
        ‘If a particular value is found to be greater than 0,
           it adds a certain number to comb_sum using properties of
           combinatorics.’

       
        if n % 2 != 0:
                num_to_add = 2 * (n / 2 - r + 1)
                return num_to_add
       
        else:
                num_to_add = 2 * ((n+1)/2 - r) + 1
                return num_to_add

for n in range(101):
        for r in range(midTerm(n) + 1):
                if combination(n, r) > 1000000:
                        comb_num += countComb(n, r)
                        break
                else:
                        continue

print comb_num

Thats the solution. And of course, who’ll forget the time taken to get to the solution.

shriphani@psp-laptop:~/project_euler$ time python ex53.py
4075

real    0m1.116s
user    0m0.140s
sys     0m0.024s
shriphani@psp-laptop:~/project_euler$

Till the next time I put up a post, HAPPY CODING !

Python Variables

For a very long time, I was under the impression that variables in Python were like boxes and I could throw objects in them. So, according to me, if I wrote a piece of code that went:

>>> a = "Shriphani Palakodety"
>>> print a
Shriphani Palakodety

I used to think that there is a new box and I’ve thrown the string “Shriphani Palakodety” in there. If I do:

>>> a = "Shriphani Palakodety"
>>> a = "Doofus"
>>> print a
Doofus

the general idea is that “Doofus” replaces “Shriphani Palakodety” in box “a”.

I was reading a mail on Edu-Sig (Python) and John Zelle ( he wrote an introductory computer science book ) mentioned that variables were not boxes but merely labels ( or tags, or “Sticky Notes” whatever you want to call them ). So, if I do something like:

>>>a = "Shriphani Palakodety"

All I am doing is sticking “Sticky Note” “”a”" on an object ( in this case, “Shriphani Palakodety” ). When I do something like:

>>>a = "Shriphani Palakodety"
>>>a = "Doofus"

all I have done is move the Sticky Note “a” to “Doofus” and not remove “Shriphani Palakodety” from box “a” and put “Doofus” in it. Here is my idea of a verification:

>>>a = "Shriphani Palakodety"
>>>print id(a)
10353360
>>>a = "Doofus"
>>>print id(a)
10386016

So, if a was a box, its location shouldn’t have changed. But we observe that a’s location changes.

But the Sticky Note analogy can’t be entirely applied to variables. In real life, I can stick one sticky note over another and I can have three sticky notes such that one is stuck to the object and the other two are stuck to the sticky note itself. The figure below should illustrate that:

3 Sticky Notes Attached One Above Another

The above picture should depict the following Python code:

>>> x = y = z = 0
>>> x
0
>>> y
0
>>> z
0

So far so good. But let us say, I want to remove the Sticky-Note “y”. In real life that move would be represented by:

\

This means that “z” would be affected too which is not how it works in case of Python. Hence we come to the conclusion that a sticky note should be stuck only on the object and not on another sticky note.

OR

Python’s variables ( Sticky Notes ) come with a special kind of glue that isn’t sticky on material from which the sticky notes are made. So “x”, “y” and “z” should be stuck to the object but not to one another.

This completes what I learned from the Edu-Sig mailing list so far and I hope I joined this list earlier. John Zelle ( who wrote the textbook that MIT now uses for its introductory CS ) is there and I hope to keep learning more.

Well, I would love comments that will improve my knowledge but a comment on my bad GIMP skills will earn you spam and of course, I will call you names as well.