I tried to take a shot @ the problem shown above. Turns out just high-school level math is sufficient to make a decent classifier for this.
In the Japanese alphabet(s), a character is composed of strokes. These strokes have a fixed order. This restriction is pretty much all you need. I grab the stroke’s end points and centroid and then normalize.
Once, a stroke is finished, a nearest-neighbor classifier runs and fetches 3 best matches. I have had very decent performance with this approach. Right now, no information about the distance between successive strokes is used. Incorporating that should improve performance by a significant amount.
I did something I have wanted to do for a long time. I submitted a paper to ICIP 2012 as first author. I’ll post a description + explanation of our idea once I hear back from the ICIP folks.
The Messier Catalog: Charles Messier was interested in cataloging comets. Frustrated with his hunt, the compiled a large list of non-comet objects that he saw in the sky. This became one of the top referenced list of astronomical objects. These objects are given Messier numbers (M1, M2, M3…). So a list entirely of objects that the author had no interest in ended up being one of the top referenced lists ever.
It is possible to create a human being that is not aware of the concept of vision. See the book “The Man Who Mistook His Wife for a Hat and Other Clinical Tales” by Oliver Sacks.
Proprioception: This is our awareness of our limbs’ positions. A few case studies caused by the loss of proprioception are mentioned in the book above by Oliver Sacks.
I spend a good 1 – 2 months in Singapore every year (family lives there). I got a Kindle for my dad and it is a joke that even Project Gutenberg stuff is not available on AMZN’s Singapore site. This is the 21st century and we are talking about a developed country with a very sound PPP. How can the only developed country in the region not merit enough attention for a well-furnished online-bookstore?
In Prof. Vishy’s ML class (cs 590 – top notch course, top notch professor), we don’t have a final and instead we are supposed to apply ML to a problem we find interesting. Microsoft gave all of us interns a Kinect this summer so I decided to put it to some use (I don’t have a TV so the XBox is just collecting dust).
My goal was to be able to record finger gestures and then detect them when a user makes these gestures. I had 2 goals in mind – no OpenCV (i.e. I will use just depth data) and no wearing special stuff to guide anything.
So, let us see what I did. Basically, I used the CandescentNUI Hand Tracker to get a collection of fingertip locations and points and then applied two techniques to try and recognize the gestures we make.
First, I tried using the Passive-Aggressive algorithm by Crammer et. al. This algorithm uses an online-learning approach to build a hyperplane (in 3 dimensions, this is a plane, in 2 dimensions – a line etc. Basically, this is what is defined when you try to define a “surface” like structure for a space. Take 2 non-parallel vectors in 3D space and you can construct the entirety of the 2D world. The hyperplane is just that – an entire space (a subspace with 1 dim less than the one we are operating in).
The hyperplane is supposed to act like a brick wall (if we’re in 3D – no point visualizing a higher dimension). When we see a new data point come in, we want to inspect on which side of the wall it lies and then we can “detect” or label this point. This is the binary classifier.
The dataset consists of raw point coordinates in the space of the human palm seen by the kinect. Now it turns out that the online passive-aggressive algorithm fails at constructing a decent hyperplane separating 2 classes (data points for 2 different gestures).
The obvious hack was to deploy a nearest neighbors classifier. The trick I used was that I ran a large cluster k-means on the data and built myself a dataset consisting entirely of cluster centers. So I was able to reduce the neighbors tenfold and still get fantastic performance. A simple technique worked fabulously in this situation and I couldn’t be more pleased.
Here is a video of the gesture-detector in action. The annotations should show you what to look @
Emil Post actually invented a deterministic computation model. In this model, the machine operated using a FIFO queue.
You would read the first symbol in the queue.
Delete the appropriate number of symbols from the head
Append a string corresponding to the symbol looked up in step #1.This machine has only 1 state. It is a valid model of computation and it is extremely simple to represent things like the collatz-sequence.
For more info : http://en.wikipedia.org/wiki/Tag_system
Recently I had to undergo the unfortunate experience of using scipy’s optimizers in the scipy.optimize module. And I had this issue where the algorithm wouldn’t even start iterating. Essentially hitting Ctrl-C would bring up some useless info about a few dll files and “Error”. No stack track nothing. Needless to say, my assignment took 3 days to complete (time taken to pull hair out – not compute) and I put myself in a very bad situation (buggy HW) and buggy preparation for the mid-term onslaught both of which can’t be fantastic…
Anyway, I promised myself that coursework would be for learning and not for grades but that hasn’t worked wonders for me – maybe grad school will offer some fresh perspective on education (which I desperately need since I have been a bit demotivated by these fuck-ups that are out of my control).
Oh well – my GPA’s ok and 1 dip in performance will not kill it.