thoughts, ideas, code and other things...

Tuesday, March 16, 2010

Failed attempts at tracking a colored object in OpenCV

Recently, I purchased a webcam for fun. Basically wanted to have some augumented reality fun while sitting at home, trying to make something interactive, so that I can stand back and maybe control a car with something like a Star Wars projection torch. For now, an Old Spice empty deo can would do, its totally red.
Time to get some tools of trade in my bag of tricks. So this lazy pythonista looks out for something pythonic and dead easy.
A little common sense from past experience (SpaceLock) tells that OpenCV is the way to go. A backing from intel makes it look more shinier. But surprisingly, he finds these many options - PyOpenCV, ctypes-opencv, swig based default bindings, and completely newly written bindings in OpenCV 2.0.
Wow, now you're in a mess where every blog, every other newbie tutorial speaks about their own python bindings for opencv. A big mess out there with some folks on stackoverflow - "Since the new bindings are incomplete and the old ones are painful to use", what the hell!

Tried out the face detection code (had to be modified to work with opencv2.0 bindings) with haar-like features. Worked well with my face though it couldn't detect when I was looking down. I think one needs a bigger data file with all angles of human face covered to make detection more accurate. So much for 1MB of haar data.

Here is a better formatted code if you have trouble indenting that one -

import sys
import cv

class FaceDetect():
def __init__(self):
cv.NamedWindow ("CamShiftDemo", 1)
device = 0
self.capture = cv.CaptureFromCAM(device)
capture_size = (320,200)
cv.SetCaptureProperty(self.capture, cv.CV_CAP_PROP_FRAME_WIDTH, capture_size[0])
cv.SetCaptureProperty(self.capture, cv.CV_CAP_PROP_FRAME_HEIGHT, capture_size[1])

def detect(self):
cv.CvtColor(self.frame, self.grayscale, cv.CV_RGB2GRAY)

#equalize histogram
cv.EqualizeHist(self.grayscale, self.grayscale)

# detect objects
faces = cv.HaarDetectObjects(image=self.grayscale, cascade=self.cascade,, scale_factor=1.2,\
min_neighbors=2, flags=cv.CV_HAAR_DO_CANNY_PRUNING)

if faces:
#print 'face detected!'
for i in faces:
if i[1] > 10:
cv.Circle(self.frame, ((2*i[0][0]+i[0][2])/2,(2*i[0][1]+i[0][3])/2), (i[0][2]+i[0][3])/4, (128, 255, 128), 2, 8, 0)

def run(self):
# check if capture device is OK
if not self.capture:
print "Error opening capture device"

self.frame = cv.QueryFrame(self.capture)
self.image_size = cv.GetSize(self.frame)

# create grayscale version
self.grayscale = cv.CreateImage(self.image_size, 8, 1)

# create storage = cv.CreateMemStorage(128)
self.cascade = cv.Load('haarcascade_frontalface_default.xml')

while 1:
# do forever
# capture the current frame
self.frame = cv.QueryFrame(self.capture)
if self.frame is None:

# mirror
cv.Flip(self.frame, None, 1)

# face detection

# display webcam image
cv.ShowImage('CamShiftDemo', self.frame)
# handle events
k = cv.WaitKey(10)

if k == 0x1b: # ESC
print 'ESC pressed. Exiting ...'


if __name__ == "__main__":
print "Press ESC to exit ..."
face_detect = FaceDetect()

After some thinking I tried writing a colored object detector. I haven't yet gone into things like filters and thresholding techniques, which I guess are faster than my kiddish approach.
Simply put what I am doing is -
  1. Grab the image
  2. Look for points with distance from target color in range of a particular tolerance
  3. Count the number of these points
  4. If this count is > a particular density then show a circle at the mean location of these points
  5. Tune up the constants for your lightening conditions and target color.
  6. Don't crib much as again this is "kid's" approach to object tracking :P
Here is how the failed attempt looks like -

# -*- coding: utf-8 -*-
# A simple capture and draw code
import sys
import cv

class ColorFinder():
''' Finds out red objects on webcam '''

def __init__(self, colors=[], tolerance=500, density=100, step=1, windowName='ColorFinder'):
# -- CV settings
self.device = 0
self.capture_size = (320,240)
self.windowName = windowName

# -- Recognition settings
# Maximum rgb space distance to consider close
self.tolerance = tolerance
# how many pixels indicate an object
self.density = density
# currently two shades of red, one bright, other dark
self.colors = colors
# step is opposite of accuracy, you have to tweak density and tolerance accordinly
self.step = step

# -- detection vars
# mean positions
self.mean_pos = [0,0]

def setupCV(self):
''' sets up opencv to capture from webcam '''
cv.NamedWindow (self.windowName, 1)
self.capture = cv.CaptureFromCAM(self.device)
cv.SetCaptureProperty(self.capture, cv.CV_CAP_PROP_FRAME_WIDTH, self.capture_size[0])
cv.SetCaptureProperty(self.capture, cv.CV_CAP_PROP_FRAME_HEIGHT, self.capture_size[1])

if not self.capture:
print "Error opening capture device"

def distance2(self,source, dest):
''' finds square euclidean distance in RGB space '''
return sum ([ (x-y)**2 for (x,y) in zip(source[:3][::-1],dest[:3]) ])
# ^^ we just need rgb

def find_by_steps(self):
''' finds colored object by calculating mean position of such colors '''
mean_pos = [0,0] # reset the mean
pix_count = 0 # to find density

x,y = (0,0)
for x in xrange(0, self.capture_size[0], self.step):
for y in xrange(0, self.capture_size[1], self.step):
source = cv.Get2D(self.frame,y,x)
for color in self.colors:
if ( self.distance2(source,color) < self.tolerance ):
pix_count +=1
#print pix_count # just use this to tweak you
if pix_count>self.density:
# now we have a good bulk under detection, update mean
self.mean_pos = [t/pix_count for t in mean_pos]

def run(self):
''' runs a loop to do color detection '''
self.frame = cv.QueryFrame(self.capture)
while True:
self.frame = cv.QueryFrame(self.capture)


cv.ShowImage(self.windowName, self.frame)

k = cv.WaitKey(10)
if k == 1048603: # ESC
print 'ESC pressed. Exiting ...'

if __name__ == '__main__':
cf = ColorFinder(colors=[(172, 0, 16)], density=8, tolerance=300, step=3) # find me these shades of red

Afterthoughts -
  1. OpenCV is good, given the support and backing it is supposed to have. But it isn't straightforward enough for someone without understanding of filters, etc to try it out.
  2. There is webcam support in PyGame, which I'm tempted to try out. A good introduction by Nirav Patel. Besides his blog is full of inspration for anyone else to try out.
  3. Documentation of current OpenCV binding is insufficient, besides presense of just one example for new bindings and 10+ examples for old SWIG based binding leaves newcomers in dilemma. So does the presence of many other bindings, tools, etc
  4. My current approach is very slow, hence I'm leaving 3 pixels gap, besides this method is also very sensitive to lightening conditions around your place, better tweak it before you try out.
  5. A better approach would be a divide and conquer kind of approach of finding the region (maybe an experienced person would disagree, but I'm speaking from a novice's point of view)
  6. Or even better would be filtering out only reddish components and thresholding them to a black & white image. Afterwards finding white portions is not tough.
T2 starts from Wednesday, I better get back to my books for a while.

Labels: , ,


Post a Comment

Subscribe to Post Comments [Atom]

<< Home