ARCHIVED CONTENT - THIS SITE IS NO LONGER UPDATED

Tutorial on "Recognition in Video"

Recognition in Video

by Dr. Dmitry O. Gorodnichy

National Research Council Canada

Tutorial

Canadian Conference on Electrical and Computer Engineering (CCECE'06)

May 7 to 10, 2006

Ottawa Congress Centre, Ottawa, Canada.

When: Sunday, May 7, 8:30 AM-12:00 PM

Where: Room SITE E0126, University of Ottawa

For official CCECE '05 tutorial site, visit http://www.ieee.ca/ccece06/tutorials.html.

For registration information, visit http://www.ieee.ca/ccece06/registration.html

(NB: Early registration deadline - April 7, 2006)

This page is designed for those who are interested in attending the tutorial and

would like to take the benefit of getting prepared for it in advance.

Below you will find the detailed outline of the tutorial and the related additional information.

Preliminaries:

Tutorial is designed to be beneficial to both beginners and researchers already familiar with video processing.

Each tutorial attendee will receive a printed hand-out of tutorial materials.

If you have a webcam and Visual Studio C++ installed on your laptop, you may take it to the class. You will have to download the following free software to be able to write a video recognition program:

Microsoft DirectX 8 or later SDK [required for camera functionality]:

From www.microsoft.com/directx/, or directly form

http://www.microsoft.com/downloads

/details.aspx?FamilyID=1c8dc451-2dbe-4ecc-8c57-c52eea50c20a&DisplayLang=en

Intel OpenCV (beta 4) Library for Windows [required for image capture and processing functionality]

From http://sourceforge.net/projects/opencvlibrary or directly from

http://prdownloads.sourceforge.net/opencvlibrary/OpenCV_b4a.exe?download

Abstract:

Video processing is no longer a prerogative of a few. With video-cameras now affordable, computers powerful enough to process video in real-time, and a large pool of Open Source video libraries available for everybody, it is now possible practically for anybody with basic engineering skill to create a video processing system. The variety of applications for video processing is just as impressive: from surveillance, video coding and annotation to computer-human interaction and collaborative environments to multi-media, video conferencing and computer games, to name a few. Building a vision system for a specific application however is a big challenge, because of the complexity and the versatility of the video recognition problem inherent to all video processing tasks, regardless of whether it’s tracking of objects, recognition of events or person identification. While for humans it is very easy to give a meaning to observed visual stimuli, for computers it is not, and engineering skills alone may not be sufficient to resolve the problem. This tutorial is aimed at providing a compressive background to the problem of recognition in video and presenting an overview of the available solutions to this problem. Both live demonstrations and simple video coding techniques will be shown. By the end of the tutorial, the audience will be able to create their own perceptual vision system using a web-cam in a Windows environment.

Keywords: video processing, video analysis, content extraction, intelligent video.

Outline of the tutorial:

Part I. Getting the background. How we (human visual system) do it.

What makes video processing special:
a) Highly demanded, many open problems - Overview of various applications: from security, to multi-media, to computer-human interaction, to video annotation and video-based search.
b) No longer prerogative a few - Anybody can do it now. Introduction to Open Source Intel OpenCV library.
c) Critical differences from image processing: low-resolution and low-quality,
d) Real-time nature of video - constraint or advantage?
e) Another critical difference and advantage: cues from biological vision.
From video input to recognition results output
a) Video recognition as intersection of Computer Vision, Machine Learning/Pattern Recognition and Neurobiology.
b) Four fundamental video recognition problems: Detection, Tracking, Memorization & Association
c) Hierarchy of video processing tasks.
d) Four modalities of video information: motion, colour, luminance (orientational and spatial frequency) and d) disparity. Their suitability for video recognition problems.
Live hands-on examples: Examining the tasks, modalities and the issues.
a) "Hello world - I see you" example using Intel OpenCV video capture and processing library.
b) Processing TV recordings.
c) On video formats, sizes & quality: webcams vs. firewire cam, VCD vs. DVD, NTSC vs. PAL.
Main results from Neurobiology. (Video processing and recognition by biological vision systems)
1. Dealing with low resolution/quality, video processing mechanisms:
a) Visual pathway in brain (From retina, to visual cortex, to hippocampus)
b) Attention-based vision, salience detection mechanisms, Accumulation over time.
c) Separations into channels, where vs what processing, top-down vs bottom-up processing
d) Visual illusions
2. Recognition in brain. Associative recall and thinking: from neurons via synapses to memories
Main results from Associative Learning and Recognition
a) Coding memories using neural networks,
b) Learning Rules for tuning inter-neuronal synapses.
c) Open source High-Performance Associative Neural Network library.

Part II. How to make computers do it. Building your own Perceptual Vision System.

Using Motion: Main results and applications.
a) Statistics-based Background computation (inc. Mixtures of Gaussians)
b) Change vs. motion detection. Illumination invariant change detection (inc. non-linear change detection)
c) Motion history and patterns
Using Colour: Main results and applications.
a) Segmentation (inc. histeresis-, statistics- based)
b) Histogram-based tracking (inc. MeanShift)
c) Special consideration: Skin detection
Using Intensity: Main results and applications.
a) Detection of salient features: edges, corners (inc. histeresis-based)
b) Detection of orientational and frequency saliencies (inc. Gabor filters simulating visual cortex processing)
c) Using shape-from-shading information
Using depth/stereo-vision (briefly: references only)
Image processing techniques.
a) Manipulations with blobs: classification, grouping
b) Deformable templates
Accumulation of information over time:
a) Histograms and Evidence accumulation (inc. Dempster-Shafer evidence theory)
b) Probabilistic framework
c) Correction learning framework (inc. associative neural network)
Special consideration: faces in video
a) Four basic face processing tasks: detection, tracking, classification & identification.
b) FaceRec from video vs FaceRec from photos, Nominal face resolution.
c) Overview of solutions
Putting it all together: Building Perceptual Vision Systems (computer vision systems with recognition capability) to tailor your needs.
a) General framework: detection->tracking->classification, motion->colour->intensity
b) Spatio-temporal decisions (accumulation over time and space)
c) Doing it with Intel OpenCV library
d) Overview of other Open Source Video Processing libraries and datasets
e) On testing and benchmarks, using your films collection.
Examples: NRC-developped technologies:
a) Perceptual Vision Interface Nouse
b) (Multicamera, multi-person, robust, back-) tracking of faces
c) Face annotation, recognition of faces in TV recordings
d) Pianist hands/fingers recognition
Concluding remarks:
a) Areas for future development
b) References

Slides

Acknowledgement of authorship is required when using the slides for this tutorial: slides.

For full set of slides, contact the author.

Google Sites

Report abuse