IMU-aided Affine-photometric KLT Feature Tracker in CUDA Framework

by Jun-Sik Kim

Data Set


This library implements the KLT Tracking algorithm with 8-DOF affine-photometric motion model. The tracking can be assisted by a synchronized IMU for better performance. This implementation is written in NVIDIA CUDA framework for parallel processing.


Graphics Cards: NVIDIA CUDA-enabled graphics cards

This library has been tested on the following graphics cards.

  • NVIDIA Geforce 8600M (Laptop)
  • Geforce 8800GTX
  • Geforce 8800ULTRA
  • Geforce GTX 280

Operating System & Compiler: Microsoft Windows XP, Visual Studio 2005

The required (tested) libraries are

The CUDA-based tracker is interfaced with the class CFeature2DPool. Although neither the IPP nor OpenCV libraries are used in our CUDA implementation, this interface class requires both of them in order to handle images.

  CUDA implementation

Our KLT-GPU code is implemented in following two cu files: image smoothing, available in the CUDA samples feature selection, registration and tracking

CUDA functions:

Initializing/Closing a CUDA session

extern "C" int InitCUDA(int imageWidth, int imageHeight, int PyrLevel, int templateSize, int maxIteration, int maxLinesearch)

extern "C" bool ExitCUDA(void);

Feature selection

extern "C" void Caller_eig_show_mag(float* target, float ratio, unsigned int width, unsigned int height);

extern "C" void Caller_corner(cudaArray *Image, size_t pitch, unsigned int width, unsigned int height);

Feature registration

extern "C" void Caller_Tracker_Registration(float* fx, float *fy, int numfeatures, int *indexmap);

Feature tracking

extern "C" void Caller_Align(cudaArray *Image2, float* motion, float *residual, float *vec_invH);

Notes for the CUDA implementation:

  • The CUDA implementation is not fully equivalent to the enclosing CPU implementation of the affine-photometric tracker. The differences are

    • In the CUDA implementation, there is no re-registration of features. Thus, the life of the features is less than that of features tracked in the CPU implementation.

    • Maximum number of features is limited 1024. Even less than that, it tracks all the 1024 features.

    • The number of iteration is fixed for every feature.

  • The maximum level of pyramids is set to 5. If one wants to increase it, change the ˇ°#define MAX_PYR_LEVELˇ± on top of the cu file.

  • The maximum number of features is fixed to 1024.

  • The Hessian inversion is made on a GPU by default. To do it on a CPU, edit the function Caller_Tracker_Registration after reading the comments on it.

  • The function cvGoodFeaturesToTrack_GPU() is based on the equivalent implementation in the OpenCV. It uses only Maximum eigenvalue method.

  Questions ?

Send email to kimjs-at-cs-dot-cmu-dot-edu