CMU 15-418 (Spring 2012) Final Project

CMU 15-418 (Spring 2012) Final Project:

A Parallel Algorithm For Noisy or Distorted Character Recognition

Samuel Russell, Rob Waaser

Reports

Working Schedule

Week What We Plan To Do What We Actually Did

Apr 1-7 Implement the CUDA framework. Implemented CUDA framework

Apr 8-14 Implement distortion filters and version 1 of matching function Implemented v1, v2, and v3 of image matching function. Using ImageMagic to generate distortions

Apr 15-22 Create automated accuracy testing scripts
Experiment with different versions of matching function Automated Accuracy (and timing) test scripts were created.
Character libraries for Captchas.net and PHP Captchas were created.
Post-Processing code was written

Apr 23-26 Implement timing code and sequential version
Create Char Libraries for OCR Failures and Yahoo Mail Capthcas
Refine Post-processing algorithm Implemented timing code and sequential version

Apr 27-29 Work on parallel performance optimizations
Create Char Library for reCaptcha Performance Optimizations
Apr 30-May 3 Work on parallel performance optimizations Performance Optimizations

May 4-6 Develop method to distribute work between multiple cluster machines and collect results Added functions for direct bmp reading
Built recaptcha library
May 7-10 Tweak parameters to maximize accuracy and performance
Create final presentation.
Write demo-code to automatically pull down captchas images and track accuracy

Working Results

Target	Example	Character Accuracy	Word Accuracy	Sequential Time (per captcha)	Parallel Time (per captcha)	Speedup
OCR Failures		Not Implemented
Captchas.net		89%	50%	7248ms	225ms (151ms kernel)	32x
PHP Captchas		75%	38%	18185ms	504ms (387ms kernel)	36x
reCaptcha		55%	10%		450ms (300ms kernel)

Working Log

4/2/12: Created Project Website, Wrote Initial Proposal
4/8/12: Primary code structure is up. Working on reading in bmps. Target captchas / images are chosen .
4/15/12: Experimenting with different matching functions. The two most promessing so far are a sum of pixel-by-pixel multiplication, and a edge-closeness function. the multiplication will likely work better for captchas that won't need edge detection. For reCaptcha which uses inversion, we will have to use edge detection images and the edge-closeness may prove better.
4/16/12: It looks like ImageMagick will be used as an out-of-the-box solution for creating letter distortions.
4/17/12: ImageMagick was used to create a small library for PHP Captcha. Mangal was the base font, and a set of distortions were applied including horizontal resize, vertical resize, and BilinearForward distortions based on control points. Accuracy increased fairly significantly.
A page showing sample maps was created.
4/19/12: Post Processing function was implemented to filter maps and generate a guess. Sample output is available here for one of the successfull php captchas
4/26/12: Sequential version is implemented and timed. Speedup is good but can be improved. One of the big areas I want to focus on is reducing memory transfer required (specifically the results buffer) can be minimized.
4/27/12: Yahoo mail has changed the format of their captchas. I only have 10 test images I saved of the old format. The new format appears to use a range of fonts, fairly severe distortions, and edge filters the images. Theoretically still solvable with our algorithm but difficult definitely went up compared to old format, most notably with their use of multiple fonts.
5/1/12: Made a sizeable speed improvement by reducing columns to the maximum in a CUDA kernel, and then only copying one row worth of maximums back from GPU. This operation was previously being done sequentially by post-processing.
Further optimizations by minimzing kernel operations
5/4/12: Added functions to read bmp files directly instead of converting
Build initial recaptcha library
5/8/12: Fixed bug in bmp reading functions
Expanded recaptcha library. Got to ~50% letter accuracy without post-processing implemented.

Week	What We Plan To Do	What We Actually Did
Apr 1-7	Implement the CUDA framework.	Implemented CUDA framework
Apr 8-14	Implement distortion filters and version 1 of matching function	Implemented v1, v2, and v3 of image matching function. Using ImageMagic to generate distortions
Apr 15-22	Create automated accuracy testing scripts Experiment with different versions of matching function	Automated Accuracy (and timing) test scripts were created. Character libraries for Captchas.net and PHP Captchas were created. Post-Processing code was written
Apr 23-26	Implement timing code and sequential version Create Char Libraries for OCR Failures and Yahoo Mail Capthcas Refine Post-processing algorithm	Implemented timing code and sequential version
Apr 27-29	Work on parallel performance optimizations Create Char Library for reCaptcha	Performance Optimizations
Apr 30-May 3	Work on parallel performance optimizations	Performance Optimizations
May 4-6	Develop method to distribute work between multiple cluster machines and collect results	Added functions for direct bmp reading Built recaptcha library
May 7-10	Tweak parameters to maximize accuracy and performance Create final presentation. Write demo-code to automatically pull down captchas images and track accuracy