| msiegler(at)whizbang.com Research Scientist WhizBang! Labs, Inc. +1.412.683.0987 http://www.whizbang.com/ |
msiegler(at)ureach.com 950 Milton Street Pittsburgh, PA 15218 +1.412.519.2893 [cell] +1.412.243.7164 [home] |
I turn complex projects with slipping deadlines, uncertain requirements, sprawling resource usage, and half-implemented technologies into on-time deliverables with solid requirements and reliable schedules.
I interact with customers (acct or tech contact) to build SOWs and schedules, seal requirements and manage an iterative plan to meet their business needs. I work with marketing, developers, and scientists, IT, and production staff to execute projects on-time and under-budget.
I build teams out of heterogeneous groups of people, accommodating both 20-year senior scientists and juniors right out of college. I work to build collective goals and meet the professional needs of employees.
Ph.D. (4.0/4.0) Electrical and Computer Engineering, Carnegie Mellon University (October 1999)
Thesis: Integration of Continuous Speech Recognition and Information Retrieval for Mutually Optimal Performance
M.S. (3.8/4.0) Electrical and Computer Engineering, Carnegie Mellon University (May 1996)
Report: Measuring and Compensating for the Effects of Speech Rate in Large Vocabulary Continuous Speech Recognition
B.S. honors (3.7/4.0) Virginia Polytechnic Institute and State University (May 1993)
Technical project management for delivery of custom products and services having complex customer requirements and extraordinary technical challenges.
Proficient in translating academic theory and research software for speech, image, and language processing into component tools for product development.
Lucid presentation and charismatic persuasion skills for both technical and general layman communications, written and oral, single or groups.
Expert in all aspects of continuous speech recognition technology. extraction, acoustic modeling, language modeling, training and testing.
Proficient in both UNIX and Win32 environments.
Expert in C, Perl, and Matlab.
Proficient in Java, C++, DHTML, and Javascript
Electronics background, knowledge of hardware for audio, music and telephony applications.
Retained U.S. Government security clearance: 1990-1995.
Research Scientist, Corporate Information group, WhizBang! Labs, Inc. August 2001-present
Acted as primary technical contact in development and delivery of a large scale web-based corporate information data extraction system to a highly respected international data provider. Managed all technical and organizational issues dealing with a 10+ group having a highly diverse set of skills and experience levels. Fulfillment yielded $2M of revenue
Proposed and prototyped an information merger strategy between customer data and automatically extracted data resulting in a contract and a new primary business line for the company. Yield of $250K for the first customer.
Proposed, managed prototyping, development, production, delivery of a fast-turnaround data extraction product. Reduced fixed costs of production by more than 50% through automating processes and created quality controls on data and procedures. Created new business line serving low-quantity fast-return customers which widened the customer base. Yield of $500k for the first customer.
Managed production and customer fulfillment for a continuously operating 15million-URL crawl and data delivery to a high profile internet-based business search provider. (Parent company worth > $3G) Yield of $250k.
Leader, R&D group, MediaSite, Inc. October 1999-August 2001
Maintained strategy, technical development plans, and execution guidance for $1.5M federally funded project. The project objective was to design and develop an archive and retrieval application for video conferencing, business meetings, and person-to-person memory, with customizable multimedia search, browse, and playback. Technologies used include speech recognition, speaker identification, face detection, face recognition, document text extraction, topic clustering, structured metadata extraction from media, using ontologies for metadata sharing, media ingest via devices, and a custom distributed computing environment.
At first as the only R&D member and ultimately the leader of eight, I created a stable technical vision and acted as the primary recruiter to build a team of highly skilled scientists and developers. Maintained a technically and personally diverse culture and created a cohesive team. Managed the technical direction and interactions with product development, external technology customers, diffusion through conferences, and general public relations.
Conducted numerous external presentations to high-profile customers and strategic partners. Acted as technical expert on all elements of the research platform and the underlying vision in the entire software product line.
Prototype Technical Lead, July 2000-January 2001
Orchestrated all technical aspects of a rapid prototype development effort in-time and under-budget.
Incorporated Donald Norman and Edward Tufte's GUI design principles in the construction of a multimedia authoring environment for repurposing corporate media for communication using a combination of DHTML email and web feedback.
Used Alan Cooper-style design methodology in the product definition phase including identification of personas, use scenarios, and selection of a product feature set. The identification of system requirements and implementation was successfully separated from the implementation of the human interactions.
Enforced Kent Beck-style Extreme Programming methodology in the rapid development phase including a flexible test methodology, pair programming, continuous integration testing, and customer-feedback.
Speech Recognition Domain Expert, October 1999 to present
Planned and implemented improvements to corporate version of the Sphinx-II speech recognition system. Reduced word error rate by 25% through a combination of domain-specific acoustic and language training. Reduced runtime requirements by 25-50% through optimization of feature set, using minimum possible acoustic model complexity, local optimization of the dictionary, and tweaking of the language model.
Oversight and technical development for inclusion of IBM's R&D speech recognition platform in a released product. This required massaging research and academic-oriented code into a robust installation system for our customers, including customization of the engine for the use of searchable broadcast news, and integration with Microsoft's SAPI speech API.
Used Intel performance tweaking tools to reduce computation time in the acoustic evaluation core of the engine for Sphinx-III. Oversight on Open-Source maintenance of Sphinx-III improvements by a research programmer.
(pend) An Automatic and Efficient Method for the Segmentation of Continuous Audio Signals into Homogeneous Sections
(pend) An Automatic and Efficient Method for the Measurement of Relative Similarity Between Audio Segments
(pend) Software Architecture to Store, Blend, Navigate and Retrieve Segments of Multimedia Data
As dissertation topic, performed the first study to rigorously investigate the quality interactions between information retrieval and speech recognition systems for searching through large repositories of spoken material. It was found that optimizations made in each case reduced the overall system quality, and that a thorough modeling of how errors are propagated throughout the system yielded a substantial improvement in retrieval precision. Along the way several important discoveries were made:
Created several practical techniques to improve information retrieval in speech databases using preexisting speech recognition data structures. Reduced overall error difference between text and speech documents by approximately 50% in a standard test set. (February 1998-present)
Discovered a probability motivated explanation for the use of TFIDF term weighting which is the most common IR relevance equation. This has largely been a unexplained heuristic for the last 30 years. (November 1998)
Led and organized the second CMU effort in speech information retrieval task. Designed a lattice analysis tool in Perl that uses multiple hypotheses to improve performance on speech documents. (March-November 1998)
Led and organized a group of 6 researchers in the first CMU effort in speech information retrieval task. Built a retrieval engine using Matlab. (March-November 1997).
Trained and evaluated acoustic and language models for broadcast news. (June 1996)
Designed and implemented CMUseg a tool for breaking long streams of audio into homogeneous pieces, and managed software development to make it a public development tool. It is available on the NIST website. http://www.nist.gov/speech/software.htm (May 1996-ongoing)
As a Master's thesis topic, investigated effects of speech rate on acoustic modeling. Studied various automatically derived features to measure speech rate automatically, and authored the first paper documenting its effect on automatic speech recognition (January-December 1995)
Developed telephone-bandwidth acoustic models for speech recognition. (August 1994)
Designed and built a digitally controlled ceramic core linear motor. (January-May 1993)
Constructed a C-language implementation of a Fortran-based Expert System application. Designed and built a multiple component Fortran analysis tool to simplify conversion. (Summers 1990, 1991 and 1992)
Built 3d viewer of orbit simulations for major defense department contract. (Summer 1991)
Designed and built an interface for a foreign policy expert system. (Summer 1990)
M. Siegler, Integration of Continuous Speech Recognition and Information Retrieval for Mutually Optimal Performance, Ph.D. Thesis, Carnegie Mellon University, December 1999.
M. Siegler, M. Witbrock, Improving the Suitability of Imperfect Transcriptions for Information Retrieval of Spoken Documents, Proc. of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Phoenix, Arizona, 1999.
M. Siegler, A. Berger, M. Witbrock, A. Hauptmann, Spoken Document Retrieval at CMU, Proceedings of TREC-7, The Seventh Text Retrieval Conference, 1998.
M. Siegler, M. Witbrock, S. Slattery, K. Seymore, R. Jones, and A. Hauptmann, Experiments in Spoken Document Retrieval at CMU, Proceedings of TREC-6, The Sixth Text Retrieval Conference, 1997.
M. Siegler, U. Jain, B. Raj, and R. Stern, Automatic Segmentation, Classification and Clustering of Broadcast News Audio, Proceedings of the Ninth Spoken Language Systems Technology Workshop, Harriman, New York, 1996.
U. Jain, M. Siegler, S. Doh, E. Gouvea, J. Huerta, P. Moreno, B. Raj, and R. Stern, Recognition of Continuous Broadcast News With Multiple Unknown Speakers and Environments, Proceedings of the Ninth Spoken Language Systems Technology Workshop, Harriman, New York, 1996.
Measuring and Compensating for the Effects of Speech Rate in Large Vocabulary Continuous Speech Recognition, Masters Report, Carnegie Mellon University, 1995.
Matthew A. Siegler and Richard M. Stern, On the Effects of Speech Rate in Large Vocabulary Speech Recognition Systems, Proc. of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Detroit, Michigan, 1995.
P. Moreno, M. Siegler, U. Jain, and R. Stern, Continuous Speech Recognition of Large Vocabulary Telephone Quality Speech, Proceedings of the Eighth Spoken Language Systems Technology Workshop, Austin, Texas, 1995.
Teaching Assistant: Sensory Perception (Fall 1996)
Lab Mediator: Signal Processing Laboratory (Spring 1994)
Teaching Assistant: Electronics Design (Fall 1993)