Research statement Jia-Yu (Tim) Pan How do we discover useful and meaningful patterns in a large database of multimedia objects (e.g., video clips) which contain data of various modalities (image, audio, text, etc.)? For example, how do news videos differ from commercials, in image and in sound? How do we summarize the differences between biomedical images of retinas in healthy and diseased conditions? Moreover, how do we correlate information from various modalities (e.g., between image and text) for content-based annotation or retrieval? My primary research interest is in developing novel data mining methods for multimedia and biomedical databases. The main goal is to discover patterns that make information in such databases useful and accessible. My work at Carnegie Mellon University focuses on two topics in this area: first, finding characteristic patterns from uni-modal data (video clips, text, time sequences, and biomedical images); and second, discovering correlations across various modalities. I have designed data mining methods which can find patterns that are meaningful in summarizing data characteristics and are useful in applications like classification, summarization and automatic annotation. \paragraph*{Discovering uni-modal patterns.} I have been working closely with people from various disciplines including video digital library, computer graphics, and biological sciences. In these projects, we developed methods to extract patterns from uni-modal data such as video clips, time sequences, and biomedical images. By exploiting the advantages of \emph{independent component analysis} on capturing non-Gaussian patterns, we proposed a variety of data mining applications on real world data sets: For example, the {\em VideoCube}~\cite{icadl02vcube} method extracts meaningful visual/auditory patterns from video clips, and could classify news clips and commercials with 81\% accuracy; the \emph{AutoSplit} method~\cite{PAKDD04AutoSplit} finds hidden patterns and detects outliers in time sequences of stock prices and the human motion capture data; the \emph{ViVo} method automatically constructs a visual vocabulary and provides an automated tool for biomedical image analysis. Our papers on AutoSplit~\cite{PAKDD04AutoSplit} and ViVo~\cite{icdm05vivo} were awarded the \emph{best student paper} awards in the conferences PAKDD-2004 and ICDM-2005, respectively. \paragraph*{Discovering cross-modal patterns.} Multimedia objects like video clips contain data of various modalities such as image, audio, and transcript text. Correlations across different modalities provide information about the multimedia content, and are useful in applications ranging from summarization to semantic captioning. For discovering cross-modal correlations, we proposed a graph-based method, \emph{MAGIC}~\cite{magic}, which turns the multimedia problem into a graph problem, by representing multimedia data as a graph. Using ``random walks with restarts'' on the graph, MAGIC is capable of finding correlations among all modalities, and has been successfully applied to identify relevant video shots and transcript words for summaries of news events~\cite{ICDM04MMSS}. Another successful application of MAGIC is on automatic image captioning. By finding robust correlations between text and image, MAGIC achieves a relative improvement by 58\% in captioning accuracy as compared to recent machine learning techniques~\cite{KDD04MMG}. \paragraph*{Future research directions.} In general, I am interested in solving data mining problems by combining techniques from machine learning, statistics, databases, and computer vision, as well as developing new techniques. My dream is to develop automated methods that extract useful information from massive data sets and extend the scientific knowledge of human beings. For my next research projects, I plan to focus on data mining in two domains: biomedical data and cyber-security. For biomedical data mining, interesting directions include finding patterns in temporal or 3-D biomedical data/images, microarrays, or biological networks. For cyber-security, possible research topics include anomaly detection and pattern modeling in network traffic or web/user logs, as well as fault analysis in networked systems/services. As stepping stones toward the solutions, I plan to first extend the applications of my existing tools (AutoSplit, ViVo, MAGIC, etc.) to these domains: AutoSplit for co-evolving biological or network traffic streams; ViVo for 3-D or temporal biomedical images; and MAGIC for networked systems or biological networks. In the course of this work, I will identify novel problem domains and continuously develop new techniques to address new applications. The techniques developed for biomedical data and cyber-security could be generalized to other domains, such as WWW, sensor network, or motion animation, where time series and graphs are also the major data types. I would be delighted to collaborate with other research groups and contribute my expertise to multi-disciplinary projects. References: \bibitem{icdm05vivo} Arnab Bhattacharya, Vebjorn Ljosa, Jia-Yu Pan, Mark~R. Verardo, Hyungjeong Yang, Christos Faloutsos, and Ambuj~K. Singh. ViVo: Visual vocabulary construction for mining biomedical images. In Proceedings of the Fifth IEEE International Conference on Data Mining (ICDM), 2005. [Best student paper award]. \bibitem{icadl02vcube} Jia-Yu Pan and Christos Faloutsos. VideoCube: a novel tool for video mining and classification. In Proceedings of the Fifth International Conference on Asian Digital Libraries (ICADL), 2002. \bibitem{PAKDD04AutoSplit} Jia-Yu Pan, Hiroyuki Kitagawa, Christos Faloutsos, and Masafumi Hamamoto. AutoSplit: Fast and scalable discovery of hidden variables in stream and multimedia databases. In Proceedings of the Eighth Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), 2004. [Best student paper award]. \bibitem{KDD04MMG} Jia-Yu Pan, Hyung-Jeong Yang, Christos Faloutsos, and Pinar Duygulu. Automatic multimedia cross-modal correlation discovery. In Proceedings of the 10th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2004. \bibitem{ICDM04MMSS} Jia-Yu Pan, Hyungjeong Yang, and Christos Faloutsos. MMSS: Multi-modal story-oriented video summarization. In Proceedings of the Fourth IEEE International Conference on Data Mining (ICDM), 2004. \bibitem{magic} Jia-Yu Pan, Hyungjeong Yang, Christos Faloutsos, and Pinar Duygulu. MAGIC: Graph-based multimedia cross-modal correlation spotting. Under submission, 2006.