May 2, 2000
 
Tobun
Dorbin Ng
 
| Wean Hall 3414 Pittsburgh, PA 15213 | Office: 1-412-268-4499 Fax: 1-412-268-5576 | 
Doctor of Philosophy in Management, 2000
The University of Arizona,
Tucson, AZ
Major:
Management Information Systems
Minor:
Computer Science
Major Advisor:
Dr. Hsinchun Chen
Major Committee Members: Dr.
Jay F. Nunamaker Jr., Dr. Olivia R. Liu
Sheng
Minor Committee Members: Dr. Richard T.
Snodgrass, Dr. John H.
Hartman
Dissertation Title: A Concept Space
Approach to Semantic Exchange
Master of Science in Management Information Systems, 1993
The University of Arizona, Tucson, AZ
Bachelor of Science in Business Administration, 1990
The University of Arizona, Tucson, AZ 
Majors:
Management Information Systems, Finance
Systems Scientist for “Informedia-II:
Digital Video Library” project which is one of the Digital Libraries
Initiative Phrase 2 projects funded by National Science Foundation (NSF).  This project aims to transform the paradigm
for accessing digital video libraries through meaningful, manipulable overviews
of video document sets, multimodal queries, and adaptive summarizations of very
large amounts of video from heterogeneous distributed sources.  Video information collages are the key
technology in Informedia-II and will be built by advancing information
visualization research to effectively deal with multiple video documents.  A video information collage is a
presentation of text, images, audio, and video derived from multiple video
sources in order to summarize, provide context, and communicate aspects of the
content for the originating set of sources. 
I am responsible for research and system development system for this
Informedia-II Project.  (Principal
investigators: H. Wactlar)
Project coordinator and technical point-of-contact for “GeoWorlds: Integrated Digital Libraries and Geographic Information Systems” project funded by the Defense Advanced Research Projects Agency (DARPA). This project is to integrate Geographical Information Systems and Digital Library technologies into a single system. This system will support a strategic alliance of technology developers and transfer the resulting technology to multiple military partners designated by DARPA; the partners are concerned with applying the technology in disaster relief operations. The integration is a collaborative effort from multiple sources (including the USC/ISI DASHER Project, the University of Arizona, the University of Illinois Urbana-Champaign Digital Library Initiative and NCSA, the University of California at Berkeley, and the University of California at Santa Barbara). The resulting DASHER services will be tested in military environments using disaster relief scenarios developed jointly with organizations such as SPAWAR System Center. I am responsible for coordinating, designing, and implementing application program interfaces (APIs) of Arizona's technological components. I am also responsible for delivering the APIs to and coordinating technology integration with USC/ISI. (Principal investigators: R. Neches, H. Chen)
System designer and software engineer for “The Interspace
Prototype:  An Analysis Environment
based on Scalable Semantics” project funded by the Defense Advanced Research
Projects Agency (DARPA).  This project
is to build a complete prototype environment for semantic association of
multimedia information and to evaluate its utility on real collections.  The semantic association relies on
statistical clustering for concepts and categories.  Interactive navigation using the semantic association enables
information retrieval at a deeper level than previously possible for diverse
large collections.  I am responsible for
design and develop a three-dimensional prototype demonstrating the semantic
interoperability between places on a three-dimensional geographic landscape and
topics on a three-dimensional semantic landscape.  (Principal investigators: B. Schatz, H. Chen)
System designer and software engineer for “COPLINK: Database
Integration and Access for a Law Enforcement Intranet” project funded by the
National Institute of Justice (NIJ). 
This project is intended to be a beginning and a catalyst to solve law
enforcement information technology problems through a synergy between the
research in Artificial Intelligence (AI) Lab at the University of Arizona and
the application in the Tucson Police Department. Innovative and functional
technologies will expand the uses of COPLINK from database integration, secured
Intranet access, and mobile computing, to criminal intelligence and case
analysis, mug-shot face recognition, and intelligent agents.  I am responsible for designing an algorithm
to correlate crime-related information such as people, locations, incidents,
vehicles, and weapons.  I am also
responsible for designing and developing an Intranet prototype.  (Principal investigator: H. Chen;
Collaborator: D. Smith)
Principal system architect for an NSF CISE-funded “Supplement to Alexandria DLI
Project:  A Semantic Interoperability
Experiment for Spatially-Oriented Multimedia Data” project.  This research aims to examine semantic
interoperability issues related to spatially-oriented, multimedia geographic
information access. Based on the concept space approach developed by the
Illinois Digital Library Initiative (DLI) project and the Alexandria
(University of California at Santa Barbara) geo-referenced collections, this
research proposes to develop knowledge representations and structures to
capture concepts of relevance to spatial and multimedia information (natural
language phrases and geo-related textures). Selected machine learning
techniques and general Artificial Intelligence (AI) graph traversal algorithms
will also be adopted to assist in semantic, concept-based spreading activation
in integrated knowledge networks.  I am
responsible for designing a system architecture to integrate various semantic
components, defining and analyzing interoperable and scalable semantic
components in different media, and developing a web accessible system
prototype.  (Principal investigators: H.
Chen, T. Smith)
Principal system designer and software engineer for “Information Analysis and
Visualization for Cancer Literature” project funded mainly by the National
Cancer Institute (NCI).  This project
aims to develop concept spaces (networks of vocabularies) and category maps for
cancer-related literature and documents. Based on selected automatic indexing,
linguistic parsing, cluster analysis, neural networks, and information
visualization techniques, the system-generated concept spaces and category maps
will be used to enhance cancer-related information retrieval and knowledge
sharing.  I am responsible for designing
and developing scalable algorithms and software to analyze automatically a
million CancerLit records and a common gateway interface (CGI) server for
semantic (concept-based) retrieval.  I
am also responsible for managing both HTML and Java interface development to
provide Internet access to CancerLit server. 
(Principal investigators: B. Schatz, H. Chen, S. Hubbard)
Principal system designer and software engineer for an
NSF/NASA/ARPA-funded “Digital
Library Initiative” project.  The
project goal is to develop a large-scale testbed for building next-generation
digital libraries for the National Information Infrastructure (NII).  The testbed collections are mainly in the
engineering domains, to be contributed by major engineering societies and
publishers (e.g., IEEE, John Wiley \& Sons, etc.).  I am responsible for developing scalable
analysis tools to analyze automatically a gigabyte of textual information for
semantic (concept-based) retrieval.  I
am also responsible for developing both web server and interface for semantic
retrieval.  (Principal investigators: B.
Schatz, H. Chen)
Principal investigator for two and principal software engineer for
eight National Center for Supercomputing
Applications (NCSA) High-performance Computing Resources Grants:
1.     
“High-Performance
Digital Library Classification Systems: 
From Information Retrieval to Knowledge Management” (1999-2000)
2.     
“Semantic
Analysis on Large Image Collection” (1998-1999)
3.     
“Parallel
Computation for a Semantic Interoperability Environment” (1997-1998)
4.     
“Medical
Information Analysis and Knowledge Discovery” (1997)
5.     
“Parallel
Semantic Analysis for Spatially-Oriented Multimedia GIS Data” (1996-1997)
6.     
“Information
Analysis and Knowledge Discovery for Digital Libraries” (1995-1997)
7.     
“Terabyte
Information Analysis and Knowledge Discovery” (1994-1995)
8.     
 “Building the Interspace:  Digital Library Infrastructure for a
University Engineering Enviroment” (1994-1995)
These projects aim to generate concept spaces for domains in
engineering, medical informatics, and Internet.  Concept spaces then will be used to support concept-based
retrieval and cross-domain vocabulary switching in scientific information
retrieval.  I am responsible for scaling
various analysis algorithms to utilize the supercomputing resources - SGI/Cray
Origin2000 (128 nodes), SGI's Power Challenge Array (148 nodes), Convex's
Exemplar (64 nodes), and Thinking Machine's CM-5 (512 nodes) – which have been
provided by NCSA.  (Principal
investigator: H. Chen, T. D. Ng; Collaborators: B. Schatz, L. Smarr)
Software engineer for an NSF-funded “National Collaboratory”
project.  The project goal is to design
concept-based information retrieval and information sharing software for
molecular biologists whose work is related to the Human Genome Initiative.  A (nematode) worm concept space and a fly
(Drosophila) concept space which can assist in cross-domain concept exploration
and term suggestion during information retrieval have been created and are in
use by worm biologists.  I was
responsible for developing a text parser to automatically index term phrases
including scientific names like chemical compound and gene names from fly
abstracts.  In addition, I was
responsible for creating the fly concept space.  (Principal investigators: B. Schatz, H. Chen, S. Ward;
Collaborators: T. Yim, D. Fye, J. Martinez, K. Powell, E. Grossman, T.
Friedman, J. Calley)
System designer and software engineer for an intelligence analysis
and retrieval system, which supports intelligence analysts who study
information technology policy, manufacturing, and proliferation in the (former)
USSR countries.  A content-based
intelligence analysis and retrieval system was developed in ANSI C and runs on
VAX/VMS and a DECStation (UNIX based). 
I was responsible for developing the content-based retrieval system
component using branch-and-bound algorithm and for designing the system user
interface.  (Collaborators: H. Chen, S.
Goodman, W. McHenry, P. Wolcott, K. Lynch, A. Himler, R. Orwig, K. Basu)
Principal system designer and software engineer for an NIH-funded
project investigating a neural network approach to pharmaceutical
applications.  Sample applications
include drug solubility prediction and non-linear pharmacokinetics functions
approximation.  I was responsible for
developing a prediction system, which was based on a neural network model using
a Backpropagation algorithm to estimate solubility of certain chemical
compounds.  I derived a systematic
approach to train the Backpropagation network. 
The trained Backpropagation network out-performed regression model which
was employed by the same application. 
(Principal investigators:  H.
Chen, H. Chow, S. Yakowski; Collaborator: P. Myrdal)
Instructor
(1996), The University of Arizona, “Data Structures and Algorithms,”
(undergraduate course), using Pascal. 
Student instructor-evluations on overall rating of instructor's
effectiveness based on the following 5-point scale:  5 for “almost always,” 4 for “more than 1/2
of the time,” 3 for “about 1/2 of the time,” 2 for “less than 1/2 of the time,”
and 1 for “almost never.”
·        
Spring
1996, rating: 4.43 (out of 5.00) top 10%
·        
Summer
1996 (10-week session), rating: 4.47 (out of 5.00) top 10%
Teaching Assistant (1990), The University of Arizona, “Data Structures and
Algorithms” (Undergraduate course)
Teaching Assistant (1990-1993), The University of Arizona, “Introduction to Federal
Taxation” (Undergraduate course)
World Wide Web (WWW) Common Gateway Interface (CGI) server
designer and engineer for Artificial Intelligence (AI) Lab (http://ai.bpa.arizona.edu) at the
University of Arizona.  Web services
include:
·        
WormSpace
server, http://bpaosf.bpa.arizona.edu:8000/cgi-bin/BioQuest,
automatic indexing, thesaurus generation, file system organization, and WAIS
indexing.  (1994)
·        
CSQuest
server, http://ai.bpa.arizona.edu/html/csquest,
automatic indexing, thesaurus generation, file system organization, and Simple
Web Indexing System for Humans (SWISH) indexing and server.  (1995)
·        
Et-map
server, http://ai.bpa.arizona.edu/ent,
automatic indexing, thesaurus generation, Thesaurus Indexing & Propagation
System (tips) server, HTML interface. 
(1996)
·        
GisSpace
server, http://ai.bpa.arizona.edu/cgi-bin/tng/GisSpace,
automatic indexing, multi-thesaurus generation, Thesaurus Indexing &
Propagation System (tips) server, HTML interface, interface to Java
applets.  (1996)
·        
CancerSpace
server, http://ai.bpa.arizona.edu/go/medical/cancerspace.html,
automatic indexing, thesaurus generation, Thesaurus Indexing & Propagation
System (tips) server, HTML interface, interface to Java applets.  (1997-1999)
·        
Informedia:
Contextual Search Interface, http://processc.inf.cs.cmu.edu/tng/inf,
automatic phrase formation and thesaurus generation from CNN broadcasting news
video.  (1999-present)
Artificial Intelligent (AI) Lab in the Department of Management
Information Systems at the University of Arizona.  Systems include:
·        
Two
high-performance supercomputers:  SGI
Origin2000 (8 R10000 processors, 1 GB Memory) and DEC Alpha 4100 (Dual 466Mhz
Alpha processors, 2 GB Memory) (1998-1999)
·        
Ten
Unix servers:  2 Alpha servers, 4 HP
servers, 3 SGI servers, and 1 Linux server (1992-1999)
·        
Seventeen
Windows NT workstations (1996-1999)
·        
Supercomputing
facilities:  SGI/Cray Origin2000 with
Irix, SGI Power Challenge Array with Irix, Convex Exemplar with HP-UX, and
Thinking Machine CM-5 with Sun OS. 
Parallel programming using C language. 
(Funded by NSF/NASA/ARPA “Digital Library Initiative” project and six
NCSA High-performance Computing Resources Grants) (1994-2000)
·        
UNIX
workstations:  SGI workstation with
Irix, HP workstation with HP-UX, DEC Alpha workstation with OSF1 and Digital
UNIX, DECStation with Ultrix, Sun workstation with Sun OS and Solaris, IBM
workstation with AIX, and Linux.  C,
Java, Perl, shell scripts, and SAS/STAT (1991-present)
·        
DEC
VAX/VMS.  C, Pascal, Minitab, SAS/STAT,
INGRES (1989-1996)
·        
Parallel
programming in C.  Programs are used in
NSF/NASA/ARPA-funded and NCI-funded projects. 
(1994-1999)
·        
C
programming.  Programs are used in
NSF-funded, NIH-funded, and NSF/NASA/ARPA-funded projects.  (1989-present)
·        
Java
programming.  (1996-present)
·        
Perl
programming.  (1997-present)
·        
Pascal
programming.  (1989-1991)
·        
Shell
scripts (C and Bourne shell).  Scripts
are used in concept space generation projects and World Wide Web server
development.  (1994-present)
Intelligent information retrieval (IR), concept space generation,
automatic thesaurus browsing and traversal, machine learning for IR.
Semantic interoperability for information analysis environment,
Internet resource discovery, digital libraries, IR for large-scale multimedia
and scientific databases.
Knowledge discovery in multimedia databases, machine learning,
knowledge-base systems, knowledge management, neural networks computing.
Software engineering, parallel computing, group support systems,
collaborative computing, telecommunication, distributed database systems.
Member of the Institute of Electrical and Electronics Engineers
(IEEE), IEEE Computer Society, the Association of Computing Machinery (ACM).
Reviewer for IEEE Computer, IEEE Expert, Journal of the American
Society for Information Science (JASIS), Hawaii International Conference on
System Sciences.
Recipient of a National Center for Supercomputing Applications
(NCSA) High-performance Computing Resources Grant, “High-Performance Digital
Library Classification Systems:  From
Information Retrieval to Knowledge Management.”  (Principal investigators: H. Chen, T. D. Ng) (1998-1999)
Recipient of a National Center for Supercomputing Applications
(NCSA) High-performance Computing Resources Grant, “Parallel Computation for a
Semantic Interoperability Environment.” 
(Principal investigators: H. Chen, T. D. Ng) (1997-1998)
Recipient of Graduate Research Assistantship, NSF/NASA/ARPA-funded
“Digital Library Initiative” project (IRI9411318), Department of Management
Information Systems, College of Business and Public Administration, the
University of Arizona.  (1994-1999)
Recipient of Graduate Research Assistantship, NSF-funded “National
Collaboratory” project (IRI9211418), Department of Management Information
Systems, College of Business and Public Administration, the University of
Arizona.  (1993-1994)
Recipient of Graduate Tuition Scholarship, Department of
Management Information Systems, College of Business and Public Administration,
the University of Arizona.  (1993-1994)
Recipient of Graduate Research Assistantship, NIH-funded project
(BRSG S07RR07002), Department of Management Information Systems, College of
Business and Public Administration, the University of Arizona.  (1992-1993)
Member of the Honors Center, the University of Arizona.  (1990)
The University of Arizona, College of Business and Public
Administration Dean's Award. 
(1989-1990)
The University of Arizona, College of Business and Public
Administration Dean's List.  (1989-1990)
Arizona Western College, Dean's List.  (1987-1988)
1.     
A. L.
Houston, H. Chen, S. M. Hubbard, B. R. Schatz, T. D. Ng, R. R. Sewell,
and K. M. Tolle, “Medical Data Mining on the Internet:  Research on a Cancer Information System,” Artificial
Intelligence Review, Volume 13, Number 5/6, Pages 437-466, December 1999.
2.     
A. L.
Houston, H. Chen, B. R. Schatz, R. R. Sewell, K. M. Tolle, T. E. Doszkocs, S.
M. Hubbard, and T. D. Ng, “Exploring the Use of Concept Space, Category
Map Techniques, and Natural Language Parsers to Improve Medical Information
Retrieval,” Decision Support Systems, Special Issue on Decision Support
for Health Care in a New Information Age, 1999, forthcoming.
3.     
B.
Zhu, M. Ramsey, T. D. Ng, H. Chen, and B. R. Schatz, “Creating a
Large-Scale Digital Library for Georeferenced Information,” D-Lib Magazine,
Volume 5, Number 7/8, July/August, 1999, http://www.dlib.org/dlib/july99/zhu/07zhu.html.
4.     
B. R.
Schatz, W. Mischo, T. Cole, A. Bishop, S. Harum, E. Johnson, L. Neumann, H.
Chen, T. D. Ng, “Federated Search of Scientific Literature,” IEEE
Computer, Special Issue on Digital Libraries, Volume 32, Number 2, Pages
51-59, February, 1999.
5.     
H.
Chen, J. Martinez, A. Kirchhoff, T. D. Ng, and B. R. Schatz,
“Alleviating Search Uncertainty Through Concept Associations:  Automatic Indexing, Co-occurrence Analysis,
and Parallel Computing,” Journal of the American Society for Information
Science, Special Issue on “Management of Imprecision and Uncertainty in
Information Retrieval and Database Management Systems,” Volume 49, Number 3,
Pages 206-216, 1998.
6.     
H.
Chen, J. Martinez, T. D. Ng, and B. Schatz, “A Concept Space Approach to
Addressing the Vocabulary Problem in Scientific Information Retrieval: An
Experiment on the Worm Community System,” Journal of the American Society
for Information Science, Volume 48, Number 1, Pages 17-31, January, 1997.
7.     
H.
Chen, B, R. Schatz, T. D. Ng, J. Martinez, A. Kirchhoff, and C. Lin, “A
Parallel Computing Approach to Creating Engineering Concept Spaces for Semantic
Retrieval:  The Illinois Digital Library
Initiative Project,” IEEE Transactions on Pattern Analysis and Machine
Intelligence, Special Section on Digital Libraries: Representation and
Retrieval, Volume 18, Number 8, Pages 771-782, August, 1996.
8.     
H.
Chow, H. Chen, T. D. Ng, P. Myrdal, and S. H. Yalkowsky, “Using
Backpropagation Networks for the Estimation of Aqueous Activity Coefficients of
Aromatic Organic Compounds,” Journal of Chemical Information and Computer
Sciences, American Chemical Society, Volume 35, Number 4, Pages 723-728,
July/August, 1995.
9.     
H.
Chen, and T. D. Ng, “An Algorithmic Approach to Concept Exploration in a
Large Knowledge Network (Automatic Thesaurus Consultation):  Symbolic Branch-and-bound Search vs.
Connectionist Hopfield Net Activation,” Journal of the American Society for
Information Science, Volume 46, Number 5, Pages 348-369, June 1995.
10. 
H.
Chen, K. J. Lynch, K. Basu, and T. D. Ng, “Generating, Integrating, and
Activating Thesauri for Concept-Based Document Retrieval,” IEEE Expert,
Special Series on Artificial Intelligence in Text-Based Information Systems,
Volume 8, Number 2, Pages 25-34, April, 1993.
1.     
Demonstration:  H. Chen, B. R. Schatz, and T. D. Ng,
“The Interspace Prototype” for the University of Illinois at Urbana-Champaign,
in the Information Management (IM) and Intelligent Collaboration &
Visualization (IC&V), Principal Investigator Meeting, Kahuku - Oahu,
Hawaii, October 26-29, 1998.  Sponsored
by DARPA Information Technology Office (ITO).
2.     
Presentation
and Demonstration:  T. D. Ng,
“Semantic Interoperability for Geographic Information Systems” for the
University of Illinois at Urbana-Champaign Digital Library Initiative (DLI)
project and the University of California at Santa Barbara DLI project, in the NSF/ARPA/NASA
Digital Library Initiative Winter 1998 All-Project Meeting, Berkeley, CA,
January 5-6, 1998.  Sponsored by
NSF/ARPA/NASA and the University of California at Berkeley, http://ai.bpa.arizona.edu/tng/pub/DLI98_Berkeley/index.htm.
3.     
Demonstration:  H. Chen, B. R. Schatz, and T. D. Ng,
“The Interspace Prototype” for the University of Illinois at Urbana-Champaign,
in the Joint Information Collaboration & Visualization (IC&V) and
Information Management (IM), Principal Investigator Meeting, San Diego, CA,
October 15-17, 1997.  Sponsored by DARPA
Information Technology Office (ITO).
4.     
Poster
Session:  H. Chen, T. R. Smith, and T.
D. Ng, “GeoScience Self-organizing Map and Concept Space,” in the 2nd
ACM International Conference on Digital Libraries '97, Philadelphia, PA,
July 23-26, 1997.
5.     
Poster
Session:  H. Chen, B. R. Schatz, A. L.
Houston, R. R. Sewell, T. D. Ng, and C. Lin, “Internet Browsing and
Searching:  User Evaluation of Category
Map and Concept Space Techniques,” in the 2nd ACM International Conference
on Digital Libraries '97, Philadelphia, PA, July 23-26, 1997.
6.     
Poster
Session:  H. Chen, B. R. Schatz, S. M.
Hubbard, T E. Doszkocs, A. L. Houston, R. R. Sewell, K. M. Tolle, and T. D.
Ng, “Medical Information Retrieval,” in the 2nd ACM International
Conference on Digital Libraries '97, Philadelphia, PA, July 23-26, 1997.
1.     
R. V.
Hauck, R. R. Sewell, T. D. Ng, and H. Chen, “Concept-based Searching and
Browsing: A Geoscience Experiment,” submitted to the IEEE Transactions on
Systems, Man, and Cybernetics Part C:  Applications
and Reviews, 1999.
2.     
K. M.
Tolle, H. Chen, and T. D. Ng, “Improving Concept Extraction from Text
Using Natural language Processing Noun Phrasing Tools:  An Experiment in Medical Information
Retrieval,” submitted to the Journal of the American Medical Informatics
Association, 1999.
3.     
A. L.
Houston, H. Chen, S. M. Hubbard, B. R. Schatz, T. D. Ng, R. R. Sewell,
and K. M. Tolle, “Health Care Information Infrastructures:  A Critical Component of the NII,” submitted
to Journal of the American Society for Information Science, 1999.