Newsgroups: comp.ai.nat-lang
Path: cantaloupe.srv.cs.cmu.edu!bb3.andrew.cmu.edu!nntp.sei.cmu.edu!news.cis.ohio-state.edu!math.ohio-state.edu!howland.erols.net!netcom.com!i17
From: i17@netcom.com (Valucard International)
Subject: Re: Finding the "closest match" between texts.
Message-ID: <i17DzBM0K.ItM@netcom.com>
Organization: NETCOM On-line Communication Services (408 261-4700 guest)
X-Newsreader: TIN [version 1.2 PL1]
References: <Pine.ULT.3.95.961014164638.16974t-100000@shiva2.cac.washington.edu> <5401uj$5s1@lyra.csx.cam.ac.uk>
Date: Tue, 15 Oct 1996 14:09:55 GMT
Lines: 75
Sender: i17@netcom10.netcom.com


This would also make the foundation for a nice compression scheme
or bandwidth-widener...  especially if the exact text wording was less
important than the "generalized" idea.  Kind of like those computer-made
photos of composite faces made from hundreds of movie stars or presidents.

A "holographic generalization" could be the "centralizer" of a "connectioma"
of specific texts and text fragments.

A visual database of images linked in such a way would be very nice.
You could search for a specific face (in a Central Casting database)
by "skullpting-in" from more fuzzy generalized faces to a final set
of specific candidates.

Or you could find photos of nature scenes or buildings which progressively
refine from "Victorian 2-3-story house" to "1839 Elm Street, spooky lighting".

Choices close to the Desired Parameters would dominate the "tween-image"
which would be a semi-ghostly multiple exposure of Choices which are 
distorted spatially so as to fit on top of each other.  "More distant
selections" in the database (structured pointcloud in some space?) would
exert less influence on the "Resultant", the "Obscure Objet-Du-Desir".

As for mapping this ideation back to the text/language area,
perhaps there are algorithmic means for assigning parameters 
in some "tree-dimensional space" (ie, navigable by the above visualizer).
Otherwise, my deepest apologies for wasting your time here in AI.NAT-LANG...


SHIVA would be an excellent name for such a system 
used for viewing and interlinking and navigating "overlapping datasets".
(His multi-arm-edness is due to "multiple exposures in time".)

Perhaps MEDUSA would name a similar system.
Imagining the interactive display of such a system is quite interesting.
Even hypnotic!  

                                                       vvvvv
: In article <Pine.ULT.3.95.961014164638.16974t-100000@shiva2.cac.washington.edu>,
: David L Miller  <dlm@cac.washington.edu> wrote:
: >
: >Given a text e-mail message, I would like to find related messages in
: >an archive.  A sub-problem is to determine which of a set of
: >pre-defined categories the message is most related to.  Tolerance for
: >grammatical and spelling errors is highly desirable.  A training phase
: >is acceptable. 
: >
: >Does anyone know of a (preferably free) package that can do this or
: >any suggestions for a reference or starting point? 
: >
: >--DLM
: >
: >-- 
: >|\ |  |\/|  David L. Miller    dlm@cac.washington.edu  (206) 685-6240
: >|/ |_ |  |  Software Engineer, Pine Development Team   (206) 685-4045 (FAX)
: >University of Washington, Networks & Distributed Computing, Box 354841
: >4545 15th Ave NE, Seattle WA 98105, USA


Miles Osborne (mo114@cl.cam.ac.uk) wrote:
: well, this seems like a text categorisation/information retrieval problem to
: me.  There are many ways to solve such a problem;  one way would be
: to construct classifiers for each of your pre-defined categories (eg. using
: ngram-based models, autoclass etc).  You then run each classifier over 
: your 'unknown' message and then assign the message's classification as
: being the most likely classifier.  In the past I've played 
: around with this idea for authorship determination.


17  7:00 15oct96   (hmmf.  need a word for 'visual overlay-tree navisystem')
-- 

------------------------------------------------------------------------------

                      May the best hallucination win.


          I want a God who takes responsibility for His mistakes.

------------------------------------------------------------------------------
