TRIWG Proposal: Scientific Autonomy

Hi Dimi,
this is the latest iteration of the technical proposal. somewhat garbled
but I hope it will give you the general idea. Can't do any more at this
point as have to been in a field camp shortly.  All the best Mr Harpoon.

Liam

method:/milestones
1) Bayes networks to integrate and interpret sensor data (spectral data,
vision and maybe radar in particular). Based on a priori knowledge/models
of how data should behave.
- detect certain minerals (olivine, pyroxene, carbonates) from spectral
data.
- detect meteorites (or subset thereof)
- sensor/algorithm scheduler to reduce uncertainty, decide which sensor to
deploy next that most reduces uncertainty with the lowest cost.
Note that networks will not just integrate multiple sensors but also the
results of multiple classification and object recognition algorithms, each
with their own particular strengths and weaknesses.

This will allow us to field validate the bayes network formalism as a
method of integrating multiple sensor inputs from distinct modalities in a
statistically rigorous manner. Furthermore, there exist methods operating
on Bayes networks that can be adapted to the problem

The intention is to first focus on using and the formalism of bayes
networks to identify key features in spectrascopic and visual data
collected from rocks in Antarctica and a Mars analog site in the Canadian
arctic. This formalism appears to be good to integrate data from various
distinct sensors and/or pattern recognition algorithms and interpret them
in the light of specialist apriori knowledge. Particularly when the
presence or absence of certain features promotes or inhinbits the presence
of others. This is important when searching for particular objects for
example (meteorites?) 

Furthermore, Bayes networks can be evaluated on incomplete data.  A full
sensor sweep is not necessary to make deductions. Also, a network can be
processed to determine the increase in knowledge gained by taking readings
with any particular sensor or output from some algorithm. With this
information, and the costs associated with each, it is possible to
efficiently schedule sensor deployments, at least for the task of
identifying objects or features coded for in the networks.

The effort will be concentrated on using spectroscopy, vision  and magnetic
sensors, possibly also radar. The first two are rich sensing modalities
with abundant interpretations.  There is inhouse expertise on all of them,
and they are suitable candidates for geological fieldwork. However, it is
expected that the final system will be easily expanded to include other
sensors.

I anticipate this part as been something likely to be deployed in the
meteobot in the near future, as it is the easiest bit to develop.

2) model generation and storage:
- batch mode
- incremental udpate as more results added.

Using position tagged sensor data from rocks in the Antarctic and arctic, 
run unsupervised clustering algorithms to determine general groupings or
classes. Each class will be associated with a geographical area, and will
be represented by mixtures of standard distributions (such as gaussians)
that require relatively few parameters to specify. These clusters will
constitute the statistical model of the environment. Note however, that
outliers, or anomalies, should also be stored, in addition to the cluster
parameters.

Given such a model, w will determine probability of membership to each
class for a new data point, even with incomplete sensor data (ie not a full
sensor sweep). Research issues are the best way to modify the model
incrementally as more data is added, and to schedule further sensor
deployment. Conceivably, Bayes networks can be generated that classify
objects according to this model. Thenceforth it is a reapplication of the
above algorithms to schedule sensor deployment for to gain further
knowledge of the object (if it is considered interesting enough, an anomaly
for example).

This model will likely record the positions of each object. Thus the
geographical distribution of each class of objects is implicitly stored,
and their prior probabilities at a given location deduced. Furthermore, if
navigation is good enough they can be revisited for further examination in
the light of new knowledge.  Also, such info can be used to refine robot
position estimates given current knowledge of the rock types in the
vicinity.

This model is of intrinsic scientific value and is the main deliverable
from the robot.  However, it is also suitable as an aid for further
exploration or searching for specific objects. If new data can be quickly
shown to belong to a known class then no further resources need be deployed
to check if it comes from the object that is been searched for.

3) Belief/Bayes network synthesis from model to:
- detect anomalies
- detect and confirm prior classes

This is the fusion of the model information with a priori knowledge,
encoded in the networks of (1) into a network capable of a more complete
interpretation of new sensor readings. Such a network would classify new
objects (or label them as unknown or anomalies).  It will schedule sensor
deployment and direct search operations.