Newsgroups: comp.ai.philosophy
From: David@longley.demon.co.uk (David Longley)
Path: cantaloupe.srv.cs.cmu.edu!rochester!udel!gatech!swrinde!pipex!peernews.demon.co.uk!longley.demon.co.uk!David
Subject: Fragments of Behaviour 4
Organization: Myorganisation
Reply-To: David@longley.demon.co.uk
X-Newsreader: Demon Internet Simple News v1.29
Lines: 407
X-Posting-Host: longley.demon.co.uk
Date: Tue, 25 Apr 1995 14:13:28 +0000
Message-ID: <798819208snz@longley.demon.co.uk>
Sender: usenet@demon.co.uk

FRAGMENTS OF BEHAVIOUR:
EXTRACT 4: FROM 'A System Specification for PROfiling BEhaviour'

Because  the  following  has  caused some   ripples  within   my   own  
profession  I have split it into several  pieces in order  to  lighten 
its   distribution and increase the liklihood of  stimulating  comment 
and criticism from those who specialise more in one area than another.  
Some  of  the issues  will be more familiar  to  philosophers,  others 
statisticians,  others  still,  applied psychologists  etc.  An  early 
version  was presented at the Spring 1993 meeting of the  Division  of 
Criminological  and  Legal  Psychology of  the  British  Psychological 
Society.

For anyone wishing to read all of the extracts, the  following   lists  
the newsgroups which each part was posted to. I am looking for as many 
profess ional comments as possible via newsgroup or e-mail. Full  text 
as a version 5.1 Wordperfect file, setup to print to a Brother HL8V is 
available if you wish.

comp.ai.philosophy   Extract 4: 25.04.95
sci.cognitive        Extract 3: 25.04.95      Extract 5: 25.04.95
sci.philosophy.tech  Extract 1: 25.04.95
sci.psychology       Extract 2: 25.04.95
sci.stat.edu         Extract 6: 25.04.95      Extract 7: 25.04.95


This ref: Extract 4: comp.ai.philosophy 25.04.95

Connectionist  systems, it is claimed, do not represent  knowledge  as 
production rules, ie as well-formed-formulae represented in the syntax 
of  the  predicate calculus (using conditionals, modus  ponens,  modus 
tollens  and  the  quantifiers), but  as  connection  weights  between 
activated predicates in a parallel distributed network:

    'Lawful  behavior  and judgments may be  produced  by  a 
    mechanism  in which there is no explicit  representation 
    of  the  rule. Instead, we suggest that  the  mechanisms 
    that   process   language   and   make   judgments    of 
    grammaticality are constructed in such a way that  their 
    performance  is characterizable by rules, but  that  the 
    rules  themselves  are  not  written  in  explicit  form 
    anywhere in the mechanism.'

    D E Rumelhart and D McClelland (1986)
    Parallel Distributed Processing Ch. 18

Such    systems   are   function-approximation   systems,   and    are 
mathematically  a development of Kolmogorov's Mapping  Neural  Network 
Existence  Theorem  (1957). Such networks consist of three  layers  of 
processing  elements. Those of the bottom layer simply distribute  the 
input  vector (a pattern of 1s and 0s) to the processing  elements  of 
the  second  layer. The processing elements of this middle  or  hidden 
layer implement a *'transfer function'* (more on this below). The  top 
layer are output units. 

An  important  feature  of Kolmogorov's Theorem, is  that  it  is  not 
constructive. That is, it is not algorithmic or 'effective'. Since the 
proof  of  the  theorem is not constructive, we do  not  know  how  to 
determine  the key quantities of the transfer functions.  The  theorem 
simply tells us that such a three layer mapping network must exist. As 
Hecht-Nielsen (1990) remarks:

    'Unfortunately,  there  does not appear to be  too  much 
    hope  that  a method of finding the  Kolmogorov  network 
    will  be developed soon. Thus, the value of this  result 
    is  its  intellectual assurance that  continuous  vector 
    mappings   of  a  vector  variable  on  the  unit   cube 
    (actually,  the theorem can be extended to apply to  any 
    COMPACT, ie, closed and bounded, set) can be implemented 
    EXACTLY with a three-layer neural network.'

    R. Hecht-Nielsen (1990)
    Kolmogorov's Theorem
    Neurocomputing

That is, we may well be able to find weight-matrices which capture  or 
embody certain functions, but we may not be able to say  'effectively' 
what  the  precise equations are which  algorithmically  compute  such 
functions.  This is often summarised by statements to the effect  that 
neural  networks  can model or fit solutions to sample  problems,  and 
generalise  to  new cases, but they can not provide a rule as  to  how 
they  make such classifications or inferences. Their ability to do  so 
is  distributed  across the weightings of the whole weight  matrix  of 
connections  between the three layers of the network. The above is  to 
be  contrasted  with the fitting of linear discriminant  functions  to 
partition  or  classify  an  N dimensional space  (N  being  a  direct 
function   of   the  number  of  classes  or   predicates).   Fisher's 
discriminant  analysis  (and  the  closely  related  linear   multiple 
regression   technology)   arrive   at   the   discriminant   function 
coefficients through the Gaussian method of Least Mean Squares, each b 
value  and the constant being arrived at deductively via the  solution 
of   simultaneous   equations.   Function   approximation,   or    the 
determination  of  hidden  layer weights or connections  is  based  on 
recursive feedback, elsewhere within behaviour science, this is  known 
as  'reinforcement',  the differential strengthening or  weakening  of 
connections  depending  on feedback or knowledge of  results.  Kohonen 
(1988)   commenting   on  "Connectionist  Models"   in   contrast   to 
conventional, extensionalist relational databases, writes:

    'Let  me make it completely clear that one of  the  most 
    central functions coveted by the "connectionist"  models 
    is the ability to solve *simplicitly defined  relational 
    structures*.  The latter, as explained in  Sect.  1.4.5, 
    are  defined  by  *partial relations*,  from  which  the 
    structures are determined in a very much similar way  as 
    solutions to systems of algebraic equations are  formed; 
    all  the  values  in the  universe  of  variables  which 
    satisfy  the  conditions  expressed  as  the   equations 
    comprise, by definition, the possible solutions. In  the 
    relational    structures,   the    knowledge    (partial 
    statements,   partial   relations)  stored   in   memory 
    constitutes  the universe of variables, from  which  the 
    solutions  must be sought; and the conditions  expressed 
    by  (eventually incomplete) relations, ie, the  "control 
    structure" [9.20] correspond to the equations.

    Contrary  to  the conventional database  machines  which 
    also  have  been  designed  to  handle  such  relational 
    structures, the "connectionist" models are said to  take 
    the relations, or actually their strengths into  account 
    statistically. In so doing, however they only apply  the 
    Euclidean  metric, or the least square loss function  to 
    optimize   the  solution.  This  is  not  a  very   good 
    assumption for natural data.'

    T. Kohonen (1988)
    Ch. 9 Notes on Neural Computing
    In Self-Organisation and Associative Memory
     
Throughout  the  1970s  Nisbett  and colleagues  studied  the  use  of 
probabilistic   heuristics  in  real  world  human  problem   solving, 
primarily in the context of Attribution Theory (H. Kelley 1967, 1972). 
Such  inductive  as opposed to deductive heuristics  of  inference  do 
indeed  seem  to be influenced by training (Nisbett and  Krantz  1983, 
Nisbett et. al 1987). Statistical heuristics are naturally applied  in 
everyday  reasoning  if  subjects  are trained in  the  Law  of  Large 
Numbers. This is not surprising, since application of such  heuristics 
is an example of response generalisation - which is how  psychologists 
have  traditionally  studied the vicissitudes of  inductive  inference 
within  Learning  Theory.  As Wagner (1981) has pointed  out,  we  are 
perfectly  at liberty to use the language of Attribution Theory as  an 
alternative,  this  exchangeability  of  reference  system  being   an 
instance of Quinean Ontological Relativity, where what matters is  not 
so  much  the  names in argument positions,  or  even  the  predicates 
themselves,  but  the  *relations*  (themselves  at  least   two-place 
predicates) which emerge from such systems.

Under  most natural circumstances, inductive inference  is  irrational 
(cf.  Popper 1936, Kahneman et al. 1982, Dawes, Faust and Meehl  1989, 
Sutherland   1992).  This  is  because  it  is  generally   based   on 
unrepresentative   sampling   (drawing  on  the   'availability'   and 
'representativeness'  heuristics), and this is so simply because  that 
is  how data in a structured culture often naturally presents  itself. 
Research has therefore demonstrated that human inference is  seriously 
at  odds with formal deductive logical reasoning, and the  algorithmic 
implementation  of  those inferential processes by  computers  (Church 
1936, Post 1936, Turing 1936). One of the main points of this paper is 
that  we  generally  turn  to  the  formal  deductive  technology   of 
mathematico-logical method (science) to compensate for the  heuristics 
and  biases which typically characterise natural inductive  inference. 
Where possible, we turn to *relational databases and 4GLs*  (recursive 
function  theory and mathematical logic) to provide  descriptive,  and 
deductively valid pictures of individuals and collectives.

This  large, and unexpected body of empirical evidence from  decision-
theory, cognitive experimental social psychology and Learning  Theory, 
began accumulating in the mid to late 1970s (cf. Kahneman, Tversky and 
Slovic 1982, Putnam 1986, Stich 1990), and began to cast serious doubt 
on  the  viability  of  the  'computational  theory'  of  mind  (Fodor 
1975,1980)  which was basic to functionalism (Putnam 1986).  That  is, 
the  substantial body of empirical evidence which  accumulated  within 
Cognitive  Psychology itself suggested that, contrary to the  doctrine 
of  functionalism,  there exists a system  of  independent,  objective 
knowledge,  and reasoning against which we can judge human, and  other 
animal cognitive processing. However, it gradually became  appreciated 
that  the  digital computer is not a good model of  human  information 
processing, at least not unless this is conceived in terms of  'neural 
computing'  (also  known as 'connectionism' or  'Parallel  Distributed 
Processing). The application of formal rules of logic and  mathematics 
to  the  analysis of behaviour solely within the  language  of  formal 
logic  is the professional business of Applied  Behaviour  Scientists. 
Outside  of the practice of those professional skills,  the  scientist 
himself is as prone to the irrationality of intensional heuristics  as 
are laymen (Wason 1966). Within the domain of formal logic applied  to 
the  analysis of behaviour, the work undertaken by applied  scientists 
is impersonal. The scientists' professional views are dictated by  the 
laws   of   logic  and  mathematics  rather  than   personal   opinion 
(heuristics).

Applied  psychologists,  particularly  those working in  the  area  of 
Criminological Psychology, are therefore faced with a dilemma.  Whilst 
many  of their academic colleagues are *studying* the  heuristics  and 
biases  of  human cognitive processing, the  applied  psychologist  is 
generally called upon to do something quite different, yet is  largely 
prevented from doing so for lack of relational systems to provide  the 
requisite  distributional  data upon which to use  the  technology  of 
algorithmic  decision making. In the main, the applied  criminological 
psychologists  as  behaviour scientist is called upon to  bring  about 
behaviour  change, rather than to better understand or  explicate  the 
natural  heuristics of cognitive (clinical) judgement. To the  applied 
psychologist,  the  low  correlation between  self-report  and  actual 
behaviour, the low consistency of behaviour across situations, the low 
efficacy  of prediction of behaviours such as 'dangerousness'  on  the 
basis  of clinical judgment, and the fallibility of assessments  based 
on  interviews,  are  all  testament  to  the  now  *well   documented 
unreliability of intensional heuristics (cognitive processes) as  data 
sources,  and  we  have  already  pointed to  why  this  is  so.*  Yet 
generally, psychologists can rely on no other sources, as there are in 
fact,  inadequate  Inmate Information Systems.  Thus,  whilst  applied 
psychologists know from research that they must rely on distributional 
data  to  establish their professional knowledge base, and  that  they 
must base their work with individuals (whether prisoners, governors or 
managers)  on  extensional  analysis of such  knowledge  bases,  *they 
neither  have  the systems available nor the influence  to  have  such 
systems  established,  despite powerful  scientific  evidence  (Dawes, 
Faust  and Meehl 1989) that their professional services in many  areas 
depend  on  the  existence  and use of  such  systems.*  What  applied 
psychologists   have  learned  therefore  is  to  eschew   intensional 
heuristics  and look instead to the formal technology  of  extensional 
analysis  of  observations  of behaviour. The fact  that  training  in 
formal statistical and deductive logic is difficult, particularly  the 
latter, makes this a challenge, since most of the required skills  are 
only  likely  to  be applicable when sitting in front  of  a  computer 
keyboard (Holland et al 1986). It is particularly challenging in  that 
the   information   systems   are  generally   inadequate   to   allow 
professionals to do what they are trained to do.

Over  the past five years (1988-1993), a programme has been  developed 
which   is  explicitly  naturalistic  on  that  it  seeks  to   record 
inmate/environment   (regime)   interactions.  This  system   is   the 
PROBE/Sentence Management system. It breaks out of solipsism by making 
all  assessments  of  behaviour, and all inmate  targets *RELATIVE  to 
predetermined  requirements of the routines and structured  activities 
defined under function 17 of the annual Governors Contract*. It is  by 
design  a 'formative profiling system' which is 'criterion  reference' 
based.

The alternative, intensional heuristics, which are the mark of natural 
human  judgement  (hence  our rich folk  psychological  vocabulary  of 
metaphor)  have  to  be  contrasted  with  extensional  analysis   and 
judgement  using technology based on the deductive algorithms  of  the 
First Order Predicate Calculus (Relational Database Technology).  This 
is  not  only  coextensive with the 'scope and  language  of  science' 
(Quine  1954) but is also, to the best of our knowledge from  research 
in  Cognitive  Psychology,  an effective compensatory  system  to  the 
biases of natural intensional, inductive heuristics (Agnoli and Krantz 
1989). Whilst a considerable amount of evidence suggests that training 
in formal logic and statistics is not in itself sufficient to suppress 
usage  of  intensional  heuristics  in any  enduring  sense,  ie  that 
generalisation  to  extra-training  contexts  is  limited,  there   is 
evidence  that judgement can be rendered more rational by training  in 
the  use of extensional technology. The demonstration by Kahneman  and 
Tversky  1983, that subjects generally fail to apply  the  extensional 
conjunction rule in probability that conjunctions are always equal  or 
less  probable  than  its elements, and that  this  too  is  generally 
resistant  to counter-training, is another example, this  time  within 
probability theory (a deductive system) of the failure of  extensional 
rules  in  applied  contexts. Careful use of I.T.  and  principles  of 
deductive  inference  (e.g. semantic tableaux,  Herbrand  models,  and 
Resolution  methods)  promise, within the limits  imposed  by  Godel's 
Theorem,  to  keep us on track if we restrict our  technology  to  the 
extensional.

Before leaving the concept of Methodological Solipsism, here's how one 
commentator  reviewed  the  situation in the context of  the  work  of 
perhaps psychology's best known radical behaviourist:

    'Meanings Are Not 'In the Head'
        
    Skinner has developed a case for this claim in the book, 
    VERBAL   BEHAVIOR  (1957),  and  elsewhere,   where   he 
    maintains that meaning, rather than being a property  of 
    an utterance itself, is to be found in the nature of the 
    relationship between occurrence of the utterance and its 
    context. It is important enough to put in his own words.
        
    ..meaning is not properly regarded as a property  either 
    of  a  response  or  a  situation  but  rather  of   the 
    contingencies  responsible  for both the  topography  of 
    behavior  and the control exerted by stimuli. To take  a 
    primitive example, if one rat presses a lever to  obtain 
    food  when hungry while another does so to obtain  water 
    when thirsty, the topographies of their behaviors may be 
    indistinguishable,  but  they may be said to  differ  in 
    meaning: to one rat pressing the lever 'means food';  to 
    the other it 'means' water. But these are aspects of the  
    contingencies  which  have brought  behavior  under  the 
    control of the current occasion. Similarly, if a rat  is 
    reinforced  with food when it presses the lever  in  the 
    presence  of  a flashing light but with water  when  the 
    light is steady, then it could be said that the flashing 
    light  means food and the steady light means water,  but 
    again  these are references not to some property of  the 
    light but to the contingencies of which the lights  have 
    been parts.
          
    The  same  point  may  be  made,  but  with  many   more 
    implications,  in  speaking  of the  meaning  of  verbal 
    behavior.  The  over-all  function of  the  behavior  is 
    crucial.  In  an  archetypal pattern  a  speaker  is  in 
    contact with a situation to which a listener is disposed 
    to respond but with which he is not in contact. A verbal 
    response  on the part of the speaker makes  it  possible 
    for the listener to respond appropriately. For  example, 
    let  us suppose that a person has an appointment,  which 
    he  will keep by consulting a clock or a watch. If  none 
    is  available, he may ask someone to tell him the  time, 
    and the response permits him to respond effectively...
          
    *The meaning of a response for the speaker* includes the 
    stimulus  which controls it (in the example  above,  the 
    setting  on the face of a clock or watch)  and  possibly 
    aversive aspects of the question, from which a  response 
    brings release. *The meaning for the listener* is  close 
    to  the  meaning the clock face would have  if  it  were 
    visible  to him, but it also includes the  contingencies 
    involving the appointment, which make a response to  the 
    clock  face  or the verbal response probable at  such  a 
    time..
          
    One  of  the unfortunate implications  of  communication 
    theory is that the meanings for speaker and listener are 
    the same, that something is made common to both of them, 
    that  the speaker conveys an idea or meaning,  transmits 
    information,  or  imparts knowledge, as  if  his  mental 
    possessions  then become the mental possessions  of  the 
    listener.  There are no meanings which are the  same  in 
    the  speaker and listener. Meanings are not  independent 
    entities...

    Skinner, 1974, pp.90-2
    
    One does not have to take Skinner's word alone, however, 
    for   much current philosophical work also leads to  the 
    conclusion that meanings are not in the head. The  issue 
    extends  beyond  the problem of meaning construed  as  a 
    linguistic property to the problem of intensionality and 
    the   interpretation  of  mentality  itself.  While  the 
    reasoning  behind  this  claim is  varied  and  complex, 
    perhaps an analogy with machine functions can be helpful 
    here.  A computer is a perfect example of a system  that 
    performs    meaningless   syntactic   operations.    The 
    electrical  configuration  of  the  addressable   memory 
    locations  is just formal structures,  without  semantic 
    significance  to  the computer either as numbers  or  as 
    representations  of  numbers. All the computer  does  is 
    change  states automatically as electrical current  runs 
    through its circuits. Despite the pure formality of  its 
    operations,  however,  the  computer  (if  designed  and 
    programmed  correctly) will be  truth-preserving  across 
    computations:  ask  the thing to add 2 + 2 and  it  will 
    give  you a 4 every time. But the numerical meanings  we 
    attach  to the inputs and outputs do not enter into  and 
    emanate  from the computer itself. Rather,  they  remain 
    outside  the system, in the interpretations that  we  as 
    computer  users assign to the inputs and outputs of  the 
    machine's  operations.  Now,  if one is  inclined  to  a 
    computational  view  of mind, then by analogy  much  the 
    same  thing holds for the organic computational  systems 
    we call our brains. Meanings are not in them, but  exist 
    in  the  mode through which they  in  their  functioning 
    stand to the world.
        
    Ironies  begin  to  mount here.  Brentano's  claim  that 
    'Intentionality' is the mark of the mental is now widely 
    accepted.  Intentionality in its technical sense has  to 
    do  with  the meaningfulness, the  semantic  context  of 
    mental  states.  But  the  argument  is  now  made  that 
    cognitive  operations and their objects are  formal  and 
    syntactic  only,  and do not  themselves  have  semantic 
    context (e.g. see Putnam, 1975; Fodor, 1980; and  Stich, 
    1983,  for a range of contributions to this  viewpoint). 
    Semantic   issues   do  not  concern   internal   mental 
    mechanisms  but  concern the mode  of  relation  between 
    individuals and their worlds. Such issues are not really 
    psychological  at all, it is claimed, and are  relegated 
    to other fields of inquiry for whatever elucidation  can 
    be  brought  to  them. For example, while  belief  is  a 
    canonical example of a mental, intentional state,  Stich 
    says,  'believing  that p is an amalgam  of  historical, 
    contextual,     ideological,    and    perhaps     other 
    considerations'  (1983, p.170). The net result of  these 
    recent moves in cognitive psychology and the  philosophy 
    of mind seems to be that the essence of mentality -  its 
    meaningfulness - is in the process of being disowned  by 
    modern mentalism! But Stich's ashbin of intentionality - 
    historical   and contextual considerations - is  exactly 
    what  behaviorism  seeks  to address.  Can  it  be  that 
    BEHAVIORISM  will  be the instrument  called  for  final 
    explication  of Brentano's thesis of the mental?   One's 
    head spins to think it.

    R. Schnaitter (1987)
    Knowledge as Action: The Epistemology of Radical Behaviorism
    In B. F. Skinner Consensus and Controversy
    Eds. S. Modgil and C. Modgil

The reawakening of interest in connectionism in the early to mid 1980s 
can  indeed  be  seen  as a vindication of  the  basic  principles  of 
behaviourism. What is psychological may well be impenetrable, for  any 
serious scientific purposes, not because it is in any way a  different 
kind  of 'stuff', but because structurally it amounts to no more  than 
an n-dimensional weight space, idiosyncratic and context specific,  to 
each and every one of us. 

-- 
David Longley
