www.GetXFactor.com

Leading Technology, Science,
Agriculture News and information


Part of the Identityscape.com network...

getxfactor.com jmoodmusic.com smartbusinesschoices.com mintdepot.com lowfaresalways.com evangelicalview.com shoppingpodder.com soproudlywehail.com webnews.ws currenthumor.com

 

 

A Hierarchical Architecture for Software Agents-Part 2
   Science and Technology news... Forum Index -> Cognitive Science Forum  
View previous topic :: View next topic  
Author Message
Guest







PostPosted: Fri Nov 14, 2003 11:05 pm    Post subject: A Hierarchical Architecture for Software Agents-Part 2 Reply with quote

Each classifier in each layer is credited with a utility:
g = c1*(sum of du) + c2*(sum of(vector IN(t) . vector INi(t+1)))
composed of a weighted sum of rewards, du, collected over time, plus the
sequence>s success in predicting how the world will evolve next. (c1 and c2
are constants, "." is the vector dot product.)
Each time a classifier is active it is possible to interrupt its output
and record the resulting change in the estimated utility (du) that results at
the output of the value module.
Value Module:
The final element in the hierarchy combines all of the value measures
passed up to it from the sequence classifiers below and produces a single
scalar estimated utility. This subsystem can be implemented as a case-based
reasoner, a Bayesian network, or a neural network. Retraining of the value
module follows each experimental run, the utility being computed by hard
wired feature detectors.
Since temporal variations of the values may influence utility we have
experimented with a value module which is, itself, composed of a "flat
file" case-based reasoning agent similar to the Asa agents described in our
earlier papers.
Feature detectors are hard wired to record such things as collisions,
damage, life span, etc. As classifiers are learned their occurrence
(activation) can be compared, statistically, with the established "value
measures" or with the utility itself. If the correlation is significant a
newly defined category (classifier) can be added to the "value measure" set
of detectors.
Outputs:
Prospective actions are treated the same as (occasional) predictions.
At each time step that action (or inaction) is chosen which maximizes the
predicted ultimate utility given the recent context (inputs).
Learning Components:
For each categorizer and sequence classifier if no case match is close
enough a new prototypical case is defined (recorded). For those cases which
are close enough to the current input, vector IN, each case is modified
slightly so as to more nearly match vector IN. (formula given previously)
At each level in the hierarchy the case base consists of INi-gi data
"pairs" (where gi is the cumulative utility gain recorded while INi was
occurring) and extrapolation is possible by varying INi in order to attempt
to increase g. A variety of extrapolation schemes have been tried
successfully: i. One or a small number of attributes (components of the
vector
INi) can be perturbed seeking an increased g as estimated by the value
module.
ii. For some cases having similar inputs, vector IN2 ~ vector IN1, if
g2 > g1 it is possible to compose a synthetic case by recombining individual
components of vector IN1 and vector IN2. iii. For a case IN1,g1 we can find
another case IN2 with vector IN1 ~ vector IN2 and generate:
vector IN3 = vector IN2 - C*(vector IN1 - vector IN2)
so as to increase g for the synthetic case IN3. C is either a constant,
typically < 1, or C*/g2-g1/ or C*/g2-g1/*(vector IN1 . vector IN2).
We are most interested in those cases having the largest values of
(g1-g2)*(vector IN1 . vector IN2) in order to find the value of
vector IN1 - vector IN2 which is assumed to be responsible for g1-g2.
If computational resources are limited extrapolation can be focused onto
cases having the largest g values, those with the most commonly recurring
IN, or cases with the largest /g1-g2/*(vector IN1 . vector IN2).
Statistics are kept on the degree of match to each prototype
(a running total and a maximum). If these statistics drop below established
thresholds then that particular prototype is deleted. Thresholds for class
membership and deletion rates can be adjusted by heuristics in order to
optimize speed and memory resource usage.
Experiment:
We have employed Asa in a simulated robotic environment. In a world
populated by multiple generations of competing robots a biologically
inspired utility might be:
u = (N-1)/L
where N are the number of robot offspring produced (disk copies) and L is
the bot>s lifespan. In our single robot experiment we use:
u = -1/L
The value module is retrained after each run and accepts as inputs
"damage", integrated energy input to the bot, and "foresight" and predicts
u as output. Damage, energy input, foresight, and u are each extracted
from raw inputs by hand coded detectors. "Damage" is signaled when an
action is taken at time t but not detected in IN(t+1).



----- Posted via NewsOne.Net: Free (anonymous) Usenet News via the Web -----
http://newsone.net/ -- Free reading and anonymous posting to 60,000+ groups
NewsOne.Net prohibits users from posting spam. If this or other posts
made through NewsOne.Net violate posting guidelines, email abuse@newsone.net
Back to top
Display posts from previous:   
   Science and Technology news... Forum Index -> Cognitive Science Forum  
Page 1 of 1
All times are GMT

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum