《英文文献-中型足球机器人决策》由会员分享,可在线阅读,更多相关《英文文献-中型足球机器人决策(5页珍藏版)》请在金锄头文库上搜索。
1、Probability Fuzzy Cognitive Map for Decision-making in Soccer Robotics Hua-Qing Min1,2, Jia-Xing Hui1, Yan-Sheng Lu2, Jia-zhi Jiang1 1 College of Computer Science the relation-weight- measurement matrix: W=ij; the backward influencing value on nodes: Y =yi The output value of each node, that is, the
2、 new state value, is calculated with a decision output function Fi (a function which explains the sum of all former nodes influencing values to certain node), This is figured by Formula (7)6. )(F111iyy,yyVi,2ic?=)(,),(),(F1111i2i1iVcVVicc21?(7) Vci( i=1,2,11) on the left side is the output values of
3、 nodes, that is, the new state values of each node, 1i,2i ,11, i figures the relation weight function on node Ci by other nodes; Fi is the decision output function. The iterative operating process of nodes state values is also the process that limited input states opening a new path through the virt
4、ual space of PFCMSRRM. 4.2 Determination of the Relation-Weight- Measurement Matrix The study of fuzzy cognition map values in the robot decision-making is the process of training on cognitive map with a certain number of swathes, so as to deduce the weight measurement among conceptual nodes as need
5、ed by the decision and reasoning process. The learning arithmetic adjusts cognitive map values in order to minimize an error measurement on training swatches. The learning process can be formalized as an optimal searching problem in weight space. The classical error measurement is the sum of square
6、error. The square error between a single training swatch whose input is x and real output is can be figured as Formula (8). 22)(21 21xhyErrEw=(8) hw(x) represents the actual output on the swatch by conceptual nodes, and y represents the expected result. 4.3 Reasoning Power Learning Arithmetic based
7、on Descent Gradient Methods In calculating the partial differential coefficient of each weight by E, we use descent gradient method to reduce square error, which is figured by Formula (9). 0()iji jniiji ii jiiEE r rE r rE r rFyxE r rFi nx= =(9) Fi is the differential coefficient of decision output f
8、unction. In descent gradient arithmetic, in order to reduce E, we update weight values according to Formula (10). iiiixinFErr+)(10) represents the study speed. Intuitively, if the error Err = y - hw(x) is positive, then the corresponding network output is too small. We should increase weight for pos
9、itive input and reduce weight for negative input. If the error is negative, we adjust in the opposite direction. 4.4 The Weight Measurement Learning Arithmetic based on Simulated Annealing Because the cost of discovering local minimum depends on the problems searching foreground, we can adjust the c
10、ooling temperature and the initial temperature according to the background of actual problem. The simplest method to get the initial temperature is to carry out a random searching at the beginning, estimate the rough difference between the mean of swatch and the object function, and then ascertain t
11、he material scenario. The weight measurement learning arithmetic based on simulated annealing of PFCMSRRM is explained as follow: Initial temperature T=T0, t = 0; The actual output of nodes values VCi(t) (i=1,2n); Proceedings of the IEEE/WIC/ACM International Conference on Intelligent Agent Technolo
12、gy (IAT06) 0-7695-2748-5/06 $20.00 2006The anticipated output of nodes values VCi(t) (i=1,2n); The warp between the actual output and anticipant output: ntVtVtdiffniciCi =12)()(21)(; WHILE(the circular time is less than MAXTIMES OR diff is less than the enactment value) Choose a ij(t) by random,get
13、an initial result in -1,1 randomly ij(t)=0; ij(t+1) =0; Cognition map do the matrix operation to get the output values swatch of Nodes states VCi(t+1); IF(diff (t+1) diff(t) IF( +Ttdifftdiff)()1(exp) 0.8, the system will switch to other state when a0.8. We see that the 0.8 is the experts experience
14、value, which is very likely to have some difference from the real value. Therefore, it is highly possible that decision mistakes occur when a is near 0.8 in real world. We called the values around 0.8 the critical value. When the validation scope becomes bigger, so does the proportion of critical va
15、lue in the validation data, leading to a decreasing the accurate-rate with the finite state machine method. The PFCMSRRM overcomes this weakness. Result from PFCMSRRM is consecutive and state value of each node is consecutive and thus avoiding errors from the critical value problem. Therefore, the decision-making accurate-rate of the cognitive map model is relatively more stable. 6.2 Comparison of Decision-making Time Table 1 shows the time needed by the robot system to run 1000 times of decisions using the two methods. The finite state machine covers 36 system states and has the same