The roots of biological metaphors are deeply embedded in the biological study
of self-organized behaviors in social insects and other natural biological interaction
process (Garnier et al., 2007). A new emerging
biological metaphor that adopts the characteristics of the immune system is
called Artificial Immune System (AIS). AIS exhibits the properties of human
immune system for performing complicated tasks, for example, computing network
security (Harmer et al., 2002), scheduling (Ge
et al., 2008), data mining (Freitas and Timmis,
2007), autonomous robotics (Ishiguro et al.,
1995; Whitbrook et al., 2007, 2010),
medical diagnostic systems (Polat and Gunes, 2007) to
multi-object optimization (Chen et al., 2010;
Wong et al., 2009) and others.
The highly distributed and adaptive properties of the AIS are adopted to develop
the control model that has the ability to organize, coordinate and schedule
a group of agents in a closed scenario, such as garbage collection, moving supplies
through factories and mail delivery. Hence, an intelligent multi-agent, self-organizing
and self-learning system that is robust to achieve goals independently can be
derived. Lau et al. (2009) designed a cooperative
control model for multi agent-based material handling systems from the perspective
of clone characters of AIS and made simulation in Matlab. Ishiguro
et al. (1995) made dynamic behavior arbitration of autonomous mobile
robots based on network characters of AIS and the results was optimized using
GA algorithm, however, the majority of its action selections are built on fuzzy
Based on network characters of AIS, Jerne and Cocteau (1984)
proposed a remarkable hypothesis: idiotypic network hypothesis. Under Jernes
Immune Networks theory, a more sophisticated control model: Immune Agent Networks
(IAN) control model was proposed in this study. The proposed control model addresses
how individual agent with unique behaviors or talents can be exploited through
communication and cooperation with each other in achieving goals. Agents are
able to determine various kinds of responses by perception of the changing environment
and interact mutually through recognition, stimulation and suppression.
Biological immune system: The natural biological immune system is a
distributed novel-pattern detection system with several functional components
positioned in strategic locations throughout the body. The immune system regulates
defense mechanism of the body by means of innate and acquired immune responses.
Innate immunity is inborn and unchanging. It provides resistance to a variety
of antigens during their first exposure to a human body. Innate immunity therefore
operates non-specifically during the early phase of an immune response. This
general defense mechanism is known as primary immune response which is slower
and less protective. In addition to providing the early defense against infectious,
innate immunity enhances acquired immunity against the infectious agents which
is known as the secondary immune response (Lau et al.,
2009). On reoccurrence of the same antigens, a much faster and stronger
secondary immune response is resulted. The ability of adaptive immunity to mount
more rapid and effective responses to repeat encounters with the same antigen
is achieved by the mechanism of immunological memory where immune cells proliferate
and differentiate into memory cells during clonal expansion (Cadavid,
Jernes immune networks theory: On the basis of network characters of AIS, Jerne proposed the famous idiographic immune network theory which makes it possible to describe immune system from mathematics perspective and this prompts immunology into a really subject from experiments directly. In immune network theory, every cell is not in isolated state. They interact through recognition, stimulation and suppression and thus homeostasis is achieved via a chemical dynamic network system by the interaction of a number of cell types. Invasion by pathogens triggers a perturbation to the homeostasis which results in the classical immune response.
For the sake of convenience in the following explanation, we furthermore introduce
several terminologies in immunology. The portion on the antigen recognized by
the antibody is called epitope (antigen determinant) and the one on the antibody
that recognizes the corresponding antigen determinant is called paratope. Recent
studies on immunology have clarified that each type of antibody has also its
specific antigen determinant called idiotope (Ishiguro et
al., 1995). The simplified working procedure of immune network is schematically
illustrated in Fig. 1. In essence, the immune system is controlled
by the action of a large number of regulatory and effector molecules. They have
various cell surface receptors and soluble molecules such as interleukins that
can transmit signals between immune cells to eliminate foreign antigens. Antibody
(mainly generated by B cell) has paratope and idiotope on their surface which
can be used to identify antigens and self-recognition. When antigens invade
into body, the equilibrium state of immune system is perturbed. Epitope of the
antigen is recognized by the Paratope (P2) of cell B2, so anti-body generated
by cell B2 is stimulated, at the same time, Idiotope (Id2) of the antibody generated
by cell B2 is recognized by Paratope (P1) of the antibody generated by cell
B1, so, antibody generated by cell B2 is suppressed. On the other hand, antibody
3 generated by cell B3 stimulates antibody 2 since the idiotope Id3 of antibody
3 works as an antigen viewed from Cell B2.
|| A general structure of immune network
In this way these stimulation and suppression chains among antibodies form
a large-scaled chain loop and works as a self and not-self recognizer. Again,
the heart of Jerne's idea is that the self-nonself recognition in the immune
system is carried out at system level. According to the immune network theory
proposed by Jerne and Cocteau (1984) and Farmer
et al. (1986) gives out a general equation group which can be used
to calculate antibodys concentration.
AIS-BASED COOPERATIVE CONTROL MODEL
Ian-based control framework: In Jernes immune network theory,
antibody produced to foreign antigen elicits an anti-idiotypic antibody that
acts to control further production of anti-idiotypic antibody. The immune system
is therefore kept in balance without the presence of antigens and the return
to equilibrium is represented by an immune response. This balancing mechanism
of antibody leads to an important concept of automatic control of antibody concentration
that stimulates or suppresses an immune response. Based on Jernes theory,
a control framework which can be used to design multi-agents system is proposed.
Figure 2 schematically shows the immune agent network system
(IAN). As shown in Fig. 2, each agent contains a set of fundamental
capabilities in the default stage. The basic actions an agent performs include
exploring the surrounding environment and communicating with each other. When
agent detects antigens, i.e. the specified task, it will take actions and select
the corresponding antibody according to the binding affinity. Binding affinity
is determined by the interaction between antibodies and environment. According
to the complexity of tasks, the system will choose one or several agents to
accomplish task, solve problems and keep homeostasis, meanwhile, useful information
will be recorded into memory lib via communication and used for the secondary
response that occurs upon second and subsequent exposure to an antigen.
||Structure of the proposed immune multi-agent network system
The secondary response is characterized by more rapid kinetics and greater
magnitude relative to the primary response that occurs upon the first exposure.
Control schedule: The control framework proposed in this paper adopts the biological theory of human immune system for manipulating agents internal behaviors. Agents are able to provide different responses from its perception of the environment. There is no centralized control or initial plan that dictates the Agents which tasks they should first complete. Agents use the measure of binding affinity to recognize and approach tasks. The binding affinity g is quantified by the distance between an agent and a specified task. When a task is recognized by an agent, it will then manipu-late its capabilities to tackle the task. This manipulation of capability allows agents to perform appropriate responses and actions to complete the task with maximum efficiency and in minimal time. Binding affinity is formally defined as follows:
where, the star denotes multiplication,
is the binding affinity used by agent i to recognize task k, Bi is
agent is capability to deal with a given task and dij is the
Euclidean distance measured between a task and an agent and can be computed
where, (xi, yi) and (xj, yj) are Cartesian coordinate pairs of agent i and task j.
In collaboration mode, when an agent is waiting for help with a task, the binding affinity will be consolidated to obtain the required help. The waiting agents will work as sources for consolidated stimulation signals and attract other agents coming to help based on the self-tuning.
According to Eq. 1, the consolidated stimulation values between antibodies and antigens will be calculated as:
where, ω is a variant, Aw is the number of waiting agents for the same task, bj is the capability of agent i and f is an incremental function of bj..
From Eq. 3, we can see that the more waiting agents there are and the more capabilities agents have, the more stimulation values agent i will have to cope with task k.
When an agent selects a task, it has the chance to be selected by other agents. Therefore, for the task, each agent generated antibodies, respectively. According to IAN framework, Stimulations and suppressions coexist between antibodies. The interaction between antibodies will be calculated as:
where, bi and bj are the capabilities of agent i and j to cope with tasks, dij is the distance between the two agent i and j.
In Eq. 4, the sign of rij varies according to stimulation and suppression, respectively. If the interaction is stimulation, rij takes positive and vice-versa.
Based on Fashers dynamic equations, a simplified equation can be derived:
where, i, j, k, gik, ki , α and β all have the same meanings as the ones in Fashers dynamic equations. is corresponding to mij in Fashers dynamic equations.
Strategic behavioral control for agents: In a multi-agent system, teamwork
is an important and frequently occurred activity. Here, teamwork means when
two or more agents work cooperatively to achieve a common goal. When a group
of agents are working together, a crucial aspect is to have understanding and
agreement among agents through communication. Hence, the behavior of agents
which is characterized by unique behavior states, is studied to project their
correct operation strategies. Through these behavioral states, an agent is able
to determine its behavior in conjunction with the state information of other
cooperating agents obtained via communication; thereby an overall strategic
plan is developed based on the mutual understanding between agents. Agents alter
their behaviors by monitoring the dynamic environment. In different stages of
an operation cycle, Agents change their behavior states to perform different
activities according to different antibodies they have chosen. Figure
3 shows the behavior model of an agent given in the form of a state transition
diagram that defines the change of behavior of an agent in response to help
and request for help in operation.
According to Jernes network theory, the group behavior of agents is regulated by the agents concentration in response to a particular task. A concentration level is given to every task in the default stage. The higher is the task concentration level, the larger is the number of agents needed to complete the job. Initially, it is assumed that all agents are in the wandering state searching for task when they are deployed in a workplace. Once a task has been found by a particular agent, it changes its behavioral state to the task locking state which is companied by color changing from red to yellow and approaches the task. While the agent is approaching the targeted task, the concentration level of that task is being checked. If the task concentration is greater than unity, the agent will send signal to request for help and change to the idle state until there are enough agents to complete the task, then their colors will turn into green. Agents who receive signals from the requesting agent will reply to the request and only agents in the wandering state will respond to the request and change their states from wandering to cooperation state. Agents that are stimulated by and have triggered responses towards other tasks are not able to participate in another teamwork operation. On the other hand, if there are enough number of agents responded to the same cooperative task, other agents who are approaching to that task but in a further location will leave this task and change the behavioral state back to the explore state and look for new tasks.
|| The state transition diagram of an agent
|| Strategies corresponding to an agents behavioral states
Six different behavior states have been planned to characterize different antibodies that are taken by an agent under different operating conditions, as is listed in Table 1.
Environmental situations were modeled as antigens and responses to them were modeled as antibodies in the simulation. An Antibody class was designed to interface with the controller and each may belong to one or two states, so that multiple antibody objects could be created. The class had public double attributes strength, concentration and activation and a public double array paratope-strength to hold the degree of match (a value between 0 and 1) for each antigen. There was also a public integer array idiotope-match to hold disallowed mappings (a value of 1 for a disallowance, 0 otherwise) between the antibody and each antigen and thus represent the idiotypic suppression and stimulation between anti- bodies. The behaviour of the robot in response to environmental conditions was hence analogous to external matching between antibodies and antigens and internal matching between antibodies.
The degrees of paratope matching were initially hand designed. They were allowed
to change dynamically through reinforcement learning. Table 2
shows the 7 antigens and 12 antibodies that were selected and the match values
that were initially assigned.
|| Paratope match mappings between antibodies and antigens
|L and R donate left and right, respectively
|| Idiotope match mappings between antibodies and antigens
|L and R donate left and right, respectively
The idiotope mappings were also designed by hand, but were not developed in
any way. Table 3 shows the idiotope values used.
It is worth noting that although the initial idiotope matrix was not developed in any way, the idiotypic results were still adaptive. The presence of suppressive and stimulatory forces was based on the static idiotope matrix but the scores awarded or deducted for these effects were taken from the dynamic paratope matrix.
Simulation description: To evaluate the reliability and the efficiency
of the AIS-based control framework, a box collecting study for multi-agent construction
are performed. We implement this case study using Player/Stage. Player and Stage
run on many UNIX-like platforms are released as Free Software under the GNU
General Public License (Gerkey et al., 2003).
Player provides interfaces to the robots sensors and actuators over network
and provides communication between the robots of the system. Additionally, PLAYER
provides a layer of hardware abstraction that permits algorithms to be tested
in simulated 2D and 3D environments (STAGE and GAZEBO, respectively) and on
|| An illustration of the layout of simulated arena
Stage provides a two-dimensional bitmapped environment where robots and sensors
operate. The devices all can be accessed through Player, as if they were real
hardware. Stage aims to be efficient and configurable rather than highly accurate.
In present simulation, a square workplace is constructed. Its assumed
that the boxes have different weight but has uniform presentation shape-a square
box and randomly located in workspace. All agents are initially placed randomly
among the workplace. Two home depots where boxes will be stored
are located at the middle of the right and left edges of the workplace. The
layout of the simulated arena is depicted in Fig. 4.
The AIS-based control paradigm is fully distributed where no supervisors or
leaders are defined. AIS agents are allowed to move freely within the workplace
and they have the ability to obtain information about the environment within
their sensory range while they are exploring the environment and exchange information
with other agents that are in close proximity defined by the communication range.
Tasks defined in this study require either a single agent or a group of agents
to handle, mainly depending on agents capability and the special task
it has locked. The main objective of this case study is to demonstrate how coordination
enhances the overall efficiency of a multi-agent system based on the AIS-based
paradigm. In view of this, the explicit handling of tasks is not within the
scope of this study and the execution of these tasks is therefore not considered.
Nonetheless, the complexities of these tasks are crucial for agents in making
their decisions. Their corresponding complexity chains by which their corresponding
complexity chains by which specificity matching and binding affinity between
tasks and agents are evaluated evaluate specificity matching and binding affinity
between tasks and agents represent such complexities that represent different
types of tasks.
|| The parameter of box weight and robot ability
Experiment setup: A simulation is said to be completed when agents have accomplished all the tasks in the workplace. As initiating agents will become idle when they are waiting for help and in the case when the number of tasks is greater than the number of agents, a deadlock is resulted when all the agents have attached to different tasks and waiting for help. As such, a condition is imposed to limit the idling time for initiating agents, i.e., an initiating agent will evoke stimulation signals and wait for assistance within a fixed period of time. After that, if no one has replied to the signals, the initiating agent will abandon the cooperative task and search for other tasks. In this case study, the duration of idling time is set to 10 time steps.
During simulation, a special box collecting scenario was considered here: there are a lot of boxes randomly scattered in the arena. A swarm of agents (herein, the number of agent is 5) are searching and retrieving boxes back to homes which is nearer from them. Each box has different weight and each robot has different executive ability. The parameters are used in simulation are listed in Table 4. Based on the environmental information obtained by onboard sensors, robots can detect a target and lock a task which can be seen from the simulation results.
Case studies show, the coordination and performance of the AIS-based control framework is satisfactory which also are validated by the results.
ANALYSIS AND CONCLUSION
As indicated above, the degrees of paratope matching were initially hand designed. It implied that the simulation results should not be optimal. We update the paratope table just according to the testing results obtained during simulation. This process can be replaced by other algorithm such as: GA (Genetic Algorithm), PSO (Particle Swarm Optimiz ation Algorithm), ACO (Ant Colony Optimization) etc.
Figure 5 indicates Robot (agent) 5 has found a box and it
cant move the box by itself, so it sends a signal and waiting for help.
||Screenshot 1: Agent 5 finding a target and waiting for help
to move its target
||Screenshot 2: Agent 4 informed and cooperating with agent
5 to take the task
The white color denotes Robot 5 being in Request state. In Fig.
6, Robot 4 receives the signal and joins in Robot 5 successfully. The grey
color indicates that they have passed the concentration level testing. In Fig.
7, Robot 4 and 5 cooperate together to move their object box to homeR which
is more near to them. Meanwhile, Robot 2 and Robot 3 have locked the same task
but they cant reach the concentration level demanded and their color means
they are waiting for help.
||Screenshot 3: agent 4 and 5 cooperate together to move box
to home. Meanwhile, agent 2 and 3 have locked the same task but they cant
reach the concentration level demanded and their white color
means they are waiting for help. For the sake of simplicity, during the
cooperation of agent 4 and 5, their box is moved out of the simulation
The authors would like to thank Prof Guangzhao Cui and Doc. Yanfeng Wang who work in Key Lab of Information based Electrical Appliances of Henan Province. Thanks for their kindness and help in instrument and workspace provision. Many experiments are executed in the key lab. This work is supported by the Natural Science Foundation of Henan Province of China under grant No. 092300410036.