INTRODUCTION
In many areas of science, system modeling is an important task. A model
is a useful tool for system analysis and helps the designer obtain a better
scope of the behavior of the system. Furthermore, a model enables us to
simulate and predict the behavior of a system.
In engineering applications, models are especially needed for analysis
and design of systems. For instance, advanced technologies for controller
design, optimization, supervision and surveillance, fault detection and
so on are based on models of the systems.
From a viewpoint, models could be divided into three groups: white box,
black box and gray box.
Our idea is related to black box modeling. Black box models rely on experimental
data and need no apriori knowledge. The parameters and the structure
of the model may have no relation with the actual structure of the system.
It is wellknown that a model is always an approximation of the system`s
real relationships and its response does not always correspond to the
real responses of the system. This is more sensible in blackbox models.
Actually, we cannot test the correctness of the model`s responses for
all inputs. All efforts are performed to increase the probability of proper
performance of the model.
It is wellknown that the proper performance of a model is dependent on
the model structure and the data used to tune the model parameters. The
richness of the training data to represent the system appropriately enhances
the accuracy of the model.
One of the factors which influence the richness of the data is the number
of the data points. However, in realworld problems, there are limitations
on the increment of the number of data points. Acquiring data from many
systems and processes requires lots of time and cost which are serious
limits on increasing the number of data points. For instance, one can
point to modeling the mechanical characteristics of an alloy in terms
of its ingredients (Datta and Banerjee, 2006), estimation of the amount
of mineral reservoirs (Tercan and Karayigit, 2001), modeling the dying
properties of materials in the textile industry (Senthilkumar, 2006; Daneshvar
et al., 2006; Keong et al., 2004; Dai et al., 2004;
Hebbar et al., 2006; Khamis et al., 2006; Elkamel, 1998;
Turkoglu et al., 1999).
Sometimes, it is necessary to model the system with an available set
of data. This is not of interest of in this study. The idea is to obtain
the proper set of data, which represents the system, appropriately.
Suppose, that the limitations permit us just to obtain M data points.
We can obtain the data points with the following methods:
• 
Selection of M data points in the domain of interest randomly 
• 
Griding each dimension of the input 
• 
Griding each dimension of the inputs and selecting the data randomly
in each interval 
The problem with all the above three methods is the batch nature of acquiring
the data and there is no intelligence in obtaining them. On the other
hand, it seems that when the data points are bring extracted, our knowledge
of the system behavior increases and we could better judge the nature
of the system.
The idea is using such kind of knowledge to enrich the procedure of data
extraction to do this, inspired by the honeybee algorithm (Seifipour
and Menhaj, 2001), we have designed a method which analyzes the available
data set and selects the best data point appropriately.
THE PROPOSED ALGORITHM
At the first step, we determine the number of the needed data points
(M) according to the existing constraints (for example costs, time and
so on). Then, we obtain a number of the data (N) according to the grid
method to initialize the algorithm. Inspired by the honey bee algorithm,
one of the data points should be selected as queen. The new data will
be generated around the queen. To do this, we need a fitness function
to choose the queen. It seems that the data in which the gradient is maximum
is a suitable candidate. On the other hand, those part of the input space
in which the number and density of data are less should be considered
for data extraction. According to this, we selected the following fitness
function:

Fig. 1: 
Neighbor selection algorithm`s flowchart 
In which (x_{i}, f(x_{i})) is the data point whose fitness
is to be evaluated. H is the number of the data in x_{i}`s neighborhood,
f is the unknown function to be modeled and d_{j} is the Euclidean distance
between x_{i} and its j^{th} neighbor x_{j}. The method
of determination of a neighbor for each point in the input space is shown in
Fig. 1.
It is notable that we could consider any other criterion and constraint
in data selection in the fitness function. Although it is considered in
the fitness function, the data could be avoided from being very dense
in some regions of the input space by a tabu list. After the selection
of the queen, according to the relative fitness function, the best individual
is selected from its neighborhood as its mate. Considering the fact that
the neighboring data (x_{i}, f(x_{i})) which has the most
distant f(x_{i}) from the queen is a suitable candidate to breed
the queen, we consider:
In which x_{i} is the point of which the relative fitness function
is to be calculated. Q is the queen and d_{j} is the Euclidian
distance between x_{j} and q. the offspring is created by the
crossover of the queen and its mate. The crossover could be performed
by many methods. We used the arithmetical mean as the crossover operator.

Fig. 2: 
Flowchart of the proposed algorithm 
This algorithm continues until the remainder MN data points are generated.
Figure 2 shows the flowchart of the proposed algorithm.
SIMULATION RESULT
To verify the algorithm, we extracted some data points to approximate
the functions of Table 1 and 2. We
employed four methods: random, grid, gridrandom and the proposed method.
On the other hand, from each function, we obtained a huge bulk of test
data points (100 M) to examine the ability of the methods in providing
rich data.
To approximate the function of Table 1, we used the
following methods:
• 
Training on MLP neural network with 5 neurons in the first layer
and 3 neurons in the second layer and 1 neuron in the output layer
with tangentsigmoid stimulus functions in the first and second layer
and a linear stimulus function in the output layer. This structure
has been obtained by tryanderror and is not necessarily the optimal
structure, but it is adequate for our purpose which is comparison
of the methods 
For each method for data point extraction, we trained the network 20
times and calculated the mean of network`s mean square error for the test
data.
• 
Piecewise Cubic Hermite Interpolating polynomial 
The error criterion is MSE (Mean Square Error). It is seen that the proposed
method is usually better than the other methods (Table 3).
Table 1: 
Singlevariable functions used for data extraction 

Table 2: 
A twovariable function used for data extraction 

Table 3: 
The simulation results for the singlevariable functions 


Fig. 3: 
(a) The diagram of the function f_{1}, (b)
data extracted from f_{1} by grid method, (c) data extracted
from f_{1} randomly, (d) data extracted from f_{1}
by the gridrandom method, (e) extracted data from f_{1}
by the grid method applied in proposed method as initial data and
(f) data extracted from f_{1} by the proposed method 
Table 4: 
The simulation result for the twovariable function 


Fig. 4: 
(a) The diagram of the function f_{4}, (b)
data extracted from f_{4} by the grid method, (c) data extracted
from f_{4} randomly, (d) data extracted from f_{4}
by the gridrandom method (e) extracted data from f_{4}
by the grid method applied in proposed method as initial data and
(f) data extracted from f_{4} by the proposed method 
To approximate the function shown in Table 2 we used
two methods:
• 
A three layer MLP neural network with 10 neurons in
the first layer and 6 neurons in the second layer and tangentsigmoid
stimulus functions and 1 neuron in the output layer with a linear
stimulus function 
• 
Trianglebased Cubic Interpolation (TCI) 
It is easily seem that the error due to the proposed method is less than
the other methods (Table 4).
For example in Fig. 35 the data
extracted by the proposed method are shown and compared to the data extraction
by other methods.
Figure 6 shows more examples, which demonstrate efficiency
of the proposed method.

Fig. 5: 
(a) The diagram of the function f_{6}, (b) data
extracted from f_{6} by the Grid method, (c) data extracted
from f_{6} randomly, (d) data extracted from f_{6}
by the gridrandom method, (e) extracted data from f_{6} by
the grid method applied in proposed method as initial data and (f)
data extracted from f_{6} by the proposed method 

Fig. 6: 
(a) The diagram of the function f_{5}, (b) extracted
data from f_{5} by the grid method applied in proposed method
as initial data, (c) data extracted from f_{5} by the proposed
method, (d) the diagram of the function f_{7}, (e) extracted
data from f_{7} by the grid method applied in proposed method
as initial data and (f) data extracted from f_{7} by the proposed
method 
CONCLUSION
In this study a novel method is presented for data extraction for function
approximation. The proposed method is explained completely and its efficiency
is demonstrated theoretically. The algorithm is verified by simulation
results. More efficient fitness functions and other evolutionary operators
should be investigated in the forthcoming contributions. Other intelligent
algorithm such as neural networks should be examined for data extraction
in the future.