INTRODUCTION
Biodiesel is a renewable fuel produced from biological oils and fats, which has many characteristics of a promising alternative energy resource. It has properties similar to ordinary diesel fuel made from crude oil and can be used in conventional diesel engines. The most common process for making biodiesel is known as transesterification. This process involves combining any natural oil (vegetable or animal) with virtually any alcohol and a catalyst.
Different vegetable oils can be used to produce biodiesel. These include virgin vegetable oils and waste vegetable oils. Rapeseed, soybean and palm oils are most commonly used to produce biodiesel, though other crops such as mustard, hemp, jatropha and even algae show great potential as a source of raw materials. As the world’s main palm oil producer and exporter, biodiesel can be produced from this raw material in Malaysia.
Density data are important in numerous chemical engineering unit operations. Biodiesel density data as a function of temperature is needed to model the combustion processes and other applications. The density of a methyl ester depends on its molecular weight, free fatty acid content, water content and temperature. As palm oil biodiesel is getting popular as a fuel, density data will be needed. Results of measurements and predictions of specific biodiesel properties have been reported recently but palm oil biodiesel density measurements and predictions were rarely or possibly never done.
The densities and viscosities of the methyl esters of hexanoic, heptanoic, oetanoic, decanoic and dodecanoic acids were determined by Liew et al. (1992) at temperatures ranging from 10 to 80°C at 5°C intervals. It was obtained that the densities of the methyl esters vary linearly with temperature.
Tate et al. (2006) obtained the densities of canola, soy and fish oil methyl esters at temperatures up to 300°C. Derived densities were found to be linear with temperature over the measured range.
Noureddini et al.^{ }(1992) has measured the density as a function of temperature for a number of vegetable oils as well as eight fatty acids in the range C9 to C22 at temperatures from above their melting points to 110°C.
Neural networks or simply neural nets are computing systems which can be trained to learn a complex relationship between two or more variables or data sets. Basically, they are parallel computing systems composed of interconnecting simple processing nodes (Lau, 1991). Neural networks utilize a matrix programming environment making most nets mathematically challenging. The neuron model and the architecture of a neural network describe how a network transforms its input to output. This transformation can be viewed as a computation. Each model and architecture generate limitations on what a particular neural net can compute. The way a network computes its output, is in such a way that the products of the neurons’ output and weight are summed with the neurons’ bias and passed through the transfer function to get the neuron’s output. Neurons may be simulated with or without biases.
The feed forward neural network is one of the most important historical developments in neurocomputing. The model was developed by Werbos (1974), Parker (1985) and Rumelhart et al. (1986).
One of the many benefits of this kind of network is the capability to approximate mathematical function or mapping. HetchNielsen (1987) first used a new version of the Kolgomorov theorem, developed by Spretcher (1965). Subsequently, it was shown that a feed forward neural network with one hidden layer could approximate any continuous function to any degree of accuracy.
Back propagation was created by generalizing the WidrowHoff learning rule
to multiplelayer networks and nonlinear differentiable transfer functions.
Input vectors and the corresponding target vectors are used to train a network
until it can approximate a function, associate input vectors with specific output
vectors, or classify vectors in an appropriate way as defined. Networks with
biases, a sigmoid layer and a linear output layer are able to approximate any
function with a finite number of discontinuities. Standard back propagation
is a gradient descent algorithm, as is the WidrowHoff learning rule, in which
the network weights are moved along the negative of the gradient of the performance
function. The schematic for the back propagation model is showed in Fig.
1. The term back propagation refers to the manner in which the gradient
computed for nonlinear multilayer networks.
There are two different ways in which the gradient descent algorithm can be implemented: incremental mode and batch mode. In the incremental mode, the gradient is computed and the weights are updated after each input is applied to the network. In the batch mode all of the inputs are applied to the network before the weights are updated. Since the first application to estimate chemical and physical properties of material, Artificial Neural Networks (ANNs) have been established as a dependable method for the achievement of this task. Neural network models have been used for the prediction of biodiesel characteristics, with very good results.
Ramadhas et al. (2006) developed ANNs to predict the Cetane number of biodiesel. Multilayer feed forward, radial base, generalized regression and recurrent network models were used for the prediction of Cetane number. Predicted Cetane numbers were found to be in agreement with the experimental data.
Kumar and Bansal (2007) examined seven neural network architectures, three
training algorithms along with ten different sets of weight and biases to predict
the properties of diesel and biodiesel blends. The results showed that the neural
network having LevernbergMarquardt algorithm gave the best estimate for the
properties of dieselbiodiesel blends.

Fig. 1: 
Back propagation learning schematic (Spencer and Danner,
1972) 
In another study, Duran et al. (2005) used neural networks for estimation of diesel particulate matter composition from transesterified waste oils blends. Simulation results proved that the amount of palmitic acid methyl ester in fuels was the main factor affecting the amount of insoluble material emitted due to its higher oxygen content and cetane number.
In this study, a new approach based on Artificial Neural Networks (ANNs) has been designed to estimate the density of palm oil based biodiesel. Measured data of density at various temperatures from 14 to 90°C from our previous study (Baroutian et al., 2007) were used to train the networks and test the results of it. The present work, applied a three layer back propagation neural network with seven neurons in the hidden layer. Predicted results were also compared with experimental density and estimated results of empirical and theoretical methods.
Model specification: In this study, in order to train and validate the neural network, several measured data of palm oil methyl ester density were used (Baroutian et al., 2007). Biodiesel was prepared by transesterification of palm olein using methanol
as alcohol source and potassium hydroxide as catalyst in a batch system. The
reaction was carried out using 100% excess methanol, i.e. molar ratio of methanol
to oil is 6:1.
Methyl ester density was measured at temperatures from 14 to 90°C, measurements were done three times to obtain mean values for each temperature.
The present study, applies a feed forward back propagation neural network in
three layers. The input, hidden and output layers had 1, 7 and 1 neurons, respectively
as showed in Fig. 2.
Deciding the number of neurons in hidden layer is a very important part of
deciding our overall neural network architecture. Though the hidden layer does
not directly interact with the external environment, this layer has a tremendous
influence on the final output and number of neurons of this hidden layer must
be considered.

Fig. 2: 
Feed forward back propagation network with three layers 
Using too few neurons in the hidden layers will result in something called underfitting. Underfitting occurs when there are too few neurons in the hidden layers to adequately detect the signals in a complicated data set.
Using too many neurons in the hidden layers can result in several problems. First too many neurons in the hidden layers may result in overfitting. Overfitting occurs when the neural network has so much information processing capacity that the limited amount of information contained in the training set is not enough to train all of the neurons in the hidden layers. The second problem is increasing the time of training.
Obviously some compromise must be reached between too many and too few look neurons in the hidden layers. The selection of the architecture of the neural network has come down to trial and error. To organize the trial and error search for the optimum network architecture forward selection method was used. This method begins by selecting a small number of hidden neurons (only two hidden neurons).
Subsequently, the neural network is trained and tested. The number of hidden
neurons is then increased and the process is repeated so long as the overall
results of the training and testing improved. The forward selection method is
showed in Fig. 3.
Each layer of this network has its own weight matrix, its own bias vector, a net input vector and an output vector. This network can be used for general function approximation.
It has been proven that three layer networks, with sigmoid transfer function
in the hidden layer and output layer, can approximate virtually any function
of interest to any degree of accuracy, provided that a sufficient amount of
hidden units are available.

Fig. 3: 
Selecting the number of hidden neurons with forward selection 
Therefore, the neuron model key component, the transfer function, is used to
design the network and establish its behavior.
The transfer function may be a linear or nonlinear function. A tansig (hyperbolic tangent sigmoid) transfer function has been chosen to satisfy some specification of the problem that the neurons are attempting to solve. Sigmoid functions are often used in neural networks to introduce nonlinearity in the model and/or to make sure that certain signals remain within a specified range. A popular neural net element computes a linear combination of its input signals and applies a bounded sigmoid function to the result; this model can be seen as a smoothed variant of the classical threshold neuron.
A reason for its popularity in neural networks is that the sigmoid function
satisfies this property:
This simple polynomial relationship between the sigmoid function and its derivative is computationally easy to perform. It makes the backpropagation based learning easy because it is differentiable, has simple relation between the function and its derivative, is flexible with easily changeable slope signs and the error convergence criteria of mean square error works well with it.
MATERIALS AND METHODS
The methodology of the approach used in this study is accomplished by means of Matlab Toolbox (2001). This software has extensive neural net capabilities. The study was conducted in 2007 at the department of Chemical Engineering, University Malaya.
Among the 77 density data points, 69 measured densities at different temperature from 14 to 90°C were chosen to train the network and the rest for simulation and evaluate the accuracy of the newly trained network by providing the network a set of data it has never seen.
The procedure to create and train a network using this toolbox was as follows:
• 
Input (temperature) and target (density) vectors entered
in its suitable format in the workspace of Matlab. 
• 
The vectors normalized independently to assign a number between 1 and
1 to each element of vectors because inputs are sensitive in this range
when sigmoid transfer function is used. On the other hand inputs are normalized
because the training domain may be biased toward one input variable or toward
higher input. Furthermore the sigmoid transfer function produces the output
within the range of 1 to 1 and if the input is not normalized, bias may
be generated. 
• 
A three layer feed forward backpropagation network created in the Matlab
neural networksToolbar. This was done because back propagation uses a gradient
descent method and it allows the neural network to train with a higher degree
of efficiency. 
• 
Trainlim and Tansig chose as training and transfer function, respectively.

• 
Input and target vegtor introduced to the created network and weight initialized. 
• 
Training parameters such as the epochs and error goal adjusted 
• 
The specified network trained gradually. This process finished when
the defined error was reached. During training the weights and biases
of the networks were iteratively adjusted to minimize the network performance
function. 
The optimized number of hidden layer neurons was determined during the learning
and training processes by trial and error tests. On the other hand, different
numbers of hidden neurons were tested, however since the performance did not
change significantly with more neurons, the simplest network was chosen. To
show optimization, 1, 2, 3, 5, 7, 10 and 15 neurons were chosen for training
process.

Fig. 4: 
Average absolute percent deviation (AAPD) parameter of training
results versus number of neurons in hidden layer 
As can be shown in Fig. 4, the best choice is the network
with seven neurons in the hidden layer which is in good agreement with the experimental
data. In Fig. 4, the Average Absolute Percent Deviation (AAPD)
parameter of training process using different number of hidden layer neurons
is shown according to the expression:
After training the threelayer, feed forward, back propagation network, the palm olein biodiesel density at other temperature was predicted from the simulation of this network with the suitable inputs.
RESULTS AND DISCUSSION
As can be shown in Fig. 5, there is a very good agreement
between the measured data (normalized density) and the trained data.
This illustrates that the networks has been trained very well and can be used to simulate the biodiesel density at a wide range of temperatures.
There is a very good agreement between the experimental data and the simulated
data. The equations of the form y = f(x) in the Fig. 5 and
6 are the equations of the regression lines.
When all the points fall exactly on the line of 45°, the regression line is y = x. In this case, the network is trained or simulated very well. Otherwise, the regression line will has the form of y = ax+b. The regression constants (R^{2}value) which are also shown in these figures show the agreement of trained and simulated data with experimental data. In the ideal situation, these parameters are exactly similar (R^{2} = 1).
In Fig. 7, the measured densities are compared with those
predicted by the ANNs. It can be seen that there is a good agreement between
the results of the ANNs and the measured data, The AAPD parameter for neural
network simulation is 0.29%. In Fig. 7, the predicted densities
by ANNs are compared with those predicted by the Spencer and Danner method (Baroutian
et al., 2007).

Fig. 5: 
Comparison between the training results and the measuring
data 

Fig. 6: 
Comparison between the simulating results and the measuring
data 
The AAPD parameter for the result of Spencer and Danner method is 0.05% (Baroutian
et al., 2007), it can be seen that Spencer and Danner method give a better
prediction with less deviation than those given by the ANNs. But, at the temperatures
higher than 80°C ANNs are more reliable and the maximum deviation for the
results of ANNs and Spencer and Danner method are 0.09 and 0.15%, respectively.
Spencer and Danner method use the modified Rackett equation with critical properties
of mixtures to estimate liquid density (Spencer and Danner, 1972).
The Rackett equation modified by Spencer and Danner (1972) to estimate liquid
density ρ (g cm^{3}):

Fig. 7: 
Density comparison between the experimental and estimation
results 
Where:
In this equation Z_{RA} is Rackett compressibility factor and can be
determined for biodiesel using measured densities and applying the Eq.
3. T(K), T_{R}(K), T_{c}(K) and φ are temperature,
reduced temperature, critical temperature and fugacity coefficient, respectively.
To determine the mixture critical properties, the LeeKesler mixing rules recommended
by Knapp et al. (1982) add Plocker et al. (1978). The equations
to calculate critical temperature of mixture T_{cm}(K), P_{cm}(bar)
and V_{cm}(cm^{3} mol^{1}) are:
where, x is mole fraction, ω is acentric factor and R is gas constant (mL bar mol^{1} K^{1})
In Fig. 7, another comparison was done between predicted
densities by ANNs. The AAPD parameter for the Clements (1996) empirical method
is 0.40% (Baroutian et al., 2007), it is obvious that the ANNs has less
deviation.
CONCLUSIONS
The approach presented in this study speeds up the process of prediction of the density of palm methyl ester biodiesel. This new approach is based on artificial neural networks to estimate the density. This method is able to predict the density at various temperatures. The comparison of the results obtained by Artificial Neural Networks (ANNs) with those predicted by Spencer and Danner (1972) and Clements (1996) method shows the reliability of ANNs over the theoretical and empirical methods. Finally, good agreement between measured data and the result of artificial neural network shows that the ANNs can be a powerful model for predicting density for the palm oil biodiesel.
Furthermore, this approach provides a new way to estimate the density of biodiesel with respect to available methods accurately.