INTRODUCTION
Fermentation processes are very complex and their modelling is rather
a challenging task (Assidjo et al., 2006; Coleman et al.,
2003; GarcíaRuiz et al., 2008). Indeed, dynamic models
for industrial fermentations processes are difficult to identify because
of a wide varieties of reasons e.g., microorganisms complex dynamics,
variables and illdefined raw materials, varying inocula quality (Assidjo
et al., 2006; Coleman et al., 2003; GarciaRuiz et
al., 2008). However, their study was intensively investigated and
different models proposed for control (Moore et al., 2001; Marin,
1999). A good model must take into account the effects of substrate limitation,
substrate and products inhibition as well as maintenance energy and cells
death on the cell growth and metabolism (Moore et al., 2001).
However, is neither necessary nor desirable to construct comprehensive
mechanistic process models that can describe the systems in all possible
situations with a high accuracy.
In this intention, many mathematical models have been proposed to predict
the influence of fermentation operating parameters on cell growth rate,
cell concentration (biomass), substrate utilisation rate (Sarkar and Modak,
2005; Trelea et al., 2004). The use of these models may lead to
the development of better strategies for the optimisation of the fermentation
process to ensure its economic viability. In fact, few fermentation models
have been used for industrial scale fermentation optimisation (Nandasana
and Kumar, 2008).
Genetic algorithms have been widely used for opTab2timisation and system
identification. They are robust, global and generally more straightforward
to apply in situation where there is little or no a priori knowledge about
the process to be controlled (Nandasana and Kumar, 2008; Sarkar and Modak,
2005; Goldberg, 1989). They are stochastic search technique for approximating
optimal solution within complex search spaces. They are based upon the
analogy with biological evolution, in which the fitness of individual
determines its ability to survive and reproduce. Their mechanism, as shown
in Fig. 1, starts by encoding the problem to produce
a list of genes (Yao et al., 2007; Davis, 1991). The genes are
represented by either numeric or alphanumeric characters. The genes are
randomly combined to give a population of chromosomes, each of which represents
a possible solution. Genetic operations are performed on chromosomes that
are randomly selected from the population producing therefore offspring
(Yao et al., 2007; Polifke et al., 1998). The fitness of
these chromosomes is measured and the probability of their survival is
determined. In this study, a genetic algorithm has been used for a batch
fermentation parameters identification.
MATERIALS AND METHODS
Data collection and treatment: The brewing fermentations
were performed using a Brunswick microferm fermentor (New Brunswick Scientific
Co. Inc., New Jersey, USA), in batch mode. This microfermentor is characterised
by a volume of about 15 L. Its vessel internal diameter is 10 cm and its
height is 50 cm.

Fig. 1: 
Flowchart of genetic algorithm process 
The wort was produced by crushing the malt into coarse flour, which was
then mixed with water. The resulting porridgelike mash was heated to
a selected temperature that permitted the malt enzymes to partially solubilize
the ground malt. The resulting sugarrich aqueous extract (wort), was
then separated from the solids and boiled. The wort was then clarified,
cooled and poured in the vessel of the microfermentor for inoculation.
Inoculated ferments were traditionally produced by female brewers that
use it for a wellknown local beer (tchapalo or dolo) making. This ferment
is in fact a mixture of microorganisms containing different species (e.g.,
Saccharomyces cerevisae and Candida). During the fermentation
process (t = 0 to 18 h), different parameters (pH, biomass, substrate,
ethanol, carbon dioxide) were measured. The responses concerned in this
study (biomass, substrate and alcohol (ethanol)) were determined by gravimetric
and refractometric methods and refractometric method after a distillation,
respectively (Thonart, 2001).
For the purpose of this study, 30 batches were performed.
Statistical analysis: In order to eliminate eventual outlier fermentations
and because of the trilinear form of data (batchesxtimexresponses), a
PARAFAC analysis was performed.
PARAFAC is a multiway decomposition method originating from psychometrics
(Rutledge and Bouveresse, 2007; Khayamian, 2007). It is gaining more interest
in chemometrics and associated areas for many reasons: simply increased
awareness of the method and its possibilities, the increased complexity
of the data dealt with in science and industry and increased computational
power (Geladi, 1989; Rutledge and Bouveresse, 2007).
PARAFAC decomposes the array into sets of scores and loadings, which
describe the data in a more condensed form than the original data array.
It conceptually can be compared to bilinear PCA, or rather it is one generalization
of bilinear PCA (Smilde, 1992; Bouveresse et al., 2007). This technique
is designed to decompose higher order data tables (e.g., cubes), again
to reveal the underlying, latent phenomena for the purpose of data analysis
and predictions.
The decomposition of the data is made into trilinear components, but
instead of one score vector and one loading vector as in bilinear PCA,
each component consists of one score vector and two loading vectors (Bouveresse
et al., 2007). It is common threeway practice not to distinguish
between scores and loadings as these are treated equally numerically.
Therefore, a PARAFAC model of a threeway array is given by three loading
matrices, A, B and C with elements a_{if}, b_{jf} and
c_{kf}. The trilinear model is found to minimize the sum of squares
of the residuals, e_{ijk} in the model:
The advantage of the PARAFAC model is the uniqueness of the solution,
meaning that no restrictions are necessary to identify estimate the model
apart from trivial variations of scale and column order (Bouveresse et
al., 2007). Therefore the true and estimated models must coincide
when the right number of components is chosen.
Leurgans et al. (1993) have shown that unique solutions can be
expected if the loading vectors are linear independent in two of the modes
and furthermore in the third mode the less restrictive condition is that
no two loading vectors are linearly dependent. In PARAFAC, one does not
deflate the array, because the trilinear model calculated simultaneously
for all components can be shown to fit the array better, than if the components
were calculated successively as is possible in PCA (Mortensen and Bro,
2006; Leurgans et al., 1993). As a consequence, extracting too
many components does not only mean that noise is being increasingly modeled,
but also that the true factors are being modeled by more (correlated)
components.
In this study, calculations were performed using Matlab R2007b (MathWorks
Inc., Massachusetts, USA) software.
Process description: Batch fermentation refers to a partially
closed system in which most of required material for microorganisms growth
and maintenance are loaded before process starts. Conditions during fermentation
are continuously changing with time leading the fermentor being an unsteady
state system (Roeva et al., 2004).
In alcoholic brewing fermentation, the microorganisms (biomass) concentration
is the central feature affecting the rate of growth, substrate consumption
and product formation (Trelea et al., 2004). Growth and alcohol
formation rates vary with time due to a dependence on the present state
of the batch which is characterised by biomass, substrate and product
concentrations, oxygen tension and culture conditions (Roeva et al.,
2004; Thonart, 2001).
The model developed herein is based on the assumptions that:
• 
The bioreactor is completely mixed. 
• 
The substrate is consumed mainly oxidatively and its consumption
can be described by Monod kinetics. 
• 
The ethanol production is assumed to be directly linked to the biomass
formation. 
• 
Variations in the growth rate, ethanol production and substrate
consumption do not significantly change the elemental composition
of biomass. 
The rates of cells growth, substrate consumption, ethanol formation as
well as carbon dioxide concentration in batch fermentation are commonly
described according to the mass balance (Thonart, 2001):
The specific growth rateμ is generally found to be function of
three factors: the limiting concentration of substrate, the maximum growth
rateμ_{max} and the substrate specific constant K_{s}.
If taking account realistic aspects of the process:
• 
Substrate limitation 
• 
Ethanol and substrate inhibition 
• 
Lag phase… 
The model drawn is as follows:
Lag phase
Fermentation phase
The parameters
This model was developed by AndresToro et al. (2004) who have
adjusted it to laboratory scale experimental data. It takes account three
components of the biomass: lag, active and dead cells and consider the
active cells as the only fermentation agent. The model also includes sugar
and ethanol concentrations and two important byproducts of the fermentation
that degrade beer quality: ethyl acetate and diacetyl.
The system identification is achieved by determining parameters like
K_{m}, K_{a},μ_{x0},μ_{d0}…
RESULTS AND DISCUSSION
Outlier determination: As described earlier, a PARAFAC decomposition
was made; scores and loadings, for time, batches and responses, were drawn.
Loadings (B) concerning batches are shown in Fig. 2 with
the 95% confidence ellipse.
The analysis of Fig. 2 shows that data were regrouped
in a set of batches inside the ellipse contour. Unfortunately, 2 batches
are isolated from the others and out of this ellipse. These outlier (Johnson
and Wichern, 1992) batches were removed from data set prior process identification.
Parameter identification: Identifying parameters represents a
very difficult task to elucidate. In general, conventional methods, as
simplex, are local optimisation methods based on gradient determination;
supposing that functions must be derivable. Moreover, conventional search
techniques are often incapable of optimising nonlinear multimodal functions.
In such cases, a random and global search method might be required. Genetic
algorithms do not use much knowledge about the problem to be optimised.
They work with codes, which represent the problem parameters (i.e.,μ_{x0},
μ_{s0}, K_{s},μ_{a0}, K_{a},
μ_{lag}, K_{m},μ_{d0}).
The differential (613) and parameters (1418) equations and initial
values were implemented in a script under Matlab R2007b (MathWorks Inc.,
Massachusetts, USA) environment. Differential equations were resolved
using RungeKutta 4th order algorithm at specified time (e.g., 0, 1, 2,
3, 4, 5 … h).

Fig. 2: 
Batches projection on PARAFAC components 
Table 1: 
Initial conditions of genetic algorithms 

In order to optimise the batch fermentation modelled, another script
containing the necessary instructions for the genetic algorithm toolbox
has been developed. In this script, some initial parameters needed in
the genetic algorithm toolbox were defined, e.g., individual number, generation
maximum number, crossover and mutation rates, selection function…
In order to implement the genetic algorithm, the model parameters (e.g.,
K_{s}, K_{m},μ_{lag},μ_{d0}…)
have to be presented as chromosomes. Decimal numbers ranging from 0 to
20 for the parameters values have been used to represent this principle.
Each chromosome corresponds to a possible solution of the objective function.
Generally, this function is expressed as the modelling error i.e., the
mean square deviation between the model output and the corresponding data
obtained during the fermentation. Therefore, the optimisation criteria
are as follows:
with Y_{c} the calculated value and Y_{o} its corresponding
observed one, respectively for biomass, substrate and alcohol.
Initial tests were performed using conditions shown in Table
1.
After several and different runs, the conditions retained are as follows:

Fig. 3: 
Contour plot of Pareto solutions 
Table 2: 
Values of parameters after identification 

• 
Population size = 50 
• 
No. of generations = 100 
The release R2007b of Matlab integrates in genetic toolbox the possibility
of optimising multiobjective function (Mathworks, 2007). The solutions
called Pareto solutions for the three functions (J_{Biomass},
J_{Substrate} and J_{Alcohol}) are shown in Fig.
3 in contour plot.
Glancing at Fig. 3, it appears that two optimal regions
are represented concerning the J_{Alcohol} criterion. But if taking
account the other criteria (i.e., J_{Biomass} and J_{Substrate}),
the optimal region to consider is the lower green one ranging from 0 to
800 for J_{Biomass} and from 0 to 200 for J_{Substrate}.
The retained point (i.e., point A in the Fig. 3) that
is a good compromise of different functions minimum is situated in this
zone. This point corresponds to parameters shown in Table
2.
In this case, the values of the criteria are:
J_{Biomass} = 44.982, J_{Substrate} = 9.724 and J_{Alcohol}
= 7.604.
Figure 4 shows the experimental and predicted values
of, respectively biomass, substrate (i.e., glucose) and ethanol (alcohol).
These different phases were already extensively described in literature:
the lagphase during where microorganisms adapt themselves to the culture
medium; the active phase or exponential phase during which the yeasts
multiply themselves exponentially, consuming sugar and producing alcohol.

Fig. 4: 
Model (○) and experimental (●) data plot,
(a) Biomass, (b) Substrate and (c) Alcohol 
The decrease of yeast number in suspension is due to their fall down in fermentation tank confirming
therefore, the lower fermentation type.
The global analysis of prevision shows in Fig. 4ac)
points out the good ability of the model to predict successfully the variation
of substrate consumption, biomass concentration and alcohol production.
Indeed, the model obtained gives values very close to the observed ones,
whatever the response concerned during the fermentation studied.
CONCLUSION
In this research, the model developed by AndrésToro et al. (2004)
for beer fermentation was proposed to fit a batch fermentation by microorganism
mixture data. The study is accomplished through the formulation of the identification
problem as an optimisation problem and the application of multiobjective genetic
algorithm in order to estimate the unknown parameters from inputoutput space.
The simulations operated thereby, concerning the substrate consumption,
biomass concentration and the alcohol production, validate the predictions
of the genetic algorithm formalism with a good accuracy. Therefore, the
multiobjective genetic algorithm methodology presented in this paper
offers a costeffective and relatively simple alternative for process
modelling and optimization.
NOMENCLATURE
μ 
: 
Specific growth rate (h1) 
μ_{a} 
: 
Alcohol production rate (h1) 
μ_{a0} 
: 
Specific alcohol production rate (h1) 
μ_{d} 
: 
Yeast settling down rate (h1) 
μ_{d0} 
: 
Specific yeast settling down rate (h1) 
μ_{lag} 
: 
Specific rate of latent formation (h1) 
μ_{max} 
: 
Maximum growth rate (h1) 
μ_{s} 
: 
Substrate consumption rate (h1) 
μ_{s0} 
: 
Specific substrate consumption rate (h1) 
μ_{x} 
: 
Yeast growth rate (h1) 
μ_{x0} 
: 
Specific yeast growth rate (h1) 
e 
: 
Concentration of the alcohol (g L1) 
f 
: 
Fermentation inhibitor factor 
K_{a} 
: 
Alcohol inhibition parameter 
K_{m} 
: 
Yeast growth inhibition parameter 
K_{s} 
: 
Sugar inhibition parameter 
m_{s} 
: 
Maintenance constant (g g1 h1) 
P 
:: 
Concentration of the product (g L1) 
S 
: 
Concentration of the substrate (g L1) 
S_{i} 
: 
Initial concentration of substrate (g L1) 
X 
: 
Concentration of biomass (g L1) 
x_{act} 
: 
Concentration of active biomass (g L1) 
x_{bot} 
: 
Concentration of bottom biomass (g L1) 
x_{i} 
: 
Initial concentration of biomass (g L1) 
x_{lag} 
: 
Concentration of latent biomass (g L1) 
Y_{p/x} 
: 
Yield coefficient (g g1) 
Y_{x/s} 
: 
Yield coefficient (g g1) 