INTRODUCTION
Power load forecasting is an important operational component of electric power
departments because it plays a vital role in guaranteeing the security and reliability
of a power system. Decisionmaking in power planning, dispatching and power
market transactions depends on load forecast. Load change is a complex process
that is influenced by many conditions of uncertainty. From a macroscopic perspective,
however, it also has recognizable regularity under certain time horizons. Many
scholars have conducted research on long, medium and shortterm power load forecasting.
The induced ordered weighted geometry averaging operator, weighted Markov chain
(Mao et al., 2010) model and BP neural network
optimized by Particle Swarm Optimization (PSO) (Cui et
al., 2009) were combined for mediumand longterm load forecasting.
The Grey forecast modelbased BP neural network and Markov chain were used to
forecast China’s electricity demand (Li and Wang,
2007). Selforganizing neural network (Zhao and Xu,
2010), least squares Support Vector Machine (SVM) (He
et al., 2011) and combined SVM and rough set’s model (Niu
et al., 2010; Li et al., 2009; Yang
et al., 2011) were used to forecast shortterm power load.
Although, good progress has been made in using the abovementioned algorithms
for shortterm load forecasting, neural networks and support vector machines
suffer from shortcomings, such as easily falling into local extrema, overlearning
and so on. Some scholars attempted to establish an SVM forecast model using
a Genetic Algorithm (GA) (Wu et al., 2009) and
ant colony algorithm (Long et al., 2011) to
optimize parameters. However, GAs entail a series of more complex operations,
including coding, selection, crossover and mutation, whereas PSO is relatively
simple. In the current work, therefore, this study established a shortterm
load forecast model and employed PSO to optimize the core parameters of SVM.
The proposed model was analyzed and validated using actual data on a region.
MATERIALS AND METHODS
Overview of SVM regression: SVM was put forward by Vapnik on the basis
of smallsample statistical learning theory (Vapnik, 2000),
which is used primarily to study small samples under statistical learning rules
and is commonly adopted in pattern classification and nonlinear regression (Thissen
et al., 2003; Kim, 2003).
The sample data set is given as D = {(xi, yi) i = 1, 2, …, n}, where
xi ∈ Rn represents the input variables and yi ∈ Rn denotes the output
variables.
The SVM algorithm seeks one misalignment mapping from the input space to output
space φ. Through this mapping, data x is mapped to a feature space
and linear regression is carried out in the feature space with the following
function:
In Eq. 1, b is a threshold value. According to statistical
learning theory, SVM determines the regression function through objective function
minimization:
where, C is a weight parameter for balancing the complex items of the model
and training error, also called the penalty factor; ε is the insensitive
loss function; and ξi* and ξi are the relaxation factors. ξi*
is expressed as follows:
By solving the dual problem in Eq. 2, lag range factors ai,
ai* can be obtained, so that the regression equation coefficient is:
The SVM regression equation is as follows:
where, K (Xi, X) is the SVM kernel function. Kernel function types include
linear kernels, polynomial kernels and radial basis functions.
Penalty factor C, insensitive loss function ε and kernel function parameter
σ determine SVM performance. The σ responds to the training data set
characteristics, determines the complexity of the solution and affects the generalizability
of the learning machine. Parameter C determines the penalty to large fitting
deviation: An excessively large value may cause overlearning but one too small
easily results in less learning. The optimization of these parameters is therefore
important in improving SVM performance.
Overview of PSO: PSO is an evolutionary computation based on swarm intelligence;
it was proposed by Kennedy and Eberhart (1995). Its
basic concept stems from the study of bird predation.
In PSO, particles identify potential optimal solutions in a solution space.
Three targetsthe position, speed and fitness valueexpress the characteristics
of the particles. The fitness value is obtained by the fitness function and
is used to express whether the particle is fit or unfit. Individual positions
are updated through the track individual extreme value (denoted as Pbest) and
the group extreme value (denotes as Gbest). The individual extreme value is
the optimal solution of the fitness value in the particle’s experiences
and the group extreme value is the optimal solution of the fitness value in
the entire particle population.
Assuming an N dimension search space, this study define a population set X,
including n particles X = (X_{1}, X_{2}, …, X_{n}).
In X, the ith particle is the position in the N dimension search space (i.e.,
a potential solution), denoted as an N dimension vector Xi = (X_{i1},
X_{i2}, …, X_{in}). The speed of the ith particle is denoted
as Vi = (V_{i1}, V_{i2}, …, V_{in})T. Individual
extreme value Pbest is denoted as Pi = (P_{i1}, P_{i2}, …,
P_{in})T and group extreme value Gbest is designated as Pg = (V_{g1},
V_{g2}, …, V_{gn})T.
In the PSO algorithm iterative process, the particle updates its own speed
and position using Eq. 6 and 7:
In Eq. 6, ω is the inertia weight; d = 1, 2, …,
N; i = 1, 2, …, n; k denotes the current iteration times; Vid is the particle
speed; c_{1} and c_{2} represent nonnegative constants called
acceleration factors; and r_{1} and r_{2} are random numbers
distributed between (0, 1).
The classical PSO algorithm features fast convergence and strong currency but
also suffers from shortcomings such as premature convergence, low precision
search and low efficiency of late period search. Therefore, the PSO precedent
derived from GA introduces a random factor into1 the iterative process (mutation
factor (Higasshi and Iba, 2003) via a probability. This
probability is used to reinitialize the particle and expand the search space.
Through this method, the algorithm is prevented from getting caught in local
extrema.
A high inertia weight value is advantageous to global search, while a low value
is beneficial to local search. To balance the global and local search ability
of PSO, this study applied a series of weight selection methods, including linearly
decreasing inertia weight (Shi and Eberhart, 1999; Jin
et al., 2006).
SVM's Penalty factor C and kernel function parameter σ optimized by
PSO: The construction of the SVM forecast model based on PSO optimization
parameters (PSOSVM) entails seven steps (Fig. 1).
Step 1: 
Initialize information, including population size, particle
position, initial particle speed, range of speed, acceleration factor, penalty
factor C, kernel function parameter σ and mutation probability 
Step 2: 
The fitness value is calculated with input data. In this step, the SVM
kernel function is used to calculate the fitness value in the training process 
Step 3: 
The optimal solution of the iterator is solved 
Step 4: 
Evaluate whether the iteration suspension conditions, including maximum
iterator times or ata precision, are satisfied 
Step 5: 
When the suspension conditions are not satisfied Eq. 6
and 7 are used to update the particle information. Step
2 is then repeated 
Step 6: 
When the suspension conditions are satisfied, output optimal solutions
C and σ 
Step 7: 
C and σ are used to construct the SVM forecast model and then regression
forecasting is executed 
RESULTS
Forecast data selection and pretreatment of application case: Integral
point data were collected from selected areas in Fujian Province in October
2011. The data were designated as training, testing and forecast data sets.
Data for October 125 were used for training and those for October 2630 were
employed for testing. On the basis of the training and testing, this study constructed
the SVM forecast model. The model was then used to forecast the October 31 data.
Finally, this study analyzed the predicted and actual data.
To obtain better convergence results, this study normalized the training data,
while data testing and forecasting were carried out at a distribution between
(0, 1). The entire forecasting process was coded by MATLAB and the LibSVM (Chang
and Lin, 2011) toolbox.
PSO and GA are heuristic algorithms. The derived parameter optimization results
differ each time. The predicted values and precision also fluctuate at a small
scale. Through repeated testing and analyses, however, the operational efficiency
and forecast precision of these algorithms are generally stable.
Forecast results: Table 1 shows the actual load data,
PSOSVMpredicted load data, GASVMpredicted load data, traditional SVMpredicted
data and each predicted data set error and Root Mean Squared Relative Error
(RMSRE). RMSRE is expressed as Eq. 8, where e_{i}
is relative error:
Errors and RMSRE are important evaluation indicators of forecast results. The
smaller the values obtained, the more accurate the model forecast.
The contrasting results for the actual, PSOSVM, GASVM and SVM data are shown
in Fig. 2a. The contrasts in error of the three forecast models
are shown in Fig. 2b.

Fig. 1: 
Iteration algorithm of PSO Optimize SVM’s penalty factor
C, kernel function parameter σ 
Table 1: 
Integral point load data and analysis of forecast results,
31 October 2011 


Fig. 2(ab): 
Contrast of original data, the forecast results (by PSOSVM,
GaSVM and SVM) and the contrast of three forecast model’s errors 
DISCUSSION
According to (Li et al., 2009), when the absolute
value of an error is smaller than 3%, the forecast results can be considered
ideal.
As shown in Table 1 and Fig. 2b, of the
24 errors identified by PSOSVM, 18 were smaller than 3%, with the smallest
error amounting to only 0.178%. The largest error was that observed in 31 October
2011 at 18:00, at a value of 5.65%. Furthermore, two errors approached 3% and
three amounted to about 4%. These results are better than those derived via
GASVM (16 errors were smaller than 3%, with the smallest at 0.244% and the
largest at 6.22%) and traditional SVM (17 errors were smaller than 3%, with
the smallest and largest being 0.24 and 6.88%, respectively).
For RMSRE values, those of PSOSVM, GASVM and SVM were 0.0244, 0.0277 and
0.0295, respectively. All the forecast results are ideal. These results follow
the ranking PSOSVM>GASVM>SVM. During the testing, the efficiency of
PSO was higher than that of GA. The operation time of PSOSVM was 92.786364
sec, while that of GASVM was 271.097241 sec. The parameter selection for traditional
SVM was affected by the particle experiences. An inappropriate parameter causes
a large prediction error. In this study, the SVM forecast data yielded better
results given the frequent adjustment in parameters C and σ.
For different training and forecast data, parameters should not be fixed. The
dynamic adjustment of SVM parameters according to data characteristics can obtain
results that correspond with actual forecasts. This advantage also emphasizes
the necessity of using intelligent algorithms such as PSO or GA in selecting
SVM parameters from the sample data.
CONCLUSION
• 
Parameter selection is important for SVM forecasting. Traditional
SVM suffers from problems such as overlearning or underlearning. These problems
diminish algorithm performance and affect forecast precision 
• 
Using the PSO optimization choice, SVM penalty factor and kernel function
parameter yielded good results. PSOSVM also exhibited more efficient operation
than did GASVM 
• 
The proposed model enables good forecast results through actual data confirmation,
making it a valid model. However, because power load is affected by many
external factors, a multifactor forecast model should be explored 
ACKNOWLEDGMENTS
Project supported by National Natural Science Fund of China (No. 71071054)
and the Fundamental Research Funds for the Central Universities of China (No.
11QR34).