INTRODUCTION
There has been great interest in ultrawide band (UWB) technology in recent
years (Win and Scholtz, 1998; Tingting
et al., 2009), for its low transmit power, high transmission rate
and high system capacity. In Impulse Radio UWB (IRUWB) systems, a sequence
of very short duration pulses, which are called monocycles, are used in communication
to gain extremely large transmission bandwidths (Nie and
Chen, 2008). As a result synchronization is an important problem to be solved.
However, in realistic situation, the power of UWB signals with low power spectral
density are attenuated and dispersed when received, due to the multipath experience,
which makes it difficult to achieve synchronization. As a result implementation
of fast and reliable synchronization is the key to normal communication, as
well as a challenge for UWB application.
There has been lots of research on algorithms of synchronization in UWB systems.
Synchronization algorithms based on noncoherent receiver can gain high performance.
However, the template signal in the receiver is hard to design, for both the
linear and the nonlinear distortion of the signal when transmitted (He
and Tepedelenlioglu, 2006). Cardoso (2003) proposed
a synchronization algorithm based on the dirty template. The algorithm is of
low complexity at the expense of the synchronization performance. In Maravic
and Vetterli (2003), frequency domain subspace decomposition was used in
noncoherent signal synchronization. The algorithm can achieve synchronization
with high speed and low complexity. However, the algorithm could not get good
performance because it did not make full use of the statistical information
of the received signal.
In this study, a fine symbol synchronization algorithm with training bits is proposed. Based on the feature of UWB signal in frequency domain subspace, the algorithm takes advantage of Particle Swarm Optimization to achieve signal synchronization.
SYSTEM MODELING
In IRUWB system, monocycles, which are of nanosecond scale, are used to transmit
information in extremely large transmission bandwidths. Research was carried
out based on the UWB system with DSPAM modulation (Benedetto
and Giancola, 2004).
Considering the single user case, the monocycle represented as g (t) is a pulse with the length of D_{g} in the time domain. Every transmitted symbol waveform without modulation can be described by:
Every information symbol is conveyed by N_{f} repeated pulses, with one pulse per frame of duration T_{f}. c_{0}, c_{1},..., c_{Nf1} represents the user's pseudorandom directsequence (DS) code to enable multiple access and c_{i} ε {1, 1}. The transmitted binary PAM UWB signal can be expressed as:
where, T_{s} is the symbol duration and T_{s} = N_{f}T_{f} . b_{j} ε {1, 1} is the binary information data of the jth symbol with equal probability.
Based on SalehValenzuela (SV) model, IEEE 802.15.4a group for sensor networks proposed the channel model. The Channel Impulse Response (CIR) can be expressed as follows:
where, L is the number of paths. δ (t) is the impulse function. α_{l}
and τ_{l} are the gain and the delay of the 1th path. And σ_{0}
= 0, without considering the case of transmission delay.
Considering the multiuser case, transmitted through the multipath channel, the received signal can be expressed as:
where, p_{R} (t) = p_{S} (t)qh (t) is the waveform which p_{S}(t)
turns into after the channel. “”
denotes signal onvolution. w (t) is the multiuser interference. m (t) is the
thermal noise.
In the case that the number of the users in the communication is large enough
and the power of each user is in the same scale, according to the central limit
theory, the multiuser interference is Gaussian random process with zero mean
(Win and Scholtz, 1998, 2000).
As a result, Eq. 4 can be written as:
where, u (t) = w (t)+m (t) is Additive White Gaussian Noise (AWGN) with zero mean and a twoside spectrum density of N_{0}/2.

Fig. 1: 
The schematic diagram of synchronous deviation 
Transmission delay exists when the signal is transmitted. After coarse synchronization, there is also synchronous deviation ε, whose value takes real numbers in range 0≤ε<T_{s} as shown in Fig. 1. The role of the proposed synchronization algorithm is to obtain the synchronization parameter ε.
SYNCHRONIZATION ALGORITHM
Here, a synchronization algorithm based on frequency domain subspace decomposition (FDsubspace) is proposed firstly. Then, on the basis of FDsubspace, we proposed an optimized synchronization algorithm with the help of Particle Swarm Optimization (PSOFDsubspace).
Frequency domain subspace decomposition realization of UWB synchronization: According to Fig. 1, when training bits is sent whose value is 1 and length is K+1, the received waveform of the ith bit can be expressed as r_{i} (t), where (I1) T_{s}≤t<iT_{s}. Sampled with frequency f_{c}, r_{i} (t) turns into r_{i} [n], which can be described by:
where, p_{S} [n], h [n] and u [n] are the waveforms p_{S} (t) h (t) and u (t) turns into after sampled with frequency f_{c}, with N sampling points. N = [txf_{n}], η = [εxf_{n}], τ_{l} = [σ_{l}xf_{n}], where [x] is the downward rounding of x. p_{S }[(nη)_{N}] is the cyclic shift of p_{S} [n].
After N point DFT transform Eq. 6 changes into:
where, R_{i} [k], P_{S} [k] and U [k] are signals r_{i} [n], p_{s} [n] and u [n] turns into after point DFT transform. W_{N} can be expressed as:
H_{i} [k] is defined as:
Taking advantage of the information in the ith bit of the training bits, define a PxQ matrix J_{i} as:
where, P+Q1≤N and P, Q≥L. J_{i} can be written as follows:
where, (•)^{H} denotes conjugate compose. Considering the noisefree case only, define w_{l} as Wl = W^{τl+η}_{N}, and Ui, Λ_{i} and V_{i} can be described as:
Seen from the analyses above, U_{i}, Λ_{i} and V_{i}
have the rank L, so does J_{i}. And U_{i}, V_{i} are
both Vandermonde matrices. Matrix Φ is defined as Φ = diag (w_{0},
w_{1},..., w_{L1}). According to properties of Vandermonde
matrix, the following formulas can be established:
where, (•)^{+} and (•)^{} donate the operations of omitting the first and last row of matrix (•).
According to Eq. 11, U_{i} and V_{i} satisfy
the shiftinvariant subspace property. Further more, the submatrices which U_{i}
and V_{i} turns into after omitted the first rows or last rows have
full column rank and satisfy the shiftinvariant subspace property if Pm ≥L,
Qm≥L. Define (•) and (•) as operations of omitting the
first rows or last rows of matrix (•). The synchronous deviation can be
obtained by calculating the phase angle of the smallest eigenvalue of matrix
.
In the noisefree case, making use of the ith bit of the training bits, ,
which is the estimate value of the synchronous deviation η, can be described
as:
where (•)^{†} is the pseudo inverse matrix of (•).
eig (•) is the eigenvalue of matrix (•).
donate the operation of calculating the phase angle.
Taking noise into consideration, after the singular value decomposition, J_{i} can be expressed as:
where, U_{Rj} and V_{Rj} are the left and right singular matrix of J_{i}. Λ_{Ri} is a PxQ matrix, which can be expressed as:
where, Λ_{Si} is a LxL diagonal matrix which is composed of L
singular values. U_{Si} is defined as the first L columns of U_{Ri}.
Then the signal subspace of matrix J_{i} can be estimated by U_{Si}.
According to the analyses above, U_{i} in Eq. 10
can be replaced by U_{Si}, when synchronization is performed. In the
case that noise exists, taking advantage of the ith bit of the training bits,
which is the estimate value of the synchronous deviation η, can be described
as:
Making use of the expectation of
which is the final estimated value of the synchronous deviation can be expressed
as:
where, K+1 is the length of the training bits.
Optimized frequency domain subspace decomposition UWB synchronization based
on PSO: Particle Swarm Optimization (PSO) is used for solving the optimization
problem originally inspired by certain social behavior of bird flocking (Kennedy
and Eberhart, 1995). In PSO individuals which are called as particles, fly
through the solution space with a certain trajectory. Under the guidance of
its own and its neighbors’ experience, each particle will gradually fly
into the area of global optimum.
In ndimensional space, X_{i} = (x_{i1}, x_{i2},...,
x_{in}) is the position vector of the ith particle and V_{i}
= (v_{i1}, v_{i2},..., v_{in}) is the velocity vector
of the ith particle. P_{i} = (p_{i1}, p_{i2},..., p_{in})
and P_{g} = (p_{g1}, p_{g2},..., p_{gn}) represented
the individual best position of the ith particle and the global best position
discovered by the whole swarm.

Fig. 2: 
The flow chart of PSO 
Using the fitness function fitness = f (X_{i}), P_{i} and P_{g}
can be calculated. PSO can be described as the following update equations:
where, i and j is used to represent the indexes of the i th particle in the
j th dimension. t stands for iteration times. c_{1} and c_{2}
are constant values that are called personal and global accelerations. r_{1}
and r_{2} are uniform random numbers in the range [0, 1]. In 2007, Bratton
and Kennedy (2007) proposed a better method of parameter settings of PSO,
which is called Standard Particle Swarm Optimization (SPSO). In this study,
parameters are set as SPSO. The flow chart of PSO is shown in Fig.
2.
According to the analysis above, an optimized synchronization algorithm with
the help of Particle Swarm Optimization (PSOFDsubspace) is proposed. PSO can
be used to optimize the estimated value
obtained from the training bits. The fitness function is set on the basis of
the least mean square criterion,
which is the final estimated value of the synchronous deviation, can be obtained
by the following equation:
where,
is the intermediate variable in the iteration. And the fitness function can
be written as:
As a result, the optimization problem of synchronization can be transformed into solving the minimum fitness value in PSO.
SIMULATION RESULTS
Simulations are carried out based on CM1 by the IEEE802.15.3a channel modeling
subcommittee (Foerster, 2003). The system parameters
are set as T_{s} = 200 nsec, T_{f} = 40 nsec, N_{f}
= 5. The second derivative of a Gaussian function is chosen as the monocycle
pulse with its duration T_{m} = 0.6 nsec. The performance of the proposed
algorithm can be shown through two experiments.
The first experiment shows the performance of FDsubspace for different length of training bits. In the figure NMSE is used as ordinate and noise ratio (SNR) which can be expressed by Eb/N0 is used as abscissa. Compared with the SVD synchronization algorithm proposed by Maravic and Vetterli (2003, FDsubspace is better in synchronization. From Fig. 3 the performance of FDsubspace is improved, as the length of the training bits increases. However, the performance improves slowly, when the number of the training bits is more than 60. As a result, 60 is the optimum number of the training bits in FDsubspace.
The second experiment shows the performance of PSOFDsubspace for different
length of training bits which is showed in Fig. 4. The figure
shows PSOFDsubspace is better than SVD synchronization algorithm in synchronization.
From Fig. 4 the performance of PSOFDsubspace is improved,
as the length of the training bits increases. And 60 is also the optimum number
of the training bits in PSOFDsubspace, as the performance improves slowly,
when the number of the training bits is more than 60.

Fig. 3: 
The performance of FDsubspace for different length of training
bits 

Fig. 4: 
The performance of PSOFDsubspace for different length of
training bits 
Similar trend in the performance curves of FDsubspace and PSOFDsubspace
can be found in Fig. 3 and Fig. 4. The representative
differences of the two synchronization algorithms in detail are showed in Fig.
5. When the length of the training bits is same, the NMSE performance of
PSOFDsubspace is better than that of FDsubspace. The comparison shows PSOFDsubspace
can perform synchronization with fewer training bits. As a result, PSOFDsubspace
has better performance compared with FDsubspace.

Fig. 5: 
NMSE with E_{b}/N_{0} in different synchronization
algorithms 
CONCLUSIONS
In this study, a fine symbol synchronization algorithm based on frequency domain subspace decomposition (FDsubspace) is introduced. With the help of PSO, an optimized synchronization algorithm on the basis of FDsubspace is proposed. Both of them are good in performance of synchronization with low complexity. Compared with FDsubspace, PSOFDsubspace can perform synchronization with fewer training bits.
ACKNOWLEDGMENTS
This study is sponsored by National Natural Science Foundation of China with grant number 60772129 and Special Foundation Project of Harbin technological innovation under grant 2010RFQXG030.