Abstract: This study presents a novel fire detection method based on vector quantization. Before online fire detection, we generate a fire codebook and a non-fire codebook by the LBG algorithm based on the training set that are selected from 10 video clips under different scenes and conditions. For encoding convenience, we merged the two codebooks into one codebook and sorted the codewords in the ascending order of their mean values for the future Equal-average Equal-variance Equal-norm Nearest Neighbor Search (EEENNS) based fast encoding process. In the online fire detection process, the video to be detected was first segmented into successive frames and we performed the VQ (Vector Quantization) encoding process to find fire-colored frames and recorded the grade of each fire-colored area. Then, the moving pixel detection process was performed on each fire-colored frame to find candidate fire frames. Finally, we verified whether a fire occurs or not and graded the fire by analyzing the change in the number of blocks belonging to each grade between consecutive frames. Experimental results demonstrated the effectiveness of the proposed scheme and a 93.3% detection rate was obtained with 25 test video clips.
INTRODUCTION
A fire accident brings a lot of damage to our properties and even takes away many peoples lives. In the event of a fire, there are certain features suitable for fire detection. The most common features are heat and smoke, depending on whether the fire is smoldering or flaming and on the type of fuel. In addition to these features, there are a number of other traits such as gases and light. All kinds of fire detection systems have been applied based on various sensors. The use of smoke detectors has increased steadily since the mid-1970s, however, there is a need for smoke detectors to discriminate between fires and nuisance sources. Recently, it has verified that a system consisting of multiple gas and Taguchi sensors combined with thermocouples is capable of detecting and discriminating between flaming and smoldering fires (Hagen and Milke, 2000). A more recent research has been focused on detecting a smoldering fire which is mainly caused by a cigarette in a general house based on tin oxide gas sensors (Sawada et al., 2008).
Besides sensor-based fire detection, vision-based fire detection is also a useful way to sound the fire alarm automatically. This study focuses on fire detection based on video processing, which can be viewed as light sensors. There are many significant static and dynamic characters in the fire video. Static characters include color, shape, spectral texture, edge irregularity and so on. Moving characters include the difference between two consecutive frames, the change in the number of fire-colored pixels, the movement of the fire-colored area, flame flickering and so on. To detect a fire in the video, many schemes must obtain the fire area in each frame first, where the limitation in the RGB color space or some other color spaces such as YCbCr, HIS and I1I2I3 (Yu et al., 2008). Alternatively, the Gaussian distribution based method can be used to obtain the general fire area (Ko et al., 2009). After detecting candidate fire areas, the next step is to remove unlikely pixels or unwanted areas. It should be noted that the range of fire intensity is similar to many nature phenomena such as the static and moving light from the sun or cars or some other bright moving objects. To eliminate the unwanted light, the static light is eliminated by comparing the fire image with the background image and the moving light is removed by applying the and operation to the first and the last frames and then the noises are removed by the erosion and dilation operations (Han and Lee, 2009). A modified hybrid background estimation method is used to detect the moving area and non-fire pixels are removed based on temporal luminance variation (Ko and Nam, 2006). Recently, some researchers have been dedicated to the study on the flame flickering and the influence of the surrounding on the main flame pulsation frequency, which permits a flame to be distinguished from a potential source of false alarms (Thuillard, 2002). Quasi-periodic behavior in flame boundaries is detected by performing temporal wavelet transform, while color variations in flame regions are detected by computing the spatial wavelet transform of moving fire-colored regions (Toreyin et al., 2006). In addition, some researchers focus on the fire that occurs in some special place such as automatic fire detection in road traffic tunnels (Aralt and Nilsen, 2009) and early fire detection for vessels (Wang et al., 2009).
Most existing schemes can not achieve low false alarm rates and high detection speed at the same time and the fire area is not graded. To improve these performances, we propose a novel fire detection method based on Vector Quantization (VQ). The VQ is an efficient signal compression and pattern clustering technique. In this study, we use VQ to generate a fire codebook with three-grade codewords and a non-fire codebook. Based on these codewords, we can identify if the input pattern is a fire pattern and then grade the fire pattern.
VECTOR QUANTIZATION
Because, VQ is the key technique used in this paper, we first give a brief introduction in this section. Vector Quantization (VQ) is a popular and efficient technique for signal compression and pattern clustering (Gray, 1984). The k-dimensional N-level vector quantizer is defined as a mapping from the k-dimensional Euclidean space Rk into a certain finite set C = {c1, c2, ..., cN}, where ci = (ci1, ci2, ..., cik) is a codeword in the codebook C and N is the codebook size. The quantizer is completely described by the codebook C together with the partitioned set consisting of subspaces of the k-dimensional Euclidean space Rk, S = {S1, S2, ..., SN} and the mapping function Q(•): Q(x) = ci, if xεSi. The partitioned sets Si (1≤i≤N) satisfy S1∪S2...∪SN =Rk and Si∩Sj = Φ, if i≠j. In the encoding phase, each input vector x = (x1, x2, ..., xk) finds its best matching codeword ci in the codebook C and the index i is assigned to x, satisfying:
Only, the index i is sent over the channel to the receiver. In the decoding process, for each index i, the decoder merely performs a simple table look-up operation to obtain ci and then uses ci to reconstruct the input vector x.
The two key techniques in the basic VQ system are codebook design and codeword search techniques. The first issue in the VQ system is to encode each input vector with distortion as small as possible. Obviously, the encoding performance is determined by the representativeness of the codebook. In general, VQ generates a representative codebook from a number of training vectors using the well-known iterative clustering algorithm (Linde et al., 1980) that is often referred to as the Generalized Lloyd Algorithm (GLA).
The second issue in the VQ system is to encode each input vector as fast as possible. As we know, if the squared Euclidean distance:
is used to describe the distortion between x and ci, then the Full Search (FS) of the nearest codeword for each input vector requires kN multiplications, (2k-1)N additions and N-1 comparisons. For the VQ system with large codebook size and high dimension, the computation load is very high during the codeword search. To release the computation burden, a lot of fast codeword search algorithms have been presented to speed up the searching process while maintaining the reconstructed performance the same as that of the full search. Among them, the method based on three characteristic values, i.e., the mean, variance and norm values of a vector, is the most representative one, which is called Equal-average Equal-variance Equal-norm Nearest Neighbor Search (EEENNS) algorithm (Lu and Sun, 2003).
PROPOSED AUTOMATIC FIRE DETECTION SYSTEM BASED ON VQ
The flow chart of the proposed fire detection scheme can be shown in Fig. 1. VQ is the central step in our scheme, thus we should generate a representative codebook from a training set before online fire detection. Here we generated a fire codebook with 100 codewords by the LBG algorithm based on the training set with 50000 27-dimensional vectors (each vector is a 3x3 block with 3 color channels) that are selected from 300 frames in 10 fire video clips under different scenes and conditions. Similarly, a non-fire codebook with 200 codewords was generated from non-fire frames. The fire codebook was classified into three grades denoting different degrees of severity. For encoding convenience, we merged the two codebooks into one codebook and sorted the codewords in the ascending order of their mean values for the future EEENNS fast encoding process (Lu and Sun, 2003). The online fire detection process can be described in brief as follows: First, the video to be detected was segmented into successive frames and each frame was segmented into 3x3 non-overlapping blocks.
Fig. 1: | The flow chart of the proposed fire detection scheme |
For each block in each frame, we performed the VQ encoding process with the above sorted codebook to determine whether the input block is a fire-colored block or not and recorded the grade of the fire-colored block. If the number of fire-colored blocks in a frame is larger than a certain threshold, we call this frame a fire-colored frame. If there are no fire-colored frames, then the algorithm is terminated without fire alarm. Otherwise, the moving pixel detection process is performed on each fire-colored frame to determine whether the current fire-colored frame is a candidate fire frame or not according to the number of moving pixels in this frame. Finally, we verified whether a fire occurs or not and graded the fire by analyzing the change in the number of blocks belonging to each grade between consecutive frames.
Codebook design: For fire area detection, we need a uniform codebook for various scenes and conditions. Because, this codebook is designed offline, the time required is out of our consideration. In many pattern recognition systems, a rule for classification is necessary to detect the target area, such as the vision-based road detection (Yanqing et al., 2010) and the K-Means clustering to improve the accuracy of decision tree response classification (Ali et al., 2009), each representing a type. Thus, before codebook generation, we also require a rule to choose the fire area in video clips. Based on many experiments on various video clips, we draw the following simple rule to determine if a pixel p = (pR, pG, pB) is a fire-colored pixel:
(1) |
From Eq. 1, we can see that the rule is loose so that we can get fire areas to cover all possible fire conditions although some non-fire areas may also be selected. In fact, non-fire areas can be excluded by subsequent steps.
Based on the above rule, we can construct two training sets to generate the fire and non-fire codebooks, CF = {cF1, cF2, ..., cF100} and CNF = {cNF1, cNF2, ..., cNF200}, based on the LBG algorithm, respectively. The first set consists of 50000 27-dimensional vectors, each vector being a 3x3 fire-colored block with 3 color channels. Similarly, the second set consists of 50000 3x3 non-fire blocks. In order to use the EEENNS algorithm in the VQ encoding process, we merge the two codebooks into one codebook C = {c1, c2, ..., c300} and sort the codewords in the ascending order of their mean values.
In addition, to grade the severity degree of a fire, we require classifying the codewords in the fire codebook CF into three grades. We draw the classification rule based on following statistical analysis. First, we get 15 fire video clips and generate their RGB histograms individually according to the rule as shown in Eq. 1. To obtain the general RGB distribution feature, we choose 10 fire video clips with both fire and non-fire areas according to Eq. 1 and the result are shown in Fig. 2a-c. To attain the RGB distribution character of the fire area, we only choose fire areas in five fire video clips and the histograms are shown in Fig. 2d-f.
Fig. 2: | The
RGB distributions for different conditions. (a, b and c) are the RGB histograms
for 300 frames from both fire and non-fire video clips. (d, e and f) are
the RGB histograms for 100 fire frames only from fire video clips. (g,
h and i) are the RGB histograms of 100 frames only with non-fire areas.
(a, d, g) Red, (b, e, h) Green and (c, f, i) Blue |
For comparison, we choose five video clips without fire areas but some are falsely detected as fire areas by Eq. 1 and the histograms for these areas are depicted in Fig. 2g-i. As we can see, there is a dramatic difference in red color distribution among above three cases. In the second case as shown in Fig. 2d, the red components are mainly with values greater than 230. On the contrary, they are mainly with values lower than 200 in the third case. Therefore, the value 230 can be taken as a dividing point for the red channel. In contrast, there is no significant difference among blue histograms, all of which have a peak in the value of 80 although, the smoothness is different to a certain extent. For the green channel as shown in Fig. 2b, d and h, there are some difference among their distributions. Based on Fig. 2h, we select the value 100 to be a dividing point for the green channel and choose the value 88 as another dividing point to classify the pixel with small green component value to another grade. Consequently, we classify the severity degree of each fire codeword cFj whose mean value is p = (pR, pG, pB) by the following rule:
(2) |
VQ encoding based fire-colored frame decision: Once, we have the sorted codebook with graded codewords in hand, all 3x3 non-overlapping blocks in each frame of the video to be detected can be classified by searching the best-match codeword for each input block whose grade is just set as that of the best-match codeword. Here, the 9 pixels in each block are classified into the same grade. Then, for each frame, the number of pixels belonging to each grade is counted and recorded for future fire pixel verification.
As mentioned in last section, the VQ encoding time should be reduced as much as possible to make the fire detection real-time, thus a fast nearest neighbor search algorithm should be adopted. To obtain the best-match codeword for each input vector as quickly as possible, we performed the EEENNS algorithm (Lu and Sun, 2003) here. The main idea is as follows: For an input k-dimensional vector x, its variance and norm values are given as:
and
respectively, where mx is the mean value of the components in x.
It can be shown that the distance calculation is necessary only for those codewords
with variances ranging from vmin=vx-
Now, we turn to the decision of fire-colored frames. Assume the total number of pixels in each frame is M, the current frame to be processed is Fn (n = 1, 2, 3, ...) and the number of pixels belonging to Grade i in this frame is Fn (i = 1, 2, 3), the decision rule can be expressed as follows:
(3) |
where, τ1 is a small threshold. In this study, we set τ1 = 0.001.
Moving pixel detection: This step is applied to check whether the pixels in the fire-colored frames are moving or not. If they are not moving we dont consider them as the candidate fire pixels since some other static objects, such as the sun and the road light, may also have the same color features as the fire. The aim of this step is to identify the general moving area but not the accurate area, because we only need to know whether the fire-colored area obtained in the first step is a real flickering area or not. In this paper, we detect the moving pixels by the following rule (Ko et al., 2009):
(4) |
where, xn(k, l) is the luminance value of the pixel located at (k, l) in the frame Fn (n = 1, 2, 3, ...) and yn+1 (k, l) is the estimated value for the pixel located at (k, l) in the next frame Fn+1, α is the weighting factor of the estimated value, τ is the threshold (we set τ = 7 in this study). Then, we binary each fire-colored frame Fn as follows:
(5) |
where, xn (k, l) is the same threshold as in Eq. 5. Obviously, the pixel with the value 1 is a moving pixel. In the experiments, we find that the binary frame obtained by Eq. 5 is not as robust as we want, thus we should smooth it. Here, we adopt the following method. First, we divide the frame into 4x4 blocks, each having 16 pixels. For each block, if there are more than 8 pixels with the value 1, then we set all values in this block to 1; Otherwise, we set them to 0. Figure 3 gives a concrete example, where Fig. 3a is the origin image, Fig. 3b is the result obtained by Eq. 5 and Fig. 3c is the smoothed result of Fig. 3b.
Fig. 3: | An
concrete example to show the smoothing effect. (a) is the original image,
(b) is the result obtained by Eq. 5 and (c) is the smoothed
result of b |
Obviously, after moving pixel detection, some of the fire-colored pixels are eliminated and the number of graded pixels will be reduced.
Now, we turn to the decision of candidate fire frames. For the current fire-colored frame Fn, assume the number of moving pixels belonging to Grade i in this frame is Mi(n)_mov (i = 1, 2, 3), the decision rule can be expressed as follows:
(6) |
where, τ2≥τ1 is a small threshold. In this study, we set τ2 = τ1 = 0.001. Here, if Mi(n)_mov<τ2M is satisfied for any Grade i, we set all the moving pixels with Grade i as non-candidate fire pixels. If we use Mi(n)_con to represent the number of candidate fire pixels belonging to Grade i in the n-th frame, then we have:
(7) |
Fire pixel verification: After above two phases, we have extracted candidate fire areas with different grades. Now we should verify them to determine whether a fire occurs in the video and give the fire grade. As we know, a fire usually spreads at the starting of occurrence, thus the number of fire pixels will also increase. When a fire has occurred for some time, the spreading speed will slow down and the number of fire pixels will not increase all the time but decrease at some time. Furthermore, it will not stay steadily at a certain number but change all the time. After performing numerous experiments, we find that if the number of frames whose candidate fire pixels belonging to any grade increase is greater than one third of the total number of the candidate frames, the fire will definitely occur. Otherwise, no fire occurs in the video. In other words, if the number of candidate fire pixels with Grade 1 increases as the fire spreads, we set the severity of the fire to be Grade 1. Otherwise, if the number of candidate fire pixels with Grade 2 increases, then the severity of the fire is set to be Grade 2. Otherwise, if the number of candidate fire pixels with Grade 3 increases, then the severity of the fire is set to be Grade 3. Otherwise, we consider there is no fire in the video. The process can be expressed by following Eq. 8-10:
(8) |
(9) |
(10) |
where, i = 1, 2, 3 represent the three severity grades of a fire, fi(n) is the binary flag denoting whether the number of candidate fire pixels with Grade i increases at the n-th frame or not and Si denotes the number of candidate fire frames with increasing candidate fire pixels belonging to Grade i. In this study, we consider 15 consecutive frames each time, thus K = 15. In Eq. 10, K/3 is selected based on our experience.
EXPERIMENTAL RESULTS AND DISCUSSION
In general, there are various fire detection systems, but there isnt a uniform fire video clips database to compare the result for different systems. In our verification process, we adopted 19 video clips including fire video clips and non-fire video clips from the web-page http://signal.ee.bilkent.edu.tr/VisiFire/ and six fire video clips in our database. Then we cut down each video clips only leaving the first 30 frames for the equity of each video. Next, we will give a concrete example to illustrate the proposed algorithm and compare our result with Kos scheme (Ko et al., 2009) with the same database. At last, we will discuss the computational complexity of the proposed approach.
To evaluate the effectiveness of the proposed scheme, we first gave a concrete example to show the intermediate results for a test fire video clip whose typical frame was shown in Fig. 4a. First, we found the fire-colored pixels in the frame based on VQ encoding as shown in Fig. 4b. Second, we detected the moving pixels to get the candidate fire pixels as shown in Fig. 4c and we synthesized Fig. 4b and Fig. 4c to produce the accurate fire area as depicted in Fig. 4d. Third, we verified the fire pixels to alarm a fire with severity degrees as shown in Fig. 4e. We can see that the number of pixels with Grade 1 and the number of pixels with Grade 2 both increase sharply during the fire spreading, while the number of pixels with Grade 3 increases slowly, so we set the severity degree to be Grade 1 for the example fire video clip.
Fig. 4: | The experimental results for a concrete example. (a) the original frame; (b) the fire-colored pixels in the frame; (c) the moving pixels in the frame; (d) the and result of (b) and (c) and (e) the number of fire pixels belonging to each grade |
Table 1: | The experimental results based on 25 test video clips |
Next, we selected 25 video clips, each having 30 frames, to prove the efficiency of the proposed method and the results are shown in Table 1. As shown in the Table 1, 15 video clips with fire and 10 video clips without fire are input into the system one by one and all of the videos with fires grow sharply with time are detected as fires of Grade 1 or Grade 2. All of the non-fire video clips are detected without a fire and a video clip with a stable fire is rejected to be a fire. From these results, we can see that our algorithm can exactly detect all the video clips with a growing fire. For the non-growing fire video clip, our algorithm regards it as a non-fire video clip, because a fire dont grow with time, such as cooking fires and stove fires, will not bring damage to our life.
Because in Kos algorithm, the result doesnt include grade information, so we give a comparison table that only considers the fire and non-fire video clips without the gradation information. The comparison result is shown in Table 2, where the detection rate (Rd) represents the ratio of detected fire frames to overall fire frames and it includes true positive and false positive rates. The true positive rate (Rtp) means the rate of detecting a real fire as a fire and the false positive rate (Rfp) means the rate of recognizing a non-fire as a fire and the missing rate (Rm) represents the rate of recognizing a real fire as a non-fire.
Table 2: | The comparison results between our algorithm and Kos algrithm |
Let, N be the total number of frames, Nf and Nnf be the numbers of frames with fire and without fire, respectively and Nd be the number of frames detected with fire. The relationship among the above parameters can be shown as follows:
(11) |
Table 2 shows our algorithm can get a higher detection rate of 93.3% compared to 86.5% and a lower missing rate of 6.7% compared to 13.2%. The true positive rate of our proposed algorithm is 100% that is higher than that of Kos algorithm, while the false positive is 0 in contrast to 0.4% in Kos algorithm.
Here, all experiments are performed on an Intel(R) core(TM)2 Duo PC with 3.16 GHz CPU and 4 GB memory. In the proposed algorithm, the computational time is short enough to process video with the capture rate up to 58 frames sec-1 as shown in Table 3.
Table 3: | The
computational time in each phase for processing 100 consecutive frames
(The size of each frame: 320x240) |
From the table we can see that, given 100 consecutive frames of size 320x240, it only need 17 ms to process a frame, that is, it can process 58 frames in a second. The proposed algorithm can be used for real time application in general conditions, but it should be improved for HD video.
CONCLUSIONS
This study proposes a novel fire detection method based on vector quantization. Conventional fire detection algorithms are usually only suitable for the fires occur in some fixed conditions; however, our scheme considers all kinds of conditions and generates a combined codebook to detect the fire areas with different severity grades. As an efficient pattern clustering and classification technique, VQ encoding was introduced in the fire-colored frame decision stage, which was performed not pixel-by-pixel but block-by-block and the fast EEENNS algorithm was performed to speed up the VQ encoding process. Moving pixel detection was performed to reduce the possible turbulence from the sun, the moving car lights or other non-fire lights. In addition, the fact that the number of pixels belonging to each grade would increase when a fire occurs was used as an extra rule to identify a fire. Experimental results proved that the proposed method is efficient to detect all kinds of fire areas and it is robust against moving objects. Future work will focus on the implementation of the fire alarm system in Hardware and improve the detection speed.