Today video surveillance systems show two main drawbacks, namely: they are
not flexible in adapting to different operative scenarios (they only work in
a well known and structured world); they generally need the assistance of a
human operator in order to recognize and to tag specific visual events (Wong
et al., 2009). Because humans being can not serve as fully trusted
observer, this model itself may lead to omission phenomenon (Hapsari
and Prabuwono, 2010). The need of people to judge against threat will cause
a delay of security alerts and too long response time of security threats. Mass
storage of data also increases the cost of video surveillance.
For these shortcomings of video surveillance systems, many new technologies
are proposed. Mecocci et al. (2003) devised an
automatic real-time video surveillance system, capable of autonomously detecting
anomalous behavioral events. The proposed system is capable of automatically
adapting to different scenarios without any human intervention and uses robust
self-learning techniques to automatically learn the typical behaviour of the
targets in each specific operative environment. A wireless video surveillance
system supporting wide area access based on 3G network is proposed (Ruichao
et al., 2009). Guo et al. (2010) implements
and analyzes several commonly used methods for moving object detection in video
surveillance system, such as temporal difference, median filtering and Single
Gaussian model. Wong et al. (2009) proposed a
surveillance system, the Omni directional scenes within a digital home are first
captured using a web camera attached to a hyperbolic optical mirror. The captured
scenes are then fed into a laptop computer for image processing and alarm purposes.
Log-polar mapping is proposed to map the captured Omni directional image into
panoramic image, hence providing the observer or image processing tools a complete
wide angle of view. It can cover a wide angle of view (360deg Omni directional),
small in size, low cost, lightweight, output images are with higher data compression
and accurate trespasser detection. Moon and Pan (2009)
analyses conventional methods of privacy protection in surveillance camera systems
and applied scrambling and RFID system to existing surveillance systems to prevent
privacy exposure in monitoring simultaneously for both privacy protection and
surveillance. A new approach is proposed for a smart video surveillance system
by Xu and Wu (2010). The proposed surveillance system
uses canonical stereo configuration to set up a pair of statical cameras to
support a salient map to control a Pan-Tilt-Zoom (PTZ) camera/cameras, which
captures high definition image/video of the interesting moving object in the
surveillance area for further forensic investigations. The other contribution
of the proposed approach is that, 3D information generated by the set of static
cameras is used to support reliable spatial-temporal based image segmentation
and object detection.
The system which has been developed by us is characterized by its smart, convenience and power, which can process images in real time with its motion detection, thus eliminate the need to monitor staff. Its auto-recording of scene and efficient image data backup provides users with clear and reliable picture.
Software architecture of intelligent video surveillance alarm system: The hardware platform of the intelligent video surveillance alarm systems involves ARM9 2410 processor and OV7620 + OV511 image acquisition card. Linux operating system is its software platform.
System software modules are divided into digital image acquisition module, image control detection module, image compression module, memory module and WEB modules. The overall framework of the system software is shown in Fig. 1. Image control detection, image compression, WEB pretreatment is the key to implementation of the system. Essential parts of the overall framework of the system which is shown in Fig. 1, include image control detection, image compression and WEB pretreatment.
The improved algorithm of differential image motion detection: The purpose
of motion detection is to detect the video image sequence and to extract the
moving target (Collins et al., 2000). Currently,
there are some methods for motion detection such as optical flow which is poor
at real-time control and with high time expense, noise-Gaussian-distribution-based
methods with a large amount of calculation and the method of difference image,
which is relatively simple and with strong real-time control (Seferidis
and Ghanbari, 1993; Kim et al., 1999; Hongjian,
Generally there are two kinds of difference image method one is the difference between the current continuous two images (continuous image difference method) another is the difference between the current image and the fixed background image (the fixed background motion detection). The disadvantage of the continuous image difference method is that the detected location of objects is inaccurate and the detected target is much larger than the real one. The fixed background motion detection is performed as follows: firstly the background image is stored, then the difference image by the current image and the background image is read, finally the value of each pixel and a pre-sets threshold is compared. If a pixel value of background is greater than the pre-sets threshold, it will be considered as the pixel of the moving objects and if not, it will be background point Since this technology aims to get current image and there only exists the moving object in the difference image, which is the subtract of current frame and fixed background frame.
The new motion detection algorithm based on the difference image method introduces
the block thought, determines the detection threshold with the method based
on noise Gaussian distribution method. Compared with traditional rule of thumb
threshold method, it is more accurate and more reliable since this method joins
the brightness adjustment mechanism and reduces the light changes. Specific
methods are as follows: firstly, the image is reasonably divided to blocks which
are the basic units of motion detection; secondly, the mean gray value of the
current frame image block and the mean gray value of fixed background image
block are compared; thirdly, the method based on noise Gaussian distribution
method to get the detection threshold and the threshold to determine whether
there is moving objects in the image are applied which aid to extract the blocks
where the moving objects are in.
|| Intelligent video surveillance alarm system software architecture
All the operations are prepared for the follow-up image compression. A brightness
adjustment mechanism for the current frame is added to alleviate the impact
of ambient light in motion detection which can effectively reduce or even eliminate
the impact of illumination change. With the image data of a new frame, it can
adaptively change the average brightness of the reference frame pixel.
The establishment of the fixed background model: At the beginning of motion detection, a pre-stored background image with no moving target constitutes the initial model. The following method to extract background model was conducted the first two consecutive frames with the difference method to determine whether there is moving objects. If there is no, the average gray values of N consecutive frames would be calculated out and one frame as the reference frame would be directed.
Since real time of motion detection algorithm is of great significance, the low cost computation algorithm is taken as the primary consideration. All moving objects have their size and the pixel brightness of each object has a certain similarity too. So there is no need to compare the image brightness of each pixel value. The image is divided into blocks whose size is similar to the objects to be detected. This method will improve the real time of motion detection.
In this system, scene images are divided into blocks with sizes about 1/4 of the body (i.e., 320x240, image is divided into 8x8, a total of 64), in order to avoid misjudgment of small objects. Selecting about 1/4 size of the human body can distinguish the movement of large objects and the motion of small size objects.
Specific identification process is as follow: when the average brightness value of a block changes was detected and the threshold was exceeded, the average brightness value of the adjacent new blocks was detected to determine whether their average brightness values exceed the threshold. Provided that the average brightness value of the new block exceeds the threshold, adjacent block based on this new block would be tested. Only when a total of four or more blocks that their average brightness exceeds the threshold were found, moving object would be identified as large-size objects.
The mechanism and experimental analysis of brightness adjustment: Changes
of ambient light exert impact on fixed background motion detection. In the determination
of conditions, a light-sensitive item was added. The improved conditions are
and ai is a piece of the overall image A, the total pixel number
is the average brightness of all pixels in t+Δt time for the ai
is the average brightness of all pixels in t time for the ai block:
is the differences brightness of image A in time t+Δt and time t and the value can be positive or negative. N is the number of all pixels, the value of inhibition coefficient λ can be 1.
Two pictures are taken in the same scene but under different light intensity in experiment. The 2nd and 3rd column of Table 1, respectively showed the average brightness value of each divided blocks of two scenes. Data in column 2 is obviously lower than it in column 3 because of light difference. The 4th column indicated the difference value of each block between the 2nd column and the 3rd column.
In addition, the average brightness value of the whole dark light scene is 119.365257 and that of the whole light brightness scene is 146.048981. In 4th column of Table 1, the value of the 48th-block is 7, the value of the 56th-block is -5, so moving objects are in these blocks, we can not consider these blocks. Taking dark light scene as the reference frame, the light brightness scene as the detected frame, according to algorithm, the difference value between 119.365257 and 146.048981, takes -27 must be added to the average brightness of light brightness scene. The adjusted difference value of average brightness was showed on the 5th column of Table 1.
Thus, the adjusted difference values of average brightness are between -9 and 10. In other words, provided that the threshold value T is greater than 10, the changes of light intensity will not affect the results of motion detection. It shows that this improved algorithm has certain adaptability to brightness variations.
|| The average brightness value test
Threshold determination and experimental analysis: The motion detection algorithm based on CCD noise characteristics can be viewed as a Gaussian distribution of deviations, its background noise can be regarded as Gaussian and the average value of various random electronic noises can be regarded as 0.Therefore, the problem of determining the moving object is transformed to find whether the difference image are wholly in the rule of Gaussian distribution.
According to the "3σ" feature of Gaussian distribution, for Gaussian data, most of them distribute within the 3σ (where, σ is the standard deviation). We can set the detection threshold according to this feature and use it to determine whether the image blocks have moving objects.
In the initial motion detection, we estimate the average value of the difference image block and the initial value of variance, according to the difference image between two consecutive images and in the two consecutive images there are not any moving object. According to multiple difference images, we calculate the error between the brightness value of difference image and the estimated value. Finally, we use the iterative weighting method. By the way, the value of the standard deviation σ is determined by the physical properties of CCD, so when motion detection begins, we can determine the initial value of the standard deviation and set 3σ to be the threshold of motion detection.
Next, according to the determined threshold, we look at the algorithms effect of moving object detection. The standard deviation is stably at about 4; take 12 as its threshold value. Consider dark light image to be the reference frame. We take a moving object picture in experiment scene. And the 6th column of Table 1 shows the average brightness value of the picture. The average brightness value of the whole image is 121.260445.
According to the reference frame, adjust the average brightness value of each block in moving object picture, that is every block in moving object picture adds the difference value of the whole average brightness value between dark light scene and light brightness scene, i.e., 119.365257 -121.260445 = -1.895188, take -1.90, as shown in the 7th column of Table 1.
It can be seen from the 8th column of Table 1: the 21st, 22nd, 28th, 29th, 32nd, 40th, 41st, 56th blocks change very apparently. Comparing moving object picture and dark light scene, obviously, there is a person, he caused a change in the first block of 21st, 22nd, 28th, 29th and the amplitude is relatively large. Look at the more obvious changes in the 32nd, 40th, 41st, 56th blocks. We find the regional block 32nd, 40th, 41st have a case, the first 56 regional block has a cloth.
According to the threshold, we can detect a moving object in the 21st, 22nd, 28th, 29th, 32nd, 40th, 41st, 48th, 56th, 57th blocks and it is consistent with the actual situation.
According to the experimental results, the 3σ threshold for motion detection can detect the movement of large objects but also can detect the movement of small objects. And 3σ threshold should be the minimum threshold; the movement of small objects is also very sensitive. The system is mainly for large objects, so the appropriate threshold value can be increased to 3.5 or 4σ.
THE OPTIMIZATION OF JPEG COMPRESSION ALGORITHM
Storage capacity and transfer rate is an important fact of cost and performance
of the video surveillance system. In order to reduce the demand for network
bandwidth and save storage space, the data video system captured need to compress
(Logashanmugam and Ramachandran, 2008). Compression
is the most expensive part of system performance, we need design carefully.
Compression module system is DCT-based. The compression process is irreversible.
System uses less number of bits to get better quality restored image. The image
is inputted into the encoder, through the DCT transforms, through quantization,
through entropy encoded, at last compressed image is outputted.
The introduction of adaptive quantization table based on motion images:
The final image compression quality and file size depend on the quantization
table. Quantification is a one-to-many mapping. It is in accordance with the
JPEG standard and uses a linear uniform quantizer (Han and
Jie, 2007). The introduction of adaptive quantization table is to implement
that: If no movement is detected, system uses the standard quantization table
to loss more details. But once movement is detected, system uses optimized quantization
table to retain more specific information.
Consider the complexity of the implementation and the restriction of the environment conditions, we use two set of brightness and color quantization tables to separately quantify the images with no motion and the image with motion. It not only effectively improves the quality of the detected movement image but also effectively reduces the size of the image with no movement. It also increases the number of stored images, and an implement simply, has low complexity.
The introduction of adaptive quantization table based on moving images: To the moving target and its background, we can also use the method of adaptive quantization tables. The basic idea is: we mark the results of block motion detection, according to the magnitude of each motion, adaptively change the number of quantization levels, quantify the independent part of the movement crudely and quantify the dependent part of the movement carefully. Ensuring the clarity of moving target, we also allow a small amount of movement independent blocks to have mosaic effect, thus we can speed up the compression rate and reduce the image size.
The implementation of this method based on motion detection, after the block motion detection of a picture; we refine the motion blocks and quantify the independent part of the movement crudely, desert the useless information, allow these blocks have mosaic appearance. The adaptive quantization table adopts two different quantization tables. It is contrary to the optimization of quantization table. It uses multiplying quantization table (the standard JPEG quantization table multiplied by 2) to quantify the independent part of the motion picture and makes more details to be lost, so as to reduce the size of the entire picture to improve compression ratio.
The comparison of two methods: The core of the two optimization methods is that they all use adaptive quantization table. The difference is the using level of the adaptive quantization table. Motion detection to the image with movement and the image with no movement use different quantization tables. It can effectively reduce the size of the picture which people is not interested and assure the quality of interested image. The adaptive quantization for different regions of each picture improves the efficient of image compression, highlights the goals and blurs unrelated areas. But the whole process of quantification is complex and for the performance of current development platform, the efficient will be greatly affected.
Image filtering: JPEG can compress "continuous" image well, if the image
has many sudden changes in color brightness, JPEG can not work well (Haseeb
and Khalifa, 2006). In view of this principle, the noise filtering of picture
to a certain degree is necessary.
Median filters use nonlinear processing methods to remove noise which can be
done to remove noise and also protect the edge of the image (Gonzalez
and Woods, 2002). Median filtering algorithm in the system still uses a
sliding window but modifies the image pixel value of the windows center
to the value in the middle of all the windows pixel values. As a result,
the noise can be removed; system can better keep the edge information.
Currently, the median value of median filter is the middle value of all the elements after sorting. But sorting needs to move a large number of elements and its efficient is low. We just try to find the middle value, so we do not really have to sort. Use the quick sort to sort, when sorting fulcrum is in the middle of the original array, we can end the cycle. The average time complexity of quick sort is O (n log2 n). Its space complexity is O (log2 n).
After filtering, the image compression ratio has increased significantly, compared to the average, the JPEG compression of 24-bit BMP images increased by about 1%.
The DCT fast algorithm: The same JPEG image compression test to a 320x240
image, in the case of Yh: Yv: Uh: Uv:
Vh: Vv: = 2: 2: 1: 1: 1: 1, has six 8x8 information groups,
corresponding to the sub-block of MCU is 16x16, the operation number of DCT
and IDCT are 320x240x6 /(16x16) = 1800 (Abboud, 2006).
Therefore, it is necessary to use fast DCT algorithm to do some other improvements:
(1) Similar to the YUV and RGB conversion, convert floating-point operations
to fixed-point operations and pre-calculate the cos() values to prepare for
calls. (2) 8x8 2-D DCT are converted to two 8-point 1-dimensional DCT. The formula
One-dimensional fast DCT algorithms are many (Feig, 1990;
Vetterli, 1985), we use a smaller computational Loeffler
algorithm (Loeffler et al., 1989). The DCT algorithm
for 16x16 blocks needs to do 31 times multiplications and 81 times additions.
But the DCT algorithm for 8x8 blocks only needs to do 11 times multiplications
and 29 times additions. The direct calculation requires 56 times multiplications,
56 times additions. Visibly fast algorithms save a lot of computation.
THE SERVER PRETREATMENT
The WEB server performance of monitoring system directly affects the speed of customer reading. To improve server performance, the system adds server page prediction modules.
The current algorithms are: referrer field method, PPM method, the prefetching
method based on semantic understanding, Wcol method (Umapathi
and Raja, 2006).
|| The adjacent table of web site
|| The corresponding Fig. of the pages and the vertex table
Wcol method is very simple: if the URL user requested is a HTML document, the
server will analyze the link above the embedded images and other HTML documents
and then pre-fetch them and put them out to the cache.
In this system pages and links are simple. Therefore, we use Wcol intelligent prediction algorithm and do some improvements. The method considers the inter-linked pages as a graph and use adjacency list to store the graph.
Store the relevant pages of the site with a vertex table and then store the related links in the side table. When a user requests a URL, the server receives the request, parses the request URL, then pre-fetch module tries to find out from the adjacent list whether the page is in memory. If the page is not in memory, module goes to the disk to read the page and put it to memory. If the page is in memory, module directly read the page to user. At the same time, according to the information of this page which is found out from the adjacency table, the pre-fetch module finds the relevant links from side table and predicts the pages, images and other information that users may visit next, at last puts the information into memory to improve access speed.
Data structure: Pre-fetch works start around the adjacency list. All
pages of the website constitute a vertex table. The reference links of a page
include page information and picture information and all of them constitute
a side table. Adjacency list is shown in Fig. 2. The data
structure is defined as follow:
The development of web site is divided into static pages module and CGI program module. The web site totally has 12 pages. The page structure is shown in Fig. 3. This structure can be viewed as a tree. Therefore, the vertex table array page  stores the information of the total 12 pages. The storage order of these pages is under the level priority shown in Fig. 3.
Workflow: When server starts, system builds the data structure of adjacency
list. Page  and page  store image-related pages. Associated with the operation
of the system, the pictures have been increasing.
|| The test results of the server with no pre-fetch module
|| The test results of the server with pre-fetch module
The structure of side table needs to constantly update. System inserts the
latest generated image information into the side tables first node. When
users access page  and page , pre-fetch procedure reads pictures from
this side table into cache, waiting for the access of users, so as to achieve
predication. For other common pages, their visit numbers always change, system
uses count value to record visit numbers. When visit numbers plus one, system
travels the whole adjacency list, the count values of all nodes of this page
plus 1. At the same time, system re-links the nodes at this side table by the
order of count values. Next time when user visits some node of the vertex list,
the predicted page which will be read next into the cache is the first node
of the vertex and also is the first node of the side table whose count value
is the largest.
Performance test: In order to test the performance of this server, we compare the server with pre-fetch module to the server with no pre-fetch module, totally make four times test and each is divided into three groups.
First of all, let 12 computers on the LAN access alarmphoto.cgi the alarm image page in the server with no pre-fetch module at the first time and test its first loading time. The loading time is at about 5, 6, 7, 8 and 9 sec. In the first group we find that there is one computer whose loading time is 5 sec, three computers whose loading time is 6 sec, five computers whose loading time is 7 sec, two computers whose loading time is 8 sec, one computer whose loading time is 9 sec and the average loading time of the first group is 6.91 sec. The average loading time of the second group is 6.83 sec. The average loading time of the third group is 6.83 sec. Secondly, let 12 computers on the LAN access the next photo of alarm image page in the server with no pre-fetch module and test its second loading time. The average loading time of the first group is 6.91 sec. The average loading time of the second group is 6.91sec. The average loading time of the third group is 6.83 sec. It is shown in Table 2.
Thirdly, let 12 computers on the LAN access alarmphoto.cgi the alarm image page in the server with pre-fetch module and test its first loading time. The average loading time of the first group is 7 sec. The average loading time of the second group is 6.91 sec. The average loading time of the third group is 6.83 sec. Fourthly, let 12 computers on the LAN access the next photo of alarm image page in the server with pre-fetch module and test its second loading time. The average loading time of the first group is 4.67 sec. The average loading time of the second group is 5.08 sec. The average loading time of the third group is 4.75 sec. It is shown in Table 3.
The time of the first page download shows that: the first picture download time of the server with no pre-fetch module is between 5 and 9 sec, the average time is 6.85 sec and the first download time of the server with pre-fetch module is between 5 and 9 sec too, the average time is 6.91 sec. Experiment results show that the performances that the two servers visit the same page at the first time are the same.
The time of the second picture download shows that: the second picture download time of the server with no pre-fetch module is still between 5 and 9 sec, the average time is 6.88 sec and the second download time of the server with pre-fetch module is between 3 and 7 sec, the average time is 4.83 sec. Experiment results show that the performance that the server with pre-fetch module visits the next page at the second time is better than the performance that the server with no pre-fetch module visits the next page at the second time.
The study introduces a number of key technologies of the software architecture for video surveillance system, proposes a new motion detection algorithm, an improved image compression algorithm and an improved web server pre-fetch algorithm. The block-based fixed background difference image motion detection algorithm has a good effect. It greatly reduces the false alarm rate, avoids filtering, reduces the amount of computation, has certain adaptability to light changes, reduces or even eliminates the interference of the CCD jitter, can accurately extract the moving object location and in the course of each test only need take one frame. It is an ideal method for motion detection. The proposed improved image compression algorithm implements the adaptive quantization table compression which is based on motion detection. It converts floating-point calculations into a fixed-point calculation and uses sub-sampling and algorithm optimization to improve the performance. Test proves: the total time of compressing 100 BMP images and completing the USB storage is lower than 48 sec. This compression module in the whole system can complete the emergency compression storage function and does not affect the performance of the real-time image acquisition of the whole system. Compared with the no pre-fetch module web system, the system with improved pre-fetch algorithm has better real-time and efficiency. Experiments show that: compared with other monitoring systems, this system is lower cost, faster response and so on.
The work was supported by major project fund of Ministry of Science (ECK2150A), Hunan Science and Technology Project Fund (2008GK3134) and Science Research Project Fund of Hunan Provincial Education Department (10C0202).