The objective of this research is to study and classify Human Heart Abnormalities using information obtained from Time-Frequency Spectrogram and Image Processing Technique through the application of Artificial Neural Network.
It is widely accepted that by measuring and observing the time domain electrocardiogram (ECG) of a human heart, a qualified medical practitioner would be able to determine the condition of the heart (whether the heart is normal or abnormal). The same process can be automated using a PC based system, for example, and given the same ECG, the program would be able to conclude the condition of the human heart.
The automated classification procedure has some advantages. Firstly, as it is implemented as a computer program, it can be exactly duplicated on other computers. This enable the same expertise to be widely available to a broader number of patient. Secondly, a computer program does not deteriorates with time, as compared to the human capability. Thus the automated system stays the same whilst the human doctor might become less and less comptenet as his age increases.
The ECG of a normal heart rate consists of the P wave, QRS complex and T wave
and repeats as shown in Fig. 1.
Time and frequency domain features: However, time-domain ECG plots lack
the signal intensity display of the frequency domain components.
complex of an ECG pulse (Enderle et al., 2000)|
This is a great loss as frequency domain components contribute significantly
in determining unique features of most engineering and scientific signals (Dripps,
The frequency analysis of such signals via Fourier techniques is fundamentally unsatisfactory since they are based upon modeling the signal as a linear combination of sinusoids extending throughout the duration of the signal. The Fourier analysis is good at determining what frequencies are present (i.e., it provides good frequency discrimination), but poor at pinpointing when these frequencies occur (i.e., it has poor time localization) (Crowe, 1997; Hlawatsch et al., 1992; Haghighi-Mood and Torry, 1997).
Plots in Time Domain (Top), Frequency Domain (Middle), Time-Frequency
Popular simultaneous time-frequency analysis techniques include the Wavelet Transform (Ikeda et al., 1997) and the Short Time Fourier Transform. Time domain, frequency domain and simultaneous time-frequency domain display of ECG signals is shown in Fig. 2.
Short time fourier transform (STFT): The short-time Fourier transform
(STFT) is a linear time-frequency representation (TFR) used to present changes
in the signal that vary with time. The Fourier transform does not explicitly
show the time location of the frequency components, but some form of time location
can be obtained by using a suitable pre-windowing (Hlawatsch and Boudreaux-Bartels,
1992). The STFT approach is to perform a Fourier Transform on only a small section
(window) of data at a time, thus mapping the signal into a two-dimensional (2-D)
function of time and frequency. The transform is described mathematically as:
where, g (t) may be defined as a simple box or pulse function.
In this study, the Blackman window with a window size of 256 (50% overlapping) discrete data out of a total of 512 discrete data was used. The Blackmann method of windowing was choosen because it gives minimum variation on spectrum shape and colour compared to other techniques. The discrete version of the Blackman window can be described by the equation below:
Selection of frequency band: The useful frequency band for the ECG signal to be studied was decided to be between 0.5 Hertz (Hz) to 59 Hz. Detailed signal analysis by prior researchers indicates that the P-wave and T-wave mainly contain frequency components that are far below 60 Hz. The R-wave also mainly contains frequency components that are below 60 Hz but it also contains some frequency components that are beyond 60 Hz. Through the use of a band pass filter, the signal bandwidth was selected to be between 0.5 to 59 Hz. Such a filter will effectively reduce 60 Hz noise (normally acquired through powerline), have little effects on P-wave and T-wave and finally produce some acceptable distortion on the R-wave. Cut-off frequency above 0.5 Hz was chosen to avoid the low-frequency noise due to respiration and electrode movement that below accous 0.03 Hz.
Linear spatial domain filter: One of the simplest linear, spatial image processing techniques used in machine vision is the convolution operation. The operation can be described as follows:
For any given planar image P and a 3x3 elements mask G described by both (3) and (4) below,
the convoluted image P* is given as:
Different coefficient value will give different filter behavior (Gonzales and Woods, 2002).
Gaussian filter: The Gaussian filter smoothes a given image P. It is made from the same structure in (3), (4) and (5) with specialized value of filter coefficient. The values varies according to requirement and specification. One such set of values is shown in Eq. 6 below:
Sobel edge detection: In Sobel Edge detection, two operators were used, Gx and Gy to calculate approximations of the derivatives, one for horizontal changes and one for vertical. The Sobel operators calculates the gradient of the image intensity at each point, giving the direction of the largest possible increase from light to dark and the rate of change in that direction. The result therefore shows how abruptly or smoothly the image changes at that point and therefore how likely it ist that part of the image represents an edge, as well as how that edge is likely to be oriented.
The operators follow the same structure as (5). If P is defined as the source image and Gx and Gy are the two operators described above, the latter are computed as:
At each point in the image, the resulting gradient approximations can be combined to give the gradient magnitude, using:
Using this information, the gradient's direction is calculated as below:
where, for example, Θ is 0 for a vertical edge which is darker on the left side.
Non-linear spatial domain filter: median filter: A median filter is a non-linear filter. In a median filter, the median value for the pixels in the processed window is used to replace the current pixel under processed. For example, consider a pixel (with the absolute value of 150) that is surrounded by 8 other pixels shown in equation (11) below. Each pixels value is then arranged incrementally and the original pixel value in the middle is then replaced with a new value which is the value median of the arranged pixels (Gonzales and Woods, 2002; Fisher et al., 1994). Equation (11) describes how median filter is implemented.
Pulse counting from binary image: The spectrogram obtained from the STFT operation is converted into a grayscale image and binary thresholded to produce a black and white only representation of the original STFT spectrogram.
At a selected frequency (fselected = 45 Hz), the number of transitions made by pixel value are calculated. For example, if a line of pixel at any height is extracted, the following representation might be obtained:
From the example above, 4 transitions (NT) can be obtained. The pixels located in a pulse area are indicated by 1. Since the transition included from 0 to 1 and 1 to 0, the number of total pulse is actually NP = NT/2. In the above example, there are actually 4/2 = 2 pulses detected. This method is by far not the most robust and precise pulse counting method, but it was observed throughout the study that the results it produced was acceptable.
Euler number: Euler Number is defined as the number of connected components.
It is a topological property that is useful for region description. The number
of holes H and connected components, C in an image can be used to define the
Euler Number, E:
The regions shown in Fig. 3, for example, have Euler numbers
equal to 0 and -1, respectively, because the left figure has one connected component
and one hole and the right component has one connected component but two holes
(Gonzales and Woods, 2002).
Types of heart abnormalities investigated: The types of heart abnormalities
which were studied in this research are described in Table 1.
The data were obtained online from http://www.physionet.org
(Goldberger et al., 2000).
Features observed: The features from the processed spectrogram image which was extracted and used for classification is described in Table 2.
Identification method: A multi-layered perceptron, back-propagation trained Artificial Neural Network (ANN) was used to identify the different types of heart abnormalities.
Generally, ANN consist of a number of simple and highly interconnected processor
called neuron. The neurons are connected by weighted links that pass signals
from one neuron to another (Negnevitsky, 2002).
with Euler number equal to 0(Left) and -1 (Right)|
Forward, Back Propagation Trained Neural Network|
Multilayer neural network are Feed Forward neural network with one or more
hidden layer (Fig. 4). A network consist of input layer, hidden
layers and output layer. Network are adjusted or trained so that a particular
input leads to specific target output. Hidden layers are important to extract
useful information from inputs and use them to predict output.
A network with input vector, X is transmitted through a connection that is multiplied by weight W1,j,i and passes through an activation function f1(...) to give the output y1,i.
The output is then is fed into another activation function fk+1(..) to produce the output for node-I at layer (k+1), yk+1,i. The process is repeated towards the end of the layers. The process is best described by the equation below:
||Type of heart abnormalities and their description
classification test results|
classification test results|
||Output at node-I of layer-(k+1)
||Activation function at layer-(k+1)
||Weight of the connection between node-I of layer-(k+1) and node-j of layer-k
||Bias at node-I of layer-(k+1)
||Number of nodes from layer-k which is Connected to node-I of layer-(k+1)
The weight are adjusted to reduce error between actual and desired output pattern during training. This error is minimized until it reach at certain objective value.
In this study a 3 layer, back-propagation trained ANN was used to identify the abnormality. The input layers have 9 nodes comprising the features (Table 2) that were extracted from the heart beat. The output layer has 6 nodes denoting the different types of abnormalities (Table 1). Each nodes gives the certainity factor (cf) of the classification process. A cf of closest to or equal to one represents the node with the highest probability of identifying the correct abnormality. Thirty original data and sixty interpolated data were used for training the ANN. Fifteen original data and fifteen interpolated data were used to validate the ANN.
The overall classification result is shown in Table 3. Table 4 shows 10 numerical example of the classification results.
The first five rows shows the cf of the classified abnormality for the data
set which were used in training the ANN. The last five rows shows the cf of
the classified abnormality for the dataset which was not used in the training
process. In both sets of data, the system was able to correctly classify the
investigated abnormalities with 100% accuracy. The cf for the classified abnormality
varies but the values were either close to or equal to 1.00, which indicates
strong belief in the result.
DISCUSSION AND CONCLUSION
The detection results obtained showed that the system functions very well and gives very good detection result (100% accuracy). It is therefore concluded that the objective of this research has been achieved. Future enhancement in this research includes the inclusion of more data in the knowledge base and cf extraction. Further testing involving significantly larger test set is also planned in the near future to further test the robustness and accuracy of this system.
This research was supported by the Ministry of Science, Technology and Environment,
Malaysia under the 8th Malaysia Plans IRPA 03-02-02-0016-SR0003/07-02