Abstract: Kolbs experiential learning theory has outlined that individuals possess unique learning preferences that comprise of diverging, converging, assimilating and accommodating styles. Conventional approach to assess the learning styles however is susceptible to reliability issues that arise from cultural and language variations. To overcome such limitation, a new learning style assessment technique is proposed using EEG sub-band spectral centroid frequencies and artificial neural network. Sixty eight participants have volunteered in for the study. Subjects are clustered into the respective learning style groups using Kolbs learning style inventory. Subsequently, resting EEG is recorded from the antero-frontal cortex and pre-processed for noise elimination. Alpha and theta spectral centroid frequencies are then extracted and analyzed. Dataset enrichment is then performed using synthetic EEG. In general, the artificial neural network is successful in classifying learning styles from the resting EEG. Network training and testing have attained 85.1 and 91.3% accuracies, respectively. Albeit yielding satisfactory performance, findings have also suggested an extended research to enhance its capabilities for learning style discrimination.
INTRODUCTION
Kolbs experiential learning theory defines that knowledge is shaped by individual ability to absorb and transform experience. The model is composed of four learning modes that are arranged in cyclic manner, Concrete Experience (CE), Reflective Observation (RO), Abstract Conceptualization (AC) and Active Experimentation (AE). The absorption dimension is constructed by the opposing modes of CE and AE. Conversely, the dialectic modes of RO and AE form the transformation dimension. Knowledge is then created as a response to contextual demands; through an ingenious process that requires interaction between the two learning dimensions. Hence, the learning process can be portrayed as a recursive cycle, in which individuals experience, reflect, think and act1.
Learning style variations arise due to the unique individual preferences to resolve the conflict of being concrete or abstract and being reflective of active1. These are influenced by educational specializations, past experiences, context and gender2. Hence, as individuals mature, the construct represents a stable personality trait3. Conventionally, individual learning style is assessable via Kolbs Learning Style Inventory (LSI). Technically, the method evaluates the dominant modes on the absorption and transformation dimensions and maps the individual into diverging, converging, assimilating and accommodating styles1.
The EEG is a non-invasive recording of electrical activity in the brain4. Apart from sleep studies, the technique has been widely used to study various psychological conditions such as bipolar disorders, schizophrenia5 and autism6. Separately, implementation of EEG has also extended to bio behavioral research; encompassing areas such as cognition and development, emotional function and intelligence7. In general, the frontal region of the brain is responsible for cognitive-related abilities8. Hemispheric lateralization of the frontal region has also revealed that the left side is dedicated for sequential and logical processes, while the right side specializes for emotion and social interaction capabilities9.
From an analytical perspective, the EEG can be segregated into four major frequency ranges, the delta (0.5-4 Hz), theta (4-8 Hz), alpha (8-13 Hz) and beta (13-30 Hz) waves6. Each of these EEG band bear exclusive information relating to different neurological behavior4. Studies have shown that delta and theta waves are each inherently associated with deep and light sleep10. Alpha rhythms however, become evident when the mind is in conscious but resting state. As the brain participates in intense mental activity, the slower alpha rhythms are replaced by the beta waves4.
Advanced signal processing methods are often implemented to quantify spectral information within each frequency bands. Evaluation of spectral features can be conducted via parametric and non-parametric techniques. Technically, parametric method estimates the model-based power spectrum using auto-regressive, moving average or auto-regressive moving average technique. Conversely, the non-parametric approach includes Welchs method for estimation of power spectrum from the time series. Although having limitations, the method has been extensively used in various EEG studies11. Subsequently for analysis purposes, the spectral information is usually processed into quantifiable features such as the well-known band power12 and power ratio13 descriptors.
Meanwhile, Spectral Centroid Frequency (SCF) is an alternative form of feature, defined as the center of gravity within the frequency spectrum. Its inherent advantages are attributed to the robustness against white Gaussian noise and reduced computational requirements. The feature has been successfully implemented in speech recognition14. Being relatively new, the SCF has recently been used to characterize stress15 and intelligence16 from the resting EEG.
The selected features are commonly used for pattern recognition purposes via techniques such as the Artificial Neural Network (ANN). The ANN is a supervised machine learning classifier that mimics the functioning of neurons in the brain17. The technique which integrates the training dataset with the back-propagation learning algorithm allows the network to iteratively update the weights between the nodes until the error is minimized18. Meanwhile, a separate dataset for testing is used to gauge the generalization ability of the trained network. As an advanced technique, ANN has been utilized for myriad of biomedical applications which include physiological analysis19 and modelling13, as well as disease recognition20.
Current methods to evaluate learning styles are susceptible to inconsistency issues that arise from cultural and language variations2. To eliminate such drawback, an innovative learning style assessment technique from the resting EEG has been proposed. Albeit yielding excellent performance with k-nearest neighbor classifier21, the potential of EEG sub-band SCF features have yet to be verified using other established methods. Hence, this paper extends such study by classifying Kolbs learning styles using ANN. The study however, is limited to alpha and theta bands as the inherent traits pertaining to working memory organization and attentional requirements exist at these frequency ranges22.
MATERIALS AND METHODS
This elaborates extensively on the methods used in the study. Figure 1 shows, the tasks comprise of subject clustering and EEG acquisition, signal pre-processing and extraction of sub-band SCF features, pattern observation and removal of extreme outliers, dataset enrichment, network optimization, training and testing of ANN, as well as correlation tests. It is important to note however, that the initial phase of the study has been reproduced from previous study21.
EEG acquisition and sample clustering: A total of 68 university students (male, right-handed, age range = 18-37 years, mean age/standard deviation = 23.9/3.1 years) from various education specializations have participated in the study. All procedures related to experimental protocol and EEG recording have earlier been endorsed by the universitys research ethics committee (600-RMI (5/1/6)). Initially, subjects were briefed regarding the experimental procedure and have given written consent. Subjects were then asked to relax in seated position with both eyes closed. The EEG is then recorded from positions AF3 and AF4 of the prefrontal cortex via the emotiv neuroheadset. For each session, the resting EEG was recorded for three minutes. Subsequently, subjects were required to complete the online Kolbs LSI. These would allow sample clustering into the respective learning style groups.
Signal pre-processing, extraction of Sub-band SCF features and data enrichment: The EEG pre-processing was performed offline using MATLAB 2014a using high-pass filter and automatic electrooculogram rejection method. Subsequently, only 2 min 30 sec EEG segment is considered for further analysis. Samples were then filtered into the respective alpha and theta waves using band-pass filters23. The power spectral density for each frequency band was estimated through Welch technique. As expressed in Eq. 1, each of the sub-band SCF feature is computed as the average of amplitude weighted frequencies, divided by the sum of amplitude:
(1) |
where, i represents the respective EEG band and N is the number of frequency bins. Conversely, S[f]wi[f] is the power of the spectrum in relation to the frequency component, f, at bin i. Based on the results from Kolbs LSI, the obtained alpha and theta SCF features were then clustered into the diverger, converger, assimilator and accommodator groups.
As previously reported, the converger and accommodator groups each comprise of 14 samples each, while the assimilators and divergers each with 20 samples. Two extreme outliers have been identified and removed, each from the assimilator and accommodator groups. Hence, the total number of samples used prior to dataset enrichment21 is 66.
Past studies have revealed that performance of intelligent classifiers deteriorates with small class separation and unbalanced sample size between the control groups. To compensate for such limitation, the use of synthetic EEG has been proposed. The technique is realized by adding white Gaussian noise with sufficiently conditioned signal-to-noise ratio to the original EEG. In this study, an SNR of 30 dB has been adopted. A more detailed elaboration on the synthetic EEG can be obtained elsewhere13. In this study, the sample size has been increased to 40 per learning style group. It has been observed that the mean and feature distribution between both original (N = 66) and enriched (N = 160) datasets yielded similar pattern21. Hence, the ensuing result will only focus on the enriched dataset which is then implemented for ANN classification.
Artificial neural network: The ANN comprises of an input layer, several hidden layers and one output layer18. However, studies have revealed that a network with single hidden layer is adequate to approximate an arbitrary function up to an acceptable level of accuracy24. In this study, alpha and theta SCF features from both sides of the antero-frontal region are used as input to the neural network. Conversely, a single node output is adopted to represent the corresponding learning style indexes.
Theoretically, an input vector, xi is transformed into vector of hidden variables, uj through an activation function, Γ1. The procedure can be mathematically expressed by Eq. 2:
(2) |
where, M represents the number of input nodes, wij are the weights between ith input to jth hidden node and θj are the biases and Γ1 utilizes the hyperbolic tangent function.
Similar transformation is performed at the output node, in which the vector of hidden variables is computed into the resultant output, yk via an activation function, Γ2. As expressed in Eq. 3:
(3) |
Fig. 1: | Overview of research methods |
where, N represents the number of hidden nodes, wjk are the weights between jth hidden node to kth output node, θk is the bias and Γ2 employs the pure linear function.
As expressed by Eq. 4, the output error, e is then obtained as the difference between the computed, yk and the desired output, yd:
(4) |
The computed error is then integrated into the back-propagation weight update procedure via the Levenberg-Marquardt algorithm. During network training, the forward and back-propagation learning is repeated with different sets of data until the error sufficiently converges. Convergence of error is evaluated in the form of mean square error (MSE). MSE is represented by Eq. 5:
(5) |
where, K represents the total number of iterations25.
To avoid over-fitting, early-stopping criterion has been adopted. Via such approach, a separate dataset for validation is used to intermittently assess the generalization ability of the network. Should the validation error increase, network training is stopped to avoid over-fitting. Separately, testing dataset is used to assess the generalization ability of the trained network, in which error convergence is no longer monitored, but is used to assess performance of the trained network26. For the purpose of this study, the 160 samples dataset have been randomly segregated for training, validation and testing with 70:15:15 ratios27.
Separately, the number of nodes in the hidden layer is determined by combining the constructive and pruning methods28. The optimization procedure effectively utilizes the constructive algorithm while considering the rules of thumb for selection of boundary conditions and thus, the minimum and maximum limits have been set at 3 and 7, respectively. As illustrated in Fig. 2, network training starts with minimum number of hidden nodes.
For each configuration, the process is repeated for 40 epochs. Such approach is based on the notion that network training will restart at varying initial weights and biases. Thus, an ideal number of hidden nodes would induce the best average performance, regardless of the Mersenne pseudorandom twister settings. The procedure is then repeated with increasing number of hidden nodes until the maximum limit of 7 is reached. Consequently, optimum selection will be based on the highest average training accuracy with the lowest MSE.
Correlation tests have been selected to validate the unbiasedness of the ANN model. The ANN is considered to be unbiased if the residuals are uncorrelated with all linear and non-linear combinations of past outputs and inputs. In this study, the Auto Correlation Function (ACF) test computes the correlation between residuals and itself. The Cross Correlation Function (CCF) test however, computes the correlation, but between the residuals and the outputs. In ideal condition, the model is assumed to be representative of the modeled relationship when the correlation at all lags lie within the 95% confidence limit, with the exception of lag 0 for autocorrelation29.
RESULTS AND DISCUSSION
Initially, results will briefly focus on the replicated sub-band SCF feature pattern with synthetic EEG. This is subsequently followed by an analysis on optimization of ANN structure. Finally, results pertaining to classification of learning styles via EEG sub-band SCF features and optimized ANN are elaborated. ACF and CCF tests are also included to verify that the model sufficiently represents the relationship between the SCF features and learning styles.
Pattern of alpha and theta SCF features: Results in Fig. 3 indicate that there is distinguishable pattern of mean alpha and theta SCF among the learning style groups. The feature distribution within the 95% confidence interval has revealed a significant overlap between the learning style groups21.
Optimization of ANN structure: Figure 4 shows the effects of hidden node variations on network training. Findings have revealed that the highest average training accuracy and the lowest MSE has been attained with 7 hidden nodes.
Subsequently, the ideal network structure is as summarized in Table 1. The ensuing work then adopts the optimized ANN for classification of learning styles from the EEG sub-band SCF features.
Classification of kolbs learning styles: The ANN has been successfully trained to classify Kolbs learning styles from the resting EEG. Table 2 shows the classification accuracy and MSE during both training and testing. Results have shown satisfactory performance with 85.1 and 91.3% accuracies for training and testing, respectively.
Fig. 2: | Modified constructive algorithm for optimal number of hidden nodes |
Fig. 3(a-b): | Mean (a) Alpha and (b) Theta SCF with 95% confidence interval (N = 160) |
Table 1: | Optimized ANN structure |
Fig. 4(a-b): | Effects of hidden node variations on average training (a) Accuracy and (b) MSE |
Fig. 5(a-b): | ACF tests for network (a) Training and (b) Testing |
Meanwhile, error was minimal for both training and testing with MSE of less than 0.1. Consequently, Table 3 shows the positive predictivity and sensitive for all learning style groups. Comparatively, results have revealed that during training and testing, the classifier has attained highest positive predictivity and sensitivity toward the accommodator group, followed by convergers and then the divergers and accommodators.
The classification performance for each learning style group can be correlated to the pattern of alpha and theta SCF, in which the overlapping of features in the accommodator group is relatively minimal and hence, indicating good class separation. For the assimilators however, the distributions of features in both sub-bands have a comparatively higher degree of correlation, particularly with both the convergers and divergers. These explain the classifier behavior towards each of the learning style groups.
Meanwhile, Fig. 5 shows the results of ACF tests for network training and testing. With the exception of lag 0, majority of other lags is within the 95% confidence limit.
Table 2: | Classification performance |
Table 3: | Positive predictivity and sensitivity measures during training and testing |
Hence, this indicates that the correlation is almost non-existent between the original and lagged residuals.
Subsequently, Fig. 6 shows the results for CCF tests for network training and testing. Similarly with the ACF tests, majority of the correlation is also within the 95% confidence limit. It is important to note that for network testing, moderate correlation has been attained at lag 0. Albeit negligible, future investigation is still needed to model a perfect relationship.
Fig. 6(a-b): | CCF tests for network (a) Training and (b) Testing |
CONCLUSION
The ANN has successfully been implemented to classify Kolbs learning styles from the sub-band SCF features of the resting EEG. Albeit yielding satisfactory performance, further investigation would be required since the proposed SCF features has yet to fully represent its relationship to the learning styles. Future study may propose EEG sub-band spectral centroid amplitude to complement the SCF features for an improved ANN model. For comparative purposes, implementation of alternative classification techniques is also recommended.
ACKNOWLEDGMENTS
Authors extend their gratitude to Universiti Teknologi MARA and the Ministry of Higher Education, Malaysia for providing the financial support through the Fundamental Research Grant Scheme (Phase 1/2015).