Abstract: Background and Objective: Existing work on human activity recognition mainly focuses on recognizing activities for a single resident. However, in real life, activities are often performed by multiple users. This study aimed to recognize multiple resident activities inside home using deep neural networks and an ontological approach for features selection. Materials and Methods: This model comprised an ontological approach method for robust features extraction and selection, a Deep Belief Network (DBN) algorithm for recognising three categories of multiple resident activities inside home. A simulated experiment was conducted using publicly two multiple resident CASAS databases collected at Washington State University (WSU) and the proposed approach was compared with traditional recognition approaches such as Support Vector Machine (SVM) and Artificial Neural Network (ANN). Results: The results showed that the proposed approach based on DBN and ontology produce better accuracy results compared to SVM and ANN. Conclusion: In this research, deep neural network algorithm had been successfully developed to recognize daily life human activities using features manually extracted.
INTRODUCTION
The key point in development of smart home is the recognition of normal and daily routine activities of its residents. This recognition can reduce costs of health and elderly care that exceed $7 trillion annually worldwide and rising1,2. It helps ensuring comfort, homecare3, safety and reduces energy consumption. For these reasons, human activity recognition has been the focus of many researches for nearly a couple of decades. In fact, a large amount of research treat recognition of Activities of Daily Living4 (ADLs) which means activities, performed in resident daily routine, such as eating, cooking, sleeping and toileting. There are various reasons for mostly covering ADLs in the literature. with pertinent examples of general and common activities between differentiating between young and old people. ADLs are the most used in standard tests of resident autonomy; disability with ADLs is the most common reason that older people live in nursing facilities5. Finally, ADLs are the best suited as inputs to perform different home applications. For the following reasons, this research is focused on recognizing ADLs human activities.
Most of existing works on activity recognition are mainly focused on recognizing activities for one resident in smart home. However, in real life, there are often multiple inhabitants lived in the same house and perform ADLs together or concurrently6. Recognizing multiple residents’ activities is challenging because of several reasons; it should incorporate an appropriate amount of sensors, suitable methods to model multiple residents’ interactions and filtering noise data from the raw data. Carrying out sensor data fusion for such settings to achieve sufficient accuracy for multiple residents’ activity recognition is still an open research issue7.
In this study, multiple residents’ activities are classified into three big categories:
| Single resident performs activities one by one: a single resident performs an activity in a sequential manner independently (e.g., personal hygiene or bed to toilet transition). Most of literature work have focused on this type of ADLs |
| Multiple residents perform the same activity together: Two or more residents do an activity in a cooperative or participatory manner (e.g., two or more residents are eating meal or watching TV together) |
| Multiple residents perform different activities independently8: In this category, two or more residents perform different activities simultaneously (e.g., one resident watch TV and one other prepare meal) |
In this study, the recognition of those three categories of multiple residents’ activities have been treated. As far as multi-resident activities recognition is concerned, few articles have been published on the subject and few experiments made in real conditions. Many of the studies are done on simple scenarios in case of multi-resident perform different activities independently9,10, although parallel exclusive and cooperative activities are the most frequent in nature. To the best of our knowledge, no work has addressed all types of activities. However, there were some existing works on recognizing multi-resident or group activities in pervasive computing9-12. However, these works cited bellow and others were still in a development phase because of the complexity of multi-resident states and activities.
Due to low cost, low power consumption and privacy respect, emerging sensors-based approach became a centre of interest at the last decade. Researchers have commonly tested machine learning in recognizing activities based on sensor readings. Typical static approaches include Naive Bayes (NB)13, decision trees14 and Support Vector Machine (SVM)15. Temporal approaches includes Hidden Markov Model (HMM)16, knowledge-driven approach (KDA)17, Conditional Random Field (CRF)18 and Evolutionary Ensembles Model (EEM)19. Nevertheless, researchers have tried to train neural networks algorithms inspired by the architectural of the brain. Neural networks are used in various researches in recent years to recognize human activities and actions20-23 and seem successful and more efficient compared to other machine learning algorithms.
Deep Belief Network (DBN) have attracted many activity recognition researchers because of two major advantages: it can learn many more parameters than discriminative models without overfitting and it is easy to see what the network has learned by generating from its model24. For instance, Fang and Hu25 used 4 hidden layers of DBN algorithm to solve the problem about recognizing human activities, the results was compared with hidden Markov model and Naïve Bayes, the higher accuracy obtained was 79.32%. Hassan et al.26 used DBN with two hidden layers and 100 inputs for activity recognition and compared it with Support Vector Machine (SVM) and Artificial Neural Network (ANN) where it outperformed them. The higher accuracy obtained was 89.61%.
This study has proposed an ontological approach for features extraction, combined with deep belief network in supervised learning in order to recognise three types of multiple resident activities described above. The combination of these two approaches not only increase the accuracy of results compared to literature, but also reduces the training complexity27. Oukrich et al.28 activity recognition was explored using back-propagation algorithm and auto-encoders feature selection to recognize activities of multi-resident. But, in fact the method described in this study outperformed the previous work and it gave pertinent results.
MATERIALS AND METHODS
The proposed system basically composed of three main parts: sensing, features extraction and recognition. The first part was data collection used as input to the Human Activity Recognition (HAR) system. For this study, emergent sensors in the two smart homes have been selected for data collection: motion sensors and door sensors. The sensors provided a continuous data during a long period. The second part was for the feature extraction. In this part an ontological approach had been used to extract relevant features and the most adequate for the learning part. The third part of the system was for modelling activities from the features via deep learning where DBN was adopted.
Consequently, this study was taking more than six months of tests and research to carry out this proposed system and all researches related to this study were done in the IT laboratory of the Mohammed V University in Rabat Morocco.
Data collection: For data collection, two multiple resident datasets collected from the Centre for Advanced Studies in Adaptive Systems (CASAS)29 are used to evaluate our approach:
| Tulum data set was collected from April to July in 2009. The apartment housed by two married residents where they performed 10 normal daily activities. This data set contained two categories of activities: Single resident performing activities one by one and multi-resident performing the same activity together. This data set contains 1513 samples |
| Twor data set collected in the WSU smart apartment test bed during the academic year of 2009-2010. The apartment housed by two residents, R1 and R2, at this time and they performed 26 normal daily activities. Multi-resident activities category extended in the database represented activities performed by numerous entities independently but not concurrently. This data set contained 3896 samples |
Activities details were explained in Table 1.
Fig. 1: | Ontological representation of activity |
Table 1: | ADLs activities of Tulum and Twor datasets |
Proposed approach to extract features: To achieve a better representation of ADLs, extracting a maximum of relevant features seemed to be essential. In this trend, an ontology approach was used as the feature space to represent the training dataset and extract information from raw data. As explained in Fig. 1, a relationship was established between activity and other entities. Based on this ontological approach, 17 features were extracted and detailed:
| |
| where, Si is the means of sensors ID of activity i, ni is the number of motion and door sensors noted in the dataset between the beginning and end of the activity and Sik is the kth Sensor ID |
| The logical value of the first Sensor ID triggered by the current activity |
| The logical value of the second Sensor ID triggered by the current activity |
| The logical value of the last Sensor ID triggered by the current activity |
| The logical value of before the last Sensor ID triggered by the current activity |
| The name of the first sensor triggered by the current activity |
| The name of the last sensor triggered by the current activity |
| The variance of all Sensor IDs triggered by the current activity |
| The beginning time of the current activity |
| The ending time of the current activity |
| The duration of the current activity |
| Day of week, which is converted into a value in the range of 0-6 |
| Previous activity, which represents the activity that occurred before the current activity |
| Activity length, which is the number of instances between the beginning and the end of current activity |
| The name of the dominant sensor durant the current activity |
| Location of the dominant sensor |
| Frequence of the dominant sensor |
Based on the above mentioned features, in this work, an algorithm was developed on the basis of C++ for better assessment of extracting features from row data.
DBN used in the proposed work: Deep Belief Networks (DBN) was stacked and trained in a greedy manner using Restricted Boltzmann Machines (RBM)30. In fact, DBN had two basic parts: pre-training and fine-tuning. Once the network was pre-trained based on RBM, fine-tuning was performed using supervised gradient descent. Specifically, a logistic regression classifier was used to classify the input based on the output of the last hidden layer of the DBN.
Fig. 2: | Structure of a DBN used in this work with 17 neurons in input layer and 3 hidden layers |
That is, once the weights of the RBMs in the first hidden layer were trained, they were used as inputs to the second hidden layer. Figure 2 showed the three hidden layers that were being used in this study. According to Hinton et al.24 proposal, this work was based on the contrastive divergence (CD) algorithm to train RBM in supervised scenario.
RESULTS
For experiments, as described above, two databases were used to validate proposed approach. Twor database had 3896 events and Tulum database had 1367 events, when 70% used in training and 30% used in testing activities. It is to be noted that in the database used in this work, the number of samples for training and testing different activity was not evenly distributed. Some activities contained huge number of samples whereas some of them had a very small number of samples. The number of inputs for the three algorithms has been fixed to ensure a good comparison. Several number of hidden layer have been tested for both BPA and DBN algorithms and the ideal results was kept.
The experiments were started with Back-Propagation Algorithm (BPA). For that, this algorithm was running several times using different number of hidden layers and fixed number of inputs and outputs layers according to dataset. At last, mean recognition rate yielded to 88.75% of at the best in the Twor datasets and 76.79% in Tulum datasets. The back-propagation-based experimental results were shown in Table 2. After that, Support Vector Machines (SVMs) was applied it was contributed in 87.42% of mean recognition rate at best in the Twor datasets and 73.52% in Tulum datasets. The SVM-based experimental results were reported in Table 3. Finally, the propose approach was tested and was yielded the highest recognition rate of 90.23% in Twor datasets and 78.49% in Tulum datasets.
Table 2: | HAR-experiment results using back-propagation based approach |
Table 3: | HAR-experiment results using traditional SVM-based approach |
Thus, the proposed approach was showed the superiority over others. Table 4 was exhibiting the experimental results using the proposed approach. Figure 3 and 4 was demonstrated for the three models BPA, SVM and DBN the accuracies comparison of different activities of Twor and Tulum datasets.
DISCUSSION
The results from experiments confirmed that DBN had proved supremacy in terms of accuracy compared to other algorithms those results was confirmed by other recent research done in the field of human activity recognition25,26,31.
Fig. 3: | Comparison of BPA, SVM and DBN of Twor dataset |
Table 4: | HAR-experiment results using DBN-based approach |
However, this result did not mean that DBN was superior to other deep learning algorithms in activity recognition field. Technically there was no model which outperforms all the others in all situations32, so it was recommended to choose models based on several features explained in detail in Wang32 survey’s.
Despite the fact there was a different number of samples in different tested activities, weak mean recognition rate does not indicate poor accuracy26,33,34. For instance, the activity Group_Meeting where the two married residents performed the same activity together was difficult to recognize because it happened within very few instances. Moreover, the together was difficult to recognize because it happened within very few instances. Moreover, the meeting place were not fixed and based on emerging sensors, the system cannot detect the presence of two residents in the same place.
Recognition of activities when multiple residents performed different activities independently is quite easy. In general, there are activities that are specific to the woman and others that was generally exerted by the man and the method followed by the woman was different to the man and by this difference the algorithm proposed learns to detect the activity and the person who carried it out.
Fig. 4: | Comparison of BPA, SVM and DBN of Tulum dataset |
CONCLUSION
This study applied three machine learning algorithms to represent and recognize human activities. From the results, it can be concluded that DBN algorithm is better than SVM and BPA. The main reasons are, DBN was the most suitable for ADLs activities and it had a strong ability in learning to interpret complex sensor events in smart home environments. Furthermore, the robust feature sets manually extracted one by one generate a higher human activity recognition accuracy since it considers the specificity of the database.
SIGNIFICANCE STATEMENT
This research paper study the efficacy of using deep neural network in human activity recognition based in efficient features manually extracted one by one and compared it with traditional recognition approaches such as Support Vector Machine (SVM) and Back-Propagation Algorithm (BPA). Additionally, it highlights the recognition of multiple residents activities inside home so as to come near real life.