
Deep learning models have achieved significant success in human activity recognition, particularly in assisted living and telemonitoring. However, training these models requires substantial amounts of labeled training data, which is time-consuming and costly to acquire in real-world environments. Contrastive self-supervised learning has recently garnered attention in sensor-based activity recognition to mitigate the need for expensive large-scale data collection and annotation. Despite numerous related published papers, there remains a lack of literature reviews highlighting recent advances in contrastive self-supervised learning for sensor-based activity recognition. This paper extensively reviews 43 papers on recent contrastive self-supervised learning methods for sensor-based human activity recognition, excluding those related to video or audio sensors due to privacy concerns. First, we summarize the taxonomy of contrastive self-supervised learning, followed by a detailed description of contrastive learning models used for activity recognition and their main components. Next, we comprehensively review data augmentation methods for sensor data and commonly used benchmark datasets for activity recognition. The empirical performance comparisons of different methods are presented on benchmark datasets in linear evaluation, semi-supervised learning, and transfer learning scenarios. Through these comparisons, we derive significant insights into the selection of contrastive self-supervised models for sensor-based activity recognition. Finally, we discuss the limitations of current research and outline promising research directions for future exploration.
Deep learning models have significantly contributed to recognizing older adults’ daily activities for telemonitoring and assistance. However, recognizing human activities in real-world smart homes over the long term presents substantial challenges. Obtaining the ground truth is time-consuming and costly, yet it is crucial for training and improving deep learning models. Inspired by the impressive performance of self-supervised learning models, this paper utilizes a model based on the SimCLR framework and a self-attention mechanism for downstream human activity recognition. The model leverages the limited and intermittent labeled activities collected by the Label Older Adults’ Daily Activities (LOADA) application, which was deployed and used to acquire activity labels in the real-world, uncontrolled smart homes of three young people and two older adults for over one month. The experimental results demonstrate significant performance in activity recognition, employing semi-supervised learning with limited labels, and transfer learning scenarios where representations learned from one smart home are transferred to another. This research could inspire other human activity recognition community researchers to overcome labeling challenges for monitoring older adults in real-world scenarios.
Deep learning models have gained prominence in human activity recognition using ambient sensors, particularly for telemonitoring older adults’ daily activities in real-world scenarios. However, collecting large volumes of annotated sensor data presents a formidable challenge, given the time-consuming and costly nature of traditional manual annotation methods, especially for extensive projects. In response to this challenge, we propose a novel AttCLHAR model rooted in the self-supervised learning framework SimCLR and augmented with a self-attention mechanism. This model is designed for human activity recognition utilizing ambient sensor data, tailored explicitly for scenarios with limited or no annotations. AttCLHAR encompasses unsupervised pre-training and fine-tuning phases, sharing a common encoder module with two convolutional layers and a long short-term memory (LSTM) layer. The output is further connected to a self-attention layer, allowing the model to selectively focus on different input sequence segments. The incorporation of sharpness-aware minimization (SAM) aims to enhance model generalization by penalizing loss sharpness. The pre-training phase focuses on learning representative features from abundant unlabeled data, capturing both spatial and temporal dependencies in the sensor data. It facilitates the extraction of informative features for subsequent fine-tuning tasks. We extensively evaluated the AttCLHAR model using three CASAS smart home datasets (Aruba-1, Aruba-2, and Milan). We compared its performance against the SimCLR framework, SimCLR with SAM, and SimCLR with the self-attention layer. The experimental results demonstrate the superior performance of our approach, especially in semi-supervised and transfer learning scenarios. It outperforms existing models, marking a significant advancement in using self-supervised learning to extract valuable insights from unlabeled ambient sensor data in real-world environments.
Human activity recognition (HAR) using ambient sensors has emerged as a promising approach to telemonitoring daily activities and enhancing the elderly quality of life. Deep learning models have demonstrated competitive performance in HAR on real-world datasets. However, acquiring large amounts of annotated sensor data for extracting robust features is costly and time-consuming. To overcome this limitation, we propose a novel model based on the self-supervised learning framework, SimCLR, for daily activity recognition using ambient sensor data. The core component of the model is the encoder module, which consists of two convolutional layers followed by a long short-term memory (LSTM) layer. This architecture allows the model to capture both spatial and temporal dependencies in the sensor data, enabling the extraction of informative features for downstream tasks. Through extensive experiments on three CASAS smart home datasets (Aruba-1, Aruba-2, and Milan), we showcase the superior performance of the model in semi-supervised learning and transfer learning scenarios, surpassing state-of-the-art approaches. The findings highlight the potential of self-supervised learning in extracting valuable information from unlabeled sensor data, reducing costly annotation efforts for real-world HAR applications.
Copyright © 2023, Laboratoire Domus