Contrastive Self-Supervised Learning for Sensor-Based Human Activity Recognition: A Review
Abstract
Deep learning models have achieved significant success in human activity recognition, particularly in assisted living and telemonitoring. However, training these models requires substantial amounts of labeled training data, which is time-consuming and costly to acquire in real-world environments. Contrastive self-supervised learning has recently garnered attention in sensor-based activity recognition to mitigate the need for expensive large-scale data collection and annotation. Despite numerous related published papers, there remains a lack of literature reviews highlighting recent advances in contrastive self-supervised learning for sensor-based activity recognition. This paper extensively reviews 43 papers on recent contrastive self-supervised learning methods for sensor-based human activity recognition, excluding those related to video or audio sensors due to privacy concerns. First, we summarize the taxonomy of contrastive self-supervised learning, followed by a detailed description of contrastive learning models used for activity recognition and their main components. Next, we comprehensively review data augmentation methods for sensor data and commonly used benchmark datasets for activity recognition. The empirical performance comparisons of different methods are presented on benchmark datasets in linear evaluation, semi-supervised learning, and transfer learning scenarios. Through these comparisons, we derive significant insights into the selection of contrastive self-supervised models for sensor-based activity recognition. Finally, we discuss the limitations of current research and outline promising research directions for future exploration.
Citation
H. Chen et al., "Contrastive Self-Supervised Learning for Sensor-Based Human Activity Recognition: A Review," in IEEE Access, doi: 10.1109/ACCESS.2024.3480814.