Abstract
Modern smartphones present an ideal device for pervasive sensing of human behavior. Microphones have the potential to reveal key information about a person's behavior. However, they have been utilized to a significantly lesser extent than other smartphone sensors in the context of human behavior sensing. We postulate that, in order for microphones to be useful in behavior sensing applications, the analysis techniques must be flexible and allow easy modification of the types of sounds to be sensed. A simplification of the training data collection process could allow a more flexible sound classification framework. We hypothesize that detailed training, a prerequisite for the majority of sound sensing techniques, is not necessary and that a significantly less detailed and time consuming data collection process can be carried out, allowing even a nonexpert to conduct the collection, labeling, and training process. To test this hypothesis, we implement a diverse density-based multiple instance learning framework, to identify a target sound, and a bag trimming algorithm, which, using the target sound, automatically segments weakly labeled sound clips to construct an accurate training set. Experiments reveal that our hypothesis is a valid one and results show that classifiers, trained using the automatically segmented training sets, were able to accurately classify unseen sound samples with accuracies comparable to supervised classifiers, achieving an average F-measure of 0.969 and 0.87 for two weakly supervised datasets.
Original language | English |
---|---|
Pages (from-to) | 123-135 |
Number of pages | 14 |
Journal | IEEE Transactions on Cybernetics |
Volume | 46 |
Issue number | 1 |
Early online date | 6 Feb 2015 |
DOIs | |
Publication status | Published (in print/issue) - 14 Dec 2015 |
Keywords
- Diverse density (DD)
- pattern recognition
- pervasive sensing
- sound classification
- weak supervision