Abstract
In recent years, discussions surrounding sex education have gained considerable attention, as the lack of
comprehensive sex education has been linked to various societal issues. While micro-video platforms offer
new opportunities for disseminating sex education content, they have also contributed to the proliferation of
sexually suggestive videos. Existing video classification
methods face significant challenges in this context, such
as the difficulty of abstract concepts, cross-domain variation, and training bias due to class imbalance. To address these challenges, we propose a method for classifying sexually suggestive videos. Our approach introduces
a consensus-aware visual encoder to assist the model in
focusing on the common features of videos within the
same category at both the distribution and feature levels,
while effectively filtering out irrelevant visual distractions. This improves the model’s ability to capture abstract and complex features. Additionally, we employ a
label distribution-aware training strategy that allocates
more learning capacity to tail classes, ensuring balanced
learning across all categories. Experimental results on
the SexTok dataset demonstrate that our method excels in classifying sexually suggestive videos, offering improved handling of abstract and imbalanced video content.
comprehensive sex education has been linked to various societal issues. While micro-video platforms offer
new opportunities for disseminating sex education content, they have also contributed to the proliferation of
sexually suggestive videos. Existing video classification
methods face significant challenges in this context, such
as the difficulty of abstract concepts, cross-domain variation, and training bias due to class imbalance. To address these challenges, we propose a method for classifying sexually suggestive videos. Our approach introduces
a consensus-aware visual encoder to assist the model in
focusing on the common features of videos within the
same category at both the distribution and feature levels,
while effectively filtering out irrelevant visual distractions. This improves the model’s ability to capture abstract and complex features. Additionally, we employ a
label distribution-aware training strategy that allocates
more learning capacity to tail classes, ensuring balanced
learning across all categories. Experimental results on
the SexTok dataset demonstrate that our method excels in classifying sexually suggestive videos, offering improved handling of abstract and imbalanced video content.
| Original language | English |
|---|---|
| Title of host publication | 13th international conference on Computational Visual Media (CVM 2025) |
| Volume | 15663 |
| ISBN (Electronic) | 978-981-96-5809-1 |
| DOIs | |
| Publication status | Published online - 26 Apr 2025 |
| Event | CVM 2025 Computational Visual Media Conference - Hong Kong SAR, China Duration: 19 Apr 2025 → 21 Apr 2025 |
Publication series
| Name | Lecture Notes in Computer Science |
|---|---|
| ISSN (Print) | 1611-3349 |
| ISSN (Electronic) | 0302-9743 |
Conference
| Conference | CVM 2025 Computational Visual Media Conference |
|---|---|
| Country/Territory | China |
| City | Hong Kong SAR |
| Period | 19/04/25 → 21/04/25 |