news-15082024-120118

Chinese martial arts movies have long been a source of fascination for audiences around the world. The intricate fight scenes, iconic props, and rich cultural heritage make these films a unique area of study for researchers interested in semantic segmentation and scene understanding in computer vision. Semantic segmentation, the process of recognizing and classifying different objects at the pixel level, is crucial for understanding complex scenes like those found in Chinese martial arts movies.

While there are several datasets available for semantic segmentation in urban or natural scenes, there is a lack of benchmark datasets specifically focused on prop segmentation in movie scenes, especially in the context of Chinese martial arts films. To address this gap, a new dataset called ChineseMPD has been introduced. This dataset provides pixel-level annotations for six categories of props commonly found in classic Chinese martial arts movies, including Gun, Sword, Stick, Knife, Hook, and Arrow. The dataset contains annotations for over 32,000 objects in 8 action movie segments, capturing various scenes like fight sequences, training scenes, and market scenes.

The process of creating the ChineseMPD dataset involved manual labeling with AI assistance, ensuring high quality and authenticity. A team of 21 individuals, including undergraduates, auditors, and reviewers, participated in the data labeling and review process. Film clips were selected from reputable archives, and careful attention was paid to copyright considerations. The dataset was meticulously organized into folders containing labeled images, JSON files with annotation points, and other relevant information.

To validate the technical accuracy of the dataset, a team of experts conducted manual checks and sampled labeled props for inspection. An interactive annotation tool called EISeg was used to annotate images with high accuracy, allowing for adjustments to improve precision. The dataset was evaluated using four popular semantic segmentation models, showing varying performance across different metrics like aAcc, mIoU, mAcc, and mDice.

The ChineseMPD dataset offers a valuable resource for researchers in the field of computer vision, particularly those interested in semantic segmentation of cultural objects in film scenes. By providing detailed labeling of traditional Chinese martial arts props, the dataset presents new challenges for developing precise segmentation models. The dataset’s cultural relevance also opens avenues for research in style-specific generation models and object recognition in culturally specific contexts.

In conclusion, the ChineseMPD dataset represents a significant contribution to the field of semantic segmentation in film and television studies. Its availability to researchers under a CC BY 4.0 license allows for non-commercial use, distribution, and remixing with appropriate attribution. Researchers are encouraged to explore the dataset, contribute optimized codes and models, and advance research in semantic segmentation for film and television applications.