Automatic Subtitle Generation for Sound in Videos

Anshul Ganvir; Kunal Pal; Sanket Jagtap; Pranita Katole; Mayur Bhalavi

doi:XX.XXX/IJARIIT-V4I2-1309

This paper is published in Volume-4, Issue-2, 2018

Paper Details
Abstract & PDF

Area

Computer Science

Author

Anshul Ganvir, Kunal Pal, Sanket Jagtap, Pranita Katole, Mayur Bhalavi

Org/Univ

Datta Meghe Institute of Engineering Technology and Research, Wardha, Maharashtra, India

Pub. Date

21 March, 2018

Paper ID

V4I2-1309

Publisher

IJARIIT

Edition

Volume-4, Issue-2, 2018

Keywords

Audio Extraction, Java Media Framework, Speech Recognition, Acoustic Model, Subtitle Generation, FFMPEG

Citations

IEEE
Anshul Ganvir, Kunal Pal, Sanket Jagtap, Pranita Katole, Mayur Bhalavi. Automatic Subtitle Generation for Sound in Videos, International Journal of Advance Research, Ideas and Innovations in Technology, www.IJARIIT.com.

APA
Anshul Ganvir, Kunal Pal, Sanket Jagtap, Pranita Katole, Mayur Bhalavi (2018). Automatic Subtitle Generation for Sound in Videos. International Journal of Advance Research, Ideas and Innovations in Technology, 4(2) www.IJARIIT.com.

MLA
Anshul Ganvir, Kunal Pal, Sanket Jagtap, Pranita Katole, Mayur Bhalavi. "Automatic Subtitle Generation for Sound in Videos." International Journal of Advance Research, Ideas and Innovations in Technology 4.2 (2018). www.IJARIIT.com.

Give proper credits, use Citation.

Abstract

The last ten years have been the witnesses of the emergence of any kind of video content. Moreover, the appearance of dedicated websites for this phenomenon has increased the importance the public gives to it. In the same time, certain individuals are deaf and occasionally cannot understand the meanings of such videos because there is not any text transcription available. Therefore, it is necessary to ﬁnd solutions for the purpose of making these media artifacts accessible for most people. Several software proposes utilities to create subtitles for videos but all require an extensive participation of the user. Hence, a more automated concept is envisaged. This report indicates a way to generate subtitles following standards by using speech recognition. Three parts are distinguished. The ﬁrst one consists in separating audio from video and converting the audio in suitable format if necessary. These second phase proceeds to the recognition of speech contained in the audio. The ultimate stage generates a subtitle ﬁle from the recognition results of the previous step. Directions of implementation have been proposed for the three distinct modules. The experiment results have not done enough satisfaction and adjustments have to be realized for further work. Decoding parallelization, use of well-trained models, and punctuation insertion are some of the improvements to be done.

All content is copyright protected.