Audio Classification

DESCRIPTION

• In this model the audio was classfied using Convolution Neural Network algorithm.
• PUBG game audio was used as a datasets to classify the sound based on object detection using TensorFlow coco model (ssd_mobilenet_v2_fpnlite_640x640_coco17_tpu-8)..

METHODOLOGY

PRE-PROCESSING

• PUBG game audio was recorded in mp3 format and then converted into wav format for the purpose of converting the audio into mel-spectrogram frequency image.
• Librosa and SciPy both is a open source libraries supports wav file for both read and write operation but it’s not feasible in mp3 format.
• By using Librosa, Matplotlib and SciPy library, we were able to extract the Mel-spectrogram and save it as images.

MEL-SPECTROGRAM OUTPUT

Gun Change Audio

Gun Shot Audio

M416 Gun Load Audio

M762 Gun Load Audio

Pistol Gun Load Audio

DATA AUGUMENTATION

• Data augmentation is labeling the image for classification.
• LabelImg tool was used for labeling the image which will generate a XML file for bound box annotation.
• The XML code are generate from labelImg tool for labeling the images and it’s divided into two part: One is for training and other is for testing.

MODEL TRAINING AND TESTING

• Model was trained using Tensorflow coco ssd_mobilenet_v2_fpnlite_640x640_coco17_tpu-8 model.

MODEL OUTPUT

Gun Change Output

Gun Shot Output

M416 Gun Load Output

M762 Gun Load Output

Pistol Gun Load Output