Audio Classification

DESCRIPTION


• In this model the audio was classfied using Convolution Neural Network algorithm.
• PUBG game audio was used as a datasets to classify the sound based on object detection using TensorFlow coco model (ssd_mobilenet_v2_fpnlite_640x640_coco17_tpu-8)..


METHODOLOGY


No Image

PRE-PROCESSING

• PUBG game audio was recorded in mp3 format and then converted into wav format for the purpose of converting the audio into mel-spectrogram frequency image.
Librosa and SciPy both is a open source libraries supports wav file for both read and write operation but it’s not feasible in mp3 format.
• By using Librosa, Matplotlib and SciPy library, we were able to extract the Mel-spectrogram and save it as images.


MEL-SPECTROGRAM OUTPUT

Gun Change Audio
Gun Shot Audio
M416 Gun Load Audio
M762 Gun Load Audio
Pistol Gun Load Audio

DATA AUGUMENTATION

• Data augmentation is labeling the image for classification.
• LabelImg tool was used for labeling the image which will generate a XML file for bound box annotation.
• The XML code are generate from labelImg tool for labeling the images and it’s divided into two part: One is for training and other is for testing.


MODEL TRAINING AND TESTING

• Model was trained using Tensorflow coco ssd_mobilenet_v2_fpnlite_640x640_coco17_tpu-8 model.


MODEL OUTPUT


Gun Change Output
Gun Shot Output
M416 Gun Load Output
M762 Gun Load Output
Pistol Gun Load Output


Get in Touch

Logo 1 Logo 2 Logo 3