𧩠Sound Sentiment Puzzle: Identity sentiments from audio clips of reviews
π Start Solving
π Explore Dataset
π΅πΌββοΈ What is The Sound Sentiment Puzzle About?
We humans rely on our community's feedback and review for so many things. When our friends tell us about their visit to the new restaurant, we can gauge whether they had a positive or a negative experience. When our family talks about the new movie, we know whether they enjoyed it or not. But do you think machines can identify sentiment based on the sound clips of reviews?
In this puzzle, you will merge multiple domains of AI to build a model that can identify sentiment from sound clips.
πͺπΌ What Youβll Learn
- How to play with sound data
- How to perform sentiment classification
Letβs get started! π
β The Task
Given an audio clip, identify the sentiment of the review. Identify whether the review was positive, negative, or neutral from the sound bite.
π©π½βπ» Explore Dataset
The dataset contains 10,500 samples of audio files. The label for each audio is present in train.csv and val.csv, corresponding to their id. The dataset is divided into training and validation sets.
- Training Set: 15000 samples
- Validation Set: 2000 samples
- Test Set: 7000 samples
Here are some details about the dataset:
The training and Validation dataset includes folders containing the wav audio files and a CSV file containing the sentiment label and the wav file id.
- label: 2 = Positive, 1 = Neutral, 0 = Negative
- wav_id: this refers to the name of the audio file in the respective folder.
label | wav_id |
---|---|
2 | 16 |
1 | 17 |
0 | 21 |
2 | 23 |
π Dataset Files
- train.csv : (15000 samples) Contains the column wav_id, which corresponds with the audio id in train.zip, and the label column containing the sentiment labels: 2 = Positive, 1 = Neutral, 0 = Negative
- train.zip : (15000 samples) Contains the review sound bites corresponding to the training dataset.
- val.csv : (2000 samples) Contains the column wav_id, which corresponds with the audio id in val.zip, and the label column containing the sentiment labels.
- val.zip : (2000 samples) Contains the review sound bites corresponding to the validation dataset.
- test.zip : (7000 samples) This is your test dataset. Contains sound bites in wav format.
π¬ Let's Solve This Puzzle
The starter kit breaks down everything from downloading the dataset, loading the libraries, processing the data, creating, training, and testing the model.
Click here to access the basic starter kit. It contains in-depth instructions to:
- Download the necessary files
- Setup the AIcrow-CLI environment that will help you make a submission directly via a notebook
- Downloading dataset & importing libraries
- Preprocessing the dataset
- Creating the model
- Setting the model
- Training the model
- Submitting the result
- Uploading the results
Make your first submission using the starter kit. π
π Evaluation Criteria
The evaluation metric for this puzzle is F1 Score ( Primary Score ) and Accuracy ( Secondary Score )
π€« Hint to get started
- To solve this puzzle, convert audio signals to images and use those images to train a convolutional neural network. Note: This is a simple approach π you can find the code for this approach here.
- You can also convert the sound to text and classify the text among the respective sentiments.
π Resource Circle
Check out this blog which extracts features from sound and then classifies them into sentiments.
π―ββοΈ Get Help From Community
Hop over to the AIcrowd Blitz discord server to see ongoing discussions about this puzzle.
πββοΈ Subscription Queries
This is one of the many free Blitz puzzles you can access forever. To access more puzzles from various domains from the Blitz Library and receive a special new puzzle in your inbox every two weeks, you can subscribe to AIcrowd Blitz here.