Skip to content

Commit

Permalink
modify README and add run_dl.sh
Browse files Browse the repository at this point in the history
  • Loading branch information
parisa-zahedi committed Jun 26, 2024
1 parent 4d8d770 commit 14da727
Show file tree
Hide file tree
Showing 2 changed files with 52 additions and 7 deletions.
43 changes: 36 additions & 7 deletions bioacoustics/feature_extraction/README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,15 @@
# Feature extraction

The modules in this directory are used to extract acoustic and/or deep learning features from '.wav' files. The features are used as input for the classifier ([step 3](../3_classifier)).
Feature extraction scripts are used to extract acoustic features from '.wav' files.
The output is two types of features that are input for the [classifiers](../classifier), i.e. svm and cnn.

## Instructions

[Installation instructions](https://github.com/UtrechtUniversity/animal-sounds#getting-started)
[Installation instructions](https://github.com/UtrechtUniversity/animal-sounds/tree/documenation_svm#getting-started)

## Feature extraction for Support Vector Machines
We extract several feature sets from using:

### Feature extraction for Support Vector Machines
We extract several feature sets using:
- a [python version](https://github.com/mystlee/rasta_py) of the [rasta-mat](https://www.ee.columbia.edu/~dpwe/resources/matlab/rastamat/) library.
- an [Automatic Analysis Architecture](https://doi.org/10.5281/zenodo.1216028)

Expand All @@ -19,8 +21,8 @@ We extend the feature set with the features from an [Automatic Analysis Architec

The script results in a feature set of 1140 features per audio frame.

#### Running the script
Use shell script `run.sh` to start `main.py` from the command line. The following arguments should be specified:
### Running the script
Use shell script `run_svm.sh` to start `extract_features_svm.py` from the command line. The following arguments should be specified:
- `--input_dir`; directory where the '.wav' files are located.
- `--output_dir`; directory where the feature files ('.csv') should be stored.
- `--frame_length`; subdivide '.wav' files in frames of this length (in number of samples, if the sample rate is 48000 samples per second, choose e.g. 24000 for 0.5 second frames)
Expand All @@ -29,8 +31,35 @@ Use shell script `run.sh` to start `main.py` from the command line. The followin

In `./config` the user can specify which features to extract.

## sndfile library
### sndfile library
If you get an error saying something about a 'snd_file' dependency on an ubuntu machine, this can be fixed by installing the following C library:
```
sudo apt-get install libsndfile-dev
```
## Feature extraction for Convolutional Neural Network (CNN)
To extract audio features for CNN classifier, .wav files are converted to Log-mel Spectrograms using [librosa](https://zenodo.org/badge/latestdoi/6309729) library.
Log- Melspectrograms had the best results in [[1]](#ref). As a future work we can try others such as Log-Spectrograms, and Gammatone-Spectrogram.

In this process, first we apply butter_bandpass filter to select frequencies among 100, 2000 hz. Then the short time Fourier transform (STFT) is applied on time-domain waveforms to calculate spectrograms.
Then mel filter banks are applied on the spectrograms followed by a logarithmic operation to extract log mel spectrograms.

| <img src="../../img/melspectrogram.png" width="400" /> |

### Running the script
Open a command line and run the following command:
```
sh run_dl.sh
```

This command applies `extract_features_dl.py` on the whole dataset. The following arguments should be specified:
- `--input_dir`; directory where the '.wav' files are located.
- `--output_dir`; directory where the feature files ('.pkl') should be stored.
- `--label`; the label of the wav file, i.e. chimpanze or background
- `--window_length`; subdivide '.wav' files in frames of this length (in number of samples, in our case, the sample rate is 48000 samples per second, we chose 750 for 15-millisecond frames)
- `--hop_length`; overlap between frames in number of samples per hop (in our case, the sample rate is 48000 samples per second, we chose 376)
- `--n_mel`; number of mel features, i.e. horizontal bars in spectrogram, which in our case it is 64.
- `--new_img_size`; the number of rows and columns of the log-melspectrograms which is ingested as an image to cnn. In our case it is 64 * 64.

## <a name="ref"></a>References
1. K. Palanisamy,D. Singhania†, and A. Yao,"Rethinking CNN Models for Audio Classification",2020
[arXiv preprint](https://arxiv.org/abs/2007.11154), [github](https://github.com/kamalesh0406/Audio-Classification)
16 changes: 16 additions & 0 deletions bioacoustics/feature_extraction/run_dl.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
#!/bin/bash

DATADIR='/Volumes/science.data.uu.nl/research-zwerts/data/sanaga_test_chunks/'
RECORDERS='A1 A3 A4 A5 A21 A22 A26 A38'

OUTPUTDIR='../../output/features/'
echo $DATADIR
for RECORDER in $RECORDERS
do
echo $DATADIR
echo $OUTPUTDIR
python3 extract_features_dl.py --input_dir $DATADIR'chimps/'$RECORDER'/*/*.wav' --output_dir $OUTPUTDIR$RECORDER'/'$RECORDER'_chimpanze.pkl' --label 'chimpanze' --window_length 750 --hop_length 376 --n_mel 64 --new_img_size 64 64
python3 extract_features_dl.py --input_dir $DATADIR'background/'$RECORDER'/*/*.wav' --output_dir $OUTPUTDIR$RECORDER'/'$RECORDER'_background.pkl' --label 'background' --window_length 750 --hop_length 376 --n_mel 64 --new_img_size 64 64
done


0 comments on commit 14da727

Please sign in to comment.