-
Notifications
You must be signed in to change notification settings - Fork 4
/
README
166 lines (145 loc) · 7.12 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
-------------------------------------------------
Name: Mitchell Adam, Ryan Shukla
ID: 1528592, 1537980
CMPUT 275, Winter 2018
Final Project: Pitcher
-------------------------------------------------
An audio processor using a phase vocoder. See the samples folder for some examples
of signal manupulation
Overview:
We created a phase vocoder to be used for audio processing. The way this works
is as follows
1. We fist read in an auido file in the .wav form
2. We then use the short-time fourier transform to transform the signals into
the frequencey domain. The short time fourier transform works by using the Hann
window function over a signal to make many windows. We then use the fast fourier
transfrom on each window.
3. Once we have each window in the frequencey domain we then utilize the fact that
we have the fourier transform of many windows. We do this by comparing the phase
of each bin (a specific frequency band) between multiple windows. This then allows us
the more accurately determine the freqency of each component by using the phase difference
to correct the frequency. This is the benefit of a pahse vocoder.
4. Once we have accurate frequencey and magnitude of the signal we can do
then audio processing. We have four methods of audio processing
a. High pass filter: This dampens all low freqencey components making the signal sound
"brighter"
b. Low pass filter: This dampens all high freqencey components making the
signal sound more mellow.
c. Pitch shifting: With this option you are able to scale the signal up or down.
Scaling up makes it sound more like a chipmunk while scaling down makes it much
deeper. This is nontrival becuase we are able to shift the pitch without
changing the duration of the signal by resampling the signal.We are able to do this
as a result of using a phase vocoder
d. Autotune: In this method we are able to use more algorithms such as auto correctelation
to detect the fundamental frequcney of the signal. This allows us to calcualte the ratio
between the fundamental freqency and the closest note in the scale. The closest note in
the scale is determined algorithmically using modular arithmetic. Once we have this ratio
we are able to once again scale the signal by resampling similar to a pitch shifting.
Algorithms and data structres:
Algorithms
1. Fast Fourier Transfrom
Runtime O(nlog(n))
where n is number of samples in original signal
2. Short-time Fourier Transfrom
Runtime O(n^2log(n))
where n is number of samples in original signal
3. Autocorrelation
Runtime O(nlogn)
where n the number of samples in the original signal
This is due to the use of the FFT
Autocorrelation is used in fundamental()
4. Musical note detection by modular arithmetic
Runtime O(1)
This is due to the use of modular arithmetic which allows us to work in a single octave
and use a set to decide if a given note is in the key.
This algorithm is used in getTargetFreq()
Data Structures
Sets and Maps both used extensiveley. Also used STL complex numbers and valarray
Included files:
* README
* Makefile
* src/fft.cpp
* src/filter.cpp
* src/fourier.h
* src/fundamental.cpp
* src/fundamental.h
* src/ifft.cpp
* src/istft.cpp
* src/main.cpp
* src/processbuffer.cpp
* src/processbuffer.h
* src/processstft.cpp
* src/processstft.h
* src/reader.cpp
* src/reader.h
* src/sfprocess.cpp
* src/sftf.cpp
* src/targetfreq.cpp
* src/targetfreq.h
* src/writer.cpp
* src/writer.h
* hallelujah.wav
* samples/
Running instructions:
0. First install libsndfile in order to read and write .wav files. Use the
following commands
sudo apt-get install libsndfile1
sudo apt-get install libsndfile1-dev
1. Then in the main directory run 'make'
2. The program is then run using the following formate
./pitcher [mode] [options] [filename]
ex command: ./pitcher scale 2.0 hallelujah.wav
a. Mode:
Mode determines what audio processing is done on the file it can be
scale , tune, hpf, lpf
scale - pitch shift the audio file
tune - autotune the file with autocorrelation
hpf - use a high pass filter
lpf - use a low pass filter
b. Options:
Options determines how the mode is used. Each mode has a corresponding option
[scale option]: Is the scale factor as a double
[tune option]: Is the key the original file is in. Valid keys are: c, d, db, eb, e, f
gb, g, ab, a, bb, b
[hpf option]: The cutoff frequency in Hz
[lpf option]: The cutoff frequency in Hz
c. filename:
This is the name of the input file. Must be a .wav file
3. The program will then run the process. Since there is a lot of processing
to do it may take a while.
4. The output will then be saved in the same directory as output.wav
Recommended Commands:
1. To test high pass filter use:
./pitcher hpf 10000 hallelujah.wav
2. To test low pass filter use:
./pitcher lpf 100 hallelujah.wav
3. To test scaling use:
./pitcher scale 2 hallelujah.wav
4. To test tune use:
./pitcher tune c hallelujah.wav
NOTE: In the samples folder we have already run a wide range of process on
the files. Here you can see how the program changes the audio
Notes and Assumptions:
* Since we do a manupulation on the signal in the frequency domain there is
some noise created. This is heard most in the tune option. While the can
make the output sound poor, the desired effect is still heard. Audio processing
is a tedious task and fine tuning the output can be very difficult.
The tune option is used as demonstration of the possible applications of a phase
vocoder and is in no way a trivial task.
* Since we have the signal inforamtion in the frequecney domain there are many other
signal processing possibilites that could be done.
References:
* Used following websites to learn and implement Cooley-Tukey FFT algorithm:
https://en.wikipedia.org/wiki/Cooley%E2%80%93Tukey_FFT_algorithm
http://www.drdobbs.com/cpp/a-simple-and-efficient-fft-implementatio/199500857
https://rosettacode.org/wiki/Fast_Fourier_transform
* Using the fft to implement stft and a phase vocoder
http://blogs.zynaptiq.com/bernsee/pitch-shifting-using-the-ft/
* Converting frequency to a musical note
https://www.johndcook.com/blog/2016/02/10/musical-pitch-notation/
* Phase vocoder
http://sethares.engr.wisc.edu/vocoders/matlabphasevocoder.html
* Pitch detection algorithms
https://sound.eti.pg.gda.pl/student/eim/synteza/leszczyna/index_ang.htm
https://ccrma.stanford.edu/~pdelac/154/m154paper.htm
https://en.wikipedia.org/wiki/Cepstrum