Initial istft implementation. #85

jfsantos · 2014-11-27T17:49:01Z

I wrote an istft function to bring a real signal from its per-frame STFT representation to a time-domain signal by using overlap-add. This version is unoptimized as I am having some trouble to create an FFTW.Plan for a backward RFFT. I can add my non-working code to this branch in case you want to take a look.

I understand the PR is in a crude shape right now but I need some help moving forward, and I figured it would be better to submit a PR than try to solve this via e-mail.

Any suggestions for improvement or tips on how I can use the FFTW.Plan interface to do what I want here?

coveralls · 2014-11-27T17:53:05Z

Coverage remained the same when pulling a29dbbb on jfs/istft into 4019eac on master.

gummif · 2014-11-27T18:51:59Z

Here is a working example

backward_plan{T<:Union(Float32, Float64)}(X::AbstractArray{Complex{T}}, Y::AbstractArray{T}) =
    FFTW.Plan(X, Y, 1, FFTW.ESTIMATE, FFTW.NO_TIMELIMIT).plan

x=randn(8)
y=rfft(x)  # works also for fft(x)
tmp1=similar(y)
tmp2=similar(x)

plan = backward_plan(tmp1,tmp2)

copy!(tmp1,y)
copy!(tmp2,x)
FFTW.execute(plan, tmp1, tmp2)  # warning: destroys tmp1
scale!(tmp2, FFTW.normalization(tmp2))

I suggest you do a copy to tmp1 from S instead of S[:,k]. Also probably better:

for 1:length(wsum)
    @inbounds wsum[i] != 0 && (out[i] /= wsum[i])
end
# instead of
pos = wsum .!= 0
out[pos] ./= wsum[pos]

jfsantos · 2014-11-27T19:18:52Z

Thanks! How did you implement backward_plan? I tried the following implementation:

backward_plan{T<:Union(Float32, Float64)}(X::AbstractArray{Complex{T}}, Y::AbstractArray{T}) =
    FFTW.Plan(Y, X, 1, FFTW.BACKWARD, FFTW.ESTIMATE, FFTW.NO_TIMELIMIT).plan

but then if I try to call it I get:

x=randn(8)
y=rfft(x)
tmp1=similar(y)
tmp2=similar(x)

plan = backward_plan(tmp2,tmp1)

ERROR: `Plan{T<:Union(Float64,Float32,Complex{Float32},Complex{Float64})}` has no method matching Plan{T<:Union(Float64,Float32,Complex{Float32},Complex{Float64})}(::Array{Float64,1}, ::Array{Complex{Float64},1}, ::Int64, ::Int32, ::Uint32, ::Float64)
 in backward_plan at none:1

EDIT: I see you posted your backward plan now. So you do not need to use FFTW.BACKWARD in the arguments?

gummif · 2014-11-27T19:58:18Z

The plan is exactly how it is done in the FFTW module, so FFTW.BACKWARD is apparently not needed.

jfsantos · 2014-11-27T20:02:45Z

Right, it seems to work perfectly. I'll just test this with different windows and lengths and add tests and documentation, but the function seems to be working well now. Thanks a lot!

gummif · 2014-11-27T20:03:27Z

To avoid allocation (if you have not not so already):

copy!(tmp1, 1, S, 1+(k-1)*size(S,1), length(tmp1))

jfsantos · 2014-11-27T20:05:34Z

That's exactly what I did in my last commit. However, execution time and allocation seem to be the same for both implementations, at least for the size of slices I'm using (length 128).

gummif · 2014-11-27T20:09:26Z

I mean for the copy!(tmp1, S[:,k]) line. There is also allocation for tmp2.*win which could be done with a loop.

jfsantos · 2014-11-27T20:26:49Z

You are right. I fixed both lines and memory allocation was reduced to 75% of the previous implementation. Execution times are more or less the same (again, maybe because I am using small window sizes).

gummif · 2014-11-27T20:31:39Z

OK good. A small bug:

1+(k-1)*winc:((k-1)*winc+n
# should be
ix + n
# with 
ix = (k-1)*winc

r9y9 · 2014-11-28T02:38:20Z

Using PRESERVE_INPUT flag in FFTW.Plan and removing copy! in istft roop make much faster. So the backward plan would be:

backward_plan{T<:Union(Float32, Float64)}(X::AbstractArray{Complex{T}}, Y::AbstractArray{T}) =
    FFTW.Plan(X, Y, 1, FFTW.ESTIMATE | FFTW.PRESERVE_INPUT, FFTW.NO_TIMELIMIT).plan

gummif · 2014-11-28T13:07:01Z

@r9y9 That's a good point. It would however require a bit of pointer hacking like

tmp1 = pointer_to_array(pointer(S, 1+(k-1)*size(S,1)), size(S,1), false)
# replaces
copy!(tmp1, 1, S, 1+(k-1)*size(S,1), length(tmp1))

r9y9 · 2014-11-28T14:34:47Z

@gummif Thanks for your comment. I was thinking about replacing:

copy!(tmp1, 1, S, 1+(k-1)*size(S,1), length(tmp1))
FFTW.execute(p, tmp1, tmp2)

with

FFTW.execute(p, sub(S,1:size(S,1), k), tmp2)

Though this requires slight allocations in sub, I've confirmed that my implementation works faster for relatively short signals. Unfortunately, it may be slower for long signals (depends on window length, overlap and window type, etc). Below is the testing code:

using DSP, Base.Test

wlen = 2048
overlap = div(wlen, 2)

x1 = rand(2^12)
println("testing with short signal (length: $(length(x1)))")
X1 = stft(x1, wlen, overlap)

# force JIT-compilation
istft(X1, wlen, overlap)
istft2(X1, wlen, overlap)

@time y1 = istft(X1, wlen, overlap)
@time y2 = istft2(X1, wlen, overlap)
@test_approx_eq x1 y1
@test_approx_eq y1 y2

x1 = rand(2^22)
println("testing with long signal (length: $(length(x1)))")

X1 = stft(x1, wlen, overlap)
@time y1 = istft(X1, wlen, overlap)
@time y2 = istft2(X1, wlen, overlap)
@test_approx_eq x1 y1
@test_approx_eq y1 y2

The results using julia v0.3.2:

testing with short signal (length: 4096)
elapsed time: 0.000865683 seconds (131456 bytes allocated)
elapsed time: 0.000283067 seconds (102768 bytes allocated)
testing with long signal (length: 4194304)
elapsed time: 0.106112303 seconds (67160936 bytes allocated)
elapsed time: 0.167759533 seconds (69568720 bytes allocated)

complete code for istft2: https://gist.github.com/r9y9/93d1645275f7bea9569a

I agree the current implementation. I think we can add benchmarks and improve performance then.

jfsantos · 2014-11-28T15:16:27Z

I tested both the suggested version with the pointer to array and the one with copy. Surprisingly, the one using copy uses slightly less memory (maybe because a new pointer is allocated at each iteration?). I only averaged over 100 executions but the version using copy was also slightly faster.

Since the @time macro measures the time for a single execution of the function, it may not be accurate. For this reason, I usually run the same expression 100/1000 times and average the result of the @elapsed expression instead:

t = zeros(100)
for i=1:100
t[i] = @elapsed y1 = istft(X1, wlen, overlap)
end

When trying both @r9y9 and the pointer_to_array approaches, I get segfaults from a FFTW function call on my machine:

signal (11): Segmentation fault: 11
r2cb_16 at /usr/local/Cellar/fftw/3.3.4/lib/libfftw3.3.dylib (unknown line)
apply at /usr/local/Cellar/fftw/3.3.4/lib/libfftw3.3.dylib (unknown line)
apply at /usr/local/Cellar/fftw/3.3.4/lib/libfftw3.3.dylib (unknown line)
apply_dif_dft at /usr/local/Cellar/fftw/3.3.4/lib/libfftw3.3.dylib (unknown line)
apply_hc2r at /usr/local/Cellar/fftw/3.3.4/lib/libfftw3.3.dylib (unknown line)
istft2 at /Users/jfsantos/.julia/v0.3/DSP/src/util.jl:150
julia_istft2;20180 at  (unknown line)
jlcall_istft2;20180 at  (unknown line)
jl_apply at /private/tmp/julia-MRinr5/src/./julia.h:980
jl_apply at /private/tmp/julia-MRinr5/src/interpreter.c:59
eval at /private/tmp/julia-MRinr5/src/interpreter.c:207
eval at /private/tmp/julia-MRinr5/src/interpreter.c:215
eval_body at /private/tmp/julia-MRinr5/src/interpreter.c:541
jl_interpret_toplevel_thunk_with at /private/tmp/julia-MRinr5/src/interpreter.c:569
jl_toplevel_eval_flex at /private/tmp/julia-MRinr5/src/toplevel.c:511
jl_f_top_eval at /private/tmp/julia-MRinr5/src/builtins.c:399
eval_user_input at REPL.jl:54
jlcall_eval_user_input;19917 at  (unknown line)
jl_apply at /private/tmp/julia-MRinr5/src/./julia.h:980
anonymous at task.jl:96
jl_apply at /private/tmp/julia-MRinr5/src/task.c:427
julia_trampoline at /private/tmp/julia-MRinr5/src/init.c:1007
Segmentation fault: 11

With the istft2 implementation, it segfaults on the first call. Using pointer_to_array, it segfaults after some calls. I am not sure about what's causing this behavior.

r9y9 · 2014-11-28T15:39:30Z

Oh, I had the same segfaults. Could you modify backward_plan as follows and try again?

backward_plan2{T<:Union(Float32, Float64)}(X::AbstractArray{Complex{T}}, Y::AbstractArray{T}) =
    FFTW.Plan(X, Y, 1, FFTW.ESTIMATE | FFTW.PRESERVE_INPUT, FFTW.NO_TIMELIMIT).plan

to

backward_plan2{T<:Union(Float32, Float64)}(X::AbstractArray{Complex{T}}, Y::AbstractArray{T}) =
    FFTW.Plan(X, Y, 1, FFTW.ESTIMATE | FFTW.PRESERVE_INPUT, FFTW.NO_TIMELIMIT)

and also

FFTW.execute(p, tmp1, tmp2)

to

FFTW.execute(p.plan, tmp1, tmp2)

I am not sure why segfaults happen but this changes seems to fix the segfaults in my environment. It's weird..

gummif · 2014-11-28T16:46:39Z

I'm not able to produce a significant difference. So the version with copy might be more safe.

julia> @time for i = 1:nn
           y2 = istft2(X1, wlen, overlap)
       end
elapsed time: 0.148178763 seconds (110615752 bytes allocated, 34.14% gc time)

julia> @time for i = 1:nn
           y1 = istft(X1, wlen, overlap)
       end
elapsed time: 0.159816237 seconds (110062040 bytes allocated, 46.83% gc time)

julia> @time for i = 1:nn
           y2 = istft2(X1, wlen, overlap)
       end
elapsed time: 0.148254777 seconds (110615736 bytes allocated, 32.35% gc time)

julia> @time for i = 1:nn
           y1 = istft(X1, wlen, overlap)
       end
elapsed time: 0.13468912 seconds (110061368 bytes allocated, 37.36% gc time)

r9y9 · 2014-11-29T02:36:55Z

Sorry for my insufficient benchmarks. I agree with @gummif

simonster · 2014-12-01T16:55:12Z

This is a late response, but the segfaults may come from trying to apply a plan that expects an array with a certain alignment to an array with a different alignment, which would be the case for a SubArray with an odd offset. You can use the UNALIGNED flag to tell FFTW not to assume aligned input, but the last time I looked at this, it seemed like an unaligned FFT was measurably slower than an aligned FFT, but like @gummif I found no substantial overhead to using copy!.

jfsantos · 2014-12-08T21:00:07Z

Alright, so there is no reason not to use copy! then. I just need to add some more tests and docs to this.

jfsantos · 2015-04-15T15:47:14Z

Oh, it seems I did something terrible while rebasing. I thought I had squashed everything instead of making this mess. Any tips on how to fix this?

coveralls · 2015-04-15T15:55:03Z

Coverage decreased (-0.35%) to 97.06% when pulling e5a0d01 on jfs/istft into bcca68b on master.

simonster · 2015-04-23T00:43:24Z

src/util.jl

+    nframes = size(S,2)-1
+    outlen = nfft + nframes*winc
+    out = zeros(T, outlen)
+    tmp1 = similar(S[:,1])


This would should be Array(eltype(S), size(S, 1)) to avoid making an unnecessary copy and to ensure that the array is actually an Array (which is not necessarily the case if S is an AbstractMatrix).

Thanks, will fix!

coveralls · 2015-04-23T12:34:35Z

Coverage increased (+0.07%) to 97.48% when pulling 3349186 on jfs/istft into bcca68b on master.

simonster · 2015-04-23T14:51:58Z

Not entirely sure what's going on with the history here, but the diff looks right. I guess the nuclear option is:

git diff master > patch.diff
git reset --hard master
patch -p1 < patch.diff
rm patch.diff

and then commit and force-push to this branch.

jfsantos · 2015-06-06T00:27:14Z

This should do the trick, thanks @simonster! Let's wait until the tests pass and then finally merge this?

jfsantos · 2015-07-02T22:37:18Z

Any more comments on this? @simonster @gummif @r9y9

standarddeviant · 2017-10-31T18:23:37Z

What ever happened to this? This came up in a discourse thread re: istft in DSP.jl:
https://discourse.julialang.org/t/should-there-be-an-inverse-stft-in-dsp-jl/6661/2

jfsantos · 2017-10-31T18:52:04Z

@standarddeviant I ended up never being able to finish this PR and in the meantime, Julia changed a lot and I was not using it on a daily basis anymore. Please feel free to do whatever is needed to get this to work with newer versions of Julia and submit another PR.

davidavdav · 2018-01-19T13:53:57Z

I started to write an istft as well, today, until I found out this PR and the MusicProcessing reference mentioned above. The latter turned out to have an alternative implementation of MFCC as well.

It would be useful, indeed, to have this istft functionality added to DSP.

While implementing my own little istft, I noticed that DSP.arraysplit() seems to only keep "whole" segments, dropping the last samples of the source array. If we would strive for perfect reconstruction possibility with istft, arraysplit should probably have an option to augment the input.

While we're at it, it would also be great to have an option to arraysplit in such a way that the overlap is symmetric on the left and right of each frame, even for the first and last frame. It may be best to illustrate this with an example. Suppose I have 1000 samples, and I want frame-shifts of 100 samples and an analysis window of 300 samples. Then the first frame would be samples -100..200, the second frame would be sampes 0..300, etc until the last frame 800..1100 (give or take a Julia base 1 index offset). Negative and >size indexes would refer to augmented data, e.g., zeros.

The advantage would be that such a scheme would have 1000/100 = 10 frames, so that the frames are more easy to align with the original samples (first frame aligned with 0..100, the middle of the analysis window, etc.). The analysis would be symmetric w.r.t. start and end, of the signals, and of the frames themselves.

In Kaldi a modern speech recognition toolkit, this is known as the analysis option snip-edges=false.

galenlynch · 2018-10-28T14:28:39Z

It seems a little bit unnatural to me to circularly wrap a signal by default. When I apply stfts to data, its usually because it's non-stationary data, and the first and the last frames end up being quite different from each other.

@davidavdav, would you have any interest in continuing this PR?

bafonso · 2020-08-17T13:55:31Z

gist here
Hi, I'd like to help to get the istft implemented in DSP. I am trying to bring back MusicProcessing and I need a working istft and I see others have shown interest in having this in the DSP library. I am trying to familiarise myself with the DSP.jl code and I am sharing the code inspired on this PR and made some changes so it plays nicely with current DSP.jl code. I tried to follow the same approach but I am sure I am missing several details.. help please? :-)

Right now it appears to work ok using a hanning window but wrongly scaled at the beginning when window is "nothing". I am not sure how to properly address this, perhaps the scaling I see in older code?

@test x1 ≈ Y1

fails even with hanning window, I believe because of discrepancies at the the first sample. Once we have the algorithm working I'd be happy to work on the tests, documentation and do a proper PR but happy to hear your thoughts

davidavdav · 2020-08-17T14:34:16Z

It seems a little bit unnatural to me to circularly wrap a signal by default. When I apply stfts to data, its usually because it's non-stationary data, and the first and the last frames end up being quite different from each other.

I don't think signals under a stft transform should be circularly wrapped, either. Where is that suggested and/or happening?

@davidavdav, would you have any interest in continuing this PR?

I think that if we have a stft, there should be a potentially perfectly reconstructing istft.

galenlynch · 2020-08-17T14:51:34Z

I don't think signals under a stft transform should be circularly wrapped, either. Where is that suggested and/or happening?

Good question, I don't know where I got that from.

simonster reviewed Apr 23, 2015
View reviewed changes

istft diff applied to current master

186636d

jfsantos force-pushed the jfs/istft branch from 3349186 to 186636d Compare June 6, 2015 00:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial istft implementation. #85

Initial istft implementation. #85

jfsantos commented Nov 27, 2014

coveralls commented Nov 27, 2014

gummif commented Nov 27, 2014

jfsantos commented Nov 27, 2014

gummif commented Nov 27, 2014

jfsantos commented Nov 27, 2014

gummif commented Nov 27, 2014

jfsantos commented Nov 27, 2014

gummif commented Nov 27, 2014

jfsantos commented Nov 27, 2014

gummif commented Nov 27, 2014

r9y9 commented Nov 28, 2014

gummif commented Nov 28, 2014

r9y9 commented Nov 28, 2014

jfsantos commented Nov 28, 2014

r9y9 commented Nov 28, 2014

gummif commented Nov 28, 2014

r9y9 commented Nov 29, 2014

simonster commented Dec 1, 2014

jfsantos commented Dec 8, 2014

jfsantos commented Apr 15, 2015

coveralls commented Apr 15, 2015

simonster Apr 23, 2015

jfsantos Apr 23, 2015

coveralls commented Apr 23, 2015

simonster commented Apr 23, 2015

jfsantos commented Jun 6, 2015

jfsantos commented Jul 2, 2015

standarddeviant commented Oct 31, 2017 •

edited

Loading

jfsantos commented Oct 31, 2017

davidavdav commented Jan 19, 2018 •

edited

Loading

galenlynch commented Oct 28, 2018

bafonso commented Aug 17, 2020 •

edited

Loading

davidavdav commented Aug 17, 2020

galenlynch commented Aug 17, 2020 •

edited

Loading

Initial istft implementation. #85

Are you sure you want to change the base?

Initial istft implementation. #85

Conversation

jfsantos commented Nov 27, 2014

coveralls commented Nov 27, 2014

gummif commented Nov 27, 2014

jfsantos commented Nov 27, 2014

gummif commented Nov 27, 2014

jfsantos commented Nov 27, 2014

gummif commented Nov 27, 2014

jfsantos commented Nov 27, 2014

gummif commented Nov 27, 2014

jfsantos commented Nov 27, 2014

gummif commented Nov 27, 2014

r9y9 commented Nov 28, 2014

gummif commented Nov 28, 2014

r9y9 commented Nov 28, 2014

jfsantos commented Nov 28, 2014

r9y9 commented Nov 28, 2014

gummif commented Nov 28, 2014

r9y9 commented Nov 29, 2014

simonster commented Dec 1, 2014

jfsantos commented Dec 8, 2014

jfsantos commented Apr 15, 2015

coveralls commented Apr 15, 2015

simonster Apr 23, 2015

Choose a reason for hiding this comment

jfsantos Apr 23, 2015

Choose a reason for hiding this comment

coveralls commented Apr 23, 2015

simonster commented Apr 23, 2015

jfsantos commented Jun 6, 2015

jfsantos commented Jul 2, 2015

standarddeviant commented Oct 31, 2017 • edited Loading

jfsantos commented Oct 31, 2017

davidavdav commented Jan 19, 2018 • edited Loading

galenlynch commented Oct 28, 2018

bafonso commented Aug 17, 2020 • edited Loading

davidavdav commented Aug 17, 2020

galenlynch commented Aug 17, 2020 • edited Loading

standarddeviant commented Oct 31, 2017 •

edited

Loading

davidavdav commented Jan 19, 2018 •

edited

Loading

bafonso commented Aug 17, 2020 •

edited

Loading

galenlynch commented Aug 17, 2020 •

edited

Loading