proposal: audio package #18497

mattetti · 2017-01-03T06:29:31Z

This proposal suggests a new standard audio package. The goal of this package is not to offer a library to consume audio data but instead to define a common interface that could be used by all audio libraries. This common interface would be audio.Buffer and concrete implementations would be provided.

According to the guidelines, I wrote a design document explaining things in details, it can be found here.
The proposal implementation in code is available here.
Note that this proposal is a follow up on an earlier proposal: #13432

I do realize that adding a new standard package is a big deal and that's why I kept the scope as small as possible and also use the implementation against non-trivial real code. The value added of such a package might not be immediately obvious to programmers not familiar with audio digital signal processing because end users would not see a direct benefit. However, this would be the corner stone of an entire ecosystem. Making audio libraries compatible with each other and offering a happy path means that Go developers would be able to chain a bunch of libraries and create their own audio processing chain without much effort. It's somewhat comparable of the benefits we all see in io.Reader.

Audio is everywhere and most available audio source code is written in C, but that doesn't have to be the case. With the rise of machine learning and language processing, developers wanting to play with audio data have the choice between hard to use C code and terribly inefficient Python code. Go is a great alternative and we have an opportunity to consolidate and optimize future efforts by defining a common interface.

My goal is to start the conversation now and in the case of acceptance, I would submit the code for review during the 1.9 cycle.

The text was updated successfully, but these errors were encountered:

nhooyr · 2017-01-03T07:01:26Z

The first here link is dead.

mattetti · 2017-01-03T07:02:31Z

@nhooyr thanks, fixed.

davecheney · 2017-01-03T07:08:48Z

To paraphrase someone from the Python community

The standard library is where packages go to die.

Why does this need to be added to the state dad library, why can't this be something like

github.com/splice/audio

mattetti · 2017-01-03T07:17:57Z

@davecheney you are right and it is already here: https://github.com/go-audio/audio but here is the thing, if we want to unify audio libraries out there, we need to make the common interface standard and official. Note that this package is not a package providing codecs and audio tools. It's a package used by 3rd party package developers implementing a shared interface.

Of course all packages in the std lib could have been external and could still be moved outside but it greatly increases the friction and the chances of consolidation. If you go through the proposal, I hope it's clear that my goal is to provide common ground for all libraries. As you can imagine, if a new developer coming from company X want to implement time stretching in Go, she might start by looking at what's available in the standard library and go from there. If nothing is mentioned, or doesn't do the right google search, she might end up building a solution that wouldn't work well with others.

What I am trying to say is that the main value of having this package in the standard library is to have a shared entry point for all libraries. I fully agree that audio libraries implementing codecs, effects and other things should live on their own. That's also why I setup the go-audio organization :)

davecheney · 2017-01-03T07:30:19Z

But why does it need to live in the standard library? All interface declarations with the same method sets are equal and interchangeable.

mattetti · 2017-01-03T07:38:04Z

It's more than an interface, there are concrete implementations.
I'm not sure how to answer your question in a way that would satisfy you, can you maybe help me understand why the packages we currently have are part of the standard library? Since anything that doesn't define the langage itself can be external, there must be reasons that made us choose to have them as part of the standard library.

davecheney · 2017-01-03T07:49:31Z

As I understand it the majority of the standard library exists to support building Go and the go tool. @bradfitz would be the best person to answer this, he has done the archology. From my point of view, adding a package to the standard library is the kiss of death. They packages API is condemned to grow via accretion in unnatural ways, which we are seeing with the database/sql package. Note: I am specifically not taking aim at the work of those adding features to the database/sql package, simply noting that if that package lived independently from the stdlib it would have been a straight forward matter to bump the major revision of the package to add support for the context arguments added in 1.8.

…

On Tue, 3 Jan 2017, 18:38 Matt Aimonetti ***@***.***> wrote: It's more than an interface, there are concrete implementations. I'm not sure how to answer your question in a way that would satisfy you, can you maybe help me understand why the packages we currently have are part of the standard library? Since anything that doesn't define the langage itself can be external, there must be reasons that made us choose to have them as part of the standard library. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#18497 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAAcA3ikeXfN3AYl4cCDZbxk7TZJY22Hks5rOfrdgaJpZM4LZXk3> .

egonelbre · 2017-01-03T09:17:51Z

I'm worried about the Buffer implementation having to allocate. Maybe there's some way to wire multiple buffers together in a way that doesn't involve allocation, but I'm not sure how? Maybe you have some example of a real-time-audio-synth working on 1ms or smaller buffers?

The list of 6 conversion functions should probably work on slices rather than on single values.

sbinet · 2017-01-03T09:29:25Z

I am underqualified to assess whether this should live as "audio" or as "x/media/audio" (although I'd err towards the latter) and leave this aspect to more qualified people.

that said, the audio.Buffer interface has piqued my interest.
it seems to me an audio.Buffer is, conceptually, an io.Seeker + an io.Writer and/or io.Reader, so it could be mmap'ed from a file, or conveniently acquired from h/w source, or conveniently save to disk.
I suppose manipulation/indexing of []T slices is a somewhat easier API?

tangentially, do you think the "multi-dimensional slices" proposal could help reconcile the various []float64, []float32 and []int "interfaces"?
in python, audio operations are applied on "bytes-like" objects:

https://docs.python.org/3/library/audioop.html
https://docs.python.org/3/glossary.html#term-bytes-like-object
ie: something that implements the buffer protocol, which IMHO is -- in Go -- something like a mix of the io.{Read|Writ}er interfaces w/ the multi-dim slice proposal.

having a "buffer protocol"-like interface would be more stdlib-worthy than audio.Buffer.

mattetti · 2017-01-03T09:30:50Z

@egonelbre good idea, I will post examples showing how to reuse the slice. The short version is that you have a slice and you don't have to reallocate. In what context are you concerned? In other words, how are you getting the input data that makes you think such an approach would require reallocation?

In regards to the conversion functions, can you say more about your suggestion?

mattetti · 2017-01-03T09:40:55Z

@sbinet yes matrices would improve things since frames would be nicer to write / read and more performant.

In regards to the bytes like buffer protocol I am not quite sure what that means concretely. When doing DSP, you have to convert the bytes into sample values, process them and convert them back into bytes at the end. But you'd want to keep the samples in a processable format until the end. The audio buffer approach is used in CoreAudio, the Cocoa framework for macOS/iOS.

The hard part here is to find the balance between performance and convenience. We want to avoid reallocations but we also want a nice API.

egonelbre · 2017-01-03T09:49:10Z

I mainly concerned about real-time-audio. A good example would be something like:

get input from keyboard/midi and generate sine-wave
get input from a looping and streaming PCM
mix 1. and 2.
have a gain node (adjustable)
have a pan node (adjustable)
output to device

Bonus points for:

when the looping/streaming PCM has different SampleRate/Format than the sine-wave generated

Basically, I'm missing the big picture how the Buffer should be used in practice.

mattetti · 2017-01-03T09:55:13Z

No problem, I have a generator package and effect libraries I developed against a slightly different version of the proposal, I will port them over shortly. I don't have an output to device tho, but I guess output to UI would do it.
The sample rate format change isn't an issue but if the streaming has a different format using a different data type, you do need 2 buffers to avoid reallocating.

egonelbre · 2017-01-03T09:59:37Z

In regards to the conversion functions, can you say more about your suggestion?

I mean when you are using IntToIeeeFloat, I assume, most of the time you are working on an []int rather than a single int. When people write the for loops manually for a particular conversion it makes more difficult to "upgrade" everyone to optimized versions of the conversions (e.g. asm optimized loops). (Alternatively, I misunderstood the purpose of the conversion functions.)

nathany · 2017-01-03T16:46:34Z

So far nothing has been decided for #17244, which would impact where this lives. I suggest reviewing that issue and chiming in there.

Outside of determining the import path for the audio packages, I think it would be great to have a set of go to audio packages in the Go community.

Also related: #13432 (Also, I really like @rakyll's suggestion of special interest groups).

minux · 2017-01-03T17:06:48Z

I think adding a new experimental package to the std is a non-starter. The first step should be adding the proposed package as x/media/audio.

mattetti · 2017-01-03T17:21:19Z

I'll be focusing on the technical details of the proposal and answer the technical concerns that were and will be brought up.

The decision of where this package should live is a separate issue, my last comment on the topic is that we have generic packages such as image in the std lib and as mentioned by others the future of x packages is being questioned. There is probably a deeper conversation about what packages should live where, why, based on what criteria. Also can and should packages 'graduate'. If as mentioned by Dave, the std lib should only contain packages that help develop the language, what should happen to std lib packages that don't do that. X stands for experimental but what if a package 'graduated', is used by a lot of people should it go in the std lib? Those are hard questions that I feel should be discussed in a different GH issue even tho, they will affect this proposal.

In the meantime, I would very much appreciate feedback on the technical proposal more than on where the package should live. For comments on std lib and x packages, please comment on Andrew's proposal mentioned above.

kardianos · 2017-01-03T17:27:21Z

X stands for experimental

x/whatever is just a path. "eXtra" maybe? "x/exp" is experimental. But for the rest, yeah, probably best to keep this related to just "what a standard audio package should look like" regardless of where it lives.

sbinet · 2017-01-03T17:31:53Z

@mattetti wrt the buffer protocol, I had played a bit with the persistency aspects here:
https://github.com/sbinet/ndim

driusan · 2017-01-03T17:53:59Z

I think having interfaces in the standard library has some advantages over defining interfaces in third party packages. People are much more likely to go out of their way use them and maintain compatibility than they are when it's an interface in some third party library. (In fact, unless there's a clear third party market leader in a domain, users are unlikely to even find the interface if it's not part of the standard library. As a Go user, I've implemented image.Image and io.Reader on types, but I've never went out of my way to find other library's interfaces to implement and don't even know where I'd get started..) The idea behind audio.Buffer being in the standard library seems reasonable to me, in order to improve interoperability of packages that do audio processing. That said, I think x/ (or even x/exp) is more appropriate until the implementation settles down.

I don't really know much about about audio processing, but having read your proposal I have some questions:

The interface seems rather large by Go standards. Do the AsXBuffer methods really need to be part of the audio.Buffer interface? Couldn't they be in separate XTypeBuffer interfaces? Similarly, why not have an audio.Cloner interface instead of putting it all into one monolithic interface? That leaves PCMFormat and NumFrames, but I don't know enough about audio processing to know how tightly coupled they should be.
SampleRate, BitDepth, and NumChannels are all ints. As they get passed around, a user might mix them up somewhere and the compiler wouldn't catch it. Why not make types for type safety (maybe not for NumChannels)?

rakyll · 2017-01-03T18:07:59Z

FWIW, the standard library contributors don't have neither bandwidth or technical skills to maintain an audio package. I would wait for #17244 to be resolved to see if x/ might be a good domain. Otherwise, a special workgroup would be the best option for media/audio projects.

A standard interface might be good to have if possible for PCM buffers. I am naively believing that a community can converge in a standard interface without contributing it to the standard library though. If the work is owned by a specialist group under Go ecosystem, there will be less chances that it will be limited by the resources of a company or the Go project.

Not every idea in the standard library wins the hearts automatically, the sql package is an example. Standard library is the worst place to contribute big interfaces and audio requires one or two big interfaces at least :( Also the burden of having to maintain an API forever once is contributed to the standard library is a big contract.

mattetti · 2017-01-03T19:33:57Z

@sbinet correct me if I'm wrong but in your ndim implementation you can't reuse the slice since it's not exported which in my case would mean a big overhead when dealing with gigabytes of data.

@kardianos good point about the meaning of x, I wrongly associated x with x/exp.

@driusan I obviously agree and appreciate your support. I leave it to the core team to define where it should live.

In regards to the size of the interface, I agree that it's bigger than I'd like but unfortunately, PCM data can be consumed as ints and floats and as we saw recently with opus, in some cases it's important to be able to rely only on float32 (for performance reasons). The good news is that the code is very straightforward and we shouldn't have to add more concrete implementations so the interface should be stable.
The audio.Cloner interface is a good point, I could extract it as its own interface but then would have to embed it in audio.Buffer since creating a copy of a buffer is a very common thing. Not sure it's worth it.
PCM Format is critical to know since it indicates how to process the raw audio data (what's a frame size, how many samples per second etc..)
The NumFrames is a bit less important and could be calculated but would cost cycles to check things. I could be dropped but I am almost sure it would come back as a request sooner than later.
SampleRate, BitDepth, and NumChannels are all ints instead of type aliased because those numbers are very different and because they are used to process the audio via math operations. The sample rate is in the 10s of thousands, the bit depth between 8 and 24 and the number of channels between 1 and 8. I don't think it's worth the overhead of having custom types and convert those all over the place.

@rakyll

standard library contributors don't have neither bandwidth or technical skills to maintain an audio package

I think this is a different issue and one that can be addressed separately (your suggestion for specialist groups is a potential solution). But I agree that this is at the heart of #17244 and it has to be addressed.
When you and I discussed this proposal, the idea was to have it under x but after seeing #17244 and the lack of movements I figured I might as well propose to go all the way to the std lib even though I knew I would get plenty of (justified) push back. My point here is that we need a centralized place for thought through, discussed, reviewed and vetted libraries and interfaces. As pointed by others, without such a central place we won't reach cohesion. I'll comment more directly on the related issue itself but @adg made an interesting comment here: #17244 (comment)

I think that should be in scope for /x, and that it would make sense to eventually move such packages to the standard library.

He's talking about 'providing an official package for interoperating about (but not necessarily doing) X' which is exactly what this proposal is all about.

mattetti · 2017-01-03T21:52:07Z

@egonelbre I use audio.IntToIeeeFloat here: https://github.com/go-audio/aiff/blob/master/encoder.go#L129 which is on a single value. I don't know if 10 byte IEEE floats are used for data outside of the headers. I might be wrong there.
I'm working on moving some of my code to this proposal and to open source it so I can show you a pseudo real time processing with reused buffer.

mattetti · 2017-01-04T02:44:43Z

@egonelbre here is a real time synth example using stdin to change the oscillator frequency and portaudio to output the audio to the soundcard: https://github.com/go-audio/generator/blob/master/examples/realtime/main.go
The buffer is filled by the generator that is then transformed by the naive gain control. Finally the buffer content is converted to float32 and copied in the portaudio buffer for playback.
As you can imagine, we could chain other transforms and even presenters to process the buffer.

Press keys to change the played note (don't forget to press enter) and press + and - (followed by enter to change the gain in a linear way.

I believe the float64 to float32 truncation/copy is very cheap and not a performance concern (especially compared to allocating and populating a new slice) but I might be wrong. If that's your concern, would you mind benchmarking that for us?

kisielk · 2017-01-04T07:17:44Z

Maybe I'm missing something, but I don't see the purpose of carrying around the BitDepth (and to some degree Endianness) on the Format field of the buffers, and particularly the FloatBuffer. Once you've working with the data in floats neither of those is really relevant except when converting back to an integer format again. That seems like it's a separate concern rather than the property of the buffer.

mattetti · 2017-01-04T07:50:14Z

@kisielk this is a fair critique. The reason those values were added were for codecs so we could round trip without having to check or store metadata anywhere else. I just checked CoreAudio's AudioBuffer and they don't store that data either: https://developer.apple.com/reference/coreaudio/audiobuffer

Let me try to play with changing that and see what it breaks and how to work around it. I'll report back in a couple days.

Edit: @kisielk after playing with the suggestion, I updated the proposal and implementations. Thanks! Let me know if you're coming to NAMM, I owe you one!

as per golang/go#18497 (comment)

nigeltao · 2017-01-04T08:55:07Z

I only skimmed https://github.com/go-audio/generator/blob/master/examples/realtime/main.go but, AFAICT, it (or its imports) don't actually use the audio.Buffer interface type, only the audio.FloatBuffer struct type. Similarly, the AsFloat32Buffer method is never called anywhere, even though you convert float64 samples to float32. As @egonelbre said, I'm missing the big picture how the proposed standard type (the Buffer interface type) and its methods should be used in practice.

nigeltao · 2017-01-04T09:03:16Z

This is drifting off-topic, but I also left some quick code review comments on go-audio/generator@aed9f58 based on my skim.

mattetti · 2017-01-04T09:06:27Z

@nigeltao the example I linked to was showing a real time example for @egonelbre that was worried about performance. The proposal is for the interface and the implementations. I will add more examples with decoders and encoders and why the interface is useful. I do however expect that the concrete types might be used often to avoid conversion but the buffer can always be converted if needed.
Thanks for the comments, I'll update the example.

nigeltao · 2017-01-04T09:16:00Z

@mattetti, @egonelbre can correct me if I'm misunderstanding his concerns, but I think that I have the same concern that the AsXxxBuffer methods have to allocate on every call. You posted an example that doesn't allocate, but also doesn't use the AsXxxBuffer methods, so I don't think it directly addresses the concern, and I'm still a little worried about performance.

In any case, you said you'll add more examples later for why the interface is useful. I look forward to those examples (and there's no need to hurry on that).

mattetti · 2017-01-04T09:24:31Z

@nigeltao AsXxxBuffer definitely allocates on every call, that's why I think a lot of libraries will require a concrete type to avoid the conversion but the concrete types implement the interface and can always be converted allowing for an end to end flow using the same "type".

It might be interesting to look at the previous iteration where I only had 1 concrete type: https://github.com/mattetti/audio/blob/master/pcm_buffer.go#L43 but talking with @campoy I came to the conclusion that having separate concrete types was a better approach.

Maybe the AsXxxBuffer isn't the right approach due to the performance cost. Maybe there is a cheaper version that would take as argument the receiver buffer. This would avoid allocation and make conversion cheaper. Maybe something like ToXxxBuffer(buf *XxxBuffer)? @egonelbre @nigeltao does that sound like a better approach to you?

egonelbre · 2017-01-04T10:59:20Z

I think that I have the same concern that the AsXxxBuffer methods have to allocate on every call.

Yes, i.e. AsXxx() would only be usable offline processing.

I feel like the whole discussion will eventually converge on something like this:

// The buffers

type Format struct {
	SampleRate  uint32
	Channels    int16
	BitDepth    int16
}

type Buffer32 struct {
	Format
	Data   []float32
}

type Buffer64 struct {
	Format
	Data   []float64
}

// Graph definition

type Processor interface { Processor32; Processor64 }
type Processor32 interface { Process32(*Buffer32) }
type Processor64 interface { Process64(*Buffer64) }

// Generate could also be named ProcessReplacing, Replace or Fill
type Generator interface { Generator32; Generator64 }
type Generator32 interface { Generate32(*Buffer32) }
type Generator64 interface { Generate64(*Buffer64) }

type Node interface { Processor; Generator }
type Node32 interface { Processor32; Generator32 }
type Node64 interface { Processor64; Generator64 }

// Formats used for resampling / converting / loading

type BufferInt16 struct { Format Format; Data []int16 } // same fields as above
type BufferInt16BytesLe struct { Format Format; Data []byte }
type BufferInt16BytesBe struct { Format Format; Data []byte }

func (buf *BufferInt16) Generate32(out *Buffer32) {
	if Debug { ... check for errors and panic/return if found ... }

	for i := range out.Data {
		out.Data[i] = (float32)buf.Data[i] / (float32)0x8000
	}
}

func (buf *BufferInt16) Process32(out *Buffer32) {
	if Debug { ... check for errors ... }
	
	for i := range out.Data {
		out.Data[i] += (float32)buf.Data[i] / (float32)0x8000
	}
}

// also have a global constant Debug for turning off any additional graph checks with a build tag
const Debug = true

It might even make sense to drop float32 or float64, but I'm not feeling confident to do so at this moment. But, that API is pretty much the minimal you can get away for general-purpose audio processing.

egonelbre · 2017-01-04T11:20:39Z

I believe the float64 to float32 truncation/copy is very cheap and not a performance concern (especially compared to allocating and populating a new slice) but I might be wrong. If that's your concern, would you mind benchmarking that for us?

The issue is unpredictable timing, not necessarily slowness -- although you still want things to be as fast as possible. If you have a lot of things with unpredictable timing it's impossible to ensure that the "bad timings" don't happen together. Which means you start to get intermittent "glitches" and no easy way of debugging what happened. For more see http://www.rossbencina.com/code/real-time-audio-programming-101-time-waits-for-nothing

Obviously, at the end of audio-graphs doing some conversion is unavoidable, if you want to support different types. Or you have to do a lot of work to avoid it.

taruti · 2017-01-04T13:34:22Z

Commenting on the technical side of the proposal:

The names FloatBuffer and Float32Buffer - float32 seems to be more common in the wild for audio. Various audio libraries use constants like ENCODING_PCM_FLOAT to rever to float32 buffers. Using FloatBuffer for 64-bit floats would create confusion. Float64Buffer would be better.

Exposing 16-bit data without copying would be quite important for processing audio input in realtime as some devices do provide only that. Thus having Int16Buffer would be important.

IntBuffer is very bad design. The varying size of int (32/64bit) will make interfacing with audio libraries without copying hard. Make it Int32Buffer.

Is the idea to handle endianess in the libraries and not expose it to the user (e.g. typically samples are native endian, WAV little endian, AIFF big endian)? Looking at the code it converts floating point and integer samples to each other by simply casting them. But floating point samples are between [-1,1] and integer samples (e.g. 16bit) between [-32768, 32767]. Thus the conversions seem quite wrong at first glance.

E.g. http://blog.bjornroche.com/2013/05/the-abcs-of-pcm-uncompressed-digital.html has an introduction on uncompressed audio formats for those interested on such topics.

egonelbre · 2017-01-04T14:15:03Z

It might even make sense to drop float32 or float64, but I'm not feeling confident to do so at this moment.

From the discussions on different forums I gather that only using float32 should be fine 1 2 3. Internally some processors do need to use float64 regardless of output. However, some DAW-s and many plugins support float64. So, using float64 would avoid this conversion.

So, there doesn't seem to be a consensus in the audio world. I think we can support easily both, if there are compatibility wrappers for automatically handling both 32/64 given that you have one implementation. But, it also creates more API surface.

With regards to the previous Generator interface, it seems it can be avoided as long as we decide who should be responsible for clearing / accumulating the buffers. e.g. JUCE has only processBlock. I would vote for the bus/graph/context implementation accumulates. Clearing should be done by the Node if it is needed. (This simplifies and minimizes code needed to write by the nodes, at the cost of making bus/graph more complicated and slower.)

rsc · 2017-01-04T14:49:07Z

This is in some ways a dup of #13423 and #16353. If we're not willing to add a place in the subrepos then we're likely not willing to add a place in the standard library.

The right way to start is to create a package somewhere else (github.com/go-audio is great) and get people to use it. Once you have experience with the API being good, then it might make sense to promote to a subrepo or eventually the standard library (the same basic path context followed).

More generally, all of these are on hold for #17244, which in turn is essentially blocked on understanding the long-term plan for package management. I intend to focus on that for the next six months.

I'll mark this as Proposal-Hold to match the others.

mewmew · 2017-01-04T15:07:57Z

Issue #13423 seems to be unrelated. I think you intended #13432.

mattetti · 2017-01-04T16:40:15Z

@egonelbre @taruti don't forget that the std lib is extremely biased towards float64: https://golang.org/pkg/math/ and int in general. While I agree that we should support float32 and int16 I think it does make things way more complicated for internal processing (by that I mean within the Go world).

I need to wrap my head around Egon's interface suggestion, for some reasons some aspects make me feel quite uncomfortable which means I need to dig deeper.

@rsc that's fair, as a small side note I think #13432 should be revoked (right @rakyll ?). But in either case that shouldn't prevent the people interested in continuing the discussion about a unified Audio interface and the development of core libraries. This discussion is certainly a catalyst and we might be able to put together a special interest group leading the way.

kisielk · 2017-01-04T16:45:58Z

I agree that most of the Go stdlib is biased towards float64, but I think it would be a mistake to not make float32 support first class in the API. Memory is at a premium in most audio devices and you're halving your available storage for little to no benefit by restricting yourself to float64.

egonelbre · 2017-01-04T17:25:33Z

@mattetti to get a concrete idea: https://github.com/egonelbre/exp/tree/master/audio this is what I'm proposing. Note, I'm unable to compile portaudio on Windows at the moment so I'm not sure whether I've made some terrible mistakes somewhere. Tried to make the code as realistic as possible.

While I agree that we should support float32 and int16 I think it does make things way more complicated for internal processing (by that I mean within the Go world).

The support for int16 would be more of a convenience implementation. i.e. the Node interface wouldn't contain the API for handling Int16, only float32 and float64. Most audio should work as a Node or handle Node-s this means implementing Process32 and Process64 and nothing more. But, BufferInt16 is not essential and can be skipped altogether. The audio package would be a good place to keep all these different common-buffers and conversions from one-to-another. (e.g. how you would use such buffer https://github.com/egonelbre/exp/blob/master/audio/example/internal/wave/wave.go#L10)

With regards to float32/float64 support, yeah, I'm not sure what the correct approach is either. I think it's fine to say that "some particular situation needs some other implementation" and always pick float32/float64. But we should be at least aware who we are/aren't targeting and whether we match their needs.

Memory is at a premium in most audio devices and you're halving your available storage for little to no benefit by restricting yourself to float64.

PS: with my proposed design deals only with processing buffers, not necessarily how you actually load/hold/stream your PCM data. For all intents and purposes you could store it in 4-bit integers, if you wanted to; but when it comes to processing, you handle float32/float64.

kisielk · 2017-01-04T17:39:37Z

@egonelbre sure, but even in the case of processing buffers this is a concern. Say you are designing a sample-based synthesizer (eg: Akai MPC) and your project has an audio pool it is working with. You'll want to be storing those samples in memory in the native format of your DSP path so you don't have to waste time doing conversions every time you are reading from your audio pool. Also remember that in-memory size also affects caching.

mattetti · 2017-01-04T17:46:48Z

@egonelbre @kisielk @nigeltao @taruti should we continue this discussion at go-audio/audio#3

We can then circle back and offer an updated proposal once the core team figures out how to deal with #17244

nathany · 2017-01-04T18:37:46Z

@mattetti 👍

"Once you have experience with the API being good, then it might make sense to promote to a subrepo or eventually the standard library" - @rsc

I'll comment over in #17244.

rakyll · 2017-01-04T19:37:01Z

As a small side note I think #13432 should be revoked (right @rakyll ?)

Yes, I am in favor of that. I have just closed #13432.

dskinner · 2017-01-04T23:23:02Z

The AsX methods don't feel good and I think the fact that those are part of the proposal demonstrates a problem of sorts (and not just b/c of the allocations).

I think for processing, there is arguably smaller units than the proposed Buffer, and that's the discrete signal, represented simply as []float64 (for that type), and the continuous signal represented as func(t float64) float64. See here for an example of a discrete and continuous signal defined: https://github.com/dskinner/snd/blob/master/signal.go

I see the proposed buffer type as an extension of the discrete signal, and the metadata it can provide will be important in syncing multiple sources, but anything beyond that is probably doing too much (all the AsX methods and their intended use). Ultimately, I think what's necessary are access to the ins and outs of such discrete types for efficient handling.

Being Go as it is, and much like the math lib, you're simply going to have to commit to float64 with the occasional utility method. The extra complexity otherwise may see no use and may otherwise become an unnecessary burden.

I agree a working group might be more fitting to hash out these details. Then, language such as "This might be critical for real time processing" can be written less ambiguously. But this has happened before with various input going into (for example) https://github.com/azul3d/engine/tree/master/audio

nigeltao · 2017-01-05T03:08:27Z

As for a math32 library, I'm not sure if it's necessary. It's slow to call (64-bit) math.Sin inside your inner loop. Instead, I'd expect to pre-compute a global sine table, such as "var sineTable = [4096]float32{ etc }". Compute that table at "go generate" time, and you don't need the math package (or a math32 package) at run time.

wsc1 · 2018-08-07T00:24:28Z

Hi all,

Just wanted to mention http://zikichombo.org/blog/launch/

I think we've got a lot of work ahead of us w.r.t. audio and go, but also a lot of opportunities.

kybercore · 2023-12-23T11:15:30Z

The first linked repo is completely down and the second one hasn't had a commit in 5 years. Something something "The standard library is where packages go to die" @davecheney

mattetti mentioned this issue Jan 3, 2017

proposal: Add an x/media package #16353

Open

minux added the Proposal label Jan 3, 2017

rakyll mentioned this issue Jan 3, 2017

proposal: decide policy for sub-repositories #17244

Open

mattetti added a commit to go-audio/audio that referenced this issue Jan 4, 2017

remove bitdepth and endianess from format

e83260e

as per golang/go#18497 (comment)

rsc added the Proposal-Hold label Jan 4, 2017

rsc added this to the Proposal milestone Jan 4, 2017

mattetti mentioned this issue Jan 4, 2017

is this API appropriate, especially for real time use go-audio/audio#3

Open

rakyll mentioned this issue Jan 4, 2017

proposal: x/mobile audio #13432

Closed

zephyrtronium mentioned this issue Aug 21, 2021

Standard MIDI library #47860

Closed

mattetti closed this as completed Dec 23, 2023

proposal: audio package #18497

proposal: audio package #18497

Comments

mattetti commented Jan 3, 2017 • edited Loading

nhooyr commented Jan 3, 2017 • edited Loading

mattetti commented Jan 3, 2017

davecheney commented Jan 3, 2017

mattetti commented Jan 3, 2017

davecheney commented Jan 3, 2017

mattetti commented Jan 3, 2017

davecheney commented Jan 3, 2017 via email

egonelbre commented Jan 3, 2017

sbinet commented Jan 3, 2017

mattetti commented Jan 3, 2017

mattetti commented Jan 3, 2017

egonelbre commented Jan 3, 2017

mattetti commented Jan 3, 2017

egonelbre commented Jan 3, 2017

nathany commented Jan 3, 2017 • edited Loading

minux commented Jan 3, 2017 via email

mattetti commented Jan 3, 2017

kardianos commented Jan 3, 2017

sbinet commented Jan 3, 2017

driusan commented Jan 3, 2017 • edited Loading

rakyll commented Jan 3, 2017

mattetti commented Jan 3, 2017 • edited Loading

mattetti commented Jan 3, 2017

mattetti commented Jan 4, 2017

kisielk commented Jan 4, 2017

mattetti commented Jan 4, 2017 • edited Loading

nigeltao commented Jan 4, 2017

nigeltao commented Jan 4, 2017

mattetti commented Jan 4, 2017

nigeltao commented Jan 4, 2017

mattetti commented Jan 4, 2017

egonelbre commented Jan 4, 2017 • edited Loading

egonelbre commented Jan 4, 2017

taruti commented Jan 4, 2017

egonelbre commented Jan 4, 2017 • edited Loading

rsc commented Jan 4, 2017

mewmew commented Jan 4, 2017

mattetti commented Jan 4, 2017

kisielk commented Jan 4, 2017

egonelbre commented Jan 4, 2017 • edited Loading

kisielk commented Jan 4, 2017

mattetti commented Jan 4, 2017

nathany commented Jan 4, 2017

rakyll commented Jan 4, 2017

dskinner commented Jan 4, 2017

nigeltao commented Jan 5, 2017

wsc1 commented Aug 7, 2018

kybercore commented Dec 23, 2023

mattetti commented Jan 3, 2017 •

edited

Loading

nhooyr commented Jan 3, 2017 •

edited

Loading

nathany commented Jan 3, 2017 •

edited

Loading

driusan commented Jan 3, 2017 •

edited

Loading

mattetti commented Jan 3, 2017 •

edited

Loading

mattetti commented Jan 4, 2017 •

edited

Loading

egonelbre commented Jan 4, 2017 •

edited

Loading

egonelbre commented Jan 4, 2017 •

edited

Loading

egonelbre commented Jan 4, 2017 •

edited

Loading