Skip to content

Audio docks for ESP32 mini (ESP32, ESP32C3, ESP32S2 and ESP8266 mini modules from Wemos)

License

Notifications You must be signed in to change notification settings

sonocotta/esp32-audio-dock

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ESP32 Audio Docks and Louder ESP

Open Source Hardware Open Source Software I sell on Tindie
Dev Chat

ESP32 Audio Docks is a range of development boards (earlier docks) that allow you to develop Audio solutions based on ESP32 chips. These were created to make Audio development entry as easy and inexpensive as possible.

First generation docks

image

HiFi-ESP32 Loud-ESP32 Louder-ESP32
DSC_0009 DSC_0017 DSC_0019

Table of Contents

Motivation

I spent the last few years developing different solutions based on ESP devices. It all started with ESP8266, where CPU power is not really sufficient to do real-time decoding, so you're limited to a rather simple ding-dong business. Then ESP32 came, bringing two much more capable cores, so you have a powerhouse to handle communication and decoding at the same time. Perhaps most importantly it also came with SPIRAM, so you can do decent buffering (essential for streamed content). Now new ESP32 C-Series and S-Series chips are entering the market, and their potential is mostly unrealized as of today.

I created those docks and subsequently development boards, to be able to quickly prototype for the whole range of ESP8266 and ESP32 chips, starting with the simplest finger-sized toys and going all the way up to full-sized speakers.

Features

First generation docks
ESP Audio Solo ESP Audio Duo Hifi ESP Louder ESP
image image image image
ESP8266, ESP32C3, ESP32S2 Mini modules ESP32 Mini Module ESP32 Mini Module ESP32 Mini Module
Single I2S DAC (MAX98357) with built in D-Class amp Dual I2S DAC (MAX98357) with built in D-Class amp PCM5100A 32bit Stereo DAC -100 dB typical noise level Stereo I2S DAC (TAS5805M) with built in D-Class amp
3W 2x 3W Non-amplified stereo output 2x 32W (4Ω, 1% THD+N)
1.5W 2x 1.5W Non-amplified stereo output 2x 22W (8Ω, 1% THD+N)
8MB PSRAM (4MB usable) 8MB PSRAM (4MB usable) 8MB PSRAM (4MB usable)
WiFi (ESP8266, ESP32S2) WiFi + BT5.0 (ESP32C3) WiFi + BT4.2 + BLE WiFi + BT4.2 + BLE WiFi + BT4.2 + BLE Ethernet
HiFi-ESP32 HiFi-ESP32S3 Loud-ESP32 Loud-ESP32S3 Louder-ESP32 Louder-ESP32S3
DSC_0002 DSC_0005 DSC_0002 DSC_0009 DSC_0013 DSC_0012
MCU ESP32-WROVER-N16R8 ESP32-S3-WROOM-N16R8 ESP32-WROVER-N16R8 ESP32-S3-WROOM-N16R8 ESP32-WROVER-N16R8 ESP32-S3-WROOM-N16R8
DAC PCM5100A 32bit Stereo DAC
-100 dB typical noise level
PCM5100A 32bit Stereo DAC
-100 dB typical noise level
Dual I2S DAC (MAX98357) with built in D-Class amp Dual I2S DAC (MAX98357) with built in D-Class amp Stereo I2S DAC (TAS5805M) with built in D-Class amp Stereo I2S DAC (TAS5805M) with built in D-Class amp
Output (4Ω) Non-amplified stereo output, 2.1V RMS Non-amplified stereo output, 2.1V RMS 2x 5W 2x 5W 2x 32W (4Ω, 1% THD+N) 2x 32W (4Ω, 1% THD+N)
Output (8Ω) Non-amplified stereo output Non-amplified stereo output 2x 3W 2x 3W 2x 22W (8Ω, 1% THD+N) 2x 22W (8Ω, 1% THD+N)
PSRAM 8MB PSRAM (4MB usable) over 40MHz SPI 8MB PSRAM over 80MHz QSPI 8MB PSRAM (4MB usable) over 40MHz SPI 8MB PSRAM over 80MHz QSPI 8MB PSRAM (4MB usable) over 40MHz SPI 8MB PSRAM over 80MHz QSPI
Power 5V over USB-C, 2x LP5907 3.3 V Ultra-Low-Noise LDO for analog section 5V over USB-C, 2x LP5907 3.3 V Ultra-Low-Noise LDO for analog section 5V (up to 2.5A) from USB-C 5V (up to 2.5A) from USB-C Up to 26V from external PSU
5V over USB-C with power limited to 2x5W
Up to 26V from external PSU
5V over USB-C with power limited to 2x5W
Connectivity WiFi + BT4.2 + BLE
W5500 Ethernet (optional module)
WiFi + BLE
W5500 Ethernet (optional module)
WiFi + BT4.2 + BLE
W5500 Ethernet (optional module)
WiFi + BLE
W5500 Ethernet (optional module)
WiFi + BT4.2 + BLE
W5500 Ethernet (optional module)
WiFi + BLE
W5500 Ethernet (optional module)

Onboard PSRAM

Audio streaming requires proper buffering to work, even with ESP32 500K of RAM it is a challenging task. For that reason, most of the projects will require WROVER modules that have onboard PSRAM chips. All ESP32 Audio boards have an 8MB PSRAM chip onboard, connected via a high-speed interface. Any code using PSRAM with just work out-of-the box.

Boards Pinout

First generation docks

ESP Audio Solo

I2S CLK I2S DATA I2S WS
ESP8266 15 3 2
ESP32-C3 5 20 6
ESP32-S2 12 37 16

ESP Audio Duo

I2S CLK I2S DATA I2S WS PSRAM CE PSRAM CLK
ESP32 26 22 25 16 17

HiFi-ESP

I2S CLK I2S DATA I2S WS PSRAM CE PSRAM CLK
ESP32 26 22 25 16 17

Louder ESP

I2S CLK I2S DATA I2S WS PSRAM CE PSRAM CLK TAS5805 SDA TAS5805 SCL TAS5805 PWDN TAS5805 FAULT
ESP32 26 22 25 16 17 21 27 33 34
ESP32-S3 14 16 15 - - 8 9 17 18

HiFi-ESP32

I2S CLK I2S DATA I2S WS PSRAM RESERVED
ESP32 26 22 25 16, 17
ESP32-S3 14 16 15 35, 36, 37

Loud-ESP32

I2S CLK I2S DATA I2S WS DAC EN PSRAM RESERVED
ESP32 26 22 25 13 16, 17
ESP32-S3 14 16 15 8 35, 36, 37

Louder-ESP32

I2S CLK I2S DATA I2S WS PSRAM RESERVED TAS5805 SDA TAS5805 SCL TAS5805 PWDN TAS5805 FAULT
ESP32 26 22 25 16, 1 21 27 33 34
ESP32-S3 14 16 15 35, 36, 37 8 9 17 18

Ethernet (all boards)

SPI CLK SPI MOSI SPI MISO SPI CS SPI HOST/SPEED ETH INT ETH RST
ESP32 18 23 19 05 2/20MHz 35 14
ESP32-S3 12 11 13 10 SPI2/20MHz 6 5

Optional peripheral (all boards)

IR IN RGB OUT OLED SPI HOST/SPEED OLED SPI CLK OLED SPI MOSI OLED SPI MISO OLED SPI CS OLED SPI DC OLED RST
ESP32 39 12 2/20MHz 18 23 19 15 4 32
ESP32-S3 7 9 SPI2/20MHz 12 11 13 39 (37) 38

Software samples

In the software section two firmware examples are provided.

Platformio IDE

All samples are provided as Plarformio IDE projects. After installing it, open the sample project. Select the proper environment based on your dock. Run the Build and Upload commands to install necessary tools and libraries, and build and upload the project to the board. Communication and proper upload method selection will be handled by IDE automatically.

Arduino IDE

Follow the ESP8266Audio library guide. Default settings will work out of the box with ESP8266 and ESP32 boards. For ESP32C3 and ESP32S2 board please adjust the pinout according to the above section

ESPHome and Home Assistant

Being an ESP32-based device, you can easily integrate it into your Home Assistant using ESPHome. Start with esphome web installer, which will give you ESPHome base install and WiFi configuration in minutes. Some S2/S3 boards have issues with we-installer, you may need to use Adafruit flasher instead with binaries pulled from the HA.

Install instructions

image image

Next, navigate to your Home Assistant (assuming you have your ESPHome integration installed), and adopt the newly created node

image

ESPHome will give you ESPHome configs for Solo board running with ESP32-S2/S3, as well as Duo/HiFi-ESP and Louder ESP working with ESP32.

Few words of explanation.

  • media_player publishes the media player into the Home assistant, so you can use it together with the native player or Music Assistant. You have a volume knob in the HA as well.
  • image
  • Volume set up to 50% on player start. Especially for Louder-ESP32, this is helpful :)

Bonus - automation example

The true power of the native speaker in the eHA is the use of automation. One example that I find useful. This simple automation will be pronounced every hour between 8 AM and 9 PM. Another one is used to pronounce bedtime, you get the point...

image

Squeezelite-ESP32

Squeezelite-ESP32 is a multimedia software suite, that started as a renderer (or player) of LMS (Logitech Media Server). Now it is extended with

  • Spotify over-the-air player using SpotifyConnect (thanks to cspot)
  • AirPlay controller (iPhone, iTunes ...) and enjoy synchronization multiroom as well (although it's AirPlay 1 only)
  • Traditional Bluetooth device (iPhone, Android)

And LMS itself

  • Streams your local music and connects to all major online music providers (Spotify, Deezer, Tidal, Qobuz) using Logitech Media Server - a.k.a LMS with multi-room audio synchronization.
  • LMS can be extended by numerous plugins and can be controlled using a Web browser or dedicated applications (iPhone, Android).
  • It can also send audio to UPnP, Sonos, Chromecast, and AirPlay speakers/devices.

All ESP32-based boards are tested with Squeezelite-ESP32 software, which can be flashed using nothing but a web browser. You can use Squeezelite-ESP32 installer for that purpose.

How to flash and configure ("ESP Audio Duo", "HiFi-ESP" and "Louder ESP")

Use Installer for ESP Audio Dock to flash firmware first. It has been preconfigured to work with ESP Audio boards and will configure all hardware automatically.

Install instructions
Select the correct device first image
Connect the device to the USB port and select it from the list image
Press Flash and wait around 2 minutes image
(Optional) You may enter the serial console to get more information image
Device is in recovery mode. Connect to squeezelite-299fac wifi network with squeezelite password (your network name suffix will be different) image
When redirected to the captive portal let the device scan wifi network and provide valid credentials
You can use provided IP address (http://192.168.1.99/ on the screenshot) to access settings page image
(Optional) You may change device names to something close to your heart image
Exit recovery image

You can use it now

Bluetooth Spotify Connect AirPlay LMS Renderer
image image image image

Ethernet configuration

If you have optional ethernet on the board, please put this config in the NVS settings

ESP32
eth_config = model=w5500,cs=5,speed=20000000,intr=35,rst=14
spi_config = mosi=23,clk=18,host=2,miso=19
ESP32S3
eth_config = model=w5500,cs=10,speed=20000000,intr=6,rst=5
spi_config = mosi=11,clk=12,host=2,miso=13

Hardware

Please visit the hardware section for board schematics and PCB designs. Note that PCBs are shared as multi-layer PDFs.

First generation docks

image

ESP Audio Solo

Image Legend
image image MAX98357 DAC
image Speaker Terminal

ESP Audio Duo

Image Legend
image image MAX98357 DAC
image Speaker Terminals
image 8MB PSRAM IC

HiFi-ESP

Image Legend
image image PCM5100A DAC
image Speaker Terminals
image 8MB PSRAM IC
image Ultra-Low noise LDO 3V3 Voltage regulator

Louder ESP

Image Legend
image image TAS5805M DAC
image Speaker Terminals
image 8MB PSRAM IC
image 3V3 Drop-Down voltage regulator (powers ESP32)
image Input Voltage terminal
image (REV B, C, D) image TAS5805M DAC
image Speaker Terminals
- 8MB PSRAM IC (Hidden under ESP32 module)
- 3V3 Drop-Down voltage regulator (powers ESP32, hidden under ESP32 module)
image Input Voltage terminal

HiFi-ESP32

Image
image
image

Loud-ESP32

Image
image
image

Louder-ESP32 and Louder-ESP32S3

Image
DSC_0013_small JPG-mh
DSC_0012_small JPG-mh

Optional SPI Ethernet module

Every board has a header that allows to solder in W5500 SPI Ethernet module that is very easy to find. The only downside is that with the module installed board will not fit the case, unless it is cut to accomodate extra height.

HiFi-ESP32(S3) Loud-ESP32(S3) Louder-ESP32(S3)
DSC_0015 DSC_0026

squeezelite-esp32 nvs settings that you need to apply to enable it

ESP32
eth_config = model=w5500,cs=5,speed=20000000,intr=35,rst=14
spi_config = mosi=23,clk=18,host=2,miso=19
ESP32S3
eth_config = model=w5500,cs=10,speed=20000000,intr=6,rst=5
spi_config = mosi=11,clk=12,host=2,miso=13

BTL and PBTL mode (TAS5805M DAC)

TAS5805M DAC Allows 2 modes of operation - BTL (stereo) and PBTL (parallel, or mono). In Mono amp will use a completely different modulation scheme and basically will fully synchronize output drivers. Jumpers on the board allow both output drivers to connect to the same speaker. The most important step is to inform the Amp to change modulation in the first place via I2C comman. In the case of sqeezelite DAC controlsset value is the following:

dac_controlset: `{"init":[{"reg":3,"val":2},{"reg":3,"val":3},{"reg":2,"val":4}],"poweron":[{"reg":3,"val":3}],"poweroff":[{"reg":3,"val":0}]}`

compared to default:

dac_controlset: `{"init":[{"reg":3,"val":2},{"reg":3,"val":3}],"poweron":[{"reg":3,"val":3}],"poweroff":[{"reg":3,"val":0}]}`

One can test audio with a single speaker connected between L and R terminals (plus on one side and minus on the other). Optionally, jumpers on the board will effectively connect the second driver in parallel doubling the current capability.

Important point, this will send only one channel to the output, that’s just how the DAC works. True mono as (L+R)/2 is possible via more in-depth configuration (very poorly documented), but I haven’t managed to configure that on the stand. I’m still working on that. (Along with a few more really cool DSP features that this DAC has, like EQ, subwoofer mode and tone compensation settings)

BTL PBTL
Descriotion Bridge Tied Load, Stereo Parallel Bridge Tied Load, Mono
Rated Power 2×23W (8-Ω, 21 V, THD+N=1%) 45W (4-Ω, 21 V, THD+N=1%)
Schematics image image
Speaker Connection image image

Starting from Rev E, an additional header is exposed to allow datasheet-specced connectivity

Image Legend
Stereo Mode - leave open image
Mono (PBTL) Mode, close horisontally image

TAS5805M DSP capabilities

The TAS5805M DAC has a very powerful DSP, that allows doing lots of data processing on the silicon, that otherwise would take a considerable part of your CPU time. As of the moment of writing it is mostly an undiscovered part of the DAC, since unfortunately, TI is not making it very easy for developers. (A minute of complaint) To be more specific, you need to be (A) a proven hardware manufacturer to get access to the configuration software, namely PurePath. (B) you need to apply for a personal license and go through an approval process, and after a few weeks of waiting you get access to one DAC configuration you asked for. (C) You find out that it will work with TI's own evaluation board that will set you back $250 if you'd be able to find one. Otherwise, all you have is a list of I2C commands that you need to transfer to the device on your own cost. No wonder no one knows how to use it.

But moanings aside, what do you get after:

  • Flexible input mixer with gain corrections
  • 15 EQ with numerous filter configurations
  • 3-band Dynamic Range Compression with flexible curve configuration
  • Automatic Gain Limiter with flexible configuration
  • Soft clipper
  • and a few other things

At this moment it is very experimental. In the perfect world, you should be able to adjust all of those settings to make your speaker-enclosure setup work the best it can, and even apply your room factors into the equation. But with above disclaimer I can only deliver limited set of configurations corresponding to the most common use cases:

  • Stereo mode with enabled DRC (Loudness) and AGL settings
  • Full range Mono mode with DRC (Loudness) and AGL settings
  • Subwoofer Mono mode with few filter frequency options
  • Bi-Amp configuration with few crossover frequency options

All of the above are available right now for experimentation. I'm keen to hear your feedback while I moving forward with porting this to other software options

Louder ESP power considerations

Barrel jack used is spaced at 6mm hole/2mm pin, which is typically 5.5/2.5mm jack on the male side.

image

The screw terminal is connected parallel to the barrel jack, you can use either interchangeably.

The power adapter specs depend on the speaker you're planning to use. DAC efficiency is close to 100%, so just take the power rating of your speaker (say 2x10w), and impedance (say 8 ohms) and you'd need at least 9 volts rated at 1.2 amps per channel, round up to 3 total amps.

It is not recommended to go beyond the voltage your speakers can take, otherwise, the amp will blow your speakers in no time.

Case

HiFi-ESP32(S3), Loud-ESP32(S3) and Louder-ESP32(S3) are mechanically compatible with Raspberry Pi 3/4 cases, tested with transparent ones. Also, community members created a few 3-D printable designs that can be found here and here

Hifi-ESP32 Loud-ESP32 Louder-ESP32
DSC_0013 DSC_0019 DSC_0001

Where to buy

You may support my work by ordering these products at Tindie and Elecrow