This article applies the ESP32 microcontroller’s DAC and MicroPython to open WAV files, which are audio recording files and exported to the DAC connected to the speakers as shown in Figure 1. The used file is an uncompressed 8-bit mono PCM audio file. And the sample program supports a sampling rate at about 50KHz or at 44100 level.
WAV file structure
A Wav file consists of a header that checks the file type and size, followed by a chunk of the fmt section for detailing the file format, and data to contain the following:
Bytes | Description |
---|---|
4 | ‘RIFF’ |
4 | size |
4 | ‘WAVE’ |
4 | ‘fmt ‘ |
4 | fmt size |
2 | audio type |
2 | number of channel |
4 | Sample Rate |
4 | Byte Rate |
2 | Block Align |
2 | bit per Sample |
4 | ‘data’ |
4 | size of data |
n | data |
From the table, it can be seen that to read data from the file starts from reading the first 4 bytes to check if it is ‘RIFF’ or not, if yes it will read the file size by 4 bytes and then read the next 4 bytes to check if it’s ‘WAVE’ or not, if yes, it’s the header of the WAV file.
The next 4 bytes are the text ‘fmt ‘, indicating that they are part of the audio format’s description. And the last section in ‘data’ is the audio data that needs to be read to be exported to the DAC.
Example code
An example program of this article is to play mono.wav and mono2.wav audio files, which the reader must upload the files to the microcontroller board first, as shown in Figure 2.
The example code for a Python program is as follows.
import time
import sys
from machine import DAC, Pin, freq
import gc
gc.enable()
gc.collect()
freq(240000000)
dacPin1 = Pin(25) # ต่อกับลำโพง
dacPin2 = Pin(26) # ต่อกับ adcPin1
dac1 = DAC( dacPin1 )
dac2 = DAC( dacPin2 )
def playWavFile( fName ):
monoFile = open(fName,"rb")
mark = monoFile.read(4)
if (mark != b'RIFF'):
print("ไม่ใช้ WAV!")
monoFile.close()
sys.exit(1)
fileSize = int.from_bytes(monoFile.read(4),"little")
print("File size = {} bytes".format(fileSize))
fileType = monoFile.read(4)
if (fileType != b'WAVE'):
print("ไม่ใช้ WAV!!")
monoFile.close()
sys.exit(2)
chunk = monoFile.read(4)
lengthFormat = 0
audioFormat = 0
numChannels = 0
sampleRate = 0
byteRate = 0
blockAlign = 0
if (chunk == b'fmt '):
lengthFormat = int.from_bytes(monoFile.read(4),"little")
audioFormat = int.from_bytes(monoFile.read(2),"little")
numChannels = int.from_bytes(monoFile.read(2),"little")
sampleRate = int.from_bytes(monoFile.read(4),"little")
byteRate = int.from_bytes(monoFile.read(4),"little")
blockAlign = int.from_bytes(monoFile.read(2),"little")
bitsPerSample = int.from_bytes(monoFile.read(2),"little")
print("Length of format data = {}".format(lengthFormat))
print("Audio's format = {}".format(audioFormat))
print("Number of channel(s) = {}".format(numChannels))
print("Sample rate = {}".format(sampleRate))
print("Byte rate = {}".format(byteRate))
print("Block align = {}".format(blockAlign))
print("Bits per sample = {}".format(bitsPerSample))
minValue = 255
maxValue = 0
chunk = monoFile.read(4)
if (chunk != b'data'):
print("ไม่ใช้ WAV!!!!")
monoFile.close()
sys.exit(5)
dataSize = int.from_bytes(monoFile.read(4),"little")
print("Data size = {}".format(dataSize))
if (bitsPerSample > 8):
print("ไม่รองรับข้อมูลที่มากกว่า 8 บืต")
monoFile.close()
sys.exit(6)
buffer = monoFile.read(dataSize)
# find min/max
for i in range(len(buffer)):
if (buffer[i] > maxValue):
maxValue = buffer[i]
if (buffer[i]<minValue):
minValue = buffer[i]
# normalize
xScale = 255.0/(maxValue-minValue)
# play
tm = int(1000000/sampleRate)
for i in range(len(buffer)):
data = int(((buffer[i]-minValue)*xScale))
dac1.write( data )
time.sleep_us(tm)
print("---------------------------")
if (audioFormat != 1):
print("ไม่รองรับกรณีที่ไม่ใช้ PCM!!!")
monoFile.close()
sys.exit(3)
monoFile.close()
dac1.write( 0 )
############### main program
playWavFile("/mono.wav")
time.sleep_ms(1000)
playWavFile("/mono2.wav")
time.sleep_ms(1000)
From the code, it is found that in the process of reading the audio data, it finds the minimum and maximum values to be used for scaling the data, giving the minimum value to 0 and the maximum to 255. The value to be exported is subtracted from the minimum value and multiplied by xScale to adjust the value to the range 0 to 255.
In addition, a variable tm has been created to determine the approximate delay value by taking the Sample Rate to 1000000 to bring the value to a microsecond delay which makes it possible to support the transmission of audio signals that are closer to the truth
When the program is running, it reports the data of the enabled file as shown in Figure 3 and reads the audio data to the DAC.
Conclusion
From this article, readers will be able to open WAV files and read the data correctly. But files that can be processed to be exported to the DAC must be converted to MONO and uncompressed 8-bit data only. We hope that the reader will continue to improve it. And finally, have fun with programming.
(C) 2020-2022, By Jarut Busarathid and Danai Jedsadathitikul
Updated 2022-02-05