Thursday, 8 March 2018

Higher Speed ADC Part 2

An Overview of the Solution

Options

There's a couple of different ways we can choose to run the software on the microprocessor to capture samples and push them across the SPI.

Manually Pushing The Peripherals

The simplest solution would be to grab samples directly on the ADC, using ST's HAL_ADC_PollForConversion(). Then when we have "enough" samples send them with HAL_SPI_TransmitReceive().

This is simple enough to do, and a good initial wiring test, but the ADC samples collected will be too irregular to be useful for anything but very low sample rates.

Interrupt Driven

The next simplest thing to do is use the ADC and SPI in interrupt mode: In this use case then when the ADC finishes a conversion it generates an interrupt. Similarly we can kick off the SPI transfers for large blocks of data and get an interrupt when the transfer completes

The ADC can be run in a continuous conversion where we can just start it running, then get regular samples with interrupts when the conversions complete, and we implement a simple handler to pull the value out of the register.

The mechanics of this are fairly simple to set up, and the HAL_SPI_TransmitReceive_IT() and HAL_ADC_Start_IT() provide the front ends to start the whole process.

This approach almost works for this case. If we limited the sample rates to low values (i.e. audio rates, around sub-50kHz ish) then it would be good enough. However as we wind up the sample rates on the ADC and also start handling SPI interrupts then we start dropping samples. The processor simply can't get around to handling all the interrupts in time.

Depending on the application the odd dropped sample might be worth the simplicity of the implementation, however to get reliability at higher rates we have to do something else.

Interrupt Handlers and Weak Bindings

One thing to be aware of is that the ST HAL likes to use weak bindings for the interrupt callback handlers, and expects the application to provide "known" function names for the ISR handlers.

Complicating this is that the sample code uses #define statements to substitute in the "correct" name for a given handler based on the channel defintions, and this can get confusing when trying to build outside of the examples tree.
So, for example, in the STM32F7 reference tree Examples/SPI/SPI_FullDuplex_ComDMA has the header ./Inc/main.h which contains the substitution:

#define SPIx_IRQHandler                  SPI2_IRQHandler
And then both ./Inc/stm32f7xx_it.h and ./Src/stm32f7xx_it.c have references to
void SPIx_IRQHandler(void)
which they expect to be SPI2_IRQHandler(). When porting/re-implementing it's important to make sure the handler resolves correctly, otherwise the interrupt handlers won't fire. I tend to remove the define to keep things clearer and prevent unexpected surprises when hacking around.

Using the DMA Engines

The STM32 chips have DMA engines, which can be set up to transfer blocks of peripheral data to and from memory, and the processor only has to be involved in setting up the transfer and informed when it completes.

There are DMA engine bindings in the HAL which can be used for the transfer of ADC data to memory, and from memory to/from the SPI interface.

DMA Based ADC

Specifically in the case of the ADC we can set up the DMA engine to recover the values from the conversion and transfer them to a block of memory, and then when the memory block is full, return to the start of the memory block and continue converting.

When operating in this mode we can have the DMA engine generate interrupts when it is halfway through the buffer and when it's at the end. This means we can run a simple double buffered conversion approach.

We set the DMA/ADC running and putting data into a large block of memory. When we pick up the "halfway" interrupt we send the first half of the block to the SPI, and when we get the "end" interrupt we send the second half.

Provided we can send the "half block" of memory across the SPI faster than the ADC fills the other "half block" we can leave the ADC to run continuously.

Representing this graphically we allocate a large block of memory and pass it to the ADC:

And then when we receive the "Half Complete" interrupt we know the first half of this buffer is ready to send:

And when we receive the Complete interrupt then we know the second half of the buffer is ready to send, and the DMA engine has looped around and is back to writing the first half:

If we want to get better performance out of the system this is the way to go. Note that this approach also has the weak bindings approach in the reference tree.

DMA Based SPI

There's a similar setup on the SPI DMA side. This allows us to substitute in HAL_SPI_Transmit_DMA() to push the data buffers across the SPI link.

This is a slightly more complex setup when we configure the SPI, but otherwise the transmit process is largely similar.

Another DMA

If this was something I was being paid to do, then I'd likely look at using at least one more DMA. I'd want a DMA to copy the ADC results from memory to another memory location, and then DMA from this memory to SPI.

Using the extra memory to memory copy would allow us to run with more than just a single buffers worth of ADC results queued, which would be a very useful safeguard for the cases where the Pi side of things came under load (always a problem with Linux), and was late when transferring ADC buffers as a result. We could also add headers to the buffers and improve the error checking on the transfer.

However this is just a weekend thing, so for now I won't bother with that.