STM32 uart interrupt handler missing rx byte

Question

I am writing a small application on a STM F446:

freertos (latest from git)
lwip (pppos) (latest from git)
LTE modem connected to uart2 (both rx and tx on interrupt, prio is 5)
PC connected to uart3 (for logging) (only tx is used, also on interrupt prio 5)

The amount of bytes that are received varies. So every received byte gets stored in the ring buffer on the interrupt. A dedicated lwip rx task is reading data from that task on highest prio and consumes the data from the ring buffer.

Occasionally I ran into the problem that lwip is dropping a packet. When I started to compare received bytes vs logic analyzer I finally noticed the problem. I missed 1 byte in the cases where lwip drops the packet (due to bad fcs, which makes perfect sense then).

I am rather new to this microcontroller world, so I am sure I am doing something wrong. I was hoping that somebody can give me some pointers.

Are my interrupt handlers too bloated?
Must I use different priorities for each peripheral?

The problem doesn't appear when I set uart3 to prio 6 (so one priority level lower than the uart connected to the modem). That's where I started to worry. Is it really a bad idea to have the same priority used for both uarts? Or is this a clear sign that something else is wrong in my code (specifically the interrupt handlers) which I should fix/improve?

The interrupt handlers :

extern "C" void HAL_UART_RxCpltCallback(UART_HandleTypeDef *uartHandle)
{
    if (uartHandle == &uart2Handle)
    {
        uart2.RxHalInterruptCallback();
    }

    if (uartHandle == &uart3Handle)
    {
        uart3.RxHalInterruptCallback();
    }
}

extern "C" void HAL_UART_TxCpltCallback(UART_HandleTypeDef *uartHandle)
{
    if (uartHandle == &uart2Handle)
    {
        uart2.TxHalInterruptCallback();
    }

    if (uartHandle == &uart3Handle)
    {
        uart3.TxHalInterruptCallback();
    }
}

And the implementation in uart class:

void RxHalInterruptCallback()
{
    BaseType_t xHigherPriorityTaskWoken = pdFALSE;

    _rxRingBuffer.Store(_receivedByte);

    // Re-enable interrupt in HAL
    HAL_UART_Receive_IT(_handle, &_receivedByte, 1);

    // Allow blocking read to continue, there is new data available
    xSemaphoreGiveFromISR(_rxSemaphore, &xHigherPriorityTaskWoken);
}

void TxHalInterruptCallback()
{
    uint16_t readBytes = 0;
    _txRingBuffer.ReadAll(256, _txBuffer, &readBytes);

    if (readBytes)
    {
        HAL_UART_Transmit_IT(_handle, (uint8_t*)_txBuffer, readBytes*sizeof(uint8_t));
    }
}

And finally, the ring buffer implementation:

class RingBuffer
{
    public:
    RingBuffer(uint16_t size) : _size(size)
    {
        _head = 0;
        _tail = 0;

        _buffer = new uint8_t[size];        
    }

    virtual ~RingBuffer() 
    {
        delete [] _buffer;
    }

    virtual void Store(uint8_t byte)
    {
        // Store head and tail in atomic action to local variables
        volatile uint16_t head = _head;
        volatile uint16_t tail = _tail;

        _buffer[head++] = byte;
        head %= _size;

        // If head is equal to tail after store, we no longer know where our data is
        if (tail == head)
        {
            __disable_irq();
            while (1) 
            {
                GPIOB->ODR |= LED_RED;
            }
        }

        // Restore head back to member
        _head = head;
    }

    virtual void Store(uint8_t *data, uint16_t length)
    {
        volatile uint16_t head = _head;
        volatile uint16_t tail = _tail;

        for (volatile uint16_t i = 0; i < length; i++)
        {
            _buffer[head++] = data[i];
            head %= _size;

            // If head is equal to tail after store, we no longer know where our data is
            if (tail == head)
            {
                __disable_irq();
                while (1) 
                {
                    GPIOB->ODR |= LED_RED;
                }
            }
        }

        // Restore head back to member
        _head = head;

    }

    virtual void ReadAll(size_t maxLength, uint8_t *data, uint16_t *actualReadBytes)
    {
        // Store head and tail in atomic local variable
        volatile uint16_t tail = _tail;
        volatile uint16_t head = _head;
        
        // Keep grabbing bytes until we have all bytes or until we read the maximum amount of desired bytes
        while (tail != head && (*actualReadBytes) < maxLength)
        {
            data[(*actualReadBytes)++] = _buffer[tail++];
            tail %= _size;
        }

        // Restore tail back to member
        _tail = tail;
    }

    private:

    volatile uint16_t _head;
    volatile uint16_t _tail;
    volatile uint16_t _size;
    uint8_t *_buffer;
};

PS: As experienced programmers will notice, I am still struggling when to use volatile. I don't know if that can rough up performance so hard that it will contribute to this problem. I am reading more on that in parallel. Again, guidance is appreciated.

new and delete should be your banned keywords when programming uCs. — 0___________
– 0___________, Commented Jun 9, 2021 at 22:17
Why do you need dynamic memory for your ring buffer? You already have the size at compile time. Also, to make the ring buffer more efficient, the capacity should be a power of 2. — Thomas Matthews
– Thomas Matthews, Commented Jun 9, 2021 at 22:35
There is no need for the ring buffer members to be volatile. The head and tail indices should only be modified by the ring buffer class. The keyword volatile is usually used to denote variables that the hardware changes or variables that are changed outside of the program's control (maybe threading?). — Thomas Matthews
– Thomas Matthews, Commented Jun 9, 2021 at 22:40
Your Store is wrong. The statement _buffer[head++] doesn't account for the ring or circular nature of the buffer. You may want to do: buffer[head] = value; head = (head + 1) % capacity;. — Thomas Matthews
– Thomas Matthews, Commented Jun 9, 2021 at 22:43

Tagli · Accepted Answer · 2021-06-09 21:33:43Z

2

HAL_UART_Receive_IT(_handle, &_receivedByte, 1); is probably the cause of your problem. It disables the interrupt after it gets 1 byte. While the interrupt is disabled, you may miss some bytes before you call HAL_UART_Receive_IT again. Use DMA in circular mode instead.

answered Jun 9, 2021 at 21:33

Tagli

2,6622 gold badges13 silver badges17 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

Thomas Matthews Over a year ago

I've never seen DMA that has a circular mode. I have always read one byte (at a time) and inserted into the ring buffer.

HS2 Over a year ago

And add portYIELD_FROM_ISR( xHigherPriorityTaskWoken); after xSemaphoreGiveFromISR.

Tagli Over a year ago

@ThomasMatthews , I don't know about other uCs, but all STM32 parts I've worked with have circular mode DMA. Some of them also support double buffered mode which allows them to alternate between two buffers on different memory locations. Normally, it's not a problem to receive 1 byte at a time using interrupts. But the way HAL_UART_Receive_IT() does it is wrong and it's a well known limitation (or design error) of the STM32 HAL.

bas Over a year ago

@HS2 ah that is something I haven't heard of. Will look into that too! Thx

Tagli Over a year ago

@bas, if you can somehow bypass the HAL and write your own USART interrupt handler, you can implement it using interrupts, without needing DMA. I've used to do it that way on some old PIC16 & dsPIC30 uCs which lack DMA. The key point is, USART receive interrupt must be always enabled (this is where ST's HAL fails.)

|

bas · Accepted Answer · 2021-06-14 14:18:43Z

Found the problem. I enabled some digital pins for debugging purpose, and toggle them in the interrupt handlers in the hope I could see that something was eating CPU cycles during interrupt.

rx line is the actual incoming message
rx int shows my 'debug markers' how long the interrupt takes for rx
tx int shows my 'debug markers' how long the interrupt takes for tx

In the image above it became quite obvious. The gap of >300us while receiving a message is the root cause for my missing bytes. So I used these "debug markers" to find the part in code which was consuming CPU during the transmit interrupt. The guilty part was (of course... ;-)) my own ring buffer.

virtual void ReadAll(size_t maxLength, uint8_t *data, uint16_t *actualReadBytes)
{
    // Store head and tail in atomic local variable
    uint16_t tail = _tail;
    uint16_t head = _head;
    
    // For debugging only!! Port 3 is logging uart
    if (_portId == 3)
    {
    GPIOF->ODR ^= GPIO_PIN_2;
    }
    // Keep grabbing bytes until we have all bytes or until we read the maximum amount of desired bytes
    while (tail != head && (*actualReadBytes) < maxLength)
    {
        data[(*actualReadBytes)++] = _buffer[tail++];
        tail %= _size;
    }

    // For debugging only!! Port 3 is logging uart
    if (_portId == 3)
    {
    GPIOF->ODR ^= GPIO_PIN_2;
    }

    // Restore tail back to member
    _tail = tail;
}

The while loop which copies the bytes over from ringbuffer to the actual send buffer which takes more than 300us. I didn't expect that this would be so slow. I will move the copying part out of the interrupt handler and prepare the next send buffer in a non-ISR-thread.

Since I gave both uart 2 and uart 3 the same interrupt priority, I am stalling my receive interrupt handler which ultimately results in my missing bytes.

Maybe this will give some useful insights for the next person who is learning micro controllers.

Collectives™ on Stack Overflow

STM32 uart interrupt handler missing rx byte

2 Answers 2

7 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

7 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related