P
P
pixik2017-03-10 13:23:10
Microcontrollers
pixik, 2017-03-10 13:23:10

How can you catch a random bug in an embadded project?

Good afternoon! Project in Keil, stm32f4 controller, I use as OS: freertos, I write in С++/С.
For the third week I've been struggling with a bug - something in the business logic breaks my memory. This is expressed by the "random" appearance of the value 0xFF (1 byte) in the ring buffer area for receiving from the UART (at the moment, this behavior has been fixed only in this area). At first I thought that this was DMA somehow accidentally breaking the memory, and I just tested the operation of receiving into the ring buffer. As a result, the ring buffer works exactly, and the mechanism for receiving data via DMA does not "break" the memory in any way. But, as soon as I turn on the task with business logic, at random times in the UARTa ring buffer for reception, the 0xff byte appears in the random index. Those. exactly breaks the business logic buffer memory. It is written in c++ and there is quite a lot of code. But what to even complain about, it is not clear.
I tried to find the error in many ways, but every time a lead appeared, I failed and started over.
If you have any questions about additional information, ask, I will answer. At the moment I have no idea how to catch this bug. Happy new suggestions :)

Answer the question

In order to leave comments, you need to log in

2 answer(s)
P
pixik, 2017-03-20
@pixik

Thanks for the advice, I
found the bug!
In the ring buffer considered the pointer on the head without volatile. I wrote this and everything worked as it should:

volatile uint32_t head = REC_BUFF_SZ - this->huart.hdmarx->Instance->NDTR;

M
Mirn, 2017-03-10
@Mirn

1. Perhaps this is a blocking of the memory bus.
They happen if something takes up RAM or flash for a long time or the uart is very fast, 500kBaud and above.
A classic of the genre - flashing parameters into a flash or the flash itself. Or the operation of another DMA channel at max speed.
2. There may be errors in receiving the uart, I advise you to look with an oscilloscope for the stability of voltage levels and the temporal stability of the fronts.
3. A bug in the code and damage to the RAM - I advise you to change the memory layout, if you use LD files, then this is easier, in other more rooted systems such as a key it is not clear how. But the method is this - if you transfer the uart buffer to another place in RAM and everything is fixed, then this is it. You can change the stack size, play around with the size of other buffers, arrays, and so on.
4. Try to turn off the business code bit by bit.
5. Not using RTOS - yes, fantastic, but very often the reason is in it. It's not perfect, and even if it were perfect, you can screw up using it.
6. Incorrectly configured other equipment - turn off the left.
7. Think about when an error occurred and about the circumstance of its occurrence, it often happens, for example, when an error is in a cell with an offset of 0x13 and then you remember that a structure was added and the third byte of the array in this structure just with an offset of 0x13 changes after calling the uart ... oops!

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question