-
Notifications
You must be signed in to change notification settings - Fork 7.7k
Description
Describe the bug
Under enough USB traffic load, the implementation of usb_dc_ep_write()
in the stm32 USB driver can cause data corruption. This is because the data pointer it accepts is stored away until a TX FIFO empty interrupt occurs, yet usb_dc_ep_write()
returns immediately. If the pointer passed in to usb_write()
is stack-allocated, it will point to gargage by the time the FIFO empty interrupt occurs.
The exact sequence of events is as follows:
-
usb_dc_ep_write()
is called (e.g. by mass storage class) with a stack data pointer0x200561ac
(gdb) bt #0 usb_dc_ep_write (ep=ep@entry=129 '\201', data=data@entry=0x200561ac <z_interrupt_stacks+1836> "", data_len=data_len@entry=8, ret_bytes=ret_bytes@entry=0x0) at $PRJDIR/zephyr/drivers/usb/device/usb_dc_stm32.c:783 #1 0x0801092c in usb_write (ep=<optimized out>, data=data@entry=0x200561ac <z_interrupt_stacks+1836> "", data_len=data_len@entry=8, bytes_ret=bytes_ret@entry=0x0) at $PRJDIR/zephyr/subsys/usb/device/usb_device.c:1392 #2 0x080122fe in write (buf=buf@entry=0x200561ac <z_interrupt_stacks+1836> "", size=size@entry=8) at $PRJDIR/zephyr/subsys/usb/device/class/msc.c:262 #3 0x08012510 in readCapacity () at $PRJDIR/zephyr/subsys/usb/device/class/msc.c:391 #4 0x08013270 in CBWDecode (buf=buf@entry=0x20056234 <z_interrupt_stacks+1972> "", size=<optimized out>) at $PRJDIR/zephyr/subsys/usb/device/class/msc.c:555 #5 0x08013e76 in mass_storage_bulk_out (ep=<optimized out>, ep_status=<optimized out>) at $PRJDIR/zephyr/subsys/usb/device/class/msc.c:719 #6 0x0806549a in HAL_PCD_DataOutStageCallback (hpcd=<optimized out>, epnum=<optimized out>) at $PRJDIR/zephyr/drivers/usb/device/usb_dc_stm32.c:1096 #7 0x080967f6 in PCD_EP_OutXfrComplete_int (hpcd=hpcd@entry=0x20049160 <usb_dc_stm32_state>, epnum=epnum@entry=1) at $PRJDIR/modules/hal/stm32/stm32cube/stm32f7xx/drivers/src/stm32f7xx_hal_pcd.c:2217 #8 0x08097036 in HAL_PCD_IRQHandler (hpcd=hpcd@entry=0x20049160 <usb_dc_stm32_state>) at $PRJDIR/modules/hal/stm32/stm32cube/stm32f7xx/drivers/src/stm32f7xx_hal_pcd.c:1039 #9 0x08064028 in usb_dc_stm32_isr (arg=<optimized out>) at $PRJDIR/zephyr/drivers/usb/device/usb_dc_stm32.c:197 #10 0x08020cfe in _isr_wrapper () at $PRJDIR/zephyr/arch/arm/core/aarch32/isr_wrapper.S:259 #11 <signal handler called> #12 arch_cpu_idle () at $PRJDIR/zephyr/arch/arm/core/aarch32/cpu_idle.S:127 #13 0x080a3d22 in k_cpu_idle () at $PRJDIR/zephyr/include/zephyr/kernel.h:5749 #14 idle (unused1=<optimized out>, unused2=<optimized out>, unused3=<optimized out>) at $PRJDIR/zephyr/kernel/idle.c:88 #15 0x08009b02 in z_thread_entry (entry=0x80a3cf5 <idle>, p1=0x2002d1e0 <_kernel>, p2=0x0, p3=0x0) at $PRJDIR/zephyr/lib/os/thread_entry.c:36 #16 0xaaaaaaaa in ?? () (gdb) x/8xb 0x200561ac 0x200561ac <z_interrupt_stacks+1836>: 0x00 0x00 0x00 0xbf 0x00 0x00 0x02 0x00
-
usb_dc_ep_write()
returns successfully without actually having written anything, the data pointer is merely stored in the HAL struct:(gdb) fin Run till exit from #0 usb_dc_ep_write (ep=ep@entry=129 '\201', data=data@entry=0x200561ac <z_interrupt_stacks+1836> "", data_len=data_len@entry=8, ret_bytes=ret_bytes@entry=0x0) at $PRJDIR/zephyr/drivers/usb/device/usb_dc_stm32.c:783 0x0801092c in usb_write (ep=<optimized out>, data=data@entry=0x200561ac <z_interrupt_stacks+1836> "", data_len=data_len@entry=8, bytes_ret=bytes_ret@entry=0x0) at $PRJDIR/zephyr/subsys/usb/device/usb_device.c:1392 1392 ret = usb_dc_ep_write(ep, data, data_len, bytes_ret); Value returned is $12 = 0 (gdb) p usb_dc_stm32_state.pcd.IN_ep[1].xfer_buff $13 = (uint8_t *) 0x200561ac <z_interrupt_stacks+1836> ""
-
Data on stack is overwritten (by some arbitrary thread, this is just for illustration)
(gdb) watch *0x200561ac Hardware watchpoint 10: *0x200561ac (gdb) c Continuing. Thread 29 hit Hardware watchpoint 10: *0x200561ac Old value = -1090519040 New value = 134627885 0x08096b6c in HAL_PCD_EP_Receive (hpcd=hpcd@entry=0x20049160 <usb_dc_stm32_state>, ep_addr=ep_addr@entry=1 '\001', pBuf=pBuf@entry=0x200498bc <usb_dc_stm32_state+1884> "USBC\003", len=64) at $PRJDIR/modules/hal/stm32/stm32cube/stm32f7xx/drivers/src/stm32f7xx_hal_pcd.c:1759 1759 {
-
TX FIFO Empty interrupt occurs, reads corrupted data and sends it to host
(gdb) b USB_WritePacket Breakpoint 11 at 0x8099be8: file $PRJDIR/modules/hal/stm32/stm32cube/stm32f7xx/drivers/src/stm32f7xx_ll_usb.c, line 1100. (gdb) c Continuing. Thread 29 hit Breakpoint 11, USB_WritePacket (USBx=USBx@entry=0x50000000, src=0x200561ac <z_interrupt_stacks+1836> "-B\006\b", ch_ep_num=ch_ep_num@entry=1 '\001', len=len@entry=8, dma=0 '\000') at $PRJDIR/modules/hal/stm32/stm32cube/stm32f7xx/drivers/src/stm32f7xx_ll_usb.c:1100 1100 { (gdb) bt #0 USB_WritePacket (USBx=USBx@entry=0x50000000, src=0x200561ac <z_interrupt_stacks+1836> "-B\006\b", ch_ep_num=ch_ep_num@entry=1 '\001', len=len@entry=8, dma=0 '\000') at $PRJDIR/modules/hal/stm32/stm32cube/stm32f7xx/drivers/src/stm32f7xx_ll_usb.c:1100 #1 0x08096474 in PCD_WriteEmptyTxFifo (hpcd=hpcd@entry=0x20049160 <usb_dc_stm32_state>, epnum=epnum@entry=1) at $PRJDIR/modules/hal/stm32/stm32cube/stm32f7xx/drivers/src/stm32f7xx_hal_pcd.c:2097 #2 0x0809720c in HAL_PCD_IRQHandler (hpcd=hpcd@entry=0x20049160 <usb_dc_stm32_state>) at $PRJDIR/modules/hal/stm32/stm32cube/stm32f7xx/drivers/src/stm32f7xx_hal_pcd.c:1166 #3 0x08064028 in usb_dc_stm32_isr (arg=<optimized out>) at $PRJDIR/zephyr/drivers/usb/device/usb_dc_stm32.c:197 #4 0x08020cfe in _isr_wrapper () at $PRJDIR/zephyr/arch/arm/core/aarch32/isr_wrapper.S:259 #5 <signal handler called> #6 arch_cpu_idle () at $PRJDIR/zephyr/arch/arm/core/aarch32/cpu_idle.S:127 #7 0x080a3d22 in k_cpu_idle () at $PRJDIR/zephyr/include/zephyr/kernel.h:5749 #8 idle (unused1=<optimized out>, unused2=<optimized out>, unused3=<optimized out>) at $PRJDIR/zephyr/kernel/idle.c:88 #9 0x08009b02 in z_thread_entry (entry=0x80a3cf5 <idle>, p1=0x2002d1e0 <_kernel>, p2=0x0, p3=0x0) at $PRJDIR/zephyr/lib/os/thread_entry.c:36 #10 0xaaaaaaaa in ?? () (gdb) x/8xb src 0x200561ac <z_interrupt_stacks+1836>: 0x2d 0x42 0x06 0x08 0x00 0x00 0x02 0x00
This makes it impossible for me to use mass storage on STM32 USB_OTG_FS, since the SCSI packets are consistently corrupted.
To Reproduce
The subsys/usb/mass
sample might work, but since this is related to USB host timing, I wouldn't bet on reproduction being easily possible. I think the cause of the issue should be clear enough, I can of course test proposed fixes on my setup.
Expected behavior
USB data is not corrupted.
Impact
USB mass storage does not work at all, other USB protocols might have undetected integrity problems.
Logs and console output
This Wireshark capture shows the corrupted data in packet 83 (should be 000000bf 00000200
, is 2d420608 00000200
, mirroring debugger output above): zephyr_usb_corruption.pcapng.gz
Environment (please complete the following information):
- OS: Ubuntu 22.04 LTS
- Toolchain: Zephyr SDK 0.16.1
- Commit SHA or Version used: 3.4.0