Hi,I am implementing the USB video class for host on a RZ/A2M MCU, starting from the MSC host sample code. I have finished the device configuration part, and I must now implement the isochronous request to retrieve video data.I would like to use endpoints with multiple transactions per microframe, but I am not sure that the r_usbh0 basic driver handles it.I noticed that the driver function R_USBH0_HstdSetPipe sets the maximum packet size without fully parsing the wMaxPacketSize field of the endpoint descriptor.Indeed, the 5 highest bits do not describe the packet size but the number of transactions per microframe. If I set the right MPS in the st_usbh0_pipe_t structure, which can reach up to 3072 bytes, will the USB driver handle it ?I am also wondering how asynchronous requests work with this driver. It is not described in the driver documentation and the sample codes do not use them. The official USB 2.0 and UVC documentations did not help me either.When I use a buffer length of 3060 bytes in the request structure st_usbh0_utr_t, I get two different results from the EHCI : sometimes USBH0_DATA_ERR, and nothing was written in the buffer, and sometimes USBH0_DATA_READING, which indicates a successful request I believe, but only the 12 header bytes are written to zeroes. I do not know if the problem comes from the camera that does not accept my request, or a wrong use of the USB driver.I noticed that asking for more than 3072 bytes makes the response callback never trigger.I suppose that I cannot just queue a big empty buffer, large enough to hold an entire video frame ? I presume the buffer is limited to the maximum theoretical packet size, 3072 bytes ?Do I need to fill the buffer ? I noticed my PC does it. Here is a truncated isochronous request my PC sent :
Frame 537: 1575 bytes on wire (12600 bits), 1575 bytes captured (12600 bits) on interface \\.\USBPcap2, id 0 USB URB [Source: host] [Destination: 2.7.1] USBPcap pseudoheader length: 1575 IRP ID: 0xffffd4825b3c78a0 IRP USBD_STATUS: USBD_STATUS_SUCCESS (0x00000000) URB Function: URB_FUNCTION_ISOCH_TRANSFER (0x000a) IRP information: 0x00, Direction: FDO -> PDO URB bus id: 2 Device address: 7 Endpoint: 0x81, Direction: IN URB transfer type: URB_ISOCHRONOUS (0x00) Packet Data Length: 0 [Response in: 553] Isochronous transfer start frame: 0 Isochronous transfer number of packets: 128 Isochronous transfer error count: 0 USB isochronous packet ISO Data offset: 0x00000000 ISO Data length: 0x00000000 (irrelevant) ISO USBD status: USBD_STATUS_SUCCESS (0x00000000) (irrelevant) USB isochronous packet ISO Data offset: 0x00000bf4 ISO Data length: 0x00000000 (irrelevant) ISO USBD status: USBD_STATUS_SUCCESS (0x00000000) (irrelevant) ... USB isochronous packet ISO Data offset: 0x0005ee0c ISO Data length: 0x00000000 (irrelevant) ISO USBD status: USBD_STATUS_SUCCESS (0x00000000) (irrelevant)
Hi AlexR,I think it's better to start with renesas-rz/rza2_gcc_azure_rtos_bsp: RZA2 BSP and Driver support and the demo_usbx_host_uvcKind Regards.
Hi,It seems my first response did not get posted.I forgot to mention that I am using FreeRTOS, this is why I started from the MSC sample code. I could not find in the Azure example the use of the r_usbh0 functions I am interested in.I managed to make successful isochronous request, but only by waiting for one to finish before sending the next.I am using usbh0_hstd_transfer_start() to make those requests. I looked at the underlying HCD and EHCI code and it seems that it should support queueing transactions.However when I try to queue multiple requests, i.e. calling usbh0_hstd_transfer_start() multiple times successively using different st_usbh0_utr_t message structures, only one request gets a callback anwser. I noticed the same iTD gets reused for every request. What is the proper way to queue requests ?I modified these user-defined constants, are there any other that I need to change ?#define USBH0_EHCI_PFL_SIZE (1024u)#define USBH0_EHCI_NUM_QTD (9180u) #define USBH0_EHCI_NUM_ITD (32u)#define USBH0_EHCI_NUM_SITD (2u)#define USBH0_EHCI_ITD_DATA_SIZE (1024u)Is the EHCI driver capable of handling multiple transactions per microframe ? As described in the first post, how must I set the mps field of the st_usbh0_pipe_t structure ?Regards,
AlexR
Hii AlexR
For isochronous transfers, it’s important to allocate a new iTD (Isochronous Transfer Descriptor) for each transfer. Reusing the same iTD across transfers can lead to data loss or unexpected errors, since EHCI expects fresh descriptors for every microframe transaction.
Also, reusing iTDs without ensuring the buffer is flushed and ready can lead to data integrity issues. The USB hardware manual actually covers this—please refer to section 33.9.9.5 (Transmit buffer flush for isochronous transfer). That might give you a better understanding of what’s expected before submitting new transfer descriptors.
To help further, could you please share the code ? where you're configuring the endpoint and setting up the transfer? I'd like to review how the buffer and iTDs are being managed, and see if any driver-level tweaks might be needed. This will help us understand how the request is being made and whether anything needs to be modified in the driver logic.
Looking forward to your response.
GN_Renesas
Hi,Thanks for the response.Apparently my answer did not go through again, maybe because of the code insert I used.Here is my answer with the code pasted as text then.Here is the function I call to set the pipe. It is very similar to R_USBH0_HstdSetPipe.I found in the RZ/A2M group documentation that high-bandwidth isochronous requests are not supported, so the maximum packet size is up to 1024.
/***********************************************************************************************************************Function Name : uvc_set_pipeDescription : Set up the pipeArgument : uint8_t *p_descriptor : Endpoint descriptor address : uint8_t ifnum : Interface number : uint16_t pipe_id : Pipe ID global variable, to fillReturn value : usbh0_er_t : Error value***********************************************************************************************************************/usbh0_er_t uvc_set_pipe(uint8_t *p_descriptor, uint8_t ifnum, uint16_t *pipe_id) // Based off R_USBH0_HstdSetPipe.{ st_usbh0_pipe_t *p_pipe = &g_usbh0_hstd_pipe[1]; uint16_t temp_pipe_id = 1;
/* Search Empty Pipe */ while (0 != p_pipe->epnum) { p_pipe++; temp_pipe_id++; if (temp_pipe_id >= USBH0_MAXPIPE) { USBH0_PRINTF0("PIPE_FULL ERROR\n"); return USBH0_ERROR; } }
/* Get value from Endpoint Descriptor */ p_pipe->epnum = (p_descriptor[USBH0_EP_B_EPADDRESS] & USBH0_EP_NUMMASK); p_pipe->direction = (p_descriptor[USBH0_EP_B_EPADDRESS] & USBH0_EP_DIRMASK); p_pipe->type = (p_descriptor[USBH0_EP_B_ATTRIBUTES] & USBH0_EP_TRNSMASK); p_pipe->mps = ((uint16_t)(p_descriptor[USBH0_EP_W_MPS_H] << 8) | (uint16_t)(p_descriptor[USBH0_EP_W_MPS_L])); // max 1024 supported
/* Get value from Attach Info */ p_pipe->devaddr = g_uvc_dev_info.addr; p_pipe->ifnum = ifnum; *pipe_id = temp_pipe_id; return USBH0_OK;} /* End of function uvc_set_pipe() */
/***********************************************************************************************************************/Here is the function where I start the transfer. I call it several times to queue multiple transfers with different buffers. The buffers are circular to allow storing requests which await parsing while sending the next.
/******************************************************************************Function Name : usbh0_huvc_get_dataDescription : Get video dataArgument : uint16_t buff_id : Output buffer IDReturn value : usbh0_er_t : Error Code******************************************************************************/usbh0_er_t usbh0_huvc_get_data(uint16_t buff_id){ uint16_t circ_buff_id = 0xFFFF; for(int i = 0; i < UVC_CIRC_BUFF_SIZE; i++) { if(UVC_WRITE==g_uvc_data_buff_state[buff_id][i]) { circ_buff_id = i; break; } }
if(0xFFFF==circ_buff_id) //circular buffer full { printf("Circular buffer of buff %d full\n", buff_id); return; }
uint8_t *buff = g_uvc_data_buffer[buff_id][circ_buff_id]; uint16_t msg_info = USBH0_MSG_HUVC_DATA_RQST + buff_id*100 + circ_buff_id; g_uvc_data_buff_state[buff_id][circ_buff_id] = UVC_WAIT;
memset(buff, 0, UVC_DATA_BUFFER_SIZE*sizeof(uint8_t));
/* transfer request packet */ g_usbh0_uvc_transfer[buff_id][circ_buff_id].keyword = g_uvc_data_pipe; g_usbh0_uvc_transfer[buff_id][circ_buff_id].p_tranadr = buff; g_usbh0_uvc_transfer[buff_id][circ_buff_id].tranlen = UVC_DATA_BUFFER_SIZE-1; g_usbh0_uvc_transfer[buff_id][circ_buff_id].complete = &usbh0_uvc_rqst_callback; g_usbh0_uvc_transfer[buff_id][circ_buff_id].msginfo = msg_info;
/* err = R_USBH0_HstdTransferStart(&g_usbh0_uvc_transfer); */ usbh0_er_t err = usbh0_hstd_transfer_start(&g_usbh0_uvc_transfer[buff_id][circ_buff_id]); return err;} /* eof usbh0_huvc_get_data() */
/******************************************************************************/
Only one transfer gets a call back answer. I think the problem comes from the driver function usbh0_hstd_ehci_search_itd(), called inside usbh0_hstd_ehci_make_isochronous_request().I noticed this search function returns the same iTD when I make several simultaneous requests, even when I ask for 20 requests, although the iTD can obviously not store that many requests.I crudely fixed it by skipping usbh0_hstd_ehci_search_itd() and always initializing a new iTD in usbh0_hstd_ehci_make_isochronous_request().I had to clear these iTDs in usbh0_hstd_ehci_transfer_end_itd(), which I implemented by looking at how usbh0_hstd_ehci_clear_device_itd did it.These changes allow me to make multiple requests which do get callback answers.I ask for 8 times the maximum packet size of the endpoint at each request, so that each iTD will consist of 8 transactions. Is it the right size to ask ?The output buffer I receive look strange : I get zeroes, then a normal UVC packet (header + payload, so one USB transaction), then zeroes again. The offsets at which the packet start always change. I sometimes get another UVC packet in the buffer as well, but non zero data does not exceed 2000 bytes. Sometimes I get payload data without any header, as if I received half a packet.I do not get more than 12 packets with the same UVC Frame ID before it changes, which is very weird.All this makes me think I am missing some packets by not sending requests at a high enough fequency. I tried to limit the latency between a request finishing and the new one starting, but I faced another problem : I am using the same code architecture as the MSC sample : a main task and a driver task that just handles request response, woken using the mailbox. However the mailbox seems to take a long time to wake up after a mail is sent. After a mail is sent, the main loop can run several dozen of times before the driver task wakes up. I tried setting the task priority higher, to no avail. Is there a FreeRTOS parameter I can to change to speed up this ? Regards,AlexR
Hii AlexR ,
can you try
Reduce each tranlen to 1024 bytes, queue many single‑transaction iTDs.
tranlen
Insert the buffer‑flush sequence per section 33.9.9.5.
Switch your mailbox wakeup to direct‑to‑task notifications.
Please try these adjustments and share any updated code snippets or logs. I’m confident this will stabilize your stream and eliminate the zero‑padding artifacts.
Best Regards,
Hi,Thanks for the response.If I use tranlen 1024, most packets are 12 or 944 bytes long, which is expected behaviour. However, some packets are bigger, around 2500 bytes. I even received packets with lengths up to 25000 bytes, and by looking at the iTDs, I noticed that the transactions were indeed reaching lengths above 4000 bytes, which should be impossible even with 3 transaction per microframe. It looks like different packets get aggregated together upon arrival.I am not sure I understand how to implement the buffer flush. Do I just need to set IFIS to 1 and IITV to 0 in the PIPEPERI register, once ? And in order to do this, I would have to follow the sequence described in : 33.9.3.3 Pipe control register switching procedure, then just set PID to BUF again ?Regards,AlexR
Hi,I managed to get a promising packet stream but there is still some problems : The frames I ask for should be 345k bytes long, but I only receive between 30k and 40k bytes before the FID changes and another frame gets sent. Sometimes there is not even an end of frame flag before the FID changes.This makes me think I am lacking bandwidth by not performing a transaction each microframe.To avoid the heavy transfer handling, I am thinking about modifying the EHCI driver to queue multiple iTDs using a single request, instead of just queueing requests. Is it a good idea ?Currently I relaunch the transfer request directly in the complete callback, but parse the buffers in the main.Using a request tranlen of 1 Maximum Packet Size yields empty packets. With 4 MPS I receive one good packet per transfer, located at a varying offset. With 8 MPS, I receive two good packets per transfer.I looked into the buffer‑flush sequence in section 33.9.9.5., and did succeed in setting IFIS and IITV, by configuring them before initializing the host module. However, when I read the registers after the HCD pipe initialization, only IFIS and IITV have changed. Even if I read the pipe registers after a successful isochronous transfer, the registers are not initialized. PIPEMAXP->MXPS is set to *** and the following fields are set to 0 : PIPEBUF->BUFSIZE, PIPEBUF->BUFNMB, PIPECFG->TYPE, PIPECFG->DBLB, PIPECFG->CNTMD, PIPECFG->DIR, PIPECFG->EPNUM.Here is the sequence I follow to be able to access the pipe registers. I wait 10ms after a write.
USB01.LPSTS.BIT.SUSPM = 1;USB00.COMMCTRL.BIT.OTG_PERI = 1;USB01.SYSCFG0.BIT.DRPD = 0;USB01.SYSCFG0.BIT.USBE = 1;
// USBCTR->DIRPD, DVSTCTR0->RHST, PIPE1CTR->PID and PIPE2CTR->PID are all 0, no need to set them thenUSB01.PIPESEL.BIT.PIPESEL = 1; //or 2I suppose that the EHCI does not use those registers to make requests at all ?How do I implement this buffer‑flushing then ?I do clear my buffers after each HCD transfer.EDIT : I am clearing the buffer I set as p_tranadr in the st_usbh0_utr_t structure. Is it the tmp_buffer of the iTD that I need clear as well?Regards,AlexR
Thanks for the update and for digging into the flush sequence—your observations make sense. Because the RZ/A2M only supports one 1024-byte isochronous transaction per microframe, asking for 345 kB in one go will inevitably starve the bus and trigger early FID changes. Instead of batching multiple transactions into one URB, I recommend you allocate a fresh iTD for each 1 kB slice (minus header) and submit them back-to-back on successive microframes: this ensures full bandwidth without confusing the EHCI scheduler. Before each iTD, perform the “pipe flush” from section 33.9.9.5 by selecting your UVC pipe in PIPESEL, setting CNTMD in PIPECFxCTR until it clears, and then flushing your CPU D-cache for both the buffer and the iTD structure—this guarantees the hardware sees only clean, new data. If you really need to speed up queuing, patch usbh0_hstd_ehci_make_isochronous_request() so that it appends new iTDs directly into the periodic schedule instead of relying on search_itd(), but keep each descriptor to a single transaction. Finally, since your mailbox latency remains high, consider switching to direct-to-task notifications (xTaskNotifyFromISR) in your EHCI ISR to wake your driver task instantly. Give that a try and let me know whether the frame lengths stabilize or if you still see early FID flips.
usbh0_hstd_ehci_make_isochronous_request()
search_itd()
xTaskNotifyFromISR
Hi,Thanks for the response.I do not ask for 345kB in one go, it is the size of the video frame I am expecting to receive, spread accross multiple transfer requests. I stil do not understand the buffer flush. The description of CNTMD in the PIPECFG register states :"This bit is valid when the selected pipe is PIPE1 to PIPE5 or PIPE9 to PIPE15, and the transfer type of the selected pipe is bulk transfer."I am using an isochronous pipe (so either PIPE1 or PIPE2). Does the clearing of CNTMD just indicates that the pipe is no longer in use ?Both my transfer buffer and the itd are located on uncached sections of the RAM. Why would I need to clear the CPU data cache ?Skipping search_itd() is indeed a bit faster.EDIT: the endpoint I am using, although isochronous, using Asynchronous synchronization (5.12.4.1.1 in USB2.0 doc). This seems a likely cause for my varying offsets.I noticed that data is written to the wrong p_itd-> tmp_buffer, i.e. for an iTD with one transaction, it will not be p_itd->tmp_buffer[0] which holds the data but another. This breaks usbh0_hstd_ehci_transfer_end_itd().I fixed that by skipping the p_itd->transaction[i].bit.length check and copying 1024 bytes every time, not p_itd->transaction[i].bit.length.I now do receive UVC packets when using tranlen 1024, but The FID still changes too early.Regards,AlexR
Hi,I identified the issue : the buffer offsets of the 8 iTD transactions were badly initialized.The problem is located in the function usbh0_hstd_ehci_init_itd() , inside r_usbh0_hehci_transfer.c .The offsets were initialized with this line : R_MMU_VAtoPA((uint32_t)(tmp_bufferadrs & 0x00000FFF),p_itd->transaction[n].bit.offset); /* Offset */However, R_MMU_VAtoPA() takes as second input a uint32_t* type, while p_itd->transaction[n].bit.offset is a 12-bit field of a uint32_t union structure. Data is thus written to the wrong address and the offset value stays at zero.Pages were set properly though, that is why I could still find some packets in the transfer buffer, but at strange offsets of course.Furthermore, I use 8 transactions per iTD to maximise the bandwidth, but the driver only outputs the total length of received data and the output buffer is continuous This makes separaing the different transactions quite hard. I thus modified the EHCI driver to add the length of the transaction directly in the buffer, just before each transaction, as a kind of transaction header. Obviously I still use tranlen = 8*MPS, but I have to allocate a buffer of 8*(MPS+2), as I encode the length with two bytes. I think that adding a way to properly access the transaction length would be a great addition to this EHCI driver.Re-using the same iTD seems to sometimes corrupt the data, even though I wait for the transfer to finish before starting the next. I thus use a new iTD each time.The video stream I receive looks good, so it seems I'm done with the UVC implementation.Thanks for all the help attempts.Regards,AlexR