There's a bug in the Rusticl implementation of clEnqueueMapBuffer/
clEnqueueUnmapMemObject because karolherbst doesn't understand
how CL_MEM_USE_HOST_PTR works.
When mapping in the collationBuff we only need to supply CL_MAP_WRITE
and not CL_MAP_WRITE_INVALIDATE_REGION since we don't care to
preserve the contents of the collation buff as input to the
collation kernel.
It seems that whenever you have an HOST_PTR input buffer to be
"transferred" from the host to the GPU, whose contents must be
preserved, you must map it with WRITE_INVALIDATE_REGION on the
RPi5.
This makes little sense, but we'll have to let it be for now.
At least the code works now and we don't have to abandon using
OpenCL.
We previously unintentionally allowed multiple production operations
to occur in the same timeslice because we were calling for production
even when deferring timeslices.
Big waves.
This function wraps the operation of getting a stimframe from
the SpMcRingBuffer, and then eventually assigning it a
SimultaneityStamp. For now we just always pass in the first
stim frame and we don't get any simulstamps.
Its callOriginalCallback() automatically calls
allowNextStimulusFrame() to ensure that it doesn't deadlock future
timeslices.
This method takes an input assembly buffer and selects which
OpenCL kernels need to be executed on that buffer to transform
the input data into the eventual output StimulusFrame for the
current timeslice period.
We implemented the feature to fill unassembled slots w/dummy header
values for the livox pcloud header.
We also fixed a bug where io uring was writing into the last slot
only because we were using the same iovec for every SQE.
We call stop() inside the assembleFrameReq3, so when it returns,
the eventfdDesc should be destroyed. Don't allow a possibly stale
eventfdDesc obj to permit us to re-arm the eventfdDesc read_some
call.