OClCollMeshEngn: cleanup and get it working on RPi5+Rusticl+V3D GPU

It seems that whenever you have an HOST_PTR input buffer to be
"transferred" from the host to the GPU, whose contents must be
preserved, you must map it with WRITE_INVALIDATE_REGION on the
RPi5.

This makes little sense, but we'll have to let it be for now.
At least the code works now and we don't have to abandon using
OpenCL.
This commit is contained in:
2025-11-12 12:35:44 -04:00
parent d687ca0164
commit 1dc74065fb
2 changed files with 211 additions and 18 deletions
@@ -92,6 +92,9 @@ private:
size_t assemblyBufferSize;
void* collationBufferPtr;
size_t collationBufferSize;
// Mapped buffer pointers (for zero-copy synchronization)
void* mappedAssemblyBuffer;
void* mappedCollationBuffer;
// Frame descriptor (cached from setup)
std::shared_ptr<FrameAssemblyDesc> frameAssemblyDesc;
@@ -115,6 +118,17 @@ private:
StagingBuffer& assemblyBuff, uint32_t nSucceeded);
bool setupCollateDgramsArgs(StagingBuffer& assemblyBuff);
// Generic buffer mapping/unmapping for zero-copy synchronization
bool mapBuffer(
cl_mem buffer, size_t size, cl_map_flags mapFlags, void*& mappedPtr);
bool unmapBuffer(cl_mem buffer, void*& mappedPtr);
// Wrapper functions for specific buffers
bool mapAssemblyBuffer(cl_map_flags mapFlags = CL_MAP_READ);
bool unmapAssemblyBuffer();
bool mapCollationBuffer(cl_map_flags mapFlags = CL_MAP_READ);
bool unmapCollationBuffer();
// Forward declaration for continuation class
class CompactCollateAndMeshFrameReq;
@@ -177,6 +191,17 @@ private:
return false;
}
// Force queue flush to ensure event processing and callback invocation
err = clFlush(commandQueue);
if (err != CL_SUCCESS)
{
std::cerr << __func__ << ": failed to flush queue: " << err
<< std::endl;
clReleaseEvent(*eventPtr);
*eventPtr = nullptr;
return false;
}
isRunning = true;
return true;
}