161 Commits

Author SHA1 Message Date
hayodea a17072c8d9 IoUringEngn:assembleFrameReq: Implement and use callOriginalCallback 2025-11-13 20:53:53 -04:00
hayodea 5c3debecf4 OClCollMeshEngn: fix mem leak in [un]mapBuffer() 2025-11-13 01:41:59 -04:00
hayodea 5031b22a31 OClCollMeshEngn: use helper fns for parsing version numbers 2025-11-12 20:43:48 -04:00
hayodea df58f324a9 CMake:LivoxGen1: Require OpenCL 1.2+, printf & WRITE_INVALIDATE_REGION 2025-11-12 20:26:29 -04:00
hayodea d87c71b794 OClCollMeshEngn: perf profile and print kernel exec durations 2025-11-12 13:05:13 -04:00
hayodea 33b534355a OpenCL minimum version is 1.2
We use CL_MAP_WRITE_INVALIDATE, and I think one other feature which
both require v1.2 minimum
2025-11-12 13:05:13 -04:00
latentprion 96e64e24b8 OClCollMeshEngn: collBuff only needs MAP_WRITE; silence dbg prints
When mapping in the collationBuff we only need to supply CL_MAP_WRITE
and not CL_MAP_WRITE_INVALIDATE_REGION since we don't care to
preserve the contents of the collation buff as input to the
collation kernel.
2025-11-12 12:49:54 -04:00
hayodea 1dc74065fb OClCollMeshEngn: cleanup and get it working on RPi5+Rusticl+V3D GPU
It seems that whenever you have an HOST_PTR input buffer to be
"transferred" from the host to the GPU, whose contents must be
preserved, you must map it with WRITE_INVALIDATE_REGION on the
RPi5.

This makes little sense, but we'll have to let it be for now.
At least the code works now and we don't have to abandon using
OpenCL.
2025-11-12 12:36:41 -04:00
hayodea d687ca0164 PcloudStimBuff: remove printf clutter 2025-11-12 12:34:30 -04:00
hayodea 91e0fd0f8e IoUringAssmEngn: Disable debugging for compact kernel results 2025-11-12 12:33:38 -04:00
hayodea b55e7a8b19 livoxGen1:OpenCL kernels: add debug printfs 2025-11-12 12:30:41 -04:00
hayodea 5bb9c9e90b Dbg: Useful printfs for the raspi5 2025-11-10 01:05:20 -04:00
hayodea 401c844fcc PcloudStimBuff: add skeleton produceFrameReq :)
Big waves.
This function wraps the operation of getting a stimframe from
the SpMcRingBuffer, and then eventually assigning it a
SimultaneityStamp. For now we just always pass in the first
stim frame and we don't get any simulstamps.

Its callOriginalCallback() automatically calls
allowNextStimulusFrame() to ensure that it doesn't deadlock future
timeslices.
2025-11-10 01:04:07 -04:00
hayodea eedeb4b803 OClCollMeshEngn: Add method compactCollateAndMeshFrameReq
This method takes an input assembly buffer and selects which
OpenCL kernels need to be executed on that buffer to transform
the input data into the eventual output StimulusFrame for the
current timeslice period.
2025-11-10 00:58:48 -04:00
hayodea 19a79faabe OClCollMeshEngn: stop now just calls stop*Kernel 2025-11-10 00:54:41 -04:00
hayodea 1ac6fa4a16 Rename StimFrame=>StimulusFrame 2025-11-09 22:09:19 -04:00
hayodea 7cae3452fc OClMeshCollEngn: temporarily call stop in CL cbs 2025-11-09 20:23:14 -04:00
hayodea 582aefb02c OClEngn: Split isSetup/Running into collate+compact 2025-11-09 19:58:45 -04:00
hayodea aef251b7e5 IoUringEngn: add random dummy slot generator for debugging 2025-11-09 19:34:02 -04:00
hayodea ad0b8058a4 ClCollMeshEngn: big reworks to clean up. 2025-11-09 19:28:55 -04:00
hayodea b331af4f03 ClCollMeshEngn: Split start into start[Collate|Compact]Kernel()
These prepare each kernel separately. We'll unify them further.
2025-11-09 16:12:10 -04:00
hayodea 683e107b04 livoxG1:OClCollMeshEngn: Wrestling and massaging 2025-11-09 15:18:53 -04:00
hayodea c8cbaed3b1 OClCollAndMeshEngn: formatting 2025-11-09 12:37:30 -04:00
hayodea 5f03e4c392 livoxG1:collateDgrams.cl: Clarify collation offsetting 2025-11-09 12:12:08 -04:00
hayodea 55116b1d41 livoxG1:collateDgrams.cl: Fix unaligned reads 2025-11-09 11:48:53 -04:00
hayodea 7977f0bcc9 OClCollatingMeshingEngn: Compile both kernels side by side 2025-11-09 04:49:37 -04:00
hayodea 6264a128a8 livoxG1: Add point cloud frame collator OpenCL kernel 2025-11-09 04:48:15 -04:00
hayodea 01ba68f2b5 livoxG1:OCLEngine: compile compactor program 2025-11-09 03:44:56 -04:00
hayodea 511f1796e8 livoxG1:slotCompactor.cl: mental-validate and refactor 2025-11-09 03:40:46 -04:00
hayodea a0a5aa49ad livoxG1: Add new OpenCl kernel to compact dgrams before collation 2025-11-09 02:39:09 -04:00
hayodea d2e2d9bc3b StagingBuffer: Prefer mlock to io_uring_register_buffers 2025-11-09 01:16:17 -04:00
hayodea 010ba9c7bd Bugfix,IoUringEngn: fill unassembled slots w/dummy; use separate iovecs
We implemented the feature to fill unassembled slots w/dummy header
values for the livox pcloud header.

We also fixed a bug where io uring was writing into the last slot
only because we were using the same iovec for every SQE.
2025-11-09 00:55:58 -04:00
hayodea 72a3415553 Bugfix: Don't use eventfdDesc after stop()
We call stop() inside the assembleFrameReq3, so when it returns,
the eventfdDesc should be destroyed. Don't allow a possibly stale
eventfdDesc obj to permit us to re-arm the eventfdDesc read_some
call.
2025-11-08 23:09:14 -04:00
hayodea a0ab5538df StimBuff: Add mnemonic wrapper for unlocking frameAssmLimiter 2025-11-08 22:07:52 -04:00
hayodea 5b7b4f215a IoUringAssmEngine: Acquire spinlock in stall timeout branch 2025-11-08 21:54:11 -04:00
hayodea d8a3999ad5 PcloudStimBuff: call OClCollMessEngn::setup/finalize in start/stop 2025-11-08 12:23:13 -04:00
hayodea 5ff6a4ee0b OClCollMeshEngn: implement start/stop/setup/finalize 2025-11-08 12:23:13 -04:00
hayodea 6a5bb47e0e PcloudStimBuff: Add OpenClCollatingAndMeshingEngine instance 2025-11-08 12:23:10 -04:00
hayodea 073cdde08f livoxG1: StagingBuff: add getClEngineIovec 2025-11-08 12:18:55 -04:00
hayodea e1042724fc livoxGen1: nitpicking: use .-prefixed symbol for end 2025-11-08 11:11:05 -04:00
hayodea 28e56653ea livoxGen1: unmangle symbols, add .sizes 2025-11-08 11:09:09 -04:00
hayodea 5dbed56e38 livoxG1: Make collateKernelNBytes a uint32_t for 32bit portability 2025-11-08 10:59:08 -04:00
hayodea 9233f7fdc8 livoxG1: Add OpenCl kernels for collation 2025-11-08 10:26:17 -04:00
hayodea bc56c83fad Rename: OpenGlSplittingEngine=>OpenGlCollatingAndMeshingEngine 2025-11-08 01:48:56 -04:00
hayodea cb493d7598 StagingBuff: set OpenCL constraints 2025-11-08 01:45:47 -04:00
hayodea 1c50fc0e29 StagingBuff: Move constructor into .cpp file 2025-11-08 00:21:24 -04:00
hayodea 7497f2fd95 StagingBuff: Enhance IoConstraints with frame constraints
Now StagingBuff instances must meed both frame and slot
constraints.
2025-11-08 00:15:29 -04:00
hayodea 0b21cdd2ba OClSplitEngn: fix build warnings 2025-11-07 22:20:44 -04:00
hayodea f5146738e1 PcloudStimBuff: Add collationBuffer 2025-11-07 22:07:27 -04:00
hayodea 479219db2d StagingBuff: Unify constraints into IOEngineConstraints 2025-11-07 22:05:01 -04:00