hayodea
d87c71b794
OClCollMeshEngn: perf profile and print kernel exec durations
2025-11-12 13:05:13 -04:00
hayodea
33b534355a
OpenCL minimum version is 1.2
...
We use CL_MAP_WRITE_INVALIDATE, and I think one other feature which
both require v1.2 minimum
2025-11-12 13:05:13 -04:00
latentprion
96e64e24b8
OClCollMeshEngn: collBuff only needs MAP_WRITE; silence dbg prints
...
When mapping in the collationBuff we only need to supply CL_MAP_WRITE
and not CL_MAP_WRITE_INVALIDATE_REGION since we don't care to
preserve the contents of the collation buff as input to the
collation kernel.
2025-11-12 12:49:54 -04:00
hayodea
1dc74065fb
OClCollMeshEngn: cleanup and get it working on RPi5+Rusticl+V3D GPU
...
It seems that whenever you have an HOST_PTR input buffer to be
"transferred" from the host to the GPU, whose contents must be
preserved, you must map it with WRITE_INVALIDATE_REGION on the
RPi5.
This makes little sense, but we'll have to let it be for now.
At least the code works now and we don't have to abandon using
OpenCL.
2025-11-12 12:36:41 -04:00
hayodea
d687ca0164
PcloudStimBuff: remove printf clutter
2025-11-12 12:34:30 -04:00
hayodea
91e0fd0f8e
IoUringAssmEngn: Disable debugging for compact kernel results
2025-11-12 12:33:38 -04:00
hayodea
b55e7a8b19
livoxGen1:OpenCL kernels: add debug printfs
2025-11-12 12:30:41 -04:00
hayodea
5bb9c9e90b
Dbg: Useful printfs for the raspi5
2025-11-10 01:05:20 -04:00
hayodea
401c844fcc
PcloudStimBuff: add skeleton produceFrameReq :)
...
Big waves.
This function wraps the operation of getting a stimframe from
the SpMcRingBuffer, and then eventually assigning it a
SimultaneityStamp. For now we just always pass in the first
stim frame and we don't get any simulstamps.
Its callOriginalCallback() automatically calls
allowNextStimulusFrame() to ensure that it doesn't deadlock future
timeslices.
2025-11-10 01:04:07 -04:00
hayodea
eedeb4b803
OClCollMeshEngn: Add method compactCollateAndMeshFrameReq
...
This method takes an input assembly buffer and selects which
OpenCL kernels need to be executed on that buffer to transform
the input data into the eventual output StimulusFrame for the
current timeslice period.
2025-11-10 00:58:48 -04:00
hayodea
19a79faabe
OClCollMeshEngn: stop now just calls stop*Kernel
2025-11-10 00:54:41 -04:00
hayodea
1ac6fa4a16
Rename StimFrame=>StimulusFrame
2025-11-09 22:09:19 -04:00
hayodea
7cae3452fc
OClMeshCollEngn: temporarily call stop in CL cbs
2025-11-09 20:23:14 -04:00
hayodea
582aefb02c
OClEngn: Split isSetup/Running into collate+compact
2025-11-09 19:58:45 -04:00
hayodea
aef251b7e5
IoUringEngn: add random dummy slot generator for debugging
2025-11-09 19:34:02 -04:00
hayodea
ad0b8058a4
ClCollMeshEngn: big reworks to clean up.
2025-11-09 19:28:55 -04:00
hayodea
b331af4f03
ClCollMeshEngn: Split start into start[Collate|Compact]Kernel()
...
These prepare each kernel separately. We'll unify them further.
2025-11-09 16:12:10 -04:00
hayodea
683e107b04
livoxG1:OClCollMeshEngn: Wrestling and massaging
2025-11-09 15:18:53 -04:00
hayodea
c8cbaed3b1
OClCollAndMeshEngn: formatting
2025-11-09 12:37:30 -04:00
hayodea
5f03e4c392
livoxG1:collateDgrams.cl: Clarify collation offsetting
2025-11-09 12:12:08 -04:00
hayodea
55116b1d41
livoxG1:collateDgrams.cl: Fix unaligned reads
2025-11-09 11:48:53 -04:00
hayodea
7977f0bcc9
OClCollatingMeshingEngn: Compile both kernels side by side
2025-11-09 04:49:37 -04:00
hayodea
6264a128a8
livoxG1: Add point cloud frame collator OpenCL kernel
2025-11-09 04:48:15 -04:00
hayodea
01ba68f2b5
livoxG1:OCLEngine: compile compactor program
2025-11-09 03:44:56 -04:00
hayodea
511f1796e8
livoxG1:slotCompactor.cl: mental-validate and refactor
2025-11-09 03:40:46 -04:00
hayodea
a0a5aa49ad
livoxG1: Add new OpenCl kernel to compact dgrams before collation
2025-11-09 02:39:09 -04:00
hayodea
d2e2d9bc3b
StagingBuffer: Prefer mlock to io_uring_register_buffers
2025-11-09 01:16:17 -04:00
hayodea
010ba9c7bd
Bugfix,IoUringEngn: fill unassembled slots w/dummy; use separate iovecs
...
We implemented the feature to fill unassembled slots w/dummy header
values for the livox pcloud header.
We also fixed a bug where io uring was writing into the last slot
only because we were using the same iovec for every SQE.
2025-11-09 00:55:58 -04:00
hayodea
72a3415553
Bugfix: Don't use eventfdDesc after stop()
...
We call stop() inside the assembleFrameReq3, so when it returns,
the eventfdDesc should be destroyed. Don't allow a possibly stale
eventfdDesc obj to permit us to re-arm the eventfdDesc read_some
call.
2025-11-08 23:09:14 -04:00
hayodea
a0ab5538df
StimBuff: Add mnemonic wrapper for unlocking frameAssmLimiter
2025-11-08 22:07:52 -04:00
hayodea
5b7b4f215a
IoUringAssmEngine: Acquire spinlock in stall timeout branch
2025-11-08 21:54:11 -04:00
hayodea
d8a3999ad5
PcloudStimBuff: call OClCollMessEngn::setup/finalize in start/stop
2025-11-08 12:23:13 -04:00
hayodea
5ff6a4ee0b
OClCollMeshEngn: implement start/stop/setup/finalize
2025-11-08 12:23:13 -04:00
hayodea
6a5bb47e0e
PcloudStimBuff: Add OpenClCollatingAndMeshingEngine instance
2025-11-08 12:23:10 -04:00
hayodea
073cdde08f
livoxG1: StagingBuff: add getClEngineIovec
2025-11-08 12:18:55 -04:00
hayodea
e1042724fc
livoxGen1: nitpicking: use .-prefixed symbol for end
2025-11-08 11:11:05 -04:00
hayodea
28e56653ea
livoxGen1: unmangle symbols, add .sizes
2025-11-08 11:09:09 -04:00
hayodea
5dbed56e38
livoxG1: Make collateKernelNBytes a uint32_t for 32bit portability
2025-11-08 10:59:08 -04:00
hayodea
9233f7fdc8
livoxG1: Add OpenCl kernels for collation
2025-11-08 10:26:17 -04:00
hayodea
bc56c83fad
Rename: OpenGlSplittingEngine=>OpenGlCollatingAndMeshingEngine
2025-11-08 01:48:56 -04:00
hayodea
cb493d7598
StagingBuff: set OpenCL constraints
2025-11-08 01:45:47 -04:00
hayodea
1c50fc0e29
StagingBuff: Move constructor into .cpp file
2025-11-08 00:21:24 -04:00
hayodea
7497f2fd95
StagingBuff: Enhance IoConstraints with frame constraints
...
Now StagingBuff instances must meed both frame and slot
constraints.
2025-11-08 00:15:29 -04:00
hayodea
0b21cdd2ba
OClSplitEngn: fix build warnings
2025-11-07 22:20:44 -04:00
hayodea
f5146738e1
PcloudStimBuff: Add collationBuffer
2025-11-07 22:07:27 -04:00
hayodea
479219db2d
StagingBuff: Unify constraints into IOEngineConstraints
2025-11-07 22:05:01 -04:00
hayodea
b598ca8594
libs: Add smohook for getting cmdline opts
2025-11-07 14:59:28 -04:00
hayodea
457d0f9345
Dbg:Add CallableTracer for callables post()ed to boost.asio
...
This class and its macro allow us to trace the invocation of
callbacks as they're invoked by Boost.asio.
2025-11-06 21:45:16 -04:00
hayodea
af57c4dfd1
Boost: move top_ link fixer to top of files
2025-11-06 15:03:26 -04:00
hayodea
db30001140
livoxG1: Rename stagingBuffer=>assemblyBuffer
...
This is in preparation for re-using StagingBuffer to also serve
as the collation buffer that we'll use as the intermediate stage
for producing the final output mesh.
2025-11-06 14:09:10 -04:00