Commit Graph

134 Commits

Author SHA1 Message Date
hayodea 2632917c63 livoxGen1: Execute delays on lib's assigned CompThread 2025-11-15 15:59:34 -04:00
hayodea 7a51f02d97 livoxGen1: Implement StimBuff add/del from StimProducers
There seems to be a bug where two or more stimProducers
or stimBuffs get initialized at once but we can deal with that
tomorrow.
2025-11-15 04:02:25 -04:00
hayodea e215e78aa5 StimulusBuffer should take ref to parent; not sh_ptr to common instance 2025-11-15 01:15:57 -04:00
hayodea 188b09319c livoxProto1: Rn Device::nAttachedStimBuffs=>nAttachedStimulusProducers
More semantically precise name.
2025-11-15 00:56:20 -04:00
hayodea 16b51a3b66 Rename PcloudDataProducer=>PcloudStimulusProducer 2025-11-14 23:50:31 -04:00
hayodea 7d86ecadc4 livoxGen1: Rn attachedDataProducers=>attachedStimulusProducers
Also compare producers only by device selector and not by the rest
of their stored DASpec.
2025-11-14 23:26:13 -04:00
hayodea 98a493a8a1 livoxGen1: Add StimBuffs to PcloudStimProd
* PcloudStimulusProducer now has member sh_ptr<StimulusBuffer>s.
* StimulusProducer now has a vector<sh_ptr<StimulusBuffer>s.

Created new stimbuff-type-specific
Pcloud[Xyz|I|Ambience]StimulusBuffer classes for representing each
stim feature exposed by livoxGen1's PcloudStimulusProducer.
2025-11-14 23:19:32 -04:00
hayodea 8a7dc10892 Split StimulusProducer=>StimulusBuffer+StimulusProducer
We're getting ready for the last mile of the StimulusBuffer API
and the proto-completion of the LivoxGen1 StimBuffApi.
2025-11-14 20:44:37 -04:00
hayodea 70c0175a8b Rename StimulusBuffer=>StimulusProducer
Next we'll split the StimulusBuffer-related stuff into a new class
StimulusBuffer.
2025-11-14 19:50:51 -04:00
hayodea 74e3896ae4 Rename PcloudStimulusBuffer=>PcloudDataProducer
This prepares us for the split up of classes. We're going to split
StimulusBuffer into two base classes: StimulusBuffer and
StimulusProducer.
2025-11-14 19:44:18 -04:00
hayodea 7b7ff06219 PcloudStimBuff:start: check engine setup()s for error 2025-11-14 18:07:20 -04:00
hayodea 51a2858214 OClCollMeshEngn:*KernelComplete: use WRITE_INVLDT during finalize()
Doing it this way enables us to get the mapBuffer() call working
during finalize. But we couldn't get the unmap call working. That
has to do with a bug in the Rusticl event waiting code.
2025-11-14 18:04:12 -04:00
hayodea 2e75dd40aa OClCollMeshEngn: Rearrange steps in startCollateKernel
Just to make it match startCompactKernel. No other reason.
2025-11-14 18:01:48 -04:00
hayodea c08e075763 Bug:Rusticl: segfault on waitForEvents(clEnqueueUnmapMemObject) in finalize
For some reason, waiting on the event object returned by
clEnqueueUnmapMemObject, but only when called from within finalize().
Under normal operating conditions when we map and then unmap our
HOST_PTR buffers, everything works just fine.

I can't discern any relevant difference.
Adding a bridged 300ms delay in setup() doesn't help either so it
doesn't seem to be solved by allowing the rusticl worker threads
to finish initializing.

GDB output from the segfault appended. Sadly, no debug symbols for
the ubuntu rusticl package.

```
[New Thread 0xffffdd2ce140 (LWP 1056313)]
validateOpenClVersion: OpenCL platform version: OpenCL 3.0
validateOpenClVersion: OpenCL device version: OpenCL 3.0
[New Thread 0xffffdcabe140 (LWP 1056314)]
[New Thread 0xffffc9a8f140 (LWP 1056315)]
start: Starting stimulus buffer for device 3JEDK380010Z39
attachDeviceReq3: Got return mode (0) for device: 3JEDK380010Z39
discardHeartbeatAck: Lidar not ready for operation: work_state: 0x0 (Initializing), ack_msg: 0x1b
discardHeartbeatAck: Lidar not ready for operation: work_state: 0x0 (Initializing), ack_msg: 0x45
discardHeartbeatAck: Lidar not ready for operation: work_state: 0x0 (Initializing), ack_msg: 0x45
discardHeartbeatAck: Lidar not ready for operation: work_state: 0x0 (Initializing), ack_msg: 0x45
attachDeviceReq5: Failed to enable pcloud data for dev 3JEDK380010Z39
newDeviceAttachmentSpecInd2: Attach failed for device spec Device Identifier: avia0, Sensor Type: e, QualeIface API: structural-qualeiface, QualeIface API Params: (), StimBuff API: livoxGen1, StimBuff API Params: (), Provider: livoxProto1, Provider Params: (smo-ip=10.42.0.2 ), Device Selector: 3JEDK380010Z39

attachAllUnattachedDevicesFromReq2: Failed to attach device: avia0
Mrntt: attached 0 of 2 sense devices.
Mrntt: Body component initialized.
initializeReq2: Failed to initialize globalMind
marionetteInitializeReqCb: Failed to initialize Marionette. Shutting down.
Mrntt: About to detach all sense devices.
Mrntt: Successfully detached 0 of 0 sense devices.
Mrntt: About to finalize all stim buff api libs.
compactKernelCompletecalling w/mapFlags=4. INV=4; READ=1.
mapBuffer 1
mapBuffer 2
mapBuffer 3: cmdQ: 0xffffec013d68, buffer: 0xffffec07b6b8, mapflags: 4
mapBuffer 4
mapBuffer 5
unmapBuffer 1
unmapBuffer 2
unmapBuffer 3
unmapBuffer 4
unmapBuffer 4.1
unmapBuffer 5

Thread 9 "rusticl queue t" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xffffdcabe140 (LWP 1056314)]
Download failed: Invalid argument.  Continuing without source file ./string/../sysdeps/aarch64/multiarch/../memcpy.S.
__memcpy_generic () at ../sysdeps/aarch64/multiarch/../memcpy.S:155
warning: 155    ../sysdeps/aarch64/multiarch/../memcpy.S: No such file or directory
(gdb)
(gdb) bt
(gdb)
```
2025-11-14 17:45:34 -04:00
hayodea 3995f57489 OClCollMeshEngn: call clFlush+clFinish after setup()
This ensures that all operations enqueued during setup() get fully
executed before any requests come in.
2025-11-14 17:42:46 -04:00
hayodea 0720ed9c76 StimBuff: Make produceFrameReq responsive to stop() 2025-11-14 02:25:51 -04:00
hayodea c268414b0d Fix comment 2025-11-14 02:08:03 -04:00
hayodea a1625eb562 OClCollMeshEngn: Use shouldAcceptRequests stop/finalize() pattern
This makes the stop() method capable of synchronously stopping all
engine/server-type async services which don't act in a self-moved
fashion but instead wait for a request.
2025-11-14 01:41:03 -04:00
hayodea 39691175e7 Formatting 2025-11-14 01:03:58 -04:00
hayodea 1df43665c3 IoUringAssmEngn: Implement shouldAcceptRequests daemon/async control
We've reworked the synchronous control functions that govern the
async daemon and in-flight requests for this class. The
shouldAcceptRequests flag represents the readiness state of the
whole engine class. The in-flight async operations consult the
shouldAcceptRequests flag to determine whether they should return
early.

Now the stop() method is solely for setting the locked flag
shouldAcceptRequests=false.

The pair resetAndAssembleFrame()/assemblyCycleComplete manage the
per-assembly cycle state machine, and they don't need to set or
interfere with the shouldAcceptRequests flag.
2025-11-13 23:53:31 -04:00
hayodea 501effe6d5 IoUringAssmEngn: assemFrameReq: exit responsively on stop() 2025-11-13 21:00:26 -04:00
hayodea d01f06904a assembleFrameReq: fix bug where we don't CB before ret 2025-11-13 20:57:10 -04:00
hayodea 16a74a3eb0 IoUringAssmEngn,OClCollMeshEngn: start/stop aren't public iface
Placing these functions in the public section kind of conceptually
confuses the reader since start/stop are indeed public interface
members in StimulusBuffer -- but they're not in the member objects.
2025-11-13 20:54:54 -04:00
hayodea a17072c8d9 IoUringEngn:assembleFrameReq: Implement and use callOriginalCallback 2025-11-13 20:53:53 -04:00
hayodea 5c3debecf4 OClCollMeshEngn: fix mem leak in [un]mapBuffer() 2025-11-13 01:41:59 -04:00
hayodea 5031b22a31 OClCollMeshEngn: use helper fns for parsing version numbers 2025-11-12 20:43:48 -04:00
hayodea df58f324a9 CMake:LivoxGen1: Require OpenCL 1.2+, printf & WRITE_INVALIDATE_REGION 2025-11-12 20:26:29 -04:00
hayodea d87c71b794 OClCollMeshEngn: perf profile and print kernel exec durations 2025-11-12 13:05:13 -04:00
hayodea 33b534355a OpenCL minimum version is 1.2
We use CL_MAP_WRITE_INVALIDATE, and I think one other feature which
both require v1.2 minimum
2025-11-12 13:05:13 -04:00
latentprion 96e64e24b8 OClCollMeshEngn: collBuff only needs MAP_WRITE; silence dbg prints
When mapping in the collationBuff we only need to supply CL_MAP_WRITE
and not CL_MAP_WRITE_INVALIDATE_REGION since we don't care to
preserve the contents of the collation buff as input to the
collation kernel.
2025-11-12 12:49:54 -04:00
hayodea 1dc74065fb OClCollMeshEngn: cleanup and get it working on RPi5+Rusticl+V3D GPU
It seems that whenever you have an HOST_PTR input buffer to be
"transferred" from the host to the GPU, whose contents must be
preserved, you must map it with WRITE_INVALIDATE_REGION on the
RPi5.

This makes little sense, but we'll have to let it be for now.
At least the code works now and we don't have to abandon using
OpenCL.
2025-11-12 12:36:41 -04:00
hayodea d687ca0164 PcloudStimBuff: remove printf clutter 2025-11-12 12:34:30 -04:00
hayodea 91e0fd0f8e IoUringAssmEngn: Disable debugging for compact kernel results 2025-11-12 12:33:38 -04:00
hayodea b55e7a8b19 livoxGen1:OpenCL kernels: add debug printfs 2025-11-12 12:30:41 -04:00
hayodea 5bb9c9e90b Dbg: Useful printfs for the raspi5 2025-11-10 01:05:20 -04:00
hayodea 401c844fcc PcloudStimBuff: add skeleton produceFrameReq :)
Big waves.
This function wraps the operation of getting a stimframe from
the SpMcRingBuffer, and then eventually assigning it a
SimultaneityStamp. For now we just always pass in the first
stim frame and we don't get any simulstamps.

Its callOriginalCallback() automatically calls
allowNextStimulusFrame() to ensure that it doesn't deadlock future
timeslices.
2025-11-10 01:04:07 -04:00
hayodea eedeb4b803 OClCollMeshEngn: Add method compactCollateAndMeshFrameReq
This method takes an input assembly buffer and selects which
OpenCL kernels need to be executed on that buffer to transform
the input data into the eventual output StimulusFrame for the
current timeslice period.
2025-11-10 00:58:48 -04:00
hayodea 19a79faabe OClCollMeshEngn: stop now just calls stop*Kernel 2025-11-10 00:54:41 -04:00
hayodea 1ac6fa4a16 Rename StimFrame=>StimulusFrame 2025-11-09 22:09:19 -04:00
hayodea 7cae3452fc OClMeshCollEngn: temporarily call stop in CL cbs 2025-11-09 20:23:14 -04:00
hayodea 582aefb02c OClEngn: Split isSetup/Running into collate+compact 2025-11-09 19:58:45 -04:00
hayodea aef251b7e5 IoUringEngn: add random dummy slot generator for debugging 2025-11-09 19:34:02 -04:00
hayodea ad0b8058a4 ClCollMeshEngn: big reworks to clean up. 2025-11-09 19:28:55 -04:00
hayodea b331af4f03 ClCollMeshEngn: Split start into start[Collate|Compact]Kernel()
These prepare each kernel separately. We'll unify them further.
2025-11-09 16:12:10 -04:00
hayodea 683e107b04 livoxG1:OClCollMeshEngn: Wrestling and massaging 2025-11-09 15:18:53 -04:00
hayodea c8cbaed3b1 OClCollAndMeshEngn: formatting 2025-11-09 12:37:30 -04:00
hayodea 5f03e4c392 livoxG1:collateDgrams.cl: Clarify collation offsetting 2025-11-09 12:12:08 -04:00
hayodea 55116b1d41 livoxG1:collateDgrams.cl: Fix unaligned reads 2025-11-09 11:48:53 -04:00
hayodea 7977f0bcc9 OClCollatingMeshingEngn: Compile both kernels side by side 2025-11-09 04:49:37 -04:00
hayodea 6264a128a8 livoxG1: Add point cloud frame collator OpenCL kernel 2025-11-09 04:48:15 -04:00