We use io_uring_register_buffers() for IoUringAssemblyEngine instead
of using mlock(). This __appears__ to have reduced CPU utilization on
the Dell laptop. Could also be that we recently upgraded total RAM
from 8GiB to 32GiB.
We directly use an instance of RangeDescriptor to avoid incurring
the memory cost of using a StagingBuffer here. It's unnecessary
since these stencils will always be 32bits large.
We modify the semantics/meaning of the ambience stim feature.
It now represents the number of frames whose average intensity
is below the ambienceLowVal.
We can now implement the postrin as the event wherein the number
of frames whose intensity <= ambienceLowVal exceeds
postrin-interest-threshold.
See the diff of the todo file within this commit for more details.
In short, we do this to prevent the possibility of an in-flight async
contin accessing metadata that we've already destroyed after finalize()
has been called.
See the diff of the todo file within this patch for more
details.
This is to eliminate the possibility of having an in-flight async
contin access metadata that we destroyed in finalize().
These two classes represent our first foray into stencil
construction. One of them standardizes PcloudAmbience stencils
across all stimbuffs, and the other specifies the internal
memory constraints and requirements for a LivoxGen1 device's
stencils.
This change is a bit pedantic, but since these vars aren't accessed
in any hotpath, it's fine to be pedantic. We made these sh_ptrs
atomic so we can use acquire and release side effects when loading
and storing them. This doesn't eliminate the problem of seeing
inconsistent state across microcontrollers, but it helps with simple
accesses like these ones we already do.
Reduces code duplication, centralizes checking and enforces consistent
behaviour across producers.
Also reordered the writes to the sh_ptr<StimulusBuffer>s such that
the pointers are written last.
PcloudStimulusBuffer::produceFrameReq():
Now correctly produces into the stim frames for the
PcloudIntensityStimulusBuffer object that's attached to the
PcloudStimulusProducer. If there's no attached I stimbuff, then
the OpenCL kernel will simply not write out the intensity data.
This is the first moment when we actually use the SP-MC ringbuffer
properly and actually cycle through the frames, producing into
them one by one.
This ensures that we can avoid races when adding and removing
stimbuffs to a stimproducer.
At least in theory. I can think of some ways in which this current
design may result in races or other bad conditions.
We added a new centralized OpenCL Compute manager. This can later
be extended to support CUDA, SyCL, etc. SMO can be configured at
build time to choose which API it will use for compute.
Moreover, the ComputeMgr allows us to register buffers which are
available to all cl_contexts.
We now allocate all the stimFrames for a StimBuffer using a
single StagingBuffer. This gives us all the benefits we're
looking for (pinning, alignment, etc).