The OFM algo runs in fractions of a millisecond. GP3 runs in
fractions of a second. I think if we can get more input data to
the OFM or something akin to it, we will have a winner.
We directly use an instance of RangeDescriptor to avoid incurring
the memory cost of using a StagingBuffer here. It's unnecessary
since these stencils will always be 32bits large.
These two classes represent our first foray into stencil
construction. One of them standardizes PcloudAmbience stencils
across all stimbuffs, and the other specifies the internal
memory constraints and requirements for a LivoxGen1 device's
stencils.
We added a new centralized OpenCL Compute manager. This can later
be extended to support CUDA, SyCL, etc. SMO can be configured at
build time to choose which API it will use for compute.
Moreover, the ComputeMgr allows us to register buffers which are
available to all cl_contexts.
This symbol is defined as a static member object inside of a
boost detail header. When boost headers are used in a project
that uses Boost in both the main binary as well as dlopen()'d
shlibs, the top_ symbol gets duplicated and the metadata gets
partitioned.
We use the Boost shlib to unify both the main binary and the
shlibs to use the same memory address for top_.
This involves marking the templated object call_stack::top_ as
"extern" and then declaring to Boost that we intend to use the
shlibs.
We move the methods in StimulusBuffer whose addresses are taken during
program execution into a separate static lib. This guarantees that
they'll have their own, single vaddr at runtime, at least within
each independent code module.