5cce473e01
This change is a bit pedantic, but since these vars aren't accessed in any hotpath, it's fine to be pedantic. We made these sh_ptrs atomic so we can use acquire and release side effects when loading and storing them. This doesn't eliminate the problem of seeing inconsistent state across microcontrollers, but it helps with simple accesses like these ones we already do.
102 lines
5.1 KiB
Plaintext
102 lines
5.1 KiB
Plaintext
* Check through all managed objects and properly refcount them
|
|
using shared_ptr.
|
|
* Ensure that we comb through the current code and enforce the distinction
|
|
between user errors and program exceptions.
|
|
* Investigate using UMONITOR/UMWAIT for spinlocks to reduce busy-waiting
|
|
stress/power consumption. Look for a parallel on ARM.
|
|
* Investigate WFE/SEV to reduce busy-waiting in spinlocks on ARM.
|
|
* The input arg `requiredLocks` to LockSet::LockSet() should be
|
|
a ref and not by-value. Propagate this upward into
|
|
SerializedAsyncContin and into all derived classes'
|
|
constructors.
|
|
* In livoxProto1/device.cpp, migrate the registerUdpCommandHandler() calls
|
|
from using the inProgress collection to the per-device collections.
|
|
* In cases where we use boost deadline_timers and pass in an async
|
|
contin as context preservation across the delay, but they aren't
|
|
part of a branch pattern, we may still need to call cancel() on them
|
|
after they expire just in case boost doesn't clean up the internal
|
|
callable that we passed it. Or else we'll have circular sh_ptr
|
|
references in our continuations.
|
|
* UdpCommandDemuxer::registerUdpCommandHandler should accept a pointer
|
|
to the io_context of the thread it should post its callbacks to, and
|
|
then post callbacks to those io_contexts when UDP cmd responses
|
|
come in.
|
|
* Consider using MAP_HUGEPAGE with both PcloudStimBuff::StagingBuffer
|
|
and in the PcloudStimulusBuffer's ringbuff.
|
|
* We should prolly call stream_descriptor::reset() after release()
|
|
whenever we wish to release a desc without closing the underlying
|
|
fd. Because we've discovered that release() doesn't fully cleanup
|
|
internal metadata.
|
|
* There's a bug where deferred production timeslices can result in
|
|
freezing. Explore this and figure out why. When we examined it,
|
|
it didn't appear to be a spinlock-deadlock.
|
|
It seems to be reliably reproducible when we use the NVidia GTX
|
|
card as our OpenCL ComputeDevice, since the GTX card doesn't
|
|
have unified memory with the host cpu complex. This causes the
|
|
kernels to overrun their timelices and triggers repeated
|
|
timeslice deferrals.
|
|
|
|
PcloudStimProducer::stop=>start() sequence:
|
|
IoUringAssemblyEngine::finalize():
|
|
I'm worried that calling PcloudStimProducer::stop() will leave
|
|
in-flight sequences running which will remain alive even after
|
|
the PcloudStimProducer object itself has been destroyed. This may
|
|
be possible for IoUringAssmEngn because it has a running timer
|
|
which may well just time out.
|
|
* There's no reason to think that an in-flight IoUringAssmEngn
|
|
assembly operation won't actually run until it times out. In
|
|
fact, that's the standard case if you configure
|
|
nDgramsPerFrame to be large enough.
|
|
* This means that when we call IoUringAssmEngn::finalize(), an
|
|
in-flight assembly could be going on, which isn't receiving
|
|
any CQE notifications on the eventFd. Thus, that in-flight
|
|
assembly op could plausibly timeout and resume execution
|
|
after IoUringAssemEngn::finalize has completed.
|
|
* We ought to do a bridged async timeout for the std::max()
|
|
of all timeouts used by IoUringAssmEngn.
|
|
|
|
OpenClCollatingAndMeshingEngine::finalize():
|
|
I'm also worried, though less so, about the OClCollMeshEngn: it's
|
|
a lot less likely to have an in-flight op run past the point where
|
|
the OClCollMeshEngn object has expired.
|
|
* But there's still a chance that a long-running OCl kernel could
|
|
cause an in-flight async contin to resume executing after its
|
|
OclCollMeshEngn has expired.
|
|
* We should do a bridged async wait for the std::max() of all
|
|
timeouts used by OClCollMeshEngn to pass before leaving
|
|
PcloudStimProducer::stop.
|
|
|
|
Attaching and detaching StimBuffs from StimProducers:
|
|
We've written code recently to attach and detact stimBuffs from a
|
|
stimProducer. The code is quite nice, but there's this hanging
|
|
omen over the fact that we put no thought into ensuring that
|
|
detachment doesn't cause an in-flight async production op to
|
|
access invalid data.
|
|
|
|
The in-flight async production ops use the SpMcRingbuffs that
|
|
inhabit the stimbuffs. If we don't ensure that all in-flight
|
|
async ops are retired before we detach a stimbuff from a
|
|
producer, we could end up with the producer writing data into
|
|
memory which has been reclaimed and repurposed.
|
|
Similarly, if we're not careful about the order in which we
|
|
assign the stimBuff pointers during attachment, we could
|
|
potentially cause producers to see a partially initialized
|
|
StimBuff object.
|
|
|
|
I think this can be solved without locking/synchronization
|
|
by being very careful to ensure that by the time that
|
|
StimProducer::stop() exits, all in-flight production
|
|
operations are reasonably sure to be halted. If all
|
|
in-flight operations are halted; and if production ops
|
|
cannot be launched while a StimBuff is being attached/
|
|
detached, this means we don't have to worry about accesses
|
|
to stale StimBuff instance state; or access to partially
|
|
initialized StimBuff instance state.
|
|
|
|
So this problem is solved by dealing with the in-flight
|
|
cancelation problem described above, concerning
|
|
[IoUringAssmEngn|OClCollMeshEngn]::start/stop(), and
|
|
StimulusBuffer::start/stop(), and ensuring that after
|
|
stop() has returned, we can be reasonably sure that all
|
|
in-flight ops have exited.
|