Todo: Update
By solving the issues in finalize() for IoUringAssmEngn and OClCollMeshEngn, we've solved this as a side-effect.
This commit is contained in:
@@ -35,39 +35,3 @@
|
|||||||
have unified memory with the host cpu complex. This causes the
|
have unified memory with the host cpu complex. This causes the
|
||||||
kernels to overrun their timelices and triggers repeated
|
kernels to overrun their timelices and triggers repeated
|
||||||
timeslice deferrals.
|
timeslice deferrals.
|
||||||
|
|
||||||
PcloudStimProducer::stop=>start() sequence:
|
|
||||||
|
|
||||||
Attaching and detaching StimBuffs from StimProducers:
|
|
||||||
We've written code recently to attach and detact stimBuffs from a
|
|
||||||
stimProducer. The code is quite nice, but there's this hanging
|
|
||||||
omen over the fact that we put no thought into ensuring that
|
|
||||||
detachment doesn't cause an in-flight async production op to
|
|
||||||
access invalid data.
|
|
||||||
|
|
||||||
The in-flight async production ops use the SpMcRingbuffs that
|
|
||||||
inhabit the stimbuffs. If we don't ensure that all in-flight
|
|
||||||
async ops are retired before we detach a stimbuff from a
|
|
||||||
producer, we could end up with the producer writing data into
|
|
||||||
memory which has been reclaimed and repurposed.
|
|
||||||
Similarly, if we're not careful about the order in which we
|
|
||||||
assign the stimBuff pointers during attachment, we could
|
|
||||||
potentially cause producers to see a partially initialized
|
|
||||||
StimBuff object.
|
|
||||||
|
|
||||||
I think this can be solved without locking/synchronization
|
|
||||||
by being very careful to ensure that by the time that
|
|
||||||
StimProducer::stop() exits, all in-flight production
|
|
||||||
operations are reasonably sure to be halted. If all
|
|
||||||
in-flight operations are halted; and if production ops
|
|
||||||
cannot be launched while a StimBuff is being attached/
|
|
||||||
detached, this means we don't have to worry about accesses
|
|
||||||
to stale StimBuff instance state; or access to partially
|
|
||||||
initialized StimBuff instance state.
|
|
||||||
|
|
||||||
So this problem is solved by dealing with the in-flight
|
|
||||||
cancelation problem described above, concerning
|
|
||||||
[IoUringAssmEngn|OClCollMeshEngn]::start/stop(), and
|
|
||||||
StimulusBuffer::start/stop(), and ensuring that after
|
|
||||||
stop() has returned, we can be reasonably sure that all
|
|
||||||
in-flight ops have exited.
|
|
||||||
|
|||||||
Reference in New Issue
Block a user