* Check through all managed objects and properly refcount them
  using shared_ptr.
* Ensure that we comb through the current code and enforce the distinction
  between user errors and program exceptions.
* Investigate using UMONITOR/UMWAIT for spinlocks to reduce busy-waiting
  stress/power consumption. Look for a parallel on ARM.
* Investigate WFE/SEV to reduce busy-waiting in spinlocks on ARM.
* The input arg `requiredLocks` to LockSet::LockSet() should be
  a ref and not by-value. Propagate this upward into
  SerializedAsyncContin and into all derived classes'
  constructors.
* Try changing the type of LockerAndInvokerBase::serializedContinuationVaddr
  to be a ref instead of a pointer.
