We haven't ported everything. Just the top-level methods. We'll
dig in to the leaf stuff later. Surprisingly, this all went without
any real difficulties.
Runs like a charm on first try.
We added a new centralized OpenCL Compute manager. This can later
be extended to support CUDA, SyCL, etc. SMO can be configured at
build time to choose which API it will use for compute.
Moreover, the ComputeMgr allows us to register buffers which are
available to all cl_contexts.