Anything that makes squeezing out more performance from our computers while writing more elegant code at the same time deserves more attention.
I wonder if any of the techniques used in Halide can be (somewhat) generalized to domains outside of image processing?
EDIT: Ah, I see that the paper basically answers my questions, as it sees applications in ML contexts. Also, theat new demosaicking algorithm looks great, wonder if will make it's way to DarkTable any time soon?
The page of the paper also has presentation slides which (to me at least) feel a bit more accessible. Hope the accompanying presentation will be on YouTube soon.
- TOPI, a library of optimized routines for common deep learning operations
- Importers for TensorFlow, ONNX (PyTorch/Caffe2), Keras, MXNet, DarkNet and other frameworks.
- AutoTVM for automatically finding fast schedules for arbitrary devices, much faster than random search or the automatic schedules generated by e.g. Halide.
- Various mobile GPU runtimes (OpenGL, OpenCL, Vulkan, etc), compared to Halide
- Large community contributing optimized runtimes (e.g. Intel + Amazon contributing CPU improvements, improved schedules/declarations, etc)
But TVM indeed includes more components than just its language and compiler—in particular, the authors built a library of standard deep learning ops (TOPI) and bridges to other deep learning frameworks (TF, ONNX, etc.). These were the big changes it brought at the start, but could also be built as libraries on top of Halide.
All the remaining points are true for both systems, and have been true for Halide longer than TVM has existed:
- All of the autoschedulers are still highly imperfect, but the Halide autoscheduler does a reasonable job optimizing a different and wider range of operations than AutoTVM, which focuses on tensor contraction-type operations and small local fusions of those with surrounding elementwise computations. Neither is magic, and both are major areas of future work for their respective systems.
- Halide has GPU backends for every target mentioned, as well as Metal, D3D12, CUDA. (And a huge pile of CPU/SIMD and DSP targets.)
- Halide has many full-time developers across Google, Facebook, Adobe, Intel, Qualcomm, and elsewhere, and hundreds of production users who don't just use it under the hood of an ML framework, but actually write code directly in the language.
(And the qualities of the IRs are clearly a subjective opinion.)
(BTW, based on your user-name: do you happen to be Andrew Adams?)
There is no good or bad choices of IR, and both Halide and TVM make reasonable technical decisions to fit their use cases
Halide’s new differentiable programing support is a great that has not yet been supported by TVM. we anticipate it would have awesome usecases