What are some exciting things that are happening in the #ML #DataScience world that we are not able to hear over the din of LLMs?
I notice that Cynthia rudin is continuing to produce great stuff on explainable AI.
What else is going on that is not GPT/Diffusion/MultiModal?
- 3d scene reconstruction from a few images: https://dust3r.europe.naverlabs.com/
- gaussian avatars: https://shenhanqian.github.io/gaussian-avatars
- relightable gaussian codec: https://shunsukesaito.github.io/rgca/
- track anything: https://co-tracker.github.io/ https://omnimotion.github.io/
- segment anything: https://github.com/facebookresearch/segment-anything
- good human pose estimate models: (Yolov8, Google's mediapipe models)
- realistic TTS: https://huggingface.co/coqui/XTTS-v2, bark TTS (hit or miss)
- open great STT (mostly whisper based)
- machine translation (ex: seamlessm4t from meta)
It's crazy to see how much is coming out of Meta's R&D alone.