Hi HN!
I've built TRELLIS 3D AI, an open tool that converts 2D images into professional 3D assets using advanced AI technology.
Technical Implementation:
- Browser-based processing using WebGL
- Structured LATents (SLAT) for maintaining geometric integrity
- Dual output pipeline generating both GLB and 3D Gaussian formats
- Real-time 3D preview using Three.js
- Optimized for processing images up to 2048x2048px
The core technology combines:
- Pretrained vision encoders for image understanding
- Rectified flow transformers for 3D geometry generation
- Advanced neural networks for texture mapping
Current Specs:
- Processing time: ~30s per image
- Output formats: GLB + 3D Gaussian
- Zero external dependencies
- Runs entirely client-side
Currently it's using Hugging Face for processing, and I plan to implement API integration in the future to enable more advanced features.
Live demo: https://trellis3d.co/
Would love your feedback and suggestions for improvements.
Can you provide some additional insights into your process? Did you train the generation models yourself? And if so can you talk a bit about the process?