If we go from DALL-E 3, it won't be nowhere near competitive while they have the superior ground. Generating a high quality 1024x1024 image with costs around ~$0.002, but $0.08 on DALL-E 3 (20x more expensive per-image). For videos with very high computational needs (since each frame needs to be temporally consistent, you need huge GPUs to serve this) I'm expecting this to be so much more expensive than its competitors (Pika or SVD1.1)