In this blog post I’ll show how to run the Phi-3-vision model, which is a multimodal model that supports text + image inputs, with .NET in a similar fashion with version 0.3.0-rc2 of ONNX Runtime GenAI based on the Phi-3 vision tutorial and phi3v.py.