vlmrunadmin007's comments

vlmrunadmin007 · 2025-07-02T18:35:57 1751481357

It's impressive how the MCP example in https://docs.vlm.run/mcp/examples/template-search search retains visual context across multiple images and tool calls. Unlike most chat interfaces, it enables seamless multi-step reasoning—like finding a logo in one image and tracking it in another—without losing state. This makes it ideal for building stateful, iterative visual workflows.

vlmrunadmin007 · 2025-02-26T22:42:51 1740609771

We have successfully tested the model with vLLM and plan to release it across multiple inference server frameworks, including vLLM and OLAMA.

vlmrunadmin007 · 2025-02-26T22:18:39 1740608319

Basically there is no model schema combination. IF you go ahead and prompt a open source model with the schema it doesn't produce the results in the expected format. The main contribution is how to make these model conform to your specific needs and in a structured format.

idiliv · 2025-02-26T22:20:39 1740608439

Wait, but we're doing that already, and it works well (Qwen 2.5 VL)? If need be, you can always resort to structured generation to enforce schema conformity?