Hacker News new | past | comments | ask | show | jobs | submit login
Tell HN: OpenAI still has a moat, it's called function calling and its API
2 points by behnamoh 70 days ago | hide | past | favorite | 4 comments
Gemini 1.0 Ultra and Gemini 1.5 Advanced are all the rage right now, but I wanted to say that in production, you're not going to use vanilla free-form text LLMs; you use LLMs that can reliably generate JSON. OpenAI knew this months ago and built this feature into their models (not as an afterthought like grammars in local models). I still haven't seen any other model get close to GPT-4-turbo's ability to call functions and return JSON output. If I'm building an app using LLMs, right now it's an obvious decision to choose GPT models, not Gemini or local models (I love local models and I'm a contributor to some OSS projects, but function calling is not there yet).

Another thing people often neglect is that OpenAI basically owns the API design right now. The fact that there are several drop-in replacements for their API shows how much apps depend on the specific structure of their API. For comparison, this is Gemini's REST API usage:

``` echo '{ "contents":[ { "parts":[ {"text": "What is this picture?"}, { "inline_data": { "mime_type":"image/jpeg", "data": "'$(base64 -w0 image.jpg)'" } } ] } ] }' > request.json

    curl https://generativelanguage.googleapis.com/v1beta/models/gemini-pro-vision:generateContent?key=${API_KEY} \
            -H 'Content-Type: application/json' \
            -d @request.json 2> /dev/null | grep "text"
```

And here's another one:

``` curl https://generativelanguage.googleapis.com/v1beta/models/gemini-pro:generateContent?key=$API_KEY \ -H 'Content-Type: application/json' \ -X POST \ -d '{ "contents": [{ "parts":[ {"text": "Write a story about a magic backpack."} ] }], "safetySettings": [ { "category": "HARM_CATEGORY_DANGEROUS_CONTENT", "threshold": "BLOCK_ONLY_HIGH" } ], "generationConfig": { "stopSequences": [ "Title" ], "temperature": 1.0, "maxOutputTokens": 800, "topP": 0.8, "topK": 10 } }' 2> /dev/null | grep "text" ```




> Another thing people often neglect is that OpenAI basically owns the API design right now.

An API schema is not a moat if it can be easily copied, which is what has happened. Companies also cannot copyright/patent an API schema (in most cases anyways). The proliferation of the OpenAI API schema is the exact opposite of a moat.

It's very trivial to come up with another REST API implementation for LLMs if necessary, there's just no reason now due to the way LLM I/O works.


Gemini Pro appears to support function calling: https://cloud.google.com/vertex-ai/docs/generative-ai/multim...



@dang Can we please have decent code block support on HN?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: