mdagostino's comments

mdagostino · 2024-10-07T15:56:16.000000Z

I built a POC of the openai-realtime-console (https://github.com/openai/openai-realtime-console) in Streamlit for python folks like me who are terrible at javascript.

It's useful for poking around the events and the event payloads.

Next I'll integrate audio playback and then audio capture. Decoding the audio from the events is easy. Making streaming audio playback work with streamlit--not so much.

mdagostino · on March 28, 2023

LLaMA is non-commercial

mdagostino · on Feb 24, 2023

This doesn’t make any sense to me: how does fine-tuning the embeddings save money? It seemed like the problem was having to make too many API calls to generate the embeddings in the first place.

tomatbebo · on Feb 24, 2023

Embeddings are often used as features for these LLMs so before they were paying to generate embeddings and doing inference with these large models. Now they pay to generate embeddings, fine-tune them and do semantic search (probably approximate k-nearest neighbors). The hardware requirements for most LLMs make them much more expensive than approximate KNN with a vector database.

DanielVZ · on Feb 24, 2023

They went from indexing with embeddings + LLM to just using a biased embedding for their use case. This should save them most of their costs.

telarson · on Feb 25, 2023

Maybe this helps people understand what they are doing at index time.

* Version 1. Ask the LLM to describe the code snippet. Create an embedding of the description. LLM generation + embeddings required.

* Version 2. run the code snippet directly through the embedding API. Skip the LLM text generations step. Now run the code snippet through the bias matrix and finally index the resulting embedding.

I assume this only works b/c they fine tuned a bias matrix on code snippet and text pairs. Feels more like a light version of transfer learning to me.

The article was a little unclear in the actual approach for V2 so if I have anything wrong please correct me.

mdagostino · on Feb 24, 2023

I wouldn’t say most—maybe a factor of 2. Getting the embedding is still an API call to an LLM.

DanielVZ · on Feb 27, 2023

I’m pretty sure they were using a high cost LLM to summarize, and for embeddings you only need Ada, which is orders pf magnitude cheaper.

mdagostino · on Aug 1, 2014

Civis Analytics- Chicago, IL We spun out of the analytics shop from the Obama 2012 re-election campaign to tackle really hard data science problems for campaigns, non-profits, and companies. We're looking for both junior and senior software engineers and data scientists: http://www.civisanalytics.com/apply