Show HN: Wyvern – Real-time machine learning platform for marketplaces

Hello HN. I’m Shu, here with my best friend Suchintan. We’re the creators of Wyvern (https://github.com/Wyvern-AI/wyvern), a real-time machine learning framework to help marketplaces adopt ML earlier in their life.

Wyvern is an end-to-end platform to help marketplaces build and scale custom machine learning pipelines, with the goal of enabling data scientists to own the full machine learning stack. It offers a bundle of: 1. Feature store to store and serve features 2. Model service to serve ML models 3. Feature + Model observability platform 4. Orchestration framework for ML Pipelines

While being mindful of the following constraints: 1. Pipeline evaluation in < 200ms to optimize user experience [1] 2. Minimizing train / test skew for model training

Suchintan built the ML Platform at Faire and Gopuff to improve their Search and Discovery experience. At both places, the platform became an engine that empowered the data team to independently deliver new models to production, generating over $100M of impact.

Small marketplaces usually buy solutions like Algolia and AWS Personalize for Search and Discovery. Once a marketplace grows to a certain size (usually >$100M GMV), they hit limitations around how many custom (”long-tail”) insights they can incorporate into these solutions.

Long-tail insights may come in the form of “features” that help move model accuracy, or changes in model objectives that help optimize for underlying business objectives: - B2B Marketplaces like Faire are able to segment users based on what they’re selling through on their storefront by asking them to categorize themselves on sign-up, and feed that into machine learning models to cold-start personalization for new users. - Margin-sensitive companies like Gopuff are able to build a function to predict the expected revenue of a product being shown to the user, and ordered the results accordingly. This was composed by several ML models coming together trying to predict the probability the user would purchase a product if we showed it to them, and the expected margin of that transaction. Cart-state as a signal is also very helpful when ranking complementary products for users, ie specific types of chips would be favored if you had some coke in your cart, versus sprite

We’ve talked to a bunch of other marketplaces and we’ve learned they have their own long-tail insights that may improve the user experience within their own models: - Recipe marketplaces like Cookpad may associate each recipe with a flavour profile, and leverage that data to map recipe flavors to predicted user profiles (ie weight Savoury recipes higher for users that enjoy Savoury food) - Device reselling marketplaces like Valyuu may use the brand of the users’ device they’re viewing the website on to predict which type of product they were most likely to purchase (iPhone users buying another iPhone)

The question we asked ourselves is: How much engineering involvement is actually required here, and how much can we generalize? We built Wyvern to abstract ML engineering work mentioned above away from data scientists. As a result, they can just focus on defining the request/response of the API, the model, the features for the model, the business logic, and finally training the models with the feedback data generated by the ML pipeline.

We would love your opinions and hot-takes. Please follow our Quickstart (https://github.com/Wyvern-AI/wyvern#quickstart) to give it a try.

[1] https://iarapakis.github.io/papers/TOIS17.pdf