Hacker News new | past | comments | ask | show | jobs | submit login
Launch HN: Lumona (YC W24) – Product search based on Reddit and YouTube reviews
148 points by philena 8 months ago | hide | past | favorite | 128 comments
Hey HN! We are Lumona (https://lumona.ai), a product search engine that recommends products based on what people on social media—Reddit and YouTube, for now—are saying about them.

Rather than going through SEO-filled Google results or adding site:reddit.com to your search, we explain what makes a good product, show you the best products, and back it up with Reddit and YouTube reviews about the product. We’re starting with skincare products (more on that below) and plan to expand from there.

Here’s a demo: https://www.youtube.com/watch?v=C4kKjW2YkZ4&lc=Ugzl94GP9SDBO...

We started off with skincare because, growing up, we struggled with acne but had no clue what skincare products could actually help us. Going down the rabbit hole of endlessly scrolling r/SkincareAddiction and watching countless hours of videos about cystic acne was not fun.

Lumona’s skincare search index was built by first scraping the internet for listings of skincare products, along with their ingredient lists, through a combination of SERP, Amazon’s API, and web page crawling. We then use a fine-tuned Mistral LLM to parse through a large number of Reddit threads and YouTube transcripts to extract opinions made by users, along with the context in which the opinions were made. These opinions are then matched with any relevant products through another fine-tuned LLM that looks at an opinion and any products that have a high cosine similarity as that of the opinion’s subject and decides whether that opinion is relevant to any of those products. Using a Mistral-7B FT trained on GPT-4 outputs allowed us to parse through hundreds of thousands of Reddit threads in a simple way with just hundreds of dollars of compute.

If your query relates to a specific situation (e.g. “cleansers for my son who has inflamed acne on his forehead”), we search semantically through the opinions of Redditors and YouTubers to retrieve the products recommended by those who have dealt with a similar situation. If your query relates to a specific product (e.g. “iunik centella gel”), we instead go through the product listings themselves to return you the relevant products.

We also use an LLM to analyze your search query to tell you what ingredients or effects are preferable for your skin concern.For example, if you searched for “inflamed forehead acne”, properties like “Oil-Control” and “Azelaic Acid” which are good for dealing with inflamed acne would be explained to you, and results containing those properties would be boosted and tagged in our results. You can also try out searches like “korean cleansers under $20 with Cica” to filter for certain ingredients and price points.

While we think we’ve built a product search that would be pretty helpful for our teenage (and current!) selves, there are many improvements we’d like to make, such as getting opinions from Tiktok and other social media platforms and making our opinion extraction process more robust for edge cases (e.g. by using OCR, video transcription tools). We’re also planning on allowing our users to upload their own reviews and content and to expand our search across more products.

The long-term potential is to be a go-to product for anyone looking for what other people think about anything subjective (products, restaurants, b2b products, vacation planning, etc.). We believe that the entire discovery experience can be revolutionized by making it as easy as searching on Google to find out what the people you care about think about something. On the individual level, we want to make sharing your opinions with your friends and the world as easy as posting a picture on Instagram.

For now, if you have any skincare needs, whether it be to solve a skin concern, get rid of an annoying pimple, or just to find a good sunscreen, please give us a try: https://lumona.ai (We are an Amazon and Stylevana affiliate.)

We’d love to hear your feedback on our search engine, whether that be how the skincare search performs, what you think is missing, what products you want to see there, or any technical suggestions!




This is so cool. I already do this in a very ad-hoc way. Will definitely try it!

My only concern is that once Reddit reviews get used at scale for product discovery, we will see an inflow of fake and paid reviews in the comments. This will further pollute Reddit and probably drive discussions to forums closed from the public eye, e.g. Discord.

Obviously, this is not your fault at all, it's just the market dynamics at hand.

Anyway, let me try it!


> My only concern is that once Reddit reviews get used at scale for product discovery, we will see an inflow of fake and paid reviews in the comments

This is already happening.

A lot of product-related posts on Reddit are made by marketing agencies, PR firms, SEO consultants, etc. There's also a thriving secondary market for "high karma" Reddit accounts, which are bought and sold with ease. Unlike old-fashioned forums, which were difficult for outsiders to crack, Reddit is easy to game and basically it's already the most astroturfed place on the internet. Making it the basis of a product search system can only make it worse.


Likewise on YouTube, which OPs service is also pulling from, several times now I've gone looking for reviews of a specific product and one of the top results has been a TTS voice reading a probably ChatGPT-generated "review" which invariably recommends the product because the point is to get you to click the affiliate link in the description. The channels I saw were posting "reviews" so frequently and consistently, and for such random products that I suspect the entire operation is completely automated.


Hmm, that's interesting. I've been getting a lot of Instagram and Tiktok reels with the robotic TTS voice nowadays, I've just been assuming that it's a funny thing that people do so that they don't have to record their own voice.

Wondering how / if we should be filtering out this content now that you can make TTS voices that sounds like they're completely real


Even with a perfectly convincing TTS voice it's still given away by the fact that they don't show themselves on video interacting with the product, they usually just show a slideshow of official product images. At least some people must already be falling for this crudely generated content for it to be worth their while to produce it though.


There's an interesting phenomenon where a certain type of rapid-to-experience, entertaining content, often with an enjoyable twist, has become synonymous with the glaringly imperfect TikTok voice... and thus, conversely, creators use TTS to signal that their content is similarly entertaining. And as more and more traditional creators start to use TTS, real voices become devalued as a quality signal. Avoiding recording is only a part of the phenomenon!

https://gesserit.co/ (formerly tiktoktts) is one of the most popular ways to generate a TikTok-esque TTS voice outside of that platform. I don't think they could have chosen a better name!


Wow, that voice on the page sounds exactly like what I hear on TikTok / Instagram all the time. Definitely evokes the feeling that I'm about to be entertained by something.

Thanks for sharing that, we'll have to think hard about how to measure the quality / realness of content online beyond the simple things like upvotes and subscriber count


Interesting, haven't seen one of those yet. With public video+audio models getting much better that will only get way worse over time. Excited to see what YT/Google decides to do about it.


Example: search for "MSI G27C4X" on YouTube, for me at least both the first and second results are fake robot reviews. There are a couple of real impressions videos by real people but for some reason YouTube sorts them below the AI spam. One of those spam channels is posting multiple reviews per hour, with >11,000 videos and counting.


That's crazy, going to be really interesting once this ramps up/if it starts getting good traffic.


I've seen AI generated content about dog breeds, the content was absolutely horrible to watch and listen too.

In the near future we will have YouTube videos that pride themselves on being organically made, no GMOs and built by humans.


lmao true, already seen a few companies gunning for the YouTube for AI generated videos mantel so we'll see how it goes


Already happening? Here's a clip from an astroturfing firm in the year 2000 for Sprite and other clients. They called it 'under the radar marketing' back then. I'd wager 95% of all product recommendations on Reddit, and HN for that matter, are placed by people with agendas.

[0] https://www.youtube.com/watch?v=F0z0a4SLIsM


That's what I do, I think I provide a great product/service but also still want to get the word out.

A marketing agency who will sell a bag of shit as long as they get paid is definitely a net negative.

Overall reddit has been going down hill for a decade at this point and it only makes sense that it will/has been captured by companies trying to profit off it.


Do you have any pointers for learning more about this space? I've personally been pretty skeptical of sponsored products on YouTube, though I find myself getting tempted to / actually trying them out anyway, but haven't thought too hard about small Redditors or Tiktokers getting paid to shill products.

Curious what companies do this / how companies are going about conducting these unpublicized marketing campaigns?


The top comment on that clip scores home that point. While there's no way to verify that, I wouldn't be surprised if it's a lot more than we realize. That being said and to play devils advocate - if all of them are already astroturfed with some people discovering that, then how much do most people care about it?


Very true. It's interesting how Reddit has maintained a relatively high trust within most people though. It's also worth noting that while this is happening, there aren't many other places that most people go to where the site is mainly text-based and there is a higher level of trust that I know of. Personally, I'd trust Reddit over a random blog from a Google search, but that isn't a high bar.

All of that being said, I think this will be a much bigger problem with misinformation generally on all of the internet as AI gets better, especially considering the election later this year.


> It's interesting how Reddit has maintained a relatively high trust within most people though

Maybe the PR companies moved up a level in the meta-game - Don't talk up your product, talk up Reddit itself, _then_ go on Reddit to talk up your product.


It's not even that "meta": they would just be dogfooding. I would be surprised if this wasn't the case.


They're getting smarter for sure, maybe some shadow marketing going on with the IPO too lol


As with anything on reddit, "the real ______ is always in the comments." You get the best advice in place you expect it the least.


Great point. While reddit may be astroturfed, all it takes is one good comment.


Thing is, there aren't many marketing firms that don't have (or can't buy on a work-for-hire basis) upvote/downvote networks. It's trivial to promote comments in ways that look organic. It's equally trivial to downvote commercially harmful comments into oblivion.

What's more, Reddit posts are usually actively discussed only for a day or two. But votes can be cast, and new comments can be added, for months. One strategy is to wait until the conversation has completely died down, then hijack it with new comments that somehow seem to get as many votes as they need to rise to the top. When, a year later, somebody digs up that thread on Google, they'll see the promoted comments first.

Reddit has severe structural flaws that are, I think, unfixable. In making the upvote/downvote thing a kind of game, and in enabling easy throwaway accounts whose votes are weighed in the same way as those of the longstanding accounts of regular commenters, they've naturally made their forum easy for commercial interests to game.


Great points, not sure those are fixable either. It definitely has some advantages from those same things in some respects, but monetizing reddit outside of ads (i.e. ecommerce like tiktok/instagram) because of these things is going to be a challenge for sure.


I wouldn't blindly trust comments.


Can you give me some examples of old-fashioned forums, I just one to see some of it for the sake of it ;)



Where can you buy and sell high karma Reddit accounts?


There are tons of clearweb markets, e.g.: https://openmarketingstudio.mysellix.io/


Thanks, we really appreciate it! This is something we've been thinking about too. One of the things that we've noticed is that video reviews have a lot more effect on us than almost all text reviews, which are harder to fake (for now). We're thinking that letting people upload their own video reviews will help solve this problem as long as we can detect video deepfakes, but that's definitely not a complete solution (like you said though, not sure anything is).


tbf it's common for YouTube uploaders to be paid to advertise products that barely work


lol true, the state of youtube ads is pretty bad nowadays (although the raid shadow legends spam is gone from my feed now). Just places more emphasis on how much people trust the individual channel.


A centralized curator could actually help by drawing a "schill" graph and excluding those signals.


Cool idea, but I don't see how it can ever possibly work with the amount of astroturfing and frequency/ubiquity of undeclared paid advertising.

You're ingesting highly biased, sponsored, astroturfed content. What measures have you taken to filter Youtube reviews down to the ones that haven't been sponsored, and likewise for Reddit? Otherwise it's just garbage in, garbage out but wrapped in fancy, legitimate-looking packaging.


It's the eternal September problem. Reddit was a great source of honest reviews until everyone figured it out and started to take advantage. Services trying to capitalize on it may be well-intentioned but it's only going to accelerate the enshittification.


Strange that you chose as acne your demo topic but none of your results mention one of, if not the most, powerful treatments that is Tretinoin/Retinol and which comes up in the first search results on Google.

Problem is that some of the best skincare is not available over the counter, and surfacing prescription treatments dips into medical care, which is a whole other can of worms.

In the end, you are missing valuable treatments but presenting a summary of poorly researched (by Reddit users) or anecdotal information.

I love the concept though and would love to see it catch on!


You're right that there are some very effective prescription treatments that aren't shown, but it doesn't seem like prescription acne treatments are the usually the appropropriate / doctor prescribed choice for most people facing mild to moderate acne.

Personally, my pediatrician told me that acne is just something that happens to teens and recommended that I go try some acne washes from the drugstore instead of prescribing something like Tretinoin which could have some pretty intense side effects.

Reading r/SkincareAddiction has been really helpful for me, especially seeing the range of experiences that people have had, and that's why we made Lumona summarize these results.


>my pediatrician told me that acne is just something that happens to teens

Certainly not... https://www.yalemedicine.org/conditions/acne

> Clinical trial data revealed that approximately 50% of women in their 20s, 33% of women in their 30s, and 25% of women in their 40s suffer from acne

>which could have some pretty intense side effects

Your site recommends benzoyl peroxide which has similar or worse side effects compared to tretinoin.

It's also a lauded product on both r/SkincareAddiction and r/30PlusSkincare. Not something recommended for kids, but for adults with persistent acne it is worth trying, especially over antibiotics and alongside BP.


Lately topical Tretinoin has been shown in numerous studies to cause Idiopathic Intracranial Hypertension aka Pseudotumor Cerebri which is quite intense. My post about this on sca from years ago recieves a many comments to this day from other sufferers. I'll never touch it again despite my adult acne. I wonder how many other people have debilitating potentially blinding brain pressure headaches and don't realize it is caused by this medicine commonly accepted as safe.


This is a neat product, and I plan on trying out some of the recommendations for sunscreen.

During my journey using the app there were a few things I noticed

1) It seems like the intermediate page is generating text from the LLM as well, which makes the whole process quite slow on my machine. It took maybe 10 seconds before the loader finished displaying the text. If I try and perform the same query again on the same browser, the results are somewhat quicker, maybe 700-800ms of wait time, but this still seems too slow. Once I ran the query five or so times, it was as quick as the demo queries on the front page.

2) Consistent results: If I use the same query on separate browsers, I'm given different products as the "Top Recommended Product", which seems odd. I know LLMs are stochastic, but the feed starting with the "Top Recommended Product" probably shouldn't have stochasticity. This problem opens up some interesting ML cans of worms, but I believe these issues could be overcome.

3) Another issue was if I wanted to scroll in the left column while the right column was still loading, the scrolling was very janky. This was an issue on firefox, but it took quite a long time for the app to be functional (> 10s)

4) Perhaps you could move the search bar and the logo to the top, so the logo is on the top left corner and the search bar takes space to the right of it. This way there aren't overlapping elements, I'm sure there's some annoying edge cases there which would frustrate users

5) For negative ingredients (and maybe any of the ingredients) it would be nice if you kept track of an ingredient database with references. I want to know why some ingredient is bad for my skin, and what I could expect.

6) If a product has many distributors, my first through was the arrow scrolling through products was a slider for the distributor list. I wonder if there's a nice way to differentiate the arrow further, so its functionality is more apparent.

Anyway, this is an excellent proof of concept, I'm excited to see how this product develops.


Thanks for trying it out!

As for the performance issues, we're looking into several things that could speed things up - Fine-tuning a small LLM for the results on the intermediate page and deploying on a provider with higher throughput and time to first token - Admittedly, there there are quite a few SQL query / index optimizations we need to make on the backend, along with making parts of our pipeline async - The frontend itself is also not very performant right now, but we're working on it.

We cache previous calls to the API, so that's why the demo queries or queries others have tried before you are faster. I'll ship a change that makes the results more consistent but not fully consistent later today.

As for the ingredients, citing sources is definitely a next step. In the meantime, I recommend looking up the ingredients that catch your eye on a place like EWG Skin Deep if it's a huge concern for you (I used to do this to make sure my ingredients weren't comedogenic for acne).

Great point about the distributor list UI, we'll think about a better way to show it!


This is great. You can then start seeding products which give you a high cut and then proclaim them as the "best". Basically what Wired and all do now but without the whole article bit and you can claim "knowledge of the public".


that's an interesting idea -- we have been seeing this play out successfully as well (like you mentioned, Wired + sponsored youtube videos + etc). though that would be useful for profitability, we're afraid that may compromise our reputability as being the knowledge of the public. instead we're looking for ways where, when we expand to more opinions and reviews, we can robustly filter out those that seem disingenuous / are sponsored. curious as to what you think about this "filtering" out + if you have any ideas of going about this :)


As long as you don't show blatantly horrible products, your reputation will be fine.


we'll do our best :)


I understand why you went for a product search engine (gotta monetize) but I think one of the reasons mining reddit for intel is so helpful is you aren't always being sold a product.

For example: I recently turned to reddit because I was looking for a foam roller to resolve some IT band issues from running, and ended up finding a stretching routine that has fixed my problem without buying anything.

Either way, I think this is really cool and bypassing the nonsense that google is becoming is a winning path.


I know this is wildly off topic, but can you please share any info about your stretching routine? I have persistent IT band issues from running…


likewise


thank you! i completely agree -- i often go to reddit when looking for tv show recommendations because of its honest advice from the community (maybe it's because of its anonymity?)

we definitely want to expand this to outside product search and be more of a general recommendation/opinion search (e.g. in your case, finding out what people are saying about how to fix band issues from running), interested as to what you would think about this :)


you'd be surprised but reddit is mostly shills.

besides your anecdote, all the games reviews, computer part reviews, etc are all paid by drop shippers. and some reddits like mattress review ones are exclusively shills talking among them.


It’s a neat idea, but I think using the affiliate model will ultimately be corrupting.

Maybe once you get larger you can pivot to being a paid service like Consumer Reports. To me, they still feel more trustworthy than other services similar to yours (like Wirecutter).


Nice work! I kind of do this with google and reddit already sometimes, as a well written explanation for why someone likes a particular item plus the upvotes do help me make decisions.The format looks pretty good, woudl just like to have a view of all the products at once in a comparison if possible.

The concept of a search that is multi layered is something I see The Browser Company and others doing to make your one search a bit more impactful, so kudos for going in that direction as well. I would do restaurants and search availability as well.

More thoughts: https://www.youtube.com/watch?v=xKFDuZsdXrc


Thanks! We just watched your review together in the living room, and we really appreciate your thoughts+detailed feedback. The list of items is an interesting idea that we'll think about how to fit into the ux. Comparisons is definitely something we want to add later down the line as well.

The idea of restaurants, like you mentioned, would be really great to have. It's not an immediate priority, but once we get Tiktok/short form videos on the site and integrate it well, it'd be really exciting to make and use.


> Using a Mistral-7B FT trained on GPT-4 outputs allowed us to parse through hundreds of thousands of Reddit threads in a simple way with just hundreds of dollars of compute.

Great idea. These sort of clever approaches are needed to be able to build these sort of products that benefit from scale. When the cost of inference goes down, it enables new experiences. And clever ways to reduce cost before the big providers do, is a massive competitive advantage that makes it tough for those who wait to compete with you.

Anyone building AI products should take note.


The missing part of the story is when we made an early prototype using GPT-4, leaving it on overnight, and realizing that we've spent several thousand dollars of OpenAI credits...


Aaah, I can imagine the panic I'd be in.

Yet, such pain is where the innovation comes from :). Wishing you all the best! And plan to try this out once it covers more product categories.


Interesting idea! Well done. The interaction design on the site is a bit weird IMO. I can swipe up or sideways through results. I’m not sure what’s the difference in information architecture. Also there’s no indication of how many results there are. You could display the results as a stack of cards and show a counter for the number of results. I’m happy to help you with the UI design if you’d like some help.


How do you deal with bot posts to push products on either platform skewing the reviews?


For now, we're excluding Reddit posts that are clearly automated and making sure the YouTube content is not sponsored, which you are required to disclose by the YouTube ToS.

We'll have to dig deeper into not to filter out spammy reviews. I can imagine analyzing a user's post history or detecting if content was clearly GPT written, but it's hard to really tell. I know there things like Amazon review analyzers out there, but we'll have to learn more about this. I wonder if the people of HN have any suggestions on this front.

There'll probably be a lot AI generated reels that look like they're from real people online soon too. I wonder what platforms like Tiktok and YouTube will do about this. If this ends up being a huge , we can probably try to use ML methods to check if the video was filmed in the real world


what does clearly automated mean


For now, it's just removing AutoModerators and things labeled as bots. Now that I'm reading this again, I realize doesn't really help, since bots pretending to be people recommending products, don't get filtered out.


That strikes me as very naive. Reddit bots are never marked as bots, that's the whole point of astroturfing. Youtubers aren't diligent about disclosing sponsorships either, regardless of what their ToS say.

Slightly outdated (2018), but they found that only 10% of Youtube videos disclosed sponsorship: https://www.engadget.com/2018-03-28-youtube-influencers-spon...

A recent report by the European Commission found that only 20% of overall "influencer" posts disclosed sponsorships: https://ec.europa.eu/commission/presscorner/detail/en/ip_24_...


I think this is a very good point. We've focused a lot on correctly matching reviews with products / brands, but haven't taken hard enough of a look at astroturfing.


We've tried to build this in the past with Looria.com, where we aggreagted and summarized reviews from the most trusted sources, e.g. Reddit: https://www.looria.com/reddit

Couple of challenges:

- Astroturfing is everywhere

- The data sources, especially social media, become more protective with their data

- Monetizing this is super hard. As an aggregator, you're always just the intermediate. The glory times of ads and affiliate marketing are over.

Vetted.ai is working on something similar and they raised $14M in 2022. For all consumers, I really hope one of you will succeed!


Thanks! Super interesting how many different approaches there are to this problem. Definitely encountered these challenges, and we think there's solutions to them eventually that we have to build towards. I'll drop a message sometime, would love to chat :)


Curious - what is your bearish case for profitability of affiliate marketing?


Over time affiliate programs like the Amazon one have become a lot less generous. On the other hand though from running an ecommerce site I'd be happy to work with an affiliate like this that isn't just a coupon website that basically adds no value.


That's what I've heard as well. Also, the lag to when you actually get paid is super painful.

Also curious - how do you think about affiliates as someone who runs an ecommerce site? Are there any reservations about whether services like us take search traffic or ads revenue?


people can find multiple ways to get to the same product. once your way starts charging they will find another way.


Is the implication here that you need to charge and users will leave you once you do? If you can make a product that's significantly better, then you should be able to charge. The thing I'd note for affiliate marketing as a business model is that for it to generate significant revenue, you need to have a lot of traffic while other business models can generate that much faster (subscriptions) or make you money based off of that traffic (ads) instead of how many products are purchased.


Your note on affiliate marketing is what makes your first statement potentially unachievable. How does a consumer "know" that a product is significantly better to the point of "worth paying for"? There's always another free (potentially ad supported) affiliate marketer (or 5) around the corner. (Also considering the "worst" version of this "product" is an unskippable ad").

I don't know the solution


Fair. The best solution we've seen is building the product in some way where it's somewhat defensible, either through data, features that bigger players won't build, etc. and then using a subscription based model if users are willing to pay for that and value the searches high enough or using an ads based model if you're optimizing for traffic rather than pure value on each search.


I suppose two case studies worth exploring are:

Consumer Reports (subscription magazine recurring revenue) NY times Wirecutter (a potential add on service to boost apparent value for subscribers)


We've looked a bit into CR and Wirecutter, not that deeply into CR yet though. I definitely used Wirecutter for a bit of things in the past, and they have a high level of trust that we'll need to seek to replicate.


Love the idea, my only concern is how to trust that at some point you're not going to include sponsored products?


As a user of our product, I'd really hate it if we were recommending crappy products. I suspect users will also feel the same and this thought will hold us accountable.


That's a non-answer to the question. You're saying that you would like to recommend good products only, but the question was about sponsored products.

And on another comment about adding high margin results as "best" you seem very interested in that concept.

I'm not trying to be hateful, but if you're thinking about how to seed results with high margin products for yourself, instead of the actual best results already in the life cycle of your product, I think it's just a matter of (not much) time before any quality you have will immediately plummet. This makes me very cautious about your product. Very.


This is so cool! The way the search engine has been built up also seems very smart. I'm honestly surprised too at the same time, that this kind of idea hasn't been worked on before (couldn't find anything similar; I could be very wrong)

I'm not sure if this type of problem is even a considerable one, but how does the search engine handle reviews from subreddits which are focussed only on a particular brand, and may potentially form a bias around such products? Does the LLM's awareness of each review's context handle that?


That's a really good point. I think in our current iteration of the system, if we applied it to subreddits focused specifically on a particular brand, it would not be able to account for the bias there, even if it knows which subreddit the content is from. That's probably too much to ask of the LLM.

We'll have to think about good ways to handle this. Curious about your or others thoughts on these subreddits, how do you process content on these subreddits differently?


Brand focussed subreddits generally have a detailed review of the product (which is really really helpful), as compared to general topic subreddits where comparisons and versus are more often to encounter.

Likewise, if I want opinions on a specific product before buying it, I would definitely go to brand focussed subreddits, but if I'm unsure and have a generic problem in mind like "acne on forehead", then probably going to general topic subreddits would be a better choice.

I agree, that might be probably too much to ask from the LLM. If you could possibly analyse the variety of brands discussed on a subreddit and assign it a score according to the versatility of discussions, maybe that could help.

Anyways, that's way too overcomplication of things and probably something that should be of concern when the product grows more mature (or if a brand bloats up a certain subreddit :P).

Cheers, congrats on getting on YC!


I just want to know what corded stick vacuum to buy. Where can I access something that a human has written? It’s become impossible for me. I’m on Kagi, I wonder if Google or Bing are better at this.


https://www.reddit.com/r/BuyItForLife/comments/15b4iks/best_... This one seems to be a good thread. Hopefully we'll have it on Lumona soon.


Results doesn't finish loading for me, I will try again in few hours, I am really curious to see how it compare to generalized search engines like Perplexity and You.com


sorry about the loading issue -- we'll look into that right now!


Skincare is an interesting beachhead. How do you test if your results are good? What's the baseline?

I feel like something like movies or video games would be a great way to validate the approach since there's generally agreed upon sentiments regarding these products.

Skin care I'd imagine is fairly complicated. Goal, lifestyle, budget, habit and individual based needs and preferences can lead to different sentiments. How do you calculate say, your loss function?


I think that's what makes skincare interesting. We want our system to be able to understand your goal, lifestyle, budget, ... and pick out which product is the best for you, given what others who have used the product before said.

With less of this information, the ground truth would probably related to how popular the product is or the average sentiment of people reviewing the product. With this information though, you can compare each one to see which best fits the user's needs. Having compared enough products, you'll eventually figure out which one is the best.


it's a recommendation system with a timeline of months and the data to back it up IS your market hypothesis.

you can have whatever risk profile you want, some people are gamblers. Personally I'd like to have greater confidence in the fitness of the implementation before going all in.

Although I say this with the advice adage: unless the business advice comes from someone who is on their own private yacht, it's more opinion than advice. (I've got no yachts)


How does this differ from https://www.looria.com/?


A couple of ways from my understanding. We have different focuses in our UX and UI as we, for example, feature reviews directly next to the product and show products 1 at a time instead of a listing view. We also place more emphasis on having a semantic search where you learn about the products being offered and how they're relevant to your specific situation instead of a keyword based search. From a business standpoint, we're also affiliates of Amazon and Stylevana while Looria isn't.


If I had to guess, I'd say the top words on Reddit would be "actually" or "because" and probably 69


r/SkincareAddiction Out of all posts comments from 2023:

and: #1 skin: #23 acne: #55 because #97 actually #263


Reddit and YouTube are so astroturfed that I have trouble believing there’s much signal in the marketing noise.


Cool concept. Not relevant to me in its current state being limited to skin care products, but would love to use something like this for things like supplements or other products where I otherwise have to sift through Amazon reviews & reddit threads.


Thanks and makes sense. Supplements+general health and beauty will probably be one of the first things that get added outside of skincare. Would be interested in seeing the reviews as well for those considering how supplements are sold+regulated.


I sat on the before we begin page for a long time waiting for something to happen before I realized nothing would:

https://imgur.com/a/cvT1iF8


Sorry that happened :( What were you searching for? We'll look into it.


Seems to happen whenever I don't specify a specific product?

https://www.lumona.ai/search/results?q=vegan+european+leathe...

^ That search does the forever loading boxes for me but if I add 'cream' to the end of it then it seems to work.


What's happening here is that we put your search query into a model that tries to figure out what skincare product characteristics (e.g. Vitamin C, Retinol, etc) would be good for you, but when it sees something that isn't a skincare product or a skincare concern, it gets confused and doesn't return anything.

I'll put in an error handler for this soon.


This is something I ran into as well, mainly bc of your title and description. I tried it before reading your whole post, so I didn't know it's only skincare for now.

> Launch HN: Lumona (YC W24) – Product search based on Reddit and YouTube reviews

> Hey HN! We are Lumona (https://lumona.ai), a product search engine


Sorry for the confusion, we'll try to have general products soon.

I just pushed a change to give an error message explaining what's happening for non-skincare related searches.


No worries, I'm excited to use it when that happens!


Are you paying for Reddit's API or did y'all find a way around it?


We are not paying for Reddit's API to get our data, there are some really good and complete and publically available dumps of Reddit data available online. We are in contact with the folks at Reddit, which is of course a YC company, so they're aware of what we're doing.


they're either paying or it was a gift from sama


would love to say that it was a gift from sama, but he hasn't blessed us :(


Similar thing, but for electronic gadgets - https://shoppalui.vercel.app/


how does this service deal with a coordinated advertising campaign -- most likely also driven by LLM's over a period of say X months. Moderators on subs can be bought out or marginalized, while youtube reviews can also be bought out. In other words, how is an aggregated source a better and more trustworthy source of information than a single blogger who people can ascribe some amount of trustworthiness to over a period of time.


Great question. This would be a bigger issue if we were only aggregating results and summarizing them, but because we both aggregate and show (in our opinion) the highest credibility reviews from YouTubers (and other sources like blogs once we add them), our idea is that while the general mass opinion can be shifted through campaigns like that, the top end of the spectrum should hopefully still remain pure.

If on the other hand the top end of the spectrum is corrupted, then hopefully the masses can compensate for that. If both are corrupted and all of the data sources available are, then it really comes down to our ability to filter out LLM or promoted content which comes down to how well they can hide it. AI detection tools have been scaling alongside models, so it's also a question if that will continue over time. We'll think of some more advanced things if that becomes a bigger issue for us :)

At the end of the day, if a company can do a coordinated advertising campaign across the internet over months to block out any negative opinion, it's a big deal for both us and the social media/data sources we pull from that's going to be a challenge we have to deal with.


Very cool! It reminds me of https://chord.pub/.


Wow, thanks for sharing this. I find it interesting that they chose to make it something that I have to wait 1-2 minutes for before I get my AI generated article.

Seems to do a good job for various types of research, will give it a try next time I'm curious about something and need it researched


Doesn't fine tuning models on GPT-4 output violate OpenAI's terms of service?


OpenAI says in their terms that you can't "use output from the Services to develop models that compete with OpenAI" [0], and it seems that people are interpreting it as training a model that directly competes with them, which we aren't doing. There are many companies out there built on using GPT-4 outputs to do task-specific fine-tuning, so it doesn't seem like it's a problem unless we were trying to make competing foundational model from GPT outputs.

(I'm not a lawyer, so this needs to be taken with a large grain of salt)

0: https://openai.com/policies/terms-of-use


It seems like everyone is doing it. Does anyone care? Should anyone care?


Yo, can I take a picture (of my skin) and you can suggest some solutions? Multi modal plz!


For sure! We'll work on that in the next couple of updates, it's been on our minds for a while.


this idea is awesome, I hope you get into software products


thank you! we've been wanting this as well while building this out haha, will do :)


Really cool use of open source models


What I find interesting here is how far a well working QA application with LLMs (such as this one) is away from anything that can be generalized to other topics.

Thats probably where we are right now: I have seen quite a few purpose built and tuned AI systems for one specific use case or topic which work really well. By contrast, I have yet to see any general AI bot that does this with arbitrary data for any reasonable definition of good.

I mean, take any of these Chat-with-data bots, load up a huge document and ask it for information that is spread on many pages (like make a list of prices for every product in a catalogue). Then see it fail.

Exciting times.


Definitely feel this way too. Sometimes I think to myself that it'd be really great to have an LLM give me a well researched report on say like, recent trends on undisclosed marketing online, wished that we supported that on Lumona, but realized that we'll have to do it eventually, but pretty tough with the current infrastructure


how do you index reddit cost effectively without breaking their tos


We're working with dumps of Reddit data, which means we don't have to use their API or do any scraping on Reddit itself for now. The data is updated monthly though, so we'll have figure out how to get higher quality data for things that are more time sensitive. We're in contact with the folks at Reddit, so we'll try to see if there are ways to get better data later on.


searched for best yoga mat, got strange video about sunscreen...


Sorry about that, we didn't make it clear initially that Lumona only has skincare products for now, we'll be working to scale it beyond these products soon, but that message about skincare was probably not clear enough from our post


Putting dermatologists out of business… lol.


Just noting that best skin care routine is:

No Alcohol or caffeine Lots of water Vegan diet Using baking soda Adequate sleep and time in nature


True, I think I would agree with most of this sentiment. Unfortunately I'm not doing many of these things and am still using my skincare products.

Perhaps we should be surfacing opinions like this beyond just products.


My brother, a famous Broadway actor, once gifted me a jar of $300 face cleaner scrub.

It was great.

When it was empty I read the ingredients and discovered it was basically a jar of wet baking soda.

Using .50 baking soda was just as great.

:)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: