10 years ago I was making that kind of money developing something like a recommendation engine for a startup and then a search engine for patents powered by a neural network.
My YOShInOn RSS reader uses BERT-family embeddings to capture the "gist" of text the way the neural network for that search engine did, then I treat content-based recommendation as a classification problem. (e.g. "Will the user like this?")
My system is quite a bit different from others because, like TikTok but even more so, it demands a thumbs up/thumbs down judgement for every article so I get a set of negative samples that are really reliable.
There are numerous frontiers of improving this system. One of them is that there are certain things, like "roundup" articles that cover a wide range of topics (say https://www.theguardian.com/world/live/2024/jun/03/russia-uk...) that the embedding doesn't capture well, adding some new features could clear out maybe 10% of articles I'd rather not see but I am not in such a rush because overall the system is very satisfying to me and I am already blending in more random articles than that to get samples to keep it calibrated and also sometimes discover new topic areas I find interesting.
Another interesting frontier is sequential recommendation
but I'm not sure if I really want to take an ML approach to this because I'm not sure there is enough training data for one person's content-based recommendation. I'm not sure exactly how I want to do it, but when I post to a place like HN I do not want to post a stick of five articles from phys.org, rather I want there to be some diversity in my feed not just over the course of a 300 article batch but on the scale of individual posts. Items can be hung up in queues for several days in this process; most "news" on HN is fairly evergreen and it doesn't matter if it is delayed a week but articles about sports have a short shelf life as you look like a fool if you post an article about what happened in week 3 of the NFL after the games of week 4 have played. So I need some way for sports articles to "jump the line" ahead of other articles but I don't want that to privilege sports over everything else.
Similarly there is "the probability that article A is relevant" but there is also "Is A or B a higher quality article?" One Google innovation was using a document quality score (PageRank) asides a normal document-query ranking which is tricky because now you're not optimizing for one thing but trying to optimize for two things that could compete with each other. I am thinking about switching that system from a batch to a streaming mode and need some answers for that.
I think what you mean is OP is going to answer their own question with something selling resources to help others "make $10k/month with AI apps"? Whether or not that's the intention this type of question does attract the snake oil sales pitches.
We do general AI/LLM consulting in the data space - not so much generative text, more along the lines of analysis, indexing, and search. Our path to customer happiness is simply to produce systems that actually work!