They give examples of LinkedIn (people you may know) and Amazon (presumably other people who bought this, so-and-so's list of such-a-subject books).
That makes sense, though the segment of businesses that may actually benefit seems limited. Social stuff, sure. Most of us? What's the minimum recommendable-entity/category-or-user threshold that this makes sense for? Is success with these sorts of engines merely a reflector of poor UI design in your normal UX? (Of the above examples, the first seems very unidimensional - in that it's basically a simple graph distance - and the latter also rather rudimentary and often irrelevant).
So what exactly is this thing providing? Graph analysis? I think not. It reads more like some kind of raw timestamped user behavioural event data processing to infer relationships between users or products they interact with. Reading through the docs it seems this is a layer on top of Apache Pig (https://pig.apache.org/) - a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs. I think clarity in explaining this thing could be improved, particularly selling clearly what a recommendation is and when its useful. Using phrases like "award winning" doesn't help.
PS. Why all the downvotes? Sheesh.
Imagine opening an advanced textbook on a subject you don't understand, reading two paragraphs of it, and throwing up your hands in disgust because what does this even mean?
Could anyone summarize the difference between Pig and LensKit when applied to recommendation systems?
Any business where you have a large catalog that users are going to want to filter through. This gives you the ability to offer a shortcut to things they might find interesting. Other examples would be netflix, spotify, app stores, or coursera.
A Mortar account. You can sign up for a free Public account with Mortar here. If you want to keep your customized recommendation engine code private, you will need a Solo-level account ($99/month). Beyond that, you'll only pay for your actual usage of AWS cloud services (we never add an upcharge).
Kudos for the open source, but it looks like to actually use this for business you'll still need to pay. Unless i'm misreading it, "Open source but you'll still have to go through our platform" is pretty disingenuous.
It's just that making a big press release and blog post that brags about open sourcing, vs the reality that you can't actually do anything substantial with the code without paying for it... it seems off to me.
I get what they're trying to do, but to me the whole point of OS code is that you can self-host, and/or modify it for business use if you so choose.
To me this would be better served by advertising "We like you so much, we're giving away access to our service for free for noncommercial and test use, and opening up the code to the library so you can see how it works", but that's less interesting as click bait.
Maybe i'm just mis-reading the whole thing and you can self-host.
1. Everything in this github repository (https://github.com/mortardata/mortar-recsys) appears to be truly open - it's just a bunch of pig scripts, some java UDF definitions, and some python management code. There doesn't appear to be any dependencies on proprietary MortarData anything. All the code is licensed under the Apache 2.0 license.
2. The blog post states: " You can run this code anywhere. It’s built on widely-adopted open source technologies—Hadoop, Pig, and Python. But we think you’ll want to use our platform."
I think it's a nice model actually.
So either Mortar opensourced some feature-crippled fragment of their platform, and it relies on features from their proprietary platform heavily; or the statement of requirement Mortar account is property of the Tutorial's approach, not the opensourced code itself.
Those who know what Hadoop, Pig and the whole "Data Science Stack" is, will find this surely useful.