Everyone who works at Academia would love it if we were able to make advanced search free.
When we first decided to build a premium account, we also made the decision to not take anything out of the free account. Strange as it may seem, the free account never had full-text search because we couldn't justify the cost of building it (full-text search of 20MM PDFs at our traffic levels is expensive to operate). We built it for the premium account because people asked for it in our initial research - and we would love to be able to eventually move it into the free account.
On the team we all agree that we want to keep building premium features in order to make the platform sustainable. The author of this article takes the view that advanced search is not a feature that should be paid-for. My view is that we intend keep building features until we have something that is worthy of his support. The support of the academics who use and enjoy the platform, both in free and paid accounts, is what will keep it around and growing for the long term.
How much money do you need? Publish your revenue model and stuff so people can help you with it .
Look what this boils down to is that Academia.edu is a private venture, people are investing in it in the hope of being able to make profit later. Here the firm says they want to bring as much information as possible to as wide an audience as possible, but need to raise additional money to add this (rather obvious) functionality. How much? That's a secret, because revealing it might reduce the profit the investors are hoping to get out of it.
I'm very much in favor of the stated goal of disrupting the existing academic publishing/cataloging oligopolies. but if that's really the priority, then stop being so secretive. And if profitability and becoming the new monopolistic incumbent is really the priority, then stop bullshitting me with feel-good mission statements and just cold-call more rich people until someone writes a check.
I know I'm pressing things very bluntly, but I don't think the habit of corporate doublespeak that has become the norm in society is actually doing anyone much good, including the people engaging in it. Nobody really wants to get up in the morning and spend their day bullshitting people with cliches, that doesn't create value for anyone.
Charities do that all the time. Some are well funded exactly for this reason. Most people understand good things cost money when thinking for long term.
HN is not an audience of that kind of people.
I've gotta say I'm getting real tired of entrepreneurs that want to be everybody's friend when they're getting their exciting new venture off the ground but are too cool to discuss the nuances of business with anyone outside the VC bubble whenever they run into a PR problem.
That's a general statement, not directed at the academia.edu team. It's a sad reality that a lot of what passes for entrepreneurship today involves telling users, employees, and investors that they're each the most important group so as to make as many people as possible happy, right up to the point where a conflict of interest emerges and then trying to obscure the fact of its existence with platitudes.
> full-text search of 20MM PDFs at our traffic levels is expensive to operate
Assuming you convert them to text once, index them, and put them in a standard FTS engine, I'd guess it is on the order of 100GB-1T of text (max), plus some more for the index (basing these estimates with my experience text mining PubMed Central and MEDLINE). So it can all fit on a pretty standard server. Maybe at 100 req/s it would take a few. Yes, you'd want replication.
The number of servers required to get good latency FTS is the part of this that I'm least familiar with. Anyone have a ballpark, given these estimates, on what kind of hardware would be required? (I could easily be wrong, and indeed this is very expensive. If so, I'd be curious about ballpark numbers)
One 8c server (running about 10 workers) can handle 8 to 10qps, depending on the depth required. This is on an index of 20 million documents. If the number of workers is constant, doubling the index will have the qps. 2 million docs take about 50GB of disk space (20 million = 500GB, 1TB with redundancy).
It's better to go with SSD arrays here, since random IOPs are much higher than for other workloads. This can skyrocket cost.
So for this (our) system, it could be as cheap as $1k for the hardware, e.g. using the Foxconn Purus cloud server: http://www.bargainhardware.co.uk/cheap-e5-2600-lga2011-sixte...
Just to make sure I understand correctly, that's about $1-2K of one time cost per 10qps (w/o SSD, and not counting power and maintenance, etc)? When I first saw "cloud server", I thought that was a per-month rental cost, but the link is for actual in-house hardware. If this is even close to correct, my suspicions seem confirmed.
Except for one thing. I have no idea how many qps a site like Academia would have. 100qps was completely out of my ass, but it seemed hard to imagine it being any more than 1-2 orders of magnitude higher, at most. Any guess on that?
You can also do this with virtual public clouds (AWS, linode, GCP et al), but you'll of course pay a premium for the infrastructure. This might be worth it though, because you can now scale within seconds to handle qps bursts. Usually, latency can be lowered by going baremetal (see e.g. Algolia).
Academia should be able to handle more qps than our system, because the queries are really trivial in comparison. With decent caching, an 8c should be able to do 50 to 80qps. That's what I get from a few experiments when I switch my test cluster into restricted mode (basically just substring search).
Of course I can only speak from my experience, not how this can be applied to Academia's existing infrastructure. Testing large search engine deployments can be really, really frustrating.
Full-text search is freely provided by Google. Judging by how many people caught on to using a similar trick with WSJ, it will start to become a very popular way of interacting with the site.
Also, that sounds like an assumption. Do you have anything concrete to back that up? Assumptions are often broken in painful ways for businesses. You might be surprised how many academics are motivated to learn about Google's advanced search and tell their friends about it if it saves them money.
…which means that you wrote them and he, ah, "co-authored" them?
Will Academia ever stop impersonating people in e-mail? I receive e-mails all the time purporting to be actually sent by academic colleagues, which are instead form messages sent by Academia.edu.
I know that you're not sending these messages with permission. I know that my dead co-author is not giving you permission to send unsettling e-mails from him.
We only ever send emails from users in response to a request from them to do so (e.g. they send a message or invite a co-author). Please send me the email and I will look into it.
Academia's e-mails have appropriate From: lines, and don't appear to be sent from beyond the grave, and you deserve credit for that.
Why not do a free full-text search on the abstract or first couple of pages (or few hundred words)?
This might make the premium search upgrade more subtle but I wonder if it could help the majority (?) of legitimate searches get the hit they expected/hoped for.
... Perhaps this would appease the haters?
As Richard mentions in another thread, search is not actually the primary discovery mechanism on Academia. The social features (the news feed, bookmarks, sharing, recommendations) are the primary discovery mechanism. These are all free and we want to keep them that way.
Is there any concern that paywalling features will reduce the number of users on the site?
I also found it weird that they had a .edu address (apparently they bought it before the rules were put in place and so they are grandfathered in)
Whenever I searched for a paper and google found an academia.edu link my process was, click on the link, oh crap it wants me to log in, what's my password again? fuck it, I'll just go to scihub.
What I sent EDUCAUSE:
The owner of the ‘academia.edu’ domain name appears to be operating in violation of the EDU TLD requirements.
The WHOIS record:
251 Kearny St
San Francisco, CA 94108
Is not an accredited institution according the US Dept. of Education.
If so, I imagine that involved a "transfer" from his personal ownership to that of the company and that may fall under that 2006 rule, but it it still isn't the same situation as someone "squatting" on a edu name and then auctioning it off to startups looking for a cool name, which is what I think EDUCASE is concerned with.
If a professor or somebody bought in May 1999 when it was registered initially (10-May-1999 Create Date), then it existed prior to the restriction. But in 2006, 2 years before Academia the company existed, those rules were put into place prohibiting transfer to non-accredited institutions. So, they're in one of two situations that I can see:
1) The domain was transferred, against the ToS for the TLD, and is invalidly registered, or,
2) The professor still owns it, and the WHOIS information is invalid (it lists the Company as the registrant), which is a violation of ICANN rules for WHOIS information accuracy.
Either way, I don't see how they can be legally using it.
I don't think this applies to places like academia.edu or Researchgate, but IANAL.
The academic world is suffering from the same tendency to keep within a walled garden as the rest of society. It's hard these days for anyone to get people to regularly visit their blog even if the quality of the posts is high, instead people only click on links shown to them on Facebook. Similarly, my colleagues in my field generally don't visit my own website even though they are aware it's there and has links to my papers and other relevant content, but if I post a new paper on Academia.edu, I immediately get a number of views from there. It’s sad.
I try to follow a certain academic field from the outside, but there are too many blogs, old online journal archives, websites, etc. out there for me to keep track of or even find, so I end up reading the few blogs that update somewhat regularly and academia.edu and that's it.
The author of the article objects to the introduction of a freemium model on Academia.edu, and specifically to the idea of advanced search being in the premium set of features.
Academia’s mission is to get every academic paper ever written on the internet, available for free, and to develop a more rigorous and efficient peer review system. Free access to academic research makes the world a more equitable place, and rigorous and efficient peer review accelerates the pace of scientific and scholarly research.
Academia launched in 2008, and now around 35 million people use the site a month. Around 19 million papers have been uploaded, and are freely available on the platform. Forty percent of Academia’s users are from developing countries and would otherwise have limited access to academic research.
In order to achieve Academia’s goals, Academia is working to become a sustainable operation. In order to do that, we introduced Academia Premium, which includes extra features such as Readers, Mentions, Expanded Analytics and Advanced Search.
- Readers tells you who is reading your papers
- Mentions alerts you when papers are uploaded mentioning your name
- Expanded Analytics provides a more detailed look at what kinds of people visit your profile and how people find your papers
- Advanced Search allows you to find exact keyword matches in the full text of every paper on Academia
In considering what features to offer in the Premium account, we decided that features related to the mission of the company should be free. This means that features related to open access (free access to uploading and downloading papers), and features related to peer review (sessions, recommendations) are free. As we add new features to Premium, we will continue to ensure that features related to the mission (open access, sessions, recommendations) are free.
We are grateful to all the academics who have contributed to Academia, and we will continue to serve academics from around the world.
> In considering what features to offer in the Premium account, we decided that features related to the mission of the company should be free
That's absurd. Open access to an ocean of articles without the ability to search through them is meaningless.
While you do allow search in titles, thus providing some ability to locate relevant material, presenting content search as some premium feature in 2017 is ridiculous.
As an academic mathematician, I disagree. The ability to search through articles' content is great, but being able to access an article given its authors and title would already be quite valuable.
I hope that academic researchers eventually converge on a completely open-access model, involving commercial enterprises to the minimum extent possible.
But researchers are too preoccupied with other things to make this a priority. For the moment, I am happy to see anyone taking on entrenched interests (e.g. Elsevier), from whatever angle. The status quo is bad enough that I welcome "creative disruption" of essentially any sort.
- less creative titles
- longer titles to cram keywords in (ie. an academic form of clickbait)
The point is artificial limitations will produce unknown effects on the system being limited. Offering premium search is an uninspired approach that will have an unpredictable impact should Academia.edu become the status quo.
I'm neither condemning nor condoning premium search. Merely warning against accepting arbitrary limitations simply because it's better than what you've currently got.
The whole reason that Elsevier et al. can make so much money is that researchers can keep doing exactly what they have always been doing. Researchers themselves don't see the publishers' invoices, let alone pay them, and they are under essentially no pressure to cut costs. In particular, Elsevier has no influence whatsoever in academia -- they just make shitloads of money.
Managing academics is often compared to "herding cats"; it is uncommon for academics to be very willing to realign themselves according to external factors. (Exception: grant funding agencies.) Anyone trying to make money in this industry should keep this in mind.
I'm pretty sure this statement is debunked by the fact that this feature is a new feature and didn't exist before for the site, unless you are suggesting that prior to this, Academia has always been meaningless.
It's a solved problem in technical terms so the only question worth asking is what resources are needed to implement it.
No. It was a piss-poor statement that deserved to be called out.
> we're used to being able to do full-text searches
Incorrect. This is a new feature on an existing site. The users of that site were not used to doing full-text searches on that site.
> the only question worth asking is what resources are needed to implement it.
Money, because doing this and doing it right costs money. And they need to get money somewhere. So they can do what Google does, and that is sell advertising. Or they can do something else, and charge for the feature.
But not, you say it's a solved problem, and you claim it can be done for free. So put up. Let's see this free PDF search solution that will scale to their needs.
C'mon, I'm waiting.
Even though advanced search was requested quite a bit by our users during our research around what features to build for our premium account, other premium features have been more heavily used than advanced search. The social mechanisms have remained the primary way that people discover research on Academia.
Regardless if such kind of search was available before or not, many people, like the author of the linked article, will encounter this pay-wall during their research.
I think you could benefit, and especially grow, a lot if academia.edu becomes a place where students actually start their research instead of landing there through an external search engine.
In 2017 fulltext search should be considered a basic feature, which according to your business model should be free. Even the online library of my university offers it, even though on a much smaller data set.
If it is true that other premium features are more heavily used than "advanced" search, then you still have enough reasons to legitimate your premium accounts without it.
I think you could find other ways to generate income and become sustainable. For academia.edu it might be premium accounts offering additional services and maybe sponsored content like placed search results, job offers etc. Use the information you generate (but please don't just sell it to a 3rd party).
And what happens when your decision/view changes because of the inability to make profit or enough profit?
Somehow i doubt the bylaws of the company say you won't ever make that stuff for-pay as well.
"Academia.edu is not a university or institution for higher learning and so under current standards it would not qualify for the '.edu' top-level domain. However, the domain name 'Academia.edu' was registered in 1999, prior to the regulations requiring .edu domain names to be held solely by accredited post-secondary institutions. All .edu domain names registered prior to 2001 were grandfathered in, even if not an accredited post-secondary institution."
"They didn't buy it before the rules were put in place. They launched in 2008 as a company, and the the .edu was restricted to accredited institutions in 2001."
For example, http://science.sciencemag.org/content/339/6121/819 becomes http://science.sciencemag.org.sci-hub.io/content/339/6121/81...
It was so spammy.
I didn't get why the professor website point all his research towards academia.edu.
He could have posted it on his website or github.
The website demand from the get go your info and you have to create account to even get stuff. It's similar to quora.
Quora survived with its quality like posts/answers but I still disagree with this type of model.
Today, if I open my feed, most of it is questions on personal experiences (from just now): 'What is the craziest thing you ever did when you were a teenager?', 'What surprised you most about attending graduate school in the US?, 'What is the most brutal death?'. I never specified interest in any of these topics.
What is worse to me is that I do not see any way to disable topics quickly so I have to perpetually mute high-impact posters who have attracted a large enough audience to be asked about their personal lives and seemingly enjoy answering the same things about themselves over and over. Like any web forum, the majority of replies comes from a relatively small amount of posters who keep retelling their personal story about their admission to MIT/their high IQ.
I suppose Quora is paying the price of growth and I realise my interests are not aligned with Quora's in attracting a large audience. It just means I am not personally interested in writing any content for it any more and I think many early users feel the same.
People do like to answer questions and be acknowledged for their know-how, but what people love is to talk about themselves and have their experiences validated.
Sure, Snapchat, Instagram, Facebook give a way to have your social existence acknowledged, but Quora offers anyone to have their individual life experiences validated, no matter if they have the lifestyle or looks typically associated with social media fame. That is a very powerful attractor but unfortunately brings out the result described above - users beginning to talk incessantly about themselves as a topic, the more one answers, the more one has the chance to convert to a topic oneself and have even more explicit opportunity to tell one's story.
Combine this with low traffic in topics of maybe more serious interest and Quora will suggest any popular content ('topics you might like). This is how one ends up having these stories in your feed without ever expressing interest in them.
If you take a look here: http://stackoverflow.com/help/dont-ask you can find more of the type of questions you can not ask.
Take a question like this: http://stackoverflow.com/questions/16318868/books-or-other-r...
And a question like this: https://www.quora.com/What-are-the-best-development-tools-fo...
Quora's utility is a tiny subset of, for example, Reddit's /r/AskReddit. Except it does away with the community vibe and what is at least somewhat a meritocracy on Reddit and replaces it with being an already-well-known figure or someone who can sell their reputation well enough. Half the questions are directed at specific people anyway, which is quite bizarre: 'What was Elon Musk's GPA at UPenn?'. If you want to have a Q&A session with a famous person, /r/AMA is much better. In general, there's nothing stopping you from asking and collecting questions and answers anywhere else on the internet, so why do we need a specialist? Quora's Q&A implementation has nothing to set it apart from SO, or even Twitter threads.
The one topic that seems to do inordinately well on Quora is questions about startups, because respondents can tout how successful they are when they give you an answer and be upvoted according to how well they do that. This is the rare case that people are actually looking for reputation rather than content. If you're looking for people with life experience, AskReddit does a great job of promoting answers where people take the time to tell a story. Subreddits (like /r/malefashionadvice, for example) do a great job of organising communities around common interest with lots of Q&As and beginner guides thrown in. If you want what Quora purports to provide, just find any old community online with the people you want to hear from and ask it there instead.
- Signing up is an exercise is spam. Most of the people who are willing to endure the whole thing are the least desirable customers (i.e., folks with more style than substance).
- Their onboarding process sucks. Most popular researchers will not jump through the hoops necessary to get their stuff on board. There is no compelling reason to be on there at this point. Within the first 30-60 secs, there is no "hell yeah!" moment.
- The issues of what constitutes "previously published" for elite journals is potentially probelmatic (although it shouldn't be in this case).
- The obvious gift of a research "family tree" has not been offered. Being able to identify distant relatives on this tree has ridiculously high value that has not been manifested.
- The recommendation engine sucks. For someone writing a paper on a certain topic, a.e should be able to crank out a fairly accurate list of papers you will want to read or cite before you write. This would be very high value add if done well. A.e whiffed on this one, too.
There is a huge need for what a.e is offering or could offer, but their execution has been pretty terrible. It seems like they are optimizing metrics for their funders rather than their customers.
If that site does die, I will gladly take it off of their hands, give them 10% equity to leave, and turn it into a ubiquitous resource for academics and researchers.
They really need to stop drinking the Bay Area kool aid and figure out what their core customers want... and then deliver that.
1) log in
2) search your favorite author
3) verify there are zero results in normal search and n>0 in advanced search (as the normal search is on paper titles only)
4) proceed to advanced search, find monthly subscription offer.
It doesn't feel too good having uploaded some of my articles there, and I hope other similar services (ResearchGate) won't do the same.
I want that for social (without ads/tracking/vc money/invasive monetization, and with strong privacy, author's rights and archival properties).
The cost per user of operating these social network sites must be plummeting with hardware costs. Clearly diaspora didn't quite get the formula right. Soon someone will, I hope.
Were it were so. I think you'll find some rather strenuous objections to your moderate proposal, though that's not so much partisanship as a very loud minority of a minority of the population stridently pursuing their ideology.
Not to mention, you can still find articles for free if the keyword is in the title or keyword abstract so the $8 a month isn't necessary.
That said, I wouldn't mind seeing a rev-share model (like old platforms like Squidoo) for popular article authors who are sharing their work for free. With a growing amount of works coming from poorly paid adjuncts, while journals, publishers and institutions continue to rake in huge money, it'd be good to see some of this flowing back to the researchers who work to support the open access of knowledge.
Squidoo was created by Seth Godin (among others), IIRC. Would be interesting to know whether their biz model worked, even if for a while.
Oh well. I logged in, verified that it shows the same broken behavior, then deleted my account. I wasn't really using it anyway. There are both better search engines and better ways of self-hosting my publications.
Open access systems have a hard time getting ongoing funding; arxiv did not have the beginning of a sustainable plan before it was 20 yrs old, and it had near death experiences in the meantime.
"The problem with Academia.edu is that it is a commercial enterprise. It is not created to serve the common good – diffusing knowledge. It is also not created to serve democratic ideals, but to make money."
I wonder if this guy has a mortgage or kids, and how many unpaid hours he puts in per week "for the common good". I could understand if they were gouging people, but as you say 8 euro per month hardly seems unreasonable. The site owners are under no obligation to run a charity simply to appease Mr. Maly's morality.
Knowing that this "idealist" system appears to work, one would expect that a not-for-profit service analogous to Academia.edu would not be a completely unrealistic dream... This is why Academia.edu is disliked by the author and other academics: they are building a for-profit service on top on the not-for-profit system of academic research.
It is, usually, however their job and career: they do make indirect monetary profit.
As for building for-profit services on top of not-for-profit research, I don't see the issue. Would it be equally immoral to sell or even lease Bibles to churches since churches are financially reliant on donations?
The beauty of Capitalism is it tells you what people are willing to put money/time towards. If enough of the scientific community feels so strongly that academic research should be accessible for free via some sort of charity, they are welcome to start and manage said charity and/or refuse to buy subscriptions to Academia.edu. I honestly wish them luck, a free, open service for such knowledge would be awesome! However, proclaiming that others are morally obligated to provide that charity, simply because they market their service to "charities", sounds rather arrogant to me.
If they are just going to turn into something much like the old journals, and lock our data behind a paywall, then what advantage do they provide? In particular, they themselves don't provide open access to the database they are building, so if they ever decide to take their ball and go home, we've lost everything they collected.
Usually when I want to read something academic it's about 30 eur per article. 8 eur per month seems reasonable if they provide a good service.
It's a price that readers (people who wish to access scientific articles) have to pay: this is unacceptable no matter the amount.
> [arXiv] had near death experiences in the meantime
I have never heard about this before, do you have any specific reference?
It's really, really bad. Potentially suicidally bad.
In this case the nominal price is reasonable - but the cost in lost trust and goodwill certainly isn't, and may turn out to be more than a.e can afford to pay.
Google scholar paired with sci-hub has become the fastest way to access papers given that institutional subscriptions are scatter gun.
1. Find paper on scholar.google.com
2. Find PDF or pay-wall
3. Click the sci-hub bookmarklet
If I want to discover papers from my field, I talk to my colleagues. Another wall-garden I do not need.
Paid monthly subscription and analytics charges to the end user. I guess the author thinks it is the end of the platform, which is possible. I'm just hesitate to immediately conclude the business is done because of this choice. If the paid upgrades don't provide enough value the people won't sign up and maybe they will focus on another way. Research Gate is very similar site that sells ad space and for job postings so maybe they will head in that direction.