
Ask HN: Are Wikipedia's costs higher for page views or page edits? - 19eightyfour
I am trying to understand the economics of hosting a large scale wiki system. Is it more expensive for Wikipedia to serve page views, or to serve page edits?<p>I don&#x27;t know enough about this to calculate this myself. My guess is that edits are more expensive ( diffs, history, few caching opportunities ) but less numerous, while views are more numerous but less expensive. I don&#x27;t have experience or knowledge about this to know where these numbers sit tho.<p>I hoping someone can speak to this or suggest some information that would make it possible to estimate.
======
theprotocol
• Wikimedia keeps detailed statistics which look like what you want, but I
haven't dug too deep so it may be lacking in some way.

[https://en.wikipedia.org/wiki/Wikipedia:Web_statistics_tool](https://en.wikipedia.org/wiki/Wikipedia:Web_statistics_tool)

[https://stats.wikimedia.org/EN/](https://stats.wikimedia.org/EN/)

• There is also an API:
[https://en.wikipedia.org/w/api.php](https://en.wikipedia.org/w/api.php)

You could probably get an estimate of edits per day by requesting versioning
metadata (i.e. each version being an edit) on a sample of documents.

• I think it will be essential for you to run tests using the actual
configuration that you plan to have. Server hardware and the software stack
and their respective configurations are a significant factor in the cost of
running something like this and are nearly impossible to estimate. There are
cases where modifying a single configuration flag can multiply the requests
per second you're able to serve.

• Additionally, you should probably factor in business-level projections based
on the market your project is aimed it - ask yourself: will your project be as
popular as wikipedia? Will your users have the same ratio of pure consumers to
editors as wikipedia?

Consider identifying user characteristics in your market segments which
correlate with whether someone is a pure consumer vs. an editor.

------
tinus_hn
The most expensive are logged in views because these are custom rendered pages
that can't be completely cached and they happen a lot, a lot more often than
edits.

Anyway mediawiki runs pretty well on basic hardware but if you think it's
going to be really big you have to think about scaling horizontally anyway.

