I don't know about other fields of physics, but in astro, most of the data is free access as well. I personally work only with public data and I'm paid to do it. A string attached to governmental funding from the Euro or NSF is usually a mandated free access database.
Sometimes I take for granted the fact that my morning ritual involves reading every publication in my field from the day before, without license. And then I download some free data, program in my free languages, write in my free latex editor, and then publish my work for free in a place anyone can read it. It's utopic.
edit: two archives with a lot of different missions data for example:
True story. He had a hobby going in a public storage unit with a surplus military linear accelerator. Smallish. About 30 feet long. Of course it required huge amounts of power so he cut a hole in the unit and ran a line to the nearest pole and siphoned 480 mains volts. And the gamma radiation was very dangerous so he hauled in several tons of lead destined for EPA long term sequestering. We worked one summer building shielding walls and measuring the operational radiation. After the unit was 'safely' running, we would take various pieces of thrown away Lucite from the physics machine shop and turn them into polished beam trees (Google it). We then gave them away for Christmas gifts. What fun for a 10 year old kid!
PS. I'm not sure which paper in the arXiv has the greatest number of citations. I don't think either of these papers are there.
Also, absolute limitations on page count is really common in CS, and the page counts tend to be pretty low. In other areas, journals might allow for more citations or citations might not count toward page count.
Also, arxiv is physics-biased.
Have there been any significant CS papers published in the last ~5 years that aren't on Arxiv?
The last thing I remember not being there was a set of papers on IBM Watson that were published in an IBM Systems Engineering Journal.
I have a feeling that some papers out of Microsoft tend not to end up there too, but I can't think of a specific example.
Even if not, there might be insignificant CS papers not indexed by Arxiv which cite significant papers which are indexed ;) This makes the citation counts comparatively lower if most insignificant physics papers are in Arxiv.
That said, it doesn't surprise me much that worldwide there are still more people working in physics, biology or mathematics than in CS.
I found 38k citations for The Nature of Statistical Learning Theory: https://scholar.google.com/scholar?cluster=86085598803682809...
And 25k for Statistical Learning Theory: https://scholar.google.com/scholar?cluster=86748554971781655...
My initial estimation of 10k was from a CiteSeer list that I didn't realize was limited to only documents in the CiteSeer database: http://citeseer.ist.psu.edu/stats/articles
I was thinking while navigating this that, if I was researching something related to physics, etc., this would much better than using some a engine, because you might not know exactly what you want to look for, until you see it.
But what does (x,y) position mean? If two papers are close on the map are they also close in some other aspect?
I mean, what gave this map this particular shape?
Basically, it's like having a spring between each node (paper) and letting the equilibrium do the rest.
> In laying out the map, an N-body algorithm is run to determine positions based on references between the papers. There are two “forces” involved in the N-body calculation: each paper is repelled from all other papers using an anti-gravity inverse-distance force, and each paper is attracted to all of its references using a spring modelled by Hooke’s law.
However it must have taken them a while to converge for 10^6 particles.
Maybe not to the same extent as physics people, but there is still a lot of CS on arXiv. More so in some subfields than others, but there's a pretty steady stream of CS papers showing up there. Enough that one person can't keep up with reading and digesting all of them as they appear.
That said, I don't disagree that there's a lot more physics than CS on arXiv. :-) I'm just not sure if that's because CS people don't upload to arXiv, or because CS people publish fewer papers in general, or "other".
Can you please enlighten us about the technical details behind the scene, right from collecting the data to processing it.
I'm also working with a large graph entity and would love to read about your process.
EDIT: Site is probably getting hammered, I just needed to wait a minute for everything to load.
EDIT: Clicking a paper and then "(citations)" will you show the one-level graph of citations, and under the search bar you can see how many results there were.