It's possible to run this query against the full github dataset but I couldn't figure out how to pay for it, so if somebody wants to do that it would be excellent.
just a note: it's bizarre that I absolutely cannot find a way to determine a) how much it would cost to run or b) how I would pay for it if I wanted to run it
I changed it to query from [bigquery-public-data:github_repos.contents] instead, and before I execute the query it says "Valid: This query will process 1.68 TB when run.".
Since the Java source is open, its all there to be peer-reviewed. If a paper its based on isn't the best you can make some noise about it. This is a good situation for Java.
Some more found by a quick grep for "et al.", "Proceedings", "Proc. ", "Symposium", "Conference", "Conf. ", "PPoPP" (a conference with an easy-to-grep-for name), and "acm.org":
hotspot/src/cpu/ppc/vm/ppc.ad: See J.M.Tendler et al. "Power4 system microarchitecture", IBM J. Res. & Dev., No. 1, Jan. 2002.
hotspot/src/cpu/x86/vm/crc32c.h: V. Gopal et al. / Fast CRC Computation for iSCSI Polynomial Using CRC32 Instruction April 2011 8
hotspot/src/share/vm/gc/shared/taskqueue.hpp: Le, N. M., Pop, A., Cohen A., and Nardell, F. Z.: Correct and efficient work-stealing for weak memory models Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming (PPoPP 2013), 69-80
jdk/src/java.base/share/classes/java/util/Arrays.java: Peter McIlroy's "Optimistic Sorting and Information Theoretic Complexity", in Proceedings of the Fourth Annual ACM-SIAM Symposium on Discrete Algorithms, pp 467-474, January 1993
jdk/src/jdk.crypto.ec/share/native/libsunec/impl/mpmontg.c: "A Cryptogrpahic Library for the Motorola DSP56000" by Stephen R. Dusse' and Burton S. Kaliski Jr. published in "Advances in Cryptology: Proceedings of EUROCRYPT '90, LNCS volume 473, 1991, pg 230-244
hotspot/src/share/vm/opto/superword.hpp: "Exploiting SuperWord Level Parallelism with Multimedia Instruction Sets" by Samuel Larsen and Saman Amarasinghe [...] published in ACM SIGPLAN Notices, Proceedings of ACM PLDI '00, Volume 35 Issue 5
jdk/src/java.base/share/classes/java/util/SplittableRandom.java: Leiserson, Schardl, and Sukha "Deterministic Parallel Random-Number Generation for Dynamic-Multithreading Platforms", PPoPP 2012
jdk/src/java.base/share/classes/java/util/SplittableRandom.java: "Parallel random numbers: as easy as 1, 2, 3" by Salmon, Morae, Dror, and Shaw, SC 2011
jdk/src/java.base/share/classes/java/util/concurrent/ForkJoinPool.java: "Dynamic Circular Work-Stealing Deque" by Chase and Lev, SPAA 2005
jdk/src/java.base/share/classes/java/util/concurrent/ForkJoinPool.java: "Idempotent work stealing" by Michael, Saraswat, and Vechev, PPoPP 2009
jdk/src/java.base/share/classes/java/util/concurrent/ForkJoinPool.java: "Leapfrogging: a portable technique for implementing efficient futures" by D.B. Wagner and B.G. Calder, PPoPP '93, http://dl.acm.org/citation.cfm?id=155354
jdk/src/java.base/share/classes/java/util/concurrent/LinkedTransferQueue.java: Using elimination to implement scalable and lock-free FIFO queues, Moir et al, http://portal.acm.org/citation.cfm?id=1074013
jdk/src/java.base/share/classes/java/util/concurrent/LinkedTransferQueue.java: "Bounding space usage of conservative garbage collectors", HJ Boehm, http://portal.acm.org/citation.cfm?doid=503272.503282 (this is the Boehm GC paper)
jdk/src/java.base/share/classes/java/util/concurrent/locks/StampedLock.java: Design, verification and applications of a new read-write lock algorithm, Shirako et al, SPAA 2012
hotspot/src/share/vm/opto/escape.hpp: Jong-Deok Shoi, Manish Gupta, Mauricio Seffano, Vugranam C. Sreedhar, Sam Midkiff: "Escape Analysis for Java", Procedings of ACM SIGPLAN OOPSLA Conference, November 1, 1999
hotspot/src/share/vm/runtime/os.cpp: Gilad Bracha and David Ungar: "Mirrors: Design Principles for Meta-level Facilities of Object-Oriented Programming Languages", in Proc. of the ACM Conf. on Object-Oriented Programming, Systems, Languages and Applications, October 2004
jdk/src/jdk.crypto.ec/share/native/libsunec/impl/ec_naf.c: D. Hankerson, J. Hernandez and A. Menezes, "Software implementation of elliptic curve cryptography over binary fields", Proc. CHES 2000
jdk/src/java.base/share/classes/java/util/concurrent/SynchronousQueue.java: "Nonblocking Concurrent Objects with Condition Synchronization", by W. N. Scherer III and M. L. Scott. 18th Annual Conf. on Distributed Computing, Oct. 2004
Ah, sorry, I didn't really check for dupes---I just skipped the ones with a pdf link in the vicinity. I'm just glad that sometimes the clever things that academics churn out are actually used in practice. Far too rarely if you ask me, but I'm biased of course ;)
I had to cite sources while implementing an artificial immune system (real valued negative selection and clonal selection algorithms). I read through a few papers for each algorithm and cited the clearest one as a source.
Seconded! I really like this compilation. Very interesting to see the algorithms and data structures behind the implementation of a language, especially one of the more popular ones.
To be fair, sometimes code and comments get moved around, and any of us can use grep (or whatever other search tool you prefer) to find a specific link in the source.
About 99% of Linux (or even more) is drivers. But indeed there should be useful references in the scheduler, locking primitives, memory management and core networking code.
Asking as someone not as familiar with the research community as I'd like to be, what are DOI files, what advantages do they have over PDF/PostScript, and are they common?
It's not a file format, it's a digital identifier. The APA can explain it better than I can:
"A digital object identifier (DOI) is a unique alphanumeric string assigned by a registration agency (the International DOI Foundation) to identify content and provide a persistent link to its location on the Internet. The publisher assigns a DOI when your article is published and made available electronically."
So you can access a journal article by going to http://dx.doi.org/DIO-GOES-HERE. doi.org doesn't host files, just resolves them to the current and correct location. For example the DOI number for the first paper in TFA is 10.1007/11427186_42 so it can be accessed at http://dx.doi.org/10.1007/11427186_42
It's possible to run this query against the full github dataset but I couldn't figure out how to pay for it, so if somebody wants to do that it would be excellent.