Having grad students help grade paper is a consistency nightmare: It's look once, never look back. Instead, after each of several provisional passes I recreate the PDF "book" for that problem, with a chapter for each score, and students randomized within each chapter. In the same spirit as "checking your work lets you work three times faster" this is actually both more consistent and faster that a single pass over paper. Almost all of my attention is on the math, which I'm good at, rather than locating problems and finding again the ones I know I misgraded, which I'm not good at.
Then each student's exam needs to be extracted from these problem PDFs, scores recorded, and annotations frozen.
There are cloud services for grading. They're hopelessly primitive, with cloud lag. Like a gamer, I used to reject wireless mice because of the lag. I reject these services. I can grade everything myself faster than using a team of grad students, with the right local tools.
The PDF format is a morass. My hats off to anyone who will work with it. There are many evolutionary layers and no formal specification or verification; one tests a PDF by seeing if most programs accept it.
It's time for me to rewrite my grading system in a modern scripting language, so others could use it. I prefer Ruby, but that's mainly to stave off boredom when I'm not using Haskell. I can use Python. This would permit a more robust workflow, such as adding late exams in mid-grading without losing grading in progress.
I can't find documentation for Borb, to check off the list of features I'd need. I suspect from this being a one-person project that I might need to continue to patch together external tools.
There is a specification, but it's very complicated.
I've also heard that some of the embedded font formats have features that are Turing complete, but I don't know the details on that.
Very roughly speaking (this is a semantic debate where everyone is wrong from someone else's perspective), a PDF file is a restricted subset of Postscript, with added indexes so one can render pages in the middle without having to process the code from the beginning.
The hardship in generating PDFs from scratch is getting those indexes right. It's far easier to convert a Postscript file using standard tools.
It sounds like it's still mostly a prototype?
I was briefly involved as a developer several years ago (as part of my bachelor's thesis). At that time, it was mostly beta-quality, but it was already in use by multiple professors for grading. I haven't been involved with the project since, so I'm not sure about the current status.
I think the homepage , which you linked to and where it mentions that it's still a prototype, is at least somewhat outdated; it has a screenshot of a very old version of the software. At least the 'support' section still looks accurate, though.
If you're interested in using it, I would advise getting in touch via the Mattermost channel or mailing list (both linked to from the homepage ) and asking about the current state of the project. Tell them Jamy sent you :)
Captcha-ize them, with several of them grading the same result, and with checking their responses against each other?
It is licensed AGPL+Commercial but if you just use it for yourself, this does not matter as you can use the AGPL.
On the other hand, the reportlab pdf generation library (which is what I actually use) offers a permissive language in its open source version (and a commercial reportlab plus version), so it can be included in all kinds of projects.
But the plot thickens! It seems the top-level LICENSE file was actually changed 13 days ago _away_ from AGPL https://github.com/jorisschellekens/borb/blame/master/LICENS...
So, yeah, confusingly for sure
Was the issue raised with the author?
I’ve seen some PDFs have the first few pages counted in Roman numerals and then “normal” numbers for the main content.
How do you edit an existing pdf to do that?
There might be some Latin script fonts that cause problems, but I haven't looked into that very much--I do recall we had problems with an italic font.
But the point that I was not so clearly trying to make was that sometimes the messed up encoding is intentional and not a bug.
I couldn’t see any support for PDF/A (the good version of PDF) in borb though.
It a really amazing project. One of those that makes you go: "Wait we didn't have this before?"
Is that sort of thing going to be in scope for this library’s editing capabilities? (“Editing PDFs” is such a broad, open-ended thing.)
I use reportlab combined with PyPDF2 and pdf-redactor. It would be nice to see a comparison with the existing tools.
As for myself, I've not had to automate work on PDFs, luckily; for manual manipulation and annotation I've found Xournal++ sort of useful (https://xournalpp.github.io/). Inkscape can also be used with some questionable PDFs.
> with borb, a pure python library