Hacker News new | past | comments | ask | show | jobs | submit | bazzargh's comments login

Back in... 2006ish? I got annoyed with being unable to copy text from multicolumn scientific papers on my iRex (an early ereader that was somewhat hackable) so dug a bit into why that was. Under the hood, the pdf reader used poppler, so I modified poppler to infer reading order in multicolumn documents using algorithms that tessaract's author (Thomas Breuel) had published for OCR.

It was a bit of a heuristic hack; it was 20 years ago but as I recall poppler's ancient API didn't really represent text runs in a way you'd want for an accessibility API. A version of the multicolumn select made it in but it was a pain to try to persuade poppler's maintainer that subsequent suggestions to improve performance were ok - because they used slightly different heuristics so had different text selections in some circumstances. There was no 'right' answer, so wanting the results to match didn't make sense.

And that's how kpdf got multicolumn select, of a sort.

Using tessaract directly for this has probably made more sense for some years now.


I too went down that rabbithole. Haha. Anything around that time to get an edge in a fantasy football league. I found a bunch of historical NFL stats pdfs and it took forever to make usable data out of them.


When I saw this I thought... "The Turing Institute? Does that still exist?"

https://en.wikipedia.org/wiki/Turing_Institute

There was a previous Turing Institute in Glasgow doing AI research (meaning, back then rules-based systems, but IIRC my professor was doing some work with them on neural networks), which hit the end of the road in 1994. There was some interesting stuff spun out of there, but it's a whole different institute.


The Turing had an interesting approach to naming, not only stealing the Glasgow group's name, but also choosing the initials 'ATI' (in 2015...).

It's recently struggling for relevance.

https://www.ft.com/content/6bfea441-e16c-499a-a887-69f735c29... (https://archive.ph/ujfhb)

I hope they turn it around because the UK need for AI academic coordination/leadership is so high.


it's a bit odd at the end referring to "a German high school teacher and amateur decipherer who, in 1802, looked at Aramaic-related inscriptions..." without naming him like everyone else mentioned - that is Georg Friedrich Grotefend. The explanation of how he managed to translate some cuneiform (but was then ignored) is interesting too https://en.wikipedia.org/wiki/Georg_Friedrich_Grotefend#Deci...


I got better results just dithering the rgb channels separately (so effectively an 8 colour palette, black, white, rgb, yellow, cyan, magenta). In p5js:

    var img
    var pixel
    var threshold
    var error = [0, 0, 0]
    var a0

    function preload() {
      img = loadImage("https://upload.wikimedia.org/wikipedia/commons/thumb/4/44/Albrecht_D%C3%BCrer_-_Hare%2C_1502_-_Google_Art_Project.jpg/1920px-Albrecht_D%C3%BCrer_-_Hare%2C_1502_-_Google_Art_Project.jpg")
    }

    function setup() {
      // I'm just using a low discrepancy sequence for a quasirandom
      // dither and diffusing the error to the right, because it's
      // trivial to implement
      a0 = 1/sqrt(5)
      pixelDensity(2)
      createCanvas(400, 400);
      image(img, 0, 0, 400, 400)
      loadPixels()
      pixel = 0
      threshold = 0
    }

    function draw() {
      if (pixel > 400*400*16) {
        return
      }
      for (var i = 0; i < 2000; i++) {
        threshold = (threshold + a0)%1
        for(var j=0; j< 3; j++) {
          var c = pixels[pixel + j]
          pixels[pixel + j] = c + error[j] > threshold * 255 ? 255 : 0
          error[j] += c - pixels[pixel + j]
        }
        pixel += 4
      }
      updatePixels()
    }
Of course this isn't trying to pick the closest colour in the palette as you're doing - it's just trying to end up with the same intensity of rgb as the original image. It does make me wonder if you should be using the manhattan distance instead of euclidean, to get the errors to add correctly.


I made a terminal based presentation tool some years back and like sibling comments said, it was neat for switching back and forth to code samples and output.

Mine wasn't markdown tho: I used ttyrec to record a terminal session to a file per slide and the tool just played it back. I set it up so pressing most keys would advance the playback hackertyper style, advancing 200ms per keypress IIRC. When you reach the end of a slide, press return for the next one. The back and forward arrows were used to jump between slides quickly, and title text was done with figlet.

I only used it for a couple of in house presentations and meetups where the hacker styling was appropriate; there wasn't much to it so the code wasn't released, it'd be easy to recreate.

edited to add: I forgot, I did put it in a gist. https://gist.github.com/bazzargh/a267b97a52f7a1f70c46 ymmv. I recall the playback struggled with things like vim, I always meant to try integrating as cinema since it seems to work better


The comment about world events not being what matter in our lives is accurate (world events don't generally happen to us personally-fortunately) but it got me thinking, it might be nice to see some of these done for historical figures, who were integral in those events happening. I mean showing what people were part of in that week of their life, as opposed to things that happened in that week of _your_ life.

For example, Anne Boleyn - born 1501 _or_ 1507 (the earlier date is more accepted), died 1536; Henry VIII first married in 1509 when Anne was just a small child; her first promise of marriage post 1515 was to avert a civil war in Ireland, and Henry meeting her in 1526 led to the creation of the Church of England, her marriage, and then of course, her own execution. She had a full diary. Or on a lighter note, Mozart, who possibly died at the same age, had a life full of compositions and courts.

Those (and many others) would be interesting in their own right, or as timeline comparisons for people who use the site.


A similar thing from many years back: the junkyard jumbotron let you assemble a random collection of displays to display their portions of a much larger image

https://github.com/mitmedialab/Junkyard-Jumbotron

Video https://youtu.be/cAUtSVSTbzU?feature=shared


The Media Lab makes so much random fun stuff. I feel like it would be fun to remake this with modern web tech. (doing the alignment photo over email does sounds like fun too hehe)


In the text of the act, schedule 1 part 1 paragraph 10 https://www.legislation.gov.uk/ukpga/2023/50/schedule/1/para...

... unlike the issue of what size of service is covered, this isn't a pinky swear by Ofcom.


Super ... many thanks.


Note that the definition of "Services provided by persons providing education or childcare." is defined in law and narrow. I'm working with a charity providing online lessons for use in schools, where school children can post stuff others in their class can see. As far as I can discern they don’t fall into this exemption.


Worth mentioning that the lawyer who runs onlinesafetyact.co.uk, Neil Brown, has its onion address in his profile.

https://mastodon.neilzone.co.uk/@neil

http://3kj5hg5j2qxm7hgwrymerh7xerzn3bowmfflfjovm6hycbyfuhe6l...


this is something you should read https://extremelearning.com.au/unreasonable-effectiveness-of... (it's effectively the fermat prng described, but goes into more depth)


Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: