Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: Why is WolframAlpha serving rendered text images instead of plain text?
15 points by esonica on May 17, 2009 | hide | past | favorite | 35 comments
I am after some thoughts on why they would be doing this. I am a developer/designer and cannot think of any good reason for doing what can be done in CSS. Any ideas?

Sample : http://www4d.wolframalpha.com/Calculate/MSP/MSP6202195gdebi63ebi5b100001e27378a70be39cd?MSPStoreType=image/gif&s=46

* srry, for unclickable link




I asked Stephen Wolfram about this and he explained that it was because sometimes the answers contain complex mathematical formulae that are difficult to render in HTML and that since he wanted a consistent display, his preferred solution was to just use images all the time.


Any idea if they use TeX or MathML for this?


Wolfram|Alpha is written in Mathematica (several million lines of it!), so I presume they're just using Mathematica to render everything.

You can actually download a Mathematica worksheet for any query response and open it up in Mathematica. Pretty neat.


Wikipedia occasionally has complex mathematical formulae & it seems to be working fine. This isn't a hard problem.


those formulae are rendered as images


... whereas text is rendered as text, is his point. But I understand the original explanation. It's all Mathematica, so it makes sense to let Mathematica format it and then spit it out as images.


How strange... from their FAQ (under Web and other Practicalities - http://www.wolframalpha.com/faqs.html):

"Do I need images enabled in my browser to use Wolfram|Alpha?

Yes. All its output content is rendered as images, for consistency."

Perhaps they didn't want to make sure the data was styled & displayed well in all browsers (ahem... ie6)?

clickable: http://www4d.wolframalpha.com/Calculate/MSP/MSP6202195gdebi6...


While consistency is a good thing, it comes at a relatively high price in this case. Many blocks could be rendered just fine using only HTML and I'd be a much happier user if I could copy text from the search results.


Indeed. FYI, if you click on any text in the search results, you're given a text field with its text within it. It's a few extra clicks, unfortunately, but at least they considered that.


Did they just add that now? I don't remember seeing the hand cursor over text fields a couple of hours ago.


It was there yesterday


I'm not too sure. I just discovered it a few minutes ago. :)


Perhaps they don't want other search engines 'learning' the questions (via phrases hyperlinked to wolfram) and displaying the answers as summary text?


Again, the text of each image can be found in the image element's alt attribute. Since WA already uses images to make formulas easier to print out, I guess Wolfram simply wanted to reuse the code.


Turns out, Google is indexing the questions:

http://www.google.com/search?q=site:wolframalpha.com+inurl:i...

The "meaning of life" and "the ultimate answer" questions feature highly. But even more people seem curious about "unemployment rate in USA".

The answers are not in Google's summary.


In addition to what Scriptor had to say - I don't think those images would be a challenge for Google's OCR (Optical Character Recognition).


I don't know for sure - but my guess is mathematica outputs images?


This guess makes the most sense: to correctly display formulae. An example from another thread:

http://imgur.com/gcymm.png


It can, and it looks like they're making mathematica just render things as images, instead of parsing in text and then reformatting it as HTML.


I'd assume that not everything that WA puts out will be text or tables. They've got graphs, complex mathematical formulas, general purpose image manipulation and who knows what else.

It may have made sense to "unify" the rendering engine. This makes the simple case you point out look silly of course.


Pretty pathetic as it breaks the site for people with less than perfect vision. I mean they obviously take advantage of machine readable data to populate their database, pretty lame to not spit it back out. Not to mention a huge server overhead--this was a calculated decision.


Recent versions of at least Safari and Firefox have full content zooming, so it zooms images along with the text.

Screen readers will read the text in the "alt" attribute of the image, which matches what the image says usually.


Don't most screen readers read out the alt text of an image? (Yes, the hint here is to check the alt text of the image).


There are tons of people who simply increase the text size in their browser. Can't do that with WA. They aren't blind, just older.


Most modern browsers can also increase the size of the image with the text.


I don't know if I buy the consistency answer. There's ways to do that (even with complicated formula), but it is an interesting way to avoid x-browser CSS issues.

My own hunch is they did it to avoid scraping/botting. Hopefully they switch to text at some point in the future.


> My own hunch is they did it to avoid scraping/botting

the text is in the alt attribute of the image. easily scraped.


We actually do the same for math equations in LearnHub lessons (via texhub.com). Images are pretty much the only way to present complex math equations to a wide audience.

MathML would be a much better solution only if it were natively supported in IE.


Our free MathPlayer plugin for IE enables it to display MathML. Except for having to be installed, it is as "native" as it can get. It even works with screen readers to speak the math for those that need that. See http://www.dessci.com/mathplayer.


Thanks Paul. I'm aware of Math Player, but I don't like the tradeoff of forcing our users to install a plugin for superior display quality/customizability vs. "it just works". The nature of our user base just doesn't make it feasible IMO.

I would be more willing to take that tradeoff in a LCMS environment though.


The special font may not be on every user's computer. Which font is that?


To make you pay for the API.


The text is in the alt attribute of each image's HTML element. So it should be a breeze to scrape it.


its not like they hide the text that much though;

jsonArray.popups.i_0100_1 = {"stringified": "Brisbane,Queensland","mInput": "","mOutput": "", "popLinks": {"Brisbane, Queensland":"Brisbane"} };


Each image tag also has its source data embedded in its alt attribute.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: