Hacker News new | comments | ask | show | jobs | submit login
Show HN: Diffs of Word documents, designed for humans
69 points by bentoner on May 6, 2013 | hide | past | web | favorite | 57 comments
We're building version control software for Microsoft Office documents that works as well as Git and GitHub do for code.

Today, we’re releasing the first component, Draftable for Word, an Office add-in that lets you generate side-by-side diffs, and makes it way less painful to work with Track Changes.

Our diffs are designed to look right to humans — instead of machines — showing changes as they might actually have been made by an editor.

We'd love to hear your feedback. We’d also love to fix what you find painful about authoring and collaborating on Microsoft Office documents, so get in touch!



There's been a few discussions about Excel on HN recently and it's such a major tool in so many organizations and keeping track of changes is a nightmare.

Here's a classic example:

- Budgeting model for company with ~10 employees and 8 departments. Model has 30 or so sheets. Tracks all assumptions on the revenue and expense sides. Great model, built well, works excellently.

- CFO goes in and changes the conversion assumptions for the year.

- I go into the model after the CFO, notice that net income increased by $500,000 since the last time I looked at it. WHAT THE HELL CHANGED?

Now take this example and imagine you're on a late night call with the CEO, CFO, and CMO. Ideas and thoughts are going back and forth. "Hey, what does it do to revenue if we change assumption A from X to Y?" ... "OK, I like that. What about changing assumption B from W to Z?" ... "Nah, let's not do that."

At the end of the call I have our updated model that everyone is happy with... but I have no way to easily and reliably see exactly what was changed.

The only thing I currently do is setup a reconciliation sheets on the income statement. For each version of the model, I create a new sheet and copy/paste values from the income sheet then I can do a diff and see what numbers changed. This works well, but there's got to be a better way.


You know what? I'm gonna start doing this.

This is a tool that could help improve journalism in some circumstances. It would help society as well as biz if someone made this.

Cool. We're big fans of what you're doing at DocumentCloud.

I originally got into this because I wanted a similar tool for LaTeX (my background is in physics), but after understanding how bad the existing version control tools for business documents were, we just had to start there.

Thanks, you've picked an interesting niche.

Versioning text is an interesting subject, and there appear to be a bunch of tools that are teasing at the problem (many of which get discussed on the Versioned Writing google group), but none of them have really gotten things right yet.

A lot of folks have been taking Github as an inspiration, and it's been cool to see what people pick and choose out of the experience.

Hey, can you point me to this Google Group? This is a topic I've been very interested in for many years now. I actually wrote this library (https://github.com/zencephalon/Tactful_Tokenizer) for sentence tokenization so that I could get Git to work across of sentences instead of lines.

If you use LaTeX and Git - but mandate that each sentence gets its own individual line, it works quite well for versioning, even if its not transparent to the end user

There was a script for latex which took two versions of latex files and compiled them to a pdf with changebar's

Aptly named latexdiff: http://www.ctan.org/tex-archive/support/latexdiff/

Oops, already mentioned below.

Latexdiff has often worked pretty well for me.

Yep, this is the direction we're going in.

One person I talked to diffs two Excel spreadsheets by arranging the Windows to be same size and then rapidly pressing Alt-tab!

The latest versions of Excel (starting with 2010, I think) come with Track Changes: maybe that would help.

Actually diffing in excel is quite easy to fix yourself with a macro. Google and you will find a few. It's maybe a screen's worth of code.

Essentially you just foreach sheet, foreach cell, if diff then highlight the cell. You are working with strictly defined data after all, not some hidden magic markup language like word-documents.

That would be pretty poor for inserted rows or columns, right? (Essentially showing everything after that as changed)

TortoiseSVN includes ability to show side-by-side comparison of Excel files.

This is done through a simple script that uses Excel to do so. Same for Word where it uses Word's native comparison feature.

A quick search for "excel diff" returns:

http://excel-diff.florencesoft.com/ http://www.exceldiff.com/ http://www.suntrap-systems.com/ExcelDiff/

Are you sure neither would work for you?

This, (and much more) is available in Excel 2013 http://office.microsoft.com/en-us/excel-help/what-you-can-do...

Are you using track changes?

Yes, it's been a while but I tried that initially. I forget why I didn't like it... maybe I'll give it another shot.

I publish a magazine, and we've been wishing for essentially this exact tool for years and years. As soon as I looked at the demo images on the site, I called two of our editors over excitedly and showed it to them. They both literally shouted with joy.

Then we saw that it was Windows-only.

The publishing industry runs on Macs. Please, develop for OS X. I can't think of a single person in publishing who wouldn't happily pay for this right now.

I agree. Windows only? Are you insane? Mac OS may only have a minority of the consumer desktop market, but it dominates publishing and graphic design. For the life of me, I don't understand why developers go out of their way to ignore a platform that is used by professionals who actually PAY for software.

Would the number of professionals which only use a Mac really be at all significant compared to the number of workers in almost every business and field which use MS Office running on Windows machines?

There are already dozens of tools for Windows that do side-by-side diff comparisons, not to mention the fact that Word itself has a side-by-side comparison tool in Windows, but not Mac.

The reason this tool is worth $99/year to someone like me is that I need it for publishing, not coding or simple business document comparison. It's a nice thing to have if you're an office worker, but not vital, and probably not something you'll be able to get your boss to pay for.

In the publishing world, on the other hand, it's a critical function that would save untold hours of paid editors' time.

For anything other than publishing, there are "good enough" tools already out there. It's very surprising to me that it wasn't developed as a Mac solution first.

This reminds me: it blows my mind how far behind the publishing industry is in terms of technology. I'm currently editing a tech book, and I get the assets from them as a Word doc and images for the figures in a zip file. I have then "track changes" in Word, save the file, and send it back. For the images, if there is an error, I have to take a new screenshot, write in the doc that there was an error, and then send back a zip file with the new image(s).

This would be SO much easier with github. I could just make my edits and issue a pull request, which would work with the image too. I could even use the commit messages to explain what the error was if need be. Github even has tools to make this process easier for non-tech people (which honestly there shouldn't be since it is a tech book!).

This will be made easier by this tool, but sadly not for me.

You're 2 years late, I've finished writing my book!

Joke aside, I think that's awesome. While writing a book and getting feedback from lots of people (tech review, copy review, etc..) it slowly became a nightmare to decipher, more than actually read, Word documents full of colorful revisions. I would have seriously loved a tool like this.

Word already has this capability through their track changes feature. You can even use it to open two documents and see a third document showing what has changed.

How does Draftable improve on what Word has been offering for years?

A few ways.

1. People seem to find side-by-side diffs easier to read than the Track Changes compare functionality built into Word, because the old and new versions don’t get mashed together.

2. Usability. Draftable is easier to use than the inbuilt functionality. (You’d be surprised how many people don’t even realize that Word can compare documents.)

3. You get more control over Track Changes and the comparisons. For example, Word can't create a comparison that preserves the existing Track Changes. Another example: we have a button “Track Changes Since Open” which makes it as if you’d turned on Track Changes when the document was opened.

You actually can open two word files side by side with advanced functionality as part of the compare or combine documents feature, at least in Word 2010.

But you're right, it's really hard to use, and almost no one knows it exists.

Here's a link to the site: https://draftable.com

I could have used this in the past three months! I suppose after summer I might put it to use again during school. I haven't tried it yet, but the title sounds very promising. Nice work!

PS. Heads up: Mind the guidelines:

> "Don't abuse the text field in the submission form to add commentary to links. The text field is for starting discussions. If you're submitting a link, put it in the url field. If you want to add initial commentary on the link, write a blog post about it and submit that instead."

Oh, whoops; I won't do that again. It's the first time I've done a Show HN and we're still setting up our blog. I can't seem to edit it to change it into a link with URL.

Well I don't mind, in fact I personally disagree with this particular guideline. It was just a heads up :)

Thank you. I regularly need to side-by-side compare word document versions. I still have no idea why MS doesn't just offer this functionality.

It's definitely built-in. You just haven't found it yet. ;)

Not in 2011 for mac. I spent over an hour looking for it recently (based on lots of googling), because had that hour paid off with results, it would have saved me many more.

Do you mean the "View Side by Side" feature? It's not particularly useful: it just arranges two document windows so that they're next to each other and then scrolls them at the same rate. There's no diff computed, so there's no markup of changes, and the documents get out of sync if they have insertions or deletions.

I think he may have been joking about the fact that Office is bloatware, and somewhere in that bloat, may hide a half assed diff tool.

Ah... yes. Well at least we discovered that I can do a good impression of an over-anxious startup founder :-)

I would need it for Powerpoint. Crazily enough, it's what my company uses for reporting. I guess we'd buy a couple thousands licenses of it.

Right now it's only available for Windows version of Office 2010. Any plans to release it for Mac or for older Office versions?

It works on Office 2007 too.

We have no current plans to do 2003 (too many hacks required!) but we could potentially do a Mac version if it turns out that we've made something people want.

Office 2013 brings some powerful compare options for Word and Excel:

Spreadsheet Inquire (need to enable an included COM add-in which requires .Net 4) http://office.microsoft.com/en-us/excel-help/what-you-can-do...

Compare (included in Word by default) http://office.microsoft.com/en-us/word-help/compare-is-under...

Really nice work, you guys.

Also, nice timing. I recently posted a hobby project to HN that is very similar :)



The ribbon support within word itself is great. The ability to link to specific sections of the diff is also a handy feature.

One other use-case I could suggest: I once got caught in the middle of a contract negotiation between our in-house lawyers, and one at a service provider. This negotiation mostly consisted of sending back and forth Word documents with "Track Changes" turned on.

See also: Softinterface Diff Doc [1]. I have never tried it though.

[1] http://www.softinterface.com/MD/Document-Comparison-Software...

you might (or might not) wanna look at what i did recently.

> http://zenmagiclove.com/phrase-change-sample.html

> http://zenmagiclove.com/phrase-change-display.html

you likely want to stick with your side-by-side display.

(as it seems that you feel this is your primary advantage.)

but still, isolating the changes so that they occur on a phrase which is presented coherently on its own line is something i believe you would find improves results.

i've got lots more to come, and am actively working on this.


That's awesome ! Exactly what I needed !

Pictures added or removed are not shown. Changed formatting is not shown. Text modified in text boxes is not shown.

Doesn't provide many more features than piping antiword to your favorite text-based diff-tool. Only difference i've managed to find is that with draftable you get formatted headers.

I'm also a bit confused, is the diff running on your server or on my computer? Your EULA keeps talking about "hosted services" but the only thing available on the website is this tool i install on my computer, no hosted services as far as i can see.

I'm sorry to hear that you didn't find it useful. Thanks for trying it out though!

I'd like to understand better what you're looking for; my email's in my profile if you'd be willing to talk further.

If you're using the Word Add-In, the diff runs on your computer. If you're using the online demo on the bottom right of our homepage, it runs on our server. I'll make sure that we spell this out in our privacy policy.

The demo on our homepage is the only hosted service right now. The EULA includes "hosted services" partly to cover use of that tool but mainly because I didn't want to have to redo it once we launched more things.

By the way, did you try the Retroactive Track Changes features, which do compare all those things you mention?

How is this different from what "git --diff" provides out of the box for .docx documents?

Draftable for Word lives right in the Word ribbon, with native Word rendering of the documents. You get a side-by-side view of the documents with deleted and inserted text highlighted, and synchronised scrolling. By not having to switch programs, it doesn’t break up your workflow.

It also works great with Track Changes, and has a handy “Changes Since Opened” feature, which saves a snapshot every time you start working.

Thanks for the question, and we hope you give it a try!

*Disclaimer: I work on Draftable :)

The overlap of Word users and git users is small enough that it makes sense to create a simple to use tool for what is essentially the same task.

I don't think git can show word documents. Sure, they might be XML under the hood, but getting a diff between blobs of XML isn't quite the same as a side-by-side comparison in a nice UI.

It's even a ZIP file with multiple files inside, only one of which is the actual document. So naïvely you'd just get back a binary diff. But I found some ways of crudely extracting text and feeding that to git for diffing so maybe those are already built-in in Windows versions. But that still just gives you a normal text diff and nowhere near as user-friendly as having something directly in the tool you're working in.

Did you planed to start a version for Office Mac ?

Congratulations. You're going to be rich. :-)

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact