As someone (a software engineer) who has been trying (struggling) to reproduce biology research lately, I say amen. Hallelujah.
But. It's time to accept coding as a core skill. Science has more to learn from software engineering than it realizes. Software engineering (aka coding) eats reproducibility for breakfast, even when hundreds or thousands of "collaborators" are involved. These days, it's rare for a single biology researcher to produce (publish) code that is easily reproducible by an external researcher.
The "reproducibility" mantra is at odds with a lot of real world science and serious computing. You don't/can't in general reproduce the sort of physics experiments and large-scale calculations with which I'm most familiar the way people are suggesting, and software engineering can't address bad science or lack of information. Revision control and "notebook" interfaces seem to have become the equivalent of waving XML metadata at any problem from the days when e-science was preventing useful work and research. Experience from a non-trivial research record and some decades doing and supporting research computing will be ignored, though.
And as for "doing real science" vs. trying to make it more reproducible, there is an excellent analogy with "doing real programming" (aka adding features) vs. refactoring and architectural adjustments. Telling that you consider the second as a waste of time tells more about yourself than about the subject.
> ignore the last two bits for many stuff to improve the user experience.
A successful alternative to Microsoft Office has to start from a level of compatibility that no current developer in existing solutions has expressed interest in attaining.
What about Gnumeric?
My understanding is that the dev team (mainly Jody Goldberg really) have focused on accuracy. The gnumeric.org site has links to studies related to its accuracy.
Your bit-level accuracy approach isn't as simple when it comes to user experience and Excel compatibility.
=RAWSUBTRACT(0.1,-0.2,1/3) => -0.033333333333333
=RAWSUBTRACT(0.1,-0.2,2/7) => 0.014285714285714
So which one is closer to 0.3?
What do you define as "heavy duty office work"? What would make OpenOffice more suitable to you needs? Asking as an AOO contributor. I'm genuinely curious to know what you think the project is lacking or how it needs to be improved.
Since joining the world of work, I've struggled with the following:
- Random crashes every couple of hours when files become large. Especially in Impress and Calc.
- I do a lot of diagrams in OODraw. Sometimes I'll reopen the file to find arrows and connectors have moved.
- The interface could be more intuitive. For example, the colour selector is limited and clunky, as is the gradients widget. Why can't we add any colour we like without going to the options menu?
- Application speed is slow when files get large, especially Calc. I've even tried increasing memory per object.
- Filtering and sorting in Calc not as fully featured or easy to use as Excel.
- Conditional formatting not as easy to use.
- Calc shortcuts not as easy to use as Excel (eg, I don't think there's an easy way to select a column without a mouse, or transpose a selection, or remove blanks etc etc). With Excel, I can pretty much achieve most things without touching a mouse (I'm an ex investment banker, and we get pretty familiar with the keys). OO seems to lack these critical shortcuts.
- Poor documentation of OO scripting environment. It's tough to figure out how to automate simple things in Calc.
This is not an exhaustive list, but it's all I can think of currently.
Again, I am very supportive of the AOO/LO effort, but I wished it would start giving Microsoft more competition in the power user category.
As someone who was a software engineer in a brokerage and had to deal with clients excel sheets: open/libre office not letting you do that is a feature.
We would be given excel sheet that would depend on a specific version of Excel. We had sheets in the hundreds of megabytes. We had sheets that would take overnight to run. We had sheets with sheet dependencies. We had sheets that needed to be run in a specific order manually. We had sheets with 13,000 manually entered rows, each with 53 columns.
These things, and I use the term thing since eldritch abominations does not convey the horror of using them, were responsible for investing hundreds of millions of dollars.
The day excel dies is a day I will celebrate.
I re-wrote the process in AWK, and was able to complete the data-reduction task in about 60 seconds on a DEC 5000 workstation. But my numbers did not match the spreadsheet results, and finding out why was an interesting process.
I then encountered small, remote offices of big international firms -- usually real-estate, insurance companies -- that had limited local IT support. It was Excel macros all the way down.
And then a large biotech firm, where it took KPMG analysts a few months to determine that our entire business relied upon this one guy's spreadsheet.
Excel dies, let us suppose... What will these people come up with? Alexa queries?
I have preferences other than Python, but mostly I csn read code written by the scientists who wrote them. They may be abominations, like global state variables and magic numbers, but they are readable.
Is Python any reason to hope for a better world?
(Hmm. Off-topic, perhaps. Your story woke some memories of tough IT experiences.)
Ideally, excel with separated actual use cases, which the original app merges into one thing:
- data entry into tables
- auto-generated tables over some ranges (dates, counts)
- constants / described values (they end up on a side, sometimes in a named cell if you're lucky)
- presentation/report layer
I think Access did some good things, but is too close to be database to be comfortable. BI to is great for the last part, but it's a separate/expensive product.
Interesting. I use Draw a lot and haven't had that hit me yet. Any chance I could convince you to post something about this to the AOO mailing list, or open an issue in Jira about it? (that is, assuming there isn't already an issue for this).
Poor documentation of OO scripting environment. It's tough to figure out how to automate simple things in Calc.
Agreed. I just went through that exercise myself, when I was working on a thing at work to use the Jira API to query stories, and pull stuff into a spreadsheet for analysis. I was able to figure it out, but it wasn't easy. From what I've seen, the information is all there, but it's not necessarily organized / accessible enough. There's also not enough tutorial format stuff. Hopefully I can write up some stuff based on my recent experience and get it out there.
but I wished it would start giving Microsoft more competition in the power user category.
Same here. I have a laundry list of things I'd like to try and add to Calc to make it more powerful for complex analytics and what-not, but I am so busy on other projects I haven't had time to pursue those ideas much.
And it crashes. It crashes and loose your work.
I love that such product exists. I donate to libre office.
But it's nowhere as reliable as msoffice. Any complex enought document will make it crash at some point.
It's not even reproductible. You can't pinpoint the action that made it crash. Sometime you do the same other and other and it's fine. Then it destroy work and crush your soul.
Microsoft products crash inexplicably on our systems too, and in fact, we have much weirder problems with corrupted instances and things than we've ever had when using LibreOffice. I also have serious problems with the Office UI, basically that the ribbon UI as it's implemented is inconsistent and incoherent, which is frustrating as hell. If you're going to have a ribbon/tab UI, make it consistent throughout, and don't have special exceptions for some functions.
Having said that, I do think LibreOffice loses to Microsoft's edge in some areas of polish, like in implementing equations, and in drawing figures. Some of the UI with the drawing actions are much more intuitive in Office.
Does Stencilla offer any kind of author collaboration? Publication are usually not one person efforts and research teams oftentimes are not working in the same location.
In my experience the GDocs or Word comments and revise mode are heavily used in collaborations.
From their FAQ:
> Stencila allows you collaborate with colleagues who use other tools than RMarkdown and Jupyter Notebook, without you having to give up your favourite tool. Stencila Coverters make it possible to open documents in various formats (R Markdown - Rmd, Jupyter Notebook - ipynb and so on) in Stencila. The conversion is lossless for all interactive parts (such as code cells).
Nice to see dat part of the conversation.
No one's fudging the formulae, they're fudging data. And stencila will digest whatever data you give it, real or fake
Is this usable for daily usage? And can we output PDF and use LaTeX for publications?
> Reproducible research depends on reproducible execution, which depends on a reproducible environment, which depends on a reproducible set of libraries and frameworks.
Completely agree. We are trying to make it easy for people to use reproducible libraries and environments. To this end, we are developing Nix environments (a highly reproducible way of defining computing environments) which include Stencila "execution contexts" for R, Python etc with standard libraries included. These environments can be connected to the user interface. See https://github.com/stencila/images/ for more details.
Either way, wish the project luck! Reproducible scientific studies is paramount.
That it's popular means nothing. Windows is the most popular operating system, pop is the most popular genre of music, oil is the most popular fuel, since decades. This doesn't invalidate their utility or our appreciation of them, but it does not mean they're good for everything.