Something that I'm particularly interested in with LibreOffice is helping the core being decoupled from the UI, to both help with portability and to speed up the rate of UI innovation. I had thought that I may have to start by myself, but based on this article it seems that the current LibreOffice developers are thinking along similar lines.
If any LibreOffice developers are reading this, are there any sub teams setup that may be a good fit? To give some idea of my background, I would describe myself as a tester that knows how to code (not in C++ alas, but I'm familiar enough with debugging at a low level).
"It's challenging to fix as one needs to tunnel certain data through 20 layers of abstraction, but hey, that's improved UX for you!"
If you want to get into triaging bugs, I welcome you with open arms. I am desperate to get us more helping hands as after nearly 10000 analysed reports I am anxious to diversify.
Here is the index of our QA tech articles: https://wiki.documentfoundation.org/QA
Basic debugging info for triagers: https://wiki.documentfoundation.org/QA/BugReport/Debug_Infor...
Debugging for devs: https://wiki.documentfoundation.org/Development/How_to_debug
Binary bisecting: https://wiki.documentfoundation.org/QA/Bibisect
If you were thinking of working on the UI, please get in touch with the design team, which includes some devs (but could really use 10 more!): https://wiki.documentfoundation.org/Design
You will find me in the QA IRC channel (and most others): https://wiki.documentfoundation.org/QA/IRC
I'm heading to sleep now, though, but I hope to see you around!
It it's truly 20 layers of abstraction, I'd love to read a blog that spelunks into all those layers and elucidates what they do.
The layers work like this:
1. Old cross platform layer for both operating system specific functionality (OSL) and runtime specific functionality like string handling and bootstrapping (RTL).
2. UI code base is the Visual Components Library (VCL) - Window handling, event loop, controls (e.g. buttons, etc), etc.. This is divided into a platform specific part and a non-platform specific part (allows for easier porting).
3. “Starview Library” (svl) and toolkit Library with extra widgets, etc.
4. SFX2 (Starview Framework v2) dies things like dispatches messages, creates a “shell” for each program.
After that I get a bit lost, it it’s mostly app specific stuff.
There’s also a component model baked in called UNO (Universal Network Objects) which is a network transparent component hosting environment - it’s what allows the use of Python and Java, with obviously a C++ specific purpose environment. It’s kind of cool, but also many folks don’t want to use it any more than necessary.
I think it helps to understand that it was a German company who created StarView, they really though in terms of layering :-)
1. I've been wondering this for a while but never knew who/where to ask, but where's the original StarOffice code dump? What actually got dumped? I'm fascinated by that whole transition timeline, both in terms of its own history and also as one of many code dump events. (One of these days I'll go learn about Mozilla.)
2. Are there any interesting tidbits floating around? Any old DOS-era or early-Windows-era code, for example? I wonder if (with great effort) I could build something for Win3.1. (Of course I realize I'm not describing LO at all here)
2000 is pretty old, as tech goes, surely something fun got captured. The only reason I can see for super-old code not floating around somewhere is that it was explicitly excluded.
There is a whole bunch of code in the SAL that works for OS/2 but it has maintenance overhead. Both uwinapi and the OS/2 port have been removed from Libreoffice.
As for a code dump before 2000 - man, I would love that. But sadly, it doesn’t exist. The Star/Sun/Oracle guys chanted VCSes a few times. They had a procedure for merging patches into the trunk that was insane - and we have lost the history of those branches so from between 2001 to 2009 the code becomes harder to follow.
Unfortunately, there are decisions that were made before 2000 but that’s lost to us now due to a lack of version history from before this point.
It might take more than a few hours (e.g. you'll need to set up a dev environment - but the builds are quite robust nowadays unlike some projects), and the codebase is complicated - but contributing to LibreOffice was fun and helped me gain real world software experience. I'd only worked on small personal projects before LO.
You definitely won't cause harm, and there are devs who will try to help you get your patch submitted. (They even employ someone to help new contributors get up to speed and to help get their patches in.) As chris_wot said, get on irc and people will generally be able to help you get set up. (IRC presence tends to vary by time and day, jfyi.)
The articles on building are found in the General Programming Guidelines box in this index: https://wiki.documentfoundation.org/Development
Then you will want to pick an easy hack: https://wiki.documentfoundation.org/Development/EasyHacks Note that there are more challenging easy hacks marked by the difficultyInteresting keyword. This step also implies you will create an account on the Bugzilla.
If the easy hack is not an ongoing thing, you should comment on the Bugzilla report about your desire to work on it and assign it to yourself and change status to ASSIGNED.
If the easy hack is an ongoing task, you might want to check https://gerrit.libreoffice.org/ if there are some unmerged patches related to it by searching for message: and the bug number. This will allow you to be confident about avoiding duplicate work.
To start working on your patch, you typically create a new branch:
git checkout -b my-branch-name
as advised in the "Sending a Patch to Gerrit" section of this:
The topic of amending patches is dealt in this additional article: https://wiki.documentfoundation.org/Development/gerrit/Submi...
Easy hacks come with mentors attached to them, so the patch reviews should not get stuck in a limbo. Regarding "normal" patches, they might sit there even for some weeks, but I hope the situation improves when TDF hires the new developer mentor(s): https://blog.documentfoundation.org/blog/2017/11/07/job-sear...
Btw. there is a job search for feature implementation going on: https://blog.documentfoundation.org/blog/2018/01/16/tender-c...
In my experience, and this goes doubly so for small FOSS projects, you have to remember that it's someone else's baby. Maybe they were the person who started it, maybe they've been there since the early days, or maybe they have an unusually strong attachment or sense of ownership over whatever it is you're changing. At any rate, the moment you submit a patch, there's always the risk that it may be met with harsh criticism (sometimes deserved; sometimes not), and at least part of that may be the nature of the human psyche. So, the best defense is to do things by the book, if there's an established process, or failing that, persistence. Sometimes, just the existence of a patch in the wild--even if it isn't accepted by the developer--can be enough to motivate them toward a fix. I've seen plenty of relatively minor fixes get rejected only to be fixed hours later by a maintainer because the patch got them thinking.
Let's be honest, even if it bruises the ego a little: As a user, I don't care how it gets fixed. I'm not in it for the glory. I just want it to work.
There's of course the usual advice : Keep your pull requests/patches short and targeted. Don't try to do too much, because larger, more widely reaching patches are likely to be ignored or turned down. Document what you can, why you did what you did, and what you intended to fix; but don't be too verbose (guilty, as you can see here!). Asking follow-up questions is instrumental if it's rejected, because there's probably something that was missed or overlooked. Of course, this doesn't obviate issues with projects that have a distinct lack of communication skills or toxic personalities. Ironically, I've had some luck with non-committal/non-responsive maintainers by forking small projects, applying my changes and moving on, only to find some weeks later that they want to pull my changes. In those cases, I never expected a fix--I just needed it to work yesterday--and I needed somewhere to publicly source the fixed copy from. That upstream wants the resolution as well is merely an added bonus.
But the key is (polite) persistence. It's also the hardest to do if you don't want to seem rude or pushy. I also think that rejection and criticism play on our fear of failure (whether or not we admit it), which may kill participation. Just don't give up: Think of the other users who don't have your skill set to fix the underlying problem and consider that you're advocating for them!
The best thing that could happen at the moment is for OutputDevice to be gotten rid entirely - it's a leaky abstraction.
It also speaks to a significant problem: large codebases are inherently difficult to understand, which forms a barrier to entry. This is despite the fact that most software makes use of various abstractions to break problems into smaller parts.
In the general sense, I believe it's worth thinking about how the particular abstraction mechanisms used in software impact a new contributor's productivity, vs. the productivity of someone more familiar with the codebase, especially for projects that are developed in the open, or with high developer turnover.
His main point is that we focus too much on the software tools, and not enough on how we think about the software we are modifying.
I can say from experience that writing things out on paper without looking at the code works, and more people should do it. Like he states in the slide-deck: it lets you off-load mental burdens, and avoid the problem of forgetting the details of a complex problem in the process of being unravelled. It lets you get back into the work weeks or months later.
Writing messy pseudo-blog-posts in github issues and very long code comments is how I kept my sanity on a project where I was the sole programmer for two years.
My favorite is tracking down a member function and being completely confused until I realize that I missed one of the layers and it was overridden in that instantiation.
ProTip: Manager or Controller is not a good name. To me a good name is eg „PostDetailSerializerResolver“ - it’s a component that will resolve serializers for post detail objects. And according to concept of single responsibility, it should only do that. If you need it to do more, you must change the name.
Some of the best refactoring I ever did for long-term code-maintenance was nothing more than untangling a web of abstraction layers and renaming them to fit the actual context, changing nothing to the functionality of the code.
Sometimes you start out with an idea for what the architecture will be, but then things change and the naming isn't updated.
Oh yes. I work on this, and require very little inducement to talk about it.
 For example: https://news.ycombinator.com/item?id=13565743#13570092
Needless to say, LibreOffice works very nicely in an offline environment, and it's very rare that I find a file it does not handle well. (The only example I have — a Word document with Track Changes enabled, heavily revised by multiple people, multiple times over many days. L.O. seemed to lose track of the changes and showed a wrong document. I imagine for most people this is pretty rare.)
And you don't have to worry about malware (for those tempted to pirate Microsoft Office).
This has been a huge PITA for me since I often need Word and Excel on a computer that's restricted from accessing the internet.
> You need to be connected to the internet to download this installer file, but once that's done, you can then install Office offline on a PC at your convenience.
> After your Office installation is complete, you need to activate Office. Be sure you're connected to the Internet and then open any Office application, such as Word or Excel.
I mean the lede is "problems with slow speed or unreliable connections" but at the same time ... offline is offline.
I assume they just have decided to make this functionality a special case for enterprises of a certain size. I emphasized to them this could be bought through a business channel if needed but I assume I have not the right scale.
Here is our meta report for MSO formats: https://bugs.documentfoundation.org/showdependencytree.cgi?i...
"depends on 1394 open bugs" - this includes all the further meta reports.
The article mentions EMF/EMF+. this format is basically a list of calls to GDI.h and is a bitch to map to other graphic stacks when you are not on Windows.
The specification of the format is public (kudos for Microsoft), it helps a lot. But there are a lot of corner cases and the spec can be quite hard to understand some times. Computing correctly the origin is tricky for example (VIEWPORTORGEX, WINDOWORGEX etc).
For some stuff, it's impossible to get it right.
One example that comes to my mind is a text encoding bug I got a year ago (I maintain an emf to svg conversion library). It took me one week to track it down. In most cases the strings are UTF-16. But in weird cases, when the ETO_GLYPH_INDEX flag is set, the "encoding" is directly the index of the glyph inside the selected font.
It's not the most portable way to handle text... If you don't have the exact same font on your computer, there is a good chance the text will not be displayed correctly.
And converting back to a well known encoding is tricky (using the cmap tables of the ttf file, you have to build a reverse cmap, pray there is a 1-1 mapping between glyph and unicode, and convert back to utf-8 or something).
Even Microsoft itself, on the Mac OS, version got it wrong.
For information, the bits of code that is handling "font index encoding":
What people generally don't understand is that the save file formats do not describe what the output will look like. Back in the old, old, old days you would write a word processor and to "save" the file, you would just write a binary image of your data structures. That was what the doc file format was originally.
Although, people didn't tend to program this way back then, imagine a Rails server. Now imagine that you handed over your database (with tables that correspond to each model object). That's kind of analogous to a "save" file. In order to render the data, you need the business logic (which, in Rails, is usually housed in the controllers) and you need the rendering logic (which is usually housed in the views). Just having a description of the database tables doesn't really help me all that much in rendering the data as you can see.
Now imagine that we take a different Rails app that happens to do the same kind of thing. Some people say, "Oh it would be swell if you could import the data from the first Rails app and render it in yours. You are basically doing the same app, so it should be easy". But it's obviously not -- I wrote my controllers and views completely differently than the other Rails app. My model objects contain completely different data and are structured completely differently. Even worse -- I don't even render things the same way. If I want to support this functionality, I basically need to completely rewrite my rails app to be exactly the same as the previous Rails app.
And this is essentially the problem. If you want to render Word documents the way Word does, you basically have to rewrite Word in your word processor. If you also want to support your format then you have to maintain 2 word processors. If you want to supports some other file formats? It just gets worse and worse.
So what you could do is completely rewrite the rendering engine so that it is more flexible. But the problem is that this is a massive undertaking and nobody will believe you if you tell them that it is necessary. Even worse, there is a surprising amount of variation in rendering. For example, how do you deal with page breaks in footnotes? Most people don't write footnotes that are longer than a page (if you look at my posting history, you might suspect that I'm one of those people, but I digress). However, it's very common in the legal field. There is a specific correct way to break pages in footnotes and Word historically has not done it correctly. They do it differently. This is actually what kept Corel/Word Perfect in business for a long time -- in order to print the document, you would have to use Word Perfect and since the file conversion was crap, you pretty much had to keep it as a Word Perfect file forever.
I will say that while I worked on Word Perfect, Microsoft was very helpful in explaining how their formatter worked. They even regularly sent us bug reports when our import filters had regressions, etc. I've never believed that they intentionally obfuscated the process. It's just a difficult problem.
The former is a more likely scenario
I've also been, on the quiet, go through the commits and summarise it better. Before 2009, there was this insane way of grouping commits and it makes it hard to know who did what to the code!
Then again, Win10 is going down the same route so maybe it will all even out (...).
For complete novices, yes the UI is not very intuitive. But Word isn’t much better either.
I recently had to start using Word again at work and even though the ribbons seem intuitive and user-friendly, actually finding the feature I needed has not always been easy. I’ve had to Google quite a bit (the help is excellent though).
At least in LibreOffice everything is always in the menus and does not suddenly disappear if your window is not wide enough.
UX doesn't enter into my criticism here.
You mean Gnome 3? There are other desktops.
Different desktops have different goals.
In all honesty, my biggest issue is that Calc can slow to a crawl when a sheet gets even moderately large, especially if using filter/sorting.