Building stuff like a text editor, simple game, database etc. from scratch which will never be used by anyone other than yourself can be immensely challenging and satisfying.
This isn’t to say that solving beat programming challenges doesn’t improve anyone’s day, but the demand for the solution isn’t being made public and clear, which many people see as a necessary requirement for meaningful work.
More like the benefit of people being ungrateful for the effort you put in and then being hostile about your lack of continued motivation to solve their problems. All for absolutely zero financial gain, so the difference between that and a personal project is that you feel good for solving something, then bad because people hate you.
Don't get me wrong, helping people is great and I do in fact find it one of my greatest motivators, especially for things I would otherwise consider boring (or indeed, mundane).
But something this is in and of itself more interesting? If you wouldn't rather work on that, doesn't that mean it's just not interesting enough? :)
I guess everybody is different. Even though I have found that helping people is for me probably the strongest motivator there is, if there is one thing that can make me procrastinate on that, it's problems that are in and of itself more interesting :)
So for me the solution was simply moving away from an every day developer job at work and more into a product management/CTO like role.
Working with a customer will make you a better businessman, not really a better programmer.
Doing a project with customers is extremely valuable if you intend to start a business, but if you hate that and just want to be a good coder and make a living out of it, there is a solution: find a good boss/manager. He will take care about the business stuff and allow you to put all the skills you got doing toy projects to good use.
Otherwise, if you want to do something that looks more like a real-world project without the hassle of dealing with customers, make software that you really use, and share it. A good candidate would be a video game that you enjoy playing, or some tool for a hobby of yours, or a utility. That way, no one will stop you from experimenting, but you still need to think about usability.
Wow, everything wrong with the tech industry right here.
Actually I'm not even sure. I have a little experience with freelance web development around 2007 (browser war veteran), and I found navigating between customers expectations and "good code" to not be hard at all. I did read a lot of blogs on webdev (including the business side of it, contracts and expectation management, etc) so I probably got some great advice somewhere. Reading about it actually got me more enthusiastic about tackling the problem as well as I could. But usually the details that made it good code or not were technical details that I didn't bother them with. Or they were things I could sell; browser compatibility and accessibility (back then I could use the line "Google is your most important blind user" -- now no longer true).
Of course I am aware that as a freelancer, I enjoyed a lot of freedoms that you don't always get as a programmer-for-customers in different business environments. But if that's the case, that really proves it's not the customers that get in the way, but the institution.
 I kinda miss the ... blogosphere.
This comment really hit home for me. Making something good is often at odds with
making something that's profitable/popular. I've often found that the most popular
software is far more bloated/limited than the best software. This might be because
people like me tend to favor smaller programs that are FOSS and work well with other
FOSS; this view is held by a minority, and is thus a less profitable niche to cater
One example is mpd+ncmpcpp versus Spotify or iTunes. The former is far more
well-designed, performant, featureful, and flexible; however, it will never be
popular/profitable because it's FOSS targeting end-users, it runs in the terminal,
it lacks CPU-intensive pretty animations, and requires users to understand that it
involves a client and a daemon.
The list goes on; from IRC/Matrix versus Discord, to Linux/BSD vs macOS/Windows (an
opinion that's a bit more controversial here on HN, but held by many nonetheless),
the list goes on. I've defaulted to assuming that if a program intended to be used
directly by end-users is mainstream and/or quite profitable, it's likely not for me.
The few exceptions (e.g., massive web browsers like Firefox) exist only because
there is no alternative I can get away with.
[EDIT] To elaborate, i recently thought about coding, and i understood that at some point all the code "touches" reality in some way, and that place is most important. Either it moves something physical, or provides answer to person, or makes him happy. But when coding is a good thing to ask your self, how my code "touches" reality.
The distinction here is probably between being an "engineer" vs. being a "programmer". The former's concerns include all this stuff that's for the business. The latter is just interested in the craft of writing computer programs. And that's all the article was about: skill as a programmer. Which might help you be more effective as an engineer, depending on the projects you face. But the point is self-edification; not everything has to be in service of money.
What an odd thing to say on a site named after the Y-combinator ... ;-)
The author's suggested projects hone in on programming mastery, whereas your suggested project hones in on a plethora of skills. We can't really say that one substitutes the other as they achieve two different things. I wouldn't necessarily start a startup just to learn WebAssembly, but I would learn WebAssembly if I needed to for a startup. In the former case, my goal is to learn WebAssembly, whereas in the latter, it's to start a startup.
My projects have grossed a lot of revenue, but several of these projects would be challenging for me and push my the limits of my skills.
What I am saying is that there is room for more nuanced language to describe all these matters in a way that is detached and clinical.
Moreover, learning how to deliver value builds empathy for non-programmers who also deliver value and also the realization that programming ability is not where a company lives and dies. These are the number one and number two things most often lacking in programmers.
I have to fight with "seasoned pros" on the regular to get them to stop doing things like sending passwords in email or worse, put them in text files in git. Because A) what the hell are you thinking and B) holy shit, we're a public company WHAT THE HELL ARE YOU THINKING!? You have to explain to them, repeatedly, why this is bad and also things like why a production-ready database isn't the same as the single-instance point and click AMI they spun up...
All of this because the only virtue that they know is "I shouldn't be blocked by anything." Unfortunately some of them are such skilled programmers that they'll drag entire IT and GRC organizations screaming in their wake trying to make sense of the mess.
This doesn't sound like the activity of someone with a high level of 'raw programming ability'.
Not so much. First, great programmers are very well compensated, and second, most tech companies are organized to keep great programmers from even considering business issues: That's the land of managers, product owners, and scrum leads. Programmers are supposed to implement the requirements they're given quickly. Not how it should be maybe, but how it generally is.
At least in my 20 years of experience, the most effective companies have developers (sometimes in those roles you listed) involved in the requirements gathering or at least work planning phases. Does anyone really enjoy just being a cog in the "feature factory" you described? I doubt it.
Given the number of broken Google SDKs or Cloud features I have to deal with in my day to day though (and game of whackamole that we have to play with them), this seems accurate.
Developers not caring about the customer explains most of Google’s major failures outside of advertising.
Desired for who? Your employer? Programmers are human beings that can do things for their own personal enjoyment, not just to increase shareholder value.
“programming” is just a tool to me - not an end goal. I’m just as proud of the code that I was smart enough not to write and use an existing product/service/module for as I am for the code that I did write.
Also see: imposter syndrome :-)
I agree with you! This list was about projects for learning. A few months ago I had a blog post that was more product focused, regarding the lessons I learned from releasing games (I think they're generalizable though). Definitely a different world when you have customers to please and motivate you!
Basically I found out the few customers you have and the further you are from business, the happier you are, i.e. considering you are still programming of course.
For programmers focusing on businesses, the only happy solution, imo, is to be a consultant. But this basically asks for taking business skills. Of course this is a biased view as I work as a BA, who deals with business every minute.
Others are technically challenging. That difficulty is precisely why others don't tackle the problems, but generally customers are looking for a solution and you have a ton of flexibility when working with customers.
To put it in different terms, in some cases "if you build it, they will come" is true. The hard part is determining when it applies.
However I also used to do some freelance web development back in the day, and sure enough I did learn things. Mostly things about customers, not about programming. They don't know what they want, they don't know what is good and they need help with that. This is called requirements engineering, and IMVHO starting a project that has customers is the literal worst moment to start learning that. Fortunately I had a class about it in uni and I happen to be great at explaining technical things to non-technical people. I also learned that I'm actually really good at what I did (looks great + works well + happy customers), they gave loads of recommendations and I got requests for at least a year after. Too bad about the burnout that happened soon after, for no particular reason, that turned my life upside down.
Either way, yes it was educational but in no way did it make me a better programmer. Maybe a better business person? (And I'm not a very good business person)
All in all, I think I would recommend "a project that has customers" only to people who are young. It has less stakes when you're young and it's chock full of wonderful learning opportunities for young people. If you're older, I think that most of the lessons you would also have gotten through general life experience. Which again demonstrates how little this exercise has to do with programming. But yeah, it can definitely be a valuable experience.
It was one of the most stressful periods of my life. I was picking them up with another programmer, and I completely failed to understand the amount and the quality they would deliver. I solved it by working day and night for a year or two, or so.
I'd say, do try a project that has customers.
Think twice about a project for a client.
I thought "client" was just the fancier term, therefore you have customers at a book shop and clients at a law firm. Does a hairdresser or tattoo artist have customers or clients?
I meant the following: customers -> group of people who buy or subscribe to your application, preferably via the App Store, Google Play or some other in-between party. Usually no contract involved.
Paying client -> a single person or business that pays you to develop software, probably with a contract that is signed and specifies what you're going to build, on which timeline, etc.
It’s pretty hard for one person to solve enough of a problem to have users in the general public. It obviously happens, but it’s not a sure thing.
It’s hard to get users that are not in the public too. I had internships early in my career working on IT stuff, at a large business and a university. I found those jobs were actually terrible for getting feedback from users because there was so much bureaucracy. The management would actually insulate you too much from complaints or feedback.
Dev tools are a good learning experience because programmers will give honest feedback if they don’t like something :)
The counter-argument to that is that processors are ridiculously fast in human timescales --- copying memory at gigabytes per second --- so unless you're focusing your use-case on editing correspondingly huge files, there's no real need to make your implementation more complicated than a single array. Even when DOS machines with <640K of memory and memcpy() speeds in the low MB/s were the norm, people edited text files of similar sizes, with editors that used a single array buffer, and for that purpose they weren't noticeably slower than ones today.
My ideas for challenging projects are a little less open-ended, so they exercise different set of skills: being able to implement a specification correctly and efficiently.
- TTF renderer
- GIF or JPEG en/decoder
- Video decoder (start with H.261)
>> A rope, or cord, is a data structure composed of smaller strings that is used to efficiently store and manipulate a very long string. For example, a text editing program may use a rope to represent the text being edited, so that operations such as insertion, deletion, and random access can be done efficiently.
Yeah, good luck enabling line numbers in such an editor.
In Emacs, which uses a gap-buffer for storing text, line numbers have had notoriously slow. It's gotten a bit better lately, but suffice to say, a naïve flat array / gap-buffer approach is not good enough for some relatively common scenarios even on modern hardware.
I've written code to do word wrapping, and it was surprising how fast it was. Line numbers are similarly complex.
Speaking as someone who's gone all the way from implementing a Red-black tree to making a rope data structure using the RB tree, to making a text editor that can edit almost arbitrarily large text files (dozens of gigabytes) without user-perceivable latency ;-)
A list of strings is more elegant, of course, where only the line being edited becomes a gap buffer. It taxes the allocator a bit more, though, which might be a concern on computers of the time when Emacs was born.
I agree strongly with designing your code so it's easily changeable into whatever new features are needed. This is much easier said than done, and I don't know if anyone has written well about the tricks of that trade.
But anyway, if you have that kind of code, swapping out whatever you need to make line numbers happen is no more work later than sooner.
Code bases with features implemented that are never used, but you still have to keep working through all changes, because someone imagines it will be a real requirement someday, are what my nightmares are made of.
If your application has grown as long as it could with the simple implementation, and now it is all too slow, chances are there's a lot of code depending on the interface. If your interface (and the implementation) is too simplistic, then all of that code will need rearchitecting, too.
You are not going to write a really great text editor as a learning exercise. It has been done by better programmers who had better overview of the problems and over thousands man hours.
This automatically means the task is as useless as a gameboy emulator or basic compiler. The underlying "Things to learn" points are good, but tasks themselves are not.
It's not really clear what your point is. You say the task is "useless"—what does that mean? Personally I can say that you are categorically wrong, because the skills I gained building things that are not completely new ideas fueled my passion for programming and opened up doors for me that otherwise would have remained closed. Even if I didn't still use a lot of these projects myself (because I built them to fit me), the value I derived from them would still be significant in the "grand" scheme of my life.
If a programmer is excited about the idea of writing her own text editor, what would you suggest she build instead that will sustain that same excitement and offer exploration into the same diverse subject matter but also satisfy your nebulous criterion of not being "useless"?
I think you are misunderstood about the concept an array. An array has 1) an interface that is easy to use. On the other hand, by definition, an array is 2) contiguous in memory. Property 1 is good but 2 can cause problems. I think you want only 1.
The solution is to create a data type that has the interface of an array but a different implementation under the hood. You can have a linked-list of arrays, a tree of strings, etc.
Vague justifications like "can cause problems" is probably exactly what he's referring to, in fact - people who know that inserting elements into an array is "slow" and end up making large and complex code as a result. Yes, it's O(N) on the length of your code, but the point is that for a couple of megs of text, O(N) is perfectly acceptable.
At least on a desktop, that'll fit in L3 cache which these days is around 175GB/sec. Or to put it another way, inserting that single char can probably be done at around 40,000 times per second. Which is faster than I can type, at any rate.
The other problem with your comment is the support for Undo operation. Even if you use a flat array, you need a more sophisticated data structure for storing previous changes. Storing a separate array for every single change is not an option.
By different you mean wrong. PHP calling an ordered hash map an array doesn't make it one.
from the article:
> Luckily, there are some nice data structures to learn to solve this.
You could have also learned a new data structure!
I mean, it should be obvious that "this thing that the cursor does when moving lines" isn't the big takeaway from this challenge. It's almost cute that the author never noticed it (as a programmer), because I actually use that behaviour to navigate code sometimes. Who hasn't done a quick arrow-left/right to make the cursor lose its memory of which column it used to be on?
> Even when DOS machines with <640K of memory and memcpy() speeds in the low MB/s were the norm, people edited text files of similar sizes, with editors that used a single array buffer, and for that purpose they weren't noticeably slower than ones today.
No way. Every reasonably performant text editor in those days used special data structures and not just an array. Imagine having to copy the entire buffer on each key press (so, when inserting at the start of the file). Believe me, on a 640K DOS machine you'll feel that.
This isn't new stuff, I learned about these data structures in uni -- except I don't remember them because back then I was young and arrogant and didn't think you'd need these fancy data structures for something as simple as an editor :) :)
... but if you never tried to write one, it's hard to see in what ways these editors are not as easy as you think.
But, OFC, these new "programmers" can't even figure basic Unix tools. Or performance.
That's hardly fast, yet still a lot snappier than most modern editors' UI or the web, where apparently achieving 60 fps for a few hundred dynamic DOM nodes is some kind of an achievement.
This approach really starts to suck when you implement macros that are going to perform a lot of one-char inserts quickly. Or when you're editing multi-gigabyte files.
> This approach really starts to suck when you implement macros that are going to perform a lot of one-char inserts quickly. Or when you're editing multi-gigabyte files.
I'm working on an editor that I've optimized for such cases. In a test it made random edits to a 4GB file in < 50 microseconds. But, it cost a load of sweat and blood to get that rope data structure right. And it loads files only at about 100MB/s (should optimize for bulk inserts). https://github.com/jstimpfle/astedit
It's been around a decade since that line was crossed. The peak bandwidth of DDR3-1333 is just a bit over 10GB/s.
What operation is that? Search and replace might have that effect but could be done by copying the entire buffer with replacement happening along the way.
Until you actually have to implement your algorithm on a mobile device that is both memory, and power constrained and that doesn’t have a swap file. The OS will either lol your program for being too memory or power inefficient, kill another program running in the background (not a great user experience) and/or force the use of the high power cores using unnecessary battery life when a more efficient algorithm could have used the lower power cores.
Attitudes like this also explains why developers don’t think twice about delivering battery consuming Electron apps.
Today you have fancy rendering, and an instantaneous editing experience for that reason again suggests a more sophisticated data structure for the editor.
Which all text editors have, when you look inside vi/emacs/nano/whatever...
I think the other major corner case is when you need concurrent, distributed editing (although that's not popular or anything these days), in which case an array is a very poor datastructure.
On the other hand, TIFF writers can (very conveniently!) be almost as simple as you want, including no compression at all, just blobs of raw pixel values, and a smattering of tags for width, height, pixel format, and that's it. The only thing simpler to output IMO would be uncompressed ASCII formats like XPM.
So in that sense you're correct- the simplest possible JPEG writer is much more complicated than the simplest possible TIFF writer, but TIFF in general is extensible to a fault (arguably), in the sense that the number of possible combinations of pixel and metadata encodings you have to prepare yourself for when opening arbitrary .tif files are far greater than when opening arbitrary .jpg files, including JPEGs within TIFFs.
The initial format is older than GIF87a (no animation which people associate GIF nowadays with). It had header but that pretty much it. Of course the format developed with time and even added LZW once the patent expired. Currently TIF is all kind of things, so writing a fully feature reader is a proper challenge (perhaps not coding-wise, but understand it and implementing the myriads of types/extensions, etc.)
my 2 cents
I'll also link Cristi Cuturicu's "A note about the JPEG decoding algorithm", which is where I started my decoder implementation from, and it was indeed a ton of fun.
That's our industry in a nutshell. Our computers, instead of becoming more capable over time, can barely keep pace with the increasing naivety of our programmers.
The GP offered a valid decision point to consider based upon what an engineer is solving for. I don’t think he said that an array was the solution he’d ship in a production text editor to millions of end-users.
Engineering is hardly naive. :)
One example? Notepad.
In some terrible, dark dimension, Notepad has a ribbon interface and supports PDFs.
I believe an older version of Notepad even had a (fairly low) limit on file size it would open.
I mean that's the reverse argument, computers have gigabytes of memory today, and are super fast, so you should be able to load a multi gigabyte text file and edit it, on a single line, with word wrapping.
Outside of webdev, Unity springs to mind, as another great example of this: The stuff you can do as a single game developer is mind boggling, or at least used to be, until indie devs everywhere started boggling our minds on a daily basis and thus raising the standard of what consumers expect an indie game to be.
This is, of course, not possible because within 50 years humans evolved to be a lot better or smarter or faster than their predecessors. It is made possible through more flexible higher level tooling, that you don't have to understand the inner workings of to take advantage of, and more abundant computing resources, that in tandem, enable work that will be in the "good enough" territory for most use cases.
This is also not a choice that programmers as individuals or even a group make. It's a choice that the market makes.
There is nothing naive about it. Naive is assuming, it would be any other way.
What do you mean, increasingly often? This was the case 15 years ago already and I see only examples that it has gotten less, because of all the frameworks that exist.
Also it's exactly what I liked about webdev. When your existing talents for graphics design and explainer-of-technical-things shine in a tech context, that feels good. A lot of programmers have no feel for this, and a lot of designers write awful code. Which could have, but historically did NOT improve at all with higher level tooling, mainly because of this "good enough" attitude. Feel free to prove me otherwise, but what did happen: Thanks to things like Bootstrap, now programmers can avoid the worst design mistakes without having to learn design. Graphics Designers, however, well .. I don't know? Are there tools that allow them to write or generate code that doesn't suck? (Without programming skills, like the coders without design skills).
> This is also not a choice that programmers as individuals or even a group make. It's a choice that the market makes.
> There is nothing naive about it. Naive is assuming, it would be any other way.
I don't know ... Do you believe there no longer exist people that deliver quality over this entire skill set? Or that they somehow exist outside of the market?
For product demands where deadlines are constantly unrealistic, underfunded, underscoped and demands are ever changing, I'm a fan of providing the simplest conceptual solution to the task at hand and not focusing on developing complex abstractions and optimizations too early.
From my experience, that time is typically wasted until functionality is zeroed in and real money is available to pay for the work, as the early complex abstractions typically fail to meet pace with demands and the optimizations break when ever changing requirements.. change. That's just my experience, YMMV.
Personally I’m in the latter camp. There’s so many layers of abstraction nowadays which each in theory make programming better/safer/easier which in practice end up creating an incredibly inefficient mess.
Today's software suffers from too many layers of complexity that are each pretty dumb and serve mostly bookkeeping. The result looks like an overinflated bureaucracy. In the example above, using a more efficient data structure for text representation will add at most one layer of abstraction (but there's a good chance you'd create that layer to hide the array anyway), but offer significant benefits in terms of performance, at a cost of little and well-isolated complexity.
This is the best kind of abstraction: complex, deep behavior hidden behind simple interface.
I wish others understood that, because the things I work on are losing performance (and a massive amount of developer time, which could be used for optimization or other useful work) to excess complexity, not too simplistic code.
This is a perfect example of when it’s stupid to keep optimizing.
I've opened files that big in a text editor before, but it was definitely the wrong tool for the job.
Let's take the text editor example. Let's say we use it to edit a large document. Is Moby Dick large enough? It's around a megabyte of (pure) text. Let's figure out a persistence solution. How about "we save the entire text to disk"? So a megabyte to disk. My laptop's SSD does (large) writes at 2GB/s. So the ultra simple solution could save the entire text around 2000 times per second.
That's a lot faster than I can type.
Now the user is either queuing up a bunch of background saves leading to overload or is forced to wait 1s per keystroke.
I guess the simple solution then is to tell the user to buy a $3000 laptop just so it's capable of running notepad.
Anyway, even laptop drives are well over 40-50 MB/s these days, and any disk scheduler worth its salt will schedule this kind of write (one contiguous chunk) near optimally, so still 40-50 writes/s.
And of course, you queue these writes asynchronously, dropping as needed, so if you actually manage to out-type your disk, all that happens is that your save-rate drops to every couple of characters. Big whoop.
Also remember that this is Moby Dick we're talking about. 700+ pages, so something that vastly exceeds the size of the kinds of documents people are likely to attempt with a Notepad class application.
Last not least, this is a thought experiment to demonstrate just how incredibly fast today's machines are, and that if something is slow, it is almost certainly because someone did something stupid, often in the name of "optimization" that turned into pessimization, because doing Doing the Simplest Thing that Could Possible Work™, i.e. brute-forcing would have been not only simpler but significantly faster.
At the same time we ought to be aware of that scaffolding and how it works (or could work), and how to build such abstractions ourselves. Not just because all abstractions leak and potentially introduce bloat, but also because it means I don't have to pull in another dependency to save me a page (or three lines) of trivial code. Or maybe because the "standard" solution doesn't quite support your use case (I can't count the number of times that I've rewritten python's lru_cache because of it not accepting lists and dicts).
Now if you open a large text file in something other than text mode, it could bring it to its knees depending on the mode. As an example, opening an XML file in the nXML mode is quite expensive, because nXML mode is powerful and utilizes your XML structure. I just tried a 12 MB XML file and told it to go to the end of the file. It's taking Emacs forever to do it (easily over 30s). But if I switch to text mode for that same file, it handles it just fine.
I just tried an 800 MB text file. It handled it fine.
The one thing where you can easily get in trouble: Long lines. Emacs cannot handle long lines well. Kinda sad.
As an example, I have anzu minor mode selected. So if I try to search in the 800MB file, it hangs until I cancel.
Ugh. That's just offensive.
It's really an array with a gap at the cursor location. Used by emacs and others for decades.
*or at least competitive with. I’ve measured it to be faster but I’ve heard others have had different experiences
- Build Your Own Text Editor
- Build Your Own Shell
- Build Your Own Git (!)
Comes complete with the Feynman quote "What I cannot create, I do not understand".
The repo has an adaptation of the quote, then. :)
Can we step away from the hyperbole here and just start saying (in this case) "Interesting project ideas" or somesuch?
This of course leads to people piping up their own "must dos" like "write a compiler" (huge undertaking).
Interestingly I see a comment here like "do something with actual customers" and the replies are interesting, essentially dismissing this as a business rather than technical problem.
I find this interesting because software exists to solve problems for people so this is probably the most useful advice I've seen. The ability to identify a pain point and use software to solve it is arguably the most useful ability a software engineer can have.
You will of course learn things by writing a compiler or . text editor or a ray tracer and if scratches an itch for you, by all means go for it.
I thought it was a valid point. 1000 books I should read? If I devoted all my free time, for the rest of my life, I might make it. But that leaves me exactly zero time left for the next person who's got some fine-sounding "should" for me, or the next, or the one after that. It's all my time for the rest of my life - for one person's "should".
I don't think it changes the problem to say "should" instead of "must". It's just an opinion. It doesn't create any more of a "should" for me than it does of a "must".
Writing a Pratt parser as part of this forced me to understand how it works. Figuring out how to process a sheet led me into algorithms and structures like directed acyclic graphs (as mentioned in the article). I found myself referencing Introduction to Algorithms and really studying it .
In the end I turned it into a talk at Big Sky Dev Con in Montana. The whole thing was a good experience - from researching how to do it, to sticking it out through the implementation, to distilling it to a 45 minute talk. Be sure to check out the recording  and code  if you're interested.
Any of these suggestions will lead you down a rabbit hole of learning with a clear objective in sight to keep you motivated to dig deeper.
I'm really enjoying watching the recording of your talk, and in addition to learning one way to build a spreadsheet, I'm also learning lots of good software development practices orthogonal to the specific project, which is great.
Again, thank you for sharing!
After gaining traction in the market it was possible to raise from angels and then VCs.
I believe you can build and sell software from anywhere if you have the drive and find a way to solve problems that people are willing to pay money to solve.
Here's an article that talks more about it: https://missoulacurrent.com/business/2018/08/missoula-tech-s...
And then there's the whole mental model of relational algebra and stream processing of queries.
It'll give new appreciation for existing databases and what they can and can't do for you.
No offence if you enjoy this sort of thing though, of course. Just not my bag.
Give me someone who needs me to build something I don't already know how to build, though, and I will figure that shit out.
I've been lucky that this has worked out well for me so far, but it means I need to always try to get on projects with unfamiliar things or take new jobs involving unfamiliar things, or I'll have a really hard time expanding my skillset.
Oh, and school sucked.
It’s one of the most complex system one can develop and you end up learning about multiple areas, such as OS, compilers, distributed systems, data structures, parallelism etc.
AKA, a toy database might be enough to handle some simple storage/retrieval problems but be full of hidden O(n^2) or higher logic which would fall down hard with even fairly simple usage in the "real world".
Reminds me of my own text editor, written in Applesoft basic when I was in middle school. It worked for its intended purpose (editing small assembly files), but was really quite terrible all things considered. I remember it being quite slow to save/restore, and it was only really capable of editing files of a few hundred lines before it started breaking BASICs memory allocation schemes. AKA, I didn't really learn any of the datastructure finesse needed to implement a "real" text editor with line wrap/etc.
Worse I remember trying to read the code a few years later, and while it fit on two printed pages, it was 100% unreadable.
(for those that don't know, applesoft's speed was influenced by "formatting" if you will. It encouraged line number usage only really for control flow, plus the long list of call/peek/poke magic numbers required a handy cheat sheet of what each address did)
The same could probably be said about the internal combustion engine, but it might soon be replaced by electric batteries, which provide a much more elegant solution.
I believe that "unbundled" databases, such as Crux, can become the electric batteries of the database world by making a lot of the current complexity irrelevant.
After read https://www.confluent.io/blog/turning-the-database-inside-ou... I wondered about that. I think that make the log first class and "plug" relational tables (optionally) will make a amazing database engine. In short, you PERSIST your commands:
POST /City ..
PUT /City ..
DELETE /City ..
POST /SendMail (to:...)
and have the flexibility to bundle the domain logic on top of the data logic in a single lang. This is my long term goal..
2) the separation of reads and writes allows for elegant horizontal read-scaling without coordination/consensus
3) pluggable storage backends implemented as simple Clojure protocols (as the sibling comment mentions), which eliminates a large number of performance and durability concerns
4) combining schema-on-read with entity-attribute-value indexing means there's no need to interpret and support a user-defined schema
5) Datalog is simpler to implement and use than the full SQL standard or alternative graph query languages
...I work on Crux :)
SQL certainly provides a lot of bells and whistles but Crux has the advantage of consistent in-process queries (i.e. the "database as a value") which means you can combine custom code with multiple queries efficiently to achieve a much larger range of possibilities, such as graph algorithms like Bidirectional BFS .
This is a rabbit hole that goes as deep as you want it, which has both positives and negatives of course
Is certainly challenging.
Just look at joins. You have (at least) 2 nested loop joins algos, then sorted and hash joins, and then you have cross and left, right, inner and outers. All of them with small subtle tricks to make it performant (in theory: You can build all on top of CROSS. But! That will be very wastefull very fast!)
Though if you add some constraints like it must have jdbc compatability then that becomes interesting.
You could also do worse than learn Erlang, Haskell, Julia, or some other interesting-but-not-really-mainstream language.
I'm surprised the author appears to be a millennial. I would have expected a list like this from someone my age, who started coding when TinyBASIC and Space Invaders were the new shiny.
Except that with a DSL, you have to worry about gnarly but not-so-interesting problems like what happens to my syntax (given by a CFG) when I add another syntax (another CFG)? Are the combined CFGs ambiguous etc?
Keep things simple!
* Make a Lisp https://github.com/kanaka/mal — tutorial on how to make a lisp language interpreter.
* Make an 8 bit computer https://eater.net/8bit — tutorial on making a simple 8 bit computer out of a bunch of AND, OR, NAND chips. Really helps you understand how computers work, what microcode is, etc. Honestly, building it wasn't quite as much fun as I thought it would be, but just learning how to build one was very useful.
Then at work I become an ambitious user of a relatively immature open source platform. As a power user, I finally had real motivation to fix specific defects, and a good enough mental map to get it done. It came naturally.
* Non-trivial application with very restricted amount of memory, crappy networking, that has to be very reliable. I did an entire payment POS application (EMV, contactless) on a very ancient device. It had total of 2MB of memory (SRAM + flash, it had execute in place). I needed to do a lot of research to optimize the application for memory usage -- for example, had to implement transactional database to work within constant amount of memory. I also needed to research a lot of techniques to make my application reliable regardless of what happened. I also learned A LOT on that project.
How would these clearly niche projects be something "every programmer" should try? When in enterprise engineering is 2MB a normal amount of memory for something to need? I'ved worked in videogames, fintech, e-commerce and worked on many kinds of systems and not run into anything with such requirements.
Not to mention, the sorts of "lessons learned" on such projects could actually hurt an engineer when working on more realistic enterprise systems. At my current job (where I manage many teams of engineers), if I saw engineers micro-optimizing to save 2 MB of memory instead of getting features shipped I would ask their tech-lead wtf was going on.
I've seen far more damage done by premature optimization in my career than I've seen it actually help matters. Engineers trying to be "extra smart" and show of their extreme memory/cpu optimization skills so often leads to wasted time, hard to debug code, or worse, annoying bugs which are much more difficult to diagnose later.
This assumes you will understand where to use and when not to use those techniques.
The list gives examples of projects each with unique requirements. Each requiring you to think about the problem in a different way.
I just gave two more examples that I personally found very illuminating, that also have unique requirements, different from other projects on the list.
I may be working on some more mundane software atm but, for example, I know how specifying size of every buffer or any data structure is important for reliability of the service I am working on.
In fact, the service had bad track record of reliability which was quickly fixed by specifying what it means to do stream processing (no unbound data structures in memory, etc.) and quickly verifying each component against this spec.
I learned this and many others from MISRA which is standard to help write reliable software for automotive industry. I adopted it for my embedded app to help me work on complex app that had to be very reliable.
I’m not sure if I’m just thick or something, but any project that involves low-ish level graphics I shy away from because I’ve had so much trouble in the past.
openGL is only a specification. and there are many openGL versions. and drivers have different implementations. and modern openGL uses shaders. which use another language, GLSL, which also has different versions. and openGL needs a context to draw on. each OS has different display servers to create those contexts and windows. here we also have to differentiate protocols and actual implementations. and then you might want to write openGL in a certain language different from C/C++. now choose a library or two or five, where each one will deal with one or two or twenty of those issues, plus other things like keyboard and mouse input, etc.
so, either you follow config instructions very closely and repeat until you find one that works, or you try to start understanding all this, get very annoyed and throw your computer through the real window.
PyGame seems to work if you install the official Python 3 binary from python.org and install PyGame with the pip package manager included with the Python installation. (Last time I've checked Conda Python and the version from Homebrew have some problems when installing PyGame.)
Used it for my first graphics related programming project. Was lots of fun and I learned a lot.
eventually I ended up learning via the HTML5 canvas element and WebGL, which gets rid of the problem of having having a different setup to the author
The factory pattern is about adding logic and state to the creation of objects. What's wrong with creating enemies using their usual constructors and dumping them into a resizable array?
Object pooling is a good point actually.
Implement basic widget toolkit, making something that essentially works is simple (regardless whether you implement in on top of dumb framebuffer, existing display server like X11/WinAPI or even on top of HTML). But there are lots of nuanced details that you will probably learn along the way, some of them overlap the text editor problem, then there are things like not opening windows/popup menus outside the display area (there are widely used toolkits that get this wrong even for the simplest “select box on bottom of the screen” case), clipping, handling scrolling (both detecting if something is too large and should be scrollable and actually implementing scrolling efficiently) and last but not least designing the API to be both powerful and simple to use.
Simple transactional database that really has ACID semantics. Locking, BTrees, journaling, isolation levels, how to actually store all that in files.
Toy unix block filesystem (there can be some overlap with previous point and for journaling design it might be better to start with the FS case). Recommended reading: https://news.ycombinator.com/item?id=12309686
And, which is maybe somehow motivated by doing this mostly pointless thing as long abandoned pet project, and thus maybe slightly nonsencial: implement AFS-like “distributed” filesystem. The interesting part there lies in RPC and authentication mechanisms, not in doing some kind of distributed consensus (thats why “AFS-like”, as AFS has single master server for each volume)
And at last, although the theme is somewhat different: Implement a tool that given a path to unix file prints a list of users who have access to the file and what kind of access they have. This sounds like simple tool, but is surprisingly non-trivial even without takong ACLs into account. (This is not from me, but I read this in some similar list ~15 years ago, IIRC by Andrew Tridgell)
- A service that can take a real-time timestamped data and generate an internalized time series. For example returns a first, last, high, low, and average value at fifteen second intervals.
- A mechanism for serializing structured data. Something like Google Protobuf.
- A chart widget optimized to display and navigate date based content.
1. Design a new programming language. Prove it Turing complete (or go back to the design phase). Maybe do a compiler for it.
2. Design some hardware. There's cheap FPGAs and nice walkthroughs for architectures like the 6502, but my next project will probably be a custom architecture because I've never done a proper compiler before.
4. Break your language paradigms! Learn Lisp, APL, Prolog, Haskell... This is a meta-challenge: do another challenging project in an unfamiliar language, and get to know powerful idioms in the language and how they're handled behind the scenes.
To be honest, the very first professional project I ever did (in 1987), was a complete embedded OS.
In 1983, I wrote a Space Invaders game on my VIC-20, in Machine Code. In those days, you had to use characters to represent aliens, so I made up my own font.
I've done a number of text editors over the last 30-some years. Nowadays, most operating system frameworks have 99% of a full-featured text editor built in, and you can do it with a few calls to system resources.
I haven't written any real compilers (haven't ever felt the need), but I have done a ton of parsers and whatnot.
In AppKit, you can do a pretty full-featured text editor as a "Hello World" project.
This is just not true, but worse still it confuses people who are just starting. You can be an exceptional front end developer with HTML/CSS/JS scripts and not necessarily have a mini OS in your past projects.
Also it is kind of sad when a fellow developer is stuck with a problem that can easily be debugged with some OS knowledge.
Currently the only two I have left to do on the list are the Basic compiler and the spreadsheet. I plan to do the compiler quite soon, once the emulator IDE is nice enough. If I decide to do a Basic without line numbers I'll probably be doing another text editor as well.
Not sure if I could bring myself to write a spreadsheet. I feel queasy just looking at them.
It’s relatively straightforward and the satisfaction of seeing even a basic shaded sphere that you literally created from scratch in code by writing pixels to a bitmap file is awesome. I did this about a year ago and it was still one of the most fun/educational projects I’ve ever done.
Or just invent your own private Emacs.
Of course, one's motivations may be as diverse as their perceived rewards. Still, solving problems which need a solution always has a room for programming/design challenges, while working with others lets one connect and learn from the collective knowledge.
Eager minds with free time are precious!
For me, though, working on my day job and my "for fun" side project take enough of my time and brainpower. I never lack for problems to solve.
I can come up with decent solutions to problems, but the best way to learn is to be able to see alternatives you hadn't thought of so you learn something new.