
How big is a Million Lines of Code? - getp
http://www.embedded.com/columns/technicalinsights/205604461?printable=true
======
DaniFong
I read somewhere that the blockbuster title 'Gears of War' had only about half
a million lines of C++ code: ~250,000 specific to the game, and ~250,000 for
the entire Unreal 3 engine.

I was surprised. Steve Yegge seems to be pretty well known as a programmer, or
at least, as a programming blogger, but his 'modest' web game was more than a
million lines of Java.

Apparently, if you're disciplined enough (they coded in layers, much of which
was in a strictly functional style) you can do a lot with a little.

~~~
palish
Ah.. Gears of War is also a single player game. Steve's game is closer to an
MMO. Once functionality needs to happen across 60 different computers in a
synchronized fashion, code can get pretty complicated. And sometimes a
multiplayer setting can actually force you to write functionality that only
makes sense in a multiplayer setting.

For example, Steve's game levels are tile-based, but I doubt he restarts the
server whenever he makes a small change to a level. That means the level needs
to be editable at runtime, which means changes need to be streamed dynamically
to all clients that might be watching. The changes also need to be serialized
to the database periodically, so that they aren't lost when the server _does_
reset.. And you need code to make sure two players' views of that level don't
get out of sync if they are both editing it at the same time. That tileset
loader code you wrote now needs to be able to support additional tiles at
runtime, which means you need a collision tree that can be rebalanced quickly
without hogging valuable server resources. And what happens if a player was
standing in a valid tile, but you just removed that tile? You get the idea.

My employment's game is easily over a million lines of code at this point, and
the client alone is about 700,000. Not only that, but the client is completely
recompiled for every change - so it takes about 10 minutes from the time
you've made a change until you've tested it. People avoid compiling the engine
like a genetically-engineered flesh-eating virus, so you're coding for maybe
hours before you test all your changes. And since debugging requires changing
code, each mistake costs you another 10 minutes. The whole process is like
putting together a 10,000-piece puzzle upside-down.

Some bits of advice for large C++ projects:

* You need to have split your functionality into about 20 different DLLs _before_ you get to the point of being an oh-god-why-is-it-so-massive project.

* You must write most of your objects using the interface pattern (a pure virtual interface in the header file, and your implementation exclusively in the source file). No private function or variable declarations should be written in the header at all. This facilitates extremely fast compile times. Here's an example of what I mean: <http://dl.getdropbox.com/u/315/example.cpp>

* Sometimes you need to ignore what senior programmers think is best to get your goals accomplished. Remember, they got you into this situation in the first place.

~~~
cperciva
_You need to have split your functionality into about 20 different DLLs before
you get to the point of being an oh-god-why-is-it-so-massive project._

How does separating out code into DLLs help? Linking is fast.

 _You must write most of your objects using the interface pattern (a pure
virtual interface in the header file, and your implementation exclusively in
the source file)._

This is something which you should always do -- not just for large C++
projects. Separating interfaces from implementations is ALWAYS good design.

~~~
palish
_Linking is fast._

Unfortunately, no. The linking step at work is the most time-consuming. Visual
Studio takes four minutes (and growing) to link the client, no matter how
small the change.

By the way, do you have any samples of C code from your project I could read?
I'm writing my game engine in C, so I try to learn as many different C styles
as I can.

------
Xichekolas
I enjoyed the following at the end of the article (with context added by me):

 _Think about that last analogy: A million [lines of code costs a million]
times the cost of the flash chips [it is stored on]. Yet accounting screams
over each added penny in recurring costs, while chanting the dual mantras
"software is free," and "hey, it's only a software change."_

I've worked at a place like that before. They would pay a full team of
developers for several months to roll our own custom solution to something we
could have bought off the shelf for $1000. People just don't see software as
expensive to develop... and unless it's open source, it is.

------
aswanson
They need to come up with a new metric. LOC is so wrong for so many reasons.
Running tested features seems like a good way to measure project progress but
I don't know how it would relate to overall project complexity.
[http://www.xprogramming.com/Blog/Page.aspx?display=RunningTe...](http://www.xprogramming.com/Blog/Page.aspx?display=RunningTestedFeatures)

~~~
dfranke
LOC is a terrible measure of productivity, but as a long-term measure of work
it's not so bad.

------
joeguilmette
a million lines of code? about one 50th of vista.

------
tlrobinson
In most cases: too big.

