
Ask HN: How to learn to web dev for real ? - Murkin
Hello everyone<p>After reading all I could find on Python/Django/Ajax/JS/etc, I find myself able to write simple sites (blogs/etc). But I still don't know how to do a real project.<p>- Methods and tools to updating/patching a working server (with thousands of users).
- Creating scalable sites
- Setting up a testing environment
And much more.<p>Can any one recommend articles/books on programming that go beyond the "blog app" ?
======
nethergoat
<http://highscalability.com/> is a great resource for building scalable sites
- it has consolidated architectural details (culled from presentations, blog
posts, interviews, and more) for many large-scale sites (YouTube, Facebook,
Amazon, etc).

As for testing environments, take a look into the principles of continuous
integration (many resources available) and continuous deployment (especially
Eric Ries's blog: <http://www.startuplessonslearned.com/>). Some tools
commonly used to this effect: Git/SVN for source control; Hudson/Cruise
Control/Team City for builds; Selenium for testing.

Finally, applying patches and updates to a server falls into the domain of
systems administration, so I'd suggest looking for articles on
patching/updating high-availability sites. Note, however, that cloud computing
is a huge game-changer here. Whereas at many enterprises, applying patches is
an arduous (and tedious) quarterly effort culminating in an late-night
maintenance window with many hands on deck (all crossing their fingers),
clouds afford their adopters the ability to simply fire up new, updated
servers alongside the old ones, allowing them to be thoroughly tested in place
before any traffic is redirected.

-Mike (@mikebabineau)

~~~
Murkin
Thanks, exactly the sort of resources I was looking for.

Perhaps I failed to introduce my questions correctly. I have been a system
admin for years on military server farms (100+ servers) and have been an
embedded developer for a few years more.

Currently I am expending my area to web. After writing a few basic web sites
that have all the vanilla stuff (Users/Messaging/Pretty JS&AJAX&JQuery stuff).
I am trying to learn how to build large scale sites.

* High scalability [more then one server. both web&db]. * Deployment [how to upgrade/patch software on a working site with thousands of existing users with minimal disruption] * Testing [both automatic and non-automatic, including different web browsers and OSes] * Any other thing in this category

Thanks for your info !

~~~
mncaudill
It seems you know how to program and how the web works, but want to know how
it all fits together (testing, deployment, scaling etc). I think you will find
Jacob Kaplan-Moss's presentation of all the pieces of building a web
application. It is geared towards Django, but it should give a good overview
of what goes into a webapp. And if you are using Django or Python, all the
better!

<http://www.slideshare.net/jacobian/django-in-the-real-world>

------
dan_sim
Hmmm... I'd say that you better start coding and stop reading. It will be a
mess at first but you have to gain some experience.

I don't want to discourage you but you probably won't have thousands of users
on your apps at the same time at the beginning so don't bother with creating
scalable sites yet. One day, you'll have some slowness on your app and you'll
learn it at this moment.

------
Dmunro
Having recently started as a "professional" developer, coming from the realm
of the hobbyist, I can say that the best thing you can do is get a job.

Start with an internship if you are not yet confident in your skills. The best
way to hone your abilities is to set up a situation where you get unexpected
problems from code you did not write or an environment you aren't entirely
familiar with, and fixing it with the mindset that failure is not an option.

It also helps to get paid while you're doing it. Really. It shows that someone
has put an incredible trust and investment into you, and they value your
intellectual output.

Start a personal website to host your projects, make it available to the
public. Take a look at the tools you use. Try submitting code contributions to
the open source projects. Start a blog and document your hardships and
discoveries for other budding hackers.

Hopefully this advice helps you some, best of luck!

------
marram
I would suggest starting with Google App Engine (the Python version). For the
following reasons:

1\. Python is a good language to learn. You can use it for other things beside
building a web app. GAE is similar to Django.

2\. The GAE datastore is easier to fathom than a traditional database. For
some reason, I always found databases hard to understand, and it seems that
designing schemas for non trivial projects is a separate and distinct skill
set from software engineering.

3\. GAE solves the updating/patching of working code by providing a sane
environment and tools to update and version your app. You can easily switch
back to a previous version. Patching datastore schemas is slightly harder.

4\. You get scalability out of the box. You don't have to worry about
hardware, or even configuring "virtual" instances. None of that. But you
probably won't have to worry about scalability for a while.

5\. You can easily deploy code to multiple environments. You just need to
signup for more apps, and update your app.yaml file to point to a new app
name. You can do this in a simple build script.

I hope this helps.

Cheers.

~~~
blasdel
Because of GAE's odd architecture, a lot of things that you could get away
with elsewhere (especially at non-mega-scale) will give you obvious latency
hits.

The upside is that the issues are easy to find, and your code doesn't get any
slower if you jack up the load!

It leads you to think about scaling in a most pleasant way.

------
jrockway
As the author of a web framework book, I have to say that the example
applications are not actually much less complicated than real applications in
any significant way. The real applications are often bigger, but it's just
more of the same stuff. The bulk of your application is going to be a web-
independent model (read any software engineering book for advice in developing
this). Then you add some thin glue to make that model look nice over HTTP.
(This includes things like REST APIs or AJAX or whatever.)

That's it. The glue to the book's Address Book example (or whatever) is not
going to be any different than the glue of a complex app. All the good stuff
in your program happens outside the domain of the web.

------
bbq
It's been hinted at in dan_sim's post, but if you have a "real project" in
mind just go do it. Learn what you're missing when you need to. You probably
know more than you think.

~~~
DanielStraight
Also, in my experience, I haven't been able to learn anything "for real"
without using it on a real project.

~~~
Murkin
I have already created a few small projects (the biggest being 2 weeks of
work).

But there is nothing to show me if I correctly handle my prod-server updates
(Now I upload, move, restart)

My testing is very basic.

And there is a general feeling of lack of infrastructure.

I would love to learn how to do a few-man-year project (from infrastructure's
point of view)

~~~
Nekojoe
A real project is an excellent way to pick how to do web development.

Have you got any hobbies or interests? Why not create a website dedicated to
your hobby/interest? That way it'll be something that you'll be able to run
over the course of a few years.

Start with a small website, and slowly build it up. Add more features, then
tweak them. Find out why it's loading slowly. Read a few books on website that
scale too.

------
peterhi
Well there is nothing stopping you from getting a small host somewhere and put
up a site of some sort. This can then be used to practice the skills you will
require to manage a host. Make sure that the site is actually being used for
something so that you will need to maintain it and do upgrades.

Then you only have to learn how to...

1) Install and configure a machine 2) Configure a web server 3) Set up ssh
access securely 4) Set up your database securely 5) Do backups (and as a test
nuke the box and rebuild it from backups) 6) Automate the deployment of your
application 7) Keep the machine up to date with security patches 8) Keep the
hackers at bay 9) Monitor the general health of the machine and services

Basically you will start to learn the art of the sysadmin. It is a useful
skill if you end up at a small company without a dedicated sysadmin and even
if you have one it will have a better idea as to what is required to develop
and host a site. Empathy with the sysadmin is always good thing.

You will learn a lot of new skills, most of which do not involve coding. I
have a little hobby site that hosts fan art for a web comic, I have learnt
much about automatic deployment by keeping the site up with minimal
disruption. I have learnt much more about the running of databases (as opposed
to just writing sql) and loads about unix security (users, groups,
permissions, jails etc) that programmers just tend to ignore.

All for around $9 a month. Money well spent if you ask me.

~~~
dan_sim
I would suggest <http://prgmr.com/xen/> for 6$ a month.

~~~
Murkin
I have a few sites on Webfaction already.

While I figured out how to do most things you mentioned. I feel like I am re-
inventing the wheel with most of them.

Surely there are advanced, open-source deployment / backup / version /etc
systems in use in big projects.

Thus I am looking for guidance on where to start learning that part of the
web-development process

~~~
peterwwillis
You will be reinventing the wheel; that's part of working with open source on
big sites. There just isn't one open-source deployment/backup/versioning/etc
system out there. Nearly every large deployment/backup/versioning/etc system
i've seen has been custom. Most of these details will be specific to the group
you're working with and the project's requirements.

Don't worry so much about getting the process right or finding a pre-packaged
solution. The most common solutions will be the same old tools re-worked and
customized to get the job done on that site. I'd say the best thing you can do
is put your fingers into every single open-source pie you can and get a feel
for it all. The rest is a natural evolution of developing for your target
platform.

------
moron4hire
"just do." I know that sounds rather cliche, but it's something that I've
found a lot of beginners are missing. They spend a lot of time trying to find
the "best" way to do something before they start doing it. Really, just do
something. You'll learn from the experience either way. You'll probably learn
more than if you _had_ found the maximal solution first.

I probably learned more about programming by hacking together a simple role
playing game in Javascript+DOM in 2001 than my next 3 years of college and 5
years of work. That 8 year old code still worked up to the point GeoCities
died, and then it was gone permanently. Not a big loss, but it was nice to see
that it survived a myriad of browser updates. I learned tons about what made
good, readable code just from having to live with it. I learned a lot of the
art of optimization because of how much Javascript sucked at the time. But
more importantly, I learned that it was more important to get-it-done than to
be working in what was popular (you would not believe the flack I caught for
not using C for it, even though I was also learning C as well and just wanted
to do something in Javascript).

~~~
wallflower
OT: Did you write your own cross-browser DHTML library or use one?

I did the same thing (considerable browser-based app in the pre jQuery dark
ages) and it was painful.

Have you checked the Geocities archive <http://reocities.com> to see if it's
there?

~~~
moron4hire
I don't think such things existed at the time. At any rate, the code was 100%
new and performed no browser checking, but still worked exactly the same on
all browsers. It took me a month to do what I could do in a day now, but it
was a great learning experience.

I've started a new project recently, maybe about 10 hours of work into it,
that is a chess game, 100% new Javascript, so far 100% cross browser
compatible (with no features planned that would break that), with no browser
checking. It's actually not that hard if you're smart about your debugging
(don't use alerts, use try/catch and write out to a div).

------
silentbicycle
I'm not a web developer, but I posted a similar question on Ask.Metafilter
last summer ([http://ask.metafilter.com/124165/Web-development-big-
picture...](http://ask.metafilter.com/124165/Web-development-big-picture-for-
a-nonweb-programmer)) and got some helpful suggestions.

For me, the most useful response in the long run has been delmoi's suggestion
to just scrutinize the HTTP 1.1 spec and write a webserver from the ground up.
(I'll release the webserver, an embedded coroutining/select-multiplexing
server written in Lua, when I release the project driving its development.) My
interest was specifically in using HTTP 1.1 / REST as a generic interface for
servers, rather than creating dynamic web sites per se, though.

Either way, it definitely helps to have a real project in mind.

------
roachsocal
For frontend stuff:

High Performance Web Sites Essential Knowledge for Front-End Engineers (By
Steve Souders) <http://oreilly.com/catalog/9780596529307>

And for an intro to scalability:

Building Scalable Web Sites: Building, Scaling, and Optimizing the Next
Generation of Web Applications (By Cal Henderson)
[http://www.amazon.com/Building-Scalable-Web-Sites-
Applicatio...](http://www.amazon.com/Building-Scalable-Web-Sites-
Applications/dp/0596102356)

------
iamelgringo
Updating and patching a server running a web app without down time is an art.
And, there's not a ton of stuff out there about django deployment best
practices. Here's a few thoughts from a fellow developer:

Python has a decent way to isolate deployed apps using virtualenv. If you
google virtualenv django, you should find some interesting reading.

For automated python deployment on unix boxen, fabric seems to be the state of
the art. You might also want to take a look at capistrano on thr ruby on rails
side. You'll probably find a lot more interesting reading about deployment
tips and tricks by reading stuff from ruby/rails land. (Don't flame me) The
Rails guys have historically had more problems with speed and scaling rails,
so there's a lot more information out there. They've also built some really
cool tools to deal with those problems.

Django has some really cool caching features. I was just playing around with
caching and load testing <http://newsley.com> by throwing ghetto localmem
caching on top of my main view, I got an order of magnitude more pages served
per second. I didn't even try memcached as a backed yet, but I feel a lot
better knowing the basics of how to do that if the need arises. Regardless,
you'll want to spend a bit of time learning the caching tools anfd reading
about memcached if you're interested about scaling django.

A lot of scaling web apps is about scaling a database. It's routinely the
slowest part of your web app. You shouls be able to find a lot of reading
about that. The highavailability blog is a great resource for that stuff.

Cal henderson (founder of flikr) wrote a great book called "building scalable
websites" if I'm not mistaken.

~~~
Murkin
Excellent advice, thank you !

~~~
iamelgringo
Any time. Django has been improving their test suite recently, and their
documentation on testing has been improving a lot in the last couple of
months. There's also projects like Twill and Selenium for building automated
tests that I like a lot.

Another thought on practicing deployment. If you really want to learn how to
deploy to a multi-server environment, I'd move off of webfaction hosting and
onto something like EC2. EC2 is great for being able to spin up and then spin
down server instances at will. You could set up a DNS round robin pointing at
3 or 4 django servers which in turn point to a separate db server. Write the
build scripts to automate set up and take down of all the separate servers.
You could also practice setting up database replication on separate EC2 nodes
if you wanted to. And, you're only paying a a couple of bucks a day to
practice setting up multiple servers.

You might also be interested in puppet:
<http://reductivelabs.com/trac/puppet/wiki/DocumentationStart>

------
flooha
If you are young and just starting, try to find a position with a web dev
company who uses the technologies you want to learn. Pick an agile company who
doesn't need a load of sign-offs to get something done. Don't worry about your
$/hr. Worry about what you can learn.

Basically I'm trying to get you to find a good mentor. You can learn more in 1
week from a good mentor than you can by reading PDFs and blog posts for two
years.

~~~
daniel-cussen
Basically I'm trying to get you to find a good mentor. You can learn more in 1
week from a good mentor than you can by reading PDFs and blog posts for two
years.

So very true. I got stuck with the latter, and there was a brief period in
there where I learned loads from a tutor. World of a difference.

------
cullenking
Create an idea that requires you to do several things for which you would like
to learn. This idea should ideally include offline processing of work, so that
you are forced to read and understand many issues with executing jobs outside
of the client/server request cycle (maybe this is image creation, video/audio
transcoding, log file analysis etc etc). Then make sure your project can
leverage some sort of key/value storage system like redis or memcached, or
maybe a document store like mongodb or couchdb. Finally, make it something
that people like so you'll get enough traffic to need zero downtime
deployments.

If you make a project like this, you will learn most of the high value
systems/concepts being explored and demanded right now. Additionally, if you
satisfy the last constraint, you may have a cool successful startup at the end
of it.

------
nicara
(warning: going a little off topic here, sorry) Might I ask how you got
started in the first place? I've got a fairly reasonable background regarding
the theory of it [programming], I know the principles of OO and basic
algorithms, etc., but in school we don't cover the actual writing of code.
(And even if we did, it's probably safe to assume it'd go nowhere near as far
as I'd like it to.) Anyway, I picked up some Ruby lately, worked through a
bunch of tutorials, and it's been going decently - at first much too easy for
someone like me, then challenging, but then there's stuff I just don't know
how to do. On the one hand, I've repeatedly had big problems with blocks in
Ruby - I can't seem to grasp why to use such a weird format when you could
just use regular loops instead. On the other hand, and much more importantly,
I don't really know where to go from there. I can't write any real programs,
and I'd like to get into Rails eventually (as a gateway to Web developing as a
whole).

Edit: Out of the tutorials that I did, this one[1] was the one I liked most,
as it had a lot of cool tasks that you could just try and solve for yourself,
it really helped me get the basics down. However, none of the tutorials have
gone any deeper than that one, and as I'm sure you'll agree, I'm not exactly a
programmer yet after that tutorial :) Additionally, I've started to read
this[2] book, but it appears to follow a really strange direction and is
generally not very pleasurable to read (IMO). And, again, the moment it tries
to explain blocks to me I just stand there puzzled.. dropped it after I hit
that point, as I did with all the other materials I've tried out so far.

Again sorry for hijacking the thread and apologies for being unable to offer
any advice on your situation. Regards

[1] <http://pine.fm/LearnToProgram/>

[2] <http://www.ruby-doc.org/docs/ProgrammingRuby/>

Edit2: Alright, just saw you've been a developer for longer than I've even
used a computer :) So I suppose you can't answer this question either, bah.
Wish there were more people that didn't get into programming either 1950 or at
age 5. Really, where does someone start nowadays when they're 20 and have no
clue.

~~~
Murkin
Since I am coming from a long background of embedded programming, I guess my
starting point is different.

But, since my last company loved hiring bright college kids with 0 experience
(and I was usually in charge of training them), I have to say. Work with
someone experienced.

You can re-invent the wheel a thousand times, but you will get much better if
you learn from someone else.

And until you do that, I have found a few blogs/videos that show how
professional developers write a simple site, start to finish. Since they
annotate the process, you really get to understand not just the how but also
the why.

So IMHO best:internships second:the blogs above third: programming yourself
(and if you ever decide to switch to embedded programming and move to Israel,
I can hook you up ;)

~~~
Murkin
Try to get into a very large software company. Many of them are willing to
take newbies and pay the teaching price.

Many people I know started by working a year or two at Checkpoint. Go in
green, go out a super-star :)

------
almost
I wouldn't get to hung up on being "scabable" if I were you. Just do it and
worry about that later if it becomes an issue (which it probably won't, plenty
of sites are successful at what they're doing without having to many troubles
in this area).

Oh, and learn to use source control (Git or Mercurial or similar) if you
haven't already. When you break something on your site and don't notice for a
bit it's pretty important to be able to look back through the code history to
find out where it all went wrong :)

------
jmonegro
The best way for you to learn is to do so as you go. Plan an idea, a "big
project" without worrying about your skills. Make what you can, learn what you
cannot. When you come up with a question Google cannot answer, go ahead and
ask it at StackOverflow. They are very nice and do not mind "noob" questions.

That way, you will learn by doing, which is much more pragmatic. However, that
does not mean that you should not learn programming theory! Embrace what you
can.

------
etherealG
the best way to learn is to try. all these things that only apply to "real"
sites can be tested in smaller doses. try setup a "scalable" blog site.

setup an artificial system that can't stretch very far by limiting the memory
/ instances of your web server e.g. then hit it with more than you know it can
handle. deal with that how you think it should be done. if it works for 1000
-> 10000 requests the same might work for 10000 -> 100000 as long as your
scale technique doesn't depend on small numbers.

same goes for all the other things you mentioned, they can be tested and coded
for without having more than a "blog site" as a test platform. just do it :)
code away!

the same applied to a "testing environment" would be to make yourself a 2nd
copy of your blog site. push to the copy, make sure things work, then push to
the main site. possibly even automate that 2 tier push by writing tests that
can be tested without user intervention :)

------
JoelSutherland
Scalability solutions are best found when you have a scalability problem.

I don't know that I have heard of a project failing because it wasn't scalable
enough. The types of things that create scalability problems also tend to
bring the resources (and motivation) to solve them.

------
prakash
Software Engineering for Internet Applications by Philip Greenspun

<http://philip.greenspun.com/teaching/one-term-web>

------
sanj
Spend some time at a company that does this stuff daily.

(Yes, I'm hiring.)

