
Scaling PHP Book: I will teach you to scale PHP to millions of users - stevencorona
http://www.scalingphpbook.com/
======
snorkel
Here's a short list:

1\. Cache the output at the edges: Use Varnish or other reverse proxy cache.

2\. Cache byte code: Use APC or XCache PHP opcode cache.

3\. Cache and minimize database I/O: reduce database touches using memcached,
redis, file caches, and application-level caches (ie. global vars)

4\. Do event logging in local files, not to the database: Make all write
operations as simple and fast as possible, any data that is not needed in
realtime can be written to a plain old file and processed later.

5\. Use a CDN, especially for delivering static assets

6\. Server tuning: Apache, MySQL, and Linux have lots of settings that affect
performance, especially the timeout settings ought to be turned down.

7\. Identify bottlenecks: At the system level use tools such strace, top,
iostat, vmstat, and query logging to see which layer is using the most time
and resources. Also there's an excellent PHP code profiling service called New
Relic that drops you right into the function and db query that's eating up
most of the time in slow requests.

8\. Load testing: DoS yourself. Stress test your stack to find bottlenecks and
tune them out

9\. Remove unused modules: For each component in the stack unload any default
modules that are not needed to deliver your service.

10\. Don't use ORMs and other dummy abstractions: Take off the training wheels
and write your own queries.

11\. Make the entry pages fast, simple, and cacheable. Nobody is reading that
silly news feed in bottom corner of your front page and it's killing your
database, so take it out.

Most of the time a PHP slows down because each PHP process is blocked waiting
for I/O from some other layer, either a slow disk, or overloaded database, or
hung memcached process, or slow REST API call to a 3rd party service ... often
just strace'ing a live PHP process will show you what its waiting for ... in
short, blocking I/O slows down everything. The key to going faster is:

* keep it simple

* cache as much as possible in local memory

* do as few blocking I/O operations as possible per request

~~~
DanielN
I'm curious why you suggest avoiding ORMs. I've been debating this for a while
and while I like the accessibility and abstraction that ORMs provide in Django
and Rails I'm not familiar enough with Doctrine, DB_DataObject or the other
PHP options to really have formed my own opinions yet.

Is this a language specific suggestion or something broader? I would love to
hear a deeper debate on the subject.

~~~
notJim
I think the advice to not use an ORM* is very misguided. Use the ORM for the
80-90% of queries that are "SELECT * FROM table WHERE id=5" and "UPDATE table
SET name='jim' WHERE id=5". For the rest of the queries, use your ORM's
ability to run custom queries. People who have problems with ORMs are probably
not using them correctly.

There was a debate on HN about ORM's a while ago, and one of the most
significant comments I read was something along of the lines of "Every project
I've ever worked on that 'didn't use an ORM' implemented the same
functionality of one in a less-maintainable way."

* Or other object-oriented abstraction layer for database access, ORMs aren't the only option.

~~~
Muzza
> * Or other object-oriented abstraction layer for database access, ORMs
> aren't the only option.

What do you mean by this? That you should consider a non-relational database?
Clearly any mapping from an OOP language to a relational database is an
object-relational mapping.

~~~
notJim
What I meant to say is that ActiveRecord isn't the only pattern.

------
rlander
This fetishism with scaling... to me it's just procrastination. It feels like
work because you're doing something technical but, in the end, you're adding
very little to your product.

Yesterday I had a meeting with a potential customer and I hated it. I hate to
try to explain my SaaS software to non-technical people who treat me like some
17-year-old webmaster. I'd much rather be refactoring Clojure code. But I got
out of my comfort zone and this client will probably add hundreds of thousands
to my bottom line. And I'm glad I was at that meeting while my competitors
were fetishizing about non-existent scaling issues.

It's 2012 for god's sake, you can rent a 32GB server for less than $100.

~~~
debaserab2
Performance isn't a problem until it is, and then when it is a problem, it's
bigger than any other problem in the world because you have nothing to sell if
it does not operate.

I always thought I was being wise by not doing any premature optimization, but
after a few lessons learned the hard way I certainly factor in performance to
the design of software before I build now.

Scalability is not a "feature" tacked on at the end development.

~~~
rlander
Scaling will only become a problem when/if you achieve product/market fit.

What percentage of startups on HN have achieved product/market fit, are past a
128GB commodity box AND have no dedicated engineering team for scaling issues?

~~~
invisible
As an engineer, not knowing how to scale when working on something that may
need to scale could spell the downfall of whatever product you're working on.
Yes, all is well and fine until you hit a natural growth cycle and can't
commission new boxes fast enough because each request is taking 300ms. Your
database isn't accepting enough requests (plus, some of your bad queries that
aren't indexing are blocking too long). Then your web servers run out of
memory because you are using an ORM for large lists of items that you're just
returning as an array...

Once you get to the point of no return you have to know what to do or you'll
suffer. Learning that when fire is falling from the skies is the worst way in
retrospect.

------
Mikushi
Good to know other people want to share about successful scaling with PHP.

On the downside, I have a book cooking on the same exact subject, with release
planned September (self-publish, about 80% done), but now not sure if it's
worth continuing. [edit]Slight moment of "panic", as seeing someone else
releasing a book on the same subject made me sad, but you're right, no reason
not to continue.

~~~
orthecreedence
Just because one person releases a book doesn't mean you don't have something
valuable to share as well. I say go for it. There's bound to be a lot of stuff
in your book this one misses, and probably vice versa. People who are facing
large-scale situations generally want all the information they can get their
hands on.

------
fiatmoney
My understanding is that no one finds it hard in a purely technical sense to
scale the pure page-serving portion of your website, whether it's PHP, Rails,
Lift, etc., because you can always throw up another caching layer or another
box to serve pages. The hard part is scaling access to your underling data,
which heavily depends on your exact use case.

------
nascro
In what formats will this book be available? When I see a new self-published
book, I assume it will be available digitally. Your site (which is really
well-designed) mentions nothing of the format.

Either way, I would like to see images of what the product will look like. For
someone like me who doesn't need this book but is still interested in its
topic, images of a well-designed book might make the difference in whether or
not I purchase it.

~~~
stevencorona
Thanks for the feedback- all really good points. As the month goes on, I'll
have some pictures of the cover-art and chapter list available.

It will be self-published, DRM-free in PDF, mobi, epub. Looking into what it
takes to publish on the Amazon Store/Kindle/iBooks, but hopefully that's
something I can figure out after launching.

------
darkmethod
Thanks! This is perfectly timed. I'm looking forward to it. I signed up
immediately.

However, it looks like your mailchimp account is setup to link to
phpscalingbook.com (which is the wrong domain). I clicked on the "continue to
website" link after confirming.

~~~
stevencorona
Thanks for letting me know, I fixed it!

~~~
darkmethod
Much appreciated. Good luck to you! Looking forward to the book.

------
stevencorona
I have a pretty comprehensive list of chapters, but I'm still adding content
to the book, so if you guys have any ideas or suggestions for topics you'd
like to see covered, feel free to email me or post them here. Thanks so much!

~~~
JOfferijns
I'm very interested what differences microptimizations (like using
while(list() = each()) instead of foreach) make when scaling.

Also, I noticed that when I clicked the return to website after clicking on
the subscribe button your website, it went to google.com instead, and when I
clicked continue to our website button after clicking the email confirmation
link, it went to <http://www.phpscalingbook.com/> (404 error).

And if you need any proofreaders, I'd be more than happy to help!

~~~
snorkel
On a typical LAMP web site the hottest bottlenecks are usually PHP is waiting
for database I/O, it's not typical that slight code optimizations will produce
a dramatic improvement.

------
Udo
Sounds interesting, especially since your stack matches mine pretty well, so
count me in. However, it should be pointed out there is nothing there at your
site to see yet (apart from the discount and announcement thing).

------
whatorm
"Don't use ORMs"? I'm sorry, that's just plain wrong, or you were exposed to
the wrong ORMs. Doctrine2 has great scalability - it has all sorts of caching
built into it - result caching, query caching, and so on and so forth. It's
actually -way- more scalable, and easier to develop for, than writing raw SQL
(which, by the way, is a portability nightmare). Also, if you want to use a
decent MVC framework, not using an ORM would be quite dumb. And if you're not
using a good, modern, scalable MVC framework in this day and age, well, I pray
for your soul.

So, USE AN ORM!!!

~~~
maratd
Every layer of abstraction you add, not only adds a layer of complexity and
extra points of failure, but also adds restrictions.

By definition, an abstraction is more restrictive ... otherwise, you're not
really abstracting anything.

If you're making a simple web app, an ORM is just fine. If you're building
enterprise-level software, it is a really really bad idea. Half your code will
be using it, the other half will be forced to use half-assed SQL queries that
try to fit into your ORM. Because, quite simply, you will need all that SQL
has to offer to make things work right. You can't afford to abstract SQL away.
Trust me, I tried. In the end, the best you can do is go with something like
LINQ.

I built a LINQ like system for PHP a while before Microsoft did it for .NET =)

On a side note, I would also stay away from all the frameworks and build one
yourself. If you're on a long-term project, it's worth it. You'll understand
what is happening and what each call really costs you. You can also refactor
an existing framework. Either way works.

------
rexreed
What is the expected price point of the book? Since you're in the middle of
writing, do you have any opportunities for "beta" testers who can provide
feedback in exchange for complementary copies?

~~~
stevencorona
Probably $39.99, similar to bootstrappingdesign. Shoot me an email using the
contact form on my website and we can coordinate some kind of beta testing
once I push it to leanpub.

~~~
rexreed
Will do - I see the subscribe link, but no contact info, just twitter. I've
signed up for the list.

------
giberti
Nice to see someone who actually created a large volume site discuss their
learnings vs. theoretical works discussing how stuff (c|s)hould be done
without real experience.

Would love to see a chapter list!

~~~
stevencorona
I'll have one posted in the next day or two.

------
TheRevoltingX
You don't scale a language, you scale an architecture.

------
Fizzadar
Sounds awesome; using my favorite php stack too :)

Would be nice to see a preview of different bits of the book when written as
well.

~~~
stevencorona
I'll be pushing it to leanpub soon, so there will be an opportunity to check
it out and get beta editions.

------
rmATinnovafy
Good timing on the pre-release promotion. Now keep us updated in a weekly
basic and you will sell a lot of copies here.

I'm glad you are writing the book. It seems like a worthy addition to my
bookshelf. Will you be blogging about specifics mentioned in the book?

------
jameswyse
Seems your site doesn't scale very well..

404 Not Found

Code: NoSuchBucket Message: The specified bucket does not exist BucketName:
www.scalingphpbook.com RequestId: 403C0590E064E19F HostId:
a+UggS1lMBgPJrT5X/kbdzsRK1kx+iKBQw6u4dZxieNkspwHbLZBWzXMa9CiHEAu

~~~
phaemon
404 errors are not due to lack of scaling. I can't think of a case where you'd
get that.

Scaling problems will usually return 5xx errors, most commonly 502, 503 and
504.

~~~
jameswyse
It was a joke :p

------
voidfiles
Is it just me, or is the site 404ing. Good joke whomever put this all
together.

~~~
cheeze
My thoughts exactly. "I can teach you to scale to millions." Site dies with
large amounts of traffic...

------
HyprMusic
Sounds very interesting, will be keeping a close eye on this.

On a side note: What is the icon for "Happy Users" supposed to be? It looks
like Facebook's like icon but without a thumb.

------
xsc
How long has this been in the works? Domain was registered yesterday... Great
way to measure interest :)

------
esbwhat
Very exciting! I hope this will be released on amazon or have affordable
shipping to Germany.

~~~
nodata
<http://www.cheapriver.com/>

~~~
esbwhat
Neat! I'll definitely be making use of this, thanks.

