

Introducing SymPullCDN: Getting Free Bandwidth From Google - symkat
http://symkat.com/118/introducing-sympullcdn/

======
piotrSikora
Sorry, but this has nothing to do with CDN.

There was _a lot_ of similar projects back in the day when AppEngine was still
new and when people thought that content from AppEngine is served from
multiple locations because of how Google load balances/routes their traffic...
But this simply isn't true and data is served from single US location.

On top of that, you're using datastore, which adds 30ms+ latency (on a good
day) by itself.

~~~
stwe
The additional datastore call and also the CPU time spent in the script (even
though it might be minimal) are totally unnecessary. Just upload the files
yourself and serve them statically. Even though it's not a true CDN (as
discussed in many places many times), it works good enough. And why do call it
Google Application Engine? Google itself calls it just App Engine.

~~~
symkat
I'm going to reply to this with a copypaste from what I said in response to a
similar question on reddit:

That’s an excellent question. While I thought of this solution as well, I
choose not to use it. Here are three key reasons I choose to do it this way
instead:

 _Speed_

While it can be argued that serving the files statically will cause an
increase in speed, due to the script itself not being loaded, this is
questionable. Google does not do 304 negotiations, and uses cache-control: no-
cache when serving static files. These default settings will increase the
amount of requests sent to GAE, and cause the browser to do more work, and
handle more bandwidth for any user who loads a page more than once. My
solution supports 304 negotiations on both sides of the cache.

 _Control_

Serving the files statically gives up control of the HTTP headers. SymPullCDN
forwards all headers from the origin. By using the static file handlers you
give up the ability to control some important headers, such as Cache-Control
and Expires.

 _Usability_

I’m using SymPullCDN side-by-side with wordpress. It’s caching files from my
origin, and wordpress is sending all requests for /wp-include/, /wp-content/,
and /s/ (static files) to SymPullCDN. If I change the template, add static
files to the site, or do anything else (installing plugins, for instance) I
don’t want to have to investigate which files are in my origin that are not on
the cache side, download them to my laptop and upload them to google. Having
the cache downloading from my origin is a crucial usability issue for me.

At the end of the day, this is something I wrote for myself, to meet the needs
I have; however, I feel others might find useful, so I released it to GitHub
and did a little write up about it.

------
al_james
If only the author would implement the proper google app engine image caching
(via the get_serving_url) method. This is a true cdn (in that content is held
close to the user) and allows dynamic image sizing by changing the URL!

~~~
symkat
Thanks for the tip, I'll look into implementing it within the next 1-2 weeks.

Right now there are a few other conditions that need to be fixed (the data
store being full, adding a purge-least-used, handling the fetch API failing,
etc, etc) before I make any changes to how the data is stored.

~~~
al_james
Cool. Its a promising tool.

I know from experience that image delivery times are _significantly_ reduced
using the get_serving_url method, so it could be a killer feature.

The hard bit will be that the cdn distributed URLs are not guessable, so you
would need to communicate the cdn URL back to the app.

------
js4all
Oh no,

    
    
      1. A cdn is a network with several cloud fronts
      2. A cdn is optimized for static content
      3. A cdn uses geographical routing
    

This solution does nothing of this and probably worthens the response time. It
offloads some traffic though. If that's what you want, go for this solution,
else don't

------
spahl
There is another similar project called cirruxcache
(<http://code.google.com/p/cirruxcache/>).

