awesome, i think that is exactly the bug i noticed on my cloudflare powered website as well but had not the experience to track it down like you have. well done. thank you for the nice understandable write-up.
An insightful investigation, and a lesson to be learned about 'smart' CDNs. I appreciate the effort you put into this, and your skills to be able to dive in and understand the cause of the issue.
Hello Saurik, I'm the author of much of the code related to the problem you were experiencing on CloudFlare. We met a while ago at Velocity. I'm a huge fan of your work on Cydia, I really enjoyed reading your write-up, and I admire the penetrating approach you took to debugging this particular issue. Your diagnosis is accurate, and we are preparing a solution for the problem even as I write this.
There are a few points that I would like to respond to:
While [preloader] maybe a nifty feature in some contexts, in the case of ModMyi the list of files seems to be a bunch of content for ModMyi's main website... the result being that millions of users of Cydia are wasting bandwidth and limited cache space downloading a number of files from ModMyi that are both a) large (as they are designed for desktop browsers) and b) useless.
I'm afraid that this is a case of a one-size-fits-all solution not fitting anyone perfectly. Our current preloader implementation is most effective on medium-small-scale sites, and many medium- to large-scale sites experience the same problem you describe. We are working to update the feature to be "more intelligent" about what assets it loads, and where. Currently the list of pre-load-able assets is a global one, but we hope to make it a list that is consumed on a per-page basis, where the assets loaded are relevant to pages most-often visited from whatever the current page is.
While you can also get around this using standard HTML5 messages, these would affect the normal operation of the website CloudFlare is serving (something CloudFlare should be going to lengths to avoid).
This is an aside, but I should point out that some of our features (many of which were requested by customers) include some that create significant, intentional side-effects. Our image lazy loading feature, for instance, has layout implications. Many of our partner applications embed widgets and manipulate the page in other ways.
However, as we can see here, CloudFlare is making dynamic code changes to websites, and is using new features such as HTML5 message channels and local storage: features that might not even be implemented correctly in the browser, and which you may be actively avoiding for your own website due to the results of your testing process.
The surface area for failure related to the features that we work on and deploy is tremendous, particularly in the arena of document manipulation. One bad code push, and suddenly hundreds of thousands of websites could quite literally stop functioning, even if you can still technically get to them.
It is certainly true that many of our users are not comfortable with this idea, and those users often disable most of our "smart" optimizations. At the end of the day, CloudFlare will always strive to be a great CDN, and we do not require any "smart" optimizations to be enabled in exchange for usage of our service.
If I may offer another perspective on the topic, one of the things that I love about working at CloudFlare is that we are actively exploring many areas of web optimization that have previously been avoided because they were considered harmful, invasive, and/or overly-ambitious. And to be completely frank, many of the features we are exploring ARE negative in one or more of those ways. However, that doesn't stop us from trying to come up with ways to make the features in question work on arbitrary websites across arbitrary devices and network connections. We aspire to break away from the pre-conceived notions of what a CDN can be, and we have found that our users appreciate us more for taking that approach. As someone who is well known for poking around outside the boundaries of conventional hackery, I hope you can appreciate how sometimes things can break in the process :) I don't personally think it means we should stop trying (nor do I think it precludes us implementing processes to avoid breakage wherever possible).
you are the man.