

Why is the mobile web slow? - fishyfishy
http://www.codenameone.com/3/post/2013/07/why-mobile-web-is-slow.html

======
kybernetyk
> Objective-C doesn't use methods like Java/C++/C#, it uses messages like
> Smalltalk. This effectively means it always performs late binding and
> invoking a message is REALLY slow in Objective-C.

Well, "always" is a little wrong. The first time a message is passed and the
bound method is called by the runtime is significantly slower (~4x slower than
a virtual method call in C++). But then this message/method pair gets cached
and for every subsequent call the cache is used.

Then there's some real neat trickery that is performed in objc_msgSend() ...
after the method to "call" has been found the code directly jumps into that
method without creating a new stack frame. (All needed arguments have been
passed to objc_msgSend() and are already on the stack.) objc_msgSend() in
essence is a trampoline.

So when sending a message to an object multiple times you only pay for the
cache lookup - which is pretty fast by itself and a cached call is faster than
a C++ virtual method call.

Performance Numbers: [http://www.mikeash.com/pyblog/performance-comparisons-
of-com...](http://www.mikeash.com/pyblog/performance-comparisons-of-common-
operations-leopard-edition.html)

More about objc_msgSend(): [http://www.mikeash.com/pyblog/friday-
qa-2012-11-16-lets-buil...](http://www.mikeash.com/pyblog/friday-
qa-2012-11-16-lets-build-objc_msgsend.html)

Now I don't know enough about Java but I guess it isn't much more faster than
that. So calling Obj-C slow may be a little too bold.

~~~
pkolaczk
Java can inline many of virtual calls, making them essentially as fast as
static methods (= much faster than C++ virtual methods).

~~~
rsynnott
Though, while Java can, it looks like Dalvik generally does not (though, most
of the info available seems to be 2.3-era; possibly things have improves
since). It looks like 2.3 Dalvik can inline getters and setters, but it's
certainly no Hotspot.

~~~
Zigurd
Dalvik is designed around criteria that are nearly diametrically opposed to
those for Hotspot. Hotspot goes for maximum performance. Dalvik's JIT compiler
is designed for maximum impact on performance with minimum computation. In
other words, Dalvik's JIT is designed for battery powered devices.

~~~
rsynnott
Oh, sure, I absolutely realise that a very aggressive JIT wouldn't be ideal
for Dalvik. However, given that the article is talking about how wonderful JIT
is, it perhaps concentrates a little too much on all the wonderful JIT things
that Dalvik (the only mobile JIT of serious interest to most developers) does
not do.

~~~
invalidname
Since the article was written by a Sun guy Dalvik is probably not the area of
expertise there...

------
FelixH
It feels like the author just hijacked the topic to rant about Objective-C. No
big insights into why web apps are slow other than: javascript is not the
bottleneck, rendering the dom (which is a complicated process) is.

------
mr_luc
The takeaway I got from this:

DOM reflows are the major unsolvable source of perceived slowness.

Cool.

So, in an app where you avoid reflows, is mobile web fast? Why or why not? How
avoidable are reflows?

Can anyone point me to any articles that talk about this specifically, or
benchmarks that make this concrete? (I vaguely recall benchmarks that covered
reflow, but it's a vague memory).

~~~
ohwp
Maybe this will help:
[https://developers.google.com/speed/articles/reflow](https://developers.google.com/speed/articles/reflow)

Edit: ah, posting the same link at the same time... But I think the best tip
is: be shallow. A lot of things trigger reflow but DOM-depth is causing slow
reflows.

------
chrisdevereux
> Don't allocate when you need fast performance. This is good practice
> regardless of whether you are using a GC since allocation/deallocation of
> memory are slow operations (in fact game programmers NEVER allocate during
> game level execution).

> This isn't really hard, you just make sure that while you are performing an
> animation or within a game level you don't make any allocations. The GC is
> unlikely to kick in and your performance will be predictable and fast. ARC
> on the other hand doesn't allow you to do that since ARC instantly
> deallocates an object you finished working with.

I don't understand this. All the strategies one might use to avoid allocations
in performance-critical code under GC (pooling resources, or whatever) are
also available under ARC.

The difference is that if the way I use memory causes performance issues
without GC, it will be much more reliably reproducible than with GC.

~~~
Roboprog
Great, I gotta write (as if) in FORTRAN :-)

~~~
chrisdevereux
Hey, I said "might"!

------
crazygringo
> _In fact JavaScript can 't technically perform slowly since it is for most
> intents and purposes single threaded... so long running JavaScript code that
> will take 50 seconds just won't happen..._

It's a valid point that almost nobody's writing 50-second raytracing routines
in JavaScript, but it's trivial to run code that's executed in the background
without freezing your app, by repeatedly calling it with setTimeout(). Just
make sure that any "tick" of your code runs in under 30ms or so (although,
depending on your task, that may not be so trivial).

~~~
tg3
This is where Web Workers [1] come in. You shouldn't be performing background
processing in a way that blocks the UI. Heavy lifting like that should be
pushed into the background, if it's done on the front-end at all.

[1]
[https://en.wikipedia.org/wiki/Web_Workers](https://en.wikipedia.org/wiki/Web_Workers)

------
joeblau
So I don't know much about anything, but I'm confused on this point. You say
"This effectively means it always performs late binding and invoking a message
is REALLY slow in Objective-C." But I can't find any evidence of Objective-C
performing late binding or in documentation [1][2][3][4].

[1] - [http://stackoverflow.com/questions/5943949/late-binding-
vs-d...](http://stackoverflow.com/questions/5943949/late-binding-vs-dynamic-
binding) [2] -
[http://www.gnu.org/software/gnustep/resources/ObjCFun.html](http://www.gnu.org/software/gnustep/resources/ObjCFun.html)
[3] -
[http://developer.apple.com/library/ios/#documentation/genera...](http://developer.apple.com/library/ios/#documentation/general/conceptual/DevPedia-
CocoaCore/DynamicBinding.html) [4] -
[http://stackoverflow.com/questions/9470824/dynamic-
binding-l...](http://stackoverflow.com/questions/9470824/dynamic-binding-late-
binding-in-java-or-not)

~~~
Roboprog
"Late binding" is a generalized term for associating a symbol with a
value/code at run time, instead of compile/link time. (even if the term is not
literally used in the obj-C docs)

------
dougk16
"They can also reallocate elements into the stack frame rather than heap when
they detect specific allocation usage."

I've always been interested in whether this is done in various managed
languages. Since I never know the answer, one pattern I've taken to is to
allocate memory in static const/final fields that I would normally put on the
stack in C, instead of doing a heap allocation to a variable whose scope is
completely within one function. You have to watch out for some gotchas like
recursing into the same function, but overall it's a pretty painless and clear
pattern for me. Multiple functions can all share the same static memory too.

Not that I'm religious about this design pattern, but it's something I try to
do in performance-critical code.

~~~
rsynnott
Good Java JIT environments (HotSpot etc) do this. It looks like Dalvik does
not.

~~~
pjmlp
From the few times I tried to read Android code, I think the JIT is not
touched since Android 2.3.

I am still curious what Google is going to do with Java and Dalvik, as the
whole thing seems to be frozen since the whole process with Oracle, and they
only add new APIs.

The last two Google IOs did not have any Dalvik related talk.

~~~
Zigurd
It's unclear if Dalvik's JIT compiler needs updating. For non-mobile uses like
GoogleTV, it might be worth making a much more aggressive JIT compiler, since
battery use would not be an issue. Other than that, it seems like changes
would have a small benefit, a large risk, and a very large testing burden.

Android is, still, a very lean project inside and lean corporate structure. If
it's a third of the way down the priority list, it probably ain't happening.

~~~
pjmlp
> It's unclear if Dalvik's JIT compiler needs updating.

Well the people doing games already gave up on it since the NDK is available,
but given that Google gives a second class treatment to the folks using the
NDK, it would be nice if they cared to improve the JIT.

~~~
Zigurd
Android contains a variety of performance strategies: Dalvik's JIT is
transparent to the developer and it provides a performance boost for most Java
apps, Renderscript is for compute-intensive operations. The NDK enables native
code modules and makes it fairly convenient to support multiple architectures.
Somewhere in there you should be able to find what you need. But I really
don't see why anyone developing action games for a multi-threaded, multi-
tasking Java OS isn't aware they are going to find it difficult to get a
consistently performing game loop. Android just isn't designed for that.
Perhaps if Google is really thinking of making a console, they will provide
game-oriented scheduling.

~~~
pjmlp
I see you are well informed about Android. :)

Sure we can find how way around what is provided, but the performance story
could be made better, specially when compared what is given to us in iOS and
Windows Phone 8 environments.

------
rsynnott
> This isn't really hard, you just make sure that while you are performing an
> animation or within a game level you don't make any allocations.

That seems like a big 'just', at least in a multi-threaded application...

~~~
fpgeek
Ah, but if you're going to talk about multi-threaded applications, we're going
to have to start talking about things like thread-safe reference counts...

~~~
rsynnott
You certainly have to be careful passing data between threads in ObjC (though,
of course, you do in Java, too); the real problem with managed memory in
multi-threaded highly latency sensitive UI applications, though, is that a
thread other than the UI thread can happily trigger a stop the world
collection through allocation, and it's very difficult for the UI thread to
say "I'm doing something important for now; please don't allocate for a bit"
(it's possible, but it's a nightmare).

~~~
Dylan16807
A nightmare? Is it hard/impossible to wrap the allocation function with a
boolean check and a spin in ObjC?

------
st3fan
_message is REALLY slow in Objective-C_

Some numbers to back this up would be nice. Apple has been optimizing the hell
out of Objective-C and I think it has a method invocation down to like 10
instructions or so.

------
est
This is why

[http://widgetsandshit.com/teddziuba/2008/09/a-web-os-are-
you...](http://widgetsandshit.com/teddziuba/2008/09/a-web-os-are-you-
dense.html)

