They stopped the one child policy nine years ago; they had it in the first place because demographic projections had them severely overpopulated if they didn't.
During the period in which it was active, their GDP increased by a factor of about 62. Not percentage, multiple.
They reversed the policy too late. Their rapid growth was them seizing the low hanging fruit upgrading from a country of almost entirely peasants to slightly fewer peasants, all funded by Western greed. Those gains will never happen again.
Entrenched and connected types tend to exit industries before the gavel hits, and enforcement from the CCDI isn't impartial, with plenty of bribery to remain off their radar.
Truly Schumpterian creative destruction is good in a vacuum, but reality isn't a vacuum.
There is no such thing as "best possible model, full stop". Models are always context dependent, have implicit or explicit assumptions about what is signal and what is noise, have different performance characteristics in training or execution. Choosing the "best" model for your task is a form of hyperparameter optimization in itself.
Plenty of places use DL models, even if it's just a component of their stack. I would guess that that gradient-boosted trees are more common in applications, though.
Still mostly NLP and image stuff. Most actual data in the wild is tabular - which GBTs are usually some combination of better and easier. In some circumstances, NN can still work well in tabular problems with the right feature engineering or model stacking.
They are also more attractive for streaming data. Tree-based models can't learn incrementally. They have to be retrained from scratch each time.
ML is very good at figuring out stuff like every day at 22:00 this asset goes up if this another asset is not at a daily maximum and the volatility of the market is low.
You might call this overfitting/noise/.... but if you do it carefully it's profitable.
Real-time parsing of incoming news events and live scanning of internet news sites - coupled with sentiment analysis. Latency is an interesting challenge in that space.
Multiple parts of the iPhone stack run DL models locally on your phone. They even added hardware acceleration to the camera because most of the picture quality upgrades is software rather than hardware.
I guess op may be envisioning an end-to-end solution that can train a model in the context of an external document store.
I.e.
One day we want to be able to backprop through the database.
Search systems face equivalent problems. The hierarchy of ML retrieval systems are separately optimized (trained). Maybe this helps regularize things, but, given enough compute / complexity, it is theoretically possible to differentiate through more of the stack.
Not at all. The authority figure simply could not live in a world where the question was wrong. A better response, in his role as a school teacher, would have been to tackle the question in depth as a class or otherwise show me how I was mistaken. But given as I was immediately dismissed I’ll never know for sure.
The w Fourier coefficient F(w) is the dot product of f with an exponential function, `e_w • f`, and is in that sense a projection. The inverse Fourier transform writes the original function as a sum of the projected components: `f = sum_w (e_w • f) e_w = sum_w F(w) e_w`. This is exactly how writing an "arrow" style 2- or 3-D vector as a sum of orthogonal projections works.
http://proceedings.mlr.press/v37/sohl-dickstein15.pdf
reply