More

ethikal · on Dec 12, 2018

Full ML Perf results can be found here: https://mlperf.org/results/

ethikal · on Oct 31, 2018

Box IPOed at $14/share but spiked to $23/share on the first day of trading. Assuming 50K shares (and ignoring the strike price / any upfront cost he might have paid), total additional compensation was ~$700K to $1.150M.

nostrademons · on Oct 31, 2018

You can make a pretty good guess at strike price from Box's financing history. They raised a Series F in 2013 at $2B post. There are currently 141M shares outstanding, of which 12.5M were offered in the IPO and roughly 10% of the company (13M) shares was sold in the Series F & Series G, so figure 120M shares outstanding at the time of his option grant. That implies a strike price between about $10 (@ the $1.225 Jul 2012 Series E valuation) and $17 (@ the Dec 2013 Series F valuation).

At the end of the 6-month lockup Box's shares were worth $17, so that would imply his options were anywhere from +$175K (he'd only have vested half of them, with standard 4-year grants) to just slightly underwater. If he held to the $28 peak this May, he'd be anywhere from +$500K -> +$900K, but if he kept holding them, he could potentially be underwater today. It's a nice additional bonus (particularly if he got in at the Series E valuation), but not enough to make one rich.

dmitrygr · on Oct 31, 2018

first day of trading makes no difference whatsoever. Employees are usually locked in for about 6 months, by which point the stock will have probably crashed to below IPO levels

jonathankoren · on Oct 31, 2018

Week of January 23, 2015: $23.23 (IPO Week)

Week of June 26, 2015: $19.10

sethammons · on Oct 31, 2018

why assume 50k shares?

ethikal · on July 25, 2017

For some benchmarks, take a look here: http://tech.marksblogg.com/benchmarks.html

ethikal · on June 12, 2017

It's not just about accelerating ML, specifically deep learning. There are many other enterprise technologies that can benefit from GPUs. One example: OLAP-focused databases (such as MapD - https://www.mapd.com/). For some benchmarks, check out this blog: http://tech.marksblogg.com/benchmarks.html.

The DL "training" use-case is well-known at this point, but there are many others which are emerging.

jhj · on June 12, 2017

A GPU database isn't that useful, because the arithmetic intensity (ops/byte) is relatively low. Cross-sectional memory bandwidth is what really matters; you can get similar effects with a cluster of CPU machines provisioned appropriately, with a shard or a replica of the database on each CPU machine. I say this as someone who has written a GPU in-memory database of sorts that is used at Facebook (Faiss), but what is interesting if you can tie that to something that has higher arithmetic intensity before or after the database lookup on the GPU.

GPUs are only really being used for machine learning due to the sequential dependence of SGD and the relatively high arithmetic intensity (flops/byte) of convolutions or certain GEMMs. The faster you can take a gradient descent step means the faster wall clock time to converge, and you would lose by limiting memory reuse (for conv/GEMM) or on communication overhead or latency if you attempt to split a single computation between multiple nodes. The Volta "tensor cores" (fp16 units) make the GPU less arithmetic bound for operations such as convolution that require a GEMM-like operation, but the fact that the memory bandwidth did not increase by a similar factor means that Volta is fairly unbalanced.

The point about Intel not increasing their headline performance by as much as GPUs is also misleading. Intel CPUs are very good at branchy codes and are latency optimized, not throughput optimized (as far as a general purpose computer can be). Not everything we want to do, even in deep learning, will necessarily run well on a throughput-optimized machine.

arnon · on June 12, 2017

Actually, in columnar databases the ops/byte intensity is significantly greater, and the GPU helps here.

If you think about how a database CAN be built, instead of how they were built until now, you will find that there are very interesting ideas that can and do make use of the GPU.

The research into these has been around since 2006, with a lot of interesting papers published around 2008-2010. There are also at least 5 different GPU databases around, each with their own aspects and suitable use-cases [1]...

[1] https://hackernoon.com/which-gpu-database-is-right-for-me-6c...

ethikal · on Dec 2, 2015

Bullshit on what? Your argument makes no sense.

He's not arguing that a 50K hour salesperson doesn't close more lucrative and more numerous amounts of deals than the 500 hour salesperson. The point is that you can invest a small(ish) amount of time to learn the basics of skills which will help you avoid critical pitfalls when starting a company - an extremely worthy investment when you consider the diminishing returns of mastery and the fact that you can't be a master of everything.

ethikal · on Dec 2, 2015

I think the author is writing under the assumption that there are diminishing returns as one gets better at a skill, when really the opposite is true. The number of topics you can absorb per unit term obviously increases with your pre-existing competency with the skill.

There absolutely are diminishing returns in mastering a skill. For example, if you spend a year learning how to play the piano, you might be at a level where you can play some pop songs and impress your friends. To be at the level where you can play some Rachmaninov will probably require an order of magnitude more time (10 years), even though you know all the basics. In the process of mastery from years 1-10, you won't unlock an order of magnitude more songs that you can play as a result. In fact, each year that you march towards mastery, there'll probably be less and less songs that you'll be unlocking as a result. Your returns are diminishing.

This isn't to say that you shouldn't seek to master something (or several things), only that if you are trying to accomplish something which requires mastery of a lot of skills (like starting a company) you should eliminate blind spots by learning a bit about what you don't know. Even reading a book about sales is better than not knowing anything at all about it. Even if you have a sales expert on your team, it helps to empathize and know what they'll be thinking about as you build the product (if you are an engineer). Being an entrepreneur is all about being well-rounded.

ethikal · on Nov 24, 2015

Yes, see https://github.com/paperscape

kartikkumar · on Nov 24, 2015

Excellent, thanks!

ethikal · on April 6, 2014

There are ways around this. With ember it is possible to use browser history to mask the hash... http://emberjs.com/guides/routing/specifying-the-location-ap...

ethikal · on March 23, 2014

Great post! You mention that it is important to "be skeptical" - I concur and would add that it's helpful to approach the analysis from a non-biased standpoint. Even if you are going into your analysis with certain goals in mind, it is not only more ethical, but also more persuasive, to indicate any inconsistencies in your findings.

ethikal · on March 23, 2014

Doesn't the article specifically point out that software engineering isn't encountering a shortage?...

"It is true that high-skilled professional occupations almost always experience unemployment rates far lower than those for the rest of the U.S. workforce, but unemployment among scientists and engineers is higher than in other professions such as physicians, dentists, lawyers, and registered nurses, and surprisingly high unemployment rates prevail for recent graduates even in fields with alleged serious “shortages” such as engineering (7.0 percent), computer science (7.8 percent) and information systems (11.7 percent)."