
Supercomputer quietly puts U.S. weather resources back on top - jonbaer
http://www.usatoday.com/story/weather/2016/02/22/supercomputer-reston-noaa-cray-ibm/80290546/
======
cryptoz
It's a very exciting time in weather forecasting. We're getting all kinds of
new observations available, from cheap LEO satellites to smartphone sensors to
smart umbrellas and everything else. And faster computers! And more parallel
computers! Think about every single iPhone and every single Galaxy S in the
world as a useful sensor array _and_ a parallel processor...

Weather forecasting accuracy is still constrained by both the available
observations and the ability to run the forecast models. This new
supercomputer won't be nearly enough to process all the new data that will
arrive in the next few years, so the pendulum will swing back to the
bottleneck being computing power soon enough. I think the most exciting things
will happen over the next 5-10 years, as we end up with another 2-3 order of
magnitude increase in the available observations to the models and we'll build
new supercomputers to process them. I bet we can get incredible improvements,
things like reliable 2-week weather forecasts.

~~~
kuya
just curious, do you have references to smartphone sensors being utilized in
numerical weather prediction?

i'd been browsing through some WMO reports and other associated NWP lit
recently and i'd seen stuff on GPS-RO, but nothing on anyone assimilating
"smart" devices.

~~~
cryptoz
Sure. I've been working on this problem for about 5 years now, starting with
the collection of barometric pressure from Android devices. I'm currently
working on this with iPhones at Sunshine [1], where we collect pressure data
(along with other metrics), but pressure is the most valuable.

There are researchers who use this data - we have collected about 4 billion
atmospheric pressure measurements that we have distributed for academic and
government research. The primary researchers are Cliff Mass and his lab at the
University of Washington. There are also groups in Canada and the US that are
using the data. IBM is now also collecting and using smartphone pressure data
through their mobile apps. [2]

Generally speaking, the current trend is to take the live data stream, run it
through a quality-control algorithm and then use kalman filters in the WRF
data assimilation package.

There are some papers published, but it is still early. I will find some links
to papers if you'd like to read them. [3]

[1] [https://thesunshine.co/](https://thesunshine.co/)

[2] [http://www.nytimes.com/2015/10/29/technology/ibm-to-
acquire-...](http://www.nytimes.com/2015/10/29/technology/ibm-to-acquire-the-
weather-company.html)

[3] Utility of Dense Pressure Observations for Improving Mesoscale Analyses
and Forecasts:
[http://www.atmos.washington.edu/~hakim/papers/madaus_hakim_m...](http://www.atmos.washington.edu/~hakim/papers/madaus_hakim_mass_2013_pressure_assimilation.pdf)

~~~
kuya
ah, thanks, that's fascinating.

"It also learns every time you actively report to the community on sky
conditions and hazards, translating this information into weather
predictions."

does sunshine actively run a numerical weather prediction model like WRF?

it'd be great to assimilate all these extra sensors into the NWP centre's
models, but i imagine things like cal/val and WMO agreements to share data
might make things difficult for commercial companies?

~~~
cryptoz
Yes, Sunshine is running WRF!

And yes it would be great to have all these sensor readings available to NOAA,
Environment Canada, ECMWF, and everywhere. I have made lots of progress in
getting them to talk about it, but it's a long road before any government
starts using this data in its own NWP models.

~~~
kuya
you're probably already aware of this, but there is some US legislation trying
to open up commercial data to gov't entities:

[https://www.congress.gov/bill/114th-congress/house-
bill/1561...](https://www.congress.gov/bill/114th-congress/house-
bill/1561/text)

there's some stuff being written this year as well for the side that i work
(space). with all the upcoming sensor gaps, they're looking at alternative
ways to cover.

------
Sniffnoy
I'm bothered by how the article says "three years ago European models
delivered a blow to the U.S. weather apparatus". Better models by Europeans do
not make the models by Americans any worse. Improved models help everyone.

~~~
akie
I talked to a (very) senior scientist at the World Meteorological Organization
not too long ago. He was of the opinion that the U.S. weather models were
about 5 to 10 years behind on the European models, and that this was caused by
structural underfunding over longer periods of time (i.e. past decades). As I
understood it these weather models are basically humongous software programs
that are developed in house, but based on publicly available science and fed
by data from weather stations.

Does your software become better if you run it on a faster computer? It will
certainly help, but he instilled on me the impression that the US had a bit
more catching up to do than just buying a new supercomputer.

~~~
barney54
He is correct. The European model still outperforms the US (GFS) model:
[https://twitter.com/RyanMaue/status/700216724067586049](https://twitter.com/RyanMaue/status/700216724067586049)

------
acd
Great that there is a more powerful computer helping predict the weather! One
should know that weather is a chaotic system and after three days its very
hard to predict the weather correctly. This is known by the lorenz equations.

[http://www.uvm.edu/~cdanfort/research/danforth-bates-
thesis....](http://www.uvm.edu/~cdanfort/research/danforth-bates-thesis.pdf)

Here is the UK met office. "Most of the time the atmosphere behaves rather
like the lower-left picture where we can predict with confidence for a few
days and have to use probabilities thereafter. "
[http://research.metoffice.gov.uk/research/nwp/ensemble/conce...](http://research.metoffice.gov.uk/research/nwp/ensemble/concept.html)

~~~
kkylin
Indeed! though the constraints that chaos imposes on predictability is a
subtle question. See, e.g.,
ftp://mana.soest.hawaii.edu/pub/rlukas/LSASI/ENSO/Prediction/Predicability_a_Problem_2006.pdf

------
nxzero
Maybe it's me, but having the backup in Orlando sounds like a bad idea. Anyone
able to comment on the logic behind this?

~~~
chm
Could they have built it underground?

~~~
munificent
"Underground" isn't very feasible in Florida. Most of the state is effectively
a sandbar with a very high water table. Underground construction is uncommon.

~~~
CWuestefeld
_" Underground" isn't very feasible in Florida._

Here in central Texas, we have the same thing, but for the opposite reasons.
Our water table is pretty deep - typical wells in my town are 900ft deep. But
rather than sand, we've got limestone. Typical land has _inches_ of topsoil
sitting on top of solid limestone, so excavating a basement is prohibitively
expensive.

~~~
sithadmin
>Typical land has inches of topsoil sitting on top of solid limestone, so
excavating a basement is prohibitively expensive.

So you're telling me one could blast out a _cave_ in the backyard? Spare no
expense!

------
matt2000
Is there anyone here who works on weather models who would be able to explain
why custom supercomputers are still needed vs. large commodity clusters?

Thanks!

~~~
paulmd
Modern supercomputers essentially _are_ large commodity clusters. Nobody
builds a single fast machine ala Cray-1 anymore. It's just not scaleable past
a certain point and we're long since past that. However, they still like to
give them fancy names as marketing, because they cost tens of millions of
dollars.

The main differentiation vs a standard datacenter is low-latency high-
throughput interconnects, often with a specific or reconfigurable topology.
Something like InfiniBand, where you can ensure very consistent, very
favorable bounds on your latency and throughput. So think of it as being a
"custom installation" or "custom cluster" instead.

Nowadays there really are only a few real choices left in high-performance
computing hardware: CPU vs GPU, processor architecture (eg POWER vs x86), and
interconnection. Everything else is more or less irrelevant.

I have to admit though, I've always wanted to see what you'd get if you built
a Cray-2 with GPU chips (eg Pascal) or Xeon Phi as processing modules. The
sheer density of chips Cray managed to cram in that case is impressive.

~~~
thrownaway2424
What kind of latency do they expect from these things? Less than 50
microseconds for a small RPC or small RDMA access? I'm wondering how far these
things really are from commercial cloud clusters.

~~~
semi-extrinsic
For a relatively modern cluster the standard is FDR Infiniband, which gives
you 56 Gbit/s throughput. The latency for a small (< 32 kbytes) packet is less
than 10 microseconds, down towards 1 microsecond for a 8 byte packet.

Glenn Lockwood has a nice blogpost with much more detail:
[http://glennklockwood.blogspot.com/2013/05/fdr-infiniband-
vs...](http://glennklockwood.blogspot.com/2013/05/fdr-infiniband-vs-dual-rail-
qdr.html)

------
ziedaniel1
What's the biggest bottleneck for better forecasting: better observations,
better models, or better supercomputers?

------
ansonhoyt
Cliff Mass also blogged [1] about the NOAA supercomputer. While a day after
the USA Today's article, many folks appreciate his perspective. He's posted on
this several times in the past year [2].

[1] [http://cliffmass.blogspot.com/2016/02/the-national-
weather-s...](http://cliffmass.blogspot.com/2016/02/the-national-weather-
services-new.html)

[2]
[https://www.google.com/webhp?#q=supercomputer+site:+cliffmas...](https://www.google.com/webhp?#q=supercomputer+site:+cliffmass.blogspot.com)

------
barney54
Even though the US now has more computing power, the Euro model is still doing
a better job at forecasting:
[https://twitter.com/RyanMaue/status/700216724067586049](https://twitter.com/RyanMaue/status/700216724067586049)

------
dhsjhdjs
For a newbie in this domain, how do we start doing our own toy forecasts (lets
say for Bay area)?

~~~
goodbyegti
A really nice intro is this classic paper on numerical integration of the
equation for barotropic vorticity:

[http://mathsci.ucd.ie/~plynch/eniac/CFvN-1950.pdf](http://mathsci.ucd.ie/~plynch/eniac/CFvN-1950.pdf)

It's something you can solve in a few lines of python and run with minimal
compute resources. A nice thing to try is to feed in some data from somewhere
like below and watch what happens as you integrate it forwards in time.

[http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanalys...](http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanalysis.html)

------
jackreichert
Well, this is happening just in time for all the weather patterns -- as we
know them -- to change due to global warming. I guess it'll be not unlike
understanding the universe according to Douglas Adams...

------
kuya
for anyone interested, this paper has a nice background on numerical weather
prediction

[http://www.elsevierscitech.com/emails/physics/climate/the_or...](http://www.elsevierscitech.com/emails/physics/climate/the_origins_of_computer_weather_prediction.pdf)

~~~
mediocrejoker
I got about 3 pages into this and felt like I was following the discussion
until the discussion of an incorrect prediction result that:

> In fact, the spurious tendencies are due to an imbalance between the
> pressure and wind fields resulting in large amplitude high frequency gravity
> wave oscillations.

Since this paper was published within the last 20 years, I can't imagine what
they were referring to by 'gravity waves'

Do you know what is meant by this statement?

~~~
cos2pi
Gravity waves are waves in a fluid that obtain their restoring force from
their buoyancy relative to the surrounding fluid. See [0] for the gory
details.

Some examples of gravity waves in the atmosphere: [1] [2]

[0]
[http://glossary.ametsoc.org/wiki/Gravity_wave](http://glossary.ametsoc.org/wiki/Gravity_wave)
[1]
[http://cimss.ssec.wisc.edu/goes/blog/archives/2051](http://cimss.ssec.wisc.edu/goes/blog/archives/2051)
[2]
[https://www.youtube.com/watch?v=yXnkzeCU3bE](https://www.youtube.com/watch?v=yXnkzeCU3bE)

------
SeanDav
> _" Mass was highly critical of the federal government's lagging computing
> capacity in recent years.."_

Perhaps he should have been having a chat with the NSA boys...

------
jordache
Wish they would stop using the library of congress' collection as a
reference...

I know it's big.. no idea how big.. it's not tangible to me.. never visited..
I'm sure most americans have not visited.

~~~
superuser2
You do not get an idea of how big it is by visiting; the stacks are off limits
and much of the material must be ordered in from suburban warehouses.

~~~
semi-extrinsic
They probably also have most of the bookshelves on rails? The local uni
library keeps their old books on a system like this, it's really quite cool.
The room can fit, say, 54 bookshelves tight-packed with no room to enter
anywhere. So they put 52 shelves in on rails, and between each pair there is a
wire going across. Disconnect a wire between two shelves, and all the shelves
slide apart so you can enter where you opened the wire.

~~~
superuser2
I'd estimate most university libraries have a system like this, though the
"disconnect a wire" part is strange. Usually they're just a button on the
shelf.

~~~
semi-extrinsic
I think the wire is a safety feature to avoid accidentally crushing people. If
you try to disconnect a second wire, nothing happens, and you have to go find
the first disconnected wire and reconnect it. Then you will see if there is
still someone in there. (When you're finished in a row, you're supposed to
reconnect behind you when you leave.)

