Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
NOAA upgrades the U.S. global weather forecast model (noaa.gov)
215 points by dataflow on June 13, 2019 | hide | past | favorite | 80 comments



If anyone is interested in doing your own thing with weather data, check out MADIS [0]. There are various levels of access, some of which require NOAA approval. But if you're serious about making weather predictions, it's a good thread to pull on. I once set up a MADIS node, and our server was shut down very quickly by Amazon for "suspicious traffic", so beware of that - there's a lot of data that gets pushed through the system. If I remember correctly, it was kind of a pain in the ass to get set up/configured, but it was pretty cool.

[0] https://madis.noaa.gov/index.shtml


For those interested in more background on NOAA and making money from it, I highly recommend reading The Fifth Risk[0] by Moneyball author, Michael Lewis[1]. It details how a couple private companies make a lot of money using NOAA data in interesting ways (e.g.: crop insurance (now acquired by Monsanto)). Another of those companies is AccuWeather whose CEO was appointed to head NOAA by Trump[2].

P.S.: Anyone notice that monitoring "Climate" was absent in the government announcement?

[0] https://www.amazon.com/dp/1324002646

[1] https://en.wikipedia.org/wiki/Michael_Lewis

[2] https://oceanleadership.org/trump-taps-accuweather-ceo-head-...


Can you go in to more details about how to get this setup?


I honestly don't remember any more. At the time, we were working with NOAA, and I remember a problem that was solved by talking to an admin at NOAA (our IP needed to be on some official whitelist or something), but that may have been for a restricted data set. We didn't end up using it for long because the client said so.

But I dug around for some information to maybe get you started.

Installation: https://madis.ncep.noaa.gov/doc/INSTALL.unix

API: https://madis.ncep.noaa.gov/madis_api.shtml

Data restrictions: https://madis.ncep.noaa.gov/madis_restrictions.shtml

Another resource that may help: https://press3.mcs.anl.gov/forest/regional-models/global-dat...

When I was working on this stuff, I found that a DFS on various government subdomains (like MADIS) was the best way to find information. It was tedious, but it worked.

It's also helpful to put on your fortran hat. For example, I once attended a Haskell meetup where someone wrote a parser to deal with parsing binary files from NOAA. I also was in a meeting (with some NOAA folks) once where I was asked if I "would prefer an ASCII file, or a binary one". This is not a world that operates on JSON or XML. Expect binary blobs with flags (bits) that change the meaning of other flags in fun and exotic ways. The binary nature of the data can help with data throughput limits, but boy is it a pain to deal with.


> This is not a world that operates on JSON or XML. Expect binary blobs with flags (bits) that change the meaning of other flags in fun and exotic ways.

That bring back memories... As a government contractor I've had to work with sensor data (seismic, radar, etc) in various formats that were developed well before the rise of XML and JSON :(

My favorite was a mixed ASCII and binary format, where each data record in a file had an ASCII header that described the format of the following block of binary, and pretty much anything could be different between records, even within the same data file (Time units? Integers? Floats? 16 bit integers? 64 bit? Big/Little Endian?).

I had to write a parser for that :'(


The most "fun" I've ever had was decoding command and telemetry from piece of equipment for a ground station. The box would spit out this massive frame of data. It was a very long ASCII string that you would turn into binary to break into 6bit BCD values (no clue why they didn't use 4bit...). There were random flags of odd bit lengths (sometimes just a single bit, sometimes 5bits) thrown in between numbers for arbitrary reasons rather than just having all the binary flags up front. My python script was this ugly mess of slicing up the frame to turn it all into a very nice struct I could pass to the rest of the system. The manual with this piece of hardware was some old scan that must have been xeroxed a million times over so some portions of the document were just unreadable and you had to guess what those bits did. Other parts of the frame were just undocumented. Commands were send one by one as single letter with the actual ASCII representation of the numerical command parameter.

When I started the project, I looked online to see if anyone had done any previous work on this thing. A vendor was selling a GUI for the thing for $2000, I scoffed at the price and started working it myself. By the time I was done, it had probably cost my employer more than that but at least we had our own code that could connect to whatever you wanted rather than a GUI with a no API.


Did you try to sell it too?


It was an internal project for a large company so no.


Rather than try and set up MADIS, look into something like the siphon [0] Python module.

MADIS is a bit dated, data dissemination that uses OPeNDAP [1] and ERDDAP [2] is much friendlier.

[0] https://github.com/Unidata/siphon

[1] https://opendap.github.io/documentation/

[2] https://coastwatch.pfeg.noaa.gov/erddap/information.html


How long ago was this? Surprising that AWS would shut down your node, were you trying to run this on the free tier or something?


They probably thought he was mining cryptos


The fact sheet showcases the improvements in the forecast: https://www.noaa.gov/sites/default/files/atoms/files/DOCUMEN...


This is probably in response to the Euro weather model, which has been producing better forecasts. Especially for named storms.


The Europeans are putting big money into weather forecasting, "The goal is to be able to provide, by 2025, reliable forecasts up to two weeks in advance."

https://www.zdnet.com/article/europes-big-weather-supercompu...

As far as I know other than this model upgrade, there are no major investment being made in American weather forecasting.


How far from that goal are they at the moment?


Judging by last week's forecasts for my region: pretty far off. Sudden rain in the regions of 10mm/m² and a lot of clouds when sunny weather was predicted. The exact opposite happened as well. Temperatures were off by 5-8°C, too.

All of this for same day forecasts, not even 2 days in advance.


rain is pretty hard to predict though. I guess what matters the most is if you can reliably predict major events like heavy storms rather than if you will get a bit wet later today.


You might be onto something... met.no has been more accurate than the usual domestic weather forecasts for as long as I can remember. Literally, it can be hailing outside, and thundering, and NOAA says like 20% chance of rain and sunny, where met.no might say 70% chance of storms, cloudy, etc...

https://www.met.no/en/free-meteorological-data


User facing interface: https://www.yr.no/?spr=eng


They have a very cool iOS app too: https://apps.apple.com/us/app/yr-no/id490989206


Might be, but ECMWF just has many more observations so it will be hard to compete.


I use ECMWF but never even knew what was different - def more reliable where I am for cloud / sun. Haven’t paid attention enough to see if it’s wind predictions are better


Some differences are covered here: https://confluence.ecmwf.int/download/attachments/53517138/P... (2016)


I once had lunch with a senior meteorologist at the WMO, who was (5 years ago) convinced that the European model was 5 to 10 years ahead of the US model and that it would be hard for them to catch up. Not sure if that's still the case.


Somewhat related, what are some good weather sites for storm monitoring? I've been using ventusky[0] for rain forecasts and mrms[1] for storm and hail conditions. Is there anything better?

[0]https://www.ventusky.com/

[1]https://mrms.nssl.noaa.gov/qvs/product_viewer/


The Storm prediction center[1] is always my goto. After that I check the Mesoscale Precipitation discussions[2].

[1] https://www.spc.noaa.gov/

[2] https://www.wpc.ncep.noaa.gov/metwatch/metwatch_mpd.php



Windy.com allows you to compare models, which varies based on the region you have zoomed in to.

This is really handy as national models can be much better for short-term and higher resolution predictions.

It has the widest range of models freely available that I'm aware of, including the commercial ECMWF model.


This is also my go-to, for storms and just general forecasts for sailing.


My favorite is Flowx[0]. Disclaimer: I'm bias since I'm the developer.

Globally, we have GFS (FV3) and GDPS in the free version, and DWD ICON in pro.

For north America, we have NAM 12km, NAM 3km, HRRR, RDPS, HRDPS in Pro

For Europe, we have DWD ICON EU, DWD COSMO-D2, ARPEGE, AROME and HIRLAM in Pro.

If you double-tapping of a graph, you can compare the forecasts quite easily.

Sorry there is no iOS or web version, nor ECMWF (it's too costly).

[0] https://flowx.io


https://wxdisco.com/forum/3-united-states/

The forums really light up for big storms with lots of discussion and insight.


I just wish we had a weather radar here in central Oregon

https://cliffmass.blogspot.com/2014/11/the-other-radar-gap-e...


How interested are you in getting this done? Below is a link to a phased array demonstrator (SPY-1A) that was dismantled and replaced with a newer version in 2016. Might find out where SPY-1A is sitting (the phased array may have been returned to the US Navy), and since it'll perform both weather and aircraft surveillance, might be easier to sell to stakeholders for the coverage gap.

https://www.nssl.noaa.gov/tools/radar/mpar/

Alternatively, Roberts Field appears to be a major commercial air hub in central Oregon. You might argue from a safety perspective to your Congressional representatives (perhaps in concert with local air carriers and AOPA) that the airport needs a TDWR station (cost will be ~$4MM-8MM), which could also provide NOAA with the necessary weather surveillance data. Thunderstorms aren’t common on the West Coast though, hence the lack of TDWR stations in West Coast states. If you pursue this route, you'd want to get funds for this into some sort of federal transportation bill, as part of enhancing the safety of the air transportation system.

https://en.wikipedia.org/wiki/Terminal_Doppler_Weather_Radar


I've mentioned it to our representative, Greg Walden, in the past, but he doesn't seem interested, which is a pity, because this is the kind of non-partisan stuff that they ought to be getting done for their constituents.


Maybe you could use microwave links in the cellular network instead? There's research and development going on in that area.

https://www.smhi.se/en/services/professional-services/microw...


That's cool! I can imagine that could be an excellent secondary data source for rainfall monitoring.


Having better weather insights for Smith Rock would be great. Saying hi from the a Monkey!


Some meteorologists are not in love with the new model. My local forecaster/weather blogger suggests that the v3 model tends to overestimate cold snaps and move storms too fast in the mid-latitudes: https://blogs.mprnews.org/updraft/2019/06/milder-with-spotty...


And it's extremely dubious whether there is actually any improvement... https://cliffmass.blogspot.com/2019/04/us-numerical-weather-...


I was researching weather prediction not long ago. From my nieve perspective it seems that dispute all the increased gpu computational power and advances in machine learning, there have not been any great advances in weather prediction. Is this true?

Edit: Downvotes for simply asking a question. sigh.


I worked as a forecaster for a bit but never made it to the research world (studied theoretical pde instead of computational)... however at the time huge gains had been made in data assimilation. One fact that has stuck with me was that ~1/3 of the computation time for the UK Met global model run was consumed by data assimilation. I don't remember statistics anymore but data assimilation schemes were a big driver of improved forecast skill.

I also recall the ECMWF had surprisingly accurate long range forecasts based on ensembles. It could predict 500mb heights out two weeks, no sweat.

Re: your comments... My guess is that a gpu isn't suited for use in an operational model due to data access patterns (and possibly not even helpful with the solver). But again, I'm not a computational pde guy. Also, perhaps machine learning would be useful but that would be post-processing or perhaps parameterizing sub-grid phenomenon. There's already a process called model output statistics (MOS) for adjusting raw fields from a weather model.


The physics is pretty well known at this point, and there's only so much you can gain by increasing from second to third order approximation. The errors in the initial conditions are just larger. Most of the action has been on data assimilation and better parameterizations because of that.

I've been out of the field for ten years now, but it's really nice to see improvements to the core physics to this degree.

I'm still skeptical of your supposed two week 500 heights forecast from the ECMWF model. I live near the western Pacific (i.e. the data hole) and it's really easy to find crazy model solutions after 7 days. And I'm pretty sure you weren't looking at the Southern Hemisphere.


> I'm still skeptical of your supposed two week 500 heights forecast from the ECMWF model.

You're probably right to be skeptical, for the record I was only a forecaster for a short period of time over ten years ago... didn't even serve my full four year commitment as I volunteered to get out under the Air Force "force shaping" at the time. I was stationed near Rammstein and we created forecasts for Europe. I was referring to the ECMWF ensemble products, specifically.


You are correct that for this type of pde solver, you don't have enough computation per byte of data for a GPU to be fast.


I've done computational physics at the grad level. There, PDEs are converted to finite difference which basically leads to giant sparse linear matrices. These are solved using SOR or even more advanced numerical techniques. These techniques tend to be quite GPU friendly.


Well, if you're just doing a standard finite difference method, and you have to keep shuffling your matrices between CPU and GPU because other operations don't work well on GPUs, you actually won't have any speedup.

Where GPUs shine for PDEs is if you have a lot of extra work for each node, for instance if you have complex chemical reactions or thermodynamics, or if you have a high-order method that requires lots of intermediate computations.

If you don't believe me, you can download the PETSc code and test the ViennaCL solvers versus the regular ones.


This brings up a point not exactly spelled out in the article: the old GFS used a spectral method, not finite difference.


According to this article there have been significant improvements in weather prediction in the past few decades: https://science.sciencemag.org/content/363/6425/342

> A modern 5-day forecast is as accurate as a 1-day forecast was in 1980, and useful forecasts now reach 9 to 10 days into the future (1). Predictions have improved for a wide range of hazardous weather conditions, including hurricanes, blizzards, flash floods, hail, and tornadoes, with skill emerging in predictions of seasonal conditions.

> ... Data from the NOAA National Hurricane Center (NHC) (13) show that forecast errors for tropical storms and hurricanes in the Atlantic basin have fallen rapidly in recent decades.


The last day of the 10-day forecast is now more accurate than the 3rd day of the 3-day forecast was when I was in college.

This covers some decades, to be sure, but it's a pretty big improvement.


Would you happen to have a citation for that? Truly interested in knowing if the diff is significant or marginal.


I don't know where I picked up that particular factoid, but if you look at the trend lines in [1] you can see the improvement claimed.

In [2] there is a slightly different claim, "A modern 5-day forecast is as accurate as a 1-day forecast was in 1980, and useful forecasts now reach 9 to 10 days into the future."

Chart 3.2 of [3] shows this; by 2001 the 5th day forecast improved to be as good as the 3rd day of 1980, establishing the trend line.

Googling about yields a few other studies and articles in a similar vein.

It is important to note that forecast improvement is not linear in effort. It takes more complete and accurate sensor data and far more computation to extend the forecast on the out days due to the chaotic nature of the mechanisms modeled.

[1] https://rmets.onlinelibrary.wiley.com/doi/full/10.1002/qj.25...

[2] https://science.sciencemag.org/content/363/6425/342

[3] https://www.nap.edu/read/10658/chapter/5#26


This was on here a while ago: https://news.ycombinator.com/item?id=19765700 and says that "[m]odern 72-hour predictions of hurricane tracks are more accurate than 24-hour forecasts were 40 years ago"

edit: I see that neuronexmachina found the same article. It's a good read if you want an overview of how weather prediction has changed.


I have heard the opposite - predictions have been significantly increasing in accuracy over time.

On the other hand, I suspect it might not be noticeable if the forecast you always read just says "40% chance rain, high 80, low 50". It might be more noticable if you look at the hour-by-hour forecast for a specific location and see when the rain is predicted to start and end.


Forecasts have improved dramatically. A 7-day forecast is now somewhat useful when deciding to have your party indoors or outdoors, whereas in the 90s next-day forecasts could hardly compete with the naive assumption of "the weather will stay as it is".

But by mentioning machine learning, I'm guessing you are looking at a different timescale, i. e. "within the last two years". And any progress in the short term will be slow compared to what we have seen in other domains such as image recognition etc.

I'm no expert on weather forecasting, but I believe the explanation may be that forecasts have long been (among the) best financed "big data" problems out there. That means they incorporate lots and lots of domain-specific work. As a result, naive machine learning models currently still lag all the specialised work, which in turn isn't structured in a way to easily take advantage of progress in, say, GPUs.


For hurricane forecasts issued by the NHC, you can see the official error statistics here [0]. Note that 96 and 120-hour forecasts were so poor prior to year 2003 that they were not issued.

Note these error statistics do not represent true model error as the official track and intensity forecast -- while informed by model output -- are determined by human forecasters.

[0] https://www.nhc.noaa.gov/verification/verify5.shtml


A while back, I found a NOAA search that showed historic chance of rain by month:

https://www.wrh.noaa.gov/images/mtr/sjc_pcpn_prob.gif

Since then, I spent hours trying to find this page again, unsuccessfully. All I have is this URL for San Jose. (Replacing "sjc" with other airport codes doesn't always work, since "mtr" is a region code?)

Anyone know where this precipitation data lives?


I made this: https://www.wmo.int/cpdb/united-states-of-america - click on “climate normals”


Clicking on the "SEATTLE" link,

    Missing Controller
    Error: Climate.DashboardController could not be found.


That sucks, I’m sorry. I made this about 5 years ago and am no longer involved in the project.


Thanks for the reply!

Was it accessing/presenting raw NOAA data? Or a different source? I notice the data is thru 1990; different from NOAA.

Is the php source available for reverse engineering?


The data comes from weather stations all across the US, which I assume are managed or operated by NOAA. This project is at the World Meteorological Organization, an international organization of weather organizations - of which the NOAA is a member. Presumably this means it's official NOAA data.

With some minor effort you could extract the raw data, in JSON form, from https://www.wmo.int/cpdb/climate/climate_normal/per_country/...

If I remember correctly the data came from a CD-ROM with historical data. When we put it online the data was already 20 to 25 years old. It was nevertheless the most recent data that we had available, I don't remember the reason why we didn't have anything more recent. The PHP source is (or was) a CakePHP application, and honestly isn't that interesting. There was not more data in the PHP application than what is presented here.

EDIT: It was already a messy application when I got there, I cleaned it up as well as I could, but after I left it seems to have gone downhill again. Ah well. Not my problem anymore.


Thanks again -- interesting to learn about this space! I wasn't aware of WMO.


This site has a technical overview of some of the models and algorithms used by the new FV3 system: https://www.gfdl.noaa.gov/fv3/


> Working with other scientists, Lin developed a model to represent how flowing air carries these substances. The new model divided the atmosphere into cells or boxes and used computer code based on the laws of physics to simulate how air and chemical substances move through each cell and around the globe.

> The model paid close attention to conserving energy, mass and momentum in the atmosphere in each box. This precision resulted in dramatic improvements in the accuracy and realism of the atmospheric chemistry.

Global Forecast System > Future https://en.wikipedia.org/wiki/Global_Forecast_System#Future


Wouldn't this kind of problem be a perfect match for machine learning? It would have a huge dataset to learn from. Why isn't it happening or what prevents AI tech from forecasting the weather?


It is because there is an understanding, from first principles, of the dynamics that drive weather (e.g. conservation of mass, momentum and energy). The current models are build upon these principles to make predictions, and conform to expectations of how physics operates. The method that these models are based on (finite volume) is efficient and adaptable if modifications need to be made.

Using AI and ML to make predictions about weather will likely not account for the conservation principles and might lead to ridiculous results (in some sense). Creating an accurate AI/ML model of a complex and chaotic system might lead to wrong predictions under extreme circumstances (e.g. predicting the weather >5 days out for an extreme hurricane) or under conditions where some implicit assumption has changed. One can at-least attempt to grapple with these issues when using finite volume. Under AI/ML you just have to hope your model is properly trained.


I could recommend this paper: Schneider, Tapio, et al. "Earth system modeling 2.0: A blueprint for models that learn from observations and targeted high‐resolution simulations." Geophysical Research Letters 44.24 (2017).


It is, and the research applying ML in this area is starting to ramp up. For example, last year I worked on a project using ML to identify tornado vortex signatures in Doppler weather radar scans. It also turned out that a couple other groups published similar research at the same time. I would say to expect to be much more growth with ML in meteorology hoping it will all eventuality be applied in the field.


I like the cautious approach to this stuff

> The retiring version of the model will no longer be used in operations but will continue to run in parallel through September 2019 to provide model users with data access and additional time to compare performance.


This is huge. Good work to the folks at NOAA and the American taxpayer for footing the bill.


I'm sure it wasn't the biggest part of our "bill". :)


Man, that sidebar is one hell of an aggravating UI element.


what kind of hardware is the GFS model run on?


As far as I understand, these are still hand designed algorithms using a tiny fraction of possible weather data. Impressive for old school methods. Would be even more awesome to see how far ML could take the state of the art.


Weather data is a system where we have a really good understanding of the underlying physics but can’t do enough computation to simulate them in a way that’s detailed enough to make truly accurate predictions.

Machine learning is all about finding an unknown function that underlies known data. This is sort of the opposite issue: we know the underlying function but can’t compute it.

That said, ML models are being applied to situations where the whether data doesn’t translate directly into known physical quantities, like satellite images (see https://developmentseed.org/projects/hurricane-intensity/)


One fundamental problem we have with weather forecasts is that our input data for the starting point is fairly sparse. GFS calculates the forecast on a grid with 13 km horizontal resolution and 64 vertical layers. We don't have accurate weather information at that resolution from all over the globe, so the starting point for the forecasts is a combination of previous simulations and interpolated observational data.

So even if we had a forecast engine that would perfectly simulate everything given some start state, we wouldn't have enough input data to have an accurate start state.


You can also use ML to learn a metamodel : a model trained on an accurate (but too costly to be run in real time) simulation.


Almost like super resolution techniques?


Pretty much. Maybe OpenAI will try to throw transformers at it.


We can compute the radiative transfer processes to infer the physical processes from the satellite images.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: