Open-Meteo started as an exercise to process weather model data from the German weather service with up to 2 km resolution. Their forecasts are great, but hard to use for non-data-scientists who regularly work with NetCDF and GRIB-files. Using this data in simple apps, websites, your home-automation software, or robot lawn mower is complex.
The Open-Meteo API makes using this data easier. APIs accept standard WGS84 coordinates and return a weather forecast for 7 days in hourly resolution.
The forecast quality is surprisingly good. Open-Meteo includes global and regional weather models. Global models use 11 km resolution with up to 180 hours of forecast. Local models vary between 2 and 7 km resolution and 48 to 72 hours. Updates every 3 hours. The best model is automatically selected and combined to produce a single 7-day hourly forecast. Currently the best forecast model coverage is in Europe. Models for North America will be integrated next.
Under the hood, all data is stored in binary files using Float16 and updated in-place after a new weather model arrives. The API is very efficient. Returning weather forecasts takes usually less than 5 milliseconds. Internet latency is usually much higher.
All data is offered for non-commercial use. With speedy APIs, all data can be served by just a couple of virtual machines for less than a coffee a day.
What’s next? Some important features are still missing like daily aggregations, additional weather models, ocean, and air quality forecasts. Additionally, I would like to deploy some servers in North America and Asia to improve latency.
The project went live 2 weeks ago and is slowly being used. I would be grateful for feedback, suggestions, ideas, and questions.
All documentation can be found at https://open-meteo.com/en/docs
Open-Meteo looks pretty good! The only thing that seems to be missing for me is precipitation probability. The weathercode is sort of a proxy for this... I'm also interested in sunrise/sunset times; direct_radiation is kind of a proxy for this.
Kudos for providing optional historical data with the same API call for the forecast. Many weather API's don't provide historical data, and even if they do, it requires extra calls. My weather app charts the previous two days of weather with the forecast for comparison. I feel this gives a more intuitive sense of the weather vs. raw numbers because weather is very relative. ("Warm" vs "cool" depends on your location and season.)
In addition, I am in the process of adding AQI forecasts, which requires even more network calls. It seems like this is on the roadmap for open-meteo. I was surprised to find there are so many different standards for AQI. Curious to know which one you plan to use.
One possible suggestion for optimizing the output format: sending seconds since the Unix epoch would save a few bytes per timestamp. I'm not sure if this would make any noticeable difference with gzip compression. The current datetime format is much more human-readable and may save a conversion before displaying.
These were the best (free) weather API's I could find. It's interesting how the three different weather forecasts can disagree so much:
- https://darksky.net/dev (deprecated)
When I find the time, I will add open-meteo as to my app! I'll probably have more feedback then.
I will add precipitation probability as soon as I include 14-day ensemble weather models. Deterministic whether models usually do not offer this directly.
I also really like the 2 days historical access :) At some point I would like to add more, but the storage requirements skyrocket quickly. Not only in space, but also disk IO to perform in-place updates.
For air quality data I want to add first all variables individually. AQI can be calculated afterwards on-the-fly. Some AQI indices require data from the previous hours, but it should work well with 2 days past data. For sure I will directly integrate the European and US AQI.
I considered unix-timestamp instead of ISO9601 string timestamps. Working with unix-timestamps is a bit trickier if you also want to use timezones. For the start, i settled with ISO8601 timestamps. I might consider adding unix timestamps as an `&timeformat=unix` parameter to the API. Regarding saving bytes in the output: I would like to implement protobuf as a format and also use it to build some SDKs. Especially with large floating point arrays, this saves a lot of parsing time and preserves precision.
All your suggestions are now kept in the Open-Meteo backlog https://github.com/open-meteo/open-meteo/issues
I currently build ensemble based modelling solutions for bushfire/wildfire. Weather is a significant component in how fire behaves and is forecast. Whilst I have access commercially to several models and meteorologists through work I'd be keen on chatting with you a little further about the possibility of including fire weather metrics in your forecasts, or at least using the data in bulk if possible to include in my own public facing project.
For now I would refrain to include data that is non-date or has restrictive licences.
You can find my email on the open-meteo site and drop me some information
Yes, the basic fire weather metrics are derived from the factors you mentioned. Different places globally have some different measures, but I can speak to Australia, Greece, Portugal, California style setups. Northern US and Canadian pine forests are a little more outside my area of knowledge, but could locate more info easily enough.
I wouldn't wish to pursue a project using commercial IP or introduce additional costs where none should exist.
What does "starts at 0:00 today" mean? Ideally, it is 0:00 of the local timezone. (Some other weather API's mess this up!)
Many weather APIs follow weather model runs. Therefore in the evening, data suddenly starts at 12:00 UTC regardless of the timezone. It is quite a pain, if you want to build an app displaying todays data, but in the evening you do not get data from the morning anymore.
With Open-Meteo multiple weather model runs are merged together and partly overwritten. At around 4:00 UTC, the first weather model with data starting at 0:00 UTC arrived (usually called 0z run). 3 hours later, the 3z run arrives with data starting at 3:00 UTC. After a couple or model runs arrive, it is quite a mix out of different weather model runs. If done right, you notice nothing of this behaviour ;)
< 0z run >
< 3z run>
< 6z run>
< 9z run>
Yes, unix-timestamps must be in UTC+0. If a timezone is set, data still starts at "2021-09-13T00:00", but this is now local-time. With a 4 hour UTC offset, I would have to set the unix-timestamp to "2021-09-12T20:00" and the developer has to correctly reapply the UTC offset again to get local time. This can be done, but is prone to errors.
And, more critically, since I like your API: will you offer public access to ensemble forecasts? That would make my life better.
This morning I decided to change the provider because the forecast is really unstable.
I will be glad to try yours and give feedback if you are interested. I am near Paris.
A) Using Numeric Weather Prediction models (NWP) data as 'historical' data is highly dependent on use case. For some application, this is totally fine and the best available data but with more rapid observation satellites becoming quite low latency, using a +2hour NWP forecast as 'historical' could be quite wrong and source of the data should be clear. For a lot of parameters this is the only choice but certain models have different biases for different areas of the world, limiting the use case further unless your backend is willing to apply improvements based on these known biases.
B) A lot of use cases really are only interested in future predictions and they can also store forecasts at +0 and keep a their own record for performance tracking. Yes it is handy for some, its wasteful for others.
C) This separation allows you to separate your infrastructure as keeping even a relatively short historical record at high geospatial/temporal resolution with good query time gets expensive quickly. Eg, satellite data (I dealt with cloud detection) is commonly at 0.5-2km resolution at update every 5-15 minutes you start having to sub sample locations quickly if you want to store even only weeks of data that is low latency available, if you track 1-2M points at 30 minutes even 2 weeks is 1.3B rows ignoring the number of parameters you might be storing. Where as forecast data can be moved to 'cold' (high latency) cheaper storage more quickly since the latest forecast is the more valuable, quickly losing its value over time. This means you can use storage like Redis to store latest forecasts for rapid ingestion and fetching. Storing historical forecast data per model at +N hours/minutes ahead and you start multiplying your historical data costs again.
Massive kudos meteo-jeff for building this service, accessibility of public data can be extremely frustrating and while GRIB/NetCDF make looking at data over an area through time somewhat more manageable, a lot of use cases are just interested in a single point. My contact is in my bio if you ever want to chat, great public service you are doing, well done!
For wind speeds, all typical units are available as well. Let me know if I missed one.
The app defaults to F because it gives the more "human" temperature range.
I've never heard of F as the more 'human' temperature range. Don't you either use C or F. Since I've never used F it doesn't seem more human to me.
So I found some other articles that articulate why F is better for weather: https://news.ycombinator.com/item?id=28524505
Sorry I'm on mobile and can't link directly to the relevant part of the video: https://youtu.be/BMGrsOawKac
Basically F gives more precise info/digit for the temperature ranges humans experience. 99F is near the top of the range, while 99C is unheard of for weather on Earth. So F gives approximately twice the precision. To get the same precision in C you have to add decimals
Same amount of bits used either way, i.e., you can also use dezi-Celsius, e.g. 22.5 °C = 225° dC, while not commonly used when displaying it, it gives just the same info and is often used in µC; so that's really not an argument for Fahrenheit.
The top-range being 100 °F is just nonsense, there are lots of places with 110° F (~43° C) and also some with 120°F (~50° C) and some places that top out at 70° F (~21° C) in summer.
Also, that would imply that bottom is 0° F (~ -17° C) and mid-range is 50° F (~ 10° C), both aren't true for either observed temperatures in most parts of the world nor would 50° F be a good level for human comfort, which is subjective anyhow.
The single thing that could make one think that a temperature scale is a better fit for human consumption is being used to that scale. If one grows up with °F then naturally °F is the scale than one can better relate to, similar with Celsius.
The actual benefits of Celsius are relation to freezing and boiling point (combined with barometric pressure) of water, something that is daily used by a lot of people (cook pasta, make ice cubes, know not to lick metal poles at <0°C, ...) and more importantly, can actually be objectively related too.
In addition to that it scales 1:1 with Kelvin, a scale that actually has a defined lower end that matters a lot in our universe.
I was speaking from personal experience. One of the reasons I made it easy to toggle between F and C in my app was so I could adjust from F to C (in Korea). I think it worked and I am comfortable using both C and F now (although I admit I am more accustomed to F).
However, one thing I noticed when I switch to C in my app is I get less information. I have seen systems that double the C reading to get a similar precision as F, but then it is no longer really C.
My point was not that temperatures don't go over 100F, but that F uses the entire range of 2-digit readings. C only uses half of the positive 2-digit range.
Also because freezing in F is above 0, you can show a useful range of temperatures below freezing before having to add a negative sign.
So I still argue F is more information-dense than C given a 2-digit temperature reading for weather.
It because you are american. For europeans using celsius this is more like hieroglyphs (but I understand it works same other way around).
More detailed arguments for F for weather here: https://news.ycombinator.com/item?id=28524505
Merits of Metric vs. Imperial generalized:
- Metric: science-based, regular (powers of 10)
- Imperial: human-scaled, divisible (many divisors, including 3) An inch is about the breadth of a human thumb, and a foot is 12 inches. (Also about the length of a human foot.)
For example, a surfer says "if you measure wave height in metres or cm it is not as accurate compared to the human body (a 6ft wave is under 2 m and we would not call a wave 182.88 cm high)"
edit: length -> breadth
And I am using it fine on desktop by toggling with the Windows snap feature.
But yes, a toggle on desktop would be nice and more user-friendly...
... and I'll add, awesome work. I love simple sites like this that are built well and are actually useful. Well done.
This API still works: https://uw.leftium.com/?api=openweather
The app's main use case was for mobile, where you're likely to start in portrait orientation.
I'll have to make the landscape mode more usable, especially on desktop.
I kept DarkSky the default API because it tends to be the fastest. (And it used to be the most accurate? These days openweather seems to predict the rain better.)
However, soon Open-Meteo will be the default API!
The ensemble forecast would -- I imagine -- give me an idea of the internal variability of the forecast for a single hour/day, which I could feed into my personal loss function and determine more accurately how to prepare to minimise discomfort.
As a concrete example, a single forecast of 17 °C can correspond to many ensemble forecasts, including
- 90 % range 11--20 °C
- 90 % range 18--22 °C
and these two mean very different things for how I'd dress and what extra clothes I'd bring.
It's even more important e.g. when planning a long drive. If the 90 % range includes heavy downpour, it might be worth rescheduling or adding buffer time to the trip, even if the most likely single forecast only predicts drizzle.
What would it take to get access to this data?
They have an endpoint to give you weather per hour now and going forward for any location in the world
Forecast variability is difficult to express. For example precipitation probability describes how likely rain is given the forecasted atmospheric conditions. Instead some people assume it is raining in 10% of an area.
Making use of probabilities it also tricky. If you run an ice cream shop, with high operation cost and there is a 50% probability of rain, do you open with the chance of barely anyone coming or keep closed to save cost?
Ensemble forecasts also take up a lot of disk space, because each model ensemble is stored individually.
I'm updating the code now to parse your weathercode for predicting umbrella necessity, and has been an accurate meter in my area for the past week.
Yes, please use the WMO weather code for that. I will also add precipitation probability in the next weeks. At >50% precipitation probability the umbrella hook light could also switch on.
I hope to improve the UX a bit further with a city-search for example. Unfortunately I am lacking a good geocoding API for that.
It works well and with zero maintenance if you use their auto-updating Docker image .
Pretty generous free tier, and paid options seem pretty reasonable.
What wikidata objects are you referring to?
As others pointed out, blocking IP addresses is an option. Blocking access based on the HTTP Origin header is also possible.
I hope I can go on for some time without being to restrictive. With a fairly quick API that can scale with cheap VMs it should be feasible to keep it running at low expenses even at higher utilisation.
HN user: a_square_peg
I've been meaning to do some analysis on historical weather data to try and understand the impact of climate change, and all the open APIs I have tried have inaccurate data.
Data for Manali in India:
When the actual low is 12 celsius and high is 20 celsius, this shows low as 4 and high as 18.
We have https://dsp.imdpune.gov.in/ but they charge thousands of dollars for their records and require a lot of paperwork.
I would be interested in seeing the implementation of the service, interesting choice going with Swift. I'm guessing your using something like Vapor for hosting the API?
How are the files designed? I'm guessing you have some cheap way of mapping (lat, long) into a file location? Maybe just by truncating the coordinates to your precision and mapping them onto the file? Using some fancy-pants Uber Hexagons. How is the forecast delivered?
Hmm! Many questions :-). I've been thinking lately of similar (but not for weather) "geographical" API's, and how to store/query data cheaply, so this interested me :-)
I use simple 2D lat lon grids for different regions and resolutions. E.g. the global grid uses 2879x1441 pixel grid. The third dimension is time. All data are stored directly as floating points on disk. Because I know the dimensions of the grid, I can read data exactly at the correct position on disk. I use Float16 as well, which saves me 50% disk space compared to Float32.
Fancy hexagons like H3 are not necessary. They could reduce storage by ~30%, but require much more effort and I have to "modify" the forecast. I keep forecast data from national weather services as untouched as possible.
Currently I am using fixed size files, because I update all weather data "in-place". I can calculate which bytes to read for a given geographic location. If data is compressed, this would be very hard to do.
As far as I interpret terms correctly, I have to indicate that I am using open-data from DWD, but for value added services, this attribution is not required for all applications using it. I will do some additional research in this regard, whether applications using Open-Meteo should also display a disclaimer.
I would probably do a request per day for each user of this feature, which would amount to about 300-400 requests/day. Would that fit in the non-commercial use tier?
I would be happy to start paying like $5/month if I'll find a way to improve the brightness calculation logic with your API. If you don't yet have a payment processor integrated, maybe we can work it out through something simpler like Patreon or Buy me a coffee recurring payments.
What I'm not sure about is what data would best fit my needs. I was thinking of using Cloud Cover first but I don't know which one would affect the daylight perceived brightness:
Regular "shortwave_radiation" should fit your needs. This is basically "direct_radiation + diffuse_radiation".
I am not entirely sure, if this effects your users that much, because many will work indoors and only a fraction of sunlight will get in.
You can contact me directly via mail for a non-commercial use waver.
Most users of this "brightness by sun position" feature work in rooms with lots of natural light so they complain about their screens being too bright in cloudy days. I think some improvements will be noticed.
I'll contact you through the email on the Contact Us  page when I have something working properly, thanks for this!
The challenge I've had is the same, the data format that a lot of these sources provide is in "academic" formats like GRIB, NetCDF etc.. and I've spent a not-insignificant amount of time trying to work through all these formats and what they represent.
I want to integrate regular wave forecasts in the next month, but I am unaware of something that can improve dive visibility.
Anecdotally I find it's best when the wind is offshore and there is low tidal movement. But there are other factors at play, such as rainfall, etc..
This is one source I found (for wave height): https://www.aviso.altimetry.fr/en/data/products.html, however it was not clear on the website whether it was paid or not.
Another one I had a poke around with: https://www.ecmwf.int/en/forecasts/datasets/wmo-essential (trying to read Grib2)
I think the titles for the charts might have the lat/long compass directions swapped: it's showing the correct latitude for my location I selected (from preset - Wellington), but it has a 'E' after it, when it should be 'N'. Similarly, the longitude has 'N' after it, when it should have 'E'.
My goal is to keep Open-Meteo free for non-commercial use and not start introducing API keys with different tiers of payments! The uptake in the recent days is great and motivates me to keep working on it!
I used VisualCrossing  which seems quite good, and crucially allows 2 weeks historical data, which I think is quite rare. It looks like you offer 2?
Darksky did not offer this feature I don't think.
For my use the accuracy is not critical.
Most likely I can store some basic weather variables like temperature, humidity, wind, precipitation and weather code for a couple of weeks. I will keep you request in my backlog: https://github.com/open-meteo/open-meteo/issues/27 Thanks!
National weather services collect all kinds of measurements and observations from weather stations, air planes, balloons, satellites and radar to "assimilate" the global state of the atmosphere. Usually this is full of gaps, because only a fraction of the world is covered by measurement stations.
Once hour 0 is assimilated weather models start to calculate how different processes like solar radiation, clouds and winds change this state.
As Open-Meteo only uses the end-product of this chain, personal weather stations cannot be included anymore.
If you want, you could try to use machine learning and Open-Meteo weather forecasts to build your own tailored forecasts. Hopefully this idea will be picked up by someone with instructions on GitHub.
Offtopic. But I'm looking for datasets that tested for
blue-green algae in still water/rivers.
- Date of test
- Location of the test ( or some info so that i can link it together)
- Status ( measurements that detect blue-green algae)
This is usually measured by a lab and as a outdoor swimmer, i've encountered a lot of issues when there are blue algues.
I'm hoping to create a ML-app that predicts the results ( free ).
Contact me at nico-algues [at] sapico.me
(I've already got test results back from Belgium, not much response from the Netherlands and the UK)
It's free for a reasonable number of requests and has a solid API and a good variety of functionality.
Yours looks well designed! I will give it a shot if I ever need to revisit my local weather script.
Did you chose swift for any reason besides existing knowledge? Python can be compiled down to a binary pretty easily, along with many other languages. Just curious. Cheers.
With Python I always struggled to get performance to a very high level. I am sure it is possible, but Swift with some C code was more natural for me.
Very impressive! What does your backend stack look like? Are you using any caching or does every API call hit the binary file?
> The project went live 2 weeks ago and is slowly being used.
What kind of traffic are you getting?
Looking forward to future updates!
There is no frontend cache and I want to keep it this way. Next steps include point-of-presence API virtual machines in different counties. Via GeoDNS this is even faster than a CDN.
Peak API usage was around 50.000 API calls per hour, but on average it is quite low.
Mainly I also intent to reach open-source developers to make access to weather dat easier without the need to instruct users to sign-up for an API key somewhere.
I run a cronjob which sends me the forecast for the next day based on german weather service MOSMIX forecasts. I am using https://brightsky.dev for this, which makes the clunky xml files of MOSMIX available as a http api.
I’ll definitely check out open-meteo
One goal of mine, would be to provide historical data of the past months to enable users to run their own statistical forecasts combined with weather models. It is actually quite easy to use measurement data and correct forecasts using simple machine learning models like random forrest. If you compare wind speed measures and forecasts, there usually is a simple statistical error that can be reduced easily. Maybe in the next month I will to some tests and example code
• What is the source of the soil moisture data?
• Also can you describe which ET model is being applied and which of its parameter values have to be assumed?
• ET is based on latent heat flux. This is not the potential evapotranspiration or ET0 reference evapotranspiration. I did not dive deep into the actual radiative transfer model for latent heat flux, but you can find it in the DWD ICON description.
I've seen pages like Weather Spark that give you historical averages, but I would like to see how the weather has (or not) changed over a long period.
Frontent: Bootstrap, jQuery
Gilab CI/CD pipelines with deploys
The legal way to remove the non-commercial limitation, is to "ask" if your company can use it.
To keep it simple, I also do not want to have any income with it. In case larger API consumers are on the horizon, I will contact them and ask to sponsor a couple more VMs.
A cheaper/scalable approach instead would be to re-process your data into an appropriately chunked, cloud-optimized storage format like Zarr and save it in object storage. Then your scaling bottleneck would just be the VMs or compute you use to query from object storage, as a function of traffic/load.
Yes, the amount of storage can be an issue, but I want to stay below 500GB of hot-data. One bottleneck is network traffic to copy updated files after each weather model update.
My binary files are just plain Float16 files without an meta data. Logically they similar to 3D Zarr files. I know exactly which bytes I have to read and the kernel cache helps a look to keep it fast.
In theory this data does not have to be a file on SSD. I could also use a block storage directly or request data from S3 via range requests.
One hopelessly over-engineered approach would be to use an `AWS EBS io2 Block Express Volume` use `Multi-Attach` and spawn up to 16 API instances to serve data from it.
Otherwise it takes around 40 GB of disk space for every day of data. For an history of 100 days, 4000 GB are required. With compression I could save 50%, but have to invest a couple of days development time to make is work. You could calculate the AWS bill now ;-)
Data on cold storage is an option, but it also super slow....
Currently, I did not yet integrate all the high-resolution models that I want to. Coverage for Europe is great. In North America I will add high resolution NOAA models next.
Most likely I will keep only a limited subset of data as history, but on fast storage to make is accessible quickly
Interesting problem to solve.
Even if you add an account and API key system, as long as it's free to sign up, apps can also automatically register accounts to get API keys.
Also for API keys you can just search for keys from basically any weather API and you will find credentials on GitHub. People also sign up 10x for free API services to circumvent typical request limits per day.
At some point I may have to block certain IP addresses or HTTP Origin referrers because of abuse.
For visualization, the UI there is great for a trained eye only, yours seems more accessible. Congrats on that point!
It looks like Kachelmann is also integrating weather forecasts from different national weather services and the paid version from ECMWF. I am sure the "HD" forecasts mainly use DWD ICON models as well. In this case, data quality would be the same.
I also spend some time to carefully integrate local and global weather models. For solar radiation forecasts, a clear sky radiation model is used to correctly interpolate data to 1h resolution. In the end it is a tradeoff between simple API with low operation cost and data quality/amount.