
Largest multi-city traffic dataset available - ddechamb
https://utd19.ethz.ch/
======
kitatsuab
Hi guys, I am the creator of this dataset. Together with Allister, I am happy
to help you out if you have any questions.

Yes, it is cars only and mostly European cities, and yes that's a limitation.
Nonetheless, it has advantages compared to GPS/Bluetooth data, as it captures
ALL cars (or trucks). If you're interested in car capacity of a network, macro
or micro congestion, or other applications that require rigorous detection of
ALL vehicles, this dataset should get you started! We're happy to hear from
you!

Lukas

~~~
flower1143
very interesting data! Would be even moreso if you were able to get data from
a city with a large population like Lagos or Delhi

~~~
kitatsuab
Thanks for your interest. Tokyo, Taipeh, Los Angeles, Paris, and London are
covered :)

------
paganel
No Rome, no Athens, no Istanbul, no Bucharest, in part I understand that but
collecting data from the “traffic-civilized” part of the continent will only
tell a partial story. Maybe from the included list Paris comes close to those
cities, even though I doubt it.

~~~
throwaway0a5e
When you're poor you buy cheap ground beef. When you're rich buying $8/lb
grass fed fancy beef seems like reasonable use of money. Governments are the
same way. Only "rich Europe" can afford/justify carpet bombing their
infrastructure with sensors so they're the only ones the data covers.

You see this incidental data bias in lots of fields. "Corrupt seaside Europe",
"dashcam videos and good vodka Europe" and "I Can't Believe It's Not The
Middle East(TM) Europe have bigger more immediate fish to spend their dollars
frying than collecting data for the sake of maybe finding some utility in it
later on.

~~~
paganel
I agree with you, the issue is that the technocratic powers from those cities
I mentioned (I for myself live in Bucharest) will most definitely use the
studies made up using this type of data and will apply them as they are, using
“the Germans/Swiss/Dutch are doing it so we should, too” as their main
reasoning. By the time someone comes in and says that most probably the
situations between Bucharest and Amsterdam (to give a random example) are very
different it is too late, the measures have already been taken.

~~~
throwaway0a5e
If the US is any indication it will be worse than that. There are all sorts of
topics in which "what works in Europe" factors into the discourse yet despite
the topic being something on which most of the world has kept good records on
for 50+yr north/central/west Europe is still the only part that is paid any
attention to.

------
philshem
cool open data! (I used it as an answer here:
[https://opendata.stackexchange.com/q/1767/1511](https://opendata.stackexchange.com/q/1767/1511))

for other traffic related datasets, check out this list:
[https://github.com/graphhopper/open-traffic-
collection](https://github.com/graphhopper/open-traffic-collection)

and for "borrowing" the live data from TomTom, check out this mini tweet
thread (from me):
[https://twitter.com/philshem/status/1241739025624567813](https://twitter.com/philshem/status/1241739025624567813)

~~~
kitatsuab
thanks for the efforts! I have added a link to your github on our github site.
Cheers, Lukas

------
ddechamb
In total, almost 5 billion vehicles covering a combined time span of 3.8 years
were detected. The UTD19 traffic data is free for all research use. You only
have to sign up, agree with the conditions, and you are all set.

~~~
dasloop
We are now in the age of multimodal mobility and this only covers the mode
that is easiest to capture (vehicles using data from loop detectors). Having a
dataset with the same spatial and temporal coverage but multimodal would be
amazing.

~~~
tgv
TBH, this is already quite a step. There's much more data available. I worked
on systems that also had LPRs (license plate readers), and they generate more
detailed information: you'd get 1M vehicle passings per day in a medium-sized
city. But they're privacy sensitive. Same goes for bluetooth detection or face
recognition. Data owners aren't going to expose that kind of fine-grained data
to the general public easily.

~~~
pc86
Do you know if there are datasets with pseudonymized LPR data available? I'm
not interested in the actual license plate obviously, but being able to so say
that two data points are the same car would be extremely interesting data
especially for these larger and larger datasets spanning longer periods of
time.

I'm also aware there is still a bit of a privacy concern with that type of
data but honestly don't really have the math background to know exactly how
that occurs or the extent to which you can minimize it.

------
brzozowski
I recently learned there are a few groups at ETZH working on traffic
simulation and fleet planning in transportation networks. Are you affiliated
with the Autonomous Mobility on Demand / IDSC folks?

[https://www.amodeus.science/](https://www.amodeus.science/)

[https://idsc.ethz.ch/education/lectures/duckietown.html](https://idsc.ethz.ch/education/lectures/duckietown.html)

~~~
kitatsuab
Yes, they're amazing! We have collaborations with them, but we are from:
[https://www.ivt.ethz.ch/](https://www.ivt.ethz.ch/) focusing more on the
transportation/traffic part of research. Thanks for the interest!

------
kitatsuab
the site is currently down, as ETH is experiencing issues with their
webhosting. Will keep you posted.

~~~
kitatsuab
EDIT: Up again!

