1) Most people aren't using bridge nodes, and are instead connecting to Tor nodes publicly listed in the consensus. No DPI necessary.
2) Tor clients/servers used to signal their intent by including "special" cipher suite combinations in the TLS handshake. IIRC, they later switched to doing a "normal" looking TLS handshake with an immediate TLS renegotiation once the outer handshake was complete. That's a very distinctive traffic pattern.
3) All writes were translated into cells that were padded to 512 bytes. So by design, all Tor traffic looks the same.
4) The circuits were much longer lived than a standard TLS connection.
My sense was that Tor was originally designed to use TLS for innocuous egress filtering compatibility, rather than for explicit censorship resistance.
Note: even though it originally came from an acronym, Tor is not spelled "TOR". Only the first letter is capitalized. In fact, we can usually spot people who haven't read any of our website (and have instead learned everything they know about Tor from news articles) by the fact that they spell it wrong.
Most of the latter types of users are probably much less interested in whether the destination website can identify them, and potentially much more interested in Tor traffic going undetected by censors.
Anonymity can be used for good. It can be used for evil. A botnet has been seen utilizing Tor.
There's a program called "Caploader" with a checkbox labeled "Identify protocols". Checking the box can identify traffic speaking the current vanilla Tor protocol.
Makes me feel like reading the article was a waste of time. I want technical details.
h72a 14 minutes ago | link [dead]
Tor's TLS handshake exhibits a number of peculiarities which distinguishes it from HTTPS. The cipher list inside the TLS client hello used to be a (almost?) unique (see http://www.cs.kau.se/philwint/static/gfc/ ) and the SNI contains a random bogus domain.
They even open sourced their code at http://crysp.uwaterloo.ca/software/CodeTalkerTunnel.html
For 900 EUR, however, you can buy yourself a copy of their tool.
If I were to take a stab in the dark about how the tool is doing it, though - based on their "statistical" analysis comment, my guess is they're measuring sustained traffic levels / TCP connection duration. Your average encrypted web session won't look anything similar to a command-and-control bot calling home over Tor to some irc server (which is their example usage for the tool). Possibly including "known" Tor node IP addresses, as well.
In addition, there was that Ethopian DPI filtering project against Tor that happened last summer (https://blog.torproject.org/blog/update-censorship-ethiopia), with the Tor Project thinking they'd somehow fingerprinted some aspect of their TLS handshake. Maybe this knowledge is spreading.
The discovery of methods to identify TOR traffic in the pursuit of reigning in malicious software, should encourage the TOR network to become less easily detectable before authoritarian governments manage to shut it down more effectively.
How Governments have tried to Block Tor
In other words, my first reaction was that it is harmful to attack the technology, but realized that is a silly argument for obscurity. Publishing a vulnerability, and more people publicly searching for vulnerabilities is a good thing, since authoritarian actors will just exploit what they find without any disclosure.