
How come PHP seems so much faster than Rust? - fortran77
https://www.reddit.com/r/rust/comments/d9zfa6/how_come_php_seems_so_much_faster_than_rust/
======
osrec
While this discussion boils down to the efficiency of the underlying regex
engines for PHP and Rust, it does highlight (to me) why PHP is so popular: in
the hands of average programmers, PHP is quick to write and fast (enough) to
execute.

I personally believe it's ubiquity is well-deserved, and I actually like it as
a language despite all its quirks (which exist in almost every language I've
come across)!

~~~
biggestdecision
PHP isn't popular because it's easy, it's popular because for a long while it
was the only viable language for developing web applications.

~~~
javajosh
_> for a long while [PHP] was the only viable language for developing web
applications._

That is not what I remember. CGI existed before PHP (1994), and was used
extensively for web apps (ebay still appears to use it to this day). Also,
mod_perl and Java Servlets were introduced around the same time, circa 1996,
so I don't think it's correct to say PHP had a monopoly on web apps "for a
long while".

~~~
dimator
The friction involved in getting those other technologies up and running was
not trivial though. That played a role in php's initial success, because it
was trivial to get up and running.

~~~
winrid
How so? As someone that's set them all up - with Apache the setup for
mod_perl, php, and cgi are like the same right - they are all essentially cgi.

~~~
cutler
No, they are not the same as CGI. You can run PHP as a CGI, as many cheap
hosts do, but for performance PHP is run as either Apache mod_php or php-fpm
with Nginx. Mod_perl is not CGI and requires much more discipline to prevent
shared memory leaks. Mod_perl allows content negotiation which CGI does not.

------
kelnos
This is a little silly. The two programs are so simple that it's probably just
testing the speed of the regex engines used by the two languages. Since PHP
uses PCRE (a C library that has been around for quite a long time), you'd
expect it to be fast. I'm impressed that marshaling back and forth across the
FFI boundary isn't causing more slowness, but perhaps it is, and the real
issue is that rust's regex engine is just painfully slow.

The only way to know for sure would be to profile each bit to see what
dominates: the file I/O? The regex matching? The FFI? The printing to stdout?

Also, looks like there is a rust wrapper for PCRE[0]; might be interesting to
try it to get a more apples-to-apples comparison.

[0] [https://github.com/cadencemarseille/rust-
pcre](https://github.com/cadencemarseille/rust-pcre)

~~~
aksx
> This is a little silly.

Is it though? I interpreted question the following way.

"Rust is a tool in my tool box, when the reason for picking it over other
tools isn't demonstrably true, what am i doing wrong?"

I agree with the rest of your comment, just not with calling the OP or their
question silly.

~~~
beatgammit
The reason for picking it is for memory safety as well as performance. If your
problem involves a lot of regex and memory safety isn't as critical, then feel
free to pick a language with good regular expression libraries. If regular
expressions are a smaller part of your problem you're solving, looking at
overall performance is better than benchmarks like this.

~~~
gtirloni
I'd never know that if this post didn't exist, so I don't think the
investigation is silly at all.

~~~
smaccona
Your conclusion is correct, but the reason I take issue is that that’s not the
thesis of OP. OP’s original subject/headline suggests that, or questions why,
Rust is generally slow vs PHP, and uses a regex example as the “proof”. A
person who didn’t read the Reddit comments, or who is less versed in critical
thinking, may accept the example as a valid proof of the thesis, instead of
simply noting the fact that Rust may not be as good at regex as it is at other
things. If the Reddit subject was “Rust is worse than PHP at regex”, nobody
would bat an eyelid and the resulting conversational threads would have a
different tenor (and your conclusion would and should be the same).

------
fireattack
The top reply says "Don't underestimate PHP when regexes are involved." and
linked a benchmark [1].

But in that benchmark apparently Rust is faster than PHP (The slowest "Rust
#2" is 2.45 sec. while PHP is 2.80 sec. He also mentioned "PCRE", but the ones
with Perl in name are even much more slower [the fastest being 14.95 sec.]).

Did I miss his point by linking that benchmark, or did I interpret the
benchmark wrong?

[1]: [https://benchmarksgame-
team.pages.debian.net/benchmarksgame/...](https://benchmarksgame-
team.pages.debian.net/benchmarksgame/performance/regexredux.html)

~~~
o_p
I guess the poster didnt saw Rust being faster, only that PHP is pretty fast,
or at least not slow (so it should be underestimated).

Id take that slowdown any day just not having to touch Rust though :^)

~~~
1_player
To be fair, the benchmarks game PHP script does quite more work that just
matching a regexp, like forking into multiple interpreters (which can't be
that fast) and communicating through a message queue.

So any speed difference between Rust's and PCRE/PHP gets lost among the noise.

------
jononor
I believe the Rust Regex Unicode support is opt-out, where as in PHP it is
opt-in. Might make a significant difference.

~~~
timvisee
Cool. Never really noticed the non-Unicode variant would be faster. Here's
some more info:

[https://docs.rs/regex/1.3.1/regex/#opt-out-of-unicode-
suppor...](https://docs.rs/regex/1.3.1/regex/#opt-out-of-unicode-support)

------
chubot
Random guess:

There will be some difference in the regex engine speed, because PCRE is very
fast in common cases.

But doing a single regex match is not that much work, so it probably boils
down to I/O.

And PHP might use extra buffering to get better throughput for the common
"stream a file" scenario.

Similar question with Python being faster than C++:

[https://stackoverflow.com/questions/9371238/why-is-
reading-l...](https://stackoverflow.com/questions/9371238/why-is-reading-
lines-from-stdin-much-slower-in-c-than-python)

~~~
igouy
> And PHP _might_ …

The program source code is shown.

~~~
TheCoelacanth
Not the source code for PHP itself, though.

------
nicklauri
Obviously, because the PHP regex engine is written in C, not in PHP. It's more
like Rust's regex crate vs C PCRE rather than PHP, or who calls system call
faster. While Rust's IO is locked[1] and doesn't buffer (which is slower than
buffered, unsynchronized IO like other languages). And regex crate is slow
when capturing, which is used on the "benchmark".

Reference:
[https://www.reddit.com/r/rust/comments/5zit0e/regex_captures...](https://www.reddit.com/r/rust/comments/5zit0e/regex_captures_slow_compared_to_python/)
2 years ago, but it's still valid to me, I've just tested it and it's much
slower as I expected. Feel free to correct me if I'm wrong :)

[1]: [https://doc.rust-lang.org/std/io/fn.stdout.html](https://doc.rust-
lang.org/std/io/fn.stdout.html)

------
aussieguy1234
When you use any functions or classes any of the PHP standard library, the
underlying implementation is usually in C.

------
UK-Al05
I maybe wrong but rust uses a dfa regex engine, while pcre uses backtracking.
So in theory rust's regex engine should be faster...

~~~
roskilli
Seems like the flamegraph he took of his Rust program shows it using
backtracking, maybe for the specific regexp he is using it's unable to use a
DFA: [https://i.imgur.com/9lx42Tu.png](https://i.imgur.com/9lx42Tu.png)

------
ericmcer
This code is mostly just testing the regex engine?

~~~
fortran77
Is the Rust regex engine slow because it's written in Rust?

~~~
mlindner
Not really, it's mainly because it hasn't been optimized over many many years
yet. Also, everything in Rust is implemented in Rust. Unlike interpreted
languages like PHP, it's self-hosted.

------
contingencies
Theory.

The php script and the log in question are probably in the kernel's aggressive
block layer cache. The new files created by rust are potentially new and
uncached, mean that it must be loaded from disk. Therefore the rust version is
probably just losing time on disk access.

Quick test: review whether or not the rust version runs faster on the second
execution.

Also or alternatively: use a profiler or copy the rust program to an in-memory
block device (ramdisk) prior to access.

------
fortran77
This is an interesting discussion. It seems that it's possible for i/o to be
slow in Rust, and the PHP regex engine may be very good....

------
Thaxll
Rust doesn't make an average code implementation fast. Even naive mistakes
like buffering / allocation make things really slow.

------
stunt
In regard to all bad criticisms PHP receives, I believe the language and
ecosystem influence the way developers think. A bad language/ecosystem can
dictate bad practices and bad design choices because that’s what the language
encourages them to do. I remember every PHP-4/5 training course starting by
mixing PHP and HTML.

Majority of developers write crappy code even on a decent environment. But
there is so much old and bad PHP code out there as a result of many years of
teaching how to right bad PHP code that nothing can stop these bad criticisms
forever.

I don't write PHP at this moment but I know PHP7 made major improvements. PHP
is still a very good choice for rapid web development and fits for many
startups, small, and medium size business, and there are many successful
examples to back it up.

------
alephnan
Regarding the slowness of Go in the benchmarks linked in the article,
[https://medium.com/@dgryski/speeding-up-regexp-matching-
with...](https://medium.com/@dgryski/speeding-up-regexp-matching-with-
ragel-4727f1c16027).

There are tradeoffs with PCRE vs RE2 instead of just raw speed.

