I think the point was just to test each string against the query for equality. The benchmark could be slightly tweaked to test each string against the query for substring inclusion or regular expression match, but I suspect the results would only change by the cost of the test predicate.
Not necessarily when comparing different language, as they might allow different optimizations. For example, you'd be able to compare several bytes at a time with C if you pad the strings, while node.js' V8 is unlikely to optimize that far.