It's a neat trick, but it still depends on uv being installed and network connectivity.
What's the advantage of this that makes it worth despite these constraints, compared to e.g. using pyinstaller [1] to build and distribute a single executable file with the python interpreter and all the dependencies of your project bundled in it in the exact versions you chose in your development virtual environment?
Yet it still participates and encourages the bibliometrics game, which benefits the big publishers.
A simple way to make a step away from encouraging bibliometrics (which would be a step in the right direction) would be to list publications by date (most recent first) on authors pages rather than by citations count, or at least to let either users and/or authors choose the default sorting they want to use (when visiting a page for users, for their page by default for authors).
Bibliometrics, in use for over 150 years now, is not a game. That's like arguing there is no value in the PageRank algorithm, and no validity to trying to find out which journals or researchers or research teams publish better content using evidence to do so.
> which benefits the big publishers
Ignoring that it helps small researchers seems short sighted.
> A simple way to make a step ... would be to list publications by date
It's really that hard to click "year" and have that sorted?
It's almost a certainty when someone is looking for a scholar, they are looking for more highly cited work than not, so the default is probably the best use of reader times. I absolutely know when I look up an author, I am interested in what other work they did that is highly regarded more than any other factor. Once in a while I look to see what they did recently, which is exactly one click away.
To be fair, you did hedge and say "almost a certainty" and maybe that's true. But speaking for myself, I generally couldn't care less about citation count. If anything, my interest in a document may be inversely proportional to the citation count. And that's because I'm often looking for either a. "lost gems" - things are are actually great/useful research, but that got overlooked for whatever reason, or b. historical references to obscure topics that I'm deep-diving into.
BUT... I'm not in formal academia, I care very little about publishing research myself (at least not from a bibliometric perspective. For me "publishing" might be writing a blog post or maybe submitting a pre-print somewhere) so I'm just not part of that whole (racket|game|whatever-you-want-to-call-it).
> If anything, my interest in a document may be inversely proportional to the citation count.
This makes little sense to me. The citation count gives you an idea of what others are looking at and building upon. As far as I’ve seen, having a low citation count isn’t an uncommon phenomena, but having a high citation count is. In terms of information gained while triaging papers to read, a low citation count gives you almost no information.
Don't over-interpret what I'm saying here. I'm not on some "mission from God" to ignore all high citation count papers. My point is only that sometimes I want to pointedly look through things that aren't "what others are looking at and building on", on the basis that sometimes things get "lost" for whatever reason. Ideas show up, are maybe ahead of their time, or get published in the wrong journal, or get overshadowed by a "hot" contemporaneous item, etc., and then stay hidden due to path dependence. My goal is to make an active effort to break that path dependent flow and maybe dredge up something that is actually useful but that has remained "below the radar".
The entire point is that experts are doing this triage for you, and building upon fruitful lanes of research.
To think that as an outsider to a field you are qualified to discover 'gems' (and between the lines here is a bit of an assumption that one is more qualified than researchers in the field, who are of course trying to discover 'gems') seems misguided.
Since it trivially does what you want with one click, and you’re not the audience, why the bizarre hatred of something you don’t understand?
It works great for its audience, likely better than any other product. Do you think your desire for rare outweighs the masses that don’t? If you want rare, why even use a tool designed for relevant? Go dig through the stacks at your favorite old library, bookstore, cellar, wherever.
I’d suspect if you were handed random low citation count articles you’d soon find they are not gems. They’re not cited for a reason.
Heck, want low citation count items? Go find a list of journal rankings (well crap, more rankings…) in the field you’re interested in, take the lowest rated ones, and go mine those crap journals for gems. Voila! Problem solved.
And I bet you find why they’re low ranked searching for gems in slop.
I have no idea where you got anything about hatred, or any idea that there's anything here I don't understand. I just wanted to make the point that there are, in fact, people out there who are not singularly focused on citation count.
That said, I personally don't have any problem with Google Scholar since you can, as you say, trivially sort by date.
But Google remains focused on popularity because that is optimal for advertising, where large audiences are the only ones that matter and there is this insidious competition for top ranking (no expectation that anyone would ever want to dig deep into search results). That sort of focus is not ideal for non-commercial research, IMHO.
I wish GScholar wouldn't embrace bibliometrics so much. Sort papers by date (most recent papers first) by default on an author's page rather than by citation count, or at least give author the choice to individually opt-in to sort by date by default.
Something I'd really like is for PHP to somehow be stricter on the number of arguments passed to a function.
As of now, PHP emits an error if arguments are missing but not if there are too many.
A way to bake that in without breaking old code would be to allow function definition to put an explicit stop to the argument list, for example using the void type keyword:
A few month ago I discussed this on the development mailing list and people seemed to agree and even suggested that this would be a good idea by default without the keyword thing I suggested. But I never got the time to properly write an RFC. There is already an old one from years ago that was voted against but In was told it was from before anything strict and typing related was considered important in PHP. If anyone's up to it, please write this RFC :) !
If I understood your proposal correctly, to get the new behavior add an explicit stop to the function, my proposal add attribute to keep old behavior.
Thus if understanding this right that would require to update every function signature to new behavior rather than marking a few functions to get the old behavior and automatically get the new better behavior of every other function for free.
Oh okay, you would do it in reverse. I strongly agree with that. But that means the new feature is opt-out rather than opt-in, and it may break some old code. Maybe it should be done in two-steps (opt-in + deprecation, then opt-out).
Typically you emit E_DEPRECATED for a full major version, then in the next major version you throw an error, e.g if it would land in PHP 9.0 then E_DEPRECATED for non-compliant functions and in PHP 10.0 start throwing errors.
One of the main advantages of actually allowing more arguments is forward compatibility:
You can, within a library, provide an additional argument to a callback without actually introducing a BC break for all users.
My favorite approach would be allowing too many args on dynamic calls (closures, and function calls with dynamic name, not method calls in general) and otherwise rejecting it.
There's no need to have this in the language just to solve the case you describe.
"function current_thing(id: Int, callback: Fn(Int)) {}" and, when you decide you need more you have a myriad of options to add these. From the despised "function current_real_thing(id: Ind, success_callback: Fn(Int), error_callback: Fn(Error)) {}" to "some_namespace:current_thing(...)" via "OtherClass::current_thing(...)" to "load current_thing from NewImplementationOfThing" and so on.
Being strict and explicit isn't opposed to being flexible. And strictness and explicitness is most often a predicament to allow for future change, rather than hampering that future change.
It's far easier to refactor, maintain, test and reason about strict and limited implementations than to do so with dynamic, runtime-magically-changing implementations.
I found this approach works best with languages having method overloading. For PHP it felt quite limiting, and it also requires you to have more complexity and overhead with wrapping.
But I have no hard evidence at hand, only how I experienced that in PHP.
I'd be curious to read about what percentage of active PHP devs use the recent features. The last time I worked in a PHP codebase (2020?) was half PHP 5 (bad) and half PHP 7 (much nicer). Curious if there's any real info out there on this
PHP 5 is as close to phased out as it gets at this point. No doubt it's still in a lot of legacy enterprise codebases (lots of breaking changes going from 5 to 7 or 8), but outside of that no one is using it.
Yeah, and I just finished porting an enormous amount of production code from PHP 5 to 7.x before fully moving it to 8. There are so many breaking changes in each major version, when you have a lot of live projects and clients don't have the budget to pay you to upgrade them, they can lay stagnant for years until way past EOL. It would have been nice to know, for instance, that future versions of PHP would throw warnings about undeclared variables or unaccessible named properties of "arrays" - which could previously be relied upon to be false-ish. That's a major pain point in code bases that treated arrays as simply dynamic objects that could be checked or defined at will. Lots of isset() and !empty() and other BS. Fine, but it takes time to sit down and check it all. I really preferred it when it let you just screw up or try to access a null property or define a variable inside a block and access it later without throwing any errors at all. Nothing about its actual functionality has changed in that regard; it's just errors you have to suppress or be more verbose to get around. In PHP 8 you can still do this:
PHP still knows what $previouslyUndefined is or isn't at the second if statement, but it'll throw an error now in the first statement if you hadn't declared it outside the block. Why? Who cares? Scope in PHP is still understood to be inline, not in block; there is no equivalent to let vs var in JS. Stop telling me where I can check a variable if you're not enforcing various kinds of scope.
Your $previouslyUndefined thing as something that's changed, as far as I know, isn't true? Unless I've missed some very recent change.
If $a is true, that snippet will just execute with no errors. If $a is false you'll get a warning trying to check $previouslyUndefined in the second if. That behavior's been the same for a very long time. The blocks don't matter for scope but the fact that you never executed the line that would have defined the variable does.
Similarly, warnings on accessing array keys that don't exist, that's been a thing forever too. Pretty sure both go back with the same behavior to PHP 4, and probably earlier.
Yes, and this is incredibly annoying. Many packages add them as a dependency, and then you get subtle bugs because of it. Or worse, they add a dependency for the polyfill that is related to an extension and suffer performance issues when the extensions are not installed; yet no warning is output.
I think this would be nice ergonomically, from a coding perspective, but I'm curious as to how it would be a security threat to pass too many arguments. What's the potential exploit here?
Exploit I don't know, but as any stricter type verification, it would catch some bugs for sure. Note that builtin functions already throw an ArgumentCountError when passing fewer OR more parameters than the signature allows. My proposal consists in (optionally in a first place) make this behavior consistent for user-defined functions.
The trouble for me, where the rubber meets the road, is external API calls that spread their arguments into a PHP function that takes a bunch of args. So I would love a way to detect if they're sending too many, which I don't think currently exists (?) but not at the expense of breaking the API if they actually do send too many.
It might be too big a change on a language level given this has been in since forever, but it might be picked up by static analysis / a linter. I'd argue it's always better to have additional protections like this in a linter as the process of adding linter rules is easier and less impactful than making a language change.
It's also always preferred to not add anything to the language imo; in this case, I'd opt to have the interpreter emit a warning or info message. It's not broken, it's a developer error.
Indeed on the PHP internals mailing list some people were saying that it would be better to entirely deprecate passing extra arguments to a non-variadic function, without adding syntax/keyword to the language.
I don’t really understand the issue. Already if you have a mismatch, the only way you’d ever know is through static analysis. It will run and maybe crash during run time. I always joke that changing a function signature is the single most risky thing you can do in php (especially if you have any dynamic dispatch). Making it even more risky isn’t the right answer, IMHO.
Oh, and doing this would literally break class autoloading in symfony, and even the engine itself, which relies on this feature.
> the only way you’d ever know is through static analysis
Not for builtin PHP functions which already throw errors on arity mismatch.
> this would literally break class autoloading in symfony, and even the engine itself, which relies on this feature
I don't understand. Could you point to where in the Symfony code it relies on being able to wrongly call a function with more arguments than it expects and will use?
For variadic functions there is the ... operator already in the language since version 5.6, and my proposal wouldn't break that. Also note that builtin functions already emit deprecated warning in PHP 8 when called with too many arguments.
Of course! And I strongly agree that static analysis is a good thing. But PHP still is a dynamically typed language and most of its development and usage is done with this dynamic approach in mind. It's not a XOR either, we can have better dynamic error reporting AND develop better static analysis tools at the same time.
Also, due to the nature and usage of PHP, some things cannot be statically analyzed because they're inherently dynamic. A simple example would be MVC frameworks where the routing is done like so: /controller/action/param1/param2/param3 where "controller" references a class and "action" references a method, which will take the "paramN" as arguments through the splicing of an array: `$ctrl->$actn(...$args);`. In such situations it would be nice to have errors/exceptions raised automatically if the URL is wrong (not enough OR too much arguments) without having to manually assess everything inside each method. Since PHP 7 and 8 over we're moving away from long lines of isset() and !empty() (and verifications such as is_numeric() etc. thanks to argument typing).
You should look at `func_get_args()` usage in the wild. This is sometimes used for (mostly outdated) good-enough reasons and doing this might break it?
I know of func_get_args, but proper variadic functions have been a thing since PHP 5.6 (released more than 10 years ago) using the ... operator. Also, my initial proposal doesn't break existing code :).
To be specific about static analysis: Lots of tools catch this. Sure, making some checks native would be nice, but for instance PHPStan always catches this, and more.
Regardless of the ‘improve the language angle’: Is somebody isn’t running PHPStan (or Psalm, Sonar, etc), then they’re missing out.
PHPStan is currently so good that using it should be non-negiotable. So the question would then even be: “I’d like rule 123 of the tool to be native, we helps with the RFC?”
I find these tools to not be too useful. The things they catch is like trying to force it to be an entirely different language, and don’t actually help with maintaining code. Then again, I’m usually writing low-level libraries that take advantage of quirks in the language for performance:
$a = &$arr[]
Gives you a reference to null and appends it to the array. Then you can pass $a to something to mutate the tail “from a distance”. Most people never need this, nor should they use it. But when you are writing a streaming parser, it is quite handy as a one-liner instead of writing
$a = null
$arr[] = &$a
or keeping track of the current index and dealing with off-by-one errors.
For applications, these static analysis tools are great. For libraries, not so much.
This is what I've always done naturally for the 20+ years I've been writing CSS. The only difference is that I put animation at the end, probably because it came much later than the rest.
People interested in this will probably also like reading this friendly introduction to differential privacy: https://desfontain.es/blog/friendly-intro-to-differential-pr..., which is friendly yet goes into a lot of details and techniques in a long series of blog posts.
No, but if you have some sort of static content hosting set up (like a S3 bucket), it shouldn't be difficult to set up publishing to that with actions. It's also got project wikis built in.
I don't get where the `-` comes from in `key-value` result lines after the "refactoring" title. I feel like it should stay a `,` like at the beginning. Can someone more knowledgeable in Prolog explain that? Is that because of an hidden use of this `to_list` predicate that comes later in the post?
(The initial version of this comment missed the point of your question; sorry.) The author says:
> We also store pairs as the pair type (Key-Value), instead of two separate values. This makes easy to serialize a dictionary into a list of pairs, which are sortable using the builtin keysort/2.
`Key, Value` is two values, not one. I suspect something like `kv(Key, Value)` would work as well.
By the way, I disagree that the refactored version doesn't cut; `-> ;` is syntactic sugar over a cut.
It's purely a convention to use terms of the form "key - value" for 2-tuples here (or '-'(key, value) in funtional/canonical notation which can be used as well). minus is just used because it's already predeclared as an infix operator, and indeed ','(key, value) could be used as well but comma is also used as argument separator and for conjunctions in clause bodies and thus tends to be avoided. You also can see '=' being used for the same thing eg.
[ key = value, ...]
(for example, as used for representing attributes by SGML/XML parsing libs for SWI, Quintus/SICStus, and others), not to be confused with '=' being interpreted as unification operation in goals/clause bodies.
If you think about it, the simplest convention in Prolog to represent "assignment" of a value to a symbol (not a variable) would be
key(value).
That is, to use the "key" atom as functor itself, rather than use functors/operators in ad-hoc ways. This is exactly what Quantum Prolog can do (optionally, and in addition to ISO Prolog's conventions).
Specifically, if you have a list of fact-like terms
L = [ p(1), q(2), r(what(ever)) ]
then Quantum Prolog can answer queries against such term list, just like answering against the global default database eg.
call(L ?- q(X))
binds
X = 2
and would also bind additional values for q(X) on backtracking if the term list contained any. This is a natural extension to regular querying in Prolog because a term list [a, b] in Prolog's square bracket notation is just syntactic sugar for using the dot operator
'.'(a, '.'(b, []))
and a Prolog program is syntactically just a list of clause terms.
In the container planning demo on the Quantum Prolog site [1], this feature is used for backtracking over (un)loading and travelling actions which would normally change state via destructive assert and retract calls and hence not allow backtracking to search for optimal sequences of actions.
You can also see that in the first call to lookup/3 where there's no -/2.
If I understand correctly, that's what the OP is asking: Where did the -/2 come from, not what it's for.
The call with the -/2 is under the heading "Refactoring the dictionary" so it's possible the author mixed up the implementations while writing the article and listed the output of an implementation that represents key-value pairs as -/2 terms.
The refactored version makes more sense btw and indeed I see the author switches to K-V later on in the article.
What's the advantage of this that makes it worth despite these constraints, compared to e.g. using pyinstaller [1] to build and distribute a single executable file with the python interpreter and all the dependencies of your project bundled in it in the exact versions you chose in your development virtual environment?
[1] https://pyinstaller.org/
reply