
Analyzing Django requirement files on GitHub - jayfk
https://pyup.io/posts/analyzing-django-requirement-files-on-github/
======
Alex3917
> Among all projects, more than 60% use a Django release with one or more
> known security vulnerabilities. Only 2% are using a secure Django release.

Probably because 95% of projects on GitHub are homework assignments for job
interviews that never get updated after they're submitted.

~~~
rbanffy
I tend not to pin versions. When I generate requirements files, I use pip-
chill (disclaimer: I made it) to avoid listing redundant dependencies.

~~~
Alex3917
Interesting. To me pip-tools seems like a better fit for this use case, in
terms of letting you specify a requirements.in file that gets compiled to a
requirement.txt file. This allows you to selectively pin dependencies of other
packages if you have some use case for doing so.

Something related that would be nice though is having pip list -o only output
dependencies in the requirements.in file if it exists, or else that are
primary dependencies.

~~~
rbanffy
You can tell it to generate the list without version numbers so you can get a
clean requirements.in for pip-tools to compile.

------
anentropic
Most Django sites probably aren't public github projects though

These are more likely Django apps... it'd be interesting to consider how many
of them shouldn't even be mentioning Django at all in their requirements.txt
files to avoid clashing with the Django version of the project you're
importing their app into.

~~~
ubernostrum
This gets to the setup.py/requirements.txt confusion.

setup.py is for specifying dependencies of a piece of code you're
distributing, while requirements.txt is for specifying a known-good
environment to run a piece of code in.

For example, my own personal site has a requirements.txt specifying Django
1.11.2 because that's the version I'm deploying on and test it on. But it uses
several applications I've written and distribute, and those use setup.py to
specify a dependency on Django 1.8, 1.10 or 1.11 (and those all use tox/Travis
to test on the full matrix of Python and Django versions they support).

------
minimaxir
A note about the use of BigQuery here: this problem is one of the very few
cases where there is _so much data_ that you'll actually have to pay money to
run the query. (the query processes 2.21TB of data; you get 1TB free, then
$5/TB).

~~~
jayfk
Yep. If I remember correctly I've paid ~$8 to run the query. That's one of the
reasons I've made the raw result public on [https://github.com/pyupio/github-
requirements](https://github.com/pyupio/github-requirements)

------
neonkiwi
Oh my, X%! Did the upvoters see the many placeholders in the article or were
they asked to vote by someone?

~~~
samcheng
It's still pretty interesting; you can eyeball those other percentages. The
"Most projects are still on Django 1.8" insight is still a good one.

Not sure how this short article could have rocketed to the very top of the
front page so quickly, though...

~~~
neonkiwi
There might be insight to be gleamed from a better dataset than public repos
on github. It's a big leap to make inferences about the use of Django in the
wild from this.

~~~
jayfk
I'd love to get my hands on a better dataset than public repos on GitHub. Any
ideas?

------
metaphorm
I'm happy to see that people are using the LTS release as intended. Not
surprised at all that the newest releases are the least used ones. More than a
little surprised that version 1.6 still has any users at all, let alone how
many it actually does have.

For those not familiar with django's release history the 1.6 -> 1.7 major
release was a very large change in terms of how database migrations are
handled. In 1.6 (and earlier) there was no built in too for it, but a very
popular django extension library called South was the standard. In version 1.7
the creator of South (Andrew Godwin) wrote a migration tool for django core
that was based on his previous work with South. There is a migration path from
South to django core migrations and it's not that scary to do but it's a
little work. That was several years ago at this point though. I wonder if some
projects just abandoned upgrading at 1.6 because of this.

~~~
ubernostrum
Any piece of software that doesn't essentially force an upgrade on you will
end up used in a lot of deployments that never upgrade. Not much we can do
about that.

Though you're right that people tend to "stick" on a version where the next
step up is a trickier upgrade. That's happened a couple times in Django's
history -- a lot of early deployments (pre-1.0) pinned for a long time on 0.91
since the Django ORM got completely rewritten for 0.95.

And post-1.0, the switch to class-based generic views also left some people
behind. 1.11 is likely to be "sticky" too since it's the last version that
will support Python 2 (the next release -- Django 2.0 -- will be
Python-3-only, and 1.11's LTS support cycle is timed to expire when upstream
support for Python 2.7 does).

------
ftxrcc
Team 2% unite. 1.11 rocks.

~~~
Finnucane
Since 1.11 is an LTS release, presumably its share will increase. It'll be the
default for new projects now.

------
arthurk
Any reason why you didn't exclude forks from this data?

~~~
minimaxir
It doesn't seem possible to reliably identify forks from the BigQuery GitHub
dataset (you could _estimate_ by deduping repos of the same name), which seems
like a weakness.

