
The Django Project Debates User Tracking - miiiiiike
https://lwn.net/SubscriberLink/707443/5710ef3733710462/
======
czep
Why does it not suffice to examine 'pip install Django' metrics from PyPI?
That would be a reliable indicator of the relative popularity of the package
against other packages in a level playing field.

While it would overcount the number of true installations of projects using
Django, judging by the number of times I spin up a VM for testing, I would
still argue that would be a better metric than a custom GA integration for
which you'd have no relevant point of comparison. Even if they were to make
this opt-out, what would they compare it to?

A: "Based on our custom GA developer tracking, we count 400,000 new Django
projects this month."

B: "Django is the 4th most frequently installed third party Python package,
based on the Python package index."

Personally I'd trust statement B more than A. No one can independently verify
statement A.

~~~
folz
I've seen many CI environments take forever because they download Django from
pypi and forget to cache it. Downloads != usage.

~~~
icebraining
But if that CI server creates a VM or container and uses runserver, you have
the same problem.

~~~
JshWright
How so? The analytics service would dedup the IP address.

~~~
quesera
Very large businesses with dozens of Django installs could be behind a single
IP address.

~~~
JshWright
That may not be a problem if all you care about is the number of
organizations.

------
yladiz
At first before reading the article I was very much against it. But after
reading it, it seems a little bit more reasonable but I would really strongly
prefer two things if this were to ever get implemented: 1) don't use Google.
There must be services that will provide analytics pro-bono/really cheaply for
open source/non-profits that aren't tied to a company with terrible privacy
track records like Google. 2) make it abundantly clear that this will happen
and explicitly give the opt-out instructions the first time (if it was indeed
opt-out). As in, "We are enabling user tracking for better usage statistics.
If you would like to opt-out, please type <...>." I know that I rarely read
changelogs and that if things are not presented to me at installation time
they would probably sneak in, through the fault of the Django team or not, but
I'd worry that such a thing (notifying users directly) isn't easily possible
through pip installations/setup.py.

However, would it be very useful statistically compared to the Pypi
installation numbers? Sure, Python is different than NPM because NPM almost
always locally installs packages whereas Python installs globally by default,
but the numbers must still be high as Django is likely one of the highest
installed packages from Pypi and in Python-land in general and as _czep_
points out, because they would only be tracking themselves, it would be hard
to compare numbers to anything. It would be useful from a total amount
perspective but it wouldn't have any use in comparing to other packages
because the kind of data would be different: Django would have usage
statistics whereas Pypi has installation/download statistics.

I'm also surprised this is even necessary, since the main purpose of this is
supposedly to be able to talk to potential investors for the DSF with concrete
numbers. Is Django being basically familiar with every Python developer not
enough? I'd really want to know specifically if investors have said they want
usage data explicitly, rather than the nebulous idea that it may help make it
easier to raise money before I'm more open to the proposal.

~~~
forgotpwtomain
> 1) don't use Google. There must be services that will provide analytics pro-
> bono/really cheaply for open source/non-profits that aren't tied to a
> company with terrible privacy track records like Google.

As an occasional Django user -- 100% on this. It's nothing difficult to store
and persist some key-value pairs from a POST request, certainly doesn't
require Google Analytics.

~~~
delroth
Storing and persisting is the easy part. Generating useful aggregations, doing
decimation on the data to avoid having an unbounded set of data while keeping
some historical values, and all this kind of work is why people use GA and
other analytics software. And all this is far from trivial.

~~~
forgotpwtomain
Depends what your visualization requirements are, probably most of what Django
project is looking for can be done in a few hours with matplotlib.

------
was_boring
I use django professionally, and if tracking usage helps guide development or
attract sponsors to achieve higher quality -- I'm all for it.

There is a problem to be solved (how to make OSS sustainable), and I'm both
interested in solving that problem and trying different approaches to solve
it.

(edited for less use of the phrase "I'm all for it")

~~~
ubernostrum
Personally I'm not opposed to a popcon-style thing that just lets us estimate
"X million people use Django". But it's increasingly looking like it's
impossible to put together such a thing in a way that's both A) useful and B)
not going to cause privacy issues.

------
msane
Someone proposed tracking django developers using the django command line?
What a ludicrous and creepy idea.

edit: why downvote? that's what it says:

> the developer commands: startproject, startapp, runserver

------
yeukhon
So even if we do have an accurate usage count, say 10 millions, so what?
What's the Foundation's plan to get funding?

I think they should run annual campaign like Mozilla and Wikipedia. The spend
of the money should be 100% transparent. I am not really sure why we need a
Foundation. I get the hosting cost, and rewarding people to work on very
difficult features and enhancements, but what else? Conference cost &
scholarship? What else.

------
rokosbasilisk
I do not support user tracking. Id fork it at that point.

------
cyberpanther
I use Django and don't mind being tracked if it helps development. However,
the proposed tracking sounds like hit tracking which doesn't give you any
meaningful numbers only trends. So I think tracking pip installs would give
you the same trends.

------
Walkman
The best part:

"It is encouraging to see that a community can discuss such issues without
heating up too much and shows great maturity for the Django project."

~~~
daenney
I agree though. The Django (and Python) community in my experience has been
good at actually debating issues on their merit, and trying to keep own
feelings/opinions with no facts to back them up out of it. Of course this
doesn't always work and there's always going to be some comments that don't
follow those principles, especially with more controversial topics.

------
icebraining
jezdez' proposal seems to be rather reasonable: just force the user to
explicitly select yes or no - that gets over the objection that people will be
too lazy to opt-in, since the effort is the same. And it removes another
source of bias, which is the disabling of the tracking by redistributers like
Debian, since the user does provide explicit permission.

~~~
toyg
If forced on screen with a honest message, people will just opt out in droves
and make the numbers as useless as the PyPI-download ones.

This seems such a huge waste of time and effort. If they can't get funding by
showing massive PyPI numbers, they won't get funding by showing massive
startapp numbers.

------
twsted
The threat to add user tracking could be the best incentive for me to donate
more to the project.

------
Lazare
I think this is an strong idea, and I don't see any issues with the proposed
implementation using google analytics.

Certainly it seems more practical than any of the proposed alternatives
suggested here. (Eg, micropayments. Come on, that's not even plausible...)

------
smoyer
I allow both Eclipse and Firefox DE to collect usage and bug information
during my use of those systems ... I feel there are a few keys to making this
decision for both platforms:

\- I can opt out if I want to

\- I can see what's sent if I want to

\- The information is anonymized and aggregated

I would assume that Django developers would feel the same way as I do if there
were these guarantees - that it's also in my interest for the software to
improve.

------
Rondom
I think they did a very good job in discussing it openly instead of going the
homebrew-way.

------
toyg
What if, instead of tracking, they added micropayments? Have a very simple way
to donate $1 every time you run startapp or something like that, and boom,
profit.

~~~
cauterized
Would you experiment with an unfamiliar framework for the first time if it
cost $10?

How many $10 frameworks would you be willing to pay for if you didn't know you
were going to use them?

Would you pay $10 to install django to spin up a new env to build a pluggable
library for it that you intend to open source?

What about $10 to populate your environment each time you run a build on
circleci?

~~~
toyg
It doesn't have to be forced, just a little nag (and $10 is too much, i'm
thinking max $5). You could have it the 10th time you use runserver, or after
5 hours of uptime when you log on the admin panel, and so on. People who don't
want to pay, wouldn't pay; but you need a relatively small amount of people to
raise enough to pay a few salaries. In a lot of cases the cost could just be
charged to the final client anyway, not the developer.

------
pryelluw
Have they tested charging for Django? Id pay a reasonable fee to use it. I
mean, least I could do (aside from donating sporadically).

~~~
rantanplan
But no one else would.

The very next minute a fork of a free version would ensue.

In an era that almost all similar frameworks are for free, charging for it
seems like a really bad idea for its future.

~~~
pryelluw
Yeah, good point. Its the shitty side of open source. People expect everything
for free as in free beer.

------
JupiterMoon
Oh well Django had a good run for me but I don't use spyware. I guess
something similar can be built up using Flask.

------
ris
I'm really not sure posting to HN is what lwn subscriber links are for.

~~~
DanBC
[https://news.ycombinator.com/item?id=5688151#5688887](https://news.ycombinator.com/item?id=5688151#5688887)

> FWIW I (as the editor of LWN and the author of the article) do not mind the
> posting of this link. It has brought in 16,000 people (at last count), many
> of whom are probably unfamiliar with LWN. Some subscriptions have been sold
> in the process.

> Certainly I don't want large amounts of our content to be distributed this
> way, but an occasional posting that puts an LWN article at #1 on HN is going
> to do us far more good than harm.

> (That said, I do appreciate your concern!)

[https://news.ycombinator.com/item?id=3793183#3793448](https://news.ycombinator.com/item?id=3793183#3793448)

~~~
ris
Fair enough.

------
rcarmo
I use Django quite a bit, and would immediately disable any such tracking
mechanism, even going to the extent of maintaining my own fork if necessary.

Having this on tools (like brew) is sort of OK because you can disable it and
not risk having it deployed to production. Having it on a library is
senseless, risky in many regards and likely to get it banned from, say, public
contracts.

It is also a likely hook for exploitation, but I'll need to see an
implementation first. Which I sure hope won't happen.

------
myf01d
The problem is Django itself as a framework and Python as a slow
infrastructure for it are getting too old with time. I love Django but it
grows too restrictive as projects get more complicated (ORM and template
rendering for example), not to mention the slow performance compared to new
languages like Go and Elixir, which is actually Python's responsibility not
Django.

Django is a monolithic framework that wants to do everything while there are
good and even superior alternatives(SQlAlchemy, Jinja2, WTForms), which makes
things harder for its developers.

~~~
kirkdouglas
You can easily replace template engine and ORM in Django when your project
becomes large. I've personally made this several times.

~~~
nsomaru
You easily ripped out the orm and templating engine on a large project?

Would love to know how.

