
How The New York Times Uses Software to Recognize Members of Congress - beriboy
https://open.nytimes.com/how-the-new-york-times-uses-software-to-recognize-members-of-congress-29b46dd426c7
======
davidkuhta
The example made me lol: "Mitch McConnell (red, almost certainly - confidence
= 100.0)

On that note, they could utilize the box color to match the party affiliation.

~~~
danschumann
No no no! Snap chat filters with a donkey or an elephant!! XD

~~~
TomK32
as long as you don't use that to retrain the software...

------
nkassis
I'm waiting for journalists to walk around with google glass type device to do
this on the fly. Bonus it could record what they see and hear for later use.

~~~
SurrealSoul
I can't wait to live in that future, glasses that let you know if you bumped
into anyone famous

~~~
darkkindness
I get the excitement, but this sounds like a terrifying future for famous
people.

~~~
mygo
also sounds like a terrifying future for child kidnappers or fugitives or
anyone on some wanted list.

whether they deserve to be on that wanted list or not

I mean right now we have cameras everywhere but they’re not all HD so the govt
can’t run facial recognition reliably plus they need access to all the
cameras. But when people voluntarily put facial recognition devices on their
own selves, en masse? Wow.

Sounds like something Google or Amazon could benefit from practically giving
away.

~~~
J-dawg
Am I right that sex offender registers are public in the US? And they
sometimes include people convicted of very minor crimes?

Yep, definitely a Black Mirror episode in the making.

~~~
mygo
you are correct

------
rhacker
I was hoping to read an article about NYTimes setting up video cameras outside
of popular restaurants in DC and using ML to perform facial recognition on
everyone to try to find members of congress and well known lobbyists. oh
well... it would be like TMZ-4-DC

~~~
Maxious
Why set up video cameras when people bring their own...

> Rachel Shorey found members of Congress at an event hosted by a SuperPAC by
> trawling through images found on social media and finding matches.

------
ericsoderstrom
The author says that training their own model would have been too hard due to
lack of training data, but evidently Rekognition had sufficient training data
to make it work? Why can't NYT use the same training set Rekognition uses?
Does Amazon somehow have a secret non-public collection of celebrity photos?

~~~
kevin_thibedeau
It shouldn't take an intern too long to collect a representative set of
Congress people and other high officials for training. Maintaining it would
not be an undue burden. That would eliminate the false positive matches for
all the unwanted celebs. Clearly Amazon's models aren't that great to begin
with so there's little reason to stick with them.

Wrap it up into a simple native app and you can bypass the MMS BS. Even
better, a sufficiently capable dev could integrate an opensource recognition
library [1] to have it entirely implemented on the device.

[1] [https://github.com/rudybrian/tuFace](https://github.com/rudybrian/tuFace)

~~~
jeremyjbowers
Hi! I'm Jeremy, one of the developers.

We'll probably work on something like this for the next version. One reason
it's harder than you think: We would have to buy / own rights to the
photographs before we could use them to train -- most of those photos are
owned by Getty or the AP. And our own photographs are perfectly lit and
square, which made them awful for training face recognition.

The other hangup (which I didn't get to in the article) is having to add /
remove people. New members are constantly being added and that's a maintenance
burden for us. Amazon usually has the new member within a day or two. (Our
team is very small and we have a lot of other responsibilities!)

But good points, definitely.

~~~
hooloovoo_zoo
"We would have to buy / own rights to the photographs before we could use them
to train..."

Is this actually true?

~~~
djrogers
Really doesn’t matter - the legal team at the NYT thinks it might be, and
lawyers exist to tell people “you’d better not”.

~~~
acct1771
And it's our job, as someone who knows what a computer is, to move forward
with common sense if they're overreaching which, is their job.

They have every incentive to be as conservative in their advice as possible,
and no incentives to "allow" risks. Doesn't increase their compensation any.

------
AdmiralAsshat
I can't wait to see how long it takes Congress to pass a law making it illegal
to use facial recognition software on members of Congress.

(And no one else)

~~~
2RTZZSro
Thankfully, only high capacity assault facial recognition software is likely
to be banned as a result.

------
Isamu
So you should be able to send a selfie to this api and it will tell you which
member of congress you look most like

~~~
reaperducer
Except in Illinois, where sending the data off device is illegal.

(See previous HN discussion)

~~~
Something1234
If you're going to state something is illegal, you need to provide a source.
What previous discussion?

~~~
mehrdadn
It was 6 days ago:
[https://news.ycombinator.com/item?id=17177663](https://news.ycombinator.com/item?id=17177663)

------
jonknee
It would be fun to see which members are the most requested by NYT reports.

~~~
jeremyjbowers
Oh, that is interesting. Also, hi Jon!

------
otakucode
I have wanted for awhile to build a site which trained a machine learning
system on the various data made available surrounding Congresspeople and
information on members which were eventually found to be guilty of adultery or
other similar crimes - then produce a score for every member of Congress
rating how likely it is that they are cheating on their spouse, or taking
bribes, or similar. Give them a sneak preview into the types of systems they
are aiding and abetting in the creation of. I am uncertain of whether it could
be considered defamation to have a brainless machine learning system decide
there's an 85% chance some random member of Congress is an adulterer. I don't
actually believe that any such system could ever reach any reasonable level of
actual effectiveness due to the fundamental complexities of human behavior and
circumstance, but that's not stopping the law enforcement side of things from
moving forward so I don't see why it ought to stop the side trying to point
out fundamental flaws in the strategy.

~~~
smacktoward
_> adultery or other similar crimes_

Adultery is not a crime.

(You can argue that it's an indicator of a person's character, or lack
thereof, sure. But that's something different.)

~~~
panarky
Lying about it is an impeachable offense.

~~~
smt88
Only under oath

------
mlthoughts2018
This is an embarrasingly bad approach to face recognition for a small set of
frequently photographed people.

Several comments from the article give me concern

\- They seem to think Rekognition is a panacea for their problem, but there
are many known issues with Rekognition celebrity detection. Not to mention
that the cost-per-request is often highly unfavorable compared with building a
higher-accuracy, situation-specific solution with extensions to pre-trained
models.

\- They say some interns took a “novel approach” by creating a hard coded
look-up table for disambiguating similar politician-celebrity pairs. This
creates awful tech debt and failure cases. I’m not knocking it too hard
because it’s pragmatic, which is a good sign about those interns, but this
should be seen as a necessary wart to be improved, not a point of pride.

\- As others have pointed out, even considering turnover in Congress, it seems
like people who report on Congress for their full time job should recognize
them. It truly seems like a silly, wasteful use of resources to solve this
with computer vision.

This is all consistent with what I’ve heard from colleagues at NYT data
science. As well as people I’ve known in data science bootcamps around New
York, like Insight, who heard recruiting pitches.

Their department seems self-aggrandizing, using highly overwrought
personalization models and seeming to have 538-envy for how they want their
data science work to come off despite 538 exiting, among other important
figures like Mike Bostock.

It just comes off as a place that wants to do status signalling to _seem_ like
a machine learning or data science thought-leader, but they don’t pay
competitively or do what’s needed to retain good people and would rather do
patchwork stuff like this with interns than to take the work a little more
seriously.

I don’t get the impression it’s a place serious ML practitioners would want to
go.

------
smsm42
Isn't this the same technology that would allow surveillance on every private
citizen?

> Most recently, Rachel Shorey found members of Congress at an event hosted by
> a SuperPAC by trawling through images found on social media and finding
> matches.

I bet nothing in the technology says "member of Congress" or depends on the
target being member of Congress. So anybody can mine social media and collect
surveillance data on people. And that is probably already happening.

------
asdsa5325
TL;DR: They use a API from Amazon that's already trained for Congressmen.

------
djhworld
If anything this article doesn't reflect well on Rekognition

------
DINKDINK
>Nope, it’s too hard! Computer vision and face recognition are legitimately
difficult computer science problems.

Someone is woefully ignorant how good facial-recognition surveillance is.

~~~
SmooL
There's a difference between "difficult" and "can't be done". Yes, facial
recognition has come a long way, but it's still non-trivial to set up a custom
facial recognition service for your particular needs.

------
evan_
the obvious next step to this would be to build a mobile app with a built-in
model to recognize everyone deemed important using live video from the camera.

------
dqpb
Cool. Maybe next they can tackle subscriptions without ads.

------
rootsudo
This reminds me of Casino Royale. Wow.

------
forapurpose
Hmmm ... your job is to cover the actions of 540 people elected to DC, many of
whom you already recognize, and you can't remember what they look like? I'm
not a journalist, but that seems like an essential thing to memorize, along
with some minor metadata (locale, party, a bit of bio). Spend a weekend and do
it.

Every profession has things you can look up and things you just have to
memorize. 540 people isn't much - can sports journalists recognize 540
athletes? Otherwise you'll be in situations where you don't have an
opportunity to look them up (e.g., can't get a photo, no time, etc.), and
you'll have many false negatives: If you don't know what they look like, you
won't realize it's a member of Congress at the party with the coke.

~~~
danso
As the article states up top, there's decent churn in Congress, making this
more than a one-time or annual thing. Also, it's not just members of Congress
who are important to cover in a beat, but their senior staff members and
aides.

Spending a significant amount of time developing a process for face
memorization and undertaking it would be an example of needless/premature
optimization, especially for people who may be covering Congress tangentially.
Most of a Congress reporter's job does not depend on having random encounters
with members of Congress.

~~~
forapurpose
> Most of a Congress reporter's job does not depend on having random
> encounters with members of Congress.

So much for my fantasy of a reporter's life; press conferences and hearings
sound boring. But I will nitpick a minor point:

> there's decent churn in Congress, making this more than a one-time or annual
> thing

I don't remember the rate at which incumbents are re-elected, but it's pretty
damn high. Unfortunately, after you memorize them once, you'd only have to
learn a few more at a time.

~~~
Spooky23
The House turns over a lot. The Senate is a different story, those guys
fossilize.

~~~
forapurpose
> The House turns over a lot. The Senate is a different story

I respectfully refer the gentleperson from Spooky to the following:

[https://www.opensecrets.org/overview/reelect.php](https://www.opensecrets.org/overview/reelect.php)

 _Few things in life are more predictable than the chances of an incumbent
member of the U.S. House of Representatives winning reelection._

They don't provide a number but eyeballing the chart, I think that number
starts with a "9" over several decades, and is increasing. Here's an article
that says it was around 96.6% in 2014;[0] it must be embarrassing to find
yourself in the bottom 3.4 percent of any group.

(It also says House members are reelected more often than Senate members.)

[0] [http://www.politifact.com/truth-o-
meter/statements/2014/nov/...](http://www.politifact.com/truth-o-
meter/statements/2014/nov/11/facebook-posts/congress-has-11-approval-
ratings-96-incumbent-re-e/)

~~~
Spooky23
You’re totally right, but being a rank and file congressman is kinda
miserable... many transition to other offices, federal/state appointments,
etc.

------
laser
_A text-based interface is easiest for reporters to use, so while texting is
slow, it’s superior to a web service in the low-bandwidth environment of the
Capitol._

This is disturbing to hear. How can our congress make the best decisions
possible if it can't access and communicate relevant information quickly? The
ROI to the United States of simply having a high-bandwidth network at this
global powerspot is so obvious that I had just assumed it was the case—so to
hear that reporters can't even use a web interface to quickly send images is
frightening if true, and perhaps even indicative of a broader issue of our
government's inability to effectively execute, partially rooted in its
inability to empower itself with the tools necessary to effectively execute.

* _Edited at burkaman 's prompt to be less sensationalist_

~~~
tomatotomato37
It's the 3G connection used by the public in one of the basement floors of the
capital building that has low bandwidth. The capital has its own internal
network for congressmen and their staffers, they just don't let random
reporters connect to it

~~~
jeremyjbowers
Even worse (weirder), the Senate bans electronic devices on the floor. If a
Senator wants info, they have to sprint out one of the doors to the lobby
where they have an aide waiting with an iPad (usually).

~~~
jeremyjbowers
The House allows iPads on the floor, and reporters are allowed to bring
laptops into the gallery. It's how we get our live votes transcribed!
[https://www.nytimes.com/2017/05/08/insider/how-we-beat-
the-h...](https://www.nytimes.com/2017/05/08/insider/how-we-beat-the-house-in-
tallying-the-health-care-vote.html)

~~~
walshemj
Interesting quite different to the HOC where the result of a division (vote)
is read out quite soon after.

I have worked at large 500+ delegate conferences using parliamentary
procedures and now they often use electronic systems for both teller and card
votes which is much faster

