
Fighting Corruption in Brazil with Machine Learning - irio
https://github.com/datasciencebr/serenata-de-amor/blob/master/README-en.md
======
rmsaksida
I was a bit disappointed by this because it feels more like people trying to
get press and make a name for themselves than an actual serious endeavor at
digging through open government data with ML. The first thing you see in the
README is links to numerous social media profiles, followed by a Kickstarter
link and a BTC address. Zero technical info. Looking into the Kickstarter
(well, Catarse, the Brazilian copycat), apparently there's a young company
behind this - "Data Science Brigade". So I guess this is partly a publicity
stunt.

I'm sorry, but if you guys are serious about building this _open_ you should
make it entirely technical and keep any press/social media/financial bullshit
out of it. Entirely separated. In addition, if the project is sponsored by a
company, make that clear in a notice somewhere. It just doesn't feel honest to
do it any other way. I'm not going to contribute with time or money to a
project that claims to be about "fighting corruption" but might have all sorts
of unknown interests behind it. I'd be pleased to contribute otherwise.

~~~
MichaelGG
If they are serious, they might want to do it full time. Or even hire people.
They'll require funding. Even recruiting people to work on OSS requires
marketing of sorts. You gotta convince people the project is cool, but that
it's on a good trajectory. And finally, after all this, you want people
talking/following so you can push your results out.

There isn't anything inherently wrong with self promotion and often it can be
critical.

------
asimov42
I've thought about this a lot for India as well. To be realistic we would need
unprecedented levels of transparency to get the amount of data needed to get
usable results. With the amount of nepotism around even constructing a simple
network of party heads of each state and related companies and contracts
awarded for public work would be valuable.

At this stage we really should think of it more in terms of _documenting_
corruption rather than _stopping_ corruption. When (and if) the system is
ready to change the data would be extremely useful to see why things are
happening the way they are and work out if solutions would just move the
corruption-bottleneck rather than eliminate it.

~~~
tree_of_item
> At this stage we really should think of it more in terms of documenting
> corruption rather than stopping corruption.

This is a very good point; often I see people block this sort of discussion by
asking "Well, what are you gonna do about it??" Taking this point of view
sidesteps that question.

~~~
NhanH
Tell them that documenting will allow better history be recorded. Then you can
redirect the argument of "is history as a subject any useful" to external
endpoint.

------
betolink
This is the same as "fighting massive surveillance with X", the root of the
problem is how the system works. These kind of problems cannot be solved by
anything but a massive change in the status quo. As long as corrupt
politicians are in power they cannot care less for evidence of potential
wrongdoings. In Mexico a renowned writer once said about that: "when the
impunity is absolute the appearances are for losers"

~~~
visarga
> As long as corrupt politicians are in power

There is no way to filter the good politicians from the bad, when interests
are at stake. If you want to lessen corruption, then you need to balance the
distribution of power and encourage competition.

~~~
sebastianconcpt
I used to think your idea could be a solution but is too slow and also there
is an "infinite" input of new corrupt politicians. Doing some numbers will
render that trying to vote them out every 4 years doesn't sound like a win.
Since we need defence against self-corruption, the source of all evil, the
only I can think of is the Non-Aggression Principle and enforce respect for
private property [https://en.wikipedia.org/wiki/Non-
aggression_principle](https://en.wikipedia.org/wiki/Non-aggression_principle)

------
dmix
The maze of shell companies and indirect associations would make it relatively
easily to bypass automated scrutiny. But it's an interesting idea I've thought
a lot about.

It would be great to automatically analyze things such as government contracts
and the politicians who were involved in getting it voted in. But you'd need a
data source of the social network graphs of those who are behind organizations
- which could be a requirement for bidding on government contracts. Anyone who
doesn't submit an accurate network of human contacts who will ultimately make
financial gain from a contract will be barred from all future contracts.

That would serve as an excellent deterrent for politicians who plan to use
their position of power for personal gain and promote the usage of companies
who actually deserve to be getting contracts (improving the quality of
government output).

~~~
joe_the_user
To get conviction or sanctions, one generally needs something definite. But
rules-of-thumb and appearances are useful as a way to know where to begin
investigating.

However, the problem with a public system of rules of thumb is that it can be
used by the wrong-doers to hide themselves.

The US bank system scans the cash activity of businesses to spot money
laundering and other activity. If a money launder had the rules for what
transaction pattern was going to be red-flagged, they could structure their
transactions pattern to not be flagged while continuing to launder money.

------
rafinha
Doomed to fail. There won't be enough non-corruption data for discrimination.

~~~
jkkorn
This. Machine learning and datascience are only as good as the quality of the
data fed into the machine.

Although the brazilian federal police has been working nonstop, we'd be
hardpressed to expect a .csv journaling political shenanigans.

This project has my support nonetheless :)

~~~
arcticfox
This scene from Tropa de Elite always makes me laugh and cry (paraphrased)...

\---

Honest Policeman: I've been doing the stats, and with the additional bodies
found in our district today...

Police Captain to Dishonest Policeman: What do you mean, bodies in our
district?! I thought I told you to take care of them.

Dishonest Policeman: I swear I moved the bodies to the neighboring district
like you said, boss. Those bastards must have moved them back!

------
josephneil
You can see a lot of things in this repository, except Machine Learning. Why
is this in front-page ?

~~~
dsacco
> You can see a lot of things in this repository, except Machine Learning. Why
> is this in front-page ?

Perhaps because the majority of HN readers have no concrete understanding of
what machine learning is, yet they upvote most ML stories in pursuit of
comprehension.

------
marcosdumay
What is the project about?

Yeah, the read-me talks about machine learning, and fighting corruption, but
it makes no effort to tell how one would help the other.

~~~
gtirloni
This seems to explain the project in much greater detail:

[https://www.catarse.me/serenata](https://www.catarse.me/serenata) (Portuguese
only)

Curiously, Google Translate doesn't seem to understand this page's content and
fails to translate it here.

~~~
fredguth
The idea seems to be fighting a specific kind of corruption. Congressmen have
an office budget and many use invalid or suspicious receipts to justify
expenses. The idea is to use bots to analyze and red flag suspicious receipts
and then use evidence as proof in justice.

~~~
vkou
This seems to be an almost entirely pointless endeavor. The problem with
corruption isn't Billy expensing $5,000 of wine, and a visit to the strip
club.

------
personlurking
I've lived in both US and Brazil and I would say that the level of corruption
is more or less the same no matter where one lives in the world, but in Brazil
even the street cleaner knows about the political scandals while in the US
it's all smoke & mirrors, to the point that only the higher-ups are in on it.

Brazil: "I know about it but what can I do to change it? Let's focus on some
talking point instead."

US: "I don't really know about any scandals so let's focus on some talking
point."

The problem also isn't exactly political but social, as in what might be
called "small/everyday corruption" (running a red light, cutting in line, not
mentioning the cashier just gave you more change than you deserved, and a
million other things). There's no ML for that, only education [1].

As an aside: when one lives in Brazil (or any country, really), "when in
Rome..." seems to apply, at least in some cases. At one point, I lived in the
slums and was using internet that was very likely from a stolen connection
(known as a "gato", or in this case, "gatonet"), but this was what everyone
had and there was nothing else on offer. I also used types of public
transportation that were very efficient for me but, afaik, technically illegal
yet no one was really policing it. To give one more Brazil example, at a movie
theater, I once used a recently expired student card of mine to get half-
price. With these types of things taken into consideration, I'd say that the
US (my country) is like concrete while Brazil is like water. To each, their
own. There's good and bad aspects about both.

1 - In Brazil, there's sometimes a mix-up in the usage of the terms education
(educação), which includes concepts of right vs wrong and manners, and
schooling (escolaridade), thus in my mention of education I mean to include
both senses.

------
guard-of-terra
It's weird that we don't have a profile page for every official on a web site,
detailing this official's performance, and for their supervisor, also a list
of spendings overseen by this official and a button 'open a case for
persecution'.

I mean, seriously? We have such things in commercial enterprises don't we?

------
daveloyall
This use case makes clear the need for what darpa.mil is calling "Explainable
AI". (Not that they invented the concept!)

~~~
PavlovsCat
The definition of "intelligence" in "AI" will probably always have an implicit
clause saying if it calls the emperor naked, it can't be intelligent.

------
i_live_there
> The idea of automating the analysis of public accounts is unprecedented:
> engaging the population on training our platform in the analysis and audit.

I was expecting something live.

~~~
ralmeida
To be fair, this is an ambitious goal, and the project is not that old.

------
crimsonalucard
Big Data and Machine Learning can also help cure cancer.

