

Automatically spotting interesting sentences in parliamentary debates - DanBC
https://fullfact.org/blog/getting_closer_automated_factchecking

======
tuckermi
Very cool! I would be interested in seeing more detail on Full Fact's
implementation. Is there a technical implementation of the Decision Tree that
is shown in the diagram? If so, it would be cool to learn a bit more about how
it was encoded/represented.

~~~
idw
We (I work there) don't have one yet but we wanted to get this line of work in
the open asap.

The next steps are (glossing over enormous amounts of work casually): 1- Test
that untrained coders have a high level of agreement in how they code
different debates the same way 2- Do the same test with people who use Hansard
(the record of Parliamentary debates) professionally 3- Iterate the set of
categories if necessary to make sure they work as well as possible, given our
two tests of (a) distinct and (b) useful categories 4- Code a set of training
debates 5- Start developing a machine learning approach to automated coding

That said, I think there might be a shortcut available for at least some
categories of applying some simpler heuristics.

So we're working to push this into a useful public system but it's likely to
take money and/or expertise we don't have in house and so it will move slowly.
In the meantime, we hope doing it openly might encourage others to get on
board or explore this approach independently.

Our skillset will help a lot with getting to stage 4 and as we will be able to
release the methodological bit and the training data hopefully that will
encourage others to build on it.

------
sytelus
This is a great topic. Some ideas to consider:

-Identify sentences where strong sentiments

-Identify first person sentences (I, me, my)

-Identify second person sentences (you, your)

-Identify sentences that indicates action that _would_ be taken by person/group.

