

Show HN: SeeMeSpeak, a crowd-sourced dictionary for sign languages - kaeff
http://kaeff.net/posts/seemespeak-sign-language-dictionary-railsrumble-2013.html

======
Argorak
I am one of the authors: Ask me anything!

~~~
Osmium
[In the far future] any plans to hook people up with e.g. a Kinect so you can
capture point clouds and allow people to rotate the signs / see them from
different perspectives? Or is this completely unnecessary?

As someone who hasn't learnt sign language myself, I'm not sure, but I think
I'd find it easier to mirror a sign by seeing it from the person's point of
view (i.e. "over the shoulder") rather than looking from a camera's point-of-
view in front.

Also, what's the current state of the art like for detecting sign language?
Could you use similar tech to test people and make sure they're performing
signs correctly?

~~~
Argorak
Kinect was thrown around. I am not sure how helpful it is for learners. Our
first attempt would be to implement a "mirror" view next to a given video were
you can see yourself doing what the person on screen does. The biggest problem
is that this would mean that everyone recording needed a Kinect. I'll touch
that subject at the end.

Over the shoulder is problematic, as the face is very important. What the
inside of your hands show to you is neglectible anyways. Also, we assume that
this problem will go away when you see yourself side-by-side with a given
person. If you hit refresh some times, you will find some videos that have the
same person from the side and from the front. I think that is very good, but
has a huge production overhead.

Detecting sign language: there is a problem there, and that is documented
corpus. Be aware that sign languages are just as much subject to dialects as
other languages. Some signs are completely different from nothern germany to
southern germany.

Which brings me to the problem of corpora: we only included DGS here, because
that was the only language we spoke that we found a well-tagged corpus under a
permissive license for (by scraping a wiki...). Even that corpus is only 800
videos. There are some corpora which basically amount to a bunch of videos
which not necessarily depict a single phrase and are not transcribed. Others
are owned by the universities that created them - they could give us a
license, but that misses the scope of our project: we make this corpus freely
available.

We attribute the lack of user-generated corpora to the fact that producing
very short videos (3-5 seconds) is still a huge hassle. You need a camera or a
capture tool, encode the resulting videos, upload them to a wiki, transcribe
them properly there... In a time where browsers are gaining recording
capabilities and every laptop comes with a cam, we wanted to make this easier
for users. Our recording workflow is certainly not finished yet, but this is a
POC that is already much quicker.

Thats also the reason why we won't have a look at the Kinect soon: it's by far
not as widespread as cameras.

~~~
jbrooksuk
In regards to handling video upload and encoding, have you checked out
Transloadit[1]?

[1] [https://transloadit.com/](https://transloadit.com/)

~~~
bitboxer
We thought of this, but using this service would have been against the rails
rumble rules

~~~
jbrooksuk
Oh sorry, I missed that part. What about now? Improvement after using 3rd
parties is acceptable?

~~~
bitboxer
I don't think this is needed anymore. We have a avconv pipeline already
working right now :) .

