There is a problem with this kind of ultra-rapid iteration when you start doing certain kinds of low-level stuff that can crash your development environment. I suspect that Jarvis avoids much of this by running in a different process, but there are certain things that might still cause a reboot.
Still, it just occurred to me that one could use integration with scm like git or mercurial to get the same kind of benefit as the Smalltalk change log+image.
What if every change to a function was snapshotted, and the developer could easily scroll/flip through these snapshots? Merging the change back to a main branch would then be like image saving in Smalltalk. This would give developers the ability to try really risky development, since they would always be able to quickly get back to a previous state.
(This can already be done, but by making the integration really seamless, it has a synergistic effect with development.)
I love the hilarious youtube's subtitles. For instance (3:34):
"without a student to conduct a kidney advocate"
or (3:19): "registers configure troop went upstairs don't just goes up"
But you should disable them by default, they are confusing!
I find it curious that Google has a very good translation software but they don't pass the generated subtitles through their language models to detect when they are gibberish.
It's not your fault, but YouTube's. I just find computer generated gibberish funny.
Their language models might perform acceptably with text, but speech is much looser with structure and grammar. They may apply a bit of local expectation-maximisation but anything more strict or long-range wouldn't work.
Speech may be much looser, but a classifier to (try to) detect absolute gibberish from real content doesn't look too far fetched to me. Its only action would be to disable the subtitles by default if gibberish was detected.
The problem with that is that the language model's power is already used to fix things up locally (because this is transcribed from audio). As a result it can't be used again to decide if the transcription fits the model; it's the case by design. There must be some kind of confidence metric at the end of the process, but I don't think it's possible to tell how much of the ambiguity comes from inadequacies in the phoneme model, or the audio environment, or the language model. They'd have to throw out good transcriptions in noisy environments along with bad transcriptions, and probably wouldn't keep much at all. As it is it seems they prefer to publish crappy results and hopefully have some feedback channel involving the video uploader.
I'm thinking of the following: audio track of the video ---[process (language recognition)]---> transcription ---[process (classifier based on textual language model)]---> answer to "gibberish?"
So good transcriptions would be good, regardless of the noise environment. It might give some false positives, but I expect that a good price to pay to avoid the kind of mess they create now.
What does your "transcription" step mean? Either it would transscribe each word "in divy two Ellie, buy it's Elf", which produces garbage without context, or it would additionally have to use the language model to patch things up.
For example, the difference between "its" and "it's", "red" and "read", "know" and "no" is irreconciliable without understanding the rest of the sentence these trouble words appear in.
Transcription is the textual output of the speech recognition process, be it phonetic, LVCSR or direct. All current applications of them do take some context in consideration, usually via a transition matrix and lots of training data.
What I'm proposing is to pass the output of speech recognition through a binary classifier that answers the question "is this text gibberish?", which is trained with the help of a textual language model, unrelated to the speech recognition pass.
As someone who's excited to be learning Python, seeing tools like this makes me even happier. Thanks! I have a feeling going back to PHP is going to be very difficult for me :)
Yes, I was thinking about people learning Python too when developing this. My daughter is 6 years old, and I suspect she will learn Python in a few years or even months ;-)
Yeah, except without all the bad parts like images that can get corrupted
Sorry, I've been a professional in Smalltalk since 1998, and I've never encountered a corrupted image. The closest thing I've seen to that is de-sync with source code. (Which you don't care much about if you're using SCM, for the same reasons everyone else uses SCM.)
stateful (wtf were they thinking?) web-apps.
Same things others were thinking at the time -- you're just seeing a slice of history. People still rave over the debugging in Seaside like it's magic, though.
All the benefits of a Lisp REPL and a Smalltalk debugging environment in a language people actually use. Tab-completion, dumps, breakpoints, testing assumptions/data, running arbitrary code inside a specific environ...the works.
I can use it in my virtualenvs without having to explain to an IDE what a virtualenv, a running test in Nose, a normal script, or a running web server. It's normal code so I can wrap the trace in a condition or whatever else I want.
No, I'm wondering if the promulgators of Smalltalk as a superior debugging environment actually have anything superior to what I just described so that I can steal it for Python.
Sometimes I toss stuff out there in the hopes of being shown something better. It's teaches me new things.
Lets take a moment to think on our poor brethren still tossing print statements all over the place though.
(I've often said around here that things would've worked out better if there wasn't such a barrier between stuff happening in Smalltalk and outside in the OS -- community wise. But if you're willing to play just in the Smalltalk sandbox, it's a pretty awesome experience.)
I've only used Smalltalk/V during university, back in 1995, if memory serves me right.
I've seen integration problems migrating code between implementations, lack of proper support for source control specially in distributed teams, lack of integration with the operating system for desktop applications.
I know that many of these problems have been solved in newer implementations, but maybe now it is too late.
I don't often see things on HN and think, "wow, I absolutely must have that," but this is one of those times. Thank you for releasing this; if you do a kickstarter page or put up a donation link somewhere I'll throw some money at you. :)
I would help if explained how to get the dependencies in better detail. I have been finding packages for 20 minutes. I would love to try this out but I can't seem to get it running. If you list the project dependencies in your 'setup.py' instead of having
Fantastic tool, can't wait to try it. I agree this is kickstarter worthy. Looks like the OSG bindings and pane makes it easy for devs to hack motion design, I wonder what will come out of that.
"WARNING: Usage of public pools during hot weather is deprecated. Misuse may lead to a degradation of your insurance policy. At short-term-alpha memory, synapse group #23AE18-123, phase shift 3"
Still, it just occurred to me that one could use integration with scm like git or mercurial to get the same kind of benefit as the Smalltalk change log+image.
What if every change to a function was snapshotted, and the developer could easily scroll/flip through these snapshots? Merging the change back to a main branch would then be like image saving in Smalltalk. This would give developers the ability to try really risky development, since they would always be able to quickly get back to a previous state.
(This can already be done, but by making the integration really seamless, it has a synergistic effect with development.)