Hacker Newsnew | comments | show | ask | jobs | submit | languagehacker's comments login

What a meaningless parade of speculative bullshit. There's a science for discussions like this -- it's called corpus linguistics. There are people who have spent years of their lives trying to understand observed patterns in behavior using real data and statistical analysis. This a handful of anecdotes that even the most generative of armchair linguists would roll their eyes at.

-----


Oh cool, another change to fig/docker-compose that doesn't include the most requested feature: https://github.com/docker/compose/issues/495

As the person who wrote the PR for its solution and have been waiting for it to get merged for months, this is super, super frustrating.

If you guys don't want to put the logic in, reject the PR and close the issue as won't fix. Quit stringing the community along.

-----


Did you really need to grind your axe in two separate comments on the same story? The first was sufficient.

-----


Whoops, sent it when HN went down and didn't think it had posted.

-----


I've got a bone to pick with the new Docker Compose.

Take a look at this issue: https://github.com/docker/compose/issues/495

Back in January I went through the effort of outlining a solution that was approved by the maintainers before implementing it. After I provided a PR, I responded to revision requests by the maintainers, and still haven't seen this change go into the project.

It's a simple change. If this feature isn't the architectural direction Docker wants, they need to close the issue and reject the pull request, instead of changing the project over and over again so that I have to maintain a PR that's over three months old.

Very uncool.

-----


The fact that it's still open probably means it is under some level of serious consideration. It was opened before they released Machine (and Swarm?) so maybe they just didn't know how/when it should fit in until the dust settles. Agree that they could have said something to this effect though.

-----


Something about this seems patently unpythonic. Also, the way the code is included reminds me of monkey-patching, which is a Ruby behavior I like to leave at the door when I'm coding in Python.

-----


Getting this in before the gratuitous negativity brigade starts hammering down:

This an implementation of a stupid joke looking for a problem. If someone actually needed this in their toolbelt they could use perl or even (amazingly) bash to handle this problem.

I would personally not name anything I worked on after a song from one of Weezer's shittiest albums, but that's just me.

-----


many execution contexts don't have perl. i have some <10MB VMs in mind.

-----


yeah okay, totally unappreciated.

give me a break, it's not 1995.

that's like saying oakland is the new brooklyn.

-----


Hey can we not just make up stacks that spell things?

Believe it or not, each one of these components are choices you should make an independent educated decision about, and then make sure that those educated decisions integrate well. Integration with best-in-class solutions is A criterion, but not the only one.

Like, can you imagine someone using a stack like this to build a to-do list app, just because?

Also, what I'm seeing is someone lumping one of the best graph databases I've seen in with some utter schlock technologies just to force a catchy analogy to a separate anagram.

-----


What the hell is this doing on Hacker News?

Saying that as someone who was vegan for five years and who is presently on a high-fat, low-carb diet, by the way. I don't come here for stuff like this.

-----


I posted it because it was an interesting kind of reverse engineering of a food substance, which was done in a way that I figured HN would appreciate.

-----


PCA is a pretty okay method for dimensionality reduction. Latent Dirichlet allocation is pretty good too. It depends on what you're trying to do and how the data is distributed in N-dimensional space.

-----


dont forget latent semantic indexing :)

-----


Holy shit! Stop the presses! Enriching source data by filtering out low-signal information and selecting for high-signal information improves unsupervised clustering?

It's great to see physicists getting into other disciplines to show everyone else how wrong they are. As we all know, when you study physics and math, you're at the top of the reductionism pyramid, so you don't even really need to worry about having a background or experience in the thing you're studying. You just data at a formula, and if it doesn't work, you get to go to town berating professionals in the field using that formula. That's how it works, right?

Pre-processing your topic-modeling data is a pre-requisite to getting good results. This is fairly common knowledge. Not pre-processing is certainly the more naive approach. Computational linguists have backgrounds in things like syntax, semantics, and discourse so that they know what components of language to select for and how to format them without mis-representing the source data.

This article characterizes a straw man to tear down in the service of advertising a proprietary solution. It's basically an advertisement with a scientific reference for good measure. We call these "white papers" -- not serious journalism.

I can make these assertions from personal experience. I used latent Dirichlet allocation on high-fidelity natural language data to provide topic modeling for hundreds of thousands of wikis. We used the data for ad optimization as well as recommendations -- both of which provided statistically significant improvements in engagement. The approach worked. The recommendations were reproducible. I used all open-source software.

I guess I should have gotten into physics.

-----

More

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: