Hacker News new | comments | show | ask | jobs | submit | joe_the_user's comments login

The main thing I'd say is that either the quality of "creativity" is something that effectively duplicates (or even extends) the human notion of creativity or it's something arbitrary that there could be a way for us to care about.

Even if someone formulated an apparently human-independent mathematical statement of "interesting", you would still the human judgment that this was credible.

"Abusive" might not convey the situation exactly.

But if Homedepot and ever other hardware store suddenly developed a technology that let them sell wood that worked great for many standard construction practices but would suddenly demand more money for artists or anyone doing anything usual with that material, one would could easily feel that one's world had become truncated.

I'd rather call that abuse than not-abuse.

The ability to buy a thing, an object, that you can do whatever the heck you want with (by yourself, in the private of your home), is the antidote for being serf to pay-by-what-we-think-it's-worth-to-you services.

Even you choose to purchase services instead of things, the existence of things keeps the service provider honest and oppositely the non-existence of things makes the serfdom more onerous.

I've heard Naives Bayes described as "nearly good enough" for many uses. Can anyone quantify how much worse it is for various applications?


In Kevin Murphy's book "Machine Learning: A Probabilistic Perspective" he cites the following paper [1] which shows that the simple Bayesian classifier "performs surprisingly well in many domains containing clear attribute dependencies" and contains numerous experimental trials (in different domains).

I haven't fully read the paper myself (glanced through it) but KM's makes the same point as OPs which leads me to believe it may have the answer to your question.

[1] http://web.cs.ucdavis.edu/~vemuri/classes/ecs271/Bayesian.pd...


Great book, by the way.


Naive Bayes assumes variable independence, for one thing. If the features aren't independent from each other, then you'd be better off using probabilistic graphical networks.


> Naive Bayes assumes variable independence, for one thing

That's true, but every model makes assumptions that are wrong. The puzzling thing about Naive Bayes was how well it performs in practice in spite of its assumptions being so wrong. I believe that there have been papers explaining this. I would look at Russell and Norvig's book for a start.

One thing that Naive Bayes sucks at is providing good probabilistic estimates. It is nearly always overconfident in its predictions (eg. P(rain) = 0.99999999) even though its classification accuracy can be pretty good (relative to its simplicity). Logistic regression fares a lot better for probabilities.


> One thing that Naive Bayes sucks at is providing good probabilistic estimates.

This is true based on how it's described in textbooks, but in practice it should always be combined with a calibration algorithm (which add a trivial O(n) cost to the process). The common choices here are Platt [0] or Isotonic Regression [1] (the latter should, itself, be combined with a regularization algorithm because it can easily overfit for outliers).

Given a calibration algorithm Naive Bayes produces probability estimates every bit as reasonable as any other algorithm.

[0] https://en.wikipedia.org/wiki/Platt_scaling

[1] https://en.wikipedia.org/wiki/Isotonic_regression


> but in practice it should always be combined with a calibration algorithm (which add a trivial O(n) cost to the process).

Why not just use logistic regression at this point? The only benefit of Naive Bayes over logistic regression is that Naive Bayes is simpler to code.


The calibration cost is trivial compared to the coefficient learning cost. Very roughly, calibration is O(records) whereas coefficient learning is O(records * features). So the tiny add-on cost of calibration shouldn't affect anyone's evaluation of the relative merits of algorithms. NB still retains its computational advantage.

One thing that is often discounted in theoretical discussions is that NB takes much less I/O than something like LR, typically in the range of 5-100x (depending on how many iterations you want to do updating your LR coefficients). If you're doing, for example, a MapReduce implementation then NB has huge computational advantages. In LR each coefficient update costs you another map/reduce pass across your entire data set (whereas NB is always done in exactly one iteration).

So if NB + calibration gets you something close to LR for vastly less computation and I/O, why wouldn't you use it?

Having said that, if you're talking about small amounts of data that fit into RAM and you can "just load into R", then sure use LR over NB. For that matter use a Random Forest [0]. The reason NB is still around is because it offers a point in the design space where you spend almost no resources and still get something surprisingly useful (and recalibration narrows the utility gap between NB and better methods even more).

[0] And you should still consider calibrating your random forest's output.


Yeah, Naive Bayes isn't very good for probabilistic estimates, but its often quite good at classification. I've had great luck using it to classify poorly coded medical data. I maintain this ruby gem [0] that implements Naive Bayes as a text classifier, and for a lot of tasks, it's good enough.



Have you tried a Perceptron or MaxEnt model for your medical data? They're also fairly simple (MaxEnt slightly less so), but will give you solid results.


I looked into perceptron, but it seemed overkill and I didn't want to write a classifier from scratch in Ruby or Java. I wasn't previously aware of MaxEnt, which looks cool. Bayes is nice for the data I'm looking at, because my features are pretty independent.


Geoff Hinton has a theory that Naive Bayes is just logistic regression with dropout: https://www.youtube.com/watch?v=DleXA5ADG78 And that if you divide all the parameters by a certain amount, you would get accurate probability estimates.


This is amazing talk. He has many other very interesting ideas.


This is true, but in a lot of instances ignoring variable dependence still produces meaningful results. For example, Naive Bayes on text classification usually performs really well even though words occurrences are dependent on each other. From my experience, if you are attempting to classify on a large number of observations (E.g. a large number of words), the algorithm performs well.


The problem is that the dominant model of development today seems to involve increased wealth inequality rather than the creation of a stable middle class.

For example, China's development has involved considerable wealth inequality.



Most income inequality (taken on a worldwide basis) is between country, not within country. Overall global income inequality is dropping considerably, with China playing a large part.

> https://twitter.com/bill_easterly/status/621684418558734337

> https://twitter.com/bill_easterly/status/654709684985688064


World scale inequality seems like a bit of a canard.

If inequality between nations drops, a measure of inequality on a world scale could drop while inequality within each particular nation increases. China or Africa's development means the average very poor person is closer to the average Westerner, dropping total inequality. But that doesn't stop recent models of development from involving considerable inequality and indeed a model of inequality (just as inequality has increased within the US). Moreover, it's the inequality within each area that one would expect to have the harmful or beneficial effects on democracy, civil society and so-forth.

Edit: Of course it is hard to reply to a link to a tweet of a bit map of a graph with resolution too poor to read the references of, hence my general comment.


I think you are right that there are still disadvantages to within-country inequality, but I strongly disagree with the dismissal you seem to be making of the massive reduction in country-level inequality and accompanying massive reduction of extreme poverty, which has plunged from ~40% of the world population in the 1980's to <20% now.

> it's the inequality within each area that one would expect to have the harmful or beneficial effects on democracy, civil society and so-forth.

You seem to be implying that extreme poverty does not also cause adverse effects on democracy, civil society, and so forth. My guess is that having massive populations of people living at or below subsistence is also harmful to democracy and civil society. If you look at the kind of problems they have in relatively equal [1] Sudan (mass malnutrition, militias literally raping and pillaging) to relatively unequal [1] United States (police violence, poor health care provision), I hope we can acknowledge the problems in the US without denying that they are orders of magnitude less severe. The difference is in the amount of absolute extreme poverty, not the inequality.

BTW if you click on the images in the links you can access the high res versions.

[1] https://en.wikipedia.org/wiki/List_of_countries_by_income_eq...


I agree. The vast majority of people in the world have access to housing, water, internet, toilet, transportation. Yes, they might have their own motorbike instead of car, but really, the qualitative difference is not the 10x as suggested by the difference in prices of those two things. Sure, the sidewalk sometimes is very rough, and the drainage on the side of the road carry heavily polluted water, and the outside of houses might not be painted. Inside most homes around the world, however, are quite pleasant.

Give this house a fresh coat of paint, and it'd fit just fine as a home in a developed country: http://i.dailymail.co.uk/i/pix/2010/03/05/article-1255622-08...



I think we all can agree that programmers almost never begin a project with the intention of causing harm. All the examples cited involved situations where immense pressure applied by higher-up impel a programmer to knuckle-under and agree to such approaches.

The solution that is offered seems very specific to now - we take the fall guy who caved in to one sort of incentive and we put an opposite incentive on him to force a different behavior (btw without removing the first incentive). How fucked-up is that?

Obviously, the better, saner solution is removing the existing perverse incentives, giving software engineers more leverage in decisions, punishing high-ups if they don't give software engineers autonomy to make decisions. And heck, execute the CEOs when the bridges collapse. The bucks should stop with them, right?


The problem becomes... what if the CEO is unaware of it? Volkswagen, for example. It is highly unlikely in my mind that some engineer somewhere in the trenches decided, in desperation, to add in a few lines of code to make sure the engine passed emissions tests. But, do I think the CEO of Volkswagen signed off on adding the code? No.

There's a diffusion of responsibility that makes it hard to point to any given person after the fact. How can we arm people - software engineers, but product designers, project managers, QA engineers, CEOs, and the welders on the assembly line can say, "Hold on, stop, this isn't right!"?


I think that shows the limits of the punishment after the fact approach. The point is to demand that companies create an atmosphere where professionals have some autonomy to make decisions based on their expertise rather than the good of the company. Unfortunately, the trend is things going the other way for virtually all professionals, not simply almost-professionals like programmers.


Theft benefits the thief while costing the person stolen from.

If one can backup and use a fuzzy enough lens, one can just ask "how should society calculate the total benefits here" but the problem is this begs the question of whether it's proper for the state to just ask these questions.

My impression is that none of the insider trading proponents are also proponents of the view that it's not theft to take money that falls off the back of an armored car because "it's hard to who it belongs to" and "we might well benefit from this money more than whoever it really belongs to" but their arguments seem the same. Further, the average people finding money on the street probably really do need it more than your average insider trader.


> Theft benefits the thief while costing the person stolen from.

> If one can backup and use a fuzzy enough lens, one can just ask "how should society calculate the total benefits here" but the problem is this begs the question of whether it's proper for the state to just ask these questions.

But in the case of insider trading, there is no such person. Lambdapie identifies that there is a cost (yes!), but that cost only exists at a fuzzier level than "the person stolen from". Trying to examine at a finer level doesn't make sense.


What if you considered the lost "profit" the non-insiders would have made from buying at 100 rather than 105, or 103, or whatever price results from normal speculative activity?


Can you sketch a more concrete example of what you're talking about?


You completely ignore the benefit of the more accurate price

The gp does a good job of describing how insider trading actually takes money from particular people. Are you saying that a certain number of people should have their money taken in order that prices are closer to predicting otherwise unknown results? Something like "by eminent domain, we are taking your investment profits for the great good of accuracy in stock prices".

Moreover, the other people who benefit from price jumps from invisible sources are those who don't know anything but who are willing to gamble that these price jumps represent a real increase in value. The existence of such gambling would seem like it increases the overall volatility of the market and given that such gamblers would tend to magnify random jumps in the market as well, it seems like society broadly would not experience any benefit.


It seems like some constituency likes the idea of legalizing insider trading so much that they grasp at a variety of straws in an effort to find some analogous place where it would be legal.

But these are very thin reeds.

"Suppose I have a business where I get stuff to sell on a 'don't ask, don't tell basis', that would be totally legal and no one would complain..." sure

"It's just information, information shouldn't be property"

--> Any employer has information they want private. If someone there was no overt law against spilling out one's employer's private information, it would certainly be a condition of employment. And intentionally doing anything that is - against your condition of employment and costs your employer millions of dollars, is going to wind in serious civil trouble. That state has decided to make this criminal also is entirely logical.

If a shop foreman takes money to shut down a factory for a day to benefit a revival, he would be guilty of theft without "shutting down a factory" being an otherwise criminal activity.

The fundamental thing is taking money from someone, which insider trading certainly does. Those who buy a stock on an insider get money - meaning those who sold the stock lose it.


What about a journalist publishing leaked information? In such a case, it is up to the company to stop leaks. People outside the company have no obligation to help them.


> Those who buy a stock on an insider get money - meaning those who sold the stock lose it.

This is definitely not true; those who buy a stock on inside information are buying from people who independently decided it was a good time to sell their stock. Those people are likely, in the counterfactual universe, to sell their stock anyway.

And while a shop foreman might be guilty of theft for shutting down a shop in response to a bribe (I find this a little unlikely, but won't express further opinion than that), the money which changes hands can't be relevant to that, since the theft would be from the factory owner or operator, while the money comes from a completely different source.


This definitely is true. If the other party had the same information they would have sold or bought at a different price point. (or maybe not at all+) That has to be because that's the attraction of insider trading, I know something that will cause you to misprice the trade in a way that benefits the insider trader.

+ Consider the person trading in insider information, namely that Cogswell's Cosmic Cogs has agreed to buy Spacely Space Sprockets at $12/share. Someone with stock in Spacely Space Sprockets would likely chose not to sell at the current market rate of $9.25/share if they knew that.


> If the other party had the same information they would have sold or bought at a different price point. (or maybe not at all+) That has to be because that's the attraction of insider trading

This doesn't follow at all. Trades happen all the time between people with the same information. Under your theory, that's impossible.

It's also not particularly relevant to the main point. The idea of banning insider trading isn't to reflect inside knowledge in the price instantly -- that would be accomplished by encouraging insider trading. The idea is to prevent people with inside knowledge from trading. People who sell to someone trading on inside knowledge get a little bit more money (due to higher demand) than they otherwise would have. They miss out on gains that the person with inside knowledge predicted, but that they also miss in the relevant counterfactual (when they sell to someone else with no inside knowledge).

A few end up selling when, in the world without the insider, they wouldn't have. Those people can't be identified and quite plausibly realize the same gain under either scenario, for example, if they have a standing order to sell at X price.


I'm getting that your argument is because the victims can't be singled out that insider trading is thus not theft.



Look at what https://news.ycombinator.com/item?id=10488948 wrote. The laws against insider trading aren't there to protect one market participant from another, they are there to protect a company from its employees.


Certainly you can make money at integration, vast amounts of money are made. But as one friend in the business really did tell me once "your job is taking one piece of crap and another piece of crap, get them to talk to each other and so turn them into a third larger piece of crap". Integration "works" but overall often makes the problem worse or at best allows you to run place. Imagine when something else has to talk to your "talks to ten things thing".

For good or ill, The start-up ideal is generally to solve problems broadly, to disrupt an entire industry rather than build one more extension to that industry.

The problem of integration, of conforming to byzantine standards and to the mess that overall existing health IT, is itself still just a detail in the overall problem of health and technology.

The fundamental problem is the messiness and unpredictability of physical human bodies/human health.

The reason that health records are a mess is that there is no easy to way to universally classify "what is going on" with a given person in a cut-and-dried fashion. For any field of a health record, there is significant gray areas once you get to a large scale (including "basic" things like "male" and "female").

Just as much, health procedures often don't benefit from automation because what one person does to care for another person is subtle and not easily codified.


If you integrate the crappy thing everyone's using with something less crappy, gradually displacing the crappiest parts, then the world has gotten better. Maybe not fast by Silicon Valley standards, but still an improvement.

As for health records, perhaps the expectation of a neat database decomposition—data perfectionism—is itself the enemy. For just about any given application, such complete information isn't necessary to improve on the status quo.

From my perspective, too much effort is spent creating One More Standard to unlock data nirvana, and not enough on breaking down the organizational walls and misaligned incentives that brought us to this point.

The post author alluded to this, in mentioning the EHR cash incentives that tragically lacked viable interoperability requirements, and so just served to lock in the major vendors' existing systems.


> For good or ill, The start-up ideal is generally to solve problems broadly, to disrupt an entire industry rather than build one more extension to that industry.

But which start-up has ever done that?

Not even Google or Facebook did disrupt an entire industry – they just built one more extension to an existing industry.



Come-on, look at my context. Of course, disruption is never a fundamental up-end, just a streamlining towards a simpler and more productive approach - Uber and airBnB being the classics (whatever their social value or lack there-of).

The start-up approach (or disruption, the most archetypical part of it) is aiming for more streamlining at the end of the process - rather than aiming to be one more orifice slurping up money from one of the vast streams of money that already exist. Contrast AirBnB with a company that might "let you reserve rooms with ten of the largest hotel chains in the world" - sure, both change things, neither absolutely change the world but AirBnB is still a more fundamental change (again, in values-free terms).


Well, many successful startups are just that, though.

"let you reserve rooms with the 1000 largest hotel chains in the world" is quite a fundamental change, as you get direct price comparison, info in which weeks it will be cheapest, etc. (And, unsurprisingly, there are dozens of startups doing exactly that).


If Hoffman's argument is simply that humans don't see reality "as it is", it's clearly true but not very interesting - of course perception is flawed and a simple imperfection can be systematically corrected. Each moment, our visual system receives a distorted image of a given object and the system combines these to make a more "exact" image. We have created artificial aids to extend this process, again correcting immediate inaccuracies. The difference between humans and beetles that mate with brown bottles [1] is that humans can (sometimes) correct our sensory perceptions over time (plus Hoffman's use of "evolution's equations" in his video seems meaningless since in the abstract one can assign fitness to whatever quality one wants).

The serious problems come in with systematic errors - especially for philosophy which has often relied internal reflection. We know about optical illusions and mirages. If there are higher-order things that only do we "not know that we don't know" but ideas/approaches/beliefs that we tend to not be aware of even when they are called to our attention. And we in fact know about many bias humans make[1]. It would have been more interesting if Hoffman focused his attention on these. Then again, his own biases might have prevented this.

[1] https://www.ted.com/talks/donald_hoffman_do_we_see_reality_a... [2] https://en.wikipedia.org/wiki/Bias



Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact