Hacker News new | comments | show | ask | jobs | submit | astine's comments login

    We considered a GitHub profile as gender neutral if all of the following conditions were met:
    - an identicon (rather than a profile image) was used,
    - the gender inference tool output a ’unknown’ for the user’s login name and display name, and
    - none of the panelists indicated that they could identify the user’s gender.
    Across both panels, panelists inspected 3000 profiles of roughly equal numbers of women and men.
Given that acceptance rates dropped for both men and women when this method determined that a user was not gender-neutral, I wonder if use of a profile picture or real name is the real killer here. Perhaps the kind of person to include identifying information in their Github profile is less likely to have good pull requests? More likely, perhaps having any identifying information at all gives people an excuse to reject your pull requests? That would explain why there was a drop for both sexes but a slightly stronger one for women. If you are identifiable from your profile image or name, people are more likely to reject your PR on the bases of your ethnicity, appearance, religion, or gender, with gender being only a part of the effect.

> Perhaps the kind of person to include identifying information in their Github profile is less likely to have good pull requests?

Probably commercial employees?


That's incorrect. The study identified the gender of the Github users twice. They got their 'canonical' gender for users using your method (and removed from their sample users who could not be identified with this method,) but later to figure out whether a user's gender was identifiable, they used the Github user's own profile.

Non-Google+ users were not analyzed at all. That could possibly still influence the results, but not the way you are suggesting.


Only 35% percent of the accounts have their gender listed in a linked Google+ account. Checking someone's social media profile is a relatively sure way of automatically determining the gender of a lot of people. The authors did use another automated tool to see if they could figure out the gender of users from their Github profile as well, which is something they needed for the second part of their analysis. They don't specify how accurate that procedure was, so it's possible that they are more accurate than you think.

Of course, there is still the issue that they have effectively limited their sample to people with Google+ accounts which may affect the results of the study. Given that men's acceptance rate also dropped when their gender was identifiable (but not by as much) gives credence to the idea that there might be a flaw in their Github profile analyzer.


"You have no control over what queries it writes"

Huh? Just write a view or a stored procedure and point Tableau at that. There's no need to compose a query in the Tableau interface.


Run a trace on your database and watch the queries Tableau actually runs while building your worksheets and dashboards. It's doing a lot more than you think.

It's interesting that Yegge places Clojure into the 'conservative' camp whereas the author of this post places it into his equivalent of the 'liberal' camp. Personally, I think that says more about Yegge than the author of this piece, but I suspect that it might also say that the distinctions that each is making might be subtly different.

-----


I think it simply reflects that the distinctions each is making are, to a great degree, subjective and up for debate. See, for example, http://blog.ezyang.com/2012/08/is-haskell-liberal-or-conserv..., which argues that Haskell of all languages is, in a sense, liberal.

-----


When I write Clojure I tend not to worry too much about this. Most of my variable names will be declared with `let` and will have a very limited scope. I can go ahead and clash some builtin function name and not worry about it unless I happen to need that builtin function within the `let` block. It usually is the case that I don't need to.

-----


Locally overriding builtin/global functions seems like a bug waiting to happen. There's no way to figure out if you actually intended to use the local variable inside of scope, or the function it overshadowed. It seems to me that in Lisp-1 all local variables should be prefixed with some sort of symbol so that they can never override globals.

-----


"All of the western political ideology (Colonialism, Communism, Fascism, Democracy) that came to Korea (and China) came by way of Japan."

This is demonstrably false. Chinese communism especially, was not just influenced but guided Russia and the Comintern. Mao certainly didn't get his ideas from Japan, and the contrasting liberal and fascist influences largely came from the West by means of Shanghai rather than Japan.

Similarly with Korea. Japan may have forced Korea to open up and dominated for nearly half a century, but Korean communism came from Russia.

-----


That was a bit later though. Leninism came to China and Korea in the late 1910s and 20s. I am talking about the ideologies and philosophies (Marxism among them) that came from the West by way of Japan several decades earlier than that.

You could argue that Leninism and Stalinism are more directly responsible for North Korean Communism today than Marxism was, and you wouldn't be wrong in my opinion, but the concepts of socialism, fascism, democracy, etc. first came to Korea and China by way of Japanese scholarship.

-----


I've had people seriously claim to me that using Bayes theorem to evaluate beliefs that one deals with in ones everyday life using evidence that one comes across in everyday life was likely a good idea and would reduce bias. I wish I'd had the presence of mind to point out that that did nothing to eliminate the selection bias of one's own experience. No mathematical formula can draw meaning out of weak or flawed evidence.

Trying to do so is like trying to 'enhance' a blurry photo so that you can see details in the photo that didn't exist.

-----


Even if you only get weak of flawed evidence, then you do what you can to make the best decisions given that evidence. No one tries to suggest you do actual bayesian calculations on everything you know, for the simplest reason that it's not computationally viable.

But if your beliefs directly contradict bayes then you're doing something wrong - there's an inconsistency that's likely worth investigating, unless the matter is really minor. It's a sanity check for your decision making, not a constructive algorithm for the best decision.

-----


I think you're missing the broader argument, which is using 'mathy' concepts to dress up poor reasoning. Obviously priors matter, but what matters most of all is how good/complete your evidence is. Using a mathematical formula to lend credence to weak evidence (through liberal use of assumptions) is a hallmark of pseudoscience. The same could be said of many of the abuses of statistics and Bayes theorem is merely one good example of this.

-----


Is using mathy concepts to dress up poor reasoning worse than not using anything to back up your reasoning? At least you can point out exactly what's wrong with the mathy reasoning.

A colleague of mine says 'Sometimes pulling numbers out of your arse and using them to make a decision is better than pulling a decision out of your arse'

-----


> 'Sometimes pulling numbers out of your arse and using them to make a decision is better than pulling a decision out of your arse'

Agreed! Leaving the pseudoscience example aside - since there are strong emotions involved - we can clearly see that it is indeed useful and necessary to take decisions under uncertain/incomplete information. This is advantageous whenever the cost of inaction is expected to exceed the cost of backtracking a less than perfect decision, which often is the case.

Let's say... project management. IF you take the time to find out that your project requires 100 tasks, 30 of which lay in your critical path; you can argue if each task will take one day or one week to complete, and you can debate whether adding a 3rd or 4th member to the team will significantly speed up the completion date or not. But you will definitevely be in better shape than if your PM just cook up some 5-page-spec overnight and commited to have it running in beta test by the end of the month before even anouncing it to the team...

Which itself will be better than having all your potential contracts snatched by competitors that never do any estimation at all but are very good at pulling themselves out of tarpits of their own making.

-----


"Is using mathy concepts to dress up poor reasoning worse than not using anything to back up your reasoning?"

I believe so. If your belief is baseless, or based on flimsy evidence or simple bias, it's best if that's obvious. Dressing up weak reasoning to seem stronger is a form of lying. It's what we call sophistry. A big part of the problem is that for a lot of people don't understand the math well enough to point out what's wrong with it or have a bias towards explanations that seem complex or sophisticated but really aren't.

It's true that sometimes we have to make a decision based on poor or no evidence but it should be clear that that is the case when that is the case. Dressing up the argument only obfuscates that.

-----


Honesty is an ultimate issue here. If my reasoning is shoddy, but I plug it into some math apparatus, then it'll likely make my problems obviously wrong. If my reasoning is very inaccurate and the data uncertain, being precise about it can at least make the results salvageable. Scott Alexander argues for this position quite well in [0].

Humans can lie with statistics well. But they can lie with plain language even better.

[0] - http://slatestarcodex.com/2013/05/02/if-its-worth-doing-its-...

-----


"If my reasoning is shoddy, but I plug it into some math apparatus, then it'll likely make my problems obviously wrong."

That's pretty clearly untrue. I remember reading a study recently where the p value was less than .01 or something like that but where the experimental design was clearly flawed. The correlation wasn't the correlation they thought they had. But because the math looked good and it was easier than actually reviewing the experiment, it was tempting to take the study on face value.

I've read Scott's essay before and I understand his argument, but I don't think it works. While, you might be able to avoid some bad reasoning simply by being more systematic, you can also strengthen bad arguments with a faulty application of statistics. What Scott doesn't do is provide an analysis of how often each of these things happens. I'd argue that for each time a quick application of statistics save someone from a bad intuitive judgment, a misapplication of statistics is used to encourage a bad judgment at least one time if not more.

Understand that my argument here is not that one should never use statistics or even Bayes theorem, but that a naive or lazy application can be worse than no application.

-----


I see your point and I agree.

For myself, I try to limit myself to the mathematical apparatus I feel comfortable with. I know that if I were to open a statistics textbook, I could find something to plug in my estimates and reach a conclusion, and I'm pretty sure the conclusion would be bullshit. I learned it the hard way in high school - I remember the poor results of trying to solve math and physics homework assignments on topics I didn't understand yet. The mistakes were often subtle, but devastating.

-----


This is a general argument against statistics. Or math, in general. Yes, dressing your bullshit in math can make people believe you more, but it doesn't change the fact that you're lying. Are we supposed to stop using math for good because evil people are using it for evil?

-----


No, it's an argument against using statistics without first considering the strength of your data.

-----


Then you should take the Bayesian side, because Bayesians look at the data first, and they take their data as given rather than taking a null hypothesis as given. They don't just blindly go off and run a test (which assumes a particular prior implicitly that may be wildly inappropriate) and see what it says about the likelihood of their already observed data being generated by the test's assumed data generator.

-----


But being a good bayesian makes you do exactly this. The process of describing priors makes it obvious you need to do a sensitivity analysis to check how much the prior is influencing the conclusions...

-----


> being a good bayesian

This is exactly what weirds people out about LessWrong folk. They talk about a tool as if it's a religion.

-----


It's a running joke there.

The people need to get over it. LW crowd is a group of people studying a pretty specific set of subjects, focused around a single website. It's typical for such a group to develop their own jargon and insider jokes, which may look weird from outside. It's normal.

-----


"Good Bayesian" in that context just means being an able user of Bayesian statistics, not necessarily holding any particular philosophical belief about what they mean.

-----


How can you evaluate the strength of your data without using statistics? You've created a catch-22.

I'll speculate you have some sort of meta-heuristic and only apply this catch-22 under those circumstances? E.g. this catch-22 only applies to weird and socially disapproved topics?

-----


Do people not know what an administrative center or military compound is? Are analogies from neurology of all things really more accessible?

-----


My guess is it's a translation from an idiomatic expression in Spanish. I'm not sure, but 'nerve center' is a pretty common idiom in Italian for a hub, and a cursory web search seems to confirm this phrase exists in Spanish as well.

-----


It's a very common idiom in English as well. http://www.merriam-webster.com/dictionary/nerve%20center

-----


A nerve centre could be translated to Spanish as 'centro neurálgico' which basically means administrative center. It's widely used in Spanish :)

-----


Thanks. I don't speak Spanish. I had no idea that this was a common idiom.

-----

More

Applications are open for YC Summer 2016

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: