When it started I thought Yegge sounded nervous and jittery and seemed a little intense, like he had a chip on his shoulder and I thought "Oh boy, I hope he doesn't do a melt-down".
Then as the talk progressed, I realized he was just excited/nervous and that's how he talks.
He gradually hit his stride and the message that his talk was meant to convey slowly started to take shape for me... and it's a hell of a positive message.
It is a call-to-arms to give a damn and use our powers for the advancement of everyone. To stop spending out free time working on icanhascheezeburger SMS alert apps and pickup a book on mathematics, bioinfomatics, data mining and other hard topics and start learning.
It is a call to arms to send yourself back to school (in a sense) that don't be afraid to start learning about other topics that have always seemed interesting to you but maybe you figured were outside of your area of effect, e.g. "I'm a server guy, I'll never do anything interesting in 3D visualization!"
It is also a call to arms to make money and effect change with principle; like a Google or an Amazon.
You don't need to scrape every last piece of skin off of your customers hide in order to post big quarterly profits to be successful. You can develop positive relationships with your customers, employees and the world around you and STILL make the money necessary to continue growing an innovating.
The "quitting" part of the talk is unimportant, it was just his way of illuminating his point. The value is in his message.
If you have questions about what it's like to be a hacker in this type of environment, post them here and I'll share what my experience is like.
BTW, I completely wish I knew more stats and bioinformatics, so I probably should purchase the Yegge book collection myself . . .
Here's my pet peeve in bioinformatics - If there's one thing that's poorly suited to science, it's the building of computational infrastructure. We're talking basic stuff like databases, tools etc. Sure, anyone can knock out a bit of code for a basic database, but the big problem is that there's no incentive to have decent code, or maintain it so that it lasts any longer than the person is in the lab, or has funding. So, what will be great is if existing resources are cleaned up - data is normalised and pulled out so that it is actually accessible for doing some kind of analysis on it.
If you want to do bigger work, do something actually novel, or that has any biological relevance there's no getting around collecting your own data (e.g. sequencing the crap out of a bunch of things). I'm in the process of trying to get funding now for a project of mine to make that very leap now.
I'm sure someone working on next gen sequencing (the new hotness) can pipe up with the big problems to be solved there.
He even has money for a dev position for 4 years, I'm just worried that he gets someone who slaps together a proprietary and incompatible site when this would be a perfect chance to experiment with implementing some standard data access APIs.
I've heard bioinformatics people complain about the lack of standards and fragmented nature that comes from various small groups of scientists doing it on their own.
If anyone is interested pm me and I'll put you in touch, he is in South Carolina so he can't offer the salary and other perks of the Bay, but it's a real chance to put good development energy into science.
They are doing meta-analysis of the neuroscience literature.
So basically I had no clue what Neuroimaging meant or did other than people get shoved in scanner, huge magnets turn and they see inside you ;p
But what I found is there are plenty of computer science problems in a field of Neuroimaging (and neuroscience) that a programmer can help with. (processing, image analysis, data mining, storage, Visualization whatever) Most labs don't really have people who's primary job is programming. Thus there a lots of tools that are just hack jobs long forgotten that no one is maintaining but everyone depends on. Those things can be helped out by good programming practice and with real programmers behind them. What sucks is getting funding for these people but Open Source can help here by pooling multiple people from many labs into common projects.
Also, if you do work with scientist, most of them will talk to you for hours about the science of what they do. You can usually ask the most stupid question and they will be happy to answer it. I've found that most people I work with are open and even more if the work you do helps them achieve their scientific goals. So in the end if you need to learn some science stuff, they will usually be helpful.
(BTW my project is in my profile, will be open source soon, waiting for some political approval process).
http://openconnectomeproject.org has some introductory material. (This is not my site, it is by some people at JHU who picked up our data set and are working on it.) Massive image data sets, lots of need to develop workflow. You can browse the image data here: http://openconnectomeproject.org/catmaid/?pid=4&zp=40635...
Concretely, look at the plugins being developed by the Fiji project and pitch in, especially on the electron-microscopy-centric plugins:
edit: also think about contributing to the CATMAID project, which is the software serving the browsable data set above, and perhaps will someday enable crowd-sourced markup: http://fly.mpi-cbg.de/~saalfeld/catmaid/
For visualization, check out:
Also, a huge list of projects is at:
In the genomics analysis space, it seems that I hear these three tools mentioned for sequence alignment are tophat, BWA, and MapSplice.
These are actively maintained projects that I think are mostly developed inside of various academic research groups.
There is also The Cancer Genome Atlas project at:
You can probably find research groups via TCGA that might appreciate some one-off development or support, but it might not be exciting from a tech viewpoint.
There is a ton of EMR (electronic medical record) data out there in free text. If you have skills or interest in things like Lucene/Solr, I would bet that almost any research hospital might appreciate your time and skills. And, if you talk to the right group, want to hire you . . .
Also, the dilemma in my mind is whether I can stand going back to grad school in my late 20s for a career I don't really know much about. I'm not sure if you can speak to that experience, but would you say it's been rewarding?
I remember getting a C program from one of the biggest names in both undergrad CS education and machine learning and finding it wouldn't run on 32 bit machines because it had a static array that consumed 4 GB.
Even in computer science, the product is papers, not working software, and the situation is worse in other fields.
As someone who had an academic background, I think going from "pictures of cats" to "math and science" is like going from the frying pan to the fire. Entry level positions in the math and science Juggernaut pay from 2-5x less what a junior or senior person in the social media Juggernaut gets. You can sink anywhere from 5-10 years into getting a PhD, and then you'll find that there are just enough new jobs for the children of yesterday's professors who weren't totally destroyed by their upbringing and that they've got an insurmountable advantage in the game of musical chairs.
Science and Math is a system that uses up young people, especially men, the same way that the racing industry uses thoroughbred horses. There's no realistic career path for 95% of the people who get involved... other than working on "pictures of cats" or whatever it is that pays.
"How should we make it attractive for them [young people] to spend 5,6,7 years in our field, be satisfied, learn about excitement, but finally be qualified to find other possibilities?" -- H. Schopper
What an euphemism. Almost like a scam.
In general, you should interview a team when you're trying to get a job and trust your gut on how much those practices you mention will be accepted. It's a little harder here, because for years the model was assign a dev to a project and that dev will own the entire project from start to finish. We still don't do a great job working as a team.
Craftmanship can take a back-seat to schedule. Lots of operational stuff (for grants and administrative purposes) gets pushed off to the last minute. And for research, the focus here is on the results much less than the process. The researchers don't care if you do it in a bash shell script or a clojure jewel as long as it's done - kind of like a startup. So if you can do it fast in a maintainable and cool way, all the better.
We are suffering the effects right now of some bad QC code. A non-insignificant amount of data had to be re-analyzed because of some code bugs. I would say interest is currently much higher in improvement. :-)
I did grad school (MS in CS), went to industry, and then came back. Returning was mainly lucky timing and personal network effects. But I was late 20's when I did grad school, so we're not too different. It was completely worth it for me.
Some PIs are like typical academics - they have MD/PhD (usually both) and can be somewhat dictatorial. But some are great and want your thoughts and expertise.
You could consider simply getting a job doing typical IT work in an organization like this. You're probably right that eventually you'll want some grad school (we are walking around in credential heaven here), but you could always try it out first.
oh, and forget about stuff being normalized or making any sense . . .
What's it like to be a hacker in that sort of environment? :P
Really, anything you could tell me about what you do now, the state of the industry, and your experience would be really valuable.
I came in to do traditional data warehouse work, loading data from separate databases into one spot to make reports across those different datasets easier. That's most of my days now - simple data integration and reporting. Oddly enough, we are an Oracle shop because our university has a site license. But I do a little Ruby/Rails/Sinatra for a project that is reporting + some app-like enhancements. Tons of SQL for manipulating data, but it is usually simple SQL, especially compared to stuff I wrote previously in a financial corp.
A significant amount of my day is ferreting requirements out of users. I wish I was better at this.
There is a much larger amount of grunt work happening in the data management space than I expected. We deal with a huge amount of protected health information (PHI), which can be frustrating since you're usually working around data policies that require somewhat careful interpretation. Obviously, the genomic datasets are huge, so dealing with storage and clusters for data processing is often a discussion point around here.
It is an academic/research place, so it is a little more relaxed than industry. Expectations (for software) are a little low - we're talking about users who have, for the most part, kept data in spreadsheets or little Access style databases for years. And they often haven't had the budget to hire dedicated software guys, so sometimes tools are suffering from bit-rot.
Good hackers seem to be appreciated. It is below your standard industry salary, but not terribly so. I try to avoid being sucked into low-value work since "the reward for work well-done is more work."
Just like grad school, if you want to work on a specific topic or research area, make sure you get a job in a group doing that work. Once you're established, it seems easy to get some space to do some exploratory hacking.
And the environment will try to shoehorn "bioinformatics/medical informatics/etc" into "mostly a core services group" of technicians, like surface technicians, kitchen etc. Though in discussions, it will be admitted, that informatics/maths is one of the most promising keys to advance the medical research field.
J. Quackenbush called these groups of people intellectual peasants. Unless these groups are accepted as peers, I can hardly see any true systems approach succeed.
They all seem to be hackers. So I would say, if you're already a hacker, you can probably find a way in somewhere given enough time regardless of your current degree level.
If you know you are going to head off to grad school, don't do another undergrad before. Many CS programs take on students who need a little remediation in math or CS background. I would guess that would be much more difficult for a MS program in biology.
You might even find that a MS in bioinformatics is the appropriate choice. There seem to be several of these programs floating around, they are multi-disciplinary to begin with, and if you can crank out a good GRE score, you can probably get in one without too much effort.
You will probably find it easier to break into the academic research world with at least a MS. But, I hear lots of buzz about commercial companies wanting to get into the sequencing business. Maybe you could just give yourself a crash course in python or ruby and a little biostatistics and sneak into one of those? Especially if you can find someone in your network already working in the field.
I recruited a friend to work with me who had a BS in history and a MBA (I teased him). But he was working in the software field already and I knew he would pick up the simple tech we were using to get work done.
IMHO, his boss deserves the professional courtesy of being told personally (be it face to face, or by letter of resignation), rather than finding out via a YouTube clip...
If you have enough life experience, or read enough of Matt Groening, worst case watch enough Office movies/series, then you could suspect that bosses can be manipulative sociopaths, who do not deserve any professional courtesy.
BTW: most bosses don't fire f2f (Office Space), but with an indirection, through Human Resources Services (or hire scumbags, likeUp in the Air) [and then good luck, with your file].
I guess Steve had a reason to do this, and I respect him for having the guts to stand up in such a public way against dirty careerist office background politics, management decision support and calendaring theory.
Maybe this case doesn't fit, I have no clue (in my experience, it can be the "right thing to do" in large organisations detached from true ethics). But if Steve felt it this way, then it was this way (see Schopenhauer's most influential work, The World as Will and Representation)
Basically Steve Yegge said that his boss doesn't deserve the respect to know this ahead of time. In fact its quite possible that 1000 people will know this before his boss does, including his boss's boss.
In many regards the only person this was disrespectful to was his boss -- and that makes it personal when you single out someone in this manner.
- We will work on projects that make the world better, not to get rich.
- We will study different subjects to better tackle existing, important problems.
- We will work as a community, sharing our findings, learning from each other
Pretty sure Yegge's 'auteurs' are getting rich.
Stevey gives up the being part of the chase for the superfluous (in money), and calls for us to do something about the necessary
No one needs a million dollars. No one. Why are we chasing it and dying of heart disease -- heart disease we can cure if we start chasing that instead?
Our priorities are absolutely messed up and it's time we start realigning them. This isn't a speech; this isn't a funny resignation. This is a clarion call to join in. We can do so much better. We can achieve something valuable, if we start to realize where true value lies.
Steve's in. I'm in. Who else around here is in?
food, roof (with internet connection reaching under it), freedom. In Bay Area, if you have a mortgage, it is half a mil bare minimum and even a million may not take you far enough. There have been experiments to build societies with decreased degree of connection between food, roof, freedom and money - somehow it always went like this one http://www.globalsecurity.org/military/world/dprk/dprk-dark....
And you can't build SpaceX without a bucket of millions (Copenhagen Orbitals are beyond wonderful, yet they are in Virgin Galactic league at best, even in their furthest plans. They show the possibilities for the future and at the same time how far yet that future is)
Yeah, but it helps.
I think it should be noted that Steve is probably fairly wealthy by most any standard from GOOG & AMZN stock.
And with that wealth comes more freedom...to work on 'big' projects.
I think it's a bit disingenuous to ask people to work on non-lolcat projects when you are wealthy and set for the rest of your life. It's admirable to devote your life to working on big ideas, but I would argue most people who are doing so (even those mentioned in his talk - Gates, O'Reilly, etc) aren't really worried about money.
Most people are simply worrying about how to pay next month's rent or handle their family expenses. Sorry, but these people are not 'in' when it comes to working on big ideas.
In my case, if it came down to working for Facebook and making $10+ Million on the IPO or working to cure cancer - Facebook wins. Maybe after that I'll work to cure cancer.
- Everyone else is just making ends meet. (Huh? Everyone?)
- If you have the chance, join FB early so you can be a millionaire and work on important things.
Look at the startup Color. It has brilliant people and they are all working on ways to make sharing pictures with people you don't know.
We're saying, "Stop It!" Maybe you'll get a hit and get rich but that's a stupid goal.
No one needs better ways to share pictures of your cats. There are important things to do instead, and our brightest minds, instead of engaging in ways to humanity forward, are writing PHP to get themselves forward.
I am very clear on my chances of making big changes to the world: Almost nil. Instead, I decided long ago that I'd do my best to make money and improve my own life.
I'm not saying I don't do little things to help the environment, but there's no chance that I'm going to be on the team that cures cancer. There are too many people out there that are both smarter than me and more learned in the topics needed. The best I could do would be to get in their way.
You don't have to cure cancer; that's absurdly binary and based on fame. You can do less sensational things like invent a device that improves patients' lives using modern robotics and sensors. You can design a system for Alzheimer's patients that incorporates your knowledge of big data sets. I can't dream of all the things to improve, but your domain knowledge is probably tremendously useful in all kinds of fields.
Accidents can always happen but if you're writing PHP to tag cat pictures, no, it's not going to do anything. But keep telling yourself that if it helps you sleep.
Having worked in academia, you rapidly appreciate that the people who keep the lights on - administrators, lab techs, librarians - are about the most important people there.
And that system? Depends on taxpayer funding. If you're doing things which are ethical, and you're doing them well, then you're helping cure cancer and all the rest, even if only indirectly. Maybe you could do something more direct; that's your call to make. But don't minimize the impact of doing good work.
He's talking about working toward your potential and possibly making short-term sacrifices in exchange for the greater good.
Because if you have a unique perspective on a problem, you may be the only one in the world right now with the vision to solve it. Don't waste your time working on mundane shit when you know you could be doing something more.
What I was really aiming for was "don't do yourself down" - and "don't underestimate the impact a well-placed tool can have".
One of the biggest reasons I decided to not do further research is that I have noticed that there are an unfathomable amount of people hobbling about on ancient and primitive tools when there are existing implementations and papers available that would make their lives orders of magnitude more efficient. So I've made a personal choice to not pursue the latest, fastest and most cutting edge algorithms and rather bring my knowledge, experience and problem solving abilities to people who are trying to solve real problems right now.
My point is that even though you think you can't compete with these guys who read, study and do 'real' science they could use a lot of help from the likes of you. Sure learning some linear algebra and bayesian statistics can help in directly implementing the algorithm, but usually the biggest problem I've seen is a complete lack of software engineering and hard coding specific to certain data sets.
I think the scientists can gain a lot from the software engineering field, especially open source practices. They will be resistant to it, as others have pointed out the incentives don't always line up, on the other hand there is a lot of low-hanging fruit in terms of improvements a decent programmer can pluck.
Yup you said it exactly right. More programmers to assist the scientist. The scientist know the science part but they need help with the informatics part (lots of it). If not they do it themselves and you end up with some software that becomes critical but with no one being able to maintain it.
Plus working with the scientist means you have access to expert in their field and usually they like to talk about it so you'll learn overtime the science and whys behind the stuff you work on.
Good luck with the startup.
This is a rather obnoxious strawman.
Google, "where I work right now", they are doing great work to attempt to change the work. At least more than other companies.
I work on compilers. I like to work on big data, learn about data mining. Because even for compilers work I need to do large data analysis, and face
tremendous scaling problems.
Hollywood blockbusters summer 2011: why is this slide here? These summer movies are all crap, because corporates are greedy, they are incremental, not trying to shoot for real quality, real game changers. They chase money.
Except for "auteurs": people making money while keeping principle. E.g, Pixar. They show their passion, make every one look bad, but make money as well. Apple is also a great example.
Social networks; this is what I work on at Google :-( (lolcatz pictures on the slide). Is this principled? This is fun, and making money. But not principled? Is there anyone in this crowd not working on social networks? This is a hype. Why is everyone working on this? This is money chasing.
You are interested in social networks but when you are 60 you will be interested in your health. But then it is too late. You will wish we had solved these fundamental health problems when you are 60. These are hard problems that require math statistics and big data.
Human genome project: This will be an inflection point in human history. It is also a data-mining project. Reverse engineer the source code (genome) with respect to how treatments work/are-effective. The people who can solve it, data mining people, are working on crap problems, lolcatz social networks :-(
Let's affect a culture change.
short-term: infrastructure and scaling
medium-term: math, data mining, bioinformatics
long-term: important problems
I had a midlife crisis instantly after rehearshing this speech once. I am not following my own advice. I had started work on math every evening. And I am officially quitting that social network job at Google. (Is he also quitting Google?)
This way I will be ready when we are in a position to face those important problems in five years.
Is there a hacker scene in Bioinformatics?
Also, #bioinformatics on freenode is a good place to meet people.
With respect to open source work - the field is wide open. For example, as "data mining/big data" was mentioned, the interesting recent development synthesizing advances in general availability of cheap server farms (hardware and software -wise, in particular cloud/hadoop) and a new approach - "meta/shotgun sequencing" - enabled by such advances can be found here http://bowtie-bio.sourceforge.net/crossbow/index.shtml
Also may be interesting to look at http://www.jcvi.org/cms/research/projects/jcvi-cloud-biolinu...
Put it another way, anybody can throw money at the problem (and a lot of people do). But not many can write algorithms that efficiently work on terabyte data sets.
But what to do about it? Hmmmm.....time to go shopping for books, or search for online courses.
It is literally burning electricity (oil, coal, Fukushima) for no real reason (MS wants to patent computational heating?)
We have no clue how proteins work/interact (this is why epigenetics is hyped these days). We have some "educated guess", but it is mostly data harvesting from public databases and then doing some simple Bayes or correlation analysis, without much scrutiny on the harvested data itself (ie. was it made up using multiple imputation?)
Also I'm not sure if he quit Google, or if he just quit the project he was working on.
Would you trust the output of this chip, as it has been presented/used in the paper ?
1.Katsanis, S.H., Javitt, G. & Hudson, K. A Case Study of Personalized Medicine. Science 320, 53 -54 (2008).
2.Konstantinopoulos, P.A. et al. Integrated Analysis of Multiple Microarray Datasets Identifies a Reproducible Survival Predictor in Ovarian Cancer. PLoS ONE 6, e18202 (2011).
Off-handed comments (in a thread about Google) calling 23andMe's scientific research "misleading" seems a bit snarky. Educated discussion has a place -- eg. in HN threads related to the company  or (especially) in the peer-review process.
 http://news.ycombinator.com/item?id=2813270 (currently on HNs frontpage)
[Edit: This was in response to someone asking why it wasn't on the front page.]
Plus publicity stunt of quitting so you'll listen [at 14:15, sounded like quitting a cat-photo-sharing project, rather than quitting Google].
Plus implication that bio is the only domain that is world-changing.
People thought we were "unlocking the genome" ten years ago with the Human Genome Project. If that taught us anything, it is that biology is a lot more complicated than simply parsing data. Biology is messy, complicated and breaks every single one of its own rules. Repeatedly.
I'm not saying bioinformatics is unimportant or unnecessary, because it truly is important. I'm simply tired of people (particularly famous people who are grandstanding) boasting that XYZ will "cure cancer".
What are the good starting points for learning about bioinformatics?
Regardless, best of luck to him wherever he winds up (within Google or elsewhere)! And I hope he keeps writing!
I used to have a philosophy of intentionally choosing easy, "overlooked" problems. I figured I wasn't that smart, so I should just stick to the simple stuff. The software I built was good and useful, but a lot of it is already becoming obsolete. I want to make software that will last 50 years, not just 5.
This talk came at a great time for me, and it's strengthened my resolve. I'm going to keep learning about speech synthesis and acoustics (which means a lot of math and physics that I slept through in school), and hopefully I can push the field forward a little bit.
"What are the most important problems in your field? Are you working on one of them? Why not?"
It also seems to me that the math you would learn in an O'Reilly book isn't in depth enough to contribute to research.
What would your list be for Steve?
He starts by talking about how he joined Google because he believed they really wanted to change the world. They are basically the only ones fighting for Net Neutrality, etc. He said Amazon had a similar culture when he was there. He goes on to talk about scaling and how it might be the biggest problem for a lot of companies. Basically, everyone is working on some kind of scaling problem and it's usually for a stupid "cat picture project" (social networking). Later in life, you realize there are more important things (specifically, health related in the video), but it's too late because these tasks require math, stats and domain-specific knowledge. We are mostly lacking the domain-specific knowledge. He talks about how you may wish you could go back in time to tell your younger self to do something more meaningful. He challenges everyone to learn something new and make a difference. With about a minute to go he's puts his money where his mouth is and quits his job.
Over the past 20 years, we've gone from writing software that runs on a single desktop with a very limited set of data to systems like Amazon and Google that accumulate large sets of data and needed to solve scalability problems. Now that we have all this data available, and the scalability problems are FAR more in hand than they were previously, the ? is what do we do with this data.
There are lots of ways we can use big data to solve real-world problems, but to solve real-world problems requires a degree of fluency in the language of the problem you're trying to solve. Most data-mining knowledge is being used to sell ads or to make it easier to share and find pictures of cats on social networking site, when with a little bit of domain knowledge you could literally change the world by solving a big, data-driven problem. Go do that. Learn on your own time and do something worth doing instead of finding new ways to earn a buck by sharing pictures of cats.
I'm Steve Yegge and I quit.
7. MS-DOS is 30 years old today (extremetech.com)
44 points by ukdm 3 hours ago | flag | 11 comments
8. Bring your half-baked idea to the Half-Bakery (halfbakery.com)
54 points by rfreytag 4 hours ago | flag | 11 comments
22. Steve Yegge quits Google in the middle of his speech [OSCON Data 2011] (youtube.com)
162 points by kodisha 3 hours ago | flag | 35 comments
EDIT: 323 votes in 5 hours... highest I've seen it as a #21, currently #22.
Is it because the YC portfolio is heavy on cat picture startups and light on startups that do something important?
It's an inspiring notion if you're into that sort of thing, a bit unfair to his would be colleagues though.
Personally, I believe we've done enough for humanity already.