Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
[dupe] Microsoft fixes the Excel feature that was wrecking scientific data (theverge.com)
45 points by rbanffy on Oct 23, 2023 | hide | past | favorite | 27 comments




It's too late, the genetics community already caved :-(

[Scientists rename human genes to stop Microsoft Excel from misreading them as dates](https://www.theverge.com/2020/8/6/21355674/human-genes-renam...)


I just stop working with people who request or send me data as .xlsx


Perversely, this is almost the exact opposite of how to solve the problem.

Speaking from experience, the real problem with scientific data (genomics, in particular) is when it is exchanged in plain-text format and then loaded IN to Excel. Automatic type inference is what fails.

Data already in Excel format can be created/modified/exchanged without unintentional conversion if the column type is correctly specified as "TEXT".


True, but how do you trust something that started as a CSV and was digested into Excel?


Unless you've done it yourself carefully, you can't


Speaking from experience, that would be most people in life sciences laboratories.


Look, I get the impulse, and might have agreed with you 10 years ago. At the end of the day, though, we have to work with non-computationally savvy people who (reasonably) want to look at their data sometimes. Not every lab tech or PI can or should learn Python/R/your favorite scripting language, and frameworks like Galaxy take time to set up and maintain, etc etc. Our job is to meet the users where they are and push the biology forward. Any time that I can push them onto a better path, I'll do that, but sometimes the right move is to tell them: "Yeah, go ahead and play with the data in excel, then tell me what you find", and I'll code something proper up afterwards to verify it, get solid stats and a make a pretty visualization.


> Not every lab tech or PI can or should learn Python/R/your favorite scripting language

The amount of learning required is minimal, even more when compared to the contortionist required to process data in Excel.


Working in the enterprise tech world, I'd go completely broke.


You arent in management, nor a very flexible person?

Businesses use Excel for a reason. Nice that MS finally gave that setting, but renaming genes to be able to use a popular tool also works.


I'm confused by this fix, which is a global setting. The formatting and conversion rules should be enabled at the column (or cell) level. I should be able to make a selection and declare the conversion/format ruleset for that selection. Why is this so hard to grasp from a feature perspective?


You have always been able to fix this on a column/cell level. Hit CTRL+1 -> category -> text (or whatever conversion/format you want):

https://i.imgur.com/1mjEF2N.png

The reason this needs to be a global setting is because people want to copy-paste or import plaintext tab-separated or comma-separated data and have it not munge that text by default. Plaintext carries no formatting information.


Doesn’t converting it to Text AFTER it’s being treated as Date just convert it to a text representation of the date that is not the same as what you initially specified?


Yes, if it's already been converted, it's too late - the original full text that was in the paste buffer or the automatic import from the .csv file that your user just blindly double-clicked is gone.

But no, if you first set the formatting of the destination cell to text and subsequently paste in your data, it won't do the conversion. If you're using a Data->From Text/CSV Power Query import that establishes a persistent data source, well, fix your import so that it sets the proper formatting on the destination. But that requires changing the defaults, and given the stickiness of defaults and the number of Excel installs anything that's not the default is going to result in data loss.

This does establish global preferences that a system administrator at your college or workplace can change to protect users from themselves. I wish it was the default for new Excel installations, but that's not how they did it.


Because they are trying to design software that is "easy to use" for the average person just learning to use the software. So they write software that tries to automatically catagorize things, to jump in and autocomplete small tasks it thinks the user wants. They don't seem to understand that nobody is average. We all have specific needs.

Clippy never died. He just stopped announcing himself.


maybe excel is just not the tool for the job?


It isn't. But in many/most circumstances one is not allowed to just run whatever software they want on a work machine. Those days are long gone. Start running your own software in a secure/regulated environment and you will be detected and locked out. So while Excel isn't ever the best tool, for many of us it is the only tool available.


It's easy for techies to say that because we are more likely than most to have the programming skills necessary to use the bulk of the alternatives.


Maybe a lot of HN users aren't in the same demographics as the people who use Excel in their day to day jobs. I know, surprising revelation.


That's obviously the case, but what would you have Microsoft do about it?


Do you have a better spreadsheet to recommend?


I think the suggestion is that a spreadsheet is the not the right tool for the job.

There is very likely value in dedicated genetics-specific software, even in the form of an excel add-on.

[edit] Here's one, and it's free: https://biology-assets.anu.edu.au/GenAlEx/Welcome.html - so maybe this change from microsoft was necessary


maybe some statistical analysis software is more useful - for e.g. minitab


You can already designate a cell a “text” which would block conversion to date or other formats. This problem applies to the default cell format and particularly to data imported from CSV and TXT files.


Although yeah, this is a very dumb problem in Excel, you can select a block of cells, or an entire row/column and tell Excel "this is text, don't touch it". But if renaming the entire human genome is easier than dealing eith Microsoft's BS, I completely understand.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: