Hacker News new | past | comments | ask | show | jobs | submit login
Microsoft is bringing Python to Excel (theverge.com)
858 points by Xeophon on Aug 22, 2023 | hide | past | favorite | 445 comments



This was a nice surprise to see today as an Ex-Excel developer who worked on trying to bring Python to Excel (and, I guess, failing ;)).

7+ years ago I had the option of leaving the Excel team. My then boss’s boss knew I had an interest in bringing Python to Excel and offered me a chance to tackle it if I chose to stay. What was meant to be a 6 month project turned into a ~3 year project, the Python part faded away and we ended up enabling JavaScript Custom Functions in Excel instead.

For Python we were also running ‘in the cloud’ (AzureML v1), although there was some back-and-forth on if we should run locally. I think what made the Python part disappear was our partner AzureML team re-orged, re-released, re-hired, we lost a PM and our work caught the attention of another partner team who realised they could use our code to execute their JavaScript out-of-process. And so I spent a lot of time ensuring that feature was successfully shipped at, I guess, the detriment of Python.

I had a lot of help from some strong engineers and learnt a lot. The core of the work was modifying the calculation engine of Excel to allow functions to compute asynchronously, allowing the user to continue working on other parts of their spreadsheet while the remote endpoint (be it JavaScript, Python or something else) was computing. Previously the spreadsheet would lock up while calculations were running, and that wouldn’t be cool for long-running unbounded calculations. Have to wonder if any of the stuff we built made it into this new feature.

Super great to see this and look forward to trying it out.

Recalc or die


Thanks for your work! The async JS functions are amazing and allowed me to ship rock-solid Custom functions with xlwings and Python. The problem with Office.js is that it’s just simply too hard to get started with all the npm bloat for the average Excel user (even more professional devs struggle). Well, I am pretty happy how I am saving users from having to use Node.js: https://docs.xlwings.org/en/latest/pro/server/officejs_custo...


Oh this must be Felix! I saw and played with xlwings many years ago and was impressed :)

Agreed with the friction around getting started. I would like to see it reduced. It’s cool to see you using them to wrap Python functions - I have done something similar before myself, and still need to get around to shimming Haskell into excel at some point.

Another thing I don’t like is how JavaScript Custom Function cell formulas have a horrible string after the _xldudf marker, which becomes noticeable if your workbook doesn’t/can’t load the custom functions addin (or you unload the addin)

Thanks for your comment


It's me, yes, are you Michael then ;)? Yeah, I love the Office.js platform, I believe it's the most versatile extension platform there is, it just came out during an unfortunate time (when people believed that you can't use JS without React anymore ;) Anyhow, the magic of Office.js is that you can use it like a JS framework with total freedom what you use on the backend (it's just a fetch call away), so whether that is .NET or Node or Python/R/Julia doesn't matter at all, it's just that the official docs don't really bother to explain that or show an example other than Node & C#. What's a game changer (and only available on Office.js) is the integrated authentication via SSO/Azure AD, which is the first thing any IT department in a corporation is interested in.


I’m not Michael - presuming you are referring to Michael the PM. I was a dev. I’ll add you on LinkedIn.

Great to here you praise auth. I almost forgot getting auth ‘right’ inside Excel was a real time burglar.


> The problem with Office.js is that it’s just simply too hard to get started with all the npm bloat for the average Excel user (even more professional devs struggle)

IMO deployment is an even harder problem. The Office team makes it incredibly difficult to deploy web add-ins (presumably for business reasons) and accordingly most companies I'm familiar with are still using the ancient, barely-supported COM+VSTO add-in models.


Which part do you find difficult? The web backend or the deployment of the Manifest.xml via the app store / office admin?


The part that makes it a non-starter is that it has to be in some kind of a store (whether it's on prem via SharePoint) or the Internet. What is needed is that you can install something on the user's computer (xcopy install, msi, w/e), but without requiring any other Microsoft end-point. And then, the bits installed either run completely on-node (similar to COM/VSTO) or it would require that access to remote web-server be required. If a remote web-server is required (which is a bit of a limitation), it should be something that can be pointed to without Microsoft's involvement. Just like I can create a web-site, give you a URL to it, and you can go it using your WebBrowser. No app-stores.

That doesn't mean app-store shouldn't be an option. I would love to also publish my add-in in the Microsoft AppStore, but it should be my choice, not a hard-requirement.

That's why people are still using VSTO, not because we have a problem with JavaScript vs. C# or COM.


Totally agree on the part of the add-in: Just give us a dead-simple method to load the manifest locally and be done with it!


The app store / Office admin stuff. When I was dealing with this stuff 2 years ago, it was insanely difficult to deploy a web add-in internally (something involving the "Microsoft 365 admin center"), let alone deploy it to customers (gotta get approval to deploy it in the special Office app store, and hope+pray that your customers' IT departments don't block that).

Compare to VSTO, where IIRC installation just involves editing a registry value to let Office know where your add-in can be found on disk.


You just reminded me I tried to make an office addin a few years back and distribute it on the office store. Terrible experience trying to get Microsoft’s admin/business center (whatever it is/was called) to verify I was actually a real human/owner of the domain as a sole proprietor. Terrible.


This is so weird. I built a working "python within excel" a few years back. (it also did sql and matlab&r remotely)

Abandoned now: https://alphawolfxl.com/

I abandoned this when some people I asked to beta test it said "their company wouldn't pay for it" and gave me the impression I'd wasted my time....

....Not sure how to feel about this.....


Don’t feel too bad because if you would have created a product from it, today this product would be deprecated..


I don't think that's necessarily true. They built VSTO for C#, but everybody was using ExcelDNA instead, an Open Source solution that was much better!


By chance I built quite an interesting addin for excel using VSTO. It was great, nothing to complain about. In the corporate world Microsoft tech is widely common.


You were just ahead of your time!


Do you know why Microsoft never made proper examples of the new technology? The few examples provided are for programmers only.

In general there is a scarcity of working code that you can inspect to see how javascript in Excel works.

Now it is mostly "how to draw an owl"


I don’t unfortunately. Given the more technical nature of the feature I’m inclined to agree Microsoft should make a bountiful supply of working examples.


Glad I am not the only one with this impression :)


Thanks for your work on that! I use custom JavaScript functions to hit our internal analytics API. It was really nice to work with.


Thanks for your kind words. Did you ever use streaming function to update your analytics periodically or when new data become available?


That's really an interesting story; thanks for sharing it.


Great user name.

I have a private github repo called NVBC, the "NecroVisualBasiCon", with all of my favorite VBA-isms.

When I code it, I indent all of the `End *` lines to column 80, to make the VBA look more like python.


Appropriate username


An aside - I wanted the username xl but someone registered it the day before (the day before!) me on hn.


> the Python part faded away and we ended up enabling JavaScript Custom Functions in Excel instead.

Nobody could have anticipated that GPT4 would author Python better than anything else.

Nonetheless, I kind of just want Google Drive except Excel, without adopting anything more than the Office ecosystem, that just works with simultaneous editing.


I wish it wasn't solely powered only by Microsoft Cloud and could support python running locally, but regardless I think this will still be huge and single-handedly modernizes Excel by a large margin.

This alone could eliminate the need for websites that just want this sort of data. I can think of a past project at a previous job I did building an analytics website used only by a handful of people internally that could have been just as well served with something like this, had it existed at the time.


> This alone could eliminate the need for websites that just want this sort of data

I catch flak for this all the time! Our users just want to be able to use Excel to free-form analyze their data. IT, of which I'm a part of, insists on building web apps to do that for them. They're never happy because all they wanted was Excel.

I keep explaining all we have to do is build a data mart built using SQL Server, and use our ETL tools to keep it populated with the data they need to analyze. Don't let them anywhere near the source data. They keep looking at me like I have ten heads. Instead they want to launch multi-million dollar projects to bring Tableau to our users - and they don't want it! It's asinine!


Our company uses Power BI for this very reason. Most of our users are content with what's available in the reports we've built in Power BI. However, a few power users directly connect to the datasets we've created (that also feed the Power BI reports) and can mix and match that data in Excel.

Excel will import datasets from the Power BI client, and you can refresh the data at any time. The nice part of this arrangement is that the whole model is imported. All the relationships between tables, all the added calculation columns, all the measures are there for use by the end user in Excel.

Power BI has been a great solution for our company (~1000 employees).


There’s also the Power Query add-in for Excel that’s super steezy.

It’s like a light Power BI.

Honestly even thought I can make interactive report webapps, that stuff needs to be built and maintained so I rather have people use Excel or a BI tool. Even I rather use Excel or a BI tool than write code.


To be fair, Tableau Desktop, Tableau Prep, and Tableau Cloud are great if expensive products. With Tableau Prep, you get a visual ETL tool. Being technical, I prefer SQL, but for business folks Tableau Prep is more of a REPL so they see what their actions are going to do. They can schedule these Prep flows in Tableau Cloud which will ETL and produce daily datasets which users can then interact via the dashboards on Tableau Cloud.

IMO, Excel is on the right path with out of the box connectors to Salesforce, Azure databases, etc. What it really needs is full blown SQL with REPL output so the users can see the effects of their SQL. (The way Tableau Prep does it is that it defaults to a sampling of rows to output immediate results). There used to be a Microsoft service called Excel online but it was another thing you had to buy and was super confusing. Excel needs to let users schedule their ETL/datasets in SharePoint online and then let other users subscribe to it. They can call it Excel Super+ and charge a small fortune. I am positive there are people with their wallets open.


That last part is almost exactly what we do with Power BI. We import tables from a SQL data warehouse into Power BI. We then build a data model (creating relationships between tables, adding calculated columns to those tables where needed, and creating measures to summarize the data). We then create reports based on those datasets. Most of our users are content with what they see in the Power BI web client.

This is the killer feature, however: power users can access those same datasets through Excel and import the entire model. All the relationships, all the calculated columns, and all the measures are there for the power users to mix and match and filter however they want in Excel. It's been a phenomenal success for our business.


>This is the killer feature, however: power users can access those same datasets through Excel and import the entire model.

I had this once. Then someone decided it was 'dangerous' and locked us out. I went back to exporting the data from the ERP and loading it into Excel...because I was allowed to do that


> I am positive there are people with their wallets open.

Take our money! The entirety of the Fortune 1000 would license it on day one!


> What it really needs is full blown SQL with REPL output so the users can see the effects of their SQL.

FWIW, I built a version of this at $DAYJOB. You can get REPL-like access and export to Excel using a tool like this: https://azure.microsoft.com/en-us/products/data-studio

Non-software engineers looked me like I had ten heads for suggesting that people would need to learn a modicum T-SQL to make use of it. I even volunteered to write all the views for them! The lesson learned for me is that horrible expensive ERPs/ETLs tools are never going to be replaced with Excel + a database because even the (non-software) engineers at a large company are not interested to learn how to write basic SQL statements.


FWIW I thought Azure Data Studio was very cool.


DataIku


Google’s ecosystem makes this incredibly simple.. all you need is to populate BigQuery with the datasets of interest, and Sheets can pull it in via a connector that’s configureable in a GUI. I sincerely hope my org never migrates to MS for this reason alone.


Came here to say the same thing.

Another good reason to not migrate to MS is that they are a security disaster.


Google's Data Studio is pretty good as well—shame about the recent rename to "Looker Studio."


> I keep explaining all we have to do is build a data mart built using SQL Server, and use our ETL tools to keep it populated with the data they need to analyze.

I can relate and have experienced this. The result was that they would end up exporting from a series of Tableau/Looker workspaces, and importing into excel and doing their analysis there.

Then continually making enhancement requests on the Tableau/Looker workspaces to get any additional data they need for their excel work.

Then of course, the excel document and its multitude of versions would get emailed around until it hit the attachment size limit, or if there were too many version of it to figure out which one was "actionable".

We finally figured out what they were doing when the enhancement requests for the reporting workspaces got too bizarre.

Insanity.


How do you lock down Excel - that is the issue we have where I work and the main reason excel is strongly discouraged.

There are two main issues my org has battled with:

1.People basically turn excel into their own mini databases which causes a support nightmare.

2. This is strongly related to 1) Data provenance is a nightmare because so many people were passing around excel spreadsheets it became very difficult to track down the source of the figures people were using. User A passed their spreadsheet to user B who modified it passed it to User C who further modified it, passed it to User d from there some figures ended up in front of a manager and now we have a problem...

It's not so much a problem with Power users those are the types of people using python (or R) it's the tier of users below that level that cause the damage with excel.


3rd_quarter_draft_final_krf_v3_edits.xslx


We have all been there.


Are you using Power BI? If it is easy to find and use ground truth data I find that people will start using that. Encourage sharing reports, but make it "official" by sharing them through Power BI.


You're probably asking for something impossible. It's common for the raw data to not fit in excel. You have to do some sort of aggregation or filtering, which means having that app where you aggregate and filter.

Almost all these tools support exporting to Excel or CSV, so they can do whatever they want in Excel after that.


In multiple orgs now I've had raw access to data warehouses (DB2 and SQL Server based) that had aggregates of petabytes of data from ETL that I was free to access via Excel PowerQuery, direct SQL tools, Python or R, or even PowerBI.


If you're getting aggregates, it's not the raw data by definition. Someone set up aggregation for you to use, much like I described.


I didn't mean aggregate as in summarized data, but aggregate as in a common repository of multiple different data sources as a single massive data warehouse. If I needed to group or summarize the data, I had to do it in my own queries.


My point wasn't that you should never give people access to raw data, but that the raw data won't fit in excel rows. You have to do aggregation or filtering somewhere to get the data within the row limit.


> I keep explaining all we have to do is build a data mart built using SQL Server, and use our ETL tools to keep it populated with the data they need to analyze.

I mean, that's like >90% of building a CRUD web app. If you truly only need the R in CRUD, your approach sounds reasonable. Otherwise, just go the extra distance and build a CRUD app with csv export / direct SQL access.


I’m physically uncomfortable with the idea of a CRUD app for a data mart.


Why?


Because then you can’t trust it. The only changes to a data mart should be replicated from authoritative upstream systems.


That is quite a generalization, not every data warehouse / mart / workflow is architected in the exact same way.


I honestly can’t imagine the value of a user-editable data mart. Maybe you can educate me?


Hi I work on Excel, can you write a summary of why that's important to you and send it to bgruenbaum@microsoft.com ? I'll get it in front of a PM.

Alternatively PMs read all the feedback sent via the feedback UI on Excel web and desktop so you can use that instead if you prefer.


We use Excel for a bunch of "critical" tasks as defined by the financial regulator, which means any of their dependencies also have to be assessed to ensure they meet the required level of stability.

What this means is we'd need to go from just requiring a laptop and a copy of the spreadsheet to requiring redundant internet connections and validating Microsoft's backend meet the regulator's requirements. I'm pretty sure that would unfortunately rule out using it.


Having seen the quality of “critical” macros, excel spreadsheets, and “dev/prod” SMB drives anyone can mess with it’s absolutely clear no regulator gives a fuck.


Or their concern and that of an engineer don't completely agree, or the reasons they want concerns addressed are different. Both happens a lot.


I think it's similar to the current generative A.I. trend.

The companies I work for while consulting (mainly financial and health organizations) have been very cautious about any company or PII data going to these A.I. programs, and the cloud in general, and are looking to make their own local run generative A.I. LLMs for their employees to use.

By it's very nature these Excel documents will be filled with company data, so I can imagine these companies would also be a bit cautious sending those calculations to the cloud, especially if they can't control the flow of data from their I.T. departments via web applications.

But I'm just a software engineer at the end of the day, and don't professionally use Excel all that much right now (if anything I use it much more for my hobby of board game design). So I can't speak with too much authority on the matter. Just my observations.


> By it's very nature these Excel documents will be filled with company data, so I can imagine these companies would also be a bit cautious sending those calculations to the cloud, especially if they can't control the flow of data from their I.T. departments via web applications.

This is exactly what I see when talking to users about xlwings: it needs to run on their end, behind their firewall, in exactly the cloud they have picked (AWS, GCP, Azure).


I do a lot of work with US state governments, it's the same story mostly. It's possible to get the approvals required for data to be shipped off to some "cloud" but it has to be FedRAMP certified and no data can move outside the contiguous 48 states.


It won't work offline, for one. I use Office exclusively without internet as otherwise who knows what private stuff it's sending back to Microsoft, which then becomes eligible for warrantless surveillance by the federal police.

I know it's old fashioned to want local software to only change things on my local file system, but if I didn't want that paradigm, I'd just use Google Docs.


Yes, this is important where you're online but not on a high throughput connection or any part of the connection has intermittent faults.

The disruption from even a few outages is enough to prefer to run things locally. We had issues with people unable to sign into Excel and everyone went mad (rightly so, as Excel was running locally not 365 but it blocked usage)

And on top of that, there are scenarios where you'd want to process substantial amounts of data that would be prohibitive to upload, process and return the results, where locally handling it would take way less time.


I eventually gave up on it because of the locals vs cloud differences. The documentation for the whole thing was terrible as well. I think they’re in transition (assuming cloud only). In my case, I just went and used Calc from LibreOffice.


The Cloud, and therefore the integration of as much as possible in it, is core Microsoft strategy.

It's about the recurring payments, the lock-in, and ultimately literally having the knowledge and business processes of the world economy on their own computers. If this is true, the reason why it is important to run data and scripts locally should be clear, and so is the motivation of your PMs to actively combat any such things.

But if you get a summary in your inbox and show it to your PMs let us know how that goes down.


Yeah, but they also make money on the office subscriptions. So you also want current working data scientists who would not touch excel to give it a chance because they can use their existing python back-end. Eventually, there would be a service that is Microsoft exclusive, but they have to compete on services.

Idk, it's really up to Microsoft's business units. Office could be the loss leader to get cloud business, but it could be the other way too. Either way, I would hazard that they want people to actually use this integration feature.


It all sounds a little too "Software services as a software service."

Python -> Excel sounds great.

Cloud Python -> Cloud Excel sounds too much like a privacy and security disaster waiting to happen.


Is there any technical reason excel couldn’t ship with Python? The license allows it [1] and it would probably solve the issue of all users being able to open any document like the cloud solution, but still run locally.

[1] https://wiki.python.org/moin/PythonSoftwareFoundationLicense...


Ever used Excel inside a SCIF? There's a lot of important work that happens on computers that aren't connected to the Internet.


Can you just please fix the issue with date strings (such as "20230822090811") being converted into scientific notation? I've received literally hundreds of these documents over the past decade. And as long as this problem exists, I will keep receiving them.

Can you please just add a button - "unmess this document"? The number of man-hours I've wasted re-creating documents is countless. My job is to do other things not fix excel documents.


Excel makes a total mess of many kinds of imports, assuming they are dates when they are not or assuming US vs UK formats dates etc. https://www.theverge.com/2020/8/6/21355674/human-genes-renam...

However, to my eyes, and probably most people, 20230822090811 does not look like a "date string", it's a number. It's correct to display it as scientific notation if it's too many digits to fit into the width of a cell. Please let's not encourage them to interpret more things as dates and randomly break people's workflows.

Excel should preserve all the digits you enter in the cell even if it doesn't display them, though. That has nothing to do with dates. It's a problem with things like product numbers or keys that can end up with many digits.


Are you saying that excel does preserve digits? I must be doing something very wrong (entirely possible), cause once the document is saved, the digits are gone.


That reads as a normative "should" to me. It doesn't preserve digits, but it ought to.


Yes, sorry, that's what I meant. Excel doesn't preserve the digits but it should.


Optimist: The glass is half full.

Pessimist: The glass is half empty.

Excel: The glass is January 2nd.


The way you fix this is using the text import tool or better yet power query and specifying the column type to be text. With power query you can repeat this for as many files as you need to import.


If he's recieving excel files with this issue from other people, the text import tool isn’t a solution and Power Query is a bit circuitous to solve what can be addressed by formatting the range with this kind of data.


You may want to try a CSV editor: https://www.moderncsv.com - it's helped me already.


+1. I just discovered this, and it's happily eliminated at least half my use case for Excel, and does a better job at it too. That is, visualize CSV files without messing with the data.


The scientific notation ("2.02308E+13") is just how Excel displays a large enough numeric value like your example; it's not doing any conversion, as you can see if you paste that number into Excel, confirm that it's showing the scientific notation, and then click into the cell to confirm that the formula bar still shows the original number (20230822090811).

If you paste your date string in a more standard date format like 2023-08-220 9:08:11, then Excel will correctly infer that it is a date. However, you seem to be saying that Excel should handle ANY numerical strings that way if they are long enough to express a date or datetime, which, while it might make your particular workflow easier (or it might not), it would really mess with people's ability to use Excel with large numerical values...


Excel only maintains 15 digits of precision for numbers. Everything else gets truncated. This truncation is done at import so it is unrecoverable.

This burned me frequently when I was generating tax audit reports for accounting. The billing system IDs were all BIGINTs but had over 20 digits.


Do you _really_ think I've spent the past 10 years getting files like this and I still don't understand the reason. I didn't ask for an explanation. I asked for a fix. What are you doing here??

I can provide instructions on how to do this until the cows come home. I will still get files with these issues. Is that not obvious?

Do you really think I want to examine _tens of thousands of rows_ in documents that are 100 cells wide looking for a single exponent value!?!?!?! Is that really what you're suggesting as a fix here?


My my, why so angry? You're expecting something that Excel doesn't support for good reason. The string you used as an example might be a date string to YOU, but to Excel and many of its users it's just a large integer. Lots of regular use cases would break if Excel would interpret that string as a datwle

If you want Excel to interpret strings as dates, use a more common date representation such as iso8601


It’s one of the most infuriating and dangerous behaviors I have ever encountered. Excel silently truncates data. The anger is warranted. The problem isn’t assuming a display format for data, it is that a poor assumption actually mangles data. Excel doesn’t fail safe.

Numbers in Excel only maintain 15 digits of precision. Leading zeros are also truncated. If you haven’t experienced this you are lucky (or just didn’t notice) but it is real.

> If you want Excel to interpret strings as dates, use a more common date representation such as iso8601

You don’t always control the upstream system. Your response can be considered rude because it implies you think you know more than the person expressing pain. This is especially insulting with such a widely known problem in Excel.

This behavior burned me with billing system IDs. Logically they are strings, we would never do math on them. For performance they were implemented as BIGINTs in the DB. Excel truncates everything after the 15th digit. Our IDs were all over 20. So accounting couldn’t reconcile our reports with the billing system.


> It’s one of the most infuriating and dangerous behaviors I have ever encountered. Excel silently truncates data.

That's a problem, sure, but not the one sixothree is complaining about.

> The problem isn’t assuming a display format for data

Your problem might not be, but that’s exactly the problem sixothree is complaining about that the post you are responding to addresses, so while your post references a valid complaint, it is a complete non-sequitur as a response.


My my, why so angry?

Because they have been dealing with the same bug for 10 years, I thought that was obvious.


> Do you _really_ think I've spent the past 10 years getting files like this and I still don't understand the reason. I didn't ask for an explanation. I asked for a fix.

I don’t believe you’ve been doing it that long and haven't found the fix.

Apply appropriate formatting if you have long strings of numeric digits [as long as they can’t have leading zeroes] that you want stored as numbers but displayed without scientific notation. (Not the best choice for what is semantically a date, in your case, but...that’s a whole bigger issue and, well, babysteps.) You don't have to look at anything, just do it for the ranges where that is the kind of data:

https://help.godatafeed.com/hc/en-us/articles/360049916591-H...


The text import wizard does work. The problem with it is that everyone has to use it and if anyone doesn’t then your data is silently mangled.

In the past I have added an apostrophe to the front of any integer that should be displayed as a string. Excel will not display that leading single apostrophe. It of course taints that data forever but I consider “Exported to Excel” a terminal state anyway.


I cannot remember the last time I had some long number and thought to myself, "You know what will be helpful, convert this number into scientific notation!"...

It's asinine Excel has this behavior with any large number. Try to work with a list of EAN/UPC codes... it'll wreck every single one every single time, unless you take significant care to guard against it via formatting, special characters that trick it into using a string, etc. Try working with a spreadsheet full of database id's... same thing. It's asinine.

By the very nature of Excel, large numbers belong as-is... ie. you're studying something and need the values. Approximations (what scientific notation gives you in the best case) are an exception to what people need nearly every time.


> I cannot remember the last time I had some long number and thought to myself, "You know what will be helpful, convert this number into scientific notation!"...

Okay, so set formats appropriate to what is useful for your data.

Defaults don't cover all cases, otherwise there would be no non-default options.

> It's asinine Excel has this behavior with any large number. Try to work with a list of EAN/UPC codes...

Codes aren’t numbers they are strings, even if the code is a sequence of digits. They should be formatted as text. For actual numbers that shouldn't be displayed in scientific missions, the appropriate numeric format does that.

> it'll wreck every single one every single time

If its any code that can be safely stored as a number, it won't wreck it. It’ll just look bad until its fixed. If it, say, has significant leading zerores, yes, it’ll wreck it, because excel defaults to dealing with numbers, not text, and there is a difference between text made up of digits and actual numbers.

> unless you take significant care to guard against it via formatting, special characters that trick it into using a string, etc

Knowing the shape of data you are entering and selecting a column and applying an appropriate format (including “text” as an option) before you enter any data (usually, to a whole column) isn’t that arduous.


The defaults should be sane - that's the point. What percentage of Excel users actually want data automatically converted into Scientific Notation? Close to zero I'd wager.

Your offered work-arounds only apply in a very narrow workflow. I work with Excel nearly every day, all kinds of files (CSV, TXT, XLS/X, etc). There's so many ways this issue can bite you unexpectedly - causing you to save and permanently destroy data.

If I type/paste/import/download/whatever `01234567890987654321` into a cell, 100% of the time I do not want `1.23+E18`. I don't want to even see that, I want to see the original value I entered.

It's asinine.

Excel may be the most used Office Suite app, but it's by far the most painful to use. The Excel Team has a long history of justifying downright perplexing behavior. Remember when Excel was the only Office app that didn't support Snap-To back in the Win Vista/7 days? Joel had a long-winded explanation that boiled down to "just cause"... it was perplexing even at the time. It really makes one wonder how bad the rat nest of a codebase Excel is.

"There's ways around our astonishing default behavior so just deal with it". That's not a great answer.


Except Excel doesn't recognize ISO timestamps either.


Are you sure? It seems to for me, which is annoying because by recognizing them it changes them to a localized date format which then breaks when sent overseas.

If I type "2023-08-22" into a cell in Excel I see "8/22/23" in the cell and "8/22/2023" as the value.


Definitely broken for me on Excel version 2302 (build 16130.20714) on an En/US locale.

None of the ISO8601 formats I've tried are recognized. "2023-08-22", "2023-08-22T01:38:22Z", "20230822T013822Z" are all valid ISO8601, but Excel treats them as unformatted text. Inputs with slashes are instantly recognized as a date.

If it works for you, it may be a locale thing. But getting away from that is the whole point of ISO8601.


I am En/US but on MacOS.


All the crusty broken shit in Excel have become features over the last thirty years.


Not OP but I would never want to use this if I was running a security minded business. Why would it be okay to send out my entire dataset to the Microsoft cloud to run a few simple queries. How long is the data stored on your servers? Since it’s operating on the data directly, it’s not possible to anonymize or redact the contents of the data.


Practically speaking, if you are running recent versions of Windows (I don't remember when exactly this trend started, with 2000? XP?), you have basically no control over what data you send to them and when. You can try to block traffic but it will interfere with normal Windows operations. You can try to investigate and play cat and mouse game with Microsoft, but they will always be one step ahead of you - unless you decide to turn off automatic updates and make your system less secure.


That is some tinfoil concoction, not reality.

There’s a world of difference between having a file stored locally in Windows and one purposefully uploaded to Office 365 for data analysis.


What if I’m running windows in a VM in Ubuntu and only feeding it exactly the relevant stuff?


It makes little difference if it's a VM or bare metal - what matters is network connectivity. If it's on, you lose control over the data leaving your computer. In Linux, BSD and others, you can control it in a very precise way.


You still have no control over the data you place into the Windows VM, which would include the data in the Excel sheet.


does he have control over whether he transmits any data he places in the mysterious windows VM?


I’d wager that most of that data is anonymized and redacted usage and error stats. Big difference between that and entire data sets


I can put this in an email if you want, but one good reason would be that my laptop is probably faster than the cloud compute that they are willing to give away to 365 subscribers. I can do much more inefficient things in a single cell if I know my laptop can pick up the slack :)

(Although, I'm guessing the reason they did it this way is so they could keep the execution model the same for Excel on the web? It's always running on the external service vs sometimes running locally and sometimes running externally.)


Part of the reason I still like Excel over Google Sheets is that I can use it completely offline.


ITAR and other regulatory issues - being the primary. Not all users are on Office 365 GCC High.


How about an actual API that lets you drive Excel from Python rather than the other way around?

With that API, it won't matter whether you want Python, Ruby, Lua, whatever, for your scripting language.

This is nice, but it welds a specific version of Python inside Excel and makes it a gigantic PITA to use your normal development tools on your Python code.


Completely agree. MS should just do Google Apps Script but better.


>> Hi I work on Excel, can you write a summary of why that's important to you and send it to

How out of touch with the real world is the Excel team?


Customer verbatims are like, an incredibly valuable currency. It's all fine to say "I know the product in and out and I think this is important". But it's another thing entirely to say "I know this is important because _this customer right here asked for literally this_".


But how do they not realize that not everyone wants to have the cloud required for a feature in Excel?


"It is difficult to get a man to implement something when his stock options depend on it remaining unimplemented." -- how a modern-day Sinclair Lewis might answer


Wrong Sinclair: the quote was Upton Sinclair (who, according to Wikipedia, is “not to be confused with his contemporary, Sinclair Lewis”).


You have an opportunity to expand their perspective and this is the best response you can come up with?


A hacker news comment chain isn’t a great place to follow up on feedback. Email’s alright! I don’t think that’s too out of touch.


Will Excel fix linkages between files on sharepoint?

It is horrible..

The amount of lost productivity because "report file" cannot easily connect to "source 1, 2, 3" is very high.

In theory you can do it, but it plain does not work in shared sheets.


I've possibly seen some of your breakage with Sharepoint, while supporting an integration product between Excel and some enterprise systems.

Does the problem persist if you ensure the other source files are open before opening "report file"?

If opening the files in the right order solves it, then it may be possible to script this in Excel or in an Excel add-in (optionally, signed).


There are few key reasons for running in a hypervisor isolated container on Azure with no access to the Internet:

- We can guarantee a consistent experience for all users. Imagine having to maintain your own local distribution of Python and guaranteeing that it works with Excel as versions diverge over time? Yikes.

- We make it possible to share your Excel workbook with other users and have the calculation just work. That wouldn't work with random local installs of Python and users would be super frustrated by this.

- Security. Imagine opening an Excel workbook that can execute Python code running locally as you.

Also, I totally agree with your second point. Trying to write an internal app in Python that integrates with all of your existing IT infrastructure is an exercise in frustration at best. Excel is already part of the IT infrastructure virtually everywhere and is a programmable reactive canvas.

Disclosure: I work on the design team for the feature.


I agree with point 3, but points one and two don't make any sense because games have been skipping with Python scripting for decades and this is never a problem. The python interpreter is embedded into your program as a library and it doesn't have any dependency on the whatever python version the user installed.


It's the dependency hell problem as packages are added to the sandbox environment. If they were just shipping python (standard library), it would be fine, but all of the useful libraries have large numbers of dependencies.


doesn't this dependency hell still exist, but just in the cloud?


It's not hell because they carefully curate the set of libraries rather than having end-users attempt to have every combination of packages (and then blame microsoft for the dependency hell / lack of support for their exact configuration.).


You say this as a good thing... but as a data scientist that uses python every day I can assure you their "carefully curated set of third party libraries" definitely will not include most of the libraries I rely upon for my work.


> games have been skipping with Python scripting for decades

That only works if there is a forever fixed version of Python embedded in your game. The value of Python in this context is its ecosystem and folks will need to install additional packages and libraries into the execution environment. Now you're managing a local distribution of Python.


Yeah, but if the big value is supposed to be in the libraries you sort of have to accept dependency hell? Otherwise you're in the world where a small set of libraries can deliver all the functionality ever needed.

Of the currently available solutions, "cloud environment made by Microsoft" doesn't sound that different from "software update made by Microsoft" in terms of how it solves the dependency hell: basically, the critical library you need is either in or out, but there is nothing you can do about it.

Admittedly, here I'm assuming Microsoft won't let you install things into the environment yourself since that basically means hosting virtual machines running arbitrary code and putting the Excel brand in front.


Dependency hell may be an issue if you install arbitrary libraries into the container at runtime. Note that this feature isn't currently available in the preview.

On versioning: we freeze the container image that your Workbook was authored against. You need to manually accept (and validate things continue to work) updates to that container image as we roll forward.


I don't really understand what stops this from happening locally?


Could you not achieve the same by shipping the required tools with Excel and running them inside a Windows Sandbox? Isolation isn't as good, but it probably protects users against the threats you/they care about?


Another half-baked Microsoft cloud thing. Get wrapped around that axle? No thanks.


I think this could be just a toggle in settings:

A. run python in the cloud (default)

B. specify path to your local python interpreter

When you open a workbook that is using B show a warning before running anything. Then behind the scenes spin up a new venv that installs everything from requirements.txt and executes the code in workbook locally.


C. run code in same environment in a local sandbox.

Linux can do this with containers (pick your implementation), I'm certain Windows has similar mechanics.


D. Just ship Python with Excel


It running in Microsoft Cloud is precisely what I think will actually allow this to be useful functionality for my company in ways that macros aren't, because macros are seen here to present a vector for running malware on client machines, whose utility doesn't justify the risk.


Just wait for someone to run malware on your mystery cloud instance of Excel. Have you seen the cross-instance and cross-domain security holes Microsoft has been running into? The latest and greatest was on their Power Platform ("Excel-like"). Nothing like silently shipping your data off to a remote location for inspection by blackhats. Microsoft cloud security is torn apart and laying all over the floor right now...nevermind all the half baked features that let you get kneck deep in a project before you see the grande middle finger from project hellscape.


It sort of makes sense to limit it to the cloud, because making Python an integral part of the Excel file format would reduce its portability and introduce all sorts of compatibility headaches going forward.


If you've ever dealt with AWS Lambda you will know what a versioning and upgrading shit show this is going to be wherever you stick it, be that client or cloud.

I worked on a very early python integration used as a workflow DSL back in 1998 and they had to dump it and write their own DSL in the end.


Whether it runs in the cloud or locally, the included python source will be written for a specific runtime version and environment. If excel were to run python locally, it would presumably bundle an interpreter (+ any needed pip packages) rather than relying on whatever was installed locally.

The cloud execution model does still have an advantage in that it can retain old runtimes in perpetuity for backwards compatibility, though


Making software available for download perpetually is less complex than making software available for cloud execution perpetually.


Under the model you're proposing, would I potentially need to download an old copy of Excel because the sheet I'm trying to load relies on an older python runtime?


Computers are computers. Potentially new Excel would be incompatible with old cloud Python environments. Potentially new Excel would be compatible with old local Python environments.


Isn't that already an issue? I've definitely needed specific versions of excel to open particular workbooks before.


Excel already has versioned file formats with significantly different affordances, and hasn't appeared to have all sorts of compatibility headaches as a result.


Integrating a whole separate language ecosystem is one or two orders of magnitude more complex though. It’s more like HTML5/WHATWG suddenly adding Python as a first-class scripting language besides JavaScript, and web browsers having to integrate that.


Isn't this what people are trying to do with pyscript? Although in that case it uses WASM?


It’s different because WASM is much more strictly defined and has a more limited scope than Python. Thanks to WASM, browsers don’t have be able to understand Python/PyScript. The trade-off is that you can’t, for example, view/edit the Python code in the browser’s devtools.

A WASM-like approach wouldn’t work very well for Excel, because you’re supposed to be able to edit the source code/formulas within the Excel application.


> It sort of makes sense to limit it to the cloud, because making Python an integral part of the Excel file format

Moving it from the cloud to local doesn't make it an integral part of the Excel file format. Whatever python service is running in the cloud could just run locally.


As a red teamer, I also wish it supported running python locally :)


Oh the cloud part is going to be a huge deal breaker in so many industries. This will automatically be blocked by default at my work place for certain.


> I wish it wasn't solely powered only by Microsoft Cloud and could support python running locally,

There are existing solutions for that, but it would be more complicated for MS to support and secure.


Adding python support for local use would be a great way for me never to open an excel file ever again. That is so dangerous.

Wasm maybe, full-blown runtime without any sandboxing? Nope.


What's the difference between Python and VBA in that respect? Do you open Excel files now? Pretty sure that wild Office macros were once one of the main sources for computer herpes.


There is no difference except decades making vba in excel less horrible.


> I wish it wasn't solely powered only by Microsoft Cloud and could support python running locally

Have a look at xlwings, which runs on any cloud and also locally.


> This alone could eliminate the need for websites that just want this sort of data.

But Microsoft Cloud is a website!


Is the idea that you will not be able to use these scripts without a subscription?


Being powered in the Cloud they can protect you from malicious code


I too would love this, but I can tel you, even in a sandbox python would be very easy to abuse from a security perspective. Consider how much havoc on humanity (no exaggeration!) VBA Macros have caused.


I have to see some red flags here. The very nature and power of Excel comes from the fact that workbooks can stand alone. This means that the process is eternally tied to the cloud now. So when you, 3 years down the line, have to open a workbook from your deceased colleague, you are at the whim of the vendor supporting the existence of this integration.

Step one of any workload will always be "work out how to make the software work again" and that's terrible.

I know this because I have been in that situation several times in the last couple of decades with random Excel, outlook and word add-ins. And those don't even do the processing remotely!


> This means that the process is eternally tied to the cloud now.

"You will own nothing, and you will be happy" has followed us to... spreadsheets? We can't even own the things we're wageslaving away at our desk for.


> We can't even own the things we're wageslaving away at our desk for.

But we never owned those things we're wageslaving away at our desk for, that's why its called wageslaving. Whether our employers own them or rent them seems pretty immaterial.


Yeah I didn't phrase it well. My point was even our work is becoming intangible now "for the greater good." Just funny to see the progression of things.


the 'cool' thing

is that not even the purported 'business owner' owns their own data, what do they actually own? what is even their business?

on the modern trend, all businesses really are: is just a collection of subscriptions to service providers like MS.... where's the actual business???


I think this is easier to manage to be honest. They are using Anaconda distributions so if you tie each workbook to a distribution with possibility to update/rollback it makes things a lot easier than trying to manage a local installation.


Try explaining that to the dude who inherited the workbook in 9 months who doesn't know what Anaconda is, barely knows what Python is and was told that his job was Excel which he's really good at because he did a Udemy course on it 3 years ago.

That's the problem.


I'm not sure why that is a problem. He didn't have to know anything about Anaconda. And for stability, you can still install 10-year old python 2.7 packages from Anaconda.

9 months is no problem at all for stability.


I would assume that workbook would be tied to a version that it was created in and the if the user does not want to update they can just continue using it with the older Anaconda distribution. They don't really need to know what Anaconda is in this scenario. It only becomes necessary if you want to update a certain workbook to a newer version.


> This means that the process is eternally tied to the cloud now.

Not only the cloud, also the subscription. If you cancel the subscription (note that the announcement mentioned it might be a paid module), don't have the subscription (because you got the Excel from a supplier), etc. you already have a problem using this Excel sheet.


I haven't used a non-cloud spreadsheet in probably 7 years. I also wouldn't use excel for anything that has a combination of high importance and long lifetime though. It's mostly for some brainstorming or thrown in a presentation as a screenshot.


> I haven't used a non-cloud spreadsheet in probably 7 years.

Last time I use a cloud spreasheet was a month ago, to download the content into Excel

> I also wouldn't use excel for anything that has a combination of high importance and long lifetime though. It's mostly for some brainstorming or thrown in a presentation as a screenshot.

It's funny but that's exactly how I see online spreadsheet: for unimportant things that may not work at any point in the future if the account is closed. However, the collaborative features make them nice for brainstorming.


If I had to download content locally I would tell them to resend a csv. I hope you are doing this on a VM


There is probably a trillion dollars flowing through Excel spreadsheets.


Likely a gross underestimate, given all the financial (both investment and risk) modeling done with excel.


I was trying to be conservative.


This will be true whether it uses the cloud or not. Excel worksheets can already have external links (both on the Web and to other files). If those web resources or other files aren't available, you're out of luck! And the older ones may only open on software that no longer runs on modern machines.

This is why it's important to restore your backups once in a while, or in this case, validate your critical files.


External links in Excel are copied and cached though.


Yeah I’ve received so many broken Excel documents and it’s just constant frustration.

Half the things don’t work, customer is sure it worked in X ( now broken and invisible way ) and they really don’t remember that correctly.


It's an exacerbation of an existing problem - who has not been sent an Excel file and opened it to discover some Sharepoint (or whatever) link is missing.


The new competition to this market have gained market share by making workbooks not stand alone, it's a whole growing product segment. So they either have to evolve or die.

It's not a red flag, it's evolution


I was so excited about Python in Excel, until it said "run in the cloud." There just doesn't seem to be any reason for this except to tie you into their cloud service. It feels so much like protecting me from unsafe printer ink.

I'm generally a fan of MS Office products and try to give them the benefit of the doubt.

Please help me think of a plausibly good reason for requiring Python to run in the cloud, that is not just about lock-in.


These are red flags. M$ is forcing people into their cloud. Their hijacking of your local file system with onedrive is a great example that many people already ran into.

I think we are about 5 years overdue from bailing from M$. They need to be avoided. Every product from Windows to Office already has anti-consumer features.


Not asking customers to manage a local python installation is one thing, although I totally understand how excel + python power users would likely be comfortable bringing and maintaining their own python runtime.


You can just ship a copy of Python with Excel, no need to ask anybody to maintain their own installation.


This. Why you would leverage system python is beyond me.


Anaconda distributions aren't exactly small. I'd also assume there's some sort of environment isolation, so you're getting multiple copies. Maybe one per workbook?


The embeddable package from python.org [1] is 20 MB unpacked. To install different sets of packages per workbook, just have Excel automatically set the module path as appropriate.

[1] https://www.python.org/downloads/release/python-3114/


No need to use Anaconda or any general purpose venv solution. Microsoft screwed this up.


Agreed. They could have easily shipped it with something like MicroPython.


Fun fact: Microsoft shipped Python code in 1996. The could do it again if they really wanted to.

http://python-history.blogspot.com/2009/01/microsoft-ships-p...


Does Excel run in a browser today?

There are a lot, I mean a lot apps that support scripting languages and don't need you to manage the language's runtime yourself. Blender's UI is entirely in Python. Civilization games have most game logic in Lua.


Inkscape (vector graphics) also has a built-in Python install, same goes for Cura (3D printing), Fusion 360 (3D CAD), and I'm sure many more


Probably because most users of excel don't have any experience installing and running Python which is even a pain in the ass for people that know Python lol.

You also have people using Excel on iOS, and Android devices. This allows them to use and read Python in their spreadsheets.


>Probably because most users of excel don't have any experience installing and running Python which is even a pain in the ass for people that know Python lol.

An embedded Python is about 10 MiB. I just checked. Excel could easily have its own separate copy.


But don't you need to install lot of packages for it to be useful?


Based on the announcement it looks like the standard data-science libraries (numpy, pandas, matplotlib, seaborne) will be available. The covers a lot of use cases. However I would really appreciate the ability to set the python interpreter to whatever I want. Then I could access the niche libraries I need (shout-out to JSBSim for open source flight dynamic modelling). Or even better, hooking up my organizations custom libraries we built for our own specific data analysis applications. Huge value add. Shame this likely won't be possible. Unless I've misunderstood some part of this.


there's always Grist. its not a clone of excel but its a spreadsheet/hybrid database so it does some similar things, but most importantly let's you use python for formulas and also has a self hosted option if you want to run it locally


Back in 2005, some friends and I started a company called Resolver Systems to build a Python-enabled spreadsheet, which we called Resolver One. Sadly, it never took off in the marketplace, and we had to pivot; we would up creating PythonAnywhere, an online coding and hosting environment, which worked out pretty well and was acquired by Anaconda last year. And now, the circle is closed :-)

(Just for clarity: the team working on this inside Anaconda is entirely separate from the PythonAnywhere team. It would have made a perfect Hollywood-ready story if it had been the same people...)


It was a great product - I remember your pitch and indeed we met a few times at the London python for finance meetup (remember the "Enthought Python Distribution"?). The only issue is that Resolver was IronPython-based so there were a bunch of libs that kinda didn't work.


I have to second that, it seemed to be a great tool. I've often wondered why it couldn't get traction¹. I can recall people gushing about it at what may have been the same Python meetup in London, and also people being impressed enough by it at a Haskell meetup that they were lamenting an imagined lost opportunity to shove a "real language" in to a spreadsheet ;)

I'd always kind of assumed that the target audience would have appreciated the IronPython use, as the .net ecosystem would likely have been more valuable to them. Having just looked I see that numpy wasn't available on IronPython until 2010, and I'm sure that would have been useful to have a little earlier.

¹ Pretty sure I've referenced it here a few times too.


Right, the lack of support for C extensions in IronPython was a problem; we kicked off the IronClad project [1] to fix that -- looks like someone else picked it up later and updated it for Python 2.7 [2]. Not sure where it went after that, or whether MS fixed the underlying issue later on.

[1] https://code.google.com/archive/p/ironclad/ [2] https://github.com/IronLanguages/ironclad


I built my first app on PythonAnywhere! I've had stuff running on there for almost a decade now. I loved being able to make small tweaks to the app directly in the PythonAnywhere web app. Such a great product. Really missed stuff like that when I started using GCP/AWS.


That's awesome, glad you like it :-D


PythonAnywhere is awesome :) like the other commenter it was my first foray into web programming too!


Resolver Systems came to my mind right away when I heard this news.


> Python calculations run in the Microsoft Cloud

A colleague was joking, "I guess even Microsoft itself isn't able to manage local Python installations on Windows…"

But imagine if Microsoft actually managed to finally solve the "setup a local Python environment" situation once and for all and Excel became the standard Python package manager across operating systems!

    $ excelpip install fastapi==9.11.23
    Searching for fastapi version 11.09.2023…


> A colleague was joking, "I guess even Microsoft itself isn't able to manage local Python installations on Windows…"

I've been waiting for Microsoft to solve package management on Windows since I played around with debian's apt 20 years. That's not snark, it seemed like Microsoft's product complexity and dedication to backwards compatibility of software would make them the ones to move software management forward. Really dig into the problem space and reinvent the package manager beyond anything that's come out of the open source space.

Instead we've got windows update, windows features, microsoft store, software center and winget (you know, that Microsoft-built package manager for Windows you install from the other Microsoft-built package manager for Windows).


> It seemed like Microsoft's product complexity and dedication to backwards compatibility of software would make them the ones to move software management forward.

That's precisely why they _didn't_ go with the package manager model of software distribution.

Package managers in most distros assume that everything you download via apt/dnf is a part of the operating system and integrates with whatever dependencies the OS ships with. It's a great system if you're an operating system maintainer, but terrible if you're a third-party software developer who has no interest in dealing with 5-6 different OS's repo formats and dependency versions. On Windows and macOS, you vendor most everything, provide runtime installers for the rest, and that's usually enough to be resilient to bitrot.

Package formats like Flatpak, Snap and AppImage are the only projects who seem to be taking the problem of software distribution on Linux seriously, and of those only Flatpak seems to have the mature tooling and lack of myopia necessary to pull it off. For the times that I do miss apt on Windows, chocolatey is pretty good at installing software and vcpkg works great for linkable libraries.


The state space doesn't have the conveniences that linux and language tools leverage, that's why I wanted to see them try to advance the field beyond what we have now.


Which conveniences are you talking about? Chocolatey gives me the "automatically updating my programs" experience, and vendoring dependencies with vcpkg is about as turnkey as it gets - add a git submodule to your project, reference the vcpkg toolchain in your CMakeLists.txt file, and you're done.


My mind can't help think of the following problem... would Microsoft have to freeze an LTSC set of packages? How would $GOV_AGENCY using a version of Excel one generation behind collaborate with the world?


I wish Libreoffice would have smelled the coffee long time ago. In principle scripting using Python is possible but exceedingly cumbersome/ugly and there seems to be no roadmap to improve on this.

The same warning applies to the linux desktop and all its apps more generally. Increasingly proprietary platforms will be rolling out advanced "AI" functionality extensions to classic apps that are cloud based, many of them via a Python API.

Its a pity that the various strands of the open source universe are so siloed. The potential of the sum being more than the parts is squandered.


I think in a lot of cases its not even the advanced stuff that keeps people on Excel and other proprietary things. It's the seemingly little things.

My biggest beef with LibreOffice's spreadsheet (and with Apple's Numbers) is how they handle copy/paste of discontiguous selections.

Consider this 5 row by 3 column set of data:

  1 2 3
  4 5 6
  7 8 9
  a b c
  d e f
Suppose you select the 1 2 3 row, the 7 8 9 row, and the a b c row, copy them, and paste just below the d e f row. In Excel you get this (first 4 rows omitted for brevity):

  d e f
  1 2 3
  7 8 9
  a b c
In LibreOffice and Numbers you get this (empty cells denoted by "-"):

  d e f
  1 2 3
  - - -
  7 8 9
  a b c
I haven't found a setting to change this. Excel doesn't have a setting for this either as far as I know, so people who do want discontiguous copy/paste to preserve spacing would probably be as irked by Excel as I am by LibreOffice and Numbers. But I don't think I've ever actually wanted the spacing preserved so it is LibreOffice and Numbers that irk me.


I get the same result in LibreOffice 7.5.5 as your Excel example. Which LibreOffice version did you use?


7.5.5.2 on MacOS.


The only thing I can think of is macOS clipboard manager somehow interfering with the copied data. Was testing on Linux kf5 UI.

Also, I can preserve the empty row, if I paste with Ctrl+Shift+Alt+V (Edit - Paste Special - Paste Unformatted Text). This opens the Text Import dialog.


You've solved it!

I turned off the clipboard history app I use and now LibreOffice pastes like Excel.

The history app I use is 10 years old and I see that the company that made it apparently is no longer around (or at least their domain no longer works) so I guess I should get a newer clipboard history app.


Great to hear :)

Re: a new clipboard app, I often see people recommending CopyQ, which is cross-platform: https://github.com/hluk/CopyQ


To me Excel's behaviour is way more problematic here. I carefully select the data I want, and it silently includes irrelevant data as well. I don't want the spacing preserved, but I really don't want other random data included.


It doesn't include other random data. You select N rows, contiguous or discontiguous, and those are copied to N consecutive rows starting at the paste point.


It includes things that aren't selected, which is other random data. If I select, copy, and paste, the only desirable outcome is that I get a copy of my selection.


I don't understand what you mean. In my example I select the cells that contain 1, 2, 3, 7, 8, 9, a, b, c, then copy, then go somewhere else and paste, and Excel pastes 1, 2, 3, 7, 8, 9, a, b, c. I don't see an other data.


Try pasting into another app, like a text editor.


Umm... Libreoffice has shipped with Python scripting since almost forever.

Not many people care because the most popular use case of any office suite is to interoperate with Microsoft file formats.


To be honest, I don't even fully understand what are the use-cases for integrating python with spreadsheets right now. I use both of them separately for different things but cannot see what it can be good at doing in an integrated environment.


I use spreadsheets for all kinds of bookkeeping, and very often there's some data you need to fetch once when a new entry is created, be it stock prices from a webpage or calendar entries from Outlook. LibreOffice's python scripts are great for those cases, because the tooling is just plain better than VB. Alas, VB is still necessary for formulae.


Isn't it simply a more mature language replacement for VBA scripting?


Check out ScriptForge: https://help.libreoffice.org/latest/en-US/text/sbasic/shared...

APSO script organiser is recommended for running scripts from inside LibreOffice: https://extensions.libreoffice.org/en/extensions/show/apso-a...


that's interesting, i see the opposite.

python is so easy to script that imo excel is nailing its own coffin. with an aging and retiring demographic of people that have never used a programming language, anyone who learns to incorporate python into excel will end up preferring python because its both more flexible and scalable.

i have no reason to use excel except for basic drawing board math, if someone demands a spreadsheet out me i automate its generation in python so i never have to deal with it again


I see spreedsheets more as an UI. An extremely intuitive (and reactive!) way of spatially decomposing computation. Yes, this creates lots of problems (you can not easily validate it etc.) but it is also very enabling. For sure more people are now exposed to programming, but how many are actually comfortable doing even simple stuff directly using code is a question mark.

A jupyter notebook is maybe somewhere in between. But even if other interfaces manage to bridge the gap further, I doubt the spreadsheet metaphor is going to go away in the visible future. Pivot tables and the like are like COBOL and mainframes :-) The corporate world would die a sudden death if you were to eliminate it (another reason to always make sure you can run things locally).


> with an aging and retiring demographic of people that have never used a programming language

This simply isn't true in any way and is completely divorced from modern businesses. Everybody I know who works in an office setting works with Excel. None of them know Python and none of them have touched a programming language. Almost all of them are younger than 30. And if anybody who's 20 starts at one of these companies, they'd better know Excel because that's what the business is built on. Not like that 20 year old is gonna know anything about programming either. 99% of people just aren't programmers and aren't going to learn programming when they just want to futz around with some numbers and the easiest (yet also the most powerful) tool they know for it is a spreadsheet.

If somebody in any business, anywhere, at any point wants to do something with numbers, they're going to use Excel.


Not a chance. There's nothing scary about VBA (except its inflexibility and lack of a lot of modern features), and people aren't learning that either. Of course, the last decade of vaguely threatening to remove it from Office completely (in favor of js and now python?) hasn't helped people think VBA would be a good investment of their time to learn. A required internet connection and having Python only work on Windows doesn't make it look like much better of an investment.

Also, imo, Python and whatever random environments and libraries that you'll have to use with it to get your work done, will be in almost every way worse than just using VBA. And if you use VBA, your work will be effortlessly portable. The only problem I have with it is that I have to come up with ways to get the code out of the embedded IDE into source control, but there are scripts that help.

> i have no reason to use excel except for basic drawing board math, if someone demands a spreadsheet out me i automate its generation in python so i never have to deal with it again

I, on the other hand, do it in VBA, and expose every important calculation within clearly labeled spreadsheets so programming-illiterate people can audit them and feel confident about the output.


This comment seems out of touch with why people use Excel. The fact is, Excel is popular because it manages to be pretty easy to use, while also being extraordinarily powerful and flexible. For most users, they'll ask "why would I program something in Python when Excel has a built in function to handle that?"


> i have no reason to use excel except for basic drawing board math

I think you underestimate the amount of work, globally, done in Excel. If Excel vanished it would be a business/economic extinction level event, It's not going anywhere.


LibreOffice has plenty of other problems to knock out first.

Its so slow.


Disappointing:

> Python in Excel is currently available to users running Beta Channel on Windows. This feature will roll out to Excel for Windows first, starting with build 16.0.16818.2000, and then to the other platforms at a later date. [1]

> Python in Excel is available in Excel for Windows. The feature is not available in Excel for Mac, Excel on the web, Excel for iPad, Excel for iPhone, or Excel for Android. On unsupported platforms, workbooks containing Python can be viewed but Python cells display an error when recalculated. [2]

[1] https://techcommunity.microsoft.com/t5/excel-blog/announcing...

[2] https://support.microsoft.com/en-us/office/introduction-to-p...


》Python calculations run in Microsoft’s Cloud No thanks!


That caught my attention too! I'm curious why they would choose this model.


Probably because Excel runs in a web browser nowadays and they want to avoid creating a separate way of running Python locally for a locally installed copy of Excel.

Maybe they will in the future, but it makes sense to start with the approach that will work in both web and local installs.


Looking at what they're doing with Outlook I'd bet money that they intend to replace the desktop version of Excel with the web version sooner or later.


I agree but expect that Excel will be the very last app of the suite that they make web-only. Excel heads like my wife absolutely hate the web version.


A lot of people in finance are Excel heads and hate the web version. And they have a lot of money to throw around, surely Microsoft cares about them.

I really don't see them killing the desktop version.


They are killing the deskrop version with sharepoint.

Sharepoint and web-shared files make it very difficult to link spreadsheets between each other, even one way around.

In theory it works, in practice it doesnt. (PowerQuery is an inconvenient stopgap to simple links)


I could see the prices going up to Bloomberg-terminal level, though, and everybody normal having to settle for Excel 365.


> Excel heads like my wife absolutely hate the web version.

Have her try Excel 2010 or 2007 running on Wine (whether using WSL or natively)

For people who want their spreadsheet offline, it's wonderful: fast and stable.


This is good news. I did not know this.


That's got to be the death-knell for Excel. It might be better in every conceivable way but going web-only only makes Sheets look more attractive to the people doing the buying. And you can run JS in your Sheets right now.


I dread this day greatly, but it's definitely coming.


In my corporate experience, casual usage of excel is in the cloud. Any large, complicated, or db-connected spreadsheet is done on local installations.

The list of what snacks people are bringing to the potluck is in excel online (or google sheets).

The excel sheet that is pulling information from a database, mashing it up, displaying it in a dozen different pivots... that is going to be local.


I'm positive they've at least toyed with the idea of running a Python interpreter in a WebAssembly module, which should work in both environments _and_ provide some isolation. The telemetry settings, I guess, wouldn't need to be changed to call home about performance/errors.

I can't attest to the performance of such technique for now, though, or guess why they could have deemed it not worth pursuing at the time being.


Easier to debug a backend than clients code, single implementation, faster updates, and performance not limited by device.

At the cost of privacy. Consider msft could sell this to you but they chose to let you borrow it instead.


A lot of people who use excel heavily are not really “programmers” and don’t have strong fundamentals or an appetite to manage their own software stack. This also simplifies development on MS end by only needing to validate a single python version.


Then they should embed the supported version of python in Excel and manage it automatically for them. This can be validated with Excel before release.


This seems like the only solution. Do not copy all of Python’s warts, but offer a curated experience. Microsoft Experience (TM) with fixed APIs, no package management or dealing with ecosystem churn.


It will never work because Python is the ecosystem. Having Python without the ability to tap the huge catalogue of existing software makes the integration worthless. The first thing someone will want to do is use Pandas on a live spreadsheet.


If you read TFA, you'd see that the python env comes with a bunch of the favorites (pandas, matplotlib) baked in.


The use case is someone in Excel who does not want to write VBA. You do not need to offer the entire ecosystem, just a slightly better language to handle more sophisticated modeling needs.


Sandboxing python enough that millions of users could safely pass around excel workbooks is a super hard problem, especially when you allow third party libraries.

Forcing it to run in an isolated environment they control simplifies the problem greatly.


Windows has it's own implementation of containers, and with a docker-like structure, and xlsx being a zip file, excel could simply have a dockerfile like structure "FROM excel-python:2023" "COPY xlsx:scripts/" etc. new versions of excel could keep the last few 'base' containers, or if they are in love with the cloud, the signed, verified base images could be pulled on request.


The latest Windows does, but Excel is used across tons and tons of devices, and you'd want compatibility across them all.


My guess is that it makes it easier to work on large files which is common without downloading the whole file and it makes CoPilot related efforts much easier.

I work on Excel but not this feature and am happy to ask around if people are interested.


The last thing in the universe I'm interested when editing Excel is making CoPilot related efforts any easier.


Because Python, with the required numerical analysis libraries is a pain to setup, and there are a ton of ways for it to get messed up.

Doing everything in the cloud massively simplifies deployment and support.


Microsoft could ship its own Python distro within Excel. The biggest problem is that historically Guido / the Steering Council have avoided being involved in how py libs are distribuited, so there are many ways to do everything. But if you do have an opinion and means to enforce it, mantaining a Python environment can be a quite smooth experience.


A complication here is how large the Python distro this appears to be using is. I don't know if you've ever set up a container with all of anaconda, scikit, statsmodels, pandas, Matplotlib, seaborn, and more but it gets pretty big. A lot of people would complain about the bloat if Excel installed it by default.

Another complication is they claim the container isn't just "a docker container", but for increased security isolation (they don't want a repeat of VBA malware) it's a mini-VM focused to run on top of Hyper-V (the Windows Hypervisor) itself. That's a really complicated install process on the average machine (like installing WSL2) that sometimes involves flipping entire Windows Features on, so also something unlikely to be a smooth experience out of the box for Excel.

It might be neat if they made that an optional install and let users have offline support, but it sounded like they wanted to focus on online and collaborative UX first.


Yeah my day job involves deploying Python containers to K8s so I get where you are coming from. GB sized images are not uncommon. However:

1 - That could be an optional component, behind a "Install Python for Excel " button.

2 - You need to install Python to code in Python anyway, with or without Excel.

3 - Bloat is the norm nowadays and I'm not sure whether users care. A clean Visual Studio install takes 10 GB of disk or something. Office itself takes several GB as well.

4 - Not sure why Docker would be needed. Using Python in Windows is fine nowadays. There are caveats with libs that are very reliant on POSIX (Airflow comes to mind,) but, again, if you control your distro you can limit the libs users can install.


1) I agree, which is I why I did end my comment that I think it would be nice to see an optional install offered for offline usage.

4) I don't think they are using Docker, but they are using containers. (They mention running directly on top of Hyper-V so they might not even be considered by some developers as containers in the Docker sense but closer to "mini" VMs.) The container they are using for security/sandboxing reasons. The Excel team likely still has nightmares from the worst years of VBA macro malware and this Python system seems built specifically to sandbox all the running Python code to an execution environment (container/VM) that is easy to refresh/rebuild/recycle/tear-down. I agree that Python on Windows is usually great and most of the Python ecosystem is great at cross-platform support, but I also understand why Excel would want to sandbox any Python it runs to containers that are isolated from the host machine/user and it can easily reboot.


How large is pretty big? And who said it must be installed by default?

Hyper-V is enabled by default on supported hardware.


> How large is pretty big?

Python data science containers are on the order of GBs.

> And who said it must be installed by default?

I did end my comment with thoughts that I think it would be nice to have as an optional install. Mentioning why it isn't likely a default install is relevant because they want this Python feature to be accessible to everyone and they want people to collaborate and the UX for an optional install isn't what they were looking for and would complicate things. (That's not an argument against also supporting the optional install, but a reasoning for why the optional install is a harder feature to support and doesn't make the "version 1" cut.)


> Python data science containers are on the order of GBs.

Yes. And SSDs are on the order of 100s and 1000s of GBs.

> Mentioning why it isn't likely a default install is relevant because they want this Python feature to be accessible to everyone and they want people to collaborate and the UX for an optional install isn't what they were looking for and would complicate things.

How do you know these things?

It won't be accessible to everyone until it's installable. Collaboration is not cloud exclusive. The UX isn't what they were looking for is circular.


Doesn't Guido work for Microsoft now?


He does


The answer is in the question. They chose this model so that Python calculations would run in the Microsoft cloud, operating on data that resides in the Microsoft cloud. When your code runs on the cloud, they get paid. When your data lives in their cloud, they get paid.


As soons as nonsense like this comes about you can tell straight away that it's not in user's favor. It's probably to tie you into subscription, data mining, selling mined data to ad companies and squeez every single dollar they can out of you


Looking at all the packages available, I'm sure that's why. You make it a much simpler thing to just develop that runtime for a known environment that you control vs a browser/local that requires a lot more engineering effort to make sure it works correctly for everyone.


You could ship that known environment as an optional download.



Who's the CEO of Microsoft again? Satya would love this approach.


Yup, as somebody who desperately wants great python integration in my suite of spreadsheets I'm not even going to consider trying this.


You can try Grist. Spreadsheet with Python formula support, and open source. https://www.getgrist.com/ https://github.com/gristlabs/grist-core/


Wait just a moment. Libreoffice had python for years, but nobody cared and instead complained about needing 100% compatible VBA for "real work". Where are those people now?


Right here. It's also not about 100% compatible VBA, it's about learning an entirely new API in order to write VBA-backed Excel spreadsheets that won't run in Excel. If Libreoffice VBA were compatible with Office VBA, I'd have no reason to always keep an outdated Windows system in the closet somewhere.

The reason we're doing this is to make money, and nobody wants to buy Excel spreadsheets that don't run in Excel.


Not offended that Excel can do more, but potentially still inconvenienced by Libreoffice not having the thing they are familiar with or are forced to work with?


I would happily use Libreoffice if any of my clients did. But nobody does. Compatibility would be nice so I can write stuff in Libreoffice Calc on Linux (Excel has 0 Linux support) and ship it to clients without issues, but this even close to be possible.


If it sucks, people won't use it.

If people don't know about it, people won't use it.

If people are accustomed to something different, people will ask for it.


The problem with Libreoffice was that the python support was purely documented.


I wrote some new docs on this recently:

* https://wiki.documentfoundation.org/Macros/Python_Guide/Call...

By having Python in Basic, it becomes available to cells too, since Basic macros are callable from cells.

Note that you may need to reduce the security level to 'medium' in the drop down Tools->Options->Libre Office->Security->Macro Security. When LibreOffice opens, it will ask you whether you want to enable macros or not.


finding another straw-man?


Many comments are criticizing the fact that this only runs in the cloud. And if Microsoft instead announced this would only run locally, they would instead be deploring the massive remote code execution vulnerabilities that this opens.

For this feature, there's no question that providing it in the cloud is much, much better for security. Microsoft can now execute the Python code in Firecracker VMs which provide much better sandboxing than your local PC could.


that would be a different set of people though. Some people are fine with cloud everything some of the rest of us aren't and like to work local. Especially for something as basic as a spreadsheet or word doc.


My local PC can run code in Firecracker VMs.


xlwings creator and author of the O'Reilly book "Python for Excel" here! First of all, big congrats to the team! I've been in contact with the Excel team on and off over the years and I remember when an Excel project manager once described the task of adding Python to Excel as "turning a fully loaded ship". Well, I am happy that the ship has now turned and is ready to ship into more exciting waters! There are a few question marks I have given my decade long experience with the topic (although everything is still beta, so it will certainly change/improve on its way to GA):

(1) I have hardly seen any company that can do with an off-the-shelf Anaconda distro. Companies usually have an internal Python package that they will need to access. (2) When running Python on the backend, the first question is always "how can we authenticate the user"? Office.js is currently the only platform I know of that allows you to leverage Azure AD identities via SSO or use any other provider (as you have complete freedom to use any JS library/redirect the user to a login form). (3) IT policies: Usually, companies have made their cloud decision: “We're an AWS/GCP/Azure shop, so Python has to run on precisely AWS Lambda/GCP Cloud Run, etc.” Yes, many are on Azure, but even in Azure they may have preference of let's say Azure Container Apps or AKS instead of Azure functions. (4) The other thing that businesses are obsessed about is to securely protect their source code, again, not something that Python in Excel seems to support. (5) And finally, what I see users most excited about in the context of xlwings is being able to run standard User-defined functions (aka "Custom functions) on the server (like the ones we wrote in the good old VBA times or like the new Lambda functions), not sure if that's possible or on the roadmap for the official version.

With the modern xlwings Server, I have taken a different approach: Let users build a 100% standard Python web app using their favorite framework (Django, Flask, FastAPI, ...) including all the standard tools (logging, auth, etc.) while using Excel as the frontend. Users have complete freedom in choosing their tech stack, they can version-control the source code on GitHub, use GitHub actions to run unit tests and deploy the code automatically to their favorite cloud, etc.

So I am probably targeting more professional developers than data scientists, but in my experience, it’s often a professional developer who write the Excel add-in that is then deployed to business users/data scientists.


Finally the Resolver One dream is coming true, just 15 or 16 years late.

https://www.youtube.com/watch?v=u6EV2jiKRfc


Interestingly enough, when Resolver One didn't work out in the marketplace, we pivoted to PythonAnywhere. That took off nicely, and was acquired by Anaconda last year. And now Anaconda is realising those old dreams :-)

BTW for clarity: the team working on this inside Anaconda is entirely separate from the PythonAnywhere team. It would have made a perfect Hollywood-ready story if it had been the same people...


Wow, thank you for replying. I'm so glad it worked out in the end.

I remember being blown away by the ResolverOne demonstration video, and if I recall correctly, some others you posted showing some of the other features and functionality.

The fact VisiData (my current 'spreadsheet' of choice) has acquired a sizeable following in that niche, and now Excel is following suit suggests you might have just been a little too far ahead of the curve.


We were discussing about Resolver One 6 months ago at: https://news.ycombinator.com/item?id=34819682

One mystery we did not resolve (pun intended) was why it did not find its audience. Was it too early? Was data science not established enough as a field? Was it because it was not Excel (i.e. the dominant tool)?


I think it was a combination of those, plus terrible market timing -- we released it, thinking we could sell to finance types, in 2008. In our sales pipeline spreadsheet, the "notes" column started saying things like "customer has gone bankrupt".


This sounds like it provides a way for Excel to hand a bunch of data to Python code for processing and get back results, and vice versa.

That's probably extremely valuable to many people, but it does not seem to be what I hope for whenever I see something about bringing Python to Excel.

What I hope for is something that is like VBA but without the VB. I.e., a Python (or JavaScript) interpreter/JIT/compiler built into Excel exposing all the internal Excel objects and methods that VBA has access to. By "Python" I just mean the language and whatever subset of the standard library makes sense in Excel.


Install PyWin32 module and `import win32com.client` and you can access "all the internal Excel objects and methods that VBA has access to".

    C:\>python
    Python 3.10.1 (tags/v3.10.1:2cd268a, Dec  6 2021, 19:10:37) [MSC v.1929 64 bit (AMD64)] on win32
    Type "help", "copyright", "credits" or "license" for more information.

    >>> from win32com.client import Dispatch
    >>> xl = Dispatch("Excel.Application")
    >>> xl.Visible = True
    >>> wb = xl.WorkBooks.Add()
    >>> sh = wb.WorkSheets[0]
    >>> sh.Range("A1").Value = "Hello World"
    >>>


The same with xlwings (works on macOS, too):

  >>> import xlwings as xw
  >>> wb = xw.Book()
  >>> sh = wb.sheets[0]
  >>> sh["A1"].value = "Hello World"


win32com.client uses Windows' COM machinery, which supports event-driven feedback from Excel to the script. Switching to PowerShell because I'm not familiar with that in Python and from a quick Google, Python needs some more complex multithreadding handling:

    $xl = new-object -ComObject excel.application

    $xl.Visible = $true
    
    $wb = $xl.WorkBooks.Add()
    $sh = $wb.Worksheets[1]
    
    
    Register-ObjectEvent -InputObject $xl -EventName SheetChange -Action {
        Write-Host "Hi"
    }
    
Now when you change a cell in the spreadsheet GUI and press enter, the PowerShell "write-host" runs and prints the message. Another comment mentioned "using Excel as a GUI for Python" and the COM interfaces have the machinery to do that. It looks like xlWings has something like this, but may need an add-in installed in Excel to trigger Python code from Excel? https://docs.xlwings.org/en/latest/quickstart.html and https://docs.xlwings.org/en/latest/addin.html#xlwings-addin


That would be very cool. Python with Excel as the GUI.


I am using Acceλerate for MS 365, which brings Visual Scheme for Applications (VSATM) to the MS 365 suite [1]. It can be used in the other MS 365 apps as well and ties in nicely to .NET. That and I unashamedly love Lisp over python even though I know python has the data science corner. I was very disappointed when MIT replaced Scheme with Python as their CS intro language, but such is life. For kicks, I am a big APL fan and found April (Array Programming Re-Imagined in Lisp) which has APL and Lisp united into a super PL! [2]. Maybe an APL for Excel?

[1] https://code-magazine.com/Article/2207071/The-Excellent-Sche...

[2] https://github.com/phantomics/april


I'm a developer. People have often come to me with Excel sheets that were really slow because it was just too much for Excel. Almost all of the times IMHO a specific solution should have been developed for it. I'm curious to see if this makes it better or actually worse.


Conway's law. Unless those people have devs or data engineers embedded in their group, they're going to keep on with less than ideal solutions. Management will accept those inefficient solutions because they don't come with dev pricetags.


I don't doubt that you have, but in my experience the reason they're slow is because updates haven't been turned off while the macro was running, or naïve algorithmic choices (that a query planner might fix in the realm of normal programming, but python wouldn't help with.)


Meh, as an advanced user on a trading floor we don't need Python for Excel to run on the cloud, we need it to run locally so we can tap into our in-house C++ library and KDB databases. Third-party solutions have so much overhead.


Sounds like you're doing something too sophisticated for Excel.


You don't know what Excel does.

This is normal Excel use.

Most large organisations use an Excel front end to their enternal services and the uses uses VBA to put them together. - Basically Excel is the finaciers web browser.

Unfortunately they also have large Excel spread sheets that are passed around between users with no control. An examplke was in UK reporting of some COVID information, it was underreported as they were using Excel which only allows 64K (number chnages per version I think it was 256 originally) rows and so when exporting data rows were just lost.


We are essentially using it as a UI for the traders.


C++ FFI from Python from Excel. Now this is podracing.


Probably not needed as if you need that you already doing Excel C FFI.


My main dependence on Excel is being able to interact with pivot tables. I also have some VBA macros to help with formatting which I would welcome to be in python - alas it looks like what they're bringing is for the cloud only?

I haven't found any free, open source solutions around this including any for Jupyter notebooks (my preferred medium).


python pandas has pivot tables https://pandas.pydata.org/docs/reference/api/pandas.pivot_ta...

I don't know if it fits your needs


So you can click and drag the columns around in the pivot table to get a better view?


imho using pandas for pivot tables is like using a cannon to shoot sparrows



If this is true then it will be amazing and very widely adopted. Pandas in Excel would be cool. Fingers crossed.


Why? Libreoffice calc had python for years now and nobody cared:

https://help.libreoffice.org/6.3/en-US/text/sbasic/python/py...


Beacuse anyone with a budget had Excel already - and had to use Excel to be able to use other peoples' spreadsheets. LibreOffice is not bug compatible with Excel.

Look what Excel had tro do to replace Lotus 1-2-3. Including making 1900 a leap year beacuse Lotus had the error.


I love free software as much as anyone, but when I'm forced to use a spreadsheet I'm glad we get office at work. The UI/UX of LibreOffice could use a lot of improvements


Because the world runs Excel, not LibreOffice Calc.


For someone working outside IT but who has coding skills, it's great. My previous work computers have been tightly controlled by IT and installing libreoffice on them wouldn't have been permitted.


I'd be happy if I could just represent integers with more than 15 digits without rounding (e.g. 64-bit primary keys)


64bit can hold 20 digits (decimal), 20 (sortof) 19 completely


64 bit integers can. Excel uses 64 bit floating point. Try the formula

  =1 = (1+1e-16)
and you'll get TRUE.



Here you go: https://blog.adaptiverisk.com/post/2023-08-22-dataslicer/ Calculations are local, not in the cloud.


We're working on this at op (https://opapp.io). Pandas in a spreadsheet, just not an excel spreadsheet if you're able to jump ship.


> it will be amazing and very widely adopted.

The calculations in the cloud?


Why would the vast majority of Excel users even notice or care about this?


The ones that use python would. It being cloud-based makes it a pretty useless feature for anything that I was using Python & Pandas for.

If it was on the client side, I'd be rolling it out to users the moment it was available.


Because they have financial data on it and putting financial information outside computers your company controls is a massive risk.


There is an interesting looking competitor in this space that is built with Python integration from the ground up.

https://www.neptyne.com/

I haven't tried it yet, nor am I affiliated with Neptyne, but it looks intriguing.


Hmm, I would have gone with "Nyptune".

But it looks nice. Looks fun to work with.


It's interesting how aggresively enthusiastic HN are about Excel when it comes to discussions about Excel vs. open-source alternatives. Any mention of alternatives are quickly and aggresively shot down, ocassionally to the point of personal attacks (if you're satisfied with LibreOffice you don't have real job, etc.).

On the meanwhile, just a few threads ago there was a (quite fair) browser benchmark where Chrome comes up on top on almost every test. People in the comments instantly rushes in proclaiming how proud they are for using Firefox even if it is technically inferior, solely because it is not owned by Google.

What's up with this divide? Firefox user, btw.


I think it comes from users being deceived by the product UX. My impression is that using Firefox, I often don’t miss other browsers. Using Libre Office makes me miss Excel, Keynote et al. I can’t pinpoint it to a single thing, it just feels dated and inferior. Open source in general seems to work better for infrastructure than applications, even if I have my own favorites. It partly has to do with the lack of product teams like designers, but also that an app is a single endpoint that makes a hard business to sell multiple flavors.


I imagine this is a stage in a larger plan to incorporate an AI assistant to excel? Being able to talk to your spreadsheet in natural language and have it do a bunch of analysis and visualizations would be a huge productivity booster.


Aw, shoot. I thought I would be able to have fun sending coworkers:

  import os
  os.rmdir("C:\Windows\System32")


shutil.rmtree, unless their System32 is empty


While I can see the advantages of Python over VBA, I feel like running the code in the cloud instead of locally will be a big no from a lot of people, especially enterprises running offline Office installations.


When I worked at a hedge fund being introduced to Python after years of working in spreadsheets was a revelation. ChatGPT/Copilot have dramatically reduced the cost of switching to the IDE for bankers/investors/consultants who make up Microsoft 365's core user base.

Surprising that Microsoft would give users this gateway drug, but I suppose the reasoning is that you can keep your users by making Excel a quasi-IDE.


Excel was already a quasi IDE 30 years ago even without VBA. Excel macros and the C FFI interface worked.

I suspect HN users in general are different to finance users and so have not seen how Excel is used.


I hope this is the start of a wholesale python replacement for VBA. Python has turned into what VBA always tried to be: A ubiquitous language for data and scripting (among other things).

Excel would be well served by replacing VBA with either JS or python because of their ecosystems and high levels of adoption. Python is the better choice because data analysts are more likely to be familiar with it.


TBH I'd rather have Excel sheets in Jupyter than have Python in Excel.


Now if they could only fix the issue where dates are converted into scientific notation. For the past 10 years I've been getting a steady stream of excel files with sometimes thousands of dates that are formatted incorrectly. And there is literally no "easy" way to fix this.

I'm tired of it Microsoft. Just fix the darn issue please.


From my skim they are incorporating a new-ish release of python. I wonder how long until (and if) MS forks python to add some functionality (extends it) and what happens next. Is it that farfetched to picture corporates running MS python with dot net integration or something and the ecosystem getting all screwed up.



Why does Excel need to run Python calculations in the Cloud?

I think I'll stick to VBA.

Javascript in Excel would be nice.

Even better would be native SQL.


Maybe this is not so much about bringing Python to Excel, more about bringing Excel to Python. But only if you do it the way Microsoft wants you to.

As with BASIC morphing into VisualBasic and VBA you lure (beginning) python programmers into Excel. Ultimately those developers will understand and appreciate less and less of vanilla python REPL, jupyter, scripts, etc. And... Welcome to Microsoft's world.



Excel has javascript… Google “Officescript in Excel”.

Also, https://learn.microsoft.com/en-us/office/dev/scripts/resourc...


> Javascript in Excel would be nice.

JS has been one of the supported add-in languages for a while[0], but that's not quite the same thing as the new python offering (which sounds more like jupyter within excel).

Since excel is often (mostly?) used for working with financial calculations, I for one am glad they went with python for this feature instead of a language that insists that an IEEE 754 double is the only kind of number anybody actually needs


JS has BigInt now, and it almost works like you'd expect. Need to put 'n' after all of your numbers though.


True, but financial calculations tend to involve non-integral numbers. Python has arbitrary-precision decimals via: https://docs.python.org/3.11/library/decimal.html


Financial calculations are also sometimes already used to doing math in integral pennies (or eighths of a penny) for performance/storage reasons. That works just fine with one BigInt. Or you can build arbitrary-precision rationals from just two BigInts. (There are some JS libraries out there already for that.)


"You can build arbitrary-precision rationals" is a far cry from saying, "this is a language designed for precision arithmetic." You can technically build arbitrary-precision rationals without BigInts using arrays, but it's not exactly ergonomic.

Even if we stipulate that users will be cognizant of floating point limitations and convert everything to integers, JS still requires all literals to be suffixed with n to avoid silently losing precision when integral values exceed 2*52.


Many languages that are considered "precision" languages have some form of suffix to denote large/long/arbitrary-sized literals. (C# there's a big useful difference if a literal ends with `m` or `l`.) Most languages you have to be aware of your data types and use the appropriate data types.

No one is saying that "JS is a language designed for precision arithmetic", everyone is saying "JS is a language with the tools today to do precision arithmetic (whether you like it or not)".


> Many languages that are considered "precision" languages have some form of suffix to denote large/long/arbitrary-sized literals. (C# there's a big useful difference if a literal ends with `m` or `l`.)

The failure mode is significantly different when comparing JS to other mainstream languages. Most languages have a "checked" mode (either opt-in or mandatory) so that expressions like `4611686018427387904 + 4611686018427387904` can raise an error, whereas JS will just give you the wrong answer. Python is the most ergonomic for beginners or non-developers because the runtime will convert an integral type to an arbitrary precision integer on overflow.

Yes, there are alternative primitives that can be used, but you need to know they exist. We're talking about a tool that is explicitly designed for people who are not professional programmers, and JS's behavior here is surprising unless you understand the underlying data model. I would personally much rather give beginners a tool incorporating as few footguns as possible.

> JS is a language with the tools today to do precision arithmetic (whether you like it or not)

Arbitrary precision arithmetic has been possible in JS since the language was invented, but it has always been a pain. The bignum NPM package[0] predates the BigInt numeric primitive by 9 years. The addition of `BigInt` to ES2020 was important for performance (since implementing an arithmetic primitive in JS makes calculations dog slow), but it didn't add any fundamental affordances to the language.

[0]: https://github.com/justmoon/node-bignum


SQL is easy to use from Excel what more integration do you want.


How will this work in practice? Will this just be python version 3.XX forever? How will packages be maintained? Do you get to specify any of this so when a future version of a package is introduced without backwards compatibility it doesn't break everything immediately?


I could swear I had seen this headline before, so much so that when I was recently asked to help out in some old vba code, I figured I’d port it over. Imagine my horror when I couldn’t find it. Imagine my horror seeing this now that I just finished.


This is so weird. I built a "python within excel" a few years back.

Abandoned now: https://alphawolfxl.com/

It allowed python functions to be run on cells and ranges, and also allowed ranges to be filled with sql queries on databases.

I had managed to get also a matlab runner working (via matlab/java connector) and was just getting the R connector finished.

Basically, I made a website showing it working and gave a few people the plugin, and their feedback was "nah, my company won't pay for it, you're wasting your time" - so I shelved the project.

I have no idea what my emotions are doing right now :)


> “I’m excited that this excellent, tight integration of Python and Excel is now seeing the light of day,” says Guido van Rossum, Python’s creator and now a Microsoft distinguished engineer.

It's depressing seeing someone who has done so much work for the free software community now working to enrich the spyware-laden proprietary product offerings of a megacorp.

Nothing about this is "good for Python" or hackers - it's just good for Microsoft to sell more Excel (and Microsoft accounts). That makes the world worse.

I thought the B stood for benevolent.


Even installing Python seems to a hurdle for many employees. They need to get approval from IT, and it seems not to get done. They use Excel, and if they can use Python within Excel without installing anything extra, it will increase the user base of Python. After getting a taste of Python they may install Python locally and use it outside Excel. So I think this will be "good for Python". Certainly integration of VBA into Excel has been good for VBA.


Here I was, hoping he worked on performance, typing, dependency management or other hard, meaningful issues in the Python ecosystem.


Off topic but why can’t we all go back to office 97?

It basically does all the same stuff and it would be lightning fast on modern machines. You don’t even need internet to use it.


> why can’t we all go back to office 97?

For one, because people want more than 65,536 rows in their spreadsheets.

But, I'd largely agree that '97 was the pinnacle from a UI perspective. There are changes under the hood that are quite nice, though, especially with the move to xml persistence.


None's stopping you to use MSOffice '97. I still use it from time to time and current MSOffice is happily recognizing the format and converts it on the fly. Also there is a tool that you can install and see current formats to be able to open them in MSOffice '97.

As for why the rest of the world why would do that or not, is their choice, no?


Thank you. Yes, I could get ELT to accept a “RAD” Access app built and shipped during the time it takes to pitch to choose just the CRUD library or JS Framework alone.


Yeah strangely if you turn the ribbon off it doesn't look much different and doesn't work much differently!


You say 65k rows is enough for anyone?


XLOOKUP.


It’s interesting but if you are good enough at Python, can’t you already write to and read from spreadsheets? Maybe moving up a Python excel library like xlrd and openpyxl and making it reactive would be more worthwhile for Microsoft. Writing Python code in that little formula bar as shown doesn’t seem attractive. Writing Python in VSCode and seeing the results appear in the spreadsheet in real time would be great though.


No, you use Excel when you need a data manipulation program quickly and you want to ship it to clients without any installation or training. You just send the workbook to your client and it's done!

It's very self-contained, what works on your machine is almost guaranteed to work on the client's computer, given that they have the same Excel version. It's also very reliable, VBA apps can work for years without any maintenance or support.

Openpyxl is fine if you want to generate a report and send the results but for anything dynamic you would need to ship an interpreter, plus generate an exe, plus ship a workbook with the data, plus you probably need to write a GUI (users freak out from the cmd). This is much less reliable and requires constant support to users, VBA/excel apps rarely do.

This Python integration is a bit disappointing though as the "killer feature" of Excel is that it's self-contained and now it will rely on MS's cloud. This is yet another integration that will fail to kill VBA (we had JS and .net already but had similar issues), it seems that MS doesn't really understand how people use Excel :)


Great points, since it’s in the cloud it’s no longer self contained. We are already in some hybrid works with extra dependencies.

And if that’s the case, why can’t the .py files live in the cloud too? The need for an interpreter didn’t go away, Microsoft is just hosting it for you.


I've worked at places where we were not allowed to install Python, nor were we allowed to run executables that weren't greenlighted by IT beforehand.


A lot of people are taught Excel early on in school. I wonder if this would result in more people learning or being introduced to Python at an early age?


People aren't taught VBA in school, as far as I know, and that was already there.


worth checking out Quadratic for an open source approach to Python + Spreadsheets: https://news.ycombinator.com/item?id=35456509

theres going to be multiple people tackling this problem but I see it as a worthy one, theres no reason we should be needing VBA in the future when we can run Python in browsers.


We're building Python (mainly python pandas) + Spreadsheets at op (https://opapp.io). Would love to know what you think. The code editor is more like a built in code notebook.


Does anyone yet have a full list of the supported python libraries?

There's a partial list here: https://support.microsoft.com/en-gb/office/open-source-libra...



I look forward to seeing this get abused. Having worked with a psychometric analysis program entirely implemented in VBA, I genuinely wonder how Excel/Python will live alongside R in academia - at least for the people who are really conservative about the tools they use.


You can always count on out of touch Microsoft PMs to over salt a good idea.

Always one step forward, two steps back, and one to the left.

I remember interviewing in Redmond once and the PM couldn’t understand why Windows Vista wasn’t doing well because it seemed to run excellent on his overpowered PC.


If you're interested in something like this with a bigger code notebook-style code editor + AI code generation, we're working on it at op (https://opapp.io).


I am the sole Python developer in a company where everyone uses spreadsheets. With this project, will my coworkers be able to run my Python functions? Or only functions that are built into Python and packages like pandas and NumPy?


If your coworker doesn't have the feature enabled, they will see the last cached values. If they try to run them without the feature enabled, they will get an error.


It is very likely that your coworkers will be able to use your Python functions.


> Microsoft says Python in Excel will be included in a Microsoft 365 subscription during the preview, but “some functionality will be restricted without a paid license”

Microsoft 365 subscriptions are paid licenses. This sentence is nonsense.


It seems it took 6 years from "consideration" to an actual release: https://news.ycombinator.com/item?id=15927132


It'll certainly be a lot easier and more accessible than pyspread:

https://pyspread.gitlab.io

(which I wish was more widely known/used)


Seems cool, and a bit awful. Writing elaborated python-code in that little text-field must be a nightmare of its own. Where can I connect a VS Code or vim to the sheet for coding in comfort?


My thoughts too. Not having to install outside software is treated as a feature, but it is a lost opportunity: Excel with tight VSCode integration sounds like a killer app for decades to come. Two best-of-class (respectively) programs, each in realms dominated by Microsoft. Combine the two and unique possibilities unlock, bridging the gap between business and development.


Excel has an editor. Try hitting alt+f11. "Open in vscode" would be an excellent feature though.


No one seems to know that an VBA IDE is part of Office. It's important to remember when you're working on an extremely locked-down system and you want to do something complicated.


Curious if we can use Pandas like framework to combine excel sheets into 1 (with common columns or specify rule for column merge) once this is launch in an easy way all in Excel?


Grist is a spreadsheet with Python support (I am a founder). Python does make some formulas far easier. Nice to see Excel has data science libraries included from the start, that's something we've had our eyes on for a while. On the other hand, Grist is open source and can be run locally.


Since a few month i use grist extensively.

For example: As part of a monitoring system. HR / crm Database. Self serving vacation planner and approving system. Tamperproof GMO(genetically modified organisms) database. As my go to source as a quick and dirty data dump while hacking...

Grist is a great tool!


does it make you nervous to basically have microsoft as your primary competition now? :)


It's a spreadsheet, Excel was always our primary competition! :) The more Excel follows us, the better for everyone. Python is a start, but they have a way to go. Imagine proper relational data, self-hosting for cloud Excel, open-source, access rules....


This seems like pretty bad news for all the excel but programatic type startups I've seen floating around. Being able to work locally might be their only hope.




Does this mean you can get an infected Excel file that use python macros? This is a serious question, I know a lot of attacks use Excel macros.


Excel single-handedly keeps VB alive; it has a huge amount of sway over what langauges people use and learn - so why settle for Python?


Is this feature enables because of Webassembly? Considering that we assembly let’s people run python code natively in the browser?


I've been saying that Python is the modern day's Visual Basic and now it has become the actual "Visual Basic".


Is this feature primarily a function of web assembly? I.e because if we assembly you can run python natively in the browser?


Anyone know what format they are serializing the data in to move it between Excel and Python? Are they using Apache Arrow?


I describe Excel as the browser for data apps. Now Python would be the scripting engine for Excel, the “data browser”.


So what happens when you're trying to work on your laptop and you don't have a reliable Internet connection?


Wait, didn't they already do this? I'm pretty sure I downloaded an official add-on for this a few years ago.


AFAIK, as a third party add-on:

https://www.pyxll.com/docs/introduction.html


There was a YC company that did it, maybe a decade ago. I bet someone will remember.


DataNitro! I knew very little about programming and was trying to create examination schedules for the faculty I was working at. I managed to convince the dean to buy a license and it pretty much changed my life (granted it was mostly Python that changed it but DataNitro made it even more accessible it to me back then).


DataNitro was my inspiration for the Open Source version of xlwings back in 2014. They have been out of business for many years though.


I work for a big bank. There is no way infosec guys would allow this, its no use to us. Even ChatGPT is blocked.


This is a nightmare. Python devs are going to be asked to support the half-baked python macros of business users.


very smart move!

clearly aimed at big business/enterprise customers, python's ecosystem of scientific & statistical libraries is massive

not so long ago they introduced Office Scripts and Power Automate, in essence coding macros in TypeScript (rather than VBA), again only for business/enterprise license users though.


Does this allow Microsoft to work around Apple app store restrictions on running interpreted code?


DIY:

-> excel cell: rest api call

-> send request to your own python web app: run python function. send back the output.


Bold prediction: this will have a greater economic impact than LLMs over the next 10 years.


I'd say they are doing this because of LLMs, so that normal users can tell Excel what it should be doing, and it then generates a script which can execute it.


Is this the end for internal javascript frontends?

Should I learn python for Excel instead of Django?


I wonder who'll be the first to ship python in a major browser


There's https://pyscript.net/ -- or do you mean for scripting the browser's behaviour rather than replacing JS in website code?


Yeah, making python a first-class browser scripting language to displace JS. I think we all know it's time. Even if it takes a decade for ubiquity, it'll be worth it.


Definitely check out PyScript if you haven't already. It works pretty well and is being actively developed. (Full disclosure: I work for Anaconda, which is the primary developer.)


That's cool, thanks for sharing. I've only ever seen things that required transpilation and had complex toolchains. I hope it catches on


This would be a good moment to ask for good free Python courses.


Soon Python will rule it all as the most widely adopted language


Until Excel can run Python locally, I doubt it.


What's wrong with openpyxl?

Heck the .xslx file format is fancy XML.


Nice, waiting for the new machine on HackTheBox :-D


Half joking: Now we know why Microsoft hired Guido.



HELL YEAH!


Why not starlark?


Cool a novel malware vector for the uninitiated


definitely_not_malware.xls.py


game changers for the competitive scene, looking forward to MAKRO's commentary


Now that Microsoft embraced Python, what are the odds it will go all the way to extend and extinguish it (or at least CPython as reference implementation)? There are much more excel users than programmers. If excel runs Python, it has the potential to become the "de facto" standard interpreter, used by most people worldwide, displacing CPython.


Between nil and zero. Just because excel is "bigger" by various metrics, does not give MS any more ability to quash Python. The software world is way larger than it was in the heyday of EEE.

Excel still has a fraction of the power Python does, it's not becoming "de facto" anything that it isn't already de facto for (which is highly interactive data wrangling and making reports by folks who cant/won't use a python env for that task).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: