Hacker News new | past | comments | ask | show | jobs | submit login
XLOOKUP for Excel (microsoft.com)
229 points by kbumsik on Aug 31, 2019 | hide | past | favorite | 80 comments



Finally! Every. Single. Time. I have been forced to use VLOOKUPs or INDEX/MATCH I wonder why nobody on the Excel team had still not built a new easier interface. I wish I could only use Pandas but excel is the lingua franca of business.


It's almost impressive how long it's taken for Excel to comfortably mimic "SELECT Prefix FROM Whatever WHERE Country = 'Brazil'" I have no doubts why it's the de facto data tool, but this is laughably overdue.


You can use Pandas via the xlwings package (https://www.xlwings.org)


pandas is inconsistent and verbose. Isn't SQL a better alternative? Or even dplyr? Heck, I would learn one of those array languages such as J or APL rather than use pandas.


I haven't used pandas much yet. What is inconsistent about it?


> pandas is inconsistent and verbose. Isn't SQL a better alternative?

Wait, there's a thing that's more inconsistent and verbose than SQL?


How is SQL inconsistent?


The syntax and grammar of SQL commands do not have an elegant symmetry. Examples:

The ordering is not always name of table and then names of columns...

Write data:

  UPDATE customer SET membership_plan='PLATINUM'
Read data (incorrect syntax if specifying table name then column name):

  SELECT customer GET membership_plan  -- incorrect
Read data (correct syntax of columns then table):

  SELECT membership_plan FROM customer 
For UPDATE, the column names are adjacent to the value with the '=' equals sign. However the INSERT statement splits column names away from the values; all column names as one delimited list in parentheses and then all the values are another delimited list:

  UPDATE customer SET membership_plan='PLATINUM', timezone='UTC'

  INSERT INTO customer (membership_plan,timezone) values ('PLATINUM','UTC')
Prepositions...

Prepositions before table name: SELECT FROM t, INSERT INTO t, DELETE FROM t

No preposition before table name: UPDATE t

For syntax consistency, the UPDATE would also have had a preposition such as "UPDATE TO t", or "UPDATE ON t", or "UPDATE ONTO t" ... or ... the other 3 commands would have removed the need for "FROM/INTO". Either way, all 4 SUDI commands could have looked more alike.

Yes, one eventually gets used to the cosmetic inconsistencies but it's annoying for beginners because memorizing 1 of the 4 commands doesn't really reward you with the ability to predict the grammar of the other 3 commands. You have to learn the different quirks of the 4 commands.

Maybe we should be grateful that SELECT/UPDATE/DELETE at least all happen to share the same "WHERE" clause. It seems like SQL's inconsistent syntax design should have tortured us with memorizing "WHERE" for one command and totally different synonyms such as "FILTER" and "CONDITION" for the others!

[To downvoters, I'm willing to be persuaded that I'm wrong and that SQL syntax is actually "consistent" so please reply with your counterargument.]


I agree with your overall theme: SQL is not composable. On the UPDATE front, some implementations (e.g., PostgreSQL) do allow tuple/row assignment, which does make things a bit nicer. (Small conveniences can make things better, while not perfect.)

   UPDATE customer
     SET (membership_plan, timezone) = ('PLATINUM', 'UTC") -- ...
I tend to use row comparison and assignment where possible (in JOIN clauses, UPDATE statements, and WHERE clauses) to reinforce thinking in tuples, where a single column is the degenerate case (though I do then omit the surrounding parens).


Mysql certainly has some syntactic warts, but you picked what I consider rather odd complaints.

> The ordering is not always name of table and then names of columns

I would point out that only SELECT (not INSERT, UPDATE, DELETE) can be run without providing a table reference. Also, the select doesn't take a simple list of column names, but is a more complicated tool for organizing and formatting data. (This section would not by syntactically or semantically similar to anything in the INSERT or UPDATE that is not itself part of a SELECT statement)

> For UPDATE, the column names are adjacent to the value with the '=' equals sign. However the INSERT statement splits column names away from the values; all column names as one delimited list in parentheses and then all the values are another delimited list:

UPDATE and INSERT both support the "assignment_list" syntax. So they are perfectly consistent. UPDATE also supports other formats (such as VALUES or SELECT).

The real inconsistency I would have called our here is that while INSERT can be used with subqueries (via VALUES or with a SELECT clause), UPDATE must use joins to achieve a similar functionality.

> For syntax consistency, the UPDATE would also have had a preposition such as "UPDATE TO t", or "UPDATE ON t", or "UPDATE ONTO t" ... or ... the other 3 commands would have removed the need for "FROM/INTO". Either way, all 4 SUDI commands could have looked more alike.

Well "INTO" is an odd case, it is completely optional and serves no syntactic purpose. The FROM portion of SELECT statements does serve a syntactic purp0se but isn't actually required if it is not needed (E.G `SELECT "hello world"`). The inconsistent thing here is that the FROM portion of the DELETE syntax (which serves no syntactic purpose) is mandatory and not optional.

> Maybe we should be grateful that SELECT/UPDATE/DELETE at least all happen to share the same "WHERE" clause. It seems like SQL's inconsistent syntax design should have tortured us with memorizing "WHERE" for one command and totally different synonyms such as "FILTER" and "CONDITION" for the others!

If you look at the spec, SQL is pretty good at reusing syntactic units. You can see this in the documentation for both MySQL and Postgres.

The worst thing in terms of SQL consistency is the syntactic and semantic differences between different flavors of SQL. This greatly reduces the value that SQL could have as a plug and play query format.


>Mysql certainly has some syntactic warts, [...] UPDATE and INSERT both support the "assignment_list" syntax. So they are perfectly consistent.

(I was making generic comments about SQL so not sure why you interpreted my post as criticizing MySQL.)

Anyways an FYI for any readers not aware... MySQL has non-standard extensions[0] to the reference SQL language. Other databases like Oracle, MS SQL Server, and SQLite do not have INSERT with optional "assignment_list" syntax.

>Well "INTO" is an odd case, it is completely optional

Another example where MySQL is a special case because its proprietary extensions makes "INTO" optional. In other databases like SQLite, the "INTO" is absolutely required because omitting it in "INSERT customer" returns a syntax error.

>I would point out that only SELECT (not INSERT, UPDATE, DELETE) can be run without providing a table reference.

Yes, the ability for SELECT to leave out the table name is handy for functions like "SELECT date();" but its optionality would not have violated consistency with table-then-column ordering of INSERT & UPDATE. In cases where SELECT needs to specify table(s), it would have been more consistent grammar by putting the table names before the columns like INSERT & UPDATE. (This would have also greatly simplified editors with intellisense since putting the table first would let the parser know which columns to list as autocomplete suggestions. E.g. in Navicat SQL text editor, to get a list of autocomplete column names, I have to first type out "SELECT FROM customer" then move the cursor backwards to the left of "FROM" and then start typing "mem" to see "membership_plan" as an autocomplete suggestion.)

[0] https://dev.mysql.com/doc/refman/8.0/en/extensions-to-ansi.h...


Hmm, I obviously failed at researching my post properly. When looking for a page listing the differences and found this:

https://dev.mysql.com/doc/refman/8.0/en/differences-from-ans...

I completely missed the extensions page you listed.


Because nobody in the Excel team uses Excel and it shows.


No, they just don't use it for your specific scenario.

Disclaimer: Work at MS. We use Excel plenty, we just tend to do things differently than many other businesses as things that shouldn't be done in Excel we don't do in Excel (since we have other tooling available).

The challenge is that in reality, people are diverse and will use Excel as long as they can as they don't want to buy other software, lack skills to implement it differently, or for some other reason need to keep data in Excel.

Software like Excel has been around for so long and has so many edge cases in how it is used that the dev teams could spend 100% of their time adding features and still probably not touch some potentially large user base ones.


I'd like to believe you but some of the beefs I have are not edge case at all.

Top of my head, a few examples (in addition of the index/match):

1. the fx "Insert function" window, with a search feature that just doesn't work, a listbox that shows only 7 results and is not resizable (when the built-in functions alone is a huge list, plus all custom udf). Same thing for the Function Arguments window.

2. excel incapable of preserving the default for calculation mode (automatic/manual) between sessions

3. excel recalculating the whole sessions when opening a csv file even when calcs are on manual

4. terrible default formats (serious people don't want thick lines, 3d charts and all that crap)

5. Recent regression: can't copy a table from excel to powerpoint while preserving the format, formatting a table in powerpoint is a pain in the butt, default format are just bad

6. No way to link a powerpoint to a spreadsheet. Think all the bankers, consultants, engineers, who must present the result of some analysis, the data is in excel, the deliverable is a powerpoint deck, how do you link the two. Only ways I know are awefully manual and/or unstable and/or result in tables displayed as images in powerpoint

7. VBA not developped or improved since the late 90s (there was an interesting attempt at one point, VSTA, but it was canned)

I could go into more edge case things but these are mass usage features.


For MS, an edge case is something that probably has under 100k users unless someone on the dev / feature team decides to champion that item.

So no - some of your examples are very much edge cases, preference, or lack of understanding how MS works.

Example(s) off your list: item #4 - that's subjective and I am certain there are users that do want that feature. This same feedback probably applies to #3 also. Or not enough people care / complain about #2 and #3 to get them prioritized.

#5 and 6 is probably because people tend to think that MS is one unified company; it's more like little ones that group together to produce a product. The Excel and PowerPoint teams pre-Satya wouldn't have even talked on features outside of integration testing. That item also might be a PowerPoint issue, a Windows OS issue (for how the formatting gets preserved on copy) and not related to Excel at all.

The VBA item someone else has already addressed.

MS dogfoods heavily; it's just that MS doesn't represent their user base (hence the push for insiders - as insiders may represent the user base). I think most large companies (Google, Amazon, etc) that make software have that same issue (lack of user base representation internally).


I am ready to bet the number of users who need these features dwarf the number of users of Cortana and Paint 3D together!


> VBA not developped or improved since the late 90s (there was an interesting attempt at one point, VSTA, but it was canned)

VSTA was not canned [0], but core Office apps never supported it. IIRC, the only Office app that ever used (uses?) VSTA instead of VBA is InfoPath. VSTA was more of a replacement for VBA as an offering for other people (VBA, as well as being used in office, was available for others to use in their apps), who didn't have Office’s backward compatibility constraints.

[0] VSTA 2019 SDK is available here: https://www.microsoft.com/en-us/download/details.aspx?id=583...


I believe at that time there were talks about VSTA being made available in other parts of office, living alongside VBA. It never happened but I thought that was a good way forward to phase out VBA.


Why does LINEST in Excel return regression coefficients in the OPPOSITE order of the columns of data? Could you at least add an optional boolean argument that returns them in the same order as the columns?


It’s basically just index match with a sensible default for match’s matching argument.

Good stuff. Took too long to make an official implementation of this but glad it’s here


Agree. Looks like I can finally get more people using the functionality provided by INDEX([returninfo],MATCH()). Previously, folks would get too confused initially to commit it to memory. MrExcel has this exact comparison: Youtube.com/watch?v=8dwmvTka3gs


I, for one, await YLOOKUP, so that we can go vertically, horizontally, and tab-wise through our workbooks, for that true 3D feel.


Shouldn't that be ZLOOKUP then?


That's a good name if you're a techie, but you know it'll be "PowerLookup Live" by the time the marketing people are through with it.


Shhhhhh: the multi-file lookup is the next version.


The new function behaves very similar to =SUMIF(), with two additional optional parameters for how to search and match.

Easier to understand and solves some issues with =VLOOKUP().


So, instead of using SUMIF, I always use SUM or SUMPRODUCT using the trick that in Excel True and False are 1 and 0 when used in arithmetical operations. This is unbelievably flexible. For example, if you want to sum all the elements of column A if the element in B is between 1 and 10 then you have SUM(A1:A100 * AND(B1:B100 >=1, B1:B100 <=10)). You need to press Ctrl+Shift+Enter, as this is an array formula, but otherwise it works like a charm.


A small but important difference between this new function and SUMIF: SUMIF will sum all values in the array that match your search criteria, but this function will only return a single value.


Bill Jelen ("Mr. Excel") has posted an in-depth look at XLOOKUP with more examples:

https://www.mrexcel.com/excel-tips/the-vlookup-slayer-xlooku...

and a video:

https://www.youtube.com/watch?v=E5JxX_3Qb7A


And an Index Match comparison: youtube.com/watch?v=8dwmvTka3gs


My thought abstraction on this: Joins are what make database technologies... If you can call Excel a database technology... So powerful.


A decade ago as an intern at Microsoft, I attended a talk by <distinguished engineer whose name I forget> about data systems. The thing I remember from it was: "In terms of total data stored, what data system do you think is the biggest in the world today? ... Probably Excel. Probably Excel by a lot."

Excel has a simple mental model combined with powerful tooling. It's the most beautiful means to get non-technical people to start to think like a programmer. An acquaintance of mine tells a great story of how Excel and VBA got him into software: https://news.ycombinator.com/item?id=7950736.


This is amazing. I also found a love for programming after i learned some spaghetti VBA to create automated sales reporting at my finance job. Finally starting up CS degree in Jan 2020.


Excel is to me basically a sequel database broken in a few key ways. And some people (a lot of...?) like it that way.


For a lot of individuals and small businesses true databases have a tough learning curve. I commend the Excel developers for recognizing this and including things like Power Query and Power Pivot.

They aren't there to completely replace databases, but they are a decent middle-ground for meeting people where they are.


As an aside, this is the very first time I've ever seen "SQL" written out as "sequel."


Not deliberate; I just write things out as I say them and didn't review.


Excel is an hybrid ball of things already. Reactive dataflow is one of them. So Join is not that much more to add.


Anyone here on the GSuite team who can get this into Sheets stat?


If you're using Google Sheets, the QUERY function [1] can do all kinds of lookups. For complex logic, I find its SQLish syntax a lot more readable than mixtures of VLOOKUP and IF and other functions (but that might just be me). So, um, anyone here on the Office team who can get QUERY into Excel?

[1]: https://support.google.com/docs/answer/3093343?hl=en


My office is split between Excel and Sheets users. I was once trying to mash-up my Sheets knowledge with an Excel colleague. I told him it might be easier to accomplish his task with Sheets because query() uses a SQL-like syntax. I told him Excel doesn't have that. He responded, "lmao you can do that with vb". I looked up instructions on how to that and realized he had no idea what he was missing.


Anything that requires VB is not friendly. Of course there's a sheets equivalent using javascript, which you can say the same thing about


Not identical, but this fuzzy match is a life changer.

https://gsuite.google.com/marketplace/app/flookup/5938060149...


+1 works well but long calculate times


So...no patch for older versions of Excel? Or did I miss that part of the announcement?

If this isn't available as a patch for older versions of Excel, good luck popularizing the function.


It’s a breakthrough for the consulting world!

Joke appart, it’s been long overdue to have such a lookup functions - even more so with sensible parameters as default!


I know the IDE has tooltips, but why don’t they ‘invent’ enums instead of those magic constants for match_mode and search_mode? Does that complicate the parser that much?

For VLOOKUP and HLOOKUP, they can’t for backwards compatibility, but this breaks new ground.


they likely either made utility studies for this, or all of Excel already works that way and they don't want to break the expectations of the users


Oh man, I would have killed for this when I was in structured products trading at Goldman, the spreadsheets that we were using to keep tabs of positions were beautiful and insane in their complexity and in their raw, insane abuse of VLOOKUP/INDEX/MATCH.

I moved from Technology into Trading and all my code access was cut off, and the switch from coding in Slang/SecDb to only being able to use Excel was absolutely brutal. If someone can figure out how to break the Microsoft stranglehold on "business" then there will be a boom in productivity unseen since... Microsoft's suite of business software.


I wonder when we can start using this function without running the risk that our bosses/clients Excel doesn't support this function. Lots of people still use Office 2007 and 2010, so It'll be a while I guess.


I wonder if this function can be emulated on older versions of Excel.


I was thinking about that too. Although any form of delivering automatically executing code is just another attack vector. It's macro's all over again.


I almost cried when I saw the word return_array. Then I realized my mistake. Return_array is the name of an input parameter.

How I long for the day I can return multiple values from an excel lookup function. Few users know of such a thing. But it is amazing. So far I have gotten it to work using the most complex formula string I ever made.

Link: https://www.get-digital-help.com/2009/10/25/how-to-return-mu...


Excel has allowed array formulas for literally decades. It’s not as complicated as the link you posted makes it seem. You just need:

1. A formula that supports array returns (like MMULT) 2. Know how big the returned array dimensions are 3. Highlight that array size, type in the formula, and press ctrl+shift+enter

Returning multiple matches from a single criteria isn’t what VLOOKUP was designed for, so rather than complain that a hammer does a bad job at screwing in nails, search for a screwdriver. Excel has tons of data filtering and table features, and allows DB connections for SQL queries.


Never knew of mmult. Will try to look at that technique to simplify the formula.

Aside from that, another user posted excel campus example of filter function. That also returns an array. And pretty much does what I need. However, it is only for office 365 insider and not to be released anytime soon.


Fot what I know, Microsoft is working on an improvement of arrays in excel: https://www.excelcampus.com/functions/dynamic-array-formulas...


Thanks for the link. I see filter function as helping me do the same thing as the link I posted.


Time to start crying. You can return an array. Just provide a 2D array to the return_array parameter.


From what I understand about xlookup, it will only return back the first value and not multiple values.


Or do a lookup over multiple columns without having to do some ugly concatanations


I saw a preview of XLOOKUP at the MVP Summit in March and was excited about it (but couldn't say anything about it until it was released).

Here's my brief video on XLOOKUP along with a Peace Summit where a staunch INDEX/MATCH user and I formally buried the hatchet n Perth, Australia.

https://youtu.be/0KEhR66btUs


This is one of the features that I was wondering why it wasn't there from the beginning (according to article Excel 1 with VLOOKUP shipped 1985).


Killing most reasons for the index() + match() combo!


It depends which is faster. The video shown is not really the real world use case.


But not the 2 dimensional index(match, match) use case, unfortunately


It doesn't look like either xmatch or xlookup can lookup based on multiple search criteria? That's my main use case for index(match()).


So, is this on the roadmap for LibreOffice yet?


Now give me count distinct in pivot tables, and limit rows in a book only by the ram in my machine.


If you add your table to the data model you can do count distinct with a pivot table


TL;DR:

vlookup: (value to lookup, column to search, numbered column from the left, yes/no use range)

xlookup: (value to lookup, array to search, value to return array)

You can also replace hlookup with xlookup. Two optional arguments are 4) exact match or not and 5) go up or down or use binary search. I guess binary search would require a sorted array and improve speed?

xlookup seems obviously better in retrospect. I wonder if they had any reason not to implement it earlier.


Will this show up in Office 2019?


I should get this around 2026 when my company finally updates from office 2016.


Nope. New features appear only in Office 365 (the subscription version). If you have the one-time purchase version (Office 2019), you'll need to wait for and buy Office 2022.



Have you asked Joel Spolsky about this?


Curious why it wasn't named HLOOKUP since v=vertical and h=horizontal


XLOOKUP is a replacement for both VLOOKUP and HLOOKUP


No joke: I spent the better part of an hour a few weeks ago trying to remember how to use VLOOKUP properly to join two lists in Excel before a coworker chatted me that he had done it in 10 seconds by running the Linux `join` command.


Your co-worker couldn't tell you about doing a join in Excel's Power Query?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: