Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Literature for mathematical optimization?
95 points by samuel2 10 months ago | hide | past | favorite | 38 comments
Hi! Do you know any books on math optimization that are essential for anybody getting into this field? Is there any classical literature for optimization related to ML? Thanks!

My current list includes:

1. Numerical Optimization by Jorge Nocedal Stephen J. Wright

2. Algorithms for Optimization - introduction to optimization with a focus on practical algorithms

3. Algorithms for Decision Making - a broad introduction to algorithms for decision making under uncertainty

[2] https://algorithmsbook.com/optimization/

[3] https://algorithmsbook.com/

Here's an excerpt from a comment I previously made on Hacker News:

I'm a Ph.D. student in operations research (OR). My suggestion would be to first build a strong foundation in linear programming. This will introduce you to the notion of duality, which is heavily emphasized in many mathematical programming courses. Here's a good open-source book on linear programming written by Jon Lee, the current editor of Mathematical Programming A: https://github.com/jon77lee/JLee_LinearOptimizationBook

Then I'd suggest studying more general methods for continuous and convex optimization. The book I see mentioned a lot is Convex Optimization by Boyd and Vandenberghe, although we didn't use this in our coursework. Instead, we used a lot of the material presented here: http://mitmgmtfaculty.mit.edu/rfreund/educationalactivities/

If you read the above (or any other two books on linear programming and convex optimization), you'll probably have a better idea of what you want to study next and how you want to go about it. The next natural step would be to study combinatorial (i.e., integer or mixed-integer) optimization. (Jon Lee has another book on this subject; I've also heard good things about the Schrijver book.)

For linear programming I much preferred the Bertsimas/Tsitsiklis (Introduction to Linear Optimization). Not free, but if you look at the first couple Google result you may get nice surprises.

In addition to Boyd and Vandenberghe, I like "Lectures on Modern Convex Optimization" by Ben-Tal and Nemirovski. Particularly the section comparing linear and conic optimization problems.

For theoretical continuous/nonlinear/convex optimization, your #1 is the bible, together with

"Convex Optimization" by Boyd & Vandenberghe.

However, beware that both are grad textbooks. They can be tough going at times. Unfortunately, I never found undergrad textbooks I liked much, for theory.

If you're interested in discrete optimization too (the other half of math optimization), the classics are:

"Optimization Over Integers" by Bertsimas & Weismantel

"Integer and Combinatorial Optimization" by Nemhauser & Wolsey

Not sure what you mean by the bible?

Sorry, I meant that those two books seem to be "classics" that are often used as references in graduate courses.

I'm confused, did you maybe not list one of the ones you were thinking of? In your original post you just list Convex Optimization and then two books on discrete optimization. I'd be very curious to know what the other one you're referring to alongside Convex Optimization.

He said "your #1", making reference to the first book in OPs list: "Numerical Optimization" by Jorge Nocedal and Stephen J. Wright.

It's pretty fun to read.

Thank you a lot!

You grab a copy of "Convex Optimization" by Boyd & Vandenberghe over here: https://web.stanford.edu/~boyd/cvxbook/

This is more optimal control but I really enjoyed reading through these notes: https://math.berkeley.edu/~evans/control.course.pdf

Two fascicles for Knuth's Art of Computer Programming, volume 4B on optimization are out. https://www-cs-faculty.stanford.edu/~knuth/news.html

I started from zero, and my approach was to read Nocedal/Wright cover-to-cover, and then the same with "Numerical Linear Algebra" by Trefethen/Bau. Usually it goes the other way, but I found the linear algebra primer in N/W to be good enough to get started.

I also read "Practical Optimization" by Murray/Gill, which is interesting because it has a lot of conversational coverage of e.g. corner cases, stuff that most textbooks won't cover.

That will cover the expected baseline of almost everything you'll encounter in the convex smooth continuous domain. I don't have great answers for moving past that.


This is the canonical introduction book recommended by Gurobi. I've found it to be great for those getting into the subject. It has math of course, but the focus is on going over the basics of LP, integer, nonlinear, Mixed-Integer...etc, followed by lots of examples. I think it's the best book to start with to get a feel for the subject of OR, before diving into the harder books.

I used the Williams book in grad school and it provided context and motivation for MIP (mixed-integer) modeling, but for real-world MIP modeling, this guide from FICO is actually much better for practitioners.

This is the cheat sheet for MIP that I wished I had when I started out. It lays out all of the most useful MIP formulations in a simple and compact way.


Thanks for the tip!

Just throwing another tip out there: if you're finding yourself having to model anything mildly nonlinear, but find yourself shackled by the linearity constraints of a Mixed Integer Program (MIP), you can use a statistical/ML technique called MARS [1] to regress a piecewise-linear model from your nonlinear data to approximate said nonlinear function. MARS models are essentially models with constant or max functions as basis functions, e.g.

y = a₀ + a₁ max(0, x − b₁) + a₂ max(0, x − b₂) + ...

max(0,x) functions are very easy to model in MIPs.

[1] https://en.wikipedia.org/wiki/Multivariate_adaptive_regressi...

I mostly just do LP & MIP (MILP), but will write this down in case I find a nonlinear need. Thank you!

I recommend Bayesian Methods for hackers [1]. It doesn’t go too deep on theory but I feel like it’s well written and has really good coded up examples of theory being applied to problems. It has a relatively narrow scope, but I find myself reaching for methods I learned from the author frequently.

[1] https://github.com/CamDavidsonPilon/Probabilistic-Programmin...

Nocedal and Wright is an excellent overview and a good starting point. Also consider:

- R. Schneider, Convex bodies: the Brunn-Minkowski theory. The first two chapters are an excellent introduction to convex geometry (plus a little bit extra!) if you have some undergrad-level analysis.

- Hiriart-Urruty and Lemarechal, Fundamentals of convex analysis. This book has been highly recommended to me but I've never used it. Might be an easier go than Schneider for convex geometry.

- Golub and van Loan, Matrix computations. Excellent book on numerical linear algebra.

- Bonnans, Gilbert, Lemarechal, Sagastizabal, Numerical optimization: theoretical and practical aspects. This book has a detailed description of bundle methods, which are important and in my opinion underutilised.

- I. Maros, Computational techniques of the simplex method. This is the only book I'm aware of that discusses how to build a working implementation of the simplex method for linear optimisation.

I'm not aware of any books that cover line search algorithms in detail. These are important in implementations but, beyond discussing the Goldstein and Wolfe conditions, generally glossed over in prose. Even in the absence of stalling and numerical difficulties, you can see an order of magnitude speedup from replacing a bad line search with a good one. One line search algorithm I've had success with is described in More and Thuente, Line search algorithms with guaranteed sufficient decrease.

Lots of tacit engineering knowledge goes into building a fast and robust optimisation code. Some of that knowledge gets forgotten when code is rewritten or ported from one language to another.

Mercifully, a lot of that engineering knowledge has been encoded into freely-available optimisation code. Quite a bit of that code is pretty readable. Off the top of my head, I've learnt things from:

- Liu and Nocedal's Fortran L-BFGS implementation,

- The CUTEst problem collection,

- Chih-Jen Lin's LIBLINEAR and LIBSVM,

- Lin and More's TRON,

- Csaba Meszaros's BPMPD,

- Jacek Gondzio's HOPDM,

- The GNU Linear Programming Kit,

- and probably quite a few other sources!

thanks for your insights!

Without a doubt you should read the following.

It is the clearest, most in depth, introduction from "zero to hero" for optimization. Mostly from a math perspective, useful for many things outside of ML too!


Nocedal and Wright is good. +1 also to the suggestions for Boyd and Vandenberghe. I really like Boyd's writing in general; he has coauthored some good review articles on proximal algorithms and ADMM.

A couple of other suggestions:

Nesterov's Introductory Lectures on Convex Optimization. This one is pretty tough sledding, but I found the perspectives in the first chapter particularly to be enlightening. It seems like there's a newer Springer book which is probably an expansion on this.

Bertsekas's Nonlinear Programming. Bertsekas has written a lot of books, and there's a fair amount of overlapping going on. This one seemed to be the one that has the most nuts and bolts about the basics of optimization.

EDIT: If you want more understanding of convexity beyond what's presented in these books, Rockafellar's Convex Analysis is helpful.

Convex analysis by Rockafellar is pretty hard for a beginner. It's a research monograph. I would recommend "Fundamentals of convex analysis" by Hiriart-Urruty & Lemaréchal

I recently bought "Introduction to Nonlinear Optimization" by Amir Beck. It's published by the SIAM. It covers the field from beginning to intermediate level and includes examples written in MATLAB. I found the book very readable and illuminating. It's also not overly long nor verbose, which is a plus in my mind. I would recommend it -- I used some of its ideas as fodder to teach the optimization section of my class in Numerical Analysis.


Sébastien Bubeck's book is excellent mathematical introduction to modern convex optimization: https://arxiv.org/abs/1405.4980

So for medium to non-technical people reading this, I took a course in grad school that showed me how to do this in Excel with solver.

It was easily one of the top 3 courses I took and heavily based off of this text book:


There are many programming libraries that would be better than Excel, e.g. scipy implements quite a few algorithms very accessibly, but that's not what OP is looking for.

The book covered the theory well and introduced it in a way that was easy for less technical people to use. Excel was used and helped to visualize problems with two dimensions.

I would recommend that text for anyone interested in getting into optimization. Excel quickly hits limits but the concepts translate to other tools

Sounds like you're looking more for optimization theory, but if you want a gentle introduction to applications with approachable math and lots of examples, I highly recommend "Operations Research: Applications and Algorithms (4E)" by Wayne Winston. It's a solid undergrad level text covering basic linear optimization, mixed integer linear programs, and non-linear optimization.

Mathematical optimization is huge field which splits into different branches. In my opinion, there are two demographics that approach optimization in slightly different ways and are interested in slightly different aspects of optimization: the OR (operations research) camp and the engineering camp.

(all generalizations are wrong to some extent and the delineations are not strict, but I have noticed they have different cultures -- analogous to how Breiman observed that there was two cultures to statistical modeling [1])

In the OR camp, the focus is primarily on linear programming/mixed integer linear programming. The types of problems solved include transportation, assignment, scheduling type problems. OR folks tend to go very deep into the theory of linear programs (matroids, Benders decompositions, etc.) and the literature is absolutely rich with advances in linear optimization. OR folks tend to be linear optimization specialists and mathematicians. A good practice-oriented intro book is Winston's "OR: Applications and Algorithms". Chvatal's Linear Programming is also good. But this is not my space, so I'll leave OR book recommendations to others.

In the Engineering camp, while folks do use linear programs, they also tend to go more in the direction of convex programs and general nonlinear/nonconvex programs (including mixed integer nonconvex nonlinear programs). There are some strong theoretical results for convex programs, but the results (naturally) aren't as strong for nonconvex problems -- global optimality never guaranteed. In the nonconvex camp, practitioners tend to concern themselves with heuristics/techniques like finding good initializations, understanding SSOCs, etc. The types of problems solved range from anything from ML problems to dynamic plant/machine optimization. Most folks in this camp tend to be engineers rather than optimization specialists (though some do become specialists eventually). Nocedal and Wright is a classic for general nonlinear programming, but also look into Biegler's Nonlinear Porgramming. Boyd and Vanderberghe is a classic for convex optimization. Murray and Gill's Practical Optimization is a bit outdated (so don't rely on it for state-of-the-art knowledge), but it has tidbits of insights about optimization that aren't found in other books and that continue to be useful.

[1] http://www2.math.uu.se/~thulin/mm/breiman.pdf


I'd also add C. T. Kelley's "Iterative Methods for Optimization" for more non convex theory. Nemirovski also has a variety of books and course notes that are available, but I haven't spent as much time with them.

I agree with thxg, there are few undergraduate textbooks that I've liked.

I'm planning on taking a course in optimization this fall, the textbook for it is "An introduction to optimization" by Chong and Zak. Dunno if it's any good, but I just ordered it so we'll see.

"The Design of Approximation Algorithms" by Williamson and Shmoys


cool, thanks

Mathematical Optimization still has many subfields that can be of interest. I guess that non-linear mathematical optimization is most more typical for many machine learning applications. Many pratical applications in scheduling, logistics, planning etc used linear (integer) programming and combinatorial optimization. The following are some points towards that body of literature.

Alexander Schrijver [1] has lecture notes on Combinatorial Optimization on his website [2]. He also has an affordable 1800 page three volume set of books "Combinatorial Optimization - Polyhedra and Efficiency" [3], although I would say it is better suited as reference material because it is quite densely written.

There is also the classic book "Combinatorial Optimization - Algorithms and Complexity" [4] by Papadimitriou (Bill Gates' MSc thesis supervisor) and Steiglitz that is a nice introduction to the topic as well.

"In Pursuit of the Traveling Salesman - Mathematics at the Limits of Computation" by William J. Cook [5] is a more popular science book on the history of the Traveling Salesman Problem, that also explains how linear programming is used in the state of the art solvers, but is of course focuses on a very specific problem. There is also a book that contains all the scientific and mathematical details by Applegate, Bixby, Chvátal and Cook [6] if that is preferred.

In recent years, there is a trend that mathematical optimization researchers work more with ML. In particular Dimitris Bertsimas has done some work on the intersection of those area's in recent years [7] and apparently has a book on the topic as well [8] (but I am not familiar with it).

[1] https://homepages.cwi.nl/~lex/ [2] https://homepages.cwi.nl/~lex/files/dict.pdf [3] https://www.springer.com/us/book/9783540443896 [4] https://www.amazon.com/Combinatorial-Optimization-Algorithms... [5] https://press.princeton.edu/books/paperback/9780691163529/in... [6] http://www.math.uwaterloo.ca/tsp/book/index.html [7] https://dbertsim.mit.edu/papers/ [8] https://www.dynamic-ideas.com/books/machine-learning-under-a...

Papadimitriou was not Gates's MSc thesis supervisor. Gates never did an MSc (nor indeed completed his BSc -- he dropped out of college to found some computer company or other).

What did happen is that Papadimitriou and Gates were co-authors of a paper about pancake sorting. (https://en.wikipedia.org/wiki/Pancake_sorting)

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact