Or did I oversee it? Is in there?
First, the purpose of a class like this is to prepare students for further coursework in algorithms and automata/complexity. You need logic, graphs and combinatorics for those a lot more than you need linear algebra.
Second, the math department teaches a class in linear algebra. This is a collection of topics that you wouldn't ordinarily find in undergraduate math classes.
Third, there's a limit on how much linear algebra you can cover in part of a course like this. Better to give it its own semester so you can treat the subject in some reasonable breadth.
>Second, the math department teaches a class in linear algebra. This is a collection of topics that you wouldn't ordinarily find in undergraduate math classes.
x math department doesn't have classes on graph theory, combinatorics, or prob & stats?
Discrete mathematics is a grab bag of topics, and it's more fair to say that a discrete math course pulls in selected topics from other areas than to say it provides a focused coverage of any of them. At the depth each topic is taught, you couldn't pull it out into its own semester's worth of lectures. On the other hand, it would be very difficult to compact linear algebra into a discrete math course. Learning linear algebra already requires you to introduce a bunch of new concepts, like fields and vector spaces, linear equations, linear transformations, matrix representation and matrix operations, determinants, inner products...
Equally importantly, linear algebra doesn't really "fit" with the topics of discrete mathematics, pedagogically speaking. It fits well with a treatment of multivariable calculus, but that's only a small part of what's taught in discrete math courses. If you wanted to teach linear algebra in a course covering a bunch of topics, you'd probably want to do a "topics in abstract algebra" course that works through groups, rings, modules, linear algebra and multilinear algebra. But then you're going well beyond what a non-math major should encounter in an undergraduate setting.
Really, linear algebra stands on its own very well. In my opinion it should receive focused coverage and not be mixed with anything else. While graph theory and combinatorics can easily receive the same coverage, their treatment is usually pushed back to advanced undergraduate or graduate level courses because engineers don't usually work with them beyond what's covered in discrete math (which is exactly why we have discrete math as a standalone course). In fact, MIT actually lists linear algebra and multivariable calculus as prerequisites for the focused courses on combinatorics. The same applies for probability theory: you can't really learn probability theory (beyond discrete) without first learning analysis.
These were our —typically— freshman/sophomore year CS classes (although I took a couple in my junior year) so I guess every school defines their requirements differently.
Much of this course is covered in undergraduate math classes, except at much greater depth (especially algebra and probability theory), and over the span of several courses.
For linear algebra, see https://ocw.mit.edu/courses/mathematics/18-06-linear-algebra...
I'll admit I don't know how the MIT CS curriculum is structured, but I suppose it is in there, just not in this particular course.
Source: https://www.eecs.mit.edu/curriculum2016 and I'm a 6-3
edit: originally had posted, "This is a discrete math textbook."
One might argue that linear algebra is slightly more niche than stuff like graph theoretic algorithms. There are applications to broadly useful things like gradient descent, but that seems to fall under numerical linear algebra which probably deserves its own course in the Applied Math department.
N.b. just playing devil's advocate, I don't have a strong opinion either way.
There's also the CMU intro to proofs/reasoning lecture notes here https://infinitedescent.xyz/ though you would learn much of this material from the introduction of Apostol's Calculus. The advantage of the CMU lecture notes is the author based it on his findings in evidence based learning IIRC.
Out of most math texts, this is very accessible and mind-expanding. Even if you don't get everything on the first pass, you'll come out with a lot of interesting insight into the field.
I don't know your level, but I don't recall anything too insurmountable for someone with a decent base in math.
For the context of my opinion: my favorite math textbook for computer science is Chapter 1, Section 1.2 of The Art of Computer Programming ("Mathematical Preliminaries"). That's pretty dense though, so I usually recommend Concrete Mathematics, which was also (co)authored by Knuth. The benefit of Concrete Mathematics is that it spreads the information out with more examples and better exposition. It also includes an expanded section on probability. But in my opinion, the failing (if you could call it that) of both is that there isn't a dedicated section on set theory. I really think that should be covered alongside the introduction to various types of proof.
Which brings me back to this textbook...I personally think it spends too much time building logic and proof machinery in the first chapter. Sets and functions have sections that are only pages long, but there is a large focus on logic. I understand it's a mathematics for computer science treatment and not a pure mathematics treatment, but that's actually pretty dry in my opinion - I think a treatment that focuses more on sets is more intuitive. One of the things I like about Knuth is that he covers these topics in a slightly irreverent and humorous way. He also engages the reader by jumping more directly into arithmetic operations, functions, basic number theory, the binomial theorem and combinatorics, etc.
Some of this is admittedly my own bias. I've never enjoyed staring at things like e.g. truth tables. It's important to know, but this actually seems excessively formal for a discrete mathematics course. Reading through the algebra of propositions isn't exactly riveting. I think that, for self-study at least, exposition and organization of topics is just as important (probably more so) than complete coverage of topics. All of the material you'd want to know is there, it's just that the treatment might be a bit suboptimal for self-study. It probably depends on what you find interesting and how quickly you'd like to do interesting things with what you're learning. It would probably also be beneficial if some people who have actually learned from this textbook for their first encounter with the material could chime in, because that might be more relevant to your question.
It's heavy on sets and functions, turning sets into types to manipulate in Standard ML so the reader can understand what's going on intuitively. I also liked Knuth's crash course in The Art of Programming and it is hard to read other books when you're used to Knuth's clear and precise explanations mixed with dry humor keeping you engaged.
Concrete math requires MUCH more background than this book. Besides the real gems and clever tricks Mathematics for Computer Science appear after the fourth chapter.
It offers so many 'hacks' to working with numbers, which I could imagine would make code more efficient and avoid unnecessary computations when dealing with very large numbers.
Separately, my increased mental efficiency with problems involving numbers is really surprising. There have been interview questions I've read about in the past where I've thought "someone must be a genius to be able to answer that through mental math," and it all makes so much more sense now! It's taught me about dozens of visualizations that can break down these problems in much more manageable ways.
As an example of a mental maths problem on my midterm: How many numbers between 1 and 99,999 have digits that add up to 10? (23,500 would work whereas 23,510 would not). While initially I would have no clue how to answer this, learning about the balls-in-bins method solves it in an incredibly simple manner.
"The analogy between good proofs and good programs extends beyond structure.
The same rigorous thinking needed for proofs is essential in the design of criti-
cal computer systems. When algorithms and protocols only “mostly work” due
to reliance on hand-waving arguments, the results can range from problematic to
catastrophic. An early example was the
Therac 25, a machine that provided radia-
tion therapy to cancer victims, but occasionally killed them with massive overdoses
due to a software race condition. A more recent (August 2004) example involved a
single faulty command to a computer system used by United and American Airlines
that grounded the entire fleet of both companies—and all their passengers!
It is a certainty that we’ll all one day be at the mercy of critical computer systems
designed by you and your classmates. So we really hope that you’ll develop the
ability to formulate rock-solid logical arguments that a system actually does what
you think it does!"
My hope is that I find enough feedback on good/bad of cheap or free resources on this topic that I can determine some default recommendations for people curious. The problems can be addressed with supplementary material. Eventually, the defaults would be good enough that the process would happen faster. A larger number of people centering on the same resources might become a community for helping new people. That might increase the inflow. Simultaneously, some percentage of these people are working on tools, esp automated, to make the proof process more productive. Over time, we might have a much larger segment of people working on formal verification than we do now. Maybe even large enough to sustain more tool development by contributions or licensing.
So, that's why I'm looking at books like this with what I think they might help us achieve. I'm way too out of my depth to know if it will work. I just know accessibility, better learning materials, and network effects always help.
Example tools below for sequential programs, concurrency, and hardware:
Something I find very useful as a developer is to see how tools work on concrete examples. For formal verification, this is well examplified by the document "ACSL by Example" about Frama-C:
This has inspired other researchers to do the same for SPARK, but currently they have done only the code/proofs, not yet the doc around it:
As I like that approach a lot, that's something I have adopted for the static analysis tools we develop around Ada and SPARK:
This shares with the O'Reilly Cookbook series (https://ssearch.oreilly.com/?q=cookbook) the goals to get very concrete immediately, on code that the reader might have to write or have already written.
The stories section of that link has examples including recent ones on floating point and interior point. The interior point paper you might find interesting since they have a specific specification and coding style to achieve full automation.
You might also find Lea Wittie's work on Clay interesting. I was looking at her dissertation in case one could use SPARK in similar way to model I/O or concurrency.
re your concrete examples
I like that you're all doing that. That does help a lot. I plan to do my part to some degree in the near-ish future: I bought the High-Integrity Applications book for SPARK and Stavely's Toward Zero-Defect Programming for Cleanroom.
SPARK book looks well-written but kind of packed with info esp on the language. Might take some time for me to learn. The Cleanroom book is highly approachable with the method mostly being algebraic focused on human understanding with symbolic-style verification of some control flow mechanisms. That was from a skim, though, with full method maybe being harder. My plan is to get back into feel of doing this stuff by trying Cleanroom first followed by using what SPARK book teaches me on all the same stuff. Both during and after learning SPARK, I'll use it on little functions like in the ACSL by Example book. Note that this assumes some other research doesn't take higher priority for me like my Brute-Force Assurance concept that mixes languages with their verification tooling.
I took the first steps of trying to get the SPARK GPL environment downloaded and installed. That wasn't working for some reason. It's harder than necessary to install: should be a well-packed installer like most things are on major Linux distros (esp Ubuntu) if wanting maximum adoption/experimentation. Since newest stuff from download link failed, I pulled older stuff through Synaptic which at least got GNAT working on doublec's Hello World example below. SPARK 2014 refused to run on it, though.
So, I gotta get SPARK working on my machine before trying to learn it. Sucks that it's been this difficult. I'll try again later this week.
Re learning SPARK, we'll have a brand new learning website during the year for Ada and SPARK, to replace the e-learning website u.adacore.com. This should make it far easier to learn Ada/SPARK, hopefully with online tweakable examples as on https://cloudchecker.r53.adacore.com/
With the SPARK Discovery GPL 2017 edition (from https://www.adacore.com/download), note that you'll get only Alt-Ergo installed by default. You need to follow these other instructions to add CVC4/Z3: http://docs.adacore.com/spark2014-docs/html/ug/en/appendix/a...
If you have any problems, let's discuss on firstname.lastname@example.org, or report it on email@example.com
I'm curious about your Brute-Force Assurance concept, can you say more?
That makes sense. The reason I kept it was that I saw some conference...
...talking about doing ERA as what looks like a replacement for floats. If it gets built, then the code for reals will either be verified directly or an equivalence checker between a real and float implementation will be next move. So, any papers verifying reals might have work that comes in handy later is my thinking.
"we'll have a brand new learning website during the year for Ada and SPARK, to replace the e-learning website u.adacore.com. This should make it far easier to learn Ada/SPARK"
Thank you! That and the list will probably be very helpful.
"I'm curious about your Brute-Force Assurance concept, can you say more?"
I was describing it a lot but I noticed a lot of verification tech is getting patented in U.S. for questionable reasons. I'm a bit concerned about possibility a party in U.S. will patent it, lock it up, or troll implementers if I describe it in detail in way that doesn't count as prior art. From what I've been told, putting it in a peer-reviewed journal would establish prior art to block that. Decent shot anyway. So, I think I'm going to try to do a detailed write-up first to make sure as many people can adopt it as possible. I'm also thinking about an appendix that consist of a brainstorming session of every way I can think of possibly applying it to block some mix-and-match type of bogus patents.
I can tell you the concept involves translating a program into an equivalent representation in multiple languages to run their associated tooling on it. The results are assessed manually or automatically to see if they apply to original form. Original is fixed repeatedly until the tools each say Ok. Each tool is picked for being good at something others aren't. Rust for temporal safety, SPARK for automated proving, a tool-specific language for termination checking, (more properties/tools here), and finally C for its tooling and final release. I've seen people attempt to integrate different logics or use multiple tools. I've never seen people do that cross-language using everything from test generators to static analysis to proofs. It seems my hackish idea is pretty original.
Note: Original inspiration was two fold. C had the best speed along with most tools. Most concurrency-analysis tools were for Java. I thought I could get safe concurrency in C for free by working within a C-based mockup of the Java model that Java's tools analyzed. Additionally, a lot of system code was done in C++ which is seemingly impossible to analyze as much as C. My concept was a C++-to-C compiler, apply all static analysis available in C, figure out what problems apply to C++ code, fix them, and repeat. Thinking on these for more languages and problems as I watched high-assurance certifications take years led to my hackish idea that's intended to replace rare experts with a pile of cheap, well-trained amateurs letting tools do most of the work.
re verification and WhyML
Whatever work I do, whether proprietary or FOSS, will be targeted in tech and price at maximizing uptake more than profit. Might be dual-licensed with intent SPARK, Frama-C, and Rust folks benefit whether building commercial stuff on it or FOSS. I noticed most build on Why3 with it having its own programming language (WhyML). I've also seen researchers working directly in WhyML to prove properties. It seems my tool should target it so both SPARK and C can use results. The correctness of assembly is still a problem with existing verified compilers being for CompCert C, C0, Simpl/C, and SML.
My next cheat of an idea is compiling WhyML programs directly to assembly. So, I ask:
(a) Do you think that's possible to begin with? I've only glanced at WhyML. Since I don't use it, I don't know if it's enough of a programming language that it could be efficiently compiled or too abstract for that.
(b) If that's possible, then have you seen any research already trying to build a WhyML-to-assembly compiler or equivalence checker?
I also always mentally retarget every assurance method I know on a new stack since high-assurance security demands we use every approach. So, when looking at this, I'm also thinking of contract-based test generation for properties expressed in Why3 on WhyML code, fuzzing the WhyML-to-x86 result with Why3 properties in as runtime checks, retargeting a static analyzer to work on it directly, and so on. If it can be verified and executed, then maybe someone can port lightweight, verification tools straight to Why3/WhyML to knock out issues before proofs or just more easily check code that would otherwise require manual proof.
Regarding your Brute-Force Assurance Concept, it connects to the concept we're pushing at AdaCore of System to Software Integrity, in which we're looking at ways to preserve assurance from higher level descriptions at system level down to code. I must admit we're still in the early stages, exploring with customers how to go from SysML or Simulink to code in SPARK while preserving properties.
Your concept of using the best available tools is certainly attractive, if indeed they can be made to collaborate, and if the translations can be made trustworthy. That's something we're quite familiar with, as this is the bread-and-butter for tool building really. :-) One of the challenges in connecting languages and tools is that all the pieces are constantly moving. This is also what makes C so attractive as an intermediate language: it does not move.
WhyML has some advantages as an intermediate language. But note that the way we use it in our SPARK toolchain would not allow compiling through WhyML. Our translation only preserves the axiomatic semantics of the program, not its operational semantics. So we try to get equi-provability, but not an executable WhyML code.
If you start with WhyML though, you can either extract it to OCaml (see http://why3.lri.fr/doc-0.83/manual009.html) or to C (see https://hal.inria.fr/hal-01519732v2/document). One of the authors of this last paper, Raphael Rieu-Helft, is currently doing his PhD on getting this extraction to C formally proved, so that it can be used for efficient security components in WhyML. So maybe what you're looking at could be done with formally verified translation from WhyML to C and then formally verified compilation to assembly. We've also looked at having a SPARK frontend to CompCert in the past, but so far we've seen no commercial interest in that approach.
Thanks for your thoughts on that topic!
That is the trick. I think at the least one can, aside from using subsets close to each other, use equivalence tests that runs through all paths and key values of ranges. In high-assurance systems, one should already be using such tools given they'll catch lots of errors. So, the same tests can be re-run on the new variant in new language to see if it gives same output. This might produce reasonable, if not perfect, level of confidence. If going for semantic checks, I'm eyeballing the K Framework that was used in the C Semantics by Runtime Verification Inc. Connecting that semantics to C++, SPARK, or Rust has its own advantages. Alternatively, I work in Abstract State Machines annotated with the properties of execution that lets me then just do equivalence checks on ASM's. There's a compiler that does that.
"This is also what makes C so attractive as an intermediate language: it does not move."
That's one reason to value it. I recommend C compatibility by default these days to take advantage of ecosystem of tooling, esp compilers and verification, without performance penalties or abstraction gap failures. Although I used to recommend alternatives, the bandwagon effect showed itself to be unbeatably powerful. Some new languages getting rapid adoption are doing so partly due to seemless integration with C or otherwise legacy code. So, if I was designing SPARK-like language today, it would use C's data types and calling conventions with a compile-to-C option available from day 1. I'd possibly even build it on top of C itself with metaprogramming.
I don't like having to do such things. The momentum behind C deployment from embedded to PC to servers is just too huge. The shooting BB at freight train metaphor comes to mind. ;)
"But note that the way we use it in our SPARK toolchain would not allow compiling through WhyML."
Appreciate the tip. Means executable version of SPARK in Why might even make nice side project for some student. Thanks for the links. The Why-to-C work I submitted to Lobste.rs previously. It was great to see someone tacke GMP. I didn't know he was formally proving Why-to-C, though. If he does, your WhyML-to-C-to-assemby idea is a neat possibility. Here's one in return in case you haven't seen it that extracts Coq to C with several fold reduction in TCB. Might allow proven-C to be integrated into SPARK work or be modifiable to extract to SPARK for SPARK+Coq work.
"We've also looked at having a SPARK frontend to CompCert in the past, but so far we've seen no commercial interest in that approach."
I have a love/hate perception of CompCert: love the work, hate it went proprietary. I imagine AbsInt charges a fortune for commercial licenses. If so, that could motivate the no commercial interest part given they'd pay two companies. Have you all thought about using one of the other toolchains that's FOSS to do that stuff? Aside from CompCert, there's these: Flint's C0 used in Verisoft; Simpl with Simpl/C used for seL4; KCC's K Framework + C Semantics; CakeML. The first three would be modified for SPARK using any SPARK-to-C mapping and/or tools you already have with some portion of work, esp on low-level stuff, already done for you. Such are most practical. I'm sure you raised an eyebrow at CakeML. My idea, which a proof engineer independently came up with, was using the imperative, low-level IR's as a target for stuff verified Why3-style or certified compilers. He was using StackLang with me looking at WordLang. Either way, the starting point is an imperative language with formal semantics verified down to machine code for x86, PPC, and ARM. Again, only sensible if C-based or more SPARK-like toolchains look impractical but these IL's do somehow.
But that's not the same as having a full certified (in the sense of Coq) compiler. If we were to use a CompCert-like approach, we'd have to redo it in Coq. Not a small feat.
Regarding approaches that build safety/security by developing a language on top of C, that has been tried so many times that I don't think it can work. The use of the same syntax is deceptive, as you end up doing a lot of redesign/rewrite (as anyone who has done program proof would expect), so none of these sticked. Even those trying to enforce properties by design and runtime checking instead of proof (CCured, Cyclone). Better to stick with C then and use an appropriate toolset like Frama-C.
Start from page 1 and iterate from there.
So to learn those 1000 pages is to, to some extent, learn a core set of techniques across many different contexts. It compresses the required mental load. For the exceptions to that rule, well, you can skim over them and know enough about them to recognize their applications later.
I disagree. Or rather, I think that's unsustainable. Any given three consecutive pages from Spivak's Calculus are probably doable on a daily basis. But is would be legitimately hard for most people to go through three pages of Rudin's Principles of Mathematical Analysis each day and consistently retain that information. Axler's Linear Algebra Done Right is very readable, but Halmos' Finite-Dimensional Vector Spaces will start getting just as dense as Rudin. These are difficult textbooks even when students are well-prepared for them with prerequisite courses. Terence Tao wrote two books to cover (with better exposition) what Rudin did in one. I think it would be pretty hard to read consistently three pages of Tao's Analysis I each day, before he even gets to limits.
I think you're underestimating the intellectual effort here. In my opinion, even if you're reading a math book targeted to your level, committing to reading and understanding three days of material each day would become exhausting. A typical semester is 15-16 weeks, with lectures 1 - 3 times a week, and most undergraduate courses do not actually work through the entirety of a 300 page textbook. Even at that slower pace it's not typical for most people to ace the course. If you read three pages a day and had a solid understanding of it, you'd be absolutely breezing through math courses.
In my experience students need to really step away from the material and let it percolate for a bit every so often in order to solidify their understanding. I really don't think you can partition the material into equal, bite-sized amounts each day. The learning progression doesn't tend to be that consistent or predictable.
So we shall now explain how to read the book.
The right way is to put it on your desk in the day, below your pillow in the night,
devoting yourself to the reading,
and solving the exercises till you know it by heart.
Then when someone mentions a term from the course, you know roughly what to expect and where to find details. Also, you know how big a bite you're taking. "Quantum Mechanics" might be several books. "Vector Calculus" maybe a few chapters. "De Morgan's laws" maybe a few pages.
It's just like any other math you do. Does anyone memorize all the trig relations, Laplace transforms, and geometric relations? Of course not. But you've seen at least one large example of every topic and you can reconstruct it from there.
It could be an entire book, though :)