The times I have used CAS to help with algebraic simplification, I've found the same problem, regardless of using SymPy or the more feature-filled Mathematica: the package's rules for simplification often obscure useful forms that lead to insight. The CAS gives you the correct result, of course, but I end up having to apply significant post-processing to the result by forcing substitutions that can take almost as much time as doing the algebra by hand.
As a simple example, consider the transfer function of a RLC circuit from basic electronics. It might contain a expression of the form: s^2 + (R/L) * s + 1/(LC). What's often helpful is to express this in terms of the unitless form 1 + s/(Q * w0) + (s/w0)^2, where w0 is the resonance frequency and Q is the quality factor. In a complicated example with more components, the value of w0 might be related to the simple form of 1/sqrt(LC) by some unitless factor, or there's some other helpful way to write things that neatly relate it to the basic case, but SymPy/Mathematica will drag its feet showing you that. I wish there was more done on improving the insight and meaningfulness of results from CAS to the experimenter.
You may want to look into symbolic regression. I often find that by applying a downward pressure on the fitness of less parsimonious individuals as well as one or more substitution operators (where applicable) you can get very elegant results. Add constraints or other fitness bonuses for specific types of relationships you are looking for and you might be surprised by what cool things you can find. Also a great excuse to learn Common Lisp if you haven't already.
A good starting point is John Koza's first book on Genetic Programming.
I agree. I often have to wrestle to get any CAS to give me an expression in the form I want. Another example would be trig expressions, where an expression can be put in terms of A sin(x+B), or a sin(x) + b cos(x). Both are equivalent, but sometimes one is much more useful than another. There's rarely a builtin routine for "convert to exactly the form I want", so I have to code up the conversion myself. But at least SymPy makes that as easy as any other Python coding challenge!
You very much have to learn the ins and outs of any CAS.
That said, they can be very useful dealing with large multi-term expressions, or double-checking your work by hand, playing around with transforms, etc.
It's a tool though, and more about the insights you bring to it, than it brings to you if that makes sense.
Those exist and are in active use by physicists (in fact one of the first CAS called Schoonschip was developed by nobel price winner to be Veltman to carry out Feynman diagram calculations). However there remains a huge tome of high energy physics to be implemented in one common CAS (one of the things I sometimes dream about starting). The culture of theoretical physics is such that only the end results of calculations are ever shared with intermediate steps left as exercises to the reader. Graduate students are expected to develop a stash of identities and routines to go back to, which is their “competitive advantage”, especially in fields like model building.
Aside from the symbolic calculations, I think the dynamic (i.e. with widgets for playing with input values) visualizations in Mathematica can be very helpful for building intuition.
SymPy turns "maths" into "programming", using python syntax instead of maths syntax, so if you already like programming, then you might just like SymPy... but as someone who likes maths, and likes programming: the programming maths parts is also the mind numbing part, and any line fewer I can write, I will take.
So to me, this article just confronts me with the soul-destroying parts of modern maths, while claiming that those are apparently the fun bits. I have no idea how to reconcile those two things so I can pass that learning on to someone who might be getting interested in maths...
Another open source CAS one might want to look at is Maxima, which is based on Macsyma which was started in the 60's. Not many open source projects a history that long.
I have been working on a different frontend, that leverages CLIM, which happens to be a perfect toolkit for something like Maxima. I made a screencast showing the current state of development:
I would hardly call wxmaxima a “nice frontend” for maxima. It doesn’t even render equations nicely (in LaTeX), say. The emacs or LyX frontends are much nicer.
I believe my renderer does a decent job though. You can see it in the video I posted. The equations in the documentation is renderered by the same code (which is not related to LaTeX, although since it's the reference implementation, I've tried to make my renderer be as similar as possible).
I used to use a lot of SymPy, but finally switched to Sage. This is Python as well, but has some syntax extensions and sensible defaults that reduce many quirks of Python. For example, you can write 1/2 instead of of Rational(1,2) and a^b instead of ab.
Moreover, Sage integrates lots of other libraries seamlessly. I remember doing lots of polynomial calculations and groebner basis stuff, and generating Singular script from Python seemed to be very elegant in the beginning. However, this quickly showed severe drawbacks. With Sage this all became easy and seamless again.
There was a post about Sage not long ago that it is used commonly among cryptographers.
As an aside, the statistical libraries and conveniences of R ("missing value" is a data type; every object is an array meaning you can call functions on arbitrary arrays) are the only reason that I still use R for some niche applications. I wonder whether pandas or something like Sage will actually take over this functionality. My impression is that the packages in R are so isolated (in a somewhat good sense) for a niche application that they are both difficult to emulate in Python and happen to keep on working long after they were written (and not necessarily maintained).
I find that mathematics gets less and less fun as I do more of it on a computer. Symbolic Algebra packages are very useful but nothing beats a pen, paper and time for fun.
I agree, but I'm not sure if it's a psychological thing or not. For example, I became known in my QFT classes for doing homework on TeXmacs. I think it might have very well to do with where you learned it first, for example, I never felt comfortable coding on paper and pen (although I did once in high school since I couldn't bring a laptop on the bus).
I cannot work directly into TeX but I love fiddling around with it once I've done the work. When I'm slamming my head again a QFT textbook I feel like I need to work on paper so I don't lie to myself with the symbols (When I don't understand it yet)
I plan my code on paper quite a lot, and actually quite enjoy talking about what I'm doing (Whiteboard or not) [I think some whiteboarding at interviews is a good thing]
Oh don't misunderstand, I'm not that leet, texmacs is a TeX(like) wysiwyg with shortcuts inspired by TeX so it feels natural and is rather quick. One of the things apart from the familiarity I'm arguing is important is the symbols once on page looked more like what you'd see in a paper/book, so it was more immediately obvious when something looked "right" or didn't.
SymPy is pretty great, but some of its basic operations (e.g. subs() and evalf(), which substitute numeric values for symbolic variable names) can be exceedingly slow. There are workarounds like autowrap, which generates C code to do this orders of magnitude faster. There are also reimplementation efforts in C++ like symengine to address the performance issues.
`subs` and `evalf` is simply not the right tool for float/double precision arithmetics. `subs` in particular is a general tool for symbolic substitutions, and it is not fair to expect it to be fast for numerics (although improvements would always be great). `evalf` provides high precision arithmetics (your RAM is the limit), so again, it is inherently slower than float math. One can argue that this should be documented better or that the tool should be able to guess the intent of the user, but otherwise I would say of course `subs` and `evalf` are slow when you use them for float math.
Well, the online version might fail on heavy computations to be honest, so it might be best to have it installed locally if you're planning to do some serious stuff. But for the first acquaintance, the online SymPy should be fine.
I'm watching your Harvard talk currently. Very cool. Is there any offline support? E.g. if I want to edit a cocalc document on a plane and merge with collaborators afterwards?
I suspect that school and university math exams emphasize meticulous and flawless computation over intuition, understanding and problem solving.
So to get good grades, it's not good enough to understand math, you also have to be able to focus enough to avoid mistakes. Between ADHD and the prefrontal cortex developing/maturing well past the age of 20, that's quite an ask.
My grandfather was a mathematician and he once admonished me that understanding math is as useful as understanding swimming. If you're drowning, you need to be able to DO swimming, not just understand it. Understanding will come from practice, but not the other way around.
That's a good point, my observation is limited to constructive proofs. Proofs by contradiction or enumeration might not offer any great insight, unless one can prove that there exists no better proof, in which case, the insight is that no great understanding is to be had. Godel's Incompleteness Theorem teaches us that sometimes things are true simply because they are not false. I don't think that contradicts my point, it just limits its scope.
What's constructive in one system is nonconstrictive in another. The only people who take constructive proofs to be more meaningful than nonconstrictive ones are software engineers. Usually with some mumbo jumbo about how type systems are a superior foundation for maths to sets.
Estimation should be taught as a respectable discipline and one worthy of serious study. Even aside from the fact no intelligent educator can claim with a straight face that most of their students will need to do arithmetic by hand in a real scenario, estimation is vital for getting an intuition about orders of magnitude and how quickly addition and multiplication increase a value.
Nitpick: Reusing a name and shadowing and existing binding because the name is equivalent isn't "overloading". But I don't know if it should be called shadowing either because there is no function scoping in effect... Hmm
I really like SymPy and support the hard working developers constantly improving it. This software is incredably complex and really needed. One day it will be a viable open source alternative for Matlab and Mathematica.
That said, for my specific use case (modulo arithmetic / cyclic groups) I didn't get SymPy to work properly. While I was able to find guidelines on how to hack it together, I have not found a real solution. After 2 hrs of search, I retreated in shame to Mathematica. If anyone knows a solution or works with these structures, let me know!
1) If you use it, I'd recommend installing it locally (pretty easy with pip install mathics)
2) When doing something complex, it can be a bit slow (it relies on SymPy, which isn't a speed demon itself, plus it has to parse and interpret the Mathematica syntax, which adds overhead)
3) The subset of Mathematica that it supports is pretty much what SymPy does -- so pretty much classic mathematics and graphing rather than the knowledge based stuff like populations of countries and their GDPs or whatever that Mathematica has been adding over the last few versions.
How are good like things like its simplifying algorithm? I've used Sage which uses sympy from what I understand and it didn't quite cut it. Is Mathics just an interface or does it tweak the algorithms?
The Mathematica/Wolfram language support of Mathics is good enough that some of the functionality is implemented in itself rather than a straight SymPy wrapper, but if Sage (a much bigger system than Mathics) can't simplify the expressions you want, probably Mathics can't either.
For regular mortals the cheapest way to get Mathematica is to get Raspberry Pi. Raspbian has Mathematica in its repo (sudo apt install wolfram-engine). The performance is not great but maybe the latest RPi 4 will help.
If you don't care about the UI and enjoy a command line version (or scripting with your favorite editor), you can know install the Wolfram engine for free: https://www.wolfram.com/engine/
I must be missing something... Arxiv was already popular more than a decade before the copyright of the paper. Nonetheless, it is exciting to see how useful sympy already is to researchers.
SymPy can expand series too, one line of code - and you're done. But that would be no fun.
The idea here is: every differentiation or integration over a polynomial results in another polynomial, right? And a polynomial in a point is just a linear equation of its coefficients. You can shove them into a system and make SymPy solve it.
So you can use differential and integral properties of some random function in some points, turn it into a linear system - and have a model of a function as a polynomial. That's it!
Nope - the Taylor series depends on a specific choice of point. Around that point, the truncated Taylor series is the "best" approximation of the function using a polynomial.
In this case, the author has picked seven points on the sine curve (or its integral or its derivative), and used those seven pieces of information to find a unique polynomial.
If I were to guess, I'd say that if the seven points were on the curve itself (not on its integral or derivative), and they were all very close to some point P, then the resulting polynomial would look a lot like the taylor series expansion around P.
In fact, with sine modelling, Taylor series beats this approach with lower error. But with this you get to choose your properties so you might get more useful model instead.
Say, if you want to conjoin two sine models got from Taylor series to have it on a full [0, 2pi], you will have derivative discontinuity - a bump - on the joint. But with this model you wouldn't since it's explicitly required to be the inverse number at 0 and at pi/2.
As a simple example, consider the transfer function of a RLC circuit from basic electronics. It might contain a expression of the form: s^2 + (R/L) * s + 1/(LC). What's often helpful is to express this in terms of the unitless form 1 + s/(Q * w0) + (s/w0)^2, where w0 is the resonance frequency and Q is the quality factor. In a complicated example with more components, the value of w0 might be related to the simple form of 1/sqrt(LC) by some unitless factor, or there's some other helpful way to write things that neatly relate it to the basic case, but SymPy/Mathematica will drag its feet showing you that. I wish there was more done on improving the insight and meaningfulness of results from CAS to the experimenter.