Happy to see new tools for debugging. Yet it's the good old print that I most often end up using. Debugging is very context-sensitive: adding prints makes sense because I know exactly what I'm interested in, and I can drill down on that without being disturbed by anything else. I can print low-level stuff or high-level indicators, depending where my debugging takes me. There's no ready-made recipe for that.
Some codebases have built-in logging or tracing functionality: you just flick a switch in a module and begin to get ready-made prints from all the interesting parts. But I've found myself never using those: they are not my prints, they do not come from the context I'm in and they don't have a meaning.
Use what you want but please don't underestimate prints.
I've definitely had the experience of using custom_logger.log("something"), then checking the system log. Nope, no there, must be in /var/log/app ... nope. Hmm oh I know! I'll turn up the logging in the config file, wherever that is. Still nothing. Did I need to compile in some flag?
>>> import logging
>>> logging.info('Hello, world!')
>>> # right, my log message went nowhere
Because by default, logging goes nowhere. And if you configure logging - using a most unintuitive config format (it's so weird that even the documentation about it can't be bothered to use it, but reverts to yaml to explain what it means!) - there's a good chance that loggers created before you got around to configure it (for instance if you, God forbid, made the mistake of adhering to PEP-8 and sticking your import statements at the top) won't use your configuration - and thus send their log messages, again, nowhere.
You know, you could start by reading the documentation and stick one logging.basicConfig() call in your entrypoint, instead of spreading that kind of misinformation.
Python's logging infrastructure is pretty bad but you fail to give any good, factual reason why it is. Instead you just vent your frustration on HN, making that platform all the more depressing to read.
I for one am glad to hear this aired in a public forum, even though it took several pointless replies to get to the "hey, check out this bad default setting"... I'm learning Python as a distant third priority, I had only heard of pdb once, and I would have probably tripped over this logger that logs by default to nowhere at least once before I resorted to giving up and reading the documentation in anger.
Why is it this way, do you think? (Is it a reasoned stance? I would have expected the logger to send messages to stdout by default, so at the risk of getting a "Read the docs!" am I going to be equally surprised at the behavior of basicConfig?)
It gives you contextual debugging - so you can put fancy prints throughout a function but they are silent unless some context is true. It's useful for when you have a hot path that is executed a lot but you only want debug prints for one of those invocations.
I pretty exclusively rely on prints and intuition. Honestly I've implemented logging frameworks in different production situations and never gotten use out of them. Like you say - they don't answer the questions you need answered.
If you're going to use print debugging I highly recommend the package "q". It's print debugging on steroids.
Always logs to /tmp/q no matter what stdout redirects the app has set up. Syntax highlighting. Context dumping. Dumps large values to separate files. Etc.
Log points are awesome. Like break points but they add print/log. First seen in nodejs via vscode; not sure if python has anything like it. Can even attach to remote running process and add them.
In general, something like this has been around in IDEs for a very long time - e.g. Visual Studio proper added them back in 2005, except it calls them "tracepoints".
For my Django projects the development server is werkzeug [1] and anywhere a break point is needed I'll add 1/0 in the code then hit refresh in the browser which pulls up an interactive debugger [2].
The visual of the entire traceback in the browser and the ability to drop into an interactive shell within a specific frame is (for me) a way better experience than pdb.
thanks for showing me this! I use the more I learn about django-extensions the more impressed I am with its features. Being able to interact with the django shell with a jupyter notebook is awesome too.
This is great, thanks! I've been wanting this from the q library for ages, but the author hasn't responded in a few years. I'll switch to this right away, thanks again!
This is a thoughtful hack but the real solution here is a) make debuggers easier to set up and b) make your project easily debuggable with a debugger from the beginning.
Like, debugging should be considered part of programming, and a local dev environment that can’t be debugged should be viewed to be as broken as a codebase without e.g. a way to run the server in watch mode.
Also someone should make a debugger that supports the equivalent of print statements, e.g. set print breakpoint on a variable to print its value every time it’s run, instead of typing print everywhere.
but usually with many limitations. The problem with these is that it's hard to implement it in such a way that there's zero overhead when not debugging. If I remember correctly, native data breakpoints have some hardware support on Intel, but for high-level languages it can be difficult to map their data model to something like that.
Also someone should make a debugger that supports the equivalent of print statements, e.g. set print breakpoint on a variable to print its value every time it’s run, instead of typing print everywhere.
Traditonal debuggers like gdb for compiled languages support this (breakpoint actions, memory write breakpoints, and variable displays). If something similar isn't already in your language's debugger, that might be a source of ideas for adding it.
Until such a day comes, `printf()` debugging will still be useful. Heck, even after that day comes, it may still find uses.
The other day I was writing LPEG [1]. I had a rule that wasn't firing for some reason and I wanted to know why LPEG was skipping that part of the input. It was not a simple matter to fire up a debugger and put a breakpoint on the rule:
-- look for 1 or more ASCII characters minus the colon
local hdr_name = (R"\0\127" - P":")^1 -- [2]
I mean, I could put a breakpoint there, but that would trigger when the expression is being compiled, not run. LPEG has its own VM geared for parsing (it really is its own language) so yes, it is sometimes difficult to debug (especially when you end up with a large parsing expression to parse a document---I only found the bug when I was creating another document similar to my initial testing document that was just different enough).
Fortunately, there is a way to hook into the parsing done by LPEG and I was able to insert literal print statements during parsing that showed what exactly was going on (my rule was a bit too promiscuous).
[2] Yes, a regular expression for that is smaller than what I typed, but with LPEG, I can reuse that rule in other expressions. Also, that wasn't the actual rule used, but the effective rule I was using.
One situation where I would have wanted a real debugger and couldn't use one was in crash reporting on remote systems (like, machine learning code running in some container somewhere, or on a headless box far from the network).
Since Python's built-in tracebacks are pretty minimal, the default crash logs don't offer much help other than a line number. I ended up writing a tool that prints tracebacks along with code context and local variables [i], sort of a souped up version of the built-in crash message. It's surprising how much that already helped in a few situations, makes me wonder why it isn't the default.
So, yes, more debuggers, but also abundant logging everywhere!
a) Crashes dump the program state to a file and then you load that in the debugger.
b) If the process is still running but broken, attach the debugger.
c) In my firmware I log bus fault addresses and restart. That allows me to see what code or memory access caused the error. 75% of the time it's fairly obvious what happened.
Crash dump debugging is tricky to implement Python, because its normal debugging APIs - sys.settrace, frame objects etc - require being inside the process with a running Python interpreter.
You can still do it, but you basically have to redo the whole thing from scratch. For example, instead of asking for a repr() of an object to display its value, you have to access the internal representation of it directly - and then accommodate all the differences in that between various Python versions. Something like this (note several different dicts): https://github.com/Microsoft/PTVS/tree/bcdfec4f211488e373fa2...
Python's standard lib has had something like this for years, but it's been criminally under-promoted (probably because of the terrible name): https://docs.python.org/3/library/cgitb.html
However, yours provides much nicer output. Thank you!
If you don’t want to go CLI-debugging both VSCode and Emacs have more integrated, in-editor options.
Many bugs can be isolated and reproduced in unit-tests. That’s just a click away from being debuggable inside a real debugger. Why use anything but that?
To me, not using a debugger to debug seems kinda crazy.
I love using PDB, but even still this has a different use case. Say you have a bug in a function and it depends on timing or it's a race condition, or it has a lot of statements. Being able to see the state throughout the function in your output may be able to catch things or speed up debugging.
Agreed. My go-to for debugging is `import epdb;epdb.st()` which is a quick and dirty breakpoint. There may be more elegant solutions but that has served me very well.
True, though pysnooper (and print statements for that matter) will change the execution speed, so won't necessarily catch timing bugs (or introduce new ones).
With python being a language that tries to only have one way of doing each thing, maybe they should have figured that out at the beginning stages of the language's design phase.
Is there an equivalent of Common Lisp's TRACE in the Python world?
IMO it is the most value for money (i.e. time and convenience) debugging tool I've used till now. Simply write (TRACE function1 function2 ...) in the REPL and you will get a nicely formatted output of arguments passed and value(s) returned for each invocation of the given functions. Another nice feature is that a deeper an invocation is in the stack the more it is indented -- so recursive functions are fairly easy to debug too.
You can't use it for everything but its sufficient most of the time.
PySnooper looks good, but it is inconvenient in a couple of ways:
1. It prints every line of the traced function -- most of the time this is overkill and not what one needs.
2. To snoop on a function you need to modify the source file. Not a deal breaker but you still have to remember to revert this.
I've done something similar (but not exact) for Lua [1]. I was testing code to trace Lua execution and found some code that was executed more than I expected [2]. Adding a trace like you described should be easy as well in Lua.
It drops me into IPython, an interactive Python shell, with the interactive Python debugger. You can type variables and use `pdb` primitives like (c)ontinue, (u)p call stack, (n)ext line, etc. I really like it, but ofc YMMV.
Try pdb++ (pdbpp) instead — it's like a much improved ipdb. It monkeypatches itself into pdb rather than defining a new package, so it works from any context that drops into the debugger. Its “sticky mode” alone is worth the switch.
Yeah but the idea is that, with TRACE, most of the time you shouldn't need to drop into the debugger (all CLs come with an interactive debugger too). If TRACE is not sufficient, only then do I go down the debugger route (or strategic print statements).
Interesting project, though I'd suggest losing the line about "can't be bothered to set one up right now" regarding a full debugger. (i)pdb is built in and is simple to use. Perhaps focus on what this can add rather than framing the project as something like a lazy alternative (especially when this may actually be harder to set up than throwing in "import pdb; pdb.set_trace()")?
pdb and its variants (my favorite is PuDB) are generally difficult to use in complex, corporate projects. If you've got a multi-process, multi-thread Python project running on a remote host, you'll need a really full-featured debugger to work with it effectively. I recommend Wing IDE or PyCharm for that.
Certainly not a counter, as I'm less familiar with Wing and Pycharm's debugging features, but both of these have been helpful to me in multi-process, multi-thread python environments.
It isn't always that easy. For example, where I work, bringing up a full stack locally involves multiple Python processes (servers) with the frontend accessed via the browser.
Using this tool will work immediately, while using pdb would be a bother (comparatively).
> It isn't always that easy. For example, where I work, bringing up a full stack locally involves multiple Python processes (servers) with the frontend accessed via the browser.
I'd add that discovering where to put the debugger and how to gate it when e.g. the issue needs warmup isn't trivial in large projects. Even if you know where to put the debugger (possibly a quest in and of itself) the exact callsite might get hit tens or hundreds of times before the issue shows itself.
And then, Python doesn't have a reverse / time traveling debugger, so hitting the callsite isn't always sufficient to understand the issue.
In all honesty, I think a more useful way to do this would be defining high-level tooling based on ebpf or dtrace, such that you can printf or snoop from the outside without having to perform any source edition.
And possibly combine that with a method to attach a debugger from the outside to a running program, using either PyDev.Debugger or a signal handler triggering a set_trace as a poor man's PDD.
Sometimes using a debugger isn’t always the best way. Sometimes you want to print out and examine lots of data taken from multiple runs/loop iterations and it’s simply easier to consume that information when it’s sitting in front of you all at once.
Maybe it’s just me but I’ve never been happy with what pdb provides. My primary mode of debugging is to jump into a i python console and go line by line. If it’s django as mentioned above, I have to say I have learned so much about Django through the console that I don’t think I’d be anywhere at this point if I’d used a debugger.
I'm currently learning Python and a little bit of numerical analysis with jupiter notebooks. I currently don't know what I'm doing. Pysnooper looks like it could drop in nicely to give some much needed help.
See also: the unofficial Jupyter extension variable inspector. It shows current values of variables in your scope integrated in a notebook [1]. Pretty slick!
> You can use it in your shitty, sprawling enterprise codebase without having to do any setup.
Debugging is a sensitive subject, particularly given how frustrating it can be. There’s a place for vulgarity somewhere, but I’d rather see your README provide authoritative info than crack jokes.
I appreciated the README as written. I immediately understood what kind of environment the author intended this to be used in. A more formal explanation would have degraded the message. Vulgarity doesn't have to taken as judgemental, it can also be a good way to express empathy. The way I read it, the author's intent was the latter.
"Any time somebody tells you that you shouldn’t do something because it’s “unprofessional,” you know that they’ve run out of real arguments." -- Joel Spolsky
Or there is a whole suite of behavior and conduct with well thought out reasons that have been comprehensively debated over the years that have been put under the rubric of professionalism and there is zero reason to rehash the same arguments over and over and over ad infinitum.
One-word quips sound good on paper, but this is an open source project with engineers as the target audience. It's not seeking (at this time) to bring revenue or make sales, so I'd argue that speaking truth to the problem is more likely to drive up adoption.
> May I ask how the word "shitty" more truthful than "poorly written" or "complex"?
Evokes emotions people affected by a situation can sympathize with more effectively, which by virtue of establishing a shared emotional bond over a topic helps the developer convey not just the situation but the frustrations of the situation more effectively than one might expect "poorly written" or "complex" to do alone.
> Citing professionalism isn't a quip, it's shorthand for a code of conduct and long accepted practices of interaction with others.
The code of conduct isn't uniform, so it can't be used effectively as shorthand for such. But at this point we're in the weeds.
Again, no short-term revenue prospects, just a tool OP wants to socialize to make a few lives easier. If you have an objection over verbiage, that's fine, but it's an exhibition of professionalism from yourself to the OP to build a sound defense of your position as to how it would help the engineer to self-censor the description of a tool where the audience by-and-large may not care.
Up to you. My point is the engineer doesn't need to suppress who they are in this specific context, and my point to you is it shouldn't impact your usage of what looks to be an effective short-term debugging tool.
I have to say sometimes I do find it much easier to read logs than to muck about in an interactive interpreter. That said I did just add `export PYTHONBREAKPOINT=pudb.set_trace` to my bashrc and am slowly going through and removing all my old `from IPython import embed` lines. A much simpler workflow that doesn't incur major runtime costs.
Using print is sometimes superior to pdb. If you want to quickly see how the program runs without stepping through each line print is justified. This looks like an evolved print. Nice tool!
While I sometimes use one debug tool or another, I have never understood the aversion to print statements reflected in the title. Sometimes, often even, a print statement is just fine, and anything else is overabstracting it. Not saying other options aren't nice to have available, just that there is nothing wrong with using a simple print statement in many situations.
Because the observability you get with a print statement is limited. You can't do further investigation without modifying the code and running it again. With a debugger you can explore the program state interactively.
My experience with python is limited but in my day job it's not uncommon to be tracking down stuff that happens infrequently. Debug cycles get brutally long.
You also can't be confident that a debugged program is in a natural state without restarting it, at which point the re-setting of the breakpoints and scripting you did in the last invocation - or saving and loading what you've already done - is a pain point. If all of this is quicker than recompiling/relaunching, then IMO there is a second bug in a lack of effective logging.
Most bugs I create are for simple reasons and can be found by scanning the first error logs. If I add print statement debugging because I couldn't then they'll often be adapted into additional logging. If I use a debugger for this as my first tool and don't add logs, I'll have to do it again next time, too[1].
If the bug is not a simple one and is not a structural bug, there's a decent chance it's something debuggers deal with poorly: data races, program boundaries, non-determinism, memory errors. If it's something that can be found by calling a function with certain parameters, it's a missing test case.
So the times I find debuggers to be worth it are after I've already decided it's a difficult yet uncommon bug. So I use them with despair.
[1] If I fix it with a debugger and then add the logs, I still have to prove it gives the right output when it fails.
Once you go down the path, it's easy to start adding more and more, and it's easy to forget about them. They sometimes blend in well with patches you generate, since it's just a 'print()' and nothing obvious like 'import pdb; pdb.set_trace()'
That's a matter of being organized and competent, not a fault of print().
People create bugs by forgetting things all the time. If you believe their spin, FB snarfed millions of contact lists by accident because of that. I've troubleshot countless things that happened because of code that fell between the cracks. And on the other side of that, I could tell you about the time we went a month without logs in production because someone didn't properly test a library change, so everyone's carefully manicured logging broke.
There's nothing wrong with print(), at least that isn't wrong with a lot of other things.
I agree. Print is the most fundamental tool available to you: debuggers are kernel and hardware spanning monsters that introduce an external logic dependency. We don't advocate for more complexity than is necessary in other places.
As a very mediocre ruby programmer, is there an equivalent in ruby to this? If you have a little pattern that is more clever than 'puts' everywhere, please share, it would be well received by at least one person out there. Thanks!
Something that prints out the state after running each line of code? No, I don't think so, but I believe it wouldn't be too hard to rig up something similar with TracePoint - https://ruby-doc.org/core-2.2.0/TracePoint.html - which is a built in Ruby way to trace code execution.
For example, dump this into a file and run it:
TracePoint.trace(:line) do |tp|
STDERR.puts "Ran: #{File.readlines(tp.path)[tp.lineno - 1]}"
end
a = 10
b = 5
puts a
a += b
puts a
puts b
It'll print out each line of code as it runs it. You could then parse each line for identifiers and print them out, etc. It'd be a project, but it's doable.
Never use print for debugging again... If you're creating a reasonably complex project, spend some time to set up a nice and robust logging facility and always use it instead of print. That's the first thing you should do. You will not regret that decision.
I sometimes use a debugger to tackle with unfamiliar code, but I always prefer using trace/logging whenever possible, because 1) you can see the context and whole process that reached that point, and 2) the history of debugging can be checked into a VCS. I'd write an one-liner to scan the log file rather than setting up a conditional breakpoint. I particularly like doing this for a GUI application. A regression testing can be done by comparing logs.
Super neat project. I am a print debugger myself and will definitely use this at some point in the future. For scenarios where PySnooper might be overkill and you just want to see the value of specific variables, I wrote a port of the Rust `dbg` macro for both Python and Go that are pretty nifty when looking at values really quickly:
This is great! I'm currently working on a large Django project that has itself and all of it's services running in Docker containers via docker-compose. In order to use a traditional debugger, I would need to set up remote debugging and the integration with VS code for that is really not great. Not to mention that getting a remote debugger to work with Django's monkey-patched imports is a little wonky as well.
With this package, it seems like I can just get my debugging via stderr.
IMO the real advantage of print() over a full-fledged debugger comes when you're testing. Just replace print() calls with assert(). (Also debuggers, especially on the front end, always seem incredibly laggy and slow.)
Using debuggers tends to encourage people to fix problems without writing regression tests.
Honestly, I'd prefer "better support for print-line debugging" to "better debugger that you can set up" in most cases.
I missed this on HN last week but I just found it on Google. THANK YOU! I've been debugging weird serial errors for the last day, and this is helping a ton!
This looks great, good job! It strikes a great balance between PuDB (my favorite, but can't easily run in Docker/remotely) and q (very simple to use but you need too many print() calls everywhere). PySnooper seems great to just drop in and get the full context of what I want to debug.
Can it be used as a context manager (`with pysnooper.snoop():`) for even finer targeting?
Before clicking I was like: why would one not use pdb. With vim, python-mode, and 'find . -name "*.py"| entr python testcases.py', setting break point and re-running is painless.
I was wrong. Upon skimming, it seems a huge plus PySnooper have over pdb is auto inspecting states, sparing whole lot of manual typing
Really love this. Seems like it captures a whole bunch of interesting information when a function is invoked.
I love auto-loggers like this where you can selectively capture interesting bits.
This is the basis of reverse debugging I.e capturing chronological snapshots. Would love to see a Vscode extension that allows to step forward/backwards through time when an interesting thing happens.
This is super neat, and definitely a great tool for early debugging -- but for anything more in-depth, there's built-in pdb and third-party ipdb, which gives PDB an IPython frontend.
Both use gdb semantics which is great if that's what you're used to.
I come from the php world now coding in python. There used to be a very nice library from Symfony called VarDumper: https://symfony.com/doc/current/components/var_dumper.html that would just pretty print variables you dumped to the browser in an easy to consume form. Is there anything like this in python?
There is pretty-print [0], which you could dump inside pre tags, or print to stdout. But honestly, depending on the framework, quicker options exist than (basically) printf debugging.
Pretty much the stuff already mentioned elsewhere on HN for this article. I often find Werkzeug's excellent debugger is enough: https://news.ycombinator.com/item?id=19718869 . pdb/ipdb/pudb et al (pick your fav) can help for really tricky stuff. And sufficient logging, so you know what's going on at all times even without a debugger attached.
(occasionally, the low effort of print-debugging works, but if you keep having to print in more/different locations... it's a blunt tool IMO)
I try to have "debug calls" that I can call from the browser or Postman or Swagger, etc. and return all sorts of debug information, etc.
I have the impression that local program-specific debugging tools quickly evolve into something like functional tests that uncover issues that functional tests proper might not cover. For example, if I'm serving a ML model but I have a debug call that runs sanity checks on the data that are too expensive to run each time.
I think there is a difference because python doesnt require a web browser to run. People are just running programs with no web UI or webserver etc... If you really wanted, you could pipe your debug prints to a file and load in a browser if that is what you want :p
We have some fairly intense functions that needs to be profiled with realistic data and io conditions. I've been doing that by hand, line by line in the shell.
I just tried wrapping this around one of these functions in the django shell (actually shell_plus, but same thing). Just imported pysnooper, created a new function using `new_f = pysnooper.snoop()(original_f)` and called new_f on the realistic data and got a nice printout that included values and times.
Dont think it works, tried few examples, it throws NotImplementedError at me all the time. Any hint?
~/anaconda3/lib/python3.7/site-packages/pysnooper/tracer.py in get_source_from_frame(frame)
75 pass
76 if source is None:
---> 77 raise NotImplementedError
78
79 # If we just read the source from a file, or if the loader did not
That looks really nice! I normally use pudb which is a curses-based debugger but pudb falls short when the debugged program runs with several threads or if it uses an asynchronous event loop. Having a non-interactive debugger like PySnooper could definitely help in such situations!
Can someone explain to me why to use this instead of `import pdb; pdb.set_trace()` ? I am new to python and am confused someone told me not to use print and use pdb, so how is this different ?
No, because by default it only logs what's happening directly in your function, not what happens deeper in the stack. (i.e. functions that your function calls.)
You can set depth=2 or depth=3 and then you'll get a huge chunk of log. The deeper levels will be indented.
If the function is launched in a spawned process, it'll work, though you might have trouble getting the stderr, so you better include a log file as the first argument, like this `@snoop('/var/log/snoop.log')`
If the function launches new processes internally... I'm not sure.
There is a race condition when writting on the same log file from several processes at the same time, which is a typical use case for wsgi frameworks such as django or flask.
Haha. I don't know if this is satire or not, but in case it's not: You can pass any writable stream as the first argument and PySnooper will use it. So it should be easy to integrate with Slack or anything else.
Yes, I was serious. I'm curious why you thought I was joking? cool-RR thought I was joking, as well, so I must be missing something... ¯\_(ツ)_/¯ Thanks for the tip on Sentry -- I'll take a look!
Oh, just that "chat ops" is one of those goofy fads that seemed to flare up everywhere and burn itself out really quickly, so any mention of it immediately triggers my satire meter.
Quite apart from that, the tool described looks like something you'd use in an intensive debugging session, so it's hard for me to imagine how it would fit in with an alerting workflow.
Some codebases have built-in logging or tracing functionality: you just flick a switch in a module and begin to get ready-made prints from all the interesting parts. But I've found myself never using those: they are not my prints, they do not come from the context I'm in and they don't have a meaning.
Use what you want but please don't underestimate prints.