I recommend reading this as it's right - the Sudoku grid is a beautifully constrained environment in which to indulge in various elements of math and logic without becoming overwhelmed. But since I know many read the comments and dash, check out the Cracking the Cryptic YouTube channel mentioned if you ever get the chance: it's become a highlight of my day to watch their solves :-)
Cracking the Cryptic quickly became my favorite pandemic past time. So much so that I've started making variant Sudokus - it's relaxing, creative, meditative, and I find it quite fun. I definitely second the suggestion to check out the CtC youtube channel!
And I encourage anyone with even a slight interest to check out the videos they post where constructors explain how they come up with some of the puzzles. I never imagined I'd be able to do it well or find it fun but I was pleasantly surprised after trying it out and having a blast.
I just bumped into The Phistomefel Ring last week watching one of his videos, but in the one I watched he claimed there was a second variant on the PR, which he used to solve a Sudoku that couldn't be solved by a computer without brute force.
I'm about to start with a new programming language. I think I might want to go back to Peter Norvig's Sudoku solver and layer these data structures on top.
The initial board is the starting axiom, and you want to prove (derive) a completely filled board. It's easier to think model-theoretically about games like this, but I've found that as I learn a logic puzzle over time, I build up a repertoire of rules I can apply. (In fact, for puzzles-masyu.com, I have a sound and mostly complete set of rules in my mind for the Medium difficulty and periodicals.)
The application of these rules typically produces a linear proof, because we augment the board state sequentially. But that's a human limitation, not a formal one. Since any rule that could be applied always can be applied (so long as there's a unique solution!), if you notice you can do two things from the same proof state, you can imagine a "rule of concurrency" that joins the results from multiple prior rules.
This kind of structure is even clearer in Picross 3D: Round Two, which has two different colors that interact very loosely. You can often work purely in one color, then switch to working on the other color, and go back and forth -- it's like you're interleaving two concurrent actors both collaborating on the same object. A purely semantic perspective doesn't really lend itself to this observation -- but when you've built up a corpus of rules, it's very obvious which rules are color-local and which require a more global view to progress.
My kids and I worked through most of "The puzzle ninjas" by Alex Bellos, and some of the "Pencil puzzles" books from Nikoli. Working on them with others, it's a lot of making little conjectures, testing (proving) the conjectures, and then applying them to various parts of the puzzles. Occasionally we've even used the fact that there is a unique solution to eliminate various possibilities.
They provide a bit of the satisfaction that finishing a good graduate math course homework set does.