* x87 floating point is generally unused (if you have SSE2, which is guaranteed for x86-64)
* BCD/ASCII instructions
* BTC/BTS/related instructions. These are basically a & (1 << b) operations, but because of redundant uses, it's generally faster to do the regular operations
* MMX instructions are obsoleted by SSE
* There's some legacy cruft (e.g., segment management) that's generally unused by anyone not in 16-bit mode.
* There are few odd instructions that are basically no-ops (LFENCE, branch predictor hints)
* Several instructions are used in hand-written assembly, but won't be emitted by a compiler except perhaps by intrinsics. The AES/SHA1 instructions, system-level instructions, and several vector instructions fall into this category.
* Compilers usually target relatively old instruction sets, so while they can emit vector instructions for AVX or AVX2, most shipped binaries won't by default. When you see people list minimum processor versions, what they're really listing is which minimum instruction set is being targeted (largely boiling down to "do we require SSE, SSE2, SSE3, SSSE3, SSE4.1, or SSE4.2?").
As for how many x86 instructions, there are 981 unique mnemonics and 3,684 variants (per https://stefanheule.com/papers/pldi16-strata.pdf). Note that some mnemonics mask several instructions--mov is particularly bad about that. I don't know if those counts are considered only up to AVX-2 or if they extend to the AVX-512 instruction set as well.
OpenBSD uses segments(while in protected mode!) to implement a line-in-the-sand W^X implementation on i386 systems that don't support anything better. The segment is set just high enough in a processes space to cover the text and libraries but leave the heap and stack unexecutable.
This mentions this implementation: http://www.tedunangst.com/flak/post/now-or-never-exec
Nowadays it is all Vanderpool/Pacifica, aka VT-x/AMD-V.
x86 NaCl uses segments for sandboxing.
Would Intel be able to meet the smaller die sizes they're currently having trouble with? Would it make the processors any less expensive to produce at scale (all other things equal)?