I used to do greylisting with spamd and ended up silently losing quite a lot of email. (Many mail hosts do not re-send from the same IP, meaning messages essentially get stuck forever.) Doing spam checks at DATA time and rejecting obviously bad emails outright seemed much more effective and less dangerous. I never managed to get Spamassassin to do this, but auto-rejecting languages you don't read also cuts down on spam a lot.
(I missed out on a consulting opportunity because the client's host of choice seemed to be a known spammer and my mail server's filtering was too aggressive about trusting blacklists. I chose to silently-reject those types of messages, so nobody got a bounce. Fortunately, someone was nice enough to ask me about it out-of-band so at least I was able to turn off greylisting and blacklisting before losing much more mail.)
I never found an imapd that scaled to having a lot of messages in a folder, so I ran a cron-job to move mail offline after 2 weeks (for mailing lists) and 1 year (for INBOX). Similarly, I never found a good client to use; Gnus had a very cryptic configuration that I could never believe worked, and mutt was not Emacs-y enough. Reading my email mostly consisted of waiting for Gnus and deleting spam. (I never figured out a good way to get Gnus to move messages marked as spam somewhere so that I could run a cron job to automatically train Spamassassin on the known-bad emails.
For search, I used HyperEstraier: http://sourceforge.net/projects/hyperestraier/
You set it up to run the ingestion program as a cron job every 10 minutes, and then you get a CGI that will show you matching emails for your query. Of course, you can't actually click the links and go anywhere unless you set up some sort of web-based email viewer. I never found anything I liked so I lived without webmail. (There are lots of options. All difficult to configure and probably riddled with security holes.)
Once your server is up and running, you need a secondary MX and a backup plan for your email. (I used Dyn.com's secondary MX hosting service. A lot of spam comes in through the secondary MX, so you can't just implicitly trust it. This involves more configuration.)
Finally, spam filtering uses a lot of CPU and RAM, so you have to pay for a rather expensive virtual machine. Linode's $40/month plan seemed mostly adequate.
I don't really like composing email in the web browser, but I've gotten used to it and $5/month for Gmail and $0/month of my time screwing around with spam filters seemed like a good tradeoff.
Then it goes through amavisd that does the spamassasin checks, as well as verify DKIM and the like, and then it gets delivered to dovecot.
Now, Dovecot as an IMAP server is fantastic. Along with dovecot-pigeonhole I can sort messages into different folders server side.
Dovecot currently handles one mailbox for me that I archive a mailing list into ... 150k messages and counting, and no issues. Uses Maildir on the backend. Although, I do think at those sizes it comes down to good file system caches, and a good file system that doesn't have a problem caching the entire directory.
It works well for me, so well in fact that I am moving all my stuff from Google Apps for Domains back to in-house. For two reasons, 1. more control, and 2. I have had issues with Google Apps in the past, and even with a paid account the support has been lackluster. I'd like to know more about my email, be able to check logs if need be to see what is going on.