There's a nice distributed system class at MIT http://nil.csail.mit.edu/6.824/2015/ where one of the lab consists of implementing Paxos in Go. They provide evaluation scripts to see if you got it right.
I'd like to highlight a link in that post to "Paxos Made Live", a paper from Google about a real-world implementation of Paxos. I found the paper very enlightening, and it gives much credence to simpler algorithms like Raft.
Paxos is really a family of similar algorithms. You could have classic paxos, full byzantine paxos(http://research.microsoft.com/en-us/um/people/lamport/tla/by...), multi-paxos, or fast-paxos (http://research.microsoft.com/apps/pubs/default.aspx?id=6462...). So in a sense you could argue that it is more general than Raft. The issues that arise building production systems were not as well understood 20 years ago. Raft is similar to multi-paxos but more fully specifies things that are left out of previous academic work like transaction log snapshotting, joining new members of the quorum and transaction log truncation. I don't agree with raft's authors that raft is easier to understand, I do agree that it is a more complete spec for building a subset of real world systems.
Here is my understanding. Please correct me if I am wrong.
LATENCY
multi-paxos and raft are equal. It takes 4 hops for a consensus.
fast-paxos gives two-hop consensus when there is no race, which is the optimal.
generalized-paxos gives two-hop consensus for non-conflicting, commutative operations.
genuine-generalized-consensus gives two-hop consensus for non-conflicting, commutative operations and, i think, three-hop consensus for the rest, which is the optimal.
BANDWIDTH
All paxos variants (except classic-paxos of course) allow multiple consensus operations in parallel. Not sure about raft.
STORAGE
cheap-paxos (without reconfiguration) only needs two replicas to support one-node failure; in general, K+1 replicas for K failures.