It's almost certainly either a Thales nCipher or a SafeNet. Very slight chance it could be IBM or couple other manufacturers. All can fail in weird ways.
This is more a failure of implementation than a failure of the device. You need some way to shield backups of keys from a failure of HSM; if they're paired online for HA there should be a secret-shared backup outside HSM storage. (usually people deploy two HSMs in the same datacenter for HA, so you need an outside backup for DR if a fire happens or something)