If you partition off a node, you can start a rejoin to ANY node and it will proceed as far as it can before giving you an unhelpful error message.
This was reproduced in a n5k4 scenario where three nodes were killed, so the two remaining would off themselves due to partition detection. But they won't do that until after the timeout. In the meantime, the three nodes can be restarted and can try to rejoin to the two living nodes. This actually starts to work until the living nodes realize the timeout is up and off themselves.
Feels like we should push rejoin requests through zookeeper.
You can't prevent simultaneous failures during rejoin, so even if you check that everyone we think is up is actually up before rejoin, that status could change AT ANY TIME. So we still need to resolve failures that happen during a rejoin (or even simultaneous with), but we should be able to deny (or delay) rejoin starts while timeouts to nodes go through fault-resolution.
Lets try this in system tests. Probably localcluster unit tests too.