[Tour de SOL] Stage 1 - Week 2 Recap

Welcome back to everyone who’s been following along on our progress during Tour de SOL. It’s been another event-filled week. We’ve shipped several fixes to issues identified over the past few days, while some issues remain outstanding as we expect them to take longer to fix.

PROGRESS ON CRITICAL BUGS

Certus Ones’ DoS Attack

For the first several days last week, we kept the network offline as we attempted to quickly fix the bug highlighted by Certus One’s successful DoS attack, however after working on it for several days we concluded that the fix would require some additional time and effort to be resolved. Therefore we decided to bring the network back up first:

  • As we had several other fixes for other issues ready to be implemented
  • Allow us to continue identifying more bugs
  • With an agreement together with the Validator community that the DoS attack would be off-limits until advised otherwise

Other High-Priority Stability Issues

Outside of the attack mentioned above, we still have several other issues which we’ll be focusing on for the coming weeks:

NETWORK UPGRADES

Tofino v0.23.4

The cluster was restarted on the 12th of February with version v0.23.4, capturing fixes in the following areas:

  • Snapshots
  • Repair
  • Gossip Network

We successfully upgraded the cluster with the new version the following day, and our bootstrap Validator node finally managed to distribute its stake to the rest of the cluster such that it represented less than 33% of the active stake.

Tofino v0.23.5

We followed up with another version on the 14th of February with another update to rectify a long-standing out-of-memory issue that was affecting some Validators

Final Comments

As of today, the network is still down due to the issue mentioned above where Validators accidentally fetch snapshots from delinquent nodes. Until this is resolved we won’t be restarting the network just yet as it’ll likely cause the network to crash again after a few days.