Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
* [Unjailing a jailed validator](docs/validator-guide/unjail.md)
* [Move validator to a different machine](docs/validator-guide/move-validator.md)
* [Backup and restore node keys with Hashicorp Vault](docs/validator-guide/backup-and-restore.md)
* [Re-enable pruning and recovering node db](docs/validator-guide/reenable-pruning.md)
* [Network-wide Software Upgrades](docs/upgrades/README.md)
* [Upgrade Guides](docs/upgrades/upgrade-guides/README.md)
* [Upgrade to v0.6.x](docs/upgrades/upgrade-guides/v0.6-upgrade.md)
Expand Down
104 changes: 104 additions & 0 deletions docs/validator-guide/reenable-pruning.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# Guide for re-enabling pruning and recovering node db

This guide is specifically made for the the validators/node operators affected by the pruning issue encountered following our v4.x upgrade. This issue required some validators/node operators having to disable pruning entirely.

Re-enabling pruning (to `default`/`custom` from `nothing`) on the affected nodes will cause the nodes to halt again as the database pruning resumes its operation. To avoid this it is recommended to reset your node db and recover it using state sync or a db snapshot. The nodes that weren't affected by the pruning issue or that had recovered by the time the pruning fix was rolled out do not have to undergo this procedure.

Following this procedure will significantly reduce the disk space required for your node’s regular operations, thereby lowering operational costs (on the nodes we manage, we observed storage usage drop from 700+ GB to under 10 GB). Additionally, running a node with less disk usage will likely improve performance.

You have two options here:

1) Reset via State Sync
2) Reset by using DB snapshot

## State Sync

Stop the systemd service of the node.
`sudo systemctl stop cheqd-cosmovisor.service`

Take a backup of the priv_validator_state.json. **This step is very Important for validator nodes:**

```bash
cp ~/.cheqdnode/data/priv_validator_state.json ~/priv_validator_state.json
```

Turn the pruning strategy to default or custom based on your preference in the `~/.cheqdnode/config/app.toml` file.

Reset the database:

```bash
cheqd-noded tendermint unsafe-reset-all --home ~/.cheqdnode/ --keep-addr-book
```

Restore the priv_validator_state.json. **This step is very Important for validator nodes:**

```bash
cp ~/priv_validator_state.json ~/.cheqdnode/data/priv_validator_state.json
```

Enable statesync on the node and provide the required variables:

```bash
STATESYNC_RPC="https://rpc.cheqd.net:443"
LATEST_HEIGHT=$(curl -s $STATESYNC_RPC/block | jq -r .result.block.header.height)
BLOCK_HEIGHT=$((LATEST_HEIGHT - 2000))
TRUST_HASH=$(curl -s "$STATESYNC_RPC/block?height=$BLOCK_HEIGHT" | jq -r .result.block_id.hash)
sed -i.bak -E "s|^(enable[[:space:]]+=[[:space:]]+).*$|\1true| ; \
s|^(rpc_servers[[:space:]]+=[[:space:]]+).*$|\1\"$STATESYNC_RPC,$STATESYNC_RPC\"| ; \
s|^(trust_height[[:space:]]+=[[:space:]]+).*$|\1$BLOCK_HEIGHT| ; \
s|^(trust_hash[[:space:]]+=[[:space:]]+).*$|\1\"$TRUST_HASH\"|" ~/.cheqdnode/config/config.toml
```

Start the node:

```bash
sudo systemctl restart cheqd-cosmovisor.service
```

The node should start looking for statesync chunks from it's peers and begin the restoration process in a few minutes. After some time, it should catch up with the network and continue signing blocks.

## Snapshot

Stop the systemd service of the node:

```bash
sudo systemctl stop cheqd-cosmovisor.service`
```

Take a backup of the priv_validator_state.json. **This step is very Important for validator nodes:**

```bash
cp ~/.cheqdnode/data/priv_validator_state.json ~/priv_validator_state.json
```

Turn the pruning strategy to default or custom based on your preference in the `~/.cheqdnode/config/app.toml` file.

Reset the database:

```bash
cheqd-noded tendermint unsafe-reset-all --home ~/.cheqdnode/ --keep-addr-book`
```

Restore the priv_validator_state.json. **This step is very Important for validator nodes:**

```bash
cp ~/priv_validator_state.json ~/.cheqdnode/data/priv_validator_state.json
```

Download the latest lz4 tar archive for mainnet from [our snapshots page](https://snapshots.cheqd.net/#mainnet/)

```bash
wget https://cheqd-node-backups.ams3.digitaloceanspaces.com/mainnet/<timestamp>/cheqd-mainnet-1_<timestamp>.tar.lz4
```

Unpack the tar archive and restore the db:

```bash
lz4 -c -d cheqd-mainnet-1_<timestamp>.tar.lz4 | tar -x -C ~/.cheqdnode`
```

Start the node

```bash
sudo systemctl restart cheqd-cosmovisor.service
```