-
Notifications
You must be signed in to change notification settings - Fork 20
[#704] Add script to automatically fix node corruption #766
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
22fe622 to
9499993
Compare
9499993 to
10ae4fd
Compare
10ae4fd to
d74780d
Compare
DMozhevitin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The restoration logic looks ok overall, so it seems that you're moving in the right direction, although I have some comments
| print("Could not delete node data dir. Manual restoration is required") | ||
|
|
||
| snapshot_array = None | ||
| config = {"network": os.environ["NETWORK"], "history_mode": history_mode} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest modifying extract_relevant_snapshot to accept network and history_mode parameters separately rather than config dict, since the config is an attribute of wizard object, but we don't have it there
| return False | ||
|
|
||
|
|
||
| def fetch_snapshot(url, sha256=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better to extract all snapshot-related stuff to tezos_baking/snapshot.py rather than keeping it in util
| for json_url in default_providers.values(): | ||
| with urllib.request.urlopen(json_url) as url: | ||
| snapshot_array = json.load(url)["data"] | ||
| if snapshot_array is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest to pick not the first suitable provider, but the one which have a snapshot with higher block_height
Unlike tezos-setup where we ask user to chose provider to download from (if there is no such a snapshot in provider, we use other ones as fallback), here we capable to chose the snapshot with most recent block by ourselves
| os.remove(snapshot_path) | ||
|
|
||
| if not reinstallation_result.returncode: | ||
| print("Recovery from corruption was successfull") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if output of these prints will be redirected to the systemd service logs, since we're going to use this script as a part of the service
Did you check it?
| if not reinstallation_result.returncode: | ||
| print("Recovery from corruption was successfull") | ||
| else: | ||
| print("Recovery from corruption failed. Manual restoration is required") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose that the script should fail with non-zero exit code at this point (and in other places when the script failed to automatically restore node storage)
| def main(): | ||
| is_corrupted = check_node_corruption() | ||
| is_baking_installed = ( | ||
| b"tezos-baking" in get_proc_output("which octez-baking").stdout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
There is no
octez-bakingpackage, buttezos-baking.octez-*is about binaries from tezos/tezos that we build and provide viatezos-*packages (i.e. installingtezos-nodepackage viasudo apt-get install tezos-nodeinstallsoctez-nodebinary). But apart from that,tezos-bakingis a special case, since there is no such binary asoctez-baking, it's rather set ofsystemdservices and other things that are used together withoctezbinaries. -
whichcommand searches the path of executable. Considering words above, it can't be used for checking the presence oftezos-bakingpackage in the system, since this package isn't an executable.
So we need to use another approach to checking whether tezos-baking is installed. The first thing that comes to mind is apt list --installed | grep tezos-baking, but I'm pretty sure that there are better alternatives
Description
Add script that performs the following:
Related issue(s)
Resolves #704
Related changes (conditional)
I checked whether I should update the README
I checked whether native packaging works, i.e. native binary packages
can be successfully built.
Stylistic guide (mandatory)