Page 1 of 1

Alarming and Auto Recovering Shell Script v1.1 for Lisk v2.0.0

Posted: Mon Apr 18, 2016 10:36 pm
by sgdias
Hi all,

We are bringing to you an alarm and auto recover script, metal494 and me (sgdias), we worked on developing and testing the script.

Any suggestion for enhacement or for new features is welcome.

Here it goes!! :

Code: Select all

#
#                         ALARM & RECOVER v1.1 for Lisk 2.0.0
#
#           VOTE FOR ACTIVE DELEGATES SGDIAS and METAL494 !
#
#           Delegate Links:
#
#           sgdias:   https://forum.lisk.io/viewtopic.php?f=6&t=121
#           metal494: https://forum.lisk.io/viewtopic.php?f=6&t=134
#
#                   Many thanks for your support !
#


tail -Fn0 /opt/lisk/lisk-Linux-x86_64/logs.log |
while read line ; do

    echo "$line" | grep "Fork"
    if [ $? = 0 ]; then
        echo "Fork found: $line" | mail -s "Alarm!: Fork found" user@mail.com
    fi

    echo "$line" | grep "ETIMEDOUT"
    if [ $? = 0 ]; then
        echo "Timeout found: $line, rebooting now" | mail -s "Alarm!: Connection Timeout Found" user@mail.com
        reboot
    fi

    echo "$line" | grep "EADDRINUSE"
    if [ $? = 0 ]; then
        echo "Address in use found: $line, rebooting now" | mail -s "Alarm!: Address in use Found" user@mail.com
        reboot
    fi

    echo "$line" | grep "\"cause\":3"
    if [ $? = 0 ]; then
        echo "Fork with root cause code 3 found. Rebuilding the blockchain." | mail -s "Alarm!: Fork with root cause code 3 found" user@mail.com
        echo "Auto Rebuilding lisk..."
        bash /opt/lisk/lisk-Linux-x86_64/lisk.sh rebuild
        echo "Auto Rebuilding Done"
    fi
done

#
# Install Notes: 
#
#                 - The file has to be copied to your lisk folder.
#
#                 - Replace the lisk path in the script with your own lisk path
#
#                - In order to start the script it's important you do it with the -c option for sh command, you need executing it the following way:
#                 sh -c alarmRecover.sh & > alarmRecover.log
#
#               - You can make the script start at reboot by adding the following line to your crontab:
#                 @reboot sh -c /opt/lisk/lisk-Linux-x86_64/alarmRecover.sh > /opt/lisk/lisk-Linux-x86_64/alarmRecover.log 2>&1
#
#               - You can customize this script as you want just by replacing log patterns, change command execution for each event
#                 or just changing the alarm message being sent to the email.
#
#



Best Regards,

Santiago.

Re: Alarming and Auto Recovering Shell Script

Posted: Mon Apr 18, 2016 10:52 pm
by Gr33nDrag0n
Hey, nice script will definitevely check it sooner then later.

Re: Alarming and Auto Recovering Shell Script

Posted: Mon Apr 18, 2016 11:14 pm
by sgdias
Gr33nDrag0n wrote:Hey, nice script will definitevely check it sooner then later.


Thanks Gr33nDrag0n, I'm glad you like it. :)

Re: Alarming and Auto Recovering Shell Script

Posted: Mon Apr 18, 2016 11:39 pm
by ViperTKD
Great! I had started something in the same train of thought. Scratch from my todo list now, I'll use yours!! Awesome job!

Re: Alarming and Auto Recovering Shell Script

Posted: Wed May 04, 2016 7:06 am
by sgdias
Updated script and notes for Lisk version 2.0.0

Re: Alarming and Auto Recovering Shell Script v1.1 for Lisk v2.0.0

Posted: Sat May 21, 2016 3:39 am
by FredDag
Hola sgdias. Are you able to help me troubleshoot the install of this?
I get the message below when running this script (which I have altered slightly)

----------------------------------------------------------------------------------------------------------
ubuntu@ip-172-30-0-66:~/lisk$ sh -c alarmRecover.sh & > alarmRecover.log
[1] 27996
ubuntu@ip-172-30-0-66:~/lisk$ sh: 1: alarmRecover.sh: not found
^C
[1]+ Exit 127 sh -c alarmRecover.sh
----------------------------------------------------------------------------------------------------------


alarmRecover.sh
Now using Amazon PUSH for notifications. These have been tested as working before adding them to this script.
Now using ~/lisk/ as the install directory

Code: Select all

tail -Fn0 ~/lisk/logs.log |
while read line ; do

    echo "$line" | grep "Fork"
    if [ $? = 0 ]; then
        echo "Fork found: $line"
        sns-publish arn:aws:sns:ap-southeast-2:732843027381:Error-Fork-3 --message "Alert Fork" --subject "Alert Fork" --region ap-southeast-2
    fi

    echo "$line" | grep "ETIMEDOUT"
    if [ $? = 0 ]; then
        echo "Timeout found: $line, rebooting now"
        sns-publish arn:aws:sns:ap-southeast-2:732843027381:Error-Fork-3 --message "Alert ETIMEOUT Rebooting" --subject "Alert ETIMEOUT" --region ap-southeast-2
        reboot
    fi

    echo "$line" | grep "EADDRINUSE"
    if [ $? = 0 ]; then
        echo "Address in use found: $line, rebooting now"
        sns-publish arn:aws:sns:ap-southeast-2:732843027381:Error-Fork-3 --message "Alert AEDDRINUSE Rebooting" --subject "Alert AEDDRINUSE" --region ap-southeast-2
        reboot
    fi

    echo "$line" | grep "\"cause\":3"
    if [ $? = 0 ]; then
        echo "Fork with root cause code 3 found. Rebuilding the blockchain."
        echo "Auto Rebuilding lisk..."
        sns-publish arn:aws:sns:ap-southeast-2:732843027381:Error-Fork-3 --message "Alert Fork3 Rebuilding" --subject "Alert Fork3 Rebuilding" --region ap-southeast$
        bash ~/lisk/lisk.sh rebuild
        echo "Auto Rebuilding Done"
    fi
done





SOLVED: By starting script with the full paths...

sh -c /home/ubuntu/lisk/alarmRecover.sh & > /home/ubuntu/lisk/alarmRecover.log

Re: Alarming and Auto Recovering Shell Script v1.1 for Lisk v2.0.0

Posted: Sat May 21, 2016 8:18 am
by FredDag
Hi sgdias & metal494. Will you keep on developing this script?

I tested the Fork3 alert today.... received 200+ email alerts in the space of 1-2 minutes :)
Very funny as they triggered an alarm/siren on my phone.

is it possible to pause the monitoring/alerting process until the rebuild is completed ?

Re: Alarming and Auto Recovering Shell Script v1.1 for Lisk v2.0.0

Posted: Tue Jun 07, 2016 7:58 pm
by sgdias
FredDag wrote:Hi sgdias & metal494. Will you keep on developing this script?

I tested the Fork3 alert today.... received 200+ email alerts in the space of 1-2 minutes :)
Very funny as they triggered an alarm/siren on my phone.

is it possible to pause the monitoring/alerting process until the rebuild is completed ?


Hi FredDag, the idea is to at least mantain it across time and different lisk versions. Regarding pausing the monitoring, if I'm not wrong the rebuild is deleting the logs.log file and creating again, therefore the monitoring should automatically pause. Either way the rebuild blocks the while loop for the time is happening so it should continue reading from the log once it's done.