Page MenuHomeDevCentral

Attach OVH VIP to the CARP MASTER MAC
Needs RevisionPublic

Authored by yousra on Tue, Mar 31, 14:17.

Details

Summary

Add a Python script to automatically manage OVH failover IP (VIP) assignment based on CARP state changes.
The script runs on router nodes and ensures the VIP is attached to the MAC of the MASTER node only.

Ref T2276

Test Plan
  • Triggered failover between routers
  • Confirmed that the VIP is correctly assigned to the MASTER MAC in OVH
  • Confirmed that it is removed from the previous MAC
  • Verified that everything works as expected

Diff Detail

Repository
rOPS Nasqueron Operations
Lint
Lint Skipped
Unit
No Test Coverage
Branch
script-ovh-carp
Build Status
Buildable 6550
Build 6834: arc lint + arc unit

Event Timeline

yousra requested review of this revision.Tue, Mar 31, 14:17
yousra created this revision.
This revision is now accepted and ready to land.Tue, Mar 31, 14:18

Fix the path of the script : /usr/local/scripts/carp/carp-ovh.py

dereckson requested changes to this revision.Tue, Mar 31, 21:16

If we're going to flood /var/log/messages with carp debug information, perhaps should we create a separate log topic, but that can be another change. I've created T2292.

roles/router/carp/files/carp-ovh-switch.sh
1

This file can be skipped as we can call directly the Python script:

#   -------------------------------------------------------------
#   Application entry point
#   - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -


def run(interface, state):
    # here the logic part (lines 149-188)
    pass


if __name__ == "__main__":
    argc = len(sys.argv)

    if argc < 3:
        print(f"Usage: {sys.argv[0]} <subsystem> <state>", file=sys.stderr)
        sys.exit(1)

    try:
        interface = subsystem.split("@", 1)[1]
    except IndexError:
        print(f"Subsystem doesn't contain @", file=sys.stderr)
        sys.exit(2)

    run(interface, sys.argv[2])
roles/router/carp/files/carp-ovh.py
1

For executables, best practice is to avoid to hardcode the interpreter path.

25

Follow https://github.com/nasqueron/snippets/blob/main/python/command.py

We need to organize the script in three parts:

(1) the config
(2) the functions and helper functions applying the changes
(3) the entry point with if __name__ == "__main__"

168

At the last iteration of the loop, delay is at 16 seconds.

We'll have waited 31 seconds + API response time.

If we abort here, we break the router as the MAC address doesn't go anywhere.

We need to retry *a lot* more time.

Don't break until 1 month.
We're already breaking with the exception if there is any issue with OVH request.

Also we need to find a way to warn for failures when we reach 128 seconds.
That's T771 job normally.

roles/router/carp/init.sls
44

There are folders for executables: /usr/local/bin or /usr/local/libexec

Executable don't have the extension

45
This revision now requires changes to proceed.Tue, Mar 31, 21:17