Page MenuHomeDevCentral
Feed Advanced Search

Aug 3 2024

dereckson added a comment to T1994: Upgrade Salt repository on Debian.
Eglide
$ salt-minion --versions
Salt Version:
          Salt: 3007.1
Aug 3 2024, 13:01 · Eglide, Servers, Salt
dereckson added a subtask for T1950: Deploy PHP 8.3: T1995: PHP 8.2 and PHP 8.3 seems both to be installed on Eglide.
Aug 3 2024, 12:59 · Servers, PHP 8.x support
dereckson edited projects for T1994: Upgrade Salt repository on Debian, added: Servers, Eglide; removed Nasqueron Operations Squad, discussion.
Aug 3 2024, 12:56 · Eglide, Servers, Salt
dereckson added a revision to T1982: Upgrade from Python 3.9 to Python 3.11+: D3391: Bump Sphinx package name to use Python3.11.
Aug 3 2024, 12:18 · Servers
dereckson claimed T1991: Context has again been lost on /var/log/www.

We can actually provide P352 as hotfix.

Aug 3 2024, 12:13 · upstream, Regression, Servers, Salt
dereckson added a revision to T1992: Install patch on redhat family as part of core: D3390: Install patch on RHEL-family servers.
Aug 3 2024, 12:08 · Servers
dereckson added a comment to T1992: Install patch on redhat family as part of core.

patch is available on Eglide as part of build-essential, so presumed OK for Debian

Aug 3 2024, 12:06 · Servers
dereckson added a parent task for T1992: Install patch on redhat family as part of core: T1991: Context has again been lost on /var/log/www.
Aug 3 2024, 12:04 · Servers
dereckson added a subtask for T1991: Context has again been lost on /var/log/www: T1992: Install patch on redhat family as part of core.
Aug 3 2024, 12:04 · upstream, Regression, Servers, Salt
dereckson added a comment to T1982: Upgrade from Python 3.9 to Python 3.11+.

certbot against Python 3.11 should be checked on dwellers and docker-002

Aug 3 2024, 11:45 · Servers
dereckson moved T1992: Install patch on redhat family as part of core from Backlog to Working on on the Servers board.
Aug 3 2024, 11:30 · Servers
dereckson lowered the priority of T1991: Context has again been lost on /var/log/www from High to Normal.
Aug 3 2024, 10:11 · upstream, Regression, Servers, Salt
dereckson added a comment to T1991: Context has again been lost on /var/log/www.

I've applied P352 to replace egrep by grep -E on dwellers and docker-002.

Aug 3 2024, 10:11 · upstream, Regression, Servers, Salt
dereckson added a comment to T1992: Install patch on redhat family as part of core.

I wanted to apply P354 to fix Salt SELinux issue with patch -p1 < ~/egrep.patch on docker-002.

Aug 3 2024, 10:09 · Servers
dereckson updated the task description for T1992: Install patch on redhat family as part of core.
Aug 3 2024, 10:08 · Servers
dereckson triaged T1992: Install patch on redhat family as part of core as Normal priority.
Aug 3 2024, 10:08 · Servers

Jul 31 2024

dereckson moved T1991: Context has again been lost on /var/log/www from Backlog to Bug and issues on the Salt board.
Jul 31 2024, 16:30 · upstream, Regression, Servers, Salt
dereckson added a project to T1991: Context has again been lost on /var/log/www: upstream.
Jul 31 2024, 16:30 · upstream, Regression, Servers, Salt
dereckson added a comment to T1991: Context has again been lost on /var/log/www.

Already reported upstream: https://github.com/saltstack/salt/issues/65608

Jul 31 2024, 16:30 · upstream, Regression, Servers, Salt
dereckson added a comment to T1991: Context has again been lost on /var/log/www.
Complector
$ cd /opt/salt/nasqueron-operations
$ salt dwellers state.apply roles/webserver-core/nginx/config
[…]
----------                                                                                                                                                                                                                                    [3/295]
          ID: selinux_context_nginx_logs
    Function: selinux.fcontext_policy_present
        Name: /var/log/www
      Result: False
     Comment: An exception occurred in this state: Traceback (most recent call last):
                File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/state.py", line 2428, in call
                  ret = self.states[cdata["full"]](
                File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/loader/lazy.py", line 160, in __call__
                  ret = self.loader.run(run_func, *args, **kwargs)
                File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/loader/lazy.py", line 1269, in run
                  return self._last_context.run(self._run_as, _func_or_method, *args, **kwargs)
                File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/loader/lazy.py", line 1284, in _run_as
                  return _func_or_method(*args, **kwargs)
                File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/loader/lazy.py", line 1317, in wrapper
                  return f(*args, **kwargs)
                File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/states/selinux.py", line 326, in fcontext_policy_present
                  current_state = __salt__["selinux.fcontext_get_policy"](
                File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/loader/lazy.py", line 160, in __call__
                  ret = self.loader.run(run_func, *args, **kwargs)
                File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/loader/lazy.py", line 1269, in run
                  return self._last_context.run(self._run_as, _func_or_method, *args, **kwargs)
                File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/loader/lazy.py", line 1284, in _run_as
                  return _func_or_method(*args, **kwargs)
                File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/modules/selinux.py", line 507, in fcontext_get_policy
                  "filespec": parts.group(1).strip(),
              AttributeError: 'NoneType' object has no attribute 'group'
     Started: 16:25:51.413301
    Duration: 391.186 ms
     Changes:
----------
          ID: selinux_context_nginx_logs_applied
    Function: selinux.fcontext_policy_applied
        Name: /var/log/www
      Result: True
     Comment: SElinux policies are already applied for filespec "/var/log/www"
     Started: 16:25:51.804764
    Duration: 6.322 ms
     Changes:
----------
[…]
Jul 31 2024, 16:27 · upstream, Regression, Servers, Salt
DorianWinty added a comment to T1505: Automate Let's Encrypt TLS certificates management for every server.

31/07/2024 at 12h the devcentral.nasqueron.org certificate expired

Jul 31 2024, 16:24 · Servers
dereckson added a comment to T1991: Context has again been lost on /var/log/www.

Issue can be repro on Dwellers:

Jul 31 2024, 16:23 · upstream, Regression, Servers, Salt
dereckson added a project to T1991: Context has again been lost on /var/log/www: Regression.
Jul 31 2024, 16:22 · upstream, Regression, Servers, Salt
dereckson triaged T1991: Context has again been lost on /var/log/www as High priority.
Jul 31 2024, 16:21 · upstream, Regression, Servers, Salt

Jul 30 2024

DorianWinty added a subtask for T1930: Postfix Provisioning: T1990: Export metrics for Postfix.
Jul 30 2024, 20:59 · Mail, Restricted Project, Servers

Jul 29 2024

dereckson moved T1989: Merge Nasqueron infrastructure reference into ops grimoire from Backlog - On hold pending T1475 to Checks after T1475 on the Mail board.
Jul 29 2024, 18:58 · Mail, Nasqueron Operations Squad, Servers, documentation
dereckson triaged T1989: Merge Nasqueron infrastructure reference into ops grimoire as Normal priority.
Jul 29 2024, 18:58 · Mail, Nasqueron Operations Squad, Servers, documentation

Jul 27 2024

dereckson added a revision to T1762: Deploy NetBox: D3385: Switch from fixes to flags in node pillar.
Jul 27 2024, 23:19 · Restricted Project, Servers, Drake network
dereckson raised the priority of T1939: Implement blue/green deployment or immutable artefacts for router-001 from Low to Normal.

Increasing priority as FreeBSD 13.2 is now EOL for one month (2024-06-30).

Jul 27 2024, 20:38 · Servers, Drake network
dereckson moved T1757: docker-001 routing for drake doesn't work on boot from IntraNought to IntraNought / GRE tunnels on the Drake network board.
Jul 27 2024, 20:33 · Operations sprints (Ignite Alkane Propulsion), Salt, Drake network, Servers, Nasqueron Docker deployment squad
dereckson added a revision to T1762: Deploy NetBox: D3383: Drop network:ipv6_native from node pillar.
Jul 27 2024, 19:24 · Restricted Project, Servers, Drake network
dereckson added a revision to T1623: Deploy Prometheus to gain observability: D3381: Configure Docker metrics service in firewalld.
Jul 27 2024, 17:07 · Monitoring and reporting, Operations sprints (Consolidate them all), Servers
dereckson added a revision to T651: Deploy Grafana: D3380: Set correct Grafana URL.
Jul 27 2024, 13:57 · Monitoring and reporting, Operations sprints (Ignite Alkane Propulsion), Servers
dereckson added a revision to T651: Deploy Grafana: D3379: Move Grafana plugins directory to default location.
Jul 27 2024, 13:51 · Monitoring and reporting, Operations sprints (Ignite Alkane Propulsion), Servers
dereckson added a comment to T1633: Collect metrics from RabbitMQ.

Dashboard: https://grafana.nasqueron.org/d/Kn5xm-gZk/rabbitmq-overview?orgId=1&refresh=15s

Jul 27 2024, 12:14 · Operations sprints (Consolidate them all), Servers

Jul 26 2024

dereckson removed a project from T651: Deploy Grafana: Nasqueron Docker deployment squad.

Not deployed to Docker but bare-metal.

Jul 26 2024, 23:09 · Monitoring and reporting, Operations sprints (Ignite Alkane Propulsion), Servers
dereckson moved T651: Deploy Grafana from Backlog - Monitoring / misc to Working on on the Operations sprints (Ignite Alkane Propulsion) board.
Jul 26 2024, 23:09 · Monitoring and reporting, Operations sprints (Ignite Alkane Propulsion), Servers
dereckson moved T651: Deploy Grafana from Backlog to Working on on the Servers board.
Jul 26 2024, 23:09 · Monitoring and reporting, Operations sprints (Ignite Alkane Propulsion), Servers
dereckson added a comment to T650: Deploy PCP on Docker engines.

T651 has a Grafana ready if we wish to retest this on Dwellers, green light.

Jul 26 2024, 19:39 · Monitoring and reporting, Servers
dereckson added a comment to T651: Deploy Grafana.

Deployed at https://grafana.nasqueron.org/

Jul 26 2024, 19:38 · Monitoring and reporting, Operations sprints (Ignite Alkane Propulsion), Servers
dereckson added a revision to T651: Deploy Grafana: D3377: Deploy Grafana.
Jul 26 2024, 19:38 · Monitoring and reporting, Operations sprints (Ignite Alkane Propulsion), Servers
dereckson added a revision to T1633: Collect metrics from RabbitMQ: D3376: Scrape RabbitMQ metrics into Prometheus.
Jul 26 2024, 19:34 · Operations sprints (Consolidate them all), Servers

Jul 25 2024

dereckson added a comment to T1505: Automate Let's Encrypt TLS certificates management for every server.

rOPS1e9a54c10365 has worked like a charm on WindRiver to generate grafana.nasqueron.org through DNS.

Jul 25 2024, 20:43 · Servers
dereckson triaged T1505: Automate Let's Encrypt TLS certificates management for every server as Normal priority.
Jul 25 2024, 20:42 · Servers
dereckson added a comment to T651: Deploy Grafana.

DNS: grafana. CNAME www-dev.nasqueron.org

Jul 25 2024, 18:49 · Monitoring and reporting, Operations sprints (Ignite Alkane Propulsion), Servers
dereckson added a comment to T651: Deploy Grafana.

Deployment can be using sqlite3 as long as it's still performant
as we want our monitoring tools to be resiliant.

Jul 25 2024, 18:49 · Monitoring and reporting, Operations sprints (Ignite Alkane Propulsion), Servers
dereckson added a comment to T650: Deploy PCP on Docker engines.

Probably a good part of roles/core/monitoring when grains["os_family"] == "RedHat". Eglide has "Debian" for that grain, but not sure if we've enough RAM there.

Jul 25 2024, 18:17 · Monitoring and reporting, Servers
dereckson added a comment to T652: Install PCP on Dwellers.

Just for reference, this was a test deployment. This is not currently installed on Dwellers, and needs to be in Salt as part of T650.

Jul 25 2024, 18:13 · Servers
dereckson renamed T650: Deploy PCP on Docker engines from Give access to Dwellers key statistics to Deploy PCP on Docker engines.
Jul 25 2024, 18:12 · Monitoring and reporting, Servers
dereckson raised the priority of T651: Deploy Grafana from Low to Normal.
Jul 25 2024, 18:12 · Monitoring and reporting, Operations sprints (Ignite Alkane Propulsion), Servers
dereckson claimed T651: Deploy Grafana.

This task has been created in 2016 to publish metrics from PCP (Performance Co-Pilot) on RHEL-like servers, especially our Docker engines.

Jul 25 2024, 18:12 · Monitoring and reporting, Operations sprints (Ignite Alkane Propulsion), Servers
dereckson added a comment to T1623: Deploy Prometheus to gain observability.

RabbitMQ exporters have been added to NetBox under the tag observability -> https://netbox.nasqueron.org/ipam/services/?tag=observability 🔒

Jul 25 2024, 18:04 · Monitoring and reporting, Operations sprints (Consolidate them all), Servers
dereckson added a subtask for T1623: Deploy Prometheus to gain observability: T1987: Dovecot Metrics.
Jul 25 2024, 18:03 · Monitoring and reporting, Operations sprints (Consolidate them all), Servers
dereckson added a subtask for T1931: Dovecot Provisioning: T1987: Dovecot Metrics.
Jul 25 2024, 18:03 · Mail, Restricted Project, Servers
dereckson added a comment to T1932: ViMbAdmin Provisioning.

Next: memcached

Jul 25 2024, 17:39 · Mail, Restricted Project, Servers
dereckson added a comment to T1633: Collect metrics from RabbitMQ.

Pending container redeployment with D3374, we can reach metrics set in D3373 with socat:

Jul 25 2024, 17:36 · Operations sprints (Consolidate them all), Servers

Jul 24 2024

dereckson added a subtask for T1950: Deploy PHP 8.3: Unknown Object (Maniphest Task).
Jul 24 2024, 19:36 · Servers, PHP 8.x support
DorianWinty added a revision to T1931: Dovecot Provisioning: D3375: Configure pg_HBA for dovecot user.
Jul 24 2024, 17:27 · Mail, Restricted Project, Servers

Jul 23 2024

dereckson added a revision to T1633: Collect metrics from RabbitMQ: D3374: Expose RabbitMQ metrics on port 15692.
Jul 23 2024, 23:34 · Operations sprints (Consolidate them all), Servers
dereckson updated the task description for T1633: Collect metrics from RabbitMQ.
Jul 23 2024, 23:27 · Operations sprints (Consolidate them all), Servers
dereckson added a revision to T1633: Collect metrics from RabbitMQ: D3373: Enable rabbitmq_prometheus plugin.
Jul 23 2024, 23:24 · Operations sprints (Consolidate them all), Servers
dereckson added a revision to T1623: Deploy Prometheus to gain observability: D3373: Enable rabbitmq_prometheus plugin.
Jul 23 2024, 23:15 · Monitoring and reporting, Operations sprints (Consolidate them all), Servers
dereckson added a revision to T1623: Deploy Prometheus to gain observability: D3372: Collect Docker metrics with Prometheus.
Jul 23 2024, 23:05 · Monitoring and reporting, Operations sprints (Consolidate them all), Servers
dereckson added a revision to T1982: Upgrade from Python 3.9 to Python 3.11+: D3371: Revert "Install Python 3.9 on CentOS/Rocky 8.5 machines".
Jul 23 2024, 22:41 · Servers
dereckson updated the task description for T1982: Upgrade from Python 3.9 to Python 3.11+.
Jul 23 2024, 22:40 · Servers
dereckson added a revision to T1982: Upgrade from Python 3.9 to Python 3.11+: D3366: Upgrade certbot to Python 3.11.
Jul 23 2024, 22:38 · Servers
dereckson added a revision to T1982: Upgrade from Python 3.9 to Python 3.11+: D3368: Bump default versions to build ports.
Jul 23 2024, 22:38 · Servers
dereckson triaged T1982: Upgrade from Python 3.9 to Python 3.11+ as Normal priority.
Jul 23 2024, 22:37 · Servers
dereckson added a revision to T1623: Deploy Prometheus to gain observability: D3370: Deploy Prometheus on WindRiver.
Jul 23 2024, 22:28 · Monitoring and reporting, Operations sprints (Consolidate them all), Servers
dereckson added a comment to T1931: Dovecot Provisioning.

Also, we need to declare Dovecot ports at https://netbox.nasqueron.org/virtualization/virtual-machines/10/ services table (on the public IP)

Jul 23 2024, 21:36 · Mail, Restricted Project, Servers
dereckson removed a parent task for T1981: Upgrade to FreeBSD 14.1: T1980: ZFS collector doesn't work everywhere.
Jul 23 2024, 21:15 · Servers
dereckson removed a subtask for T1980: ZFS collector doesn't work everywhere: T1981: Upgrade to FreeBSD 14.1.
Jul 23 2024, 21:15 · Monitoring and reporting, Servers
dereckson closed T1980: ZFS collector doesn't work everywhere as Resolved.

I suspect the version 1.6.1 (currently in packages) is compatible with FreeBSD 13 while the version 1.8.2 is compatible with FreeBSD 14.

Jul 23 2024, 21:15 · Monitoring and reporting, Servers
dereckson moved T1623: Deploy Prometheus to gain observability from Backlog to Prometheus on the Monitoring and reporting board.
Jul 23 2024, 20:57 · Monitoring and reporting, Operations sprints (Consolidate them all), Servers
dereckson moved T1980: ZFS collector doesn't work everywhere from Backlog to Prometheus on the Monitoring and reporting board.
Jul 23 2024, 20:57 · Monitoring and reporting, Servers
dereckson added a subtask for T1981: Upgrade to FreeBSD 14.1: T1972: Update WindRiver to FreeBSD 14.1.
Jul 23 2024, 20:57 · Servers
dereckson added a parent task for T1972: Update WindRiver to FreeBSD 14.1: T1981: Upgrade to FreeBSD 14.1.
Jul 23 2024, 20:57 · Servers
dereckson triaged T1981: Upgrade to FreeBSD 14.1 as High priority.
Jul 23 2024, 20:57 · Servers
dereckson triaged T1980: ZFS collector doesn't work everywhere as Low priority.
Jul 23 2024, 20:55 · Monitoring and reporting, Servers
dereckson added a revision to T1623: Deploy Prometheus to gain observability: D3369: Deploy Prometheus Node Exporter.
Jul 23 2024, 18:30 · Monitoring and reporting, Operations sprints (Consolidate them all), Servers
dereckson added a comment to T1623: Deploy Prometheus to gain observability.

Independently of the 2020 plan for a service mesh, we're going to deploy Prometheus right now to gain observability on currently deployed services.

Jul 23 2024, 18:05 · Monitoring and reporting, Operations sprints (Consolidate them all), Servers
dereckson added a comment to T1877: Evaluate Alcali - Salt front-end.

It could be easier to deploy https://github.com/kpetremann/salt-exporter

Jul 23 2024, 17:56 · security, Salt, Servers, Product evaluation
dereckson added a revision to T1950: Deploy PHP 8.3: D3368: Bump default versions to build ports.
Jul 23 2024, 06:04 · Servers, PHP 8.x support

Jul 18 2024

DorianWinty added a revision to T1931: Dovecot Provisioning: D3365: Create PostgreSQL credentials for Dovecot.
Jul 18 2024, 18:44 · Mail, Restricted Project, Servers

Jul 15 2024

DorianWinty added a revision to T1931: Dovecot Provisioning: D3364: Provisioning Dovecot Config.
Jul 15 2024, 20:55 · Mail, Restricted Project, Servers
dereckson added a comment to T1475: Provision a mail server.

DNS change

Jul 15 2024, 20:17 · Mail, Restricted Project, Servers
DorianWinty added a comment to T1930: Postfix Provisioning.

policy-spf package should be installed

Jul 15 2024, 19:16 · Mail, Restricted Project, Servers

Jul 12 2024

dereckson added a revision to T1475: Provision a mail server: D3363: Install mail clients on devserver and shellserver roles.
Jul 12 2024, 20:41 · Mail, Restricted Project, Servers

Jul 10 2024

dereckson closed T1974: Update windu SSH key as Resolved.

Key confirmed to work.

Jul 10 2024, 19:17 · security, Servers

Jul 9 2024

dereckson closed T1936: NetBox outage on WindRiver restart as Resolved by committing rOPS23c4d4f80a1c: Deploy Redis.
Jul 9 2024, 22:28 · Servers
dereckson added a revision to T1974: Update windu SSH key: D3362: Add SSH key for windu account.
Jul 9 2024, 22:17 · security, Servers
dereckson reopened T1974: Update windu SSH key as "Open".

Still some issue to connect, SSH2 RSA key not recognized.

Jul 9 2024, 22:17 · security, Servers

Jul 8 2024

dereckson added a revision to T1762: Deploy NetBox: D3359: Deploy Redis.
Jul 8 2024, 23:21 · Restricted Project, Servers, Drake network
dereckson added a revision to T1762: Deploy NetBox: D3360: Deploy NetBox service.
Jul 8 2024, 23:10 · Restricted Project, Servers, Drake network
dereckson added a comment to T1936: NetBox outage on WindRiver restart.

It also needs PostgreSQL launched at startup, but that's already defined in the role dbserver-pgsql.

Jul 8 2024, 22:14 · Servers
dereckson moved T1936: NetBox outage on WindRiver restart from Working on to Pending review on the Servers board.
Jul 8 2024, 22:01 · Servers
dereckson added a revision to T1936: NetBox outage on WindRiver restart: D3359: Deploy Redis.
Jul 8 2024, 22:00 · Servers
dereckson claimed T1936: NetBox outage on WindRiver restart.
Jul 8 2024, 21:56 · Servers
dereckson moved T1936: NetBox outage on WindRiver restart from Backlog to Working on on the Servers board.
Jul 8 2024, 21:56 · Servers
dereckson added a project to T1936: NetBox outage on WindRiver restart: Servers.
Jul 8 2024, 21:56 · Servers