Prometheus is available, regardless of the initial goal to offer a service mesh on Kubernetes.
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Aug 4 2024
We've already statistics with Prometheus Node Exporter.
Documentation added in https://agora.nasqueron.org/Operations_grimoire/Grafana and links to other dashboards added to relevant places.
Deployed D3248 to docker-002.
Just a small note this product becomes more and more open core, and we're less in favour of that one "specifically".
Aug 3 2024
From router-001 network looks good:
Stopped currently not needed salt and node-exporter on router-001 to see if that helps.
Could be at hypervisor level. SSH failed until 13:22 where it worked immediately.
As of 13:18 UTC, SSH access works.
Also, at the same time, DevCentral is slow for arc diff or to publish this task. This delay behavior is similar as when DNS resolution timeouts occur.
$ salt-minion --versions Salt Version: Salt: 3007.1
We can actually provide P352 as hotfix.
patch is available on Eglide as part of build-essential, so presumed OK for Debian
certbot against Python 3.11 should be checked on dwellers and docker-002
I've applied P352 to replace egrep by grep -E on dwellers and docker-002.
I wanted to apply P354 to fix Salt SELinux issue with patch -p1 < ~/egrep.patch on docker-002.
Jul 31 2024
Already reported upstream: https://github.com/saltstack/salt/issues/65608
$ cd /opt/salt/nasqueron-operations $ salt dwellers state.apply roles/webserver-core/nginx/config […] ---------- [3/295] ID: selinux_context_nginx_logs Function: selinux.fcontext_policy_present Name: /var/log/www Result: False Comment: An exception occurred in this state: Traceback (most recent call last): File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/state.py", line 2428, in call ret = self.states[cdata["full"]]( File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/loader/lazy.py", line 160, in __call__ ret = self.loader.run(run_func, *args, **kwargs) File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/loader/lazy.py", line 1269, in run return self._last_context.run(self._run_as, _func_or_method, *args, **kwargs) File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/loader/lazy.py", line 1284, in _run_as return _func_or_method(*args, **kwargs) File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/loader/lazy.py", line 1317, in wrapper return f(*args, **kwargs) File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/states/selinux.py", line 326, in fcontext_policy_present current_state = __salt__["selinux.fcontext_get_policy"]( File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/loader/lazy.py", line 160, in __call__ ret = self.loader.run(run_func, *args, **kwargs) File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/loader/lazy.py", line 1269, in run return self._last_context.run(self._run_as, _func_or_method, *args, **kwargs) File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/loader/lazy.py", line 1284, in _run_as return _func_or_method(*args, **kwargs) File "/opt/saltstack/salt/lib/python3.10/site-packages/salt/modules/selinux.py", line 507, in fcontext_get_policy "filespec": parts.group(1).strip(), AttributeError: 'NoneType' object has no attribute 'group' Started: 16:25:51.413301 Duration: 391.186 ms Changes: ---------- ID: selinux_context_nginx_logs_applied Function: selinux.fcontext_policy_applied Name: /var/log/www Result: True Comment: SElinux policies are already applied for filespec "/var/log/www" Started: 16:25:51.804764 Duration: 6.322 ms Changes: ---------- […]
31/07/2024 at 12h the devcentral.nasqueron.org certificate expired
Issue can be repro on Dwellers:
Jul 30 2024
Jul 29 2024
Jul 27 2024
Increasing priority as FreeBSD 13.2 is now EOL for one month (2024-06-30).
Jul 26 2024
Not deployed to Docker but bare-metal.
T651 has a Grafana ready if we wish to retest this on Dwellers, green light.
Deployed at https://grafana.nasqueron.org/
Jul 25 2024
rOPS1e9a54c10365 has worked like a charm on WindRiver to generate grafana.nasqueron.org through DNS.
DNS: grafana. CNAME www-dev.nasqueron.org
Deployment can be using sqlite3 as long as it's still performant
as we want our monitoring tools to be resiliant.
Probably a good part of roles/core/monitoring when grains["os_family"] == "RedHat". Eglide has "Debian" for that grain, but not sure if we've enough RAM there.