Page MenuHomeDevCentral

Deploy Airflow
Closed, WontfixPublic

Description

To run pipeline DAGs for datasources workflow, plan is to evaluate Airflow.

Plan is to:

  • run containers for Airflow in our Docker PaaS
  • deploy through Jenkins DAG content like in D2754 to /srv/airflow/nasqueron/dags
  • monitor errors through Sentry

Fallback plan if evaluation isn't successful

In June, evaluate if we keep or not Airflow.

If not, as a fallback, alternative plans are:

  • fantoir-datasource workflow deploy.yaml: a YAML document tell what commands to run with what options
  • Jenkins pipeline to run each fantoir-datasource command
  • Both: Jenkins is responsible to run fantoir-datasource workflow deploy.yaml and fantoir-datasource workflow wikidata.yaml

This plan needs to be put in action before decom Airflow.

Event Timeline

DNS -> airflow.nasqueron.org. 172800 IN CNAME app2.nasqueron.org.

Previous status from 2023 installation:

  • Airflow 2.5.2 deployed to Dwellers
  • Sentry correctly configured and receiving events

Current status:

  • Airflow 2.8.0 deployed to Dwellers
  • Vault configured as secrets back-end
  • Sentry still works

Next steps:

  • Create a shared working directory so DAGs task have somewhere to work when they need a file
  • Deploy D2754 and see what's missing

Well, FANTOIR isn't the format used by French administrations anymore. The TOPO database is now used.

No other candidate for Airflow pipelines exist.

So, we'll revisit this when/if we need such a complex DAG to transform data.