We can run queries for analytics or report purpose.
Could we create a service to store and run those queries easily?
What kind of service?
Wikimedia build Quarry, https://quarry.wmcloud.org/query/runs/all
It allows to freely query a safe replication of database without credentials or sensitive information
They're also switching to https://superset.apache.org/ as a front-end to use those replica.
As we don't have data interesting to general public outside Nasqueron Databases,
we could only have a repository to store queries, and a service to run them publicly or through CLI on a devserver. Such service would only have access to SELECT right on the tables it needs.
Candidate databases servers
Those queries can be run:
- on db-A cluster for PostgreSQL services
- on db-B cluster for MySQL services using MariaDB 10
- on acquisitariat for MySQL services using MySQL 8 on Docker PaaS
Output
Output can be:
- MySQL or PostgreSQL raw output
- a CSV variant, so we can use tools like http://mlei.net/shared/tool/csv-wiki.htm
- or directly the output we want like:
- MediaWiki table
- MarkDown table
- HTML table
Examples
An example of query to include a report as a MediaWiki table:
SELECT CONCAT("{{D|", rev.id, "}}") as revision, title, DATE_FORMAT(FROM_UNIXTIME(rev.dateCreated), '%Y-%m-%d') as `date`, userName, CONCAT("{{Repo|", repositorySlug, "}}") as repository FROM devcentral_differential.differential_revision rev LEFT JOIN devcentral_repository.repository repo ON repo.phid = rev.repositoryPHID LEFT JOIN devcentral_user.user ON user.phid = authorPHID WHERE rev.phid IN (SELECT DISTINCT objectPHID FROM devcentral_token.token_given WHERE tokenPHID = "PHID-TOKN-emoji-3");