Page MenuHomeDevCentral

Allow to run queries for reporting
Open, WishlistPublic

Description

We can run queries for analytics or report purpose.

Could we create a service to store and run those queries easily?

What kind of service?

Wikimedia build Quarry, https://quarry.wmcloud.org/query/runs/all

It allows to freely query a safe replication of database without credentials or sensitive information

They're also switching to https://superset.apache.org/ as a front-end to use those replica.

As we don't have data interesting to general public outside Nasqueron Databases,
we could only have a repository to store queries, and a service to run them publicly or through CLI on a devserver. Such service would only have access to SELECT right on the tables it needs.

Candidate databases servers

Those queries can be run:

  • on db-A cluster for PostgreSQL services
  • on db-B cluster for MySQL services using MariaDB 10
  • on acquisitariat for MySQL services using MySQL 8 on Docker PaaS

Output

Output can be:

Examples

An example of query to include a report as a MediaWiki table:

DevCentral :: Phabricator revisions with token to mark them as AI-assisted
SELECT
    CONCAT("{{D|", rev.id, "}}") as revision,
    title,
    DATE_FORMAT(FROM_UNIXTIME(rev.dateCreated), '%Y-%m-%d') as `date`,
    userName,
    CONCAT("{{Repo|", repositorySlug, "}}") as repository
FROM devcentral_differential.differential_revision rev
    LEFT JOIN devcentral_repository.repository repo ON repo.phid = rev.repositoryPHID
    LEFT JOIN devcentral_user.user ON user.phid = authorPHID
WHERE rev.phid IN (SELECT DISTINCT objectPHID FROM devcentral_token.token_given WHERE tokenPHID = "PHID-TOKN-emoji-3");

Event Timeline

As a minimum, to have somewhere (a reports repository?) where we can write those report queries could already be useful, so we don't lose them.

dereckson triaged this task as Wishlist priority.May 25 2023, 04:23
dereckson moved this task from Backlog to Services / Features on the DBA board.