Page MenuHomeDevCentral

Federated timeline isn't updated anymore
Open, NormalPublic

Description

Sidekiq has 117k jobs in queue, it doesn't currently process anymore new events.

Screenshot from 2018-12-08 12-33-42.png (672×1 px, 183 KB)

Event Timeline

dereckson triaged this task as High priority.Dec 8 2018, 11:33
dereckson created this task.

Screenshot from 2018-12-08 12-33-42.png (672×1 px, 183 KB)

I restarted Sidekiq, it's currently processing new jobs. I'm watching the queues evolution, processed count increases, enqueued is currently stable.

There are big ffmpeg video rendering tasks currently:

[AV] Running command: ffmpeg -i "/tmp/8d777f385d3dfec8815d20f7496026dc20181208-7-1nlh2k3" -acodec aac -strict experimental -movflags faststart -pix_fmt yuv420p -vf scale='trunc(iw/2)*2:trunc(ih/2)*2' -vsync cfr -c:v h264 -b:v 500K -maxrate 1300K -bufsize 1300K -crf 18 -y "/tmp/8d777f385d3dfec8815d20f7496026dc20181208-7-1nlh2k320181208-7-idgh2h.mp4"

That's a pretty bad news according experience with Wikimedia Commons: a video flood can stuck task runners for days.

So the issue on Mastodon there is no a specific video-convert queue we can pause pending other tasks completion.

To detect easily this situation in the future we can use this nagios check:
https://github.com/wanelo/nagios-checks/blob/master/check_sidekiq_queue

dereckson lowered the priority of this task from High to Normal.Dec 8 2018, 21:10

Backlog has been cleared, but I let this task open, as there is some actionable to follow-up, like write a postmortem.

dereckson updated the task description. (Show Details)

So to avoid queue to be clobbered again: while sleep 30; do clear-video-queue; done &

An automate task runner is still to be prepared.