Issue affecting some Raspberry Pi 3 devices running UC16
Incident Report for Screenly
Postmortem

On 2023-06-20, we started receiving support tickets from a few clients in Asia describing an issue with screens showing a purple Screenly logo rather than the expected content. A few hours later, we started receiving similar reports from customers in the U.S..

We’ve now narrowed down the issue to a memory issue in apparmor_parser (used by snapd), which in turn is causing an Out Of Memory (OOM) situation where Out Of Memory Killer (OOM Killer) shuts down processes. The Ubuntu Core / Snap Team is actively working on resolving the issue and the progress is tracked here.

In the meantime, we’ve instructed all UC16 devices that are online and reachable to hold any future updates.

What we are doing to next

  • We are expanding our new Quality Control setup to include combinations of older hardware/software not previously covered (i.e. Raspberry Pi 3 with UC16).
  • We are working closely with Canonical/Ubuntu to enhance their upstream testing of software updates.
  • We are working on adding manual approval processes for new upstream updates.
Posted Jun 22, 2023 - 10:56 UTC

Resolved
Many of the affected screens have now recovered. If your screens are still affected and you have not yet been in touch with us, please do reach out to support@screenly.io for next steps.
Posted Jun 22, 2023 - 10:52 UTC
Update
We've spent the day working with Canonical's engineers on the issue and are fairly confident that the issue was caused by an update to "core" snap (which is essentially the operating system) going from version 16-2.59.3 to 16-2.59.4. It is still unclear how the upgrade broke exactly, but we believe that we are getting closer to the root cause.

If you are effected, please get in touch with support@screenly.io.
Posted Jun 21, 2023 - 17:17 UTC
Update
We are working closely with Canonical's engineers on trying to narrow down the root cause and are exploring various hypotheses.

In the meantime, we're seeing that a number of the affected screens have recovered by themselves.
Posted Jun 21, 2023 - 11:14 UTC
Update
Investigation is still ongoing. We'll provide an update as soon as we have received a new update from Canonical.
Posted Jun 21, 2023 - 07:46 UTC
Update
While this is still under active investigation by both our and Canonical/Ubuntu's engineers, our initial investigation indicates that this was caused by a system update in the operating system. Some of these devices have been been reported to recover after being power cycled.

To minimize that the issue spreads to additional devices, we've issued a freeze on further updates to the affected cohort of devices that are online.

It should be noted that this issue doesn't appear to be affecting any newer Screenly Player (these are all Pi4-based) or Screenly Player Max, as these all are using Ubuntu Core 20 (UC20).
Posted Jun 20, 2023 - 18:55 UTC
Update
We're investigating an issue reported by some customers. The issue appears to be limited to Raspberry Pi 3s running Ubuntu Core 16 (UC16). We are working with Canonical (Ubuntu) to triage the issue further.
Posted Jun 20, 2023 - 16:02 UTC
Investigating
We are currently investigating this issue.
Posted Jun 20, 2023 - 15:59 UTC