All Systems Operational
SignOnSite Field Platform Operational
90 days ago
100.0 % uptime
Today
API Operational
90 days ago
100.0 % uptime
Today
Website Operational
90 days ago
100.0 % uptime
Today
iOS App ? Operational
90 days ago
100.0 % uptime
Today
Android App ? Operational
90 days ago
100.0 % uptime
Today
Reporting & Dashboards Operational
90 days ago
100.0 % uptime
Today
Dashboards ? Operational
90 days ago
100.0 % uptime
Today
Managed Data Warehouse ? Operational
90 days ago
100.0 % uptime
Today
3rd Party Services Operational
Google BigQuery Operational
AWS s3-ap-southeast-2 Operational
AWS eks-ap-southeast-2 Operational
AWS route53-ap-southeast-2 Operational
AWS route53 Operational
AWS ec2-ap-southeast-2 Operational
AWS ecr-ap-southeast-2 Operational
AWS rds-ap-southeast-2 Operational
Operational
Degraded Performance
Partial Outage
Major Outage
Maintenance
Major outage
Partial outage
No downtime recorded on this day.
No data exists for this day.
had a major outage.
had a partial outage.
Past Incidents
Mar 19, 2024

No incidents reported today.

Mar 18, 2024
Resolved - A fix has been implemented and deployed.

Sites which reported un-responsiveness or extreme slowness of the desktop web attendance screen through our Support & CSM channels have had testing performed on them subsequent to the fix being released; Testing has shown that service looks to be restored to normal.

The issue only impacted the site attendance screen on desktop web. Our mobile applications & other areas of the web portal were operating as normal. If any further issues on the site attendance screen occur, please report them as soon as possible so we can further investigate.

This was a tricky one to resolve- we've been continuously working on it since the first reports came in of the issue early this morning. Our first response was to double the capacity of our database cluster, which bought some head-room database capacity this morning during our peak-load period and prevented the issue from progressing to impact more people or degrading the SignOnSite service overall.

For the more technical observers among you, the cause of the issue was the following:
- Over the weekend, we deployed some new composite indexes into our production database which gave us some great performance improvements across a number of common query use-cases.
- Unfortunately, once these new indexes were integrated into our production environment, the SQL query optimiser ended up selecting one of these new composite indexes for an extremely important and extremely hot-path query- this isn't something we were expecting to happen at all, because this particular query _overall_ would not be improved by the new index.
- What was going on, was that the optimiser was speeding up one part of the query at the expense of then needing some very expensive looped full-table scans to compute the result in another part of the query.
- The optimiser was detecting that the new composite index would minimise the cost function in one part of the query to an extent that it was selecting the new index, but this was actually a very bad choice for the whole query overall as the alternative index could have been re-used in multiple places throughout the query and the new one couldn't.
- We aren't sure yet exactly why the optimiser was mis-computing the cost of moving the second part of the query to full-table scans, but we think it was because of a small amount of dynamic logic that the optimiser wasn't able to penetrate.
- The end result was that a change designed to speed-up our application, ended up significantly impacting performance on sites which have had high numbers of workers on them _over their lifetime_. The profile of these sites varied, some had high activity levels and many active workers on them each day, and others had less foot-traffic traffic, but had been active for a long time.

Thank you everyone for your patience today while we worked through safely performing an online schema roll-back,

SignOnSite Engineering

Mar 18, 13:17 UTC
Identified - We are still working on a fix
Mar 18, 11:04 UTC
Update - We've identified the issue and are working on a fix.
Mar 18, 02:42 UTC
Update - We are continuing to investigate this issue.
Mar 18, 01:56 UTC
Investigating - We are currently investigating the cause of slow loading of the attendance register for sites with more than a few attendees.
Mar 18, 00:46 UTC
Mar 17, 2024

No incidents reported.

Mar 16, 2024

No incidents reported.

Mar 15, 2024

No incidents reported.

Mar 14, 2024

No incidents reported.

Mar 13, 2024

No incidents reported.

Mar 12, 2024

No incidents reported.

Mar 11, 2024

No incidents reported.

Mar 10, 2024

No incidents reported.

Mar 9, 2024

No incidents reported.

Mar 8, 2024

No incidents reported.

Mar 7, 2024

No incidents reported.

Mar 6, 2024

No incidents reported.

Mar 5, 2024

No incidents reported.