Issue on VUSION Cloud - data input by file
Incident Report for VUSION Cloud
Postmortem

What happened?

During our Cloud release on April 16th, 2024, at 07:00 AM UTC for the Americas and 12:00 PM UTC for Europe, we deployed a new version of VusionCloud including an update of the file processing software. Following this deployment, a subset of customers reported that their item data files were received, flagged as successfully integrated, but did not trigger any item updates, consequently leading to no label updates.

Upon recognition of the issue, our engineers investigated and identified the faulty behavior. It was decided to roll back to the previous version. On April 17th, 2024 at 10 AM UTC, the problem was solved, all the new file integration were correctly processed and transmissions done.

Unfortunately, the faulty integrations could not be replayed in a safely and timely manner. Impacted customers have been contacted and asked to replay those files.

What went wrong, and why?

To increase performance in the data processing of smaller files (less than 100kb), we decided to implement an improvement in our data processing software. This improvement was applied to the data processing units of all our installed base independently from their technical configuration. Unfortunately, one specific configuration (using a specific database system) was missing in our non-regression testing and was overlooked. The files were flagged as successful without triggering any real database updates, any exceptions or errors, thus the issue was not identified by our monitoring.

How are we making incidents like this less likely or less impactful?

We are fully aware of the critical impact this unique incident had on your daily operations. Consequently, we have initiated major projects:

  1. Complete review of our testing environment and its mirroring of our production environments.
  2. Review of the development process and the completion of our unit tests to ensure that all production environments are covered such as this kind of changes do not create situations like this one again.
  3. Complete review of various configurations to streamline them and avoid overlooked cases.
  4. Implementation of a more complete replaying api/file/matching features to help recover in extreme cases.
  5. Continuous improvement of our processes and playbooks to tackle or mitigate similar issues as soon as possible.
  6. Adding a new monitoring trigger to cover any similar situation.

We sincerely apologize for any inconvenience caused by this service disruption. We understand the impact on your operations and are dedicated to continuously improving our services to help prevent similar incidents in the future.

On behalf of VusionGroup, we would like to thank you for your support and understanding.

Posted May 01, 2024 - 22:26 UTC

Resolved
Dear valued Customers,

Following yesterday's cloud release, file-based data integrations have encountered issue for a subset of customers in both Europe and Americas. Our engineers have started investigating as soon as the issue was raised. A workaround has been implemented at 10AM UTC mitigate the impact. We are currently closely monitoring this incident.

If you have submitted a file within the last 24 hours or if you have any doubts regarding your file-based data integration, we highly recommend you resubmit your file to ensure smooth processing.

Rest assured, we will continue to keep you updated on any follow-up.

We would like to thank you for your patience and understanding.
Posted Apr 17, 2024 - 14:11 UTC
Identified
We have identified a possible cause for the issue. Corrective actions have been taken, and systems should be fully operational.
Posted Apr 17, 2024 - 10:14 UTC
Investigating
Dear customers,

Since 10:00pm UTC on Tuesday April 16th, A subset of customers have been experiencing issues with data integration using files. API integrations with json or xml bodies are not impacted.

Our operational teams are actively working on identifying and fixing the issue.

On behalf of VusionCloud teams we would like to apologize for the inconvenience.
Posted Apr 17, 2024 - 09:35 UTC
This incident affected: Europe (VUSION Manager - Europe, VUSION Cloud API - Europe).