...

Database Chief Architect | Trilogy

Database Chief Architect | Trilogy

Problem

Aurea manages the IT infrastructure not only for its own operations but also for its parent company and affiliated entities. At the heart of this infrastructure was a Central Platform responsible for hosting and managing databases for over 5,000 products, spanning both RDBMS and NoSQL technologies. It was owned by Database Chief Architects.

With such scale came complexity. Ensuring high availability, maintaining SLA commitments, and staying compliant with regulatory requirements (like GDPR) became increasingly difficult. Routine maintenance, incident resolution, and compliance enforcement placed a heavy load on the core team, risking operational bottlenecks and unplanned outages.

Action

To improve performance, reliability, and compliance, as a Database Chief Architect, I initiated and executed several strategic and technical initiatives:

  • Developed an internal knowledge base and cheat sheet to empower L1 engineers to handle routine and repetitive incidents autonomously, reducing unnecessary escalations.
  • Led root cause analysis (RCA) for major incidents, implementing long-term fixes and reducing recurring outages.
  • Migrated EU and UK customer data to dedicated environments based on database technology (e.g., Oracle, SQL Server) to enhance security, performance, and regulatory alignment.
  • Oversaw cloud migration efforts, moving SQL Server, Oracle, and Aurora (MySQL/PostgreSQL) databases primarily to AWS, with occasional use of Azure, OCI, and GCP depending on the client stack.
  • Deployed third-party load balancers and utilized native high-availability (HA) features to ensure continuous database availability across environments.
  • Periodically tested disaster recovery and performed in-place software upgrades using sandbox environments cloned from production, ensuring business continuity without affecting live workloads.

Result

  • Delivered 99.999% uptime across the full database stack, meeting and exceeding availability targets.
  • Ensured full compliance with GDPR and other regulatory standards.
  • Successfully maintained the infrastructure for over 5,000 hosted products, enabling scalability and centralized governance.
  • Consistently met RTO (Recovery Time Objective) and RPO (Recovery Point Objective) for critical services.
  • Delegated first-line troubleshooting to L1 and L2 engineers, allowing the core team to focus on architecture, automation, and complex problem-solving.

Tags

#SoftwareDevelopment
#SQLServer #PostgreSQL #Oracle #MySQL
#AWS #Azure

Scroll to Top
Seraphinite AcceleratorOptimized by Seraphinite Accelerator
Turns on site high speed to be attractive for people and search engines.