IAS Technology Insider is dedicated to discussing computer science, software engineering, and emerging technologies. From deep dives into machine learning algorithms and cloud computing architectures to discussions on cybersecurity trends and data analytics methodologies, our tech experts offer insights and analyses that resonate with enthusiasts and professionals alike.
By Senior Software Engineer, IAS
A monolithic application, a common feature in many companies,has been in use since Integral Ad Science (IAS) was newly established. While new features are regularly added, this application often lacks the necessary attention to address the growing tech debt.
It requires frequent updates throughout the day, a significant challenge for a global company managing 12 thousand ad requests per minute.
The current monolith addresses data reloads by restarting itself every few hours, a process that can take up to 15 minutes to reload 4 GB of data consisting of approximately 13 million records. This approach is far from ideal, particularly within the Kubernetes model, which emphasizes scalability, visibility, and efficiency.
The problem
Frequent full restarts are incompatible with Kubernetes’ architecture, which is designed to efficiently handle unpredictable user request peaks globally. During major events, IAS experiences traffic surges, such as breaking into the 1 million impressions per second range during events like the Spain vs. Croatia and France vs. Switzerland soccer games, as well as the NCAA Basketball Tournament 2022.
The options
Several alternatives were considered for improving data reloads:
- Cron Jobs with External Schedulers: This method aimed to stagger data reloads to avoid simultaneous restarts. However, the unpredictable nature of node starts and stops in Kubernetes made this approach unsustainable.
- Centralized Reloads with Apache Gossip, Zookeeper, or In-house Applications: These solutions were deemed too radical and complex for IAS’s scale.
- Direct Database Access for Read/Write Operations: This option was dismissed due to security and scalability concerns.
The solution: AWS DynamoDB Lock Client
IAS ultimately implemented the AWS DynamoDB Lock Client, a distributed locking library built on AWS’s DynamoDB. This solution provides distributed capabilities, scalability, and global availability, crucial for handling frequent pod terminations in Kubernetes without deadlocking records.
Implementation
A wrapper application was developed to handle business logic before and after invoking the Lock Service. This included a callback function for active locks and error handling for timing issues. Integration testing was facilitated using localstack/localstack, which creates a local DynamoDB table in Docker, enabling safe and cost-effective testing.
Learning points
Several challenges were encountered and addressed:
- Implemented connection retry logic to handle simultaneous lock acquisition attempts by multiple pods.
- Shortened internal leaseWaitTime to prevent indefinite locks.
- Required AWS SDK Java version 1.11.704 for Service Account access to DynamoDB.
- Addressed WebIdentityTokenCredentialsProvider dependency issues in Kubernetes clusters.
- Developed a system to allow 2-3 concurrent data reloads to avoid deployment blockages by Flagger, the progressive delivery tool.
While custom solutions are sometimes necessary, inspiration can often be drawn from existing cloud services. IAS’s queued data reloads now operate effectively within the Kubernetes environment, ensuring data updates occur safely during periods of low traffic.
Join our innovative team
IAS is a global leader in digital media quality, driven by a team of engineers dedicated to designing high-performing platforms and leveraging impactful tools. We continuously analyze industry trends to drive innovation and enhance our revolutionary technology. IAS is expanding, and we are seeking collaborative, self-starting technologists to join our team. If you are interested in joining us, check out our open opportunities.
Article originally published on July 20, 2022.