7 steps to improve API performance
Finding performance problems just a couple weeks before the production release date of an application API is a common occurrence as most organizations do performance testing when all the development and integration testing is completed. The application owner is stressed on how to break the news to his leadership team and how the customers will react if they have to deal with slow response times. According to a study by Akamai research, for every additional second that the app consumes, the conversion rate declines by seven percent. And if applications take longer than three seconds, 48% of the users will uninstall or stop using the mobile application.
Here is an example of a situation where I was pulled in by the development team to improve the performance of an API for a mobile application. The API was not able to scale above 40 transactions per second and was going to be a customer experience nightmare. It needed to support 100 transactions per second, but the performance requirements were not listed explicitly. The technical team didn’t probe further to consider performance in the design and development phase.
We were in a position where the testing team is running an end to end performance test just two weeks before the production date and found out the API is not able to handle 100 transactions per second. We tried to solve the performance problem by adding more resources like CPU, memory, and additional servers, which helped to a limit, but as the API was not designed and developed with performance in mind, it did not scale to meet the requirements. This could have been avoided if we had caught this early in the SDLC cycle by designing for performance and testing individual components in the API and checking the dependent system network latency and benchmarking their performance.
Performance bottlenecks that keep you up at night
The next step was to identify the bottlenecks that prevented us from getting to the 100 transactions per second. After analyzing the slower transactions and looking at each activity in the flow, we found a few problems.
- The API payload size for some requests was much larger due to more history of orders with detailed line items of each order and those transactions were taking comparatively longer.
- We were trying to retrieve multiple types of information from different databases, like customer demographics and order history, then consolidating them in one monolith operation synchronously.
- The customer demographic database had a bottleneck at 200 transactions per second based on the server size and was being shared among multiple applications. The database could only be scaled vertically by increasing the size of the server and also needed downtime which was not an easy task within the timeframe.
- The API was deployed on three workers/servers and could handle the number of transactions for regular load but were not able to scale to peak loads.
- The error transactions were taking much longer than successful transactions.
Designing for performance: The consumer’s dream scenario
Looking at these problems, we had to make the following changes to the design for performance:
1. Added pagination to large payloads
To solve the problem of relatively large response payloads we had to limit order history by adding pagination and provide JSON linking using hypermedia/Hateoas links for getting order line item details by making one more call with order ID as the key. This significantly reduced the payload size for customers with a long history of orders.
2. Brokedown APIs into microservices
We broke the single monolithic API into multiple microservices by dividing them into customer system API to connect and access the customer demographic data. The order system API to connect and access the order database. The process API for orchestrating and transforming the data from customer and order APIs. The experience APIs called the process API and transformed the data needed on the mobile application based on the MuleSoft API-led connectivity approach. This provided capability to reuse the process and system APIs in the future as they were designed with the microservices approach and could be scaled separately as per demand.
3. Created synchronous APIs
We were able to call customer demographic system API and Order System API in parallel with the new approach by breaking them down into two separate APIs. This reduced the time taken as this was being done synchronously, one after the other, earlier.
4. Used connection pooling
We used connection pooling to connect to the database as a significant amount of time was being spent in creating and closing a database connection.
5. Added caching
We noticed 30% of the requests were the same for customer demographics within a 15 minutes period. The customer demographics don’t change frequently so we decided to add a caching policy using the API Manager and reduced the capacity bottleneck of the Customer database by reducing the number of calls.
6. Deployed auto-scaling
We deployed the APIs within auto-scaling groups where it could scale between two and five instances depending on normal and peak times of the day.
7. Switched to asynchronous error-logging
Error logging was changed to asynchronous which removed the wait time for logging.
With all these changes, we got to the target performance of 100 transactions per seconds. Had we kept the performance requirements in mind from the start, these changes could have been avoided. Going forward, you should consider performance as a critical requirement while designing an API and keep these best practices in mind.
Monitoring: the key to enhancing consumer experience
As a next step, having a good proactive approach to performance testing and monitoring is a better option than working on them after encountering performance issues. It is very important to continuously monitor the APIs performance with MuleSoft’s Anypoint Monitoring which bridges the divide between application performance monitoring and log management, it is the de-facto monitoring tool for enterprise-grade visibility. Anypoint Monitoring gives granular visibility into all insights related to APIs, integrations, and dependencies that are extracted automatically, with very low-performance overhead and no required application modifications. More than 40 metrics are available through built-in dashboards, covering everything from inbound requests, outbound calls, and dependencies to performance, failures, JVM, and infrastructure metrics.