The National Informatics Centre and its team of volunteers managed to roll out the app in just 15 days (supporting 50 million downloads in just 13 days) by relying on great volunteers (and tools). One of the volunteers, Vikalp Sahni, co-founder and chief technology officer of Goibibo and Aarogya Setu project volunteer, tells us how he and his team managed to create one of the first contact tracing apps to be rolled out globally that has since been used to identify over 3,000 COVID-19 hotspots.
Vikalp also shares with us the steps that the Indian tech community and the Government of India took to reach the scale of 140 million+ downloads.
Some edited excerpts from an interview:
What were some of the design and architecture principles that were kept in mind before the team started working on the Aarogya Setu App?
We ensured that we used incredibly simplistic microservices architecture at the backend. We leveraged a high-performance NoSQL (highly distributed) database to take the workload of millions of hits coming on to us at scale. These are super simple operations to ensure scalability at a 99%ile response time of less than 100ms. At the frontend, we took a hybrid approach. We ensured we used native components as well as WebViews to make an extraordinarily scalable and iterative solution.
What were the biggest challenges to this project?
Scalability and security were the two biggest tech challenges. We did reach 50 million citizens in 13 days, and on a single day, close to 14 million citizens downloaded the Aarogya Setu app. Scaling services at that level was a great tech challenge, and teams were always busy to pre-empt the load and innovate not to face any downtime or performance glitches. Fortunately, we did pass the scale challenge quite effectively.
Security was another big challenge that we faced. Many hackers tried to breach the application security, and privacy features were constantly being questioned. We saw many attempts, ethical and unethical, to breach into services / reverse engineer. We successfully circumvented all of these as the team took the right measures and thought through these vulnerabilities while developing.
Can you elaborate on the key steps that the Indian tech community and the Government of India took to reach the scale of 140 million+ downloads
The Government of India helped massively by creating a large scale campaign across India. Even our Honorable Prime Minister Narendra Modi reached out to citizens to download the app. The government also reached out to the various private and public establishments for help in increasing the reach of contact tracing through Aarogya Setu. Without the marketing efforts of the Government of India, this kind of growth and success would not have been possible.
How using the right technology enabled the team to meet a surge in demand (at one point, the contact tracing app recorded seven million server requests in just one minute)
As mentioned, the choice of tech and the right architecture proved pivotal. We stuck to our philosophy of keeping services simple & leveraging local mobile devices to store private contact tracing data. Furthermore, we ensured we had the right DevOps practices to handle auto-scale up and down when more resources were required, and to facilitate the same, we did leverage serverless design and a containerization approach. Partners such as AWS and New Relic helped us in ensuring we pre-empted the bottlenecks at early stages to scale.
What are some of the lessons for developers from building an app that is massively scalable?
1. Keep the user services extremely simple & performant, and move all the complexity to backend async processes.
2. Ensure an architecture that is quick to scale up and scale down, hence Container First & Serverless deployment approach.
3. Frontend is a big area of a bottleneck when it comes to quick rollouts. Leverage the right balance between native and web components within the app and create a configurable frontend.
4. Mobile is a fat client & apps can do a lot, hence leverage the client-side tech efficiently to distribute some of the load that otherwise would have come to servers.