On Thursday 23 November 2017 we held the inaugural Prometheus Meetup in Singapore at honestbee in our Delta House office. It was an introductory session on Prometheus, an open source systems monitoring and alerting toolkit, the first session for the Prometheus meetup community in Singapore which will focus on the topic of monitoring modern cloud native infrastructure. The event was attended by about thirty individuals from the software engineering sector and hosted by Cloudflare Site Reliability Engineers together with honestbee DevOps Engineer, Vincent De Smet.

The first speaker for the evening was Arseny Chekhov, from Standard Chartered Bank’s Cloud Team, who introduced the history of Prometheus to the attendees and reviewed select monitoring chapters of Google’s Site Reliability Engineering (SRE) book, gave a high level architecture overview of Prometheus and covered some of the latest release highlights.

You can watch the video here.

The second speaker was Binh Le, Site Reliability Engineer at Cloudflare, who gave a more practical introduction to Prometheus and its related tools for the next generation of monitoring systems. He also shared how Prometheus is used to monitor global infrastructure at Cloudflare.

You can watch the video here.

And finally, Antonio Cocera, Site Reliability Engineering Team lead at Cloudflare, presented a case study of migrating from Nagios to Prometheus, sharing the challenges and learning points he garnered from the experience.

You can watch this video here.  

This session provided a great opportunity to learn and interact with core members of the team which built the monitoring solution at a company of Cloudflare’s scale. These presentations added deep technical insights on top of Matt Bostock’s PromCon 2017 Berlin presentation: Monitoring Cloudflare’s Planet-Scale Edge Network with Prometheus.