Saturday, 15 October 2011

MaxHeap's very own BI

With this, we plan on taking you through a brief tour of all the cool stuff that has happened with our tools team in recent times.

The eventual need of every company is business intelligence and a term synonymous with that is large volumes of data. Like a company, we generate hundreds of thousands of lines of logs a day. And since we call it a log, the data we record in there remains trapped for all eternity unless someone decides to do something about it.

So the tools team rose to the occasion and through our catalyst and founder, Lalit's suggestion, we decided to do something cool here. Lalit basically did an excellent job at analysing what a typical BI system would need. He was spot on about having room to twist malleable data, clean it, record it and use it as we'd want to. So we needed something that could do a little more than a normal data storage backend like a typical SQL database. He suggested MongoDB, and that is when I fell in love with this wonderful database.
We had been aiming at a simplistic system earlier, but as I dived deeper into Mongo and noSQL territory, the horizons started widening instantly, expansively. I instantly knew that a lot more was possible with this database – and that was the need of the hour; a system we could scale massively.
So I decided to go much further than a business analytics system. I decided to deliver a framework itself.

The initial phases of development were not taxing. The transition from SQL to noSQL was seamless. And within a couple of weeks, we were dealing with humongous amounts of data, parsing it, cleaning it and feeding it to the DB. Just like that.

We then proceeded to build a report scheduling system that again worked on top of mongoDB. Not just that, we pushed and implemented a caching system that could now be used not just by the BI framework but by anyone in any team in the need of some really fast cache on the web.

All of this didn't need genius coding. It only needed the tools team and mongodb.


  1. I am not sure of the details you are divulging here , but given the logs , you can implement something like a failure alarm or a tool that allows you to select the hosts you want to search logs of , which can spit out the errors that have happened . That was about the OLTP data.
    If you already have a reporting system in place , you can setup an OLAP environment which can accumulate the events over time and generate dashboards that the you can view for their analysis.

  2. Hello Harshit. What you've pointed out, we already have in place. A business intelligence framework is very different. And the idea behind using mongoDB to store and mine the data from solves all the requirements an OLAP solution would.
    In any case, you'll need to be able to answer 'questions' from the past which is exactly what we've aimed to solve - and to a huge extent, we've been successful at that.