Thursday, 8 November 2012

Anti spam framework - making sense of Spam

Anti Spam System used in CF - Real Estate

Hello Everybody.

In this post, I will be explaining the framework we have implemented to handle spam on our site. As you know commonfloor allows owners,seekers & real estate Agents to find and communicate with each other . The side effects of this wonderful idea is spamming. We needed a system to monitor,detect and protect genuine users from spam messages.

We have 4 categories of users

  • Seeker - looking to buy/rent a property 
  • Owner - looking to sell/rent
  • Real Estate Agent
  • Builder

At commonfloor, we allow only registered (users with verified mobile numbers) users to communicate with each other. When a seeker is interested in a project/property, he/she sends a message to the owner of that project/property via sms & email.

All communications are processed by our anti-spam system before being sent. When the system receives a message to be checked, it sends the msg to the algorithm which returns a 'spam score' (a probability of the msg being spam) We have internally set a threshold for this spam score. if the score is above the threshold, it will be categorized as 'spam' and the message wont be sent. if the score is below the threshold(the algorithm is not sure if the msg is a spam), it will be sent to a moderation console where our team will decide whether the msg is a spam or not . This decision is sent as a feedback to the algorithm for self-learning. The algorithm gets better as it processes more messages and receives more feedback. Based on the decision made in the moderation console the message will either be sent or discarded. All spam messages are stored in a separate db and used to train the algorithm and for future analysis.

With this system in place, I can assure that users will receive messages from genuine users and are protected from receiving spam messages.