Friday, 9 December 2011

Apache Solr: DisMax and multiValued

At maxHeap, we try our best to speed up the data delivery to the end user. MySQL just isn't fast enough, so we use Apache Solr. Solr uses Lucene at it's core (both a part of Apache Project) to deliver lightning fast data. So where does Solr fit into CommonFloor? Well, the auto-suggest and all searching is powered by Solr on CommonFloor.

In it's latest versions Solr, has introduced a different search handler called Disjunction Maximum or simply DisMax. The default search handler is pretty stupid, pardon my strong language, but that's really the case. There is no way to search across multiple fields! Rather, you would have to specify the same query for each field that you want to search, so developers came up with an answer to the problem by using the copyField directive which appends the source field into the destination. Using this method developers used to search in one field for all queries. But then data grew complex, and there was a new requirement! Not all fields have the same weight-age, some are less important, and some are very important. For example the title would be very important, while the URL is not so much. But using the copyField method developers could not do it, they needed something more smart and robust and thats where DisMax and eDisMax (extended-DisMax) comes in. DisMax allows you to execute one query across multiple fields, while allowing you to give a different weight-age to each field (they call this boost). This allowed complex searches with a robust method of boosting and selecting results. DisMax features like mm (Minimum should match), pf and ps (Phrase Fields and Phrase Slop) and of course qf (Query Fields) (multiple fields and their boosts are specified here) allow advanced matching criteria and a great method of sorting the results exactly how you want it. I could go on for ever about DisMax and it's uses, but I'll leave it to you to explore!
More about DisMax:

The second and I'd say the more important thing is upgrading the schema (Solr has schemas just like databases, though they differ significantly). While upgrading from Solr version 1.3 to the latest version we experienced a lot of trouble with Solr. Due to the lack of full documentation we were on our own to solve the issues. The problem: Solr can have multi-valued fields (they are like arrays) which are different from normal fields. The main difference apart from single-valued and multi-valued is that Solr cannot sort based on a multi-valued field. The Solr schema specifies all details of the fields, how they should be processed, what filters to apply, how to parse the values etc. along with a very important attribute i.e. the schema version number. Since our Solr was pretty outdated, the schema version was set to '0.1' which directed Solr to use the rules specified with the oldest schema version which is '1.0'. Version '1.0' specified that all fields are multiValued by nature due to which even after specifying multiValued as false for each fields Solr understood it as multiValued! So after a lot of errors and time spent on the problem we couldn't figure it out. Then it clicked! A very simple change i.e. setting the schema version to '1.1' solved the problem as Version '1.1' directed Solr to use multiValued as false by default. This is completely undocumented due to which we had such a hard time figuring out such a small fix!


  1. Replies
    1. Great Article
      Cloud Computing Projects

      Networking Projects

      Final Year Projects for CSE

      JavaScript Training in Chennai

      JavaScript Training in Chennai

      The Angular Training covers a wide range of topics including Components, Angular Directives, Angular Services, Pipes, security fundamentals, Routing, and Angular programmability. The new Angular TRaining will lay the foundation you need to specialise in Single Page Application developer. Angular Training

  2. Interesting information and attractive.This blog is really rocking... Yes, the post is very interesting and I really like it.I never seen articles like this. I meant it's so knowledgeable, informative, and good looking site. I appreciate your hard work. Good job.
    Kindly visit us @
    Sathya Online Shopping
    Online AC Price | Air Conditioner Online | AC Offers Online | AC Online Shopping
    Inverter AC | Best Inverter AC | Inverter Split AC
    Buy Split AC Online | Best Split AC | Split AC Online
    LED TV Sale | Buy LED TV Online | Smart LED TV | LED TV Price
    Laptop Price | Laptops for Sale | Buy Laptop | Buy Laptop Online
    Full HD TV Price | LED HD TV Price
    Buy Ultra HD TV | Buy Ultra HD TV Online
    Buy Mobile Online | Buy Smartphone Online in India

  3. Looking for latest update on TNPSC exams? Kalviseithi - #1 educational portal offer latest news about TN state government jobs, educational news and much more information.

  4. The article is very interesting and very understood to be read, may be useful for the people. I wanted to thank you for this great read!! I definitely enjoyed every little bit of it. I have to bookmarked to check out new stuff on your post. Thanks for sharing the information keep updating, looking forward for more posts..
    Kindly visit us @
    Madurai Travels
    Best Travels in Madurai
    Cabs in Madurai
    Tours and Travels in Madurai

  5. Looking for best English to Tamil Translation tool online, make use of our site to enjoy Tamil typing and directly share on your social media handle. Tamil Novels Free Download

  6. Hello Admin!

    Thanks for the post. It was very interesting and meaningful. I really appreciate it! Keep updating stuffs like this. If you are looking for the Advertising Agency in Chennai | Printing in Chennai , Visit Inoventic Creative Agency Today..

  7. very nice article you write looking for best Non veg jokes and also you may get 1000 non veg jokes

  8. Very well written post. Thanks for sharing this, I really appreciate you taking the time to share with everyone. Pmp Training Hyderabad

  9. will gives its service the microwave gives any issues microwave not warming is a typical issue. Whirlpool Washing machine Service Center in MaladThis duct AC has the best capacity of cooling so this will provide the fast cooling to the room. Whirlpool Washing Machine Service Center in MulundThese air conditioners are one of the great appliances in this generation and the air conditioners will requires a good maintenanceWhirlpool Washing machine Service Center Jogeshwari So these air conditioners will need a best servicing to run it properly. Home-appliances repair our professionals accomplished to in this documented. Whirlpool Washing Machine Service Center in Dombivli They are accomplished and they are providing for you best support of your item. WHIRLPOOL care phone number, WHIRLPOOL service center contact number, WHIRLPOOL call center number

  10. Okay! SEO is backbone for any online business’s success. But if you don’t know how your effort and strategy going on you can’t can understand flow of your data and effectiveness of your strategy. ChartExpo allows you to visualize your efforts with the help of Sankey Diagram in just clicks with no coding required environment. Read more here : Sankey Diagram Google Sheets .

  11. 우리카지노사이트 더킹카지노 샌즈카지노 퍼스트카지노 예스카지노 슈퍼카지노 개츠비카지노 33카지노 월드카지노 메리트카지노

  12. 우리카지노사이트 더킹카지노 샌즈카지노 퍼스트카지노 예스카지노 코인카지노 33카지노 월드카지노 메리트카지노 개츠비카지노

  13. 우리카지노사이트 더킹카지노 샌즈카지노 퍼스트카지노 예스카지노 슈퍼카지노 개츠비카지노 33카지노 월드카지노 메리트카지노

  14. 우리카지노사이트 더킹카지노 샌즈카지노 퍼스트카지노 예스카지노 슈퍼카지노 개츠비카지노 33카지노 월드카지노 메리트카지노

  15. 우리카지노사이트 더킹카지노 샌즈카지노 퍼스트카지노 예스카지노 슈퍼카지노 개츠비카지노 33카지노 월드카지노 메리트카지노