Sunday, October 24, 2010

Testing Solr Indexing Software

Some of the sharpest MARC -> Solr coders have scanned some of my indexing tests and have scratched their heads, wondering why I have "gone overboard" and why I am "testing Solr itself."   Others have positively panted in anticipation when they heard that Bess Sadler (and I, sort of) came up with a way to do relevancy testing. And I know a few of you out there remember the 2010 Code4Lib presentation Jessie Keck, Willy Mene and I did on testing (http://www.stanford.edu/~ndushay/code4lib2010/I_am_not_your_mother.pdf).

So ... what tests should be written by those of us saddled with converting (MARC) records into a Solr index?  And how can we write these tests so they can be used by continuous integration?

Building on previous conversations with Bob Haschart, Jonathan Rochkind and Bill Dueber,  Bill and I had an illuminating conversation last Friday about testing code that writes data to Solr. We teased out four different levels of tests.
  1. Mapping Tests: from raw data to what will been written to Solr.  This is the code that takes the raw data and turns it into field values that will be written to Solr.  "Mapping Tests" are a good name for these.
    1. "Am I assigning the right format to a MARC record with this leader, this 008 and this 006?"
    2. "Am I pulling out exactly the OCLC numbers we want from this record with 8 OCLC numbers?"
    3. "Is the sortable version of the title correctly removing non-filing characters for Hebrew titles?"
  2. Searchable Value Tests: from (raw data) to (searchable) value stored in Solr.  Bill points out that Solr's schema.xml could transform a field's searchable value as the document is added to Solr.  The field's type definition in schema.xml might strip out whitespace, punctuation, stopwords, etc.  It might substitute characters (maybe ΓΌ becomes ue, or maybe it becomes u).  Note that the stored field value will not be changed by schema.xml -- whatever was in the Solr document written to the index will be the value retrieved from a stored field.
    1. "I know I left the letter 'a' prefix in that field, because I wrote a test to ensure the 'a' prefix is present in the field value to be written to Solr.  So why aren't searches with the 'a' prefix working?"
  3. Search Acceptance Tests: from (raw data) to Solr search results.  What if you're not sure of the best way to define a field type and/or to transform the raw data in the indexing code and/or to set up a RequestHandler to achieve the appropriate search results?
    1. In a call number, some periods are vital to searches (P3562.8) and some are not (A1 .B2).  Some spaces are vital (A1 .B2 1999) and some are not (A1.B2).  What is the best way  index the raw data to get the desired results?  What is the best way to configure the request handler to get the desired results?   
    2. Do author search results contain stemmed versions of names (michaels vs. michael)?  Stopwords (earl of sandford vs. earl sandford)?  What about the default request handler?  Are results correct for hyphenated names?
    3. Will a query containing a stopword match raw data with the same stopword?  Will it match raw data without a stopword?  With a different stopword?  (dogs in literature vs. dogs of literature vs. dog literature).  How should the query be analyzed? 
    4. "I am not seeing that field in the search results, but I know the field is present because I wrote a test to check the field value is correct in the document to be written to Solr."  (Oops - is stored="true" on the field definition?) 
  4. Full Stack Tests: from a query in your UI to Solr search results.  There are simple things that might be done by your UI code, such as translating the displayed value of a location facet to a different value for the actual query.  There are also situations when the transformations must be more complex --  converting query strings from multiple form fields into a complex Solr query with local params, as we do with our Advanced Search form  (See http://www.stanford.edu/~ndushay/code4lib2010/advSearchSolrQueries.pdf  or, for a more complete explanation, see http://www.stanford.edu/~ndushay/code4lib2010/advanced_search.pdf.) 
Recall the testing mantra:  if you had to test it manually, then it's worth automating the test. 


A few principles of good test code:
  • test code should live close to the code it is testing
  • tests should be automate-able via a script for continuous integration
  • tests should exercise as much of the code you write as is practical.
  • tests are useless if they give false positives:  you should be certain that the test code fails when it is supposed to - this is the "test the test code" principle.
  • tests should assert that the code behaves well with bad data ... the data is always dirty.
Yes, it takes you longer to do the initial coding ... but you get all that time back, and then some, as the system grows and needs maintenance.  It may be hard to believe, but when you start writing tests, your code improves.  Not only because you are testing for error conditions and the like, but also because thinking about tests subtly changes the way you write your code - it becomes more modular, clearer.  Plus it allows you to refactor with confidence as a better approach presents itself.
    Continuous Integration for these tests

    Automate-able scripts that run your tests will differ depending on the language and tools you have used.  I tend to have ant targets for tests written in java and rake tasks in other contexts.  I will provide more specifics in future posts as I step through examples of the different types of tests.

    If you need to sell any of this to your management, try this:

    Our staff are delighted to hear that when we fix a reported searching problem, we use the examples from their bug reports to ensure we won't unknowingly unfix things in the future.  It's a big win to tell them they won't be asked to repeat the tests manually when we upgrade Solr or make any other changes.   Also, over time, the maintenance costs of the software will be lower.

    1 comment:

    1. Great post, Naomi. I liked this point in particular: "it takes you longer to do the initial coding ... but you get all that time back, and then some, as the system grows and needs maintenance. It may be hard to believe, but when you start writing tests, your code improves". Really nicely put. / Daniel

      ReplyDelete