Discovery Grindstone: Testing Solr Indexing Software - Search Acceptance and Searchable Value Tests

In a previous post, I talked about the different levels of testing for code that writes to a Solr index. This post will go into detail about Search Acceptance and Searchable Value tests, which are essentially acceptance tests for search results.

Recall the testing mantra: if you had to test it manually, then it's worth automating the test. Solr is a complex black box for most of us - you need to know that twiddling any knobs on that box won't affect your search results in ways you didn't expect.

Many people do these sorts of tests ad hoc, which is fine if the raw data and the desired searching behavior are simple, you're certain you have met all the search requirements, and you'll never have to touch the Solr configuration files again. (... what reality are you living in? I want to join you.)

For most of us, there are times when we're not sure how to achieve the appropriate search results.

Search Acceptance Tests

Some of the questions we confront when we configure Solr:

how to define the Solr field(s) and field type(s) to be searched -

how is the text analyzed/tokenized?
should it be indexed? stored? multivalued?

how to transform the raw data into Solr searched values (this part on its own could be a Mapping test, if you have non-Solr code transforming the particular raw data in question)
how to set up a RequestHandler to achieve the appropriate search results

Sometimes the raw data is inconsistent in strange ways, or the desired search behavior is complex enough to beg for test code. Here are some examples:

In a call number, some periods are vital to searches (P35.8 vs. P358) and some are not (A1 .B2 vs. A1 B2). Some spaces are vital (A1 .B2 1999 vs. A1 .B21999) and some are not (A1 .B2 vs. A1.B2). Call numbers A1234 1999 .B2 and A1234 .B2, the desired search behavior may be for both to match. Plus, the end user queries will be inconsistent and the data is definitely dirty.
Will a query containing a stopword match raw data with the same stopword? Will it match raw data without a stopword? With a different stopword (dogs in literature vs. dogs of literature vs. dog literature)?
How should author names deal with stemming (Michaels vs. Michael)? With stopwords (Earl of Sandford vs. Earl Sandford)? Are results correct for hyphenated names? Are results reasonable for specialized author searches as well as for unspecialized searches?
"Why are (hyphens, ampersands, colons, semicolons) in query strings causing empty results? How do I fix this?"
"I am not seeing the publisher in the search results, but I know publisher is written to the index because I wrote a mapping test for that field."

just checking if you're paying attention: this one is a matter of setting stored="true" on the field definition.

I have to experiment to meet acceptance criteria like the above. And once I figure out a solution, I want assurance that future twiddles won't break what I've already solved. And if they do get broken, I want to know which twiddle is at fault. And if I can't meet all the criteria, I like being able to know what I could get working, and what I had to chuck.

In a chat with Jonathan Rochkind, he said that you twiddle and you twiddle search configurations and eventually you hit a point where meeting acceptance test J will break acceptance test H. From his perspective, this is why these sorts of tests are fickle - he feels a conflict is inevitable and show the folly of these tests. But from my perspective, this is EXACTLY why these tests are necessary. They let me pinpoint exactly which behaviors conflict, so I can pursue a new solution, or choose which behavior will be addressed. (In my world, this generally means informing Librarians what the trade off is and letting them make the decision.)

Searchable Value Tests are rarely needed, because Search Acceptance tests tend to address nearly all the searchable value issues.

Searchable value tests answer questions like this:

"Why aren't searches with the 'a' prefix on the ckey working? I know I left the letter 'a' prefix in that field, because I wrote a mapping test to ensure the 'a' prefix is present in the field value to be written to Solr."

Bill Dueber points out that Solr's schema.xml could transform a field's searchable value as the document is added to Solr. The field's type definition in schema.xml might strip out whitespace, punctuation, stopwords, etc. It might substitute characters (maybe ü becomes ue, or maybe it becomes u). Note that the stored field value will not be changed by schema.xml -- whatever was in the Solr document written to the index will be the value retrieved from a stored field.

How to Test

These tests must occur AFTER the Solr document is written to the index. We have to check search results given known data - we send a search query to a Solr populated with our test data and check if we get the desired results back from Solr.

Note that this does NOT include your web application code. In our case, Blacklight may filter the user query or the results, and we don't want or need that layered into the testing stack. Recall that test code should live close to the code it is testing; these tests should be sending queries directly to Solr and examining the Solr results.

As it happens, you can test both of these cases more or less the same way.

The Right Way is to create a script to do the following:

Get your latest, greatest Solr configuration files from source control.

Prepare a Solr instance for testing: copy in the latest configuration files, clear the test index, etc.

Pull down the indexing software, your test data, and your test code from source control.

Build the indexing software if necessary. SolrMarc is built from an ant task.

Start Solr test instance. Generally this means starting or restarting the web server you are using for Solr testing (jetty, tomcat ...)
Run your tests. Your tests should create very small Solr indexes - a fresh index for a group of related tests, or sometimes for a single test.

(MARC) records are sent through the indexing software, creating Solr documents, and add the Solr documents to the testing index.
Submit test queries against the testing index.
Programmatically test for acceptance criteria in the Solr results.
Repeat as necessary

Stop Solr test instance.
Clean up

I am currently working on a ruby Rake task with Cucumber (but not with Rails) to do the above. I would like to be able to plug in different indexing software (SolrMarc, some ruby MARC -> Solr code being developed ...). The tests themselves will not be MARC centric - just the data and the indexing software. And it will be easy to have continuous integration execute the script, because it's a Rake task.

I already have a chunk of this working, thanks in great part to the Ruby code that I stole from the Hydrangea and Blacklight projects. This allowed me to figure out Solr configurations for a call number search field that satisfied the most important searching criteria as defined by my most excellent group of advising librarians.

I have a Rake task that:

Spins up a Solr instance on jetty (3. above)
Clears the existing index, (then commit), indexes a file of Solr docs to exercise what I'm testing (4.1 part in italic) (then commit).
Cucumber scenario submits queries against Solr with the test data (4.2 above)
The same cucumber scenario compares the Solr results with the acceptance criteria (4.3)
(keep running Cucumber scenarios - 4.4)
Stop Solr test instance (5. above)

For more details, see my post on Call Number Searching in Solr .

For Searchable Value Tests:

To some extent, you can examine searchable values of Solr fields by doing facet queries for those fields. NOTE: if the fields are tokenized (and most searchable fields will be) then you will see each token as a facet value. So you may need to create a tiny test index with very few records when doing this sort of test.

http://your.solr.server/solr/select?facet.field=your_indexed_field&facet=true&rows=0

To be honest, I'm not sure how this will work if the Solr field is both stored and indexed - will you see the stored value?

Another way I currently test values written to Solr:

I wrote JUnit tests for my SolrMarc instance that build a Solr index from test data and then search the index via the Solr API. This works okay when searching against a single field, but it doesn't tackle searching via a Solr Request Handler.

Here is an example test:

    /**

     * isbn_search should be case insensitive

     */

@Test

    public final void testISBNCaseInsensitive() 

        throws IOException, ParserConfigurationException, SAXException 

    {

        String fldName = "isbn_search";

        createIxInitVars("isbnTests.mrc");

        Set<String> docIds = new HashSet<String>();

        docIds.add("020suba10trailingText");

        docIds.add("020SubaAndz");

        assertSearchResults(fldName, "052185668X", docIds);

        assertSearchResults(fldName, "052185668x", docIds);

    }

Continuous Integration runs an ant target just like the one for the mapping tests to execute these tests; here is an ant task to run all the test code in a directory and its children:

Discovery Grindstone

Sunday, October 24, 2010

Testing Solr Indexing Software - Search Acceptance and Searchable Value Tests

No comments:

Post a Comment