It's pretty simple: once you have a ruby Solr response, you wrap it in RSpecSolr:
resp = RSpecSolr::SolrResponseHash.new(yer_solr_resp)
and then you can make useful assertions for acceptance testing:
resp.should include({'id'=>'81234'})
resp.should include({'title'=>'Harry Potter'}).in_first(3).results
resp.should include('111'}).before('222')
So you might write specs like this:
it "q of 'Buddhism' should get 8,500-10,500 results" do
resp = solr_resp_doc_ids_only({'q'=>'Buddhism'})
resp.should have_at_least(8500).documents resp.should have_at_most(10500).documents end it "q of 'Two3' should have excellent results", :jira => 'VUF-386' do resp = solr_resp_doc_ids_only({'q'=>'Two3'}) resp.should have_at_most(10).documents resp.should include("5732752").as_first_result resp.should include("8564713").in_first(2).results
resp.should include("
5732752
").before("8564713")
resp.should_not include("5727394") resp.should have_the_same_number_of_results_as(solr_resp_doc_ids_only({'q'=>'two3'})) resp.should have_fewer_results_than(solr_resp_doc_ids_only({'q'=>'two 3'})) end it "Traditional Chinese chars 三國誌 should get the same results as simplified chars 三国志" do resp = solr_resp_doc_ids_only({'q'=>'三國誌'}) resp.should have_at_least(240).documents resp.should have_the_same_number_of_results_as(solr_resp_doc_ids_only({'q'=>'三国志'})) end
Note that these examples utilize a couple of helper methods. See the README for more details.
The gem is only at release 0.1.0, but I'm finding it useful already. You'll see some FIXME and TODO comments, and I suspect there's plenty that can be improved. I'm happy to take your pull requests.
If it looks too much like code ...
If you can get non-coding colleagues to write your tests, then making the testing syntax easier for them is probably worthwhile. You could certainly use Cucumber on top of rspec-solr to write your Solr acceptance tests in more natural language.
Tip:
For most of my tests, I realized I don't check anything but the Solr document id in the results. It makes it a lot easier to look through RSpec error messages when the Solr response doesn't have extraneous fields or the facet counts ... and it's also a much smaller http response. So I rigged up a method that adds {'fl'=>'id', 'facet'=>'false'} to the request params I send to Solr. My spec errors now read like this:
expected {"responseHeader"=>{"status"=>0, "QTime"=>10, "params"=>{"facet"=>"false", "fl"=>"id", "qt"=>"search_author", "wt"=>"ruby", "q"=>"契沖"}}, "response"=>{"numFound"=>3, "start"=>0, "docs"=>[{"id"=>"6675613"}, {"id"=>"6675393"}, {"id"=>"6274534"}]}} to include ["6675613", "6675393", "7191966", "6274534", "4783602"]
Diff:
@@ -1,2 +1,14 @@
-[["6675613", "6675393", "7191966", "6274534", "4783602"]]
+{"responseHeader"=>
+ {"status"=>0,
+ "QTime"=>10,
+ "params"=>
+ {"facet"=>"false",
+ "fl"=>"id",
+ "qt"=>"search_author",
+ "wt"=>"ruby",
+ "q"=>"契沖"}},
+ "response"=>
+ {"numFound"=>3,
+ "start"=>0,
+ "docs"=>[{"id"=>"6675613"}, {"id"=>"6675393"}, {"id"=>"6274534"}]}}
and they could have even less output, if I turned off "diffable" -- but I am currently finding it helpful.
Okay, but what good is this, really?
My current project is to improve search results for CJK (Chinese, Japanese and Korean) queries in SearchWorks. It's nearly impossible for a CJK-ignorant coder such as myself to write good tests. It's pretty darn hard for our non-coder CJK experts to write good tests, too. So we have to iterate to figure out a set of acceptance tests. Doing this without coding repeatable, automatable tests is ludicrous.**
We already have search tests, but our current search tests are slow. They use Cucumber to mimic a user interacting with the web page to do a search, send the request to Solr, then the SearchWorks Blacklight Rails stack prepares the html that would be served by the application to present the search results from Solr. The assertions are made against the html. Given that for search acceptance testing, we don't care about the rails stack, this is a lot of extra processing.
So it's time to take Rails out of the picture. With some help from my colleague Chris Beer, we conceived a way to make it really simple -- let's write rspec style language on Solr response objects! That spawned rspec-solr.
I am already using rspec-solr for our CJK acceptance tests. All I needed was the rsolr gem, a spec_helper file, and some simple configuration stuff - 4 very small files. (See rspec-solr README)
I've got CJK tests like this:
it "should parse out 中国 (china) 经济 (economic) 政策 (policy)" do
resp = solr_resp_doc_ids_only({'q'=>'中国经济政策'})
resp = solr_resp_doc_ids_only({'q'=>'中国经济政策'})
resp.should have_at_least(85).documents
resp.size.should be_within(5).of(solr_resp_doc_ids_only({'q'=>'中国 经济 政策'}).size)
resp.size.should be_within(5).of(solr_resp_doc_ids_only({'q'=>'中国 经济 政策'}).size)
end
it "Traditional chars 三國誌 should get the same results as simplified chars 三国志" do
resp = solr_resp_doc_ids_only({'q'=>'三國誌'})
resp = solr_resp_doc_ids_only({'q'=>'三國誌'})
resp.should have_at_least(240).documents
resp.should have_the_same_number_of_results_as(solr_resp_doc_ids_only({'q'=>'三国志'}))
end
resp.should have_the_same_number_of_results_as(solr_resp_doc_ids_only({'q'=>'三国志'}))
end
it "hangul 광주 should get results for hancha 光州" do
resp = solr_resp_doc_ids_only({'q'=>'광주'})
resp.should include(["7763372", "7773313"]) # hancha 光州
resp.should have_at_least(110).documents
endresp = solr_resp_doc_ids_only({'q'=>'광주'})
resp.should include(["7763372", "7773313"]) # hancha 光州
resp.should have_at_least(110).documents
I'm also migrating our cucumber search regression tests to the rspec-solr approach -- obviously, I want a full suite of regression tests as I make changes for CJK searching.
A sample regression test:
it "q of 'Two3' should have excellent results", :jira => 'VUF-386' do
resp = solr_resp_doc_ids_only({'q'=>'Two3'})
resp.should have_at_most(10).documents
resp.should include("5732752").as_first_result
resp.should include("8564713").in_first(2).results
resp.should_not include("5727394")
resp.should have_the_same_number_of_results_as(solr_resp_doc_ids_only({'q'=>'two3'}))
resp.should have_fewer_results_than(solr_resp_doc_ids_only({'q'=>'two 3'}))
end
resp = solr_resp_doc_ids_only({'q'=>'Two3'})
resp.should have_at_most(10).documents
resp.should include("5732752").as_first_result
resp.should include("8564713").in_first(2).results
resp.should_not include("5727394")
resp.should have_the_same_number_of_results_as(solr_resp_doc_ids_only({'q'=>'two3'}))
resp.should have_fewer_results_than(solr_resp_doc_ids_only({'q'=>'two 3'}))
end
Both types are very much works in progress, but I've deliberately put the tests up on github as the sw_index_tests repository so you can leverage them however you see fit.
I think it's pretty slick.
** In fact, they already DID do this for our ILS without repeatable, automatable tests ... and without records of their manual tests ... so we're starting from scratch. How annoying and wasteful!