Testing a web scraper with Rspec and Nokogiri fragment
A widely used approach to test web scrapers is to use the VCR gem to record the requests made to the target website and use the recorded response to test the parsers.
I had to test a class that depends on Nokogiri while building the esaj scraper, but I could not use this procedure since the class being tested was initialized with a Nokogiri::XML::Element element:
No request would be performed in this scenario.
Nokogiri::HTML::fragment
The chosen solution was to use the “Nokogiri::HTML::fragment” method to instantiate the element to be tested with the HTML fragment retrieved from a file fixture.
The fixture was created in spec/fixture/result_item_fragment and the Nokogiri fragment method was wrapped in module created in spec/helpers:
The spec_heper was updated in order to make the above helper available for all specs.
Clean code
This approach allows a test class without doubles or stubs making the code lean and clean.