run nutch 1.8 or 1.9 as a hadoop job -


if understand correctly, can not run nutch 1.8 , 1.9 hadoop job, because these versions not have crawl class serves wrapper crawl steps. means there no 1 class can specify in hadoop call run whole job. in nutch 1.7, used org.apache.nutch.crawl.crawl class.

am missing something? 1 figure out way around this?

your understanding wrong. should use script bin/crawl. in each step, should see corresponding class should call (in case want use outside crawl script). in addition, far know class quoted deprecated.


Popular posts from this blog