Subscribe via RSS Feed

Web Crawler

Installation and running Apache Nutch and Apache Solr for crawling and indexing Web Content

May 14, 2013 1 Comment
Installation and running Apache Nutch and Apache Solr for crawling and indexing Web Content

In our work, we needed to use open source web crawler for unstructured data gathering. Here we have used A> Apache Nutch for web crawling and B> Apache Solr for unstructured web data indexing Steps, that we have used to set up the complete environment are – 1> Downloaded Apache Solr (3.X) 2> Downloaded Apache […]

Continue Reading »