amazon web services - Nutch crawler not scaling for large urls -

September 15, 2010

i trying set nutch crawler on amazon emr cluster 2 master nodes, scalable. seed url list 10000 urls, crawler gets stuck on fetch phase in map-reduce job @ around 90 percent. ran fine 5000 urls. there configuration might missing?

go mapreduce ui , check logs fetch phase. contain clue went wrong.

Search This Blog

How Y

amazon web services - Nutch crawler not scaling for large urls -

Comments

Post a Comment

Popular posts from this blog

meteor - inserting data to database gives error "insert failed: Method '/texts/insert' not found" -

html - unterminated string literal “onclick” event in anchor -

angular - DownloadURL return null in below code -