hadoop - How to filter multiple source data using Apache Flume? -


i using flume handle multiple sources data , stored in hdfs not understand how filter data before storing in hdfs.

you have 2 options:

  • use flume interceptor, check answer here.
  • use streaming based solution(apache spark, apache heron/storm) filter records store in hdfs,

2nd option give more flexibility write different types of streaming patterns. add comment if have more queries.


Comments

Popular posts from this blog

Is there a better way to structure post methods in Class Based Views -

Qt QGraphicsScene is not accessable from QGraphicsView (on Qt 5.6.1) -

Python Tornado package error when running server -