scala - How can I queue REST requests using Apache Livy and Apache Spark? -


i have remote process sends thousands requests humble spark standalone cluster:

  • 3 worker-nodes 4 cores , 8gb
  • identical master-node driver runs

that hosts simple data processing service developed in scala. requests sent via curl command parameters .jar, through rest apache livy interface this:

curl -s -x post -h "content-type: application/json" remote_ip:8998/batches -d '{"file":"file://path_to_jar/target/my_jar-1.0.0-jar-with-dependencies.jar","classname":"project.update.my_jar","args":["c1:1","c2:2","c3:3"]}' 
  • this triggers spark job each time,
  • the resource scheduling cluster dynamic can serve @ 3 requests @ time,
  • when worker goes idle, queued request served.

at point in time, requests kills master node memory if in waiting state (because spark register jobs served), hangs master node , workers loose connection it.

is there way can queue requests preventing spark hold ram them ? , when worker free, process request queue.

this question similar, saying yarn.scheduler.capacity.max-applications allows n numbers of running applications, can't figure out if solution need. apache livy doesn't have functionality, not i'm aware of.


Comments

Popular posts from this blog

What is happening when Matlab is starting a "parallel pool"? -

php - Cannot override Laravel Spark authentication with own implementation -

Qt QGraphicsScene is not accessable from QGraphicsView (on Qt 5.6.1) -