scala - How can I queue REST requests using Apache Livy and Apache Spark? -


i have remote process sends thousands requests humble spark standalone cluster:

  • 3 worker-nodes 4 cores , 8gb
  • identical master-node driver runs

that hosts simple data processing service developed in scala. requests sent via curl command parameters .jar, through rest apache livy interface this:

curl -s -x post -h "content-type: application/json" remote_ip:8998/batches -d '{"file":"file://path_to_jar/target/my_jar-1.0.0-jar-with-dependencies.jar","classname":"project.update.my_jar","args":["c1:1","c2:2","c3:3"]}' 
  • this triggers spark job each time,
  • the resource scheduling cluster dynamic can serve @ 3 requests @ time,
  • when worker goes idle, queued request served.

at point in time, requests kills master node memory if in waiting state (because spark register jobs served), hangs master node , workers loose connection it.

is there way can queue requests preventing spark hold ram them ? , when worker free, process request queue.

this question similar, saying yarn.scheduler.capacity.max-applications allows n numbers of running applications, can't figure out if solution need. apache livy doesn't have functionality, not i'm aware of.


Comments

Popular posts from this blog

Is there a better way to structure post methods in Class Based Views -

performance - Why is XCHG reg, reg a 3 micro-op instruction on modern Intel architectures? -

c# - Asp.net web api : redirect unauthorized requst to forbidden page -