scala - How can I queue REST requests using Apache Livy and Apache Spark? -
i have remote process sends thousands requests humble spark standalone cluster:
- 3 worker-nodes 4 cores , 8gb
- identical master-node driver runs
that hosts simple data processing service developed in scala. requests sent via curl command parameters .jar, through rest apache livy interface this:
curl -s -x post -h "content-type: application/json" remote_ip:8998/batches -d '{"file":"file://path_to_jar/target/my_jar-1.0.0-jar-with-dependencies.jar","classname":"project.update.my_jar","args":["c1:1","c2:2","c3:3"]}'
- this triggers spark job each time,
- the resource scheduling cluster dynamic can serve @ 3 requests @ time,
- when worker goes idle, queued request served.
at point in time, requests kills master node memory if in waiting state (because spark register jobs served), hangs master node , workers loose connection it.
is there way can queue requests preventing spark hold ram them ? , when worker free, process request queue.
this question similar, saying yarn.scheduler.capacity.max-applications allows n numbers of running applications, can't figure out if solution need. apache livy doesn't have functionality, not i'm aware of.
Comments
Post a Comment