Package

org.apache.spark.datafu

deploy

Permalink

package deploy

Visibility
  1. Public
  2. All

Type Members

  1. case class SparkPythonRunner(pyPaths: String, otherArgs: Array[String] = Array()) extends Product with Serializable

    Permalink

    Internal class - should not be used by user

    Internal class - should not be used by user

    background: We had to "override" Spark's PythonRunner because we failed on premature python process closing. In PythonRunner the python process exits immediately when finished to read the file, this caused us to Accumulators Exceptions when the driver tries to get accumulation data from the python gateway. Instead, like in Zeppelin, we create an "interactive" python process, feed it the python script and not closing the gateway.

Ungrouped