Spark interview question part-5
1. What is Immutable? Ans: Once created and assign a value, it’s not possible to change, this property is called Immutability. Spark is by default immutable, it does not allow updates and modifications. Please note data collection is not immutable, but data value is immutable. 2. What is Distributed? Ans: RDD can automatically the data is distributed across different parallel computing nodes. 3. What is Lazy evaluated? Ans: If you execute a bunch of programs, it’s not mandatory to evaluate immediately. Especially in Transformations, this Laziness is a trigger. 4. What is Spark engine's responsibility? Ans: Spark is responsible for scheduling, distributing, and monitoring the application across the cluster. 5. What are common Spark Ecosystems? Ans: Spark SQL(Shark) for SQL developers, Spark Streaming for streaming data, MLLib for machine learning algorithms, GraphX for Graph computation, SparkR to run R on Spark engine, BlinkDB enabling interactive queries ...