How to do SQL aggregation on dataframes

Here is the example of how to perform sum(), count(), groupBy operation in DataFrames in Spark //Read the files from hdfs and create a dataframe by applying a schema on that val filePath = “hdfs://user/Test/*.csv” val User_Dataframe = sqlContext.read.format(“com.databricks.spark.csv”).option(“header”, “false”).option(“mode”, “permissive”).load(filePath) //Define Schema val User_schema = StructType(Array(StructField(“USER_ID”, StringType, true), StructField(“APP_ID”, StringType, true), StructField(“TIMESTAMP”, StringType, true), […]

Continue reading