Prerequisites
This article assumes that the installed OS is Ubuntu 20.04 and the installation is performed either by a root or a sudo account.
Process
First, let's install update the packages installed on the current system using
Not we will install other packages those are prerequisites for Apache Spark.
Once the above step is complete then download the Apache Spark installation with the following commands
Now you are ready to start Apache Spark , let's bring up Spark's scala cli with
Output
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Spark context Web UI available at http://AppChassis5B1S4:4040
Spark context available as 'sc' (master = local[*], app id = local-1615563643141).
Spark session available as 'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.4.6
/_/
Using Scala version 2.11.12 (OpenJDK 64-Bit Server VM, Java 11.0.10)
Type in expressions to have them evaluated.
Type :help for more information.
scala>
Verifying Apache Spark installation
We can verify the container running by giving it a task, e.g. calculating the value of π by throwing darts at a circular board, here is the scala code for that
You should see something similar to the output shown below:
Output
Pi is roughly 3.1406314063140632