Prerequisites

To follow this installation, you will need:

  • An Ubuntu 20.04 server set up,  including a sudo non-root user and a firewall.


Installing the Default JRE/JDK

The easiest option for installing Java is to use the version packaged with Ubuntu. By default, Ubuntu 20.04 includes Open JDK 11,  which is an open-source variant of the JRE and JDK.

To install this version, first update the package index:

sudo apt update
Bash
 

Next, check if Java is already installed:

java -version
Bash
 

If Java is not currently installed, you’ll see the following output:

Output
Command 'java' not found, but can be installed with:

sudo apt install default-jre # version 2:1.11-72, or
sudo apt install openjdk-11-jre-headless # version 11.0.7+10-3ubuntu1
sudo apt install openjdk-13-jre-headless # version 13.0.3+3-1ubuntu2
sudo apt install openjdk-14-jre-headless # version 14.0.1+7-1ubuntu1
sudo apt install openjdk-8-jre-headless # version 8u252-b09-1ubuntu1

Execute the following command to install the default Java Runtime Environment (JRE), which will install the JRE from OpenJDK 11:

sudo apt install default-jre
Bash
 

Verify the installation with:

java -version
Bash
 

You should see the following output:

Output
openjdk version "11.0.7" 2020-04-14
OpenJDK Runtime Environment (build 11.0.7+10-post-Ubuntu-3ubuntu1)
OpenJDK 64-Bit Server VM (build 11.0.7+10-post-Ubuntu-3ubuntu1, mixed mode, sharing)


You may need the Java Development Kit (JDK) in conjuction with the JRE in order to compile and run some specific Java-based software. To install the JDK, run the following command, which will also install the JRE:

sudo apt install default-jdk
Bash
 

Verify that the JDK is installed by checking the version of javac, the Java compiler:

javac -version
Bash
 

You should see the following output:

Output
javac 11.0.7


Creating a User for Kafka

As Kafka has a capability to handle requests over a network, the first step is to create a dedicated user for the service. We will create a dedicated kafka user in this step.


Having logged in as your non-root sudo user, create a user called kafka:

  • sudo adduser kafka
 

Follow the prompts to set a password and create the kafka user.

Next, add the kafka user to the sudo group with the adduser command. You need these privileges to install Kafka’s dependencies:

  • sudo adduser kafka sudo
 

Your kafka user is now ready. Log into the account using su:

  • su -l kafka
 



Download and Extract the Kafka Binaries


Download and extract the Kafka binaries into dedicated folders in the kafka user’s home directory.


Create a downloads directory in /home/kafka:

  • mkdir ~/Downloads
 

Use wget to download the Kafka binaries:

  • wget "https://downloads.apache.org/kafka/2.6.1/kafka_2.13-2.6.1.tgz" -o ~/Downloads/kafka.tgz
 

Create a directory called kafka and change to this directory. This will be the base directory of the Kafka installation:

  • mkdir ~/kafka && cd ~/kafka
 

Extract the archive you downloaded using the tar command:

  • tar -xvzf ~/Downloads/kafka.tgz --strip 1
 

Specify the --strip 1 flag to ensure that the archive’s contents are extracted in ~/kafka/ itself and not in another directory.


Configuring the Kafka Server

Kafka’s default behavior will not allow you to delete a topic. A Kafka topic is the category, group, or feed name to which messages can be published. To modify this, you must edit the configuration file. 

Kafka’s configuration options are specified in server.properties

Open this file with nano or your favorite editor:

  • vim ~/kafka/config/server.properties
 


Firstly, add a setting at the bottom of the file to enable Kafka topics to be deleted:

delete.topic.enable = true
 


Secondly, change the directory where the Kafka logs are stored by modifying the logs.dir property:

log.dirs=/home/kafka/logs
 

Save and close the file. Now that you’ve configured Kafka, the next step is to create systemd unit files for running and enabling the Kafka server on startup.


Creating Systemd Unit Files and Starting the Kafka Server

In this section, we will create systemd unit files for the Kafka service. This will help you perform common service actions such as starting, stopping, and restarting Kafka in a manner consistent with other Linux services.

Zookeeper is a service that Kafka uses to manage its cluster state and configurations. It is used in many distributed systems. If you would like to know more about it, visit the official Zookeeper docs.

Create the unit file for zookeeper:

  • sudo vim /etc/systemd/system/zookeeper.service
 


Enter the following unit definition into the file: /etc/systemd/system/zookeeper.service

[Unit]
Requires=network.target remote-fs.target
After=network.target remote-fs.target

[Service]
Type=simple
User=kafka
ExecStart=/home/kafka/kafka/bin/zookeeper-server-start.sh /home/kafka/kafka/config/zookeeper.properties
ExecStop=/home/kafka/kafka/bin/zookeeper-server-stop.sh
Restart=on-abnormal

[Install]
WantedBy=multi-user.target
 

The [Unit] section specifies that Zookeeper requires networking and the filesystem to be ready before it can start.

The [Service] section specifies that systemd should use the zookeeper-server-start.sh and zookeeper-server-stop.sh shell files for starting and stopping the service. It also specifies that Zookeeper should be restarted if it exits abnormally.

After adding this content, save and close the file.

Next, create the systemd service file for kafka:

  • sudo vim /etc/systemd/system/kafka.service
 

Enter the following unit definition into the file: /etc/systemd/system/kafka.service

[Unit]
Requires=zookeeper.service
After=zookeeper.service

[Service]
Type=simple
User=kafka
ExecStart=/bin/sh -c '/home/kafka/kafka/bin/kafka-server-start.sh /home/kafka/kafka/config/server.properties > /home/kafka/kafka/kafka.log 2>&1'
ExecStop=/home/kafka/kafka/bin/kafka-server-stop.sh
Restart=on-abnormal

[Install]
WantedBy=multi-user.target
 

The [Unit] section specifies that this unit file depends on zookeeper.service. This will ensure that zookeeper gets started automatically when the kafka service starts.

The [Service] section specifies that systemd should use the kafka-server-start.sh and kafka-server-stop.sh shell files for starting and stopping the service. It also specifies that Kafka should be restarted if it exits abnormally. 


Now that the units have been defined, start Kafka with the following command:

  • sudo systemctl start kafka
 

To ensure that the server has started successfully, check the journal logs for the kafka unit:

  • sudo systemctl status kafka
 


You should see similar output to this:

Output
* kafka.service Loaded: loaded (/etc/systemd/system/kafka.service; disabled; vendor preset> Active: active (running) since Mon 2021-03-29 16:29:50 UTC; 17s ago Main PID: 21309 (sh) Tasks: 81 (limit: 37702) Memory: 325.2M CGroup: /system.slice/kafka.service |-21309 /bin/sh -c /home/kafka/kafka/bin/kafka-server-start.sh /ho> `-21311 java -Xmx1G -Xms1G -server -XX:+UseG1GC -XX:MaxGCPauseMill> Mar 29 16:29:50 AppChassis5B1S3 systemd[1]: Started kafka.service.


To ensure that the server has started successfully, check the journal logs for the zookeeper unit:

  • sudo systemctl status zookeeper


Output
* zookeeper.service Loaded: loaded (/etc/systemd/system/zookeeper.service; enabled; vendor preset: enabled) Active: active (running) since Mon 2021-03-29 16:29:50 UTC; 16h ago Main PID: 21308 (java) Tasks: 69 (limit: 37702) Memory: 101.4M CGroup: /system.slice/zookeeper.service `-21308 java -Xmx512M -Xms512M -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitGCInvokesConcurrent -XX:MaxInlineLevel=15 -Djava.awt.headless=t> Mar 29 16:29:52 AppChassis5B1S3 zookeeper-server-start.sh[21308]: [2021-03-29 16:29:52,334] INFO maxSessionTimeout set to 60000 (org.apache.zookeeper.server.ZooKeeperServer) Mar 29 16:29:52 AppChassis5B1S3 zookeeper-server-start.sh[21308]: [2021-03-29 16:29:52,335] INFO Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 60000 datadir /tmp/zookeeper/ve> Mar 29 16:29:52 AppChassis5B1S3 zookeeper-server-start.sh[21308]: [2021-03-29 16:29:52,350] INFO Using org.apache.zookeeper.server.NIOServerCnxnFactory as server connection factory (org.apache.zookeeper.> Mar 29 16:29:52 AppChassis5B1S3 zookeeper-server-start.sh[21308]: [2021-03-29 16:29:52,356] INFO Configuring NIO connection handler with 10s sessionless connection timeout, 2 selector thread(s), 32 worke> Mar 29 16:29:52 AppChassis5B1S3 zookeeper-server-start.sh[21308]: [2021-03-29 16:29:52,367] INFO binding to port 0.0.0.0/0.0.0.0:2181 (org.apache.zookeeper.server.NIOServerCnxnFactory) Mar 29 16:29:52 AppChassis5B1S3 zookeeper-server-start.sh[21308]: [2021-03-29 16:29:52,396] INFO zookeeper.snapshotSizeFactor = 0.33 (org.apache.zookeeper.server.ZKDatabase) Mar 29 16:29:52 AppChassis5B1S3 zookeeper-server-start.sh[21308]: [2021-03-29 16:29:52,403] INFO Snapshotting: 0x0 to /tmp/zookeeper/version-2/snapshot.0 (org.apache.zookeeper.server.persistence.FileTxnS> Mar 29 16:29:52 AppChassis5B1S3 zookeeper-server-start.sh[21308]: [2021-03-29 16:29:52,409] INFO Snapshotting: 0x0 to /tmp/zookeeper/version-2/snapshot.0 (org.apache.zookeeper.server.persistence.FileTxnS> Mar 29 16:29:52 AppChassis5B1S3 zookeeper-server-start.sh[21308]: [2021-03-29 16:29:52,443] INFO Using checkIntervalMs=60000 maxPerMinute=10000 (org.apache.zookeeper.server.ContainerManager) Mar 29 16:29:53 AppChassis5B1S3 zookeeper-server-start.sh[21308]: [2021-03-29 16:29:53,167] INFO Creating new log file: log.1 (org.apache.zookeeper.server.persistence.FileTxnLog)


The Kafka server is listening on port 9092, however, if the server was rebooted, Kafka would not restart automatically. To enable the kafka service on server boot, run the following commands:

  • sudo systemctl enable zookeeper
  • sudo systemctl enable kafka
 


Testing the Kafka Installation

In this step, we will test the Kafka installation by publishing and consuming a “Hello World” message.


Publishing messages in Kafka requires:

  • A producer, who enables the publication of records and data to topics. 
  • A consumer, who reads messages and data from topics.

To begin, create a topic named SaveOurPlanet:

  • ~/kafka/bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic SaveOurPlanet
 

You can create a producer from the command line using the kafka-console-producer.sh script. It expects the Kafka server’s hostname, a port, and a topic as arguments.

Now publish the string "Hello, World" to the SaveOurPlanet topic:

  • echo "Hello, World" | ~/kafka/bin/kafka-console-producer.sh --broker-list localhost:9092 --topic SaveOurPlanet > /dev/null
 

Next, create a Kafka consumer using the kafka-console-consumer.sh script. It expects the ZooKeeper server’s hostname and port, along with a topic name as arguments.

The following command consumes messages from SaveOurPlanet topic. Note the use of the --from-beginning flag, which allows the consumption of messages that were published before the consumer was started:

  • ~/kafka/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic SaveOurPlanet --from-beginning
 

If there are no configuration issues, you will see Hello, World  in your terminal:

Output
Hello, World