How To Install Apache HBase on Docker (Complete Step-by-Step Guide)

Apache Hbase Overview

Introduction

Apache HBase is a distributed, scalable, and NoSQL database built on top of the Hadoop ecosystem. Designed to handle large tables with billions of rows and millions of columns, HBase is the go-to solution when you need real-time read/write access to big data.

Running HBase directly on a server requires multiple components such as Zookeeper and Hadoop. Fortunately, Docker simplifies everything by allowing you to run HBase in isolated, easy-to-manage containers.

In this guide, you will learn how to install Apache HBase on Docker, run services, and verify your setup—all without needing a full Hadoop cluster.

This tutorial is perfect for developers, DevOps engineers, and students who want a quick, reliable environment for testing HBase.

Table of Contents

  1. What Is Apache HBase?
  2. Why Run HBase on Docker?
  3. Prerequisites
  4. Step 1: Install Docker
  5. Step 2: Create a Docker Network
  6. Step 3: Run Zookeeper Container
  7. Step 4: Run HBase Master and RegionServer
  8. Step 5: Using Docker Compose (Recommended)
  9. Step 6: Access the HBase Web UI
  10. Step 7: Connect to HBase Shell
  11. Basic HBase Commands to Test
  12. Troubleshooting Tips
  13. Conclusion

1. What Is Apache HBase?

Apache HBase is a distributed, NoSQL column-oriented database modeled after Google Bigtable. It is designed for:

  • Real-time read and write operations on massive datasets
  • Horizontally scalable storage
  • High throughput and low latency
  • Fault-tolerant data distribution

HBase is commonly used in big data environments such as IoT platforms, log analytics, machine learning pipelines, and real-time data warehousing. The Apache HBase overview article can be found on What is Apache HBase? A Scalable NoSQL Database for Big Data article.

2. Why Run HBase on Docker?

Running HBase manually can be complex, as it requires:

  • Zookeeper
  • HBase Master
  • HBase RegionServer
  • Optional Hadoop components

Docker simplifies all of this by offering:

Fast setup
Easy cleanup
Isolated testing environments
No need for manual dependency installation
Cross-platform compatibility

For development and testing, Docker is by far the easiest way to spin up HBase.

3. Prerequisites

Make sure you have:

  • Docker installed on your machine
  • Minimum 4 GB RAM
  • Linux, macOS, or Windows with WSL2
  • Internet connection

4. Step 1: Install Docker

If you don’t have Docker installed, you can install it using:

Linux

sudo dnf install docker -y
sudo systemctl enable --now docker

Ubuntu

sudo apt install docker.io -y
sudo systemctl enable --now docker

macOS & Windows

Download Docker Desktop from the official website.

Verify installation:

docker --version

5. Step 2: Create a Docker Network

HBase and Zookeeper need to communicate.
Create a dedicated network:

docker network create hbase-net

6. Step 3: Run Zookeeper Container

HBase relies on Zookeeper for cluster coordination.

Run the container:

docker run -d \
  --name zookeeper \
  --network hbase-net \
  -p 2181:2181 \
  zookeeper:3.9

Verify:

docker logs zookeeper

7. Step 4: Run HBase Master and RegionServer

Most HBase Docker images combine Master + RegionServer for simplicity.

Run HBase:

docker run -d \
  --name hbase \
  --network hbase-net \
  -p 16000:16000 \
  -p 16010:16010 \
  -p 16020:16020 \
  -p 16030:16030 \
  -e HBASE_MANAGES_ZK=false \
  -e ZOOKEEPER_QUORUM=zookeeper \
  harisekhon/hbase

Exposed ports:

  • 16010 → HBase Master Web UI
  • 16000 / 16020 → Thrift & RegionServer
  • 2181 → Zookeeper

Check logs:

docker logs -f hbase

You should see:
Master has completed initialization

8. Step 5: Using Docker Compose (Recommended)

Docker Compose simplifies everything into one file.

Create a file:

nano docker-compose.yml

Paste:

version: '3.8'

services:
  zookeeper:
    image: zookeeper:3.9
    container_name: zookeeper
    ports:
      - "2181:2181"
    networks:
      - hbase-net

  hbase:
    image: harisekhon/hbase
    container_name: hbase
    environment:
      HBASE_MANAGES_ZK: "false"
      ZOOKEEPER_QUORUM: zookeeper
    ports:
      - "16000:16000"
      - "16010:16010"
      - "16020:16020"
      - "16030:16030"
    networks:
      - hbase-net

networks:
  hbase-net:
    driver: bridge

Start:

docker compose up -d

View logs:

docker compose logs -f

9. Step 6: Access the HBase Web UI

Once running, open your browser and go to:

http://localhost:16010

You should see:

  • Cluster status
  • Active Master
  • RegionServer info
  • Metrics

10. Step 7: Connect to HBase Shell

Enter the HBase container:

docker exec -it hbase bash

Start the HBase shell:

hbase shell

11. Basic HBase Commands to Test

Create a table

create 'users', 'info'

Insert data

put 'users', 'row1', 'info:name', 'Alice'

Retrieve data

get 'users', 'row1'

Scan the table

scan 'users'

Delete the table

disable 'users'
drop 'users'

12. Troubleshooting Tips

Port already in use

Find the process:

sudo lsof -i :16010

Kill or stop the conflicting service.

HBase cannot find Zookeeper

Ensure containers share the same network:

docker network inspect hbase-net

Web UI not loading

Restart containers:

docker compose restart

Container exits immediately

Check logs:

docker logs hbase

Common issues include missing environment variables or low memory.

13. Conclusion

Running Apache HBase on Docker is the easiest way to develop, test, and experiment with big data workloads without deploying a full Hadoop cluster. Using Docker or Docker Compose, you get a scalable and isolated HBase environment that can be deployed within minutes.

Whether you’re learning HBase, testing schema design, or building microservices that require real-time, column-oriented storage, Docker provides a flexible foundation with minimal overhead.

(Visited 1 times, 1 visits today)

You may also like