Skip to main content
← BACK TO WRITING
2 MIN READ INTERMEDIATE Published December 14, 2024, 2 min read

Local AI Powerhouse: Setting Up Ollama and OpenWebUI on Ubuntu Server

Take control of your data and performance. A step-by-step guide to building a private AI server with NVIDIA GPU acceleration, Docker, and a sleek web interface.

#OLLAMA #OPENWEBUI #UBUNTU-SERVER #LOCAL-AI #DOCKER #SELF-HOSTING

Experimenting with AI models like Ollama requires a robust local setup to maximize performance and usability. In this guide, I’ll show you how to prepare an Ubuntu server, install Ollama, and set up OpenWebUI for an interactive interface—all optimized for users with or without technical expertise.

The Foundation: Setting Up Ubuntu Server

A solid foundation begins with Ubuntu Server installation. Let’s dive into the setup process.

Step 1: Installing Ubuntu Server

First, download and install Ubuntu Server on your PC. Here's how:

  1. Visit the Ubuntu Server download page and get the ISO file.

  2. Create a bootable USB drive using tools like Rufus (for Windows) or dd (for Linux/Mac):

    Shell
    sudo dd if=/path/to/ubuntu-server.iso of=/dev/sdX bs=4M status=progress

    (Replace /dev/sdX with your USB device.)

  3. Boot from the USB and follow the on-screen instructions to install Ubuntu. Remember to set up a username, password, and hostname.


Building Blocks: Essential Updates and GPU Configuration

Keeping Your System Updated

Once Ubuntu is installed, ensure your server is up-to-date with essential tools:

Shell
sudo apt update && sudo apt upgrade -y
sudo apt install build-essential dkms linux-headers-$(uname -r) software-properties-common -y

Configuring NVIDIA Drivers

If your server uses an NVIDIA GPU, install the appropriate drivers:

  1. Add the NVIDIA PPA:

    Shell
    sudo add-apt-repository ppa:graphics-drivers/ppa -y
    sudo apt update
  2. Detect and install the recommended driver:

    Shell
    ubuntu-drivers devices
    sudo apt install nvidia-driver-560 -y
    sudo reboot
  3. Verify the installation:

    Shell
    nvidia-smi

Getting Started with Ollama

Ollama lets you work with advanced AI models locally. Here’s how to get started:

  1. Install Ollama:

    Shell
    curl -fsSL https://ollama.com/install.sh | sh
  2. Add models (e.g., llama3):

    Shell
    ollama pull llama3

Enhancing Usability with OpenWebUI

OpenWebUI provides a seamless interface for interacting with your models:

  1. Set up OpenWebUI using Docker:

    Shell
    sudo docker run -d --network=host -v open-webui:/app/backend/data \
        -e OLLAMA_BASE_URL=http://127.0.0.1:11434 \
        --name open-webui --restart always \
        ghcr.io/open-webui/open-webui:main
  2. Access the WebUI through your server’s IP.


Testing and Troubleshooting

  • GPU Verification: Use nvidia-smi to confirm GPU functionality.

  • Common Errors:

    aplay command not found: Install alsa-utils:

    Shell
    sudo apt install alsa-utils -y

    Deprecated hwdb errors: Update packages:

    Shell
    sudo apt update && sudo apt full-upgrade -y

Optional: CUDA for Compute Workloads

To maximize GPU compute capabilities:

  1. Install CUDA:

    Shell
    sudo apt install nvidia-cuda-toolkit -y
  2. Verify:

    Shell
    nvcc --version

With this setup, your Ubuntu server is now optimized for hosting Ollama models and leveraging OpenWebUI. Whether for experimentation or production, this guide ensures a smooth and efficient process. Happy experimenting!

Key takeaways

  • Your Hardware, Your Models, Your Privacy. The real power of AI in 2026 isn't just in the cloud; it's in the ability to run Llama 3 and beyond on your own terms. By moving through the "pain" of driver configuration and terminal commands, you gain a system that is free of monthly subscriptions and secure from data mining. A local AI server isn't just a fun weekend project—it's an

// FURTHER EXPLORATION

PREREQUISITE

OOP vs. Functional Programming in JavaScript: Choosing the Right Paradigm

Is your code a collection of interacting entities or a series of data transformations? In the world of JavaScript, the debate between Object-Oriented Programming (OOP) and Functional Programming (FP) isn't about finding a "winner," but about choosing the right tool for the job. From the blueprint-driven world of Classes and Inheritance to the predictable, side-effect-free realm of Pure Functions and Immutability, we break down the patterns you need to master both paradigms.

NEXT READ

Remote Work for Full-Stack Developers: The Tradeoffs Leaders Can’t Ignore

If your lead developer spends 2 hours and 40 minutes every day navigating public transit before they even open their IDE, you aren't just losing time—you're losing cognitive capacity. Remote work is often framed as a "trust" issue, but for full-stack teams, it’s actually an optimization problem. In this guide, we look at why the "presence = productivity" fallacy is killing focus, and how leaders can design remote systems that protect deep work without losing team alignment.

RELATED

The Three-Layer Fortress: Building a Private Cloud with Coolify, Directus, and Cloudflare

Stop punching holes in your firewall and exposing your home IP to the world. In 2026, building a private cloud is no longer about managing complex VPNs or risky port forwarding; it’s about creating an invisible, outbound-only bridge to the internet. By combining Coolify for orchestration, Directus for data management, and Cloudflare Tunnels for Zero Trust connectivity, you can build a three-layer fortress that is accessible from any browser but completely hidden from scanners. This is the definitive guide to professional-grade self-hosting on your own hardware—no open ports required.