Hosting Postgres with GeoLite2: a practical guide to IP geolocation, data loading, and updates

Mahmoud Abdelwahab

Dec 16, 2025

When should you load GeoLite2 into Postgres instead of using the binary format?
What's the easiest way to deploy Postgres with GeoLite2 preloaded?
Option 1: Initialization script
Option 2: Custom Docker image
Option 3: Separate data-loading service
How do you query GeoLite2 data in Postgres?
How often should you update GeoLite2 data?
How much storage does GeoLite2 require?
What about query performance at scale?
Postgres with GeoLite2 on Railway
What you get
What you'd add yourself
Getting started
Conclusion

IP geolocation maps IP addresses to physical locations: countries, cities, coordinates. MaxMind's GeoLite2 is the standard free database for this, used by analytics platforms, content localization systems, fraud detection tools, and compliance workflows.

You can query GeoLite2 through MaxMind's binary format (MMDB) or load it into Postgres for SQL access. This guide covers the Postgres approach: when it makes sense, how to deploy it, and how to keep the data fresh.

When should you load GeoLite2 into Postgres instead of using the binary format?

Both approaches work. The right choice depends on how your application uses geolocation data.

The binary format (MMDB) is optimized for fast, single-IP lookups. MaxMind provides client libraries for most languages that read the binary file directly. Lookups are fast (sub-millisecond) and the integration is straightforward: download the file, point your code at it, call a function.

Loading into Postgres makes sense when you need to:

Join geolocation data with other tables. If you're enriching user records, log entries, or analytics data with location information, doing it in SQL is often simpler than fetching each IP individually in application code.

Run batch operations. Geolocating thousands or millions of IPs is more efficient as a single SQL query than thousands of individual library calls.

Query the geolocation data itself. If you need to answer questions like "which IP ranges are in Germany?" or "how many networks map to this city?", SQL queries are the natural tool.

Share data across services. Multiple applications can query the same Postgres database without each needing its own copy of the MMDB file.

The tradeoff is operational complexity. You need to load the data, keep it updated, and manage Postgres. For simple use cases where you just need to look up individual IPs, the binary format is easier.

What's the easiest way to deploy Postgres with GeoLite2 preloaded?

Container-based deployments give you several options. The right one depends on whether you want the data loaded on first boot, baked into your image, or managed by a separate process.

Option 1: Initialization script

Postgres containers run scripts in /docker-entrypoint-initdb.d/ when the database initializes with an empty data directory. This is the simplest approach for getting started.

Your init script downloads the GeoLite2 CSV files, creates tables, loads the data, and builds indexes:

#!/bin/bash
set -e

# Download GeoLite2 City data (requires MaxMind license key)
curl -L -o /tmp/geolite2-city.zip \
  "https://download.maxmind.com/app/geoip_download?edition_id=GeoLite2-City-CSV&license_key=${MAXMIND_LICENSE_KEY}&suffix=zip"
unzip /tmp/geolite2-city.zip -d /tmp/

# Create tables and load data
psql -v ON_ERROR_STOP=1 --username "$POSTGRES_USER" --dbname "$POSTGRES_DB" <<-EOSQL
    CREATE TABLE geoip_network (
        network cidr NOT NULL,
        geoname_id int,
        registered_country_geoname_id int,
        represented_country_geoname_id int,
        is_anonymous_proxy bool,
        is_satellite_provider bool,
        postal_code text,
        latitude numeric,
        longitude numeric,
        accuracy_radius int,
        is_anycast bool
    );

    CREATE TABLE geoip_location (
        geoname_id int NOT NULL,
        locale_code text NOT NULL,
        continent_code text,
        continent_name text,
        country_iso_code text,
        country_name text,
        subdivision_1_iso_code text,
        subdivision_1_name text,
        subdivision_2_iso_code text,
        subdivision_2_name text,
        city_name text,
        metro_code int,
        time_zone text,
        is_in_european_union bool,
        PRIMARY KEY (geoname_id, locale_code)
    );

    \copy geoip_location FROM PROGRAM 'cat /tmp/GeoLite2-City-CSV_*/GeoLite2-City-Locations-en.csv' WITH (FORMAT CSV, HEADER);
    \copy geoip_network FROM PROGRAM 'cat /tmp/GeoLite2-City-CSV_*/GeoLite2-City-Blocks-IPv4.csv' WITH (FORMAT CSV, HEADER);
    \copy geoip_network FROM PROGRAM 'cat /tmp/GeoLite2-City-CSV_*/GeoLite2-City-Blocks-IPv6.csv' WITH (FORMAT CSV, HEADER);

    CREATE INDEX idx_geoip_network ON geoip_network USING gist (network inet_ops);
EOSQL

# Cleanup
rm -rf /tmp/geolite2-city.zip /tmp/GeoLite2-City-CSV_*

This runs once when the database initializes. Subsequent container restarts skip initialization because the data directory isn't empty.

To use this with Railway, add the script to a custom Postgres template. The database initializes with GeoLite2 data on first deploy.

Option 2: Custom Docker image

If you want the data baked into your image for reproducible deployments, build a custom Postgres image:

FROM postgres:16

COPY geolite2-city-blocks-ipv4.csv /docker-entrypoint-initdb.d/data/
COPY geolite2-city-blocks-ipv6.csv /docker-entrypoint-initdb.d/data/
COPY geolite2-city-locations-en.csv /docker-entrypoint-initdb.d/data/
COPY init-geolite2.sql /docker-entrypoint-initdb.d/

The init SQL script loads from local files instead of downloading. This approach guarantees the same data every time you deploy, but requires rebuilding the image whenever you want fresh data.

Option 3: Separate data-loading service

For production workloads where you need regular updates, run data loading as a separate service. This decouples the database from the update pipeline.

Deploy a service that runs on a schedule, downloads the latest data, and refreshes the database tables. Railway supports cron jobs for this pattern.

import { execSync } from "child_process";
import { createReadStream } from "fs";
import { pipeline } from "stream/promises";
import postgres from "postgres";
import { from as copyFrom } from "pg-copy-streams";

const DATABASE_URL = process.env.DATABASE_URL!;
const MAXMIND_LICENSE_KEY = process.env.MAXMIND_LICENSE_KEY!;

async function updateGeoLite2() {
  // Download latest data
  const downloadUrl = `https://download.maxmind.com/app/geoip_download?edition_id=GeoLite2-City-CSV&license_key=${MAXMIND_LICENSE_KEY}&suffix=zip`;
  execSync(`curl -L -o /tmp/geolite2.zip "${downloadUrl}"`);
  execSync("unzip -o /tmp/geolite2.zip -d /tmp/");

  const sql = postgres(DATABASE_URL);

  // Load into temporary tables, then swap
  await sql`TRUNCATE geoip_network, geoip_location`;

  // Use COPY for fast bulk loading
  await sql`
    COPY geoip_location FROM PROGRAM 
    'cat /tmp/GeoLite2-City-CSV_*/GeoLite2-City-Locations-en.csv' 
    WITH (FORMAT CSV, HEADER)
  `;

  await sql`
    COPY geoip_network FROM PROGRAM 
    'cat /tmp/GeoLite2-City-CSV_*/GeoLite2-City-Blocks-IPv4.csv' 
    WITH (FORMAT CSV, HEADER)
  `;

  await sql`
    COPY geoip_network FROM PROGRAM 
    'cat /tmp/GeoLite2-City-CSV_*/GeoLite2-City-Blocks-IPv6.csv' 
    WITH (FORMAT CSV, HEADER)
  `;

  await sql.end();
  console.log("GeoLite2 update complete");
}

updateGeoLite2();

This keeps your main Postgres deployment simple while ensuring data stays current.

How do you query GeoLite2 data in Postgres?

Once the data is loaded, IP lookups use Postgres's built-in network operators. The >>= operator checks if a network contains an IP address:

SELECT
    l.country_name,
    l.city_name,
    n.latitude,
    n.longitude,
    n.accuracy_radius
FROM geoip_network n
JOIN geoip_location l ON n.geoname_id = l.geoname_id
WHERE n.network >>= '8.8.8.8'::inet
  AND l.locale_code = 'en';

With the GiST index on the network column, this query runs in under 10ms even with millions of network ranges.

For batch operations, join your data directly:

SELECT
    logs.ip_address,
    logs.timestamp,
    l.country_iso_code,
    l.city_name
FROM access_logs logs
JOIN geoip_network n ON n.network >>= logs.ip_address::inet
JOIN geoip_location l ON n.geoname_id = l.geoname_id
WHERE l.locale_code = 'en'
  AND logs.timestamp > NOW() - INTERVAL '1 day';

This geolocates all IPs in a single query rather than making individual lookups.

How often should you update GeoLite2 data?

MaxMind updates the GeoLite2 databases weekly, every Tuesday. IP allocations change as ISPs acquire new ranges and reassign existing ones. Stale data means incorrect geolocation for some percentage of lookups.

How much accuracy matters depends on your use case:

Analytics and reporting can tolerate weekly or even monthly updates. A small percentage of IPs geolocating incorrectly doesn't significantly affect aggregate statistics.

Content localization benefits from fresher data but isn't critically dependent on it. Showing the wrong language or currency to a small percentage of users is suboptimal but not catastrophic.

Compliance and fraud detection may require more frequent updates. If you're blocking traffic from specific countries or flagging suspicious locations, stale data creates both false positives and false negatives.

For most applications, updating weekly (matching MaxMind's release cadence) is sufficient. Set up a cron job or scheduled service that runs every Tuesday or Wednesday.

A simple update strategy:

Download the new CSV files
Load into temporary tables
Swap the tables in a transaction
Drop the old tables

BEGIN;

-- Rename current tables
ALTER TABLE geoip_network RENAME TO geoip_network_old;
ALTER TABLE geoip_location RENAME TO geoip_location_old;

-- Rename new tables
ALTER TABLE geoip_network_new RENAME TO geoip_network;
ALTER TABLE geoip_location_new RENAME TO geoip_location;

COMMIT;

-- Clean up outside transaction
DROP TABLE geoip_network_old;
DROP TABLE geoip_location_old;

This approach minimizes downtime. The swap happens in a transaction, so queries see either the old data or the new data, never a partial state.

How much storage does GeoLite2 require?

The storage requirements are modest:

Database	Uncompressed CSV	In Postgres (with indexes)
GeoLite2 Country	~10 MB	~50 MB
GeoLite2 City	~150 MB	~400 MB
GeoLite2 ASN	~15 MB	~80 MB

If you're loading all three databases with both IPv4 and IPv6 data, expect roughly 500-600 MB of storage.

This is small enough that storage scaling isn't a significant concern. The more relevant operational question is how your hosting provider handles storage growth and pricing.

Provisioned storage requires choosing a disk size upfront. You're paying for the full amount whether you use it or not, and resizing may require downtime.

Usage-based storage charges for what you actually use. If your database uses 1 GB including GeoLite2 data, you pay for 1 GB.

Railway uses the usage-based model. Volumes grow as your data grows, and you pay for actual consumption. For GeoLite2 specifically, this means you don't need to guess how much space to provision.

What about query performance at scale?

The GiST index on the network column is essential for performance. Without it, every lookup scans the entire table (millions of rows for IPv4 + IPv6). With it, lookups are sub-10ms.

CREATE INDEX idx_geoip_network ON geoip_network USING gist (network inet_ops);

A few things affect performance at scale:

Index must fit in memory. The GiST index for GeoLite2 City is roughly 200 MB. If this exceeds your available shared_buffers, queries slow down. For most deployments, this isn't an issue.

Batch queries are more efficient than individual lookups. If you need to geolocate 10,000 IPs, do it in one query with a JOIN rather than 10,000 individual queries.

Consider caching for hot paths. If the same IPs are looked up repeatedly (common in web applications), cache the results in Redis or application memory. GeoLite2 data changes weekly, so cached results stay valid for days.

For most applications, a single Postgres instance handles GeoLite2 queries without performance issues. The workload is read-heavy and the data fits comfortably in memory.

Postgres with GeoLite2 on Railway

Railway runs Postgres as a containerized service with persistent storage. For GeoLite2 specifically, this means:

What you get

Flexible deployment options. Use initialization scripts, custom Docker images, or separate data-loading services. Railway supports all three approaches.

Usage-based storage. Pay for what you use. GeoLite2 data adds roughly 500 MB to your database, and you pay for that incrementally.

Cron support for updates. Deploy a service with a cron trigger that runs weekly to refresh your GeoLite2 data. Railway handles the scheduling.

Private networking. Your application connects to Postgres over a private network. The database isn't exposed to the public internet.

What you'd add yourself

Automated update pipeline. Railway provides the infrastructure (cron triggers, private networking), but you write the update script that downloads new data and refreshes the tables.

Monitoring for data freshness. Set up alerts if your update job fails. Stale GeoLite2 data degrades accuracy silently.

Backup verification. Railway provides scheduled backups, but verify your backups include the GeoLite2 tables and can be restored successfully.

Getting started

Deploying Postgres with GeoLite2 on Railway:

Create a Railway account at railway.com
Add a Postgres database to your project
Add your initialization script or deploy a custom template
Set MAXMIND_LICENSE_KEY as an environment variable (get a free key at maxmind.com)
Deploy and wait for initialization to complete
Connect your application and start querying

For ongoing updates, add a separate service with a cron trigger that runs your update script weekly.

Conclusion

Loading GeoLite2 into Postgres makes sense when you need to join geolocation data with other tables, run batch operations, or query the data itself. The storage requirements are modest, and query performance is excellent with proper indexing.

The key operational considerations are: choosing a deployment strategy that fits your needs (initialization scripts for simplicity, separate services for ongoing updates), updating data weekly to match MaxMind's release cadence, and monitoring to catch update failures before they affect accuracy.

推荐订阅源

Railway Blog