Sunday, March 27, 2016


PostgreSql tablespaces

This is my 1nst post after more than two years, I can’t believe how much time has passed since my last article. In this period I was still programming using FPC/Lazarus and Delphi (only from time to time, fixing bugs/adding features to old apps), also doing some stuff in C++ (using the DCMTK dicom library) and of course JavaScript/CSS/HTML5/AngularJs, but the common denominator of all the work I’ve done is the wonderful, marvellous, spectacular PostgreSql database, yes it’s awesome, I agree with those that calls it a developer’s friendly database, it really is.
Now, I’ve decided to start writing about solutions to not so common problems involving PostgreSql, but don’t worry, I’ll still write about FPC ;).

Reaching free disk space limit

Imagine this situation, you installed PostgreSql in an Ubuntu Server by using sudo apt-get install postgresql and populated your database, then after some time, you find it is growing more than you initially expected. So, the solution is easy, add more disk space…, yes, but how you can add more disk space depends on the architecture of the underlying hardware and operating system. The easiest solution is to add an extra hard disk with more capacity, then move the PGDATA directory, but some times you can’t do that.
Let’s continue with an example. The server has one 1Tb disk, with two partitions, first is 800Gb and second is 200Gb. You install PostgreSql in the first partition, using all the defaults, so your PGDATA directory is in /var/lib/postgresql/9.x/main. Everything is ok, until you find the space limit on the first partition is near.

Tablespaces to the rescue

According to the PostgreSql manual, Tablespaces allow database administrators to define locations in the file system where the files representing database objects can be stored. Once created, a tablespace can be referred to by name when creating database objects.
So, one solution to our problem is move one or more tables from one partition to another. To do that, first determine the current in-disk size of your biggest tables, using this command:

SELECT nspname || '.' || relname AS "relation",
  pg_size_pretty(pg_relation_size(C.oid)) AS "size"
FROM pg_class C
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
WHERE nspname NOT IN ('pg_catalog', 'information_schema')
ORDER BY pg_relation_size(C.oid) DESC

For example, imagine we determine to move a 100Gb table called “documents” to a new tablespace, in the partition mounted on /mnt/partition2. The process is very easy:
  1. Create a directory in /mnt/partition2 where our data will reside:

    sudo mkdir /mnt/partition2/pgdata

  2. Assign postgres owner:

    sudo chown postgres:postgres /mnt/partition2/pgdata

  3. Create the new tablespace (from now on inside psql or PGAdminIII):

    create tablespace part2 location '/mnt/partition2/pgdata';

  4. Change the tablespace on documents table (this can take a while):

    alter table documents set tablespace part2;

That’s it, the documents table now is stored in a different location and our main partition has 100Gb more!.
As you can imagine, the tablespaces can be used in a lot more situations than this, for example, imagine you have two servers, one with screaming fast SSD storage and other with SATA, but bigger than the main server, you can store the most recent and needed data in the fast server and older data in the slow one, in my next article I’ll show you how.

Labels: , ,

Actually, that has been a best practice supported by "true" RDBMS and DBAs since the dawn of times, or little after. Having your database files mixed into other data just slowed down things and complicated backups. Moreover, it is pretty common to have a one or more dedicated set of disks for database files to improve reliability and/or I/O speed (using different tablespaces you can direct I/O to different disks) - and those disks may be even some remote ones in a SAN.
It has always been a sort of bug that databases like MySQL and Postgres in an effort to simplify things a bit too much used default locations for data files that are usually the worst one. Good maybe for the average little site run on the same machine, dreadful for any serious application.
Gud Article. I am following this blog and another website

Both are really good.
Post a Comment

<< Home

This page is powered by Blogger. Isn't yours?