Monthly Archives

12 Articles

Greenplum plugin accepted

Posted by skahler on

My Greenplum plugin was accepted, it can be found here. It’s not anything all the special yet. Written in perl it relies on DBD::Pg to do most the heavy lifting. Currently it has four basic functions

1) DB login check. This is a very simple check just sees if it can make an actual connection to the GPDB. There are instances where using a tcp port check you could see the that it’s up but the login for your user is prevented. A good a example of this would be removing remote access from pg_hba.conf in order to do maintenance and then forgetting to enable remote users and/or access from specific network locations.

2) SELECT test. This test goes in and does a SELECT COUNT(1), gp_segment_id FROM schema.table GROUP BY gp_segment_id. The idea behind this check is to make sure a table is responding on all segments. This could also be used as a SLA check to make sure you aren’t surpassing certain time constraints to pull results from tables. Currently I do this again a small 1000 row table I generated in our systems.

3) WRITE test. Here the plugin logs into GPDB and attempts to create a temp table. We’ve had instances where GPDB has been up and you are able to log in and do select queries yet any query that requires a write just hangs. This check it to make sure that issue does not come up. I have yet to test on a 3.x system if this check will fail should the system go into “read-only” mode.

4) A very simple segment status check. This is the base for a more extensive check that I will build up. Currently it goes into gp_configuration or gp_segment_configuration depending upon your choice of 3.x or 4.x and looks to make sure all the segments are online. It sends back a crit status if any segments show offline. I plan to do a lot of tweaking to this test in the near future. Being able to specific a number of segments online warn and crit threshold as well as checking to see if any two segments containing the same content are down.

The timeout on all the test are configurable and default to 300 seconds. I wouldn’t suggest setting up any of these test to repeat more often than 5 minutes, except for possibly the login test. Doing something like forcing you GP cluster to do a select test against a multi-TB table every minute would probably be a bad idea.

On our system our current check setup look like this:

This being the first plugin I’ve submitted feedback on code cleanup or different additions that should be incorporated into a Greenplum plugin would be appreciated.


A nagios check to go in and check various Greenplum availability pieces


-t, --timeout Plugin timeout in seconds [default=300]
-U, --username Username to connect with (mandatory)
-P, --password Password to connect with (mandatory)
-H, --dbhost Database Hostname to connect to (mandatory)
-D, --db Database to connect to (mandatory)

Check to see if Greenplum accepts a connection
This is the default check


Check to see if getting data from a table works
The check executes a SELECT count by segment_id query
for the specified table. This could also be used to
setup SLA checks for getting data from the db

--select-schema Schema for Select check (mandatory)
--select-table Table for Select check (mandatory)

Check to see if table creation works
The check creates a temp table with and id(int) and
vlas(char) columns and set the distributed by to id.
This helps to monitor if the catalog queries and table
creation are happening in a reasonable amount of table

--create-table Table for Create check (mandatory)

Check to see if GP considers any segments offline
Query the gp_configuration (3.x) and gp_segment_configuration (4.x)
tables to see if any segments are marked down at the master level.
Currently this will crit if any show down.


plugin action

Posted by skahler on

Just uploaded a Greenplum plugin to the Nagios plugin exchange. It’s my first go at an official Nagios plugin. I’ve created checks for a simple login, reading from a table, writing to a temp table and checking for segment status in 3.x and 4.x installs. Once it’s accepted I’ll publish links here.

move data from db to db

Posted by skahler on

We had two Greenplum instances running and we needed to copy one big table to the other. There’s a variety of ways to dump and import but we were moving a large amount of data, so the thought of dropping it to a local file was not a very good option.

One of the ways to move a table from one instance to the other using COPY commands. If you are on the destination database you would use:

psql -h remotegreenplumtopulldatafrom -U me -c "COPY myschema.mytable to STDOUT" | psql -c "COPY myschema.mytable FROM STDIN"

We were going to move a lot of data though. So I came up with a quick script to do it a partition at a time

# database_dump
# Simple program do go out and copy dated partitions of
# a greenplum db to another db where the table exists
# it starts at start_time and goes incrementally backwards
# a day at a time until it gets to end_time
# 2010-03-16 SHK

use strict;
use warnings;
use POSIX;
use Time::Local;

# Fields are S,M,H,D,M,Y - Note month is 0-11
my $start_time = timelocal(0,0,0,1,2,2011);
my $end_time = timelocal(0,0,0,1,2,2010);
my $cmd;

my $running_date = $start_time;
until ( $running_date < $end_time ) { my $part_date = strftime("%Y%m%d",localtime($running_date)); $running_date -= 86400 $cmd = qq~psql -h remotegreenplumtopullfrom -U skahler -c "COPY myschema.mytable_1_prt_$partdate to STDOUT" | psql -c "COPY myschema.mytable FROM STDIN"~; print "$cmd\n"; system($cmd); } exit;

In our case the new db didn't have much action going on while I was moving the data in. I think if it was an active table that I was moving things into I'd create create a table on the destination target, load the data into it and then exchange that partition into the my target table.

gp_segment_id helps find skew

Posted by skahler on

Finding table skew in Greenplum is very important. If you end up with a bad distribution of records across segments you end up with one node doing much more work than the others. Unfortunately Greenplum firmly supports No Child Left Behind, it will only be as fast as it’s slowest member. Thus it is extremely important to have an even distribution across all the segments. A good way to check this is to use the hidden column gp_segment_id. A simple count query with this parameter will let you know how well your data is spread across nodes.

So I’m pulling a sample of 500 tuples from an example advertising impression data set.

First I think I’ll distribute it on the business unit. The results are:

db=# select gp_segment_id, count(1) from skahler.demo group by gp_segment_id;
gp_segment_id | count
5 | 500
(1 row)

Ouch, that was not the right choice. Obviously the set of data I got is all from one unit. So let’s try something different. How about distributing by the campaign they are associate with.

db=# select gp_segment_id, count(1) from skahler.demo group by gp_segment_id;
gp_segment_id | count
1 | 43
7 | 33
0 | 58
5 | 54
4 | 39
6 | 46
3 | 50
2 | 177
(8 rows)

A little bit better, but that doesn’t work so well either. The data is spread across the segments but segment 2 is holding much more data than everybody else. This will make for some hot spotting when I query the data. Next up maybe I can try by ip.

db=# select gp_segment_id, count(1) from skahler.demo group by gp_segment_id;
gp_segment_id | count
7 | 62
4 | 68
1 | 56
6 | 63
3 | 64
0 | 66
5 | 58
2 | 63
(8 rows)

Ah, that looks much better. Of course that this could be skewed in this dataset if we were heavily from a certain country or data came much more often from specific network segment. So I would need to continue to watch it and see if skew develops over time. It looks like this will work for now.

Game over try again

Posted by skahler on

Tonights Greenplum error was

ERROR: could not find segment file to use for inserting into relation table (64749). (appendonlywriter.c:569) SQL state: XX000

Which essentially means game over. Dump your table and recreate it, because you won’t be able to put any more data into it. Luckily you can still pull it out. This seems to only happen to append only compressed tables in both 3.x and 4.x. It’s supposed to be fixed in an upcoming patch release. It’s still enough to make George Bush sad.

Getting Distinct counts on keys

Posted by skahler on

It seems that someone always wants to know how much of X do I have in the db

Good Idea:

SELECT COUNT(1) from ( SELECT distribkeyvalue FROM db.really_big_table GROUP BY distribkeyvalue) a

Bad Idea:

SELECT COUNT(DISTINCT distribkeyvalue ) FROM db.really_big_table

In the first case the Greenplum optimizer will realize that it can do all of the work on the nodes and just forward the final counts to be aggregated where in the second case it’s going to try to bring all the data back to a central location in order to attempt establish uniqueness in the dataset, ouch.

Disk performance and disk fragmentation

Posted by skahler on

My last post had some statistics for a C2100 cluster we were running. Last night I did maintenance on a cluster that is running on R710 attached via PERC6/E controllers to a MD1120 array filled with 24 300GB disks (10k 2.5″). These are split into 4 arrays with 6 disks in each setup RAID5. The gpcheckperf at the start of my recent maintenance

gpadmin@mdw:~> gpcheckperf -f hosts.seg -d /data/vol1 -d /data/vol2 -d /data/vol3 -d /data/vol4 -r d -D

disk write min bandwidth (MB/s): 888.01 [sdw14-1]
disk write max bandwidth (MB/s): 968.73 [ sdw4-1]

disk read min bandwidth (MB/s): 1592.66 [ sdw7-1]
disk read max bandwidth (MB/s): 1941.55 [sdw13-1]

one of the next things I do is take a look at disk defragmentation using “xfs_db -c frag -r /dev/X” where X is one of my four arrays. In this case I came up with about 35% fragmentation across all of our arrays.

to clean this up I do a run of xfs_fsr across the disks which got them all down to less than 1% fragmentation.

the next disk test produced similar write speeds but increased read speed

disk write min bandwidth (MB/s): 872.72 [ sdw8-1]
disk write max bandwidth (MB/s): 960.32 [sdw15-1]

disk read min bandwidth (MB/s): 1975.79 [ sdw8-1]
disk read max bandwidth (MB/s): 2052.40 [ sdw2-1]

Up until the last couple of months it was not uncommon for us to hit 80%+ fragmentation on all of our nodes in the Greenplum cluster. Our recent switch from Suse to Redhat should help fix this, there was apparently a bug fix that RHEL implements in a recent kernel release to clean this up. I’ve noticed that in this cluster fragmentation can have a significant impact on our reported speeds. Oddly on clusters with a single controller running 12 600GB disks ( 15k 3.5″ ) split into two arrays that I see very little change in these io reports, even when stepping down from 95% fragmentation to 1%.

What kind of disk performance does your GP see?

Posted by skahler on

During our regular maintenance widows I run a gpcheckperf to see where our disk speeds in the Greenplum cluster are coming in. This is a result from an C2100 with a single LSI 9260-8i  controller. There are two virtual disk composed of 6 disks each arranged in a RAID5. For the file system I’m using xfs with the mount options: logbufs=8, logbsize=256k, noatime, attr2, nobarrier and seeing these results.

/usr/local/greenplum-db/./bin/gpcheckperf -f /data/gpadmin/hosts.seg -d /data/gpdb_p1 -d /data/gpdb_p2 -r d -D

disk write min bandwidth (MB/s): 945.25 [sdw15]
disk write max bandwidth (MB/s): 1007.74 [sdw13]

disk read min bandwidth (MB/s): 1239.10 [sdw15]
disk read max bandwidth (MB/s): 1691.65 [sdw12]

Are these similar number to what you are getting in your clusters?

Adding plperlu Language to Greenplum on RHEL5

Posted by skahler on

In order to get plperlu added as a language on our Greenplum 4.0RHEL5 cluster I had to take a couple additional steps. My first unhappy message was

db=# CREATE LANGUAGE plperlu;
ERROR:  could not load library “/usr/local/greenplum-db-”: cannot open shared object file: No such file or directory
db=# CREATE LANGUAGE plperlu;ERROR:  could not load library “/usr/local/greenplum-db-”: cannot open shared object file: No such file or directory

So greenplum can’t find

[gpadmin@mdw ~]$ locate


Looks like it’s on the system so I just need to make it available

[root@mdw ~]# echo “/usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi/CORE/” > /etc/

[root@mdw ~]# ldconfig

Next I went out and did the same thing on all the nodes using gpssh. Then it’s back to the master and we try to create the language again

db=# CREATE LANGUAGE plperlu;


Time: 892.420 ms

I should note that I tried to install plperl on the SLES11 servers we had and ran into an issue because the version of perl on SLES11 is 5.10 and unfortunately the within Greenplum’s distro is looking for 5.8. I didn’t look for a fix because I knew we were going to jump to Redhat in the near future and it would work there.

Alpine Miner First Look

Posted by skahler on

Downloaded the new Alpine Miner off of the the Greenplum Community site. Unfortunately for me there is only a Mac and Windows version so I had to fire up a VM to try it out.