Data Science Summit 2012

Posted by skahler on

I’m attending the 2012 Data Science Summit and I am happy to report it has been well worth my time. It isn’t a nuts and bolts confernce on what technologies to use or how to use them, what processes you should work or which machine learning algos to apply in a situation. What there are is presentations and panels on topics around working with data that apply directly to mich of the work I do.

Dashboarding Greenplum

Posted by skahler on

When Greenplum first landed in our shop they had wanted us to use gpperfmon. It quickly because obvious that it wasn’t stable at that time and that it created way to much overhead. So a couple years ago I came up with my own dashboarding tools that rely on the database as little as possible and exists outside of the cluster. My thought being that if they cluster is down it pretty hard to trouble shoot what’s wrong with it when the stats are kept in the cluster itself. The tool I came up with blends some Greenplum query checks, with sar data and uses MegaCLI to pull disk health. Here’s a quick glimpse so you can get an idea of what I’ve got going.