Loading, listing, and using R Modules and Functions in PL/R
- by Dave Jarvis
I am having difficulty with:
Listing the R packages and functions available to PostgreSQL.
Installing a package (such as Kendall) for use with PL/R
Calling an R function within PostgreSQL
Listing Available R Packages
Q.1. How do you find out what R modules have been loaded?
SELECT * FROM r_typenames();
That shows the types that are available, but what about checking if Kendall( X, Y ) is loaded? For example, the documentation shows:
CREATE TABLE plr_modules (
modseq int4,
modsrc text
);
That seems to allow inserting records to dictate that Kendall is to be loaded, but the following code doesn't explain, syntactically, how to ensure that it gets loaded:
INSERT INTO plr_modules
VALUES (0, 'pg.test.module.load <-function(msg) {print(msg)}');
Q.2. What would the above line look like if you were trying to load Kendall?
Q.3. Is it applicable?
Installing R Packages
Using the "synaptic" package manager the following packages have been installed:
r-base
r-base-core
r-base-dev
r-base-html
r-base-latex
r-cran-acepack
r-cran-boot
r-cran-car
r-cran-chron
r-cran-cluster
r-cran-codetools
r-cran-design
r-cran-foreign
r-cran-hmisc
r-cran-kernsmooth
r-cran-lattice
r-cran-matrix
r-cran-mgcv
r-cran-nlme
r-cran-quadprog
r-cran-robustbase
r-cran-rpart
r-cran-survival
r-cran-vr
r-recommended
Q.4. How do I know if Kendall is in there?
Q.5. If it isn't, how do I find out what package it is in?
Q.6. If it isn't in a package suitable for installing with apt-get (aptitude, synaptic, dpkg, what have you), how do I go about installing it on Ubuntu?
Q.7. Where are the installation steps documented?
Calling R Functions
I have the following code:
EXECUTE 'SELECT '
'regr_slope( amount, year_taken ),'
'regr_intercept( amount, year_taken ),'
'corr( amount, year_taken ),'
'sum( measurements ) AS total_measurements '
'FROM temp_regression'
INTO STRICT slope, intercept, correlation, total_measurements;
This code calls the PostgreSQL function corr to calculate Pearson's correlation over the data. Ideally, I'd like to do the following (by switching corr for plr_kendall):
EXECUTE 'SELECT '
'regr_slope( amount, year_taken ),'
'regr_intercept( amount, year_taken ),'
'plr_kendall( amount, year_taken ),'
'sum( measurements ) AS total_measurements '
'FROM temp_regression'
INTO STRICT slope, intercept, correlation, total_measurements;
Q.8. Do I have to write plr_kendall myself?
Q.9. Where can I find a simple example that walks through:
Loading an R module into PG.
Writing a PG wrapper for the desired R function.
Calling the PG wrapper from a SELECT.
For example, would the last two steps look like:
create or replace function plr_kendall( _float8, _float8 ) returns float as '
agg_kendall(arg1, arg2)
' language 'plr';
CREATE AGGREGATE agg_kendall (
sfunc = plr_array_accum,
basetype = float8, -- ???
stype = _float8, -- ???
finalfunc = plr_kendall
);
And then the SELECT as above?
Thank you!