Is this postgres function cost efficient or still have to clean

Posted by kiranking on Stack Overflow See other posts from Stack Overflow or by kiranking
Published on 2010-12-30T12:59:16Z Indexed on 2010/12/30 19:53 UTC
Read the original article Hit count: 248

There are two tables in postgres db. english_all and english_glob

First table contains words like international,confidential,booting,cooler ...etc

I have written the function to get the words from english_all then perform for loop for each word to get word list which are not inserted in anglish_glob table. Word list is like

I
In
Int
Inte
Inter
..
b
bo
boo
boot
..
c
co
coo
cool
etc..

for some reason zwnj(zero-width non-joiner) is added during insertion to english_all table. But in function I am removing that character with regexp_replace.

Postgres function for_loop_test is taking two parameter min and max based on that I am selecting words from english_all table.

function code is like

DECLARE
inMinLength ALIAS FOR $1;
inMaxLength ALIAS FOR $2;
mviews RECORD;
outenglishListRow english_word_list;--custom data type eng_id,english_text
BEGIN
FOR mviews IN SELECT id,english_all_text FROM english_all where wlength between inMinLength and inMaxLength 
ORDER BY english_all_text limit 30 LOOP

FOR i IN 1..char_length(regexp_replace(mviews.english_all_text,'(?)$','')) LOOP
    FOR outenglishListRow IN
        SELECT distinct on (regexp_replace((substring(mviews.english_all_text from 1 for i)),'(?)$','')) mviews.id,
        regexp_replace((substring(mviews.english_all_text from 1 for i)),'(?)$','') where 
        regexp_replace((substring(mviews.english_all_text from 1 for i)),'(?)$','') not 
        in(select english_glob.english_text from english_glob where i=english_glob.wlength)
        order by regexp_replace((substring(mviews.english_all_text from 1 for i)),'(?)$','')
    LOOP
    RETURN NEXT outenglishListRow;
    END LOOP;


END LOOP;
END LOOP;
END;

Once I get the word list I will insert that into another table english_glob. My question is is there any thing I can add to or remove from function to make it more efficient.

edit Let assume english_all table have words like

  • footer,settle,question,overflow,database,kingdom

If inMinLength = 5 and inmaxLength=7 then in the outer loop

  • footer,settle,kingdom

will be selected. For above 3 words inner two loop will apply to get words like

  • f,fo,foo,foot,foote,footer,s,se,set,sett,settl .... etc.

In the final process words which are bold will be entered into english_glob with another parameter like 1 to denote it is a proper word and stored in the another filed of english_glob table. Remaining word will be stored with another parameter 0 because in the next call words which are saved in database should not be fetched again.

edit2: This is a complete code

CREATE TABLE english_all
(
id serial NOT NULL,
english_all_text  text NOT NULL,
wlength integer NOT NULL,
CONSTRAINT english_all PRIMARY KEY (id),
CONSTRAINT english_all_kan_text_uq_id UNIQUE (english_all_text)
)

CREATE TABLE english_glob
(
id serial NOT NULL,
english_text  text NOT NULL,
is_prop integer default 1,
CONSTRAINT english_glob PRIMARY KEY (id),
CONSTRAINT english_glob_kan_text_uq_id UNIQUE (english_text)
)

insert into english_all(english_text) values ('ant'),('forget'),('forgive');

on function call with parameter 3 and 6 fallowing rows should fetched

a
an
ant
f
fo
for
forg
forge
forget

next is insert to another table based on above row

insert into english_glob(english_text,is_prop) 
values 
('a',1),('an',1),
('ant',1),('f',0),
('fo',0),('for',1),
('forg',0),('forge',1),
('forget',1),

on function call next time with parameter 3 and 7 fallowing rows should fetched.(because f,fo,for,forg are all entered in english_glob table)

forgi
forgiv
forgive

© Stack Overflow or respective owner

Related posts about function

Related posts about postgresql