Search Results

Search found 4919 results on 197 pages for 'integer'.

Page 182/197 | < Previous Page | 178 179 180 181 182 183 184 185 186 187 188 189  | Next Page >

  • How to optimize my PostgreSQL DB for prefix search?

    - by asmaier
    I have a table called "nodes" with roughly 1.7 million rows in my PostgreSQL db =#\d nodes Table "public.nodes" Column | Type | Modifiers --------+------------------------+----------- id | integer | not null title | character varying(256) | score | double precision | Indexes: "nodes_pkey" PRIMARY KEY, btree (id) I want to use information from that table for autocompletion of a search field, showing the user a list of the ten titles having the highest score fitting to his input. So I used this query (here searching for all titles starting with "s") =# explain analyze select title,score from nodes where title ilike 's%' order by score desc; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- Sort (cost=64177.92..64581.38 rows=161385 width=25) (actual time=4930.334..5047.321 rows=161264 loops=1) Sort Key: score Sort Method: external merge Disk: 5712kB -> Seq Scan on nodes (cost=0.00..46630.50 rows=161385 width=25) (actual time=0.611..4464.413 rows=161264 loops=1) Filter: ((title)::text ~~* 's%'::text) Total runtime: 5260.791 ms (6 rows) This was much to slow for using it with autocomplete. With some information from Using PostgreSQL in Web 2.0 Applications I was able to improve that with a special index =# create index title_idx on nodes using btree(lower(title) text_pattern_ops); =# explain analyze select title,score from nodes where lower(title) like lower('s%') order by score desc limit 10; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------ Limit (cost=18122.41..18122.43 rows=10 width=25) (actual time=1324.703..1324.708 rows=10 loops=1) -> Sort (cost=18122.41..18144.60 rows=8876 width=25) (actual time=1324.700..1324.702 rows=10 loops=1) Sort Key: score Sort Method: top-N heapsort Memory: 17kB -> Bitmap Heap Scan on nodes (cost=243.53..17930.60 rows=8876 width=25) (actual time=96.124..1227.203 rows=161264 loops=1) Filter: (lower((title)::text) ~~ 's%'::text) -> Bitmap Index Scan on title_idx (cost=0.00..241.31 rows=8876 width=0) (actual time=90.059..90.059 rows=161264 loops=1) Index Cond: ((lower((title)::text) ~>=~ 's'::text) AND (lower((title)::text) ~<~ 't'::text)) Total runtime: 1325.085 ms (9 rows) So this gave me a speedup of factor 4. But can this be further improved? What if I want to use '%s%' instead of 's%'? Do I have any chance of getting a decent performance with PostgreSQL in that case, too? Or should I better try a different solution (Lucene?, Sphinx?) for implementing my autocomplete feature?

    Read the article

  • Error at lapack cgesv when matrix is not singular

    - by Jan Malec
    This is my first post. I usually ask classmates for help, but they have a lot of work now and I'm too desperate to figure this out on my own :). I am working on a project for school and I have come to a point where I need to solve a system of linear equations with complex numbers. I have decided to call lapack routine "cgesv" from c++. I use the c++ complex library to work with complex numbers. Problem is, when I call the routine, I get error code "2". From lapack documentation: INFO is INTEGER = 0: successful exit < 0: if INFO = -i, the i-th argument had an illegal value > 0: if INFO = i, U(i,i) is exactly zero. The factorization has been completed, but the factor U is exactly singular, so the solution could not be computed. Therefore, the element U(2, 2) should be zero, but it is not. This is how I declare the function: void cgesv_( int* N, int* NRHS, std::complex* A, int* lda, int* ipiv, std::complex* B, int* ldb, int* INFO ); This is how I use it: int *IPIV = new int[NA]; int INFO, NRHS = 1; std::complex<double> *aMatrix = new std::complex<double>[NA*NA]; for(int i=0; i<NA; i++){ for(int j=0; j<NA; j++){ aMatrix[j*NA+i] = A[i][j]; } } cgesv_( &NA, &NRHS, aMatrix, &NA, IPIV, B, &NB, &INFO ); And this is how the matrix looks like: (1,-160.85) (0,0.000306796) (0,-0) (0,-0) (0,-0) (0,0.000306796) (1,-40.213) (0,0.000306796) (0,-0) (0,-0) (0,-0) (0,0.000306796) (1,-0.000613592) (0,0.000306796) (0,-0) (0,-0) (0,-0) (0,0.000306796) (1,-40.213) (0,0.000306796) (0,-0) (0,-0) (0,-0) (0,0.000306796) (1,-160.85) I had to split the matrix colums, otherwise it did not format correctly. My first suspicion was that complex is not parsed correctly, but I have used lapack functions with complex numbers before this way. Any ideas?

    Read the article

  • How can I make Swig correctly wrap a char* buffer that is modified in C as a Java Something-or-other

    - by Ukko
    I am trying to wrap some legacy code for use in Java and I was quite happy to see that Swig was able to handle the header file and it generate a great wrapper that almost works. Now I am looking for the deep magic that will make it really work. In C I have a function that looks like this DLL_IMPORT int DustyVoodoo(char *buff, int len, char *curse); This integer returned by this function is an error code in case it fails. The arguments are buff is a character buffer len is the length of the data in the buffer curse the another character buffer that contains the result of calling DustyVoodoo So, you can see where this is going, the result is actually coming back via the third argument. Also len is confusing since it may be the length of both buffers, they are always allocated as being the same size in calling code but given what DustyVoodoo does I don't think that they need be the same. To be safe both buffers should be the same size in practice, say 512 chars. The C code generated for the binding is as follows: SWIGEXPORT jint JNICALL Java_pemapiJNI_DustyVoodoo(JNIEnv *jenv, jclass jcls, jstring jarg1, jint jarg2, jstring jarg3) { jint jresult = 0 ; char *arg1 = (char *) 0 ; int arg2 ; char *arg3 = (char *) 0 ; int result; (void)jenv; (void)jcls; arg1 = 0; if (jarg1) { arg1 = (char *)(*jenv)->GetStringUTFChars(jenv, jarg1, 0); if (!arg1) return 0; } arg2 = (int)jarg2; arg3 = 0; if (jarg3) { arg3 = (char *)(*jenv)->GetStringUTFChars(jenv, jarg3, 0); if (!arg3) return 0; } result = (int)PemnEncrypt(arg1,arg2,arg3); jresult = (jint)result; if (arg1) (*jenv)->ReleaseStringUTFChars(jenv, jarg1, (const char *)arg1); if (arg3) (*jenv)->ReleaseStringUTFChars(jenv, jarg3, (const char *)arg3); return jresult; } It is correct for what it does; however, it misses the fact that cursed is not just an input, it is altered by the function and should be returned as an output. It also does not know that the java Strings are really buffers and should be backed by a suitably sized array. I think that Swig can do the right thing here, I just can't figure out from the documentation how to tell Swig what it needs to know. Any typemap masers in the house?

    Read the article

  • "Simple" sort a nested array using array_multisort or native PHP functions instead of my own foreach loop

    - by Ana Ban
    I have the following array of days of the week, with each day having hours of the day (the whole array represents the schedule of a part-time employee): Array ( [7] => Array ( [0] => 15 [1] => 14 [2] => 13 [3] => 11 [4] => 12 [5] => 10 ) [1] => Array ( [0] => 10 [1] => 13 [2] => 12 ) [6] => Array ( [0] => 14 ) [3] => Array ( [0] => 4 [1] => 5 [2] => 6 ) ) and I simply need to: sort asc each sub-array (2nd dimension) - no need to maintain the numeric keys, values are integers sort asc the 1st dimension and maintain the numeric, integer keys ie: Array ( [1] => Array ( [0] => 10 [1] => 12 [2] => 13 ) [3] => Array ( [0] => 4 [1] => 5 [2] => 6 ) [6] => Array ( [0] => 14 ) [7] => Array ( [0] => 10 [1] => 11 [2] => 12 [3] => 13 [4] => 14 [5] => 15 ) ) Additional info: only the keys of the 1st dimension and the values of the 2nd dimension (and of course their association) are meaningful to my use-case the 1st dimension can have at most 7 values, ranging from 1-7 (days of the week), and will have at least 1 value (1 day) the 2nd dimension can have at most 24 values, ranging from 0-23 (hours of each day), and will have at least 1 value (1 hour per day) I know I can do this with a foreach on the whole ksorted array and sort each 2nd dimension array: ksort($sched); foreach ($sched as &$array) sort($array); unset($array); but I was hoping I could achieve this with native php array function(s) instead. My search led me to try array_multisort(array_values($array), array_keys($array), $array) but I just can't make it work.

    Read the article

  • Would someone mind giving suggestions for this new assembly language?

    - by Noctis Skytower
    Greetings! Last semester in college, my teacher in the Computer Languages class taught us the esoteric language named Whitespace. In the interest of learning the language better with a very busy schedule (midterms), I wrote an interpreter and assembler in Python. An assembly language was designed to facilitate writing programs easily, and a sample program was written with the given assembly mnemonics. Now that it is summer, a new project has begun with the objective being to rewrite the interpreter and assembler for Whitespace 0.3, with further developments coming afterwards. Since there is so much extra time than before to work on its design, you are presented here with an outline that provides a revised set of mnemonics for the assembly language. This post is marked as a wiki for their discussion. Have you ever had any experience with assembly languages in the past? Were there some instructions that you thought should have been renamed to something different? Did you find yourself thinking outside the box and with a different paradigm than in which the mnemonics were named? If you can answer yes to any of those questions, you are most welcome here. Subjective answers are appreciated! hold N Push the number onto the stack copy Duplicate the top item on the stack copy N Copy the nth item on the stack (given by the argument) onto the top of the stack swap Swap the top two items on the stack drop Discard the top item on the stack drop N Slide n items off the stack, keeping the top item add Addition sub Subtraction mul Multiplication div Integer Division mod Modulo save Store load Retrieve L: Mark a location in the program call L Call a subroutine goto L Jump unconditionally to a label if=0 L Jump to a label if the top of the stack is zero if<0 L Jump to a label if the top of the stack is negative return End a subroutine and transfer control back to the caller exit End the program print chr Output the character at the top of the stack print int Output the number at the top of the stack input chr Read a character and place it in the location given by the top of the stack input int Read a number and place it in the location given by the top of the stack Question: How would you redesign, rewrite, or rename the previous mnemonics and for what reasons?

    Read the article

  • Bullet indents in PowerPoint 2007 compatibility mode via .NET interop issue

    - by L. Shaydariv
    Hello. I've got a really difficult bug and I can't see the fix. The subject drives me insane for real for a long time. Let's consider the following scenario: 1) There is a PowerPoint 2003 presentation. It contains the only slide and the only shape, but the shape contains a text frame including a bulleted list with a random textual representation structure. 2) There is a requirement to get bullet indents for every bulletted paragraph using PowerPoint 2007. I can satisfy the requirement opening the presentation in the compatibility mode and applying the following VBA script: With ActivePresentation Dim sl As Slide: Set sl = .Slides(1) Dim sh As Shape: Set sh = sl.Shapes(1) Dim i As Integer For i = 1 To sh.TextFrame.TextRange.Paragraphs.Count Dim para As TextRange: Set para = sh.TextFrame.TextRange.Paragraphs(i, 1) Debug.Print para.Text; para.indentLevel, sh.TextFrame.Ruler.Levels(para.indentLevel).FirstMargin Next i End With that produces the following output: A 1 0 B 1 0 C 2 24 D 3 60 E 5 132 Obviously, everything is perfect indeed: it has shown the proper list item text, list item level and its bullet indent. But I can't see the way of how I can reach the same result using C#. Let's add a COM-reference to Microsoft.Office.Interop.PowerPoint 2.9.0.0 (taken from MSPPT.OLB, MS Office 12): // presentation = ...("presentation.ppt")... // a PowerPoint 2003 presentation Slide slide = presentation.Slides[1]; Shape shape = slide.Shapes[1]; for (int i = 1; i<=shape.TextFrame.TextRange.Paragraphs(-1, -1).Count; i++) { TextRange paragraph = shape.TextFrame.TextRange.Paragraphs(i, 1); Console.WriteLine("{0} {1} {2}", paragraph.Text, paragraph.IndentLevel, shape.TextFrame.Ruler.Levels[paragraph.IndentLevel].FirstMargin); } Oh, man... What's it? I've got problems here. First, the paragraph.Text value is trimmed until the '\r' character is found (however paragraph.Text[0] really returns the first character O_o). But it's ok, I can shut my eyes to this. But... But, second, I can't understand why the first margins are always zero and it does not matter which level they belong to. They are always zero in the compatibility mode... It's hard to believe it... :) So is there any way to fix it or just to find a workaround? I'd like to accept any help regarding to the solution of the subject. I can't even find any article related to the issue. :( Probably you have ever been face to face with it... Or is it just a bug with no fix and must it be reported to Microsoft? Thanks you.

    Read the article

  • How do I add a column that displays the number of distinct rows to this query?

    - by Fake Code Monkey Rashid
    Hello good people! I don't know how to ask my question clearly so I'll just show you the money. To start with, here's a sample table: CREATE TABLE sandbox ( id integer NOT NULL, callsign text NOT NULL, this text NOT NULL, that text NOT NULL, "timestamp" timestamp with time zone DEFAULT now() NOT NULL ); CREATE SEQUENCE sandbox_id_seq START WITH 1 INCREMENT BY 1 NO MINVALUE NO MAXVALUE CACHE 1; ALTER SEQUENCE sandbox_id_seq OWNED BY sandbox.id; SELECT pg_catalog.setval('sandbox_id_seq', 14, true); ALTER TABLE sandbox ALTER COLUMN id SET DEFAULT nextval('sandbox_id_seq'::regclass); INSERT INTO sandbox VALUES (1, 'alpha', 'foo', 'qux', '2010-12-29 16:51:09.897579+00'); INSERT INTO sandbox VALUES (2, 'alpha', 'foo', 'qux', '2010-12-29 16:51:36.108867+00'); INSERT INTO sandbox VALUES (3, 'bravo', 'bar', 'quxx', '2010-12-29 16:52:36.370507+00'); INSERT INTO sandbox VALUES (4, 'bravo', 'foo', 'quxx', '2010-12-29 16:52:47.584663+00'); INSERT INTO sandbox VALUES (5, 'charlie', 'foo', 'corge', '2010-12-29 16:53:00.742356+00'); INSERT INTO sandbox VALUES (6, 'delta', 'foo', 'qux', '2010-12-29 16:53:10.884721+00'); INSERT INTO sandbox VALUES (7, 'alpha', 'foo', 'corge', '2010-12-29 16:53:21.242904+00'); INSERT INTO sandbox VALUES (8, 'alpha', 'bar', 'corge', '2010-12-29 16:54:33.318907+00'); INSERT INTO sandbox VALUES (9, 'alpha', 'baz', 'quxx', '2010-12-29 16:54:38.727095+00'); INSERT INTO sandbox VALUES (10, 'alpha', 'bar', 'qux', '2010-12-29 16:54:46.237294+00'); INSERT INTO sandbox VALUES (11, 'alpha', 'baz', 'qux', '2010-12-29 16:54:53.891606+00'); INSERT INTO sandbox VALUES (12, 'alpha', 'baz', 'corge', '2010-12-29 16:55:39.596076+00'); INSERT INTO sandbox VALUES (13, 'alpha', 'baz', 'corge', '2010-12-29 16:55:44.834019+00'); INSERT INTO sandbox VALUES (14, 'alpha', 'foo', 'qux', '2010-12-29 16:55:52.848792+00'); ALTER TABLE ONLY sandbox ADD CONSTRAINT sandbox_pkey PRIMARY KEY (id); Here's the current SQL query I have: SELECT * FROM ( SELECT DISTINCT ON (this, that) id, this, that, timestamp FROM sandbox WHERE callsign = 'alpha' AND CAST(timestamp AS date) = '2010-12-29' ) playground ORDER BY timestamp DESC This is the result it gives me: id this that timestamp ----------------------------------------------------- 14 foo qux 2010-12-29 16:55:52.848792+00 13 baz corge 2010-12-29 16:55:44.834019+00 11 baz qux 2010-12-29 16:54:53.891606+00 10 bar qux 2010-12-29 16:54:46.237294+00 9 baz quxx 2010-12-29 16:54:38.727095+00 8 bar corge 2010-12-29 16:54:33.318907+00 7 foo corge 2010-12-29 16:53:21.242904+00 This is what I want to see: id this that timestamp count ------------------------------------------------------------- 14 foo qux 2010-12-29 16:55:52.848792+00 3 13 baz corge 2010-12-29 16:55:44.834019+00 2 11 baz qux 2010-12-29 16:54:53.891606+00 1 10 bar qux 2010-12-29 16:54:46.237294+00 1 9 baz quxx 2010-12-29 16:54:38.727095+00 1 8 bar corge 2010-12-29 16:54:33.318907+00 1 7 foo corge 2010-12-29 16:53:21.242904+00 1 EDIT: I'm using PostgreSQL 9.0.* (if that helps any).

    Read the article

  • Structs, strtok, segmentation fault

    - by FILIaS
    I'm trying to make a programme with structs and files.The following is just a part of my code(it;s not all). What i'm trying to do is: ask the user to write his command. eg. delete John eg. enter John James 5000 ipad purchase. The problem is that I want to split the command in order to save its 'args' for a struct element. That's why i used strtok. BUT I'm facing another problem in who to 'put' these on the struct. #include <stdio.h> #include <stdlib.h> #include <string.h> #define MAX 100 char command[1500]; struct catalogue { char short_name[50]; char surname[50]; signed int amount; char description[1000]; }*catalog[MAX]; int main ( int argc, char *argv[] ) { int i,n; char choice[3]; printf(">sort1: Print savings sorted by surname\n"); printf(">sort2: Print savings sorted by amount\n"); printf(">search+name:Print savings of each name searched\n"); printf(">delete+full_name+amount: Erase saving\n"); printf(">enter+full_name+amount+description: Enter saving \n"); printf(">quit: Update + EXIT program.\n"); printf("Choose your selection:\n>"); gets(command); //it save the whole command /*in choice it;s saved only the first 2 letters(needed for menu choice again)*/ strncpy(choice,command,2); choice[2]='\0'; char** args = (char**)malloc(strlen(command)*sizeof(char*)); memset(args, 0, sizeof(char*)*strlen(command)); char* curToken = strtok(command, " \t"); for (n = 0; curToken != NULL; ++n) { args[n] = strdup(curToken); curToken = strtok(NULL, " \t"); *catalog[n]->short_name=*args[1]; *catalog[n]->surname=args[2]; catalog[n]->amount=atoi(args[3]); *catalog[n]->description=args[4]; } return 0; } I get a warning (warning: assignment makes integer from pointer without a cast) for the lines: *catalog[n]->short_name=*args[1]; *catalog[n]->surname=args[2]; *catalog[n]->description=args[4]; As a result, after running the program i get a Segmentation Fault... Any help? Any ideas?

    Read the article

  • Problem with WiX major upgrade!

    - by Joshua
    Okay, my last question on this journey of WiX upgrades managed to get my settings file to be preserved! However, There is another component that is being preserved that I don't want to! I need it to overwrite, and it's not. The component "Settings" now works, with the NeverOverwrite="yes", and a KeyPath="yes". However, the component immediately below it does not work! It needs to overwrite both the MDF and the LDF with new ones from the install! I've tried lots of stuff, and am stumped. Please and thank you! Here is the components: <DirectoryRef Id="CommonAppDataPathways"> <Component Id="CommonAppDataPathwaysFolderComponent" Guid="087C6F14-E87E-4B57-A7FA-C03FC8488E0D"> <CreateFolder> <Permission User="Everyone" GenericAll="yes" /> </CreateFolder> <RemoveFolder Id="CommonAppDataPathways" On="uninstall" /> <!-- <RegistryValue Root="HKCU" Key="Software\TDR\Pathways" Name="installed" Type="integer" Value="1" KeyPath="yes" />--> </Component> <Component Id="Settings" Guid="A3513208-4F12-4496-B609-197812B4A953" NeverOverwrite="yes"> <File Id="settingsXml" KeyPath="yes" ShortName="SETTINGS.XML" Name="Settings.xml" DiskId="1" Source="\\fileserver\Release\Pathways\Dependencies\Settings\settings.xml" Vital="yes" /> </Component> <Component Id="Database" Guid="1D8756EF-FD6C-49BC-8400-299492E8C65D"> <File KeyPath="yes" Id="pathwaysMdf" Name="Pathways.mdf" DiskId="1" Source="\\fileserver\Shared\Databases\Pathways\SystemDBs\Pathways.mdf" /> <File Id="pathwaysLdf" Name="Pathways_log.ldf" DiskId="1" Source="\\fileserver\Shared\Databases\Pathways\SystemDBs\Pathways.ldf" /> <RemoveFile Id="pathwaysMdf" Name="Pathways.mdf" On="uninstall" /> <RemoveFile Id="pathwaysLdf" Name="Pathways_log.ldf" On="uninstall" /> </Component> </DirectoryRef> And here is the features: <Feature Id="App" Title="Pathways Application" Level="1" Description="Pathways software" Display="expand" ConfigurableDirectory="INSTALLDIR" Absent="disallow" AllowAdvertise="no" InstallDefault="local"> <ComponentRef Id="Application" /> <ComponentRef Id="CommonAppDataPathwaysFolderComponent" /> <ComponentRef Id="Settings"/> <ComponentRef Id="ProgramsMenuShortcutComponent" /> <Feature Id="Shortcuts" Title="Desktop Shortcut" Level="1" Absent="allow" AllowAdvertise="no" InstallDefault="local"> <ComponentRef Id="DesktopShortcutComponent" /> </Feature> </Feature> <Feature Id="Data" Title="Database" Level="1" Absent="allow" AllowAdvertise="no" InstallDefault="local"> <ComponentRef Id="Database" /> </Feature> And here is the InstallExecuteSequence: <InstallExecuteSequence> <RemoveExistingProducts After="InstallFinalize"/> </InstallExecuteSequence> What am I doing wrong?

    Read the article

  • How do I de-duplicate a list of nodes in XSLT - and return the last node encountered?

    - by Broam
    I've seen lots of "de-duplicate this xml" questions but everyone wants the first node or the nodes are identical. I have a bit of a bigger puzzle. I have a list of articles in XML, a relevant snippet is shown: <item><key>Article1</key><stamp>100</stamp></item> <item><key>Article1</key><stamp>130</stamp></item> <item><key>Article2</key><stamp>800</stamp></item> <item><key>Article1</key><stamp>180</stamp></item> <item><key>Article3</key><stamp>900</stamp></item> <item><key>Article3</key><stamp>950</stamp></item> <item><key>Article4</key><stamp>990</stamp></item> <item><key>Article5</key><stamp>999</stamp></item> I'd like a list of nodes where the keys are unique and where the last instance is returned, not the first: Stamp (integer) is always increasing for elements of a particular key. Ideally I'd like "largest stamp" but they're always in order so the shortcut is ok. Desired result: (Order doesn't really matter.) <item><key>Article2</key><stamp>800</stamp></item> <item><key>Article1</key><stamp>180</stamp></item> <item><key>Article3</key><stamp>950</stamp></item> <item><key>Article4</key><stamp>990</stamp></item> <item><key>Article5</key><stamp>999</stamp></item> I'm somewhat confused on how to get this list. Any ideas? I'm using the Saxon processor if it matters.

    Read the article

  • PHP MVC Principles

    - by George
    I'm not using an off-the-shelf framework and don't particularly want to (nor d I want to go into the reasons why...). Anyway, onto my question(s), I hope it make sense.... I'm trying to get my head around what should go in the model and what should go in the controller. Originally I had the impression that a model class should represent an actual object (eg - a car from the cars table of a database) and model properties should mirror the database fields. However I'm now getting the feeling that I've got the wrong idea - should an instance of a model class represent an actual item, or should it contain a number of methods for doing stuff - sometimes to one car or sometimes to multiple cars based on my example earlier. For example I want to get all the cars from a the database and show them in the view. Am I right in think it should be along the lines of this? Controller File function list() { $cars = $this->model->get_all(); $this->view->add($cars); $this->view->render('cars-list'); } Model File function get_all() { // Use a database interaction class that I've written $cars = Database::select(); return $cars; } Now, if the car had a "status" field that was stored as an integer in the database and I wanted to change that to a string, where should that be done? By looping the SQL results array in the get_all() method in the model? Also, where should form validation live? I have written a validation class that works a little like this: $validator = new Validator(); $validator->check('field_name', 'required'); If the check fails, it adds an error message to the array in the Validator. This array of error messages would then get passed to the view. Should the use of my validator class go in model or the controller? Thanks in advance for for any help anyone can offer. If you know of any links to a simple MVC example / open source application that deals with basic CRUD, they would be much appreciated.

    Read the article

  • R glm standard error estimate differences to SAS PROC GENMOD

    - by Michelle
    I am converting a SAS PROC GENMOD example into R, using glm in R. The SAS code was: proc genmod data=data0 namelen=30; model boxcoxy=boxcoxxy ~ AGEGRP4 + AGEGRP5 + AGEGRP6 + AGEGRP7 + AGEGRP8 + RACE1 + RACE3 + WEEKEND + SEQ/dist=normal; FREQ REPLICATE_VAR; run; My R code is: parmsg2 <- glm(boxcoxxy ~ AGEGRP4 + AGEGRP5 + AGEGRP6 + AGEGRP7 + AGEGRP8 + RACE1 + RACE3 + WEEKEND + SEQ , data=data0, family=gaussian, weights = REPLICATE_VAR) When I use summary(parmsg2) I get the same coefficient estimates as in SAS, but my standard errors are wildly different. The summary output from SAS is: Name df Estimate StdErr LowerWaldCL UpperWaldCL ChiSq ProbChiSq Intercept 1 6.5007436 .00078884 6.4991975 6.5022897 67911982 0 agegrp4 1 .64607262 .00105425 .64400633 .64813891 375556.79 0 agegrp5 1 .4191395 .00089722 .41738099 .42089802 218233.76 0 agegrp6 1 -.22518765 .00083118 -.22681672 -.22355857 73401.113 0 agegrp7 1 -1.7445189 .00087569 -1.7462352 -1.7428026 3968762.2 0 agegrp8 1 -2.2908855 .00109766 -2.2930369 -2.2887342 4355849.4 0 race1 1 -.13454883 .00080672 -.13612997 -.13296769 27817.29 0 race3 1 -.20607036 .00070966 -.20746127 -.20467944 84319.131 0 weekend 1 .0327884 .00044731 .0319117 .03366511 5373.1931 0 seq2 1 -.47509583 .00047337 -.47602363 -.47416804 1007291.3 0 Scale 1 2.9328613 .00015586 2.9325559 2.9331668 -127 The summary output from R is: Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 6.50074 0.10354 62.785 < 2e-16 AGEGRP4 0.64607 0.13838 4.669 3.07e-06 AGEGRP5 0.41914 0.11776 3.559 0.000374 AGEGRP6 -0.22519 0.10910 -2.064 0.039031 AGEGRP7 -1.74452 0.11494 -15.178 < 2e-16 AGEGRP8 -2.29089 0.14407 -15.901 < 2e-16 RACE1 -0.13455 0.10589 -1.271 0.203865 RACE3 -0.20607 0.09315 -2.212 0.026967 WEEKEND 0.03279 0.05871 0.558 0.576535 SEQ -0.47510 0.06213 -7.646 2.25e-14 The importance of the difference in the standard errors is that the SAS coefficients are all statistically significant, but the RACE1 and WEEKEND coefficients in the R output are not. I have found a formula to calculate the Wald confidence intervals in R, but this is pointless given the difference in the standard errors, as I will not get the same results. Apparently SAS uses a ridge-stabilized Newton-Raphson algorithm for its estimates, which are ML. The information I read about the glm function in R is that the results should be equivalent to ML. What can I do to change my estimation procedure in R so that I get the equivalent coefficents and standard error estimates that were produced in SAS? To update, thanks to Spacedman's answer, I used weights because the data are from individuals in a dietary survey, and REPLICATE_VAR is a balanced repeated replication weight, that is an integer (and quite large, in the order of 1000s or 10000s). The website that describes the weight is here. I don't know why the FREQ rather than the WEIGHT command was used in SAS. I will now test by expanding the number of observations using REPLICATE_VAR and rerunning the analysis.

    Read the article

  • How to Improve my php image resizer to support alpha png and transparent GIFs

    - by David
    Hi, I use this function to resize images but i end up with ugly creepy image with a black background if it's a transparent GIF or PNG with alpha, however it works perfectly for jpg and normal png. function cropImage($nw, $nh, $source, $stype, $dest) { $size = getimagesize($source); $w = $size[0]; $h = $size[1]; switch($stype) { case 'gif': $simg = imagecreatefromgif($source); break; case 'jpg': $simg = imagecreatefromjpeg($source); break; case 'png': $simg = imagecreatefrompng($source); break; } $dimg = imagecreatetruecolor($nw, $nh); switch ($stype) { case "png": imagealphablending( $dimg, false ); imagesavealpha( $dimg, true ); $transparent = imagecolorallocatealpha($dimg, 255, 255, 255, 127); imagefilledrectangle($dimg, 0, 0, $nw, $nh, $transparent); break; case "gif": // integer representation of the color black (rgb: 0,0,0) $background = imagecolorallocate($simg, 0, 0, 0); // removing the black from the placeholder imagecolortransparent($simg, $background); break; } $wm = $w/$nw; $hm = $h/$nh; $h_height = $nh/2; $w_height = $nw/2; if($w> $h) { $adjusted_width = $w / $hm; $half_width = $adjusted_width / 2; $int_width = $half_width - $w_height; imagecopyresampled($dimg,$simg,-$int_width,0,0,0,$adjusted_width,$nh,$w,$h); } elseif(($w <$h) || ($w == $h)) { $adjusted_height = $h / $wm; $half_height = $adjusted_height / 2; $int_height = $half_height - $h_height; imagecopyresampled($dimg,$simg,0,-$int_height,0,0,$nw,$adjusted_height,$w,$h); } else { imagecopyresampled($dimg,$simg,0,0,0,0,$nw,$nh,$w,$h); } imagejpeg($dimg,$dest,100); } Example : cropImage("300","200","original.png","png","new.png"); I use php 5.3.2 and the GD library bundled (2.0.34 compatible) How to make it support transparency? i've added imagealphablending() and imagesavealpha but it didn't work. Or atlast is there any similar good classes? Thanks

    Read the article

  • Creating a spam list with a web crawler in python

    - by user313623
    Hey guys, I'm not trying to do anything malicious here, I just need to do some homework. I'm a fairly new programmer, I'm using python 3.0, and I having difficulty using recursion for problem-solving. I've been stuck on this question for quite a while. Here's the assignment: Write a recursive method spam(url, n) that takes a url of a web page as input and a non-negative integer n, collects all the email address contained in the web page and adds them to a global dictionary variable spam_dict, and then recursively calls itself on every http hyperlink contained in the web page. You will use a dictionary so only one copy of every email address is save; your dictionary will store (key,value) pairs (email, email). The recursive call should use the parameter n-1 instead of n. If n = 0, you should collect the email addresses but no recursive calls should be made. The parameter n is used to limit the recursion to at most depth n. You will need to use the solutions of the two above problems; you method spam() will call the methods links2() and emails() and possibly other functions as well. Notes: 1. running spam() directly will produce no output on the screen; to find your spam_dict, you will need to read the value of spam_dict, and you will also need to reset it to the empty dictionary before every run of spam. 2. Recall how global variables are used. Usage: spam_dict = {} spam('http://reed.cs.depaul.edu/lperkovic/csc242/test1.html',0) spam_dict.keys() dict_keys([]) spam_dict = {} spam('http://reed.cs.depaul.edu/lperkovic/csc242/test1.html',1) spam_dict.keys() dict_keys(['[email protected]', '[email protected]']) So far, I've written a function that traverses web pages and puts all the links in a nice little list, and what I wanted to do was call that functions. And why would I use recursion on a dictionary? And how? I don't understand how n ties into all of this. def links2(url): content = str(urlopen(url).read()) myparser = MyHTMLParser() myparser.feed(content) lst = myparser.get() mergelst = [] for link in lst: mergelst.append(urljoin(lst[0],link)) print(mergelst) Any input (except why spam is bad) would be greatly appreciated. Also, I realize that the above function could probably look better, if you have a way to do it, I'm all ears. However, all I need is the point is for the program to produce the proper output.

    Read the article

  • Rails / ActiveRecord Modeling Help

    - by JM
    I’m trying to model a relationship in ActiveRecord and I think it’s a little beyond my skill level. Here’s the background. This is a horse racing project and I’m trying to model a horses Connections over time. Connections are defined as the Horse’s Current: Owner, Trainer and Jockey. Over time, a horse’s connections can change for a lot of different reasons: The owner sells the horse in a private sale The horse is claimed (purchase in a public sale) The Trainer switches jockeys The owner switches trainers In my first attempt at modeling this, I created the following tables: Horses, Owners, Trainers, Jockeys and Connections. Essentially, the Connections table was the has-many-through join table and was structured as follows: Connections Table 1 Id Horse_id Owner_id Trainer_id Jockey_id Status_Code Status_Date Change_Code The Horse, Owner, Trainer and Jockey foreign keys are self explanatory. The status code is 1 or 0 (1 active, 0 inactive) and the status date is the date the status changed. Change_code is and integer or string value that represent the reason for the change (private sale, claim, jockey change, etc) The key benefit of this approach is that the Connection is represented as one record in the connections table. The downside is that I have to have a table for Owner (1), Trainer (2) and Jockey (3) when one table could due. In my second attempt at modeling this I created the following tables: Horses, Connections, Entities The Entities tables has the following structure Entities Table id First_name Last_name Role where Role represents if the entity is a Owner, Trainer or Jockey. Under this approach, my Connections table has the following structure Connections Table 2 id Horse_id Entity_id Role Status_Code Status_Date Change_Code 1 1 1 1 1 1/1/2010 2 1 4 2 1 1/1/2010 3 1 10 3 1 1/1/2010 This approach has the benefit of eliminating two tables, but on the other hand the Connection is now comprised of three different records as opposed to one in the first approach. What believe I’m looking for is an approach that allows me to capture the Connection in one record, but also uses an Entities table with roles instead of the Owner, Trainer and Jockey tables. I’m new to ActiveRecord and rails so any and all input would be greatly appreciated. Perhaps there are other ways that would even be better. Thanks!

    Read the article

  • Opinion on "loop invariants", and are these frequently used in the industry?

    - by Michael Aaron Safyan
    I was thinking back to my freshman year at college (five years ago) when I took an exam to place-out of intro-level computer science. There was a question about loop invariants, and I was wondering if loop invariants are really necessary in this case or if the question was simply a bad example... the question was to write an iterative definition for a factorial function, and then to prove that the function was correct. The code that I provided for the factorial function was as follows: public static int factorial(int x) { if ( x < 0 ){ throw new IllegalArgumentException("Parameter must be = 0"); }else if ( x == 0 ){ return 1; }else{ int result = 1; for ( int i = 1; i <= x; i++ ){ result*=i; } return result; } } My own proof of correctness was a proof by cases, and in each I asserted that it was correct by definition (x! is undefined for negative values, 0! is 1, and x! is 1*2*3...*x for a positive value of x). The professor wanted me to prove the loop using a loop invariant; however, my argument was that it was correct "by definition", because the definition of "x!" for a positive integer x is "the product of the integers from 1... x", and the for-loop in the else clause is simply a literal translation of this definition. Is a loop invariant really needed as a proof of correctness in this case? How complicated must a loop be before a loop invariant (and proper initialization and termination conditions) become necessary for a proof of correctness? Additionally, I was wondering... how often are such formal proofs used in the industry? I have found that about half of my courses are very theoretical and proof-heavy and about half are very implementation and coding-heavy, without any formal or theoretical material. How much do these overlap in practice? If you do use proofs in the industry, when do you apply them (always, only if it's complicated, rarely, never)?

    Read the article

  • Linking buttion to jQuery through service

    - by Ruddy
    I have a small problem that should be very easy to overcome. For some reason I cant work this out. So the problem is I cannot get a button to link to some jquery. My set-up is as follows (showing the relevant code): Default.aspx jQuery: function getContent() { var data = { numberID: 1 }; $.jsonAspNet("ContentService.asmx", "GetContent", data, function (result) { $('#content').html(result); }); } jQuery(document).ready(function () { getContent(); } HTML: <div id="content"></div> ContentService.vb Public Function GetContent(number As Integer) As String Dim sb = New StringBuilder sb.AppendLine("<table>") sb.AppendLine("<tr>") sb.AppendLine("<td class='ui-widget-header ui-corner-all'>Number</td>") sb.AppendLine("</tr>") sb.AppendLine("<tr>") sb.AppendLine("<td>" & number & "</td>") sb.AppendLine("<td><a href='#' id='test' class='fg-button ui-state-default ui-corner-all'><img src='" & Context.Request.ApplicationPath & "/images/spacer.gif' class='ui-icon ui-icon-pencil' /></a></td>") sb.AppendLine("</tr>") sb.AppendLine("</table>") Return sb.ToString End Function So that's the basics of what I have everything works but I'm not sure how to get the button (id='test') to get linked to some jQuery. I want it to be pressed and bring up a popup. I have tried to put the jQuery on default.aspx but this doesn't seem to work unless the button is place in the HTML on that page. $('#test').unbind('click').click(function () { alert('Working'); }); I'm sure this is easy to be able to do but I have been trying for a while and cannot seem to get it to work.

    Read the article

  • casting, converting, and input from textbox controls

    - by Matt
    Working on some .aspx.cs code and decided I would forget how to turn a textbox value into a useable integer or decimal. Be warned I'm pretty new to .asp. Wish I could say the same for c sharp. So the value going into my textbox (strawberryp_textbox) is "1" which I presume I can access with the .text property. Which I then parse into a int. The Error reads Format Exception was unhandled by user code. My other question is can I do operations on a session variable? protected void submit_order_button_Click(object sender, EventArgs e) { int strawberryp; int strawberrys; decimal money1 = decimal.Parse(moneybox1.Text); decimal money2 = decimal.Parse(moneybox2.Text); decimal money3 = decimal.Parse(moneybox3.Text); decimal money4 = decimal.Parse(moneybox4.Text); decimal money5 = decimal.Parse(moneybox5.Text); strawberryp = int.Parse(strawberryp_Textbox.Text); //THE PROBLEM RIGHT HERE! strawberrys = int.Parse(strawberrys_Textbox.Text); // Needs fixed int strawberryc = int.Parse(strawberryc_Textbox.Text); //fix int berryp = int.Parse(berryp_Textbox.Text); //fix int raspberryp = int.Parse(raspberryp_Textbox.Text); /fix decimal subtotal = (money1 * strawberryp) + (money2 * strawberrys) + (money3 * strawberryc) + (money4 * berryp) + (money5 * raspberryp); //check to see if you can multiply decimal and int to get a deciaml!! Session["passmysubtotal"] = subtotal; //TextBox2.Text; (strawberryp_Textbox.Text);//TextBox4.Text; add_my_order_button.Enabled = true; add_my_order_button.Visible = true; submit_order_button.Enabled = false; submit_order_button.Visible = false; strawberryp_Textbox.ReadOnly = false; strawberrys_Textbox.ReadOnly = false; strawberryc_Textbox.ReadOnly = false; berryp_Textbox.ReadOnly = false; raspberryp_Textbox.ReadOnly = false; Response.Redirect("reciept.aspx"); } Thanks for the help

    Read the article

  • Calculating next date in Turbo Pascal

    - by Chaima Chaimouta
    program date; uses wincrt; var m,ch,ch1,ch2,ch3: string ; mois,j,a,b: integer ; begin write('a');read(a); write('j');read(j); write('mois');read(mois); case mois of 1,3,5,7,8,10: if j<31 then begin b:=j+1; m:=str(b,ch)+'/'+str(mois,ch2)+'/'+str(a,ch3); else if j=31then b:=1; s:=mois+1; m:=concat(str(b,ch),'/',str(s,ch2),'/',str(a,ch3)); end else m:='erreur'; 4,6,9,11:if j<30 then begin b:=j+1; m:=concat(str(b,ch),'/',str(mois,ch2),'/',str(a,ch3)); end else j=30 then begin b:=1; s:=mois+1; m:=concat(str(b,ch),'/',str(mois,ch2),'/',str(a,ch3)); end else m:='erreur'; 2:if j<28 then begin b:=j+1; m:=concat(str(b,ch),'/',str(mois,ch2),'/',str(a,ch3)); end else if j=28 then begin b:=1; m:=concat(str(b,ch),'/',str(mois,ch2,'/',str(a,ch3)); end else if((a mod 4=0)AND (a mod 100<>0)) or ((a mod 100=0)and(a mod 400=0)) then if j<29 then begin b:=j+1; m:=concat(str(b,ch),'/',str(mois,ch2,'/',str(a,ch3)); end else if j=29 then begin b:=1; m:=concat(str(b,ch),'/',str(mois,ch2,'/',str(a,ch3)); end else m:='erreur'; 12:if j<31 then begin b:=j+1; m:=concat(str(b,ch),'/',str(mois,ch2,'/',str(a,ch3)); end else if j=31 then begin b:=1; s:=a+1; m:=concat(str(b,ch),'/',str(mois,ch2,'/',str(s,ch3)); end; writeln(m); end. this is my program i hope you be able to help me

    Read the article

  • Custom activity designers in Workflow Foundation 3.5: How do they work?

    - by stakx
    Intent of this post: I realise that Workflow Foundation is not extremely popular on StackOverflow and that there will probably be not many answers, or none at all. This post is intended as a resource to people trying to customise workflow activities' appearance through custom designer classes. Goals: I am attempting to create a custom designer class for Workflow activities to achieve the following: Make activities look less technical. For example, I don't necessarily want to see the internal object name as the activity's "title" -- instead, I'd like to see something more descriptive. Display the values of certain properties beneath the title text. I would like to see some properties' values directly underneath the title so that I don't need to look somewhere else (namely, at the Properties window). Provide custom drop areas and draw custom internal arrows. As an example, I would like to be able to have custom drop areas in very specific places. What I found out so far: I created a custom designer class deriving from SequentialActivityDesigner as follows: [Designer(typeof(SomeDesigner))] public partial class SomeActivity: CompositeActivity { ... } class PlainDesigner : SequentialActivityDesigner { ... } Through overriding some properties and the OnPaint method, I found out about the following correspondences between the properties and how the activity will be displayed: Figure 1. Relationship between some properties of an SequentialActivityDesigner and the displayed activity. Possible solutions for goal #1 (make activities look less technical) and goal #2 (display values of properties beneath title text): The displayed title can be changed through the Title property. If more room is required to display additional information beneath the title, the TitleHeight property can be increased (ie., override the property and make it return base.TitleHeight + n, where n is some positive integer). Override the OnPaint method and draw additional text in the area reserved through TitleHeight. Open questions: What are the connectors, connections, and connection points used for? They seem to be necessary, but for what purpose? While the drop targets can be got through the GetDropTargets method, it seems that this is not necessarily where the designer will actually place dropped activities. When an activity is dragged across a workflow, the designer displays little green plus signs where activities can be dropped; how does it figure out the locations of these plus signs? How does the designer figure out where to draw connector lines and arrows?

    Read the article

  • How to save file and read

    - by Jessy
    Hello everyone, I create a program that place random image in grid layout format. The size of the grid layout is 6 x 6 = 36. Only 10 were filled with images (each image was different) and the rest were empty. freeimagehosting.net/uploads/bfb7e85f63.jpg How can I save it to a file and read it again, so it will display the same images with same placement on the grid? Here is the code that I used to save the images: //image file String []arrPic = {"pic1.jpg","pic2.jpg","pic3.jpg","pic4.jpg","pic5.jpg","pic6.jpg","pic7.jpg","pic8.jpg","pic9.jpg","pic10.jpg",,"pic11.jpg","pic12.jpg","pic13.jpg"}; ArrayList<String> pictures = new ArrayList<String>(Arrays.asList(arrPic)); ArrayList<String> file = new ArrayList<String>(); JPanel pDraw = new JPanel(new GridLayout(6,6,2,2)); ... //fill all grids with empty label for (int i =0; i<(6*6); i++){ JLabel lbl = new JLabel(""); pDraw.add(lbl); } ... //Choose random box to be filled with images for(int i=0; i<10; i++){ Boolean number = true; while(number){ int n = rand.nextInt(35); if(!(arraylist.contains(n))) number = false; arraylist.add(n); } //fill the grids with images for(int i=0; i<arraylist.size(); i++){ //select random image from arraylist int index = rand.nextInt (pictures.size()); String fileName = (String) pictures.get(index ); //find the image file icon = createImageIcon(fileName); //save the file in a new file file.add(fileName); //rescaled the image int x = rand.nextInt(50)+50; int y = rand.nextInt(50)+50; Image image = icon.getImage().getScaledInstance(x,y,Image.SCALE_SMOOTH); icon.setImage(image); //remove empty label and replace it with an image int one = (Integer) arraylist.get(i); pDraw.remove(one); final JLabel label; pDraw.add(label,one); }

    Read the article

  • AlarmManager triggers PendingIntent too soon

    - by Wezelkrozum
    I've searched for 3 days now but didn't find a solution or similar problem/question anywhere else. Here is the deal: Trigger in 1 hour - works correct Trigger in 2 hours - Goes of in 1:23 Trigger in 1 day - Goes of in ~11:00 So why is the AlarmManager so unpredictable and always too soon? Or what am I doing wrong? And is there another way so that it could work correctly? This is the way I register my PendingIntent in the AlarmManager (stripped down): AlarmManager alarmManager = (AlarmManager)parent.getSystemService(ALARM_SERVICE); Intent myIntent = new Intent(parent, UpdateKlasRoostersService.class); PendingIntent pendingIntent = PendingIntent.getService(parent, 0, myIntent, PendingIntent.FLAG_UPDATE_CURRENT); //Set startdate of PendingIntent so it triggers in 10 minutes Calendar start = Calendar.getInstance(); start.setTimeInMillis(SystemClock.elapsedRealtime()); start.add(Calendar.MINUTE, 10); //Set interval of PendingIntent so it triggers every day Integer interval = 1*24*60*60*1000; //Cancel any similar instances of this PendingIntent if already scheduled alarmManager.cancel(pendingIntent); //Schedule PendingIntent alarmManager.setRepeating(AlarmManager.ELAPSED_REALTIME_WAKEUP, start.getTimeInMillis(), interval, pendingIntent); //Old way I used to schedule a PendingIntent, didn't seem to work either //alarmManager.set(AlarmManager.RTC_WAKEUP, start.getTimeInMillis(), pendingIntent); It would be awesome if anyone has a solution. Thanks for any help! Update: 2 hours ago it worked to trigger it with an interval of 2 hours, but after that it triggered after 1:20 hours. It's getting really weird. I'll track the triggers down with a logfile and post it here tomorrow. Update: The PendingIntent is scheduled to run every 3 hours. From the log's second line it seems like an old scheduled PendingIntent is still running: [2012-5-3 2:15:42 519] Updating Klasroosters [2012-5-3 4:15:15 562] Updating Klasroosters [2012-5-3 5:15:42 749] Updating Klasroosters [2012-5-3 8:15:42 754] Updating Klasroosters [2012-5-3 11:15:42 522] Updating Klasroosters But, I'm sure I cancelled the scheduled PendingIntent's before I schedule a new one. And every PendingIntent isn't recreated in the same way, so it should be exactly the same. If not , this threads question isn't relevant anymore.

    Read the article

  • Replace HTML entities in a string avoiding <img> tags

    - by Xeos
    I have the following input: Hi! How are you? <script>//NOT EVIL!</script> Wassup? :P LOOOL!!! :D :D :D Which is then run through emoticon library and it become this: Hi! How are you? <script>//NOT EVIL!</script> Wassup? <img class="smiley" alt="" title="tongue, :P" src="ui/emoticons/15.gif"> LOOOL!!! <img class="smiley" alt="" title="big grin, :D" src="ui/emoticons/5.gif"> <img class="smiley" alt="" title="big grin, :P" src="ui/emoticons/5.gif"> <img class="smiley" alt="" title="big grin, :P" src="ui/emoticons/5.gif"> I have a function that escapes HTML entites to prevent XSS. So running it on raw input for the first line would produce: Hi! How are you? &lt;script&gt;//NOT EVIL!&lt;/script&gt; Now I need to escape all the input, but at the same time I need to preserve emoticons in their initial state. So when there is <:-P emoticon, it stays like that and does not become &lt;:-P. I was thinking of running a regex split on the emotified text. Then processing each part on its own and then concatenating the string together, but I am not sure how easily can Regex be bypassed? I know the format will always be this: [<img class="smiley" alt="] [empty string] [" title="] [one of the values from a big list] [, ] [another value from the list (may be matching original emoticon)] [" src="ui/emoticons/] [integer from Y to X] [.gif">] Using the list MAY be slow, since I need to run that regex on text that may have 20-30-40 emoticons. Plus there may be 5-10-15 text messages to process. What could be an elegant solution to this? I am ready to use third-party library or jQuery for this. PHP preprocessing is possible as well.

    Read the article

  • Strange results - I obtain same value for all keys

    - by Pietro Luciani
    I have a problem with mapreduce. Giving as input a list of song ("Songname"#"UserID"#"boolean") i must have as result a song list in which is specified how many time different useres listen them... so a output ("Songname","timelistening"). I used hashtable to allow only one couple . With short files it works well but when I put as input a list about 1000000 of records it returns me the same value (20) for all records. This is my mapper: public static class CanzoniMapper extends Mapper<Object, Text, Text, IntWritable>{ private IntWritable userID = new IntWritable(0); private Text song = new Text(); public void map(Object key, Text value, Context context) throws IOException, InterruptedException { /*StringTokenizer itr = new StringTokenizer(value.toString()); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); context.write(word, one); }*/ String[] caratteri = value.toString().split("#"); if(caratteri[2].equals("1")){ song.set(caratteri[0]); userID.set(Integer.parseInt(caratteri[1])); context.write(song,userID); } } } This is my reducer: public static class CanzoniReducer extends Reducer<Text,IntWritable,Text,IntWritable> { private IntWritable result = new IntWritable(); public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { Hashtable<IntWritable,Text> doppioni = new Hashtable<IntWritable,Text>(); for (IntWritable val : values) { doppioni.put(val,key); } result.set(doppioni.size()); //doppioni.clear(); context.write(key,result); } } and main: Configuration conf = new Configuration(); Job job = new Job(conf, "word count"); job.setJarByClass(Canzoni.class); job.setMapperClass(CanzoniMapper.class); //job.setCombinerClass(CanzoniReducer.class); //job.setNumReduceTasks(2); job.setReducerClass(CanzoniReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); System.exit(job.waitForCompletion(true) ? 0 : 1); Any idea???

    Read the article

  • Rails. Putting update logic in your migrations

    - by Daniel Abrahamsson
    A couple of times I've been in the situation where I've wanted to refactor the design of some model and have ended up putting update logic in migrations. However, as far as I've understood, this is not good practice (especially since you are encouraged to use your schema file for deployment, and not your migrations). How do you deal with these kind of problems? To clearify what I mean, say I have a User model. Since I thought there would only be two kinds of users, namely a "normal" user and an administrator, I chose to use a simple boolean field telling whether the user was an adminstrator or not. However, after I while I figured I needed some third kind of user, perhaps a moderator or something similar. In this case I add a UserType model (and the corresponding migration), and a second migration for removing the "admin" flag from the user table. And here comes the problem. In the "add_user_type_to_users" migration I have to map the admin flag value to a user type. Additionally, in order to do this, the user types have to exist, meaning I can not use the seeds file, but rather create the user types in the migration (also considered bad practice). Here comes some fictional code representing the situation: class CreateUserTypes < ActiveRecord::Migration def self.up create_table :user_types do |t| t.string :name, :nil => false, :unique => true end #Create basic types (can not put in seed, because of future migration dependency) UserType.create!(:name => "BASIC") UserType.create!(:name => "MODERATOR") UserType.create!(:name => "ADMINISTRATOR") end def self.down drop_table :user_types end end class AddTypeIdToUsers < ActiveRecord::Migration def self.up add_column :users, :type_id, :integer #Determine type via the admin flag basic = UserType.find_by_name("BASIC") admin = UserType.find_by_name("ADMINISTRATOR") User.all.each {|u| u.update_attribute(:type_id, (u.admin?) ? admin.id : basic.id)} #Remove the admin flag remove_column :users, :admin #Add foreign key execute "alter table users add constraint fk_user_type_id foreign key (type_id) references user_types (id)" end def self.down #Re-add the admin flag add_column :users, :admin, :boolean, :default => false #Reset the admin flag (this is the problematic update code) admin = UserType.find_by_name("ADMINISTRATOR") execute "update users set admin=true where type_id=#{admin.id}" #Remove foreign key constraint execute "alter table users drop foreign key fk_user_type_id" #Drop the type_id column remove_column :users, :type_id end end As you can see there are two problematic parts. First the row creation part in the first model, which is necessary if I would like to run all migrations in a row, then the "update" part in the second migration that maps the "admin" column to the "type_id" column. Any advice?

    Read the article

< Previous Page | 178 179 180 181 182 183 184 185 186 187 188 189  | Next Page >