Search Results

Search found 2562 results on 103 pages for 'morphological analysis'.

Page 85/103 | < Previous Page | 81 82 83 84 85 86 87 88 89 90 91 92  | Next Page >

  • sharing build artifacts between jobs in hudson

    - by programming panda
    Hi I'm trying to set up our build process in hudson. Job 1 will be a super fast (hopefully) continuous integration build job that will be built frequently. Job 2, will be responsible for running a comprehensive test suite, at a regular interval or triggered manually. Job 3 will be responsible for running analysis tools across the codebase (much like Job 2). I tried using the "Advanced Projects Options use custom workspace" feature so that code compiled in Job 1 can be used in Job 2 and 3. However, it seems that all build artifacts remain inside that Job 1 workspace. I'm I doing this right? Is there a better way of doing this? I guess I'm looking for something similar to a build pipeline setup...so that things can be shared and the appropriate jobs can be executed in stages. (I also considered using 'batch tasks'...but it seems like those can't be scheduled? only triggered manually?) Any suggestions are welcomed. Thanks!

    Read the article

  • Useful training courses that aren't specific to a single technology

    - by Dave Turvey
    I have possibly the best problem in the world. I have about £1600 left in a training budget and I need to find something to spend it on. I can spend it on anything that could be considered training. Books, courses, conferences, etc. I would like to find a course that would benifit a software developer but is not about learning a specific programming technology. I don't really want to spend it on a technical training course. These topics are usually best learned with a good book and some trial and error. I have also already been on a general business/management training course and a PRINCE2 project management course. I am currently working on a project on my own so am responsible for communicating with the client, requirements gathering, project management, etc., as well as the coding. What training have you found useful outside the usual technical stuff? Has anyone done any business analysis courses? What were they like? Are there any courses on some of the practicalities of working with software, e.g. automated test and deployment strategies, handling technical support? I would prefer a course in the UK but I can travel if necessary.

    Read the article

  • Can I make a "TCP packet modifier" using tun/tap and raw sockets?

    - by benhoyt
    I have a Linux application that talks TCP, and to help with analysis and statistics, I'd like to modify the data in some of the TCP packets that it sends out. I'd prefer to do this without hacking the Linux TCP stack. The idea I have so far is to make a bridge which acts as a "TCP packet modifier". My idea is to connect to the application via a tun/tap device on one side of the bridge, and to the network card via raw sockets on the other side of the bridge. My concern is that when you open a raw socket it still sends packets up to Linux's TCP stack, and so I couldn't modify them and send them on even if I wanted to. Is this correct? A pseudo-C-code sketch of the bridge looks like: tap_fd = open_tap_device("/dev/net/tun"); raw_fd = open_raw_socket(); for (;;) { select(fds = [tap_fd, raw_fd]); if (FD_ISSET(tap_fd, &fds)) { read_packet(tap_fd); modify_packet_if_needed(); write_packet(raw_fd); } if (FD_ISSET(raw_fd, &fds)) { read_packet(raw_fd); modify_packet_if_needed(); write_packet(tap_fd); } } Does this look possible, or are there other better ways of achieving the same thing? (TCP packet bridging and modification.)

    Read the article

  • Database warehouse design: fact tables and dimension tables

    - by morpheous
    I am building a poor man's data warehouse using a RDBMS. I have identified the key 'attributes' to be recorded as: sex (true/false) demographic classification (A, B, C etc) place of birth date of birth weight (recorded daily): The fact that is being recorded My requirements are to be able to run 'OLAP' queries that allow me to: 'slice and dice' 'drill up/down' the data and generally, be able to view the data from different perspectives After reading up on this topic area, the general consensus seems to be that this is best implemented using dimension tables rather than normalized tables. Assuming that this assertion is true (i.e. the solution is best implemented using fact and dimension tables), I would like to seek some help in the design of these tables. 'Natural' (or obvious) dimensions are: Date dimension Geographical location Which have hierarchical attributes. However, I am struggling with how to model the following fields: sex (true/false) demographic classification (A, B, C etc) The reason I am struggling with these fields is that: They have no obvious hierarchical attributes which will aid aggregation (AFAIA) - which suggest they should be in a fact table They are mostly static or very rarely change - which suggests they should be in a dimension table. Maybe the heuristic I am using above is too crude? I will give some examples on the type of analysis I would like to carryout on the data warehouse - hopefully that will clarify things further. I would like to aggregate and analyze the data by sex and demographic classification - e.g. answer questions like: How does male and female weights compare across different demographic classifications? Which demographic classification (male AND female), show the most increase in weight this quarter. etc. Can anyone clarify whether sex and demographic classification are part of the fact table, or whether they are (as I suspect) dimension tables.? Also assuming they are dimension tables, could someone elaborate on the table structures (i.e. the fields)? The 'obvious' schema: CREATE TABLE sex_type (is_male int); CREATE TABLE demographic_category (id int, name varchar(4)); may not be the correct one.

    Read the article

  • Starting out NLP - Python + large data set

    - by pencilNero
    Hi, I've been wanting to learn python and do some NLP, so have finally gotten round to starting. Downloaded the english wikipedia mirror for a nice chunky dataset to start on, and have been playing around a bit, at this stage just getting some of it into a sqlite db (havent worked with dbs in the past unfort). But I'm guessing sqlite is not the way to go for a full blown nlp project(/experiment :) - what would be the sort of things I should look at ? HBase (.. and hadoop) seem interesting, i guess i could run then im java, prototype in python and maybe migrate the really slow bits to java... alternatively just run Mysql.. but the dataset is 12gb, i wonder if that will be a problem? Also looked at lucene, but not sure how (other than breaking the wiki articles into chunks) i'd get that to work.. What comes to mind for a really flexible NLP platform (i dont really know at this stage WHAT i want to do.. just want to learn large scale lang analysis tbh) ? Many thanks.

    Read the article

  • Are PyArg_ParseTuple() "s" format specifiers useful in Python 3.x C API?

    - by Craig McQueen
    I'm trying to write a Python C extension that processes byte strings, and I have something basically working for Python 2.x and Python 3.x. For the Python 2.x code, near the start of my function, I currently have a line: if (!PyArg_ParseTuple(args, "s#:in_bytes", &src_ptr, &src_len)) ... I notice that the s# format specifier accepts both Unicode strings and byte strings. I really just want it to accept byte strings and reject Unicode. For Python 2.x, this might be "good enough"--the standard hashlib seems to do the same, accepting Unicode as well as byte strings. However, Python 3.x is meant to clean up the Unicode/byte string mess and not let the two be interchangeable. So, I'm surprised to find that in Python 3.x, the s format specifiers for PyArg_ParseTuple() still seem to accept Unicode and provide a "default encoded string version" of the Unicode. This seems to go against the principles of Python 3.x, making the s format specifiers unusable in practice. Is my analysis correct, or am I missing something? Looking at the implementation for hashlib for Python 3.x (e.g. see md5module.c, function MD5_update() and its use of GET_BUFFER_VIEW_OR_ERROUT() macro) I see that it avoids the s format specifiers, and just takes a generic object (O specifier) and then does various explicit type checks using the GET_BUFFER_VIEW_OR_ERROUT() macro. Is this what we have to do?

    Read the article

  • Boost Mersenne Twister: how to seed with more than one value?

    - by Eamon Nerbonne
    I'm using the boost mt19937 implementation for a simulation. The simulation needs to be reproducible, and that means storing and potentially reusing the RNG seeds later. I'm using the windows crypto api to generate the seed values because I need an external source for the seeds and not because of any particular guarantees of randomness. The output of any simulation run will have a note including the RNG seed - so the seed needs to be reasonably short. On the other hand, as part of the analysis of the simulation, I'll be comparing several runs - but to be sure that these runs are actually different, I'll need to use different seeds - so the seed needs to be long enough to avoid accidental collisions. I've determined that 64-bits of seeding should suffice; the chance of a collision will reach 50% after about 2^32 runs - that probability is low enough that the average error caused by it is negligible to me. Using just 32-bits of seed is tricky; the chance of a collision reaches 50% already after 2^16 runs; and that's a little too likely for my tastes. Unfortunately, the boost implementation either seeds with a full state vector - which is far, far too long - or a single 32-bit unsigned long - which isn't ideal. How can I seed the generator with more than 32-bits but less than a full state vector? I tried just padding the vector or repeating the seeds to fill the state vector, but even a cursory glance at the results shows that that generates poor results.

    Read the article

  • Oracle T4CPreparedStatement memory leaks?

    - by Jay
    A little background on the application that I am gonna talk about in the next few lines: XYZ is a data masking workbench eclipse RCP application: You give it a source table column, and a target table column, it would apply a trasformation (encryption/shuffling/etc) and copy the row data from source table to target table. Now, when I mask n tables at a time, n threads are launched by this app. Here is the issue: I have run into a production issue on first roll out of the above said app. Unfortunately, I don't have any logs to get to the root. However, I tried to run this app in test region and do a stress test. When I collected .hprof files and ran 'em through an analyzer (yourKit), I noticed that objects of oracle.jdbc.driver.T4CPreparedStatement was retaining heap. The analysis also tells me that one of my classes is holding a reference to this preparedstatement object and thereby, n threads have n such objects. T4CPreparedStatement seemed to have character arrays: lastBoundChars and bindChars each of size char[300000]. So, I researched a bit (google!), obtained ojdbc6.jar and tried decompiling T4CPreparedStatement. I see that T4CPreparedStatement extends OraclePreparedStatement, which dynamically manages array size of lastBoundChars and bindChars. So, my questions here are: Have you ever run into an issue like this? Do you know the significance of lastBoundChars / bindChars? I am new to profiling, so do you think I am not doing it correct? (I also ran the hprofs through MAT - and this was the main identified issue - so, I don't really think I could be wrong?) I have found something similar on the web here: http://forums.oracle.com/forums/thread.jspa?messageID=2860681 Appreciate your suggestions / advice.

    Read the article

  • Database warehoue design: fact tables and dimension tables

    - by morpheous
    I am building a poor man's data warehouse using a RDBMS. I have identified the key 'attributes' to be recorded as: sex (true/false) demographic classification (A, B, C etc) place of birth date of birth weight (recorded daily): The fact that is being recorded My requirements are to be able to run 'OLAP' queries that allow me to: 'slice and dice' 'drill up/down' the data and generally, be able to view the data from different perspectives After reading up on this topic area, the general consensus seems to be that this is best implemented using dimension tables rather than normalized tables. Assuming that this assertion is true (i.e. the solution is best implemented using fact and dimension tables), I would like to see some help in the design of these tables. 'Natural' (or obvious) dimensions are: Date dimension Geographical location Which have hierarchical attributes. However, I am struggling with how to model the following fields: sex (true/false) demographic classification (A, B, C etc) The reason I am struggling with these fields is that: They have no obvious hierarchical attributes which will aid aggregation (AFAIA) - which suggest they should be in a fact table They are mostly static or very rarely change - which suggests they should be in a dimension table. Maybe the heuristic I am using above is too crude? I will give some examples on the type of analysis I would like to carryout on the data warehouse - hopefully that will clarify things further. I would like to aggregate and analyze the data by sex and demographic classification - e.g. answer questions like: How does male and female weights compare across different demographic classifications? Which demographic classification (male AND female), show the most increase in weight this quarter. etc. Can anyone clarify whether sex and demographic classification are part of the fact table, or whether they are (as I suspect) dimension tables.? Also assuming they are dimension tables, could someone elaborate on the table structures (i.e. the fields)? The 'obvious' schema: CREATE TABLE sex_type (is_male int); CREATE TABLE demographic_category (id int, name varchar(4)); may not be the correct one.

    Read the article

  • Keyboard hook return different symbols from card reader depends whther my app in focus or not

    - by user363868
    I code WinForm application where one of the input is magnetic stripe card reader (CR). I am using code George Mamaladze's article Processing Global Mouse and Keyboard Hooks in C# on codeproject.com to listen keyboard (USB card reader acts same way as keyboard) and I have weird situation. One card reader CR1 (Unitech MS240-2UG) produces keystroke which I intercept on KeyPress event analyze that I intercept certain patter like %ABCD-6EFJHI? and trigger some logic. Analysis required because user can type something else into application or in another application meanwhile my app is open When I use another card reader CR2 (IdTech IDBM-334133) keystroke intercepted by hook started from number 5 instead of % (It is actually same key on keyboard). Since it is starting sentinel it is very important for me to have ability recognize input from card reader. Moreover if my app running in background and I have focus on Notepad when I swipe card string %ABCD-6EFJHI? appears in Notepad and same way, with proper starting character) intercepted by keyboard hook. If swiped when focus on Form it is 5ABCD-6EFJHI? User who tried app with another card reader has same result as me with CR2. Only CR1 works for me as expected I was looking into Device manager of Windows and both devices use same HID driver supplied by MS. I checked devices though respective software from CR makers and starting and ending sentinels set to % and ? respective on both. I would appreciate and ideas and thoughts as I hit the wall myself Thank you

    Read the article

  • Access SSAS cube from across domains without direct database connection

    - by SuperKing
    Hello, I'm working with SQL Server Analysis Services for the first time and have the dilemma of working on a project in which users must be able to access SSAS Cubes (via a custom web dashboard) that live across different servers and domains, but without having access to the other server's SSAS database connection strings. So Organization A and Organization B will have their own cubes on their own servers, but Organization A users must be able to view Organization B's cubes, and Organization B users must be able to view Organization A's cubes, but neither organization should have access to the connection string. I've read about allowing HTTP access to the SSAS server and cube from the link below, but that requires setting up users for authentication or allowing anonymous access to one organization's server for users of another organization, and I'm not sure this would be acceptable for this situation, or if this is the preferred way to do this. Is performance acceptable here? http://technet.microsoft.com/en-us/library/cc917711.aspx I also wonder if perhaps it makes sense to run a nightly/weekly process that accesses the other organization's SSAS database via a web service or something, and pull that data into a database on the organization's server, and then rebuild the cube. Then that cube would be queried without having to go and connect to the other organization server when viewing the cube. Has anyone else attempted to accomplish something similar? Is HTTP access the standard way to go for this? Or any other possible options? Thanks, and please let me know if you need more info, still unclear on how some of this works.

    Read the article

  • Secure Webservice (WCF) without storing credentials on consumer application

    - by Pai Gaudêncio
    Howdy folks, I have a customer that sells a lottery analysis application. In this application, he consumes a webservice (my service, I mean, belongs to the company I work for now) to get statistical data about lottery results, bets made, amounts, etc., from all across the globe. The access to this webservice is paid, and each consult costs X credits. Some people have disassembled this lottery application and found the api key/auth key used to access the paid webservice, and started to use it. I would like to prevent this from happening again, but I can't find a way to authenticate on the webservice without storing the auth. keys on the application. Does anyone have any ideas on how to accomplish such task? ps1.Can't ask for the users to input any kind of credentials. Has to be transparent for them (they shouldn't know what is happening). ps2. Can't use digital certificates for the same reason above, not to mention it's easy to retrieve them and we would fall into the original problem. Thanks in advance.

    Read the article

  • Figuring out the Nyquist performance limitation of an ADC on an example PIC microcontroller

    - by AKE
    I'm spec-ing the suitability of a dsPIC microcontroller for an analog-to-digital application. This would be preferable to using dedicated A/D chips and a separate dedicated DSP chip. To do that, I've had to run through some computations, pulling the relevant parameters from the datasheets. I'm not sure I've got it right -- would appreciate a check! (EDITED NOTE: The PIC10F220 in the example below was selected ONLY to walk through a simple example to check that I'm interpreting Tacq, Fosc, TAD, and divisor correctly in working through this sort of Nyquist analysis. The actual chips I'm considering for the design are the dsPIC33FJ128MC804 (with 16b A/D) or dsPIC30F3014 (with 12b A/D).) A simple example: PIC10F220 is the simplest possible PIC with an ADC Runs at clock speed of 8MHz. Has an instruction cycle of 0.5us (4 clock steps per instruction) So: Taking Tacq = 6.06 us (acquisition time for ADC, assume chip temp. = 50*C) [datasheet p34] Taking Fosc = 8MHz (? clock speed) Taking divisor = 4 (4 clock steps per CPU instruction) This gives TAD = 0.5us (TAD = 1/(Fosc/divisor) ) Conversion time is 13*TAD [datasheet p31] This gives conversion time 6.5us ADC duration is then 12.56 us [? Tacq + 13*TAD] Assuming at least 2 instructions for load/store: This is another 1 us [0.5 us per instruction] Which would give max sampling rate of 73.7 ksps (1/13.56) Supposing 8 more instructions for real-time processing: This is another 4 us Thus, total ADC/handling time = 17.56us (12.56us + 1us + 4us) So expected upper sampling rate is 56.9 ksps. Nyquist frequency for this sampling rate is therefore 28 kHz. If this is right, it suggests the (theoretical) performance suitability of this chip's A/D is for signals that are bandlimited to 28 kHz. Is this a correct interpretation of the information given in the data sheet in obtaining the Nyquist performance limit? Any opinions on the noise susceptibility of ADCs in PIC / dsPIC chips would be much appreciated! AKE

    Read the article

  • How to sort the file names in bash in this circumstance?

    - by Nicolas
    I have run a program to generate some results with the different parameters(i.e. the R, C and RP). These results are saved in files named results.txt. Then, I should parse these experimental results to make an analysis. In the params_R_7_C_16_RP_0, the 7 is the value of the parameter R, the 16 is the value of the parameter C and the 0 is the value of the parameter RP. Now, I want to get these results.txt files in the current directory to parse, and sort the path with the parameter values of R,C and RP. I first use the following command to get the results.txt files that I want to parse: find ./ -name "results.txt" and the output is: ./params_R_11_C_9_RP_0/results.txt ./params_R_7_C_9_RP_0/results.txt ./params_R_7_C_4_RP_0/results.txt ./params_R_11_C_16_RP_0/results.txt ./params_R_9_C_4_RP_0/results.txt ./params_R_5_C_9_RP_0/results.txt ./params_R_9_C_25_RP_0/results.txt ./params_R_7_C_16_RP_0/results.txt ./params_R_5_C_25_RP_0/results.txt ./params_R_5_C_16_RP_0/results.txt ./params_R_11_C_4_RP_0/results.txt ./params_R_9_C_16_RP_0/results.txt ./params_R_7_C_25_RP_0/results.txt ./params_R_15_C_4_RP_0/results.txt ./params_R_5_C_4_RP_0/results.txt ./params_R_9_C_9_RP_0/results.txt and I change the command as follows: find ./ -name "results.txt" | sort and the output is: ./params_R_11_C_16_RP_0/results.txt ./params_R_11_C_25_RP_0/results.txt ./params_R_11_C_4_RP_0/results.txt ./params_R_11_C_9_RP_0/results.txt ./params_R_5_C_16_RP_0/results.txt ./params_R_5_C_25_RP_0/results.txt ./params_R_5_C_4_RP_0/results.txt ./params_R_5_C_9_RP_0/results.txt ./params_R_7_C_16_RP_0/results.txt ./params_R_7_C_25_RP_0/results.txt ./params_R_7_C_4_RP_0/results.txt ./params_R_7_C_9_RP_0/results.txt ./params_R_9_C_16_RP_0/results.txt ./params_R_9_C_25_RP_0/results.txt ./params_R_9_C_4_RP_0/results.txt ./params_R_9_C_9_RP_0/results.txt But I want it output as following: ./params_R_5_C_4_RP_0/results.txt ./params_R_5_C_9_RP_0/results.txt ./params_R_5_C_16_RP_0/results.txt ./params_R_5_C_25_RP_0/results.txt ./params_R_7_C_4_RP_0/results.txt ./params_R_7_C_9_RP_0/results.txt ./params_R_7_C_16_RP_0/results.txt ./params_R_7_C_25_RP_0/results.txt ./params_R_9_C_4_RP_0/results.txt ./params_R_9_C_9_RP_0/results.txt ./params_R_9_C_16_RP_0/results.txt ./params_R_9_C_25_RP_0/results.txt ... I should let it params_R_005_C_004_RP_0 when generating the results. But it would take much time to rerun the program to get the results. So I wonder if there is any way to use the bash command to achieve this objective.

    Read the article

  • R + Bioconductor : combining probesets in an ExpressionSet

    - by Mike Dewar
    Hi, First off, this may be the wrong Forum for this question, as it's pretty darn R+Bioconductor specific. Here's what I have: library('GEOquery') GDS = getGEO('GDS785') cd4T = GDS2eSet(GDS) cd4T <- cd4T[!fData(cd4T)$symbol == "",] Now cd4T is an ExpressionSet object which wraps a big matrix with 19794 rows (probesets) and 15 columns (samples). The final line gets rid of all probesets that do not have corresponding gene symbols. Now the trouble is that most genes in this set are assigned to more than one probeset. You can see this by doing gene_symbols = factor(fData(cd4T)$Gene.symbol) length(gene_symbols)-length(levels(gene_symbols)) [1] 6897 So only 6897 of my 19794 probesets have unique probeset - gene mappings. I'd like to somehow combine the expression levels of each probeset associated with each gene. I don't care much about the actual probe id for each probe. I'd like very much to end up with an ExpressionSet containing the merged information as all of my downstream analysis is designed to work with this class. I think I can write some code that will do this by hand, and make a new expression set from scratch. However, I'm assuming this can't be a new problem and that code exists to do it, using a statistically sound method to combine the gene expression levels. I'm guessing there's a proper name for this also but my googles aren't showing up much of use. Can anyone help?

    Read the article

  • Core Data + Core Animation/CALayer together??

    - by ivanTheTerrible
    I am making an Cocoa app with custom interfaces. So far I have implemented one version of the app using CALayer doing the rendering, which has been great given the hierarchical structure of CALayers, and its [hitTest:] function for handling mouse events. In this early version, the model of the app are my custom classes. However, as the program grows I feel the urge of using Core Data for the model, not just for the ease of binding/undo management, but also want to try out the new technology. My method so far: In Core Data: creating a Block entity, with attributes xPos, yPos, width, height...etc. Then, creating a BlockView : CALayer class for drawing, which uses methods such as self.position.x = [self valueForKey:@"xPos"] to fetch the values from the model. In this case, every BlockView object has to also keep a local copy of xPos, which is NOT good. Do any of you guys have better suggestions? Edit: This app is a information visualization tool. So the positions, dimensions of the blocks are important, and should be persisted for later analysis.

    Read the article

  • How best to use XPath with very large XML files in .NET?

    - by glenatron
    I need to do some processing on fairly large XML files ( large here being potentially upwards of a gigabyte ) in C# including performing some complex xpath queries. The problem I have is that the standard way I would normally do this through the System.XML libraries likes to load the whole file into memory before it does anything with it, which can cause memory problems with files of this size. I don't need to be updating the files at all just reading them and querying the data contained in them. Some of the XPath queries are quite involved and go across several levels of parent-child type relationship - I'm not sure whether this will affect the ability to use a stream reader rather than loading the data into memory as a block. One way I can see of making it work is to perform the simple analysis using a stream-based approach and perhaps wrapping the XPath statements into XSLT transformations that I could run across the files afterward, although it seems a little convoluted. Alternately I know that there are some elements that the XPath queries will not run across, so I guess I could break the document up into a series of smaller fragments based on it's original tree structure, which could perhaps be small enough to process in memory without causing too much havoc. I've tried to explain my objective here so if I'm barking up totally the wrong tree in terms of general approach I'm sure you folks can set me right...

    Read the article

  • Is it possible that a single-threaded program is executed simultaneously on more than one CPU core?

    - by Wolfgang Plaschg
    When I run a single-threaded program that i have written on my quad core Intel i can see in the Windows Task Manager that actually all four cores of my CPU are more or less active. One core is more active than the other three, but there is also activity on those. There's no other program (besided the OS kernel of course) running that would be plausible for that activitiy. And when I close my program all activity an all cores drops down to nearly zero. All is left is a little "noise" on the cores, so I'm pretty sure all the visible activity comes directly or indirectly (like invoking system routines) from my program. Is it possible that the OS or the cores themselves try to balance some code or execution on all four cores, even it's not a multithreaded program? Do you have any links that documents this technique? Some infos to the program: It's a console app written in Qt, the Task Manager states that only one thread is running. Maybe Qt uses threads, but I don't use signals or slots, nor any GUI. Link to Task Manager screenshot: http://img97.imageshack.us/img97/6403/taskmanager.png This question is language agnostic and not tied to Qt/C++, i just want to know if Windows or Intel do to balance also single-threaded code on all cores. If they do, how does this technique work? All I can think of is, that kernel routines like reading from disk etc. is scheduled on all cores, but this won't improve performance significantly since the code still has to run synchronous to the kernel api calls. EDIT Do you know any tools to do a better analysis of single and/or multi-threaded programs than the poor Windows Task Manager?

    Read the article

  • Sweave/R - Automatically generating an appendix that contains all the model summaries/plots/data pro

    - by John Horton
    I like the idea of making research available at multiple levels of detail i.e., abstract for the casually curious, full text for the more interested, and finally the data and code for those working in the same area/trying to reproduce your results. In between the actual text and the data/code level, I'd like to insert another layer. Namely, I'd like to create a kind of automatically generated appendix that contains the full regression output, diagnostic plots, exploratory graphs data profiles etc. from the analysis, regardless of whether those plots/regressions etc. made it into the final paper. One idea I had was to write a script that would examine the .Rnw file and automatically: Profile all data sets that are loaded (sort of like the Hmisc(?) package) Summarize all regressions - i.e., run summary(model) for all models Present all plots (regardless of whether they made it in the final version) The idea is to make this kind of a low-effort, push-button sort of thing as opposed to a formal appendix written like the rest of a paper. What I'm looking for is some ideas on how to do this in R in a relatively simple way. My hunch is that there is some way of going through the namespace, figuring out what something is and then dumping into a PDF. Thoughts? Does something like this already exist?

    Read the article

  • Natural language grammar and user-entered names

    - by Owen Blacker
    Some languages, particularly Slavic languages, change the endings of people's names according to the grammatical context. (For those of you who know grammar or studied languages that do this to words, such as German or Russian, and to help with search keywords, I'm talking about noun declension.) This is probably easiest with a set of examples (in Polish, to save the whole different-alphabet problem): Dorothy saw the cat — Dorota zobaczyla kota The cat saw Dorothy — Kot zobaczyl Dorote It is Dorothy’s cat — To jest kot Doroty I gave the cat to Dorothy — Dalam kota Dorotie I went for a walk with Dorothy — Poszlam na spacer z Dorota “Hello, Dorothy!” — “Witam, Doroto!” Now, if, in these examples, the name here were to be user-entered, that introduces a world of grammar nightmares. Importantly, if I went for Katie (Kasia), the examples are not directly comparable — 3 and 4 are both Kasi, rather than *Kasy and *Kasie — and male names will be wholly different again. I'm guessing someone has dealt with this situation before, but my Google-fu appears to be weak today. I can find a lot of links about natural-language processing, but I don'think that's quite what I want. To be clear: I'm only ever gonna have one user-entered name per user and I'm gonna need to decline them into known configurations — I'll have a localised text that will have placeholders something like {name nominative} and {name dative}, for the sake of argument. I really don't want to have to do lexical analysis of text to work stuff out, I'll only ever need to decline that one user-entered name. Anyone have any recommendations on how to do this, or do I need to start calling round localisation agencies ;o) Further reading (all on Wikipedia) for the interested: Declension Grammatical case Declension in Polish Declension in Russian Declension in Czech nouns and pronouns Disclaimer: I know this happens in many other languages; highlighting Slavic languages is merely because I have a project that is going to be localised into some Slavic languages.

    Read the article

  • How can I improve this design?

    - by klausbyskov
    Let's assume that our system can perform actions, and that an action requires some parameters to do its work. I have defined the following base class for all actions (simplified for your reading pleasure): public abstract class BaseBusinessAction<TActionParameters> : where TActionParameters : IActionParameters { protected BaseBusinessAction(TActionParameters actionParameters) { if (actionParameters == null) throw new ArgumentNullException("actionParameters"); this.Parameters = actionParameters; if (!ParametersAreValid()) throw new ArgumentException("Valid parameters must be supplied", "actionParameters"); } protected TActionParameters Parameters { get; private set; } protected abstract bool ParametersAreValid(); public void CommonMethod() { ... } } Only a concrete implementation of BaseBusinessAction knows how to validate that the parameters passed to it are valid, and therefore the ParametersAreValid is an abstract function. However, I want the base class constructor to enforce that the parameters passed are always valid, so I've added a call to ParametersAreValid to the constructor and I throw an exception when the function returns false. So far so good, right? Well, no. Code analysis is telling me to "not call overridable methods in constructors" which actually makes a lot of sense because when the base class's constructor is called the child class's constructor has not yet been called, and therefore the ParametersAreValid method may not have access to some critical member variable that the child class's constructor would set. So the question is this: How do I improve this design? Do I add a Func<bool, TActionParameters> parameter to the base class constructor? If I did: public class MyAction<MyParameters> { public MyAction(MyParameters actionParameters, bool something) : base(actionParameters, ValidateIt) { this.something = something; } private bool something; public static bool ValidateIt() { return something; } } This would work because ValidateIt is static, but I don't know... Is there a better way? Comments are very welcome.

    Read the article

  • Getting content from PHP: Trouble with POST and query.

    - by vgm64
    Apologies for my longest question on SO ever. I'm trying to interface with a php frontend for a mysql database in ROOT (a CERN framework in C++ for high energy physics analysis). To start off with, I tried to get this php interface to play nice with wget and curl first because I'm more familiar with them. The following command works: wget --post-data "hostname=localhost:3306&un=joeuser&pw=psswd&myquery=show_spazio_databases;" http://some.host.edu/log/log_query_matlab.php The results are: database1 database2 That's good. If I leave out the --post-data then I get the result: Warning: mysql_connect() [function.mysql-connect]: Access denied for user 'admin'@'localhost' (using password: NO) in /log/log_query_matlab.php on line 6 i'm dead! Access denied for user 'admin'@'localhost' (using password: NO) Warning: mysql_query() [function.mysql-query]: Access denied for user 'admin'@'localhost' (using password: NO) in /log/log_query_matlab.php on line 29 Warning: mysql_query() [function.mysql-query]: A link to the server could not be established in /log/log_query_matlab.php on line 29 I have access to the php script (read only), but the error itself isn't too important. What matters it that using ROOT, I use a function called as socket.SendRaw(message, message.Length()) (socket is a TSocket) and this gives me the same "error" as wget without the post data switch if my "message" is "POST http://some.host.edu/log/log_query_matlab.php?hostname=localhost:3306&un=joeuser&pw=psswd&myquery=show_spazio_databases" This may be in vain, but does someone knows a way I should format the "message" that includes something that is equivalent to the --post-data switch. Or, is there a standard way to format POST requests in a single line (I've seen multi-line stuff. Is that right?) Sorry I'm clueless! PS. The mysql query is show databases but the space has been replaced with _spazio_, Italian for space. The author of the db and php interface requires it (and various replacements for symbols), but has anyone seen this before? Trying to troubleshoot that was terrible!

    Read the article

  • Why use Django on Google App Engine?

    - by Travis Bradshaw
    When researching Google App Engine (GAE), it's clear that using Django is wildly popular for developing in Python on GAE. I've been scouring the web to find information on the costs and benefits of using Django, to find out why it's so popular. While I've been able to find a wide variety of sources on how to run Django on GAE and the various methods of doing so, I haven't found any comparative analysis on why Django is preferable to using the webapp framework provided by Google. To be clear, it's immediately apparent why using Django on GAE is useful for developers with an existing skillset in Django (a majority of Python web developers, no doubt) or existing code in Django (where using GAE is more of a porting exercise). My team, however, is evaluating GAE for use on an all-new project and our existing experience is with TurboGears, not Django. It's been quite difficult to determine why Django is beneficial to a development team when the BigTable libraries have replaced Django's ORM, sessions and authentication are necessarily changed, and Django's templating (if desirable) is available without using the entire Django stack. Finally, it's clear that using Django does have the advantage of providing an "exit strategy" if we later wanted to move away from GAE and need a platform to target for the exodus. I'd be extremely appreciative for help in pointing out why using Django is better than using webapp on GAE. I'm also completely inexperienced with Django, so elaboration on smaller features and/or conveniences that work on GAE are also valuable to me. Thanks in advance for your time!

    Read the article

  • Pass enum value to method which is called by dynamic object

    - by user329588
    hello. I'm working on program which dynamically(in runtime) loads dlls. For an example: Microsoft.AnalysisServices.dll. In this dll we have this enum: namespace Microsoft.AnalysisServices { [Flags] public enum UpdateOptions { Default = 0, ExpandFull = 1, AlterDependents = 2, } } and we also have this class Cube: namespace Microsoft.AnalysisServices { public sealed class Cube : ... { public Cube(string name); public Cube(string name, string id); .. .. .. } } I dynamically load this dll and create object Cube. Than i call a method Cube.Update(). This method deploy Cube to SQL Analysis server. But if i want to call this method with parameters Cube.Update(UpdateOptions.ExpandFull) i get error, because method doesn't get appropriate parameter. I have already tried this, but doesn't work: dynamic updateOptions = AssemblyLoader.LoadStaticAssembly("Microsoft.AnalysisServices", "Microsoft.AnalysisServices.UpdateOptions");//my class for loading assembly Array s = Enum.GetNames(updateOptions); dynamic myEnumValue = s.GetValue(1);//1 = ExpandFull dynamicCube.Update(myEnumValue);// == Cube.Update(UpdateOptions.ExpandFull) I know that error is in parameter myEnumValue but i don't know how to get dynamically enum type from assembly and pass it to the method. Does anybody know the solution? Thank you very much for answers and help!

    Read the article

  • MySQL table data transformation -- how can I dis-aggregate MySQL time data?

    - by lighthouse65
    We are coding for a MySQL data warehousing application that stores descriptive data (User ID, Work ID, Machine ID, Start and End Time columns in the first table below) associated with time and production quantity data (Output and Time columns in the first table below) upon which aggregate (SUM, COUNT, AVG) functions are applied. We now wish to dis-aggregate time data for another type of analysis. Our current data table design: +---------+---------+------------+---------------------+---------------------+--------+------+ | User ID | Work ID | Machine ID | Event Start Time | Event End Time | Output | Time | +---------+---------+------------+---------------------+---------------------+--------+------+ | 080025 | ABC123 | M01 | 2008-01-24 16:19:15 | 2008-01-24 16:34:45 | 2120 | 930 | +---------+---------+------------+---------------------+---------------------+--------+------+ Reprocessing dis-aggregation that we would like to do would be to transform table content based on a granularity of minutes, rather than the current production event ("Event Start Time" and "Event End Time") granularity. The resulting reprocessing of existing table rows would look like: +---------+---------+------------+---------------------+--------+ | User ID | Work ID | Machine ID | Production Minute | Output | +---------+---------+------------+---------------------+--------+ | 080025 | ABC123 | M01 | 2010-01-24 16:19 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:20 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:21 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:22 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:23 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:24 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:25 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:26 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:27 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:28 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:29 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:30 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:31 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:22 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:33 | 133 | | 080025 | ABC123 | M01 | 2010-01-24 16:34 | 133 | +---------+---------+------------+---------------------+--------+ So the reprocessing would take an existing row of data created at the granularity of production event and modify the granularity to minutes, eliminating redundant (Event End Time, Time) columns while doing so. It assumes a constant rate of production and divides output by the difference in minutes plus one to populate the new table's Output column. I know this can be done in code...but can it be done entirely in a MySQL insert statement (or otherwise entirely in MySQL)? I am thinking of a INSERT ... INTO construction but keep getting stuck. An additional complexity is that there are hundreds of machines to include in the operation so there will be multiple rows (one for each machine) for each minute of the day. Any ideas would be much appreciated. Thanks.

    Read the article

< Previous Page | 81 82 83 84 85 86 87 88 89 90 91 92  | Next Page >