Search Results

Search found 2683 results on 108 pages for 'statistical analysis'.

Page 24/108 | < Previous Page | 20 21 22 23 24 25 26 27 28 29 30 31  | Next Page >

  • Splitting string on probable English word boundaries

    - by Sean
    I recently used Adobe Acrobat Pro's OCR feature to process a Japanese kanji dictionary. The overall quality of the output is generally quite a bit better than I'd hoped, but word boundaries in the English portions of the text have often been lost. For example, here's one line from my file: softening;weakening(ofthemarket)8 CHANGE [transform] oneselfINTO,takethe form of; disguise oneself I could go around and insert the missing word boundaries everywhere, but this would be adding to what is already a substantial task. I'm hoping that there might exist software which can analyze text like this, where some of the words run together, and split the text on probable word boundaries. Is there such a package? I'm using Emacs, so it'd be extra-sweet if the package in question were already an Emacs package or could be readily integrated into Emacs, so that I could simply put my cursor on a line like the above and repeatedly invoke some command that splits the line on word boundaries in decreasing order of probable correctness.

    Read the article

  • breakdown c++ code size

    - by Evan Rogers
    I'm looking for a nice stackoverflow-style answer to the first question in this old blog post, which I'll repeat below: "I’d really like some tool (ideally, g++ based) that shows me what parts of compiled/linked code are generated from what parts of C++ source code. For instance, to see whether a particular template is being instantiated for hundreds of different types (fixable via a template specialization) or whether code is being inlined excessively, or whether particular functions are larger than expected."

    Read the article

  • Using Pylint with Django

    - by rcreswick
    I would very much like to integrate pylint into the build process for my python projects, but I have run into one show-stopper: One of the error types that I find extremely useful--:E1101: *%s %r has no %r member*--constantly reports errors when using common django fields, for example: E1101:125:get_user_tags: Class 'Tag' has no 'objects' member which is caused by this code: def get_user_tags(username): """ Gets all the tags that username has used. Returns a query set. """ return Tag.objects.filter( ## This line triggers the error. tagownership__users__username__exact=username).distinct() # Here is the Tag class, models.Model is provided by Django: class Tag(models.Model): """ Model for user-defined strings that help categorize Events on on a per-user basis. """ name = models.CharField(max_length=500, null=False, unique=True) def __unicode__(self): return self.name How can I tune Pylint to properly take fields such as objects into account? (I've also looked into the Django source, and I have been unable to find the implementation of objects, so I suspect it is not "just" a class field. On the other hand, I'm fairly new to python, so I may very well have overlooked something.) Edit: The only way I've found to tell pylint to not warn about these warnings is by blocking all errors of the type (E1101) which is not an acceptable solution, since that is (in my opinion) an extremely useful error. If there is another way, without augmenting the pylint source, please point me to specifics :) See here for a summary of the problems I've had with pychecker and pyflakes -- they've proven to be far to unstable for general use. (In pychecker's case, the crashes originated in the pychecker code -- not source it was loading/invoking.)

    Read the article

  • Where do you start your design - code, UI or workflow?

    - by Mmarquee
    Hi I was discussing this at work, and was wondering where people start their designs? We tend to start with designing code to solve the problem presented to us, but that is probably all of us are (or were) programmers. I was wondering where other people and organisations start their design. Do they start with solving the problem as a coding problem, sit down and design what UI to use, or map out the data or workflow? Thanks

    Read the article

  • Preventing multiple reporting of the same rule violation in FxCop -- What is Id?

    - by Dave
    FxCop is currently reporting the same rule violation for a particular method -- it has two out parameters, because I want to return two values to the caller without creating a struct for it. I wonder if anonymous types would solve my problem, but I didn't know about them at the time I had written the method. Anyhow, I'm getting CheckId CA1021 reported once for each parameter. I've copied the SuppressMessage text from FxCop, and then realized that the Id for each message is different! To me, it seems like you only need the CheckId, so... what is the Id used for? I haven't been able to find information about it online. will the Id remain the same? I assume so, or SuppressMessage wouldn't work the way one would want it to is there a way to specify the SuppressMessage attribute so that it suppresses for all Ids?

    Read the article

  • How do I write a analyzable thread dump format

    - by gamue
    I'm creating a global exception handling which collects some information before shutting down in some cases. One of this information is the current thread dump. i do this with following code: ManagementFactory.getThreadMXBean().dumpAllThreads(true, true); The problem is to write the information into a analyzable format for TDA. Is there a "simple" way to format the information instead of writing the format on my own?

    Read the article

  • .NET Library to Identify Pitches

    - by Antoni
    I'd like to write a simple program(preferably in C#) to which I sing a pitch using a mic and the program identifies to which musical note that pitch corresponds. Thank you very much for your prompt responses. I clarify: I'd like a (preferably .NET) library that would identify the notes I sing. I'd like that such a library: Identifies a note when I sing(a note from the chromatic scale). Tells me how much I'm off from the closest note. I intend to use such a library to sing one note a time.

    Read the article

  • What data structures and algorithms are applied within data warehouse cubes?

    - by Jeff Meatball Yang
    I understand that cubes are optimized data structures for aggregating and "slicing" large amounts of data. I just don't know how they are implemented. I can imagine a lot of this technology is proprietary, but are there any resources that I could use to start implementing my own cube technology? Set theory and lots of math are probably involved (and welcome as suggestions!), but I'm primarily interested in implementations: the data structures and query algorithms. Thanks!

    Read the article

  • .net VS2008 compilation analyzer tool ?

    - by Matthieu
    Hi, I'm looking for a tool that allows me to analyze the compilation of a VS Solution (about 30 VS projects inside). I would like to know after the global solution compilation, which projets fail and forward errors to developers. Of course, I could analyze the compilation report... but I'm pretty sure that great tools are available ! What is for you the best one ? Thanks a lot !

    Read the article

  • Lack of IsNumeric function in C#

    - by Michael Kniskern
    One thing that has bothered me about C# since its release was the lack of a generic IsNumeric function. I know it is difficult to generate a one-stop solution to detrmine if a value is numeric. I have used the following solution in the past, but it is not the best practice because I am generating an exception to determine if the value is IsNumeric: public bool IsNumeric(string input) { try { int.Parse(input); return true; } catch { return false; } } Is this still the best way to approach this problem or is there a more efficient way to determine if a value is numeric in C#?

    Read the article

  • Rounding of calculated measure in MDX

    - by Espo
    How can i round a calculated mdx measure up to the nearest integer without having Excel on the server? The Excel-function is CEILING(number, significance), but it is not possible to install Excel on the production ssas-server.

    Read the article

  • Optimal two variable linear regression SQL statement (censoring outliers)

    - by Dave Jarvis
    Problem Am looking to apply the y = mx + b equation (where m is SLOPE, b is INTERCEPT) to a data set, which is retrieved as shown in the SQL code. The values from the (MySQL) query are: SLOPE = 0.0276653965651912 INTERCEPT = -57.2338357550468 SQL Code SELECT ((sum(t.YEAR) * sum(t.AMOUNT)) - (count(1) * sum(t.YEAR * t.AMOUNT))) / (power(sum(t.YEAR), 2) - count(1) * sum(power(t.YEAR, 2))) as SLOPE, ((sum( t.YEAR ) * sum( t.YEAR * t.AMOUNT )) - (sum( t.AMOUNT ) * sum(power(t.YEAR, 2)))) / (power(sum(t.YEAR), 2) - count(1) * sum(power(t.YEAR, 2))) as INTERCEPT FROM (SELECT D.AMOUNT, Y.YEAR FROM CITY C, STATION S, YEAR_REF Y, MONTH_REF M, DAILY D WHERE -- For a specific city ... -- C.ID = 8590 AND -- Find all the stations within a 15 unit radius ... -- SQRT( POW( C.LATITUDE - S.LATITUDE, 2 ) + POW( C.LONGITUDE - S.LONGITUDE, 2 ) ) <15 AND -- Gather all known years for that station ... -- S.STATION_DISTRICT_ID = Y.STATION_DISTRICT_ID AND -- The data before 1900 is shaky; insufficient after 2009. -- Y.YEAR BETWEEN 1900 AND 2009 AND -- Filtered by all known months ... -- M.YEAR_REF_ID = Y.ID AND -- Whittled down by category ... -- M.CATEGORY_ID = '001' AND -- Into the valid daily climate data. -- M.ID = D.MONTH_REF_ID AND D.DAILY_FLAG_ID <> 'M' GROUP BY Y.YEAR ORDER BY Y.YEAR ) t Data The data is visualized here (with five outliers highlighted): Questions How do I return the y value against all rows without repeating the same query to collect and collate the data? That is, how do I "reuse" the list of t values? How would you change the query to eliminate outliers (at an 85% confidence interval)? The following results (to calculate the start and end points of the line) appear incorrect. Why are the results off by ~10 degrees (e.g., outliers skewing the data)? (1900 * 0.0276653965651912) + (-57.2338357550468) = -4.66958228 (2009 * 0.0276653965651912) + (-57.2338357550468) = -1.65405406 I would have expected the 1900 result to be around 10 (not -4.67) and the 2009 result to be around 11.50 (not -1.65). Thank you!

    Read the article

  • Can someone recommend a resource/site/book to improve problem solving skills

    - by kjm
    I am a reasonably experienced developer (.NET, c#, asp.NET etc) but I'd like to hone my problem solving skills. I find that when I come up against a complex problem I sometimes implement a solution that I feel could have been better had I analyzed the problem in a different way. Ideally what I am looking for is a resource of some type that has 'practice problems and solutions' as I think my skills will only get better by practicing this more and adopting better practices. I hope my question is not to vague and I wont get upset with people answering with opinions etc.. thanks

    Read the article

  • What do the square brackets in LaTeX logs mean?

    - by stefan-majewsky
    I'm currently working on a parser that reads complete LaTeX logs. Most of the log format is, though weird, easy to figure out, but these square brackets are puzzling me. Here's an example from near the end of one of my logs: Overfull \hbox (10.88788pt too wide) in paragraph at lines 40--40 []$[]$ [] [102]) [103] Kapitel 14. (./Thermo-141-GrenzenFundamentalpostulat.tex [104 ]) (./Thermo-142-Mastergleichung.tex [105]) (./Thermo-143-HTheorem.tex [106pdfTeX warning (ext4): destination with the same identifier (name{equation.14.3.3}) ha s been already used, duplicate ignored Can anybody give me a hint what these square brackets mean? I can't see any structure in them. I have the suspicion that lines 2/3 above are some kind of ASCII art representing the box layout, though I know too less about badboxes to justify this or identify the meaning of the single characters. Then, the "[104" etc. seem to correspond to the page numbers, but I am still not seeing the reason why there is sometimes something inbetween the square brackets (like the pdfTeX warning above), and sometimes not.

    Read the article

  • Locating multiple nested If statements using regular expressions

    - by TERACytE
    Is there a way to search for multiple nested if statements in code using a regular expression? For example, an expression that would locate an instance of if statements three or more layers deep with different styles (if, if/else, if/elseif/else): if (...) { <code> if (...) { <code> if (...) <code> } else if (...) { <code> } else { <code> } } else { <code> }

    Read the article

  • Optimal two variable linear regression SQL statement

    - by Dave Jarvis
    Problem Am looking to apply the y = mx + b equation (where m is SLOPE, b is INTERCEPT) to a data set, which is retrieved as shown in the SQL code. The values from the (MySQL) query are: SLOPE = 0.0276653965651912 INTERCEPT = -57.2338357550468 SQL Code SELECT ((sum(t.YEAR) * sum(t.AMOUNT)) - (count(1) * sum(t.YEAR * t.AMOUNT))) / (power(sum(t.YEAR), 2) - count(1) * sum(power(t.YEAR, 2))) as SLOPE, ((sum( t.YEAR ) * sum( t.YEAR * t.AMOUNT )) - (sum( t.AMOUNT ) * sum(power(t.YEAR, 2)))) / (power(sum(t.YEAR), 2) - count(1) * sum(power(t.YEAR, 2))) as INTERCEPT FROM (SELECT D.AMOUNT, Y.YEAR FROM CITY C, STATION S, YEAR_REF Y, MONTH_REF M, DAILY D WHERE -- For a specific city ... -- C.ID = 8590 AND -- Find all the stations within a 5 unit radius ... -- SQRT( POW( C.LATITUDE - S.LATITUDE, 2 ) + POW( C.LONGITUDE - S.LONGITUDE, 2 ) ) <15 AND -- Gather all known years for that station ... -- S.STATION_DISTRICT_ID = Y.STATION_DISTRICT_ID AND -- The data before 1900 is shaky; and insufficient after 2009. -- Y.YEAR BETWEEN 1900 AND 2009 AND -- Filtered by all known months ... -- M.YEAR_REF_ID = Y.ID AND -- Whittled down by category ... -- M.CATEGORY_ID = '001' AND -- Into the valid daily climate data. -- M.ID = D.MONTH_REF_ID AND D.DAILY_FLAG_ID <> 'M' GROUP BY Y.YEAR ORDER BY Y.YEAR ) t Data The data is visualized here: Questions How do I return the y value against all rows without repeating the same query to collect and collate the data? That is, how do I "reuse" the list of t values? How would you change the query to eliminate outliers (at an 85% confidence interval)? The following results (to calculate the start and end points of the line) appear incorrect. Why are the results off by ~10 degrees (e.g., outliers skewing the data)? (1900 * 0.0276653965651912) + (-57.2338357550468) = -4.66958228 (2009 * 0.0276653965651912) + (-57.2338357550468) = -1.65405406 I would have expected the 1900 result to be around 10 (not -4.67) and the 2009 result to be around 11.50 (not -1.65). Thank you!

    Read the article

  • Resources to learn about engineering aspects of data analytics (OLAP, warehousing, ETL, etc.)

    - by JT
    I'm a math/stats guy, interested in learning more about the engineering aspects of "data analytics" (this may be an overly broad term, this is a case of "I don't know what I don't know", so I'm not sure how to be more specific). I'm fine with manipulating and analyzing the data once it's already stored somewhere and I can access it, and I'm fine with writing scripts and SQL queries (and have a general knowledge of things like normalization). What I don't know is the whole engineering process of capturing and storing the data. For example, terms I've heard thrown about that I only vaguely understand the meaning of include: - OLAP, OLTP - Data warehousing - ETL - ??? What's a good book (or any other resource) to learn about these kinds of things? What are things I should know about database design (normalization seems kinda "obvious" to me, something I would have done even before I knew the term -- is there anything else?)? In other words, for jobs falling under the umbrella term of "analytics engineer", what kinds of things should I know?

    Read the article

  • A PHP regex to extract php functions from code files

    - by user298593
    I'm trying to make a PHP regex to extract functions from php source code. Until now i used a recursive regex to extract everything between {} but then it also matches stuff like if statements. When i use something like: preg_match_all("/(function .(.))({([^{}]+|(?R))*})/",$this-data,$matches2); It doesn't work when there is more than 1 function in the file (probably because it uses the 'function' part in the recursiveness too). Is there any way to do this? Example file: <?php if($useless) { echo "i don't want this"; } function bla($wut) { echo "i do want this"; } ?> Thanks

    Read the article

  • Where do you start your design - code, UI, workflow or whatever?

    - by Mmarquee
    Hi I was discussing this at work, and was wondering where people start their designs? We tend to start with designing code to solve the problem presented to us, but that is probably all of us are (or were) programmers. I was wondering where other people and organisations start their design. Do they start with solving the problem as a coding problem, sit down and design what UI to use, or map out the data or workflow? Thanks

    Read the article

  • Can PMD be customized to fully support a new language?

    - by tinny
    Can PMD be customized to fully support a new language, in a reasonable amount of time. I mean I know that technically almost anything can be done, but im wondering if this can be done in a reasonable amount of time? E.g. < 2 weeks This page mentions how to write a CPD parser http://pmd.sourceforge.net/cpd-parser-howto.html But is this just for copy / paste detection? Does writing a CPD parser give me full support of PMD in terms of rile sets?

    Read the article

< Previous Page | 20 21 22 23 24 25 26 27 28 29 30 31  | Next Page >