Search Results

Search found 12705 results on 509 pages for 'random sample'.

Page 56/509 | < Previous Page | 52 53 54 55 56 57 58 59 60 61 62 63  | Next Page >

  • Solving Big Problems with Oracle R Enterprise, Part I

    - by dbayard
    Abstract: This blog post will show how we used Oracle R Enterprise to tackle a customer’s big calculation problem across a big data set. Overview: Databases are great for managing large amounts of data in a central place with rigorous enterprise-level controls.  R is great for doing advanced computations.  Sometimes you need to do advanced computations on large amounts of data, subject to rigorous enterprise-level concerns.  This blog post shows how Oracle R Enterprise enables R plus the Oracle Database enabled us to do some pretty sophisticated calculations across 1 million accounts (each with many detailed records) in minutes. The problem: A financial services customer of mine has a need to calculate the historical internal rate of return (IRR) for its customers’ portfolios.  This information is needed for customer statements and the online web application.  In the past, they had solved this with a home-grown application that pulled trade and account data out of their data warehouse and ran the calculations.  But this home-grown application was not able to do this fast enough, plus it was a challenge for them to write and maintain the code that did the IRR calculation. IRR – a problem that R is good at solving: Internal Rate of Return is an interesting calculation in that in most real-world scenarios it is impractical to calculate exactly.  Rather, IRR is a calculation where approximation techniques need to be used.  In this blog post, we will discuss calculating the “money weighted rate of return” but in the actual customer proof of concept we used R to calculate both money weighted rate of returns and time weighted rate of returns.  You can learn more about the money weighted rate of returns here: http://www.wikinvest.com/wiki/Money-weighted_return First Steps- Calculating IRR in R We will start with calculating the IRR in standalone/desktop R.  In our second post, we will show how to take this desktop R function, deploy it to an Oracle Database, and make it work at real-world scale.  The first step we did was to get some sample data.  For a historical IRR calculation, you have a balances and cash flows.  In our case, the customer provided us with several accounts worth of sample data in Microsoft Excel.      The above figure shows part of the spreadsheet of sample data.  The data provides balances and cash flows for a sample account (BMV=beginning market value. FLOW=cash flow in/out of account. EMV=ending market value). Once we had the sample spreadsheet, the next step we did was to read the Excel data into R.  This is something that R does well.  R offers multiple ways to work with spreadsheet data.  For instance, one could save the spreadsheet as a .csv file.  In our case, the customer provided a spreadsheet file containing multiple sheets where each sheet provided data for a different sample account.  To handle this easily, we took advantage of the RODBC package which allowed us to read the Excel data sheet-by-sheet without having to create individual .csv files.  We wrote ourselves a little helper function called getsheet() around the RODBC package.  Then we loaded all of the sample accounts into a data.frame called SimpleMWRRData. Writing the IRR function At this point, it was time to write the money weighted rate of return (MWRR) function itself.  The definition of MWRR is easily found on the internet or if you are old school you can look in an investment performance text book.  In the customer proof, we based our calculations off the ones defined in the The Handbook of Investment Performance: A User’s Guide by David Spaulding since this is the reference book used by the customer.  (One of the nice things we found during the course of this proof-of-concept is that by using R to write our IRR functions we could easily incorporate the specific variations and business rules of the customer into the calculation.) The key thing with calculating IRR is the need to solve a complex equation with a numerical approximation technique.  For IRR, you need to find the value of the rate of return (r) that sets the Net Present Value of all the flows in and out of the account to zero.  With R, we solve this by defining our NPV function: where bmv is the beginning market value, cf is a vector of cash flows, t is a vector of time (relative to the beginning), emv is the ending market value, and tend is the ending time. Since solving for r is a one-dimensional optimization problem, we decided to take advantage of R’s optimize method (http://stat.ethz.ch/R-manual/R-patched/library/stats/html/optimize.html). The optimize method can be used to find a minimum or maximum; to find the value of r where our npv function is closest to zero, we wrapped our npv function inside the abs function and asked optimize to find the minimum.  Here is an example of using optimize: where low and high are scalars that indicate the range to search for an answer.   To test this out, we need to set values for bmv, cf, t, emv, tend, low, and high.  We will set low and high to some reasonable defaults. For example, this account had a negative 2.2% money weighted rate of return. Enhancing and Packaging the IRR function With numerical approximation methods like optimize, sometimes you will not be able to find an answer with your initial set of inputs.  To account for this, our approach was to first try to find an answer for r within a narrow range, then if we did not find an answer, try calling optimize() again with a broader range.  See the R help page on optimize()  for more details about the search range and its algorithm. At this point, we can now write a simplified version of our MWRR function.  (Our real-world version is  more sophisticated in that it calculates rate of returns for 5 different time periods [since inception, last quarter, year-to-date, last year, year before last year] in a single invocation.  In our actual customer proof, we also defined time-weighted rate of return calculations.  The beauty of R is that it was very easy to add these enhancements and additional calculations to our IRR package.)To simplify code deployment, we then created a new package of our IRR functions and sample data.  For this blog post, we only need to include our SimpleMWRR function and our SimpleMWRRData sample data.  We created the shell of the package by calling: To turn this package skeleton into something usable, at a minimum you need to edit the SimpleMWRR.Rd and SimpleMWRRData.Rd files in the \man subdirectory.  In those files, you need to at least provide a value for the “title” section. Once that is done, you can change directory to the IRR directory and type at the command-line: The myIRR package for this blog post (which has both SimpleMWRR source and SimpleMWRRData sample data) is downloadable from here: myIRR package Testing the myIRR package Here is an example of testing our IRR function once it was converted to an installable package: Calculating IRR for All the Accounts So far, we have shown how to calculate IRR for a single account.  The real-world issue is how do you calculate IRR for all of the accounts?This is the kind of situation where we can leverage the “Split-Apply-Combine” approach (see http://www.cscs.umich.edu/~crshalizi/weblog/815.html).  Given that our sample data can fit in memory, one easy approach is to use R’s “by” function.  (Other approaches to Split-Apply-Combine such as plyr can also be used.  See http://4dpiecharts.com/2011/12/16/a-quick-primer-on-split-apply-combine-problems/). Here is an example showing the use of “by” to calculate the money weighted rate of return for each account in our sample data set.  Recap and Next Steps At this point, you’ve seen the power of R being used to calculate IRR.  There were several good things: R could easily work with the spreadsheets of sample data we were given R’s optimize() function provided a nice way to solve for IRR- it was both fast and allowed us to avoid having to code our own iterative approximation algorithm R was a convenient language to express the customer-specific variations, business-rules, and exceptions that often occur in real-world calculations- these could be easily added to our IRR functions The Split-Apply-Combine technique can be used to perform calculations of IRR for multiple accounts at once. However, there are several challenges yet to be conquered at this point in our story: The actual data that needs to be used lives in a database, not in a spreadsheet The actual data is much, much bigger- too big to fit into the normal R memory space and too big to want to move across the network The overall process needs to run fast- much faster than a single processor The actual data needs to be kept secured- another reason to not want to move it from the database and across the network And the process of calculating the IRR needs to be integrated together with other database ETL activities, so that IRR’s can be calculated as part of the data warehouse refresh processes In our next blog post in this series, we will show you how Oracle R Enterprise solved these challenges.

    Read the article

  • Sentiment analysis with NLTK python for sentences using sample data or webservice?

    - by Ke
    I am embarking upon a NLP project for sentiment analysis. I have successfully installed NLTK for python (seems like a great piece of software for this). However,I am having trouble understanding how it can be used to accomplish my task. Here is my task: I start with one long piece of data (lets say several hundred tweets on the subject of the UK election from their webservice) I would like to break this up into sentences (or info no longer than 100 or so chars) (I guess i can just do this in python??) Then to search through all the sentences for specific instances within that sentence e.g. "David Cameron" Then I would like to check for positive/negative sentiment in each sentence and count them accordingly NB: I am not really worried too much about accuracy because my data sets are large and also not worried too much about sarcasm. Here are the troubles I am having: All the data sets I can find e.g. the corpus movie review data that comes with NLTK arent in webservice format. It looks like this has had some processing done already. As far as I can see the processing (by stanford) was done with WEKA. Is it not possible for NLTK to do all this on its own? Here all the data sets have already been organised into positive/negative already e.g. polarity dataset http://www.cs.cornell.edu/People/pabo/movie-review-data/ How is this done? (to organise the sentences by sentiment, is it definitely WEKA? or something else?) I am not sure I understand why WEKA and NLTK would be used together. Seems like they do much the same thing. If im processing the data with WEKA first to find sentiment why would I need NLTK? Is it possible to explain why this might be necessary? I have found a few scripts that get somewhat near this task, but all are using the same pre-processed data. Is it not possible to process this data myself to find sentiment in sentences rather than using the data samples given in the link? Any help is much appreciated and will save me much hair! Cheers Ke

    Read the article

  • What differences should i make in the MPMoviePlayer sample code(for iphone) to make it work in Ipad?

    - by wolverine
    Its working perfectly in the iphone simulator. But not in the ipad simulator. I am only trying to make the movie getting loaded when the application launches.Copy pasted the same code in a ipad window application. But it loads and only gives the white screen and nothing is happening. Can anyone tell me what changes should I make to work it in ipad simulator just as in the iphone simulator?

    Read the article

  • Linq to SQL NullReferenceException's: A random needle in a haystack!

    - by Shane
    I'm getting NullReferenceExeceptions at seemly random times in my application and can't track down what could be causing the error. I'll do my best to describe the scenario and setup. Any and all suggestions greatly appreciated! C# .net 3.5 Forms Application, but I use the WebFormRouting library built by Phil Haack (http://haacked.com/archive/2008/03/11/using-routing-with-webforms.aspx) to leverage the Routing libraries of .net (usually used in conjunction with MVC) - intead of using url rewriting for my urls. My database has 60 tables. All Normalized. It's just a massive application. (SQL server 2008) All queries are built with Linq to SQL in code (no SP's). Each time a new instance of my data context is created. I use only one data context with all relationships defined in 4 relationship diagrams in SQL Server. the data context gets created a lot. I let the closing of the data context be handled automatically. I've heard arguments both sides about whether you should leave to be closed automatically or do it yourself. In this case I do it myself. It doesnt seem to matter if I'm creating a lot of instances of the data context or just one. For example, I've got a vote-up button. with the following code, and it errors probably 1 in 10-20 times. protected void VoteUpLinkButton_Click(object sender, EventArgs e) { DatabaseDataContext db = new DatabaseDataContext(); StoryVote storyVote = new StoryVote(); storyVote.StoryId = storyId; storyVote.UserId = Utility.GetUserId(Context); storyVote.IPAddress = Utility.GetUserIPAddress(); storyVote.CreatedDate = DateTime.Now; storyVote.IsDeleted = false; db.StoryVotes.InsertOnSubmit(storyVote); db.SubmitChanges(); // If this story is not yet published, check to see if we should publish it. Make sure that // it is already approved. if (story.PublishedDate == null && story.ApprovedDate != null) { Utility.MakeUpcommingNewsPopular(storyId); } // Refresh our page. Response.Redirect("/news/" + category.UniqueName + "/" + RouteData.Values["year"].ToString() + "/" + RouteData.Values["month"].ToString() + "/" + RouteData.Values["day"].ToString() + "/" + RouteData.Values["uniquename"].ToString()); } The last thing I tried was the "Auto Close" flag setting on SQL Server. This was set to true and I changed to false. Doesnt seem to have done the trick although has had a good overall effect. Here's a detailed that wasnt caught. I also get slighly different errors when caught by my try/catch's. System.Web.HttpUnhandledException: Exception of type 'System.Web.HttpUnhandledException' was thrown. --- System.NullReferenceException: Object reference not set to an instance of an object. at System.Web.Util.StringUtil.GetStringHashCode(String s) at System.Web.UI.ClientScriptManager.EnsureEventValidationFieldLoaded() at System.Web.UI.ClientScriptManager.ValidateEvent(String uniqueId, String argument) at System.Web.UI.WebControls.TextBox.LoadPostData(String postDataKey, NameValueCollection postCollection) at System.Web.UI.Page.ProcessPostData(NameValueCollection postData, Boolean fBeforeLoad) at System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint) --- End of inner exception stack trace --- at System.Web.UI.Page.HandleError(Exception e) at System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint) at System.Web.UI.Page.ProcessRequest(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint) at System.Web.UI.Page.ProcessRequest() at System.Web.UI.Page.ProcessRequest(HttpContext context) at ASP.forms_news_detail_aspx.ProcessRequest(HttpContext context) at System.Web.HttpApplication.CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute() at System.Web.HttpApplication.ExecuteStep(IExecutionStep step, Boolean& completedSynchronously) HELP!!!

    Read the article

  • Java IO (javase 6)- Help me understand the effects of my sample use of Streams and Writers...

    - by Daddy Warbox
    BufferedWriter out = new BufferedWriter( new OutputStreamWriter( new BufferedOutputStream( new FileOutputStream("out.txt") ) ) ); So let me see if I understand this: A byte output stream is opened for file "out.txt". It is then fed to a buffered output stream to make file operations faster. The buffered stream is fed to an output stream writer to bridge from bytes to characters. Finally, this writer is fed to a buffered writer... which adds another layer of buffering? Hmm...

    Read the article

  • How to take advantage of an auto-property when refactoring this .Net 1.1 sample?

    - by Hamish Grubijan
    I see a lot of legacy .Net 1.1-style code at work like in example below, which I would like to shrink with the help of an auto-property. This will help many classes shrink by 30-40%, which I think would be good. public int MyIntThingy { get { return _myIntThingy; } set { _myIntThingy = value; } } private int _myIntThingy = -1; This would become: public int MyIntThingy { get; set; } And the only question is - where do I set MyIntThingy = -1;? If I wrote the class from the start, then I would have a better idea, but I did not. An obvious answer would be: put it in the constructor. Trouble is: there are many constructors in this class. Watching the initialization to -1 in the debugger, I see it happen (I believe) before the constructor gets called. It is almost as if I need to use a static constructor as described here: http://www.c-sharpcorner.com/uploadfile/cupadhyay/staticconstructors11092005061428am/staticconstructors.aspx except that my variables are not static. Java's static initializer comes to mind, but again - my variables are not static. http://www.glenmccl.com/tip_003.htm I want to make stylistic but not functional changes to this class. As crappy as it is, it has been tested and working for a few years now. breaking the functionality would be bad. So ... I am looking for shorter, sweeter, cuter, and yet EQUIVALENT code. Let me know if you have questions.

    Read the article

  • Login code sample which has been hacked via SQL Injection, although mysql_real_escape_string...

    - by artmania
    Hi friends, I use CodeIgniter, and having trouble with hacking :( is it possible to make SQL Injection to the login code below: function process_login() { $username = mysql_real_escape_string($this->input->post('username')); $password = mysql_real_escape_string(MD5($this->input->post('password'))); //Check user table $query = $this->db->getwhere('users', array('username'=>$username, 'password'=>$password)); if ($query->num_rows() > 0) { // success login data Am I using the mysql_real_escape_string wrong? or what? Appreciate helps!

    Read the article

  • How would you sample a real-time stream of coordinates to create a Speed Graph?

    - by Andrew Johnson
    I have a GPS device, and I am receiving continuous points, which I store in an array. These points are time stamped. I would like to graph distance/time (speed) vs. distance in real-time; however, I can only plot 50 of the points because of hardware constraints. How would you select points from the array to graph? For example, one algorithm might be to select every Nth point from the array, where N results in 50 points total. Code: float indexModifier = 1; if (MIN(50,track.lastPointIndex) == 50) { indexModifier = track.lastPointIndex/50.0f; } index = ceil(index*indexModifier); Another algorithm might be to keep an array of 50 points, and throw out the point with the least speed change each time you get a new point.

    Read the article

  • Please take a stab at this VB.Net Oracle-related sample and help me with String.Format.

    - by Hamish Grubijan
    If the database is not Oracle, it is MS SQl 2008. My task: if Oracle, add two more parameters when calling a stored proc. Oracle and MSFT stored procs are generated; Oracle ones have 3 extra parameters: Vret_val out number, Vparam2 in out number, Vparam3 in out number, ... the rest (The are not actually named Vparam2 and Vparam3, but this should not matter). So, the code for a helper VB.Net class that calls a stored proc: Imports System.Data.Odbc Imports System.Configuration Dim objCon As OdbcConnection = Nothing Dim objAdapter As OdbcDataAdapter Dim cmdCommand As New OdbcCommand Dim objDataTable As DataTable Dim sconnection As String Try sconnection = mConnectionString objAdapter = New OdbcDataAdapter objCon = New OdbcConnection(sconnection) objCon.Open() objAdapter.SelectCommand = cmdCommand objAdapter.SelectCommand.Connection = objCon objAdapter.SelectCommand.CommandType = CommandType.StoredProcedure objAdapter.SelectCommand.CommandTimeout = Globals.mReportTimeOut If Not mIsOracle Then objAdapter.SelectCommand.CommandText = String.Format("{{call {0}}}", spName) Else Dim returnValue As New OdbcParameter returnValue.Direction = ParameterDirection.Output returnValue.ParameterName = "@Vret_val" returnValue.OdbcType = OdbcType.Numeric objAdapter.SelectCommand.Parameters.Add(returnValue) objAdapter.SelectCommand.CommandText = String.Format("{{call {0}(?)}}", spName) End If Try objDataTable = New DataTable(spName) objAdapter.Fill(objDataTable) Catch ex As Exception ... Question: I am puzzled as to what String.Format("{{call {0}(?)}}", spName) does, in particular the (?) part. My understanding of the String.Format is that it will simply replace {0} with spName. The {{, }}, and (?) do throw me off because { reminds me of formatting, (?) hints at some advanced regex use. Unfortunately I am getting little help from a key person who is on vacation without a leash [smart]phone. I am guessing that I simply add 5 more lines for each additional parameter, and change String.Format("{{call {0}(?)}}", spName) to String.Format("{{call {0}(?,?,?)}}", spName). I forgot to mention that I am coding this "blindly" - I have a compiler to help me, but no environment set up to test this. This will be over in a few days, but I need to do my best to try finishing it on time :) Thanks.

    Read the article

  • Is there a sample set of web log data available for testing analysis against?

    - by Peter
    Sorry if this isn't strictly speaking a programming question, but I figure my best chance of success would be to ask here. I'm developing some web log file analysis algorithms, but to date I only have access to a fairly small amount of web log data to process. One algorithm I want to use makes some assumptions about 'the shape' of typical web log data, and so I'd like to test it against a larger 'exemplar' - perhaps the logs of a busy site with a good distribution of traffic from different sources etc. Is there a set of such data available somewhere? Thanks for any help.

    Read the article

  • Does somebody have a working sample of PHP session_set_save_handler ?

    - by susana
    PHP Version 5.1.6 All day trying..I'm breaking my head now... I understand it but can't make it work. I need to store the session data in a mysql database, cause we're balancing the request among 2 servers, so regular sessions wont work... I need to use this function ... session_set_save_handler ... and I cant make it work..any help greatly appreciated it. Thank you !

    Read the article

  • Is there a best practice for concatenating MP3 Files, adjusting sample rates to match, while preserving original files?

    - by Scott
    Hello overflow community! Does anyone know if there is a "best practice" to concatenate mp3 files to create new files, while preserving the original files? I am working on a CentOS Linux machine, in command line. I will eventually call the command line from a PHP script. I have been doing research and I have come up with a process that I think could work. It combines general advice from different forums, blogs, and sources like this one. So here I go: Create a temporary folder Loop through files to create a new, converted copy, of file into a "raw" format (which one, I don't know. I didn't know "raw" files existed before too long ago. I could use some suggestions on this) Store the path to the temporary files, in the temporary folder, and then loop through the files to concatenate them and then put the new merged file the final "processed directory" Delete the contents of the temporary file with the temporary raw files inside. Convert the final file from "raw" to mp3 and enjoy the finished result I'm thinking that this course of action might be best because I can't necessarily control the quality of the original "source" mp3s. The only other option I could think of would be to create a script that would perform a similar process upon files being added to the system leaving only the files with the "proper" format and removing the original "erroneous" file. Hopefully you can see that I have put some thought into this and that I'm trying to leverage the collective knowledge of this community to choose the best direction. Perhaps there is a better path that I could take? By concatenate, I mean to join together in sequence to create a new audio file from the "concatenated files."

    Read the article

  • PHP - Code Sample - Polymorphism Implementation - How to allow for expansion?

    - by darga33
    I've read numerous SO posts about Polymorphism, and also the other really good one at http://net.tutsplus.com/tutorials/php/understanding-and-applying-polymorphism-in-php/ Good stuff!!! I'm trying to figure out how a seasoned PHP developer that follows all the best practices would accomplish the following. Please be as specific and detailed as possible. I'm sure your answer is going to help a lot of people!!! :-) While learning Polymorphism, I came across a little stumbling block. Inside of the PDFFormatter class, I had to use (instanceof) in order to figure out if some code should be included in the returned data. I am trying to be able to pass in two different kinds of profiles to the formatter. (needs to be able to handle multiple kinds of formatters but display the data specific to the Profile class that is being passed to it). It doesn't look bad now, but imagine 10 more kinds of Profiles!! How would you do this? The best answer would also include the changes you would make. Thanks sooooooo much in advance!!!!! Please PHP only! Thx!!! File 1. FormatterInterface.php interface FormatterInterface { public function format(Profile $Profile); } File 2. PDFFormatter.php class PDFFormatter implements FormatterInterface { public function format(Profile $Profile) { $format = "PDF Format<br /><br />"; $format .= "This is a profile formatted as a PDF.<br />"; $format .= 'Name: ' . $Profile->name . '<br />'; if ($Profile instanceof StudentProfile) { $format .= "Graduation Date: " . $Profile->graduationDate . "<br />"; } $format .= "<br />End of PDF file"; return $format; } } File 3. Profile.php class Profile { public $name; public function __construct($name) { $this->name = $name; } public function format(FormatterInterface $Formatter) { return $Formatter->format($this); } } File 4. StudentProfile.php class StudentProfile extends Profile { public $graduationDate; public function __construct($name, $graduationDate) { $this->name = $name; $this->graduationDate = $graduationDate; } } File 5. index.php //Assuming all files are included...... $StudentProfile = new StudentProfile('Michael Conner', 55, 'Unknown, FL', 'Graduate', '1975', 'Business Management'); $Profile = new Profile('Brandy Smith', 44, 'Houston, TX'); $PDFFormatter = new PDFFormatter(); echo '<hr />'; echo $StudentProfile->format($PDFFormatter); echo '<hr />'; echo $Profile->format($PDFFormatter);

    Read the article

  • How to generate sample XML documents from their DTD or XSD?

    - by lindelof
    We are developing an application that involves a substantial amount of XML transformations. We do not have any proper input test data per se, only DTD or XSD files. We'd like to generate our test data ourselves from these files. Is there an easy/free way to do that? Edit There are apparently no free tools for this, and I agree that OxygenXML is one of the best tools for this.

    Read the article

  • PHP script sample for iPhone hi-score/survey uploading?

    - by Horace Ho
    I am looking for a PHP example/tutorial which can accept hi-scores/survey upload from an iPhone. Hopefully, the PHP script: accepts POST, in additional to GET works over SSL (https) connects to MySQL In addition, it'd best the iPhone can get a session from the server and submit the session value along with the hi-score. Thanks

    Read the article

< Previous Page | 52 53 54 55 56 57 58 59 60 61 62 63  | Next Page >