Search Results

Search found 4373 results on 175 pages for 'historical debugging'.

Page 72/175 | < Previous Page | 68 69 70 71 72 73 74 75 76 77 78 79  | Next Page >

  • Solving Kaggle’s Bike Sharing Demand Machine Learning Problem

    - by Gopinath
    Kaggle.com hosts a lot of interesting machine learning problems online and thousands of its members compete to solve them for a bounty. Problems hosted on Kaggle has varying complexity to accomodate newbies to rock star developers – few problems are good enough for  newbies to learn basics of machine learning and few of them challenge the best of machine learning developers. I’m learning basics of machine learning for the past few weeks and had an opportunity to solve Kaggel’s Bike Sharing Demand problem. Bike Sharing systems allows customers to rent a bike (or a cycle as it is called in many part of the world) for several hours and return them back . The problem provides historical information about the demand for bike sharing business and we need to forecast the demand. For more information on the problem, visit Kaggle.com website. Here is the solution I written using random forests algorithm using R programming language and you can download the source code from github.  With this solution I was able to score RMSLE of 0.70117, which placed me somewhere in the mid of the leader board.  This is the best score I could get by spending 4 hours of my time. Please feel free to fork the code and improve it.   Get Kaggle Bike Sharing Demand solution code from GitHub

    Read the article

  • DATE function does not support all the dates in DAX by design #powerpivot #tabular #dax

    - by Marco Russo (SQLBI)
    The DATE function in DAX has this simple syntax: DATE( <year>, <month>, <day> ) If you are like me, you never read the BOL notes that says in a clear way that it supports dates beginning with March 1, 1900. In fact, I was wrongly assuming that it would have supported any date that can be represented in a Date data type in Data Models, so all the dates beginning with January 1, 1900. The funny thing is that in some of the BOL documentation you will find that Date data type supports dates after March 1, 1900 (which seems not including that date, but this is a detail…). But we should not digress. The real issue is that if you try to call the DATE function passing values between January 1 and February 28, 1900, you will see a different day as a result. evaluate row ( "x", DATE( 1900, 1, 1 ) ) -- return WRONG result -- [x] 12/31/1899 12:00:00 AM   evaluate row ( "x", DATE( 1901, 2, 29 ) ) -- return WRONG result -- [x] 2/28/1900 12:00:00 AM   evaluate row ( "x", DATE( 1900, 3, 1 ) ) -- return CORRECT result -- [x] 3/1/1900 12:00:00 AM As usual, this is not a bug. It is “by design”. The DATE function works in this way in Excel. And also in Excel it was “by design”. In this case the design is having the same bug of Lotus 1-2-3 that handled 1900 a leap year, even though it isn’t. The first release of Lotus 1-2-3 is dated 1983. I hope many of my readers are younger than that. I tried to open a bug in Connect. Please vote it. I would like if Microsoft changed this type of items from “by design” (as we can expect) to “by genetic disease”. Or by “historical respect”, in order to be more politically correct.

    Read the article

  • Is there an alternative to Google Code Search?

    - by blunders
    Per the Official Google Blog: Code Search, which was designed to help people search for open source code all over the web, will be shut down along with the Code Search API on January 15, 2012. Google Code Search is now gone, and since that makes it much harder to understand the features it presented, here's my attempt to render them via information I gathered from a cache of the page for the Search Options: The "In Search Box" just notes the syntax to type the command directly in the main search box instead of using the advance search interface. Package (In Search Box: "package:linux-2.6") Language (In Search Box: "lang:c++") (OPTIONS: any language, actionscript, ada, applescript, asp, assembly, autoconf, automake, awk, basic, bat, c, c#, c++, caja, cobol, coldfusion, configure, css, d, eiffel, erlang, fortran, go, haskell, inform, java, java, javascript, jsp, lex, limbo, lisp, lolcode, lua, m4, makefile, maple, mathematica, matlab, messagecatalog, modula2, modula3, objectivec, ocaml, pascal, perl, php, pod, prolog, proto, python, python, r, rebol, ruby, sas, scheme, scilab, sgml, shell, smalltalk, sml, sql, svg, tcl, tex, texinfo, troff, verilog, vhdl, vim, xslt, xul, yacc) File (In Search Box: "file:^.*.java$") Class (In Search Box: "class:HashMap") Function (In Search Box: "function:toString") License (In Search Box: "license:mozilla") (OPTIONS: null/any-license, aladdin/Aladdin-Public-License, artistic/Artistic-License, apache/Apache-License, apple/Apple-Public-Source-License, bsd/BSD-License, cpl/Common-Public-License, epl/Eclipse-Public-License, agpl/GNU-Affero-General-Public-License, gpl/GNU-General-Public-License, lgpl/GNU-Lesser-General-Public-License, disclaimer/Historical-Permission-Notice-and-Disclaimer, ibm/IBM-Public-License, lucent/Lucent-Public-License, mit/MIT-License, mozilla/Mozilla-Public-License, nasa/NASA-Open-Source-Agreement, python/Python-Software-Foundation-License, qpl/Q-Public-License, sleepycat/Sleepycat-License, zope/Zope-Public-License) Case Sensitive (In Search Box: "case:no") (OPTIONS: yes, no) Also of use in understanding the search tool would be the still live FAQs page for Google Code Search. Is there any code search engine that would fully replace Google Code Search's features?

    Read the article

  • Grid Infrastructure Management Repository (GIMR) database now mandatory in Oracle GI 12.1.0.2

    - by Mike Dietrich
    During the installation of Oracle Grid Infrastructure 12.1.0.1 you've had the following option to choose YES/NO to install the Grid Infrastructure Management Repository (GIMR) database MGMTDB: With Oracle Grid Infrastructure 12.1.0.2 this choice has become obsolete and the above screen does not appear anymore. The GIMR database has become mandatory.  What gets stored in the GIMR? See the documentation here See the changes in Oracle Clusterware 12.1.0.2 here: Automatic Installation of Grid Infrastructure Management Repository The Grid Infrastructure Management Repository is automatically installed with Oracle Grid Infrastructure 12c release 1 (12.1.0.2). The Grid Infrastructure Management Repository enables such features as Cluster Health Monitor, Oracle Database QoS Management, and Rapid Home Provisioning, and provides a historical metric repository that simplifies viewing of past performance and diagnosis of issues. This capability is fully integrated into Oracle Enterprise Manager Cloud Control for seamless management. Furthermore what the doc doesn't say explicitly: The -MGMTDB has now become a single-tenant deployment having a CDB with one PDB This will allow the use of a Utility Cluster that can hold the CDB for a collection of GIMR PDBs When you've had already an Oracle 12.1.0.1 GIMR this database will be destroyed and recreated Preserving the CHM/OS data can be acchieved with OCULMON to dump it out into node view The data files associated with it will be created within the same disk group as OCR and VOTING  In a future release there may be an option offered to put in into a separate disk group Some important MOS Notes: MOS Note 1568402.1FAQ: 12c Grid Infrastructure Management Repository, states there's no supported procedure to enable Management Database once the GI stack is configured MOS Note 1589394.1How to Move GI Management Repository to Different Shared Storage(shows how to delete and recreate the MGMTDB) MOS Note 1631336.1Cannot delete Management Database (MGMTDB) in 12.1 -Mike

    Read the article

  • Get to Know a Candidate (15 of 25): Jerry White&ndash;Socialist Equality Party

    - by Brian Lanham
    DISCLAIMER: This is not a post about “Romney” or “Obama”. This is not a post for whom I am voting. Information sourced for Wikipedia. White (born Jerome White) is an American politician and journalist, reporting for the World Socialist Web Site.  White's Presidential campaign keeps four core components: * International unity in the working class * Social equality * Opposition to imperialist militarism and assault on democratic rights * Opposition to the political subordination of the working class to the Democrats and Republicans The White-Scherrer ticket is currently undergoing a review by the Wisconsin election committee concerning the ballot listing of the party for the 2012 Presidential elections. White has visited Canada, Germany, and Sri Lanka to campaign for socialism and an international working class movement. The Socialist Equality Party (SEP) is a Trotskyist political party in the United States, one of several Socialist Equality Parties around the world affiliated to the International Committee of the Fourth International (ICFI). The ICFI publishes daily news articles, perspectives and commentaries on the World Socialist Web Site. The party held public conferences in 2009 and 2010. It led an inquiry into utility shutoffs in Detroit, Michigan earlier in 2010, after which it launched a Committee Against Utility Shutoffs. Recently it sent reporters to West Virginia to report on the Upper Big Branch Mine disaster and the way that Massey Energy has treated its workers. It also sent reporters to the Gulf Coast to report on the Deepwater Horizon oil spill. In addition, it has participated in elections with the aim of opposing the American occupation of Iraq and building a mass socialist party with an international perspective. Despite having been active for over a decade, the Socialist Equality Party held its founding congress in 2008, where it adopted a statement of principles and a historical document. White has Ballot Access in: CO, LA, WI Learn more about Jerry White and Socialist Equality Party on Wikipedia

    Read the article

  • Email Alias [email protected] Replaced with New Oracle Certification Support Tool

    - by Paul Sorensen
    All Oracle Certification customer service issues previously sent to [email protected], [email protected], [email protected], or [email protected], should now be submitted as service requests via the new request tool. Support via these email aliases ends today. Managing candidate communications via this tool will enable better issue tracking capabilities and ensure that all issues are handled quickly and efficiently. The integrated tool will also help us to more easily research historical and related issues to enable improved certification communications and business processes. For now, questions related to Java, Oracle Solaris (Cluster), MySQL, NetBeans or OpenOffice.org exam or certification, will still be sent to [email protected] and resolved via email. Questions related to the status of an Oracle Certification Success Kit, will still be sent to [email protected] and resolved via email. ?We are excited about this new offering and ?c?o?n?t?i?n?u?e? ??t?o??????? ?w?o?r?k? ?t?o?w?a?r?d ?improve?d customer ?s?e?r?v?i?c?e?? for our OCP community. Thank you for your cooperation! Quick View of Oracle Certification Customer Support Oracle Certification Support: All issues that previously would have been sent to [email protected] [email protected]: All questions on Java, Oracle Solaris (Cluster), MySQL, NetBeans, OpenOffice.org exams and certifications [email protected]: All questions on the status of your Oracle Certification Success Kit

    Read the article

  • Can I trust the Basic schedule equation?

    - by Steve Campbell
    I've been reading Steve McConnell's demystifying the black art of estimating book, and he gives an equation for estimating nominal schedule based on Person-months of effort: ScheduleInMonths = 3.0 x EffortInMonths ^ (1/3) Per the book, this is very accurate (within 25%), although the 3.0 factor above varies depending on your organization (typically between 2 and 4). It is supposedly easy to use historical projects in your organization to derive an appropriate factor for your use. I am trying to reconcile the equation against Agile methods, using 2-6 week cycles which are often mini-projects that have a working deliverable at the end. If I have a team of 5 developers over 4 weeks (1 month), then EffortInMonths = 5 Person Months. The algorithm then outputs a schedule of 3.0 x 5^(1/3) = 5 months. 5 months is much more than 25% different than 1 month. If I lower the 3.0 factor to 0.6, then the algorthim works (outputs a schedule of approx 1 month). The lowest possible factor mentioned in the book through is 2.0. Whats going on here? I want to trust this equation for estimating a "traditional" non-agile project, but I cannot trust it when it does not reconcile with my (agile) experience. Can someone help me understand?

    Read the article

  • How to Recover that Photo, Picture or File You Deleted Accidentally

    - by The Geek
    Have you ever accidentally deleted a photo on your camera, computer, USB drive, or anywhere else? What you might not know is that you can usually restore those pictures—even from your camera’s memory stick. Windows tries to prevent you from making a big mistake by providing the Recycle Bin, where deleted files hang around for a while—but unfortunately it doesn’t work for external USB drives, USB flash drives, memory sticks, or mapped drives. Luckily there’s another way to recover deleted files. Note: we originally wrote this article a year ago, but we’ve received this question so many times from readers, friends, and families that we’ve polished it up and are republishing it for everybody. So far, everybody has reported success! Latest Features How-To Geek ETC How to Recover that Photo, Picture or File You Deleted Accidentally How To Colorize Black and White Vintage Photographs in Photoshop How To Get SSH Command-Line Access to Windows 7 Using Cygwin The How-To Geek Video Guide to Using Windows 7 Speech Recognition How To Create Your Own Custom ASCII Art from Any Image How To Process Camera Raw Without Paying for Adobe Photoshop What is the Internet? From the Today Show January 1994 [Historical Video] Take Screenshots and Edit Them in Chrome and Iron Using Aviary Screen Capture Run Android 3.0 on a Hacked Nook Google Art Project Takes You Inside World Famous Museums Emerald Waves and Moody Skies Wallpaper Change Your MAC Address to Avoid Free Internet Restrictions

    Read the article

  • SSIS Dashboard v0.4

    - by Davide Mauri
    Following the post on SSISDB script on Gist, I’ve been working on a HTML5 SSIS Dashboard, in order to have a nice looking, user friendly and, most of all, useful, SSIS Dashboard. Since this is a “spare-time” project, I’ve decided to develop it using Python since it’s THE data language (R aside), it’s a beautiful & powerful, well established and well documented and with a rich ecosystem around. Plus it has full support in Visual Studio, through the amazing Python Tools For Visual Studio plugin, I decided also to use Flask, a very good micro-framework to create websites, and use the SB Admin 2.0 Bootstrap admin template, since I’m all but a Web Designer. The result is here: https://github.com/yorek/ssis-dashboard and I can say I’m pretty satisfied with the work done so far (I’ve worked on it for probably less than 24 hours). Though there’s some features I’d like to add in t future (historical execution time, some charts, connection with AzureML to do prediction on expected execution times) it’s already usable. Of course I’ve tested it only on my development machine, so check twice before putting it in production but, give the fact that, virtually, there is not installation needed (you only need to install Python), and that all queries are based on standard SSISDB objects, I expect no big problems. If you want to test, contribute and/or give feedback please fell free to do it…I would really love to see this little project become a community project! Enjoy!

    Read the article

  • Is it bad idea to use flag variable to search MAX element in array?

    - by Boris Treukhov
    Over my programming career I formed a habit to introduce a flag variable that indicates that the first comparison has occured, just like Msft does in its linq Max() extension method implementation public static int Max(this IEnumerable<int> source) { if (source == null) { throw Error.ArgumentNull("source"); } int num = 0; bool flag = false; foreach (int num2 in source) { if (flag) { if (num2 > num) { num = num2; } } else { num = num2; flag = true; } } if (!flag) { throw Error.NoElements(); } return num; } However I have met some heretics lately, who implement this by just starting with the first element and assigning it to result, and oh no - it turned out that STL and Java authors have preferred the latter method. Java: public static <T extends Object & Comparable<? super T>> T max(Collection<? extends T> coll) { Iterator<? extends T> i = coll.iterator(); T candidate = i.next(); while (i.hasNext()) { T next = i.next(); if (next.compareTo(candidate) > 0) candidate = next; } return candidate; } STL: template<class _FwdIt> inline _FwdIt _Max_element(_FwdIt _First, _FwdIt _Last) { // find largest element, using operator< _FwdIt _Found = _First; if (_First != _Last) for (; ++_First != _Last; ) if (_DEBUG_LT(*_Found, *_First)) _Found = _First; return (_Found); } Are there any preferences between one method or another? Are there any historical reasons for this? Is one method more dangerous than another?

    Read the article

  • What are the pros and cons about developing under MAC OS? [closed]

    - by user827992
    Sometimes i get the chance to program under MAC OS, i knew about this OS since Apple shipped its computers with a PowerPC by Motorola ( since Panther, more or less ), these days they are all X86 and i see no particular advantages about adopting this platform, also i see only downsides for the main part, i do not want to cause flames, please reply if you have a good answer or you can contribute in some constructive way. I'm trying to write a list of the natively supported languages, or the languages that comes only under MAC OS with some particular technology, my list is this: Objective C with Cocoa/Carbon I'm not considering personal preferences here, if a person X likes to code under Xcode it's probably ok to have a MAC, if a person Y likes to code under Visual Studio it's probably ok to not having a MAC, my purpose is to clarify what MAC OS is good for. I also do not get why people glorify the MAC for historical reasons, I mean a language like Java just comes for MAC only in the 7th edition of its JDK, things like GCC are just a porting and many technologies are out of the question like C# ( I'm sorry, i do not consider MonoDevelop like a serious alternative ) , .Net, ASP, DirectX, and many others are just, again, porting or free software, like PHP, MySQL, Javascript, XML, CSS, OpenGL, etc etc. My question is: what is so special about being a programmer under MAC OS? There is something that I have not seen? I also noticed that a significant portion of MAC users end up using their MAC like a normal Windows PC with Parallels or something like that. I can afford to buy a MAC, show me why this machine is so unique.

    Read the article

  • SQL Rally Pre-Con: Data Warehouse Modeling – Making the Right Choices

    - by Davide Mauri
    As you may have already learned from my old post or Adam’s or Kalen’s posts, there will be two SQL Rally in North Europe. In the Stockholm SQL Rally, with my friend Thomas Kejser, I’ll be delivering a pre-con on Data Warehouse Modeling: Data warehouses play a central role in any BI solution. It's the back end upon which everything in years to come will be created. For this reason, it must be rock solid and yet flexible at the same time. To develop such a data warehouse, you must have a clear idea of its architecture, a thorough understanding of the concepts of Measures and Dimensions, and a proven engineered way to build it so that quality and stability can go hand-in-hand with cost reduction and scalability. In this workshop, Thomas Kejser and Davide Mauri will share all the information they learned since they started working with data warehouses, giving you the guidance and tips you need to start your BI project in the best way possible?avoiding errors, making implementation effective and efficient, paving the way for a winning Agile approach, and helping you define how your team should work so that your BI solution will stand the test of time. You'll learn: Data warehouse architecture and justification Agile methodology Dimensional modeling, including Kimball vs. Inmon, SCD1/SCD2/SCD3, Junk and Degenerate Dimensions, and Huge Dimensions Best practices, naming conventions, and lessons learned Loading the data warehouse, including loading Dimensions, loading Facts (Full Load, Incremental Load, Partitioned Load) Data warehouses and Big Data (Hadoop) Unit testing Tracking historical changes and managing large sizes With all the Self-Service BI hype, Data Warehouse is become more and more central every day, since if everyone will be able to analyze data using self-service tools, it’s better for him/her to rely on correct, uniform and coherent data. Already 50 people registered from the workshop and seats are limited so don’t miss this unique opportunity to attend to this workshop that is really a unique combination of years and years of experience! http://www.sqlpass.org/sqlrally/2013/nordic/Agenda/PreconferenceSeminars.aspx See you there!

    Read the article

  • RGB? CMYK? Alpha? What Are Image Channels and What Do They Mean?

    - by Eric Z Goodnight
    They’re there, lurking in your image files. But have you ever wondered what are image channels are? And what do they have to do with RGB and CMYK? Here’s the answer. The channels panel in Photoshop is one of the most disused and misunderstood parts of the program. But images have color channels with or without Photoshop. Read on to find out what color channels are, what RGB and CMYK are, and learn a little bit more about how image files work Latest Features How-To Geek ETC How to Recover that Photo, Picture or File You Deleted Accidentally How To Colorize Black and White Vintage Photographs in Photoshop How To Get SSH Command-Line Access to Windows 7 Using Cygwin The How-To Geek Video Guide to Using Windows 7 Speech Recognition How To Create Your Own Custom ASCII Art from Any Image How To Process Camera Raw Without Paying for Adobe Photoshop What is the Internet? From the Today Show January 1994 [Historical Video] Take Screenshots and Edit Them in Chrome and Iron Using Aviary Screen Capture Run Android 3.0 on a Hacked Nook Google Art Project Takes You Inside World Famous Museums Emerald Waves and Moody Skies Wallpaper Change Your MAC Address to Avoid Free Internet Restrictions

    Read the article

  • How to Change the Default Application for a File Type in Mac OS X

    - by The Geek
    If you’re a recent Mac OS X convert, you might be wondering how to force a particular file type to open in a different application than the default. No? Well, we’re going to explain it anyway. This is most useful when you’ve installed something like VLC and want to open your video files in that instead of the default, which is QuickTime Player. Latest Features How-To Geek ETC RGB? CMYK? Alpha? What Are Image Channels and What Do They Mean? How to Recover that Photo, Picture or File You Deleted Accidentally How To Colorize Black and White Vintage Photographs in Photoshop How To Get SSH Command-Line Access to Windows 7 Using Cygwin The How-To Geek Video Guide to Using Windows 7 Speech Recognition How To Create Your Own Custom ASCII Art from Any Image Vintage Posters Showcase the History of Tech Advertising Google Cloud Print Extension Lets You Print Doc/PDF/Txt Files from Web Sites Hack a $10 Flashlight into an Ultra-bright Premium One Firefox Personas Arrive on Firefox Mobile Focus Booster Is a Sleek and Free Productivity Timer What is the Internet? From the Today Show January 1994 [Historical Video]

    Read the article

  • Basket Analysis with #dax in #powerpivot and #ssas #tabular

    - by Marco Russo (SQLBI)
    A few days ago I published a new article on DAX Patterns web site describing how to implement Basket Analysis in DAX. This topic is a very classical one and is also covered in the many-to-many revolution white paper. It has been also discussed in several blog posts, listed here in historical order: Simple Basket Analysis in DAX by Chris Webb PowerPivot, basket analysis and the hidden many to many by Alberto Ferrari Applied Basket Analysis in Power Pivot using DAX by Gerhard Brueckl As usual, in DAX Patterns we try to present the required DAX formulas in a way that is easy to adapt to specific models. We also try to show a good implementation from a performance point of view. Further optimizations are always possible in DAX. However, in order to keep the model simple to adapt in different scenarios, we avoid presenting optimizations that would require particular assumptions or restrictions on the data model. I hope you will find the Basket Analysis pattern useful. Even if you do not need it today, reading the DAX formula is a good exercise to check your knowledge of evaluation contexts in DAX. For example, describing how does it work the following expression is not a trivial task! [Orders with Both Products] := CALCULATE (     DISTINCTCOUNT ( Sales[SalesOrderNumber] ),     CALCULATETABLE (         SUMMARIZE ( Sales, Sales[SalesOrderNumber] ),         ALL ( Product ),         USERELATIONSHIP ( Sales[ProductCode], 'Filter Product'[Filter ProductCode] )     ) ) The good news is that you can use the patterns even if you do not really understand all the details of the DAX formulas you are using! Any feedback on this new pattern is very welcome.

    Read the article

  • Solving Big Problems with Oracle R Enterprise, Part I

    - by dbayard
    Abstract: This blog post will show how we used Oracle R Enterprise to tackle a customer’s big calculation problem across a big data set. Overview: Databases are great for managing large amounts of data in a central place with rigorous enterprise-level controls.  R is great for doing advanced computations.  Sometimes you need to do advanced computations on large amounts of data, subject to rigorous enterprise-level concerns.  This blog post shows how Oracle R Enterprise enables R plus the Oracle Database enabled us to do some pretty sophisticated calculations across 1 million accounts (each with many detailed records) in minutes. The problem: A financial services customer of mine has a need to calculate the historical internal rate of return (IRR) for its customers’ portfolios.  This information is needed for customer statements and the online web application.  In the past, they had solved this with a home-grown application that pulled trade and account data out of their data warehouse and ran the calculations.  But this home-grown application was not able to do this fast enough, plus it was a challenge for them to write and maintain the code that did the IRR calculation. IRR – a problem that R is good at solving: Internal Rate of Return is an interesting calculation in that in most real-world scenarios it is impractical to calculate exactly.  Rather, IRR is a calculation where approximation techniques need to be used.  In this blog post, we will discuss calculating the “money weighted rate of return” but in the actual customer proof of concept we used R to calculate both money weighted rate of returns and time weighted rate of returns.  You can learn more about the money weighted rate of returns here: http://www.wikinvest.com/wiki/Money-weighted_return First Steps- Calculating IRR in R We will start with calculating the IRR in standalone/desktop R.  In our second post, we will show how to take this desktop R function, deploy it to an Oracle Database, and make it work at real-world scale.  The first step we did was to get some sample data.  For a historical IRR calculation, you have a balances and cash flows.  In our case, the customer provided us with several accounts worth of sample data in Microsoft Excel.      The above figure shows part of the spreadsheet of sample data.  The data provides balances and cash flows for a sample account (BMV=beginning market value. FLOW=cash flow in/out of account. EMV=ending market value). Once we had the sample spreadsheet, the next step we did was to read the Excel data into R.  This is something that R does well.  R offers multiple ways to work with spreadsheet data.  For instance, one could save the spreadsheet as a .csv file.  In our case, the customer provided a spreadsheet file containing multiple sheets where each sheet provided data for a different sample account.  To handle this easily, we took advantage of the RODBC package which allowed us to read the Excel data sheet-by-sheet without having to create individual .csv files.  We wrote ourselves a little helper function called getsheet() around the RODBC package.  Then we loaded all of the sample accounts into a data.frame called SimpleMWRRData. Writing the IRR function At this point, it was time to write the money weighted rate of return (MWRR) function itself.  The definition of MWRR is easily found on the internet or if you are old school you can look in an investment performance text book.  In the customer proof, we based our calculations off the ones defined in the The Handbook of Investment Performance: A User’s Guide by David Spaulding since this is the reference book used by the customer.  (One of the nice things we found during the course of this proof-of-concept is that by using R to write our IRR functions we could easily incorporate the specific variations and business rules of the customer into the calculation.) The key thing with calculating IRR is the need to solve a complex equation with a numerical approximation technique.  For IRR, you need to find the value of the rate of return (r) that sets the Net Present Value of all the flows in and out of the account to zero.  With R, we solve this by defining our NPV function: where bmv is the beginning market value, cf is a vector of cash flows, t is a vector of time (relative to the beginning), emv is the ending market value, and tend is the ending time. Since solving for r is a one-dimensional optimization problem, we decided to take advantage of R’s optimize method (http://stat.ethz.ch/R-manual/R-patched/library/stats/html/optimize.html). The optimize method can be used to find a minimum or maximum; to find the value of r where our npv function is closest to zero, we wrapped our npv function inside the abs function and asked optimize to find the minimum.  Here is an example of using optimize: where low and high are scalars that indicate the range to search for an answer.   To test this out, we need to set values for bmv, cf, t, emv, tend, low, and high.  We will set low and high to some reasonable defaults. For example, this account had a negative 2.2% money weighted rate of return. Enhancing and Packaging the IRR function With numerical approximation methods like optimize, sometimes you will not be able to find an answer with your initial set of inputs.  To account for this, our approach was to first try to find an answer for r within a narrow range, then if we did not find an answer, try calling optimize() again with a broader range.  See the R help page on optimize()  for more details about the search range and its algorithm. At this point, we can now write a simplified version of our MWRR function.  (Our real-world version is  more sophisticated in that it calculates rate of returns for 5 different time periods [since inception, last quarter, year-to-date, last year, year before last year] in a single invocation.  In our actual customer proof, we also defined time-weighted rate of return calculations.  The beauty of R is that it was very easy to add these enhancements and additional calculations to our IRR package.)To simplify code deployment, we then created a new package of our IRR functions and sample data.  For this blog post, we only need to include our SimpleMWRR function and our SimpleMWRRData sample data.  We created the shell of the package by calling: To turn this package skeleton into something usable, at a minimum you need to edit the SimpleMWRR.Rd and SimpleMWRRData.Rd files in the \man subdirectory.  In those files, you need to at least provide a value for the “title” section. Once that is done, you can change directory to the IRR directory and type at the command-line: The myIRR package for this blog post (which has both SimpleMWRR source and SimpleMWRRData sample data) is downloadable from here: myIRR package Testing the myIRR package Here is an example of testing our IRR function once it was converted to an installable package: Calculating IRR for All the Accounts So far, we have shown how to calculate IRR for a single account.  The real-world issue is how do you calculate IRR for all of the accounts?This is the kind of situation where we can leverage the “Split-Apply-Combine” approach (see http://www.cscs.umich.edu/~crshalizi/weblog/815.html).  Given that our sample data can fit in memory, one easy approach is to use R’s “by” function.  (Other approaches to Split-Apply-Combine such as plyr can also be used.  See http://4dpiecharts.com/2011/12/16/a-quick-primer-on-split-apply-combine-problems/). Here is an example showing the use of “by” to calculate the money weighted rate of return for each account in our sample data set.  Recap and Next Steps At this point, you’ve seen the power of R being used to calculate IRR.  There were several good things: R could easily work with the spreadsheets of sample data we were given R’s optimize() function provided a nice way to solve for IRR- it was both fast and allowed us to avoid having to code our own iterative approximation algorithm R was a convenient language to express the customer-specific variations, business-rules, and exceptions that often occur in real-world calculations- these could be easily added to our IRR functions The Split-Apply-Combine technique can be used to perform calculations of IRR for multiple accounts at once. However, there are several challenges yet to be conquered at this point in our story: The actual data that needs to be used lives in a database, not in a spreadsheet The actual data is much, much bigger- too big to fit into the normal R memory space and too big to want to move across the network The overall process needs to run fast- much faster than a single processor The actual data needs to be kept secured- another reason to not want to move it from the database and across the network And the process of calculating the IRR needs to be integrated together with other database ETL activities, so that IRR’s can be calculated as part of the data warehouse refresh processes In our next blog post in this series, we will show you how Oracle R Enterprise solved these challenges.

    Read the article

  • How to Archive, Search, and View Your Tweet Statistics with ThinkUp

    - by YatriTrivedi
    Worried about archiving your tweets? Want a more powerful search? Want to see your tweet statistics? You can do all of that and more by installing ThinkUp on your home server. ThinkUp is a brilliant application (currently in beta) that will archive all of your tweets, your replies, responses, etc. so that you can search through them and find out some helpful usage statistics. It has quite a few plugins, including one that adds full Facebook support, too. It’s designed to be installed on a LAMP server; that is, Linux, Apache, MySQL, and PHP is what will provide the backbone for it. While it’s possible to install it on a Windows- or Mac-based machine, it’s most easily handled in Linux, so we’ll be using Ubuntu to show you how to get it up and running. It’s in very active development by the founder, Gina Trapani, and by many users in the community Latest Features How-To Geek ETC How to Recover that Photo, Picture or File You Deleted Accidentally How To Colorize Black and White Vintage Photographs in Photoshop How To Get SSH Command-Line Access to Windows 7 Using Cygwin The How-To Geek Video Guide to Using Windows 7 Speech Recognition How To Create Your Own Custom ASCII Art from Any Image How To Process Camera Raw Without Paying for Adobe Photoshop What is the Internet? From the Today Show January 1994 [Historical Video] Take Screenshots and Edit Them in Chrome and Iron Using Aviary Screen Capture Run Android 3.0 on a Hacked Nook Google Art Project Takes You Inside World Famous Museums Emerald Waves and Moody Skies Wallpaper Change Your MAC Address to Avoid Free Internet Restrictions

    Read the article

  • "Siebel2FusionCRM Integration" solution by ec4u (D)

    - by Richard Lefebvre
    ec4u, a CRM System Integration leader based in Germany and Switzerland, and an historical Oracle/Siebel partner, offers a complete "Siebel2FusionCRM Integration" solution, based on tools methodology and services. ec4u Siebel2FusionCRM Integration solution's main objectives are: Integration between Siebel (on-premise) and Fusion CRM / Marketing (“in the cloud”) Accounts, Contacts and Addresses are maintained by Sales in Siebel CRM and synchronized in real-time into Fusion CRM / Marketing CDM Processing ensures clean data for marketing campaigns (validation and deduplication) Create E-Mail marketing campaigns and newsletters in Fusion The solution features: Upsert processes figure out what information needs to be updated, inserted or terminated (deleted). However, as Siebel is the data master, it is still a one-way synchronization. Handle deleted or nullified information by terminating them in Fusion CRM (set start and end date to define the validity period) Initial load and real-time synchronization use the same processes Invocations/Operations can be repeated due to no transactional support from Fusion web services Tagging sub entries in case of 1 to N mapping (Example: Telephone number is one simple field in Siebel but in Fusion you can have multiple telephone numbers in a sub table) E-Mail-Notification in case of any error (containing error message, instance number, detailed payload) Schematron Validation Interested? Looking for more details or a partnership with ec4u for a "Siebel2FusionCRM Integration" project? Contact: Gregor Bublitz, Director Expert Services ([email protected])

    Read the article

  • A quick hello to the Western Kentucky .NET User Group

    - by Muljadi Budiman
    A few days back, I got a chance to speak at the Western Kentucky .NET User Group meeting in Murray, Kentucky.  The opportunity came up because the original speaker, Jeff Blankenburg, had another obligation and was thus unable to come to this meeting.  I volunteered to deliver his presentation, which is an overview of MIX10 conference. It was a great experience for me; got to drive around and do a little bit of sight-seeing – can’t say I’ve ever been to Kentucky before, so first trip ever there.  I got to meet the user group’s current lead, Tom Turner and got to chat and discuss about all kinds of stuff with the other members.  Cheers to Matt Gawarecki and Brandon Sharp! The presentation itself mostly covers new features in Visual Studio 2010, which was recently released on April 12 – got to demonstrate Historical Debugging in IntelliTrace, Parallel Stacks, View Call Hierarchy and show some Extensions.  We also covered some of the new functionalities in Silverlight 4 (using webcams, drag & drop support among others) and I got to show off Scott Guthrie’s Windows Phone 7 Twitter app.  Altogether, it was quite a bit to cover in 70 minutes or so, but I think everyone enjoyed it. Jeff provided me with the presentation slides (which I modify a bit) and demo applications; so I’m putting it up here for those that may be interested in downloading them.  Please keep in mind that all the demos were made with VS2010 RC, so there may be slight tweaks to get it to work on the RTM version.

    Read the article

  • pros-cons of separate hosting accounts versus using addon domain

    - by hen3ry
    Folks: For historical reasons, I have "Site A" on "Hosting Account A", and "Site B" on "Account B", totally independent accounts with the same vendor, Bluehost. Both are primary domains. Now that Hosting Account B is just about to expire, I'm considering letting it disappear and moving Site B to an Addon domain on "Account A". Both sites are non-commercial, narrow-interest, very-low-traffic, hundreds of page views per month. The file weights for the sites are non-trivial, especially as I like to install specialized CMSs in subdomains. Since Bluehost allows unlimited hosting space there should be no issue with the file load, except I've seen hints of an issue with total file count, maybe 50k files -- which I'm not currently close to hitting, but might eventually. My question: what are the pros and cons of using separate accounts versus hosting Site B as an addon domain? Obviously, using a single account is cheaper by half, and I know that my authoring environment (DreamWeaver CS5) complains when it detects nested source trees, telling me "Synchronization" might fail in such cases, but I don't depend on this feature. What other factors should I consider? TIA

    Read the article

  • Source code versioning with comments (organizational practice) - leave or remove?

    - by ADTC
    Before you start admonishing me with "DON'T DO IT," "BAD PRACTICE!" and "Learn to use proper source code control", please hear me out first. I am fully aware that the practice of commenting out old code and leaving it there forever is very bad and I hate such practice myself. But here's the situation I'm in. A few months ago I joined a company as software developer. I had worked in the company for few months as an intern, about a year before joining recently. Our company uses source code version control (CVS) but not properly. Here's what happened both in my internship and my current permanent position. Each time I was assigned to work on a project (legacy, about 8-10 years old). Instead of creating a CVS account and letting me check out code and check in changes, a senior colleague exported the code from CVS, zipped it up and passed it to me. While this colleague checks in all changes in bulk every few weeks, our usual practice is to do fine-grained versioning in the actual source code itself (each file increments in versions independent from the rest). Whenever a change is made to a file, old code is commented out, new code entered below it, and this whole section is marked with a version number. Finally a note about the changes is placed at the top of the file in a section called Modification History. Finally the changed files are placed in a shared folder, ready and waiting for the bulk check-in. /* * Copyright notice blah blah * Some details about file (project name, file name etc) * Modification History: * Date Version Modified By Description * 2012-10-15 1.0 Joey Initial creation * 2012-10-22 1.1 Chandler Replaced old code with new code */ code .... //v1.1 start //old code new code //v1.1 end code .... Now the problem is this. In the project I'm working on, I needed to copy some new source code files from another project (new in the sense that they didn't exist in destination project before). These files have a lot of historical commented out code and comment-based versioning including usually long or very long Modification History section. Since the files are new to this project I decided to clean them up and remove unnecessary code including historical code, and start fresh at version 1.0. (I still have to continue the practice of comment-based versioning despite hating it. And don't ask why not start at version 0.1...) I have done similar something during my internship and no one said anything. My supervisor has seen the work a few times and didn't say I shouldn't do such clean-up (if at all it was noticed). But a same-level colleague saw this and said it's not recommended as it may cause downtime in the future and increase maintenance costs. An example is when changes are made in another project on the original files and these changes need to be propagated to this project. With code files drastically different, it could cause confusion to an employee doing the propagation. It makes sense to me, and is a valid point. I couldn't find any reason to do my clean-up other than the inconvenience of a ridiculously messy code. So, long story short: Given the practice in our company, should I not do such clean-up when copying new files from project to project? Is it better to make changes on the (copy of) original code with full history in comments? Or what justification can I give for doing the clean-up? PS to mods: Hope you allow this question some time even if for any reason you determine it to be unfit in SO. I apologize in advance if anything is inappropriate including tags.

    Read the article

  • A terminal emulator for ex-Windows users

    - by Dan
    There are several things I would like to be better in Ubuntu Terminal Emulator. coloring, like in the source code Copy and paste keyboard shortcuts that I used all the time in Windows: Ctrl-C and Ctrl-V (Most of people here in Ubuntu use Ctrl+C and Ctrl+V keyboard shortcuts to copy and paste everywhere except the terminal! I think it's annoying for newcomers, and I don't worry about historical reasons) A feature to save all the output to log file UPDATE: Can the terminal be a powerful feature-full user-friendly tool like a modern IDE? The Linux user can spend 30% of time in the terminal. Programmers no longer code in a notepad. Can I see the history pane? Suggestions? Directory pane? Commands list? Search for words in an output? Contextual behavior? "Search in Google" for a mouse right-click. Tips and tricks learning? Time is money! Please, people, give me a link to the 21st - century terminal.

    Read the article

  • OpenOffice Calc: How can I count the number of different items with data pilot?

    - by manu
    Hi all, I have a rather long spreadsheet with historical information of issues solved by some user on a colaborative environment. The spreadsheet have the following (relevant) columns date, week no., project, author id, etc... The week no. is calculated from the date, is basically the year concatenated with the week number within that year; for instance, both 2009-02-18 and 2009-02-20 yield the week number 200908 - the 8th week of year 2009; and 2009-02-23 yields 200909 - the 9th week of year 2009. I need to count how many different users (given by author id) contributed to some project, on a weekly basis. I have setup a data pilot with the week as Row Field, the project as the Column Field, and count-author as the Data Field. However, this counts the author id as different instances. This is not what I need. I need to count how many different users contributed to each project on a weekly basis. I expect to get something like: projects week Project1 Project2 Project3 200901 10 2 200902 2 7 Each inner cell containing how many different users contributed. With the count-author configuration, what I get is how many contributions (total) got the project on that week. Is there a way to tell OpenOffice Calc to do what I want?

    Read the article

  • Is it recommended to use more than one language at a startup?

    - by GoofyBall
    I work for a mobile startup where, for historical reasons, our chosen language was C#. I was recently assigned to a small project to build a tool that would be used by us internally. When I explained my intention to use Python to build this tool I was heavily criticized for this because introducing new languages, and technologies (Debian, Apache, Python and Django) into our ecosystem would make it harder for other developers to maintain (because only two other people know more than one language besides C#). I countered that this project would take far longer to develop in C# (which I think is an inherent problem with the language/.NET framework) and that the project was small and designed to solve a very particular problem. Of course it is necessary that the ecosystem be as a homogeneous as possible but if your are developing tooling, infrastructure, and internal systems when there are better things to build them with than C# then you should consider using them. By using one language you exclude a lot of other great libraries and frameworks out there, and this case it was the difference between taking one week to build in Python as opposed to a month in C#. Do you think it is acceptable to understand and use only only one language at a startup or even a larger company? Am I perhaps being naive??

    Read the article

  • I need an approach to the problem of preventing inserting duplicate records into the database

    - by Maurice
    Apologies is this question is asked on the incorrect "stack" A webservice that I call returns a list of data. The data from the webservice is updated periodically, so a call to the webservice done in one hour could return the same data as a call done in an hour. Also, the data is returned based on a start and end date. We have multiple users that can run the webservice search, and duplicate data is most likely to be returned (especially for historical data). However I don't want to insert this duplicate data in the database. I've created a db table in which the data is stored (most important columns are) Id int autoincrement PK Date date not null --The date to which the data set belongs. LastUpdate date not null --The date the data set was last updated. UserName varchar(50) --The name of the user doing the search. I use sql server 2008 express with c# 4.0 and visual studio 2010. Entity Framework is used as the ORM. If stored procedures could be avoided in the proposed solution, then that will be a plus. Another way of looking interpreting what I'm asking a solution for is as follows: I have a million unique records in my table. A user does a new search. The search results from the user contains around 300k of the data that is already in the db. An efficient solution to finding an inserting only the unique records is needed.

    Read the article

< Previous Page | 68 69 70 71 72 73 74 75 76 77 78 79  | Next Page >