Search Results

Search found 148 results on 6 pages for 'michel grootjans'.

Page 1/6 | 1 2 3 4 5 6  | Next Page >

  • Casse-têtes en C#, un article de Jon Skeet sur les pièges du langage, traduit par Jean-Michel Ormes

    Cette discussion est destinée à recueillir vos commentaires sur l'article Casse-têtes en C# (traduction de l'article C# Brainteasers de Jon Skeet) Citation: Régulièrement, je tombe sur une situation intéressante en C# qui donne des résultats surprenants. Cette page contient une liste d'exemples. Dans les exemples où il n'y a qu'un bout de code, nous supposerons que celui-ci est dans la m...

    Read the article

  • The theory of evolution applied to software

    - by Michel Grootjans
    I recently realized the many parallels you can draw between the theory of evolution and evolving software. Evolution is not the proverbial million monkeys typing on a million typewriters, where one of them comes up with the complete works of Shakespeare. We would have noticed by now, since the proverbial monkeys are now blogging on the Internet ;-) One of the main ideas of the theory of evolution is the balance between random mutations and natural selection. Random mutations happen all the time: millions of mutations over millions of years. Most of them are totally useless. Some of them are beneficial to the evolved species. Natural selection favors the beneficially mutated species. Less beneficial mutations die off. The mutated rabbit doesn't have to be faster than the fox. It just has to be faster than the other rabbits.   Theory of evolution Evolving software Random mutations happen all the time. Most of these mutations are so bad, the new species dies off, or cannot reproduce. Developers write new code all the time. New ideas come up during the act of writing software. The really bad ones don't get past the stage of idea. The bad ones don't get committed to source control. Natural selection favors the beneficial mutated species Good ideas and new code gets discussed in group during informal peer review. Less than good code gets refactored. Enhanced code makes it more readable, maintainable... A good set of traits makes the species superior to others. It becomes widespread A good design tends to make it easier to add new features, easier to understand the current implementations, easier to optimize for performance...thus superior. The best designs get carried over from project to project. They appear in blogs, articles and books about principles, patterns and practices.   Of course the act of writing software is deliberate. This can hardly be called random mutations. Though it sometimes might seem that code evolves through a will of its own ;-) Does this mean that evolving software (evolution) is better than a big design up front (creationism)? Not necessarily. It's a false idea to think that a project starts from scratch and everything evolves from there. Everyone carries his experience of what works and what doesn't. Up front design is necessary, but is best kept simple and minimal, just enough to get you started. Let the good experiences and ideas help to drive the process, whether they come from you or from others, from past experience or from the most junior developer on your team. Once again, balance is the keyword. Balance design up front with evolution on a daily basis. How do you know what balance is right? Through your own experience of what worked and what didn't (here's evolution again). Notes: The evolution of software can quickly degenerate without discipline. TDD is a discipline that leaves little to chance on that part. Write your test to describe the new behavior. Write just enough code to make it behave as specified. Refactor to evolve the code to a higher standard. The responsibility of good design rests continuously on each developers' shoulders. Promiscuous pair programming helps quickly spreading the design to the whole team.

    Read the article

  • Disable pasting in a textbox using jQuery

    - by Michel Grootjans
    I had fun writing this one My current client asked me to allow users to paste text into textboxes/textareas, but that the pasted text should be cleaned from '<...>' tags. Here's what we came up with: $(":input").bind('paste', function(e) { var el = $(this); setTimeout(function() { var text = $(el).val(); $(el).val(text.replace(/<(.*?)>/gi, '')); }, 100); }) ; This is so simple, I'm amazed. The first part just binds a function to the paste operation applied to any input  declared on the page. $(":input").bind('paste', function(e) {...}); In the first line, I just capture the element. Then wait for 100ms setTimeout(function() {....}, 100); then get the actual value from the textbox, and replace it with a regular expression that basically means replace everything that looks like '<{0}>' with ''. gi at the end are regex arguments in javascript. /<(.*?)>/gi

    Read the article

  • Cross domain javascript form filling, reverse proxy

    - by Michel van Engelen
    I need a javascript form filler that can bypass the 'same origin policy' most modern browsers implement. I made a script that opens the desired website/form in a new browser. With the handler, returned by the window.open method, I want to retrieve the inputs with theWindowHandler.document.getElementById('inputx') and fill them (access denied). Is it possible to solve this problem by using Isapi Rewrite (official site) in IIS 6 acting like a reverse proxy? If so, how would I configure the reverse proxy? This is how far I got: RewriteEngine on RewriteLogLevel 9 LogLevel debug RewriteRule CarChecker https://the.actualcarchecker.com/CheckCar.aspx$1 [NC,P] The rewrite works, http://ourcompany.com/ourapplication/CarChecker, as evident in the logging. From within our companysite I can run the carchecker as if it was in our own domain. Except, the 'same origin policy' is still in force. Regards, Michel

    Read the article

  • Allow and restrict remote sql server access

    - by Michel
    Hi, I want to expose my sql server instance via the internet. I've been programming asp.net to sql server for a long time, but for the first time i'm hosting the sql server myself instead of the clients server. So what i want to do is move my sql server from my dev machine at home to a virtual server (yet to hire). But of course i don't want anyone to just enter my sql server but just a few persons. So what i was thinking was to allow only a few ip addresses to the sql server instance. Can anyone tell me how i can expose my sql server to the internet and limit the access to the instance to only a few ip addresses? And ehm, if you know even better ways to secure it, i'd be happy, because this is the first time for me :) Michel

    Read the article

  • Trace redirect loop

    - by Michel Krämer
    I have a large PHP application. After I changed some settings I get a redirection loop (i.e. the browser is redirected to the same page over and over again). The problem is that I don't know which command (which line in which PHP file) in this application causes the redirect. Is there a way to trace calls to the header() function? Or - even better - is there a way to trace redirects in PHP? Thanks in advance, Michel

    Read the article

  • speed: XmlTextReader vs LinqtoXml

    - by Michel
    Hi, i'm about to read some xml (who isn't :-)) This time however it's a lot of data: about 30.000 records with 5 properties, all in one file. Till now i've always read that the XmlTextReader is the fastest way to read xml data, but now there also is the (nice sytax of) linqtoXml. Does anybody know any performance issues, or that there aren't any, with LinqToXml? Michel

    Read the article

  • how to implement IOC without a global static service?

    - by Michel
    Hi, we want to use Unity for IOC. All i've seen is the implementation that there is one global static service which holds a reference to the Unity container, which registers all interface/class combinations and every class asks that object: give me an implementation for Ithis or IThat. Frequently i see a response that this pattern is not good because it leads to a dependency from ALL classes to this service. But what i don't see often, is: what is the alternative way? Michel

    Read the article

  • How to keep my functions (objects/methods) 'lean and mean'

    - by Michel
    Hi, in all (Agile) articles i read about this: keep your code and functions small and easy to test. How should i do this with the 'controller' or 'coordinator' class? In my situation, i have to import data. In the end i have one object who coordinates this, and i was wondering if there is a way of keeping the coordinator lean(er) and mean(er). My coordinator now does the followling (pseudocode) //Write to the log that the import has started Log.StartImport() //Get the data in Excel sheet format result = new Downloader().GetExcelFile() //Log this step Log.LogStep(result ) //convert the data to intern objects result = new Converter().Convertdata(result); //Log this step Log.LogStep(result ) //write the data result = Repository.SaveData(result); //Log this step Log.LogStep(result ) Imho, this is one of those 'know all' classes or at least one being 'not lean and mean'? Or, am i taking this lean and mean thing to far and is it impossible to program an import without some kind of 'fat' importer/coordinator? Michel

    Read the article

  • How about the Asp.net processes and threads and apppools?

    - by Michel
    Hi, as i understand, when i load a asp.net .aspx page on the (iis)server, it's processed via the w3p.exe process. But when iis gets multiple requests, are they all processed by the same w3p process? And does this process automaticly use all my processors and cores? And after that: when i start i new thread in my page, this thread still works when the pages is already served to the client. Where does this thread live? also in the w3p.exe process? And what if i assign another apppool to my site, what does that do? Michel

    Read the article

  • Entity framework (1): implement 1 foreign key to multiple tables

    - by Michel
    Hi, i've modeled this: i have an import table, and an import steps table import 1 .. N importsteps Now i have a table importparams, which hold key/value pairs to register all kind of info about the import or the importsteps. So i have modeled a FK in SqlServer which points to the PK of the import table and to the PK of the importsteps table (the ID's for both the import as the importsteps table are guids, so i can query the importparams with either the id from import or from importsteps and get the right importparams). Makes sense a bit? But how can i model this in the EF? I can see it's a bit hard for the EF to model this, because one realtion can point to multiple classes, but is there a way? The workaround normally is just to get all importparams where FK is the ID, but as you know the FK is not available in the EF version 1. I hope you can help me out, michel

    Read the article

  • how to implement IOC without a global static service (non-service locator solution)?

    - by Michel
    Hi, we want to use Unity for IOC. All i've seen is the implementation that there is one global static service which holds a reference to the Unity container, which registers all interface/class combinations and every class asks that object: give me an implementation for Ithis or IThat. Frequently i see a response that this pattern is not good because it leads to a dependency from ALL classes to this service. But what i don't see often, is: what is the alternative way? Michel EDIT: found out that the global static service is called the service locator, added that to the title.

    Read the article

  • automatic enter login credentials while testing my app (from visual studio 2008)

    - by Michel
    Hi all, this is something that i don't want to program, but i was looking for a handy way of logging on to my web app. i'm building (and testing and running) my webapp over and over again, and here i've been provided a strong password. needless to say it's not so nice to enter my full user name + strong password 30 times a day. is there a nifty tool which lives in the background and when i open page localhost/mytestpage.aspx, it will say: "hey, let me type in michel and sdfs%^%gfhg in these two textboxes"?

    Read the article

  • Predicting Likelihood of Click with Multiple Presentations

    - by Michel Adar
    When using predictive models to predict the likelihood of an ad or a banner to be clicked on it is common to ignore the fact that the same content may have been presented in the past to the same visitor. While the error may be small if the visitors do not often see repeated content, it may be very significant for sites where visitors come repeatedly. This is a well recognized problem that usually gets handled with presentation thresholds – do not present the same content more than 6 times. Observations and measurements of visitor behavior provide evidence that something better is needed. Observations For a specific visitor, during a single session, for a banner in a not too prominent space, the second presentation of the same content is more likely to be clicked on than the first presentation. The difference can be 30% to 100% higher likelihood for the second presentation when compared to the first. That is, for example, if the first presentation has an average click rate of 1%, the second presentation may have an average CTR of between 1.3% and 2%. After the second presentation the CTR stays more or less the same for a few more presentations. The number of presentations in this plateau seems to vary by the location of the content in the page and by the visual attraction of the content. After these few presentations the CTR starts decaying with a curve that is very well approximated by an exponential decay. For example, the 13th presentation may have 90% the likelihood of the 12th, and the 14th has 90% the likelihood of the 13th. The decay constant seems also to depend on the visibility of the content. Modeling Options Now that we know the empirical data, we can propose modeling techniques that will correctly predict the likelihood of a click. Use presentation number as an input to the predictive model Probably the most straight forward approach is to add the presentation number as an input to the predictive model. While this is certainly a simple solution, it carries with it several problems, among them: If the model learns on each case, repeated non-clicks for the same content will reinforce the belief of the model on the non-clicker disproportionately. That is, the weight of a person that does not click for 200 presentations of an offer may be the same as 100 other people that on average click on the second presentation. The effect of the presentation number is not a customer characteristic or a piece of contextual data about the interaction with the customer, but it is contextual data about the content presented. Models tend to underestimate the effect of the presentation number. For these reasons it is not advisable to use this approach when the average number of presentations of the same content to the same person is above 3, or when there are cases of having the presentation number be very large, in the tens or hundreds. Use presentation number as a partitioning attribute to the predictive model In this approach we essentially build a separate predictive model for each presentation number. This approach overcomes all of the problems in the previous approach, nevertheless, it can be applied only when the volume of data is large enough to have these very specific sub-models converge.

    Read the article

  • How to test a localized WPF application in visual studio 2012

    - by Michel Keijzers
    I am trying to create a localized application in C# / WPF in Visual Studio 2012. For that I created two resource files and changed one string in a (XAML) window to use the resource files (instead of a hardcoded string). I see the English text from the resource file, which is correct. However, I want to check if the other resource file (fr-FR) also works but I cannot find a setting or procedure how to change my 'project' to run in French. Thanks in advance.

    Read the article

  • The softer side of BPM

    - by [email protected]
    BPM and RTD are great complementary technologies that together provide a much higher benefit than each of them separately. BPM covers the need for automating processes, making sure that there is uniformity, that rules and regulations are complied with and that the process runs smoothly and quickly processes the units flowing through it. By nature, this automation and unification can lead to a stricter, less flexible process. To avoid this problem it is common to encounter process definition that include multiple conditional branches and human input to help direct processing in the direction that best applies to the current situation. This is where RTD comes into play. The selection of branches and conditions and the optimization of decisions is better left in the hands of a system that can measure the results of its decisions in a closed loop fashion and make decisions based on the empirical knowledge accumulated through observing the running of the process.When designing a business process there are key places in which it may be beneficial to introduce RTD decisions. These are:Thresholds - whenever a threshold is used to determine the processing of a unit, there may be an opportunity to make the threshold "softer" by introducing an RTD decision based on predicted results. For example an insurance company process may have a total claim threshold to initiate an investigation. Instead of having that threshold, RTD could be used to help determine what claims to investigate based on the likelihood they are fraudulent, cost of investigation and effect on processing time.Human decisions - sometimes a process will let the human participants make decisions of flow. For example, a call center process may leave the escalation decision to the agent. While this has flexibility, it may produce undesired results and asymetry in customer treatment that is not based on objective functions but subjective reasoning by the agent. Instead, an RTD decision may be introduced to recommend escalation or other kinds of treatments.Content Selection - a process may include the use of messaging with customers. The selection of the most appropriate message to the customer given the content can be optimized with RTD.A/B Testing - a process may have optional paths for which it is not clear what populations they work better for. Rather than making the arbitrary selection or selection by committee of the option deeped the best, RTD can be introduced to dynamically determine the best path for each unit.In summary, RTD can be used to make BPM based process automation more dynamic and adaptable to the different situations encountered in processing. Effectively making the automation softer, less rigid in its processing.

    Read the article

  • Logparser and Powershell

    - by Michel Klomp
    Logparser in powershell One of the few examples how to use logparser in powershell is from the Microsoft.com Operations blog. This script is a good base to create more advanced logparser scripts: $myQuery = new-object -com MSUtil.LogQuery $szQuery = “Select top 10 * from r:\ex07011210.log”; $recordSet = $myQuery.Execute($szQuery) for(; !$recordSet.atEnd(); $recordSet.moveNext()) {             $record=$recordSet.getRecord();             write-host ($record.GetValue(0) + “,”+ $record.GetValue(1)); } $recordSet.Close(); Logparser input formats The previous example uses the default logparser object, you can extent this with the logparser input formats. with this formats get information from the event-log, different types of logfiles, the Active Directory, the registry and XML files. Here are the different ProgId’s you can use. Input Format ProgId ADS MSUtil.LogQuery.ADSInputFormat BIN MSUtil.LogQuery.IISBINInputFormat CSV MSUtil.LogQuery.CSVInputFormat ETW MSUtil.LogQuery.ETWInputFormat EVT MSUtil.LogQuery.EventLogInputFormat FS MSUtil.LogQuery.FileSystemInputFormat HTTPERR MSUtil.LogQuery.HttpErrorInputFormat IIS MSUtil.LogQuery.IISIISInputFormat IISODBC MSUtil.LogQuery.IISODBCInputFormat IISW3C MSUtil.LogQuery.IISW3CInputFormat NCSA MSUtil.LogQuery.IISNCSAInputFormat NETMON MSUtil.LogQuery.NetMonInputFormat REG MSUtil.LogQuery.RegistryInputFormat TEXTLINE MSUtil.LogQuery.TextLineInputFormat TEXTWORD MSUtil.LogQuery.TextWordInputFormat TSV MSUtil.LogQuery.TSVInputFormat URLSCAN MSUtil.LogQuery.URLScanLogInputFormat W3C MSUtil.LogQuery.W3CInputFormat XML MSUtil.LogQuery.XMLInputFormat Using logparser to parse IIS logs if you use the IISW3CinputFormat you can use the field names instead of de row number to get the information from an IIS logfile, it also skips the comment rows in the logfile. $ObjLogparser = new-object -com MSUtil.LogQuery $objInputFormat = new-object -com MSUtil.LogQuery.IISW3CInputFormat $Query = “Select top 10 * from c:\temp\hb\ex071002.log”; $recordSet = $ObjLogparser.Execute($Query, $objInputFormat) for(; !$recordSet.atEnd(); $recordSet.moveNext()) {     $record=$recordSet.getRecord();     write-host ($record.GetValue(“s-ip”) + “,”+ $record.GetValue(“cs-uri-query”)); } $recordSet.Close();

    Read the article

  • Ignoring Robots - Or Better Yet, Counting Them Separately

    - by [email protected]
    It is quite common to have web sessions that are undesirable from the point of view of analytics. For example, when there are either internal or external robots that check the site's health, index it or just extract information from it. These robotic session do not behave like humans and if their volume is high enough they can sway the statistics and models.One easy way to deal with these sessions is to define a partitioning variable for all the models that is a flag indicating whether the session is "Normal" or "Robot". Then all the reports and the predictions can use the "Normal" partition, while the counts and statistics for Robots are still available.In order for this to work, though, it is necessary to have two conditions:1. It is possible to identify the Robotic sessions.2. No learning happens before the identification of the session as a robot.The first point is obvious, but the second may require some explanation. While the default in RTD is to learn at the end of the session, it is possible to learn in any entry point. This is a setting for each model. There are various reasons to learn in a specific entry point, for example if there is a desire to capture exactly and precisely the data in the session at the time the event happened as opposed to including changes to the end of the session.In any case, if RTD has already learned on the session before the identification of a robot was done there is no way to retract this learning.Identifying the robotic sessions can be done through the use of rules and heuristics. For example we may use some of the following:Maintain a list of known robotic IPs or domainsDetect very long sessions, lasting more than a few hours or visiting more than 500 pagesDetect "robotic" behaviors like a methodic click on all the link of every pageDetect a session with 10 pages clicked at exactly 20 second intervalsDetect extensive non-linear navigationNow, an interesting experiment would be to use the flag above as an output of a model to see if there are more subtle characteristics of robots such that a model can be used to detect robots, even if they fall through the cracks of rules and heuristics.In any case, the basic and simple technique of partitioning the models by the type of session is simple to implement and provides a lot of advantages.

    Read the article

  • Is RTD Stateless or Stateful?

    - by [email protected]
    Yes.   A stateless service is one where each request is an independent transaction that can be processed by any of the servers in a cluster.  A stateful service is one where state is kept in a server's memory from transaction to transaction, thus necessitating the proper routing of requests to the right server. The main advantage of stateless systems is simplicity of design. The main advantage of stateful systems is performance. I'm often asked whether RTD is a stateless or stateful service, so I wanted to clarify this issue in depth so that RTD's architecture will be properly understood. The short answer is: "RTD can be configured as a stateless or stateful service." The performance difference between stateless and stateful systems can be very significant, and while in a call center implementation it may be reasonable to use a pure stateless configuration, a web implementation that produces thousands of requests per second is practically impossible with a stateless configuration. RTD's performance is orders of magnitude better than most competing systems. RTD was architected from the ground up to achieve this performance. Features like automatic and dynamic compression of prediction models, automatic translation of metadata to machine code, lack of interpreted languages, and separation of model building from decisioning contribute to achieving this performance level. Because  of this focus on performance we decided to have RTD's default configuration work in a stateful manner. By being stateful RTD requests are typically handled in a few milliseconds when repeated requests come to the same session. Now, those readers that have participated in implementations of RTD know that RTD's architecture is also focused on reducing Total Cost of Ownership (TCO) with features like automatic model building, automatic time windows, automatic maintenance of database tables, automatic evaluation of data mining models, automatic management of models partitioned by channel, geography, etcetera, and hot swapping of configurations. How do you reconcile the need for a low TCO and the need for performance? How do you get the performance of a stateful system with the simplicity of a stateless system? The answer is that you make the system behave like a stateless system to the exterior, but you let it automatically take advantage of situations where being stateful is better. For example, one of the advantages of stateless systems is that you can route a message to any server in a cluster, without worrying about sending it to the same server that was handling the session in previous messages. With an RTD stateful configuration you can still route the message to any server in the cluster, so from the point of view of the configuration of other systems, it is the same as a stateless service. The difference though comes in performance, because if the message arrives to the right server, RTD can serve it without any external access to the session's state, thus tremendously reducing processing time. In typical implementations it is not rare to have high percentages of messages routed directly to the right server, while those that are not, are easily handled by forwarding the messages to the right server. This architecture usually provides the best of both worlds with performance and simplicity of configuration.   Configuring RTD as a pure stateless service A pure stateless configuration requires session data to be persisted at the end of handling each and every message and reloading that data at the beginning of handling any new message. This is of course, the root of the inefficiency of these configurations. This is also the reason why many "stateless" implementations actually do keep state to take advantage of a request coming back to the same server. Nevertheless, if the implementation requires a pure stateless decision service, this is easy to configure in RTD. The way to do it is: Mark every Integration Point to Close the session at the end of processing the message In the Session entity persist the session data on closing the session In the session entity check if a persisted version exists and load it An excellent solution for persisting the session data is Oracle Coherence, which provides a high performance, distributed cache that minimizes the performance impact of persisting and reloading the session. Alternatively, the session can be persisted to a local database. An interesting feature of the RTD stateless configuration is that it can cope with serializing concurrent requests for the same session. For example, if a web page produces two requests to the decision service, these requests could come concurrently to the decision services and be handled by different servers. Most stateless implementation would have the two requests step onto each other when saving the state, or fail one of the messages. When properly configured, RTD will make one message wait for the other before processing.   A Word on Context Using the context of a customer interaction typically significantly increases lift. For example, offer success in a call center could double if the context of the call is taken into account. For this reason, it is important to utilize the contextual information in decision making. To make the contextual information available throughout a session it needs to be persisted. When there is a well defined owner for the information then there is no problem because in case of a session restart, the information can be easily retrieved. If there is no official owner of the information, then RTD can be configured to persist this information.   Once again, RTD provides flexibility to ensure high performance when it is adequate to allow for some loss of state in the rare cases of server failure. For example, in a heavy use web site that serves 1000 pages per second the navigation history may be stored in the in memory session. In such sites it is typical that there is no OLTP that stores all the navigation events, therefore if an RTD server were to fail, it would be possible for the navigation to that point to be lost (note that a new session would be immediately established in one of the other servers). In most cases the loss of this navigation information would be acceptable as it would happen rarely. If it is desired to save this information, RTD would persist it every time the visitor navigates to a new page. Note that this practice is preferred whether RTD is configured in a stateless or stateful manner.  

    Read the article

  • Ubuntu, Gnome, PAM and ecryptfs

    - by Michel
    I would like to have a directory accessible to a couple of users, and not readable by maintenance types ... I can do what I want using ecryptfs and a password known only to the "couple of users" in question, who then can mount the directory and use as they see fit. I would love to be able to automate that process and unlock the directory at login - again, only for the "couple users" in question, without asking a password. Gnome-keyring is able to store passphrases/passwords encrypted; and, apparently, if I could get a key identity to ecryptfs, Gnome PAM modules would allow the key with that identity to be unlocked, and the directory could be mounted. Alas, I have found no way to go from point A (Gnome PAM keyring module) to point B (use the unlocked key in ecryptfs). Another use of the same mechanism would allow to build a "key escrow" mechanism, where keys to encrypted volumes are safekept with, e.g., HR; so that company information in encrypted directories can be recovered if you pass under the proverbial bus.

    Read the article

  • Creating a two-way Forest trust with Powershell

    - by Michel Klomp
    Here is a small Powershell script for creating a two-way forest trust. $localforest = [System.DirectoryServices.ActiveDirectory.Forest]::getCurrentForest() $strRemoteForest = ‘domain.local’ $strRemoteUser = ‘administrator’ $strRemotePassword = ‘P@ssw0rd’ $remoteContext = New-Object System.DirectoryServices.ActiveDirectory.DirectoryContext(‘Forest’, $strRemoteForest,$strRemoteUser,$strRemotePassword) $remoteForest = [System.DirectoryServices.ActiveDirectory.Forest]::getForest($remoteContext) $localForest.CreateTrustRelationship($remoteForest,’Bidirectional’)

    Read the article

  • Tips on ensuring Model Quality

    - by [email protected]
    Given enough data that represents well the domain and models that reflect exactly the decision being optimized, models usually provide good predictions that ensure lift. Nevertheless, sometimes the modeling situation is less than ideal. In this blog entry we explore the problems found in a few such situations and how to avoid them.1 - The Model does not reflect the problem you are trying to solveFor example, you may be trying to solve the problem: "What product should I recommend to this customer" but your model learns on the problem: "Given that a customer has acquired our products, what is the likelihood for each product". In this case the model you built may be too far of a proxy for the problem you are really trying to solve. What you could do in this case is try to build a model based on the result from recommendations of products to customers. If there is not enough data from actual recommendations, you could use a hybrid approach in which you would use the [bad] proxy model until the recommendation model converges.2 - Data is not predictive enoughIf the inputs are not correlated with the output then the models may be unable to provide good predictions. For example, if the input is the phase of the moon and the weather and the output is what car did the customer buy, there may be no correlations found. In this case you should see a low quality model.The solution in this case is to include more relevant inputs.3 - Not enough cases seenIf the data learned does not include enough cases, at least 200 positive examples for each output, then the quality of recommendations may be low. The obvious solution is to include more data records. If this is not possible, then it may be possible to build a model based on the characteristics of the output choices rather than the choices themselves. For example, instead of using products as output, use the product category, price and brand name, and then combine these models.4 - Output leaking into input giving the false impression of good quality modelsIf the input data in the training includes values that have changed or are available only because the output happened, then you will find some strong correlations between the input and the output, but these strong correlations do not reflect the data that you will have available at decision (prediction) time. For example, if you are building a model to predict whether a web site visitor will succeed in registering, and the input includes the variable DaysSinceRegistration, and you learn when this variable has already been set, you will probably see a big correlation between having a Zero (or one) in this variable and the fact that registration was successful.The solution is to remove these variables from the input or make sure they reflect the value as of the time of decision and not after the result is known. 

    Read the article

  • Update fails with unrecoverable dpkg fatal error

    - by Jonthue Michel
    Hello I keep receiving this error every-time i try to do some updates, I tried sudo dpkg --configure, sudo apt-get update and sudo apt-get install -f but they failed on me. installArchives() failed: (Reading database ... (Reading database ... 5% (Reading database ... 10% (Reading database ... 15% (Reading database ... 20% (Reading database ... 25% (Reading database ... 30% (Reading database ... 35% (Reading database ... 40% (Reading database ... 45% (Reading database ... 50% (Reading database ... 55%dpkg: unrecoverable fatal error, aborting: failed to read on buffer copy for files list for package `libc6-i386': Is a directory

    Read the article

1 2 3 4 5 6  | Next Page >