Search Results

Search found 666 results on 27 pages for 'disadvantages'.

Page 26/27 | < Previous Page | 22 23 24 25 26 27 | Next Page >

Alternative Grid Layout for Silverlight suggestion

- by brainbox

I've proposed a suggestion to create alternative grid layout for Silverlight. Please vote for it if also faced the same problems. As i write before current Silverlight Grid Layout breakes best practices of HTML and Adobe Flex Grid layouts. Current defention based approach have following disadvantages that makes xaml coding very hard: 1. It is very hard to create new row. In that case you should rewriteall Grid.Row and Grid.Columns for all rows inserted below.2. Defenitions are static by their nature and because of it, it isimpossible to use grid for dynamic forms. Currently even in toolkit DataFormMicrosoft is using StackPanel. But StackPanel is not designed for multicolumn layout that have dataform. Here is a sample code of AdobeFlex datagrid, which incorporates bestpractices of HTML. <mx:Grid id="myGrid">  <mx:GridRow id="row1">  <mx:GridItem> <mx:Button label="Button 1"/> </mx:GridItem>  <mx:GridItem> <mx:Button label="2"/> </mx:GridItem>  <mx:GridItem> <mx:Button label="Button 3"/> </mx:GridItem> </mx:GridRow>  <mx:GridRow id="row2">  <mx:GridItem colSpan="3" horizontalAlign="center"> <mx:Button label="Long-Named Button 4"/> </mx:GridItem> </mx:GridRow>  <mx:GridRow id="row3">  <mx:GridItem/>  <mx:GridItem colSpan="2" horizontalAlign="center"> <mx:Button label="Button 5"/> </mx:GridItem> </mx:GridRow> </mx:Grid>

Read the article
Data Source Security Part 2

- by Steve Felts

In Part 1, I introduced the default security behavior and listed the various options available to change that behavior. One of the key topics to understand is the difference between directly using database user and password values versus mapping from WLS user and password to the associated database values. The direct use of database credentials is relatively new to WLS, based on customer feedback. Some of the trade-offs are covered in this article. Credential Mapping vs. Database Credentials Each WLS data source has a credential map that is a mechanism used to map a key, in this case a WLS user, to security credentials (user and password). By default, when a user and password are specified when getting a connection, they are treated as credentials for a WLS user, validated, and are converted to a database user and password using a credential map associated with the data source. If a matching entry is not found in the credential map for the data source, then the user and password associated with the data source definition are used. Because of this defaulting mechanism, you should be careful what permissions are granted to the default user. Alternatively, you can define an invalid default user to ensure that no one can accidentally get through (in this case, you would need to set the initial capacity for the pool to zero so that the pool is populated only by valid users). To create an entry in the credential map: 1) First create a WLS user. In the administration console, go to Security realms, select your realm (e.g., myrealm), select Users, and select New. 2) Second, create the mapping. In the administration console, go to Services, select Data sources, select your data source name, select Security, select Credentials, and select New. See http://docs.oracle.com/cd/E24329_01/apirefs.1211/e24401/taskhelp/jdbc/jdbc_datasources/ConfigureCredentialMappingForADataSource.html for more information. The advantages of using the credential mapping are that: 1) You don’t hard-code the database user/password into a program or need to prompt for it in addition to the WLS user/password and 2) It provides a layer of abstraction between WLS security and database settings such that many WLS identities can be mapped to a smaller set of DB identities, thereby only requiring middle-tier configuration updates when WLS users are added/removed. You can cut down the number of users that have access to a data source to reduce the user maintenance overhead. For example, suppose that a servlet has the one pre-defined, special WLS user/password for data source access, hard-wired in its code in a getConnection(user, password) call. Every WebLogic user can reap the specific DBMS access coded into the servlet, but none has to have general access to the data source. For instance, there may be a ‘Sales’ DBMS which needs to be protected from unauthorized eyes, but it contains some day-to-day data that everyone needs. The Sales data source is configured with restricted access and a servlet is built that hard-wires the specific data source access credentials in its connection request. It uses that connection to deliver only the generally needed day-to-day information to any caller. The servlet cannot reveal any other data, and no WebLogic user can get any other access to the data source. This is the approach that many large applications take and is the reasoning behind the default mapping behavior in WLS. The disadvantages of using the credential map are that: 1) It is difficult to manage (create, update, delete) with a large number of users; it is possible to use WLST scripts or a custom JMX client utility to manage credential map entries. 2) You can’t share a credential map between data sources so they must be duplicated. Some applications prefer not to use the credential map. Instead, the credentials passed to getConnection(user, password) should be treated as database credentials and used to authenticate with the database for the connection, avoiding going through the credential map. This is enabled by setting the “use-database-credentials” to true. See http://docs.oracle.com/cd/E24329_01/apirefs.1211/e24401/taskhelp/jdbc/jdbc_datasources/ConfigureOracleParameters.html "Configure Oracle parameters" in Oracle WebLogic Server Administration Console Help. Use Database Credentials is not currently supported for Multi Data Source configurations. When enabled, it turns off credential mapping on Generic and Active GridLink data sources for the following attributes: 1. identity-based-connection-pooling-enabled (this interaction is available by patch in 10.3.6.0). 2. oracle-proxy-session (this interaction is first available in 10.3.6.0). 3. set client identifier (this interaction is available by patch in 10.3.6.0). Note that in the data source schema, the set client identifier feature is poorly named “credential-mapping-enabled”. The documentation and the console refer to it as Set Client Identifier. To review the behavior of credential mapping and using database credentials: - If using the credential map, there needs to be a mapping for each WLS user to database user for those users that will have access to the database; otherwise the default user for the data source will be used. If you always specify a user/password when getting a connection, you only need credential map entries for those specific users. - If using database credentials without specifying a user/password, the default user and password in the data source descriptor are always used. If you specify a user/password when getting a connection, that user will be used for the credentials. WLS users are not involved at all in the data source connection process.

Read the article
Qt static build with static mysql plugin confusion

- by bdiloreto

I have built a Qt application which uses the MySQL library, but I am confused by the documentation on static versus shared builds. From the Qt documentation at http://doc.qt.nokia.com/4.7/deployment-windows.html it says: To deploy plugin-based applications we should use the shared library approach. And on http://doc.qt.nokia.com/4.7/deployment.html, it says: Static linking results in a stand-alone executable. The advantage is that you will only have a few files to deploy. The disadvantages are that the executables are large and with no flexibility and that you cannot deploy plugins. To deploy plugin-based applications, you can use the shared library approach. But on http://doc.qt.nokia.com/latest/plugins-howto.html, it seems to say the opposite, giving directions on how to use static plugins: Plugins can be linked statically against your application. If you build the static version of Qt, this is the only option for including Qt's predefined plugins. Using static plugins makes the deployment less error-prone, but has the disadvantage that no functionality from plugins can be added without a complete rebuild and redistribution of the application. ... To link statically against those plugins, you need to use the Q_IMPORT_PLUGIN() macro in your application and you need to add the required plugins to your build using QTPLUGIN. I want to build the Qt libraries statically (for easy deployment) and then use the static MySQL plugin. To do this, I did NOT use the binary distrubtion for Windows. Instead, I've started with the source qt-everywhere-opensource-src-4.7.4 Is the following the correct way to do a static build so that i can use the static MySql plugin? configure -static -debug-and-release -opensource -platform win32-msvc2010 -no-qt3support -no-webkit -no-script -plugin-sql-mysql -I C:\MySQL\include -L C:\MySQL\lib This should build the Qt libraries statically AND the static plugin to be linked at run-time, correct? I would NOT need to build the Mysql Plugin from source separately, correct? If I was to subtitute "-qt-sql-mysql" for "-plugin-sql-mysql" in above, it would include the MySQL driver directly in the QT static libraries, in which case I would NOT need to use the plugin at all, correct? Thanks for making me unconfused!

Read the article
Objective-C wrapper API design methodology

- by Wade Williams

I know there's no one answer to this question, but I'd like to get people's thoughts on how they would approach the situation. I'm writing an Objective-C wrapper to a C library. My goals are: 1) The wrapper use Objective-C objects. For example, if the C API defines a parameter such as char *name, the Objective-C API should use name:(NSString *). 2) The client using the Objective-C wrapper should not have to have knowledge of the inner-workings of the C library. Speed is not really any issue. That's all easy with simple parameters. It's certainly no problem to take in an NSString and convert it to a C string to pass it to the C library. My indecision comes in when complex structures are involved. Let's say you have: struct flow { long direction; long speed; long disruption; long start; long stop; } flow_t; And then your C API call is: void setFlows(flow_t inFlows[4]); So, some of the choices are: 1) expose the flow_t structure to the client and have the Objective-C API take an array of those structures 2) build an NSArray of four NSDictionaries containing the properties and pass that as a parameter 3) create an NSArray of four "Flow" objects containing the structure's properties and pass that as a parameter My analysis of the approaches: Approach 1: Easiest. However, it doesn't meet the design goals Approach 2: For some reason, this seems to me to be the most "Objective-C" way of doing it. However, each element of the NSDictionary would have to be wrapped in an NSNumber. Now it seems like we're doing an awful lot just to pass the equivalent of a struct. Approach 3: Seems the cleanest to me from an object-oriented standpoint and the extra encapsulation could come in handy later. However, like #2, it now seems like we're doing an awful lot (creating an array, creating and initializing objects) just to pass a struct. So, the question is, how would you approach this situation? Are there other choices I'm not considering? Are there additional advantages or disadvantages to the approaches I've presented that I'm not considering?

Read the article
How to store wiki sites (vcs)

- by Eugen

Hello, as a personal project I am trying to write a wiki with the help of django. I'm a beginner when it comes to web development. I am at the (early) point where I need to decide how to store the wiki sites. I have three approaches in mind and would like to know your suggestion. Flat files I considered a flat file approach with a version control system like git or mercurial. Firstly, I would have some example wikis to look at like http://hatta.sheep.art.pl/. Secondly, the vcs would probably deal with editing conflicts and keeping the edit history, so I would not have to reinvent the wheel. And thirdly, I could probably easily clone the wiki repository, so I (or for that matter others) can have an offline copy of the wiki. On the other hand, as far as I know, I can not use django models with flat files. Then, if I wanted to add fields to a wiki site, like a category, I would need to somehow keep a reference to that flat file in order to associate the fields in the database with the flat file. Besides, I don't know if it is a good idea to have all the wiki sites in one repository. I imagine it is more natural to have kind of like a repository per wiki site resp. file. Last but not least, I'm not sure, but I think using flat files would limit my deploying capabilities because web hosts maybe don't allow creating files (I'm thinking, for example, of Google App Engine) Storing in a database By storing the wiki sites in the database I can utilize django models and associate arbitrary fields with the wiki site. I probably would also have an easier life deploying the wiki. But I would not get vcs features like history and conflict resolving per se. I searched for django-extensions to help me and I found django-reversion. However, I do not fully understand if it fit my needs. Does it track model changes like for example if I change the django model file, or does it track the content of the models (which would fit my need). Plus, I do not see if django reversion would help me with edit conflicts. Storing a vcs repository in a database field This would be my ideal solution. It would combine the advantages of both previous approaches without the disadvantages. That is; I would have vcs features but I would save the wiki sites in a database. The problem is: I have no idea how feasible that is. I just imagine saving a wiki site/source together with a git/mercurial repository in a database field. Yet, I somehow doubt database fields work like that. So, I'm open for any other approaches but this is what I came up with. Also, if you're interested, you can find the crappy early test I'm working on here http://github.com/eugenkiss/instantwiki-test

Read the article
Do you still limit line length in code?

- by Noldorin

This is a matter on which I would like to gauge the opinion of the community: Do you still limit the length of lines of code to a fixed maximum? This was certainly a convention of the past for many languages; one would typically cap the number of characters per line to a value such as 80 (and more recnetly 100 or 120 I believe). As far as I understand, the primary reasons for limiting line length are: Readability - You don't have to scroll over horizontally when you want to see the end of some lines. Printing - Admittedly (at least in my experience), most code that you are working on does not get printed out on paper, but by limiting the number of characters you can insure that formatting doesn't get messed up when printed. Past editors (?) - Not sure about this one, but I suspect that at some point in the distant past of programming, (at least some) text editors may have been based on a fixed-width buffer. I'm sure there are points that I am still missing out, so feel free to add to these... Now, when I tend to observe C or C# code nowadays, I often see a number of different styles, the main ones being: Line length capped to 80, 100, or even 120 characters. As far as I understand, 80 is the traditional length, but the longer ones of 100 and 120 have appeared because of the widespread use of high resolutions and widescreen monitors nowadays. No line length capping at all. This tends to be pretty horrible to read, and I don't see it too often, though it's certainly not too rare either. Inconsistent capping of line length. The length of some lines are limited to a fixed maximum (or even a maximum that changes depending on the file/location in code), while others (possibly comments) are not at all. My personal preference here (at least recently) has been to cap the line length to 100 in the Visual Studio editor. This means that in a decently sized window (on a non-widescreen monitor), the ends of lines are still fully visible. I can however see a few disadvantages in this, especially when you end up writing code that's indented 3 or 4 levels and then having to include a long string literal - though I often take this as a sign to refactor my code! In particular, I am curious what the C and C# coders (or anyone who uses Visual Studio for that matter) think about this point, though I would be interested in hearing anyone's thoughts on the subject. Edit Thanks for the all answers - I appreciate the variety of opinions here, all presenting sound reasons. Consensus does seem to be tipping in the direction of always (or almost always) limit the line length. Interestingly, it seems to be in various coding standards to limit the line length. Judging by some of the answers, both the Python and Google CPP guidelines set the limit at 80 chars. I haven't seen anything similar regarding C# or VB.NET, but I would be curious to see if there are ones anywhere.

Read the article
Mercurial for Beginners: The Definitive Practical Guide

- by Laz

Inspired by Git for beginners: The definitive practical guide. This is a compilation of information on using Mercurial for beginners for practical use. Beginner - a programmer who has touched source control without understanding it very well. Practical - covering situations that the majority of users often encounter - creating a repository, branching, merging, pulling/pushing from/to a remote repository, etc. Notes: Explain how to get something done rather than how something is implemented. Deal with one question per answer. Answer clearly and as concisely as possible. Edit/extend an existing answer rather than create a new answer on the same topic. Please provide a link to the the Mercurial wiki or the HG Book for people who want to learn more. Questions: Installation/Setup How to install Mercurial? How to set up Mercurial? How do you create a new project/repository? How do you configure it to ignore files? Working with the code How do you get the latest code? How do you check out code? How do you commit changes? How do you see what's uncommitted, or the status of your current codebase? How do you destroy unwanted commits? How do you compare two revisions of a file, or your current file and a previous revision? How do you see the history of revisions to a file? How do you handle binary files (visio docs, for instance, or compiler environments)? How do you merge files changed at the "same time"? Tagging, branching, releases, baselines How do you 'mark' 'tag' or 'release' a particular set of revisions for a particular set of files so you can always pull that one later? How do you pull a particular 'release'? How do you branch? How do you merge branches? How do you merge parts of one branch into another branch? Other Good GUI/IDE plugin for Mercurial? Advantages/disadvantages? Any other common tasks a beginner should know? How do I interface with Subversion? Other Mercurial references Mercurial: The Definitive Guide Mercurial Wiki Meet Mercurial | Peepcode Screencast

Read the article
Good working habits to observe in project development?

- by Will Marcouiller

As my development experience grows, I see fit to stick to best practices from here and there to build somehow my own working practices while observing the conventions, etc. I'm currently working on a project which my goals is to graduate the security access model from an environment's Active Directory to another environment's automatically. I don't know for any of you, but as far as I'm concerned, I meet some real difficulties sticking to only one way, then develop. I mean, I learn something new everyday while visiting SO, and recently wanted to get acquainted with generics. On the other hand, I better know the Façade pattern which proved to be very practical in transactional programming in process systems. This seems to be less practical for desktop application as there are plenty of variables to consider in a desktop application that you don't have to care in transactional programming, as you're playing only with information data. As for my current project, I have: Groups; Organizational Units; Users. Which are all considered an entry in the Active Directory. This points out to be a good candidate for generics, as also approached this way by Bart de Smett's Linq to AD on CodePlex. He has a DirectorySource<T>, and to manage let's say groups, then he instantiate a source with the proper type: var groups = new DirectorySource<Group>(); This seems to be very a good way of doing. Despite, I seem to go from one pattern to another and I don't seem to be able to strictly stick to one. While I'm aware that one must not stay with only one way of doing, since each pattern statisfies certain advantages, while also illustrating disadvantages under some usage conditions, I seem to want to develop with both patterns having a singleton Façade class with the underlying factories which represent the sub systems: GroupsFactory; UsersFactory; OrganizationalUnitsFactory. Each of the factories offers the possible operations for their respective entity (group, user, OU). To make a very long story short, I often have plenty of ideas while developping and this causes me some trouble, as I go from an idea to another feeling completely lost after a while. Yet I understand the advantages and disavantages, I have no trouble choosing from one pattern to another depending on the situation. Nevertheless, when it comes to programming itself, if I'm not part of a team, I feel sometimes like I can't do anything good. That is, because I can't stand not doing something "perfect" the first time. The role I play within the project is both: the project manager and the programmer. I am more comfortable in the project manager role, architectural role, analytical role than the developer's. Has any of you some good habbits to observe in project development? Thanks to you all! =)

Read the article
Adding a mini admin to a webpage.

- by DADU

Hello Picture this: you are creating a little module that people can incorporate into their website easily, for example, a little contact form. It would consist of a PHP file that outputs some HTML, a Javascript file (ajax etc.), a CSS file and a CSS skin. Now the person who doesn't know much about coding wants to integrate it on a webpage (website/index.php). We could do this with three rules of code: <link rel="stylesheet" href="module/css/module.css" /> <script src="module/js/module.js"></script> <?php require_once 'module/module.php'; ?> There's no doubt this part is questionable, right? Now when we want to add an admin for this little module, there are two options: Accessing the admin via an extra URL like website/module/admin.php and after authentication, displaying a page where the person can do all the settings. The person then goes back to index.php to see the results. Enabling the admin via an extra URL like website/module/admin.php and after authentication, redirecting back to index.php. The person can now edit the module directly (HTML5 contenteditable) and see changes live, on the webpage where everybody else will see it when the person saves the changes. Option 2 has a couple of advantages: The person doesn't have to toggle between admin and index.php. The person can see directly how it's looking at the webpage it's integrated in. The person probably feels like the module is more part of the webpage/website. Of course option 2 has some disadvantages too: Not everything works well editing it inline. The person would need to have an HTML5 compliant browser. Probably some more I can't think of right now. Now I have a few concerns that's I can't seem to see a clear answer to. How would we let the person integrate the admin on their webpage? The admin files only need to be included in index.php if the person has choosen to edit the module via the url (website/module/admin.php). But how can we do this if we have a admin.css file that belongs in the head section, an admin.php file that goes into the body, and another admin.js file that's included at the end of the body? How would we know the file that admin.php needs to redirect back to, after authentication? index.php could be any webpage with any name. Any real life website/web apps examples using this principle are welcome too. If there's something unclear, I am glad to add additional info.

Read the article
Move options between multiple lists

- by Martha

We currently have a form with the standard multi-select functionality of "here are the available options, here are the selected options, here are some buttons to move stuff back and forth." However, the client now wants the ability to not just select certain items, but to also categorize them. For example, given a list of books, they want to not just select the ones they own, but also the ones they've read, the ones they would like to read, and the ones they've heard about. (All examples fictional.) Thankfully, a selected item can only be in one category at a time. I can find many examples of moving items between listboxes, but not a single one for moving items between multiple listboxes. To add to the complication, the form needs to have two sets of list+categories, e.g. a list of movies that need to be categorized in addition to the aforementioned books. EDIT: Having now actually sat down to try to code the non-javascripty bits, I need to revise my question, because I realized that multiple select lists won't really work from the "how do I inform the server about all this lovely new information" standpoint. So the html code is now a pseudo-listbox, i.e. an unordered list (ul) displayed in a box with a scrollbar, and each list item (<li>) has a set of five radio buttons (unselected/own/read/like/heard). My task is still roughly the same: how to take this one list and make it easy to categorize the items, in such a way that the user can tell at a glance what is in what category. (The pseudo-listbox has some of the same disadvantages as a multi-select listbox, namely it's hard to tell what's selected if the list is long enough to scroll.) The dream solution would be a drag-and-drop type thing, but at this point even buttons would be OK. Another modification (a good one) is that the client has revised the lists, so the longest is now "only" 62 items long (instead of the many hundreds they had before). The categories will still mostly contain zero, one, or two selected items, possibly a couple more if the user was overzealous. As far as OS and stuff, the site is in classic asp (quit snickering!), the server-side code is VBScript, and so far we've avoided the various Javascript libraries by the simple expedient of almost never using client-side scripting. This one form for this one client is currently the big exception. Give 'em an inch and they want a mile... Oh, and I have to add: I suck at Javascript, or really at any C-descendant language. Curly braces give me hives. I'd really, really like something I can just copy & paste into my page, maybe tweak some variable names, and never look at it again. A girl can dream, can't she? :) [existing code deleted because it's largely irrelevant.]

Read the article
Supporting multiple instances of a plugin DLL with global data

- by Bruno De Fraine

Context: I converted a legacy standalone engine into a plugin component for a composition tool. Technically, this means that I compiled the engine code base to a C DLL which I invoke from a .NET wrapper using P/Invoke; the wrapper implements an interface defined by the composition tool. This works quite well, but now I receive the request to load multiple instances of the engine, for different projects. Since the engine keeps the project data in a set of global variables, and since the DLL with the engine code base is loaded only once, loading multiple projects means that the project data is overwritten. I can see a number of solutions, but they all have some disadvantages: You can create multiple DLLs with the same code, which are seen as different DLLs by Windows, so their code is not shared. Probably this already works if you have multiple copies of the engine DLL with different names. However, the engine is invoked from the wrapper using DllImport attributes and I think the name of the engine DLL needs to be known when compiling the wrapper. Obviously, if I have to compile different versions of the wrapper for each project, this is quite cumbersome. The engine could run as a separate process. This means that the wrapper would launch a separate process for the engine when it loads a project, and it would use some form of IPC to communicate with this process. While this is a relatively clean solution, it requires some effort to get working, I don't now which IPC technology would be best to set-up this kind of construction. There may also be a significant overhead of the communication: the engine needs to frequently exchange arrays of floating-point numbers. The engine could be adapted to support multiple projects. This means that the global variables should be put into a project structure, and every reference to the globals should be converted to a corresponding reference that is relative to a particular project. There are about 20-30 global variables, but as you can imagine, these global variables are referenced from all over the code base, so this conversion would need to be done in some automatic manner. A related problem is that you should be able to reference the "current" project structure in all places, but passing this along as an extra argument in each and every function signature is also cumbersome. Does there exist a technique (in C) to consider the current call stack and find the nearest enclosing instance of a relevant data value there? Can the stackoverflow community give some advice on these (or other) solutions?

Read the article
Is the recent trend toward widescreen (16:9) computer monitors a plus or minus for programmers?

- by DanM

It's almost gotten to the point where you can't buy a conventional (4:3) monitor anymore. Pretty much everything is widescreen. This is fine for watching movies or TV, but is it good or bad for programming? My initial thoughts on the issue are that widescreens are a net negative for programmers. Here are some of the disadvantages I see: Poor space utiliziation One disadvantage of widescreens you can't argue with is that they offer poor space utilization for the amount of total pixels you get. For example, my Thinkpad, which I bought just before the widescreen craze, has a 15" monitor with a native resolution of 1600 x 1200. The newer 15.4" Thinkpads run at most 1680 x 1050. So (if you do the math) you get fewer pixels in a wider (but not shorter) package. With desktop monitors, you pay a price in terms of desk space used. Two 1680 x 1050 monitors will simply take up more of your desk than two 1600 x 1200 monitors (assuming equal dot pitch). More scrolling If you compare a 1680 x 1050 monitor to a 1600 x 1200 monitor, you get 80 extra pixels of width but 150 fewer pixels of height. The height reduction means you lose approximately 11 lines of code. That's less you can see on the screen at one time and more scrolling you have to do. This harms productivity, maybe not dramatically, but insidiously. Less room for wide panels Widescreens also mean you lose space for wide but short panels common in programming environments. If you use Visual Studio, for example, your code window will be that much shorter when viewing the Find Results, Task List, or Error List (all of which I use frequently). This isn't to say the 80 pixels of extra width you get with widescreen would never be useful, but I tend to keep my lines of code short, so seeing more lines would be more valuable to me than seeing fewer, longer lines. What do you think? Do you agree/disagree? Are you now using one or more widescreen monitors for development? What resolution are you running on each? Do you ever miss the height of the traditional 4:3 monitor? Would you complain if your monitors were one inch narrower but two inches taller?

Read the article
Is there a programming language with be semantics close to English ?

- by ivo s

Most languages allow to 'tweek' to certain extend parts of the syntax (C++,C#) and/or semantics that you will be using in your code (Katahdin, lua). But I have not heard of a language that can just completely define how your code will look like. So isn't there some language which already exists that has such capabilities to override all syntax & define semantics ? Example of what I want to do is basically from the C# code below: foreach(Fruit fruit in Fruits) { if(fruit is Apple) { fruit.Price = fruit.Price/2; } } I want do be able to to write the above code in my perfect language like this: Check if any fruits are Macintosh apples and discount the price by 50%. The advantages that come to my mind looking from a coder's perspective in this "imaginary" language are: It's very clear what is going on (self descriptive) - it's plain English after all even kid would understand my program Hides all complexities which I have to write in C#. But why should I care to learn that if statements, arithmetic operators etc since there are already implemented The disadvantages that I see for a coder who will maintain this program are: Maybe you would express this program differently from me so you may not get all the information that I've expressed in my sentence Programs can be quite verbose and hard to debug but if possible to even proximate this type of syntax above maybe more people would start programming right? That would be amazing I think. I can go to work and just write an essay to draw a square on a winform like this: Create a form called MyGreetingForm. Draw a square with in the middle of MyGreetingFormwith a side of 100 points. In the middle of the square write "Hello! Click here to continue" in Arial font. In the above code the parser must basically guess that I want to use the unnamed square from the previous sentence, it'd be hard to write such a smart parser I guess, yet it's so simple what I want to do. If the user clicks on square in the middle of MyGreetingForm show MyMainForm. In the above code 'basically' the compiler must: 1)generate an event handler 2) check if there is any square in the middle of the form and if there is - 3) hide the form and show another form It looks very hard to do but it doesn't look impossible IMO to me at least approximate this (I can personally generate a parser to perform the 3 steps above np & it's basically the same that it has to do any way when you add even in c# a.MyEvent=+handler; so I don't see a problem here) so I'm thinking maybe somebody already did something like this ? Or is there some practical burden of complexity to create such a 'essay style' programming language which I can't see ? I mean what's the worse that can happen if the parser is not that good? - your program will crash so you have to re-word it:)

Read the article
Is it any loose coupling mechanism in Objective-C + Cocoa like C# delegates or C++Qt signals+slots?

- by Eye of Hell

Hello. For a large programs, the standard way to chalenge a complexity is to divide a program code into small objects. Most of the actual programming languages offer this functionality via classes, so is Objective-C. But after source code is separated into small object, the second challenge is to somehow connect them with each over. Standard approaches, supported by most languages are compositon (one object is a member field of another), inheritance, templates (generics) and callbacks. More cryptic techniques include method-level delagates (C#) and signals+slots (C++Qt). I like the delegates / signals idea, since while connecting two objects i can connect individual methods with each over, without objects knowing anything of each over. For C#, it will look like this: var object1 = new CObject1(); var object2 = new CObject2(); object1.SomethingHappened += object2.HandleSomething; In this code, is object1 calls it's SomethingHappened delegate (like a normal method call) the HandleSomething method of object2 will be called. For C++Qt, it will look like this: var object1 = new CObject1(); var object2 = new CObject2(); connect( object1, SIGNAL(SomethingHappened()), object2, SLOT(HandleSomething()) ); The result will be exactly the same. This technique has some advantages and disadvantages, but generally i like it more than interfaces since if program code base grows i can change connections and add new ones without creating tons of interfaces. After examination of Objective-C i havn't found any way to use this technique i like :(. It seems that Objective-C supports message passing perfectly well, but it requres for object1 to have a pointer to object2 in order to pass it a message. If some object needs to be connected to lots of other objects, in Objective-C i will be forced to give him pointers to each of the objects it must be connected. So, the question :). Is it any approach in Objective-C programming that will closely resemble delegate / signal+slot types of connection, not a 'give first object an entire pointer to second object so it can pass a message to it'. Method-level connections are a bit more preferable to me than object-level connection ^_^.

Read the article
Is it a good or bad practice to call instance methods from a java constructor?

- by Steve

There are several different ways I can initialize complex objects (with injected dependencies and required set-up of injected members), are all seem reasonable, but have various advantages and disadvantages. I'll give a concrete example: final class MyClass { private final Dependency dependency; @Inject public MyClass(Dependency dependency) { this.dependency = dependency; dependency.addHandler(new Handler() { @Override void handle(int foo) { MyClass.this.doSomething(foo); } }); doSomething(0); } private void doSomething(int foo) { dependency.doSomethingElse(foo+1); } } As you can see, the constructor does 3 things, including calling an instance method. I've been told that calling instance methods from a constructor is unsafe because it circumvents the compiler's checks for uninitialized members. I.e. I could have called doSomething(0) before setting this.dependency, which would have compiled but not worked. What is the best way to refactor this? Make doSomething static and pass in the dependency explicitly? In my actual case I have three instance methods and three member fields that all depend on one another, so this seems like a lot of extra boilerplate to make all three of these static. Move the addHandler and doSomething into an @Inject public void init() method. While use with Guice will be transparent, it requires any manual construction to be sure to call init() or else the object won't be fully-functional if someone forgets. Also, this exposes more of the API, both of which seem like bad ideas. Wrap a nested class to keep the dependency to make sure it behaves properly without exposing additional API:class DependencyManager { private final Dependency dependency; public DependecyManager(Dependency dependency) { ... } public doSomething(int foo) { ... } } @Inject public MyClass(Dependency dependency) { DependencyManager manager = new DependencyManager(dependency); manager.doSomething(0); } This pulls instance methods out of all constructors, but generates an extra layer of classes, and when I already had inner and anonymous classes (e.g. that handler) it can become confusing - when I tried this I was told to move the DependencyManager to a separate file, which is also distasteful because it's now multiple files to do a single thing. So what is the preferred way to deal with this sort of situation?

Read the article
IPv6: Should I have private addresses?

- by AlReece45

Right now, we have a rack of servers. Every server right now has at least 2 IP addresses, one for the public interface, another for the private. The servers that have SSL websites on them have more IP addresses. We also have virtual servers, that are configured similarly. Private Network The private range is currently just used for backups and monitoring. Its a gigabit port, the interface usage does not usually get very high. There are other technologies we're considering using that would use this port: iSCSI (implementations usually recommends dedicating an interface to it, which would be yet another IP network), VPN to get access to the private range (something I'd rather avoid) dedicated database servers LDAP centralized configuration (like puppet) centralized logging We don't have any private addresses in our DNS records (only public addresses). For our servers to utilize the correct IP address for the right interface (and not hard code the IP address) probably requires setting up a private DNS server (So now we add 2 different dns entries to 2 different systems). Public Network Our public range has a variety of services include web, email, and ftp. There is a hardware firewall between our network and the "public" network. We have (relatively secure) method to instruct the firewall to open and close administrative access (web interfaces, ssh, etc) for our current IP address. With either solution discussed, the host-based firewalls will be configured as well. The public network currently runs at a dedicated 20Mbps link. There are a couple of legacy servers with fast-ethernet ports, but they are scheduled for decommissioning. All of the other production boxes have at least 2 Gigabit Ethernet ports. The more traffic-heavy servers have 4-6 available (none is using more than the 2 Gigabit ports right now). IPv6 I want to get an IPv6 prefix from our ISP. So at least every "server" has at least one IPv6 interface. We'll still need to keep the IPv4 addressees up and available for legacy clients (web servers and email at the very least). We have two IP networks right now. Adding the public IPv6 address would make it three. Just use IPv6? I'm thinking about just dumping the private IPv4 range and using the IPv6 range as the primary means of all communications. If an interface starts reaching its capacity, utilize the newly free interfaces to create a trunk. It has the advantage that if either the public or private traffic needs to exceed 1Gbps. The traffic for each interface is already analyzed on a regular basis to predict future bandwidth use. In the rare instances where bandwidth unexpected peaks: utilize QoS to ensure traffic (like our limited SSH access) is prioritized correctly so the problem can be corrected (if possible, our WAN is the bottleneck right now). It also has the advantage of not needing to make an entry for every private address. We may have private DNS (or just LDAP), but it'll be much more limited in scope with less entries to duplicate. Summary I'm trying to make this network as "simple" as possible. At the same time, I want to make sure its reliable, upgradeable, scalable, and (eventually) redundant. Having one IPv6 network, and a legacy IPv4 network seems to be the best solution to me. Regarding using assigned IPv6 addresses for both networks, sharing the available bandwidth on one (more trunked if needed): Are there any technical disadvantages (limitations, buffers, scalability)? Are there any other security considerations (asides from firewalls mentioned above) to consider? Are there regulations or other security requirements (like PCI-DSS) that this doesn't meet? Is there typical software for setting up a Linux network that doesn't have IPv6 support yet? (logging, ldap, puppet) Some other thing I didn't consider?

Read the article
AdvancedFormatProvider: Making string.format do more

- by plblum

When I have an integer that I want to format within the String.Format() and ToString(format) methods, I’m always forgetting the format symbol to use with it. That’s probably because its not very intuitive. Use {0:N0} if you want it with group (thousands) separators. text = String.Format("{0:N0}", 1000); // returns "1,000" int value1 = 1000; text = value1.ToString("N0"); Use {0:D} or {0:G} if you want it without group separators. text = String.Format("{0:D}", 1000); // returns "1000" int value2 = 1000; text2 = value2.ToString("D"); The {0:D} is especially confusing because Microsoft gives the token the name “Decimal”. I thought it reasonable to have a new format symbol for String.Format, "I" for integer, and the ability to tell it whether it shows the group separators. Along the same lines, why not expand the format symbols for currency ({0:C}) and percent ({0:P}) to let you omit the currency or percent symbol, omit the group separator, and even to drop the decimal part when the value is equal to the whole number? My solution is an open source project called AdvancedFormatProvider, a group of classes that provide the new format symbols, continue to support the rest of the native symbols and makes it easy to plug in additional format symbols. Please visit https://github.com/plblum/AdvancedFormatProvider to learn about it in detail and explore how its implemented. The rest of this post will explore some of the concepts it takes to expand String.Format() and ToString(format). AdvancedFormatProvider benefits: Supports {0:I} token for integers. It offers the {0:I-,} option to omit the group separator. Supports {0:C} token with several options. {0:C-$} omits the currency symbol. {0:C-,} omits group separators, and {0:C-0} hides the decimal part when the value would show “.00”. For example, 1000.0 becomes “$1000” while 1000.12 becomes “$1000.12”. Supports {0:P} token with several options. {0:P-%} omits the percent symbol. {0:P-,} omits group separators, and {0:P-0} hides the decimal part when the value would show “.00”. For example, 1 becomes “100 %” while 1.1223 becomes “112.23 %”. Provides a plug in framework that lets you create new formatters to handle specific format symbols. You register them globally so you can just pass the AdvancedFormatProvider object into String.Format and ToString(format) without having to figure out which plug ins to add. text = String.Format(AdvancedFormatProvider.Current, "{0:I}", 1000); // returns "1,000" text2 = String.Format(AdvancedFormatProvider.Current, "{0:I-,}", 1000); // returns "1000" text3 = String.Format(AdvancedFormatProvider.Current, "{0:C-$-,}", 1000.0); // returns "1000.00" The IFormatProvider parameter Microsoft has made String.Format() and ToString(format) format expandable. They each take an additional parameter that takes an object that implements System.IFormatProvider. This interface has a single member, the GetFormat() method, which returns an object that knows how to convert the format symbol and value into the desired string. There are already a number of web-based resources to teach you about IFormatProvider and the companion interface ICustomFormatter. I’ll defer to them if you want to dig more into the topic. The only thing I want to point out is what I think are implementation considerations. Why GetFormat() always tests for ICustomFormatter When you see examples of implementing IFormatProviders, the GetFormat() method always tests the parameter against the ICustomFormatter type. Why is that? public object GetFormat(Type formatType) { if (formatType == typeof(ICustomFormatter)) return this; else return null; } The value of formatType is already predetermined by the .net framework. String.Format() uses the StringBuilder.AppendFormat() method to parse the string, extracting the tokens and calling GetFormat() with the ICustomFormatter type. (The .net framework also calls GetFormat() with the types of System.Globalization.NumberFormatInfo and System.Globalization.DateTimeFormatInfo but these are exclusive to how the System.Globalization.CultureInfo class handles its implementation of IFormatProvider.) Your code replaces instead of expands I would have expected the caller to pass in the format string to GetFormat() to allow your code to determine if it handles the request. My vision would be to return null when the format string is not supported. The caller would iterate through IFormatProviders until it finds one that handles the format string. Unfortunatley that is not the case. The reason you write GetFormat() as above is because the caller is expecting an object that handles all formatting cases. You are effectively supposed to write enough code in your formatter to handle your new cases and call .net functions (like String.Format() and ToString(format)) to handle the original cases. Its not hard to support the native functions from within your ICustomFormatter.Format function. Just test the format string to see if it applies to you. If not, call String.Format() with a token using the format passed in. public string Format(string format, object arg, IFormatProvider formatProvider) { if (format.StartsWith("I")) { // handle "I" formatter } else return String.Format(formatProvider, "{0:" + format + "}", arg); } Formatters are only used by explicit request Each time you write a custom formatter (implementer of ICustomFormatter), it is not used unless you explicitly passed an IFormatProvider object that supports your formatter into String.Format() or ToString(). This has several disadvantages: Suppose you have several ICustomFormatters. In order to have all available to String.Format() and ToString(format), you have to merge their code and create an IFormatProvider to return an instance of your new class. You have to remember to utilize the IFormatProvider parameter. Its easy to overlook, especially when you have existing code that calls String.Format() without using it. Some APIs may call String.Format() themselves. If those APIs do not offer an IFormatProvider parameter, your ICustomFormatter will not be available to them. The AdvancedFormatProvider solves the first two of these problems by providing a plug-in architecture.

Read the article
Replication Services as ETL extraction tool

- by jorg

In my last blog post I explained the principles of Replication Services and the possibilities it offers in a BI environment. One of the possibilities I described was the use of snapshot replication as an ETL extraction tool: “Snapshot Replication can also be useful in BI environments, if you don’t need a near real-time copy of the database, you can choose to use this form of replication. Next to an alternative for Transactional Replication it can be used to stage data so it can be transformed and moved into the data warehousing environment afterwards. In many solutions I have seen developers create multiple SSIS packages that simply copies data from one or more source systems to a staging database that figures as source for the ETL process. The creation of these packages takes a lot of (boring) time, while Replication Services can do the same in minutes. It is possible to filter out columns and/or records and it can even apply schema changes automatically so I think it offers enough features here. I don’t know how the performance will be and if it really works as good for this purpose as I expect, but I want to try this out soon!” Well I have tried it out and I must say it worked well. I was able to let replication services do work in a fraction of the time it would cost me to do the same in SSIS. What I did was the following: Configure snapshot replication for some Adventure Works tables, this was quite simple and straightforward. Create an SSIS package that executes the snapshot replication on demand and waits for its completion. This is something that you can’t do with out of the box functionality. While configuring the snapshot replication two SQL Agent Jobs are created, one for the creation of the snapshot and one for the distribution of the snapshot. Unfortunately these jobs are asynchronous which means that if you execute them they immediately report back if the job started successfully or not, they do not wait for completion and report its result afterwards. So I had to create an SSIS package that executes the jobs and waits for their completion before the rest of the ETL process continues. Fortunately I was able to create the SSIS package with the desired functionality. I have made a step-by-step guide that will help you configure the snapshot replication and I have uploaded the SSIS package you need to execute it. Configure snapshot replication The first step is to create a publication on the database you want to replicate. Connect to SQL Server Management Studio and right-click Replication, choose for New.. Publication… The New Publication Wizard appears, click Next Choose your “source” database and click Next Choose Snapshot publication and click Next You can now select tables and other objects that you want to publish Expand Tables and select the tables that are needed in your ETL process In the next screen you can add filters on the selected tables which can be very useful. Think about selecting only the last x days of data for example. Its possible to filter out rows and/or columns. In this example I did not apply any filters. Schedule the Snapshot Agent to run at a desired time, by doing this a SQL Agent Job is created which we need to execute from a SSIS package later on. Next you need to set the Security Settings for the Snapshot Agent. Click on the Security Settings button. In this example I ran the Agent under the SQL Server Agent service account. This is not recommended as a security best practice. Fortunately there is an excellent article on TechNet which tells you exactly how to set up the security for replication services. Read it here and make sure you follow the guidelines! On the next screen choose to create the publication at the end of the wizard Give the publication a name (SnapshotTest) and complete the wizard The publication is created and the articles (tables in this case) are added Now the publication is created successfully its time to create a new subscription for this publication. Expand the Replication folder in SSMS and right click Local Subscriptions, choose New Subscriptions The New Subscription Wizard appears Select the publisher on which you just created your publication and select the database and publication (SnapshotTest) You can now choose where the Distribution Agent should run. If it runs at the distributor (push subscriptions) it causes extra processing overhead. If you use a separate server for your ETL process and databases choose to run each agent at its subscriber (pull subscriptions) to reduce the processing overhead at the distributor. Of course we need a database for the subscription and fortunately the Wizard can create it for you. Choose for New database Give the database the desired name, set the desired options and click OK You can now add multiple SQL Server Subscribers which is not necessary in this case but can be very useful. You now need to set the security settings for the Distribution Agent. Click on the …. button Again, in this example I ran the Agent under the SQL Server Agent service account. Read the security best practices here Click Next Make sure you create a synchronization job schedule again. This job is also necessary in the SSIS package later on. Initialize the subscription at first synchronization Select the first box to create the subscription when finishing this wizard Complete the wizard by clicking Finish The subscription will be created In SSMS you see a new database is created, the subscriber. There are no tables or other objects in the database available yet because the replication jobs did not ran yet. Now expand the SQL Server Agent, go to Jobs and search for the job that creates the snapshot: Rename this job to “CreateSnapshot” Now search for the job that distributes the snapshot: Rename this job to “DistributeSnapshot” Create an SSIS package that executes the snapshot replication We now need an SSIS package that will take care of the execution of both jobs. The CreateSnapshot job needs to execute and finish before the DistributeSnapshot job runs. After the DistributeSnapshot job has started the package needs to wait until its finished before the package execution finishes. The Execute SQL Server Agent Job Task is designed to execute SQL Agent Jobs from SSIS. Unfortunately this SSIS task only executes the job and reports back if the job started succesfully or not, it does not report if the job actually completed with success or failure. This is because these jobs are asynchronous. The SSIS package I’ve created does the following: It runs the CreateSnapshot job It checks every 5 seconds if the job is completed with a for loop When the CreateSnapshot job is completed it starts the DistributeSnapshot job And again it waits until the snapshot is delivered before the package will finish successfully Quite simple and the package is ready to use as standalone extract mechanism. After executing the package the replicated tables are added to the subscriber database and are filled with data: Download the SSIS package here (SSIS 2008) Conclusion In this example I only replicated 5 tables, I could create a SSIS package that does the same in approximately the same amount of time. But if I replicated all the 70+ AdventureWorks tables I would save a lot of time and boring work! With replication services you also benefit from the feature that schema changes are applied automatically which means your entire extract phase wont break. Because a snapshot is created using the bcp utility (bulk copy) it’s also quite fast, so the performance will be quite good. Disadvantages of using snapshot replication as extraction tool is the limitation on source systems. You can only choose SQL Server or Oracle databases to act as a publisher. So if you plan to build an extract phase for your ETL process that will invoke a lot of tables think about replication services, it would save you a lot of time and thanks to the Extract SSIS package I’ve created you can perfectly fit it in your usual SSIS ETL process.

Read the article
I see no LOBs!

- by Paul White

Is it possible to see LOB (large object) logical reads from STATISTICS IO output on a table with no LOB columns? I was asked this question today by someone who had spent a good fraction of their afternoon trying to work out why this was occurring – even going so far as to re-run DBCC CHECKDB to see if any corruption had taken place. The table in question wasn’t particularly pretty – it had grown somewhat organically over time, with new columns being added every so often as the need arose. Nevertheless, it remained a simple structure with no LOB columns – no TEXT or IMAGE, no XML, no MAX types – nothing aside from ordinary INT, MONEY, VARCHAR, and DATETIME types. To add to the air of mystery, not every query that ran against the table would report LOB logical reads – just sometimes – but when it did, the query often took much longer to execute. Ok, enough of the pre-amble. I can’t reproduce the exact structure here, but the following script creates a table that will serve to demonstrate the effect: IF OBJECT_ID(N'dbo.Test', N'U') IS NOT NULL DROP TABLE dbo.Test GO CREATE TABLE dbo.Test ( row_id NUMERIC IDENTITY NOT NULL, col01 NVARCHAR(450) NOT NULL, col02 NVARCHAR(450) NOT NULL, col03 NVARCHAR(450) NOT NULL, col04 NVARCHAR(450) NOT NULL, col05 NVARCHAR(450) NOT NULL, col06 NVARCHAR(450) NOT NULL, col07 NVARCHAR(450) NOT NULL, col08 NVARCHAR(450) NOT NULL, col09 NVARCHAR(450) NOT NULL, col10 NVARCHAR(450) NOT NULL, CONSTRAINT [PK dbo.Test row_id] PRIMARY KEY CLUSTERED (row_id) ) ; The next script loads the ten variable-length character columns with one-character strings in the first row, two-character strings in the second row, and so on down to the 450th row: WITH Numbers AS ( -- Generates numbers 1 - 450 inclusive SELECT TOP (450) n = ROW_NUMBER() OVER (ORDER BY (SELECT 0)) FROM master.sys.columns C1, master.sys.columns C2, master.sys.columns C3 ORDER BY n ASC ) INSERT dbo.Test WITH (TABLOCKX) SELECT REPLICATE(N'A', N.n), REPLICATE(N'B', N.n), REPLICATE(N'C', N.n), REPLICATE(N'D', N.n), REPLICATE(N'E', N.n), REPLICATE(N'F', N.n), REPLICATE(N'G', N.n), REPLICATE(N'H', N.n), REPLICATE(N'I', N.n), REPLICATE(N'J', N.n) FROM Numbers AS N ORDER BY N.n ASC ; Once those two scripts have run, the table contains 450 rows and 10 columns of data like this: Most of the time, when we query data from this table, we don’t see any LOB logical reads, for example: -- Find the maximum length of the data in -- column 5 for a range of rows SELECT result = MAX(DATALENGTH(T.col05)) FROM dbo.Test AS T WHERE row_id BETWEEN 50 AND 100 ; But with a different query… -- Read all the data in column 1 SELECT result = MAX(DATALENGTH(T.col01)) FROM dbo.Test AS T ; …suddenly we have 49 LOB logical reads, as well as the ‘normal’ logical reads we would expect. The Explanation If we had tried to create this table in SQL Server 2000, we would have received a warning message to say that future INSERT or UPDATE operations on the table might fail if the resulting row exceeded the in-row storage limit of 8060 bytes. If we needed to store more data than would fit in an 8060 byte row (including internal overhead) we had to use a LOB column – TEXT, NTEXT, or IMAGE. These special data types store the large data values in a separate structure, with just a small pointer left in the original row. Row Overflow SQL Server 2005 introduced a feature called row overflow, which allows one or more variable-length columns in a row to move to off-row storage if the data in a particular row would otherwise exceed 8060 bytes. You no longer receive a warning when creating (or altering) a table that might need more than 8060 bytes of in-row storage; if SQL Server finds that it can no longer fit a variable-length column in a particular row, it will silently move one or more of these columns off the row into a separate allocation unit. Only variable-length columns can be moved in this way (for example the (N)VARCHAR, VARBINARY, and SQL_VARIANT types). Fixed-length columns (like INTEGER and DATETIME for example) never move into ‘row overflow’ storage. The decision to move a column off-row is done on a row-by-row basis – so data in a particular column might be stored in-row for some table records, and off-row for others. In general, if SQL Server finds that it needs to move a column into row-overflow storage, it moves the largest variable-length column record for that row. Note that in the case of an UPDATE statement that results in the 8060 byte limit being exceeded, it might not be the column that grew that is moved! Sneaky LOBs Anyway, that’s all very interesting but I don’t want to get too carried away with the intricacies of row-overflow storage internals. The point is that it is now possible to define a table with non-LOB columns that will silently exceed the old row-size limit and result in ordinary variable-length columns being moved to off-row storage. Adding new columns to a table, expanding an existing column definition, or simply storing more data in a column than you used to – all these things can result in one or more variable-length columns being moved off the row. Note that row-overflow storage is logically quite different from old-style LOB and new-style MAX data type storage – individual variable-length columns are still limited to 8000 bytes each – you can just have more of them now. Having said that, the physical mechanisms involved are very similar to full LOB storage – a column moved to row-overflow leaves a 24-byte pointer record in the row, and the ‘separate storage’ I have been talking about is structured very similarly to both old-style LOBs and new-style MAX types. The disadvantages are also the same: when SQL Server needs a row-overflow column value it needs to follow the in-row pointer a navigate another chain of pages, just like retrieving a traditional LOB. And Finally… In the example script presented above, the rows with row_id values from 402 to 450 inclusive all exceed the total in-row storage limit of 8060 bytes. A SELECT that references a column in one of those rows that has moved to off-row storage will incur one or more lob logical reads as the storage engine locates the data. The results on your system might vary slightly depending on your settings, of course; but in my tests only column 1 in rows 402-450 moved off-row. You might like to play around with the script – updating columns, changing data type lengths, and so on – to see the effect on lob logical reads and which columns get moved when. You might even see row-overflow columns moving back in-row if they are updated to be smaller (hint: reduce the size of a column entry by at least 1000 bytes if you hope to see this). Be aware that SQL Server will not warn you when it moves ‘ordinary’ variable-length columns into overflow storage, and it can have dramatic effects on performance. It makes more sense than ever to choose column data types sensibly. If you make every column a VARCHAR(8000) or NVARCHAR(4000), and someone stores data that results in a row needing more than 8060 bytes, SQL Server might turn some of your column data into pseudo-LOBs – all without saying a word. Finally, some people make a distinction between ordinary LOBs (those that can hold up to 2GB of data) and the LOB-like structures created by row-overflow (where columns are still limited to 8000 bytes) by referring to row-overflow LOBs as SLOBs. I find that quite appealing, but the ‘S’ stands for ‘small’, which makes expanding the whole acronym a little daft-sounding…small large objects anyone? © Paul White 2011 email: [email protected] twitter: @SQL_Kiwi

Read the article
Inheritance Mapping Strategies with Entity Framework Code First CTP5: Part 2 – Table per Type (TPT)

- by mortezam

In the previous blog post you saw that there are three different approaches to representing an inheritance hierarchy and I explained Table per Hierarchy (TPH) as the default mapping strategy in EF Code First. We argued that the disadvantages of TPH may be too serious for our design since it results in denormalized schemas that can become a major burden in the long run. In today’s blog post we are going to learn about Table per Type (TPT) as another inheritance mapping strategy and we'll see that TPT doesn’t expose us to this problem. Table per Type (TPT)Table per Type is about representing inheritance relationships as relational foreign key associations. Every class/subclass that declares persistent properties—including abstract classes—has its own table. The table for subclasses contains columns only for each noninherited property (each property declared by the subclass itself) along with a primary key that is also a foreign key of the base class table. This approach is shown in the following figure: For example, if an instance of the CreditCard subclass is made persistent, the values of properties declared by the BillingDetail base class are persisted to a new row of the BillingDetails table. Only the values of properties declared by the subclass (i.e. CreditCard) are persisted to a new row of the CreditCards table. The two rows are linked together by their shared primary key value. Later, the subclass instance may be retrieved from the database by joining the subclass table with the base class table. TPT Advantages The primary advantage of this strategy is that the SQL schema is normalized. In addition, schema evolution is straightforward (modifying the base class or adding a new subclass is just a matter of modify/add one table). Integrity constraint definition are also straightforward (note how CardType in CreditCards table is now a non-nullable column). Another much more important advantage is the ability to handle polymorphic associations (a polymorphic association is an association to a base class, hence to all classes in the hierarchy with dynamic resolution of the concrete class at runtime). A polymorphic association to a particular subclass may be represented as a foreign key referencing the table of that particular subclass. Implement TPT in EF Code First We can create a TPT mapping simply by placing Table attribute on the subclasses to specify the mapped table name (Table attribute is a new data annotation and has been added to System.ComponentModel.DataAnnotations namespace in CTP5): public abstract class BillingDetail { public int BillingDetailId { get; set; } public string Owner { get; set; } public string Number { get; set; } } [Table("BankAccounts")] public class BankAccount : BillingDetail { public string BankName { get; set; } public string Swift { get; set; } } [Table("CreditCards")] public class CreditCard : BillingDetail { public int CardType { get; set; } public string ExpiryMonth { get; set; } public string ExpiryYear { get; set; } } public class InheritanceMappingContext : DbContext { public DbSet<BillingDetail> BillingDetails { get; set; } } If you prefer fluent API, then you can create a TPT mapping by using ToTable() method: protected override void OnModelCreating(ModelBuilder modelBuilder) { modelBuilder.Entity<BankAccount>().ToTable("BankAccounts"); modelBuilder.Entity<CreditCard>().ToTable("CreditCards"); } Generated SQL For QueriesLet’s take an example of a simple non-polymorphic query that returns a list of all the BankAccounts: var query = from b in context.BillingDetails.OfType<BankAccount>() select b; Executing this query (by invoking ToList() method) results in the following SQL statements being sent to the database (on the bottom, you can also see the result of executing the generated query in SQL Server Management Studio): Now, let’s take an example of a very simple polymorphic query that requests all the BillingDetails which includes both BankAccount and CreditCard types: projects some properties out of the base class BillingDetail, without querying for anything from any of the subclasses: var query = from b in context.BillingDetails select new { b.BillingDetailId, b.Number, b.Owner }; -- var query = from b in context.BillingDetails select b; This LINQ query seems even more simple than the previous one but the resulting SQL query is not as simple as you might expect: -- As you can see, EF Code First relies on an INNER JOIN to detect the existence (or absence) of rows in the subclass tables CreditCards and BankAccounts so it can determine the concrete subclass for a particular row of the BillingDetails table. Also the SQL CASE statements that you see in the beginning of the query is just to ensure columns that are irrelevant for a particular row have NULL values in the returning flattened table. (e.g. BankName for a row that represents a CreditCard type) TPT ConsiderationsEven though this mapping strategy is deceptively simple, the experience shows that performance can be unacceptable for complex class hierarchies because queries always require a join across many tables. In addition, this mapping strategy is more difficult to implement by hand— even ad-hoc reporting is more complex. This is an important consideration if you plan to use handwritten SQL in your application (For ad hoc reporting, database views provide a way to offset the complexity of the TPT strategy. A view may be used to transform the table-per-type model into the much simpler table-per-hierarchy model.) SummaryIn this post we learned about Table per Type as the second inheritance mapping in our series. So far, the strategies we’ve discussed require extra consideration with regard to the SQL schema (e.g. in TPT, foreign keys are needed). This situation changes with the Table per Concrete Type (TPC) that we will discuss in the next post. References ADO.NET team blog Java Persistence with Hibernate book a { text-decoration: none; } a:visited { color: Blue; } .title { padding-bottom: 5px; font-family: Segoe UI; font-size: 11pt; font-weight: bold; padding-top: 15px; } .code, .typeName { font-family: consolas; } .typeName { color: #2b91af; } .padTop5 { padding-top: 5px; } .padTop10 { padding-top: 10px; } p.MsoNormal { margin-top: 0in; margin-right: 0in; margin-bottom: 10.0pt; margin-left: 0in; line-height: 115%; font-size: 11.0pt; font-family: "Calibri" , "sans-serif"; }

Read the article
Computer Networks UNISA - Chap 10 – In Depth TCP/IP Networking

- by MarkPearl

After reading this section you should be able to Understand methods of network design unique to TCP/IP networks, including subnetting, CIDR, and address translation Explain the differences between public and private TCP/IP networks Describe protocols used between mail clients and mail servers, including SMTP, POP3, and IMAP4 Employ multiple TCP/IP utilities for network discovery and troubleshooting Designing TCP/IP-Based Networks The following sections explain how network and host information in an IPv4 address can be manipulated to subdivide networks into smaller segments. Subnetting Subnetting separates a network into multiple logically defined segments, or subnets. Networks are commonly subnetted according to geographic locations, departmental boundaries, or technology types. A network administrator might separate traffic to accomplish the following… Enhance security Improve performance Simplify troubleshooting The challenges of Classful Addressing in IPv4 (No subnetting) The simplest type of IPv4 is known as classful addressing (which was the Class A, Class B & Class C network addresses). Classful addressing has the following limitations. Restriction in the number of usable IPv4 addresses (class C would be limited to 254 addresses) Difficult to separate traffic from various parts of a network Because of the above reasons, subnetting was introduced. IPv4 Subnet Masks Subnetting depends on the use of subnet masks to identify how a network is subdivided. A subnet mask indicates where network information is located in an IPv4 address. The 1 in a subnet mask indicates that corresponding bits in the IPv4 address contain network information (likewise 0 indicates the opposite) Each network class is associated with a default subnet mask… Class A = 255.0.0.0 Class B = 255.255.0.0 Class C = 255.255.255.0 An example of calculating the network ID for a particular device with a subnet mask is shown below.. IP Address = 199.34.89.127 Subnet Mask = 255.255.255.0 Resultant Network ID = 199.34.89.0 IPv4 Subnetting Techniques Subnetting breaks the rules of classful IPv4 addressing. Read page 490 for a detailed explanation Calculating IPv4 Subnets Read page 491 – 494 for an explanation Important… Subnetting only applies to the devices internal to your network. Everything external looks at the class of the IP address instead of the subnet network ID. This way, traffic directed to your network externally still knows where to go, and once it has entered your internal network it can then be prioritized and segmented. CIDR (classless Interdomain Routing) CIDR is also known as classless routing or supernetting. In CIDR conventional network class distinctions do not exist, a subnet boundary can move to the left, therefore generating more usable IP addresses on your network. A subnet created by moving the subnet boundary to the left is known as a supernet. With CIDR also came new shorthand for denoting the position of subnet boundaries known as CIDR notation or slash notation. CIDR notation takes the form of the network ID followed by a forward slash (/) followed by the number of bits that are used for the extended network prefix. To take advantage of classless routing, your networks routers must be able to interpret IP addresses that don;t adhere to conventional network class parameters. Routers that rely on older routing protocols (i.e. RIP) are not capable of interpreting classless IP addresses. Internet Gateways Gateways are a combination of software and hardware that enable two different network segments to exchange data. A gateway facilitates communication between different networks or subnets. Because on device cannot send data directly to a device on another subnet, a gateway must intercede and hand off the information. Every device on a TCP/IP based network has a default gateway (a gateway that first interprets its outbound requests to other subnets, and then interprets its inbound requests from other subnets). The internet contains a vast number of routers and gateways. If each gateway had to track addressing information for every other gateway on the Internet, it would be overtaxed. Instead, each handles only a relatively small amount of addressing information, which it uses to forward data to another gateway that knows more about the data’s destination. The gateways that make up the internet backbone are called core gateways. Address Translation An organizations default gateway can also be used to “hide” the organizations internal IP addresses and keep them from being recognized on a public network. A public network is one that any user may access with little or no restrictions. On private networks, hiding IP addresses allows network managers more flexibility in assigning addresses. Clients behind a gateway may use any IP addressing scheme, regardless of whether it is recognized as legitimate by the Internet authorities but as soon as those devices need to go on the internet, they must have legitimate IP addresses to exchange data. When a clients transmission reaches the default gateway, the gateway opens the IP datagram and replaces the client’s private IP address with an Internet recognized IP address. This process is known as NAT (Network Address Translation). TCP/IP Mail Services All Internet mail services rely on the same principles of mail delivery, storage, and pickup, though they may use different types of software to accomplish these functions. Email servers and clients communicate through special TCP/IP application layer protocols. These protocols, all of which operate on a variety of operating systems are discussed below… SMTP (Simple Mail transfer Protocol) The protocol responsible for moving messages from one mail server to another over TCP/IP based networks. SMTP belongs to the application layer of the ODI model and relies on TCP as its transport protocol. Operates from port 25 on the SMTP server Simple sub-protocol, incapable of doing anything more than transporting mail or holding it in a queue MIME (Multipurpose Internet Mail Extensions) The standard message format specified by SMTP allows for lines that contain no more than 1000 ascii characters meaning if you relied solely on SMTP you would have very short messages and nothing like pictures included in an email. MIME us a standard for encoding and interpreting binary files, images, video, and non-ascii character sets within an email message. MIME identifies each element of a mail message according to content type. MIME does not replace SMTP but works in conjunction with it. Most modern email clients and servers support MIME POP (Post Office Protocol) POP is an application layer protocol used to retrieve messages from a mail server POP3 relies on TCP and operates over port 110 With POP3 mail is delivered and stored on a mail server until it is downloaded by a user Disadvantage of POP3 is that it typically does not allow users to save their messages on the server because of this IMAP is sometimes used IMAP (Internet Message Access Protocol) IMAP is a retrieval protocol that was developed as a more sophisticated alternative to POP3 The single biggest advantage IMAP4 has over POP3 is that users can store messages on the mail server, rather than having to continually download them Users can retrieve all or only a portion of any mail message Users can review their messages and delete them while the messages remain on the server Users can create sophisticated methods of organizing messages on the server Users can share a mailbox in a central location Disadvantages of IMAP are typically related to the fact that it requires more storage space on the server. Additional TCP/IP Utilities Nearly all TCP/IP utilities can be accessed from the command prompt on any type of server or client running TCP/IP. The syntaxt may differ depending on the OS of the client. Below is a list of additional TCP/IP utilities – research their use on your own! Ipconfig (Windows) & Ifconfig (Linux) Netstat Nbtstat Hostname, Host & Nslookup Dig (Linux) Whois (Linux) Traceroute (Tracert) Mtr (my traceroute) Route

Read the article
Organization & Architecture UNISA Studies – Chap 4

- by MarkPearl

Learning Outcomes Explain the characteristics of memory systems Describe the memory hierarchy Discuss cache memory principles Discuss issues relevant to cache design Describe the cache organization of the Pentium Computer Memory Systems There are key characteristics of memory… Location – internal or external Capacity – expressed in terms of bytes Unit of Transfer – the number of bits read out of or written into memory at a time Access Method – sequential, direct, random or associative From a users perspective the two most important characteristics of memory are… Capacity Performance – access time, memory cycle time, transfer rate The trade off for memory happens along three axis… Faster access time, greater cost per bit Greater capacity, smaller cost per bit Greater capacity, slower access time This leads to people using a tiered approach in their use of memory As one goes down the hierarchy, the following occurs… Decreasing cost per bit Increasing capacity Increasing access time Decreasing frequency of access of the memory by the processor The use of two levels of memory to reduce average access time works in principle, but only if conditions 1 to 4 apply. A variety of technologies exist that allow us to accomplish this. Thus it is possible to organize data across the hierarchy such that the percentage of accesses to each successively lower level is substantially less than that of the level above. A portion of main memory can be used as a buffer to hold data temporarily that is to be read out to disk. This is sometimes referred to as a disk cache and improves performance in two ways… Disk writes are clustered. Instead of many small transfers of data, we have a few large transfers of data. This improves disk performance and minimizes processor involvement. Some data designed for write-out may be referenced by a program before the next dump to disk. In that case the data is retrieved rapidly from the software cache rather than slowly from disk. Cache Memory Principles Cache memory is substantially faster than main memory. A caching system works as follows.. When a processor attempts to read a word of memory, a check is made to see if this in in cache memory… If it is, the data is supplied, If it is not in the cache, a block of main memory, consisting of a fixed number of words is loaded to the cache. Because of the phenomenon of locality of references, when a block of data is fetched into the cache, it is likely that there will be future references to that same memory location or to other words in the block. Elements of Cache Design While there are a large number of cache implementations, there are a few basic design elements that serve to classify and differentiate cache architectures… Cache Addresses Cache Size Mapping Function Replacement Algorithm Write Policy Line Size Number of Caches Cache Addresses Almost all non-embedded processors support virtual memory. Virtual memory in essence allows a program to address memory from a logical point of view without needing to worry about the amount of physical memory available. When virtual addresses are used the designer may choose to place the cache between the MMU (memory management unit) and the processor or between the MMU and main memory. The disadvantage of virtual memory is that most virtual memory systems supply each application with the same virtual memory address space (each application sees virtual memory starting at memory address 0), which means the cache memory must be completely flushed with each application context switch or extra bits must be added to each line of the cache to identify which virtual address space the address refers to. Cache Size We would like the size of the cache to be small enough so that the overall average cost per bit is close to that of main memory alone and large enough so that the overall average access time is close to that of the cache alone. Also, larger caches are slightly slower than smaller ones. Mapping Function Because there are fewer cache lines than main memory blocks, an algorithm is needed for mapping main memory blocks into cache lines. The choice of mapping function dictates how the cache is organized. Three techniques can be used… Direct – simplest technique, maps each block of main memory into only one possible cache line Associative – Each main memory block to be loaded into any line of the cache Set Associative – exhibits the strengths of both the direct and associative approaches while reducing their disadvantages For detailed explanations of each approach – read the text book (page 148 – 154) Replacement Algorithm For associative and set associating mapping a replacement algorithm is needed to determine which of the existing blocks in the cache must be replaced by a new block. There are four common approaches… LRU (Least recently used) FIFO (First in first out) LFU (Least frequently used) Random selection Write Policy When a block resident in the cache is to be replaced, there are two cases to consider If no writes to that block have happened in the cache – discard it If a write has occurred, a process needs to be initiated where the changes in the cache are propagated back to the main memory. There are several approaches to achieve this including… Write Through – all writes to the cache are done to the main memory as well at the point of the change Write Back – when a block is replaced, all dirty bits are written back to main memory The problem is complicated when we have multiple caches, there are techniques to accommodate for this but I have not summarized them. Line Size When a block of data is retrieved and placed in the cache, not only the desired word but also some number of adjacent words are retrieved. As the block size increases from very small to larger sizes, the hit ratio will at first increase because of the principle of locality, which states that the data in the vicinity of a referenced word are likely to be referenced in the near future. As the block size increases, more useful data are brought into cache. The hit ratio will begin to decrease as the block becomes even bigger and the probability of using the newly fetched information becomes less than the probability of using the newly fetched information that has to be replaced. Two specific effects come into play… Larger blocks reduce the number of blocks that fit into a cache. Because each block fetch overwrites older cache contents, a small number of blocks results in data being overwritten shortly after they are fetched. As a block becomes larger, each additional word is farther from the requested word and therefore less likely to be needed in the near future. The relationship between block size and hit ratio is complex, and no set approach is judged to be the best in all circumstances. Pentium 4 and ARM cache organizations The processor core consists of four major components: Fetch/decode unit – fetches program instruction in order from the L2 cache, decodes these into a series of micro-operations, and stores the results in the L2 instruction cache Out-of-order execution logic – Schedules execution of the micro-operations subject to data dependencies and resource availability – thus micro-operations may be scheduled for execution in a different order than they were fetched from the instruction stream. As time permits, this unit schedules speculative execution of micro-operations that may be required in the future Execution units – These units execute micro-operations, fetching the required data from the L1 data cache and temporarily storing results in registers Memory subsystem – This unit includes the L2 and L3 caches and the system bus, which is used to access main memory when the L1 and L2 caches have a cache miss and to access the system I/O resources

Read the article
Parallelism in .NET – Part 9, Configuration in PLINQ and TPL

- by Reed

Parallel LINQ and the Task Parallel Library contain many options for configuration. Although the default configuration options are often ideal, there are times when customizing the behavior is desirable. Both frameworks provide full configuration support. When working with Data Parallelism, there is one primary configuration option we often need to control – the number of threads we want the system to use when parallelizing our routine. By default, PLINQ and the TPL both use the ThreadPool to schedule tasks. Given the major improvements in the ThreadPool in CLR 4, this default behavior is often ideal. However, there are times that the default behavior is not appropriate. For example, if you are working on multiple threads simultaneously, and want to schedule parallel operations from within both threads, you might want to consider restricting each parallel operation to using a subset of the processing cores of the system. Not doing this might over-parallelize your routine, which leads to inefficiencies from having too many context switches. In the Task Parallel Library, configuration is handled via the ParallelOptions class. All of the methods of the Parallel class have an overload which accepts a ParallelOptions argument. We configure the Parallel class by setting the ParallelOptions.MaxDegreeOfParallelism property. For example, let’s revisit one of the simple data parallel examples from Part 2: Parallel.For(0, pixelData.GetUpperBound(0), row => { for (int col=0; col < pixelData.GetUpperBound(1); ++col) { pixelData[row, col] = AdjustContrast(pixelData[row, col], minPixel, maxPixel); } }); .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Here, we’re looping through an image, and calling a method on each pixel in the image. If this was being done on a separate thread, and we knew another thread within our system was going to be doing a similar operation, we likely would want to restrict this to using half of the cores on the system. This could be accomplished easily by doing: var options = new ParallelOptions(); options.MaxDegreeOfParallelism = Math.Max(Environment.ProcessorCount / 2, 1); Parallel.For(0, pixelData.GetUpperBound(0), options, row => { for (int col=0; col < pixelData.GetUpperBound(1); ++col) { pixelData[row, col] = AdjustContrast(pixelData[row, col], minPixel, maxPixel); } }); Now, we’re restricting this routine to using no more than half the cores in our system. Note that I included a check to prevent a single core system from supplying zero; without this check, we’d potentially cause an exception. I also did not hard code a specific value for the MaxDegreeOfParallelism property. One of our goals when parallelizing a routine is allowing it to scale on better hardware. Specifying a hard-coded value would contradict that goal. Parallel LINQ also supports configuration, and in fact, has quite a few more options for configuring the system. The main configuration option we most often need is the same as our TPL option: we need to supply the maximum number of processing threads. In PLINQ, this is done via a new extension method on ParallelQuery<T>: ParallelEnumerable.WithDegreeOfParallelism. Let’s revisit our declarative data parallelism sample from Part 6: double min = collection.AsParallel().Min(item => item.PerformComputation()); Here, we’re performing a computation on each element in the collection, and saving the minimum value of this operation. If we wanted to restrict this to a limited number of threads, we would add our new extension method: int maxThreads = Math.Max(Environment.ProcessorCount / 2, 1); double min = collection .AsParallel() .WithDegreeOfParallelism(maxThreads) .Min(item => item.PerformComputation()); This automatically restricts the PLINQ query to half of the threads on the system. PLINQ provides some additional configuration options. By default, PLINQ will occasionally revert to processing a query in parallel. This occurs because many queries, if parallelized, typically actually cause an overall slowdown compared to a serial processing equivalent. By analyzing the “shape” of the query, PLINQ often decides to run a query serially instead of in parallel. This can occur for (taken from MSDN): Queries that contain a Select, indexed Where, indexed SelectMany, or ElementAt clause after an ordering or filtering operator that has removed or rearranged original indices. Queries that contain a Take, TakeWhile, Skip, SkipWhile operator and where indices in the source sequence are not in the original order. Queries that contain Zip or SequenceEquals, unless one of the data sources has an originally ordered index and the other data source is indexable (i.e. an array or IList(T)). Queries that contain Concat, unless it is applied to indexable data sources. Queries that contain Reverse, unless applied to an indexable data source. If the specific query follows these rules, PLINQ will run the query on a single thread. However, none of these rules look at the specific work being done in the delegates, only at the “shape” of the query. There are cases where running in parallel may still be beneficial, even if the shape is one where it typically parallelizes poorly. In these cases, you can override the default behavior by using the WithExecutionMode extension method. This would be done like so: var reversed = collection .AsParallel() .WithExecutionMode(ParallelExecutionMode.ForceParallelism) .Select(i => i.PerformComputation()) .Reverse(); Here, the default behavior would be to not parallelize the query unless collection implemented IList<T>. We can force this to run in parallel by adding the WithExecutionMode extension method in the method chain. Finally, PLINQ has the ability to configure how results are returned. When a query is filtering or selecting an input collection, the results will need to be streamed back into a single IEnumerable<T> result. For example, the method above returns a new, reversed collection. In this case, the processing of the collection will be done in parallel, but the results need to be streamed back to the caller serially, so they can be enumerated on a single thread. This streaming introduces overhead. IEnumerable<T> isn’t designed with thread safety in mind, so the system needs to handle merging the parallel processes back into a single stream, which introduces synchronization issues. There are two extremes of how this could be accomplished, but both extremes have disadvantages. The system could watch each thread, and whenever a thread produces a result, take that result and send it back to the caller. This would mean that the calling thread would have access to the data as soon as data is available, which is the benefit of this approach. However, it also means that every item is introducing synchronization overhead, since each item needs to be merged individually. On the other extreme, the system could wait until all of the results from all of the threads were ready, then push all of the results back to the calling thread in one shot. The advantage here is that the least amount of synchronization is added to the system, which means the query will, on a whole, run the fastest. However, the calling thread will have to wait for all elements to be processed, so this could introduce a long delay between when a parallel query begins and when results are returned. The default behavior in PLINQ is actually between these two extremes. By default, PLINQ maintains an internal buffer, and chooses an optimal buffer size to maintain. Query results are accumulated into the buffer, then returned in the IEnumerable<T> result in chunks. This provides reasonably fast access to the results, as well as good overall throughput, in most scenarios. However, if we know the nature of our algorithm, we may decide we would prefer one of the other extremes. This can be done by using the WithMergeOptions extension method. For example, if we know that our PerformComputation() routine is very slow, but also variable in runtime, we may want to retrieve results as they are available, with no bufferring. This can be done by changing our above routine to: var reversed = collection .AsParallel() .WithExecutionMode(ParallelExecutionMode.ForceParallelism) .WithMergeOptions(ParallelMergeOptions.NotBuffered) .Select(i => i.PerformComputation()) .Reverse(); On the other hand, if are already on a background thread, and we want to allow the system to maximize its speed, we might want to allow the system to fully buffer the results: var reversed = collection .AsParallel() .WithExecutionMode(ParallelExecutionMode.ForceParallelism) .WithMergeOptions(ParallelMergeOptions.FullyBuffered) .Select(i => i.PerformComputation()) .Reverse(); Notice, also, that you can specify multiple configuration options in a parallel query. By chaining these extension methods together, we generate a query that will always run in parallel, and will always complete before making the results available in our IEnumerable<T>.

Read the article
E-Business Suite Technology Sessions at OAUG Collaborate 12

- by Max Arderius

Members of our E-Business Suite Applications Technology Group will be at the OAUG Collaborate 12 conference at the Mandalay Bay Convention Center in Las Vegas, Nevada on April 22 to 26, 2012. Please drop by any of our sessions to hear the latest news and meet up with us. Speaker Sessions Session 9675Planning Your Oracle E-Business Suite Upgrade from Release 11i to 12.1 and BeyondAnne Carlson, Senior Director, Applications Technology Group, OracleSunday, April 22, 2:00 pm - 3:00 pmLocation: Jasmine B Attend this session to hear the latest Oracle E-Business Suite Release 12.1 upgrade planning tips gleaned from customers who have already performed the upgrade. Youll get specific, cross-product advice on how to decide your project's scope, understand the factors that affect your project's duration, develop a robust testing strategy, leverage Oracle Support resources, and more. In a nutshell, this session tells you things you need to know before embarking upon your Release 12.1 upgrade project. Session 9401Minimizing Oracle E-Business Suite Maintenance DowntimesElke Phelps, Principal Product Manager, Applications Technology Group, OracleKevin Hudson, Sr. Director, Applications Technology Group, OracleSunday, April 22, 2:10 pm - 3:10 pmLocation: South Seas EThis session starts with an architecture review of Oracle E-Business Suite fundamentals and then moves to a practical view of the different tools and approaches for downtimes. Topics include patching shortcuts, merging patches, distributing worker processes across multiple servers, running ADPatch in no-interactive mode, staged APPL_TOPs, shared file systems, deferring system-wide database tasks, avoiding resource bottlenecks etc... This session also describes the online patching capabilities coming in Release 12.2. Session 9368Oracle E-Business Suite Technology: Latest Features and RoadmapLisa Parekh, Vice President, Applications Technology Group, Oracle Sunday, April 22, 4:30 pm - 5:30 pmLocation: South Seas EThis session provides an overview of Oracle E-Business Suite technology strategy, the capabilities and associated business benefits of recent releases, as well as a review of the product roadmap. As a cornerstone session for Oracle E-Business Suite technology, come hear about the latest usability enhancements, systems administration and configuration management tools, security-related updates, and tools and options for extending, customizing, and integrating the Oracle E-Business Suite with other applications. Session 10709Oracle E-Business Suite Applications Strategy and General Manager UpdateCliff Godwin, Sr. VP, Application Development, OracleMonday, April 23, 2:30 pm - 3:30 pmLocation: Mandalay Bay DIn this session, hear from Oracle E-Business Suite General Manager Cliff Godwin as he delivers an update on the Oracle E-Business Suite product line. The session covers the value delivered by the current release of Oracle E-Business Suite applications, the momentum, and how Oracle E-Business Suite applications integrate into Oracle’s overall applications strategy. You will come away with an understanding of the value Oracle E-Business Suite applications deliver now and in the future. Session 9398How to Reduce TCO Using Oracle Application Management Suite for Oracle E-Business SuiteAngelo Rosado, Principal Product Manager, Applications Technology Group, OracleKenneth Baxter, Principal Product Strategy Manager, Management Pack Fusion Middleware Management, OracleTuesday, April 24, 8:00 am - 9:00 amLocation: Breakers GThis session covers the methods and tools you can use to gain insights into your end users, troubleshoot performance problems, define service-level objectives, and proactively monitor your end-to-end Oracle E-Business Suite environment to meet your availability and performance targets. Come hear how you can manage, diagnose, and monitor the Oracle E-Business Suite environment from a single console by using Oracle Enterprise Manager together with the Oracle Application Management Suite for Oracle E-Business Suite. Session 9370 Coexistence of Oracle E-Business Suite and Oracle Fusion Applications: Platform Perspective Nadia Bendjedou, Senior Director, Product Strategy, Oracle Tuesday, April 24, 2:00 pm - 3:00 pm Location: South Seas E Join us at this session if you are wondering which tools to integrate your data, your processes and your User Interface. Or what tools to customize and extend your screens and reports (OAF, Forms, ADF, Oracle Reports, BI etc....), what tools to secure, protect and manage your Oracle E-Business Suite etc... Or simply if you are looking for a technical roadmap for your Oracle E-Business Suite infrastructure to CO-EXIST with the rest of your enterprise applications including Oracle Fusion Applications. Session 9375 Oracle E-Business Suite Directions: Deployment and System AdministrationMax Arderius, Manager, Applications Development Group, OracleTuesday, April 24, 4:30 pm - 5:30 pmLocation: Breakers GWhat's coming in the next major version of Oracle E-Business Suite 12? This session covers the latest technology stack, including the use of Oracle WebLogic Server and Oracle Database 11g Release 2. Topics include an architectural overview, installation and upgrade options, new configuration options, and new tools for hot-cloning and automated "lights out" cloning. Learn about how online patching will reduce your database patching downtimes to the time it takes to bounce your database server.Session 9369Oracle E-Business Suite Technology Certification Primer and RoadmapSteven Chan, Sr. Director, Applications Technology Group, Oracle Wednesday, April 25, 8:15 am - 9:15 amLocation: South Seas FThis Oracle Development session summarizes the latest certifications and roadmap for the Oracle E-Business Suite technology stack, including database releases/options, Java, Oracle Forms, Oracle Containers for J2EE, desktop OS, browsers, JRE releases, Office/OpenOffice, development and Web authoring tools, user authentication and management, BI, security options, clouds, Oracle VM etc.... It also covers the most-commonly-asked questions about technology stack component support dates and upgrade implications. Session 9407The Latest Oracle E-Business Suite Release User Interface and Usability EnhancementsGustavo Jimenez, Sr. Manager, Applications Technology Group, Oracle Wednesday, April 25, 1:00 pm - 2:00 pmLocation: South Seas GIn this session, developers will get a detailed look at new features designed to enhance usability, offer more capabilities for personalization and extensions, and support the development and use of dashboards and Web services. Topics include rich new UI capabilities such as new home page features, Navigator and Favorites pull-down menus, Oracle ADF task flows etc.... In addition, we will cover the personalization/extensibility enhancements, business layer extensions, Oracle ADF integration and much more. Session 9374Best Practices for Oracle E-Business Suite Performance Tuning and Upgrade OptimizationIsam Alyousfi, Senior Director, Applications Performance, OracleUdayan Parvate, Director, Release Engineering, Quality and Release Management, Oracle Thursday, April 26, 8:30 am - 9:30 amLocation: South Seas FThis presentation will offer tips and techniques on tuning all the layers of the Oracle E-Business Suite stack including the various tiers of the Oracle E-Business Suite environment. You will learn about tuning Oracle Forms, Concurrent Manager, Apache, and Oracle Discoverer. Track down memory leaks and other issues on the Java and Java Virtual Machine layers. The session also covers Oracle E-Business Suite product-level tuning, including Oracle Workflow, Oracle Order Management, Oracle Payroll, and other modules.Session 9412 Oracle E-Business Suite 12.1 Desktop Integration: Beyond Oracle Applications Desktop IntegratorGustavo Jimenez, Sr. Manager, Applications Technology Group, OracleThursday, April 26, 8:30 am - 9:30 amLocation: Breakers GThis session describes the new expanded functionality in Oracle Web Applications Desktop Integrator, Oracle Report Manager, and dedicated integrators. You have more options for desktop integration now, not fewer. Topics include an overview of prepackaged solutions for integrating Oracle E-Business Suite with desktop applications such as Microsoft Excel, Word, and Projects. The session also discusses how you can use the Desktop Integration Framework feature to create your own integrators quickly and easily.Session 9533 Upgrading your Customizations to Oracle E-Business Suite Release 12.1Sara Woodhull, Principal Product Manager, Applications Technology Group, Oracle Thursday, April 26, 11:00 am - 12:00 pmLocation: South Seas FHave you personalized Forms or OA Framework screens? Have you used mod_plsql or Applications Express to tailor your Release 11i functionality? Have you extended or customized your Release 11i environment using other tools? This session will help you understand customization scenarios, use cases, tools, and technologies for ensuring that your Oracle E-Business Suite Release 12.1 environment fits your users' needs closely and that any future customizations will be easy to upgrade. Special Interest Groups (SIG) Session 10535OAUG Database SIG- Part IMichael Brown, Colibri Limited Company Sunday, April 22, 3:20 pm - 4:20 pmLocation: South Seas FThis is the annual meeting of the Database SIG at Collaborate. The call for candidates for the chair will be closed at the meeting. Plans include a speaker from Oracle and a presentation on applications performance. The details of the meeting will be posted on http://www.dbsig.com. Guest Presentation: Oracle E-Business Suite Database PerformanceIsam Alyousfi, Senior Director, Applications Performance, Oracle Session 10720OAUG EBS Applications Technology SIG- Part ISrini Chaval, Cummins Monday, April 23, 2:30 pm - 3:30 pmLocation: South Seas F Guest Presentation:Oracle E-Business Suite Technology Certification RoadmapSteven Chan, Sr. Director, Applications Technology Group, Oracle Session 10510OAUG EBS Applications Technology SIG- Part IISrini Chaval, CumminsMonday, April 23, 3:45 pm - 4:45 pmLocation: South Seas F Guest Presentation:Oracle E-Business Suite 12.2 Online Patching Kevin Hudson, Sr. Director, Applications Technology Group, Oracle Session 10522 OAUG Upgrade SIG- Part IISandra Vucinic, VLAD Group, Inc. Wednesday, April 25, 3:00 pm - 4:00 pmLocation: South Seas FUpgrade SIG will host a business meeting followed by panel (Q&A) related to EBS Upgrade topics and Oracle presentation. Guest Presentation:Upgrading E-Business Suite Amrita Mehrok, Director, Financials Product Strategy, Oracle Nadia Bendjedou, Senior Director, Product Strategy, Oracle Session 10722OAUG Upgrade SIG- Part IISandra Vucinic, VLAD Group, Inc. Wednesday, April 25, 4:15 pm - 5:15 pmLocation: South Seas FUpgrade SIG will host a business meeting followed by panel (Q&A) related to EBS Upgrade topics and Oracle presentation. Guest Presentation:Tuning the Oracle E-Business Suite Upgrade Isam Alyousfi, Senior Director, Applications Performance, Oracle Panels Session 9360Oracle E-Business Suite Cloning PanelSandra Vucinic, VLAD Group, Inc. Guest Speaker: Max Arderius, Manager, Applications Technology Group, OracleWednesday, April 25, 9:30 am - 10:30 amLocation: South Seas FThis panel will discuss differences between available release 11i, R12 and R12.1 cloning methods. Advantages and disadvantages of each cloning method will be discussed in depth. This panel of experienced database administrators will lead a discussion focusing on the questions such as “which cloning method is best to use in your particular environment”. Attendees will gain practical knowledge, tips and tricks to assist with cloning of Oracle E-Business Suite release 11i, R12 and R12.1 environments. Session 10022Oracle Applications Tuning PanelMark Farnham, Rightsizing, Inc.Guest Speaker: Isam Alyousfi, Senior Director, Applications Performance, OracleThursday, April 26, 09:45 am - 10:45 amLocation: South Seas FThis applications performance panel session, sponsored by the OAUG Database SIG, provides a Q&A forum focused on helping you address your Oracle Applications (Oracle E-Business Suite and Oracle's PeopleSoft Enterprise and Siebel applications) performance- and scalability-related issues. The panel comprises several well-known Oracle Applications performance experts. Topic areas include Oracle Database; the network; and the applications tier, including patching and upgrade performance. For complete listing of all speaker sessions and other activities, please visit the OAUG Collaborate Web Site.

Read the article
C# Neural Networks with Encog

- by JoshReuben

Neural Networks · I recently read a book Introduction to Neural Networks for C# , by Jeff Heaton. http://www.amazon.com/Introduction-Neural-Networks-C-2nd/dp/1604390093/ref=sr_1_2?ie=UTF8&s=books&qid=1296821004&sr=8-2-spell. Not the 1st ANN book I've perused, but a nice revision. · Artificial Neural Networks (ANNs) are a mechanism of machine learning – see http://en.wikipedia.org/wiki/Artificial_neural_network , http://en.wikipedia.org/wiki/Category:Machine_learning · Problems Not Suited to a Neural Network Solution- Programs that are easily written out as flowcharts consisting of well-defined steps, program logic that is unlikely to change, problems in which you must know exactly how the solution was derived. · Problems Suited to a Neural Network – pattern recognition, classification, series prediction, and data mining. Pattern recognition - network attempts to determine if the input data matches a pattern that it has been trained to recognize. Classification - take input samples and classify them into fuzzy groups. · As far as machine learning approaches go, I thing SVMs are superior (see http://en.wikipedia.org/wiki/Support_vector_machine ) - a neural network has certain disadvantages in comparison: an ANN can be overtrained, different training sets can produce non-deterministic weights and it is not possible to discern the underlying decision function of an ANN from its weight matrix – they are black box. · In this post, I'm not going to go into internals (believe me I know them). An autoassociative network (e.g. a Hopfield network) will echo back a pattern if it is recognized. · Under the hood, there is very little maths. In a nutshell - Some simple matrix operations occur during training: the input array is processed (normalized into bipolar values of 1, -1) - transposed from input column vector into a row vector, these are subject to matrix multiplication and then subtraction of the identity matrix to get a contribution matrix. The dot product is taken against the weight matrix to yield a boolean match result. For backpropogation training, a derivative function is required. In learning, hill climbing mechanisms such as Genetic Algorithms and Simulated Annealing are used to escape local minima. For unsupervised training, such as found in Self Organizing Maps used for OCR, Hebbs rule is applied. · The purpose of this post is not to mire you in technical and conceptual details, but to show you how to leverage neural networks via an abstraction API - Encog Encog · Encog is a neural network API · Links to Encog: http://www.encog.org , http://www.heatonresearch.com/encog, http://www.heatonresearch.com/forum · Encog requires .Net 3.5 or higher – there is also a Silverlight version. Third-Party Libraries – log4net and nunit. · Encog supports feedforward, recurrent, self-organizing maps, radial basis function and Hopfield neural networks. · Encog neural networks, and related data, can be stored in .EG XML files. · Encog Workbench allows you to edit, train and visualize neural networks. The Encog Workbench can generate code. Synapses and layers · the primary building blocks - Almost every neural network will have, at a minimum, an input and output layer. In some cases, the same layer will function as both input and output layer. · To adapt a problem to a neural network, you must determine how to feed the problem into the input layer of a neural network, and receive the solution through the output layer of a neural network. · The Input Layer - For each input neuron, one double value is stored. An array is passed as input to a layer. Encog uses the interface INeuralData to hold these arrays. The class BasicNeuralData implements the INeuralData interface. Once the neural network processes the input, an INeuralData based class will be returned from the neural network's output layer. · convert a double array into an INeuralData object : INeuralData data = new BasicNeuralData(= new double[10]); · the Output Layer- The neural network outputs an array of doubles, wraped in a class based on the INeuralData interface. · The real power of a neural network comes from its pattern recognition capabilities. The neural network should be able to produce the desired output even if the input has been slightly distorted. · Hidden Layers– optional. between the input and output layers. very much a “black box”. If the structure of the hidden layer is too simple it may not learn the problem. If the structure is too complex, it will learn the problem but will be very slow to train and execute. Some neural networks have no hidden layers. The input layer may be directly connected to the output layer. Further, some neural networks have only a single layer. A single layer neural network has the single layer self-connected. · connections, called synapses, contain individual weight matrixes. These values are changed as the neural network learns. Constructing a Neural Network · the XOR operator is a frequent “first example” -the “Hello World” application for neural networks. · The XOR Operator- only returns true when both inputs differ. 0 XOR 0 = 0 1 XOR 0 = 1 0 XOR 1 = 1 1 XOR 1 = 0 · Structuring a Neural Network for XOR - two inputs to the XOR operator and one output. · input: 0.0,0.0 1.0,0.0 0.0,1.0 1.0,1.0 · Expected output: 0.0 1.0 1.0 0.0 · A Perceptron - a simple feedforward neural network to learn the XOR operator. · Because the XOR operator has two inputs and one output, the neural network will follow suit. Additionally, the neural network will have a single hidden layer, with two neurons to help process the data. The choice for 2 neurons in the hidden layer is arbitrary, and often comes down to trial and error. · Neuron Diagram for the XOR Network · · The Encog workbench displays neural networks on a layer-by-layer basis. · Encog Layer Diagram for the XOR Network: · Create a BasicNetwork - Three layers are added to this network. the FinalizeStructure method must be called to inform the network that no more layers are to be added. The call to Reset randomizes the weights in the connections between these layers. var network = new BasicNetwork(); network.AddLayer(new BasicLayer(2)); network.AddLayer(new BasicLayer(2)); network.AddLayer(new BasicLayer(1)); network.Structure.FinalizeStructure(); network.Reset(); · Neural networks frequently start with a random weight matrix. This provides a starting point for the training methods. These random values will be tested and refined into an acceptable solution. However, sometimes the initial random values are too far off. Sometimes it may be necessary to reset the weights again, if training is ineffective. These weights make up the long-term memory of the neural network. Additionally, some layers have threshold values that also contribute to the long-term memory of the neural network. Some neural networks also contain context layers, which give the neural network a short-term memory as well. The neural network learns by modifying these weight and threshold values. · Now that the neural network has been created, it must be trained. Training a Neural Network · construct a INeuralDataSet object - contains the input array and the expected output array (of corresponding range). Even though there is only one output value, we must still use a two-dimensional array to represent the output. public static double[][] XOR_INPUT ={ new double[2] { 0.0, 0.0 }, new double[2] { 1.0, 0.0 }, new double[2] { 0.0, 1.0 }, new double[2] { 1.0, 1.0 } }; public static double[][] XOR_IDEAL = { new double[1] { 0.0 }, new double[1] { 1.0 }, new double[1] { 1.0 }, new double[1] { 0.0 } }; INeuralDataSet trainingSet = new BasicNeuralDataSet(XOR_INPUT, XOR_IDEAL); · Training is the process where the neural network's weights are adjusted to better produce the expected output. Training will continue for many iterations, until the error rate of the network is below an acceptable level. Encog supports many different types of training. Resilient Propagation (RPROP) - general-purpose training algorithm. All training classes implement the ITrain interface. The RPROP algorithm is implemented by the ResilientPropagation class. Training the neural network involves calling the Iteration method on the ITrain class until the error is below a specific value. The code loops through as many iterations, or epochs, as it takes to get the error rate for the neural network to be below 1%. Once the neural network has been trained, it is ready for use. ITrain train = new ResilientPropagation(network, trainingSet); for (int epoch=0; epoch < 10000; epoch++) { train.Iteration(); Debug.Print("Epoch #" + epoch + " Error:" + train.Error); if (train.Error > 0.01) break; } Executing a Neural Network · Call the Compute method on the BasicNetwork class. Console.WriteLine("Neural Network Results:"); foreach (INeuralDataPair pair in trainingSet) { INeuralData output = network.Compute(pair.Input); Console.WriteLine(pair.Input[0] + "," + pair.Input[1] + ", actual=" + output[0] + ",ideal=" + pair.Ideal[0]); } · The Compute method accepts an INeuralData class and also returns a INeuralData object. Neural Network Results: 0.0,0.0, actual=0.002782538818034049,ideal=0.0 1.0,0.0, actual=0.9903741937121177,ideal=1.0 0.0,1.0, actual=0.9836807956566187,ideal=1.0 1.0,1.0, actual=0.0011646072586172778,ideal=0.0 · the network has not been trained to give the exact results. This is normal. Because the network was trained to 1% error, each of the results will also be within generally 1% of the expected value.

Read the article

Search Results

Search found 666 results on 27 pages for 'disadvantages'.

Page 26/27 | < Previous Page | 22 23 24 25 26 27 | Next Page >

- by brainbox

- by Steve Felts

- by bdiloreto

- by Wade Williams

- by Eugen

- by Noldorin

- by Laz

- by Will Marcouiller

- by DADU

- by Martha

- by Bruno De Fraine

- by DanM

- by ivo s

- by Eye of Hell

- by Steve

- by AlReece45

- by plblum

- by jorg

- by Paul White

- by mortezam

- by MarkPearl

- by MarkPearl

- by Reed

- by Max Arderius

- by JoshReuben

< Previous Page | 22 23 24 25 26 27 | Next Page >