Search Results

Search found 1477 results on 60 pages for 'advantages'.

Page 56/60 | < Previous Page | 52 53 54 55 56 57 58 59 60 | Next Page >

Big Data – Buzz Words: What is Hadoop – Day 6 of 21

- by Pinal Dave

In yesterday’s blog post we learned what is NoSQL. In this article we will take a quick look at one of the four most important buzz words which goes around Big Data – Hadoop. What is Hadoop? Apache Hadoop is an open-source, free and Java based software framework offers a powerful distributed platform to store and manage Big Data. It is licensed under an Apache V2 license. It runs applications on large clusters of commodity hardware and it processes thousands of terabytes of data on thousands of the nodes. Hadoop is inspired from Google’s MapReduce and Google File System (GFS) papers. The major advantage of Hadoop framework is that it provides reliability and high availability. What are the core components of Hadoop? There are two major components of the Hadoop framework and both fo them does two of the important task for it. Hadoop MapReduce is the method to split a larger data problem into smaller chunk and distribute it to many different commodity servers. Each server have their own set of resources and they have processed them locally. Once the commodity server has processed the data they send it back collectively to main server. This is effectively a process where we process large data effectively and efficiently. (We will understand this in tomorrow’s blog post). Hadoop Distributed File System (HDFS) is a virtual file system. There is a big difference between any other file system and Hadoop. When we move a file on HDFS, it is automatically split into many small pieces. These small chunks of the file are replicated and stored on other servers (usually 3) for the fault tolerance or high availability. (We will understand this in the day after tomorrow’s blog post). Besides above two core components Hadoop project also contains following modules as well. Hadoop Common: Common utilities for the other Hadoop modules Hadoop Yarn: A framework for job scheduling and cluster resource management There are a few other projects (like Pig, Hive) related to above Hadoop as well which we will gradually explore in later blog posts. A Multi-node Hadoop Cluster Architecture Now let us quickly see the architecture of the a multi-node Hadoop cluster. A small Hadoop cluster includes a single master node and multiple worker or slave node. As discussed earlier, the entire cluster contains two layers. One of the layer of MapReduce Layer and another is of HDFC Layer. Each of these layer have its own relevant component. The master node consists of a JobTracker, TaskTracker, NameNode and DataNode. A slave or worker node consists of a DataNode and TaskTracker. It is also possible that slave node or worker node is only data or compute node. The matter of the fact that is the key feature of the Hadoop. In this introductory blog post we will stop here while describing the architecture of Hadoop. In a future blog post of this 31 day series we will explore various components of Hadoop Architecture in Detail. Why Use Hadoop? There are many advantages of using Hadoop. Let me quickly list them over here: Robust and Scalable – We can add new nodes as needed as well modify them. Affordable and Cost Effective – We do not need any special hardware for running Hadoop. We can just use commodity server. Adaptive and Flexible – Hadoop is built keeping in mind that it will handle structured and unstructured data. Highly Available and Fault Tolerant – When a node fails, the Hadoop framework automatically fails over to another node. Why Hadoop is named as Hadoop? In year 2005 Hadoop was created by Doug Cutting and Mike Cafarella while working at Yahoo. Doug Cutting named Hadoop after his son’s toy elephant. Tomorrow In tomorrow’s blog post we will discuss Buzz Word – MapReduce. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
Is Linear Tape File System (LTFS) Best For Transportable Storage?

- by rickramsey

Those of us in tape storage engineering take a lot of pride in what we do, but understand that tape is the right answer to a storage problem only some of the time. And, unfortunately for a storage medium with such a long history, it has built up a few preconceived notions that are no longer valid. When I hear customers debate whether to implement tape vs. disk, one of the common strikes against tape is its perceived lack of usability. If you could go back a few generations of corporate acquisitions, you would discover that StorageTek engineers recognized this problem and started developing a solution where a tape drive could look just like a memory stick to a user. The goal was to not have to care about where files were on the cartridge, but to simply see the list of files that were on the tape, and click on them to open them up. Eventually, our friends in tape over at IBM built upon our work at StorageTek and Sun Microsystems and released the Linear Tape File System (LTFS) feature for the current LTO5 generation of tape drives as an open specification. LTFS is really a wonderful feature and we’re proud to have taken part in its beginnings and, as you’ll soon read, its future. Today we offer LTFS-Open Edition, which is free for you to use in your in Oracle Enterprise Linux 5.5 environment - not only on your LTO5 drives, but also on your Oracle StorageTek T10000C drives. You can download it free from Oracle and try it out. LTFS does exactly what its forefathers imagined. Now you can see immediately which files are on a cartridge. LTFS does this by splitting a cartridge into two partitions. The first holds all of the necessary metadata to create a directory structure for you to easily view the contents of the cartridge. The second partition holds all of the files themselves. When tape media is loaded onto a drive, a complete file system image is presented to the user. Adding files to a cartridge can be as simple as a drag-and-drop just as you do today on your laptop when transferring files from your hard drive to a thumb drive or with standard POSIX file operations. You may be thinking all of this sounds nice, but asking, “when will I actually use it?” As I mentioned at the beginning, tape is not the right solution all of the time. However, if you ever need to physically move data between locations, tape storage with LTFS should be your most cost-effective and reliable answer. I will give you a few use cases examples of when LTFS can be utilized. Media and Entertainment (M&E), Oil and Gas (O&G), and other industries have a strong need for their storage to be transportable. For example, an O&G company hunting for new oil deposits in remote locations takes very large underground seismic images which need to be shipped back to a central data center. M&E operations conduct similar activities when shooting video for productions. M&E companies also often transfers files to third-parties for editing and other activities. These companies have three highly flawed options for transporting data: electronic transfer, disk storage transport, or tape storage transport. The first option, electronic transfer, is impractical because of the expense of the bandwidth required to transfer multi-terabyte files reliably and efficiently. If there’s one place that has bandwidth, it’s your local post office so many companies revert to physically shipping storage media. Typically, M&E companies rely on transporting disk storage between sites even though it, too, is expensive. Tape storage should be the preferred format because as IDC points out, “Tape is more suitable for physical transportation of large amounts of data as it is less vulnerable to mechanical damage during transportation compared with disk" (See note 1, below). However, tape storage has not been used in the past because of the restrictions created by proprietary formats. A tape may only be readable if both the sender and receiver have the same proprietary application used to write the file. In addition, the workflows may be slowed by the need to read the entire tape cartridge during recall. LTFS solves both of these problems, clearing the way for tape to become the standard platform for transferring large files. LTFS is open and, as long as you’ve downloaded the free reader from our website or that of anyone in the LTO consortium, you can read the data. So if a movie studio ships a scene to a third-party partner to add, for example, sounds effects or a music score, it doesn’t have to care what technology the third-party has. If it’s written back to an LTFS-formatted tape cartridge, it can be read. Some tape vendors like to claim LTFS is a “standard,” but beauty is in the eye of the beholder. It’s a specification at this point, not a standard. That said, we’re already seeing application vendors create functionality to write in an LTFS format based on the specification. And it’s my belief that both customers and the tape storage industry will see the most benefit if we all follow the same path. As such, we have volunteered to lead the way in making LTFS a standard first with the Storage Network Industry Association (SNIA), and eventually through to standard bodies such as American National Standards Institute (ANSI). Expect to hear good news soon about our efforts. So, if storage transportability is one of your requirements, I recommend giving LTFS a look. It makes tape much more user-friendly and it’s free, which allows tape to maintain all of its cost advantages over disk! Note 1 - IDC Report. April, 2011. “IDC’s Archival Storage Solutions Taxonomy, 2011” - Brian Zents Website Newsletter Facebook Twitter

Read the article
MySQL Connect Only 10 Days Away - Focus on InnoDB Sessions

- by Bertrand Matthelié

Time flies and MySQL Connect is only 10 days away! You can check out the full program here as well as in the September edition of the MySQL newsletter. Mat recently blogged about the MySQL Cluster sessions you’ll have the opportunity to attend, and below are those focused on InnoDB. Remember you can plan your schedule with Schedule Builder. Saturday, 1.00 pm, Room Golden Gate 3: 10 Things You Should Know About InnoDB—Calvin Sun, Oracle InnoDB is the default storage engine for Oracle’s MySQL as of MySQL Release 5.5. It provides the standard ACID-compliant transactions, row-level locking, multiversion concurrency control, and referential integrity. InnoDB also implements several innovative technologies to improve its performance and reliability. This presentation gives a brief history of InnoDB; its main features; and some recent enhancements for better performance, scalability, and availability. Saturday, 5.30 pm, Room Golden Gate 4: Demystified MySQL/InnoDB Performance Tuning—Dimitri Kravtchuk, Oracle This session covers performance tuning with MySQL and the InnoDB storage engine for MySQL and explains the main improvements made in MySQL Release 5.5 and Release 5.6. Which setting for which workload? Which value will be better for my system? How can I avoid potential bottlenecks from the beginning? Do I need a purge thread? Is it true that InnoDB doesn't need thread concurrency anymore? These and many other questions are asked by DBAs and developers. Things are changing quickly and constantly, and there is no “silver bullet.” But understanding the configuration setting’s impact is already a huge step in performance improvement. Bring your ideas and problems to share them with others—the discussion is open, just moderated by a speaker. Sunday, 10.15 am, Room Golden Gate 4: Better Availability with InnoDB Online Operations—Calvin Sun, Oracle Many top Web properties rely on Oracle’s MySQL as a critical piece of infrastructure for serving millions of users. Database availability has become increasingly important. One way to enhance availability is to give users full access to the database during data definition language (DDL) operations. The online DDL operations in recent MySQL releases offer users the flexibility to perform schema changes while having full access to the database—that is, with minimal delay of operations on a table and without rebuilding the entire table. These enhancements provide better responsiveness and availability in busy production environments. This session covers these improvements in the InnoDB storage engine for MySQL for online DDL operations such as add index, drop foreign key, and rename column. Sunday, 11.45 am, Room Golden Gate 7: Developing High-Throughput Services with NoSQL APIs to InnoDB and MySQL Cluster—Andrew Morgan and John Duncan, Oracle Ever-increasing performance demands of Web-based services have generated significant interest in providing NoSQL access methods to MySQL (MySQL Cluster and the InnoDB storage engine of MySQL), enabling users to maintain all the advantages of their existing relational databases while providing blazing-fast performance for simple queries. Get the best of both worlds: persistence; consistency; rich SQL queries; high availability; scalability; and simple, flexible APIs and schemas for agile development. This session describes the memcached connectors and examines some use cases for how MySQL and memcached fit together in application architectures. It does the same for the newest MySQL Cluster native connector, an easy-to-use, fully asynchronous connector for Node.js. Sunday, 1.15 pm, Room Golden Gate 4: InnoDB Performance Tuning—Inaam Rana, Oracle The InnoDB storage engine has always been highly efficient and includes many unique architectural elements to ensure high performance and scalability. In MySQL 5.5 and MySQL 5.6, InnoDB includes many new features that take better advantage of recent advances in operating systems and hardware platforms than previous releases did. This session describes unique InnoDB architectural elements for performance, new features, and how to tune InnoDB to achieve better performance. Sunday, 4.15 pm, Room Golden Gate 3: InnoDB Compression for OLTP—Nizameddin Ordulu, Facebook and Inaam Rana, Oracle Data compression is an important capability of the InnoDB storage engine for Oracle’s MySQL. Compressed tables reduce the size of the database on disk, resulting in fewer reads and writes and better throughput by reducing the I/O workload. Facebook pushes the limit of InnoDB compression and has made several enhancements to InnoDB, making this technology ready for online transaction processing (OLTP). In this session, you will learn the fundamentals of InnoDB compression. You will also learn the enhancements the Facebook team has made to improve InnoDB compression, such as reducing compression failures, not logging compressed page images, and allowing changes of compression level. Not registered yet? You can still save US$ 300 over the on-site fee – Register Now!

Read the article
Top Reasons You Need A User Engagement Platform

- by Michael Snow

Guest post by: Amit Sircar, Senior Sales Consultant, Oracle Deliver complex enterprise functionality through a simple intuitive and unified User Interface (UI) The modern enterprise contains a wide range of applications that are used to manage the business and drive competitive advantages. Organizations respond by creating a complex structure that results in a functional and management grouping of users. Each of these groups of users requires access to multiple applications and information sources in order to perform their job functions. This leads to the lack of a unified view of enterprise information, inconsistent user interfaces and disjointed security. To be effective, portals must be designed from the end-user perspective, enabling the user to accomplish as many tasks as possible while visiting the fewest number of portals. This requires rethinking the way that portals are built, moving from a functional business unit perspective to a user-focused, process-oriented point of view. Oracle WebCenter provides the Common User Experience Architecture that allows organizations to seamlessly present a unified view of enterprise information tailored to a particular user’s role and preferences. This architecture provides the best practices, design patterns and delivery mechanism for myriad services, applications, and data sources. In order to serve as a primary system of access, Oracle WebCenter also provides access to unstructured content and to other users via integrated search, service-oriented artifacts, content management, and collaboration tools. Provide a modern and engaging experience without modifying the core business application Web 2.0 technologies such as blogs, wikis, forums or social media sites are having a profound impact in the public internet. These technologies can be leveraged by enterprises to add significant value to the business. Organizations need to integrate these technologies directly into their business applications while continuing to meet their security and governance needs. To deliver richer connections and become a more agile and intelligent business, WebCenter provides an enterprise portal platform that contains pre-integrated, standards-based Enterprise 2.0 services. These Enterprise 2.0 services can be easily accessed, integrated and utilized by users. By giving users the ability to use and integrate Enterprise 2.0 services such as tags, links, wikis, activities, blogs or social networking directly with their portals and applications, they are empowered to make richer connections, optimize their productivity, and ultimately increase the value of their applications. Foster a collaborative experience The organizational workplace has undergone a major change in the last decade. With increasing globalization and a distributed workforce, project teams may be physically separated by large distances. Online collaboration technologies are becoming a critical resource to enable virtual teams to share information and work together effectively. Oracle WebCenter delivers dynamic business communities with rich Services to empower teams to quickly and efficiently manage their information, applications, projects, and people without requiring IT assistance. It brings together the latest technology around Enterprise 2.0 and social computing, communities, personal productivity, and ad-hoc team interactions without any development effort. It enables the sharing and collaboration on team content, focusing an organization’s valuable resources on solving business problems, tapping into new ideas, and reducing time-to-market. Mobile Support The traditional workplace dynamics that required employees to access their work applications from their desktops have undergone a fundamental shift. Employees were used to primarily working from company offices and utilized an IT-issued computer for performing their job functions. With the introduction of flexible work hours and the growth of remote workers, more and more employees need the ability to remain productive even when they do not have access to a computer via the use of tablets and smartphones. In addition, customers and citizens have come to expect 24x7 access to resources and websites from wherever they are located. Tablets and smartphones have empowered everyone to quickly access services they need anytime and from any place. WebCenter provides out of the box capabilities to deliver the mobile experience in a seamless manner. Seeded device profiles and toolkits within WebCenter can be used to render the same web pages into multiple target devices such iPads, iPhones and android devices. Web designers can preview the portal using the built in simulator, make necessary updates and then deploy their UI design for the targeted device. Conclusion The competitive economy and resource constraints facing organizations today require them to find ways to make their applications, portals and Web sites more agile and intelligent and their knowledge workers more productive no matter where they are located. Organizations need to provide faster access to relevant information and resources, enhance existing applications and business processes with rich Enterprise 2.0 services, and seamlessly deliver content to mobile platforms. Oracle WebCenter successfully meets these challenges by providing the modern user experience platform for the enterprise and the Web.

Read the article
WPF: Timers

- by Ilya Verbitskiy

I believe, once your WPF application will need to execute something periodically, and today I would like to discuss how to do that. There are two possible solutions. You can use classical System.Threading.Timer class or System.Windows.Threading.DispatcherTimer class, which is the part of WPF. I have created an application to show you how to use the API. Let’s take a look how you can implement timer using System.Threading.Timer class. First of all, it has to be initialized. 1: private Timer timer; 2: 3: public MainWindow() 4: { 5: // Form initialization code 6: 7: timer = new Timer(OnTimer, null, Timeout.InfiniteTimeSpan, Timeout.InfiniteTimeSpan); 8: } Timer’s constructor accepts four parameters. The first one is the callback method which is executed when timer ticks. I will show it to you soon. The second parameter is a state which is passed to the callback. It is null because there is nothing to pass this time. The third parameter is the amount of time to delay before the callback parameter invokes its methods. I use System.Threading.Timeout helper class to represent infinite timeout which simply means the timer is not going to start at the moment. And the final fourth parameter represents the time interval between invocations of the methods referenced by callback. Infinite timeout timespan means the callback method will be executed just once. Well, the timer has been created. Let’s take a look how you can start the timer. 1: private void StartTimer(object sender, RoutedEventArgs e) 2: { 3: timer.Change(TimeSpan.Zero, new TimeSpan(0, 0, 1)); 4: 5: // Disable the start buttons and enable the reset button. 6: } The timer is started by calling its Change method. It accepts two arguments: the amount of time to delay before the invoking the callback method and the time interval between invocations of the callback. TimeSpan.Zero means we start the timer immediately and TimeSpan(0, 0, 1) tells the timer to tick every second. There is one method hasn’t been shown yet. This is the callback method OnTimer which does a simple task: it shows current time in the center of the screen. Unfortunately you cannot simple write something like this: 1: clock.Content = DateTime.Now.ToString("hh:mm:ss"); The reason is Timer runs callback method on a separate thread, and it is not possible to access GUI controls from a non-GUI thread. You can avoid the problem using System.Windows.Threading.Dispatcher class. 1: private void OnTimer(object state) 2: { 3: Dispatcher.Invoke(() => ShowTime()); 4: } 5: 6: private void ShowTime() 7: { 8: clock.Content = DateTime.Now.ToString("hh:mm:ss"); 9: } You can build similar application using System.Windows.Threading.DispatcherTimer class. The class represents a timer which is integrated into the Dispatcher queue. It means that your callback method is executed on GUI thread and you can write a code which updates your GUI components directly. 1: private DispatcherTimer dispatcherTimer; 2: 3: public MainWindow() 4: { 5: // Form initialization code 6: 7: dispatcherTimer = new DispatcherTimer { Interval = new TimeSpan(0, 0, 1) }; 8: dispatcherTimer.Tick += OnDispatcherTimer; 9: } Dispatcher timer has nicer and cleaner API. All you need is to specify tick interval and Tick event handler. The you just call Start method to start the timer. private void StartDispatcher(object sender, RoutedEventArgs e) { dispatcherTimer.Start(); // Disable the start buttons and enable the reset button. } And, since the Tick event handler is executed on GUI thread, the code which sets the actual time is straightforward. 1: private void OnDispatcherTimer(object sender, EventArgs e) 2: { 3: ShowTime(); 4: } We’re almost done. Let’s take a look how to stop the timers. It is easy with the Dispatcher Timer. 1: dispatcherTimer.Stop(); And slightly more complicated with the Timer. You should use Change method again. 1: timer.Change(Timeout.InfiniteTimeSpan, Timeout.InfiniteTimeSpan); What is the best way to add timer into an application? The Dispatcher Timer has simple interface, but its advantages are disadvantages at the same time. You should not use it if your Tick event handler executes time-consuming operations. It freezes your window which it is executing the event handler method. You should think about using System.Threading.Timer in this case. The code is available on GitHub.

Read the article
Is it Hard to Write a Blog?

- by Joe Mayo

Responding to a tweet I received, asking if I found it hard to write a blog and keep it interesting. This is one of the situations where a 140 character response doesn’t do a question justice. There’s a lot to think about between the subjects of writing, subject matter, and entertainment. Here’s my take on each of these three topics: There’s all types of writing you can do with various degrees of difficulty. If you’re writing a book and you have a gazillion editors bleeding over your every utterance, then the task becomes harder because you’re second-guessing yourself, not knowing whose opinion will be violated. However, if you’re communicating in a public forum, not too many people care about the grammar as much as whether what you have to say is correct. For a blog, I would say it’s somewhere in-between. Right now, I’m using Windows Live Writer, which gives me a few advantages to just typing into the blog editor, such as spelling correction and the ability to save my work and resume later. Overall, writing is one of those things that you just need to get used to. It’s an essential skill for developers because you need to document your work, depending on what your definition of proper documentation is, and communicate with other developers via various communications mediums. Not begin good (or not thinking that you’re good) shouldn’t hold you back. Like most things in life, practice will improve your skill. So, push away that inner voice that keeps you from moving forward and just do it. A good grasp on the subject matter you’re writing about helps. However, don’t let a lack of knowledge stop you from writing about something. I recall reading something a while back by a developer who didn’t know a technology but wrote about their experience in learning it. They ended up learning more by expressing their thoughts in writing. If you look around out many blogs today, there are many items written by developers learning what they’re writing about. So, whether you are sure or unsure, you can still write – just be honest with yourself and your readers about what you’re writing. Also, don’t be afraid to have a different opinion or worry if someone will disagree. I’ll freely admit that it took a while for me to become accustomed to being criticized. Take the good with the bad and use the bad to make yourself better. Guaranteed, someone will disagree with one or more parts of what I’ve written here or think they have a better approach. No problem, more power to them, and whatever constructive comments they have will be a benefit to me in the future; Otherwise, to h*ll with them. :) Every time you get knocked down, get right back up, dust the dirt off your backside, and keep moving forward. You’ll learn in time how to align a subject with your own presentation of the material. Entertainment could be hard or could be natural, depending on the personality of yourself and your target audience. It’s even more challenging because you can say something you think is funny and someone will be offended. In fact, there are a lot of things that you shouldn’t say in the name of a joke, but I won’t mention any of them here for want of not offending anyone. Of course, I probably offended someone by saying that and there is probably an organization somewhere in the world out to get me now. I’m probably not the best person to be giving you advice on entertaining an audience. I mean, every time I try to tell a joke on Twitter 10 people unfriend me. Okay, maybe 15, but you get my point. One thing you might be interested in knowing is that it’s not too hard for one technical person to entertain other technical people, especially when the subject is of interest. It’s the excitement in each sentence and passion in each paragraph that will keep another developer entertained and interested in what you have to say. Not everyone will like what you’ve written, but the important part is to find your own voice and it’s likely that there is one person in some corner of the world that likes what you have to say, even if it’s your mom and she doesn’t understand a single word you write. :) If I could leave you with one final thought; Just do it and don’t let anyone or anything hold you back. Joe

Read the article
Part 1 - Load Testing In The Cloud

- by Tarun Arora

Azure is fascinating, but even more fascinating is the marriage of Azure and TFS! Introduction Recently a client I worked for had 2 major business critical applications being delivered, with very little time budgeted for Performance testing, we immediately hit a bottleneck when the performance testing phase started, the in house infrastructure team could not support the hardware requirements in the short notice. It was suggested that the performance testing be performed on one of the QA environments which was a fraction of the production environment. This didn’t seem right, the team decided to turn to the cloud. The team took advantage of the elasticity offered by Azure, starting with a single test agent which was provisioned and ready for use with in 30 minutes the team scaled up to 17 test agents to perform a very comprehensive performance testing cycle. Issues were identified and resolved but the highlight was that the cost of running the ‘test rig’ proved to be less than if hosted on premise by the infrastructure team. Thank you for taking the time out to read this blog post, in the series of posts, I’ll try and cover the start to end of everything you need to know to use Azure to build your Test Rig in the cloud. But Why Azure? I have my own Data Centre… If the environment is provisioned in your own datacentre, - No matter what level of service agreement you may have with your infrastructure team there will be down time when the environment is patched - How fast can you scale up or down the environments (keeping the enterprise processes in mind) Administration, Cost, Flexibility and Scalability are the areas you would want to think around when taking the decision between your own Data Centre and Azure! How is Microsoft's Public Cloud Offering different from Amazon’s Public Cloud Offering? Microsoft's offering of the Cloud is a hybrid of Platform as a Service (PaaS) and Infrastructure as a Service (IaaS) which distinguishes Microsoft's offering from other providers such as Amazon (Amazon only offers IaaS). PaaS – Platform as a Service IaaS – Infrastructure as a Service Fills the needs of those who want to build and run custom applications as services. Similar to traditional hosting, where a business will use the hosted environment as a logical extension of the on-premises datacentre. A service provider offers a pre-configured, virtualized application server environment to which applications can be deployed by the development staff. Since the service providers manage the hardware (patching, upgrades and so forth), as well as application server uptime, the involvement of IT pros is minimized. On-demand scalability combined with hardware and application server management relieves developers from infrastructure concerns and allows them to focus on building applications. The servers (physical and virtual) are rented on an as-needed basis, and the IT professionals who manage the infrastructure have full control of the software configuration. This kind of flexibility increases the complexity of the IT environment, as customer IT professionals need to maintain the servers as though they are on-premises. The maintenance activities may include patching and upgrades of the OS and the application server, load balancing, failover clustering of database servers, backup and restoration, and any other activities that mitigate the risks of hardware and software failures. The biggest advantage with PaaS is that you do not have to worry about maintaining the environment, you can focus all your time in solving the business problems with your solution rather than worrying about maintaining the environment. If you decide to use a VM Role on Azure, you are asking for IaaS, more on this later. A nice blog post here on the difference between Saas, PaaS and IaaS. Now that we are convinced why we should be turning to the cloud and why in specific Azure, let’s discuss about the Test Rig. The Load Test Rig – Topology Now the moment of truth, Of course a big part of getting value from cloud computing is identifying the most adequate workloads to take to the cloud, so I’ve decided to try to make a Load Testing rig where the Agents are running on Windows Azure. I’ll talk you through the above Topology, - User: User kick starts the load test run from the developer workstation on premise. This passes the request to the Test Controller. - Test Controller: The Test Controller is on premise connected to the same domain as the developer workstation. As soon as the Test Controller receives the request it makes use of the Windows Azure Connect service to orchestrate the test responsibilities to all the Test Agents. The Windows Azure Connect endpoint software must be active on all Azure instances and on the Controller machine as well. This allows IP connectivity between them and, given that the firewall is properly configured, allows the Controller to send work loads to the agents. In parallel, the Controller will collect the performance data from the agents, using the traditional WMI mechanisms. - Test Agents: The Test Agents are on the Windows Azure Public Cloud, as soon as the test controller issues instructions to the test agents, the test agents start executing the load tests. The HTTP requests are issued against the web server on premise, the results are captured by the test agents. And finally the results are passed over to the controller. - Servers: The Web Server and DB Server are hosted on premise in the datacentre, this is usually the case with business critical applications, you probably want to manage them your self. Recap and What’s next? So, in the introduction in the series of blog posts on Load Testing in the cloud I highlighted why creating a test rig in the cloud is a good idea, what advantages does Windows Azure offer and the Test Rig topology that I will be using. I would also like to mention that i stumbled upon this [Video] on Azure in a nutshell, great watch if you are new to Windows Azure. In the next post I intend to start setting up the Load Test Environment and discuss pricing with respect to test agent machine types that will be used in the test rig. Hope you enjoyed this post, If you have any recommendations on things that I should consider or any questions or feedback, feel free to add to this blog post. Remember to subscribe to http://feeds.feedburner.com/TarunArora. See you in Part II. Share this post : CodeProject

Read the article
About Solaris 11 and UltraSPARC II/III/IV/IV+

- by nospam(at)example.com (Joerg Moellenkamp)

I know that I will get the usual amount of comments like "Oh, Jörg ? you can't be negative about Oracle" for this article. However as usual I want to explain the logic behind my reasoning. Yes ? I know that there is a lot of UltraSPARC III, IV and IV+ gear out there. But there are some very basic questions: Does your application you are currently running on this gear stops running just because you can't run Solaris 11 on it? What is the need to upgrade a system already in production to Solaris 11? I have the impression, that some people think that the systems get useless in the moment Oracle releases Solaris 11. I know that Sun sold UltraSPARC IV+ systems until 2009. The Sun SF490 introduced 2004 for example, that was a Sun SF480 with UltraSPARC IV and later with UltraSPARC IV+. And yes, Sun made some speedbumps. At that time the systems of the UltraSPARC III to IV+ generations were supported on Solaris 8, on Solaris 9 and on Solaris 10. However from my perspective we sold them to customers, which weren't able to migrate to Solaris 10 because they used applications not supported on Solaris 9 or who just didn't wanted to migrate to Solaris 10. Believe it or not ? I personally know two customers that migrated core systems to Solaris 10 in ? well 2008/9. This was especially true when the M3000 was announced in 2008 when it closed the darned single socket gap. It may be different at you site, however that's what I remember about that time when talking with customers. At first: Just because there is no Solaris 11 for UltraSPARC III, IV and IV+, it doesn't mean that Solaris 10 will go away anytime soon. I just want to point you to "Expect Lifetime Support - Hardware and Operating Systems". It states about Premier Support:Maintenance and software upgrades are included for Oracle operating systems and Oracle VM for a minimum of eight years from the general availability date.GA for Solaris 10 was in 2005. Plus 8 years ? 2013 ? at minimum. Then you can still opt for 3 years of "Extended Support" ? 2016 ? at minimum. 2016 your systems purchased in 2009 are 7 years old. Even on systems purchased at the very end of the lifetime of that system generation. That are the rules as written in the linked document. I said minimum The actual dates are even further in the future: Premier Support for Solaris 10 ends in 2015, Extended support ends 2018. Sustaining support ? indefinite. You will find this in the document "Oracle Lifetime Support Policy: Oracle Hardware and Operating Systems".So I don't understand when some people write, that Oracle is less protective about hardware investments than Sun. And for hardware it's the same as with Sun: Service 5 years after EOL as part of Premier Support. I would like to write about a different perspective as well: I have to be a little cautious here, because this is going in the roadmap area, so I will mention the public sources here: John Fowler told last year that we have to expect at at least 3x the single thread performance of T3 for T4. We have 8 cores in T4, as stated by Rick Hetherington. Let's assume for a moment that a T4 core will have the performance of a UltraSPARC core (just to simplify math and not to disclosing anything about the performance, all existing SPARC cores are considered equal). So given this pieces of information, you could consolidate 8 V215, 4 or 8 V245, 2 full blown V445,2 full blown 490, 2 full blown M3000 on a single T4 SPARC processor. The Fowler roadmap prezo talked about 4-socket systems with T4. So 32 V215, 16 to 8 V245, 8 fullblown V445, 8 full blown V490, 8 full blown M3000 in a system image. I think you get the idea. That said, most of the systems we are talking about have already amortized and perhaps it's just time to invest in new systems to yield other advantages like reduced space consumptions, like reduced power consumption, like some of the neat features sun4v gives you, and yes ? reduced number of processor licenses for Oracle and less money for Oracle HW/SW support. As much as I dislike it myself that my own UltraSPARC III and UltraSPARC II based systems won't run on Solaris 11 (and I have quite a few of them in my personal lab), I really think that the impact on production environments will be much less than most people think now. By the way: The reason for this move is a quite significant new feature. I will tell you that it was this feature, when it's out. I assume, telling just a word more could lead to much more time to blog.

Read the article
yield – Just yet another sexy c# keyword?

- by George Mamaladze

yield (see NSDN c# reference) operator came I guess with .NET 2.0 and I my feeling is that it’s not as wide used as it could (or should) be. I am not going to talk here about necessarity and advantages of using iterator pattern when accessing custom sequences (just google it). Let’s look at it from the clean code point of view. Let's see if it really helps us to keep our code understandable, reusable and testable. Let’s say we want to iterate a tree and do something with it’s nodes, for instance calculate a sum of their values. So the most elegant way would be to build a recursive method performing a classic depth traversal returning the sum. private int CalculateTreeSum(Node top) { int sumOfChildNodes = 0; foreach (Node childNode in top.ChildNodes) { sumOfChildNodes += CalculateTreeSum(childNode); } return top.Value + sumOfChildNodes; } “Do One Thing” Nevertheless it violates one of the most important rules “Do One Thing”. Our method CalculateTreeSum does two things at the same time. It travels inside the tree and performs some computation – in this case calculates sum. Doing two things in one method is definitely a bad thing because of several reasons: · Understandability: Readability / refactoring · Reuseability: when overriding - no chance to override computation without copying iteration code and vice versa. · Testability: you are not able to test computation without constructing the tree and you are not able to test correctness of tree iteration. I want to spend some more words on this last issue. How do you test the method CalculateTreeSum when it contains two in one: computation & iteration? The only chance is to construct a test tree and assert the result of the method call, in our case the sum against our expectation. And if the test fails you do not know wether was the computation algorithm wrong or was that the iteration? At the end to top it all off I tell you: according to Murphy’s Law the iteration will have a bug as well as the calculation. Both bugs in a combination will cause the sum to be accidentally exactly the same you expect and the test will PASS. J Ok let’s use yield! That’s why it is generally a very good idea not to mix but isolate “things”. Ok let’s use yield! private int CalculateTreeSumClean(Node top) { IEnumerable<Node> treeNodes = GetTreeNodes(top); return CalculateSum(treeNodes); } private int CalculateSum(IEnumerable<Node> nodes) { int sumOfNodes = 0; foreach (Node node in nodes) { sumOfNodes += node.Value; } return sumOfNodes; } private IEnumerable<Node> GetTreeNodes(Node top) { yield return top; foreach (Node childNode in top.ChildNodes) { foreach (Node currentNode in GetTreeNodes(childNode)) { yield return currentNode; } } } Two methods does not know anything about each other. One contains calculation logic another jut the iteration logic. You can relpace the tree iteration algorithm from depth traversal to breath trevaersal or use stack or visitor pattern instead of recursion. This will not influence your calculation logic. And vice versa you can relace the sum with product or do whatever you want with node values, the calculateion algorithm is not aware of beeng working on some tree or graph. How about not using yield? Now let’s ask the question – what if we do not have yield operator? The brief look at the generated code gives us an answer. The compiler generates a 150 lines long class to implement the iteration logic. [CompilerGenerated] private sealed class <GetTreeNodes>d__0 : IEnumerable<Node>, IEnumerable, IEnumerator<Node>, IEnumerator, IDisposable { ... 150 Lines of generated code ... } Often we compromise code readability, cleanness, testability, etc. – to reduce number of classes, code lines, keystrokes and mouse clicks. This is the human nature - we are lazy. Knowing and using such a sexy construct like yield, allows us to be lazy, write very few lines of code and at the same time stay clean and do one thing in a method. That's why I generally welcome using staff like that. Note: The above used recursive depth traversal algorithm is possibly the compact one but not the best one from the performance and memory utilization point of view. It was taken to emphasize on other primary aspects of this post.

Read the article
Project of Projects with team Foundation Server 2010

- by Martin Hinshelwood

It is pretty much accepted that you should use Areas instead of having many small Team Projects when you are using Team Foundation Server 2010. I have implemented this scenario many times and this is the current iteration of layout and considerations. If like me you work with many customers you will find that you get into a grove for how to set these things up to make them as easily understandable for everyone, while giving the best functionality. The trick is in making it as intuitive as possible for both you and the developers that need to work with it. There are five main places where you need to have the Product or Project name in prominence of any other value. Area Iteration Source Code Work Item Queries Build Once you decide how you are doing this in each of these places you need to keep to it religiously. Evan if you have one source code file to keep, make sure it is in the right place. This makes your developers and others working with the format familiar with where everything should go, as well as building up mussel memory. This prevents the neat system degenerating into a nasty mess. Areas Areas are traditionally used to separate out parts of your product / project so that you can see how much effort has gone into each. Figure: The top level areas are for reporting and work item separation There are massive advantages of using this method. You can: move work from one project to another rename a project / product It is far more likely that a project or product gets renamed than a department. Tip: If you have many projects, over 100, you should consider categorising them here, but make sure that the actual project name always sits at the same level so you know which is which. Figure: Always keep things that are the same at the same level Note: You may use these categories only at the Area/Iteration level to make it easier to select on drop down lists. You may not want to use them everywhere. On the other hand, for consistency it would be better to. Iterations Iterations are usually used to some sort of time based consideration. Here I am splitting into Iterations with periodic releases. Figure: Each product needs to be able to have its own cadence The ability to have each project run at its own pace and to enable them to have their own release schedule is often of paramount importance and you don’t want to fix your 100+ projects to all be released on the same date. Source Code Having a good structure for your source even if you are not branching or having multiple products under the same structure is always a good idea. Figure: Separate out your products source You need to think about both your branches as well as the structure of your source. All your code should be under “Source” and everything you need to build your solution including Build Scripts and 3rd party tools should be under your “Main” (branch) folder. This should them be branched by “Quality”, “Release” or both to get the most out of your branching structure. The important thing is to make sure you branch (or be able to branch) everything you need to build, test and deploy your application to an environment. That environment may be development, test or even production, but I can’t stress the importance of having everything your need. Note: You usually will not be able to install custom software on your build server. Store any *.dll’s or *.exe’s that you need under the “Tools\Tool1” folder. Note: Consult the Branching Guidance for Team Foundation Server 2010 for more on branching Figure: Adding category may be a necessary evil Even if you have to have a couple of categories called “Default”, it is better than not knowing the difference between a folder, Product and Branch. Work Item Queries Queries are used to load lists of Work Items out of TFS so you can see what work you have. This means that you want to also separate queries out by Product / project to make it easier to Figure: Again you have the same first level structure Having Folders also in Work Item Tracking we do the same thing. We put all the queries under a folder named for the Product / Project and change each query to have “AreaPath=[TeamProject]\[ProductX]” in the query instead of the standard “Project=@Project”. Tip: Don’t have a folder with new queries for each iteration. Instead have a single “Current” folder that has queries that point to the current iteration. Just change the queries as you move from one iteration to another. Tip: You can ctrl+drag the “Product1” folder to create your “Product2” folder. Builds You may have many builds both for individual products but also for different quality's. This can be further complicated by having some builds that action “Gated Check-In” and others that are specifically for “Release”, “Test” or another purpose. Figure: There are no folders, yet, for the builds so you need a good naming convention Its a pity that there are no folders under builds, some way to categorise would be nice. In lue of that at the moment you can use a functional naming convention that at least allows you to find what you want. Conclusion It is really easy to both achieve and to stick to this format if you take the time to do it. Unless you have 1000+ builds or 100+ Products you are unlikely run into any issues. Even then there are things you can do to mitigate the issues and I have describes some of them above. Let me know if you can think of any other things to make this easier.

Read the article
BizTalk: Sample: Context routing and Throttling with orchestration

- by Leonid Ganeline

The sample demonstrates using orchestration for throttling and using context routing. Usually throttling is implemented on the host level (in BizTalk 2010 we can also using the host instance level throttling). Here is demonstrated the throttling with orchestration convoy that slows down message flow from some customers. Sample implements sort of quality service agreement layer for different kind of customers. The sample demonstrates the context routing between orchestrations. It has several advantages over the content routing. For example, we don’t have to create the property schema and promote properties on the schemas; we don’t have to change the message content to change routing. Use case: The BizTalk application has a main processing orchestration that process all input messages. The application usually works as an OLTP application. Input messages came in random order without peaks, typical scenario for the on-line users. But sometimes the big data batch payloads come. These batches overload processing orchestrations. All processes, activated by on-line users after the payload, come to the same queue and are processed only after the payload. Result is on-line users can see significant delay in processing. It can be minutes or hours, depending of the batch size. Requirements: On-line user’s processing should work without delays. Big batches cannot disturb on-line users. There should be higher priority for the on-line users and the lower priority for the batches. Design: Decision is to divide the message flow in two branches, one for on-line users and second for batches. Branch with batches provides messages to the processing line with low priority, and the on-line user’s branch – with high priority. All messages are provided by hi-speed receive port. BTS.ReceivePortName context property is used for routing. The Router orchestration separates messages sent from on-line users and from the batch messages. But the Router does not use the BizTalk provided value of this property, the Router set up this value by itself. Router uses the content of the messages to decide if it is from on-line users or from batches. The message context property the BTS.ReceivePortName is changed respectively, its value works as a recipient address, as the “To” address for the next recipient orchestrations. Those next orchestrations are the BatchBottleneck and the MainProcess orchestrations. Messages with context equal “ToBatch” are filtered up by the BatchBottleneck orchestration. It is a unified convoy orchestration and it throttles the message flow, delaying the message delivery to the MainProcess orchestration. The BatchBottleneck orchestration changes the message context to the “ToProcess” and sends messages one after another with small delay in between. Delay can be configured in the BizTalk config file as: <appSettings> <add key="GLD_Tests_TwoWayRouting_BatchBottleneck_DelayMillisec" value="100"/> </appSettings> Of course, messages with context equal “ToProcess” are filtered up by the MainProcess orchestration. NOTES: Filters with string values: In Orchestrations (the first Receive shape in orchestration) use string values WITH quotes; in Send Ports use string values WITHOUT quotes. Filters on the Send Ports are dynamic; we can change them in run-time. Filters on the Orchestrations are static; we can change them only in design-time. To check the existence of the promoted property inside orchestration use the Expression shape with construction like this: if (BTS.ReceivePortName exists myMessage) { …; } It is not possible in the Message Assignment shape because using the “if” statement inside Message Assignment is prohibited. Several predefined context properties can behave in specific way. Say MessageTracking.OriginatingMessage or XMLNORM.DocumentSpecName, they are required some internal rules should be applied to the format or usage of this properties. MessageTracking.* parameters require you have to use tracking and you can get unexpected run-time errors in some cases. My recommendation is - use very limited set of the predefined context properties. To “attach” the new promoted property to the message, we have to use correlation. The correlation type should include this property. [Here is a good explanation by Saravana ] The sample code is here [sorry, temporary trubles with CodePlex].

Read the article
Data Source Security Part 2

- by Steve Felts

In Part 1, I introduced the default security behavior and listed the various options available to change that behavior. One of the key topics to understand is the difference between directly using database user and password values versus mapping from WLS user and password to the associated database values. The direct use of database credentials is relatively new to WLS, based on customer feedback. Some of the trade-offs are covered in this article. Credential Mapping vs. Database Credentials Each WLS data source has a credential map that is a mechanism used to map a key, in this case a WLS user, to security credentials (user and password). By default, when a user and password are specified when getting a connection, they are treated as credentials for a WLS user, validated, and are converted to a database user and password using a credential map associated with the data source. If a matching entry is not found in the credential map for the data source, then the user and password associated with the data source definition are used. Because of this defaulting mechanism, you should be careful what permissions are granted to the default user. Alternatively, you can define an invalid default user to ensure that no one can accidentally get through (in this case, you would need to set the initial capacity for the pool to zero so that the pool is populated only by valid users). To create an entry in the credential map: 1) First create a WLS user. In the administration console, go to Security realms, select your realm (e.g., myrealm), select Users, and select New. 2) Second, create the mapping. In the administration console, go to Services, select Data sources, select your data source name, select Security, select Credentials, and select New. See http://docs.oracle.com/cd/E24329_01/apirefs.1211/e24401/taskhelp/jdbc/jdbc_datasources/ConfigureCredentialMappingForADataSource.html for more information. The advantages of using the credential mapping are that: 1) You don’t hard-code the database user/password into a program or need to prompt for it in addition to the WLS user/password and 2) It provides a layer of abstraction between WLS security and database settings such that many WLS identities can be mapped to a smaller set of DB identities, thereby only requiring middle-tier configuration updates when WLS users are added/removed. You can cut down the number of users that have access to a data source to reduce the user maintenance overhead. For example, suppose that a servlet has the one pre-defined, special WLS user/password for data source access, hard-wired in its code in a getConnection(user, password) call. Every WebLogic user can reap the specific DBMS access coded into the servlet, but none has to have general access to the data source. For instance, there may be a ‘Sales’ DBMS which needs to be protected from unauthorized eyes, but it contains some day-to-day data that everyone needs. The Sales data source is configured with restricted access and a servlet is built that hard-wires the specific data source access credentials in its connection request. It uses that connection to deliver only the generally needed day-to-day information to any caller. The servlet cannot reveal any other data, and no WebLogic user can get any other access to the data source. This is the approach that many large applications take and is the reasoning behind the default mapping behavior in WLS. The disadvantages of using the credential map are that: 1) It is difficult to manage (create, update, delete) with a large number of users; it is possible to use WLST scripts or a custom JMX client utility to manage credential map entries. 2) You can’t share a credential map between data sources so they must be duplicated. Some applications prefer not to use the credential map. Instead, the credentials passed to getConnection(user, password) should be treated as database credentials and used to authenticate with the database for the connection, avoiding going through the credential map. This is enabled by setting the “use-database-credentials” to true. See http://docs.oracle.com/cd/E24329_01/apirefs.1211/e24401/taskhelp/jdbc/jdbc_datasources/ConfigureOracleParameters.html "Configure Oracle parameters" in Oracle WebLogic Server Administration Console Help. Use Database Credentials is not currently supported for Multi Data Source configurations. When enabled, it turns off credential mapping on Generic and Active GridLink data sources for the following attributes: 1. identity-based-connection-pooling-enabled (this interaction is available by patch in 10.3.6.0). 2. oracle-proxy-session (this interaction is first available in 10.3.6.0). 3. set client identifier (this interaction is available by patch in 10.3.6.0). Note that in the data source schema, the set client identifier feature is poorly named “credential-mapping-enabled”. The documentation and the console refer to it as Set Client Identifier. To review the behavior of credential mapping and using database credentials: - If using the credential map, there needs to be a mapping for each WLS user to database user for those users that will have access to the database; otherwise the default user for the data source will be used. If you always specify a user/password when getting a connection, you only need credential map entries for those specific users. - If using database credentials without specifying a user/password, the default user and password in the data source descriptor are always used. If you specify a user/password when getting a connection, that user will be used for the credentials. WLS users are not involved at all in the data source connection process.

Read the article
Objective-C wrapper API design methodology

- by Wade Williams

I know there's no one answer to this question, but I'd like to get people's thoughts on how they would approach the situation. I'm writing an Objective-C wrapper to a C library. My goals are: 1) The wrapper use Objective-C objects. For example, if the C API defines a parameter such as char *name, the Objective-C API should use name:(NSString *). 2) The client using the Objective-C wrapper should not have to have knowledge of the inner-workings of the C library. Speed is not really any issue. That's all easy with simple parameters. It's certainly no problem to take in an NSString and convert it to a C string to pass it to the C library. My indecision comes in when complex structures are involved. Let's say you have: struct flow { long direction; long speed; long disruption; long start; long stop; } flow_t; And then your C API call is: void setFlows(flow_t inFlows[4]); So, some of the choices are: 1) expose the flow_t structure to the client and have the Objective-C API take an array of those structures 2) build an NSArray of four NSDictionaries containing the properties and pass that as a parameter 3) create an NSArray of four "Flow" objects containing the structure's properties and pass that as a parameter My analysis of the approaches: Approach 1: Easiest. However, it doesn't meet the design goals Approach 2: For some reason, this seems to me to be the most "Objective-C" way of doing it. However, each element of the NSDictionary would have to be wrapped in an NSNumber. Now it seems like we're doing an awful lot just to pass the equivalent of a struct. Approach 3: Seems the cleanest to me from an object-oriented standpoint and the extra encapsulation could come in handy later. However, like #2, it now seems like we're doing an awful lot (creating an array, creating and initializing objects) just to pass a struct. So, the question is, how would you approach this situation? Are there other choices I'm not considering? Are there additional advantages or disadvantages to the approaches I've presented that I'm not considering?

Read the article
How to run a powershell script within a DOS batch file

- by Don Vince

How do I have a powershell script embedded within the same file as a DOS batch script? I know this kind of thing is possible in other scenarios: Embedding SQL in a DOS batch script using sqlcmd and a clever arrangements of goto's and comments at the beginning of the file In a *nix environment having a the name of the program you wish to run the script with on the first line of the script commented out e.g. #!/usr/local/bin/python There may not be a way to do this - in which case I will have to call the separate powershell script from the launching DOS script. One possible solution I've considered is to echo out the powershell script, and then run it. A good reason to not do this is that part of the reason to attempt this is to be using the advantages of the powershell environment without the pain of, for example, DOS escape characters I have some unusual constraints and would like to find an elegant solution. I suspect this question may be baiting responses of the variety: "why don't you try and solve this different problem instead." Suffice to say these are my constraints, sorry about that. Any ideas? Is there a suitable combination of clever comments and escape characters that will enable me to achieve this? Some thoughts on how to achieve this: A carat ^ at the end of a line in DOS is a continuation - like an underscore in VB An ampersand & in DOS typically is used to separate commands echo Hello & echo World results in 2 echos on separate lines %0 will give you the script that's currently running So something like this (if I could make it work) would be good: # & call powershell -psconsolefile %0 # & goto :EOF /* From here on in we're running nice juicy powershell code */ Write-Output "Hello World" Except... It doesn't work... because the extension of the file isn't as per powershell's liking: Windows PowerShell console file "insideout.bat" extension is not psc1. Windows PowerShell console file extension must be psc1. DOS isn't really altogether happy with the situation either - although it does stumble on '#' is not recognized as an internal or external command, operable program or batch file.

Read the article
How to store wiki sites (vcs)

- by Eugen

Hello, as a personal project I am trying to write a wiki with the help of django. I'm a beginner when it comes to web development. I am at the (early) point where I need to decide how to store the wiki sites. I have three approaches in mind and would like to know your suggestion. Flat files I considered a flat file approach with a version control system like git or mercurial. Firstly, I would have some example wikis to look at like http://hatta.sheep.art.pl/. Secondly, the vcs would probably deal with editing conflicts and keeping the edit history, so I would not have to reinvent the wheel. And thirdly, I could probably easily clone the wiki repository, so I (or for that matter others) can have an offline copy of the wiki. On the other hand, as far as I know, I can not use django models with flat files. Then, if I wanted to add fields to a wiki site, like a category, I would need to somehow keep a reference to that flat file in order to associate the fields in the database with the flat file. Besides, I don't know if it is a good idea to have all the wiki sites in one repository. I imagine it is more natural to have kind of like a repository per wiki site resp. file. Last but not least, I'm not sure, but I think using flat files would limit my deploying capabilities because web hosts maybe don't allow creating files (I'm thinking, for example, of Google App Engine) Storing in a database By storing the wiki sites in the database I can utilize django models and associate arbitrary fields with the wiki site. I probably would also have an easier life deploying the wiki. But I would not get vcs features like history and conflict resolving per se. I searched for django-extensions to help me and I found django-reversion. However, I do not fully understand if it fit my needs. Does it track model changes like for example if I change the django model file, or does it track the content of the models (which would fit my need). Plus, I do not see if django reversion would help me with edit conflicts. Storing a vcs repository in a database field This would be my ideal solution. It would combine the advantages of both previous approaches without the disadvantages. That is; I would have vcs features but I would save the wiki sites in a database. The problem is: I have no idea how feasible that is. I just imagine saving a wiki site/source together with a git/mercurial repository in a database field. Yet, I somehow doubt database fields work like that. So, I'm open for any other approaches but this is what I came up with. Also, if you're interested, you can find the crappy early test I'm working on here http://github.com/eugenkiss/instantwiki-test

Read the article
General Questions on setting up Subversion (w/Netbeans) and web development workflow.

- by Roeland

Hey guys, I am web developer who currently works mostly on his own but, some projects require outside coding help (my brother.) Anyway, after running in to the problem of "working on the same files" and "saving over each others edits" I decided to research ways to avoid this. Through the help of stackoverflow I decided on subversion. My setup is the following: A windows 2008 server with a clean install. My desktop which is on the local network of the server. Then my brothers desktop which is at his house, not on lan. We both prefer to use netbeans for development. My Questions: How do we set this thing up correctly and most optimally. Here is my current setup and work flow. dev site: in the past I just created sub domains with my web host for test sites (company1.bythepixel.com, company2.bythepixel.com), and editted those sites with netbeans set up having remote sources (ftp). how do i set up my netbeans now? should I set it up with remote sources? I guess I may need to set up a web server on my local server. I'm just not sure of the work flow. When i hit save in netbeans.. will it update the repository.. then do i need to update the site from the repository somehow? live site: when going live, i would just copy all the files from the dev site into the live site. from what i gather i should be able to update the site from the dev repository? Currently I am toying with virtualsvn server (http://www.visualsvn.com/server/) on my local server. It looks like it is set up to use the http protocol. Is there advantages to this or should I consider something that does the file//. Do you recommend any other subversion software that would run on my 2008 box? how will my brother connect? should i set up a permanent vpn from his house to mine? suggestions? (not that important) how do I deal with databases, there anyway to do subversion on database? I know I have a lot of questions and I am trying to read / make sense of the free online subversion book, but its all so overwhelming! Wish there was a small subversion for dummies guide :)

Read the article
architecture python question

- by tom smith

hi. creating a distributed crawling python app. it consists of a master server, and associated client apps that will run on client servers. the purpose of the client app is to run across a targeted site, to extract specific data. the clients need to go "deep" within the site, behind multiple levels of forms, so each client is specifically geared towards a given site. each client app looks something like main: parse initial url call function level1 (data1) function level1 (data) parse the url, for data1 use the required xpath to get the dom elements call the next function call level2 (data) function level2 (data2) parse the url, for data2 use the required xpath to get the dom elements call the next function call level3 function level3 (dat3) parse the url, for data3 use the required xpath to get the dom elements call the next function call level4 function level4 (data) parse the url, for data4 use the required xpath to get the dom elements at the final function.. --all the data output, and eventually returned to the server --at this point the data has elements from each function... my question: given that the number of calls that is made to the child function by the current function varies, i'm trying to figure out the best approach. each function essentialy fetches a page of content, and then parses the page using a number of different XPath expressions, combined with different regex expressions depending on the site/page. if i run a client on a single box, as a sequential process, it'll take awhile, but the load on the box is rather small. i've thought of attempting to implement the child functions as threads from the current function, but that could be a nightmare, as well as quickly bring the "box" to its knees! i've thought of breaking the app up in a manner that would allow the master to essentially pass packets to the client boxes, in a way to allow each client/function to be run directly from the master. this process requires a bit of rewrite, but it has a number of advantages. a bunch of redundancy, and speed. it would detect if a section of the process was crashing and restart from that point. but not sure if it would be any faster... i'm writing the parsing scripts in python.. so... any thoughts/comments would be appreciated... i can get into a great deal more detail, but didn't want to bore anyone!! thanks! tom

Read the article
Mercurial for Beginners: The Definitive Practical Guide

- by Laz

Inspired by Git for beginners: The definitive practical guide. This is a compilation of information on using Mercurial for beginners for practical use. Beginner - a programmer who has touched source control without understanding it very well. Practical - covering situations that the majority of users often encounter - creating a repository, branching, merging, pulling/pushing from/to a remote repository, etc. Notes: Explain how to get something done rather than how something is implemented. Deal with one question per answer. Answer clearly and as concisely as possible. Edit/extend an existing answer rather than create a new answer on the same topic. Please provide a link to the the Mercurial wiki or the HG Book for people who want to learn more. Questions: Installation/Setup How to install Mercurial? How to set up Mercurial? How do you create a new project/repository? How do you configure it to ignore files? Working with the code How do you get the latest code? How do you check out code? How do you commit changes? How do you see what's uncommitted, or the status of your current codebase? How do you destroy unwanted commits? How do you compare two revisions of a file, or your current file and a previous revision? How do you see the history of revisions to a file? How do you handle binary files (visio docs, for instance, or compiler environments)? How do you merge files changed at the "same time"? Tagging, branching, releases, baselines How do you 'mark' 'tag' or 'release' a particular set of revisions for a particular set of files so you can always pull that one later? How do you pull a particular 'release'? How do you branch? How do you merge branches? How do you merge parts of one branch into another branch? Other Good GUI/IDE plugin for Mercurial? Advantages/disadvantages? Any other common tasks a beginner should know? How do I interface with Subversion? Other Mercurial references Mercurial: The Definitive Guide Mercurial Wiki Meet Mercurial | Peepcode Screencast

Read the article
Is it too early to start designing for Task Parallel Library?

- by Joe Erickson

I have been following the development of the .NET Task Parallel Library (TPL) with great interest since Microsoft first announced it. There is no doubt in my mind that we will eventually take advantage of TPL. What I am questioning is whether it makes sense to start taking advantage of TPL when Visual Studio 2010 and .NET 4.0 are released, or whether it makes sense to wait a while longer. Why Start Now? The .NET 4.0 Task Parallel Library appears to be well designed and some relatively simple tests demonstrate that it works well on today's multi-core CPUs. I have been very interested in the potential advantages of using multiple lightweight threads to speed up our software since buying my first quad processor Dell Poweredge 6400 about seven years ago. Experiments at that time indicated that it was not worth the effort, which I attributed largely to the overhead of moving data between each CPU's cache (there was no shared cache back then) and RAM. Competitive advantage - some of our customers can never get enough performance and there is no doubt that we can build a faster product using TPL today. It sounds fun. Yes, I realize that some developers would rather poke themselves in the eye with a sharp stick, but we really enjoy maximizing performance. Why Wait? Are today's Intel Nehalem CPUs representative of where we are going as multi-core support matures? You can purchase a Nehalem CPU with 4 cores which share a single level 3 cache today, and most likely a 6 core CPU sharing a single level 3 cache by the time Visual Studio 2010 / .NET 4.0 are released. Obviously, the number of cores will go up over time, but what about the architecture? As the number of cores goes up, will they still share a cache? One issue with Nehalem is the fact that, even though there is a very fast interconnect between the cores, they have non-uniform memory access (NUMA) which can lead to lower performance and less predictable results. Will future multi-core architectures be able to do away with NUMA? Similarly, will the .NET Task Parallel Library change as it matures, requiring modifications to code to fully take advantage of it? Limitations Our core engine is 100% C# and has to run without full trust, so we are limited to using .NET APIs.

Read the article
Advantage of creating a generic repository vs. specific repository for each object?

- by LuckyLindy

We are developing an ASP.NET MVC application, and are now building the repository/service classes. I'm wondering if there are any major advantages to creating a generic IRepository interface that all repositories implement, vs. each Repository having its own unique interface and set of methods. For example: a generic IRepository interface might look like (taken from this answer): public interface IRepository : IDisposable { T[] GetAll<T>(); T[] GetAll<T>(Expression<Func<T, bool>> filter); T GetSingle<T>(Expression<Func<T, bool>> filter); T GetSingle<T>(Expression<Func<T, bool>> filter, List<Expression<Func<T, object>>> subSelectors); void Delete<T>(T entity); void Add<T>(T entity); int SaveChanges(); DbTransaction BeginTransaction(); } Each Repository would implement this interface (e.g. CustomerRepository:IRepository, ProductRepository:IRepository, etc). The alternate that we've followed in prior projects would be: public interface IInvoiceRepository : IDisposable { EntityCollection<InvoiceEntity> GetAllInvoices(int accountId); EntityCollection<InvoiceEntity> GetAllInvoices(DateTime theDate); InvoiceEntity GetSingleInvoice(int id, bool doFetchRelated); InvoiceEntity GetSingleInvoice(DateTime invoiceDate, int accountId); //unique InvoiceEntity CreateInvoice(); InvoiceLineEntity CreateInvoiceLine(); void SaveChanges(InvoiceEntity); //handles inserts or updates void DeleteInvoice(InvoiceEntity); void DeleteInvoiceLine(InvoiceLineEntity); } In the second case, the expressions (LINQ or otherwise) would be entirely contained in the Repository implementation, whoever is implementing the service just needs to know which repository function to call. I guess I don't see the advantage of writing all the expression syntax in the service class and passing to the repository. Wouldn't this mean easy-to-messup LINQ code is being duplicated in many cases? For example, in our old invoicing system, we call InvoiceRepository.GetSingleInvoice(DateTime invoiceDate, int accountId) from a few different services (Customer, Invoice, Account, etc). That seems much cleaner than writing the following in multiple places: rep.GetSingle(x => x.AccountId = someId && x.InvoiceDate = someDate.Date); The only disadvantage I see to using the specific approach is that we could end up with many permutations of Get* functions, but this still seems preferable to pushing the expression logic up into the Service classes. What am I missing?

Read the article
What to Return? Error String, Bool with Error String Out, or Void with Exception

- by Ranger Pretzel

I spend most of my time in C# and am trying to figure out which is the best practice for handling an exception and cleanly return an error message from a called method back to the calling method. For example, here is some ActiveDirectory authentication code. Please imagine this Method as part of a Class (and not just a standalone function.) bool IsUserAuthenticated(string domain, string user, string pass, out errStr) { bool authentic = false; try { // Instantiate Directory Entry object DirectoryEntry entry = new DirectoryEntry("LDAP://" + domain, user, pass); // Force connection over network to authenticate object nativeObject = entry.NativeObject; // No exception thrown? We must be good, then. authentic = true; } catch (Exception e) { errStr = e.Message().ToString(); } return authentic; } The advantages of doing it this way are a clear YES or NO that you can embed right in your If-Then-Else statement. The downside is that it also requires the person using the method to supply a string to get the Error back (if any.) I guess I could overload this method with the same parameters minus the "out errStr", but ignoring the error seems like a bad idea since there can be many reasons for such a failure... Alternatively, I could write a method that returns an Error String (instead of using "out errStr") in which a returned empty string means that the user authenticated fine. string AuthenticateUser(string domain, string user, string pass) { string errStr = ""; try { // Instantiate Directory Entry object DirectoryEntry entry = new DirectoryEntry("LDAP://" + domain, user, pass); // Force connection over network to authenticate object nativeObject = entry.NativeObject; } catch (Exception e) { errStr = e.Message().ToString(); } return errStr; } But this seems like a "weak" way of doing things. Or should I just make my method "void" and just not handle the exception so that it gets passed back to the calling function? void AuthenticateUser(string domain, string user, string pass) { // Instantiate Directory Entry object DirectoryEntry entry = new DirectoryEntry("LDAP://" + domain, user, pass); // Force connection over network to authenticate object nativeObject = entry.NativeObject; } This seems the most sane to me (for some reason). Yet at the same time, the only real advantage of wrapping those 2 lines over just typing those 2 lines everywhere I need to authenticate is that I don't need to include the "LDAP://" string. The downside with this way of doing it is that the user has to put this method in a try-catch block. Thoughts? Is there another way of doing this that I'm not thinking of?

Read the article
should I ever put a major version number into a C#/Java namespace?

- by Andrew Patterson

I am designing a set of 'service' layer objects (data objects and interface definitions) for a WCF web service (that will be consumed by third party clients i.e. not in-house, so outside my direct control). I know that I am not going to get the interface definition exactly right - and am wanting to prepare for the time when I know that I will have to introduce a breaking set of new data objects. However, the reality of the world I am in is that I will also need to run my first version simultaneously for quite a while. The first version of my service will have URL of http://host/app/v1service.svc and when the times comes by new version will live at http://host/app/v2service.svc However, when it comes to the data objects and interfaces, I am toying with putting the 'major' version of the interface number into the actual namespace of the classes. namespace Company.Product.V1 { [DataContract(Namespace = "company-product-v1")] public class Widget { [DataMember] string widgetName; } public interface IFunction { Widget GetWidgetData(int code); } } When the time comes for a fundamental change to the service, I will introduce some classes like namespace Company.Product.V2 { [DataContract(Namespace = "company-product-v2")] public class Widget { [DataMember] int widgetCode; [DataMember] int widgetExpiry; } public interface IFunction { Widget GetWidgetData(int code); } } The advantages as I see it are that I will be able to have a single set of code serving both interface versions, sharing functionality where possible. This is because I will be able to reference both interface versions as a distinct set of C# objects. Similarly, clients may use both interface versions simultaneously, perhaps using V1.Widget in some legacy code whilst new bits move on to V2.Widget. Can anyone tell why this is a stupid idea? I have a nagging feeling that this is a bit smelly.. notes: I am obviously not proposing every single new version of the service would be in a new namespace. Presumably I will do as many non-breaking interface changes as possible, but I know that I will hit a point where all the data modelling will probably need a significant rewrite. I understand assembly versioning etc but I think this question is tangential to that type of versioning. But I could be wrong.

Read the article
Adding a mini admin to a webpage.

- by DADU

Hello Picture this: you are creating a little module that people can incorporate into their website easily, for example, a little contact form. It would consist of a PHP file that outputs some HTML, a Javascript file (ajax etc.), a CSS file and a CSS skin. Now the person who doesn't know much about coding wants to integrate it on a webpage (website/index.php). We could do this with three rules of code: <link rel="stylesheet" href="module/css/module.css" /> <script src="module/js/module.js"></script> <?php require_once 'module/module.php'; ?> There's no doubt this part is questionable, right? Now when we want to add an admin for this little module, there are two options: Accessing the admin via an extra URL like website/module/admin.php and after authentication, displaying a page where the person can do all the settings. The person then goes back to index.php to see the results. Enabling the admin via an extra URL like website/module/admin.php and after authentication, redirecting back to index.php. The person can now edit the module directly (HTML5 contenteditable) and see changes live, on the webpage where everybody else will see it when the person saves the changes. Option 2 has a couple of advantages: The person doesn't have to toggle between admin and index.php. The person can see directly how it's looking at the webpage it's integrated in. The person probably feels like the module is more part of the webpage/website. Of course option 2 has some disadvantages too: Not everything works well editing it inline. The person would need to have an HTML5 compliant browser. Probably some more I can't think of right now. Now I have a few concerns that's I can't seem to see a clear answer to. How would we let the person integrate the admin on their webpage? The admin files only need to be included in index.php if the person has choosen to edit the module via the url (website/module/admin.php). But how can we do this if we have a admin.css file that belongs in the head section, an admin.php file that goes into the body, and another admin.js file that's included at the end of the body? How would we know the file that admin.php needs to redirect back to, after authentication? index.php could be any webpage with any name. Any real life website/web apps examples using this principle are welcome too. If there's something unclear, I am glad to add additional info.

Read the article
Executing a .NET Managed Assembly from SQL Server 2008 - Pro's, Con's & Recommendations

- by RPM1984

Hi guys, looking for opinions/recommendations/links for the following scenario im currently facing. The Platform: .NET 4.0 Web Application SQL Server 2008 The Task: Overhaul a component of the system that performs (fairly) complex mathematical operations based on a specific user activity, and updates numerous tables in the database. A common user activity might be "Bob" decides to post a forum topic. This results in (the end-solution) needing to look at various factors (about the post he did), then after doing some math based on lookup values/ratios as well as other data in the database, inserting some other data as a result of these operations. The Options: Ok - so here's what im thinking. Although it would be much easier to do this in C# (LINQ-SQL) it doesnt make much sense as the majority of the computations are based on values in the db, and it will get difficult to control/optimize/debug the LINQ over time. Hence, im leaning towards created a managed assembly (C# Class Library) that contains the lookup values (constants) as well as leveraging the math classes in the existing .NET BCL. Basically i'd expose a few methods that can be called by the T-SQL Stored Procedures. This to me has the following advantages: Simplicity of math. Do complex math in .NET vs complex math in T-SQL. No brainer. =) Abstraction of computatations, configurable "lookup" values and business logic from raw T-SQL. T-SQL only needs to care about the data, simplifying the stored procedures and making it easier to maintain. When it needs to do math it delegates off to the managed assembly. So, having said that - ive never done this before (call .NET assmembly from T-SQL), and after some googling the best site i could come up with is here, which is useful but outdated. So - what am i asking? Well, firstly - i need some better references on how to actually do this. "This" being how to call a C# .NET 4 Assembly from within T-SQL Stored Procedures in SQL Server 2008. Secondly, who out there has done this, what problems (if any) did you face? Realize this may be difficult to provide a "correct answer", so ill try to give it to whoever gives me the answer with a combination of good links and a list of pro's/con's/problems with this implementation. Cheers!

Read the article
Precise explanation of JavaScript <-> DOM circular reference issue

- by Joey Adams

One of the touted advantages of jQuery.data versus raw expando properties (arbitrary attributes you can assign to DOM nodes) is that jQuery.data is "safe from circular references and therefore free from memory leaks". An article from Google titled "Optimizing JavaScript code" goes into more detail: The most common memory leaks for web applications involve circular references between the JavaScript script engine and the browsers' C++ objects' implementing the DOM (e.g. between the JavaScript script engine and Internet Explorer's COM infrastructure, or between the JavaScript engine and Firefox XPCOM infrastructure). It lists two examples of circular reference patterns: DOM element → event handler → closure scope → DOM DOM element → via expando → intermediary object → DOM element However, if a reference cycle between a DOM node and a JavaScript object produces a memory leak, doesn't this mean that any non-trivial event handler (e.g. onclick) will produce such a leak? I don't see how it's even possible for an event handler to avoid a reference cycle, because the way I see it: The DOM element references the event handler. The event handler references the DOM (either directly or indirectly). In any case, it's almost impossible to avoid referencing window in any interesting event handler, short of writing a setInterval loop that reads actions from a global queue. Can someone provide a precise explanation of the JavaScript ↔ DOM circular reference problem? Things I'd like clarified: What browsers are effected? A comment in the jQuery source specifically mentions IE6-7, but the Google article suggests Firefox is also affected. Are expando properties and event handlers somehow different concerning memory leaks? Or are both of these code snippets susceptible to the same kind of memory leak? // Create an expando that references to its own element. var elem = document.getElementById('foo'); elem.myself = elem; // Create an event handler that references its own element. var elem = document.getElementById('foo'); elem.onclick = function() { elem.style.display = 'none'; }; If a page leaks memory due to a circular reference, does the leak persist until the entire browser application is closed, or is the memory freed when the window/tab is closed?

Read the article

Search Results

Search found 1477 results on 60 pages for 'advantages'.

Page 56/60 | < Previous Page | 52 53 54 55 56 57 58 59 60 | Next Page >

- by Pinal Dave

- by rickramsey

- by Bertrand Matthelié

- by Michael Snow

- by Ilya Verbitskiy

- by Joe Mayo

- by Tarun Arora

- by nospam(at)example.com (Joerg Moellenkamp)

- by George Mamaladze

- by Martin Hinshelwood

- by Leonid Ganeline

- by Steve Felts

- by Wade Williams

- by Don Vince

- by Eugen

- by Roeland

- by tom smith

- by Laz

- by Joe Erickson

- by LuckyLindy

- by Ranger Pretzel

- by Andrew Patterson

- by DADU

- by RPM1984

- by Joey Adams

< Previous Page | 52 53 54 55 56 57 58 59 60 | Next Page >