Search Results

Search found 2515 results on 101 pages for 'distributed filesystems'.

Page 42/101 | < Previous Page | 38 39 40 41 42 43 44 45 46 47 48 49 | Next Page >

table alias (or 'symlink') in mysql

- by andreash

Hi there, in MySQL5.1, is there a way to make one table accessible by two different names? I'm thinking about somethink like a symlink on linux filesystems. I know theres the CREATE VIEW myview AS SELECT * FrOM mytable thing, but I don't only need to SELECT from both names, but also delete etc ... You might ask why I want to do this? It's about getting a commercial, closed-source app to work, which is crappily programmed (usually, the table names are all lower-case, but occasionally, they use capitalized names for the same table ...). Oh, that would be another idea: Is there a way to tell MySQL not to care about capitalization of table names (like on Windows filesystems?)? that would also do the trick ... Thanks for your insight! A.

Read the article
USB blocks suspend on a Gigabyte GA-890GPA-UD3H with ATI SB700/SB800

- by poolie

Following on from question 12397, I'd still like to get suspend working on my Phenom II X6 / GA-890GPA desktop machine running current Maverick. When I run pmi action suspend the machine doesn't crash, but it also doesn't suspend. The kernel logs show: PM: Syncing filesystems ... done. PM: Preparing system for mem sleep Freezing user space processes ... (elapsed 0.02 seconds) done. Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done. PM: Entering mem sleep Suspending console(s) (use no_console_suspend to debug) pm_op(): usb_dev_suspend+0x0/0x20 returns -2 PM: Device usb8 failed to suspend async: error -2 PM: Some devices failed to suspend PM: resume of devices complete after 0.430 msecs PM: resume devices took 0.000 seconds PM: Finishing wakeup. Restarting tasks ... done. PM: Syncing filesystems ... I've tried disconnecting all the USB devices, and then connecting in to run pmi over ssh, and I get the same failure. With everything unplugged, I see the following usb devices: Bus 007 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 006 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 005 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub and lspci shows the physical devices are: 00:12.0 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI0 Controller 00:12.2 USB Controller: ATI Technologies Inc SB700/SB800 USB EHCI Controller 00:13.0 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI0 Controller 00:13.2 USB Controller: ATI Technologies Inc SB700/SB800 USB EHCI Controller 00:14.5 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI2 Controller 00:16.0 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI0 Controller 00:16.2 USB Controller: ATI Technologies Inc SB700/SB800 USB EHCI Controller 02:00.0 USB Controller: NEC Corporation uPD720200 USB 3.0 Host Controller (rev 03) Booting with no_console_suspend makes no difference.

Read the article
Is your team is a high-performing team?

As a child I can remember looking out of the car window as my father drove along the Interstate in Florida while seeing prisoners wearing bright orange jump suits and prison guards keeping a watchful eye on them. The prisoners were taking part in a prison road gang. These road gangs were formed to help the state maintain the state highway infrastructure. The prisoner’s primary responsibilities are to pick up trash and debris from the roadway. This is a prime example of a work group or working group used by most prison systems in the United States. Work groups or working groups can be defined as a collection of individuals or entities working together to achieve a specific goal or accomplish a specific set of tasks. Typically these groups are only established for a short period of time and are dissolved once the desired outcome has been achieved. More often than not group members usually feel as though they are expendable to the group and some even dread that they are even in the group. "A team is a small number of people with complementary skills who are committed to a common purpose, performance goals, and approach for which they are mutually accountable." (Katzenbach and Smith, 1993) So how do you determine that a team is a high-performing team? This can be determined by three base line criteria that include: consistently high quality output, the promotion of personal growth and well being of all team members, and most importantly the ability to learn and grow as a unit. Initially, a team can successfully create high-performing output without meeting all three criteria, however this will erode over time because team members will feel detached from the group or that they are not growing then the quality of the output will decline. High performing teams are similar to work groups because they both utilize a collection of individuals or entities to accomplish tasks. What distinguish a high-performing team from a work group are its characteristics. High-performing teams contain five core characteristics. These characteristics are what separate a group from a team. The five characteristics of a high-performing team include: Purpose, Performance Measures, People with Tasks and Relationship Skills, Process, and Preparation and Practice. A high-performing team is much more than a work group, and typically has a life cycle that can vary from team to team. The standard team lifecycle consists of five states and is comparable to a human life cycle. The five states of a high-performing team lifecycle include: Formulating, Storming, Normalizing, Performing, and Adjourning. The Formulating State of a team is first realized when the team members are first defined and roles are assigned to all members. This initial stage is very important because it can set the tone for the team and can ultimately determine its success or failure. In addition, this stage requires the team to have a strong leader because team members are normally unclear about specific roles, specific obstacles and goals that my lay ahead of them. Finally, this stage is where most team members initially meet one another prior to working as a team unless the team members already know each other. The Storming State normally arrives directly after the formulation of a new team because there are still a lot of unknowns amongst the newly formed assembly. As a general rule most of the parties involved in the team are still getting used to the workload, pace of work, deadlines and the validity of various tasks that need to be performed by the group. In this state everything is questioned because there are so many unknowns. Items commonly questioned include the credentials of others on the team, the actual validity of a project, and the leadership abilities of the team leader. This can be exemplified by looking at the interactions between animals when they first meet. If we look at a scenario where two people are walking directly toward each other with their dogs. The dogs will automatically enter the Storming State because they do not know the other dog. Typically in this situation, they attempt to define which is more dominating via play or fighting depending on how the dogs interact with each other. Once dominance has been defined and accepted by both dogs then they will either want to play or leave depending on how the dogs interacted and other environmental variables. Once the Storming State has been realized then the Normalizing State takes over. This state is entered by a team once all the questions of the Storming State have been answered and the team has been tested by a few tasks or projects. Typically, participants in the team are filled with energy, and comradery, and a strong alliance with team goals and objectives. A high school football team is a perfect example of the Normalizing State when they start their season. The player positions have been assigned, the depth chart has been filled and everyone is focused on winning each game. All of the players encourage and expect each other to perform at the best of their abilities and are united by competition from other teams. The Performing State is achieved by a team when its history, working habits, and culture solidify the team as one working unit. In this state team members can anticipate specific behaviors, attitudes, reactions, and challenges are seen as opportunities and not problems. Additionally, each team member knows their role in the team’s success, and the roles of others. This is the most productive state of a group and is where all the time invested working together really pays off. If you look at an Olympic figure skating team skate you can easily see how the time spent working together benefits their performance. They skate as one unit even though it is comprised of two skaters. Each skater has their routine completely memorized as well as their partners. This allows them to anticipate each other’s moves on the ice makes their skating look effortless. The final state of a team is the Adjourning State. This state is where accomplishments by the team and each individual team member are recognized. Additionally, this state also allows for reflection of the interactions between team members, work accomplished and challenges that were faced. Finally, the team celebrates the challenges they have faced and overcome as a unit. Currently in the workplace teams are divided into two different types: Co-located and Distributed Teams. Co-located teams defined as the traditional group of people working together in an office, according to Andy Singleton of Assembla. This traditional type of a team has dominated business in the past due to inadequate technology, which forced workers to primarily interact with one another via face to face meetings. Team meetings are primarily lead by the person with the highest status in the company. Having personally, participated in meetings of this type, usually a select few of the team members dominate the flow of communication which reduces the input of others in group discussions. Since discussions are dominated by a select few individuals the discussions and group discussion are skewed in favor of the individuals who communicate the most in meetings. In addition, Team members might not give their full opinions on a topic of discussion in part not to offend or create controversy amongst the team and can alter decision made in meetings towards those of the opinions of the dominating team members. Distributed teams are by definition spread across an area or subdivided into separate sections. That is exactly what distributed teams when compared to a more traditional team. It is common place for distributed teams to have team members across town, in the next state, across the country and even with the advances in technology over the last 20 year across the world. These teams allow for more diversity compared to the other type of teams because they allow for more flexibility regarding location. A team could consist of a 30 year old male Italian project manager from New York, a 50 year old female Hispanic from California and a collection of programmers from India because technology allows them to communicate as if they were standing next to one another. In addition, distributed team members consult with more team members prior to making decisions compared to traditional teams, and take longer to come to decisions due to the changes in time zones and cultural events. However, team members feel more empowered to speak out when they do not agree with the team and to notify others of potential issues regarding the work that the team is doing. Virtual teams which are a subset of the distributed team type is changing organizational strategies due to the fact that a team can now in essence be working 24 hrs a day because of utilizing employees in various time zones and locations. A primary example of this is with customer services departments, a company can have multiple call centers spread across multiple time zones allowing them to appear to be open 24 hours a day while all a employees work from 9AM to 5 PM every day. Virtual teams also allow human resources departments to go after the best talent for the company regardless of where the potential employee works because they will be a part of a virtual team all that is need is the proper technology to be setup to allow everyone to communicate. In addition to allowing employees to work from home, the company can save space and resources by not having to provide a desk for every team member. In fact, those team members that randomly come into the office can actually share one desk amongst multiple people. This is definitely a cost cutting plus given the current state of the economy. One thing that can turn a team into a high-performing team is leadership. High-performing team leaders need to focus on investing in ongoing personal development, provide team members with direction, structure, and resources needed to accomplish their work, make the right interventions at the right time, and help the team manage boundaries between the team and various external parties involved in the teams work. A team leader needs to invest in ongoing personal development in order to effectively manage their team. People have said that attitude is everything; this is very true about leaders and leadership. A team takes on the attitudes and behaviors of its leaders. This can potentially harm the team and the team’s output. Leaders must concentrate on self-awareness, and understanding their team’s group dynamics to fully understand how to lead them. In addition, always learning new leadership techniques from other effective leaders is also very beneficial. Providing team members with direction, structure, and resources that they need to accomplish their work collectively sounds easy, but it is not. Leaders need to be able to effectively communicate with their team on how their work helps the company reach for its organizational vision. Conversely, the leader needs to allow his team to work autonomously within specific guidelines to turn the company’s vision into a reality. This being said the team must be appropriately staffed according to the size of the team’s tasks and their complexity. These tasks should be clear, and be meaningful to the company’s objectives and allow for feedback to be exchanged with the leader and the team member and the leader and upper management. Now if the team is properly staffed, and has a clear and full understanding of what is to be done; the company also must supply the workers with the proper tools to achieve the tasks that they are asked to do. No one should be asked to dig a hole without being given a shovel. Finally, leaders must reward their team members for accomplishments that they achieve. Awards could range from just a simple congratulatory email, a party to close the completion of a large project, or other monetary rewards. Managing boundaries is very important for team leaders because it can alter attitudes of team members and can add undue stress to the team which will force them to loose focus on the tasks at hand for the group. Team leaders should promote communication between team members so that burdens are shared amongst the team and solutions can be derived from hearing the opinions of multiple sources. This also reinforces team camaraderie and working as a unit. Team leaders must manage the type and timing of interventions as to not create an even bigger mess within the team. Poorly timed interventions can really deflate team members and make them question themselves. This could really increase further and undue interventions by the team leader. Typically, the best time for interventions is when the team is just starting to form so that all unproductive behaviors are removed from the team and that it can retain focus on its agenda. If an intervention is effectively executed the team will feel energized about the work that they are doing, promote communication and interaction amongst the group and improve moral overall. High-performing teams are very import to organizations because they consistently produce high quality output and develop a collective purpose for their work. This drive to succeed allows team members to utilize specific talents allowing for growth in these areas. In addition, these team members usually take on a sense of ownership with their projects and feel that the other team members are irreplaceable. References: http://blog.assembla.com/assemblablog/tabid/12618/bid/3127/Three-ways-to-organize-your-team-co-located-outsourced-or-global.aspx Katzenbach, J.R. & Smith, D.K. (1993). The Wisdom of Teams: Creating the High-performance Organization. Boston: Harvard Business School.

Read the article
What's up with OCFS2?

- by wcoekaer

On Linux there are many filesystem choices and even from Oracle we provide a number of filesystems, all with their own advantages and use cases. Customers often confuse ACFS with OCFS or OCFS2 which then causes assumptions to be made such as one replacing the other etc... I thought it would be good to write up a summary of how OCFS2 got to where it is, what we're up to still, how it is different from other options and how this really is a cool native Linux cluster filesystem that we worked on for many years and is still widely used. Work on a cluster filesystem at Oracle started many years ago, in the early 2000's when the Oracle Database Cluster development team wrote a cluster filesystem for Windows that was primarily focused on providing an alternative to raw disk devices and help customers with the deployment of Oracle Real Application Cluster (RAC). Oracle RAC is a cluster technology that lets us make a cluster of Oracle Database servers look like one big database. The RDBMS runs on many nodes and they all work on the same data. It's a Shared Disk database design. There are many advantages doing this but I will not go into detail as that is not the purpose of my write up. Suffice it to say that Oracle RAC expects all the database data to be visible in a consistent, coherent way, across all the nodes in the cluster. To do that, there were/are a few options : 1) use raw disk devices that are shared, through SCSI, FC, or iSCSI 2) use a network filesystem (NFS) 3) use a cluster filesystem(CFS) which basically gives you a filesystem that's coherent across all nodes using shared disks. It is sort of (but not quite) combining option 1 and 2 except that you don't do network access to the files, the files are effectively locally visible as if it was a local filesystem. So OCFS (Oracle Cluster FileSystem) on Windows was born. Since Linux was becoming a very important and popular platform, we decided that we would also make this available on Linux and thus the porting of OCFS/Windows started. The first version of OCFS was really primarily focused on replacing the use of Raw devices with a simple filesystem that lets you create files and provide direct IO to these files to get basically native raw disk performance. The filesystem was not designed to be fully POSIX compliant and it did not have any where near good/decent performance for regular file create/delete/access operations. Cache coherency was easy since it was basically always direct IO down to the disk device and this ensured that any time one issues a write() command it would go directly down to the disk, and not return until the write() was completed. Same for read() any sort of read from a datafile would be a read() operation that went all the way to disk and return. We did not cache any data when it came down to Oracle data files. So while OCFS worked well for that, since it did not have much of a normal filesystem feel, it was not something that could be submitted to the kernel mail list for inclusion into Linux as another native linux filesystem (setting aside the Windows porting code ...) it did its job well, it was very easy to configure, node membership was simple, locking was disk based (so very slow but it existed), you could create regular files and do regular filesystem operations to a certain extend but anything that was not database data file related was just not very useful in general. Logfiles ok, standard filesystem use, not so much. Up to this point, all the work was done, at Oracle, by Oracle developers. Once OCFS (1) was out for a while and there was a lot of use in the database RAC world, many customers wanted to do more and were asking for features that you'd expect in a normal native filesystem, a real "general purposes cluster filesystem". So the team sat down and basically started from scratch to implement what's now known as OCFS2 (Oracle Cluster FileSystem release 2). Some basic criteria were : Design it with a real Distributed Lock Manager and use the network for lock negotiation instead of the disk Make it a Linux native filesystem instead of a native shim layer and a portable core Support standard Posix compliancy and be fully cache coherent with all operations Support all the filesystem features Linux offers (ACL, extended Attributes, quotas, sparse files,...) Be modern, support large files, 32/64bit, journaling, data ordered journaling, endian neutral, we can mount on both endian /cross architecture,.. Needless to say, this was a huge development effort that took many years to complete. A few big milestones happened along the way... OCFS2 was development in the open, we did not have a private tree that we worked on without external code review from the Linux Filesystem maintainers, great folks like Christopher Hellwig reviewed the code regularly to make sure we were not doing anything out of line, we submitted the code for review on lkml a number of times to see if we were getting close for it to be included into the mainline kernel. Using this development model is standard practice for anyone that wants to write code that goes into the kernel and having any chance of doing so without a complete rewrite or.. shall I say flamefest when submitted. It saved us a tremendous amount of time by not having to re-fit code for it to be in a Linus acceptable state. Some other filesystems that were trying to get into the kernel that didn't follow an open development model had a lot harder time and a lot harsher criticism. March 2006, when Linus released 2.6.16, OCFS2 officially became part of the mainline kernel, it was accepted a little earlier in the release candidates but in 2.6.16. OCFS2 became officially part of the mainline Linux kernel tree as one of the many filesystems. It was the first cluster filesystem to make it into the kernel tree. Our hope was that it would then end up getting picked up by the distribution vendors to make it easy for everyone to have access to a CFS. Today the source code for OCFS2 is approximately 85000 lines of code. We made OCFS2 production with full support for customers that ran Oracle database on Linux, no extra or separate support contract needed. OCFS2 1.0.0 started being built for RHEL4 for x86, x86-64, ppc, s390x and ia64. For RHEL5 starting with OCFS2 1.2. SuSE was very interested in high availability and clustering and decided to build and include OCFS2 with SLES9 for their customers and was, next to Oracle, the main contributor to the filesystem for both new features and bug fixes. Source code was always available even prior to inclusion into mainline and as of 2.6.16, source code was just part of a Linux kernel download from kernel.org, which it still is, today. So the latest OCFS2 code is always the upstream mainline Linux kernel. OCFS2 is the cluster filesystem used in Oracle VM 2 and Oracle VM 3 as the virtual disk repository filesystem. Since the filesystem is in the Linux kernel it's released under the GPL v2 The release model has always been that new feature development happened in the mainline kernel and we then built consistent, well tested, snapshots that had versions, 1.2, 1.4, 1.6, 1.8. But these releases were effectively just snapshots in time that were tested for stability and release quality. OCFS2 is very easy to use, there's a simple text file that contains the node information (hostname, node number, cluster name) and a file that contains the cluster heartbeat timeouts. It is very small, and very efficient. As Sunil Mushran wrote in the manual : OCFS2 is an efficient, easily configured, quickly installed, fully integrated and compatible, feature-rich, architecture and endian neutral, cache coherent, ordered data journaling, POSIX-compliant, shared disk cluster file system. Here is a list of some of the important features that are included : Variable Block and Cluster sizes Supports block sizes ranging from 512 bytes to 4 KB and cluster sizes ranging from 4 KB to 1 MB (increments in power of 2). Extent-based Allocations Tracks the allocated space in ranges of clusters making it especially efficient for storing very large files. Optimized Allocations Supports sparse files, inline-data, unwritten extents, hole punching and allocation reservation for higher performance and efficient storage. File Cloning/snapshots REFLINK is a feature which introduces copy-on-write clones of files in a cluster coherent way. Indexed Directories Allows efficient access to millions of objects in a directory. Metadata Checksums Detects silent corruption in inodes and directories. Extended Attributes Supports attaching an unlimited number of name:value pairs to the file system objects like regular files, directories, symbolic links, etc. Advanced Security Supports POSIX ACLs and SELinux in addition to the traditional file access permission model. Quotas Supports user and group quotas. Journaling Supports both ordered and writeback data journaling modes to provide file system consistency in the event of power failure or system crash. Endian and Architecture neutral Supports a cluster of nodes with mixed architectures. Allows concurrent mounts on nodes running 32-bit and 64-bit, little-endian (x86, x86_64, ia64) and big-endian (ppc64) architectures. In-built Cluster-stack with DLM Includes an easy to configure, in-kernel cluster-stack with a distributed lock manager. Buffered, Direct, Asynchronous, Splice and Memory Mapped I/Os Supports all modes of I/Os for maximum flexibility and performance. Comprehensive Tools Support Provides a familiar EXT3-style tool-set that uses similar parameters for ease-of-use. The filesystem was distributed for Linux distributions in separate RPM form and this had to be built for every single kernel errata release or every updated kernel provided by the vendor. We provided builds from Oracle for Oracle Linux and all kernels released by Oracle and for Red Hat Enterprise Linux. SuSE provided the modules directly for every kernel they shipped. With the introduction of the Unbreakable Enterprise Kernel for Oracle Linux and our interest in reducing the overhead of building filesystem modules for every minor release, we decide to make OCFS2 available as part of UEK. There was no more need for separate kernel modules, everything was built-in and a kernel upgrade automatically updated the filesystem, as it should. UEK allowed us to not having to backport new upstream filesystem code into an older kernel version, backporting features into older versions introduces risk and requires extra testing because the code is basically partially rewritten. The UEK model works really well for continuing to provide OCFS2 without that extra overhead. Because the RHEL kernel did not contain OCFS2 as a kernel module (it is in the source tree but it is not built by the vendor in kernel module form) we stopped adding the extra packages to Oracle Linux and its RHEL compatible kernel and for RHEL. Oracle Linux customers/users obviously get OCFS2 included as part of the Unbreakable Enterprise Kernel, SuSE customers get it by SuSE distributed with SLES and Red Hat can decide to distribute OCFS2 to their customers if they chose to as it's just a matter of compiling the module and making it available. OCFS2 today, in the mainline kernel is pretty much feature complete in terms of integration with every filesystem feature Linux offers and it is still actively maintained with Joel Becker being the primary maintainer. Since we use OCFS2 as part of Oracle VM, we continue to look at interesting new functionality to add, REFLINK was a good example, and as such we continue to enhance the filesystem where it makes sense. Bugfixes and any sort of code that goes into the mainline Linux kernel that affects filesystems, automatically also modifies OCFS2 so it's in kernel, actively maintained but not a lot of new development happening at this time. We continue to fully support OCFS2 as part of Oracle Linux and the Unbreakable Enterprise Kernel and other vendors make their own decisions on support as it's really a Linux cluster filesystem now more than something that we provide to customers. It really just is part of Linux like EXT3 or BTRFS etc, the OS distribution vendors decide. Do not confuse OCFS2 with ACFS (ASM cluster Filesystem) also known as Oracle Cloud Filesystem. ACFS is a filesystem that's provided by Oracle on various OS platforms and really integrates into Oracle ASM (Automatic Storage Management). It's a very powerful Cluster Filesystem but it's not distributed as part of the Operating System, it's distributed with the Oracle Database product and installs with and lives inside Oracle ASM. ACFS obviously is fully supported on Linux (Oracle Linux, Red Hat Enterprise Linux) but OCFS2 independently as a native Linux filesystem is also, and continues to also be supported. ACFS is very much tied into the Oracle RDBMS, OCFS2 is just a standard native Linux filesystem with no ties into Oracle products. Customers running the Oracle database and ASM really should consider using ACFS as it also provides storage/clustered volume management. Customers wanting to use a simple, easy to use generic Linux cluster filesystem should consider using OCFS2. To learn more about OCFS2 in detail, you can find good documentation on http://oss.oracle.com/projects/ocfs2 in the Documentation area, or get the latest mainline kernel from http://kernel.org and read the source. One final, unrelated note - since I am not always able to publicly answer or respond to comments, I do not want to selectively publish comments from readers. Sometimes I forget to publish comments, sometime I publish them and sometimes I would publish them but if for some reason I cannot publicly comment on them, it becomes a very one-sided stream. So for now I am going to not publish comments from anyone, to be fair to all sides. You are always welcome to email me and I will do my best to respond to technical questions, questions about strategy or direction are sometimes not possible to answer for obvious reasons.

Read the article
Why should I use Amazon Route 53 over my registrar's DNS servers?

- by Abtin Forouzandeh

I am building a site that I anticipate will have high usage. Currently, my registrar (GoDaddy) is handling DNS. However, Amazon's Route 53 looks interesting. They promise high speed and offer globally distributed DNS servers and a programmable interface. While GoDaddy doesn't offer a programmable interface, I assume their servers are geographically distributed as well. What are the main reasons I should opt to use Amazon Route 53 over free registrar-based DNS?

Read the article
Objective-C Plugin Architecture Security (Mac, not iphone)

- by Tom Dalling

I'm possibly writing a plugin system for a Cocoa application (Mac, not iphone). A common approach is the make each plugin a bundle, then inject the bundle into the main application. I'm concerned with the security implications of doing this, as the bundle will have complete access to the Objective-C runtime. I am especially concerned with a plugin having access to the code that handles registration and serial keys. Another plugin system we are considering is based on distributed notifications. Basically, each plugin will be a separate process, and they will communicate via distributed notifications only. Is there a way to load bundles securely (e.g. sandboxing)? If not, do you see any problems with using distributed notifications? Are there any other plugin architectures that would be better?

Read the article
Deduping your redundancies

- by nospam(at)example.com (Joerg Moellenkamp)

Robin Harris of Storagemojo pointed to an interesting article about about deduplication and it's impact to the resiliency of your data against data corruption on ACM Queue. The problem in short: A considerable number of filesystems store important metadata at multiple locations. For example the ZFS rootblock is copied to three locations. Other filesystems have similar provisions to protect their metadata. However you can easily proof, that the rootblock pointer in the uberblock of ZFS for example is pointing to blocks with absolutely equal content in all three locatition (with zdb -uu and zdb -r). It has to be that way, because they are protected by the same checksum. A number of devices offer block level dedup, either as an option or as part of their inner workings. However when you store three identical blocks on them and the devices does block level dedup internally, the device may just deduplicated your redundant metadata to a block stored just once that is stored on the non-voilatile storage. When this block is corrupted, you have essentially three corrupted copies. Three hit with one bullet. This is indeed an interesting problem: A device doing deduplication doesn't know if a block is important or just a datablock. This is the reason why I like deduplication like it's done in ZFS. It's an integrated part and so important parts don't get deduplicated away. A disk accessed by a block level interface doesn't know anything about the importance of a block. A metadata block is nothing different to it's inner mechanism than a normal data block because there is no way to tell that this is important and that those redundancies aren't allowed to fall prey to some clever deduplication mechanism. Robin talks about this in regard of the Sandforce disk controllers who use a kind of dedup to reduce some of the nasty effects of writing data to flash, but the problem is much broader. However this is relevant whenever you are using a device with block level deduplication. It's just the point that you have to activate it for most implementation by command, whereas certain devices do this by default or by design and you don't know about it. However I'm not perfectly sure about that ? given that storage administration and server administration are often different groups with different business objectives I would ask your storage guys if they have activated dedup without telling somebody elase on their boxes in order to speak less often with the storage sales rep. The problem is even more interesting with ZFS. You may use ditto blocks to protect important data to store multiple copies of data in the pool to increase redundancy, even when your pool just consists out of one disk or just a striped set of disk. However when your device is doing dedup internally it may remove your redundancy before it hits the nonvolatile storage. You've won nothing. Just spend your disk quota on the the LUNs in the SAN and you make your disk admin happy because of the good dedup ratio However you can just fall in this specific "deduped ditto block"trap when your pool just consists out of a single device, because ZFS writes ditto blocks on different disks, when there is more than just one disk. Yet another reason why you should spend some extra-thought when putting your zpool on a single LUN, especially when the LUN is sliced and dices out of a large heap of storage devices by a storage controller. However I have one problem with the articles and their specific mention of ZFS: You can just hit by this problem when you are using the deduplicating device for the pool. However in the specifically mentioned case of SSD this isn't the usecase. Most implementations of SSD in conjunction with ZFS are hybrid storage pools and so rotating rust disk is used as pool and SSD are used as L2ARC/sZIL. And there it simply doesn't matter: When you really have to resort to the sZIL (your system went down, it doesn't matter of one block or several blocks are corrupt, you have to fail back to the last known good transaction group the device. On the other side, when a block in L2ARC is corrupt, you simply read it from the pool and in HSP implementations this is the already mentioned rust. In conjunction with ZFS this is more interesting when using a storage array, that is capable to do dedup and where you use LUNs for your pool. However as mentioned before, on those devices it's a user made decision to do so, and so it's less probable that you deduplicating your redundancies. Other filesystems lacking acapability similar to hybrid storage pools are more "haunted" by this problem of SSD using dedup-like mechanisms internally, because those filesystem really store the data on the the SSD instead of using it just as accelerating devices. However at the end Robin is correct: It's jet another point why protecting your data by creating redundancies by dispersing it several disks (by mirror or parity RAIDs) is really important. No dedup mechanism inside a device can dedup away your redundancy when you write it to a totally different and indepenent device.

Read the article
New Book From Luís Abreu: ASP.NET 4.0 – The Complete Course (Portuguese)

- by Paulo Morgado

Thsi book, with several practical examples, presents how to build web applications using ASP.NET 4.0. Starts by introducing the framework to build pages and controls and gradually introduces all the new features available. More compact that its previous versions (part of the content was moved to FCA’s site in the form of apendices), this new book gives emphasis to to the new features in ASP.NET 4.0 and targets both developers new to ASP.NET and developers moving from previous versions of ASP.NET. This time there’s good new for Brazilian readers. The book will be distributed in Brazil by: Zamboni Comércio de Livros Ltda. Av.Parada Pinto, 1476 São Paulo – SP Telf. / Fax: +55 11 2233-2333 E-mail: [email protected] Our book (LINQ Com C# (Portuguese)) isn’t still distributed in Brazil, but, if you want it, you can always try that distributer.

Read the article
What steps can you take to ensure sane build environments when compiling software?

- by Chris Adams

Hi guys, I've been stuck with a compilation problem when building a standardised virtual machine on CentOS 5.4, and I'm in the dark here as to a) why this error is occurring, and b) how to fix it, and in the hope that someone else stumbles across this problem too, I'm hoping someone can help me find the solution here. I'm getting a configure: error: newly created file is older than distributed files! error when trying to compile Ruby Enterprise like below when I try to run the installer, and the solutions offered to on the forums (of checking the tine, and touching the files to update the time associated with them) don't seem to be helping here. What steps can I take to work out what the cause of this problem? [vagrant@vagrant-centos-5 ruby-enterprise-1.8.7-2009.10]$ sudo ./installer Welcome to the Ruby Enterprise Edition installer This installer will help you install Ruby Enterprise Edition 1.8.7-2009.10. Don't worry, none of your system files will be touched if you don't want them to, so there is no risk that things will screw up. You can expect this from the installation process: 1. Ruby Enterprise Edition will be compiled and optimized for speed for this system. 2. Ruby on Rails will be installed for Ruby Enterprise Edition. 3. You will learn how to tell Phusion Passenger to use Ruby Enterprise Edition instead of regular Ruby. Press Enter to continue, or Ctrl-C to abort. Checking for required software... * C compiler... found at /usr/bin/gcc * C++ compiler... found at /usr/bin/g++ * The 'make' tool... found at /usr/bin/make * Zlib development headers... found * OpenSSL development headers... found * GNU Readline development headers... found -------------------------------------------- Target directory Where would you like to install Ruby Enterprise Edition to? (All Ruby Enterprise Edition files will be put inside that directory.) [/opt/ruby-enterprise] : -------------------------------------------- Compiling and optimizing the memory allocator for Ruby Enterprise Edition In the mean time, feel free to grab a cup of coffee. ./configure --prefix=/opt/ruby-enterprise --disable-dependency-tracking checking build system type... i686-pc-linux-gnu checking host system type... i686-pc-linux-gnu checking for a BSD-compatible install... /usr/bin/install -c checking whether build environment is sane... configure: error: newly created file is older than distributed files! Check your system clock This is a virtual machine running on virtualbox, and the time of the host and the virtual machine are identical, and up to date. I've also tried running this after updating time with an ntp-client, so no avail. I tried this after reading this post here of someone having a similar problem [vagrant@vagrant-centos-5 ruby-enterprise-1.8.7-2009.10]$ date Tue Apr 27 08:09:05 BST 2010 The other approach I've tried is to touch the top level the files in the build folder like suggested here, but this hasn't worked either (an to be honest, I'm not sure why it would have worked either) [vagrant@vagrant-centos-5 ruby-enterprise-1.8.7-2009.10]$ sudo touch ruby-enterprise-1.8.7-2009.10/* I'm not sure what I can do next here - the problem seems to be the bash configure script that returns this error error: newly created file is older than distributed files!, at line :2214 { echo "$as_me:$LINENO: checking whether build environment is sane" >&5 echo $ECHO_N "checking whether build environment is sane... $ECHO_C" >&6; } # Just in case sleep 1 echo timestamp > conftest.file # Do `set' in a subshell so we don't clobber the current shell's # arguments. Must try -L first in case configure is actually a # symlink; some systems play weird games with the mod time of symlinks # (eg FreeBSD returns the mod time of the symlink's containing # directory). if ( set X `ls -Lt $srcdir/configure conftest.file 2> /dev/null` if test "$*" = "X"; then # -L didn't work. set X `ls -t $srcdir/configure conftest.file` fi rm -f conftest.file if test "$*" != "X $srcdir/configure conftest.file" \ && test "$*" != "X conftest.file $srcdir/configure"; then # If neither matched, then we have a broken ls. This can happen # if, for instance, CONFIG_SHELL is bash and it inherits a # broken ls alias from the environment. This has actually # happened. Such a system could not be considered "sane". { { echo "$as_me:$LINENO: error: ls -t appears to fail. Make sure there is not a broken alias in your environment" >&5 echo "$as_me: error: ls -t appears to fail. Make sure there is not a broken alias in your environment" >&2;} { (exit 1); exit 1; }; } fi ### PROBLEM LINE #### # this line is the problem line - this is returned true, sometimes it isn't and I can't # see a pattern that that determines when this will test will pass or not. test "$2" = conftest.file ) then # Ok. : else { { echo "$as_me:$LINENO: error: newly created file is older than distributed files! Check your system clock" >&5 echo "$as_me: error: newly created file is older than distributed files! Check your system clock" >&2;} { (exit 1); exit 1; }; } fi the thing that makes this really frustrating is that this script works sometimes, when the VM has been running for an hour or so it works, but not at boot. There's nothing I see in the crontab that suggests any hourly tasks are run that might change the state of the system enough make a difference to this script working. I'm totally at a loss when it comes to debugging beyond here. What's the best approach to take here? Thanks

Read the article
Oracle Releases New Mainframe Re-Hosting in Oracle Tuxedo 11g

- by Jason Williamson

I'm excited to say that we've released our next generation of Re-hosting in 11g. In fact I'm doing some hands-on labs now for our Systems Integrators in Italy in a couple of weeks and targeting Latin America next month. If you are an SI, or Rehosting firm and are looking to become an Oracle Partner or get a better understanding of Tuxedo and how to use the workbench for rehosting...drop me a line. Oracle Tuxedo Application Runtime for CICS and Batch 11g provides a CICS API emulation and Batch environment that exploits the full range of Oracle Tuxedo's capabilities. Re-hosted applications run in a multi-node, grid environment with centralized production control. Also, enterprise integration of CICS application services benefits from an open and SOA-enabled framework. Key features include: CICS Application Runtime: Can run IBM CICS applications unchanged in an application grid, which enables the distribution of large workloads across multiple processors and nodes. This simplifies CICS administration and can scale to over 100,000 users and over 50,000 transactions per second. 3270 Terminal Server: Protects business users from change through support for tn3270 terminal emulation. Distributed CICS Resource Management: Simplifies deployment and administration by allowing customers to run CICS regions in a distributed configuration. Batch Application Runtime: Provides robust IBM JES-like job management that enables local or remote job submissions. In addition, distributed batch initiators can enable parallelization of jobs and support fail-over, shortening the batch window and helping to meet stringent SLAs. Batch Execution Environment: Helps to run IBM batch unchanged and also supports JCL functionality and all common batch utilities. Oracle Tuxedo Application Rehosting Workbench 11g provides a set of automated migration tools integrated around a central repository. The tools provide high precision which results in very low error rates and the ability to handle large applications. This enables less expensive, low-risk migration projects. Key capabilities include: Workbench Repository and Cataloguer: Ensures integrity of the migrated application assets through full dependency checking. The Cataloguer generates and maintains all relevant meta-data on source and target components. File Migrator: Supports reliable migration of datasets and flat files to an ISAM or Oracle Database 11g. This is done through the automated migration utilities for data unloading, reloading and validation. It also generates logical access functions to shield developers from data repository changes. DB2 Migrator: Similarly, this tool automates the migration of DB2 schema and data to Oracle Database 11g. COBOL Migrator: Supports migration of IBM mainframe COBOL assets (OLTP and Batch) to open systems. Adapts programs for compiler dialects and data access variations. JCL Migrator: Supports migration of IBM JCL jobs to a Tuxedo ART environment, maintaining the flow and characteristics of batch jobs.

Read the article
Big Data – Various Learning Resources – How to Start with Big Data? – Day 20 of 21

- by Pinal Dave

In yesterday’s blog post we learned how to become a Data Scientist for Big Data. In this article we will go over various learning resources related to Big Data. In this series we have covered many of the most essential details about Big Data. At the beginning of this series, I have encouraged readers to send me questions. One of the most popular questions is - “I want to learn more about Big Data. Where can I learn it?” This is indeed a great question as there are plenty of resources out to learn about Big Data and it is indeed difficult to select on one resource to learn Big Data. Hence I decided to write here a few of the very important resources which are related to Big Data. Learn from Pluralsight Pluralsight is a global leader in high-quality online training for hardcore developers. It has fantastic Big Data Courses and I started to learn about Big Data with the help of Pluralsight. Here are few of the courses which are directly related to Big Data. Big Data: The Big Picture Big Data Analytics with Tableau NoSQL: The Big Picture Understanding NoSQL Data Analysis Fundamentals with Tableau I encourage all of you start with this video course as they are fantastic fundamentals to learn Big Data. Learn from Apache Resources at Apache are single point the most authentic learning resources. If you want to learn fundamentals and go deep about every aspect of the Big Data, I believe you must understand various concepts in Apache’s library. I am pretty impressed with the documentation and I am personally referencing it every single day when I work with Big Data. I strongly encourage all of you to bookmark following all the links for authentic big data learning. Haddop - The Apache Hadoop® project develops open-source software for reliable, scalable, distributed computing. Ambari: A web-based tool for provisioning, managing, and monitoring Apache Hadoop clusters which include support for Hadoop HDFS, Hadoop MapReduce, Hive, HCatalog, HBase, ZooKeeper, Oozie, Pig and Sqoop. Ambari also provides a dashboard for viewing cluster health such as heat maps and ability to view MapReduce, Pig and Hive applications visually along with features to diagnose their performance characteristics in a user-friendly manner. Avro: A data serialization system. Cassandra: A scalable multi-master database with no single points of failure. Chukwa: A data collection system for managing large distributed systems. HBase: A scalable, distributed database that supports structured data storage for large tables. Hive: A data warehouse infrastructure that provides data summarization and ad hoc querying. Mahout: A Scalable machine learning and data mining library. Pig: A high-level data-flow language and execution framework for parallel computation. ZooKeeper: A high-performance coordination service for distributed applications. Learn from Vendors One of the biggest issues with about learning Big Data is setting up the environment. Every Big Data vendor has different environment request and there are lots of things require to set up Big Data framework. Many of the users do not start with Big Data as they are afraid about the resources required to set up framework as well as a time commitment. Here Hortonworks have created fantastic learning environment. They have created Sandbox with everything one person needs to learn Big Data and also have provided excellent tutoring along with it. Sandbox comes with a dozen hands-on tutorial that will guide you through the basics of Hadoop as well it contains the Hortonworks Data Platform. I think Hortonworks did a fantastic job building this Sandbox and Tutorial. Though there are plenty of different Big Data Vendors I have decided to list only Hortonworks due to their unique setup. Please leave a comment if there are any other such platform to learn Big Data. I will include them over here as well. Learn from Books There are indeed few good books out there which one can refer to learn Big Data. Here are few good books which I have read. I will update the list as I will learn more. Ethics of Big Data Balancing Risk and Innovation Big Data for Dummies Head First Data Analysis: A Learner’s Guide to Big Numbers, Statistics, and Good Decisions If you search on Amazon there are millions of the books but I think above three books are a great set of books and it will give you great ideas about Big Data. Once you go through above books, you will have a clear idea about what is the next step you should follow in this series. You will be capable enough to make the right decision for yourself. Tomorrow In tomorrow’s blog post we will wrap up this series of Big Data. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
Big Data – Operational Databases Supporting Big Data – RDBMS and NoSQL – Day 12 of 21

- by Pinal Dave

In yesterday’s blog post we learned the importance of the Cloud in the Big Data Story. In this article we will understand the role of Operational Databases Supporting Big Data Story. Even though we keep on talking about Big Data architecture, it is extremely crucial to understand that Big Data system can’t just exist in the isolation of itself. There are many needs of the business can only be fully filled with the help of the operational databases. Just having a system which can analysis big data may not solve every single data problem. Real World Example Think about this way, you are using Facebook and you have just updated your information about the current relationship status. In the next few seconds the same information is also reflected in the timeline of your partner as well as a few of the immediate friends. After a while you will notice that the same information is now also available to your remote friends. Later on when someone searches for all the relationship changes with their friends your change of the relationship will also show up in the same list. Now here is the question – do you think Big Data architecture is doing every single of these changes? Do you think that the immediate reflection of your relationship changes with your family member is also because of the technology used in Big Data. Actually the answer is Facebook uses MySQL to do various updates in the timeline as well as various events we do on their homepage. It is really difficult to part from the operational databases in any real world business. Now we will see a few of the examples of the operational databases. Relational Databases (This blog post) NoSQL Databases (This blog post) Key-Value Pair Databases (Tomorrow’s post) Document Databases (Tomorrow’s post) Columnar Databases (The Day After’s post) Graph Databases (The Day After’s post) Spatial Databases (The Day After’s post) Relational Databases We have earlier discussed about the RDBMS role in the Big Data’s story in detail so we will not cover it extensively over here. Relational Database is pretty much everywhere in most of the businesses which are here for many years. The importance and existence of the relational database are always going to be there as long as there are meaningful structured data around. There are many different kinds of relational databases for example Oracle, SQL Server, MySQL and many others. If you are looking for Open Source and widely accepted database, I suggest to try MySQL as that has been very popular in the last few years. I also suggest you to try out PostgreSQL as well. Besides many other essential qualities PostgreeSQL have very interesting licensing policies. PostgreSQL licenses allow modifications and distribution of the application in open or closed (source) form. One can make any modifications and can keep it private as well as well contribute to the community. I believe this one quality makes it much more interesting to use as well it will play very important role in future. Nonrelational Databases (NOSQL) We have also covered Nonrelational Dabases in earlier blog posts. NoSQL actually stands for Not Only SQL Databases. There are plenty of NoSQL databases out in the market and selecting the right one is always very challenging. Here are few of the properties which are very essential to consider when selecting the right NoSQL database for operational purpose. Data and Query Model Persistence of Data and Design Eventual Consistency Scalability Though above all of the properties are interesting to have in any NoSQL database but the one which most attracts to me is Eventual Consistency. Eventual Consistency RDBMS uses ACID (Atomicity, Consistency, Isolation, Durability) as a key mechanism for ensuring the data consistency, whereas NonRelational DBMS uses BASE for the same purpose. Base stands for Basically Available, Soft state and Eventual consistency. Eventual consistency is widely deployed in distributed systems. It is a consistency model used in distributed computing which expects unexpected often. In large distributed system, there are always various nodes joining and various nodes being removed as they are often using commodity servers. This happens either intentionally or accidentally. Even though one or more nodes are down, it is expected that entire system still functions normally. Applications should be able to do various updates as well as retrieval of the data successfully without any issue. Additionally, this also means that system is expected to return the same updated data anytime from all the functioning nodes. Irrespective of when any node is joining the system, if it is marked to hold some data it should contain the same updated data eventually. As per Wikipedia - Eventual consistency is a consistency model used in distributed computing that informally guarantees that, if no new updates are made to a given data item, eventually all accesses to that item will return the last updated value. In other words - Informally, if no additional updates are made to a given data item, all reads to that item will eventually return the same value. Tomorrow In tomorrow’s blog post we will discuss about various other Operational Databases supporting Big Data. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article
Windows Azure Recipe: Big Data

- by Clint Edmonson

As the name implies, what we’re talking about here is the explosion of electronic data that comes from huge volumes of transactions, devices, and sensors being captured by businesses today. This data often comes in unstructured formats and/or too fast for us to effectively process in real time. Collectively, we call these the 4 big data V’s: Volume, Velocity, Variety, and Variability. These qualities make this type of data best managed by NoSQL systems like Hadoop, rather than by conventional Relational Database Management System (RDBMS). We know that there are patterns hidden inside this data that might provide competitive insight into market trends. The key is knowing when and how to leverage these “No SQL” tools combined with traditional business such as SQL-based relational databases and warehouses and other business intelligence tools. Drivers Petabyte scale data collection and storage Business intelligence and insight Solution The sketch below shows one of many big data solutions using Hadoop’s unique highly scalable storage and parallel processing capabilities combined with Microsoft Office’s Business Intelligence Components to access the data in the cluster. Ingredients Hadoop – this big data industry heavyweight provides both large scale data storage infrastructure and a highly parallelized map-reduce processing engine to crunch through the data efficiently. Here are the key pieces of the environment: Pig - a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs. Mahout - a machine learning library with algorithms for clustering, classification and batch based collaborative filtering that are implemented on top of Apache Hadoop using the map/reduce paradigm. Hive - data warehouse software built on top of Apache Hadoop that facilitates querying and managing large datasets residing in distributed storage. Directly accessible to Microsoft Office and other consumers via add-ins and the Hive ODBC data driver. Pegasus - a Peta-scale graph mining system that runs in parallel, distributed manner on top of Hadoop and that provides algorithms for important graph mining tasks such as Degree, PageRank, Random Walk with Restart (RWR), Radius, and Connected Components. Sqoop - a tool designed for efficiently transferring bulk data between Apache Hadoop and structured data stores such as relational databases. Flume - a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large log data amounts to HDFS. Database – directly accessible to Hadoop via the Sqoop based Microsoft SQL Server Connector for Apache Hadoop, data can be efficiently transferred to traditional relational data stores for replication, reporting, or other needs. Reporting – provides easily consumable reporting when combined with a database being fed from the Hadoop environment. Training These links point to online Windows Azure training labs where you can learn more about the individual ingredients described above. Hadoop Learning Resources (20+ tutorials and labs) Huge collection of resources for learning about all aspects of Apache Hadoop-based development on Windows Azure and the Hadoop and Windows Azure Ecosystems SQL Azure (7 labs) Microsoft SQL Azure delivers on the Microsoft Data Platform vision of extending the SQL Server capabilities to the cloud as web-based services, enabling you to store structured, semi-structured, and unstructured data. See my Windows Azure Resource Guide for more guidance on how to get started, including links web portals, training kits, samples, and blogs related to Windows Azure.

Read the article
TG'10 Conference Set

Meeting of the nation's largest and most-comprehensive distributed cyberinfrastructure for open scientific research will take place August 2-5 Research - Computer Science - Cyberinfrastructure - National Science Foundation - Organizations

Read the article
TG'10 Conference Set

Meeting of the nation's largest and most-comprehensive distributed cyberinfrastructure for open scientific research will take place August 2-5 Research - Computer Science - Cyberinfrastructure - National Science Foundation - Organizations

Read the article
What are the real life implications for an Apache 2 license?

- by Duopixel

I want to use SVG Edit for project. This software is distributed under the Apache 2 license. I've seen that: all copies, modified or unmodified, are accompanied by a copy of the licence all modifications are clearly marked as being the work of the modifier all notices of copyright, trademark and patent rights are reproduced accurately in distributed copies the licensee does not use any trademarks that belong to the licensor Do these pertain to the code or should I display the license somewhere in the GUI? The orignal software displays a "powered by SVG Edit", is it ok if I remove this? And most importantly: what is the correct etiquette for doing this? I don't want to be an asshole, but at the same time I want to simplify the UI as much as possible and removing the link will be part of it if it's not considered rude.

Read the article
Self-Scaling Architecture for Group and Enterprise Computing

Better computing and communication for emergency personnel at disaster sites Computing - Business - Computer Science - Distributed Computing - Cloud computing

Read the article
Pay in the future should make you think in the present

- by BuckWoody

Distributed Computing - and more importantly “-as-a-Service” models of computing have a different cost model. This is something that sounds obvious on the surface but it’s often forgotten during the design and coding phase of a project. In on-premises computing, we’re used to purchasing a server and all of the hardware infrastructure and software licenses needed not only for one project, but several. This is an up-front or “sunk” cost that we consume by running code the organization needs to perform its function. Using a direct connection over wires you’ve already paid for, we don’t often have to think about bandwidth, hits on the data store or the amount of compute we use - we just know more is better. In a pay-as-you-go model, however, each of these architecture decisions has a potential cost impact. The amount of data you store, the number of times you access it, and the amount you send back all come with a charge. The offset is that you don’t buy anything at all up-front, so that sunk cost is freed up. And financial professionals know that money now is worth more than money later. Saving that up-front cost allows you to invest it in other things. It’s not just that you’re using things that now cost money - it’s that the design itself in distributed computing has a cost impact. That can be a really good thing, such as when you dynamically add capacity for paying customers. If you can tie back the cost of a series of clicks to what a user will pay to do so, you can set a profit margin that is easy to track. Here’s a case in point: Assume you are using a large instance in Windows Azure to compute some data that you retrieve from a SQL Azure database. If you don’t monitor the path of the application, you may not know what you are really using. Since you’re paying by the size of the instance, it’s best to maximize it all the time. Recently I evaluated just this situation, and found that downsizing the instance and adding another one where needed, adding a caching function to the application, moving part of the data into Windows Azure tables not only increased the speed of the application, but reduced the cost and more closely tied the cost to the profit. The key is this: from the very outset - the design - make sure you include metrics to measure for the cost/performance (sometimes these are the same) for your application. Windows Azure opens up awesome new ways of doing things, so make sure you study distributed systems architecture before you try and force in the application design you have on premises into your new application structure.

Read the article
Is Ubuntu MAAS free? Will it remain like that?

- by Bruno Pereira

Ubuntu MAAS, very cool, awesome in fact, looks like a unique tool for several jobs. It looks free, but part of its documentation starts already with clauses that would scare anyone with interest in it: Documentation is copy righted by Canonical; Documentation must be used only for non-commercial purposes; If documentation is distributed within the non-commercial clause you must retain copyright; It just sounds a lot for a guide on how to install MAAS + Juju + Openstack and that scares me a bit. Under what license is Ubuntu MAAS distributed and what would be the reasoning for being so worried about copyrighting a guide like that so heavily? Is Ubuntu MAAS free? Will it continue like that?

Read the article
Report: Memcached Vendors Bulk Up for Web 2.0

Gear6 and Schooner roll out new solutions built on top of the open source memcached distributed caching effort.

Read the article
Basic memcached question

- by Aadith

I have been reading up on distributed hashing. I learnt that consistent hashing is used for distributing the keys among cache machines. I also learnt that, a key is duplicated on mutiple caches to handle failure of cache hosts. But what I have come across on memcached doesn't seem to be in alignment with all this. I read that all cache nodes are independent of each other and that if a cache goes down, requests go to DB. Theres no mention of cache miss on a host resulting in the host directing the request to another host which could either be holding the key or is nearer to the key. Can you please tell me how these two fit together? Is memcached a very preliminary form of distributed hashing which doesnt have much sophistication?

Read the article
Hadoop:Only master node does the work

- by user287722

I've setup a Hadoop 2.2 cluster with 1 master node(namenode and secondary namenode) and 3 slave nodes(datanode and namenode on each one).All of the machines use Linux Mint 64bit. When I run my MapReduce program, writen in Java, I can only see that master node is using extra CPU and RAM. Slave nodes are not doing a thing. I've checked the logs from all of the namenodes and there is nothing wrong with the namenodes on slave nodes. Resource Manager is running and all of the slave nodes can see the Resource Manager. I used this http://n0where.net/hadoop-2-2-multi-node-cluster-setup/ tutorial to configure my nodes. Datanodes are working in terms of distributed data storing but I can't see any indication of distributed data processing. Do I have to configure the xml configuration files in some other way so all of the machines will process data while I'm running my MapReduce Job?

Read the article
On what name should I claim copyright in open source software?

- by ONOZ

When I want to use the Apache 2.0 licence in my project, I should include this in the comments of my source code: Copyright [yyyy] [name of copyright owner] Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. What name should I fill in for [name of copyright owner]? I am currently working alone on this project, but I'm going to release the source code so there might be other contributors in the near future.

Read the article
WS-Eventing for WCF (Indigo)

This article describes the design, implementation and usage of the WS-Eventing for distributed applications driven by new MS communication model WCF (Windows Communication Foundation)

Read the article
EMC Gives Data Domain New Replication Features

Cascaded replication and 180-to-1 remote site fan-in features are aimed at large distributed enterprises.

Read the article

Search Results

Search found 2515 results on 101 pages for 'distributed filesystems'.

Page 42/101 | < Previous Page | 38 39 40 41 42 43 44 45 46 47 48 49 | Next Page >

- by andreash

- by poolie

- by wcoekaer

- by Abtin Forouzandeh

- by Tom Dalling

- by nospam(at)example.com (Joerg Moellenkamp)

- by Paulo Morgado

- by Chris Adams

- by Jason Williamson

- by Pinal Dave

- by Pinal Dave

- by Clint Edmonson

- by Duopixel

- by BuckWoody

- by Bruno Pereira

- by Aadith

- by user287722

- by ONOZ

< Previous Page | 38 39 40 41 42 43 44 45 46 47 48 49 | Next Page >