Search Results

Search found 4125 results on 165 pages for 'hash cluster'.

Page 160/165 | < Previous Page | 156 157 158 159 160 161 162 163 164 165 | Next Page >

Scripting Language Sessions at Oracle OpenWorld and MySQL Connect, 2012

- by cj

This posts highlights some great scripting language sessions coming up at the Oracle OpenWorld and MySQL Connect conferences. These events are happening in San Francisco from the end of September. You can search for other interesting conference sessions in the Content Catalog. Also check out what is happening at JavaOne in that event's Content Catalog (I haven't included sessions from it in this post.) To find the timeslots and locations of each session, click their respective link and check the "Session Schedule" box on the top right. GEN8431 - General Session: What’s New in Oracle Database Application Development This general session takes a look at what’s been new in the last year in Oracle Database application development tools using the latest generation of database technology. Topics range from Oracle SQL Developer and Oracle Application Express to Java and PHP. (Thomas Kyte - Architect, Oracle) BOF9858 - Meet the Developers of Database Access Services (OCI, ODBC, DRCP, PHP, Python) This session is your opportunity to meet in person the Oracle developers who have built Oracle Database access tools and products such as the Oracle Call Interface (OCI), Oracle C++ Call Interface (OCCI), and Open Database Connectivity (ODBC) drivers; Transparent Application Failover (TAF); Oracle Database Instant Client; Database Resident Connection Pool (DRCP); Oracle Net Services, and so on. The team also works with those who develop the PHP, Ruby, Python, and Perl adapters for Oracle Database. Come discuss with them the features you like, your pains, and new product enhancements in the latest database technology. CON8506 - Syndication and Consolidation: Oracle Database Driver for MySQL Applications This technical session presents a new Oracle Database driver that enables you to run MySQL applications (written in PHP, Perl, C, C++, and so on) against Oracle Database with almost no code change. Use cases for such a driver include application syndication such as interoperability across a relationship database management system, application migration, and database consolidation. In addition, the session covers enhancements in database technology that enable and simplify the migration of third-party databases and applications to and consolidation with Oracle Database. Attend this session to learn more and see a live demo. (Srinath Krishnaswamy - Director, Software Development, Oracle. Kuassi Mensah - Director Product Management, Oracle. Mohammad Lari - Principal Technical Staff, Oracle ) CON9167 - Current State of PHP and MySQL Together, PHP and MySQL power large parts of the Web. The developers of both technologies continue to enhance their software to ensure that developers can be satisfied despite all their changing and growing needs. This session presents an overview of changes in PHP 5.4, which was released earlier this year and shows you various new MySQL-related features available for PHP, from transparent client-side caching to direct support for scaling and high-availability needs. (Johannes Schlüter - SoftwareDeveloper, Oracle) CON8983 - Sharding with PHP and MySQL In deploying MySQL, scale-out techniques can be used to scale out reads, but for scaling out writes, other techniques have to be used. To distribute writes over a cluster, it is necessary to shard the database and store the shards on separate servers. This session provides a brief introduction to traditional MySQL scale-out techniques in preparation for a discussion on the different sharding techniques that can be used with MySQL server and how they can be implemented with PHP. You will learn about static and dynamic sharding schemes, their advantages and drawbacks, techniques for locating and moving shards, and techniques for resharding. (Mats Kindahl - Senior Principal Software Developer, Oracle) CON9268 - Developing Python Applications with MySQL Utilities and MySQL Connector/Python This session discusses MySQL Connector/Python and the MySQL Utilities component of MySQL Workbench and explains how to write MySQL applications in Python. It includes in-depth explanations of the features of MySQL Connector/Python and the MySQL Utilities library, along with example code to illustrate the concepts. Those interested in learning how to expand or build their own utilities and connector features will benefit from the tips and tricks from the experts. This session also provides an opportunity to meet directly with the engineers and provide feedback on your issues and priorities. You can learn what exists today and influence future developments. (Geert Vanderkelen - Software Developer, Oracle) BOF9141 - MySQL Utilities and MySQL Connector/Python: Python Developers, Unite! Come to this lively discussion of the MySQL Utilities component of MySQL Workbench and MySQL Connector/Python. It includes in-depth explanations of the features and dives into the code for those interested in learning how to expand or build their own utilities and connector features. This is an audience-driven session, so put on your best Python shirt and let’s talk about MySQL Utilities and MySQL Connector/Python. (Geert Vanderkelen - Software Developer, Oracle. Charles Bell - Senior Software Developer, Oracle) CON3290 - Integrating Oracle Database with a Social Network Facebook, Flickr, YouTube, Google Maps. There are many social network sites, each with their own APIs for sharing data with them. Most developers do not realize that Oracle Database has base tools for communicating with these sites, enabling all manner of information, including multimedia, to be passed back and forth between the sites. This technical presentation goes through the methods in PL/SQL for connecting to, and then sending and retrieving, all types of data between these sites. (Marcelle Kratochvil - CTO, Piction) CON3291 - Storing and Tuning Unstructured Data and Multimedia in Oracle Database Database administrators need to learn new skills and techniques when the decision is made in their organization to let Oracle Database manage its unstructured data. They will face new scalability challenges. A single row in a table can become larger than a whole database. This presentation covers the techniques a DBA needs for managing the large volume of data in a standard Oracle Database instance. (Marcelle Kratochvil - CTO, Piction) CON3292 - Using PHP, Perl, Visual Basic, Ruby, and Python for Multimedia in Oracle Database These five programming languages are just some of the most popular ones in use at the moment in the marketplace. This presentation details how you can use them to access and retrieve multimedia from Oracle Database. It covers programming techniques and methods for achieving faster development against Oracle Database. (Marcelle Kratochvil - CTO, Piction) UGF5181 - Building Real-World Oracle DBA Tools in Perl Perl is not normally associated with building mission-critical application or DBA tools. Learn why Perl could be a good choice for building your next killer DBA app. This session draws on real-world experience of building DBA tools in Perl, showing the framework and architecture needed to deal with portability, efficiency, and maintainability. Topics include Perl frameworks; Which Comprehensive Perl Archive Network (CPAN) modules are good to use; Perl and CPAN module licensing; Perl and Oracle connectivity; Compiling and deploying your app; An example of what is possible with Perl. (Arjen Visser - CEO & CTO, Dbvisit Software Limited) CON3153 - Perl: A DBA’s and Developer’s Best (Forgotten) Friend This session reintroduces Perl as a language of choice for many solutions for DBAs and developers. Discover what makes Perl so successful and why it is so versatile in our day-to-day lives. Perl can automate all those manual tasks and is truly platform-independent. Perl may not be in the limelight the way other languages are, but it is a remarkable language, it is still very current with ongoing development, and it has amazing online resources. Learn what makes Perl so great (including CPAN), get an introduction to Perl language syntax, find out what you can use Perl for, hear how Oracle uses Perl, discover the best way to learn Perl, and take away a small Perl project challenge. (Arjen Visser - CEO & CTO, Dbvisit Software Limited) CON10332 - Oracle RightNow CX Cloud Service’s Connect PHP API: Intro, What’s New, and Roadmap Connect PHP is a public API that enables developers to build solutions with the Oracle RightNow CX Cloud Service platform. This API is used primarily by developers working within the Oracle RightNow Customer Portal Cloud Service framework who are looking to gain access to data and services hosted by the Oracle RightNow CX Cloud Service platform through a backward-compatible API. Connect for PHP leverages the same data model and services as the Connect Web Services for SOAP API. Come to this session to get an introduction and learn what’s new and what’s coming up. (Mark Rhoads - Senior Principal Applications Engineer, Oracle. Mark Ericson - Sr. Principle Product Manager, Oracle) CON10330 - Oracle RightNow CX Cloud Service APIs and Frameworks Overview Oracle RightNow CX Cloud Service APIs are available in the following areas: desktop UI, Web services, customer portal, PHP, and knowledge. These frameworks provide access to Oracle RightNow CX Cloud Service’s Connect Common Object Model and custom objects. This session provides a broad overview of capabilities in all these areas. (Mark Ericson - Sr. Principle Product Manager, Oracle)

Read the article
How you can extend Tasklists in Fusion Applications

- by Elie Wazen

In this post we describe the process of modifying and extending a Tasklist available in the Regional Area of a Fusion Applications UI Shell. This is particularly useful to Customers who would like to expose Setup Tasks (generally available in the Fusion Setup Manager application) in the various functional pillars workareas. Oracle Composer, the tool used to implement such extensions allows changes to be made at runtime. The example provided in this document is for an Oracle Fusion Financials page. Let us examine the case of a customer role who requires access to both, a workarea and its associated functional tasks, and to an FSM (setup) task. Both of these tasks represent ADF Taskflows but each is accessible from a different page. We will show how an FSM task is added to a Functional tasklist and made accessible to a user from within a single workarea, eliminating the need to navigate between the FSM application and the Functional workarea where transactions are conducted. In general, tasks in Fusion Applications are grouped in two ways: Setup tasks are grouped in tasklists available to implementers in the Functional Setup Manager (FSM). These Tasks are accessed by implementation users and in general do not represent daily operational tasks that fit into a functional business process and were consequently included in the FSM application. For these tasks, the primary organizing principle is precedence between tasks. If task "Manage Suppliers" has prerequisites, those tasks must precede it in a tasklist. Task Lists are organized to efficiently implement an offering. Tasks frequently performed as part of business process flows are made available as links in the tasklist of their corresponding menu workarea. The primary organizing principle in the menu and task pane entries is to group tasks that are generally accessed together. Customizing a tasklist thus becomes required for business scenarios where a task packaged under FSM as a setup task, is for a particular customer a regular maintenance task that is accessed for record updates or creation as part of normal operational activities and where the frequency of this access merits the inclusion of that task in the related operational tasklist A user with the role of maintaining Journals in General Ledger is also responsible for maintaining Chart of Accounts Mappings. In the Fusion Financials Product Family, Manage Journals is a task available from within the Journals Menu whereas Chart of Accounts Mapping is available via FSM under the Define Chart of Accounts tasklist Figure 1. The Manage Chart of Accounts Mapping Task in FSM Figure 2. The Manage Journals Task in the Task Pane of the Journals Workarea Our goal is to simplify cross task navigation and allow the user to access both tasks from a single tasklist on a single page without having to navigate to FSM for the Mapping task and to the Journals workarea for the Manage task. To accomplish that, we use Oracle Composer to customize the Journals tasklist by adding to it the Mapping task. Identify the Taskflow name and path of the FSM Task The first step in our process is to identify the underlying taskflow for the Manage Chart of Accounts Mappings task. We select to Setup and Maintenance from the Navigator to launch the FSM Application, and we query the task from Manage Tasklists and Tasks Figure 3. Task Details including Taskflow path The Manage Chart of Accounts Mapping Task Taskflow is: /WEB-INF/oracle/apps/financials/generalLedger/sharedSetup/coaMappings/ui/flow /CoaMappingsMainAreaFlow.xml#CoaMappingsMainAreaFlow We copy that value and use it later as a parameter to our new task in the customized Journals Tasklist. Customize the Journals Page A user with Administration privileges can start the run time customization directly from the Administration Menu of the Global Area. This customization is done at the Site level and once implemented becomes available to all users with access to the Journals Workarea. Figure 4. Customization Menu The Oracle Composer Window is displayed in the same browser and the Hierarchy of the page component is displayed and available for modification. Figure 5. Oracle Composer In the composer Window select the PanelFormLayout node and click on the Edit Button. Note that the selected component is simultaneously highlighted in the lower pane in the browser. In the Properties popup window, select the Tasks List and Task Properties Tab, where the user finds the hierarchy of the Tasklist and is able to Edit nodes or create new ones. src="https://blogs.oracle.com/FunctionalArchitecture/resource/TL5.jpg" Figure 6. The Tasklist in edit mode Add a Child Task to the Tasklist In the Edit Window the user will now create a child node at the desired level in the hierarchy by selecting the immediate parent node and clicking on the insert node button. This process requires four values to be set as described in Table 1 below. Parameter Value How to Determine the Value Focus View Id /JournalEntryPage This is the Focus View ID of the UI Shell where the Tasklist we want to customize is. A simple way to determine this value is to copy it from any of the Standard tasks on the Tasklist Label COA Mapping This is the Display name of the Task as it will appear in the Tasklist Task Type dynamicMain If the value is dynamicMain, the page contains a new link in the Regional Area. When you click the link, a new tab with the loaded task opens Taskflowid /WEB-INF/oracle/apps/financials/generalLedger/sharedSetup/ coaMappings/ui/flow/ CoaMappingsMainAreaFlow.xml#CoaMappingsMainAreaFlow This is the Taskflow path we retrieved from the Task Definition in FSM earlier in the process Table 1. Parameters and Values for the Task to be added to the customized Tasklist Figure 7. The parameters window of the newly added Task Access the FSM Task from the Journals Workarea Once the FSM task is added and its parameters defined, the user saves the record, closes the Composer making the new task immediately available to users with access to the Journals workarea (Refer to Figure 8 below). Figure 8. The COA Mapping Task is now visible and can be invoked from the Journals Workarea Additional Considerations If a Task Flow is part of a product that is deployed on the same app server as the Tasklist workarea then that task flow can be added to a customized tasklist in that workarea. Otherwise that task flow can be invoked from its parent product’s workarea tasklist by selecting that workarea from the Navigator menu. For Example The following Taskflows belong respectively to the Subledger Accounting, and to the General Ledger Products. /WEB-INF/oracle/apps/financials/subledgerAccounting/accountingMethodSetup/mappingSets/ui/flow/MappingSetFlow.xml#MappingSetFlow /WEB-INF/oracle/apps/financials/generalLedger/sharedSetup/coaMappings/ui/flow/CoaMappingsMainAreaFlow.xml#CoaMappingsMainAreaFlow Since both the Subledger Accounting and General Ledger products are part of the LedgerApp J2EE Applicaton and are both deployed on the General Ledger Cluster Server (Figure 8 below), the user can add both of the above taskflows to the tasklist in the /JournalEntryPage FocusVIewID Workarea. Note: both FSM Taskflows and Functional Taskflows can be added to the Tasklists as described in this document Figure 8. The Topology of the Fusion Financials Product Family. Note that SubLedger Accounting and General Ledger are both deployed on the Ledger App Conclusion In this document we have shown how an administrative user can edit the Tasklist in the Regional Area of a Fusion Apps page using Oracle Composer. This is useful for cases where tasks packaged in different workareas are frequently accessed by the same user. By making these tasks available from the same page, we minimize the number of steps in the navigation the user has to do to perform their transactions and queries in Fusion Apps. The example explained above showed that tasks classified as Setup tasks, meaning made accessible to implementation users from the FSM module can be added to the workarea of their respective Fusion application. This eliminates the need to navigate to FSM to access tasks that are both setup and regular maintenance tasks. References Oracle Fusion Applications Extensibility Guide 11g Release 1 (11.1.1.5) Part Number E16691-02 (Section 3.2) Oracle Fusion Applications Developer's Guide 11g Release 1 (11.1.4) Part Number E15524-05

Read the article
Package management fails in update-manager with gzip problems and compilation errors. U12.04LTS

- by HarveyP

Similar to but not the same as Package management system corrupted. Cannot install or remove packages. U12.04LTS (an earlier problem) with package management system. Followed all of L. D. James suggestions in his answer to no avail. This time as well as the gzip error I am also getting compilation errors. The difference may be due to a lack of compilation in my earlier problem so it may be the same error. The packages concerned are enumerated in the output from update-manager below. Also included below that is the output from apt-get -f install apt-get autoremove gives same output. Tried update without SSL updates - 9 to install and got "Unhandled Error in aptdaemon". Output number 3 below. One at a time - output 4 - is for firefox, first in the list of packages. Falls over at libssl1.0.0 despite deselection of it from update ... Tried apt-get install --reinstall dpkg which succeeded, apt-get install --reinstall tar and apt-get install --reinstall gzip both of which failed at libssl1.0.0 as ever. (as suggested by Subv3rsion elsewhere in this forum) Now cannot apt-get update with complete success even after changing server and apt-get clean - output number 5 below ... 1). Output from update-manager The following packages will be upgraded:<> firefox firefox-globalmenu firefox-locale-en libavcodec-extra-53 libavformat53 libavutil-extra-51 libjson0 libpostproc52 libssl1.0.0 libswscale2 openssl 11 to upgrade, 0 to newly install, 0 to remove and 0 not to upgrade.<br> Need to get 0 B/46.5 MB of archives. After this operation, 1,416 kB of additional disk space will be used.<br> Do you want to continue [Y/n]? y debconf: Perl may be unconfigured (Bareword "gensym" not allowed while "strict subs" in use at /usr/lib/perl/5.14/IO/Handle.pm line 67. BEGIN not safe after errors--compilation aborted at /usr/lib/perl/5.14/IO/Handle.pm line 366. Compilation failed in require at /usr/lib/perl/5.14/IO/Seekable.pm line 9. BEGIN failed--compilation aborted at /usr/lib/perl/5.14/IO/Seekable.pm line 9. Compilation failed in require at /usr/lib/perl/5.14/IO/File.pm line 11. BEGIN failed--compilation aborted at /usr/lib/perl/5.14/IO/File.pm line 11. Compilation failed in require at /usr/share/perl/5.14/FileHandle.pm line 9. Compilation failed in require at (eval 1) line 3. BEGIN failed--compilation aborted at (eval 1) line 3. ) -- aborting (Reading database ... 160575 files and directories currently installed.) Preparing to replace libssl1.0.0 1.0.1-4ubuntu5.14 (using .../libssl1.0.0_1.0.1-4ubuntu5.15_i386.deb) ... Unpacking replacement libssl1.0.0 ... dpkg-deb (subprocess): data: internal gzip read error: '<fd:4>: data error' dpkg-deb: error: subprocess <decompress> returned error exit status 2 dpkg: error processing /var/cache/apt/archives/libssl1.0.0_1.0.1-4ubuntu5.15_i386.deb (--unpack):<br> subprocess dpkg-deb --fsys-tarfile returned error exit status 2 No apport report written because MaxReports has already been reached Bareword "gensym" not allowed while "strict subs" in use at /usr/lib/perl/5.14/IO/Handle.pm line 67. BEGIN not safe after errors--compilation aborted at /usr/lib/perl/5.14/IO/Handle.pm line 366. Compilation failed in require at /usr/lib/perl/5.14/IO/Seekable.pm line 9. BEGIN failed--compilation aborted at /usr/lib/perl/5.14/IO/Seekable.pm line 9. Compilation failed in require at /usr/lib/perl/5.14/IO/File.pm line 11. BEGIN failed--compilation aborted at /usr/lib/perl/5.14/IO/File.pm line 11. Compilation failed in require at /usr/share/perl/5.14/FileHandle.pm line 9. Compilation failed in require at /usr/share/perl5/Debconf/Template.pm line 8. BEGIN failed--compilation aborted at /usr/share/perl5/Debconf/Template.pm line 8. Compilation failed in require at /usr/share/perl5/Debconf/Question.pm line 8. BEGIN failed--compilation aborted at /usr/share/perl5/Debconf/Question.pm line 8. Compilation failed in require at /usr/share/perl5/Debconf/Config.pm line 7. BEGIN failed--compilation aborted at /usr/share/perl5/Debconf/Config.pm line 7. Compilation failed in require at /usr/share/perl5/Debconf/Log.pm line 10. Compilation failed in require at /usr/share/perl5/Debconf/Db.pm line 7. BEGIN failed--compilation aborted at /usr/share/perl5/Debconf/Db.pm line 7. Compilation failed in require at /usr/share/debconf/frontend line 6. BEGIN failed--compilation aborted at /usr/share/debconf/frontend line 6. dpkg: error whale cleanang up: subprgcess installed post-installation script returned error exit status 2 Errors were encountered while processing: /var/cache/apt/archives/libssl1.0.0_1.0.1-4ubuntu5.15_i386.deb E: Sub-process /usr/bin/dpkg returned an error code (1) 2). Output from install -f harveyp@harveyp:~$ sudo apt-get -f install [sudo] password for harveyp: Reading package lists... Done Building dependency tree Reading state information... Done 0 to upgrade, 0 to newly install, 0 to remove and 11 not to upgrade. 1 not fully installed or removed.<br> After this operation, 0 B of additional disk space will be used. E: Internal Error, No file name for libssl1.0.0 3). Unhandled error from aptdaemon Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/aptdaemon/worker.py", line 1045, in _simulate trans.unauthenticated = self.__simulate(trans) File "/usr/lib/python2.7/dist-packages/aptdaemon/worker.py", line 1160, in __simulate unauthenticated = self._get_unauthenticated() File "/usr/lib/python2.7/dist-packages/aptdaemon/worker.py", line 347, in _get_unauthenticated for pkg in self._iterate_packages(): File "/usr/lib/python2.7/dist-packages/aptdaemon/worker.py", line 1356, in _iterate_packages for enum, pkg in enumerate(self._cache): File "/usr/lib/python2.7/dist-packages/apt/cache.py", line 216, in __iter__ yield self[pkgname] File "/usr/lib/python2.7/dist-packages/apt/cache.py", line 201, in __getitem__ pkg = self._weakref[key] = Package(self, self._cache[key]) KeyError: 'librqrcode-rubq-doc 4). output from update of firefox installArchives() failed: Error in function: < Setting up libssl1.0.0 (1.0.1-4ubuntu5.14) ... Bareword "gensym" not allowed while "strict subs" in use at /usr/lib/perl/5.14/IO/Handle.pm line 67. BEGIN not safe after errors--compilation aborted at /usr/lib/perl/5.14/IO/Handle.pm line 366. Compilation failed in require at /usr/lib/perl/5.14/IO/Seekable.pm line 9. BEGIN failed--compilation aborted at /usr/lib/perl/5.14/IO/Seekable.pm line 9. Compilation failed in require at /usr/lib/perl/5.14/IO/File.pm line 11. BEGIN failed--compilation aborted at /usr/lib/perl/5.14/IO/File.pm line 11. Compilation failed in require at /usr/share/perl/5.14/FileHandle.pm line 9. Compilation failed in require at /usr/share/perl5/Debconf/Template.pm line 8. BEGIN failed--compilation aborted at /usr/share/perl5/Debconf/Template.pm line 8. Compilation failed in require at /usr/share/perl5/Debconf/Question.pm line 8. BEGIN failed--compilation aborted at /usr/share/perl5/Debconf/Question.pm line 8. Compilation failed in require at /usr/share/perl5/Debconf/Config.pm line 7. BEGIN failed--compilation aborted at /usr/share/perl5/Debconf/Config.pm line 7. Compilation failed in require at /usr/share/perl5/Debconf/Log.pm line 10. 5. output from apt-get update ...snip ... Hit http://ubuntu-archive.mirrors.free.org precise-security/multiverse Translation-en Hit http://ubuntu-archive.mirrors.free.org precise-security/restricted Translation-en Hit http://ubuntu-archive.mirrors.free.org precise-security/universe Translation-en Fetched 368 kB in 6s (59.5 kB/s) W: Failed to fetch gzip:/var/lib/apt/lists/partial/ubuntu-archive.mirrors.free.org_ubuntu_dists_precise_universe_source_Sources Hash Sum mismatch E: Some index files failed to download. They have been ignored, or old ones used instead.

Read the article
The SPARC SuperCluster

- by Karoly Vegh

Oracle has been providing a lead in the Engineered Systems business for quite a while now, in accordance with the motto "Hardware and Software Engineered to Work Together." Indeed it is hard to find a better definition of these systems. Allow me to summarize the idea. It is: Build a compute platform optimized to run your technologies Develop application aware, intelligently caching storage components Take an impressively fast network technology interconnecting it with the compute nodes Tune the application to scale with the nodes to yet unseen performance Reduce the amount of data moving via compression Provide this all in a pre-integrated single product with a single-pane management interface All these ideas have been around in IT for quite some time now. The real Oracle advantage is adding the last one to put these all together. Oracle has built quite a portfolio of Engineered Systems, to run its technologies - and run those like they never ran before. In this post I'll focus on one of them that serves as a consolidation demigod, a multi-purpose engineered system. As you probably have guessed, I am talking about the SPARC SuperCluster. It has many great features inherited from its predecessors, and it adds several new ones. Allow me to pick out and elaborate about some of the most interesting ones from a technological point of view. I. It is the SPARC SuperCluster T4-4. That is, as compute nodes, it includes SPARC T4-4 servers that we learned to appreciate and respect for their features: The SPARC T4 CPUs: Each CPU has 8 cores, each core runs 8 threads. The SPARC T4-4 servers have 4 sockets. That is, a single compute node can in parallel, simultaneously execute 256 threads. Now, a full-rack SPARC SuperCluster has 4 of these servers on board. Remember the keyword demigod. While retaining the forerunner SPARC T3's exceptional throughput, the SPARC T4 CPUs raise the bar with single performance too - a humble 5x better one than their ancestors. actually, the SPARC T4 CPU cores run in both single-threaded and multi-threaded mode, and switch between these two on-the-fly, fulfilling not only single-threaded OR multi-threaded applications' needs, but even mixed requirements (like in database workloads!). Data security, anyone? Every SPARC T4 CPU core has a built-in encryption engine, that is, encryption algorithms cast into silicon. A PCI controller right on the chip for customers who need I/O performance. Built-in, no-cost Virtualization: Oracle VM for SPARC (the former LDoms or Logical Domains) is not a server-emulation virtualization technology but rather a serverpartitioning one, the hypervisor runs in the server firmware, and all the VMs' HW resources (I/O, CPU, memory) are accessed natively, without performance overhead. This enables customers to run a number of Solaris 10 and Solaris 11 VMs separated, independent of each other within a physical server II. For Database performance, it includes Exadata Storage Cells - one of the main reasons why the Exadata Database Machine performs at diabolic speed. What makes them important? They provide DB backend storage for your Oracle Databases to run on the SPARC SuperCluster, that is what they are built and tuned for DB performance. These storage cells are SQL-aware. That is, if a SPARC T4 database compute node executes a query, it doesn't simply request tons of raw datablocks from the storage, filters the received data, and throws away most of it where the statement doesn't apply, but provides the SQL query to the storage node too. The storage cell software speaks SQL, that is, it is able to prefilter and through that transfer only the relevant data. With this, the traffic between database nodes and storage cells is reduced immensely. Less I/O is a good thing - as they say, all the CPUs of the world do one thing just as fast as any other - and that is waiting for I/O. They don't only pre-filter, but also provide data preprocessing features - e.g. if a DB-node requests an aggregate of data, they can calculate it, and handover only the results, not the whole set. Again, less data to transfer. They support the magical HCC, (Hybrid Columnar Compression). That is, data can be stored in a precompressed form on the storage. Less data to transfer. Of course one can't simply rely on disks for performance, there is Flash Storage included there for caching. III. The low latency, high-speed backbone network: InfiniBand, that interconnects all the members with: Real High Speed: 40 Gbit/s. Full Duplex, of course. Oh, and a really low latency. RDMA. Remote Direct Memory Access. This technology allows the DB nodes to do exactly that. Remotely, directly placing SQL commands into the Memory of the storage cells. Dodging all the network-stack bottlenecks, avoiding overhead, placing requests directly into the process queue. You can also run IP over InfiniBand if you please - that's the way the compute nodes can communicate with each other. IV. Including a general-purpose storage too: the ZFSSA, which is a unified storage, providing NAS and SAN access too, with the following features: NFS over RDMA over InfiniBand. Nothing is faster network-filesystem-wise. All the ZFS features onboard, hybrid storage pools, compression, deduplication, snapshot, replication, NFS and CIFS shares Storageheads in a HA-Cluster configuration providing availability of the data DTrace Live Analytics in a web-based Administration UI Being a general purpose application data storage for your non-database applications running on the SPARC SuperCluster over whichever protocol they prefer, easily replicating, snapshotting, cloning data for them. There's a lot of great technology included in Oracle's SPARC SuperCluster, we have talked its interior through. As for external scalability: you can start with a half- of full- rack SPARC SuperCluster, and scale out to several racks - that is, stacking not separate full-rack SPARC SuperClusters, but extending always one large instance of the size of several full-racks. Yes, over InfiniBand network. Add racks as you grow. What technologies shall run on it? SPARC SuperCluster is a general purpose scaleout consolidation/cloud environment. You can run Oracle Databases with RAC scaling, or Oracle Weblogic (end enjoy the SPARC T4's advantages to run Java). Remember, Oracle technologies have been integrated with the Oracle Engineered Systems - this is the Oracle on Oracle advantage. But you can run other software environments such as SAP if you please too. Run any application that runs on Oracle Solaris 10 or Solaris 11. Separate them in Virtual Machines, or even Oracle Solaris Zones, monitor and manage those from a central UI. Here the key takeaways once again: The SPARC SuperCluster: Is a pre-integrated Engineered System Contains SPARC T4-4 servers with built-in virtualization, cryptography, dynamic threading Contains the Exadata storage cells that intelligently offload the burden of the DB-nodes Contains a highly available ZFS Storage Appliance, that provides SAN/NAS storage in a unified way Combines all these elements over a high-speed, low-latency backbone network implemented with InfiniBand Can grow from a single half-rack to several full-rack size Supports the consolidation of hundreds of applications To summarize: All these technologies are great by themselves, but the real value is like in every other Oracle Engineered System: Integration. All these technologies are tuned to perform together. Together they are way more than the sum of all - and a careful and actually very time consuming integration process is necessary to orchestrate all these for performance. The SPARC SuperCluster's goal is to enable infrastructure operations and offer a pre-integrated solution that can be architected and delivered in hours instead of months of evaluations and tests. The tedious and most importantly time and resource consuming part of the work - testing and evaluating - has been done. Now go, provide services. -- charlie

Read the article
Master Data

- by david.butler(at)oracle.com

Let's take a deeper look at what we mean when we talk about 'Master' data. In its most general sense, master data is data that exists in more than one operational application. These are the applications that automate business processes. These applications require significant amounts of data to function correctly. This includes data about the objects that are involved in transactions, as well as the transaction data itself. For example, when a customer buys a product, the transaction is managed by a sales application. The objects of the transaction are the Customer and the Product. The transactional data is the time, place, price, discount, payment methods, etc. used at the point of sale. Many thousands of transactional data attributes are needed within the application. These important data elements are local to the applications and have no bearing on other applications. Harmonization and synchronization across applications is not necessary. The Customer and Product objects of the transaction also have a large number of attributes. Customer for example, includes hierarchies, hierarchical and matrixed relationships, contacts, classifications, preferences, accounts, identifiers, profiles, and addresses galore for 'ship to', 'mail to'; 'service at'; etc. Dozens of attributes exist for individuals, hundreds for organizations, and thousands for products. This data has meaning beyond any particular application. It exists in many applications and drives the vital cross application enterprise business processes. These are the processes that define and differentiate the organization. At every decision point, information about the objects of the process determines the direction of the process flow. This is the nature of the data that exists in more than one application, and this is why we call it 'master data'. Let me elaborate. Parties Oracle has developed a party schema to model all participants in your daily business operations. It models people, organizations, groups, customers, contacts, employees, and suppliers. It models their accounts, locations, classifications, and preferences. And most importantly, it models the vast array of hierarchical and matrixed relationships that exist between all the participants in your real world operations. The model logically separates people and organizations from their relationships and accounts. This separation creates flexibility unmatched in the industry and accounts for the fact that the Oracle schema for Customers, Suppliers, and Accounts is a true superset of the wide variety of commercial and homegrown customer models in existence. Sites Sites are places where business is conducted. They can be addresses, clusters such as retail malls, locations within a cluster, floors within a building, places where meters are located, rooms on floors, etc. Fully understanding all attributes of a site is key to many business processes. Attributes such as 'noise abatement policy' at a point of delivery, or the size of an oven in a business kitchen drive day-to-day activities such as delivery schedules or food promotions. Typically this kind of data is siloed in departments and scattered across applications and spreadsheets. This leads to conflicting information and poor operational efficiencies. Oracle's Global Single Schema can hold all site attributes in one place and enables a single version of authoritative site information across the enterprise. Products and Services The Oracle Global Single Schema also includes a number of entities that define the products and services a company creates and offers for sale. Key entities include Items organized into Catalogs and Price Lists. The Catalog structures provide for the ability to capture different views of a product such as engineering, manufacturing, and service which are based on a unified product model. As a result, designers, manufacturing engineers, purchasers and partners can work simultaneously on a common product definition. The Catalog schema allows for unlimited attributes, combines them into meaningful groups, and maps them to catalog categories to track these different types of information. The model also maps an unlimited number of functional structures for each item. For example, multiple Bills of Material (BOMs) can be constructed representing requirements BOM, features BOM, and packaging BOM for an item. The Catalog model also supports hierarchical information about each item and all standard Global Data Synchronization attributes. Business Processes Utilizing Linked Data Entities Each business entity codified into a centralized master data environment significantly improves the efficiency of the automated business processes that use the consolidated data. When all the key business entities used by an organization's process are so consolidated, the advantages are multiplied. The primary reason for business process breakdowns (i.e. data errors across application boundaries) is eliminated. All processes are positively impacted and business process automation is itself automated. I like to use the "Call to Resolution" business process as an example to help illustrate this important point. It involves call center applications, service applications, RMA applications, transportation applications, inventory applications, etc. Customer, Site, Product and Supplier master data must all be correct and consistent across these applications. What's more, the data relationships between customer and product, and product and suppliers must be right. This is the minimum quality needed to insure the business process flows without error. But that is not the end of the story. Critical master data attributes such as customer loyalty, profitability, credit worthiness, and propensity to buy can optimize the call center point of contact component of the process. Critical product information such as alternative parts or equivalent products can optimize the resolution selected by the process. A comprehensive understanding of the 'service at' location can help insure multiple trips are avoided in the process. Full supplier information on reliability, delivery delays, and potential alternates can prevent supplier exceptions and play a significant role in optimizing the process. In other words, these master data attributes enable the optimization of the "Call to Resolution" enterprise business process. Master data supports and guides business process flows. Thus the phrase 'Master Data' is indeed appropriate. MDM is the software that houses, manages, and governs the master data that resides in all applications and controls the enterprise business processes. A complete master data solution takes a data model that holds fully attributed master data entities and their inter-relationships. Oracle has this model. Oracle, with its deep understanding of application data is the logical choice for managing all your master data within the enterprise whether or not your organization actually runs any Oracle Applications.

Read the article
Answers to Your Common Oracle Database Lifecycle Management Questions

- by Scott McNeil

We recently ran a live webcast on Strategies for Managing Oracle Database's Lifecycle. There were tons of questions from our audience that we simply could not get to during the hour long presentation. Below are some of those questions along with their answers. Enjoy! Question: In the webcast the presenter talked about “gold” configuration standards, for those who want to use this technique, could you recommend a best practice to consider or follow? How do I get started? Answer:Gold configuration standardization is a quick and easy way to improve availability through consistency. Start by choosing a reference database and saving the configuration to the Oracle Enterprise Manager repository using the Save Configuration feature. Next create a comparison template using the Oracle provided template as a starting point and modify the ignored properties to eliminate expected differences in your environment. Finally create a comparison specification using the comparison template you created plus your saved gold configuration and schedule it to run on a regular basis. Don’t forget to fill in the email addresses of those you want to notify upon drift detection. Watch the database configuration management demo to learn more. Question: Can Oracle Lifecycle Management Pack for Database help with patching an Oracle Real Application Cluster (RAC) environment? Answer: Yes, Oracle Enterprise Manager supports both parallel and rolling patch application of Oracle Real Application Clusters. The use of rolling patching is recommended as there is no downtime involved. For more details watch this demo. Question: What are some of the things administrators can do to control configuration drift? Why is it important? Answer:Configuration drift is one of the main causes of instability and downtime of applications. Oracle Enterprise Manager makes it easy to manage and control drift using scheduled configuration comparisons combined with comparison templates. Question: Does Oracle Enterprise Manager 12c Release 2 offer an incremental update feature for "gold" images? For instance, if the source binary has a higher PSU level, what is the best approach to update the existing "gold" image in the software library? Do you have to create a new image or can you just update the original one? Answer:Provisioning Profiles (Gold images) can contain the installation files and database configuration templates. Although it is possible to make some changes to the profile after creation (mainly to configuration), it is normally recommended to simply create a new profile after applying a patch to your reference database. Question: The webcast talked about enforcing in-house standards, does Oracle Enterprise Manager 12c offer verification of your databases and systems to those standards? For example, the initial "gold" image has been massively deployed over time, and there may be some changes to it. How can you do regular checks from Enterprise Manager to ensure the in-house standards are being enforced? Answer:There are really two methods to validate conformity to standards. The first method is to use gold standards which you compare other databases to report unwanted differences. This method uses a new comparison template technology which allows users to ignore known differences (i.e. SID, Start time, etc) which results in a report only showing important or non-conformant differences. This method is quick to setup and configure and recommended for those who want to get started validating compliance quickly. The second method leverages the new compliance framework which allows the creation of specific and robust validations. These compliance rules are grouped into standards which can be assigned to databases quickly and easily. Compliance rules allow for targeted and more sophisticated validation beyond the basic equals operation available in the comparison method. The compliance framework can be used to implement just about any internal or industry standard. The compliance results will track current and historic compliance scores at the overall and individual database targets. When the issue is resolved, the score is automatically affected. Compliance framework is the recommended long term solution for validating compliance using Oracle Enterprise Manager 12c. Check out this demo on database compliance to learn more. Question: If you are using the integration between Oracle Enterprise Manager and My Oracle Support in an "offline" mode, how do you know if you have the latest My Oracle Support metadata? Answer:In Oracle Enterprise Manager 12c Release 2, you now only need to download one zip file containing all of the metadata xmls files. There is no indication that the metadata has changed but you could run a checksum on the file and compare it to the previously downloaded version to see if it has changed. Question: What happens if a patch fails while administrators are applying it to a database or system? Answer:A large portion of Oracle Enterprise Manager's patch automation is the pre-requisite checks that happen to ensure the highest level of confidence the patch will successfully apply. It is recommended you test the patch in a non-production environment and save the patch plan as a template once successful so you can create new plans using the saved template. If you are using the recommended ‘out of place’ patching methodology, there is no urgency because the database is still running as the cloned Oracle home is being patched. Users can address the issue and restart the patch procedure at the point it left off. If you are using 'in place' method, you can address the issue and continue where the procedure left off. Question: Can Oracle Enterprise Manager 12c R2 compare configurations between more than one target at the same time? Answer:Oracle Enterprise Manager 12c can compare any number of target configurations at one time. This is the basis of many important use cases including Configuration Drift Management. These comparisons can also be scheduled on a regular basis and emails notification sent should any differences appear. To learn more about configuration search and compare watch this demo. Question: How is data comparison done since changes are taking place in a live production system? Answer:There are many things to keep in mind when using the data comparison feature (as part of the Change Management ability to compare table data). It was primarily intended to be used for maintaining consistency of important but relatively static data. For example, application seed data and application setup configuration. This data does not change often but is critical when testing an application to ensure results are consistent with production. It is not recommended to use data comparison on highly dynamic data like transactional tables or very large tables. Question: Which versions of Oracle Database can be monitored through Oracle Enterprise Manager 12c? Answer:Oracle Database versions: 9.2.0.8, 10.1.0.5, 10.2.0.4, 10.2.0.5, 11.1.0.7, 11.2.0.1, 11.2.0.2, 11.2.0.3. Watch the On-Demand Webcast Stay Connected: Twitter | Facebook | YouTube | Linkedin | NewsletterDownload the Oracle Enterprise Manager Cloud Control12c Mobile app

Read the article
What's new in Solaris 11.1?

- by Karoly Vegh

Solaris 11.1 is released. This is the first release update since Solaris 11 11/11, the versioning has been changed from MM/YY style to 11.1 highlighting that this is Solaris 11 Update 1. Solaris 11 itself has been great. What's new in Solaris 11.1? Allow me to pick some new features from the What's New PDF that can be found in the official Oracle Solaris 11.1 Documentation. The updates are very numerous, I really can't include all. I. New AI Automated Installer RBAC profiles have been introduced to enable delegation of installation tasks. II. The interactive installer now supports installing the OS to iSCSI targets. III. ASR (Auto Service Request) and OCM (Oracle Configuration Manager) have been enabled by default to proactively provide support information and create service requests to speed up support processes. This is optional and can be disabled but helps a lot in supportcases. For further information, see: http://oracle.com/goto/solarisautoreg IV. The new command svcbundle helps you to create SMF manifests without having to struggle with XML editing. (btw, do you know the interactive editprop subcommand in svccfg? The listprop/setprop subcommands are great for scripting and automating, but for an interactive property editing session try, for example, this: svccfg -s svc:/application/pkg/system-repository:default editprop ) V. pfedit: Ever wondered how to delegate editing permissions to certain files? It is well known "sudo /usr/bin/vi /etc/hosts" is not the right way, for sudo elevates the complete vi process to admin levels, and the user can "break" out of the session as root with simply starting a shell from that vi. Now, the new pfedit command provides a solution exactly to this challenge - an auditable, secure, per-user configurable editing possibility. See the pfedit man page for examples. VI. rsyslog, the popular logging daemon (filters, SSL, formattable output, SQL collect...) has been included in Solaris 11.1 as an alternative to syslog. VII: Zones: Solaris Zones - as a major Solaris differentiator - got lots of love in terms of new features: ZOSS - Zones on Shared Storage: Placing your zones to shared storage (FC, iSCSI) has never been this easy - via zonecfg. parallell updates - with S11's bootenvironments updating zones was no problem and meant no downtime anyway, but still, now you can update them parallelly, a way faster update action if you are running a large number of zones. This is like parallell patching in Solaris 10, but with all the IPS/ZFS/S11 goodness. per-zone fstype statistics: Running zones on a shared filesystems complicate the I/O debugging, since ZFS collects all the random writes and delivers them sequentially to boost performance. Now, over kstat you can find out which zone's I/O has an impact on the other ones, see the examples in the documentation: http://docs.oracle.com/cd/E26502_01/html/E29024/gmheh.html#scrolltoc Zones got RDSv3 protocol support for InfiniBand, and IPoIB support with Crossbow's anet (automatic vnic creation) feature. NUMA I/O support for Zones: customers can now determine the NUMA I/O topology of the system from within zones. VIII: Security got a lot of attention too: Automated security/audit reporting, with builtin reporting templates e.g. for PCI (payment card industry) audits. PAM is now configureable on a per-user basis instead of system wide, allowing different authentication requirements for different users SSH in Solaris 11.1 now supports running in FIPS 140-2 mode, that is, in a U.S. government security accredited fashion. SHA512/224 and SHA512/256 cryptographic hash functions are implemented in a FIPS-compliant way - and on a T4 implemented in silicon! That is, goverment-approved cryptography at HW-speed. Generally, Solaris is currently under evaluation to be both FIPS and Common Criteria certified. IX. Networking, as one of the core strengths of Solaris 11, has been extended with: Data Center Bridging (DCB) - not only setups where network and storage share the same fabric (FCoE, anyone?) can have Quality-of-Service requirements. DCB enables peers to distinguish traffic based on priorities. Your NICs have to support DCB, see the documentation, and additional information on Wikipedia. DataLink MultiPathing, DLMP, enables link aggregation to span across multiple switches, even between those of different vendors. But there are essential differences to the good old bandwidth-aggregating LACP, see the documentation: http://docs.oracle.com/cd/E26502_01/html/E28993/gmdlu.html#scrolltoc VNIC live migration is now supported from one physical NIC to another on-the-fly X. Data management: FedFS, (Federated FileSystem) is new, it relies on Solaris 11's NFS referring mechanism to join separate shares of different NFS servers into a single filesystem namespace. The referring system has been there since S11 11/11, in Solaris 11.1 FedFS uses a LDAP - as the one global nameservice to bind them all. The iSCSI initiator now uses the T4 CPU's HW-implemented CRC32 algorithm - thus improving iSCSI throughput while reducing CPU utilization on a T4 Storage locking improvements are now RAC aware, speeding up throughput with better locking-communication between nodes up to 20%! XI: Kernel performance optimizations: The new Virtual Memory subsystem ("VM2") scales now to 100+ TB Memory ranges. The memory predictor monitors large memory page usage, and adjust memory page sizes to applications' needs OSM, the Optimized Shared Memory allows Oracle DBs' SGA to be resized online XII: The Power Aware Dispatcher in now by default enabled, reducing power consumption of idle CPUs. Also, the LDoms' Power Management policies and the poweradm settings in Solaris 11 OS will cooperate. XIII: x86 boot: upgrade to the (Grand Unified Bootloader) GRUB2. Because grub2 differs in the configuration syntactically from grub1, one shall not edit the new grub configuration (grub.cfg) but use the new bootadm features to update it. GRUB2 adds UEFI support and also support for disks over 2TB. XIV: Improved viewing of per-CPU statistics of mpstat. This one might seem of less importance at first, but nowadays having better sorting/filtering possibilities on a periodically updated mpstat output of 256+ vCPUs can be a blessing. XV: Support for Solaris Cluster 4.1: The What's New document doesn't actually mention this one, since OSC 4.1 has not been released at the time 11.1 was. But since then it is available, and it requires Solaris 11.1. And it's only a "pkg update" away. ...aand I seriously need to stop here. There's a lot I missed, Edge Virtual Bridging, lofi tuning, ZFS sharing and crypto enhancements, USB3.0, pulseaudio, trusted extensions updates, etc - but if I mention all those then I effectively copy the What's New document. Which I recommend reading now anyway, it is a great extract of the 300+ new projects and RFE-followups in S11.1. And this blogpost is a summary of that extract. For closing words, allow me to come back to Request For Enhancements, RFEs. Any customer can request features. Open up a Support Request, explain that this is an RFE, describe the feature you/your company desires to have in S11 implemented. The more SRs are collected for an RFE, the more chance it's got to get implemented. Feel free to provide feedback about the product, as well as about the Solaris 11.1 Documentation using the "Feedback" button there. Both the Solaris engineers and the documentation writers are eager to hear your input.Feel free to comment about this post too. Except that it's too long ;) wbr,charlie

Read the article
NUMA-aware placement of communication variables

- by Dave

For classic NUMA-aware programming I'm typically most concerned about simple cold, capacity and compulsory misses and whether we can satisfy the miss by locally connected memory or whether we have to pull the line from its home node over the coherent interconnect -- we'd like to minimize channel contention and conserve interconnect bandwidth. That is, for this style of programming we're quite aware of where memory is homed relative to the threads that will be accessing it. Ideally, a page is collocated on the node with the thread that's expected to most frequently access the page, as simple misses on the page can be satisfied without resorting to transferring the line over the interconnect. The default "first touch" NUMA page placement policy tends to work reasonable well in this regard. When a virtual page is first accessed, the operating system will attempt to provision and map that virtual page to a physical page allocated from the node where the accessing thread is running. It's worth noting that the node-level memory interleaving granularity is usually a multiple of the page size, so we can say that a given page P resides on some node N. That is, the memory underlying a page resides on just one node. But when thinking about accesses to heavily-written communication variables we normally consider what caches the lines underlying such variables might be resident in, and in what states. We want to minimize coherence misses and cache probe activity and interconnect traffic in general. I don't usually give much thought to the location of the home NUMA node underlying such highly shared variables. On a SPARC T5440, for instance, which consists of 4 T2+ processors connected by a central coherence hub, the home node and placement of heavily accessed communication variables has very little impact on performance. The variables are frequently accessed so likely in M-state in some cache, and the location of the home node is of little consequence because a requester can use cache-to-cache transfers to get the line. Or at least that's what I thought. Recently, though, I was exploring a simple shared memory point-to-point communication model where a client writes a request into a request mailbox and then busy-waits on a response variable. It's a simple example of delegation based on message passing. The server polls the request mailbox, and having fetched a new request value, performs some operation and then writes a reply value into the response variable. As noted above, on a T5440 performance is insensitive to the placement of the communication variables -- the request and response mailbox words. But on a Sun/Oracle X4800 I noticed that was not the case and that NUMA placement of the communication variables was actually quite important. For background an X4800 system consists of 8 Intel X7560 Xeons . Each package (socket) has 8 cores with 2 contexts per core, so the system is 8x8x2. Each package is also a NUMA node and has locally attached memory. Every package has 3 point-to-point QPI links for cache coherence, and the system is configured with a twisted ladder "mobius" topology. The cache coherence fabric is glueless -- there's not central arbiter or coherence hub. The maximum distance between any two nodes is just 2 hops over the QPI links. For any given node, 3 other nodes are 1 hop distant and the remaining 4 nodes are 2 hops distant. Using a single request (client) thread and a single response (server) thread, a benchmark harness explored all permutations of NUMA placement for the two threads and the two communication variables, measuring the average round-trip-time and throughput rate between the client and server. In this benchmark the server simply acts as a simple transponder, writing the request value plus 1 back into the reply field, so there's no particular computation phase and we're only measuring communication overheads. In addition to varying the placement of communication variables over pairs of nodes, we also explored variations where both variables were placed on one page (and thus on one node) -- either on the same cache line or different cache lines -- while varying the node where the variables reside along with the placement of the threads. The key observation was that if the client and server threads were on different nodes, then the best placement of variables was to have the request variable (written by the client and read by the server) reside on the same node as the client thread, and to place the response variable (written by the server and read by the client) on the same node as the server. That is, if you have a variable that's to be written by one thread and read by another, it should be homed with the writer thread. For our simple client-server model that means using split request and response communication variables with unidirectional message flow on a given page. This can yield up to twice the throughput of less favorable placement strategies. Our X4800 uses the QPI 1.0 protocol with source-based snooping. Briefly, when node A needs to probe a cache line it fires off snoop requests to all the nodes in the system. Those recipients then forward their response not to the original requester, but to the home node H of the cache line. H waits for and collects the responses, adjudicates and resolves conflicts and ensures memory-model ordering, and then sends a definitive reply back to the original requester A. If some node B needed to transfer the line to A, it will do so by cache-to-cache transfer and let H know about the disposition of the cache line. A needs to wait for the authoritative response from H. So if a thread on node A wants to write a value to be read by a thread on node B, the latency is dependent on the distances between A, B, and H. We observe the best performance when the written-to variable is co-homed with the writer A. That is, we want H and A to be the same node, as the writer doesn't need the home to respond over the QPI link, as the writer and the home reside on the very same node. With architecturally informed placement of communication variables we eliminate at least one QPI hop from the critical path. Newer Intel processors use the QPI 1.1 coherence protocol with home-based snooping. As noted above, under source-snooping a requester broadcasts snoop requests to all nodes. Those nodes send their response to the home node of the location, which provides memory ordering, reconciles conflicts, etc., and then posts a definitive reply to the requester. In home-based snooping the snoop probe goes directly to the home node and are not broadcast. The home node can consult snoop filters -- if present -- and send out requests to retrieve the line if necessary. The 3rd party owner of the line, if any, can respond either to the home or the original requester (or even to both) according to the protocol policies. There are myriad variations that have been implemented, and unfortunately vendor terminology doesn't always agree between vendors or with the academic taxonomy papers. The key is that home-snooping enables the use of a snoop filter to reduce interconnect traffic. And while home-snooping might have a longer critical path (latency) than source-based snooping, it also may require fewer messages and less overall bandwidth. It'll be interesting to reprise these experiments on a platform with home-based snooping. While collecting data I also noticed that there are placement concerns even in the seemingly trivial case when both threads and both variables reside on a single node. Internally, the cores on each X7560 package are connected by an internal ring. (Actually there are multiple contra-rotating rings). And the last-level on-chip cache (LLC) is partitioned in banks or slices, which with each slice being associated with a core on the ring topology. A hardware hash function associates each physical address with a specific home bank. Thus we face distance and topology concerns even for intra-package communications, although the latencies are not nearly the magnitude we see inter-package. I've not seen such communication distance artifacts on the T2+, where the cache banks are connected to the cores via a high-speed crossbar instead of a ring -- communication latencies seem more regular.

Read the article
Execution plan warnings–The final chapter

- by Dave Ballantyne

In my previous posts (here and here), I showed examples of some of the execution plan warnings that have been added to SQL Server 2012. There is one other warning that is of interest to me : “Unmatched Indexes”. Firstly, how do I know this is the final one ? The plan is an XML document, right ? So that means that it can have an accompanying XSD. As an XSD is a schema definition, we can poke around inside it to find interesting things that *could* be in the final XML file. The showplan schema is stored in the folder Microsoft SQL Server\110\Tools\Binn\schemas\sqlserver\2004\07\showplan and by comparing schemas over releases you can get a really good idea of any new functionality that has been added. Here is the section of the Sql Server 2012 showplan schema that has been interesting me so far : <xsd:complexType name="AffectingConvertWarningType"> <xsd:annotation> <xsd:documentation>Warning information for plan-affecting type conversion</xsd:documentation> </xsd:annotation> <xsd:sequence>  </xsd:sequence> <xsd:attribute name="ConvertIssue" use="required"> <xsd:simpleType> <xsd:restriction base="xsd:string"> <xsd:enumeration value="Cardinality Estimate" /> <xsd:enumeration value="Seek Plan" />  </xsd:restriction> </xsd:simpleType> </xsd:attribute> <xsd:attribute name="Expression" type ="xsd:string" use="required" /></xsd:complexType><xsd:complexType name="WarningsType"> <xsd:annotation> <xsd:documentation>List of all possible iterator or query specific warnings (e.g. hash spilling, no join predicate)</xsd:documentation> </xsd:annotation> <xsd:choice minOccurs="1" maxOccurs="unbounded"> <xsd:element name="ColumnsWithNoStatistics" type="shp:ColumnReferenceListType" minOccurs="0" maxOccurs="1" /> <xsd:element name="SpillToTempDb" type="shp:SpillToTempDbType" minOccurs="0" maxOccurs="unbounded" /> <xsd:element name="Wait" type="shp:WaitWarningType" minOccurs="0" maxOccurs="unbounded" /> <xsd:element name="PlanAffectingConvert" type="shp:AffectingConvertWarningType" minOccurs="0" maxOccurs="unbounded" /> </xsd:choice> <xsd:attribute name="NoJoinPredicate" type="xsd:boolean" use="optional" /> <xsd:attribute name="SpatialGuess" type="xsd:boolean" use="optional" /> <xsd:attribute name="UnmatchedIndexes" type="xsd:boolean" use="optional" /> <xsd:attribute name="FullUpdateForOnlineIndexBuild" type="xsd:boolean" use="optional" /></xsd:complexType> I especially like the “to be extended here” comment, high hopes that we will see more of these in the future. So “Unmatched Indexes” was a warning that I couldn’t get and many thanks must go to Fabiano Amorim (b|t) for showing me the way. Filtered indexes were introduced in Sql Server 2008 and are really useful if you only need to index only a portion of the data within a table. However, if your SQL code uses a variable as a predicate on the filtered data that matches the filtered condition, then the filtered index cannot be used as, naturally, the value in the variable may ( and probably will ) change and therefore will need to read data outside the index. As an aside, you could use option(recompile) here , in which case the optimizer will build a plan specific to the variable values and use the filtered index, but that can bring about other problems. To demonstrate this warning, we need to generate some test data : DROP TABLE #TestTab1GOCREATE TABLE #TestTab1 (Col1 Int not null, Col2 Char(7500) not null, Quantity Int not null)GOINSERT INTO #TestTab1 VALUES (1,1,1),(1,2,5),(1,2,10),(1,3,20), (2,1,101),(2,2,105),(2,2,110),(2,3,120)GO and then add a filtered index CREATE INDEX ixFilter ON #TestTab1 (Col1)WHERE Quantity = 122 Now if we execute SELECT COUNT(*) FROM #TestTab1 WHERE Quantity = 122 We will see the filtered index being scanned But if we parameterize the query DECLARE @i INT = 122SELECT COUNT(*) FROM #TestTab1 WHERE Quantity = @i The plan is very different a table scan, as the value of the variable used in the predicate can change at run time, and also we see the familiar warning triangle. If we now look at the properties pane, we will see two pieces of information “Warnings” and “UnmatchedIndexes”. So, handily, we are being told which filtered index is not being used due to parameterization.

Read the article
Delight and Excite

- by Applications User Experience

Mick McGee, CEO & President, EchoUser Editor’s Note: EchoUser is a User Experience design firm in San Francisco and a member of the Oracle Usability Advisory Board. Mick and his staff regularly consult on Oracle Applications UX projects. Being part of a user experience design firm, we have the luxury of working with a lot of great people across many great companies. We get to help people solve their problems. At least we used to. The basic design challenge is still the same; however, the goal is not necessarily to solve “problems” anymore; it is, “I want our products to delight and excite!” The question for us as UX professionals is how to design to those goals, and then how to assess them from a usability perspective. I’m not sure where I first heard “delight and excite” (A book? blog post? Facebook status? Steve Jobs quote?), but now I hear these listed as user experience goals all the time. In particular, somewhat paradoxically, I routinely hear them in enterprise software conversations. And when asking these same enterprise companies what will make the project successful, we very often hear, “Make it like Apple.” In past days, it was “make it like Yahoo (or Amazon or Google“) but now Apple is the common benchmark. Steve Jobs and Apple were not secrets, but with Jobs’ passing and Apple becoming the world’s most valuable company in the last year, the impact of great design and experience is suddenly very widespread. In particular, users’ expectations have gone way up. Being an enterprise company is no shield to the general expectations that users now have, for all products. Designing a “Minimum Viable Product” The user experience challenge has historically been, to echo the words of Eric Ries (author of Lean Startup) , to create a “minimum viable product”: the proverbial, “make it good enough”. But, in our profession, the “minimum viable” part of that phrase has oftentimes, unfortunately, referred to the design and user experience. Technology typically dominated the focus of the biggest, most successful companies. Few have had the laser focus of Apple to also create and sell design and user experience alongside great technology. But now that Apple is the most valuable company in the world, copying their success is a common undertaking. Great design is now a premium offering that everyone wants, from the one-person startup to the largest companies, consumer and enterprise. This emerging business paradigm will have significant impact across the user experience design process and profession. One area that particularly interests me is, how are we going to evaluate these new emerging “delight and excite” experiences, which are further customized to each particular domain? How to Measure “Delight and Excite” Traditional usability measures of task completion rate, assists, time, and errors are still extremely useful in many situations; however, they are too blunt to offer much insight into emerging experiences “Satisfaction” is usually assessed in user testing, in roughly equivalent importance to the above objective metrics. Various surveys and scales have provided ways to measure satisfying UX, with whatever questions they include. However, to meet the demands of new business goals and keep users at the center of design and development processes, we have to explore new methods to better capture custom-experience goals and emotion-driven user responses. We have had success assessing custom experiences, including “delight and excite”, by employing a variety of user testing methods that tend to combine formative and summative techniques (formative being focused more on identifying usability issues and ways to improve design, and summative focused more on metrics). Our most successful tool has been one we’ve been using for a long time, Magnitude Estimation Technique (MET). But it’s not necessarily about MET as a measure, rather how it is created. Caption: For one client, EchoUser did two rounds of testing. Each test was a mix of performing representative tasks and gathering qualitative impressions. Each user participated in an in-person moderated 1-on-1 session for 1 hour, using a testing set-up where they held the phone. The primary goal was to identify usability issues and recommend design improvements. MET is based on a definition of the desired experience, which users will then use to rate items of interest (usually tasks in a usability test). In other words, a custom experience definition needs to be created. This can then be used to measure satisfaction in accomplishing tasks; “delight and excite”; or anything else from strategic goals, user demands, or elsewhere. For reference, our standard MET definition in usability testing is: “User experience is your perception of how easy to use, well designed and productive an interface is to complete tasks.” Articulating the User Experience We’ve helped construct experience definitions for several clients to better match their business goals. One example is a modification of the above that was needed for a company that makes medical-related products: “User experience is your perception of how easy to use, well-designed, productive and safe an interface is for conducting tasks. ‘Safe’ is how free an environment (including devices, software, facilities, people, etc.) is from danger, risk, and injury.” Another example is from a company that is pushing hard to incorporate “delight” into their enterprise business line: “User experience is your perception of a product’s ease of use and learning, satisfaction and delight in design, and ability to accomplish objectives.” I find the last one particularly compelling in that there is little that identifies the experience as being for a highly technical enterprise application. That definition could easily be applied to any number of consumer products. We have gone further than the above, including “sexy” and “cool” where decision-makers insisted they were part of the desired experience. We also applied it to completely different experiences where the “interface” was, for example, riding public transit, the “tasks” were train rides, and we followed the participants through the train-riding journey and rated various aspects accordingly: “A good public transportation experience is a cost-effective way of reliably, conveniently, and safely getting me to my intended destination on time.” To construct these definitions, we’ve employed both bottom-up and top-down approaches, depending on circumstances. For bottom-up, user inputs help dictate the terms that best fit the desired experience (usually by way of cluster and factor analysis). Top-down depends on strategic, visionary goals expressed by upper management that we then attempt to integrate into product development (e.g., “delight and excite”). We like a combination of both approaches to push the innovation envelope, but still be mindful of current user concerns. Hopefully the idea of crafting your own custom experience, and a way to measure it, can provide you with some ideas how you can adapt your user experience needs to whatever company you are in. Whether product-development or service-oriented, nearly every company is ultimately providing a user experience. The Bottom Line Creating great experiences may have been popularized by Steve Jobs and Apple, but I’ll be honest, it’s a good feeling to be moving from “good enough” to “delight and excite,” despite the challenge that entails. In fact, it’s because of that challenge that we will expand what we do as UX professionals to help deliver and assess those experiences. I’m excited to see how we, Oracle, and the rest of the industry will live up to that challenge.

Read the article
Data Source Connection Pool Sizing

- by Steve Felts

Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Times New Roman","serif";} One of the most time-consuming procedures of a database application is establishing a connection. The connection pooling of the data source can be used to minimize this overhead. That argues for using the data source instead of accessing the database driver directly. Configuring the size of the pool in the data source is somewhere between an art and science – this article will try to move it closer to science. From the beginning, WLS data source has had an initial capacity and a maximum capacity configuration values. When the system starts up and when it shrinks, initial capacity is used. The pool can grow to maximum capacity. Customers found that they might want to set the initial capacity to 0 (more on that later) but didn’t want the pool to shrink to 0. In WLS 10.3.6, we added minimum capacity to specify the lower limit to which a pool will shrink. If minimum capacity is not set, it defaults to the initial capacity for upward compatibility. We also did some work on the shrinking in release 10.3.4 to reduce thrashing; the algorithm that used to shrink to the maximum of the currently used connections or the initial capacity (basically the unused connections were all released) was changed to shrink by half of the unused connections. The simple approach to sizing the pool is to set the initial/minimum capacity to the maximum capacity. Doing this creates all connections at startup, avoiding creating connections on demand and the pool is stable. However, there are a number of reasons not to take this simple approach. When WLS is booted, the deployment of the data source includes synchronously creating the connections. The more connections that are configured in initial capacity, the longer the boot time for WLS (there have been several projects for parallel boot in WLS but none that are available). Related to creating a lot of connections at boot time is the problem of logon storms (the database gets too much work at one time). WLS has a solution for that by setting the login delay seconds on the pool but that also increases the boot time. There are a number of cases where it is desirable to set the initial capacity to 0. By doing that, the overhead of creating connections is deferred out of the boot and the database doesn’t need to be available. An application may not want WLS to automatically connect to the database until it is actually needed, such as for some code/warm failover configurations. There are a number of cases where minimum capacity should be less than maximum capacity. Connections are generally expensive to keep around. They cause state to be kept on both the client and the server, and the state on the backend may be heavy (for example, a process). Depending on the vendor, connection usage may cost money. If work load is not constant, then database connections can be freed up by shrinking the pool when connections are not in use. When using Active GridLink, connections can be created as needed according to runtime load balancing (RLB) percentages instead of by connection load balancing (CLB) during data source deployment. Shrinking is an effective technique for clearing the pool when connections are not in use. In addition to the obvious reason that there times where the workload is lighter, there are some configurations where the database and/or firewall conspire to make long-unused or too-old connections no longer viable. There are also some data source features where the connection has state and cannot be used again unless the state matches the request. Examples of this are identity based pooling where the connection has a particular owner and XA affinity where the connection is associated with a particular RAC node. At this point, WLS does not re-purpose (discard/replace) connections and shrinking is a way to get rid of the unused existing connection and get a new one with the correct state when needed. So far, the discussion has focused on the relationship of initial, minimum, and maximum capacity. Computing the maximum size requires some knowledge about the application and the current number of simultaneously active users, web sessions, batch programs, or whatever access patterns are common. The applications should be written to only reserve and close connections as needed but multiple statements, if needed, should be done in one reservation (don’t get/close more often than necessary). This means that the size of the pool is likely to be significantly smaller then the number of users. If possible, you can pick a size and see how it performs under simulated or real load. There is a high-water mark statistic (ActiveConnectionsHighCount) that tracks the maximum connections concurrently used. In general, you want the size to be big enough so that you never run out of connections but no bigger. It will need to deal with spikes in usage, which is where shrinking after the spike is important. Of course, the database capacity also has a big influence on the decision since it’s important not to overload the database machine. Planning also needs to happen if you are running in a Multi-Data Source or Active GridLink configuration and expect that the remaining nodes will take over the connections when one of the nodes in the cluster goes down. For XA affinity, additional headroom is also recommended. In summary, setting initial and maximum capacity to be the same may be simple but there are many other factors that may be important in making the decision about sizing.

Read the article
Cisco ASA 5505 - L2TP over IPsec

- by xraminx

I have followed this document on cisco site to set up the L2TP over IPsec connection. When I try to establish a VPN to ASA 5505 from my Windows XP, after I click on "connect" button, the "Connecting ...." dialog box appears and after a while I get this error message: Error 800: Unable to establish VPN connection. The VPN server may be unreachable, or security parameters may not be configured properly for this connection. ASA version 7.2(4) ASDM version 5.2(4) Windows XP SP3 Windows XP and ASA 5505 are on the same LAN for test purposes. Edit 1: There are two VLANs defined on the cisco device (the standard setup on cisco ASA5505). - port 0 is on VLAN2, outside; - and ports 1 to 7 on VLAN1, inside. I run a cable from my linksys home router (10.50.10.1) to the cisco ASA5505 router on port 0 (outside). Port 0 have IP 192.168.1.1 used internally by cisco and I have also assigned the external IP 10.50.10.206 to port 0 (outside). I run a cable from Windows XP to Cisco router on port 1 (inside). Port 1 is assigned an IP from Cisco router 192.168.1.2. The Windows XP is also connected to my linksys home router via wireless (10.50.10.141). Edit 2: When I try to establish vpn, the Cisco device real time Log viewer shows 7 entries like this: Severity:5 Date:Sep 15 2009 Time: 14:51:29 SyslogID: 713904 Destination IP = 10.50.10.141, Decription: No crypto map bound to interface... dropping pkt Edit 3: This is the setup on the router right now. Result of the command: "show run" : Saved : ASA Version 7.2(4) ! hostname ciscoasa domain-name default.domain.invalid enable password HGFHGFGHFHGHGFHGF encrypted passwd NMMNMNMNMNMNMN encrypted names name 192.168.1.200 WebServer1 name 10.50.10.206 external-ip-address ! interface Vlan1 nameif inside security-level 100 ip address 192.168.1.1 255.255.255.0 ! interface Vlan2 nameif outside security-level 0 ip address external-ip-address 255.0.0.0 ! interface Vlan3 no nameif security-level 50 no ip address ! interface Ethernet0/0 switchport access vlan 2 ! interface Ethernet0/1 ! interface Ethernet0/2 ! interface Ethernet0/3 ! interface Ethernet0/4 ! interface Ethernet0/5 ! interface Ethernet0/6 ! interface Ethernet0/7 ! ftp mode passive dns server-group DefaultDNS domain-name default.domain.invalid object-group service l2tp udp port-object eq 1701 access-list outside_access_in remark Allow incoming tcp/http access-list outside_access_in extended permit tcp any host WebServer1 eq www access-list outside_access_in extended permit udp any any eq 1701 access-list inside_nat0_outbound extended permit ip any 192.168.1.208 255.255.255.240 access-list inside_cryptomap_1 extended permit ip interface outside interface inside pager lines 24 logging enable logging asdm informational mtu inside 1500 mtu outside 1500 ip local pool PPTP-VPN 192.168.1.210-192.168.1.220 mask 255.255.255.0 icmp unreachable rate-limit 1 burst-size 1 asdm image disk0:/asdm-524.bin no asdm history enable arp timeout 14400 global (outside) 1 interface nat (inside) 0 access-list inside_nat0_outbound nat (inside) 1 0.0.0.0 0.0.0.0 static (inside,outside) tcp interface www WebServer1 www netmask 255.255.255.255 access-group outside_access_in in interface outside timeout xlate 3:00:00 timeout conn 1:00:00 half-closed 0:10:00 udp 0:02:00 icmp 0:00:02 timeout sunrpc 0:10:00 h323 0:05:00 h225 1:00:00 mgcp 0:05:00 mgcp-pat 0:05:00 timeout sip 0:30:00 sip_media 0:02:00 sip-invite 0:03:00 sip-disconnect 0:02:00 timeout sip-provisional-media 0:02:00 uauth 0:05:00 absolute http server enable http 192.168.1.0 255.255.255.0 inside no snmp-server location no snmp-server contact snmp-server enable traps snmp authentication linkup linkdown coldstart crypto ipsec transform-set TRANS_ESP_3DES_SHA esp-3des esp-sha-hmac crypto ipsec transform-set TRANS_ESP_3DES_SHA mode transport crypto ipsec transform-set TRANS_ESP_3DES_MD5 esp-3des esp-md5-hmac crypto ipsec transform-set TRANS_ESP_3DES_MD5 mode transport crypto map outside_map 1 match address inside_cryptomap_1 crypto map outside_map 1 set transform-set TRANS_ESP_3DES_MD5 crypto map outside_map interface inside crypto isakmp enable outside crypto isakmp policy 10 authentication pre-share encryption 3des hash md5 group 2 lifetime 86400 telnet timeout 5 ssh timeout 5 console timeout 0 dhcpd auto_config outside ! dhcpd address 192.168.1.2-192.168.1.33 inside dhcpd enable inside ! group-policy DefaultRAGroup internal group-policy DefaultRAGroup attributes dns-server value 192.168.1.1 vpn-tunnel-protocol IPSec l2tp-ipsec username myusername password FGHFGHFHGFHGFGFHF nt-encrypted tunnel-group DefaultRAGroup general-attributes address-pool PPTP-VPN default-group-policy DefaultRAGroup tunnel-group DefaultRAGroup ipsec-attributes pre-shared-key * tunnel-group DefaultRAGroup ppp-attributes no authentication chap authentication ms-chap-v2 ! ! prompt hostname context Cryptochecksum:a9331e84064f27e6220a8667bf5076c1 : end

Read the article
Apache load balancer limits with Tomcat over AJP

- by PAS

Hi All, I have Apache acting as a load balancer in front of 3 Tomcat servers. Occasionally, Apache returns 503 responses, which I would like to remove completely. All 4 servers are not under significant load in terms of CPU, memory, or disk, so I am a little unsure what is reaching it's limits or why. 503s are returned when all workers are in error state - whatever that means. Here are the details: Apache config: <IfModule mpm_prefork_module> StartServers 30 MinSpareServers 30 MaxSpareServers 60 MaxClients 200 MaxRequestsPerChild 1000 </IfModule> ... <Proxy *> AddDefaultCharset Off Order deny,allow Allow from all </Proxy> # Tomcat HA cluster <Proxy balancer://mycluster> BalancerMember ajp://10.176.201.9:8009 keepalive=On retry=1 timeout=1 ping=1 BalancerMember ajp://10.176.201.10:8009 keepalive=On retry=1 timeout=1 ping=1 BalancerMember ajp://10.176.219.168:8009 keepalive=On retry=1 timeout=1 ping=1 </Proxy> # Passes thru track. or api. ProxyPreserveHost On ProxyStatus On # Original tracker ProxyPass /m balancer://mycluster/m ProxyPassReverse /m balancer://mycluster/m Tomcat config: <Server port="8005" shutdown="SHUTDOWN"> <Listener className="org.apache.catalina.core.AprLifecycleListener" SSLEngine="on" /> <Listener className="org.apache.catalina.core.JasperListener" /> <Listener className="org.apache.catalina.mbeans.ServerLifecycleListener" /> <Listener className="org.apache.catalina.mbeans.GlobalResourcesLifecycleListener" /> <Service name="Catalina"> <Connector port="8080" protocol="HTTP/1.1" connectionTimeout="20000" redirectPort="8443" /> <Connector port="8009" protocol="AJP/1.3" redirectPort="8443" /> <Engine name="Catalina" defaultHost="localhost"> <Host name="localhost" appBase="webapps" unpackWARs="true" autoDeploy="true" xmlValidation="false" xmlNamespaceAware="false"> </Engine> </Service> </Server> Apache error log: [Mon Mar 22 18:39:47 2010] [error] (70007)The timeout specified has expired: proxy: AJP: attempt to connect to 10.176.201.10:8009 (10.176.201.10) failed [Mon Mar 22 18:39:47 2010] [error] ap_proxy_connect_backend disabling worker for (10.176.201.10) [Mon Mar 22 18:39:47 2010] [error] proxy: AJP: failed to make connection to backend: 10.176.201.10 [Mon Mar 22 18:39:47 2010] [error] (70007)The timeout specified has expired: proxy: AJP: attempt to connect to 10.176.201.9:8009 (10.176.201.9) failed [Mon Mar 22 18:39:47 2010] [error] ap_proxy_connect_backend disabling worker for (10.176.201.9) [Mon Mar 22 18:39:47 2010] [error] proxy: AJP: failed to make connection to backend: 10.176.201.9 [Mon Mar 22 18:39:47 2010] [error] (70007)The timeout specified has expired: proxy: AJP: attempt to connect to 10.176.219.168:8009 (10.176.219.168) failed [Mon Mar 22 18:39:47 2010] [error] ap_proxy_connect_backend disabling worker for (10.176.219.168) [Mon Mar 22 18:39:47 2010] [error] proxy: AJP: failed to make connection to backend: 10.176.219.168 [Mon Mar 22 18:39:47 2010] [error] proxy: BALANCER: (balancer://mycluster). All workers are in error state [Mon Mar 22 18:39:47 2010] [error] proxy: BALANCER: (balancer://mycluster). All workers are in error state [Mon Mar 22 18:39:47 2010] [error] proxy: BALANCER: (balancer://mycluster). All workers are in error state [Mon Mar 22 18:39:47 2010] [error] proxy: BALANCER: (balancer://mycluster). All workers are in error state [Mon Mar 22 18:39:47 2010] [error] proxy: BALANCER: (balancer://mycluster). All workers are in error state [Mon Mar 22 18:39:47 2010] [error] proxy: BALANCER: (balancer://mycluster). All workers are in error state Load balancer top info: top - 23:44:11 up 210 days, 4:32, 1 user, load average: 0.10, 0.11, 0.09 Tasks: 135 total, 2 running, 133 sleeping, 0 stopped, 0 zombie Cpu(s): 0.1%us, 0.2%sy, 0.0%ni, 99.2%id, 0.1%wa, 0.0%hi, 0.1%si, 0.3%st Mem: 524508k total, 517132k used, 7376k free, 9124k buffers Swap: 1048568k total, 352k used, 1048216k free, 334720k cached Tomcat top info: top - 23:47:12 up 210 days, 3:07, 1 user, load average: 0.02, 0.04, 0.00 Tasks: 63 total, 1 running, 62 sleeping, 0 stopped, 0 zombie Cpu(s): 0.2%us, 0.0%sy, 0.0%ni, 99.8%id, 0.1%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 2097372k total, 2080888k used, 16484k free, 21464k buffers Swap: 4194296k total, 380k used, 4193916k free, 1520912k cached Catalina.out does not have any error messages in it. According to Apache's server status, it seems to be maxing out at 143 requests/sec. I believe the servers can handle substantially more load than they are, so any hints about low default limits or other reasons why this setup would be maxing out would be greatly appreciated.

Read the article
Combining HBase and HDFS results in Exception in makeDirOnFileSystem

- by utrecht

Introduction An attempt to combine HBase and HDFS results in the following: 2014-06-09 00:15:14,777 WARN org.apache.hadoop.hbase.HBaseFileSystem: Create Dir ectory, retries exhausted 2014-06-09 00:15:14,780 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. java.io.IOException: Exception in makeDirOnFileSystem at org.apache.hadoop.hbase.HBaseFileSystem.makeDirOnFileSystem(HBaseFile System.java:136) at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFi leSystem.java:428) at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSyst emLayout(MasterFileSystem.java:148) at org.apache.hadoop.hbase.master.MasterFileSystem.<init>(MasterFileSyst em.java:133) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.j ava:572) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:432) at java.lang.Thread.run(Thread.java:744) Caused by: org.apache.hadoop.security.AccessControlException: Permission denied: user=hbase, access=WRITE, inode="/":vagrant:supergroup:drwxr-xr-x at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPe rmissionChecker.java:224) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPe rmissionChecker.java:204) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermi ssion(FSPermissionChecker.java:149) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(F SNamesystem.java:4891) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(F SNamesystem.java:4873) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAcce ss(FSNamesystem.java:4847) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FS Namesystem.java:3192) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNames ystem.java:3156) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesyst em.java:3137) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameN odeRpcServer.java:669) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTra nslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:419) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$Cl ientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:4497 0) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.cal l(ProtobufRpcEngine.java:453) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1752) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1748) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma tion.java:1438) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1746) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstruct orAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingC onstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:408) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteExce ption.java:90) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteExc eption.java:57) at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2153) at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2122) at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSy stem.java:545) at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1915) at org.apache.hadoop.hbase.HBaseFileSystem.makeDirOnFileSystem(HBaseFile System.java:129) ... 6 more while configuration and system settings are as follows: [vagrant@localhost hadoop-hdfs]$ hadoop fs -ls hdfs://localhost/ Found 1 items -rw-r--r-- 3 vagrant supergroup 1010827264 2014-06-08 19:01 hdfs://localhost/u buntu-14.04-desktop-amd64.iso [vagrant@localhost hadoop-hdfs]$ /etc/hadoop/conf/core-site.xml <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://localhost:8020</value> </property> </configuration> /etc/hbase/conf/hbase-site.xml <configuration> <property> <name>hbase.rootdir</name> <value>hdfs://localhost:8020/hbase</value> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> </configuration> /etc/hadoop/conf/hdfs-site.xml <configuration> <property> <name>dfs.name.dir</name> <value>/var/lib/hadoop-hdfs/cache</value> </property> <property> <name>dfs.data.dir</name> <value>/tmp/hellodatanode</value> </property> </configuration> NameNode directory permissions [vagrant@localhost hadoop-hdfs]$ ls -ltr /var/lib/hadoop-hdfs/cache total 8 -rwxrwxrwx. 1 hbase hdfs 15 Jun 8 23:43 in_use.lock drwxrwxrwx. 2 hbase hdfs 4096 Jun 8 23:43 current [vagrant@localhost hadoop-hdfs]$ HMaster is able to start if fs.defaultFS property has been commented in core-site.xml NameNode is listening [vagrant@localhost hadoop-hdfs]$ netstat -nato | grep 50070 tcp 0 0 0.0.0.0:50070 0.0.0.0:* LIST EN off (0.00/0/0) tcp 0 0 33.33.33.33:50070 33.33.33.1:57493 ESTA BLISHED off (0.00/0/0) and accessible by navigating to http://33.33.33.33:50070/dfshealth.jsp. Question How to solve makeDirOnFileSystem exception and let HBase connect to HDFS?

Read the article
Emails forwarded via postfix get flagged as spam and forged in Gmail

- by Kendall Hopkins

I'm trying to setup a forwarding only email server. I'm running into the problem where all messages forwarded via postfix are getting put into gmail's spam folder and getting flagged as forged. I'm testing a very similar setup on a cpanel box and their forwarded emails make it through without any problem. Things I've done: Setup reverse dns on forwarding box Setup SPF record for forwarding box domain CPanel route (not flagged as spam): [email protected] - [email protected] - [email protected] AWS postfix route (flagged as spam): [email protected] - [email protected] - [email protected] Gmail error message: /etc/postfix/main.cf myhostname = sputnik.*domain*.com smtpd_banner = $myhostname ESMTP $mail_name (Ubuntu) biff = no append_dot_mydomain = no readme_directory = no myorigin = /etc/mailname mydestination = sputnik.*domain*.com, localhost.*domain*.com, , localhost relayhost = mynetworks = 127.0.0.0/8 10.0.0.0/24 [::1]/128 [fe80::%eth0]/64 mailbox_size_limit = 0 recipient_delimiter = + inet_interfaces = all inet_protocols = all virtual_alias_maps = hash:/etc/postfix/virtual Email forwarded by CPanel (doesn't get marked as spam): Delivered-To: *personaluser*@gmail.com Received: by 10.182.144.98 with SMTP id sl2csp14396obb; Wed, 9 May 2012 09:18:36 -0700 (PDT) Received: by 10.182.52.38 with SMTP id q6mr1137571obo.8.1336580316700; Wed, 09 May 2012 09:18:36 -0700 (PDT) Return-Path: <mail@*personaldomain*.com> Received: from web6.*domain*.com (173.193.55.66-static.reverse.softlayer.com. [173.193.55.66]) by mx.google.com with ESMTPS id ec7si1845451obc.67.2012.05.09.09.18.36 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 09 May 2012 09:18:36 -0700 (PDT) Received-SPF: neutral (google.com: 173.193.55.66 is neither permitted nor denied by best guess record for domain of mail@*personaldomain*.com) client-ip=173.193.55.66; Authentication-Results: mx.google.com; spf=neutral (google.com: 173.193.55.66 is neither permitted nor denied by best guess record for domain of mail@*personaldomain*.com) smtp.mail=mail@*personaldomain*.com Received: from mail-vb0-f43.google.com ([209.85.212.43]:56152) by web6.*domain*.com with esmtps (TLSv1:RC4-SHA:128) (Exim 4.77) (envelope-from <mail@*personaldomain*.com>) id 1SS9b2-0007J9-LK for mail@kendall.*domain*.com; Wed, 09 May 2012 12:18:36 -0400 Received: by vbbfq11 with SMTP id fq11so599132vbb.2 for <mail@kendall.*domain*.com>; Wed, 09 May 2012 09:18:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-originating-ip:date:message-id:subject:from:to :content-type:x-gm-message-state; bh=Hr0AH40uUtx/w/u9hltbrhHJhRaD5ubKmz2gGg44VLs=; b=IBKi6Xalr9XVFYwdkWxn9PLRB69qqJ9AjUPdvGh8VxMNW4S+hF6r4GJcGOvkDn2drO kw5r4iOpGuWUQPEMHRPyO4+Ozc9SE9s4Px2oVpadR6v3hO+utvFGoj7UuchsXzHqPVZ8 A9FS4cKiE0E0zurTjR7pfQtZT64goeEJoI/CtvcoTXj/Mdrj36gZ2FYtO8Qj4dFXpfu9 uGAKa4jYfx9zwdvhLzQ3mouWwQtzssKUD+IvyuRppLwI2WFb9mWxHg9n8y9u5IaduLn7 7TvLIyiBtS3DgqSKQy18POVYgnUFilcDorJs30hxFxJhzfTFW1Gdhrwjvz0MTYDSRiGQ P4aw== MIME-Version: 1.0 Received: by 10.52.173.209 with SMTP id bm17mr326586vdc.54.1336580315681; Wed, 09 May 2012 09:18:35 -0700 (PDT) Received: by 10.220.191.134 with HTTP; Wed, 9 May 2012 09:18:35 -0700 (PDT) X-Originating-IP: [99.50.225.7] Date: Wed, 9 May 2012 12:18:35 -0400 Message-ID: <CA+tP6Viyn0ms5RJoqtd20ms3pmQCgyU0yy7GBiaALEACcDBC2g@mail.gmail.com> Subject: test5 From: Kendall Hopkins <mail@*personaldomain*.com> To: mail@kendall.*domain*.com Content-Type: multipart/alternative; boundary=bcaec51b9bf5ee11c004bf9cda9c X-Gm-Message-State: ALoCoQm3t1Hohu7fEr5zxQZsC8FQocg662Jv5MXlPXBnPnx2AiQrbLsNQNknLy39Su45xBMCM47K X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - web6.*domain*.com X-AntiAbuse: Original Domain - kendall.*domain*.com X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - *personaldomain*.com X-Source: X-Source-Args: X-Source-Dir: --bcaec51b9bf5ee11c004bf9cda9c Content-Type: text/plain; charset=ISO-8859-1 test5 --bcaec51b9bf5ee11c004bf9cda9c Content-Type: text/html; charset=ISO-8859-1 test5 --bcaec51b9bf5ee11c004bf9cda9c-- Email forwarded via AWS postfix box (marked as spam): Delivered-To: *personaluser*@gmail.com Received: by 10.182.144.98 with SMTP id sl2csp14350obb; Wed, 9 May 2012 09:17:46 -0700 (PDT) Received: by 10.229.137.143 with SMTP id w15mr389471qct.37.1336580266237; Wed, 09 May 2012 09:17:46 -0700 (PDT) Return-Path: <mail@*personaldomain*.com> Received: from sputnik.*domain*.com (sputnik.*domain*.com. [107.21.39.201]) by mx.google.com with ESMTP id o8si1330855qct.115.2012.05.09.09.17.46; Wed, 09 May 2012 09:17:46 -0700 (PDT) Received-SPF: neutral (google.com: 107.21.39.201 is neither permitted nor denied by best guess record for domain of mail@*personaldomain*.com) client-ip=107.21.39.201; Authentication-Results: mx.google.com; spf=neutral (google.com: 107.21.39.201 is neither permitted nor denied by best guess record for domain of mail@*personaldomain*.com) smtp.mail=mail@*personaldomain*.com Received: from mail-vb0-f52.google.com (mail-vb0-f52.google.com [209.85.212.52]) by sputnik.*domain*.com (Postfix) with ESMTP id A308122AD6 for <mail@*personaldomain2*.com>; Wed, 9 May 2012 16:17:45 +0000 (UTC) Received: by vbzb23 with SMTP id b23so448664vbz.25 for <mail@*personaldomain2*.com>; Wed, 09 May 2012 09:17:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-originating-ip:date:message-id:subject:from:to :content-type:x-gm-message-state; bh=XAzjH9tUXn6SbadVSLwJs2JVbyY4arosdTuV8Nv+ARI=; b=U8gIgHd6mhWYqPU4MH/eyvo3kyZsDn/GiYwZj5CLbs6Zz/ZOXQkenRi7zW3ewVFi/9 uAFylT8SQ+Wjw2l6OgAioCTojfZ58s4H/JW+1bu460KAP9aeOTcZDNSsHlsj0wvH5XRV 4DQJa11kz+WFVtVVcFuB33WVUPAgJfXzY+pSTe+FWsrZyrrwL7/Vm9TSKI5PBwRN9i4g zAZabgkmw1o2THT3kbJi6vAbPzlqK2LVbgt82PP0emHdto7jl4iD5F6lVix4U0dsrtRv xuGUE0gDyIwJuR4Q5YTkNubwGH/Y2bFBtpx2q1IORANrolWxIGaZSceUWawABkBGPABX 1/eg== MIME-Version: 1.0 Received: by 10.52.96.169 with SMTP id dt9mr282954vdb.107.1336580265812; Wed, 09 May 2012 09:17:45 -0700 (PDT) Received: by 10.220.191.134 with HTTP; Wed, 9 May 2012 09:17:45 -0700 (PDT) X-Originating-IP: [99.50.225.7] Date: Wed, 9 May 2012 12:17:45 -0400 Message-ID: <CA+tP6VgqZrdxP543Y28d1eMwJAs4DxkS4EE6bvRL8nFoMkgnQQ@mail.gmail.com> Subject: test4 From: Kendall Hopkins <mail@*personaldomain*.com> To: mail@*personaldomain2*.com Content-Type: multipart/alternative; boundary=20cf307f37f6f521b304bf9cd79d X-Gm-Message-State: ALoCoQkrNcfSTWz9t6Ir87KEYyM+zJM4y1AbwP86NMXlk8B3ALhnis+olFCKdgPnwH/sIdzF3+Nh --20cf307f37f6f521b304bf9cd79d Content-Type: text/plain; charset=ISO-8859-1 test4 --20cf307f37f6f521b304bf9cd79d Content-Type: text/html; charset=ISO-8859-1 test4 --20cf307f37f6f521b304bf9cd79d--

Read the article
hyper-v fails when attaching more disk to VM. The VM won't start and generates an error

- by CasperDK

I'm lost at what to do about this: Hi... System: Windows 2008 R2 Hyper-V farm running with failover cluster with a EVA 4400 as backend. When I attach a new disk to a VM it fails when I try to start it. If I move the VM to another, say node 1, I can add the disk and I can get them to start. If I move the VM back to node 2 where the problem arose and the VM is running, I get an error during live migration and the VM fails back to node1 where it did run... So it's like there is something wrong with Hyper-V on node 2 and not node 1. Also node 3 has the same issue. Restarting the nodes is NOT an option since I will have this problem again at a later time AND because not all the VMs can run on node 1 which means my client company will experience downtime on the VMs not running on node 1. Any fix for this? An update I have missed perhaps? It has been two years... Here are the errors: An error ocurred while attempting to change the state of virtual machine XXX. 'XXX' failed to start. Microsoft Emulated IDE Controller (Instance ID {83F8638B-8DCA-4152-9EDA-2CA8B33039B4}): Failed to power on with Error 'A device attached to the system is not functioning.' Failed to open attachment 'X:\XXX.vhd'. Error: 'A device attached to the system is not functioning.' Failed to open attachment 'X:\XXX.vhd'. Error: 'A device attached to the system is not functioning.' 'XXX' failed to start. (Virtual machine 36563C78-65B5-4C40-A52D-689BB39E8B08) Microsoft Emulated IDE Controller (Instance ID {83F8638B-8DCA-4152-9EDA-2CA8B33039B4}): Failed to power on with Error 'A device attached to the system is not functioning.' (0x8007001F). (Virtual machine 36563C78-65B5-4C40-A52D-689BB39E8B08) 'XXX': Failed to open attachment 'X:\XXX.vhd'. Error: 'A device attached to the system is not functioning.' (0x8007001F). (Virtual machine 36563C78-65B5-4C40-A52D-689BB39E8B08) 'XXX': Failed to open attachment 'X:\XXX.vhd'. Error: 'A device attached to the system is not functioning.' (0x8007001F). (Virtual machine 36563C78-65B5-4C40-A52D-689BB39E8B08) An error ocurred while attempting to change the state of virtual machine XXX. 'XXX' failed to start. Microsoft Emulated IDE Controller (Instance ID {83F8638B-8DCA-4152-9EDA-2CA8B33039B4}): Failed to power on with Error 'A device attached to the system is not functioning.' Failed to open attachment 'X:\XXX.vhd'. Error: 'A device attached to the system is not functioning.' Failed to open attachment 'X:\XXX.vhd'. Error: 'A device attached to the system is not functioning.' 'XXX' failed to start. (Virtual machine 36563C78-65B5-4C40-A52D-689BB39E8B08) Microsoft Emulated IDE Controller (Instance ID {83F8638B-8DCA-4152-9EDA-2CA8B33039B4}): Failed to power on with Error 'A device attached to the system is not functioning.' (0x8007001F). (Virtual machine 36563C78-65B5-4C40-A52D-689BB39E8B08) 'XXX': Failed to open attachment 'X:\XXX.vhd'. Error: 'A device attached to the system is not functioning.' (0x8007001F). (Virtual machine 36563C78-65B5-4C40-A52D-689BB39E8B08) 'XXX': Failed to open attachment 'X:\XXX.vhd'. Error: 'A device attached to the system is not functioning.' (0x8007001F). (Virtual machine 36563C78-65B5-4C40-A52D-689BB39E8B08) An error ocurred while attempting to change the state of virtual machine XXX. 'XXX' failed to start. Microsoft Emulated IDE Controller (Instance ID {83F8638B-8DCA-4152-9EDA-2CA8B33039B4}): Failed to power on with Error 'A device attached to the system is not functioning.' Failed to open attachment 'c:\clusterstorage/volume1/XXX.vhd'. Error: 'A device attached to the system is not functioning.' Failed to open attachment 'c:\clusterstorage/volume1\XXX.vhd'. Error: 'A device attached to the system is not functioning.' 'XXX' failed to start. (Virtual machine 36563C78-65B5-4C40-A52D-689BB39E8B08) Microsoft Emulated IDE Controller (Instance ID {83F8638B-8DCA-4152-9EDA-2CA8B33039B4}): Failed to power on with Error 'A device attached to the system is not functioning.' (0x8007001F). (Virtual machine 36563C78-65B5-4C40-A52D-689BB39E8B08) 'XXX': Failed to open attachment 'c:\clusterstorage/volume1\XXX.vhd'. Error: 'A device attached to the system is not functioning.' (0x8007001F). (Virtual machine 36563C78-65B5-4C40-A52D-689BB39E8B08) 'XXX': Failed to open attachment 'c:\clusterstorage/volume1\XXX.vhd'. Error: 'A device attached to the system is not functioning.' (0x8007001F). (Virtual machine 36563C78-65B5-4C40-A52D-689BB39E8B08) In the Hyper-V logs I found some more errors: In the hyper-v VMMS logs I have this: 'ServerName' failed to perform the operation. The virtual machine is not in a valid state to perform the operation. (Virtual machine ID 0A6CC4A9-39D6-4413-8CF0-B6DAA35B68D7)

Read the article
DRBD Not syncing between my nodes

- by Mike Curry

Some version info: Operating system is Ubuntu 11.10, on EC2, kernel is 3.0.0-16-virtual and the application info is: Version: 8.3.11 (api:88) GIT-hash: 0de839cee13a4160eed6037c4bddd066645e23c5 build by buildd@allspice, 2011-07-05 19:51:07 Getting some strange errors in dmesg (seen below) as well, there is no replication happening. I have made my first node primary and its showing: drbd driver loaded OK; device status: version: 8.3.11 (api:88/proto:86-96) srcversion: DA5A13F16DE6553FC7CE9B2 m:res cs ro ds p mounted fstype 0:r0 StandAlone Primary/Unknown UpToDate/DUnknown r----s ext3 my secondary node is showing: drbd driver loaded OK; device status: version: 8.3.11 (api:88/proto:86-96) srcversion: DA5A13F16DE6553FC7CE9B2 m:res cs ro ds p mounted fstype 0:r0 StandAlone Secondary/Unknown Inconsistent/DUnknown r----s Showing /proc/drbd on the master shows: version: 8.3.11 (api:88/proto:86-96) srcversion: DA5A13F16DE6553FC7CE9B2 0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown r----s ns:0 nr:0 dw:4 dr:1073 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:262135964 Showing /proc/drbd on the slave shows that there is nothing being transfered... version: 8.3.11 (api:88/proto:86-96) srcversion: DA5A13F16DE6553FC7CE9B2 0: cs:StandAlone ro:Secondary/Unknown ds:Inconsistent/DUnknown r----s ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:262135964 Here is my config... resource r0 { protocol C; startup { wfc-timeout 15; degr-wfc-timeout 60; } net { cram-hmac-alg sha1; shared-secret "test123; } on drbd01 { device /dev/drbd0; disk /dev/xvdm; address 23.XX.XX.XX:7788; # blocked out ip meta-disk internal; } on drbd02 { device /dev/drbd0; disk /dev/xvdm; address 184.XX.XX.XX:7788; #blocked out ip meta-disk internal; } } I have run the following on the master: sudo drbdadm -- --overwrite-data-of-peer primary all There is no firewall between the systems. Here is the dmesg with some errors: [2285172.969955] drbd: initialized. Version: 8.3.11 (api:88/proto:86-96) [2285172.969960] drbd: srcversion: DA5A13F16DE6553FC7CE9B2 [2285172.969962] drbd: registered as block device major 147 [2285172.969965] drbd: minor_table @ 0xffff88000276ea00 [2285173.000952] block drbd0: Starting worker thread (from drbdsetup [1300]) [2285173.003971] block drbd0: disk( Diskless -> Attaching ) [2285173.006150] block drbd0: No usable activity log found. [2285173.006154] block drbd0: Method to ensure write ordering: flush [2285173.006158] block drbd0: max BIO size = 4096 [2285173.006165] block drbd0: drbd_bm_resize called with capacity == 524271928 [2285173.008512] block drbd0: resync bitmap: bits=65533991 words=1023969 pages=2000 [2285173.008518] block drbd0: size = 250 GB (262135964 KB) [2285173.079566] block drbd0: bitmap READ of 2000 pages took 17 jiffies [2285173.081189] block drbd0: recounting of set bits took additional 1 jiffies [2285173.081194] block drbd0: 250 GB (65533991 bits) marked out-of-sync by on disk bit-map. [2285173.081203] block drbd0: Suspended AL updates [2285173.081210] block drbd0: disk( Attaching -> UpToDate ) [2285173.081214] block drbd0: attached to UUIDs 1C1291D39584C1D1:0000000000000004:0000000000000000:0000000000000000 [2285173.095016] block drbd0: conn( StandAlone -> Unconnected ) [2285173.095046] block drbd0: Starting receiver thread (from drbd0_worker [1301]) [2285173.099297] block drbd0: receiver (re)started [2285173.099304] block drbd0: conn( Unconnected -> WFConnection ) [2285173.099330] block drbd0: bind before connect failed, err = -99 [2285173.099346] block drbd0: conn( WFConnection -> Disconnecting ) [2285173.295788] block drbd0: Discarding network configuration. [2285173.295815] block drbd0: Connection closed [2285173.295826] block drbd0: conn( Disconnecting -> StandAlone ) [2285173.295840] block drbd0: receiver terminated [2285173.295844] block drbd0: Terminating drbd0_receiver Edit: Reading some other similar issues, it was suggested to do a 'drbdadm dump all', so I figured it couldn't hurt. ubuntu@drbd01:~$ drbdadm dump all /etc/drbd.conf:19: in resource r0, on drbd01: IP 23.XX.XX.XX not found on this host. and on slave: root@drbd02:~# drbdadm dump all /etc/drbd.conf:25: in resource r0, on drbd02: IP 184.XX.XX.XX not found on this host. Strange it doesn't find its own ip, however, this is an Amazon EC2 system using an elastic IP... here are my ipconfigs for both... master: ubuntu@drbd01:~$ ifconfig eth0 Link encap:Ethernet HWaddr 22:00:0a:1c:27:11 inet addr:10.28.39.17 Bcast:10.28.39.63 Mask:255.255.255.192 inet6 addr: fe80::2000:aff:fe1c:2711/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:1569 errors:0 dropped:0 overruns:0 frame:0 TX packets:1169 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:124409 (124.4 KB) TX bytes:213601 (213.6 KB) Interrupt:26 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) slave: root@drbd02:~# ifconfig eth0 Link encap:Ethernet HWaddr 12:31:3f:00:14:9d inet addr:10.160.27.107 Bcast:10.160.27.255 Mask:255.255.254.0 inet6 addr: fe80::1031:3fff:fe00:149d/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:915 errors:0 dropped:0 overruns:0 frame:0 TX packets:774 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:75381 (75.3 KB) TX bytes:109673 (109.6 KB) Interrupt:26 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

Read the article
MySQL 5.1.49 freezing every two days

- by maximus

Hi all, our mysql system is "freezing" every two days. By "freezing" i mean the following: it doesn't respond to ping we can't login with SSH we don't get any answer from MySQL there is no entry in the error logs! neither from linux neither from MySQL. we have already changed to a completely new hardware, we have the same problem, so it's definitely not a hardware problem. we do not have any other software installed except a firewall (iptables rule) we can restart the server from another server using rsyslog (www.rsyslog.com)(software reset) Could someone help me, by giving me some pointers what could i do to figure out the problem? I have included every detail about our settings. Thank you in advance for your help. Max. Our system parameters and settings: System-Memory: 12GB Processor: Intel 7-920 Quadcore Operating system: Debian 5 (lenny) 64bit MySQL 5.1.49 Databases: (a) a small phpbb forum (b) a 6GB database 3 tables with about 15 million rows my.cnf # # The MySQL database server configuration file. # # You can copy this to one of: # - "/etc/mysql/my.cnf" to set global options, # - "~/.my.cnf" to set user-specific options. # # One can use all long options that the program supports. # Run program with --help to get a list of available options and with # --print-defaults to see which it would actually understand and use. # # For explanations see # http://dev.mysql.com/doc/mysql/en/server-system-variables.html # This will be passed to all mysql clients # It has been reported that passwords should be enclosed with ticks/quotes # escpecially if they contain "#" chars... # Remember to edit /etc/mysql/debian.cnf when changing the socket location. [client] port = 3306 socket = /var/run/mysqld/mysqld.sock # Here is entries for some specific programs # The following values assume you have at least 32M ram # This was formally known as [safe_mysqld]. Both versions are currently parsed. [mysqld_safe] socket = /var/run/mysqld/mysqld.sock nice = 0 [mysqld] # # * Basic Settings # user = mysql pid-file = /var/run/mysqld/mysqld.pid socket = /var/run/mysqld/mysqld.sock port = 3306 basedir = /usr datadir = /var/lib/mysql tmpdir = /tmp language = /usr/share/mysql/english skip-external-locking # # Instead of skip-networking the default is now to listen only on # localhost which is more compatible and is not less secure. bind-address = our-ip-address # # * Fine Tuning # key_buffer = 16M max_allowed_packet = 16M thread_stack = 256K thread_cache_size = 32 max_connections = 300 table_cache = 2048 #thread_concurrency = 4 # Used for InnoDB tables recommended to 50%-80% available memory innodb_buffer_pool_size = 6G # 20MB sometimes larger innodb_additional_mem_pool_size = 20M # 8M-16M is good for most situations innodb_log_buffer_size = 8M # Disable XA support because we do not use it innodb-support-xa = 0 # 1 is default wich is 100% secure but 2 offers better performance innodb_flush_log_at_trx_commit = 1 innodb_flush_method = O_DIRECT #innodb_thread_concurency = 8 # Recommended 64M - 512M depending on server size innodb_log_file_size = 512M # One file per table innodb_file_per_table # # * Query Cache Configuration # query_cache_limit = 1M query_cache_size = 16M #query_cache_type = 1 #query_cache_min_res_unit= 2K #join_buffer_size = 1M # # * Logging and Replication # # Both location gets rotated by the cronjob. # Be aware that this log type is a performance killer. # As of 5.1 you can enable the log at runtime! #general_log_file = /var/log/mysql/mysql.log #general_log = 1 # # Error logging goes to syslog. This is a Debian improvement :) # # Here you can see queries with especially long duration log_slow_queries = /var/log/mysql/mysql-slow.log long_query_time = 2 log-queries-not-using-indexes # # The following can be used as easy to replay backup logs or for replication. #server-id = 1 log_bin = /var/log/mysql/mysql-bin.log # WARNING: Using expire_logs_days without bin_log crashes the server! See README.Debian! expire_logs_days = 10 max_binlog_size = 100M #binlog_do_db = include_database_name #binlog_ignore_db = include_database_name # # InnoDB is enabled by default with a 10MB datafile in /var/lib/mysql/. # Read the manual for more InnoDB related options. There are many! # * InnoDB plugin # As of MySQL 5.1.38, the InnoDB plugin from Oracle is included in the MySQL source code. # It has many improvements and better performances than the built-in InnoDB storage engine. # Please read http://www.innodb.com/products/innodb_plugin/ for more information. # Uncommenting the two following lines to use the InnoDB plugin. ignore_builtin_innodb plugin-load=innodb=ha_innodb_plugin.so # # * Security Features # # Read the manual, too, if you want chroot! # chroot = /var/lib/mysql/ # # For generating SSL certificates I recommend the OpenSSL GUI "tinyca". # # ssl-ca=/etc/mysql/cacert.pem # ssl-cert=/etc/mysql/server-cert.pem # ssl-key=/etc/mysql/server-key.pem [mysqldump] quick quote-names max_allowed_packet = 16M [mysql] #no-auto-rehash # faster start of mysql but no tab completition [isamchk] key_buffer = 16M # # * NDB Cluster # # See /usr/share/doc/mysql-server-*/README.Debian for more information. # # The following configuration is read by the NDB Data Nodes (ndbd processes) # not from the NDB Management Nodes (ndb_mgmd processes). # # [MYSQL_CLUSTER] # ndb-connectstring=127.0.0.1 # # * IMPORTANT: Additional settings that can override those from this file! # !includedir /etc/mysql/conf.d/ UPDATE After installing sysstat and configuring it to collect data after every minute i have the following datas. I used sar to generate the following output: The log-file is too big so coudn't enter it here but uploaded to box.net. The link is http://www.box.net/shared/xc6rh7qqob SECOND UPDATE We started a ping command in the background, and that solved the problem. Now the server does work since more then a week. We still don't know what's the problem.

Read the article
Macports irssi & perl5 installation issues

- by Dmitri DB

Long time reader, first time poster. Big, appreciative thanks for everyone's collective questioning and answering here and at stackoverflow, it's helped me quite a lot over the time I've been learning answers through these sites! Apologies in advance if I didn't search hard enough on posts already up on this site to find out what I could do about this issue, but I thought I'd just reach out for the sake of trying at least once. I've experienced this issue while starting up my macports-installed version of irssi: 13:25 -!- Irssi: Error in script dispatch: 13:25 Can't locate lib.pm in @INC (@INC contains: /opt/local/lib/perl5/site_perl/5.12.4/darwin-multi-2level /opt/local/lib/perl5/site_perl/5.12.4 /opt/local/lib/perl5/vendor_perl/5.12.4/darwin-multi-2level /opt/local/lib/perl5/vendor_perl/5.12.4 /opt/local/lib/perl5/5.12.4/darwin-multi-2level /opt/local/lib/perl5/5.12.4 /opt/local/lib/perl5/site_perl/5.12.3/darwin-multi-2level /opt/local/lib/perl5/site_perl/5.12.3 /opt/local/lib/perl5/site_perl /opt/local/lib/perl5/vendor_perl .) at (eval 18) line 1. 13:25 BEGIN failed--compilation aborted at (eval 18) line 1. 13:25 Huh, strange. I looked into it a bit: [email protected] /opt/local/lib/perl5 ?- find . -name "lib.pm" -ls 14673887 16 -r--r--r-- 1 root admin 6853 25 Jun 23:39 ./5.12.4/darwin-thread-multi- 2level/lib.pm [email protected] /opt/local/lib/perl5 ?- l 5.12.4/darwin-thread-multi-2level total 1864 drwxr-xr-x 55 root admin 1870 28 Jun 19:28 . drwxr-xr-x 158 root admin 5372 28 Jun 19:28 .. -rw-r--r-- 1 root admin 177814 25 Jun 23:39 .packlist drwxr-xr-x 6 root admin 204 28 Jun 19:28 B -r--r--r-- 1 root admin 25714 25 Jun 23:39 B.pm drwxr-xr-x 64 root admin 2176 28 Jun 19:28 CORE drwxr-xr-x 3 root admin 102 28 Jun 19:28 Compress -r--r--r-- 1 root admin 3000 25 Jun 23:39 Config.pm -r--r--r-- 1 root admin 228094 25 Jun 23:39 Config.pod -r--r--r-- 1 root admin 409 25 Jun 23:39 Config_git.pl -r--r--r-- 1 root admin 38759 25 Jun 23:39 Config_heavy.pl -r--r--r-- 1 root admin 21174 25 Jun 23:39 Cwd.pm -r--r--r-- 1 root admin 63535 25 Jun 23:39 DB_File.pm drwxr-xr-x 3 root admin 102 28 Jun 19:28 Data drwxr-xr-x 5 root admin 170 28 Jun 19:28 Devel drwxr-xr-x 4 root admin 136 28 Jun 19:28 Digest -r--r--r-- 1 root admin 25185 25 Jun 23:39 DynaLoader.pm drwxr-xr-x 22 root admin 748 28 Jun 19:28 Encode -r--r--r-- 1 root admin 29731 25 Jun 23:39 Encode.pm -r--r--r-- 1 root admin 6736 25 Jun 23:39 Errno.pm -r--r--r-- 1 root admin 5445 25 Jun 23:39 Fcntl.pm drwxr-xr-x 5 root admin 170 28 Jun 19:28 File drwxr-xr-x 3 root admin 102 28 Jun 19:28 Filter -r--r--r-- 1 root admin 1819 25 Jun 23:39 GDBM_File.pm drwxr-xr-x 4 root admin 136 28 Jun 19:28 Hash drwxr-xr-x 3 root admin 102 28 Jun 19:28 I18N drwxr-xr-x 11 root admin 374 28 Jun 19:28 IO -r--r--r-- 1 root admin 1404 25 Jun 23:39 IO.pm drwxr-xr-x 6 root admin 204 28 Jun 19:28 IPC drwxr-xr-x 4 root admin 136 28 Jun 19:28 List drwxr-xr-x 4 root admin 136 28 Jun 19:28 MIME drwxr-xr-x 3 root admin 102 28 Jun 19:28 Math -r--r--r-- 1 root admin 2519 25 Jun 23:39 NDBM_File.pm -r--r--r-- 1 root admin 4208 25 Jun 23:39 O.pm -r--r--r-- 1 root admin 15563 25 Jun 23:39 Opcode.pm -r--r--r-- 1 root admin 21011 25 Jun 23:39 POSIX.pm -r--r--r-- 1 root admin 58962 25 Jun 23:39 POSIX.pod drwxr-xr-x 5 root admin 170 28 Jun 19:28 PerlIO -r--r--r-- 1 root admin 2515 25 Jun 23:39 SDBM_File.pm drwxr-xr-x 4 root admin 136 28 Jun 19:28 Scalar -r--r--r-- 1 root admin 10837 25 Jun 23:39 Socket.pm -r--r--r-- 1 root admin 41003 25 Jun 23:39 Storable.pm drwxr-xr-x 4 root admin 136 28 Jun 19:28 Sys drwxr-xr-x 3 root admin 102 28 Jun 19:28 Text drwxr-xr-x 5 root admin 170 28 Jun 19:28 Time drwxr-xr-x 3 root admin 102 28 Jun 19:28 Unicode -r--r--r-- 1 root admin 14462 25 Jun 23:39 attributes.pm drwxr-xr-x 38 root admin 1292 28 Jun 19:28 auto -r--r--r-- 1 root admin 19892 25 Jun 23:39 encoding.pm -r--r--r-- 1 root admin 6853 25 Jun 23:39 lib.pm -r--r--r-- 1 root admin 11044 25 Jun 23:39 mro.pm -r--r--r-- 1 root admin 997 25 Jun 23:39 ops.pm -r--r--r-- 1 root admin 13945 25 Jun 23:39 re.pm drwxr-xr-x 3 root admin 102 28 Jun 19:28 threads -r--r--r-- 1 root admin 33283 25 Jun 23:39 threads.pm So, it sort of seems to me that the permissions which perl5 got installed with for these modules has gotten mixed up somehow? I'm not really a perl user beyond enjoying it for massive directory-recursive find/replace operations within text files, so I haven't much of an idea what the permissions here are supposed to look like, and I'm not really sure how to go about determining how macports has gone and installed perl this way when it's otherwise worked without failure for years now. Does anyone have any recommendations for the sanest path towards rectifying this issue? Also, is there any interesting reason as to why the macports default for the perl5 port installs 5.12.4, and not 5.16.0, which has to be explicitly installed via the perl5.16 port? Thanks again!

Read the article
Can't configure PAM + LDAP on Debian Lenny - Getting error=49 on server logs

- by Jorge Suárez de Lis

I've been migrating some servers and desktops using Ubuntu 10.04 from getting the users from an old OpenLDAP implementation to a newer Centos Active Directory. I haven't had any problems so far, until I reached a Debian Lenny server. I've set up the server as the others, setting /etc/ldap.conf and /etc/ldap/ldap.conf. However, when I issue "getent passwd", I get nothing from the LDAP server. Reading the pam_ldap manpage, I realized that /etc/ldap.conf was not an accepted file by pam_ldap -it worked with Ubuntu though-, so I renamed it to /etc/pam_ldap.conf. Same result. However, once I've changed the name of this file, when I login using SSH I get this on the LDAP server logs: [20/Jul/2012:11:19:40 +0200] conn=16501 fd=155 slot=155 connection from x.x.x.50 to 10.1.176.237 [20/Jul/2012:11:19:40 +0200] conn=16501 op=0 BIND dn="uid=ubuntu,ou=Applications,ou=CITIUS,dc=inv,dc=usc,dc=es" method=128 version=3 [20/Jul/2012:11:19:40 +0200] conn=16501 op=0 RESULT err=0 tag=97 nentries=0 etime=0 dn="uid=ubuntu,ou=applications,ou=citius,dc=inv,dc=usc,dc=es" [20/Jul/2012:11:19:40 +0200] conn=16501 op=1 SRCH base="ou=People,ou=CITIUS,dc=inv,dc=usc,dc=es" scope=2 filter="(uid=jorge.suarez)" attrs=ALL [20/Jul/2012:11:19:40 +0200] conn=16501 op=1 RESULT err=0 tag=101 nentries=1 etime=0 notes=U [20/Jul/2012:11:19:40 +0200] conn=16501 op=2 BIND dn="uid=jorge.suarez,ou=People,ou=CITIUS,dc=inv,dc=usc,dc=es" method=128 version=3 [20/Jul/2012:11:19:40 +0200] conn=16501 op=2 RESULT err=49 tag=97 nentries=0 etime=0 The password isn't working. I don't know that could be wrong, anything else seems to be OK. That user/password is working from another clients: [20/Jul/2012:11:29:39 +0200] conn=16528 fd=188 slot=188 connection from x.x.x.224 to 10.1.176.237 [20/Jul/2012:11:29:39 +0200] conn=16528 op=0 BIND dn="uid=ubuntu,ou=Applications,ou=CITIUS,dc=inv,dc=usc,dc=es" method=128 version=3 [20/Jul/2012:11:29:39 +0200] conn=16528 op=0 RESULT err=0 tag=97 nentries=0 etime=0 dn="uid=ubuntu,ou=applications,ou=citius,dc=inv,dc=usc,dc=es" [20/Jul/2012:11:29:39 +0200] conn=16528 op=1 SRCH base="ou=People,ou=CITIUS,dc=inv,dc=usc,dc=es" scope=2 filter="(uid=jorge.suarez)" attrs=ALL [20/Jul/2012:11:29:39 +0200] conn=16528 op=1 RESULT err=0 tag=101 nentries=1 etime=0 notes=U [20/Jul/2012:11:29:39 +0200] conn=16528 op=2 BIND dn="uid=jorge.suarez,ou=People,ou=CITIUS,dc=inv,dc=usc,dc=es" method=128 version=3 [20/Jul/2012:11:29:39 +0200] conn=16528 op=2 RESULT err=0 tag=97 nentries=0 etime=0 dn="uid=jorge.suarez,ou=people,ou=citius,dc=inv,dc=usc,dc=es" I'm using SSHA for storing passwords on the LDAP server. Maybe this is not supported by Debian Lenny? On pam_ldap.conf, I've set up this, as in all the other servers: # Do not hash the password at all; presume # the directory server will do it, if # necessary. This is the default. pam_password md5 Also tried clear, but it didn't work. Anyways, it's weird that issuing getent passwd still gets me no users. However, if I use pamtest from the package libpam-dotfile to test login, it works. # pamtest ssh jorge.suarez Trying to authenticate <jorge.suarez> for service <ssh>. Password: Authentication successful. # pamtest foo jorge.suarez Trying to authenticate <jorge.suarez> for service <foo>. Password: Authentication successful. But "su" won't work also: # su jorge.suarez Id. descoñecido: jorge.suarez Just the output from getent passwd : # getent passwd root:x:0:0:root:/root:/bin/bash daemon:x:1:1:daemon:/usr/sbin:/bin/sh bin:x:2:2:bin:/bin:/bin/sh sys:x:3:3:sys:/dev:/bin/sh sync:x:4:65534:sync:/bin:/bin/sync games:x:5:60:games:/usr/games:/bin/sh man:x:6:12:man:/var/cache/man:/bin/sh lp:x:7:7:lp:/var/spool/lpd:/bin/sh mail:x:8:8:mail:/var/mail:/bin/sh news:x:9:9:news:/var/spool/news:/bin/sh uucp:x:10:10:uucp:/var/spool/uucp:/bin/sh proxy:x:13:13:proxy:/bin:/bin/sh www-data:x:33:33:www-data:/var/www:/bin/sh backup:x:34:34:backup:/var/backups:/bin/sh list:x:38:38:Mailing List Manager:/var/list:/bin/sh irc:x:39:39:ircd:/var/run/ircd:/bin/sh gnats:x:41:41:Gnats Bug-Reporting System (admin):/var/lib/gnats:/bin/sh nobody:x:65534:65534:nobody:/nonexistent:/bin/sh libuuid:x:100:101::/var/lib/libuuid:/bin/sh Debian-exim:x:101:103::/var/spool/exim4:/bin/false statd:x:102:65534::/var/lib/nfs:/bin/false sshd:x:104:65534::/var/run/sshd:/usr/sbin/nologin luser:x:1000:1000:Usuario local de Burdeos,,,:/home/luser:/bin/bash messagebus:x:105:107::/var/run/dbus:/bin/false sge-admin:x:1001:1001:Administrador do SGE,,,:/home/cluster/sge-admin:/bin/bash ntp:x:107:110::/home/ntp:/bin/false haldaemon:x:108:111:Hardware abstraction layer,,,:/var/run/hald:/bin/false vde2-net:x:109:114::/var/run/vde2:/bin/false uml-net:x:110:115::/home/uml-net:/bin/false polkituser:x:111:116:PolicyKit,,,:/var/run/PolicyKit:/bin/false Debian-pxe:x:113:65534:Dummy user for Debian pxe package,,,:/home/Debian-pxe:/bin/false Nscd was stopped from the beginning.

Read the article
php crashes with no core file and this message : apc_mmap failed

- by greg0ire

Description of the problem Regularly, cron php processes crash on our production server, which result in mails with the following body : PHP Fatal error: PHP Startup: apc_mmap: mmap failed: in Unknown on line 0 Segmentation fault (core dumped) I think the Segmentation fault (core dumped) should result in core files being handled by apport and then written in /var/crashes, but the files I can see there are there since yesterday, although the last crash occured today : -rw-r----- 1 root whoopsie 1138528 mai 22 04:09 _usr_bin_php5.0.crash -rw-r----- 1 frontoffice whoopsie 1166373 mai 20 18:00 _usr_bin_php5.1005.crash -rw-r----- 1 frontoffice whoopsie 81622658 mai 22 00:05 _usr_sbin_php5-fpm.1005.crash I tried to download the last one anyway, and ran gdb /usr/sbin/php5-fpm /tmp/_usr_sbin_php5-fpm.1005.crash, only to be told that the file is not a core file (its format was not recognized). Here is the server's apc configuration : cat /etc/php5/cli/conf.d/20-apc.ini extension=apc.so apc.shm_size=512M apc.ttl=3600 apc.user_ttl=3600 apc.enable_cli=1 I'm mostly worried about the apc.shm_size… isn't it too high or too low ? I understand it has to do with the size of memory segments. Question(s) What could be the problem ? How can I troubleshoot it (how can I get a valid core file ?) ? System information free total used free shared buffers cached Mem: 5081296 4354684 726612 0 374744 959968 -/+ buffers/cache: 3019972 2061324 Swap: 522236 516888 5348 cat /etc/lsb-release DISTRIB_ID=Ubuntu DISTRIB_RELEASE=12.04 DISTRIB_CODENAME=precise DISTRIB_DESCRIPTION="Ubuntu 12.04.2 LTS" php -v PHP 5.4.17-1~precise+1 (cli) (built: Jul 17 2013 18:14:06) Copyright (c) 1997-2013 The PHP Group Zend Engine v2.4.0, Copyright (c) 1998-2013 Zend Technologies php -i excerpt : Configuration apc APC Support => enabled Version => 3.1.13 APC Debugging => Disabled MMAP Support => Enabled MMAP File Mask => Locking type => pthread mutex Locks Serialization Support => php Revision => $Revision: 327136 $ Build Date => Nov 20 2012 18:41:36 Directive => Local Value => Master Value apc.cache_by_default => On => On apc.canonicalize => On => On apc.coredump_unmap => Off => Off apc.enable_cli => On => On apc.enabled => On => On apc.file_md5 => Off => Off apc.file_update_protection => 2 => 2 apc.filters => no value => no value apc.gc_ttl => 3600 => 3600 apc.include_once_override => Off => Off apc.lazy_classes => Off => Off apc.lazy_functions => Off => Off apc.max_file_size => 1M => 1M apc.mmap_file_mask => no value => no value apc.num_files_hint => 1000 => 1000 apc.preload_path => no value => no value apc.report_autofilter => Off => Off apc.rfc1867 => Off => Off apc.rfc1867_freq => 0 => 0 apc.rfc1867_name => APC_UPLOAD_PROGRESS => APC_UPLOAD_PROGRESS apc.rfc1867_prefix => upload_ => upload_ apc.rfc1867_ttl => 3600 => 3600 apc.serializer => default => default apc.shm_segments => 1 => 1 apc.shm_size => 512M => 512M apc.shm_strings_buffer => 4M => 4M apc.slam_defense => On => On apc.stat => On => On apc.stat_ctime => Off => Off apc.ttl => 3600 => 3600 apc.use_request_time => On => On apc.user_entries_hint => 4096 => 4096 apc.user_ttl => 3600 => 3600 apc.write_lock => On => On php -m [PHP Modules] apc bcmath bz2 calendar Core ctype curl date dba dom ereg exif fileinfo filter ftp gd gettext hash iconv imagick intl json ldap libxml mbstring memcache memcached mhash mysql mysqli openssl pcntl pcre PDO pdo_mysql pdo_pgsql pdo_sqlite pgsql Phar posix Reflection session shmop SimpleXML soap sockets SPL sqlite3 standard sysvmsg sysvsem sysvshm tidy tokenizer wddx xml xmlreader xmlwriter zip zlib [Zend Modules] ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 39531 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 39531 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited

Read the article
Nginx + Haproxy + Thin + Rails - 503 Service Unavailable -

- by Luca G. Soave

I don't know how troubleshoot this. I get "503 Service Unavailable" http error for all "nginx upstreams" proxy passing calls to haproxy fast_thin and slow_thin ( server 127.0.0.1:3100 and server 127.0.0.1:3200 ), which loadbalance on 6 Thin servers ( 127.0.0.1:3000 .. 3005 ). Static files like /blog are currently fine. The falldown is: nginx on port 80 - haproxy on 3100 and 3200 - thin on 3000 .. 3005 and then Rails. Here it is /etc/nginx/nginx.conf : user nginx; worker_processes 2; pid /var/run/nginx.pid; events { worker_connections 1024; } http { include /etc/nginx/mime.types; default_type application/octet-stream; log_format main '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$http_x_forwarded_for"'; sendfile on; tcp_nopush on; keepalive_timeout 65; tcp_nodelay on; include /etc/nginx/conf.d/*.conf; } then /etc/nginx/conf.d/default.conf upstream fast_thin { server 127.0.0.1:3100; } upstream slow_thin { server 127.0.0.1:3200; } server { listen 80; server_name www.gitwatcher.com; rewrite ^/(.*) http://gitwatcher.com/$1 permanent; } server { listen 80; server_name gitwatcher.com; access_log /var/www/gitwatcher/log/access.log; error_log /var/www/gitwatcher/log/error.log; root /var/www/gitwatcher/public; # index index.html; location /about { proxy_pass http://fast_thin; break; } location /trends { proxy_pass http://slow_thin; break; } location /categories { proxy_pass http://slow_thin; break; } location /signout { proxy_pass http://slow_thin; break; } location /auth/github { proxy_pass http://slow_thin; break; } location / { proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header Host $http_host; proxy_redirect off; if (-f $request_filename/index.html) { rewrite (.*) $1/index.html break; } if (-f $request_filename.html) { rewrite (.*) $1.html break; } if (!-f $request_filename) { proxy_pass http://slow_thin; break; } } } then haproxy config file /etc/haproxy/haproxy.cfg : global log 127.0.0.1 local0 log 127.0.0.1 local1 notice #log loghost local0 info maxconn 4096 #chroot /usr/share/haproxy user haproxy group haproxy daemon #debug #quiet nbproc 1 # number of processing cores defaults log global retries 3 maxconn 2000 contimeout 5000 mode http clitimeout 60000 # maximum inactivity time on the client side srvtimeout 30000 # maximum inactivity time on the server side timeout connect 4000 # maximum time to wait for a connection attempt to a server to succeed option httplog option dontlognull option redispatch option httpclose # disable keepalive (HAProxy does not yet support the HTTP keep-alive mode) option abortonclose # enable early dropping of aborted requests from pending queue option httpchk # enable HTTP protocol to check on servers health option forwardfor # enable insert of X-Forwarded-For headers balance roundrobin # each server is used in turns, according to assigned weight stats enable # enable web-stats at /haproxy?stats stats auth haproxy:pr0xystats # force HTTP Auth to view stats stats refresh 5s # refresh rate of stats page listen rails_proxy 127.0.0.1:3100 # - equal weights on all servers # - maxconn will queue requests at HAProxy if limit is reached # - minconn dynamically scales the connection concurrency (bound my maxconn) depending on size of HAProxy queue # - check health every 20000 microseconds server web1 127.0.0.1:3000 weight 1 minconn 3 maxconn 6 check inter 20000 server web1 127.0.0.1:3001 weight 1 minconn 3 maxconn 6 check inter 20000 server web1 127.0.0.1:3002 weight 1 minconn 3 maxconn 6 check inter 20000 listen slow_proxy 127.0.0.1:3200 # cluster for slow requests, lower the queues, check less frequently server slow1 127.0.0.1:3003 weight 1 minconn 1 maxconn 3 check inter 40000 server slow2 127.0.0.1:3004 weight 1 minconn 1 maxconn 3 check inter 40000 server slow3 127.0.0.1:3005 weight 1 minconn 1 maxconn 3 check inter 40000 and the Thin config file /etc/thin/gitwatcher.yml : --- chdir: /var/www/gitwatcher environment: production address: 0.0.0.0 port: 3000 timeout: 30 log: log/thin.log pid: tmp/pids/thin.pid max_conns: 1024 max_persistent_conns: 100 require: [] wait: 30 servers: 6 daemonize: true if I look into open listen ports, I got the following : root@fullness:/var/www/gitwatcher# lsof | grep TCP | egrep "nginx|haproxy|thin" nginx 834 root 8u IPv4 921 0t0 TCP *:http (LISTEN) nginx 835 nginx 8u IPv4 921 0t0 TCP *:http (LISTEN) nginx 837 nginx 8u IPv4 921 0t0 TCP *:http (LISTEN) haproxy 1908 haproxy 4u IPv4 11699 0t0 TCP localhost:3100 (LISTEN) haproxy 1908 haproxy 6u IPv4 11701 0t0 TCP localhost:3200 (LISTEN) root@fullness:/var/www/gitwatcher# iptables -L get me the following : Chain INPUT (policy DROP) target prot opt source destination ACCEPT all -- anywhere anywhere state RELATED,ESTABLISHED ACCEPT all -- anywhere anywhere state RELATED,ESTABLISHED ACCEPT tcp -- anywhere anywhere tcp dpt:22222 ACCEPT tcp -- anywhere anywhere tcp dpt:http ACCEPT tcp -- anywhere anywhere tcp dpt:https ACCEPT all -- anywhere anywhere DROP all -- anywhere anywhere Chain FORWARD (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination ACCEPT all -- anywhere anywhere Any help ?

Read the article
Weird routing issue (updated)

- by smccloud

I just updated the route tables due to a mistake on my part. I am working on getting networking working correctly on a cluster of 14 virtual servers at a customer site. 11 of them work fine for routing and 3 don't work correctly for their administrative network (172.28.56.0). All are running Windows Web Server 2008R2. Default gateway is set on the production network (172.28.58.0) and not on the administrative network (handled with persistent static routes). On a working server, route print gives me the following (MACs redacted) =========================================================================== Interface List 11...XX XX XX XX XX XX ......Intel(R) PRO/1000 MT Network Connection 13...XX XX XX XX XX XX00 0c 29 85 b2 98 ......Intel(R) PRO/1000 MT Network Connection #2 1...........................Software Loopback Interface 1 12...00 00 00 00 00 00 00 e0 Microsoft ISATAP Adapter 14...00 00 00 00 00 00 00 e0 Microsoft ISATAP Adapter #2 15...00 00 00 00 00 00 00 e0 Teredo Tunneling Pseudo-Interface =========================================================================== IPv4 Route Table =========================================================================== Active Routes: Network Destination Netmask Gateway Interface Metric 0.0.0.0 0.0.0.0 172.28.58.1 172.28.58.11 266 10.18.1.22 255.255.255.255 172.28.58.1 172.28.58.11 11 10.32.0.0 255.255.0.0 172.28.56.1 172.28.56.201 11 127.0.0.0 255.0.0.0 On-link 127.0.0.1 306 127.0.0.1 255.255.255.255 On-link 127.0.0.1 306 127.255.255.255 255.255.255.255 On-link 127.0.0.1 306 172.28.34.0 255.255.255.0 172.28.56.1 172.28.56.201 11 172.28.42.0 255.255.255.0 172.28.56.1 172.28.56.201 11 172.28.56.0 255.255.255.0 On-link 172.28.56.201 266 172.28.56.0 255.255.255.0 172.28.56.1 172.28.56.201 11 172.28.56.201 255.255.255.255 On-link 172.28.56.201 266 172.28.56.255 255.255.255.255 On-link 172.28.56.201 266 172.28.58.0 255.255.255.224 On-link 172.28.58.11 266 172.28.58.0 255.255.255.224 172.28.58.1 172.28.58.11 11 172.28.58.1 255.255.255.255 172.28.58.1 172.28.58.11 11 172.28.58.11 255.255.255.255 On-link 172.28.58.11 266 172.28.58.31 255.255.255.255 On-link 172.28.58.11 266 172.28.60.0 255.255.255.0 172.28.56.1 172.28.56.201 11 172.28.63.0 255.255.255.0 172.28.56.1 172.28.56.201 11 192.168.0.0 255.255.0.0 172.28.56.1 172.28.56.201 11 224.0.0.0 240.0.0.0 On-link 127.0.0.1 306 224.0.0.0 240.0.0.0 On-link 172.28.56.201 266 224.0.0.0 240.0.0.0 On-link 172.28.58.11 266 255.255.255.255 255.255.255.255 On-link 127.0.0.1 306 255.255.255.255 255.255.255.255 On-link 172.28.56.201 266 255.255.255.255 255.255.255.255 On-link 172.28.58.11 266 =========================================================================== Persistent Routes: Network Address Netmask Gateway Address Metric 172.28.56.0 255.255.255.0 172.28.56.1 1 172.28.63.0 255.255.255.0 172.28.56.1 1 192.168.0.0 255.255.0.0 172.28.56.1 1 172.28.60.0 255.255.255.0 172.28.56.1 1 10.32.0.0 255.255.0.0 172.28.56.1 1 172.28.34.0 255.255.255.0 172.28.56.1 1 172.28.42.0 255.255.255.0 172.28.56.1 1 0.0.0.0 0.0.0.0 172.28.58.1 Default =========================================================================== IPv6 Route Table =========================================================================== Active Routes: If Metric Network Destination Gateway 1 306 ::1/128 On-link 1 306 ff00::/8 On-link =========================================================================== Persistent Routes: None On one of the non-working server, route print gives me the following (MACs redacted) =========================================================================== Interface List 11...XX XX XX XX XX XX ......Intel(R) PRO/1000 MT Network Connection 13...XX XX XX XX XX XX ......Intel(R) PRO/1000 MT Network Connection #2 1...........................Software Loopback Interface 1 12...00 00 00 00 00 00 00 e0 Microsoft ISATAP Adapter 14...00 00 00 00 00 00 00 e0 Microsoft ISATAP Adapter #2 16...00 00 00 00 00 00 00 e0 Teredo Tunneling Pseudo-Interface =========================================================================== IPv4 Route Table =========================================================================== Active Routes: Network Destination Netmask Gateway Interface Metric 0.0.0.0 0.0.0.0 172.28.58.1 172.28.58.21 266 10.32.0.0 255.255.0.0 172.28.56.1 172.28.56.211 11 127.0.0.0 255.0.0.0 On-link 127.0.0.1 306 127.0.0.1 255.255.255.255 On-link 127.0.0.1 306 127.255.255.255 255.255.255.255 On-link 127.0.0.1 306 172.28.34.0 255.255.255.0 172.28.56.1 172.28.56.211 11 172.28.42.0 255.255.255.0 172.28.56.1 172.28.56.211 11 172.28.56.0 255.255.255.0 172.28.56.1 172.28.56.211 11 172.28.56.211 255.255.255.255 On-link 172.28.56.211 266 172.28.58.0 255.255.255.0 172.28.58.1 172.28.58.21 11 172.28.58.0 255.255.255.224 On-link 172.28.58.21 266 172.28.58.21 255.255.255.255 On-link 172.28.58.21 266 172.28.58.31 255.255.255.255 On-link 172.28.58.21 266 172.28.60.0 255.255.255.0 172.28.56.1 172.28.56.211 11 172.28.63.0 255.255.255.0 172.28.56.1 172.28.56.211 11 192.168.0.0 255.255.0.0 172.28.56.1 172.28.56.211 11 224.0.0.0 240.0.0.0 On-link 127.0.0.1 306 224.0.0.0 240.0.0.0 On-link 172.28.56.211 266 224.0.0.0 240.0.0.0 On-link 172.28.58.21 266 255.255.255.255 255.255.255.255 On-link 127.0.0.1 06 255.255.255.255 255.255.255.255 On-link 172.28.56.211 266 255.255.255.255 255.255.255.255 On-link 172.28.58.21 266 =========================================================================== Persistent Routes: Network Address Netmask Gateway Address Metric 172.28.56.0 255.255.255.0 172.28.56.1 1 172.28.60.0 255.255.255.0 172.28.56.1 1 172.28.63.0 255.255.255.0 172.28.56.1 1 172.28.34.0 255.255.255.0 172.28.56.1 1 172.28.42.0 255.255.255.0 172.28.56.1 1 192.168.0.0 255.255.0.0 172.28.56.1 1 10.32.0.0 255.255.0.0 172.28.56.1 1 0.0.0.0 0.0.0.0 172.28.58.1 Default 172.28.58.0 255.255.255.0 172.28.58.1 1 =========================================================================== IPv6 Route Table =========================================================================== Active Routes: If Metric Network Destination Gateway 1 306 ::1/128 On-link 1 306 ff00::/8 On-link =========================================================================== Persistent Routes: None I am at a complete loss why the non-working servers have no On-link route for 172.28.56.0. Does anyone have any suggestions on what I should be looking at to figure this out? Also, I do have "physical" access to the console if needed through vSphere Client.

Read the article
vSphere ESX 5.5 hosts cannot connect to NFS Server

- by Gerald

Summary: My problem is I cannot use the QNAP NFS Server as an NFS datastore from my ESX hosts despite the hosts being able to ping it. I'm utilising a vDS with LACP uplinks for all my network traffic (including NFS) and a subnet for each vmkernel adapter. Setup: I'm evaluating vSphere and I've got two vSphere ESX 5.5 hosts (node1 and node2) and each one has 4x NICs. I've teamed them all up using LACP/802.3ad with my switch and then created a distributed switch between the two hosts with each host's LAG as the uplink. All my networking is going through the distributed switch, ideally, I want to take advantage of DRS and the redundancy. I have a domain controller VM ("Central") and vCenter VM ("vCenter") running on node1 (using node1's local datastore) with both hosts attached to the vCenter instance. Both hosts are in a vCenter datacenter and a cluster with HA and DRS currently disabled. I have a QNAP TS-669 Pro (Version 4.0.3) (TS-x69 series is on VMware Storage HCL) which I want to use as the NFS server for my NFS datastore, it has 2x NICs teamed together using 802.3ad with my switch. vmkernel.log: The error from the host's vmkernel.log is not very useful: NFS: 157: Command: (mount) Server: (10.1.2.100) IP: (10.1.2.100) Path: (/VM) Label (datastoreNAS) Options: (None) cpu9:67402)StorageApdHandler: 698: APD Handle 509bc29f-13556457 Created with lock[StorageApd0x411121] cpu10:67402)StorageApdHandler: 745: Freeing APD Handle [509bc29f-13556457] cpu10:67402)StorageApdHandler: 808: APD Handle freed! cpu10:67402)NFS: 168: NFS mount 10.1.2.100:/VM failed: Unable to connect to NFS server. Network Setup: Here is my distributed switch setup (JPG). Here are my networks. 10.1.1.0/24 VM Management (VLAN 11) 10.1.2.0/24 Storage Network (NFS, VLAN 12) 10.1.3.0/24 VM vMotion (VLAN 13) 10.1.4.0/24 VM Fault Tolerance (VLAN 14) 10.2.0.0/24 VM's Network (VLAN 20) vSphere addresses 10.1.1.1 node1 Management 10.1.1.2 node2 Management 10.1.2.1 node1 vmkernel (For NFS) 10.1.2.2 node2 vmkernel (For NFS) etc. Other addresses 10.1.2.100 QNAP TS-669 (NFS Server) 10.2.0.1 Domain Controller (VM on node1) 10.2.0.2 vCenter (VM on node1) I'm using a Cisco SRW2024P Layer-2 switch (Jumboframes enabled) with the following setup: LACP LAG1 for node1 (Ports 1 through 4) setup as VLAN trunk for VLANs 11-14,20 LACP LAG2 for my router (Ports 5 through 8) setup as VLAN trunk for VLANs 11-14,20 LACP LAG3 for node2 (Ports 9 through 12) setup as VLAN trunk for VLANs 11-14,20 LACP LAG4 for the QNAP (Ports 23 and 24) setup to accept untagged traffic into VLAN 12 Each subnet is routable to another, although, connections to the NFS server from vmk1 shouldn't need it. All other traffic (vSphere Web Client, RDP etc.) goes through this setup fine. I tested the QNAP NFS server beforehand using ESX host VMs atop of a VMware Workstation setup with a dedicated physical NIC and it had no problems. The ACL on the NFS Server share is permissive and allows all subnet ranges full access to the share. I can ping the QNAP from node1 vmk1, the adapter that should be used to NFS: ~ # vmkping -I vmk1 10.1.2.100 PING 10.1.2.100 (10.1.2.100): 56 data bytes 64 bytes from 10.1.2.100: icmp_seq=0 ttl=64 time=0.371 ms 64 bytes from 10.1.2.100: icmp_seq=1 ttl=64 time=0.161 ms 64 bytes from 10.1.2.100: icmp_seq=2 ttl=64 time=0.241 ms Netcat does not throw an error: ~ # nc -z 10.1.2.100 2049 Connection to 10.1.2.100 2049 port [tcp/nfs] succeeded! The routing table of node1: ~ # esxcfg-route -l VMkernel Routes: Network Netmask Gateway Interface 10.1.1.0 255.255.255.0 Local Subnet vmk0 10.1.2.0 255.255.255.0 Local Subnet vmk1 10.1.3.0 255.255.255.0 Local Subnet vmk2 10.1.4.0 255.255.255.0 Local Subnet vmk3 default 0.0.0.0 10.1.1.254 vmk0 VM Kernel NIC info ~ # esxcfg-vmknic -l Interface Port Group/DVPort IP Family IP Address Netmask Broadcast MAC Address MTU TSO MSS Enabled Type vmk0 133 IPv4 10.1.1.1 255.255.255.0 10.1.1.255 00:50:56:66:8e:5f 1500 65535 true STATIC vmk0 133 IPv6 fe80::250:56ff:fe66:8e5f 64 00:50:56:66:8e:5f 1500 65535 true STATIC, PREFERRED vmk1 164 IPv4 10.1.2.1 255.255.255.0 10.1.2.255 00:50:56:68:f5:1f 1500 65535 true STATIC vmk1 164 IPv6 fe80::250:56ff:fe68:f51f 64 00:50:56:68:f5:1f 1500 65535 true STATIC, PREFERRED vmk2 196 IPv4 10.1.3.1 255.255.255.0 10.1.3.255 00:50:56:66:18:95 1500 65535 true STATIC vmk2 196 IPv6 fe80::250:56ff:fe66:1895 64 00:50:56:66:18:95 1500 65535 true STATIC, PREFERRED vmk3 228 IPv4 10.1.4.1 255.255.255.0 10.1.4.255 00:50:56:72:e6:ca 1500 65535 true STATIC vmk3 228 IPv6 fe80::250:56ff:fe72:e6ca 64 00:50:56:72:e6:ca 1500 65535 true STATIC, PREFERRED Things I've tried/checked: I'm not using DNS names to connect to the NFS server. Checked MTU. Set to 9000 for vmk1, dvSwitch and Cisco switch and QNAP. Moved QNAP onto VLAN 11 (VM Management, vmk0) and gave it an appropriate address, still had same issue. Changed back afterwards of course. Tried initiating the connection of NAS datastore from vSphere Client (Connected to vCenter or directly to host), vSphere Web Client and the host's ESX Shell. All resulted in the same problem. Tried a path name of "VM", "/VM" and "/share/VM" despite not even having a connection to server. I plugged in a linux system (10.1.2.123) into a switch port configured for VLAN 12 and tried mounting the NFS share 10.1.2.100:/VM, it worked successfully and I had read-write access to it I tried disabling the firewall on the ESX host esxcli network firewall set --enabled false I'm out of ideas on what to try next. The things I'm doing differently from my VMware Workstation setup is the use of LACP with a physical switch and a virtual distributed switch between the two hosts. I'm guessing the vDS is probably the source of my troubles but I don't know how to fix this problem without eliminating it.

Read the article
Cross-platform distributed fault-tolerant (disconnected operation/local cache) filesystem

- by Adrian Frühwirth

We are facing a design "challenge" where we are required to set up a storage solution with the following properties: What we need HA a scalable storage backend offline/disconnected operation on the client to account for network outages cross-platform access client-side access from certainly Windows (probably XP upwards), possibly Linux backend integrates with AD/LDAP (permission management (user/group management, ...)) should work reasonably well over slow WAN-links Another problem is that we don't really know all possible use cases here, if people need to be able to have concurrent access to shared files or if they will only be accessing their own files, so a possible solution needs to account for concurrent access and how conflict management would look in this case from a user's point of view. This two years old blog posts sums up the impression that I have been getting during the last couple of days of research, that there are lots of current übercool projects implementing (non-Windows) clustered petabyte-capable blob-storage solutions but that there is none that supports disconnected operation nicely and natively, but I am hoping that we have missed an obvious solution. What we have tried OpenAFS We figured that we want a distributed network filesystem with a local cache and tested OpenAFS (which, as the only currently "stable" DFS supporting disconnected operation, seemed the way to go) for a week but there are several problems with it: it's a real pain to set up there are no official RHEL/CentOS packages the package of the current stable version 1.6.5.1 from elrepo randomly kernel panics on fresh installs, this is an absolute no-go Windows support (including the required Kerberos packages) is mystical. The current client for the 1.6 branch does not run on Windows 8, the current client for the 1.7 does but it just randomly crashes. After that experience we didn't even bother testing on XP and Windows 7. Suffice to say, we couldn't get it working and the whole setup has been so unstable and complicated to setup that it's just not an option for production. Samba + Unison Since OpenAFS was a complete disaster and no other DFS seems to support disconnected operation we went for a simpler idea that would sync files against a Samba server using Unison. This has the following advantages: Samba integrates with ADs; it's a pain but can be done. Samba solves the problem of remotely accessing the storage from Windows but introduces another SPOF and does not address the actual storage problem. We could probably stick any clustered FS underneath Samba, but that means we need a HA Samba setup on top of that to maintain HA which probably adds a lot of additional complexity. I vaguely remember trying to implement redundancy with Samba before and I could not silently failover between servers. Even when online, you are working with local files which will result in more conflicts than would be necessary if a local cache were only touched when disconnected It's not automatic. We cannot expect users to manually sync their files using the (functional, but not-so-pretty) GTK GUI on a regular basis. I attempted to semi-automate the process using the Windows task scheduler, but you cannot really do it in a satisfactory way. On top of that, the way Unison works makes syncing against Samba a costly operation, so I am afraid that it just doesn't scale very well or even at all. Samba + "Offline Files" After that we became a little desparate and gave Windows "offline files" a chance. We figured that having something that is inbuilt into the OS would reduce administrative efforts, helps blaming someone else when it's not working properly and should just work since people have been using this for years. Right? Wrong. We really wanted it to work, but it just doesn't. 30 minutes of copying files around and unplugging network cables/disabling network interfaces left us with (silent! there is only a tiny notification in Windows explorer in the statusbar, which doesn't even open Sync Center if you click on it!) undeletable files on the server (!) and conflicts that should not even be conflicts. In the end, we had one successful sync of a tiny text file, everything else just exploded horribly. Beyond that, there are other problems: Microsoft admits that "offline files" in Windows XP cannot cope with "large files" and therefore does not cache/sync them at all which would mean those files become unavailable if the connection drop In Windows 7 the feature is only available in the Professional/Ultimate/Enterprise editions. Summary Unless there is another fault-tolerant DFS that supports Windows natively I assume that stacking a HA Samba cluster on top of something like GlusterFS/Lustre/whatnot is the only option, but I hope that I am wrong here. How do other companies allow fault-tolerant network access to redundant storage in a heterogeneous environment with Windows?

Read the article

Search Results

Search found 4125 results on 165 pages for 'hash cluster'.

Page 160/165 | < Previous Page | 156 157 158 159 160 161 162 163 164 165 | Next Page >

- by cj

- by Elie Wazen

- by HarveyP

- by Karoly Vegh

- by david.butler(at)oracle.com

- by Scott McNeil

- by Karoly Vegh

- by Dave

- by Dave Ballantyne

- by Applications User Experience

- by Steve Felts

- by xraminx

- by PAS

- by utrecht

- by Kendall Hopkins

- by CasperDK

- by Mike Curry

- by maximus

- by Dmitri DB

- by Jorge Suárez de Lis

- by greg0ire

- by Luca G. Soave

- by smccloud

- by Gerald

- by Adrian Frühwirth

< Previous Page | 156 157 158 159 160 161 162 163 164 165 | Next Page >