Search Results

Search found 3295 results on 132 pages for 'solaris cluster'.

Page 29/132 | < Previous Page | 25 26 27 28 29 30 31 32 33 34 35 36  | Next Page >

  • REDUX: How to overcome an incompatibility between the ksh on Linux vs. that installed on AIX/Solaris

    - by Andrew Stein
    I have uncovered another problem in the effort that we are making to port several hundreds of ksh scripts from AIX, Solaris and HPUX to Linux. See here for the previous problem. This code: #!/bin/ksh if [ -a k* ]; then echo "Oh yeah!" else echo "No way!" fi exit 0 (when run in a directory with several files whose name starts with k) produces "Oh yeah!" when called with the AT&T ksh variants (ksh88 and ksh93). On the other hand it produces and error message followed by "No way!" on the other ksh variants (pdksh, MKS ksh and bash). Again, my question are: Is there an environment variable that will cause pdksh to behave like ksh93? Failing that: Is there an option on pdksh to get the required behavior?

    Read the article

  • Ein besonderes Oracle Business Breakfast in Berlin

    - by Detlef Drewanz
    Seit mehreren Jahren finden bei uns Business Breakfasts statt. Diese Veranstaltung ist üblicherweise an Technologen mit tiefem technologischen Wissensdurst gerichtet. Aus einem besonderen Anlass ist die Veranstaltung am 13.6.2014 in unserem Customer Visit Center in Berlin etwas speziell. Anlässlich des Solaris 11.2 Launches tourt Herr Markus Flierl, Oracle VP Software Development, gerade durch Deutschland. Wir haben ihn nach Berlin in unser Customer Visit Center eingeladen, um mit Ihnen Ihre Cloud Strategien und Anforderungen an ein modernes Betriebssystem zu diskutieren. Vielleicht setzen Sie zurzeit ein Betriebssystem ein, welches nicht aus dem Hause Oracle stammt. Das macht nichts. Auch dann ist der Besuch zu dieser Veranstaltung interessant, denn Herr Flierl interessiert sich ebenso für Ihre Anforderungen und Entscheidungsgrundlagen. Übrigens: Markus Flierl ist in Südddeutschland geboren und aufgewachsen und spricht somit fließend Deutsch. Agenda Start Ende Titel 08:30 09:30 Registrierung und Frühstück 09:30 09:45 Begrüßung und Einleitung Ralf Zenses, Oracle Senior Director Systems Sales Consulting Europe North 09:45 11:30 Strategien für OpenStack, Software Defined Networking und RZ Automatisierung: Cloud Management Integriert, nicht nur Installiert Markus Flierl, Oracle VP Software Development 11:30 11:45 Pause 11:45 12:15 Solaris 11.2 OpenStack Demo Joost Pronk, Oracle Senior Principle Product Strategy Manager 12:15 13:00 Unified Archiving und SCAP: Die finale Antwort auf Migrations- und Compliance Fragen Detlef Drewanz, Oracle Master Principle Sales Consultant Weitere Details und den Link zur Anmeldung finden Sie hier. Die Veranstaltung ist offen für alle Interessierten. Ich freue mich auf Ihren Besuch. Wir sehen uns.

    Read the article

  • Using Solaris pkg to list all setuid or setgid programs

    - by darrenm
    $ pkg contents -a mode=4??? -a mode=2??? -t file -o pkg.name,path,mode We can also add a package name on the end to restrict it to just that single package eg: $ pkg contents -a mode=4??? -a mode=2??? -t file -o pkg.name,path,mode core-os PKG.NAME PATH MODE system/core-os usr/bin/amd64/newtask 4555 system/core-os usr/bin/amd64/uptime 4555 system/core-os usr/bin/at 4755 system/core-os usr/bin/atq 4755 system/core-os usr/bin/atrm 4755 system/core-os usr/bin/crontab 4555 system/core-os usr/bin/mail 2511 system/core-os usr/bin/mailx 2511 system/core-os usr/bin/newgrp 4755 system/core-os usr/bin/pfedit 4755 system/core-os usr/bin/su 4555 system/core-os usr/bin/tip 4511 system/core-os usr/bin/write 2555 system/core-os usr/lib/utmp_update 4555 system/core-os usr/sbin/amd64/prtconf 2555 system/core-os usr/sbin/amd64/swap 2555 system/core-os usr/sbin/amd64/sysdef 2555 system/core-os usr/sbin/amd64/whodo 4555 system/core-os usr/sbin/prtdiag 2755 system/core-os usr/sbin/quota 4555 system/core-os usr/sbin/wall 2555

    Read the article

  • Hogyan konfiguráljunk egy Oracle BI cluster rendszert Sun hardver környezetben

    - by Fekete Zoltán
    A következo Deploying Oracle® Business Intelligence Enterprise Edition on Oracle's Sun Systems white paper részletesen leírja, hogyan állítsunk össze egy Oracle BI klasztert. Ezzel a klaszter környezettel elérheto: - nagy rendelkezésre állás, az egyik szerver meghibásodásakor is muködik tovább a rendszer - terhelésmegosztás a BI szerverek között, aktív-aktív szereppel A dokumentum kitér mind a hardver mind a szoftver komponensek architektúrájára és konfigurálására, még az installálásra is: - hardver komponensek kapcsolatára: a két Oracle Business intelligence Sun SPARC Enterprise szerver, a switch, a Sun Unified Storage,... - szoftver komponensek: Oracle BI EE, WebLogic Server, Oracle Directory Server, Oracle Database, Oracle VM Server for SPARC, stb. Deploying Oracle® Business Intelligence Enterprise Edition on Oracle's Sun Systems

    Read the article

  • ndd on Solaris 10

    - by user12620111
    This is mostly a repost of LaoTsao's Weblog with some tweaks. Last time that I tried to cut & paste directly off of his page, some of the XML was messed up. I run this from my MacBook. It should also work from your windows laptop if you use cygwin. ================If not already present, create a ssh key on you laptop================ # ssh-keygen -t rsa ================ Enable passwordless ssh from my laptop. Need to type in the root password for the remote machines. Then, I no longer need to type in the password when I ssh or scp from my laptop to servers. ================ #!/usr/bin/env bash for server in `cat servers.txt` do   echo root@$server   cat ~/.ssh/id_rsa.pub | ssh root@$server "cat >> .ssh/authorized_keys" done ================ servers.txt ================ testhost1testhost2 ================ etc_system_addins ================ set rpcmod:clnt_max_conns=8 set zfs:zfs_arc_max=0x1000000000 set nfs:nfs3_bsize=131072 set nfs:nfs4_bsize=131072 ================ ndd-nettune.txt ================ #!/sbin/sh # # ident   "@(#)ndd-nettune.xml    1.0     01/08/06 SMI" . /lib/svc/share/smf_include.sh . /lib/svc/share/net_include.sh # Make sure that the libraries essential to this stage of booting  can be found. LD_LIBRARY_PATH=/lib; export LD_LIBRARY_PATH echo "Performing Directory Server Tuning..." >> /tmp/smf.out # # Standard SuperCluster Tunables # /usr/sbin/ndd -set /dev/tcp tcp_max_buf 2097152 /usr/sbin/ndd -set /dev/tcp tcp_xmit_hiwat 1048576 /usr/sbin/ndd -set /dev/tcp tcp_recv_hiwat 1048576 # Reset the library path now that we are past the critical stage unset LD_LIBRARY_PATH ================ ndd-nettune.xml ================ <?xml version="1.0"?> <!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1"> <!-- ident "@(#)ndd-nettune.xml 1.0 04/09/21 SMI" --> <service_bundle type='manifest' name='SUNWcsr:ndd'>   <service name='network/ndd-nettune' type='service' version='1'>     <create_default_instance enabled='true' />     <single_instance />     <dependency name='fs-minimal' type='service' grouping='require_all' restart_on='none'>       <service_fmri value='svc:/system/filesystem/minimal' />     </dependency>     <dependency name='loopback-network' grouping='require_any' restart_on='none' type='service'>       <service_fmri value='svc:/network/loopback' />     </dependency>     <dependency name='physical-network' grouping='optional_all' restart_on='none' type='service'>       <service_fmri value='svc:/network/physical' />     </dependency>     <exec_method type='method' name='start' exec='/lib/svc/method/ndd-nettune' timeout_seconds='3' > </exec_method>     <exec_method type='method' name='stop'  exec=':true'                       timeout_seconds='3' > </exec_method>     <property_group name='startd' type='framework'>       <propval name='duration' type='astring' value='transient' />     </property_group>     <stability value='Unstable' />     <template>       <common_name>     <loctext xml:lang='C'> ndd network tuning </loctext>       </common_name>       <documentation>     <manpage title='ndd' section='1M' manpath='/usr/share/man' />       </documentation>     </template>   </service> </service_bundle> ================ system_tuning.sh ================ #!/usr/bin/env bash for server in `cat servers.txt` do   cat etc_system_addins | ssh root@$server "cat >> /etc/system"   scp ndd-nettune.xml root@${server}:/var/svc/manifest/site/ndd-nettune.xml   scp ndd-nettune.txt root@${server}:/lib/svc/method/ndd-nettune   ssh root@$server chmod +x /lib/svc/method/ndd-nettune   ssh root@$server svccfg validate /var/svc/manifest/site/ndd-nettune.xml   ssh root@$server svccfg import /var/svc/manifest/site/ndd-nettune.xml done

    Read the article

  • How to Evict a Failed Node and Add it Back to SQL Server 2005 Cluster

    Adding and removing nodes in SQL Server Clusters is not so difficult, and instructions on how to do so abound on the internet. However, mismanagement when adding/removing nodes can quickly become a 'gotcha' that wastes time. Bo Chen offers insight into some of those scenarios that are not normally covered in the standard online documents.

    Read the article

  • How to Evict a Failed Node and Add it Back to SQL Server 2005 Cluster

    Adding and removing nodes in SQL Server Clusters is not so difficult, and instructions on how to do so abound on the internet. However, mismanagement when adding/removing nodes can quickly become a 'gotcha' that wastes time. Bo Chen offers insight into some of those scenarios that are not normally covered in the standard online documents.

    Read the article

  • Check which nodes of Beowulf HPC cluster system are free from PHP app?

    - by Skuja
    I am working on my diploma thesis project. I have access to 32Node Dell poweredge HPC cluster system with Linux(Debian i think) installed on it. My first goal is to create web (PHP) app where logged users could see free and busy nodes, turn them on and off. I am planning to do something like this - write some cron daemon that would run every 30seconds or other interval, and it could run ping utility for each node to find out if it is on or off, then write results to some file. Then from my web app i could read the info. Will it be a good solution? What existing for node management solutions are there?

    Read the article

  • Algorithm to Find the Aggregate Mass of "Granola Bar"-Like Structures?

    - by Stuart Robbins
    I'm a planetary science researcher and one project I'm working on is N-body simulations of Saturn's rings. The goal of this particular study is to watch as particles clump together under their own self-gravity and measure the aggregate mass of the clumps versus the mean velocity of all particles in the cell. We're trying to figure out if this can explain some observations made by the Cassini spacecraft during the Saturnian summer solstice when large structures were seen casting shadows on the nearly edge-on rings. Below is a screenshot of what any given timestep looks like. (Each particle is 2 m in diameter and the simulation cell itself is around 700 m across.) The code I'm using already spits out the mean velocity at every timestep. What I need to do is figure out a way to determine the mass of particles in the clumps and NOT the stray particles between them. I know every particle's position, mass, size, etc., but I don't know easily that, say, particles 30,000-40,000 along with 102,000-105,000 make up one strand that to the human eye is obvious. So, the algorithm I need to write would need to be a code with as few user-entered parameters as possible (for replicability and objectivity) that would go through all the particle positions, figure out what particles belong to clumps, and then calculate the mass. It would be great if it could do it for "each" clump/strand as opposed to everything over the cell, but I don't think I actually need it to separate them out. The only thing I was thinking of was doing some sort of N2 distance calculation where I'd calculate the distance between every particle and if, say, the closest 100 particles were within a certain distance, then that particle would be considered part of a cluster. But that seems pretty sloppy and I was hoping that you CS folks and programmers might know of a more elegant solution? Edited with My Solution: What I did was to take a sort of nearest-neighbor / cluster approach and do the quick-n-dirty N2 implementation first. So, take every particle, calculate distance to all other particles, and the threshold for in a cluster or not was whether there were N particles within d distance (two parameters that have to be set a priori, unfortunately, but as was said by some responses/comments, I wasn't going to get away with not having some of those). I then sped it up by not sorting distances but simply doing an order N search and increment a counter for the particles within d, and that sped stuff up by a factor of 6. Then I added a "stupid programmer's tree" (because I know next to nothing about tree codes). I divide up the simulation cell into a set number of grids (best results when grid size ˜7 d) where the main grid lines up with the cell, one grid is offset by half in x and y, and the other two are offset by 1/4 in ±x and ±y. The code then divides particles into the grids, then each particle N only has to have distances calculated to the other particles in that cell. Theoretically, if this were a real tree, I should get order N*log(N) as opposed to N2 speeds. I got somewhere between the two, where for a 50,000-particle sub-set I got a 17x increase in speed, and for a 150,000-particle cell, I got a 38x increase in speed. 12 seconds for the first, 53 seconds for the second, 460 seconds for a 500,000-particle cell. Those are comparable speeds to how long the code takes to run the simulation 1 timestep forward, so that's reasonable at this point. Oh -- and it's fully threaded, so it'll take as many processors as I can throw at it.

    Read the article

  • Exiting a reboot loop

    - by user12617035
    If you're in a situation where the system is panic'ing during boot, you can use # boot net -s to regain control of your system. In my case, I'd added some diagnostic code to a (PCI) driver (that is used to boot the root filesystem). There was a bug in the driver, and each time during boot, the bug occurred, and so caused the system to panic: ... 000000000180b950 genunix:vfs_mountroot+60 (800, 200, 0, 185d400, 1883000, 18aec00) %l0-3: 0000000000001770 0000000000000640 0000000001814000 00000000000008fc %l4-7: 0000000001833c00 00000000018b1000 0000000000000600 0000000000000200 000000000180ba10 genunix:main+98 (18141a0, 1013800, 18362c0, 18ab800, 180e000, 1814000) %l0-3: 0000000070002000 0000000000000001 000000000180c000 000000000180e000 %l4-7: 0000000000000001 0000000001074800 0000000000000060 0000000000000000 skipping system dump - no dump device configured rebooting... If you're logged in via the console, you can send a BREAK sequence in order to gain control of the firmware's (OBP's) prompt. Enter Ctrl-Shift-[ in order to get the TELNET prompt. Once telnet has control, enter this: telnet> send brk You'll be presented with OBP's prompt: ok You then enter the following in order to boot into single-user mode via the network: ok boot net -s Note that booting from the network under Solaris will implicitly cause the system to be INSTALLED with whatever software had last been configured to be installed. However, we are using boot net -s as a "handle" with which to get at the Solaris prompt. Once at that prompt, we can perform actions as root that will let us back out our buggy driver (ok... MY buggy driver :-)) ...and replace it with the original, non-buggy driver. Entering the boot command caused the following output, as well as left us at the Solaris prompt (in single-user-mode): Sun Blade 1500, No Keyboard Copyright 1998-2004 Sun Microsystems, Inc. All rights reserved. OpenBoot 4.16.4, 1024 MB memory installed, Serial #53463393. Ethernet address 0:3:ba:2f:c9:61, Host ID: 832fc961. Rebooting with command: boot net -s Boot device: /pci@1f,700000/network@2 File and args: -s 1000 Mbps FDX Link up Timeout waiting for ARP/RARP packet Timeout waiting for ARP/RARP packet 4000 1000 Mbps FDX Link up Requesting Internet address for 0:3:ba:2f:c9:61 SunOS Release 5.10 Version Generic_118833-17 64-bit Copyright 1983-2005 Sun Microsystems, Inc. All rights reserved. Use is subject to license terms. Booting to milestone "milestone/single-user:default". Configuring devices. Using RPC Bootparams for network configuration information. Attempting to configure interface bge0... Configured interface bge0 Requesting System Maintenance Mode SINGLE USER MODE # Our goal is to now move to the directory containing the buggy driver and replace it with the original driver (that we had saved away before ever loading our buggy driver! :-) However, since we booted from the network, the root filesystem ("/") is NOT mounted on one of our local disks. It is mounted on an NFS filesystem exported by our install server. To verify this, enter the following command: # mount | head -1 / on my-server:/export/install/media/s10u2/solarisdvd.s10s_u2dvd/latest/Solaris_10/Tools/Boot remote/read/write/setuid/devices/dev=4ac0001 on Wed Dec 31 16:00:00 1969 As a result, we have to create a temporary mount point and then mount the local disk onto that mount point: # mkdir /tmp/mnt # mount /dev/dsk/c0t0d0s0 /tmp/mnt Note that your system will not necessarily have had its root filesystem on "c0t0d0s0". This is something that you should also have recorded before you ever loaded your.. er... "my" buggy driver! :-) One can find the local disk mounted under the root filesystem by entering: # df -k / Filesystem kbytes used avail capacity Mounted on /dev/dsk/c0t0d0s0 76703839 4035535 71901266 6% / To continue with our example, we can now move to the directory of buggy-driver in order to replace it with the original driver. Note that /tmp/mnt is prefixed to the path of where we'd "normally" find the driver: # cd /tmp/mnt/platform/sun4u/kernel/drv/sparcv9 # ls -l pci\* -rw-r--r-- 1 root root 288504 Dec 6 15:38 pcisch -rw-r--r-- 1 root root 288504 Dec 6 15:38 pcisch.aar -rwxr-xr-x 1 root sys 211616 Jun 8 2006 pcisch.orig # cp -p pcisch.orig pcisch We can now synchronize any in-memory filesystem data structures with those on disk... and then reboot. The system will then boot correctly... as expected: # sync;sync # reboot syncing file systems... done Sun Blade 1500, No Keyboard Copyright 1998-2004 Sun Microsystems, Inc. All rights reserved. OpenBoot 4.16.4, 1024 MB memory installed, Serial #xxxxxxxx. Ethernet address 0:3:ba:2f:c9:61, Host ID: yyyyyyyy. Rebooting with command: boot Boot device: /pci@1e,600000/ide@d/disk@0,0:a File and args: SunOS Release 5.10 Version Generic_118833-17 64-bit Copyright 1983-2005 Sun Microsystems, Inc. All rights reserved. Use is subject to license terms. Hostname: my-host NIS domain name is my-campus.Central.Sun.COM my-host console login: ...so that's how it's done! Of course, the easier way is to never write a buggy-driver... but.. then.. we all "have an eraser on the end of each of our pencils"... don't we ? :-) "...thank you... and good night..."

    Read the article

  • Virutal Machine loses network connectivity on Hyper V Cluster

    - by Chris W
    We're running a number of VMs on a 6 node failover cluster of blades using Hyper V. We have an intermittent issue (every few days at different times - not a fixed frequency) of VMs losing network connectivity. Console access to the VM suggests all is fine and the underlying blade has normal connectivity. To resolve the problem we either have to re-start the VM or, more usually, we do a live migration to another blade which fires up connectivity and we then migrate it back to the original blade. I've had 3 instances of this happen with a specific VM running on a particular blade however it has happened once with a different VM running on a different blade. All VMs and blades have the same basic setup and are running Windows 2008 R2. Any ideas where I should be looking to diagnose the possible causes of this problem as the event logs provide no help? Edit: I've checked that each blade is running the latest NIC drivers and all seem to be fine. Something that is confusing me - a failover or restart of the VM resolves the issue. Whilst I need to work out the underlying issue that is causing the NICs to hang I'm also concerned that the VM didn't failover to another node which would have solved the outage for me. Is there a way to configure the cluster so that it can tell that the VM guest has lost connectivity and fail it over? As things stand the cluster is assuming that the VM is running happily as I presume Hyper V says everything is great even though there is a problem.

    Read the article

  • Configuration for a two machine ESXi cluster using VSA to present local storage to VMs

    - by MDMarra
    I'm designing a little vSphere 5 cluster for one of our remote sites. We have some IBM x3650s that have 6x 300GB 10K RPM drives in them, along with dual quad core CPUs and 24GB RAM. Because we use HP P4500 G2s at our primary site, we have licenses available for HP P4000 VSAs. I thought that this would be the perfect opportunity to use them. Below is a basic drawing of what I want to accomplish: I want to run a P4000 VSA on each server and run them in a Network RAID-10 (Lefthand speak for network mirroring, think of it as RAID 1 across nodes or as an active/active storage cluster). I will then present this storage to guests that will run on this mini-cluster. It will be managed by a vCenter Server on our main site. All connections will be GbE with two dedicated to storage. Management and Data will share a pair of connections, since I don't expect there to be high load. These servers are just there to provide directory services, dhcp, printing, etc. Does anyone see anything potentially wrong with this approach? Is this the best way to do this without adding additional dedicated storage heads? Are there any pitfalls in this design, besides the lack of dedicated Data/Mgmt interfaces?

    Read the article

  • Network Misconfiguration when adding first host to new vSphere cluster

    - by dunxd
    I am building a new vSphere cluster from scratch. I have installed ESXi on the first host, and built a vCenter server on a VM residing on that host (storage is on the local hard drive, although we have iSCSI targets which I can reach from the host). The cluster is configured for HA. When I try and add the host to the cluster, I get an error at the point where HA is configured - Cannot complete the . I have stripped the network configuration of the host down to the most basic - a single NIC attached to a single vSwitch - this is running the VMKernel Port on VLAN 8 - that is our Management VLAN. The vCenter server will have a network address on this VLAN, so I also set the initial Virtual Machine Port Group to this VLAN, and connected the vCenter server NIC to this port group. I understand I can't connect the vCenter server to the VMkernel port group, but shouldn't I be able to connect the vCenter server to a Port Group in the same VLAN? If not, do I need to create a VLAN specifically for VMKernel Port Group? I plan to set up another port group for vMotion with a dedicated and isolated VLAN (i.e. VLAN isn't routed) so this wouldn't allow vCenter to communicate. Does anyone have any suggestions, or other ideas for what might be causing the problem. I've read through the documentation, but it isn't giving me any pointers, and the error message isn't helping me beyond telling me something is wrong with my network config.

    Read the article

  • How to design highly scalable web services in Java?

    - by Kshitiz Sharma
    I am creating some Web Services that would have 2000 concurrent users. The services are offered for free and are hence expected to get a large user base. In the future it may be required to scale up to 50,000 users. There are already a few other questions that address the issue like - Building highly scalable web services However my requirements differ from the question above. For example - My application does not have a user interface, so images, CSS, javascript are not an issue. It is in Java so suggestions like using HipHop to translate PHP to native code are useless. Hence I decided to ask my question separately. This is my project setup - Rest based Web services using Apache CXF Hibernate 3.0 (With relevant optimizations like lazy loading and custom HQL for tune up) Tomcat 6.0 MySql 5.5 My questions are - Are there alternatives to Mysql that offer better performance for what I'm trying to do? What are some general things to abide by in order to scale a Java based web application? I am thinking of putting my Application in two tomcat instances with httpd redirecting the request to appropriate tomcat on basis of load. Is this the right approach? Separate tomcat instances can help but then database becomes the bottleneck since both applications access the same database? I am a programmer not a Db Admin, how difficult would it be to cluster a Mysql database (or, to cluster whatever database offered as an alternative to 1)? How effective are caching solutions like EHCache? Any other general best practices? Some clarifications - Could you partition the data? Yes we could but we're trying to avoid it. We need to run a lot of data mining algorithms and the design would evolve over time so we can't be sure what lines of partition should be there.

    Read the article

  • Clustering Strings on the basis of Common Substrings

    - by pk188
    I have around 10000+ strings and have to identify and group all the strings which looks similar(I base the similarity on the number of common words between any two give strings). The more number of common words, more similar the strings would be. For instance: How to make another layer from an existing layer Unable to edit data on the network drive Existing layers in the desktop Assistance with network drive In this case, the strings 1 and 3 are similar with common words Existing, Layer and 2 and 4 are similar with common words Network Drive(eliminating stop word) The steps I'm following are: Iterate through the data set Do a row by row comparison Find the common words between the strings Form a cluster where number of common words is greater than or equal to 2(eliminating stop words) If number of common words<2, put the string in a new cluster. Assign the rows either to the existing clusters or form a new one depending upon the common words Continue until all the strings are processed I am implementing the project in C#, and have got till step 3. However, I'm not sure how to proceed with the clustering. I have researched a lot about string clustering but could not find any solution that fits my problem. Your inputs would be highly appreciated.

    Read the article

  • What's up with OCFS2?

    - by wcoekaer
    On Linux there are many filesystem choices and even from Oracle we provide a number of filesystems, all with their own advantages and use cases. Customers often confuse ACFS with OCFS or OCFS2 which then causes assumptions to be made such as one replacing the other etc... I thought it would be good to write up a summary of how OCFS2 got to where it is, what we're up to still, how it is different from other options and how this really is a cool native Linux cluster filesystem that we worked on for many years and is still widely used. Work on a cluster filesystem at Oracle started many years ago, in the early 2000's when the Oracle Database Cluster development team wrote a cluster filesystem for Windows that was primarily focused on providing an alternative to raw disk devices and help customers with the deployment of Oracle Real Application Cluster (RAC). Oracle RAC is a cluster technology that lets us make a cluster of Oracle Database servers look like one big database. The RDBMS runs on many nodes and they all work on the same data. It's a Shared Disk database design. There are many advantages doing this but I will not go into detail as that is not the purpose of my write up. Suffice it to say that Oracle RAC expects all the database data to be visible in a consistent, coherent way, across all the nodes in the cluster. To do that, there were/are a few options : 1) use raw disk devices that are shared, through SCSI, FC, or iSCSI 2) use a network filesystem (NFS) 3) use a cluster filesystem(CFS) which basically gives you a filesystem that's coherent across all nodes using shared disks. It is sort of (but not quite) combining option 1 and 2 except that you don't do network access to the files, the files are effectively locally visible as if it was a local filesystem. So OCFS (Oracle Cluster FileSystem) on Windows was born. Since Linux was becoming a very important and popular platform, we decided that we would also make this available on Linux and thus the porting of OCFS/Windows started. The first version of OCFS was really primarily focused on replacing the use of Raw devices with a simple filesystem that lets you create files and provide direct IO to these files to get basically native raw disk performance. The filesystem was not designed to be fully POSIX compliant and it did not have any where near good/decent performance for regular file create/delete/access operations. Cache coherency was easy since it was basically always direct IO down to the disk device and this ensured that any time one issues a write() command it would go directly down to the disk, and not return until the write() was completed. Same for read() any sort of read from a datafile would be a read() operation that went all the way to disk and return. We did not cache any data when it came down to Oracle data files. So while OCFS worked well for that, since it did not have much of a normal filesystem feel, it was not something that could be submitted to the kernel mail list for inclusion into Linux as another native linux filesystem (setting aside the Windows porting code ...) it did its job well, it was very easy to configure, node membership was simple, locking was disk based (so very slow but it existed), you could create regular files and do regular filesystem operations to a certain extend but anything that was not database data file related was just not very useful in general. Logfiles ok, standard filesystem use, not so much. Up to this point, all the work was done, at Oracle, by Oracle developers. Once OCFS (1) was out for a while and there was a lot of use in the database RAC world, many customers wanted to do more and were asking for features that you'd expect in a normal native filesystem, a real "general purposes cluster filesystem". So the team sat down and basically started from scratch to implement what's now known as OCFS2 (Oracle Cluster FileSystem release 2). Some basic criteria were : Design it with a real Distributed Lock Manager and use the network for lock negotiation instead of the disk Make it a Linux native filesystem instead of a native shim layer and a portable core Support standard Posix compliancy and be fully cache coherent with all operations Support all the filesystem features Linux offers (ACL, extended Attributes, quotas, sparse files,...) Be modern, support large files, 32/64bit, journaling, data ordered journaling, endian neutral, we can mount on both endian /cross architecture,.. Needless to say, this was a huge development effort that took many years to complete. A few big milestones happened along the way... OCFS2 was development in the open, we did not have a private tree that we worked on without external code review from the Linux Filesystem maintainers, great folks like Christopher Hellwig reviewed the code regularly to make sure we were not doing anything out of line, we submitted the code for review on lkml a number of times to see if we were getting close for it to be included into the mainline kernel. Using this development model is standard practice for anyone that wants to write code that goes into the kernel and having any chance of doing so without a complete rewrite or.. shall I say flamefest when submitted. It saved us a tremendous amount of time by not having to re-fit code for it to be in a Linus acceptable state. Some other filesystems that were trying to get into the kernel that didn't follow an open development model had a lot harder time and a lot harsher criticism. March 2006, when Linus released 2.6.16, OCFS2 officially became part of the mainline kernel, it was accepted a little earlier in the release candidates but in 2.6.16. OCFS2 became officially part of the mainline Linux kernel tree as one of the many filesystems. It was the first cluster filesystem to make it into the kernel tree. Our hope was that it would then end up getting picked up by the distribution vendors to make it easy for everyone to have access to a CFS. Today the source code for OCFS2 is approximately 85000 lines of code. We made OCFS2 production with full support for customers that ran Oracle database on Linux, no extra or separate support contract needed. OCFS2 1.0.0 started being built for RHEL4 for x86, x86-64, ppc, s390x and ia64. For RHEL5 starting with OCFS2 1.2. SuSE was very interested in high availability and clustering and decided to build and include OCFS2 with SLES9 for their customers and was, next to Oracle, the main contributor to the filesystem for both new features and bug fixes. Source code was always available even prior to inclusion into mainline and as of 2.6.16, source code was just part of a Linux kernel download from kernel.org, which it still is, today. So the latest OCFS2 code is always the upstream mainline Linux kernel. OCFS2 is the cluster filesystem used in Oracle VM 2 and Oracle VM 3 as the virtual disk repository filesystem. Since the filesystem is in the Linux kernel it's released under the GPL v2 The release model has always been that new feature development happened in the mainline kernel and we then built consistent, well tested, snapshots that had versions, 1.2, 1.4, 1.6, 1.8. But these releases were effectively just snapshots in time that were tested for stability and release quality. OCFS2 is very easy to use, there's a simple text file that contains the node information (hostname, node number, cluster name) and a file that contains the cluster heartbeat timeouts. It is very small, and very efficient. As Sunil Mushran wrote in the manual : OCFS2 is an efficient, easily configured, quickly installed, fully integrated and compatible, feature-rich, architecture and endian neutral, cache coherent, ordered data journaling, POSIX-compliant, shared disk cluster file system. Here is a list of some of the important features that are included : Variable Block and Cluster sizes Supports block sizes ranging from 512 bytes to 4 KB and cluster sizes ranging from 4 KB to 1 MB (increments in power of 2). Extent-based Allocations Tracks the allocated space in ranges of clusters making it especially efficient for storing very large files. Optimized Allocations Supports sparse files, inline-data, unwritten extents, hole punching and allocation reservation for higher performance and efficient storage. File Cloning/snapshots REFLINK is a feature which introduces copy-on-write clones of files in a cluster coherent way. Indexed Directories Allows efficient access to millions of objects in a directory. Metadata Checksums Detects silent corruption in inodes and directories. Extended Attributes Supports attaching an unlimited number of name:value pairs to the file system objects like regular files, directories, symbolic links, etc. Advanced Security Supports POSIX ACLs and SELinux in addition to the traditional file access permission model. Quotas Supports user and group quotas. Journaling Supports both ordered and writeback data journaling modes to provide file system consistency in the event of power failure or system crash. Endian and Architecture neutral Supports a cluster of nodes with mixed architectures. Allows concurrent mounts on nodes running 32-bit and 64-bit, little-endian (x86, x86_64, ia64) and big-endian (ppc64) architectures. In-built Cluster-stack with DLM Includes an easy to configure, in-kernel cluster-stack with a distributed lock manager. Buffered, Direct, Asynchronous, Splice and Memory Mapped I/Os Supports all modes of I/Os for maximum flexibility and performance. Comprehensive Tools Support Provides a familiar EXT3-style tool-set that uses similar parameters for ease-of-use. The filesystem was distributed for Linux distributions in separate RPM form and this had to be built for every single kernel errata release or every updated kernel provided by the vendor. We provided builds from Oracle for Oracle Linux and all kernels released by Oracle and for Red Hat Enterprise Linux. SuSE provided the modules directly for every kernel they shipped. With the introduction of the Unbreakable Enterprise Kernel for Oracle Linux and our interest in reducing the overhead of building filesystem modules for every minor release, we decide to make OCFS2 available as part of UEK. There was no more need for separate kernel modules, everything was built-in and a kernel upgrade automatically updated the filesystem, as it should. UEK allowed us to not having to backport new upstream filesystem code into an older kernel version, backporting features into older versions introduces risk and requires extra testing because the code is basically partially rewritten. The UEK model works really well for continuing to provide OCFS2 without that extra overhead. Because the RHEL kernel did not contain OCFS2 as a kernel module (it is in the source tree but it is not built by the vendor in kernel module form) we stopped adding the extra packages to Oracle Linux and its RHEL compatible kernel and for RHEL. Oracle Linux customers/users obviously get OCFS2 included as part of the Unbreakable Enterprise Kernel, SuSE customers get it by SuSE distributed with SLES and Red Hat can decide to distribute OCFS2 to their customers if they chose to as it's just a matter of compiling the module and making it available. OCFS2 today, in the mainline kernel is pretty much feature complete in terms of integration with every filesystem feature Linux offers and it is still actively maintained with Joel Becker being the primary maintainer. Since we use OCFS2 as part of Oracle VM, we continue to look at interesting new functionality to add, REFLINK was a good example, and as such we continue to enhance the filesystem where it makes sense. Bugfixes and any sort of code that goes into the mainline Linux kernel that affects filesystems, automatically also modifies OCFS2 so it's in kernel, actively maintained but not a lot of new development happening at this time. We continue to fully support OCFS2 as part of Oracle Linux and the Unbreakable Enterprise Kernel and other vendors make their own decisions on support as it's really a Linux cluster filesystem now more than something that we provide to customers. It really just is part of Linux like EXT3 or BTRFS etc, the OS distribution vendors decide. Do not confuse OCFS2 with ACFS (ASM cluster Filesystem) also known as Oracle Cloud Filesystem. ACFS is a filesystem that's provided by Oracle on various OS platforms and really integrates into Oracle ASM (Automatic Storage Management). It's a very powerful Cluster Filesystem but it's not distributed as part of the Operating System, it's distributed with the Oracle Database product and installs with and lives inside Oracle ASM. ACFS obviously is fully supported on Linux (Oracle Linux, Red Hat Enterprise Linux) but OCFS2 independently as a native Linux filesystem is also, and continues to also be supported. ACFS is very much tied into the Oracle RDBMS, OCFS2 is just a standard native Linux filesystem with no ties into Oracle products. Customers running the Oracle database and ASM really should consider using ACFS as it also provides storage/clustered volume management. Customers wanting to use a simple, easy to use generic Linux cluster filesystem should consider using OCFS2. To learn more about OCFS2 in detail, you can find good documentation on http://oss.oracle.com/projects/ocfs2 in the Documentation area, or get the latest mainline kernel from http://kernel.org and read the source. One final, unrelated note - since I am not always able to publicly answer or respond to comments, I do not want to selectively publish comments from readers. Sometimes I forget to publish comments, sometime I publish them and sometimes I would publish them but if for some reason I cannot publicly comment on them, it becomes a very one-sided stream. So for now I am going to not publish comments from anyone, to be fair to all sides. You are always welcome to email me and I will do my best to respond to technical questions, questions about strategy or direction are sometimes not possible to answer for obvious reasons.

    Read the article

  • Oracle JDBC connection exception in Solaris but not Windows?

    - by lupefiasco
    I have some Java code that connects to an Oracle database using DriverManager.getConnection(). It works just fine on my Windows XP machine. However, when running the same code on a Solaris machine, I get the following exception. Both machines can reach the database machine on the network. I have included the Oracle trace logs. Mar 23, 2010 12:12:33 PM org.apache.commons.configuration.ConfigurationUtils locate FINE: ConfigurationUtils.locate(): base is /users/theUser/ADCompare, name is props.txt Mar 23, 2010 12:12:33 PM org.apache.commons.configuration.ConfigurationUtils locate FINE: Loading configuration from the path /users/theUser/ADCompare/props.txt Mar 23, 2010 12:12:33 PM oracle.jdbc.driver.OracleDriver connect FINE: OracleDriver.connect(url=jdbc:oracle:thin:@//theServer:1521/theService, info) Mar 23, 2010 12:12:33 PM oracle.jdbc.driver.OracleDriver connect FINER: OracleDriver.connect() walletLocation:(null) Mar 23, 2010 12:12:33 PM oracle.jdbc.driver.OracleDriver parseUrl FINER: OracleDriver.parseUrl(url=jdbc:oracle:thin:@//theServer:1521/theService) Mar 23, 2010 12:12:33 PM oracle.jdbc.driver.OracleDriver parseUrl FINER: sub_sub_index=12, end=46, next_colon_index=16, user=17, slash=18, at_sign=17 Mar 23, 2010 12:12:33 PM oracle.jdbc.driver.OracleDriver parseUrl FINER: OracleDriver.parseUrl(url):return Mar 23, 2010 12:12:33 PM oracle.jdbc.driver.OracleDriver connect FINER: user=theUser, password=******, database=//theServer:1521/theService, protocol=thin, prefetch=null, batch=null, accumulate batch result =true, remarks=null, synonyms=null Mar 23, 2010 12:12:33 PM oracle.jdbc.driver.PhysicalConnection <init> FINE: PhysicalConnection.PhysicalConnection(ur="jdbc:oracle:thin:@//theServer:1521/theService", us="theUser", p="******", db="//theServer:1521/theService", info) Mar 23, 2010 12:12:33 PM oracle.jdbc.driver.PhysicalConnection <init> FINEST: PhysicalConnection.PhysicalConnection() : connectionProperties={user=theUser, password=******, protocol=thin} Mar 23, 2010 12:12:33 PM oracle.jdbc.driver.PhysicalConnection initialize FINE: PhysicalConnection.initialize(ur="jdbc:oracle:thin:@//theServer:1521/theService", us="theUser", access) Mar 23, 2010 12:12:33 PM oracle.jdbc.driver.PhysicalConnection initialize FINE: PhysicalConnection.initialize(ur, us):return Mar 23, 2010 12:12:33 PM oracle.jdbc.driver.PhysicalConnection needLine FINE: PhysicalConnection.needLine()--no return java.lang.ArrayIndexOutOfBoundsException: 31 at oracle.net.nl.NVTokens.parseTokens(Unknown Source) at oracle.net.nl.NVFactory.createNVPair(Unknown Source) at oracle.net.nl.NLParamParser.addNLPListElement(Unknown Source) at oracle.net.nl.NLParamParser.initializeNlpa(Unknown Source) at oracle.net.nl.NLParamParser.<init>(Unknown Source) at oracle.net.resolver.TNSNamesNamingAdapter.loadFile(Unknown Source) at oracle.net.resolver.TNSNamesNamingAdapter.checkAndReload(Unknown Source) at oracle.net.resolver.TNSNamesNamingAdapter.resolve(Unknown Source) at oracle.net.resolver.NameResolver.resolveName(Unknown Source) at oracle.net.resolver.AddrResolution.resolveAndExecute(Unknown Source) at oracle.net.ns.NSProtocol.establishConnection(Unknown Source) at oracle.net.ns.NSProtocol.connect(Unknown Source) at oracle.jdbc.driver.T4CConnection.connect(T4CConnection.java:1037) at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:282) at oracle.jdbc.driver.PhysicalConnection.<init>(PhysicalConnection.java:468) at oracle.jdbc.driver.T4CConnection.<init>(T4CConnection.java:165) at oracle.jdbc.driver.T4CDriverExtension.getConnection(T4CDriverExtension.java:35) at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:839) at java.sql.DriverManager.getConnection(DriverManager.java:582) at java.sql.DriverManager.getConnection(DriverManager.java:185) The above exception is also thrown if I use OracleDataSource instead of the generic DriverManager.getConnection(). Any ideas on why the behavior is different in the different environments?

    Read the article

  • RabbitMQ - How I do configure servers for zero-downtime upgrades?

    - by Terence Johnson
    Having read through the docs and RabbitMQ in Action, creating a RabbitMQ cluster seems straightforward enough, but upgrading or patching an existing RabbitMQ cluster seems to require the whole cluster to be restarted. Is there a way to combine clustering, shovel, federation, and load balancing to make a rolling upgrade possible without losing queues or messages, or have I missed something slightly more obvious?

    Read the article

  • Troubleshooting failover cluster problem in W2K8 / SQL05

    - by paulland
    I have an active/passive W2K8 (64) cluster pair, running SQL05 Standard. Shared storage is on a HP EVA SAN (FC). I recently expanded the filesystem on the active node for a database, adding a drive designation. The shared storage drives are designated as F:, I:, J:, L: and X:, with SQL filesystems on the first 4 and X: used for a backup destination. Last night, as part of a validation process (the passive node had been offline for maintenance), I moved the SQL instance to the other cluster node. The database in question immediately moved to Suspect status. Review of the system logs showed that the database would not load because the file "K:\SQLDATA\whatever.ndf" could not be found. (Note that we do not have a K: drive designation.) A review of the J: storage drive showed zero contents -- nothing -- this is where "whatever.ndf" should have been. Hmm, I thought. Problem with the server. I'll just move SQL back to the other server and figure out what's wrong.. Still no database. Suspect. Uh-oh. "Whatever.ndf" had gone into the bit bucket. I finally decided to just restore from the backup (which had been taken immediately before the validation test), so nothing was lost but a few hours of sleep. The question: (1) Why did the passive node think the whatever.ndf files were supposed to go to drive "K:", when this drive didn't exist as a resource on the active node? (2) How can I get the cluster nodes "re-syncd" so that failover can be accomplished? I don't know that there wasn't a "K:" drive as a cluster resource at some time in the past, but I do know that this drive did not exist on the original cluster at the time of resource move.

    Read the article

  • Sharing storage on Linux and Solaris

    - by devlearn
    I'm looking for a solution in order to share a san mounted volume between several hosts running on Linux (RHEL) and/or Solaris (Sparc). Note that I basically need to share a set of directories containing large binary files that are accessed in random R/W mode. I have the following reqs : keep the data on the SAN suitable i/o performances as the software is pretty demanding on IOPS stick to a shared file system as I can't afford a cluster fs (lack of MDS/OSS infrastructure) compression could be really usefull For now I've found only the following candidates : GFS2 , supports Linux only, no compression VxFS , supports Linux and Solaris, compression supported So if you have some suggestions for this list, I'll really welcome them. Thanks in advance,

    Read the article

  • Do I need to enable DRS to use Dynacache in Websphere Application Server Cluster

    - by rabs
    We are running a websphere commerce application with several websphere application servers configured in a cluster. We are using dynacache, so each server in the cluster will have its own cached objects in its own JVM. We are using CACHEIVL with database triggers for all cache invalidations. I was reading http://www.ibm.com/developerworks/websphere/library/techarticles/0603_crick/0603_crick.html and found an interesting sentence: "Furthermore, cache replication is necessary to ensure that invalidation messages are shared between the servers in a cluster." After thinking about this it would make sense that for the invalidation to work it would need to be triggered on all the servers in the cluster, but I couldn't find confirmation of this in the mountains of IBM doco. Does anyone know if you can use trigger based cache invalidation (through CACHEIVL) when you have several application servers clustered each with their own cache without DRS turned on? or do I need to use DRS for this to work?

    Read the article

< Previous Page | 25 26 27 28 29 30 31 32 33 34 35 36  | Next Page >