Search Results

Search found 1833 results on 74 pages for 'floppy disks'.

Page 73/74 | < Previous Page | 69 70 71 72 73 74  | Next Page >

  • 10 Windows Tweaking Myths Debunked

    - by Chris Hoffman
    Windows is big, complicated, and misunderstood. You’ll still stumble across bad advice from time to time when browsing the web. These Windows tweaking, performance, and system maintenance tips are mostly just useless, but some are actively harmful. Luckily, most of these myths have been stomped out on mainstream sites and forums. However, if you start searching the web, you’ll still find websites that recommend you do these things. Erase Cache Files Regularly to Speed Things Up You can free up disk space by running an application like CCleaner, another temporary-file-cleaning utility, or even the Windows Disk Cleanup tool. In some cases, you may even see an old computer speed up when you erase a large amount of useless files. However, running CCleaner or similar utilities every day to erase your browser’s cache won’t actually speed things up. It will slow down your web browsing as your web browser is forced to redownload the files all over again, and reconstruct the cache you regularly delete. If you’ve installed CCleaner or a similar program and run it every day with the default settings, you’re actually slowing down your web browsing. Consider at least preventing the program from wiping out your web browser cache. Enable ReadyBoost to Speed Up Modern PCs Windows still prompts you to enable ReadyBoost when you insert a USB stick or memory card. On modern computers, this is completely pointless — ReadyBoost won’t actually speed up your computer if you have at least 1 GB of RAM. If you have a very old computer with a tiny amount of RAM — think 512 MB — ReadyBoost may help a bit. Otherwise, don’t bother. Open the Disk Defragmenter and Manually Defragment On Windows 98, users had to manually open the defragmentation tool and run it, ensuring no other applications were using the hard drive while it did its work. Modern versions of Windows are capable of defragmenting your file system while other programs are using it, and they automatically defragment your disks for you. If you’re still opening the Disk Defragmenter every week and clicking the Defragment button, you don’t need to do this — Windows is doing it for you unless you’ve told it not to run on a schedule. Modern computers with solid-state drives don’t have to be defragmented at all. Disable Your Pagefile to Increase Performance When Windows runs out of empty space in RAM, it swaps out data from memory to a pagefile on your hard disk. If a computer doesn’t have much memory and it’s running slow, it’s probably moving data to the pagefile or reading data from it. Some Windows geeks seem to think that the pagefile is bad for system performance and disable it completely. The argument seems to be that Windows can’t be trusted to manage a pagefile and won’t use it intelligently, so the pagefile needs to be removed. As long as you have enough RAM, it’s true that you can get by without a pagefile. However, if you do have enough RAM, Windows will only use the pagefile rarely anyway. Tests have found that disabling the pagefile offers no performance benefit. Enable CPU Cores in MSConfig Some websites claim that Windows may not be using all of your CPU cores or that you can speed up your boot time by increasing the amount of cores used during boot. They direct you to the MSConfig application, where you can indeed select an option that appears to increase the amount of cores used. In reality, Windows always uses the maximum amount of processor cores your CPU has. (Technically, only one core is used at the beginning of the boot process, but the additional cores are quickly activated.) Leave this option unchecked. It’s just a debugging option that allows you to set a maximum number of cores, so it would be useful if you wanted to force Windows to only use a single core on a multi-core system — but all it can do is restrict the amount of cores used. Clean Your Prefetch To Increase Startup Speed Windows watches the programs you run and creates .pf files in its Prefetch folder for them. The Prefetch feature works as a sort of cache — when you open an application, Windows checks the Prefetch folder, looks at the application’s .pf file (if it exists), and uses that as a guide to start preloading data that the application will use. This helps your applications start faster. Some Windows geeks have misunderstood this feature. They believe that Windows loads these files at boot, so your boot time will slow down due to Windows preloading the data specified in the .pf files. They also argue you’ll build up useless files as you uninstall programs and .pf files will be left over. In reality, Windows only loads the data in these .pf files when you launch the associated application and only stores .pf files for the 128 most recently launched programs. If you were to regularly clean out the Prefetch folder, not only would programs take longer to open because they won’t be preloaded, Windows will have to waste time recreating all the .pf files. You could also modify the PrefetchParameters setting to disable Prefetch, but there’s no reason to do that. Let Windows manage Prefetch on its own. Disable QoS To Increase Network Bandwidth Quality of Service (QoS) is a feature that allows your computer to prioritize its traffic. For example, a time-critical application like Skype could choose to use QoS and prioritize its traffic over a file-downloading program so your voice conversation would work smoothly, even while you were downloading files. Some people incorrectly believe that QoS always reserves a certain amount of bandwidth and this bandwidth is unused until you disable it. This is untrue. In reality, 100% of bandwidth is normally available to all applications unless a program chooses to use QoS. Even if a program does choose to use QoS, the reserved space will be available to other programs unless the program is actively using it. No bandwidth is ever set aside and left empty. Set DisablePagingExecutive to Make Windows Faster The DisablePagingExecutive registry setting is set to 0 by default, which allows drivers and system code to be paged to the disk. When set to 1, drivers and system code will be forced to stay resident in memory. Once again, some people believe that Windows isn’t smart enough to manage the pagefile on its own and believe that changing this option will force Windows to keep important files in memory rather than stupidly paging them out. If you have more than enough memory, changing this won’t really do anything. If you have little memory, changing this setting may force Windows to push programs you’re using to the page file rather than push unused system files there — this would slow things down. This is an option that may be helpful for debugging in some situations, not a setting to change for more performance. Process Idle Tasks to Free Memory Windows does things, such as creating scheduled system restore points, when you step away from your computer. It waits until your computer is “idle” so it won’t slow your computer and waste your time while you’re using it. Running the “Rundll32.exe advapi32.dll,ProcessIdleTasks” command forces Windows to perform all of these tasks while you’re using the computer. This is completely pointless and won’t help free memory or anything like that — all you’re doing is forcing Windows to slow your computer down while you’re using it. This command only exists so benchmarking programs can force idle tasks to run before performing benchmarks, ensuring idle tasks don’t start running and interfere with the benchmark. Delay or Disable Windows Services There’s no real reason to disable Windows services anymore. There was a time when Windows was particularly heavy and computers had little memory — think Windows Vista and those “Vista Capable” PCs Microsoft was sued over. Modern versions of Windows like Windows 7 and 8 are lighter than Windows Vista and computers have more than enough memory, so you won’t see any improvements from disabling system services included with Windows. Some people argue for not disabling services, however — they recommend setting services from “Automatic” to “Automatic (Delayed Start)”. By default, the Delayed Start option just starts services two minutes after the last “Automatic” service starts. Setting services to Delayed Start won’t really speed up your boot time, as the services will still need to start — in fact, it may lengthen the time it takes to get a usable desktop as services will still be loading two minutes after booting. Most services can load in parallel, and loading the services as early as possible will result in a better experience. The “Delayed Start” feature is primarily useful for system administrators who need to ensure a specific service starts later than another service. If you ever find a guide that recommends you set a little-known registry setting to improve performance, take a closer look — the change is probably useless. Want to actually speed up your PC? Try disabling useless startup programs that run on boot, increasing your boot time and consuming memory in the background. This is a much better tip than doing any of the above, especially considering most Windows PCs come packed to the brim with bloatware.     

    Read the article

  • quick look at: dm_db_index_physical_stats

    - by fatherjack
    A quick look at the key data from this dmv that can help a DBA keep databases performing well and systems online as the users need them. When the dynamic management views relating to index statistics became available in SQL Server 2005 there was much hype about how they can help a DBA keep their servers running in better health than ever before. This particular view gives an insight into the physical health of the indexes present in a database. Whether they are use or unused, complete or missing some columns is irrelevant, this is simply the physical stats of all indexes; disabled indexes are ignored however. In it’s simplest form this dmv can be executed as:   The results from executing this contain a record for every index in every database but some of the columns will be NULL. The first parameter is there so that you can specify which database you want to gather index details on, rather than scan every database. Simply specifying DB_ID() in place of the first NULL achieves this. In order to avoid the NULLS, or more accurately, in order to choose when to have the NULLS you need to specify a value for the last parameter. It takes one of 4 values – DEFAULT, ‘SAMPLED’, ‘LIMITED’ or ‘DETAILED’. If you execute the dmv with each of these values you can see some interesting details in the times taken to complete each step. DECLARE @Start DATETIME DECLARE @First DATETIME DECLARE @Second DATETIME DECLARE @Third DATETIME DECLARE @Finish DATETIME SET @Start = GETDATE() SELECT * FROM [sys].[dm_db_index_physical_stats](DB_ID(), NULL, NULL, NULL, DEFAULT) AS ddips SET @First = GETDATE() SELECT * FROM [sys].[dm_db_index_physical_stats](DB_ID(), NULL, NULL, NULL, 'SAMPLED') AS ddips SET @Second = GETDATE() SELECT * FROM [sys].[dm_db_index_physical_stats](DB_ID(), NULL, NULL, NULL, 'LIMITED') AS ddips SET @Third = GETDATE() SELECT * FROM [sys].[dm_db_index_physical_stats](DB_ID(), NULL, NULL, NULL, 'DETAILED') AS ddips SET @Finish = GETDATE() SELECT DATEDIFF(ms, @Start, @First) AS [DEFAULT] , DATEDIFF(ms, @First, @Second) AS [SAMPLED] , DATEDIFF(ms, @Second, @Third) AS [LIMITED] , DATEDIFF(ms, @Third, @Finish) AS [DETAILED] Running this code will give you 4 result sets; DEFAULT will have 12 columns full of data and then NULLS in the remainder. SAMPLED will have 21 columns full of data. LIMITED will have 12 columns of data and the NULLS in the remainder. DETAILED will have 21 columns full of data. So, from this we can deduce that the DEFAULT value (the same one that is also applied when you query the view using a NULL parameter) is the same as using LIMITED. Viewing the final result set has some details that are worth noting: Running queries against this view takes significantly longer when using the SAMPLED and DETAILED values in the last parameter. The duration of the query is directly related to the size of the database you are working in so be careful running this on big databases unless you have tried it on a test server first. Let’s look at the data we get back with the DEFAULT value first of all and then progress to the extra information later. We know that the first parameter that we supply has to be a database id and for the purposes of this blog we will be providing that value with the DB_ID function. We could just as easily put a fixed value in there or a function such as DB_ID (‘AnyDatabaseName’). The first columns we get back are database_id and object_id. These are pretty explanatory and we can wrap those in some code to make things a little easier to read: SELECT DB_NAME([ddips].[database_id]) AS [DatabaseName] , OBJECT_NAME([ddips].[object_id]) AS [TableName] … FROM [sys].[dm_db_index_physical_stats](DB_ID(), NULL, NULL, NULL, NULL) AS ddips  gives us   SELECT DB_NAME([ddips].[database_id]) AS [DatabaseName] , OBJECT_NAME([ddips].[object_id]) AS [TableName], [i].[name] AS [IndexName] , ….. FROM [sys].[dm_db_index_physical_stats](DB_ID(), NULL, NULL, NULL, NULL) AS ddips INNER JOIN [sys].[indexes] AS i ON [ddips].[index_id] = [i].[index_id] AND [ddips].[object_id] = [i].[object_id]     These handily tie in with the next parameters in the query on the dmv. If you specify an object_id and an index_id in these then you get results limited to either the table or the specific index. Once again we can place a  function in here to make it easier to work with a specific table. eg. SELECT * FROM [sys].[dm_db_index_physical_stats] (DB_ID(), OBJECT_ID(‘AdventureWorks2008.Person.Address’) , 1, NULL, NULL) AS ddips   Note: Despite me showing that functions can be placed directly in the parameters for this dmv, best practice recommends that functions are not used directly in the function as it is possible that they will fail to return a valid object ID. To be certain of not passing invalid values to this function, and therefore setting an automated process off on the wrong path, declare variables for the OBJECT_IDs and once they have been validated, use them in the function: DECLARE @db_id SMALLINT; DECLARE @object_id INT; SET @db_id = DB_ID(N’AdventureWorks_2008′); SET @object_id = OBJECT_ID(N’AdventureWorks_2008.Person.Address’); IF @db_id IS NULL BEGINPRINT N’Invalid database’; ENDELSE IF @object_id IS NULL BEGINPRINT N’Invalid object’; ENDELSE BEGINSELECT * FROM sys.dm_db_index_physical_stats (@db_id, @object_id, NULL, NULL , ‘LIMITED’); END; GO In cases where the results of querying this dmv don’t have any effect on other processes (i.e. simply viewing the results in the SSMS results area)  then it will be noticed when the results are not consistent with the expected results and in the case of this blog this is the method I have used. So, now we can relate the values in these columns to something that we recognise in the database lets see what those other values in the dmv are all about. The next columns are: We’ll skip partition_number, index_type_desc, alloc_unit_type_desc, index_depth and index_level  as this is a quick look at the dmv and they are pretty self explanatory. The final columns revealed by querying this view in the DEFAULT mode are avg_fragmentation_in_percent. This is the amount that the index is logically fragmented. It will show NULL when the dmv is queried in SAMPLED mode. fragment_count. The number of pieces that the index is broken into. It will show NULL when the dmv is queried in SAMPLED mode. avg_fragment_size_in_pages. The average size, in pages, of a single fragment in the leaf level of the IN_ROW_DATA allocation unit. It will show NULL when the dmv is queried in SAMPLED mode. page_count. Total number of index or data pages in use. OK, so what does this give us? Well, there is an obvious correlation between fragment_count, page_count and avg_fragment_size-in_pages. We see that an index that takes up 27 pages and is in 3 fragments has an average fragment size of 9 pages (27/3=9). This means that for this index there are 3 separate places on the hard disk that SQL Server needs to locate and access to gather the data when it is requested by a DML query. If this index was bigger than 72KB then having it’s data in 3 pieces might not be too big an issue as each piece would have a significant piece of data to read and the speed of access would not be too poor. If the number of fragments increases then obviously the amount of data in each piece decreases and that means the amount of work for the disks to do in order to retrieve the data to satisfy the query increases and this would start to decrease performance. This information can be useful to keep in mind when considering the value in the avg_fragmentation_in_percent column. This is arrived at by an internal algorithm that gives a value to the logical fragmentation of the index taking into account the multiple files, type of allocation unit and the previously mentioned characteristics if index size (page_count) and fragment_count. Seeing an index with a high avg_fragmentation_in_percent value will be a call to action for a DBA that is investigating performance issues. It is possible that tables will have indexes that suffer from rapid increases in fragmentation as part of normal daily business and that regular defragmentation work will be needed to keep it in good order. In other cases indexes will rarely become fragmented and therefore not need rebuilding from one end of the year to another. Keeping this in mind DBAs need to use an ‘intelligent’ process that assesses key characteristics of an index and decides on the best, if any, defragmentation method to apply should be used. There is a simple example of this in the sample code found in the Books OnLine content for this dmv, in example D. There are also a couple of very popular solutions created by SQL Server MVPs Michelle Ufford and Ola Hallengren which I would wholly recommend that you review for much further detail on how to care for your SQL Server indexes. Right, let’s get back on track then. Querying the dmv with the fifth parameter value as ‘DETAILED’ takes longer because it goes through the index and refreshes all data from every level of the index. As this blog is only a quick look a we are going to skate right past ghost_record_count and version_ghost_record_count and discuss avg_page_space_used_in_percent, record_count, min_record_size_in_bytes, max_record_size_in_bytes and avg_record_size_in_bytes. We can see from the details below that there is a correlation between the columns marked. Column 1 (Page_Count) is the number of 8KB pages used by the index, column 2 is how full each page is (how much of the 8KB has actual data written on it), column 3 is how many records are recorded in the index and column 4 is the average size of each record. This approximates to: ((Col1*8) * 1024*(Col2/100))/Col3 = Col4*. avg_page_space_used_in_percent is an important column to review as this indicates how much of the disk that has been given over to the storage of the index actually has data on it. This value is affected by the value given for the FILL_FACTOR parameter when creating an index. avg_record_size_in_bytes is important as you can use it to get an idea of how many records are in each page and therefore in each fragment, thus reinforcing how important it is to keep fragmentation under control. min_record_size_in_bytes and max_record_size_in_bytes are exactly as their names set them out to be. A detail of the smallest and largest records in the index. Purely offered as a guide to the DBA to better understand the storage practices taking place. So, keeping an eye on avg_fragmentation_in_percent will ensure that your indexes are helping data access processes take place as efficiently as possible. Where fragmentation recurs frequently then potentially the DBA should consider; the fill_factor of the index in order to leave space at the leaf level so that new records can be inserted without causing fragmentation so rapidly. the columns used in the index should be analysed to avoid new records needing to be inserted in the middle of the index but rather always be added to the end. * – it’s approximate as there are many factors associated with things like the type of data and other database settings that affect this slightly.  Another great resource for working with SQL Server DMVs is Performance Tuning with SQL Server Dynamic Management Views by Louis Davidson and Tim Ford – a free ebook or paperback from Simple Talk. Disclaimer – Jonathan is a Friend of Red Gate and as such, whenever they are discussed, will have a generally positive disposition towards Red Gate tools. Other tools are often available and you should always try others before you come back and buy the Red Gate ones. All code in this blog is provided “as is” and no guarantee, warranty or accuracy is applicable or inferred, run the code on a test server and be sure to understand it before you run it on a server that means a lot to you or your manager.

    Read the article

  • Clusterware 11gR2 &ndash; Setting up an Active/Passive failover configuration

    - by Gilles Haro
    Oracle is providing a large range of interesting solutions to ensure High Availability of the database. Dataguard, RAC or even both configurations (as recommended by Oracle for a Maximum Available Architecture - MAA) are the most frequently found and used solutions. But, when it comes to protecting a system with an Active/Passive architecture with failover capabilities, people often thinks to other expensive third party cluster systems. Oracle Clusterware technology, which comes along at no extra-cost with Oracle Database or Oracle Unbreakable Linux, is - in the knowing of most people - often linked to Oracle RAC and therefore, is seldom used to implement failover solutions. Oracle Clusterware 11gR2  (a part of Oracle 11gR2 Grid Infrastructure)  provides a comprehensive framework to setup automatic failover configurations. It is actually possible to make "failover-able'", and then to protect, almost any kind of application (from the simple xclock to the most complex Application Server). Quoting Oracle: “Oracle Clusterware is a portable cluster software that allows clustering of single servers so that they cooperate as a single system. Oracle Clusterware also provides the required infrastructure for Oracle Real Application Clusters (RAC). In addition Oracle Clusterware enables the protection of any Oracle application or any other kind of application within a cluster.” In the next couple of lines, I will try to present the different steps to achieve this goal : Have a fully operational 11gR2 database protected by automatic failover capabilities. I assume you are fluent in installing Oracle Database 11gR2, Oracle Grid Infrastructure 11gR2 on a Linux system and that ASM is not a problem for you (as I am using it as a shared storage). If not, please have a look at Oracle Documentation. As often, I made my tests using an Oracle VirtualBox environment. The scripts are tested and functional on my system. Unfortunately, there can always be a typo or a mistake. This blog entry does not replace a course around the Clusterware Framework. I just hope it will let you see how powerful it is and that it will give you the whilst to go further with it...  Note : This entry has been revised (rev.2) following comments from Philip Newlan. Prerequisite 2 Linux boxes (OELCluster01 and OELCluster02) at the same OS level. I used OEL 5 Update 5 with an Enterprise Kernel. Shared Storage (SAN). On my VirtualBox system, I used Openfiler to simulate the SAN Oracle 11gR2 Database (11.2.0.1) Oracle 11gR2 Grid Infrastructure (11.2.0.1)   Step 1 - Install the software Using asmlib, create 3 ASM disks (ASM_CRS, ASM_DTA and ASM_FRA) Install Grid Infrastructure for a cluster (OELCluster01 and OELCluster02 are the 2 nodes of the cluster) Use ASM_CRS to store Voting Disk and OCR. Use SCAN. Install Oracle Database Standalone binaries on both nodes. Use asmca to check/mount the disk groups on 2 nodes Use dbca to create and configure a database on the primary node Let's name it DB11G. Copy the pfile, password file to the second node. Create adump directoty on the second node.   Step 2 - Setup the resource to be protected After its creation with dbca, the database is automatically protected by the Oracle Restart technology available with Grid Infrastructure. Consequently, it restarts automatically (if possible) after a crash (ex: kill -9 smon). A database resource has been created for that in the Cluster Registry. We can observe this with the command : crsctl status resource that shows and ora.dba11g.db entry. Let's save the definition of this resource, for future use : mkdir -p /crs/11.2.0/HA_scripts chown oracle:oinstall /crs/11.2.0/HA_scripts crsctl status resource ora.db11g.db -p > /crs/11.2.0/HA_scripts/myResource.txt Although very interesting, Oracle Restart is not cluster aware and cannot restart the database on any other node of the cluster. So, let's remove it from the OCR definitions, we don't need it ! srvctl stop database -d DB11G srvctl remove database -d DB11G Instead of it, we need to create a new resource of a more general type : cluster_resource. Here are the steps to achieve this : Create an action script :  /crs/11.2.0/HA_scripts/my_ActivePassive_Cluster.sh #!/bin/bash export ORACLE_HOME=/oracle/product/11.2.0/dbhome_1 export ORACLE_SID=DB11G case $1 in 'start')   $ORACLE_HOME/bin/sqlplus /nolog <<EOF   connect / as sysdba   startup EOF   RET=0   ;; 'stop')   $ORACLE_HOME/bin/sqlplus /nolog <<EOF   connect / as sysdba   shutdown immediate EOF   RET=0   ;; 'clean')   $ORACLE_HOME/bin/sqlplus /nolog <<EOF   connect / as sysdba   shutdown abort    ##for i in `ps -ef | grep -i $ORACLE_SID | awk '{print $2}' ` ;do kill -9 $i; done EOF   RET=0   ;; 'check')    ok=`ps -ef | grep smon | grep $ORACLE_SID | wc -l`    if [ $ok = 0 ]; then      RET=1    else      RET=0    fi    ;; '*')      RET=0   ;; esac if [ $RET -eq 0 ]; then    exit 0 else    exit 1 fi   This script must provide, at least, methods to start, stop, clean and check the database. It is self-explaining and contains nothing special. Just be aware that it must be runnable (+x), it runs as Oracle user (because of the ACL property - see later) and needs to know about the environment. Also make sure it exists on every node of the cluster. Moreover, as of 11.2, the clean method is mandatory. It must provide the “last gasp clean up”, for example, a shutdown abort or a kill –9 of all the remaining processes. chmod +x /crs/11.2.0/HA_scripts/my_ActivePassive_Cluster.sh scp  /crs/11.2.0/HA_scripts/my_ActivePassive_Cluster.sh   oracle@OELCluster02:/crs/11.2.0/HA_scripts Create a new resource file, based on the information we got from previous  myResource.txt . Name it myNewResource.txt. myResource.txt  is shown below. As we can see, it defines an ora.database.type resource, named ora.db11g.db. A lot of properties are related to this type of resource and do not need to be used for a cluster_resource. NAME=ora.db11g.db TYPE=ora.database.type ACL=owner:oracle:rwx,pgrp:oinstall:rwx,other::r-- ACTION_FAILURE_TEMPLATE= ACTION_SCRIPT= ACTIVE_PLACEMENT=1 AGENT_FILENAME=%CRS_HOME%/bin/oraagent%CRS_EXE_SUFFIX% AUTO_START=restore CARDINALITY=1 CHECK_INTERVAL=1 CHECK_TIMEOUT=600 CLUSTER_DATABASE=false DB_UNIQUE_NAME=DB11G DEFAULT_TEMPLATE=PROPERTY(RESOURCE_CLASS=database) PROPERTY(DB_UNIQUE_NAME= CONCAT(PARSE(%NAME%, ., 2), %USR_ORA_DOMAIN%, .)) ELEMENT(INSTANCE_NAME= %GEN_USR_ORA_INST_NAME%) DEGREE=1 DESCRIPTION=Oracle Database resource ENABLED=1 FAILOVER_DELAY=0 FAILURE_INTERVAL=60 FAILURE_THRESHOLD=1 GEN_AUDIT_FILE_DEST=/oracle/admin/DB11G/adump GEN_USR_ORA_INST_NAME= GEN_USR_ORA_INST_NAME@SERVERNAME(oelcluster01)=DB11G HOSTING_MEMBERS= INSTANCE_FAILOVER=0 LOAD=1 LOGGING_LEVEL=1 MANAGEMENT_POLICY=AUTOMATIC NLS_LANG= NOT_RESTARTING_TEMPLATE= OFFLINE_CHECK_INTERVAL=0 ORACLE_HOME=/oracle/product/11.2.0/dbhome_1 PLACEMENT=restricted PROFILE_CHANGE_TEMPLATE= RESTART_ATTEMPTS=2 ROLE=PRIMARY SCRIPT_TIMEOUT=60 SERVER_POOLS=ora.DB11G SPFILE=+DTA/DB11G/spfileDB11G.ora START_DEPENDENCIES=hard(ora.DTA.dg,ora.FRA.dg) weak(type:ora.listener.type,uniform:ora.ons,uniform:ora.eons) pullup(ora.DTA.dg,ora.FRA.dg) START_TIMEOUT=600 STATE_CHANGE_TEMPLATE= STOP_DEPENDENCIES=hard(intermediate:ora.asm,shutdown:ora.DTA.dg,shutdown:ora.FRA.dg) STOP_TIMEOUT=600 UPTIME_THRESHOLD=1h USR_ORA_DB_NAME=DB11G USR_ORA_DOMAIN=haroland USR_ORA_ENV= USR_ORA_FLAGS= USR_ORA_INST_NAME=DB11G USR_ORA_OPEN_MODE=open USR_ORA_OPI=false USR_ORA_STOP_MODE=immediate VERSION=11.2.0.1.0 I removed database type related entries from myResource.txt and modified some other to produce the following myNewResource.txt. Notice the NAME property that should not have the ora. prefix Notice the TYPE property that is not ora.database.type but cluster_resource. Notice the definition of ACTION_SCRIPT. Notice the HOSTING_MEMBERS that enumerates the members of the cluster (as returned by the olsnodes command). NAME=DB11G.db TYPE=cluster_resource DESCRIPTION=Oracle Database resource ACL=owner:oracle:rwx,pgrp:oinstall:rwx,other::r-- ACTION_SCRIPT=/crs/11.2.0/HA_scripts/my_ActivePassive_Cluster.sh PLACEMENT=restricted ACTIVE_PLACEMENT=0 AUTO_START=restore CARDINALITY=1 CHECK_INTERVAL=10 DEGREE=1 ENABLED=1 HOSTING_MEMBERS=oelcluster01 oelcluster02 LOGGING_LEVEL=1 RESTART_ATTEMPTS=1 START_DEPENDENCIES=hard(ora.DTA.dg,ora.FRA.dg) weak(type:ora.listener.type,uniform:ora.ons,uniform:ora.eons) pullup(ora.DTA.dg,ora.FRA.dg) START_TIMEOUT=600 STOP_DEPENDENCIES=hard(intermediate:ora.asm,shutdown:ora.DTA.dg,shutdown:ora.FRA.dg) STOP_TIMEOUT=600 UPTIME_THRESHOLD=1h Register the resource. Take care of the resource type. It needs to be a cluster_resource and not a ora.database.type resource (Oracle recommendation) .   crsctl add resource DB11G.db  -type cluster_resource -file /crs/11.2.0/HA_scripts/myNewResource.txt Step 3 - Start the resource crsctl start resource DB11G.db This command launches the ACTION_SCRIPT with a start and a check parameter on the primary node of the cluster. Step 4 - Test this We will test the setup using 2 methods. crsctl relocate resource DB11G.db This command calls the ACTION_SCRIPT  (on the two nodes)  to stop the database on the active node and start it on the other node. Once done, we can revert back to the original node, but, this time we can use a more "MS$ like" method :Turn off the server on which the database is running. After short delay, you should observe that the database is relocated on node 1. Conclusion Once the software installed and the standalone database created (which is a rather common and usual task), the steps to reach the objective are quite easy : Create an executable action script on every node of the cluster. Create a resource file. Create/Register the resource with OCR using the resource file. Start the resource. This solution is a very interesting alternative to licensable third party solutions. References Clusterware 11gR2 documentation Oracle Clusterware Resource Reference Clusterware for Unbreakable Linux Using Oracle Clusterware to Protect A Single Instance Oracle Database 11gR1 (to have an idea of complexity) Oracle Clusterware on OTN   Gilles Haro Technical Expert - Core Technology, Oracle Consulting   

    Read the article

  • New Enhancements for InnoDB Memcached

    - by Calvin Sun
    In MySQL 5.6, we continued our development on InnoDB Memcached and completed a few widely desirable features that make InnoDB Memcached a competitive feature in more scenario. Notablely, they are 1) Support multiple table mapping 2) Added background thread to auto-commit long running transactions 3) Enhancement in binlog performance  Let’s go over each of these features one by one. And in the last section, we will go over a couple of internally performed performance tests. Support multiple table mapping In our earlier release, all InnoDB Memcached operations are mapped to a single InnoDB table. In the real life, user might want to use this InnoDB Memcached features on different tables. Thus being able to support access to different table at run time, and having different mapping for different connections becomes a very desirable feature. And in this GA release, we allow user just be able to do both. We will discuss the key concepts and key steps in using this feature. 1) "mapping name" in the "get" and "set" command In order to allow InnoDB Memcached map to a new table, the user (DBA) would still require to "pre-register" table(s) in InnoDB Memcached “containers” table (there is security consideration for this requirement). If you would like to know about “containers” table, please refer to my earlier blogs in blogs.innodb.com. Once registered, the InnoDB Memcached will then be able to look for such table when they are referred. Each of such registered table will have a unique "registration name" (or mapping_name) corresponding to the “name” field in the “containers” table.. To access these tables, user will include such "registration name" in their get or set commands, in the form of "get @@new_mapping_name.key", prefix "@@" is required for signaling a mapped table change. The key and the "mapping name" are separated by a configurable delimiter, by default, it is ".". So the syntax is: get [@@mapping_name.]key_name set [@@mapping_name.]key_name  or  get @@mapping_name set @@mapping_name Here is an example: Let's set up three tables in the "containers" table: The first is a map to InnoDB table "test/demo_test" table with mapping name "setup_1" INSERT INTO containers VALUES ("setup_1", "test", "demo_test", "c1", "c2", "c3", "c4", "c5", "PRIMARY");  Similarly, we set up table mappings for table "test/new_demo" with name "setup_2" and that to table "mydatabase/my_demo" with name "setup_3": INSERT INTO containers VALUES ("setup_2", "test", "new_demo", "c1", "c2", "c3", "c4", "c5", "secondary_index_x"); INSERT INTO containers VALUES ("setup_3", "my_database", "my_demo", "c1", "c2", "c3", "c4", "c5", "idx"); To switch to table "my_database/my_demo", and get the value corresponding to “key_a”, user will do: get @@setup_3.key_a (this will also output the value that corresponding to key "key_a" or simply get @@setup_3 Once this is done, this connection will switch to "my_database/my_demo" table until another table mapping switch is requested. so it can continue issue regular command like: get key_b  set key_c 0 0 7 These DMLs will all be directed to "my_database/my_demo" table. And this also implies that different connections can have different bindings (to different table). 2) Delimiter: For the delimiter "." that separates the "mapping name" and key value, we also added a configure option in the "config_options" system table with name of "table_map_delimiter": INSERT INTO config_options VALUES("table_map_delimiter", "."); So if user wants to change to a different delimiter, they can change it in the config_option table. 3) Default mapping: Once we have multiple table mapping, there should be always a "default" map setting. For this, we decided if there exists a mapping name of "default", then this will be chosen as default mapping. Otherwise, the first row of the containers table will chosen as default setting. Please note, user tables can be repeated in the "containers" table (for example, user wants to access different columns of the table in different settings), as long as they are using different mapping/configure names in the first column, which is enforced by a unique index. 4) bind command In addition, we also extend the protocol and added a bind command, its usage is fairly straightforward. To switch to "setup_3" mapping above, you simply issue: bind setup_3 This will switch this connection's InnoDB table to "my_database/my_demo" In summary, with this feature, you now can direct access to difference tables with difference session. And even a single connection, you can query into difference tables. Background thread to auto-commit long running transactions This is a feature related to the “batch” concept we discussed in earlier blogs. This “batch” feature allows us batch the read and write operations, and commit them only after certain calls. The “batch” size is controlled by the configure parameter “daemon_memcached_w_batch_size” and “daemon_memcached_r_batch_size”. This could significantly boost performance. However, it also comes with some disadvantages, for example, you will not be able to view “uncommitted” operations from SQL end unless you set transaction isolation level to read_uncommitted, and in addition, this will held certain row locks for extend period of time that might reduce the concurrency. To deal with this, we introduce a background thread that “auto-commits” the transaction if they are idle for certain amount of time (default is 5 seconds). The background thread will wake up every second and loop through every “connections” opened by Memcached, and check for idle transactions. And if such transaction is idle longer than certain limit and not being used, it will commit such transactions. This limit is configurable by change “innodb_api_bk_commit_interval”. Its default value is 5 seconds, and minimum is 1 second, and maximum is 1073741824 seconds. With the help of such background thread, you will not need to worry about long running uncommitted transactions when set daemon_memcached_w_batch_size and daemon_memcached_r_batch_size to a large number. This also reduces the number of locks that could be held due to long running transactions, and thus further increase the concurrency. Enhancement in binlog performance As you might all know, binlog operation is not done by InnoDB storage engine, rather it is handled in the MySQL layer. In order to support binlog operation through InnoDB Memcached, we would have to artificially create some MySQL constructs in order to access binlog handler APIs. In previous lab release, for simplicity consideration, we open and destroy these MySQL constructs (such as THD) for each operations. This required us to set the “batch” size always to 1 when binlog is on, no matter what “daemon_memcached_w_batch_size” and “daemon_memcached_r_batch_size” are configured to. This put a big restriction on our capability to scale, and also there are quite a bit overhead in creating destroying such constructs that bogs the performance down. With this release, we made necessary change that would keep MySQL constructs as long as they are valid for a particular connection. So there will not be repeated and redundant open and close (table) calls. And now even with binlog option is enabled (with innodb_api_enable_binlog,), we still can batch the transactions with daemon_memcached_w_batch_size and daemon_memcached_r_batch_size, thus scale the write/read performance. Although there are still overheads that makes InnoDB Memcached cannot perform as fast as when binlog is turned off. It is much better off comparing to previous release. And we are continuing optimize the solution is this area to improve the performance as much as possible. Performance Study: Amerandra of our System QA team have conducted some performance studies on queries through our InnoDB Memcached connection and plain SQL end. And it shows some interesting results. The test is conducted on a “Linux 2.6.32-300.7.1.el6uek.x86_64 ix86 (64)” machine with 16 GB Memory, Intel Xeon 2.0 GHz CPU X86_64 2 CPUs- 4 Core Each, 2 RAID DISKS (1027 GB,733.9GB). Results are described in following tables: Table 1: Performance comparison on Set operations Connections 5.6.7-RC-Memcached-plugin ( TPS / Qps) with memcached-threads=8*** 5.6.7-RC* X faster Set (QPS) Set** 8 30,000 5,600 5.36 32 59,000 13,000 4.54 128 68,000 8,000 8.50 512 63,000 6.800 9.23 * mysql-5.6.7-rc-linux2.6-x86_64 ** The “set” operation when implemented in InnoDB Memcached involves a couple of DMLs: it first query the table to see whether the “key” exists, if it does not, the new key/value pair will be inserted. If it does exist, the “value” field of matching row (by key) will be updated. So when used in above query, it is a precompiled store procedure, and query will just execute such procedures. *** added “–daemon_memcached_option=-t8” (default is 4 threads) So we can see with this “set” query, InnoDB Memcached can run 4.5 to 9 time faster than MySQL server. Table 2: Performance comparison on Get operations Connections 5.6.7-RC-Memcached-plugin ( TPS / Qps) with memcached-threads=8 5.6.7-RC* X faster Get (QPS) Get 8 42,000 27,000 1.56 32 101,000 55.000 1.83 128 117,000 52,000 2.25 512 109,000 52,000 2.10 With the “get” query (or the select query), memcached performs 1.5 to 2 times faster than normal SQL. Summary: In summary, we added several much-desired features to InnoDB Memcached in this release, allowing user to operate on different tables with this Memcached interface. We also now provide a background commit thread to commit long running idle transactions, thus allow user to configure large batch write/read without worrying about large number of rows held or not being able to see (uncommit) data. We also greatly enhanced the performance when Binlog is enabled. We will continue making efforts in both performance enhancement and functionality areas to make InnoDB Memcached a good demo case for our InnoDB APIs. Jimmy Yang, September 29, 2012

    Read the article

  • Clusterware 11gR2 &ndash; Setting up an Active/Passive failover configuration

    - by Gilles Haro
    Oracle provides many interesting ways to ensure High Availability. Dataguard configurations, RAC configurations or even both (as recommended for a Maximum Available Architecture - MAA) are the most frequently found. But when it comes to protecting a system with an Active/Passive architecture with failover capabilities, one often thinks to expensive third party cluster systems. Oracle Clusterware technology, which comes free with Oracle Database, is – in the knowing of most people - often linked to Oracle RAC and therefore, is rarely used to implement failover solutions. 11gR2 Clusterware – which is part of Oracle Grid Infrastructure - provides a comprehensive framework to setup automatic failover configurations. It is actually possible to make “failover-able'” and, therefore to protect, almost every kind of application (from xclock to the more complex Application Server) In the next couple of lines, I will try to present the different steps to achieve this goal : Have a fully operational 11gR2 database protected by automatic failover capabilities. I assume you are fluent in installing Oracle Database 11gR2, Oracle Grid Infrastructure 11gR2 on a Linux system and that ASM is not a problem for you (as I am using it as a shared storage). If not, please have a look at Oracle Documentation. As often, I made my tests using an Oracle VirtualBox environment. The scripts are tested and functional. Unfortunately, there can always be a typo or a mistake. This blog entry is not a course around the Clusterware Framework. I just hope it will let you see how powerful it is and that it will give you the whilst to go further with it…   Prerequisite 2 Linux boxes (OELCluster01 and OELCluster02) at the same OS level. I used OEL 5 Update 5 with Enterprise Kernel. Shared Storage (SAN). On my VirtualBox system, I used Openfiler to simulate the SAN Oracle 11gR2 Database (11.2.0.1) Oracle 11gR2 Grid Infrastructure (11.2.0.1)   Step 1 – Install the software Using asmlib, create 3 ASM disks (ASM_CRS, ASM_DTA and ASM_FRA) Install Grid Infrastructure for a cluster (OELCluster01 and OELCluster02 are the 2 nodes of the cluster) Use ASM_CRS to store Voting Disk and OCR. Use SCAN. Install Oracle Database Standalone binaries on both nodes. Use asmca to check/mount the disk groups on 2 nodes Use dbca to create and configure a database on the primary node Let’s name it DB11G. Copy the pfile, password file to the second node. Create adump directoty on the second node.   Step 2 - Setup the resource to be protected After its creation with dbca, the database is automatically protected by the Oracle Restart technology available with Grid Infrastructure. Consequently, it restarts automatically (if possible) after a crash (ex: kill –9 smon). A database resource has been created for that in the Cluster Registry. We can observe this with the command : crsctl status resource that shows and ora.dba11g.db entry. Let’s save the definition of this resource, for future use : mkdir –p /crs/11.2.0/HA_scripts chown oracle:oinstall /crs/11.2.0/HA_scripts crsctl status resource ora.db11g.db -p > /crs/11.2.0/HA_scripts/myResource.txt Although very interesting, Oracle Restart is not cluster aware and cannot restart the database on any other node of the cluster. So, let’s remove it from the OCR definitions, we don’t need it ! srvctl stop database -d DB11G srvctl remove database -d DB11G Instead of it, we need to create a new resource of a more general type : cluster_resource. Here are the steps to achieve this : Create an action script :  /crs/11.2.0/HA_scripts/my_ActivePassive_Cluster.sh #!/bin/bash export ORACLE_HOME=/oracle/product/11.2.0/dbhome_1 export ORACLE_SID=DB11G case $1 in 'start')   $ORACLE_HOME/bin/sqlplus /nolog <<EOF   connect / as sysdba   startup EOF   RET=0   ;; 'stop')   $ORACLE_HOME/bin/sqlplus /nolog <<EOF   connect / as sysdba   shutdown immediate EOF   RET=0   ;; 'check')    ok=`ps -ef | grep smon | grep $ORACLE_SID | wc -l`    if [ $ok = 0 ]; then      RET=1    else      RET=0    fi    ;; '*')      RET=0   ;; esac if [ $RET -eq 0 ]; then    exit 0 else    exit 1 fi   This script must provide, at least, methods to start, stop and check the database. It is self-explaining and contains nothing special. Just be aware that it is run as Oracle user (because of the ACL property – see later) and needs to know about the environment. It also needs to be present on every node of the cluster. chmod +x /crs/11.2.0/HA_scripts/my_ActivePassive_Cluster.sh scp  /crs/11.2.0/HA_scripts/my_ActivePassive_Cluster.sh   oracle@OELCluster02:/crs/11.2.0/HA_scripts Create a new resource file, based on the information we got from previous  myResource.txt . Name it myNewResource.txt. myResource.txt  is shown below. As we can see, it defines an ora.database.type resource, named ora.db11g.db. A lot of properties are related to this type of resource and do not need to be used for a cluster_resource. NAME=ora.db11g.db TYPE=ora.database.type ACL=owner:oracle:rwx,pgrp:oinstall:rwx,other::r-- ACTION_FAILURE_TEMPLATE= ACTION_SCRIPT= ACTIVE_PLACEMENT=1 AGENT_FILENAME=%CRS_HOME%/bin/oraagent%CRS_EXE_SUFFIX% AUTO_START=restore CARDINALITY=1 CHECK_INTERVAL=1 CHECK_TIMEOUT=600 CLUSTER_DATABASE=false DB_UNIQUE_NAME=DB11G DEFAULT_TEMPLATE=PROPERTY(RESOURCE_CLASS=database) PROPERTY(DB_UNIQUE_NAME= CONCAT(PARSE(%NAME%, ., 2), %USR_ORA_DOMAIN%, .)) ELEMENT(INSTANCE_NAME= %GEN_USR_ORA_INST_NAME%) DEGREE=1 DESCRIPTION=Oracle Database resource ENABLED=1 FAILOVER_DELAY=0 FAILURE_INTERVAL=60 FAILURE_THRESHOLD=1 GEN_AUDIT_FILE_DEST=/oracle/admin/DB11G/adump GEN_USR_ORA_INST_NAME= GEN_USR_ORA_INST_NAME@SERVERNAME(oelcluster01)=DB11G HOSTING_MEMBERS= INSTANCE_FAILOVER=0 LOAD=1 LOGGING_LEVEL=1 MANAGEMENT_POLICY=AUTOMATIC NLS_LANG= NOT_RESTARTING_TEMPLATE= OFFLINE_CHECK_INTERVAL=0 ORACLE_HOME=/oracle/product/11.2.0/dbhome_1 PLACEMENT=restricted PROFILE_CHANGE_TEMPLATE= RESTART_ATTEMPTS=2 ROLE=PRIMARY SCRIPT_TIMEOUT=60 SERVER_POOLS=ora.DB11G SPFILE=+DTA/DB11G/spfileDB11G.ora START_DEPENDENCIES=hard(ora.DTA.dg,ora.FRA.dg) weak(type:ora.listener.type,uniform:ora.ons,uniform:ora.eons) pullup(ora.DTA.dg,ora.FRA.dg) START_TIMEOUT=600 STATE_CHANGE_TEMPLATE= STOP_DEPENDENCIES=hard(intermediate:ora.asm,shutdown:ora.DTA.dg,shutdown:ora.FRA.dg) STOP_TIMEOUT=600 UPTIME_THRESHOLD=1h USR_ORA_DB_NAME=DB11G USR_ORA_DOMAIN=haroland USR_ORA_ENV= USR_ORA_FLAGS= USR_ORA_INST_NAME=DB11G USR_ORA_OPEN_MODE=open USR_ORA_OPI=false USR_ORA_STOP_MODE=immediate VERSION=11.2.0.1.0 I removed database type related entries from myResource.txt and modified some other to produce the following myNewResource.txt. Notice the NAME property that should not have the ora. prefix Notice the TYPE property that is not ora.database.type but cluster_resource. Notice the definition of ACTION_SCRIPT. Notice the HOSTING_MEMBERS that enumerates the members of the cluster (as returned by the olsnodes command). NAME=DB11G.db TYPE=cluster_resource DESCRIPTION=Oracle Database resource ACL=owner:oracle:rwx,pgrp:oinstall:rwx,other::r-- ACTION_SCRIPT=/crs/11.2.0/HA_scripts/my_ActivePassive_Cluster.sh PLACEMENT=restricted ACTIVE_PLACEMENT=0 AUTO_START=restore CARDINALITY=1 CHECK_INTERVAL=10 DEGREE=1 ENABLED=1 HOSTING_MEMBERS=oelcluster01 oelcluster02 LOGGING_LEVEL=1 RESTART_ATTEMPTS=1 START_DEPENDENCIES=hard(ora.DTA.dg,ora.FRA.dg) weak(type:ora.listener.type,uniform:ora.ons,uniform:ora.eons) pullup(ora.DTA.dg,ora.FRA.dg) START_TIMEOUT=600 STOP_DEPENDENCIES=hard(intermediate:ora.asm,shutdown:ora.DTA.dg,shutdown:ora.FRA.dg) STOP_TIMEOUT=600 UPTIME_THRESHOLD=1h Register the resource. Take care of the resource type. It needs to be a cluster_resource and not a ora.database.type resource (Oracle recommendation) .   crsctl add resource DB11G.db  -type cluster_resource -file /crs/11.2.0/HA_scripts/myNewResource.txt Step 3 - Start the resource crsctl start resource DB11G.db This command launches the ACTION_SCRIPT with a start and a check parameter on the primary node of the cluster. Step 4 - Test this We will test the setup using 2 methods. crsctl relocate resource DB11G.db This command calls the ACTION_SCRIPT  (on the two nodes)  to stop the database on the active node and start it on the other node. Once done, we can revert back to the original node, but, this time we can use a more “MS$ like” method :Turn off the server on which the database is running. After short delay, you should observe that the database is relocated on node 1. Conclusion Once the software installed and the standalone database created (which is a rather common and usual task), the steps to reach the objective are quite easy : Create an executable action script on every node of the cluster. Create a resource file. Create/Register the resource with OCR using the resource file. Start the resource. This solution is a very interesting alternative to licensable third party solutions.   References Clusterware 11gR2 documentation Oracle Clusterware Resource Reference   Gilles Haro Technical Expert - Core Technology, Oracle Consulting   

    Read the article

  • ASMLib

    - by wcoekaer
    Oracle ASMlib on Linux has been a topic of discussion a number of times since it was released way back when in 2004. There is a lot of confusion around it and certainly a lot of misinformation out there for no good reason. Let me try to give a bit of history around Oracle ASMLib. Oracle ASMLib was introduced at the time Oracle released Oracle Database 10g R1. 10gR1 introduced a very cool important new features called Oracle ASM (Automatic Storage Management). A very simplistic description would be that this is a very sophisticated volume manager for Oracle data. Give your devices directly to the ASM instance and we manage the storage for you, clustered, highly available, redundant, performance, etc, etc... We recommend using Oracle ASM for all database deployments, single instance or clustered (RAC). The ASM instance manages the storage and every Oracle server process opens and operates on the storage devices like it would open and operate on regular datafiles or raw devices. So by default since 10gR1 up to today, we do not interact differently with ASM managed block devices than we did before with a datafile being mapped to a raw device. All of this is without ASMLib, so ignore that one for now. Standard Oracle on any platform that we support (Linux, Windows, Solaris, AIX, ...) does it the exact same way. You start an ASM instance, it handles storage management, all the database instances use and open that storage and read/write from/to it. There are no extra pieces of software needed, including on Linux. ASM is fully functional and selfcontained without any other components. In order for the admin to provide a raw device to ASM or to the database, it has to have persistent device naming. If you booted up a server where a raw disk was named /dev/sdf and you give it to ASM (or even just creating a tablespace without asm on that device with datafile '/dev/sdf') and next time you boot up and that device is now /dev/sdg, you end up with an error. Just like you can't just change datafile names, you can't change device filenames without telling the database, or ASM. persistent device naming on Linux, especially back in those days ways to say it bluntly, a nightmare. In fact there were a number of issues (dating back to 2004) : Linux async IO wasn't pretty persistent device naming including permissions (had to be owned by oracle and the dba group) was very, very difficult to manage system resource usage in terms of open file descriptors So given the above, we tried to find a way to make this easier on the admins, in many ways, similar to why we started working on OCFS a few years earlier - how can we make life easier for the admins on Linux. A feature of Oracle ASM is the ability for third parties to write an extension using what's called ASMLib. It is possible for any third party OS or storage vendor to write a library using a specific Oracle defined interface that gets used by the ASM instance and by the database instance when available. This interface offered 2 components : Define an IO interface - allow any IO to the devices to go through ASMLib Define device discovery - implement an external way of discovering, labeling devices to provide to ASM and the Oracle database instance This is similar to a library that a number of companies have implemented over many years called libODM (Oracle Disk Manager). ODM was specified many years before we introduced ASM and allowed third party vendors to implement their own IO routines so that the database would use this library if installed and make use of the library open/read/write/close,.. routines instead of the standard OS interfaces. PolyServe back in the day used this to optimize their storage solution, Veritas used (and I believe still uses) this for their filesystem. It basically allowed, in particular, filesystem vendors to write libraries that could optimize access to their storage or filesystem.. so ASMLib was not something new, it was basically based on the same model. You have libodm for just database access, you have libasm for asm/database access. Since this library interface existed, we decided to do a reference implementation on Linux. We wrote an ASMLib for Linux that could be used on any Linux platform and other vendors could see how this worked and potentially implement their own solution. As I mentioned earlier, ASMLib and ODMLib are libraries for third party extensions. ASMLib for Linux, since it was a reference implementation implemented both interfaces, the storage discovery part and the IO part. There are 2 components : Oracle ASMLib - the userspace library with config tools (a shared object and some scripts) oracleasm.ko - a kernel module that implements the asm device for /dev/oracleasm/* The userspace library is a binary-only module since it links with and contains Oracle header files but is generic, we only have one asm library for the various Linux platforms. This library is opened by Oracle ASM and by Oracle database processes and this library interacts with the OS through the asm device (/dev/asm). It can install on Oracle Linux, on SuSE SLES, on Red Hat RHEL,.. The library itself doesn't actually care much about the OS version, the kernel module and device cares. The support tools are simple scripts that allow the admin to label devices and scan for disks and devices. This way you can say create an ASM disk label foo on, currently /dev/sdf... So if /dev/sdf disappears and next time is /dev/sdg, we just scan for the label foo and we discover it as /dev/sdg and life goes on without any worry. Also, when the database needs access to the device, we don't have to worry about file permissions or anything it will be taken care of. So it's a convenience thing. The kernel module oracleasm.ko is a Linux kernel module/device driver. It implements a device /dev/oracleasm/* and any and all IO goes through ASMLib - /dev/oracleasm. This kernel module is obviously a very specific Oracle related device driver but it was released under the GPL v2 so anyone could easily build it for their Linux distribution kernels. Advantages for using ASMLib : A good async IO interface for the database, the entire IO interface is based on an optimal ASYNC model for performance A single file descriptor per Oracle process, not one per device or datafile per process reducing # of open filehandles overhead Device scanning and labeling built-in so you do not have to worry about messing with udev or devlabel, permissions or the likes which can be very complex and error prone. Just like with OCFS and OCFS2, each kernel version (major or minor) has to get a new version of the device drivers. We started out building the oracleasm kernel module rpms for many distributions, SLES (in fact in the early days still even for this thing called United Linux) and RHEL. The driver didn't make sense to get pushed into upstream Linux because it's unique and specific to the Oracle database. As it takes a huge effort in terms of build infrastructure and QA and release management to build kernel modules for every architecture, every linux distribution and every major and minor version we worked with the vendors to get them to add this tiny kernel module to their infrastructure. (60k source code file). The folks at SuSE understood this was good for them and their customers and us and added it to SLES. So every build coming from SuSE for SLES contains the oracleasm.ko module. We weren't as successful with other vendors so for quite some time we continued to build it for RHEL and of course as we introduced Oracle Linux end of 2006 also for Oracle Linux. With Oracle Linux it became easy for us because we just added the code to our build system and as we churned out Oracle Linux kernels whether it was for a public release or for customers that needed a one off fix where they also used asmlib, we didn't have to do any extra work it was just all nicely integrated. With the introduction of Oracle Linux's Unbreakable Enterprise Kernel and our interest in being able to exploit ASMLib more, we started working on a very exciting project called Data Integrity. Oracle (Martin Petersen in particular) worked for many years with the T10 standards committee and storage vendors and implemented Linux kernel support for DIF/DIX, data protection in the Linux kernel, note to those that wonder, yes it's all in mainline Linux and under the GPL. This basically gave us all the features in the Linux kernel to checksum a data block, send it to the storage adapter, which can then validate that block and checksum in firmware before it sends it over the wire to the storage array, which can then do another checksum and to the actual DISK which does a final validation before writing the block to the physical media. So what was missing was the ability for a userspace application (read: Oracle RDBMS) to write a block which then has a checksum and validation all the way down to the disk. application to disk. Because we have ASMLib we had an entry into the Linux kernel and Martin added support in ASMLib (kernel driver + userspace) for this functionality. Now, this is all based on relatively current Linux kernels, the oracleasm kernel module depends on the main kernel to have support for it so we can make use of it. Thanks to UEK and us having the ability to ship a more modern, current version of the Linux kernel we were able to introduce this feature into ASMLib for Linux from Oracle. This combined with the fact that we build the asm kernel module when we build every single UEK kernel allowed us to continue improving ASMLib and provide it to our customers. So today, we (Oracle) provide Oracle ASMLib for Oracle Linux and in particular on the Unbreakable Enterprise Kernel. We did the build/testing/delivery of ASMLib for RHEL until RHEL5 but since RHEL6 decided that it was too much effort for us to also maintain all the build and test environments for RHEL and we did not have the ability to use the latest kernel features to introduce the Data Integrity features and we didn't want to end up with multiple versions of asmlib as maintained by us. SuSE SLES still builds and comes with the oracleasm module and they do all the work and RHAT it certainly welcome to do the same. They don't have to rebuild the userspace library, it's really about the kernel module. And finally to re-iterate a few important things : Oracle ASM does not in any way require ASMLib to function completely. ASMlib is a small set of extensions, in particular to make device management easier but there are no extra features exposed through Oracle ASM with ASMLib enabled or disabled. Often customers confuse ASMLib with ASM. again, ASM exists on every Oracle supported OS and on every supported Linux OS, SLES, RHEL, OL withoutASMLib Oracle ASMLib userspace is available for OTN and the kernel module is shipped along with OL/UEK for every build and by SuSE for SLES for every of their builds ASMLib kernel module was built by us for RHEL4 and RHEL5 but we do not build it for RHEL6, nor for the OL6 RHCK kernel. Only for UEK ASMLib for Linux is/was a reference implementation for any third party vendor to be able to offer, if they want to, their own version for their own OS or storage ASMLib as provided by Oracle for Linux continues to be enhanced and evolve and for the kernel module we use UEK as the base OS kernel hope this helps.

    Read the article

  • Meet the New Windows Azure

    - by ScottGu
    Today we are releasing a major set of improvements to Windows Azure.  Below is a short-summary of just a few of them: New Admin Portal and Command Line Tools Today’s release comes with a new Windows Azure portal that will enable you to manage all features and services offered on Windows Azure in a seamless, integrated way.  It is very fast and fluid, supports filtering and sorting (making it much easier to use for large deployments), works on all browsers, and offers a lot of great new features – including built-in VM, Web site, Storage, and Cloud Service monitoring support. The new portal is built on top of a REST-based management API within Windows Azure – and everything you can do through the portal can also be programmed directly against this Web API. We are also today releasing command-line tools (which like the portal call the REST Management APIs) to make it even easier to script and automate your administration tasks.  We are offering both a Powershell (for Windows) and Bash (for Mac and Linux) set of tools to download.  Like our SDKs, the code for these tools is hosted on GitHub under an Apache 2 license. Virtual Machines Windows Azure now supports the ability to deploy and run durable VMs in the cloud.  You can easily create these VMs using a new Image Gallery built-into the new Windows Azure Portal, or alternatively upload and run your own custom-built VHD images. Virtual Machines are durable (meaning anything you install within them persists across reboots) and you can use any OS with them.  Our built-in image gallery includes both Windows Server images (including the new Windows Server 2012 RC) as well as Linux images (including Ubuntu, CentOS, and SUSE distributions).  Once you create a VM instance you can easily Terminal Server or SSH into it in order to configure and customize the VM however you want (and optionally capture your own image snapshot of it to use when creating new VM instances).  This provides you with the flexibility to run pretty much any workload within Windows Azure.   The new Windows Azure Portal provides a rich set of management features for Virtual Machines – including the ability to monitor and track resource utilization within them.  Our new Virtual Machine support also enables the ability to easily attach multiple data-disks to VMs (which you can then mount and format as drives).  You can optionally enable geo-replication support on these – which will cause Windows Azure to continuously replicate your storage to a secondary data-center at least 400 miles away from your primary data-center as a backup. We use the same VHD format that is supported with Windows virtualization today (and which we’ve released as an open spec), which enables you to easily migrate existing workloads you might already have virtualized into Windows Azure.  We also make it easy to download VHDs from Windows Azure, which also provides the flexibility to easily migrate cloud-based VM workloads to an on-premise environment.  All you need to do is download the VHD file and boot it up locally, no import/export steps required. Web Sites Windows Azure now supports the ability to quickly and easily deploy ASP.NET, Node.js and PHP web-sites to a highly scalable cloud environment that allows you to start small (and for free) and then scale up as your traffic grows.  You can create a new web site in Azure and have it ready to deploy to in under 10 seconds: The new Windows Azure Portal provides built-in administration support for Web sites – including the ability to monitor and track resource utilization in real-time: You can deploy to web-sites in seconds using FTP, Git, TFS and Web Deploy.  We are also releasing tooling updates today for both Visual Studio and Web Matrix that enable developers to seamlessly deploy ASP.NET applications to this new offering.  The VS and Web Matrix publishing support includes the ability to deploy SQL databases as part of web site deployment – as well as the ability to incrementally update database schema with a later deployment. You can integrate web application publishing with source control by selecting the “Set up TFS publishing” or “Set up Git publishing” links on a web-site’s dashboard: Doing do will enable integration with our new TFS online service (which enables a full TFS workflow – including elastic build and testing support), or create a Git repository that you can reference as a remote and push deployments to.  Once you push a deployment using TFS or Git, the deployments tab will keep track of the deployments you make, and enable you to select an older (or newer) deployment and quickly redeploy your site to that snapshot of the code.  This provides a very powerful DevOps workflow experience.   Windows Azure now allows you to deploy up to 10 web-sites into a free, shared/multi-tenant hosting environment (where a site you deploy will be one of multiple sites running on a shared set of server resources).  This provides an easy way to get started on projects at no cost. You can then optionally upgrade your sites to run in a “reserved mode” that isolates them so that you are the only customer within a virtual machine: And you can elastically scale the amount of resources your sites use – allowing you to increase your reserved instance capacity as your traffic scales: Windows Azure automatically handles load balancing traffic across VM instances, and you get the same, super fast, deployment options (FTP, Git, TFS and Web Deploy) regardless of how many reserved instances you use. With Windows Azure you pay for compute capacity on a per-hour basis – which allows you to scale up and down your resources to match only what you need. Cloud Services and Distributed Caching Windows Azure also supports the ability to build cloud services that support rich multi-tier architectures, automated application management, and scale to extremely large deployments.  Previously we referred to this capability as “hosted services” – with this week’s release we are now referring to this capability as “cloud services”.  We are also enabling a bunch of new features with them. Distributed Cache One of the really cool new features being enabled with cloud services is a new distributed cache capability that enables you to use and setup a low-latency, in-memory distributed cache within your applications.  This cache is isolated for use just by your applications, and does not have any throttling limits. This cache can dynamically grow and shrink elastically (without you have to redeploy your app or make code changes), and supports the full richness of the AppFabric Cache Server API (including regions, high availability, notifications, local cache and more).  In addition to supporting the AppFabric Cache Server API, it also now supports the Memcached protocol – allowing you to point code written against Memcached at it (no code changes required). The new distributed cache can be setup to run in one of two ways: 1) Using a co-located approach.  In this option you allocate a percentage of memory in your existing web and worker roles to be used by the cache, and then the cache joins the memory into one large distributed cache.  Any data put into the cache by one role instance can be accessed by other role instances in your application – regardless of whether the cached data is stored on it or another role.  The big benefit with the “co-located” option is that it is free (you don’t have to pay anything to enable it) and it allows you to use what might have been otherwise unused memory within your application VMs. 2) Alternatively, you can add “cache worker roles” to your cloud service that are used solely for caching.  These will also be joined into one large distributed cache ring that other roles within your application can access.  You can use these roles to cache 10s or 100s of GBs of data in-memory very effectively – and the cache can be elastically increased or decreased at runtime within your application: New SDKs and Tooling Support We have updated all of the Windows Azure SDKs with today’s release to include new features and capabilities.  Our SDKs are now available for multiple languages, and all of the source in them is published under an Apache 2 license and and maintained in GitHub repositories. The .NET SDK for Azure has in particular seen a bunch of great improvements with today’s release, and now includes tooling support for both VS 2010 and the VS 2012 RC. We are also now shipping Windows, Mac and Linux SDK downloads for languages that are offered on all of these systems – allowing developers to develop Windows Azure applications using any development operating system. Much, Much More The above is just a short list of some of the improvements that are shipping in either preview or final form today – there is a LOT more in today’s release.  These include new Virtual Private Networking capabilities, new Service Bus runtime and tooling support, the public preview of the new Azure Media Services, new Data Centers, significantly upgraded network and storage hardware, SQL Reporting Services, new Identity features, support within 40+ new countries and territories, and much, much more. You can learn more about Windows Azure and sign-up to try it for free at http://windowsazure.com.  You can also watch a live keynote I’m giving at 1pm June 7th (later today) where I’ll walk through all of the new features.  We will be opening up the new features I discussed above for public usage a few hours after the keynote concludes.  We are really excited to see the great applications you build with them. Hope this helps, Scott

    Read the article

  • How to reduce virtual memory by optimising my PHP code?

    - by iCeR
    My current code (see below) uses 147MB of virtual memory! My provider has allocated 100MB by default and the process is killed once run, causing an internal error. The code is utilising curl multi and must be able to loop with more than 150 iterations whilst still minimizing the virtual memory. The code below is only set at 150 iterations and still causes the internal server error. At 90 iterations the issue does not occur. How can I adjust my code to lower the resource use / virtual memory? Thanks! <?php function udate($format, $utimestamp = null) { if ($utimestamp === null) $utimestamp = microtime(true); $timestamp = floor($utimestamp); $milliseconds = round(($utimestamp - $timestamp) * 1000); return date(preg_replace('`(?<!\\\\)u`', $milliseconds, $format), $timestamp); } $url = 'https://www.testdomain.com/'; $curl_arr = array(); $master = curl_multi_init(); for($i=0; $i<150; $i++) { $curl_arr[$i] = curl_init(); curl_setopt($curl_arr[$i], CURLOPT_URL, $url); curl_setopt($curl_arr[$i], CURLOPT_RETURNTRANSFER, 1); curl_setopt($curl_arr[$i], CURLOPT_SSL_VERIFYHOST, FALSE); curl_setopt($curl_arr[$i], CURLOPT_SSL_VERIFYPEER, FALSE); curl_multi_add_handle($master, $curl_arr[$i]); } do { curl_multi_exec($master,$running); } while($running > 0); for($i=0; $i<150; $i++) { $results = curl_multi_getcontent ($curl_arr[$i]); $results = explode("<br>", $results); echo $results[0]; echo "<br>"; echo $results[1]; echo "<br>"; echo udate('H:i:s:u'); echo "<br><br>"; usleep(100000); } ?> Processor Information Total processors: 8 Processor #1 Vendor GenuineIntel Name Intel(R) Xeon(R) CPU E5405 @ 2.00GHz Speed 1995.120 MHz Cache 6144 KB Processor #2 Vendor GenuineIntel Name Intel(R) Xeon(R) CPU E5405 @ 2.00GHz Speed 1995.120 MHz Cache 6144 KB Processor #3 Vendor GenuineIntel Name Intel(R) Xeon(R) CPU E5405 @ 2.00GHz Speed 1995.120 MHz Cache 6144 KB Processor #4 Vendor GenuineIntel Name Intel(R) Xeon(R) CPU E5405 @ 2.00GHz Speed 1995.120 MHz Cache 6144 KB Processor #5 Vendor GenuineIntel Name Intel(R) Xeon(R) CPU E5405 @ 2.00GHz Speed 1995.120 MHz Cache 6144 KB Processor #6 Vendor GenuineIntel Name Intel(R) Xeon(R) CPU E5405 @ 2.00GHz Speed 1995.120 MHz Cache 6144 KB Processor #7 Vendor GenuineIntel Name Intel(R) Xeon(R) CPU E5405 @ 2.00GHz Speed 1995.120 MHz Cache 6144 KB Processor #8 Vendor GenuineIntel Name Intel(R) Xeon(R) CPU E5405 @ 2.00GHz Speed 1995.120 MHz Cache 6144 KB Memory Information Memory for crash kernel (0x0 to 0x0) notwithin permissible range Memory: 8302344k/9175040k available (2176k kernel code, 80272k reserved, 901k data, 228k init, 7466304k highmem) System Information Linux server3.server.com 2.6.18-194.17.1.el5PAE #1 SMP Wed Sep 29 13:31:51 EDT 2010 i686 i686 i386 GNU/Linux Physical Disks SCSI device sda: 1952448512 512-byte hdwr sectors (999654 MB) sda: Write Protect is off sda: Mode Sense: 03 00 00 08 SCSI device sda: drive cache: write back SCSI device sda: 1952448512 512-byte hdwr sectors (999654 MB) sda: Write Protect is off sda: Mode Sense: 03 00 00 08 SCSI device sda: drive cache: write back sd 0:1:0:0: Attached scsi disk sda sd 4:0:0:0: Attached scsi removable disk sdb sd 0:1:0:0: Attached scsi generic sg4 type 0 sd 4:0:0:0: Attached scsi generic sg7 type 0 Current Memory Usage total used free shared buffers cached Mem: 8306672 7847384 459288 0 487912 6444548 -/+ buffers/cache: 914924 7391748 Swap: 4095992 496 4095496 Total: 12402664 7847880 4554784 Current Disk Usage Filesystem Size Used Avail Use% Mounted on /dev/mapper/VolGroup00-LogVol00 898G 307G 546G 36% / /dev/sda1 99M 19M 76M 20% /boot none 4.0G 0 4.0G 0% /dev/shm /var/tmpMnt 4.0G 1.8G 2.0G 48% /tmp

    Read the article

  • Partitioned Repository for WebCenter Content using Oracle Database 11g

    - by Adao Junior
    One of the biggest challenges for content management solutions is related to the storage management due the high volumes of the unstoppable growing of information. Even if you have storage appliances and a lot of terabytes, thinks like backup, compression, deduplication, storage relocation, encryption, availability could be a nightmare. One standard option that you have with the Oracle WebCenter Content is to store data to the database. And the Oracle Database allows you leverage features like compression, deduplication, encryption and seamless backup. But with a huge volume, the challenge is passed to the DBA to keep the WebCenter Content Database up and running. One solution is the use of DB partitions for your content repository, but what are the implications of this? Can I fit this with my business requirements? Well, yes. It’s up to you how you will manage that, you just need a good plan. During you “storage brainstorm plan” take in your mind what you need, such as storage petabytes of documents? You need everything on-line? There’s a way to logically separate the “good content” from the “legacy content”? The first thing that comes to my mind is to use the creation date of the document, but you need to remember that this document could receive a lot of revisions and maybe you can consider the revision creation date. Your plan can have also complex rules like per Document Type or per a custom metadata like department or an hybrid per date, per DocType and an specific virtual folder. Extrapolation the use, you can have your repository distributed in different servers, different disks, different disk types (Such as ssds, sas, sata, tape,…), separated accordingly your business requirements, separating the “hot” content from the legacy and easily matching your compliance requirements. If you think to use by revision, the simple way is to consider the dId, that is the sequential unique id for every content created using the WebCenter Content or the dLastModified that is the date field of the FileStorage table that contains the date of inclusion of the content to the DB Table using SecureFiles. Using the scenario of partitioned repository using an hierarchical separation by date, we will transform the FileStorage table in an partitioned table using  “Partition by Range” of the dLastModified column (You can use the dId or a join with other tables for other metadata such as dDocType, Security, etc…). The test scenario bellow covers: Previous existent data on the JDBC Storage to be migrated to the new partitioned JDBC Storage Partition by Date Automatically generation of new partitions based on a pre-defined interval (Available only with Oracle Database 11g+) Deduplication and Compression for legacy data Oracle WebCenter Content 11g PS5 (Could present some customizations that do not affect the test scenario) For the test case you need some data stored using JDBC Storage to be the “legacy” data. If you do not have done before, just create an Storage rule pointed to the JDBC Storage: Enable the metadata StorageRule in the UI and upload some documents using this rule. For this test case you can run using the schema owner or an dba user. We will use the schema owner TESTS_OCS. I can’t forgot to tell that this is just a test and you should do a proper backup of your environment. When you use the schema owner, you need some privileges, using the dba user grant the privileges needed: REM Grant privileges required for online redefinition. GRANT EXECUTE ON DBMS_REDEFINITION TO TESTS_OCS; GRANT ALTER ANY TABLE TO TESTS_OCS; GRANT DROP ANY TABLE TO TESTS_OCS; GRANT LOCK ANY TABLE TO TESTS_OCS; GRANT CREATE ANY TABLE TO TESTS_OCS; GRANT SELECT ANY TABLE TO TESTS_OCS; REM Privileges required to perform cloning of dependent objects. GRANT CREATE ANY TRIGGER TO TESTS_OCS; GRANT CREATE ANY INDEX TO TESTS_OCS; In our test scenario we will separate the content as Legacy, Day1, Day2, Day3 and Future. This last one will partitioned automatically using 3 tablespaces in a round robin mode. In a real scenario the partition rule could be per month, per year or any rule that you choose. Table spaces for the test scenario: CREATE TABLESPACE TESTS_OCS_PART_LEGACY DATAFILE 'tests_ocs_part_legacy.dat' SIZE 500K AUTOEXTEND ON NEXT 500K MAXSIZE UNLIMITED; CREATE TABLESPACE TESTS_OCS_PART_DAY1 DATAFILE 'tests_ocs_part_day1.dat' SIZE 500K AUTOEXTEND ON NEXT 500K MAXSIZE UNLIMITED; CREATE TABLESPACE TESTS_OCS_PART_DAY2 DATAFILE 'tests_ocs_part_day2.dat' SIZE 500K AUTOEXTEND ON NEXT 500K MAXSIZE UNLIMITED; CREATE TABLESPACE TESTS_OCS_PART_DAY3 DATAFILE 'tests_ocs_part_day3.dat' SIZE 500K AUTOEXTEND ON NEXT 500K MAXSIZE UNLIMITED; CREATE TABLESPACE TESTS_OCS_PART_ROUND_ROBIN_A 'tests_ocs_part_round_robin_a.dat' DATAFILE SIZE 500K AUTOEXTEND ON NEXT 500K MAXSIZE UNLIMITED; CREATE TABLESPACE TESTS_OCS_PART_ROUND_ROBIN_B 'tests_ocs_part_round_robin_b.dat' DATAFILE SIZE 500K AUTOEXTEND ON NEXT 500K MAXSIZE UNLIMITED; CREATE TABLESPACE TESTS_OCS_PART_ROUND_ROBIN_C 'tests_ocs_part_round_robin_c.dat' DATAFILE SIZE 500K AUTOEXTEND ON NEXT 500K MAXSIZE UNLIMITED; Before start, gather optimizer statistics on the actual FileStorage table: EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'FileStorage', cascade => TRUE); Now check if is possible execute the redefinition process: EXEC DBMS_REDEFINITION.CAN_REDEF_TABLE('TESTS_OCS', 'FileStorage',DBMS_REDEFINITION.CONS_USE_PK); If no errors messages, you are good to go. Create a Partitioned Interim FileStorage table. You need to create a new table with the partition information to act as an interim table: CREATE TABLE FILESTORAGE_Part ( DID NUMBER(*,0) NOT NULL ENABLE, DRENDITIONID VARCHAR2(30 CHAR) NOT NULL ENABLE, DLASTMODIFIED TIMESTAMP (6), DFILESIZE NUMBER(*,0), DISDELETED VARCHAR2(1 CHAR), BFILEDATA BLOB ) LOB (BFILEDATA) STORE AS SECUREFILE ( ENABLE STORAGE IN ROW NOCACHE LOGGING KEEP_DUPLICATES NOCOMPRESS ) PARTITION BY RANGE (DLASTMODIFIED) INTERVAL (NUMTODSINTERVAL(1,'DAY')) STORE IN (TESTS_OCS_PART_ROUND_ROBIN_A, TESTS_OCS_PART_ROUND_ROBIN_B, TESTS_OCS_PART_ROUND_ROBIN_C) ( PARTITION FILESTORAGE_PART_LEGACY VALUES LESS THAN (TO_DATE('05-APR-2012 12.00.00 AM', 'DD-MON-YYYY HH.MI.SS AM')) TABLESPACE TESTS_OCS_PART_LEGACY LOB (BFILEDATA) STORE AS SECUREFILE ( TABLESPACE TESTS_OCS_PART_LEGACY RETENTION NONE DEDUPLICATE COMPRESS HIGH ), PARTITION FILESTORAGE_PART_DAY1 VALUES LESS THAN (TO_DATE('06-APR-2012 07.25.00 PM', 'DD-MON-YYYY HH.MI.SS AM')) TABLESPACE TESTS_OCS_PART_DAY1 LOB (BFILEDATA) STORE AS SECUREFILE ( TABLESPACE TESTS_OCS_PART_DAY1 RETENTION AUTO KEEP_DUPLICATES COMPRESS ), PARTITION FILESTORAGE_PART_DAY2 VALUES LESS THAN (TO_DATE('06-APR-2012 07.55.00 PM', 'DD-MON-YYYY HH.MI.SS AM')) TABLESPACE TESTS_OCS_PART_DAY2 LOB (BFILEDATA) STORE AS SECUREFILE ( TABLESPACE TESTS_OCS_PART_DAY2 RETENTION AUTO KEEP_DUPLICATES NOCOMPRESS ), PARTITION FILESTORAGE_PART_DAY3 VALUES LESS THAN (TO_DATE('06-APR-2012 07.58.00 PM', 'DD-MON-YYYY HH.MI.SS AM')) TABLESPACE TESTS_OCS_PART_DAY3 LOB (BFILEDATA) STORE AS SECUREFILE ( TABLESPACE TESTS_OCS_PART_DAY3 RETENTION AUTO KEEP_DUPLICATES NOCOMPRESS ) ); After the creation you should see your partitions defined. Note that only the fixed range partitions have been created, none of the interval partition have been created. Start the redefinition process: BEGIN DBMS_REDEFINITION.START_REDEF_TABLE( uname => 'TESTS_OCS' ,orig_table => 'FileStorage' ,int_table => 'FileStorage_PART' ,col_mapping => NULL ,options_flag => DBMS_REDEFINITION.CONS_USE_PK ); END; This operation can take some time to complete, depending how many contents that you have and on the size of the table. Using the DBA user you can check the progress with this command: SELECT * FROM v$sesstat WHERE sid = 1; Copy dependent objects: DECLARE redefinition_errors PLS_INTEGER := 0; BEGIN DBMS_REDEFINITION.COPY_TABLE_DEPENDENTS( uname => 'TESTS_OCS' ,orig_table => 'FileStorage' ,int_table => 'FileStorage_PART' ,copy_indexes => DBMS_REDEFINITION.CONS_ORIG_PARAMS ,copy_triggers => TRUE ,copy_constraints => TRUE ,copy_privileges => TRUE ,ignore_errors => TRUE ,num_errors => redefinition_errors ,copy_statistics => FALSE ,copy_mvlog => FALSE ); IF (redefinition_errors > 0) THEN DBMS_OUTPUT.PUT_LINE('>>> FileStorage to FileStorage_PART temp copy Errors: ' || TO_CHAR(redefinition_errors)); END IF; END; With the DBA user, verify that there's no errors: SELECT object_name, base_table_name, ddl_txt FROM DBA_REDEFINITION_ERRORS; *Note that will show 2 lines related to the constrains, this is expected. Synchronize the interim table FileStorage_PART: BEGIN DBMS_REDEFINITION.SYNC_INTERIM_TABLE( uname => 'TESTS_OCS', orig_table => 'FileStorage', int_table => 'FileStorage_PART'); END; Gather statistics on the new table: EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'FileStorage_PART', cascade => TRUE); Complete the redefinition: BEGIN DBMS_REDEFINITION.FINISH_REDEF_TABLE( uname => 'TESTS_OCS', orig_table => 'FileStorage', int_table => 'FileStorage_PART'); END; During the execution the FileStorage table is locked in exclusive mode until finish the operation. After the last command the FileStorage table is partitioned. If you have contents out of the range partition, you should see the new partitions created automatically, not generating an error if you “forgot” to create all the future ranges. You will see something like: You now can drop the FileStorage_PART table: border-bottom-width: 1px; border-bottom-style: solid; text-align: left; border-left-color: silver; border-left-width: 1px; border-left-style: solid; padding-bottom: 4px; line-height: 12pt; background-color: #f4f4f4; margin-top: 20px; margin-right: 0px; margin-bottom: 10px; margin-left: 0px; padding-left: 4px; width: 97.5%; padding-right: 4px; font-family: 'Courier New', Courier, monospace; direction: ltr; max-height: 200px; font-size: 8pt; overflow-x: auto; overflow-y: auto; border-top-color: silver; border-top-width: 1px; border-top-style: solid; cursor: text; border-right-color: silver; border-right-width: 1px; border-right-style: solid; padding-top: 4px; " id="codeSnippetWrapper"> DROP TABLE FileStorage_PART PURGE; To check the FileStorage table is valid and is partitioned, use the command: SELECT num_rows,partitioned FROM user_tables WHERE table_name = 'FILESTORAGE'; You can list the contents of the FileStorage table in a specific partition, per example: SELECT * FROM FileStorage PARTITION (FILESTORAGE_PART_LEGACY) Some useful commands that you can use to check the partitions, note that you need to run using a DBA user: SELECT * FROM DBA_TAB_PARTITIONS WHERE table_name = 'FILESTORAGE';   SELECT * FROM DBA_TABLESPACES WHERE tablespace_name like 'TESTS_OCS%'; After the redefinition process complete you have a new FileStorage table storing all content that has the Storage rule pointed to the JDBC Storage and partitioned using the rule set during the creation of the temporary interim FileStorage_PART table. At this point you can test the WebCenter Content downloading the documents (Original and Renditions). Note that the content could be already in the cache area, take a look in the weblayout directory to see if a file with the same id is there, then click on the web rendition of your test file and see if have created the file and you can open, this means that is all working. The redefinition process can be repeated many times, this allow you test what the better layout, over and over again. Now some interesting maintenance actions related to the partitions: Make an tablespace read only. No issues viewing, the WebCenter Content do not alter the revisions When try to delete an content that is part of an read only tablespace, an error will occurs and the document will not be deleted The only way to prevent errors today is creating an custom component that checks the partitions and if you have an document in an “Read Only” repository, execute the deletion process of the metadata and mark the document to be deleted on the next db maintenance, like a new redefinition. Take an tablespace off-line for archiving purposes or any other reason. When you try open an document that is included in this tablespace will receive an error that was unable to retrieve the content, but the others online tablespaces are not affected. Same behavior when deleting documents. Again, an custom component is the solution. If you have an document “out of range”, the component can show an message that the repository for that document is offline. This can be extended to a option to the user to request to put online again. Moving some legacy content to an offline repository (table) using the Exchange option to move the content from one partition to a empty nonpartitioned table like FileStorage_LEGACY. Note that this option will remove the registers from the FileStorage and will not be able to open the stored content. You always need to keep in mind the indexes and constrains. An redefinition separating the original content (vault) from the renditions and separate by date ate the same time. This could be an option for DAM environments that want to have an special place for the renditions and put the original files in a storage with less performance. The process will be the same, you just need to change the script of the interim table to use composite partitioning. Will be something like: CREATE TABLE FILESTORAGE_RenditionPart ( DID NUMBER(*,0) NOT NULL ENABLE, DRENDITIONID VARCHAR2(30 CHAR) NOT NULL ENABLE, DLASTMODIFIED TIMESTAMP (6), DFILESIZE NUMBER(*,0), DISDELETED VARCHAR2(1 CHAR), BFILEDATA BLOB ) LOB (BFILEDATA) STORE AS SECUREFILE ( ENABLE STORAGE IN ROW NOCACHE LOGGING KEEP_DUPLICATES NOCOMPRESS ) PARTITION BY LIST (DRENDITIONID) SUBPARTITION BY RANGE (DLASTMODIFIED) ( PARTITION Vault VALUES ('primaryFile') ( SUBPARTITION FILESTORAGE_VAULT_LEGACY VALUES LESS THAN (TO_DATE('05-APR-2012 12.00.00 AM', 'DD-MON-YYYY HH.MI.SS AM')) LOB (BFILEDATA) STORE AS SECUREFILE , SUBPARTITION FILESTORAGE_VAULT_DAY1 VALUES LESS THAN (TO_DATE('06-APR-2012 07.25.00 PM', 'DD-MON-YYYY HH.MI.SS AM')) LOB (BFILEDATA) STORE AS SECUREFILE , SUBPARTITION FILESTORAGE_VAULT_DAY2 VALUES LESS THAN (TO_DATE('06-APR-2012 07.55.00 PM', 'DD-MON-YYYY HH.MI.SS AM')) LOB (BFILEDATA) STORE AS SECUREFILE , SUBPARTITION FILESTORAGE_VAULT_DAY3 VALUES LESS THAN (TO_DATE('06-APR-2012 07.58.00 PM', 'DD-MON-YYYY HH.MI.SS AM')) LOB (BFILEDATA) STORE AS SECUREFILE , SUBPARTITION FILESTORAGE_VAULT_FUTURE VALUES LESS THAN (MAXVALUE) ) ,PARTITION WebLayout VALUES ('webViewableFile') ( SUBPARTITION FILESTORAGE_WEBLAYOUT_LEGACY VALUES LESS THAN (TO_DATE('05-APR-2012 12.00.00 AM', 'DD-MON-YYYY HH.MI.SS AM')) LOB (BFILEDATA) STORE AS SECUREFILE , SUBPARTITION FILESTORAGE_WEBLAYOUT_DAY1 VALUES LESS THAN (TO_DATE('06-APR-2012 07.25.00 PM', 'DD-MON-YYYY HH.MI.SS AM')) LOB (BFILEDATA) STORE AS SECUREFILE , SUBPARTITION FILESTORAGE_WEBLAYOUT_DAY2 VALUES LESS THAN (TO_DATE('06-APR-2012 07.55.00 PM', 'DD-MON-YYYY HH.MI.SS AM')) LOB (BFILEDATA) STORE AS SECUREFILE , SUBPARTITION FILESTORAGE_WEBLAYOUT_DAY3 VALUES LESS THAN (TO_DATE('06-APR-2012 07.58.00 PM', 'DD-MON-YYYY HH.MI.SS AM')) LOB (BFILEDATA) STORE AS SECUREFILE , SUBPARTITION FILESTORAGE_WEBLAYOUT_FUTURE VALUES LESS THAN (MAXVALUE) ) ,PARTITION Special VALUES ('Special') ( SUBPARTITION FILESTORAGE_SPECIAL_LEGACY VALUES LESS THAN (TO_DATE('05-APR-2012 12.00.00 AM', 'DD-MON-YYYY HH.MI.SS AM')) LOB (BFILEDATA) STORE AS SECUREFILE , SUBPARTITION FILESTORAGE_SPECIAL_DAY1 VALUES LESS THAN (TO_DATE('06-APR-2012 07.25.00 PM', 'DD-MON-YYYY HH.MI.SS AM')) LOB (BFILEDATA) STORE AS SECUREFILE , SUBPARTITION FILESTORAGE_SPECIAL_DAY2 VALUES LESS THAN (TO_DATE('06-APR-2012 07.55.00 PM', 'DD-MON-YYYY HH.MI.SS AM')) LOB (BFILEDATA) STORE AS SECUREFILE , SUBPARTITION FILESTORAGE_SPECIAL_DAY3 VALUES LESS THAN (TO_DATE('06-APR-2012 07.58.00 PM', 'DD-MON-YYYY HH.MI.SS AM')) LOB (BFILEDATA) STORE AS SECUREFILE , SUBPARTITION FILESTORAGE_SPECIAL_FUTURE VALUES LESS THAN (MAXVALUE) ) )ENABLE ROW MOVEMENT; The next post related to partitioned repository will come with an sample component to handle the possible exceptions when you need to take off line an tablespace/partition or move to another place. Also, we can include some integration to the Retention Management and Records Management. Another subject related to partitioning is the ability to create an FileStore Provider pointed to a different database, raising the level of the distributed storage vs. performance. Let us know if this is important to you or you have an use case not listed, leave a comment. Cross-posted on the blog.ContentrA.com

    Read the article

  • Windows Azure: Announcing release of Windows Azure SDK 2.2 (with lots of goodies)

    - by ScottGu
    Earlier today I blogged about a big update we made today to Windows Azure, and some of the great new features it provides. Today I’m also excited to also announce the release of the Windows Azure SDK 2.2. Today’s SDK release adds even more great features including: Visual Studio 2013 Support Integrated Windows Azure Sign-In support within Visual Studio Remote Debugging Cloud Services with Visual Studio Firewall Management support within Visual Studio for SQL Databases Visual Studio 2013 RTM VM Images for MSDN Subscribers Windows Azure Management Libraries for .NET Updated Windows Azure PowerShell Cmdlets and ScriptCenter The below post has more details on what’s available in today’s Windows Azure SDK 2.2 release.  Also head over to Channel 9 to see the new episode of the Visual Studio Toolbox show that will be available shortly, and which highlights these features in a video demonstration. Visual Studio 2013 Support Version 2.2 of the Window Azure SDK is the first official version of the SDK to support the final RTM release of Visual Studio 2013. If you installed the 2.1 SDK with the Preview of Visual Studio 2013 we recommend that you upgrade your projects to SDK 2.2.  SDK 2.2 also works side by side with the SDK 2.0 and SDK 2.1 releases on Visual Studio 2012: Integrated Windows Azure Sign In within Visual Studio Integrated Windows Azure Sign-In support within Visual Studio is one of the big improvements added with this Windows Azure SDK release.  Integrated sign-in support enables developers to develop/test/manage Windows Azure resources within Visual Studio without having to download or use management certificates.  You can now just right-click on the “Windows Azure” icon within the Server Explorer inside Visual Studio and choose the “Connect to Windows Azure” context menu option to connect to Windows Azure: Doing this will prompt you to enter the email address of the account you wish to sign-in with: You can use either a Microsoft Account (e.g. Windows Live ID) or an Organizational account (e.g. Active Directory) as the email.  The dialog will update with an appropriate login prompt depending on which type of email address you enter: Once you sign-in you’ll see the Windows Azure resources that you have permissions to manage show up automatically within the Visual Studio Server Explorer (and you can start using them): With this new integrated sign in experience you are now able to publish web apps, deploy VMs and cloud services, use Windows Azure diagnostics, and fully interact with your Windows Azure services within Visual Studio without the need for a management certificate.  All of the authentication is handled using the Windows Azure Active Directory associated with your Windows Azure account (details on this can be found in my earlier blog post). Integrating authentication this way end-to-end across the Service Management APIs + Dev Tools + Management Portal + PowerShell automation scripts enables a much more secure and flexible security model within Windows Azure, and makes it much more convenient to securely manage multiple developers + administrators working on a project.  It also allows organizations and enterprises to use the same authentication model that they use for their developers on-premises in the cloud.  It also ensures that employees who leave an organization immediately lose access to their company’s cloud based resources once their Active Directory account is suspended. Filtering/Subscription Management Once you login within Visual Studio, you can filter which Windows Azure subscriptions/regions are visible within the Server Explorer by right-clicking the “Filter Services” context menu within the Server Explorer.  You can also use the “Manage Subscriptions” context menu to mange your Windows Azure Subscriptions: Bringing up the “Manage Subscriptions” dialog allows you to see which accounts you are currently using, as well as which subscriptions are within them: The “Certificates” tab allows you to continue to import and use management certificates to manage Windows Azure resources as well.  We have not removed any functionality with today’s update – all of the existing scenarios that previously supported management certificates within Visual Studio continue to work just fine.  The new integrated sign-in support provided with today’s release is purely additive. Note: the SQL Database node and the Mobile Service node in Server Explorer do not support integrated sign-in at this time. Therefore, you will only see databases and mobile services under those nodes if you have a management certificate to authorize access to them.  We will enable them with integrated sign-in in a future update. Remote Debugging Cloud Resources within Visual Studio Today’s Windows Azure SDK 2.2 release adds support for remote debugging many types of Windows Azure resources. With live, remote debugging support from within Visual Studio, you are now able to have more visibility than ever before into how your code is operating live in Windows Azure.  Let’s walkthrough how to enable remote debugging for a Cloud Service: Remote Debugging of Cloud Services To enable remote debugging for your cloud service, select Debug as the Build Configuration on the Common Settings tab of your Cloud Service’s publish dialog wizard: Then click the Advanced Settings tab and check the Enable Remote Debugging for all roles checkbox: Once your cloud service is published and running live in the cloud, simply set a breakpoint in your local source code: Then use Visual Studio’s Server Explorer to select the Cloud Service instance deployed in the cloud, and then use the Attach Debugger context menu on the role or to a specific VM instance of it: Once the debugger attaches to the Cloud Service, and a breakpoint is hit, you’ll be able to use the rich debugging capabilities of Visual Studio to debug the cloud instance remotely, in real-time, and see exactly how your app is running in the cloud. Today’s remote debugging support is super powerful, and makes it much easier to develop and test applications for the cloud.  Support for remote debugging Cloud Services is available as of today, and we’ll also enable support for remote debugging Web Sites shortly. Firewall Management Support with SQL Databases By default we enable a security firewall around SQL Databases hosted within Windows Azure.  This ensures that only your application (or IP addresses you approve) can connect to them and helps make your infrastructure secure by default.  This is great for protection at runtime, but can sometimes be a pain at development time (since by default you can’t connect/manage the database remotely within Visual Studio if the security firewall blocks your instance of VS from connecting to it). One of the cool features we’ve added with today’s release is support that makes it easy to enable and configure the security firewall directly within Visual Studio.  Now with the SDK 2.2 release, when you try and connect to a SQL Database using the Visual Studio Server Explorer, and a firewall rule prevents access to the database from your machine, you will be prompted to add a firewall rule to enable access from your local IP address: You can simply click Add Firewall Rule and a new rule will be automatically added for you. In some cases, the logic to detect your local IP may not be sufficient (for example: you are behind a corporate firewall that uses a range of IP addresses) and you may need to set up a firewall rule for a range of IP addresses in order to gain access. The new Add Firewall Rule dialog also makes this easy to do.  Once connected you’ll be able to manage your SQL Database directly within the Visual Studio Server Explorer: This makes it much easier to work with databases in the cloud. Visual Studio 2013 RTM Virtual Machine Images Available for MSDN Subscribers Last week we released the General Availability Release of Visual Studio 2013 to the web.  This is an awesome release with a ton of new features. With today’s Windows Azure update we now have a set of pre-configured VM images of VS 2013 available within the Windows Azure Management Portal for use by MSDN customers.  This enables you to create a VM in the cloud with VS 2013 pre-installed on it in with only a few clicks: Windows Azure now provides the fastest and easiest way to get started doing development with Visual Studio 2013. Windows Azure Management Libraries for .NET (Preview) Having the ability to automate the creation, deployment, and tear down of resources is a key requirement for applications running in the cloud.  It also helps immensely when running dev/test scenarios and coded UI tests against pre-production environments. Today we are releasing a preview of a new set of Windows Azure Management Libraries for .NET.  These new libraries make it easy to automate tasks using any .NET language (e.g. C#, VB, F#, etc).  Previously this automation capability was only available through the Windows Azure PowerShell Cmdlets or to developers who were willing to write their own wrappers for the Windows Azure Service Management REST API. Modern .NET Developer Experience We’ve worked to design easy-to-understand .NET APIs that still map well to the underlying REST endpoints, making sure to use and expose the modern .NET functionality that developers expect today: Portable Class Library (PCL) support targeting applications built for any .NET Platform (no platform restriction) Shipped as a set of focused NuGet packages with minimal dependencies to simplify versioning Support async/await task based asynchrony (with easy sync overloads) Shared infrastructure for common error handling, tracing, configuration, HTTP pipeline manipulation, etc. Factored for easy testability and mocking Built on top of popular libraries like HttpClient and Json.NET Below is a list of a few of the management client classes that are shipping with today’s initial preview release: .NET Class Name Supports Operations for these Assets (and potentially more) ManagementClient Locations Credentials Subscriptions Certificates ComputeManagementClient Hosted Services Deployments Virtual Machines Virtual Machine Images & Disks StorageManagementClient Storage Accounts WebSiteManagementClient Web Sites Web Site Publish Profiles Usage Metrics Repositories VirtualNetworkManagementClient Networks Gateways Automating Creating a Virtual Machine using .NET Let’s walkthrough an example of how we can use the new Windows Azure Management Libraries for .NET to fully automate creating a Virtual Machine. I’m deliberately showing a scenario with a lot of custom options configured – including VHD image gallery enumeration, attaching data drives, network endpoints + firewall rules setup - to show off the full power and richness of what the new library provides. We’ll begin with some code that demonstrates how to enumerate through the built-in Windows images within the standard Windows Azure VM Gallery.  We’ll search for the first VM image that has the word “Windows” in it and use that as our base image to build the VM from.  We’ll then create a cloud service container in the West US region to host it within: We can then customize some options on it such as setting up a computer name, admin username/password, and hostname.  We’ll also open up a remote desktop (RDP) endpoint through its security firewall: We’ll then specify the VHD host and data drives that we want to mount on the Virtual Machine, and specify the size of the VM we want to run it in: Once everything has been set up the call to create the virtual machine is executed asynchronously In a few minutes we’ll then have a completely deployed VM running on Windows Azure with all of the settings (hard drives, VM size, machine name, username/password, network endpoints + firewall settings) fully configured and ready for us to use: Preview Availability via NuGet The Windows Azure Management Libraries for .NET are now available via NuGet. Because they are still in preview form, you’ll need to add the –IncludePrerelease switch when you go to retrieve the packages. The Package Manager Console screen shot below demonstrates how to get the entire set of libraries to manage your Windows Azure assets: You can also install them within your .NET projects by right clicking on the VS Solution Explorer and using the Manage NuGet Packages context menu command.  Make sure to select the “Include Prerelease” drop-down for them to show up, and then you can install the specific management libraries you need for your particular scenarios: Open Source License The new Windows Azure Management Libraries for .NET make it super easy to automate management operations within Windows Azure – whether they are for Virtual Machines, Cloud Services, Storage Accounts, Web Sites, and more.  Like the rest of the Windows Azure SDK, we are releasing the source code under an open source (Apache 2) license and it is hosted at https://github.com/WindowsAzure/azure-sdk-for-net/tree/master/libraries if you wish to contribute. PowerShell Enhancements and our New Script Center Today, we are also shipping Windows Azure PowerShell 0.7.0 (which is a separate download). You can find the full change log here. Here are some of the improvements provided with it: Windows Azure Active Directory authentication support Script Center providing many sample scripts to automate common tasks on Windows Azure New cmdlets for Media Services and SQL Database Script Center Windows Azure enables you to script and automate a lot of tasks using PowerShell.  People often ask for more pre-built samples of common scenarios so that they can use them to learn and tweak/customize. With this in mind, we are excited to introduce a new Script Center that we are launching for Windows Azure. You can learn about how to scripting with Windows Azure with a get started article. You can then find many sample scripts across different solutions, including infrastructure, data management, web, and more: All of the sample scripts are hosted on TechNet with links from the Windows Azure Script Center. Each script is complete with good code comments, detailed descriptions, and examples of usage. Summary Visual Studio 2013 and the Windows Azure SDK 2.2 make it easier than ever to get started developing rich cloud applications. Along with the Windows Azure Developer Center’s growing set of .NET developer resources to guide your development efforts, today’s Windows Azure SDK 2.2 release should make your development experience more enjoyable and efficient. If you don’t already have a Windows Azure account, you can sign-up for a free trial and start using all of the above features today.  Then visit the Windows Azure Developer Center to learn more about how to build apps with it. Hope this helps, Scott P.S. In addition to blogging, I am also now using Twitter for quick updates and to share links. Follow me at: twitter.com/scottgu

    Read the article

  • What's up with OCFS2?

    - by wcoekaer
    On Linux there are many filesystem choices and even from Oracle we provide a number of filesystems, all with their own advantages and use cases. Customers often confuse ACFS with OCFS or OCFS2 which then causes assumptions to be made such as one replacing the other etc... I thought it would be good to write up a summary of how OCFS2 got to where it is, what we're up to still, how it is different from other options and how this really is a cool native Linux cluster filesystem that we worked on for many years and is still widely used. Work on a cluster filesystem at Oracle started many years ago, in the early 2000's when the Oracle Database Cluster development team wrote a cluster filesystem for Windows that was primarily focused on providing an alternative to raw disk devices and help customers with the deployment of Oracle Real Application Cluster (RAC). Oracle RAC is a cluster technology that lets us make a cluster of Oracle Database servers look like one big database. The RDBMS runs on many nodes and they all work on the same data. It's a Shared Disk database design. There are many advantages doing this but I will not go into detail as that is not the purpose of my write up. Suffice it to say that Oracle RAC expects all the database data to be visible in a consistent, coherent way, across all the nodes in the cluster. To do that, there were/are a few options : 1) use raw disk devices that are shared, through SCSI, FC, or iSCSI 2) use a network filesystem (NFS) 3) use a cluster filesystem(CFS) which basically gives you a filesystem that's coherent across all nodes using shared disks. It is sort of (but not quite) combining option 1 and 2 except that you don't do network access to the files, the files are effectively locally visible as if it was a local filesystem. So OCFS (Oracle Cluster FileSystem) on Windows was born. Since Linux was becoming a very important and popular platform, we decided that we would also make this available on Linux and thus the porting of OCFS/Windows started. The first version of OCFS was really primarily focused on replacing the use of Raw devices with a simple filesystem that lets you create files and provide direct IO to these files to get basically native raw disk performance. The filesystem was not designed to be fully POSIX compliant and it did not have any where near good/decent performance for regular file create/delete/access operations. Cache coherency was easy since it was basically always direct IO down to the disk device and this ensured that any time one issues a write() command it would go directly down to the disk, and not return until the write() was completed. Same for read() any sort of read from a datafile would be a read() operation that went all the way to disk and return. We did not cache any data when it came down to Oracle data files. So while OCFS worked well for that, since it did not have much of a normal filesystem feel, it was not something that could be submitted to the kernel mail list for inclusion into Linux as another native linux filesystem (setting aside the Windows porting code ...) it did its job well, it was very easy to configure, node membership was simple, locking was disk based (so very slow but it existed), you could create regular files and do regular filesystem operations to a certain extend but anything that was not database data file related was just not very useful in general. Logfiles ok, standard filesystem use, not so much. Up to this point, all the work was done, at Oracle, by Oracle developers. Once OCFS (1) was out for a while and there was a lot of use in the database RAC world, many customers wanted to do more and were asking for features that you'd expect in a normal native filesystem, a real "general purposes cluster filesystem". So the team sat down and basically started from scratch to implement what's now known as OCFS2 (Oracle Cluster FileSystem release 2). Some basic criteria were : Design it with a real Distributed Lock Manager and use the network for lock negotiation instead of the disk Make it a Linux native filesystem instead of a native shim layer and a portable core Support standard Posix compliancy and be fully cache coherent with all operations Support all the filesystem features Linux offers (ACL, extended Attributes, quotas, sparse files,...) Be modern, support large files, 32/64bit, journaling, data ordered journaling, endian neutral, we can mount on both endian /cross architecture,.. Needless to say, this was a huge development effort that took many years to complete. A few big milestones happened along the way... OCFS2 was development in the open, we did not have a private tree that we worked on without external code review from the Linux Filesystem maintainers, great folks like Christopher Hellwig reviewed the code regularly to make sure we were not doing anything out of line, we submitted the code for review on lkml a number of times to see if we were getting close for it to be included into the mainline kernel. Using this development model is standard practice for anyone that wants to write code that goes into the kernel and having any chance of doing so without a complete rewrite or.. shall I say flamefest when submitted. It saved us a tremendous amount of time by not having to re-fit code for it to be in a Linus acceptable state. Some other filesystems that were trying to get into the kernel that didn't follow an open development model had a lot harder time and a lot harsher criticism. March 2006, when Linus released 2.6.16, OCFS2 officially became part of the mainline kernel, it was accepted a little earlier in the release candidates but in 2.6.16. OCFS2 became officially part of the mainline Linux kernel tree as one of the many filesystems. It was the first cluster filesystem to make it into the kernel tree. Our hope was that it would then end up getting picked up by the distribution vendors to make it easy for everyone to have access to a CFS. Today the source code for OCFS2 is approximately 85000 lines of code. We made OCFS2 production with full support for customers that ran Oracle database on Linux, no extra or separate support contract needed. OCFS2 1.0.0 started being built for RHEL4 for x86, x86-64, ppc, s390x and ia64. For RHEL5 starting with OCFS2 1.2. SuSE was very interested in high availability and clustering and decided to build and include OCFS2 with SLES9 for their customers and was, next to Oracle, the main contributor to the filesystem for both new features and bug fixes. Source code was always available even prior to inclusion into mainline and as of 2.6.16, source code was just part of a Linux kernel download from kernel.org, which it still is, today. So the latest OCFS2 code is always the upstream mainline Linux kernel. OCFS2 is the cluster filesystem used in Oracle VM 2 and Oracle VM 3 as the virtual disk repository filesystem. Since the filesystem is in the Linux kernel it's released under the GPL v2 The release model has always been that new feature development happened in the mainline kernel and we then built consistent, well tested, snapshots that had versions, 1.2, 1.4, 1.6, 1.8. But these releases were effectively just snapshots in time that were tested for stability and release quality. OCFS2 is very easy to use, there's a simple text file that contains the node information (hostname, node number, cluster name) and a file that contains the cluster heartbeat timeouts. It is very small, and very efficient. As Sunil Mushran wrote in the manual : OCFS2 is an efficient, easily configured, quickly installed, fully integrated and compatible, feature-rich, architecture and endian neutral, cache coherent, ordered data journaling, POSIX-compliant, shared disk cluster file system. Here is a list of some of the important features that are included : Variable Block and Cluster sizes Supports block sizes ranging from 512 bytes to 4 KB and cluster sizes ranging from 4 KB to 1 MB (increments in power of 2). Extent-based Allocations Tracks the allocated space in ranges of clusters making it especially efficient for storing very large files. Optimized Allocations Supports sparse files, inline-data, unwritten extents, hole punching and allocation reservation for higher performance and efficient storage. File Cloning/snapshots REFLINK is a feature which introduces copy-on-write clones of files in a cluster coherent way. Indexed Directories Allows efficient access to millions of objects in a directory. Metadata Checksums Detects silent corruption in inodes and directories. Extended Attributes Supports attaching an unlimited number of name:value pairs to the file system objects like regular files, directories, symbolic links, etc. Advanced Security Supports POSIX ACLs and SELinux in addition to the traditional file access permission model. Quotas Supports user and group quotas. Journaling Supports both ordered and writeback data journaling modes to provide file system consistency in the event of power failure or system crash. Endian and Architecture neutral Supports a cluster of nodes with mixed architectures. Allows concurrent mounts on nodes running 32-bit and 64-bit, little-endian (x86, x86_64, ia64) and big-endian (ppc64) architectures. In-built Cluster-stack with DLM Includes an easy to configure, in-kernel cluster-stack with a distributed lock manager. Buffered, Direct, Asynchronous, Splice and Memory Mapped I/Os Supports all modes of I/Os for maximum flexibility and performance. Comprehensive Tools Support Provides a familiar EXT3-style tool-set that uses similar parameters for ease-of-use. The filesystem was distributed for Linux distributions in separate RPM form and this had to be built for every single kernel errata release or every updated kernel provided by the vendor. We provided builds from Oracle for Oracle Linux and all kernels released by Oracle and for Red Hat Enterprise Linux. SuSE provided the modules directly for every kernel they shipped. With the introduction of the Unbreakable Enterprise Kernel for Oracle Linux and our interest in reducing the overhead of building filesystem modules for every minor release, we decide to make OCFS2 available as part of UEK. There was no more need for separate kernel modules, everything was built-in and a kernel upgrade automatically updated the filesystem, as it should. UEK allowed us to not having to backport new upstream filesystem code into an older kernel version, backporting features into older versions introduces risk and requires extra testing because the code is basically partially rewritten. The UEK model works really well for continuing to provide OCFS2 without that extra overhead. Because the RHEL kernel did not contain OCFS2 as a kernel module (it is in the source tree but it is not built by the vendor in kernel module form) we stopped adding the extra packages to Oracle Linux and its RHEL compatible kernel and for RHEL. Oracle Linux customers/users obviously get OCFS2 included as part of the Unbreakable Enterprise Kernel, SuSE customers get it by SuSE distributed with SLES and Red Hat can decide to distribute OCFS2 to their customers if they chose to as it's just a matter of compiling the module and making it available. OCFS2 today, in the mainline kernel is pretty much feature complete in terms of integration with every filesystem feature Linux offers and it is still actively maintained with Joel Becker being the primary maintainer. Since we use OCFS2 as part of Oracle VM, we continue to look at interesting new functionality to add, REFLINK was a good example, and as such we continue to enhance the filesystem where it makes sense. Bugfixes and any sort of code that goes into the mainline Linux kernel that affects filesystems, automatically also modifies OCFS2 so it's in kernel, actively maintained but not a lot of new development happening at this time. We continue to fully support OCFS2 as part of Oracle Linux and the Unbreakable Enterprise Kernel and other vendors make their own decisions on support as it's really a Linux cluster filesystem now more than something that we provide to customers. It really just is part of Linux like EXT3 or BTRFS etc, the OS distribution vendors decide. Do not confuse OCFS2 with ACFS (ASM cluster Filesystem) also known as Oracle Cloud Filesystem. ACFS is a filesystem that's provided by Oracle on various OS platforms and really integrates into Oracle ASM (Automatic Storage Management). It's a very powerful Cluster Filesystem but it's not distributed as part of the Operating System, it's distributed with the Oracle Database product and installs with and lives inside Oracle ASM. ACFS obviously is fully supported on Linux (Oracle Linux, Red Hat Enterprise Linux) but OCFS2 independently as a native Linux filesystem is also, and continues to also be supported. ACFS is very much tied into the Oracle RDBMS, OCFS2 is just a standard native Linux filesystem with no ties into Oracle products. Customers running the Oracle database and ASM really should consider using ACFS as it also provides storage/clustered volume management. Customers wanting to use a simple, easy to use generic Linux cluster filesystem should consider using OCFS2. To learn more about OCFS2 in detail, you can find good documentation on http://oss.oracle.com/projects/ocfs2 in the Documentation area, or get the latest mainline kernel from http://kernel.org and read the source. One final, unrelated note - since I am not always able to publicly answer or respond to comments, I do not want to selectively publish comments from readers. Sometimes I forget to publish comments, sometime I publish them and sometimes I would publish them but if for some reason I cannot publicly comment on them, it becomes a very one-sided stream. So for now I am going to not publish comments from anyone, to be fair to all sides. You are always welcome to email me and I will do my best to respond to technical questions, questions about strategy or direction are sometimes not possible to answer for obvious reasons.

    Read the article

  • Building a SOA/BPM/BAM Cluster Part I &ndash; Preparing the Environment

    - by antony.reynolds
    An increasing number of customers are using SOA Suite in a cluster configuration, I might hazard to say that the majority of production deployments are now using SOA clusters.  So I thought it may be useful to detail the steps in building an 11g cluster and explain a little about why things are done the way they are. In this series of posts I will explain how to build a SOA/BPM cluster using the Enterprise Deployment Guide. This post will explain the setting required to prepare the cluster for installation and configuration. Software Required The following software is required for an 11.1.1.3 SOA/BPM install. Software Version Notes Oracle Database Certified databases are listed here SOA & BPM Suites require a working database installation. Repository Creation Utility (RCU) 11.1.1.3 If upgrading an 11.1.1.2 repository then a separate script is available. Web Tier Utilities 11.1.1.3 Provides Web Server, 11.1.1.3 is an upgrade to 11.1.1.2, so 11.1.1.2 must be installed first. Web Tier Utilities 11.1.1.3 Web Server, 11.1.1.3 Patch.  You can use the 11.1.1.2 version without problems. Oracle WebLogic Server 11gR1 10.3.3 This is the host platform for 11.1.1.3 SOA/BPM Suites. SOA Suite 11.1.1.2 SOA Suite 11.1.1.3 is an upgrade to 11.1.1.2, so 11.1.1.2 must be installed first. SOA Suite 11.1.1.3 SOA Suite 11.1.1.3 patch, requires 11.1.12 to have been installed. My installation was performed on Oracle Enterprise Linux 5.4 64-bit. Database I will not cover setting up the database in this series other than to identify the database requirements.  If setting up a SOA cluster then ideally we would also be using a RAC database.  I assume that this is running on separate machines to the SOA cluster.  Section 2.1, “Database”, of the EDG covers the database configuration in detail. Settings The database should have processes set to at least 400 if running SOA/BPM and BAM. alter system set processes=400 scope=spfile Run RCU The Repository Creation Utility creates the necessary database tables for the SOA Suite.  The RCU can be run from any machine that can access the target database.  In 11g the RCU creates a number of pre-defined users and schema with a user defiend prefix.  This allows you to have multiple 11g installations in the same database. After running the RCU you need to grant some additional privileges to the soainfra user.  The soainfra user should have privileges on the transaction tables. grant select on sys.dba_pending_transactions to prefix_soainfra Grant force any transaction to prefix_soainfra Machines The cluster will be built on the following machines. EDG Name is the name used for this machine in the EDG. Notes are a description of the purpose of the machine. EDG Name Notes LB External load balancer to distribute load across and failover between web servers. WEBHOST1 Hosts a web server. WEBHOST2 Hosts a web server. SOAHOST1 Hosts SOA components. SOAHOST2 Hosts SOA components. BAMHOST1 Hosts BAM components. BAMHOST2 Hosts BAM components. Note that it is possible to collapse the BAM servers so that they run on the same machines as the SOA servers. In this case BAMHOST1 and SOAHOST1 would be the same, as would BAMHOST2 and SOAHOST2. The cluster may include more than 2 servers and in this case we add SOAHOST3, SOAHOST4 etc as needed. My cluster has WEBHOST1, SOAHOST1 and BAMHOST1 all running on a single machine. Software Components The cluster will use the following software components. EDG Name is the name used for this machine in the EDG. Type is the type of component, generally a WebLogic component. Notes are a description of the purpose of the component. EDG Name Type Notes AdminServer Admin Server Domain Admin Server WLS_WSM1 Managed Server Web Services Manager Policy Manager Server WLS_WSM2 Managed Server Web Services Manager Policy Manager Server WLS_SOA1 Managed Server SOA/BPM Managed Server WLS_SOA2 Managed Server SOA/BPM Managed Server WLS_BAM1 Managed Server BAM Managed Server running Active Data Cache WLS_BAM2 Managed Server BAM Manager Server without Active Data Cache   Node Manager Will run on all hosts with WLS servers OHS1 Web Server Oracle HTTP Server OHS2 Web Server Oracle HTTP Server LB Load Balancer Load Balancer, not part of SOA Suite The above assumes a 2 node cluster. Network Configuration The SOA cluster requires an extensive amount of network configuration.  I would recommend assigning a private sub-net (internal IP addresses such as 10.x.x.x, 192.168.x.x or 172.168.x.x) to the cluster for use by addresses that only need to be accessible to the Load Balancer or other cluster members.  Section 2.2, "Network", of the EDG covers the network configuration in detail. EDG Name is the hostname used in the EDG. IP Name is the IP address name used in the EDG. Type is the type of IP address: Fixed is fixed to a single machine. Floating is assigned to one of several machines to allow for server migration. Virtual is assigned to a load balancer and used to distribute load across several machines. Host is the host where this IP address is active.  Note for floating IP addresses a range of hosts is given. Bound By identifies which software component will use this IP address. Scope shows where this IP address needs to be resolved. Cluster scope addresses only have to be resolvable by machines in the cluster, i.e. the machines listed in the previous section.  These addresses are only used for inter-cluster communication or for access by the load balancer. Internal scope addresses Notes are comments on why that type of IP is used. EDG Name IP Name Type Host Bound By Scope Notes ADMINVHN VIP1 Floating SOAHOST1-SOAHOSTn AdminServer Cluster Admin server, must be able to migrate between SOA server machines. SOAHOST1 IP1 Fixed SOAHOST1 NodeManager, WLS_WSM1 Cluster WSM Server 1 does not require server migration. SOAHOST2 IP2 Fixed SOAHOST1 NodeManager, WLS_WSM2 Cluster WSM Server 2 does not require server migration SOAHOST1VHN VIP2 Floating SOAHOST1-SOAHOSTn WLS_SOA1 Cluster SOA server 1, must be able to migrate between SOA server machines SOAHOST2VHN VIP3 Floating SOAHOST1-SOAHOSTn WLS_SOA2 Cluster SOA server 2, must be able to migrate between SOA server machines BAMHOST1 IP4 Fixed BAMHOST1 NodeManager Cluster   BAMHOST1VHN VIP4 Floating BAMHOST1-BAMHOSTn WLS_BAM1 Cluster BAM server 1, must be able to migrate between BAM server machines BAMHOST2 IP3 Fixed BAMHOST2 NodeManager, WLS_BAM2 Cluster BAM server 2 does not require server migration WEBHOST1 IP5 Fixed WEBHOST1 OHS1 Cluster   WEBHOST2 IP6 Fixed WEBHOST2 OHS2 Cluster   soa.mycompany.com VIP5 Virtual LB LB Public External access point to SOA cluster. admin.mycompany.com VIP6 Virtual LB LB Internal Internal access to WLS console and EM soainternal.mycompany.com VIP7 Virtual LB LB Internal Internal access point to SOA cluster Floating IP addresses are IP addresses that may be re-assigned between machines in the cluster.  For example in the event of failure of SOAHOST1 then WLS_SOA1 will need to be migrated to another server.  In this case VIP2 (SOAHOST1VHN) will need to be activated on the new target machine.  Once set up the node manager will manage registration and removal of the floating IP addresses with the exception of the AdminServer floating IP address. Note that if the BAMHOSTs and SOAHOSTs are the same machine then you can obviously share the hostname and fixed IP addresses, but you still need separate floating IP addresses for the different managed servers.  The hostnames don’t have to be the ones given in the EDG, but they must be distinct in the same way as the ETC names are distinct.  If the type is a fixed IP then if the addresses are the same you can use the same hostname, for example if you collapse the soahost1, bamhost1 and webhost1 onto a single machine then you could refer to them all as HOST1 and give them the same IP address, however SOAHOST1VHN can never be the same as BAMHOST1VHN because these are floating IP addresses. Notes on DNS IP addresses that are of scope “Cluster” just need to be in the hosts file (/etc/hosts on Linux, C:\Windows\System32\drivers\etc\hosts on Windows) of all the machines in the cluster and the load balancer.  IP addresses that are of scope “Internal” need to be available on the internal DNS servers, whilst IP addresses of scope “Public” need to be available on external and internal DNS servers. Shared File System At a minimum the cluster needs shared storage for the domain configuration, XA transaction logs and JMS file stores.  It is also possible to place the software itself on a shared server.  I strongly recommend that all machines have the same file structure for their SOA installation otherwise you will experience pain!  Section 2.3, "Shared Storage and Recommended Directory Structure", of the EDG covers the shared storage recommendations in detail. The following shorthand is used for locations: ORACLE_BASE is the root of the file system used for software and configuration files. MW_HOME is the location used by the installed SOA/BPM Suite installation.  This is also used by the web server installation.  In my installation it is set to <ORACLE_BASE>/SOA11gPS2. ORACLE_HOME is the location of the Oracle SOA components or the Oracle Web components.  This directory is installed under the the MW_HOME but the name is decided by the user at installation, default values are Oracle_SOA1 and Oracle_Web1.  In my installation they are set to <MW_HOME>/Oracle_SOA and <MW_HOME>/Oracle _WEB. ORACLE_COMMON_HOME is the location of the common components and is located under the MW_HOME directory.  This is always <MW_HOME>/oracle_common. ORACLE_INSTANCE is used by the Oracle HTTP Server and/or Oracle Web Cache.  It is recommended to create it under <ORACLE_BASE>/admin.  In my installation they are set to <ORACLE_BASE>/admin/Web1, <ORACLE_BASE>/admin/Web2 and <ORACLE_BASE>/admin/WC1. WL_HOME is the WebLogic server home and is always found at <MW_HOME>/wlserver_10.3. Key file locations are shown below. Directory Notes <ORACLE_BASE>/admin/domain_name/aserver/domain_name Shared location for domain.  Used to allow admin server to manually fail over between machines.  When creating domain_name provide the aserver directory as the location for the domain. In my install this is <ORACLE_BASE>/admin/aserver/soa_domain as I only have one domain on the box. <ORACLE_BASE>/admin/domain_name/aserver/applications Shared location for deployed applications.  Needs to be provided when creating the domain. In my install this is <ORACLE_BASE>/admin/aserver/applications as I only have one domain on the box. <ORACLE_BASE>/admin/domain_name/mserver/domain_name Either unique location for each machine or can be shared between machines to simplify task of packing and unpacking domain.  This acts as the managed server configuration location.  Keeping it separate from Admin server helps to avoid problems with the managed servers messing up the Admin Server. In my install this is <ORACLE_BASE>/admin/mserver/soa_domain as I only have one domain on the box. <ORACLE_BASE>/admin/domain_name/mserver/applications Either unique location for each machine or can be shared between machines.  Holds deployed applications. In my install this is <ORACLE_BASE>/admin/mserver/applications as I only have one domain on the box. <ORACLE_BASE>/admin/domain_name/soa_cluster_name Shared directory to hold the following   dd – deployment descriptors   jms – shared JMS file stores   fadapter – shared file adapter co-ordination files   tlogs – shared transaction log files In my install this is <ORACLE_BASE>/admin/soa_cluster. <ORACLE_BASE>/admin/instance_name Local folder for web server (OHS) instance. In my install this is <ORACLE_BASE>/admin/web1 and <ORACLE_BASE>/admin/web2. I also have <ORACLE_BASE>/admin/wc1 for the Web Cache I use as a load balancer. <ORACLE_BASE>/product/fmw This can be a shared or local folder for the SOA/BPM Suite software.  I used a shared location so I only ran the installer once. In my install this is <ORACLE_BASE>/SOA11gPS2 All the shared files need to be put onto a shared storage media.  I am using NFS, but recommendation for production would be a SAN, with mirrored disks for resilience. Collapsing Environments To reduce the hardware requirements it is possible to collapse the BAMHOST, SOAHOST and WEBHOST machines onto a single physical machine.  This will require more memory but memory is a lot cheaper than additional machines.  For environments that require higher security then stay with a separate WEBHOST tier as per the EDG.  Similarly for high volume environments then keep a separate set of machines for BAM and/or Web tier as per the EDG. Notes on Dev Environments In a dev environment it is acceptable to use a a single node (non-RAC) database, but be aware that the config of the data sources is different (no need to use multi-data source in WLS).  Typically in a dev environment we will collapse the BAMHOST, SOAHOST and WEBHOST onto a single machine and use a software load balancer.  To test a cluster properly we will need at least 2 machines. For my test environment I used Oracle Web Cache as a load balancer.  I ran it on one of the SOA Suite machines and it load balanced across the Web Servers on both machines.  This was easy for me to set up and I could administer it from a web based console.

    Read the article

  • Acer aspire 5735z wireless not working after upgrade to 11.10

    - by Jon
    I cant get my wifi card to work at all after upgrading to 11.10 Oneiric. I'm not sure where to start to fix this. Ive tried using the additional drivers tool but this shows that no additional drivers are needed. Before my upgrade I had a drivers working for the Rt2860 chipset. Any help on this would be much appreciated.... thanks Jon jon@ubuntu:~$ ifconfig eth0 Link encap:Ethernet HWaddr 00:1d:72:ec:76:d5 inet addr:192.168.1.134 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::21d:72ff:feec:76d5/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:7846 errors:0 dropped:0 overruns:0 frame:0 TX packets:7213 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:8046624 (8.0 MB) TX bytes:1329442 (1.3 MB) Interrupt:16 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:91 errors:0 dropped:0 overruns:0 frame:0 TX packets:91 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:34497 (34.4 KB) TX bytes:34497 (34.4 KB) Ive included by dmesg output below [ 0.428818] NET: Registered protocol family 2 [ 0.429003] IP route cache hash table entries: 131072 (order: 8, 1048576 bytes) [ 0.430562] TCP established hash table entries: 524288 (order: 11, 8388608 bytes) [ 0.436614] TCP bind hash table entries: 65536 (order: 8, 1048576 bytes) [ 0.437409] TCP: Hash tables configured (established 524288 bind 65536) [ 0.437412] TCP reno registered [ 0.437431] UDP hash table entries: 2048 (order: 4, 65536 bytes) [ 0.437482] UDP-Lite hash table entries: 2048 (order: 4, 65536 bytes) [ 0.437678] NET: Registered protocol family 1 [ 0.437705] pci 0000:00:02.0: Boot video device [ 0.437892] PCI: CLS 64 bytes, default 64 [ 0.437916] Simple Boot Flag at 0x57 set to 0x1 [ 0.438294] audit: initializing netlink socket (disabled) [ 0.438309] type=2000 audit(1319243447.432:1): initialized [ 0.440763] Freeing initrd memory: 13416k freed [ 0.468362] HugeTLB registered 2 MB page size, pre-allocated 0 pages [ 0.488192] VFS: Disk quotas dquot_6.5.2 [ 0.488254] Dquot-cache hash table entries: 512 (order 0, 4096 bytes) [ 0.488888] fuse init (API version 7.16) [ 0.488985] msgmni has been set to 5890 [ 0.489381] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 253) [ 0.489413] io scheduler noop registered [ 0.489415] io scheduler deadline registered [ 0.489460] io scheduler cfq registered (default) [ 0.489583] pcieport 0000:00:1c.0: setting latency timer to 64 [ 0.489633] pcieport 0000:00:1c.0: irq 40 for MSI/MSI-X [ 0.489699] pcieport 0000:00:1c.1: setting latency timer to 64 [ 0.489741] pcieport 0000:00:1c.1: irq 41 for MSI/MSI-X [ 0.489800] pcieport 0000:00:1c.2: setting latency timer to 64 [ 0.489841] pcieport 0000:00:1c.2: irq 42 for MSI/MSI-X [ 0.489904] pcieport 0000:00:1c.3: setting latency timer to 64 [ 0.489944] pcieport 0000:00:1c.3: irq 43 for MSI/MSI-X [ 0.490006] pcieport 0000:00:1c.4: setting latency timer to 64 [ 0.490047] pcieport 0000:00:1c.4: irq 44 for MSI/MSI-X [ 0.490126] pci_hotplug: PCI Hot Plug PCI Core version: 0.5 [ 0.490149] pciehp: PCI Express Hot Plug Controller Driver version: 0.4 [ 0.490196] intel_idle: MWAIT substates: 0x1110 [ 0.490198] intel_idle: does not run on family 6 model 15 [ 0.491240] ACPI: Deprecated procfs I/F for AC is loaded, please retry with CONFIG_ACPI_PROCFS_POWER cleared [ 0.493473] ACPI: AC Adapter [ADP1] (on-line) [ 0.493590] input: Lid Switch as /devices/LNXSYSTM:00/device:00/PNP0C0D:00/input/input0 [ 0.496771] ACPI: Lid Switch [LID0] [ 0.496818] input: Sleep Button as /devices/LNXSYSTM:00/device:00/PNP0C0E:00/input/input1 [ 0.496823] ACPI: Sleep Button [SLPB] [ 0.496865] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input2 [ 0.496869] ACPI: Power Button [PWRF] [ 0.496900] ACPI: acpi_idle registered with cpuidle [ 0.498719] Monitor-Mwait will be used to enter C-1 state [ 0.498753] Monitor-Mwait will be used to enter C-2 state [ 0.498761] Marking TSC unstable due to TSC halts in idle [ 0.517627] thermal LNXTHERM:00: registered as thermal_zone0 [ 0.517630] ACPI: Thermal Zone [TZS0] (67 C) [ 0.524796] thermal LNXTHERM:01: registered as thermal_zone1 [ 0.524799] ACPI: Thermal Zone [TZS1] (67 C) [ 0.524823] ACPI: Deprecated procfs I/F for battery is loaded, please retry with CONFIG_ACPI_PROCFS_POWER cleared [ 0.524852] ERST: Table is not found! [ 0.524948] Serial: 8250/16550 driver, 32 ports, IRQ sharing enabled [ 0.680991] ACPI: Battery Slot [BAT0] (battery present) [ 0.688567] Linux agpgart interface v0.103 [ 0.688672] agpgart-intel 0000:00:00.0: Intel GM45 Chipset [ 0.688865] agpgart-intel 0000:00:00.0: detected gtt size: 2097152K total, 262144K mappable [ 0.689786] agpgart-intel 0000:00:00.0: detected 65536K stolen memory [ 0.689912] agpgart-intel 0000:00:00.0: AGP aperture is 256M @ 0xd0000000 [ 0.691006] brd: module loaded [ 0.691510] loop: module loaded [ 0.691967] Fixed MDIO Bus: probed [ 0.691990] PPP generic driver version 2.4.2 [ 0.692065] tun: Universal TUN/TAP device driver, 1.6 [ 0.692067] tun: (C) 1999-2004 Max Krasnyansky <[email protected]> [ 0.692146] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver [ 0.692181] ehci_hcd 0000:00:1a.7: PCI INT C -> GSI 20 (level, low) -> IRQ 20 [ 0.692206] ehci_hcd 0000:00:1a.7: setting latency timer to 64 [ 0.692210] ehci_hcd 0000:00:1a.7: EHCI Host Controller [ 0.692255] ehci_hcd 0000:00:1a.7: new USB bus registered, assigned bus number 1 [ 0.692289] ehci_hcd 0000:00:1a.7: debug port 1 [ 0.696181] ehci_hcd 0000:00:1a.7: cache line size of 64 is not supported [ 0.696202] ehci_hcd 0000:00:1a.7: irq 20, io mem 0xf8904800 [ 0.712014] ehci_hcd 0000:00:1a.7: USB 2.0 started, EHCI 1.00 [ 0.712131] hub 1-0:1.0: USB hub found [ 0.712136] hub 1-0:1.0: 6 ports detected [ 0.712230] ehci_hcd 0000:00:1d.7: PCI INT A -> GSI 23 (level, low) -> IRQ 23 [ 0.712243] ehci_hcd 0000:00:1d.7: setting latency timer to 64 [ 0.712247] ehci_hcd 0000:00:1d.7: EHCI Host Controller [ 0.712287] ehci_hcd 0000:00:1d.7: new USB bus registered, assigned bus number 2 [ 0.712315] ehci_hcd 0000:00:1d.7: debug port 1 [ 0.716201] ehci_hcd 0000:00:1d.7: cache line size of 64 is not supported [ 0.716216] ehci_hcd 0000:00:1d.7: irq 23, io mem 0xf8904c00 [ 0.732014] ehci_hcd 0000:00:1d.7: USB 2.0 started, EHCI 1.00 [ 0.732130] hub 2-0:1.0: USB hub found [ 0.732135] hub 2-0:1.0: 6 ports detected [ 0.732209] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver [ 0.732223] uhci_hcd: USB Universal Host Controller Interface driver [ 0.732254] uhci_hcd 0000:00:1a.0: PCI INT A -> GSI 20 (level, low) -> IRQ 20 [ 0.732262] uhci_hcd 0000:00:1a.0: setting latency timer to 64 [ 0.732265] uhci_hcd 0000:00:1a.0: UHCI Host Controller [ 0.732298] uhci_hcd 0000:00:1a.0: new USB bus registered, assigned bus number 3 [ 0.732325] uhci_hcd 0000:00:1a.0: irq 20, io base 0x00001820 [ 0.732441] hub 3-0:1.0: USB hub found [ 0.732445] hub 3-0:1.0: 2 ports detected [ 0.732508] uhci_hcd 0000:00:1a.1: PCI INT B -> GSI 20 (level, low) -> IRQ 20 [ 0.732514] uhci_hcd 0000:00:1a.1: setting latency timer to 64 [ 0.732518] uhci_hcd 0000:00:1a.1: UHCI Host Controller [ 0.732553] uhci_hcd 0000:00:1a.1: new USB bus registered, assigned bus number 4 [ 0.732577] uhci_hcd 0000:00:1a.1: irq 20, io base 0x00001840 [ 0.732696] hub 4-0:1.0: USB hub found [ 0.732700] hub 4-0:1.0: 2 ports detected [ 0.732762] uhci_hcd 0000:00:1a.2: PCI INT C -> GSI 20 (level, low) -> IRQ 20 [ 0.732768] uhci_hcd 0000:00:1a.2: setting latency timer to 64 [ 0.732772] uhci_hcd 0000:00:1a.2: UHCI Host Controller [ 0.732805] uhci_hcd 0000:00:1a.2: new USB bus registered, assigned bus number 5 [ 0.732829] uhci_hcd 0000:00:1a.2: irq 20, io base 0x00001860 [ 0.732942] hub 5-0:1.0: USB hub found [ 0.732946] hub 5-0:1.0: 2 ports detected [ 0.733007] uhci_hcd 0000:00:1d.0: PCI INT A -> GSI 23 (level, low) -> IRQ 23 [ 0.733014] uhci_hcd 0000:00:1d.0: setting latency timer to 64 [ 0.733017] uhci_hcd 0000:00:1d.0: UHCI Host Controller [ 0.733057] uhci_hcd 0000:00:1d.0: new USB bus registered, assigned bus number 6 [ 0.733082] uhci_hcd 0000:00:1d.0: irq 23, io base 0x00001880 [ 0.733202] hub 6-0:1.0: USB hub found [ 0.733206] hub 6-0:1.0: 2 ports detected [ 0.733265] uhci_hcd 0000:00:1d.1: PCI INT B -> GSI 17 (level, low) -> IRQ 17 [ 0.733273] uhci_hcd 0000:00:1d.1: setting latency timer to 64 [ 0.733276] uhci_hcd 0000:00:1d.1: UHCI Host Controller [ 0.733313] uhci_hcd 0000:00:1d.1: new USB bus registered, assigned bus number 7 [ 0.733351] uhci_hcd 0000:00:1d.1: irq 17, io base 0x000018a0 [ 0.733466] hub 7-0:1.0: USB hub found [ 0.733470] hub 7-0:1.0: 2 ports detected [ 0.733532] uhci_hcd 0000:00:1d.2: PCI INT C -> GSI 18 (level, low) -> IRQ 18 [ 0.733539] uhci_hcd 0000:00:1d.2: setting latency timer to 64 [ 0.733542] uhci_hcd 0000:00:1d.2: UHCI Host Controller [ 0.733578] uhci_hcd 0000:00:1d.2: new USB bus registered, assigned bus number 8 [ 0.733610] uhci_hcd 0000:00:1d.2: irq 18, io base 0x000018c0 [ 0.733730] hub 8-0:1.0: USB hub found [ 0.733736] hub 8-0:1.0: 2 ports detected [ 0.733843] i8042: PNP: PS/2 Controller [PNP0303:KBD0,PNP0f13:PS2M] at 0x60,0x64 irq 1,12 [ 0.751594] serio: i8042 KBD port at 0x60,0x64 irq 1 [ 0.751605] serio: i8042 AUX port at 0x60,0x64 irq 12 [ 0.751732] mousedev: PS/2 mouse device common for all mice [ 0.752670] rtc_cmos 00:08: RTC can wake from S4 [ 0.752770] rtc_cmos 00:08: rtc core: registered rtc_cmos as rtc0 [ 0.752796] rtc0: alarms up to one month, y3k, 242 bytes nvram, hpet irqs [ 0.752907] device-mapper: uevent: version 1.0.3 [ 0.752976] device-mapper: ioctl: 4.20.0-ioctl (2011-02-02) initialised: [email protected] [ 0.753028] cpuidle: using governor ladder [ 0.753093] cpuidle: using governor menu [ 0.753096] EFI Variables Facility v0.08 2004-May-17 [ 0.753361] TCP cubic registered [ 0.753482] NET: Registered protocol family 10 [ 0.753966] NET: Registered protocol family 17 [ 0.753992] Registering the dns_resolver key type [ 0.754113] PM: Hibernation image not present or could not be loaded. [ 0.754131] registered taskstats version 1 [ 0.771553] Magic number: 15:152:507 [ 0.771667] rtc_cmos 00:08: setting system clock to 2011-10-22 00:30:48 UTC (1319243448) [ 0.772238] BIOS EDD facility v0.16 2004-Jun-25, 0 devices found [ 0.772240] EDD information not available. [ 0.774165] Freeing unused kernel memory: 984k freed [ 0.774504] Write protecting the kernel read-only data: 10240k [ 0.774755] Freeing unused kernel memory: 20k freed [ 0.775093] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input3 [ 0.779727] Freeing unused kernel memory: 1400k freed [ 0.801946] udevd[84]: starting version 173 [ 0.880950] sky2: driver version 1.28 [ 0.881046] sky2 0000:02:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 [ 0.881096] sky2 0000:02:00.0: setting latency timer to 64 [ 0.881197] sky2 0000:02:00.0: Yukon-2 Extreme chip revision 2 [ 0.881871] sky2 0000:02:00.0: irq 45 for MSI/MSI-X [ 0.896273] sky2 0000:02:00.0: eth0: addr 00:1d:72:ec:76:d5 [ 0.910630] ahci 0000:00:1f.2: version 3.0 [ 0.910647] ahci 0000:00:1f.2: PCI INT B -> GSI 19 (level, low) -> IRQ 19 [ 0.910710] ahci 0000:00:1f.2: irq 46 for MSI/MSI-X [ 0.910775] ahci: SSS flag set, parallel bus scan disabled [ 0.910812] ahci 0000:00:1f.2: AHCI 0001.0200 32 slots 4 ports 3 Gbps 0x33 impl SATA mode [ 0.910816] ahci 0000:00:1f.2: flags: 64bit ncq sntf stag pm led clo pio slum part ccc ems sxs [ 0.910821] ahci 0000:00:1f.2: setting latency timer to 64 [ 0.941773] scsi0 : ahci [ 0.941954] scsi1 : ahci [ 0.942038] scsi2 : ahci [ 0.942118] scsi3 : ahci [ 0.942196] scsi4 : ahci [ 0.942268] scsi5 : ahci [ 0.942332] ata1: SATA max UDMA/133 abar m2048@0xf8904000 port 0xf8904100 irq 46 [ 0.942336] ata2: SATA max UDMA/133 abar m2048@0xf8904000 port 0xf8904180 irq 46 [ 0.942339] ata3: DUMMY [ 0.942340] ata4: DUMMY [ 0.942344] ata5: SATA max UDMA/133 abar m2048@0xf8904000 port 0xf8904300 irq 46 [ 0.942347] ata6: SATA max UDMA/133 abar m2048@0xf8904000 port 0xf8904380 irq 46 [ 1.028061] usb 1-5: new high speed USB device number 2 using ehci_hcd [ 1.181775] usbcore: registered new interface driver uas [ 1.260062] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 1.261126] ata1.00: ATA-8: Hitachi HTS543225L9A300, FBEOC40C, max UDMA/133 [ 1.261129] ata1.00: 488397168 sectors, multi 16: LBA48 NCQ (depth 31/32), AA [ 1.262360] ata1.00: configured for UDMA/133 [ 1.262518] scsi 0:0:0:0: Direct-Access ATA Hitachi HTS54322 FBEO PQ: 0 ANSI: 5 [ 1.262716] sd 0:0:0:0: Attached scsi generic sg0 type 0 [ 1.262762] sd 0:0:0:0: [sda] 488397168 512-byte logical blocks: (250 GB/232 GiB) [ 1.262824] sd 0:0:0:0: [sda] Write Protect is off [ 1.262827] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 [ 1.262851] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 1.287277] sda: sda1 sda2 sda3 [ 1.287693] sd 0:0:0:0: [sda] Attached SCSI disk [ 1.580059] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [ 1.581188] ata2.00: ATAPI: HL-DT-STDVDRAM GT10N, 1.00, max UDMA/100 [ 1.582663] ata2.00: configured for UDMA/100 [ 1.584162] scsi 1:0:0:0: CD-ROM HL-DT-ST DVDRAM GT10N 1.00 PQ: 0 ANSI: 5 [ 1.585821] sr0: scsi3-mmc drive: 24x/24x writer dvd-ram cd/rw xa/form2 cdda tray [ 1.585824] cdrom: Uniform CD-ROM driver Revision: 3.20 [ 1.585953] sr 1:0:0:0: Attached scsi CD-ROM sr0 [ 1.586038] sr 1:0:0:0: Attached scsi generic sg1 type 5 [ 1.632061] usb 6-1: new low speed USB device number 2 using uhci_hcd [ 1.908056] ata5: SATA link down (SStatus 0 SControl 300) [ 2.228065] ata6: SATA link down (SStatus 0 SControl 300) [ 2.228955] Initializing USB Mass Storage driver... [ 2.229052] usbcore: registered new interface driver usb-storage [ 2.229054] USB Mass Storage support registered. [ 2.235827] scsi6 : usb-storage 1-5:1.0 [ 2.235987] usbcore: registered new interface driver ums-realtek [ 2.244451] input: B16_b_02 USB-PS/2 Optical Mouse as /devices/pci0000:00/0000:00:1d.0/usb6/6-1/6-1:1.0/input/input4 [ 2.244598] generic-usb 0003:046D:C025.0001: input,hidraw0: USB HID v1.10 Mouse [B16_b_02 USB-PS/2 Optical Mouse] on usb-0000:00:1d.0-1/input0 [ 2.244620] usbcore: registered new interface driver usbhid [ 2.244622] usbhid: USB HID core driver [ 3.091083] EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null) [ 3.238275] scsi 6:0:0:0: Direct-Access Generic- Multi-Card 1.00 PQ: 0 ANSI: 0 CCS [ 3.348261] sd 6:0:0:0: Attached scsi generic sg2 type 0 [ 3.351897] sd 6:0:0:0: [sdb] Attached SCSI removable disk [ 47.138012] udevd[334]: starting version 173 [ 47.177678] lp: driver loaded but no devices found [ 47.197084] wmi: Mapper loaded [ 47.197526] acer_wmi: Acer Laptop ACPI-WMI Extras [ 47.210227] acer_wmi: Brightness must be controlled by generic video driver [ 47.566578] Disabling lock debugging due to kernel taint [ 47.584050] ndiswrapper version 1.56 loaded (smp=yes, preempt=no) [ 47.620666] type=1400 audit(1319239895.347:2): apparmor="STATUS" operation="profile_load" name="/sbin/dhclient" pid=624 comm="apparmor_parser" [ 47.620934] type=1400 audit(1319239895.347:3): apparmor="STATUS" operation="profile_load" name="/usr/lib/NetworkManager/nm-dhcp-client.action" pid=624 comm="apparmor_parser" [ 47.621108] type=1400 audit(1319239895.347:4): apparmor="STATUS" operation="profile_load" name="/usr/lib/connman/scripts/dhclient-script" pid=624 comm="apparmor_parser" [ 47.633056] [drm] Initialized drm 1.1.0 20060810 [ 47.722594] i915 0000:00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 [ 47.722602] i915 0000:00:02.0: setting latency timer to 64 [ 47.807152] ndiswrapper (check_nt_hdr:141): kernel is 64-bit, but Windows driver is not 64-bit;bad magic: 010B [ 47.807159] ndiswrapper (load_sys_files:206): couldn't prepare driver 'rt2860' [ 47.807930] ndiswrapper (load_wrap_driver:108): couldn't load driver rt2860; check system log for messages from 'loadndisdriver' [ 47.856250] usbcore: registered new interface driver ndiswrapper [ 47.861772] i915 0000:00:02.0: irq 47 for MSI/MSI-X [ 47.861781] [drm] Supports vblank timestamp caching Rev 1 (10.10.2010). [ 47.861783] [drm] Driver supports precise vblank timestamp query. [ 47.861842] vgaarb: device changed decodes: PCI:0000:00:02.0,olddecodes=io+mem,decodes=io+mem:owns=io+mem [ 47.980620] fixme: max PWM is zero. [ 48.286153] fbcon: inteldrmfb (fb0) is primary device [ 48.287033] Console: switching to colour frame buffer device 170x48 [ 48.287062] fb0: inteldrmfb frame buffer device [ 48.287064] drm: registered panic notifier [ 48.333883] acpi device:02: registered as cooling_device2 [ 48.334053] input: Video Bus as /devices/LNXSYSTM:00/device:00/PNP0A08:00/LNXVIDEO:00/input/input5 [ 48.334128] ACPI: Video Device [GFX0] (multi-head: yes rom: no post: no) [ 48.334203] [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0 [ 48.334644] HDA Intel 0000:00:1b.0: power state changed by ACPI to D0 [ 48.334652] HDA Intel 0000:00:1b.0: power state changed by ACPI to D0 [ 48.334673] HDA Intel 0000:00:1b.0: PCI INT A -> GSI 21 (level, low) -> IRQ 21 [ 48.334737] HDA Intel 0000:00:1b.0: irq 48 for MSI/MSI-X [ 48.334772] HDA Intel 0000:00:1b.0: setting latency timer to 64 [ 48.356107] Adding 261116k swap on /host/ubuntu/disks/swap.disk. Priority:-1 extents:1 across:261116k [ 48.380946] hda_codec: ALC268: BIOS auto-probing. [ 48.390242] input: HDA Intel Mic as /devices/pci0000:00/0000:00:1b.0/sound/card0/input6 [ 48.390365] input: HDA Intel Headphone as /devices/pci0000:00/0000:00:1b.0/sound/card0/input7 [ 48.490870] EXT4-fs (loop0): re-mounted. Opts: errors=remount-ro,user_xattr [ 48.917990] ppdev: user-space parallel port driver [ 48.950729] type=1400 audit(1319239896.675:5): apparmor="STATUS" operation="profile_load" name="/usr/lib/cups/backend/cups-pdf" pid=941 comm="apparmor_parser" [ 48.951114] type=1400 audit(1319239896.675:6): apparmor="STATUS" operation="profile_load" name="/usr/sbin/cupsd" pid=941 comm="apparmor_parser" [ 48.977706] Synaptics Touchpad, model: 1, fw: 7.2, id: 0x1c0b1, caps: 0xd04733/0xa44000/0xa0000 [ 49.048871] input: SynPS/2 Synaptics TouchPad as /devices/platform/i8042/serio1/input/input8 [ 49.078713] sky2 0000:02:00.0: eth0: enabling interface [ 49.079462] ADDRCONF(NETDEV_UP): eth0: link is not ready [ 50.762266] sky2 0000:02:00.0: eth0: Link is up at 100 Mbps, full duplex, flow control rx [ 50.762702] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 54.751478] type=1400 audit(1319239902.475:7): apparmor="STATUS" operation="profile_load" name="/usr/lib/lightdm/lightdm-guest-session-wrapper" pid=1039 comm="apparmor_parser" [ 54.755907] type=1400 audit(1319239902.479:8): apparmor="STATUS" operation="profile_replace" name="/sbin/dhclient" pid=1040 comm="apparmor_parser" [ 54.756237] type=1400 audit(1319239902.483:9): apparmor="STATUS" operation="profile_replace" name="/usr/lib/NetworkManager/nm-dhcp-client.action" pid=1040 comm="apparmor_parser" [ 54.756417] type=1400 audit(1319239902.483:10): apparmor="STATUS" operation="profile_replace" name="/usr/lib/connman/scripts/dhclient-script" pid=1040 comm="apparmor_parser" [ 54.764825] type=1400 audit(1319239902.491:11): apparmor="STATUS" operation="profile_load" name="/usr/bin/evince" pid=1041 comm="apparmor_parser" [ 54.768365] type=1400 audit(1319239902.495:12): apparmor="STATUS" operation="profile_load" name="/usr/bin/evince-previewer" pid=1041 comm="apparmor_parser" [ 54.770601] type=1400 audit(1319239902.495:13): apparmor="STATUS" operation="profile_load" name="/usr/bin/evince-thumbnailer" pid=1041 comm="apparmor_parser" [ 54.770729] type=1400 audit(1319239902.495:14): apparmor="STATUS" operation="profile_load" name="/usr/share/gdm/guest-session/Xsession" pid=1038 comm="apparmor_parser" [ 54.775181] type=1400 audit(1319239902.499:15): apparmor="STATUS" operation="profile_load" name="/usr/lib/telepathy/mission-control-5" pid=1043 comm="apparmor_parser" [ 54.775533] type=1400 audit(1319239902.499:16): apparmor="STATUS" operation="profile_load" name="/usr/lib/telepathy/telepathy-*" pid=1043 comm="apparmor_parser" [ 54.936691] init: failsafe main process (891) killed by TERM signal [ 54.944583] init: apport pre-start process (1096) terminated with status 1 [ 55.000373] init: apport post-stop process (1160) terminated with status 1 [ 55.005291] init: gdm main process (1159) killed by TERM signal [ 59.782579] EXT4-fs (loop0): re-mounted. Opts: errors=remount-ro,user_xattr,commit=0 [ 60.992021] eth0: no IPv6 routers present [ 61.936072] device eth0 entered promiscuous mode [ 62.053949] Bluetooth: Core ver 2.16 [ 62.054005] NET: Registered protocol family 31 [ 62.054007] Bluetooth: HCI device and connection manager initialized [ 62.054010] Bluetooth: HCI socket layer initialized [ 62.054012] Bluetooth: L2CAP socket layer initialized [ 62.054993] Bluetooth: SCO socket layer initialized [ 62.058750] Bluetooth: RFCOMM TTY layer initialized [ 62.058758] Bluetooth: RFCOMM socket layer initialized [ 62.058760] Bluetooth: RFCOMM ver 1.11 [ 62.059428] Bluetooth: BNEP (Ethernet Emulation) ver 1.3 [ 62.059432] Bluetooth: BNEP filters: protocol multicast [ 62.460389] init: plymouth-stop pre-start process (1662) terminated with status 1 '

    Read the article

  • AIX Checklist for stable obiee deployment

    - by user554629
    Common AIX configuration issues     ( last updated 27 Aug 2012 ) OBIEE is a complicated system with many moving parts and connection points.The purpose of this article is to provide a checklist to discuss OBIEE deployment with your systems administrators. The information in this article is time sensitive, and updated as I discover new  issues or details. What makes OBIEE different? When Tech Support suggests AIX component upgrades to a stable, locked-down production AIX environment, it is common to get "push back".  "Why is this necessary?  We aren't we seeing issues with other software?"It's a fair question that I have often struggled to answer; here are the talking points: OBIEE is memory intensive.  It is the entire purpose of the software to trade memory for repetitive, more expensive database requests across a network. OBIEE is implemented in C++ and is very dependent on the C++ runtime to behave correctly. OBIEE is aggressively thread efficient;  if atomic operations on a particular architecture do not work correctly, the software crashes. OBIEE dynamically loads third-party database client libraries directly into the nqsserver process.  If the library is not thread-safe, or corrupts process memory the OBIEE crash happens in an unrelated part of the code.  These are extremely difficult bugs to find. OBIEE software uses 99% common source across multiple platforms:  Windows, Linux, AIX, Solaris and HPUX.  If a crash happens on only one platform, we begin to suspect other factors.  load intensity, system differences, configuration choices, hardware failures.  It is rare to have a single product require so many diverse technical skills.   My role in support is to understand system configurations, performance issues, and crashes.   An analyst trained in Business Analytics can't be expected to know AIX internals in the depth required to make configuration choices.  Here are some guidelines. AIX C++ Runtime must be at  version 11.1.0.4$ lslpp -L | grep xlC.aixobiee software will crash if xlC.aix.rte is downlevel;  this is not a "try it" suggestion.Nov 2011 11.1.0.4 version  is appropriate for all AIX versions ( 5, 6, 7 )Download from here:https://www-304.ibm.com/support/docview.wss?uid=swg24031426 No reboot is necessary to install, it can even be installed while applications are using the current version.Restart the apps, and they will pick up the latest version. AIX 5.3 Technology Level 12 is required when running on Power5,6,7 processorsAIX 6.1 was introduced with the newer Power chips, and we have seen no issues with 6.1 or 7.1 versions.Customers with an unstable deployment, dozens of unexplained crashes, became stable after the upgrade.If your AIX system is 5.3, the minimum TL level should be at or higher than this:$ oslevel -s  5300-12-03-1107IBM typically supports only the two latest versions of AIX ( 6.1 and 7.1, for example).  AIX 5.3 is still supported and popular running in an LPAR. obiee userid limits$ ulimit -Ha  ( hard limits )$ ulimit -a   ( default limits )core file size (blocks)     unlimiteddata seg size (kbytes)      unlimitedfile size (blocks)          unlimitedmax memory size (kbytes)    unlimitedopen files                  10240 cpu time (seconds)          unlimitedvirtual memory (kbytes)     unlimitedIt is best to establish the values in /etc/security/limitsroot user is needed to observe and modify this file.If you modify a limit, you will need to relog in to change it again.  For example,$ ulimit -c 0$ ulimit -c 2097151cannot modify limit: Operation not permitted$ ulimit -c unlimited$ ulimit -c0There are only two meaningful values for ulimit -c ; zero or unlimited.Anything else is likely to produce a truncated core file that cannot be analyzed. Deploy 32-bit or 64-bit ?Early versions of OBIEE offered 32-bit or 64-bit choice to AIX customers.The 32-bit choice was needed if a database vendor did not supply a 64-bit client library.That's no longer an issue and beginning with OBIEE 11, 32-bit code is no longer shipped.A common error that leads to "out of memory" conditions to to accept the 32-bit memory configuration choices on 64-bit deployments.  The significant configuration choices are: Maximum process data (heap) size is in an AIX environment variableLDR_CNTRL=IGNOREUNLOAD@LOADPUBLIC@PREREAD_SHLIB@MAXDATA=0x... Two thread stack sizes are made in obiee NQSConfig.INI[ SERVER ]SERVER_THREAD_STACK_SIZE = 0;DB_GATEWAY_THREAD_STACK_SIZE = 0; Sort memory in NQSConfig.INI[ GENERAL ]SORT_MEMORY_SIZE = 4 MB ;SORT_BUFFER_INCREMENT_SIZE = 256 KB ; Choosing a value for MAXDATA:0x080000000  2GB Default maximum 32-bit heap size ( 8 with 7 zeros )0x100000000  4GB 64-bit breaking even with 32-bit ( 1 with 8 zeros )0x200000000  8GB 64-bit double 32-bit max0x400000000 16GB 64-bit safetyUsing 2GB heap size for a 64-bit process will almost certainly lead to an out-of-memory situation.Registers are twice as big ... consume twice as much memory in the heap.Upgrading to a 4GB heap for a 64-bit process is just "breaking even" with 32-bit.A 32-bit process is constrained by the 32-bit virtual addressing limits.  Heap memory is used for dynamic requirements of obiee software, thread stacks for each of the configured threads, and sometimes for shared libraries. 64-bit processes are not constrained in this way;  extra heap space can be configured for safety against a query that might create a sudden requirement for excessive storage.  If the storage is not available, this query might crash the whole server and disrupt existing users.There is no performance penalty on AIX for configuring more memory than required;  extra memory can be configured for safety.  If there are no other considerations, start with 8GB.Choosing a value for Thread Stack size:zero is the value documented to select an appropriate default for thread stack size.  My preference is to change this to an absolute value, even if you intend to use the documented default;  it provides better documentation and removes the "surprise" factor.There are two thread types that can be configured. GATEWAY is used by a thread pool to call a database client library to establish a DB connection.The default size is 256KB;  many customers raise this to 512KB ( no performance penalty for over-configuring ). This value must be set to 1 MB if Teradata connections are used. SERVER threads are used to run queries.  OBIEE uses recursive algorithms during the analysis of query structures which can consume significant thread stack storage.  It's difficult to provide guidance on a value that depends on data and complexity.  The general notion is to provide more space than you think you need,  "double down" and increase the value if you run out, otherwise inspect the query to understand why it is too complex for the thread stack.  There are protections built into the software to abort a single user query that is too complex, but the algorithms don't cover all situations.256 KB  The default 32-bit stack size.  Many customers increased this to 512KB on 32-bit.  A 64-bit server is very likely to crash with this value;  the stack contains mostly register values, which are twice as big.512 KB  The documented 64-bit default.  Some early releases of obiee didn't set this correctly, resulting in 256KB stacks.1 MB  The recommended 64-bit setting.  If your system only ever uses 512KB of stack space, there is no performance penalty for using 1MB stack size.2 MB  Many large customers use this value for safety.  No performance penalty.nqscheduler does not use the NQSConfig.INI file to set thread stack size.If this process crashes because the thread stack is too small, use this to set 2MB:export OBI_BACKGROUND_STACK_SIZE=2048 Shared libraries are not (shared) When application libraries are loaded at run-time, AIX makes a decision on whether to load the libraries in a "public" memory segment.  If the filesystem library permissions do not have the "Read-Other" permission bit, AIX loads the library into private process memory with two significant side-effects:* The libraries reduce the heap storage available.      Might be significant in 32-bit processes;  irrelevant in 64-bit processes.* Library code is loaded into multiple real pages for execution;  one copy for each process.Multiple execution images is a significant issue for both 32- and 64-bit processes.The "real memory pages" saved by using public memory segments is a minor concern.  Today's machines typically have plenty of real memory.The real problem with private copies of libraries is that they consume processor cache blocks, which are limited.   The same library instructions executing in different real pages will cause memory delays as the i-cache ( instruction cache 128KB blocks) are refreshed from real memory.   Performance loss because instructions are delayed is something that is difficult to measure without access to low-level cache fault data.   The machine just appears to be running slowly for no observable reason.This is an easy problem to detect, and an easy problem to correct.Detection:  "genld -l" AIX command produces a list of the libraries used by each process and the AIX memory address where they are loaded.32-bit public segment is 13 ( "dxxxxxxx" ).   private segments are 2-a.64-bit public segment is 9 ( "9xxxxxxxxxxxxxxx") ; private segment is 8.genld -l | grep -v ' d| 9' | sort +2provides a list of privately loaded libraries. Repair: chmod o+r <libname>AIX shared libraries will have a suffix of ".so" or ".a".Another technique is to change all libraries in a selected directory to repair those that might not be currently loaded.   The usual directories that need repair are obiee code, httpd code and plugins, database client libraries and java.chmod o+r /shr/dir/*.a /shr/dir/*.so Configure your system for diagnosticsProduction systems shouldn't crash, and yet bad things happen to good software.If obiee software crashes and produces a core, you should configure your system for reliable transfer of the failing conditions to Oracle Tech Support.  Here's what we need to be able to diagnose a core file from your system.* fullcore enabled. chdev -lsys0 -a fullcore=true* core naming enabled. chcore -n on -d* ulimit must not truncate core. see item 3.* pstack.sh is used to capture core documentation.* obidoc is used to capture current AIX configuration.* snapcore  AIX utility captures core and libraries. Use the proper syntax. $ snapcore -r corename executable-fullpath   /tmp/snapcore will contain the .pax.Z output file.  It is compressed.* If cores are directed to a common directory, ensure obiee userid can write to the directory.  ( chcore -p /cores -d ; chmod 777 /cores )The filesystem must have sufficient space to hold a crashing obiee application.Use:  df -k  Check the "Free" column ( not "% Used" )  8388608 is 8GB. Disable Oracle Client Library signal handlingThe Oracle DB Client Library is frequently distributed with the sqlplus development kit.By default, the library enables a signal handler, which will document a call stack if the application crashes.   The signal handler is not needed, and definitely disruptive to obiee diagnostics.   It needs to be disabled.   sqlnet.ora is typically located at:   $ORACLE_HOME/network/admin/sqlnet.oraAdd this line at the top of the file:   DIAG_SIGHANDLER_ENABLED=FALSE Disable async query in the RPD connection pool.This might be an obiee 10.1.3.4 issue only ( still checking  )."async query" must be disabled in the connection pools.It was designed to enable query cancellation to a database, and turned out to have too many edge conditions in normal communication that produced random corruption of data and crashes.  Please ensure it is turned off in the RPD. Check AIX error report (errpt).Errors external to obiee applications can trigger crashes.  $ /bin/errpt -aHardware errors ( firmware, adapters, disks ) should be reported to IBM support.All application core files are recorded by AIX;  the most recent ones are listed first. Reserved for something important to say.

    Read the article

  • ?11gR2 RAC???ASM DISK Path????

    - by Liu Maclean(???)
    ????T.askmaclean.com???????11gR2?ASM DISK?????,??????: aix 6.1,grid 11.2.0.3+asm11.2.0.3+rac ???????????aix????????mpio,??diskgroup ?????veritas dmp???,?????asm?disk_strings=/dev/vx/rdmp/*,crs/asm??????????????/dev/vx/rdmp/?????,?????????diskgroup??? crs???????:2012-07-13 15:07:29.748: [ GPNP][1286]clsgpnp_profileCallUrlInt: [at clsgpnp.c:2108 clsgpnp_profileCallUrlInt] get-profile call to url “ipc://GPNPD_ggtest1? disco “” [f=0 claimed- host: cname: seq: auth:]2012-07-13 15:07:29.762: [ GPNP][1286]clsgpnp_profileCallUrlInt: [at clsgpnp.c:2236 clsgpnp_profileCallUrlInt] Result: (0) CLSGPNP_OK. Successful get-profile CALL to remote “ipc://GPNPD_ggtest1? disco “”2012-07-13 15:07:29.762: [ CSSD][1286]clssnmReadDiscoveryProfile: voting file discovery string(/dev/vx/rdmp/*)2012-07-13 15:07:29.762: [ CSSD][1286]clssnmvDDiscThread: using discovery string /dev/vx/rdmp/* for initial discovery2012-07-13 15:07:29.762: [ SKGFD][1286]Discovery with str:/dev/vx/rdmp/*: 2012-07-13 15:07:29.762: [ SKGFD][1286]UFS discovery with :/dev/vx/rdmp/*: 2012-07-13 15:07:29.769: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_919: 2012-07-13 15:07:29.770: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_212: 2012-07-13 15:07:29.770: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_211: 2012-07-13 15:07:29.770: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_210: 2012-07-13 15:07:29.770: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_209: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_181: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_180: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/disk_3: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/disk_2: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/disk_1: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/disk_0: 2012-07-13 15:07:29.771: [ SKGFD][1286]OSS discovery with :/dev/vx/rdmp/*: 2012-07-13 15:07:29.771: [ SKGFD][1286]Handle 1115e7510 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_916: 2012-07-13 15:07:29.772: [ SKGFD][1286]Handle 1118758b0 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_912: 2012-07-13 15:07:29.773: [ SKGFD][1286]Handle 1118d9cf0 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_908: 2012-07-13 15:07:29.773: [ SKGFD][1286]Handle 1118da450 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_904: 2012-07-13 15:07:29.773: [ SKGFD][1286]Handle 1118dad70 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_903: 2012-07-13 15:07:29.802: [ CLSF][1286]checksum failed for disk:/dev/vx/rdmp/v_df8000_916:2012-07-13 15:07:29.803: [ SKGFD][1286]Lib :UFS:: closing handle 1115e7510 for disk :/dev/vx/rdmp/v_df8000_916: 2012-07-13 15:07:29.803: [ SKGFD][1286]Lib :UFS:: closing handle 1118758b0 for disk :/dev/vx/rdmp/v_df8000_912: 2012-07-13 15:07:29.804: [ SKGFD][1286]Handle 1115e6710 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_202: 2012-07-13 15:07:29.808: [ SKGFD][1286]Handle 1115e7030 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_201: 2012-07-13 15:07:29.809: [ SKGFD][1286]Handle 1115e7ad0 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_200: 2012-07-13 15:07:29.809: [ SKGFD][1286]Handle 1118733f0 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_199: 2012-07-13 15:07:29.816: [ CLSF][1286]checksum failed for disk:/dev/vx/rdmp/v_df8000_186:2012-07-13 15:07:29.816: [ SKGFD][1286]Lib :UFS:: closing handle 1118de5d0 for disk :/dev/vx/rdmp/v_df8000_186: 2012-07-13 15:07:29.816: [ CSSD][1286]clssnmvDiskVerify: Successful discovery of 0 disks2012-07-13 15:07:29.816: [ CSSD][1286]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery2012-07-13 15:07:29.816: [ CSSD][1286]clssnmvFindInitialConfigs: No voting files found2012-07-13 15:07:29.816: [ CSSD][1286](:CSSNM00070:)clssnmCompleteInitVFDiscovery: Voting file not found. Retrying discovery in 15 seconds2012-07-13 15:07:30.169: [ CSSD][1029]clssgmExecuteClientRequest(): type(37) size(80) only connect and exit messages are allowed before lease acquisition proc(1115e4870) client(0) ??????ASM DISK PATH???????,????11gR2 RAC+ASM????,??CRS??????,????crsctl start crs -excl -nocrs???????CSS???ASM??, ???????(clssnmCompleteInitVFDiscovery: Voting file not found),????Voteing file????????????????? ?????????,???????11gR2 RAC+ASM??ASM DISK??: 1.?????????ASM DISK?????,??????UDEV????????,???UDEV????ASM DISK?/dev/asm-disk* ??? /dev/rasm-disk*???, ??????udev rule??????: [grid@maclean1 ~]$ export ORACLE_HOME=/g01/grid/app/11.2.0/grid [grid@maclean1 ~]$ /g01/grid/app/11.2.0/grid/bin/sqlplus / as sysasm SQL*Plus: Release 11.2.0.3.0 Production on Sun Jul 15 04:09:28 2012 Copyright (c) 1982, 2011, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options SQL> show parameter diskstri NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ asm_diskstring string /dev/asm* ??????ASM?????asm_diskstring ?/dev/asm*, ???root????UDEV RULE?? : [root@maclean1 rules.d]# cp 99-oracle-asmdevices.rules 99-oracle-asmdevices.rules.bak [root@maclean1 rules.d]# vi 99-oracle-asmdevices.rules [root@maclean1 rules.d]# cat 99-oracle-asmdevices.rules KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB09cadb31-cfbea255_", NAME="rasm-diskb", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB5f097069-59efb82f_", NAME="rasm-diskc", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB4e1a81c0-20478bc4_", NAME="rasm-diskd", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VBdcce9285-b13c5a27_", NAME="rasm-diske", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB82effe1a-dbca7dff_", NAME="rasm-diskf", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB950d279f-c581cb51_", NAME="rasm-diskg", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB14400d81-651672d7_", NAME="rasm-diskh", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB31b1237b-78aa22bb_", NAME="rasm-diski", OWNER="grid", GROUP="asmadmin", MODE="0660" ???????99-oracle-asmdevices.rules?UDEV RULE????,??????????/dev/rasm-disk*???,??????ASM DISK???, ????????????????RAC CRS??????? ??????votedisk?ocr ????: [root@maclean1 rules.d]# /g01/grid/app/11.2.0/grid/bin/crsctl query css votedisk ## STATE File Universal Id File Name Disk group -- ----- ----------------- --------- --------- 1. ONLINE 6896bfc3d1464f9fbf0ea9df87e023ad (/dev/asm-diskb) [SYSTEMDG] 2. ONLINE 58eb81b656084ff2bfd315d9badd08b7 (/dev/asm-diskc) [SYSTEMDG] 3. ONLINE 6bf7324625c54f3abf2c942b1e7f70d9 (/dev/asm-diskd) [SYSTEMDG] 4. ONLINE 43ad8ae20c354f5ebf7083bc30bf94cc (/dev/asm-diske) [SYSTEMDG] 5. ONLINE 4c225359d51b4f93bfba01080664b3d7 (/dev/asm-diskf) [SYSTEMDG] Located 5 voting disk(s). [root@maclean1 rules.d]# /g01/grid/app/11.2.0/grid/bin/ocrcheck Status of Oracle Cluster Registry is as follows : Version : 3 Total space (kbytes) : 262120 Used space (kbytes) : 2844 Available space (kbytes) : 259276 ID : 879001605 Device/File Name : +SYSTEMDG Device/File integrity check succeeded Device/File not configured Device/File not configured Device/File not configured Device/File not configured Cluster registry integrity check succeeded Logical corruption check succeeded ??votedisk file?????????ASM DISK,?????????crsctl replace votedisk, ??????LINUX OS: [root@maclean1 rules.d]# init 6 rebooting ............ [root@maclean1 dev]# ls -l *asm* brw-rw---- 1 grid asmadmin 8, 16 Jul 15 04:15 rasm-diskb brw-rw---- 1 grid asmadmin 8, 32 Jul 15 04:15 rasm-diskc brw-rw---- 1 grid asmadmin 8, 48 Jul 15 04:15 rasm-diskd brw-rw---- 1 grid asmadmin 8, 64 Jul 15 04:15 rasm-diske brw-rw---- 1 grid asmadmin 8, 80 Jul 15 04:15 rasm-diskf brw-rw---- 1 grid asmadmin 8, 96 Jul 15 04:15 rasm-diskg brw-rw---- 1 grid asmadmin 8, 112 Jul 15 04:15 rasm-diskh brw-rw---- 1 grid asmadmin 8, 128 Jul 15 04:15 rasm-diski ??????????/dev/rasm-disk*?ASM DISK,??ASM??????css?????/dev/asm*?????ASM DISK,??????????????ASM DISK: more /g01/grid/app/11.2.0/grid/log/maclean1/cssd/ocssd.log 2012-07-15 04:17:45.208: [ SKGFD][1099548992]Discovery with str:/dev/asm*: 2012-07-15 04:17:45.208: [ SKGFD][1099548992]UFS discovery with :/dev/asm*: 2012-07-15 04:17:45.208: [ SKGFD][1099548992]OSS discovery with :/dev/asm*: 2012-07-15 04:17:45.208: [ CSSD][1099548992]clssnmvDiskVerify: Successful discovery of 0 disks 2012-07-15 04:17:45.208: [ CSSD][1099548992]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery 2012-07-15 04:17:45.208: [ CSSD][1099548992]clssnmvFindInitialConfigs: No voting files found 2012-07-15 04:17:45.208: [ CSSD][1099548992](:CSSNM00070:)clssnmCompleteInitVFDiscovery: Voting file not found. Retrying discovery in 15 seconds 2012-07-15 04:17:45.251: [ CSSD][1096661312]clssgmExecuteClientRequest(): type(37) size(80) only connect and exit messages are allowed before lease acquisition proc(0x26a8ba0) client((nil)) 2012-07-15 04:17:45.251: [ CSSD][1096661312]clssgmDeadProc: proc 0x26a8ba0 2012-07-15 04:17:45.251: [ CSSD][1096661312]clssgmDestroyProc: cleaning up proc(0x26a8ba0) con(0xfe6) skgpid ospid 3751 with 0 clients, refcount 0 2012-07-15 04:17:45.252: [ CSSD][1096661312]clssgmDiscEndpcl: gipcDestroy 0xfe6 2012-07-15 04:17:45.829: [ CSSD][1096661312]clssscSelect: cookie accept request 0x2318ea0 2012-07-15 04:17:45.829: [ CSSD][1096661312]clssgmAllocProc: (0x2659480) allocated 2012-07-15 04:17:45.830: [ CSSD][1096661312]clssgmClientConnectMsg: properties of cmProc 0x2659480 - 1,2,3,4,5 2012-07-15 04:17:45.830: [ CSSD][1096661312]clssgmClientConnectMsg: Connect from con(0x114e) proc(0x2659480) pid(3751) version 11:2:1:4, properties: 1,2,3,4,5 2012-07-15 04:17:45.830: [ CSSD][1096661312]clssgmClientConnectMsg: msg flags 0x0000 2012-07-15 04:17:45.939: [ CSSD][1096661312]clssscSelect: cookie accept request 0x253ddd0 2012-07-15 04:17:45.939: [ CSSD][1096661312]clssscevtypSHRCON: getting client with cmproc 0x253ddd0 2012-07-15 04:17:45.939: [ CSSD][1096661312]clssgmRegisterClient: proc(3/0x253ddd0), client(61/0x26877b0) 2012-07-15 04:17:45.939: [ CSSD][1096661312]clssgmExecuteClientRequest(): type(6) size(684) only connect and exit messages are  allowed before lease acquisition proc(0x253ddd0) client(0x26877b0) 2012-07-15 04:17:45.939: [ CSSD][1096661312]clssgmDiscEndpcl: gipcDestroy 0x1174 2012-07-15 04:17:46.070: [ CSSD][1096661312]clssscSelect: cookie accept request 0x26368a0 2012-07-15 04:17:46.070: [ CSSD][1096661312]clssscevtypSHRCON: getting client with cmproc 0x26368a0 2012-07-15 04:17:46.070: [ CSSD][1096661312]clssgmRegisterClient: proc(5/0x26368a0), client(50/0x26877b0) ??11gR2?CRS?????ASM,??ocr???ASM?,??ASM???????,???CRS?????????: [root@maclean1 ~]# crsctl check has CRS-4638: Oracle High Availability Services is online [root@maclean1 ~]# crsctl check crs CRS-4638: Oracle High Availability Services is online CRS-4535: Cannot communicate with Cluster Ready Services CRS-4530: Communications failure contacting Cluster Synchronization Services daemon CRS-4534: Cannot communicate with Event Manager 2. ?????ASM DISK PATH???????,?????????????CRS: ??????OHASD??: [root@maclean1 ~]# crsctl stop has -f CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'maclean1' CRS-2673: Attempting to stop 'ora.mdnsd' on 'maclean1' CRS-2673: Attempting to stop 'ora.crf' on 'maclean1' CRS-2677: Stop of 'ora.mdnsd' on 'maclean1' succeeded CRS-2677: Stop of 'ora.crf' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.gipcd' on 'maclean1' CRS-2677: Stop of 'ora.gipcd' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.gpnpd' on 'maclean1' CRS-2677: Stop of 'ora.gpnpd' on 'maclean1' succeeded CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'maclean1' has completed CRS-4133: Oracle High Availability Services has been stopped. 3. ?-excl -nocrs????CRS,?????ASM ???????CRS??: [root@maclean1 ~]# crsctl start crs -excl -nocrs  CRS-4123: Oracle High Availability Services has been started. CRS-2672: Attempting to start 'ora.mdnsd' on 'maclean1' CRS-2676: Start of 'ora.mdnsd' on 'maclean1' succeeded CRS-2672: Attempting to start 'ora.gpnpd' on 'maclean1' CRS-2676: Start of 'ora.gpnpd' on 'maclean1' succeeded CRS-2672: Attempting to start 'ora.cssdmonitor' on 'maclean1' CRS-2672: Attempting to start 'ora.gipcd' on 'maclean1' CRS-2676: Start of 'ora.cssdmonitor' on 'maclean1' succeeded CRS-2676: Start of 'ora.gipcd' on 'maclean1' succeeded CRS-2672: Attempting to start 'ora.cssd' on 'maclean1' CRS-2672: Attempting to start 'ora.diskmon' on 'maclean1' CRS-2676: Start of 'ora.diskmon' on 'maclean1' succeeded CRS-2676: Start of 'ora.cssd' on 'maclean1' succeeded CRS-2679: Attempting to clean 'ora.cluster_interconnect.haip' on 'maclean1' CRS-2672: Attempting to start 'ora.ctssd' on 'maclean1' CRS-2681: Clean of 'ora.cluster_interconnect.haip' on 'maclean1' succeeded CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'maclean1' CRS-2676: Start of 'ora.ctssd' on 'maclean1' succeeded CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'maclean1' succeeded CRS-2672: Attempting to start 'ora.asm' on 'maclean1' CRS-2676: Start of 'ora.asm' on 'maclean1' succeeded #??????CRS_HOME???ORACLE_BASE?777??,??????? [root@maclean1 ~]# chmod 777 /g01 4.??ASM???disk_strings????ASM DISK PATH??: [root@maclean1 ~]# su - grid [grid@maclean1 ~]$ sqlplus / as sysasm SQL*Plus: Release 11.2.0.3.0 Production on Sun Jul 15 04:40:40 2012 Copyright (c) 1982, 2011, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options SQL> alter system set asm_diskstring='/dev/rasm*'; System altered. SQL> alter diskgroup systemdg mount; Diskgroup altered. SQL> create spfile from memory; File created. SQL> startup force mount; ORA-32004: obsolete or deprecated parameter(s) specified for ASM instance ASM instance started Total System Global Area 283930624 bytes Fixed Size 2227664 bytes Variable Size 256537136 bytes ASM Cache 25165824 bytes ASM diskgroups mounted SQL> show parameter spfile NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ spfile string /g01/grid/app/11.2.0/grid/dbs/ spfile+ASM1.ora SQL> show parameter disk NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ asm_diskgroups string SYSTEMDG asm_diskstring string /dev/rasm* SQL> create pfile from spfile; File created. SQL> create spfile='+SYSTEMDG' from pfile; File created. SQL> startup force; ORA-32004: obsolete or deprecated parameter(s) specified for ASM instance ASM instance started Total System Global Area 283930624 bytes Fixed Size 2227664 bytes Variable Size 256537136 bytes ASM Cache 25165824 bytes ASM diskgroups mounted SQL> show parameter spfile NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ spfile string +SYSTEMDG/maclean-cluster/asmp arameterfile/registry.253.7886 82933 ???????asm_diskstring ,????ASM DISKGROUP??SPFILE , ??ASM?????SPFILE?????????????????? 5. crsctl replace votedisk ???votedisk????: [root@maclean1 ~]# crsctl replace votedisk +systemdg Successful addition of voting disk 864a00efcfbe4f42bfd0f4f6b60472a0. Successful addition of voting disk ab14d6e727614f29bf53b9870052a5c8. Successful addition of voting disk 754c03c168854f46bf2daee7287bf260. Successful addition of voting disk 9ed58f37f3e84f28bfcd9b101f2af9f3. Successful addition of voting disk 4ce7b7c682364f12bf4df5ce1fb7814e. Successfully replaced voting disk group with +systemdg. CRS-4266: Voting file(s) successfully replaced [root@maclean1 ~]# crsctl query css votedisk ## STATE File Universal Id File Name Disk group -- ----- ----------------- --------- --------- 1. ONLINE 864a00efcfbe4f42bfd0f4f6b60472a0 (/dev/rasm-diskb) [SYSTEMDG] 2. ONLINE ab14d6e727614f29bf53b9870052a5c8 (/dev/rasm-diskc) [SYSTEMDG] 3. ONLINE 754c03c168854f46bf2daee7287bf260 (/dev/rasm-diskd) [SYSTEMDG] 4. ONLINE 9ed58f37f3e84f28bfcd9b101f2af9f3 (/dev/rasm-diske) [SYSTEMDG] 5. ONLINE 4ce7b7c682364f12bf4df5ce1fb7814e (/dev/rasm-diskf) [SYSTEMDG] Located 5 voting disk(s). [root@maclean1 ~]# ocrcheck Status of Oracle Cluster Registry is as follows : Version : 3 Total space (kbytes) : 262120 Used space (kbytes) : 2844 Available space (kbytes) : 259276 ID : 879001605 Device/File Name : +SYSTEMDG Device/File integrity check succeeded Device/File not configured Device/File not configured Device/File not configured Device/File not configured Cluster registry integrity check succeeded Logical corruption check succeeded ??replace?votedisk??? ASM DISK?,???votedisk?OCR??????? 6.??CRS??: [root@maclean1 ~]# crsctl stop crs CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'maclean1' CRS-2673: Attempting to stop 'ora.mdnsd' on 'maclean1' CRS-2673: Attempting to stop 'ora.ctssd' on 'maclean1' CRS-2673: Attempting to stop 'ora.asm' on 'maclean1' CRS-2677: Stop of 'ora.mdnsd' on 'maclean1' succeeded CRS-2677: Stop of 'ora.asm' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'maclean1' CRS-2677: Stop of 'ora.ctssd' on 'maclean1' succeeded CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.cssd' on 'maclean1' CRS-2677: Stop of 'ora.cssd' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.gipcd' on 'maclean1' CRS-2677: Stop of 'ora.gipcd' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.gpnpd' on 'maclean1' CRS-2677: Stop of 'ora.gpnpd' on 'maclean1' succeeded CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'maclean1' has completed CRS-4133: Oracle High Availability Services has been stopped. [root@maclean1 ~]# crsctl stat res -t -------------------------------------------------------------------------------- NAME TARGET STATE SERVER STATE_DETAILS -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.BACKUPDG.dg ONLINE ONLINE maclean1 ora.DATA.dg ONLINE ONLINE maclean1 ora.LISTENER.lsnr ONLINE ONLINE maclean1 ora.SYSTEMDG.dg ONLINE ONLINE maclean1 ora.asm ONLINE ONLINE maclean1 Started ora.gsd OFFLINE OFFLINE maclean1 ora.net1.network ONLINE ONLINE maclean1 ora.ons ONLINE ONLINE maclean1 -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.LISTENER_SCAN1.lsnr 1 ONLINE ONLINE maclean1 ora.cvu 1 ONLINE ONLINE maclean1 ora.maclean1.vip 1 ONLINE ONLINE maclean1 ora.maclean2.vip 1 ONLINE INTERMEDIATE maclean1 FAILED OVER ora.oc4j 1 ONLINE OFFLINE STARTING ora.prod.db 1 ONLINE OFFLINE Instance Shutdown,S TARTING 2 ONLINE OFFLINE ora.scan1.vip 1 ONLINE ONLINE maclean1 ???????ASM?????SPFILE,???????????????,?????CRS??????? ??11gR2 RAC+ASM?????????,????????????????ASM DISK PATH??????????

    Read the article

  • ?11gR2 RAC???ASM DISK Path????

    - by Liu Maclean(???)
    ????T.askmaclean.com???????11gR2?ASM DISK?????,??????: aix 6.1,grid 11.2.0.3+asm11.2.0.3+rac ???????????aix????????mpio,??diskgroup ?????veritas dmp???,?????asm?disk_strings=/dev/vx/rdmp/*,crs/asm??????????????/dev/vx/rdmp/?????,?????????diskgroup??? crs???????:2012-07-13 15:07:29.748: [ GPNP][1286]clsgpnp_profileCallUrlInt: [at clsgpnp.c:2108 clsgpnp_profileCallUrlInt] get-profile call to url “ipc://GPNPD_ggtest1? disco “” [f=0 claimed- host: cname: seq: auth:]2012-07-13 15:07:29.762: [ GPNP][1286]clsgpnp_profileCallUrlInt: [at clsgpnp.c:2236 clsgpnp_profileCallUrlInt] Result: (0) CLSGPNP_OK. Successful get-profile CALL to remote “ipc://GPNPD_ggtest1? disco “”2012-07-13 15:07:29.762: [ CSSD][1286]clssnmReadDiscoveryProfile: voting file discovery string(/dev/vx/rdmp/*)2012-07-13 15:07:29.762: [ CSSD][1286]clssnmvDDiscThread: using discovery string /dev/vx/rdmp/* for initial discovery2012-07-13 15:07:29.762: [ SKGFD][1286]Discovery with str:/dev/vx/rdmp/*: 2012-07-13 15:07:29.762: [ SKGFD][1286]UFS discovery with :/dev/vx/rdmp/*: 2012-07-13 15:07:29.769: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_919: 2012-07-13 15:07:29.770: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_212: 2012-07-13 15:07:29.770: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_211: 2012-07-13 15:07:29.770: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_210: 2012-07-13 15:07:29.770: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_209: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_181: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/v_df8000_180: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/disk_3: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/disk_2: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/disk_1: 2012-07-13 15:07:29.771: [ SKGFD][1286]Fetching UFS disk :/dev/vx/rdmp/disk_0: 2012-07-13 15:07:29.771: [ SKGFD][1286]OSS discovery with :/dev/vx/rdmp/*: 2012-07-13 15:07:29.771: [ SKGFD][1286]Handle 1115e7510 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_916: 2012-07-13 15:07:29.772: [ SKGFD][1286]Handle 1118758b0 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_912: 2012-07-13 15:07:29.773: [ SKGFD][1286]Handle 1118d9cf0 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_908: 2012-07-13 15:07:29.773: [ SKGFD][1286]Handle 1118da450 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_904: 2012-07-13 15:07:29.773: [ SKGFD][1286]Handle 1118dad70 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_903: 2012-07-13 15:07:29.802: [ CLSF][1286]checksum failed for disk:/dev/vx/rdmp/v_df8000_916:2012-07-13 15:07:29.803: [ SKGFD][1286]Lib :UFS:: closing handle 1115e7510 for disk :/dev/vx/rdmp/v_df8000_916: 2012-07-13 15:07:29.803: [ SKGFD][1286]Lib :UFS:: closing handle 1118758b0 for disk :/dev/vx/rdmp/v_df8000_912: 2012-07-13 15:07:29.804: [ SKGFD][1286]Handle 1115e6710 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_202: 2012-07-13 15:07:29.808: [ SKGFD][1286]Handle 1115e7030 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_201: 2012-07-13 15:07:29.809: [ SKGFD][1286]Handle 1115e7ad0 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_200: 2012-07-13 15:07:29.809: [ SKGFD][1286]Handle 1118733f0 from lib :UFS:: for disk :/dev/vx/rdmp/v_df8000_199: 2012-07-13 15:07:29.816: [ CLSF][1286]checksum failed for disk:/dev/vx/rdmp/v_df8000_186:2012-07-13 15:07:29.816: [ SKGFD][1286]Lib :UFS:: closing handle 1118de5d0 for disk :/dev/vx/rdmp/v_df8000_186: 2012-07-13 15:07:29.816: [ CSSD][1286]clssnmvDiskVerify: Successful discovery of 0 disks2012-07-13 15:07:29.816: [ CSSD][1286]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery2012-07-13 15:07:29.816: [ CSSD][1286]clssnmvFindInitialConfigs: No voting files found2012-07-13 15:07:29.816: [ CSSD][1286](:CSSNM00070:)clssnmCompleteInitVFDiscovery: Voting file not found. Retrying discovery in 15 seconds2012-07-13 15:07:30.169: [ CSSD][1029]clssgmExecuteClientRequest(): type(37) size(80) only connect and exit messages are allowed before lease acquisition proc(1115e4870) client(0) ??????ASM DISK PATH???????,????11gR2 RAC+ASM????,??CRS??????,????crsctl start crs -excl -nocrs???????CSS???ASM??, ???????(clssnmCompleteInitVFDiscovery: Voting file not found),????Voteing file????????????????? ?????????,???????11gR2 RAC+ASM??ASM DISK??: 1.?????????ASM DISK?????,??????UDEV????????,???UDEV????ASM DISK?/dev/asm-disk* ??? /dev/rasm-disk*???, ??????udev rule??????: [grid@maclean1 ~]$ export ORACLE_HOME=/g01/grid/app/11.2.0/grid [grid@maclean1 ~]$ /g01/grid/app/11.2.0/grid/bin/sqlplus / as sysasm SQL*Plus: Release 11.2.0.3.0 Production on Sun Jul 15 04:09:28 2012 Copyright (c) 1982, 2011, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options SQL> show parameter diskstri NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ asm_diskstring string /dev/asm* ??????ASM?????asm_diskstring ?/dev/asm*, ???root????UDEV RULE?? : [root@maclean1 rules.d]# cp 99-oracle-asmdevices.rules 99-oracle-asmdevices.rules.bak [root@maclean1 rules.d]# vi 99-oracle-asmdevices.rules [root@maclean1 rules.d]# cat 99-oracle-asmdevices.rules KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB09cadb31-cfbea255_", NAME="rasm-diskb", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB5f097069-59efb82f_", NAME="rasm-diskc", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB4e1a81c0-20478bc4_", NAME="rasm-diskd", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VBdcce9285-b13c5a27_", NAME="rasm-diske", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB82effe1a-dbca7dff_", NAME="rasm-diskf", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB950d279f-c581cb51_", NAME="rasm-diskg", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB14400d81-651672d7_", NAME="rasm-diskh", OWNER="grid", GROUP="asmadmin", MODE="0660" KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="SATA_VBOX_HARDDISK_VB31b1237b-78aa22bb_", NAME="rasm-diski", OWNER="grid", GROUP="asmadmin", MODE="0660" ???????99-oracle-asmdevices.rules?UDEV RULE????,??????????/dev/rasm-disk*???,??????ASM DISK???, ????????????????RAC CRS??????? ??????votedisk?ocr ????: [root@maclean1 rules.d]# /g01/grid/app/11.2.0/grid/bin/crsctl query css votedisk ## STATE File Universal Id File Name Disk group -- ----- ----------------- --------- --------- 1. ONLINE 6896bfc3d1464f9fbf0ea9df87e023ad (/dev/asm-diskb) [SYSTEMDG] 2. ONLINE 58eb81b656084ff2bfd315d9badd08b7 (/dev/asm-diskc) [SYSTEMDG] 3. ONLINE 6bf7324625c54f3abf2c942b1e7f70d9 (/dev/asm-diskd) [SYSTEMDG] 4. ONLINE 43ad8ae20c354f5ebf7083bc30bf94cc (/dev/asm-diske) [SYSTEMDG] 5. ONLINE 4c225359d51b4f93bfba01080664b3d7 (/dev/asm-diskf) [SYSTEMDG] Located 5 voting disk(s). [root@maclean1 rules.d]# /g01/grid/app/11.2.0/grid/bin/ocrcheck Status of Oracle Cluster Registry is as follows : Version : 3 Total space (kbytes) : 262120 Used space (kbytes) : 2844 Available space (kbytes) : 259276 ID : 879001605 Device/File Name : +SYSTEMDG Device/File integrity check succeeded Device/File not configured Device/File not configured Device/File not configured Device/File not configured Cluster registry integrity check succeeded Logical corruption check succeeded ??votedisk file?????????ASM DISK,?????????crsctl replace votedisk, ??????LINUX OS: [root@maclean1 rules.d]# init 6 rebooting ............ [root@maclean1 dev]# ls -l *asm* brw-rw---- 1 grid asmadmin 8, 16 Jul 15 04:15 rasm-diskb brw-rw---- 1 grid asmadmin 8, 32 Jul 15 04:15 rasm-diskc brw-rw---- 1 grid asmadmin 8, 48 Jul 15 04:15 rasm-diskd brw-rw---- 1 grid asmadmin 8, 64 Jul 15 04:15 rasm-diske brw-rw---- 1 grid asmadmin 8, 80 Jul 15 04:15 rasm-diskf brw-rw---- 1 grid asmadmin 8, 96 Jul 15 04:15 rasm-diskg brw-rw---- 1 grid asmadmin 8, 112 Jul 15 04:15 rasm-diskh brw-rw---- 1 grid asmadmin 8, 128 Jul 15 04:15 rasm-diski ??????????/dev/rasm-disk*?ASM DISK,??ASM??????css?????/dev/asm*?????ASM DISK,??????????????ASM DISK: more /g01/grid/app/11.2.0/grid/log/maclean1/cssd/ocssd.log 2012-07-15 04:17:45.208: [ SKGFD][1099548992]Discovery with str:/dev/asm*: 2012-07-15 04:17:45.208: [ SKGFD][1099548992]UFS discovery with :/dev/asm*: 2012-07-15 04:17:45.208: [ SKGFD][1099548992]OSS discovery with :/dev/asm*: 2012-07-15 04:17:45.208: [ CSSD][1099548992]clssnmvDiskVerify: Successful discovery of 0 disks 2012-07-15 04:17:45.208: [ CSSD][1099548992]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery 2012-07-15 04:17:45.208: [ CSSD][1099548992]clssnmvFindInitialConfigs: No voting files found 2012-07-15 04:17:45.208: [ CSSD][1099548992](:CSSNM00070:)clssnmCompleteInitVFDiscovery: Voting file not found. Retrying discovery in 15 seconds 2012-07-15 04:17:45.251: [ CSSD][1096661312]clssgmExecuteClientRequest(): type(37) size(80) only connect and exit messages are allowed before lease acquisition proc(0x26a8ba0) client((nil)) 2012-07-15 04:17:45.251: [ CSSD][1096661312]clssgmDeadProc: proc 0x26a8ba0 2012-07-15 04:17:45.251: [ CSSD][1096661312]clssgmDestroyProc: cleaning up proc(0x26a8ba0) con(0xfe6) skgpid ospid 3751 with 0 clients, refcount 0 2012-07-15 04:17:45.252: [ CSSD][1096661312]clssgmDiscEndpcl: gipcDestroy 0xfe6 2012-07-15 04:17:45.829: [ CSSD][1096661312]clssscSelect: cookie accept request 0x2318ea0 2012-07-15 04:17:45.829: [ CSSD][1096661312]clssgmAllocProc: (0x2659480) allocated 2012-07-15 04:17:45.830: [ CSSD][1096661312]clssgmClientConnectMsg: properties of cmProc 0x2659480 - 1,2,3,4,5 2012-07-15 04:17:45.830: [ CSSD][1096661312]clssgmClientConnectMsg: Connect from con(0x114e) proc(0x2659480) pid(3751) version 11:2:1:4, properties: 1,2,3,4,5 2012-07-15 04:17:45.830: [ CSSD][1096661312]clssgmClientConnectMsg: msg flags 0x0000 2012-07-15 04:17:45.939: [ CSSD][1096661312]clssscSelect: cookie accept request 0x253ddd0 2012-07-15 04:17:45.939: [ CSSD][1096661312]clssscevtypSHRCON: getting client with cmproc 0x253ddd0 2012-07-15 04:17:45.939: [ CSSD][1096661312]clssgmRegisterClient: proc(3/0x253ddd0), client(61/0x26877b0) 2012-07-15 04:17:45.939: [ CSSD][1096661312]clssgmExecuteClientRequest(): type(6) size(684) only connect and exit messages are  allowed before lease acquisition proc(0x253ddd0) client(0x26877b0) 2012-07-15 04:17:45.939: [ CSSD][1096661312]clssgmDiscEndpcl: gipcDestroy 0x1174 2012-07-15 04:17:46.070: [ CSSD][1096661312]clssscSelect: cookie accept request 0x26368a0 2012-07-15 04:17:46.070: [ CSSD][1096661312]clssscevtypSHRCON: getting client with cmproc 0x26368a0 2012-07-15 04:17:46.070: [ CSSD][1096661312]clssgmRegisterClient: proc(5/0x26368a0), client(50/0x26877b0) ??11gR2?CRS?????ASM,??ocr???ASM?,??ASM???????,???CRS?????????: [root@maclean1 ~]# crsctl check has CRS-4638: Oracle High Availability Services is online [root@maclean1 ~]# crsctl check crs CRS-4638: Oracle High Availability Services is online CRS-4535: Cannot communicate with Cluster Ready Services CRS-4530: Communications failure contacting Cluster Synchronization Services daemon CRS-4534: Cannot communicate with Event Manager 2. ?????ASM DISK PATH???????,?????????????CRS: ??????OHASD??: [root@maclean1 ~]# crsctl stop has -f CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'maclean1' CRS-2673: Attempting to stop 'ora.mdnsd' on 'maclean1' CRS-2673: Attempting to stop 'ora.crf' on 'maclean1' CRS-2677: Stop of 'ora.mdnsd' on 'maclean1' succeeded CRS-2677: Stop of 'ora.crf' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.gipcd' on 'maclean1' CRS-2677: Stop of 'ora.gipcd' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.gpnpd' on 'maclean1' CRS-2677: Stop of 'ora.gpnpd' on 'maclean1' succeeded CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'maclean1' has completed CRS-4133: Oracle High Availability Services has been stopped. 3. ?-excl -nocrs????CRS,?????ASM ???????CRS??: [root@maclean1 ~]# crsctl start crs -excl -nocrs  CRS-4123: Oracle High Availability Services has been started. CRS-2672: Attempting to start 'ora.mdnsd' on 'maclean1' CRS-2676: Start of 'ora.mdnsd' on 'maclean1' succeeded CRS-2672: Attempting to start 'ora.gpnpd' on 'maclean1' CRS-2676: Start of 'ora.gpnpd' on 'maclean1' succeeded CRS-2672: Attempting to start 'ora.cssdmonitor' on 'maclean1' CRS-2672: Attempting to start 'ora.gipcd' on 'maclean1' CRS-2676: Start of 'ora.cssdmonitor' on 'maclean1' succeeded CRS-2676: Start of 'ora.gipcd' on 'maclean1' succeeded CRS-2672: Attempting to start 'ora.cssd' on 'maclean1' CRS-2672: Attempting to start 'ora.diskmon' on 'maclean1' CRS-2676: Start of 'ora.diskmon' on 'maclean1' succeeded CRS-2676: Start of 'ora.cssd' on 'maclean1' succeeded CRS-2679: Attempting to clean 'ora.cluster_interconnect.haip' on 'maclean1' CRS-2672: Attempting to start 'ora.ctssd' on 'maclean1' CRS-2681: Clean of 'ora.cluster_interconnect.haip' on 'maclean1' succeeded CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'maclean1' CRS-2676: Start of 'ora.ctssd' on 'maclean1' succeeded CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'maclean1' succeeded CRS-2672: Attempting to start 'ora.asm' on 'maclean1' CRS-2676: Start of 'ora.asm' on 'maclean1' succeeded #??????CRS_HOME???ORACLE_BASE?777??,??????? [root@maclean1 ~]# chmod 777 /g01 4.??ASM???disk_strings????ASM DISK PATH??: [root@maclean1 ~]# su - grid [grid@maclean1 ~]$ sqlplus / as sysasm SQL*Plus: Release 11.2.0.3.0 Production on Sun Jul 15 04:40:40 2012 Copyright (c) 1982, 2011, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production With the Real Application Clusters and Automatic Storage Management options SQL> alter system set asm_diskstring='/dev/rasm*'; System altered. SQL> alter diskgroup systemdg mount; Diskgroup altered. SQL> create spfile from memory; File created. SQL> startup force mount; ORA-32004: obsolete or deprecated parameter(s) specified for ASM instance ASM instance started Total System Global Area 283930624 bytes Fixed Size 2227664 bytes Variable Size 256537136 bytes ASM Cache 25165824 bytes ASM diskgroups mounted SQL> show parameter spfile NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ spfile string /g01/grid/app/11.2.0/grid/dbs/ spfile+ASM1.ora SQL> show parameter disk NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ asm_diskgroups string SYSTEMDG asm_diskstring string /dev/rasm* SQL> create pfile from spfile; File created. SQL> create spfile='+SYSTEMDG' from pfile; File created. SQL> startup force; ORA-32004: obsolete or deprecated parameter(s) specified for ASM instance ASM instance started Total System Global Area 283930624 bytes Fixed Size 2227664 bytes Variable Size 256537136 bytes ASM Cache 25165824 bytes ASM diskgroups mounted SQL> show parameter spfile NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ spfile string +SYSTEMDG/maclean-cluster/asmp arameterfile/registry.253.7886 82933 ???????asm_diskstring ,????ASM DISKGROUP??SPFILE , ??ASM?????SPFILE?????????????????? 5. crsctl replace votedisk ???votedisk????: [root@maclean1 ~]# crsctl replace votedisk +systemdg Successful addition of voting disk 864a00efcfbe4f42bfd0f4f6b60472a0. Successful addition of voting disk ab14d6e727614f29bf53b9870052a5c8. Successful addition of voting disk 754c03c168854f46bf2daee7287bf260. Successful addition of voting disk 9ed58f37f3e84f28bfcd9b101f2af9f3. Successful addition of voting disk 4ce7b7c682364f12bf4df5ce1fb7814e. Successfully replaced voting disk group with +systemdg. CRS-4266: Voting file(s) successfully replaced [root@maclean1 ~]# crsctl query css votedisk ## STATE File Universal Id File Name Disk group -- ----- ----------------- --------- --------- 1. ONLINE 864a00efcfbe4f42bfd0f4f6b60472a0 (/dev/rasm-diskb) [SYSTEMDG] 2. ONLINE ab14d6e727614f29bf53b9870052a5c8 (/dev/rasm-diskc) [SYSTEMDG] 3. ONLINE 754c03c168854f46bf2daee7287bf260 (/dev/rasm-diskd) [SYSTEMDG] 4. ONLINE 9ed58f37f3e84f28bfcd9b101f2af9f3 (/dev/rasm-diske) [SYSTEMDG] 5. ONLINE 4ce7b7c682364f12bf4df5ce1fb7814e (/dev/rasm-diskf) [SYSTEMDG] Located 5 voting disk(s). [root@maclean1 ~]# ocrcheck Status of Oracle Cluster Registry is as follows : Version : 3 Total space (kbytes) : 262120 Used space (kbytes) : 2844 Available space (kbytes) : 259276 ID : 879001605 Device/File Name : +SYSTEMDG Device/File integrity check succeeded Device/File not configured Device/File not configured Device/File not configured Device/File not configured Cluster registry integrity check succeeded Logical corruption check succeeded ??replace?votedisk??? ASM DISK?,???votedisk?OCR??????? 6.??CRS??: [root@maclean1 ~]# crsctl stop crs CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'maclean1' CRS-2673: Attempting to stop 'ora.mdnsd' on 'maclean1' CRS-2673: Attempting to stop 'ora.ctssd' on 'maclean1' CRS-2673: Attempting to stop 'ora.asm' on 'maclean1' CRS-2677: Stop of 'ora.mdnsd' on 'maclean1' succeeded CRS-2677: Stop of 'ora.asm' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'maclean1' CRS-2677: Stop of 'ora.ctssd' on 'maclean1' succeeded CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.cssd' on 'maclean1' CRS-2677: Stop of 'ora.cssd' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.gipcd' on 'maclean1' CRS-2677: Stop of 'ora.gipcd' on 'maclean1' succeeded CRS-2673: Attempting to stop 'ora.gpnpd' on 'maclean1' CRS-2677: Stop of 'ora.gpnpd' on 'maclean1' succeeded CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'maclean1' has completed CRS-4133: Oracle High Availability Services has been stopped. [root@maclean1 ~]# crsctl stat res -t -------------------------------------------------------------------------------- NAME TARGET STATE SERVER STATE_DETAILS -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.BACKUPDG.dg ONLINE ONLINE maclean1 ora.DATA.dg ONLINE ONLINE maclean1 ora.LISTENER.lsnr ONLINE ONLINE maclean1 ora.SYSTEMDG.dg ONLINE ONLINE maclean1 ora.asm ONLINE ONLINE maclean1 Started ora.gsd OFFLINE OFFLINE maclean1 ora.net1.network ONLINE ONLINE maclean1 ora.ons ONLINE ONLINE maclean1 -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.LISTENER_SCAN1.lsnr 1 ONLINE ONLINE maclean1 ora.cvu 1 ONLINE ONLINE maclean1 ora.maclean1.vip 1 ONLINE ONLINE maclean1 ora.maclean2.vip 1 ONLINE INTERMEDIATE maclean1 FAILED OVER ora.oc4j 1 ONLINE OFFLINE STARTING ora.prod.db 1 ONLINE OFFLINE Instance Shutdown,S TARTING 2 ONLINE OFFLINE ora.scan1.vip 1 ONLINE ONLINE maclean1 ???????ASM?????SPFILE,???????????????,?????CRS??????? ??11gR2 RAC+ASM?????????,????????????????ASM DISK PATH?????????, ???????????????,????!

    Read the article

  • unexplainable packet drops with 5 ethernet NICs and low traffic on Ubuntu

    - by jon
    I'm stuck on problem where my machine started to drops packets with no sign of ANY system load or high interrupt usage after an upgrade to Ubuntu 12.04. My server is a network monitoring sensor, running Ubuntu LTS 12.04, it passively collects packets from 5 interfaces doing network intrusion type stuff. Before the upgrade I managed to collect 200+GB of packets a day while writing them to disk with around 0% packet loss depending on the day with the help of CPU affinity and NIC IRQ to CPU bindings. Now I lose a great deal of packets with none of my applications running and at very low PPS rate which a modern workstation NIC would have no trouble with. Specs: x64 Xeon 4 cores 3.2 Ghz 16 GB RAM NICs: 5 Intel Pro NICs using the e1000 driver (NAPI). [1] eth0 and eth1 are integrated NICs (in the motherboard) There are 2 other PCI-X network cards, each with 2 Ethernet ports. 3 of the interfaces are running at Gigabit Ethernet, the others are not because they're attached to hubs. Specs: [2] http://support.dell.com/support/edocs/systems/pe2850/en/ug/t1390aa.htm uptime 17:36:00 up 1:43, 2 users, load average: 0.00, 0.01, 0.05 # uname -a Linux nms 3.2.0-29-generic #46-Ubuntu SMP Fri Jul 27 17:03:23 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux I also have the CPU governor set to performance mode and irqbalance off. The problem still occurs with them on. # lspci -t -vv -[0000:00]-+-00.0 Intel Corporation E7520 Memory Controller Hub +-02.0-[01-03]--+-00.0-[02]----0e.0 Dell PowerEdge Expandable RAID controller 4 | \-00.2-[03]-- +-04.0-[04]-- +-05.0-[05-07]--+-00.0-[06]----07.0 Intel Corporation 82541GI Gigabit Ethernet Controller | \-00.2-[07]----08.0 Intel Corporation 82541GI Gigabit Ethernet Controller +-06.0-[08-0a]--+-00.0-[09]--+-04.0 Intel Corporation 82546EB Gigabit Ethernet Controller (Copper) | | \-04.1 Intel Corporation 82546EB Gigabit Ethernet Controller (Copper) | \-00.2-[0a]--+-02.0 Digium, Inc. Wildcard TE210P/TE212P dual-span T1/E1/J1 card 3.3V | +-03.0 Intel Corporation 82546EB Gigabit Ethernet Controller (Copper) | \-03.1 Intel Corporation 82546EB Gigabit Ethernet Controller (Copper) +-1d.0 Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #1 +-1d.1 Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #2 +-1d.2 Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #3 +-1d.7 Intel Corporation 82801EB/ER (ICH5/ICH5R) USB2 EHCI Controller +-1e.0-[0b]----0d.0 Advanced Micro Devices [AMD] nee ATI RV100 QY [Radeon 7000/VE] +-1f.0 Intel Corporation 82801EB/ER (ICH5/ICH5R) LPC Interface Bridge \-1f.1 Intel Corporation 82801EB/ER (ICH5/ICH5R) IDE Controller I believe the NIC nor the NIC drivers are dropping the packets because ethtool reports 0 under rx_missed_errors and rx_no_buffer_count for each interface. On the old system, if it couldn't keep up this is where the drops would be. I drop packets on multiple interfaces just about every second, usually in small increments of 2-4. I tried all these sysctl values, I'm currently using the uncommented ones. # cat /etc/sysctl.conf # high net.core.netdev_max_backlog = 3000000 net.core.rmem_max = 16000000 net.core.rmem_default = 8000000 # defaults #net.core.netdev_max_backlog = 1000 #net.core.rmem_max = 131071 #net.core.rmem_default = 163480 # moderate #net.core.netdev_max_backlog = 10000 #net.core.rmem_max = 33554432 #net.core.rmem_default = 33554432 Here's an example of an interface stats report with ethtool. They are all the same, nothing is out of the ordinary ( I think ), so I'm only going to show one: ethtool -S eth2 NIC statistics: rx_packets: 7498 tx_packets: 0 rx_bytes: 2722585 tx_bytes: 0 rx_broadcast: 327 tx_broadcast: 0 rx_multicast: 1504 tx_multicast: 0 rx_errors: 0 tx_errors: 0 tx_dropped: 0 multicast: 1504 collisions: 0 rx_length_errors: 0 rx_over_errors: 0 rx_crc_errors: 0 rx_frame_errors: 0 rx_no_buffer_count: 0 rx_missed_errors: 0 tx_aborted_errors: 0 tx_carrier_errors: 0 tx_fifo_errors: 0 tx_heartbeat_errors: 0 tx_window_errors: 0 tx_abort_late_coll: 0 tx_deferred_ok: 0 tx_single_coll_ok: 0 tx_multi_coll_ok: 0 tx_timeout_count: 0 tx_restart_queue: 0 rx_long_length_errors: 0 rx_short_length_errors: 0 rx_align_errors: 0 tx_tcp_seg_good: 0 tx_tcp_seg_failed: 0 rx_flow_control_xon: 0 rx_flow_control_xoff: 0 tx_flow_control_xon: 0 tx_flow_control_xoff: 0 rx_long_byte_count: 2722585 rx_csum_offload_good: 0 rx_csum_offload_errors: 0 alloc_rx_buff_failed: 0 tx_smbus: 0 rx_smbus: 0 dropped_smbus: 01 # ifconfig eth0 Link encap:Ethernet HWaddr 00:11:43:e0:e2:8c UP BROADCAST RUNNING NOARP PROMISC ALLMULTI MULTICAST MTU:1500 Metric:1 RX packets:373348 errors:16 dropped:95 overruns:0 frame:16 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:356830572 (356.8 MB) TX bytes:0 (0.0 B) eth1 Link encap:Ethernet HWaddr 00:11:43:e0:e2:8d UP BROADCAST RUNNING NOARP PROMISC ALLMULTI MULTICAST MTU:1500 Metric:1 RX packets:13616 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:8690528 (8.6 MB) TX bytes:0 (0.0 B) eth2 Link encap:Ethernet HWaddr 00:04:23:e1:77:6a UP BROADCAST RUNNING NOARP PROMISC ALLMULTI MULTICAST MTU:1500 Metric:1 RX packets:7750 errors:0 dropped:471 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:2780935 (2.7 MB) TX bytes:0 (0.0 B) eth3 Link encap:Ethernet HWaddr 00:04:23:e1:77:6b UP BROADCAST RUNNING NOARP PROMISC ALLMULTI MULTICAST MTU:1500 Metric:1 RX packets:5112 errors:0 dropped:206 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:639472 (639.4 KB) TX bytes:0 (0.0 B) eth4 Link encap:Ethernet HWaddr 00:04:23:b6:35:6c UP BROADCAST RUNNING NOARP PROMISC ALLMULTI MULTICAST MTU:1500 Metric:1 RX packets:961467 errors:0 dropped:935 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:958561305 (958.5 MB) TX bytes:0 (0.0 B) eth5 Link encap:Ethernet HWaddr 00:04:23:b6:35:6d inet addr:192.168.1.6 Bcast:192.168.1.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:4264 errors:0 dropped:16 overruns:0 frame:0 TX packets:699 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:572228 (572.2 KB) TX bytes:124456 (124.4 KB) I tried the defaults, then started to play around with settings. I wasn't using any flow control and I increased the RxDescriptor count to 4096 before the upgrade as well without any problems. # cat /etc/modprobe.d/e1000.conf options e1000 XsumRX=0,0,0,0,0 RxDescriptors=4096,4096,4096,4096,4096 FlowControl=0,0,0,0,0 debug=16 Here's my network configuration file, I turned off checksumming and various offloading mechanisms along with setting CPU affinity with heavy use interfaces getting an entire CPU and light use interfaces sharing a CPU. I used these settings prior to the upgrade without problems. # cat /etc/network/interfaces # The loopback network interface auto lo iface lo inet loopback # The primary network interface auto eth0 iface eth0 inet manual pre-up /sbin/ethtool -G eth0 rx 4096 tx 0 pre-up /sbin/ethtool -K eth0 gro off gso off rx off pre-up /sbin/ethtool -A eth0 rx off autoneg off up ifconfig eth0 0.0.0.0 -arp promisc mtu 1500 allmulti txqueuelen 0 up post-up echo "4" > /proc/irq/48/smp_affinity down ifconfig eth0 down post-down /sbin/ethtool -G eth0 rx 256 tx 256 post-down /sbin/ethtool -K eth0 gro on gso on rx on post-down /sbin/ethtool -A eth0 rx on autoneg on auto eth1 iface eth1 inet manual pre-up /sbin/ethtool -G eth1 rx 4096 tx 0 pre-up /sbin/ethtool -K eth1 gro off gso off rx off pre-up /sbin/ethtool -A eth1 rx off autoneg off up ifconfig eth1 0.0.0.0 -arp promisc mtu 1500 allmulti txqueuelen 0 up post-up echo "4" > /proc/irq/49/smp_affinity down ifconfig eth1 down post-down /sbin/ethtool -G eth1 rx 256 tx 256 post-down /sbin/ethtool -K eth1 gro on gso on rx on post-down /sbin/ethtool -A eth1 rx on autoneg on auto eth2 iface eth2 inet manual pre-up /sbin/ethtool -G eth2 rx 4096 tx 0 pre-up /sbin/ethtool -K eth2 gro off gso off rx off pre-up /sbin/ethtool -A eth2 rx off autoneg off up ifconfig eth2 0.0.0.0 -arp promisc mtu 1500 allmulti txqueuelen 0 up post-up echo "1" > /proc/irq/82/smp_affinity down ifconfig eth2 down post-down /sbin/ethtool -G eth2 rx 256 tx 256 post-down /sbin/ethtool -K eth2 gro on gso on rx on post-down /sbin/ethtool -A eth2 rx on autoneg on auto eth3 iface eth3 inet manual pre-up /sbin/ethtool -G eth3 rx 4096 tx 0 pre-up /sbin/ethtool -K eth3 gro off gso off rx off pre-up /sbin/ethtool -A eth3 rx off autoneg off up ifconfig eth3 0.0.0.0 -arp promisc mtu 1500 allmulti txqueuelen 0 up post-up echo "2" > /proc/irq/83/smp_affinity down ifconfig eth3 down post-down /sbin/ethtool -G eth3 rx 256 tx 256 post-down /sbin/ethtool -K eth3 gro on gso on rx on post-down /sbin/ethtool -A eth3 rx on autoneg on auto eth4 iface eth4 inet manual pre-up /sbin/ethtool -G eth4 rx 4096 tx 0 pre-up /sbin/ethtool -K eth4 gro off gso off rx off pre-up /sbin/ethtool -A eth4 rx off autoneg off up ifconfig eth4 0.0.0.0 -arp promisc mtu 1500 allmulti txqueuelen 0 up post-up echo "4" > /proc/irq/77/smp_affinity down ifconfig eth4 down post-down /sbin/ethtool -G eth4 rx 256 tx 256 post-down /sbin/ethtool -K eth4 gro on gso on rx on post-down /sbin/ethtool -A eth4 rx on autoneg on auto eth5 iface eth5 inet static pre-up /etc/fw.conf address 192.168.1.1 netmask 255.255.255.0 broadcast 192.168.1.255 gateway 192.168.1.1 dns-nameservers 192.168.1.2 192.168.1.3 up ifconfig eth5 up post-up echo "8" > /proc/irq/77/smp_affinity down ifconfig eth5 down Here's a few examples of packet drops, i ran one after another, probabling totaling 3 or 4 seconds. You can see increases in the drops from the 1st and 3rd. This was a non-busy time, very little traffic. # awk '{ print $1,$5 }' /proc/net/dev Inter-| face drop eth3: 225 lo: 0 eth2: 505 eth1: 0 eth5: 17 eth0: 105 eth4: 1034 # awk '{ print $1,$5 }' /proc/net/dev Inter-| face drop eth3: 225 lo: 0 eth2: 507 eth1: 0 eth5: 17 eth0: 105 eth4: 1034 # awk '{ print $1,$5 }' /proc/net/dev Inter-| face drop eth3: 227 lo: 0 eth2: 512 eth1: 0 eth5: 17 eth0: 105 eth4: 1039 I tried the pci=noacpi options. With and without, it's the same. This is what my interrupt stats looked like before the upgrade, after, with ACPI on PCI it showed multiple NICs bound to an interrupt and shared with other devices such as USB drives which I didn't like so I think i'm going to keep it with ACPI off as it's easier to designate sole purpose interrupts. Is there any advantage I would have using the default i.e. ACPI w/ PCI. ? # cat /etc/default/grub | grep CMD_LINE GRUB_CMDLINE_LINUX_DEFAULT="ipv6.disable=1 noacpi pci=noacpi" GRUB_CMDLINE_LINUX="" # cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 0: 45 0 0 16 IO-APIC-edge timer 1: 1 0 0 7936 IO-APIC-edge i8042 2: 0 0 0 0 XT-PIC-XT-PIC cascade 6: 0 0 0 3 IO-APIC-edge floppy 8: 0 0 0 1 IO-APIC-edge rtc0 9: 0 0 0 0 IO-APIC-edge acpi 12: 0 0 0 1809 IO-APIC-edge i8042 14: 1 0 0 4498 IO-APIC-edge ata_piix 15: 0 0 0 0 IO-APIC-edge ata_piix 16: 0 0 0 0 IO-APIC-fasteoi uhci_hcd:usb2 18: 0 0 0 1350 IO-APIC-fasteoi uhci_hcd:usb4, radeon 19: 0 0 0 0 IO-APIC-fasteoi uhci_hcd:usb3 23: 0 0 0 4099 IO-APIC-fasteoi ehci_hcd:usb1 38: 0 0 0 61963 IO-APIC-fasteoi megaraid 48: 0 0 1002319 4 IO-APIC-fasteoi eth0 49: 0 0 38772 3 IO-APIC-fasteoi eth1 77: 0 0 130076 432159 IO-APIC-fasteoi eth4 78: 0 0 0 23917 IO-APIC-fasteoi eth5 82: 1329033 0 0 4 IO-APIC-fasteoi eth2 83: 0 4886525 0 6 IO-APIC-fasteoi eth3 NMI: 5 6 4 5 Non-maskable interrupts LOC: 61409 57076 64257 114764 Local timer interrupts SPU: 0 0 0 0 Spurious interrupts IWI: 0 0 0 0 IRQ work interrupts RES: 17956 25333 13436 14789 Rescheduling interrupts CAL: 22436 607 539 478 Function call interrupts TLB: 1525 1458 4600 4151 TLB shootdowns TRM: 0 0 0 0 Thermal event interrupts THR: 0 0 0 0 Threshold APIC interrupts MCE: 0 0 0 0 Machine check exceptions MCP: 16 16 16 16 Machine check polls ERR: 0 MIS: 0 Here's sample output of vmstat, showing the system. Barebones system right now. root@nms:~# vmstat -S m 1 procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 0 14992 192 1029 0 0 56 2 419 29 1 0 99 0 0 0 0 14992 192 1029 0 0 0 0 922 27 0 0 100 0 0 0 0 14991 192 1029 0 0 0 36 763 50 0 0 100 0 0 0 0 14991 192 1029 0 0 0 0 646 35 0 0 100 0 0 0 0 14991 192 1029 0 0 0 0 722 54 0 0 100 0 0 0 0 14991 192 1029 0 0 0 0 793 27 0 0 100 0 ^C Here's dmesg output. I can't figure out why my PCI-X slots are negotiated as PCI. The network cards are all PCI-X with the exception of the integrated NICs that came with the server. In the output below it looks as if eth3 and eth2 negotiated at PCI-X speeds rather than PCI:66Mhz. Wouldn't they all drop to PCI:66Mhz? If your integrated NICs are PCI, as labeled below (eth0,eth1), then wouldn't all devices on your bus speed drop down to that slower bus speed? If not, I still don't know why only one of my NICs ( each has two ethernet ports) is labeled as PCI-X in the output below. Does that mean it is running at PCI-X speeds are is it showing that it's capable? # dmesg | grep e1000 [ 3678.349337] e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k8-NAPI [ 3678.349342] e1000: Copyright (c) 1999-2006 Intel Corporation. [ 3678.349394] e1000 0000:06:07.0: PCI->APIC IRQ transform: INT A -> IRQ 48 [ 3678.409725] e1000 0000:06:07.0: Receive Descriptors set to 4096 [ 3678.409730] e1000 0000:06:07.0: Checksum Offload Disabled [ 3678.409734] e1000 0000:06:07.0: Flow Control Disabled [ 3678.586409] e1000 0000:06:07.0: eth0: (PCI:66MHz:32-bit) 00:11:43:e0:e2:8c [ 3678.586419] e1000 0000:06:07.0: eth0: Intel(R) PRO/1000 Network Connection [ 3678.586642] e1000 0000:07:08.0: PCI->APIC IRQ transform: INT A -> IRQ 49 [ 3678.649854] e1000 0000:07:08.0: Receive Descriptors set to 4096 [ 3678.649859] e1000 0000:07:08.0: Checksum Offload Disabled [ 3678.649863] e1000 0000:07:08.0: Flow Control Disabled [ 3678.826436] e1000 0000:07:08.0: eth1: (PCI:66MHz:32-bit) 00:11:43:e0:e2:8d [ 3678.826444] e1000 0000:07:08.0: eth1: Intel(R) PRO/1000 Network Connection [ 3678.826627] e1000 0000:09:04.0: PCI->APIC IRQ transform: INT A -> IRQ 82 [ 3679.093266] e1000 0000:09:04.0: Receive Descriptors set to 4096 [ 3679.093271] e1000 0000:09:04.0: Checksum Offload Disabled [ 3679.093275] e1000 0000:09:04.0: Flow Control Disabled [ 3679.130239] e1000 0000:09:04.0: eth2: (PCI-X:133MHz:64-bit) 00:04:23:e1:77:6a [ 3679.130246] e1000 0000:09:04.0: eth2: Intel(R) PRO/1000 Network Connection [ 3679.130449] e1000 0000:09:04.1: PCI->APIC IRQ transform: INT B -> IRQ 83 [ 3679.397312] e1000 0000:09:04.1: Receive Descriptors set to 4096 [ 3679.397318] e1000 0000:09:04.1: Checksum Offload Disabled [ 3679.397321] e1000 0000:09:04.1: Flow Control Disabled [ 3679.434350] e1000 0000:09:04.1: eth3: (PCI-X:133MHz:64-bit) 00:04:23:e1:77:6b [ 3679.434360] e1000 0000:09:04.1: eth3: Intel(R) PRO/1000 Network Connection [ 3679.434553] e1000 0000:0a:03.0: PCI->APIC IRQ transform: INT A -> IRQ 77 [ 3679.704072] e1000 0000:0a:03.0: Receive Descriptors set to 4096 [ 3679.704077] e1000 0000:0a:03.0: Checksum Offload Disabled [ 3679.704081] e1000 0000:0a:03.0: Flow Control Disabled [ 3679.738364] e1000 0000:0a:03.0: eth4: (PCI:33MHz:64-bit) 00:04:23:b6:35:6c [ 3679.738371] e1000 0000:0a:03.0: eth4: Intel(R) PRO/1000 Network Connection [ 3679.738538] e1000 0000:0a:03.1: PCI->APIC IRQ transform: INT B -> IRQ 78 [ 3680.046060] e1000 0000:0a:03.1: eth5: (PCI:33MHz:64-bit) 00:04:23:b6:35:6d [ 3680.046067] e1000 0000:0a:03.1: eth5: Intel(R) PRO/1000 Network Connection [ 3682.132415] e1000: eth0 NIC Link is Up 100 Mbps Half Duplex, Flow Control: None [ 3682.224423] e1000: eth1 NIC Link is Up 100 Mbps Half Duplex, Flow Control: None [ 3682.316385] e1000: eth2 NIC Link is Up 100 Mbps Half Duplex, Flow Control: None [ 3682.408391] e1000: eth3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None [ 3682.500396] e1000: eth4 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None [ 3682.708401] e1000: eth5 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX At first I thought it was the NIC drivers but I'm not so sure. I really have no idea where else to look at the moment. Any help is greatly appreciated as I'm struggling with this. If you need more information just ask. Thanks! [1]http://www.cs.fsu.edu/~baker/devices/lxr/http/source/linux/Documentation/networking/e1000.txt?v=2.6.11.8 [2] http://support.dell.com/support/edocs/systems/pe2850/en/ug/t1390aa.htm

    Read the article

  • Why load increase ,ssd iops increase but cpu iowait decrease?

    - by mq44944
    There is a strange thing on my server which has a mysql running on it. The QPS is more than 4000 but TPS is less than 20. The server load is more than 80 and cpu usr is more than 86% but iowait is less than 8%. The disk iops is more than 16000 and util of disk is more than 99%. When the QPS decreases, the load decreases, the cpu iowait increases. I can't catch this! root@mypc # dmidecode | grep "Product Name" Product Name: PowerEdge R510 Product Name: 084YMW root@mypc # megacli -PDList -aALL |grep "Inquiry Data" Inquiry Data: SEAGATE ST3600057SS ES656SL316PT Inquiry Data: SEAGATE ST3600057SS ES656SL30THV Inquiry Data: ATA INTEL SSDSA2CW300362CVPR201602A6300EGN Inquiry Data: ATA INTEL SSDSA2CW300362CVPR2044037K300EGN Inquiry Data: ATA INTEL SSDSA2CW300362CVPR204402PX300EGN Inquiry Data: ATA INTEL SSDSA2CW300362CVPR204403WN300EGN Inquiry Data: ATA INTEL SSDSA2CW300362CVPR202000HU300EGN Inquiry Data: ATA INTEL SSDSA2CW300362CVPR202001E7300EGN Inquiry Data: ATA INTEL SSDSA2CW300362CVPR204402WE300EGN Inquiry Data: ATA INTEL SSDSA2CW300362CVPR204404E5300EGN Inquiry Data: ATA INTEL SSDSA2CW300362CVPR204401QF300EGN Inquiry Data: ATA INTEL SSDSA2CW300362CVPR20450001300EGN the mysql data files lie on the ssd disks which are organizaed using RAID 10. root@mypc # megacli -LDInfo -L1 -a0 Adapter 0 -- Virtual Drive Information: Virtual Disk: 1 (Target Id: 1) Name: RAID Level: Primary-1, Secondary-0, RAID Level Qualifier-0 Size:1427840MB State: Optimal Stripe Size: 64kB Number Of Drives:2 Span Depth:5 Default Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU Current Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU Access Policy: Read/Write Disk Cache Policy: Disk's Default Exit Code: 0x00 -------- -----load-avg---- ---cpu-usage--- ---swap--- -------------------------io-usage----------------------- -QPS- -TPS- -Hit%- time | 1m 5m 15m |usr sys idl iow| si so| r/s w/s rkB/s wkB/s queue await svctm %util| ins upd del sel iud| lor hit| 09:05:29|79.80 64.49 42.00| 82 7 6 5| 0 0|16421.1 10.6262705.9 85.2 8.3 0.5 0.1 99.5| 0 0 0 3968 0| 495482 96.58| 09:05:30|79.80 64.49 42.00| 79 7 8 6| 0 0|15907.4 230.6254409.7 6357.5 8.4 0.5 0.1 98.5| 0 0 0 4195 0| 496434 96.68| 09:05:31|81.34 65.07 42.31| 81 7 7 5| 0 0|16198.7 8.6259029.2 99.8 8.1 0.5 0.1 99.3| 0 0 0 4220 0| 508983 96.70| 09:05:32|81.34 65.07 42.31| 82 7 5 5| 0 0|16746.6 8.7267853.3 92.4 8.5 0.5 0.1 99.4| 0 0 0 4084 0| 503834 96.54| 09:05:33|81.34 65.07 42.31| 81 7 6 5| 0 0|16498.7 9.6263856.8 92.3 8.0 0.5 0.1 99.3| 0 0 0 4030 0| 507051 96.60| 09:05:34|81.34 65.07 42.31| 80 8 7 6| 0 0|16328.4 11.5261101.6 95.8 8.1 0.5 0.1 98.3| 0 0 0 4119 0| 504409 96.63| 09:05:35|81.31 65.33 42.52| 82 7 6 5| 0 0|16374.0 8.7261921.9 92.5 8.1 0.5 0.1 99.7| 0 0 0 4127 0| 507279 96.66| 09:05:36|81.31 65.33 42.52| 81 8 6 5| 0 0|16496.2 8.6263832.0 84.5 8.5 0.5 0.1 99.2| 0 0 0 4100 0| 505054 96.59| 09:05:37|81.31 65.33 42.52| 82 8 6 4| 0 0|16239.4 9.6259768.8 84.3 8.0 0.5 0.1 99.1| 0 0 0 4273 0| 510621 96.72| 09:05:38|81.31 65.33 42.52| 81 7 6 5| 0 0|16349.6 8.7261439.2 81.4 8.2 0.5 0.1 98.9| 0 0 0 4171 0| 510145 96.67| 09:05:39|81.31 65.33 42.52| 82 7 6 5| 0 0|16116.8 8.7257667.6 96.5 8.0 0.5 0.1 99.1| 0 0 0 4348 0| 513093 96.74| 09:05:40|79.60 65.24 42.61| 79 7 7 7| 0 0|16154.2 242.9258390.4 6388.4 8.5 0.5 0.1 99.0| 0 0 0 4033 0| 507244 96.70| 09:05:41|79.60 65.24 42.61| 79 7 8 6| 0 0|16583.1 21.2265129.6 173.5 8.2 0.5 0.1 99.1| 0 0 0 3995 0| 501474 96.57| 09:05:42|79.60 65.24 42.61| 81 8 6 5| 0 0|16281.0 9.7260372.2 69.5 8.3 0.5 0.1 98.7| 0 0 0 4221 0| 509322 96.70| 09:05:43|79.60 65.24 42.61| 80 7 7 6| 0 0|16355.3 8.7261515.5 104.3 8.2 0.5 0.1 99.6| 0 0 0 4087 0| 502052 96.62| -------- -----load-avg---- ---cpu-usage--- ---swap--- -------------------------io-usage----------------------- -QPS- -TPS- -Hit%- time | 1m 5m 15m |usr sys idl iow| si so| r/s w/s rkB/s wkB/s queue await svctm %util| ins upd del sel iud| lor hit| 09:05:44|79.60 65.24 42.61| 83 7 5 4| 0 0|16469.4 11.6263387.0 138.8 8.2 0.5 0.1 98.7| 0 0 0 4292 0| 509979 96.65| 09:05:45|79.07 65.37 42.77| 80 7 6 6| 0 0|16659.5 9.7266478.7 85.0 8.4 0.5 0.1 98.5| 0 0 0 3899 0| 496234 96.54| 09:05:46|79.07 65.37 42.77| 78 7 7 8| 0 0|16752.9 8.7267921.8 97.1 8.4 0.5 0.1 98.9| 0 0 0 4126 0| 508300 96.57| 09:05:47|79.07 65.37 42.77| 82 7 6 5| 0 0|16657.2 9.6266439.3 84.3 8.3 0.5 0.1 98.9| 0 0 0 4086 0| 502171 96.57| 09:05:48|79.07 65.37 42.77| 79 8 6 6| 0 0|16814.5 8.7268924.1 77.6 8.5 0.5 0.1 99.0| 0 0 0 4059 0| 499645 96.52| 09:05:49|79.07 65.37 42.77| 81 7 6 5| 0 0|16553.0 6.8264708.6 42.5 8.3 0.5 0.1 99.4| 0 0 0 4249 0| 501623 96.60| 09:05:50|79.63 65.71 43.01| 79 7 7 7| 0 0|16295.1 246.9260475.0 6442.4 8.7 0.5 0.1 99.1| 0 0 0 4231 0| 511032 96.70| 09:05:51|79.63 65.71 43.01| 80 7 6 6| 0 0|16568.9 8.7264919.7 104.7 8.3 0.5 0.1 99.7| 0 0 0 4272 0| 517177 96.68| 09:05:53|79.63 65.71 43.01| 79 7 7 6| 0 0|16539.0 8.6264502.9 87.6 8.4 0.5 0.1 98.9| 0 0 0 3992 0| 496728 96.52| 09:05:54|79.63 65.71 43.01| 79 7 7 7| 0 0|16527.5 11.6264363.6 92.6 8.5 0.5 0.1 98.8| 0 0 0 4045 0| 502944 96.59| 09:05:55|79.63 65.71 43.01| 80 7 7 6| 0 0|16374.7 12.5261687.2 134.9 8.6 0.5 0.1 99.2| 0 0 0 4143 0| 507006 96.66| 09:05:56|76.05 65.20 42.96| 77 8 8 8| 0 0|16464.9 9.6263314.3 111.9 8.5 0.5 0.1 98.9| 0 0 0 4250 0| 505417 96.64| 09:05:57|76.05 65.20 42.96| 79 7 6 7| 0 0|16460.1 8.8263283.2 93.4 8.3 0.5 0.1 98.8| 0 0 0 4294 0| 508168 96.66| 09:05:58|76.05 65.20 42.96| 80 7 7 7| 0 0|16176.5 9.6258762.1 127.3 8.3 0.5 0.1 98.9| 0 0 0 4160 0| 509349 96.72| 09:05:59|76.05 65.20 42.96| 75 7 9 10| 0 0|16522.0 10.7264274.6 93.1 8.6 0.5 0.1 97.5| 0 0 0 4034 0| 492623 96.51| -------- -----load-avg---- ---cpu-usage--- ---swap--- -------------------------io-usage----------------------- -QPS- -TPS- -Hit%- time | 1m 5m 15m |usr sys idl iow| si so| r/s w/s rkB/s wkB/s queue await svctm %util| ins upd del sel iud| lor hit| 09:06:00|76.05 65.20 42.96| 79 7 7 7| 0 0|16369.6 21.2261867.3 262.5 8.4 0.5 0.1 98.9| 0 0 0 4305 0| 494509 96.59| 09:06:01|75.33 65.23 43.09| 73 6 9 12| 0 0|15864.0 209.3253685.4 6238.0 10.0 0.6 0.1 98.7| 0 0 0 3913 0| 483480 96.62| 09:06:02|75.33 65.23 43.09| 73 7 8 12| 0 0|15854.7 12.7253613.2 93.6 11.0 0.7 0.1 99.0| 0 0 0 4271 0| 483771 96.64| 09:06:03|75.33 65.23 43.09| 75 7 9 9| 0 0|16074.8 8.7257104.3 81.7 8.1 0.5 0.1 98.5| 0 0 0 4060 0| 480701 96.55| 09:06:04|75.33 65.23 43.09| 76 7 8 9| 0 0|16221.7 9.7259500.1 139.4 8.1 0.5 0.1 97.6| 0 0 0 3953 0| 486774 96.56| 09:06:05|74.98 65.33 43.24| 78 7 8 8| 0 0|16330.7 8.7261166.5 85.3 8.2 0.5 0.1 98.5| 0 0 0 3957 0| 481775 96.53| 09:06:06|74.98 65.33 43.24| 75 7 9 9| 0 0|16093.7 11.7257436.1 93.7 8.2 0.5 0.1 99.2| 0 0 0 3938 0| 489251 96.60| 09:06:07|74.98 65.33 43.24| 75 7 5 13| 0 0|15758.9 19.2251989.4 188.2 14.7 0.9 0.1 99.7| 0 0 0 4140 0| 494738 96.70| 09:06:08|74.98 65.33 43.24| 69 7 10 15| 0 0|16166.3 8.7258474.9 81.2 8.9 0.5 0.1 98.7| 0 0 0 3993 0| 487162 96.58| 09:06:09|74.98 65.33 43.24| 74 7 9 10| 0 0|16071.0 8.7257010.9 93.3 8.2 0.5 0.1 99.2| 0 0 0 4098 0| 491557 96.61| 09:06:10|70.98 64.66 43.14| 71 7 9 12| 0 0|15549.6 216.1248701.1 6188.7 8.3 0.5 0.1 97.8| 0 0 0 3879 0| 480832 96.66| 09:06:11|70.98 64.66 43.14| 71 7 10 13| 0 0|16233.7 22.4259568.1 257.1 8.2 0.5 0.1 99.2| 0 0 0 4088 0| 493200 96.62| 09:06:12|70.98 64.66 43.14| 78 7 8 7| 0 0|15932.4 10.6254779.5 108.1 8.1 0.5 0.1 98.6| 0 0 0 4168 0| 489838 96.63| 09:06:13|70.98 64.66 43.14| 71 8 9 12| 0 0|16255.9 11.5259902.3 103.9 8.3 0.5 0.1 98.0| 0 0 0 3874 0| 481246 96.52| 09:06:14|70.98 64.66 43.14| 60 6 16 18| 0 0|15621.0 9.7249826.1 81.9 8.0 0.5 0.1 99.3| 0 0 0 3956 0| 480278 96.65|

    Read the article

  • My computer freezes irregularly

    - by Manhim
    My computer started to freeze at irregular times for 3 weeks now. Please note that this question change with each things that i try. (For additional details) What happens My computer freezes, the video stops. (No graphic glitches, it just stops) Sound keeps playing up to some time (Usually 10-30 seconds) then stops playing. Sometimes, randomly, the screen on my G-15 keyboard flickers and I see caracters not at the right places. Usually happens for about 1-2 seconds and a bit before my computer freezes. I have to keep the power button pressed for 4 seconds to shut my computer down. I still hear my hard drives and fans working. Sometimes it works with no problems for a full day, some other times it just keeps freezing each time I restart my computer and I have to leave it for the rest of the day. Sometimes my mouse freezes for a fraction of a second (Like 0.01 to 0.2 seconds) quite randomly, usually before it freezes. No errors spotted by the "Action center" unlike when I had problems with my last video card on this system (Driver errors). My G-15 LCD screen also freezes. Sometimes my G-15 LCD screen flickers and caracters gets caried around temporary under heavy load. Now, most of the times, the BIOS hard disks boot order gets reversed for some reason and I have to put it to the right one and save each times I boot. (Might be unrelated, not sure, but it first started yesterday) Sometimes the BIOS doesn't detect my 750GB hard drive plugged in SATA1. What I did so far I have had similar problems in the past and I had changed my hard drive (It was faulty), so I tested my software RAID-0 array and it was faulty so I changed it. (I reinstalled Windows 7 with this part). I also tested with unplugging my secondary hard drive. My CPU was running at about 100 degree Celsius, I removed the dust between the fans and the heatsink and it's now between 45-55. I ran a CPU stress-test and it didn't freeze during the tests (using Prime95 on all cores) Ran a memory test (using memtest86+) for a single pass and there were no errors. Ran a GPU stress test with ati-tools and furmark and it didn't freeze during the tests. (No artefacts either) I had troubles with my graphic card when I got it, but I think that it got fixed with a driver update. I checked the voltages in my BIOS setup and they all seemed ok (±0.2 I think). I have ran on the computer without problems with Fedora 15 on an external hard drive (Appart that it couldn't load Gnome 3 and was reverting to Gnome 2, didn't want to install drivers since I use it on multiple computers) I used it to backup my files from the raid array to my 1TB hard drive for the reinstallation of Windows. (So the crashes only happenned on Windows) [The external hard drive is plugged directly on a SATA port] I contacted EVGA (My graphic card vendor) and pointed them on this question, I'm looking for an answer. Ran sensors on Fedora 15 and got this output: http://pastebin.com/0BHJnAvu Ran 6 short different CPU stress test on Fedora 15 (Haven't found any complete stress testers for Linux) and it didn't crash. Changed the thermal paste to some Artic Silver 5 for my CPU and stress tested the CPU, temperature was at 50 idle, then 64 highest and slowly went down to 62 during the test. Ran some stress testing with a temporary graphic card and it went ok. Ran furmark stress test with my original graphic card and it freezed again. GPU had a temp of 74C, a CPU temp of 58C and a mobo temp of 40C or 45C (Dunno which one it is from SpeedFan). Ran a furmark stress test and a CPU stress test at the same time, results: http://pastebin.com/2t6PLpdJ I have been using my computer without stressing it for about 2 hours now and no crashes yet. I also have disabled the AMD Cool'n'quiet function on the BIOS for a more regular power to the CPU. When I ran Furmark without C'n'q my computer didn't freeze but I had a "Driver Kernel Error" that have recovered (And Furmark crashed) all that while running a CPU stress test. The computer eventually frozed without me being at it, but this time my screen just went on sleep and I couldn't wake it. Using the stability tester in nTune my computer freezed again (In the same manner as before). I notived that Speedfan gives me a -12V of -16.97V and a -5V of -8.78V. I wonder if these numbers are reliable and if they are good or bad. I have swapped my G-15 with another basic USB keyboard (HP) and I have ran furmark for about 10 minutes with a CPU stability test running each 60 seconds for 30 seconds and my computer haven't crashed yet. Ran some more extended tests without my G-15 and it freezed like it usually do. Removed the nForce Hard disk controler. Disabled command queuing in the NVIDIA nForce SATA Controller for both port 0 and port 1 (Errors from the logs) Used CPUID HwMonitor, here are the voltages: http://pastebin.com/dfM7p4jV Changed some configurations in the motherboard BIOS: Disabled PEG Link Mode, Changed AI Tuning to Standard, Disabled the 1394 Controller, Disabled HD Audio, Disabled JMicron RAID controller and Disabled SATA Raid. When it happens When I play video games (Mostly) When I play flash games (Second most) When I'm looking at my desktop background (It rarely happens when I have a window open, but it does, sometimes) When my Graphic card and my CPU are stressed. Sometimes when my Graphic card is stressed. Never happenned while stressing only the CPU. Sometimes when my CPU is stressed. Specs Windows Seven x64 Home Premium Motherboard: M2N-SLI Deluxe CPU: AMD Phenom 9950 x2 @ 2.6GHz Memory: Kingston 4x2GB Dual Channel (Pretty basic memory sticks) Hard drives: Was 2x250GB (Western digital caviar) in raid-0 + 1TB (WD caviar black), I replaced the raid array with a 750GB (WD caviar black) [Yes I removed the array from the raid configurations] 750W Power supply No overcloking. Ever. There have been some power-downs like 4-5 weeks ago, but the problem didn't start immediately after. (I wasn't home, so my computer got shut-down) Event logs (Warnings, errors and critical errors) for the last 24 hours: http://pastebin.com/Bvvk31T7 My current to-try list Reinstall the drivers and software 1 by 1 and do extensive stress testing between each. Update the BIOS firmware to the most recent stable one. Change my motherboard. Status updates Keeping only the last 3 (28/06 04pm) More stress testing and still pass the tests. (28/06 03pm) Been stress testing for 10 minute straight now and 5 minutes with both CPU and GPU being stressed at the same time. (28/06 03pm) Stress-testing right now, so far no problems. A little hope Tests with Furmark and Prime95. Testing Windows bare-bone: 30 Minutes stress, no freeze. Installing an Anti-virus and some software, restarting computer. Testing with Anti-virus and some software (No drivers installed): 30 Minutes stress, no freeze. Installing audio drivers, restarting computer. Testing with the audio drivers: 30 Minutes stress, no freeze. Installing the latest graphic drivers from EVGA's website (without 3d vision since I don't use it), restarting computer. Testing with the graphic drivers: 30 Minutes stress, no freeze. Configuring Windows to my liking and installing more softwares. In this situation, how can I successfully pin-point the current hardware problem? (If it's a hardware problem) Because I don't really have the budget to just forget and replace everything. I also don't really have hardware to test-replace current hardware.

    Read the article

  • Xenserver 5.5 U2 a bit unstable with an unstable W2003 VM

    - by twistedbrain
    In the last week I had to reboot the host system twice and the second one by means of the power button. The system is a Dell PE 6950 (4 Opteron dual core, 2,8 Ghz, 16 GB RAM, 900 GB disk in a RAID 10 array composed by 4 450GB 15000 rpm disks) with XenServer 5.5 U2. We're installing it and at now there are working in production a 2003 32 bit Windows server VM with 2 GB RAM, 3 vcpu and about 200 GB disk in 4 partition (12 GB boot, 20 GB program, 80 GB user data, 80 GB other data). The first time I was compressing many Windows 2003 folders (some tens GB by means of W2003 compressed folder option) from a Windows remote console and some hours before my colleague installed Backup Exec agent that was alredy installed and that required a reboot (that was pending). The console stopped responding, it was no more possible to connect by means of remote console or by means of the console of the XenCentre, it was still possible from the network to use the shared folder of the VM and the programs on it (2 db and a GIS program), but the print server didn't work any more and I couldn't give remote reboot from other domain controller hosts. I couldn't stop the virtual machine neither from the XenCentre, neither from the command line of the host also forcing the reboot. I had to reboot the host server. Yesterday it has been worst. I installed a template of another VM, CentOS 5.3 and then, put the DVD in the drive of the host. Then, before the install and after the boot I checked for defect the DVD and the W2003 VM began to respond slowly (I was connected by means of an administration remote console) the task manager showed only mid or low load, it is, only the first of the 3 vcpu was loaded (about 70%), while the other two were about at 20% and also the disk I/O was not so heavy. Then the users were not so happy because they couldn't any more use the MS word docs on the server. I immediately stopped the check of the DVD (to do that I had to force the stop of such Centos 5.3 VM), then some users could again use their docs, but other had still problems, so I decided to reboot the VM, but it doesn't stopped, neither from XenCenter, neither from command line, neither forcing the reboot. Then I tried to reboot of the host, but it didn't worked, neither from the XenCentre, neither from the host prompt (shutdown -r now as root: it told that it was shutting down, but then it didn't did that). So I had to power off by means of the power button of the server (before I tried some Magic SyS REQ, but I saw that Xen isn't compiled with this option enabled). What could you suggest about my problem, what can I look, search and see? In the W2003 VM logs there are no errors or warning to explain what happened. Some more exciting, amusing and inspiring words of poetry (the facts happened around 11 am): \# egrep -i 'err|warning' xensource.log [20100219 10:32:05.597|debug|culo|6301 unix-RPC|VBD.plug R:c81bcda701f6|xenops] watch: watching xenstore paths: [ /xapi/0/frontend/vbd/51712/hotplug; /local/domain/0/backend/vbd/0/51712/tapdisk-error ] with timeout 1200.000000 seconds [20100219 10:32:05.597|debug|culo|6301 unix-RPC|VBD.plug R:c81bcda701f6|xenops] watch: fired on /local/domain/0/backend/vbd/0/51712/tapdisk-error [20100219 10:32:14.314|debug|culo|6335 unix-RPC|VBD.unplug R:9258f54578d6|xenops] watch: watching xenstore paths: [ /local/domain/0/backend/vbd/0/51712/shutdown-done; /local/domain/0/error/device/vbd/51712/error ] with timeout 1200.000000 seconds [20100219 10:32:14.337|debug|culo|6335 unix-RPC|VBD.unplug R:9258f54578d6|xenops] xenstore-rm /local/domain/0/error/backend/vbd/0 [20100219 10:32:14.337|debug|culo|6335 unix-RPC|VBD.unplug R:9258f54578d6|xenops] xenstore-rm /local/domain/0/error/device/vbd/51712 [20100219 10:32:14.338|debug|culo|6335 unix-RPC|VBD.unplug R:9258f54578d6|xenops] watch: fired on /local/domain/0/error/device/vbd/51712/error [20100219 10:53:48.903|debug|culo|6418|Async.VM.hard_shutdown R:88d2095678f7|helpers] Ignoring exception: INTERNAL_ERROR: [ Xb.Noent ] while Vmops.destroy_domain: Destroying domid 14 guest session [20100219 10:53:52.048|debug|culo|6418|Async.VM.hard_shutdown R:88d2095678f7|xenops] xenstore-rm /local/domain/0/error/backend/tap/14 [20100219 10:53:52.048|debug|culo|6418|Async.VM.hard_shutdown R:88d2095678f7|xenops] xenstore-rm /local/domain/14/error/device/vbd/51744 [20100219 10:53:52.085|debug|culo|6418|Async.VM.hard_shutdown R:88d2095678f7|xenops] xenstore-rm /local/domain/0/error/backend/tap/14 [20100219 10:53:52.086|debug|culo|6418|Async.VM.hard_shutdown R:88d2095678f7|xenops] xenstore-rm /local/domain/14/error/device/vbd/51728 [20100219 10:53:52.122|debug|culo|6418|Async.VM.hard_shutdown R:88d2095678f7|xenops] xenstore-rm /local/domain/0/error/backend/tap/14 [20100219 10:53:52.122|debug|culo|6418|Async.VM.hard_shutdown R:88d2095678f7|xenops] xenstore-rm /local/domain/14/error/device/vbd/51712 [20100219 10:53:52.127|debug|culo|6418|Async.VM.hard_shutdown R:88d2095678f7|xenops] xenstore-rm /local/domain/0/error/backend/vbd/14 [20100219 10:53:52.128|debug|culo|6418|Async.VM.hard_shutdown R:88d2095678f7|xenops] xenstore-rm /local/domain/14/error/device/vbd/51760 [20100219 10:53:52.496|debug|culo|6418|Async.VM.hard_shutdown R:88d2095678f7|xenops] Device.Vif.hard_shutdown about to blow away backend and error paths [20100219 10:53:52.497|debug|culo|6418|Async.VM.hard_shutdown R:88d2095678f7|xenops] xenstore-rm /local/domain/0/error/backend/vif/14 [20100219 10:53:52.497|debug|culo|6418|Async.VM.hard_shutdown R:88d2095678f7|xenops] xenstore-rm /local/domain/14/error/device/vif/0 [20100219 10:53:53.385|debug|culo|6418|Async.VM.hard_shutdown R:88d2095678f7|xenops] Device.Vif.hard_shutdown about to blow away backend and error paths [20100219 10:53:53.386|debug|culo|6418|Async.VM.hard_shutdown R:88d2095678f7|xenops] xenstore-rm /local/domain/0/error/backend/vif/14 [20100219 10:53:53.386|debug|culo|6418|Async.VM.hard_shutdown R:88d2095678f7|xenops] xenstore-rm /local/domain/14/error/device/vif/1 [20100219 10:53:53.389| warn|culo|6418|Async.VM.hard_shutdown R:88d2095678f7|hotplug] Warning, deleting 'vif' entry from /xapi/14/hotplug/vif/0 [20100219 10:53:53.391| warn|culo|6418|Async.VM.hard_shutdown R:88d2095678f7|hotplug] Warning, deleting 'vif' entry from /xapi/14/hotplug/vif/1 [20100219 11:20:49.766|debug|culo|6484|Async.VM.clean_shutdown R:78d3c3e28cb6|helpers] Ignoring exception: INTERNAL_ERROR: [ Xb.Noent ] while Vmops.destroy_domain: Destroying domid 11 guest session [20100219 11:20:50.339|debug|culo|6484|Async.VM.clean_shutdown R:78d3c3e28cb6|xenops] xenstore-rm /local/domain/0/error/backend/tap/11 [20100219 11:20:50.339|debug|culo|6484|Async.VM.clean_shutdown R:78d3c3e28cb6|xenops] xenstore-rm /local/domain/11/error/device/vbd/832 [20100219 11:20:50.360|debug|culo|6484|Async.VM.clean_shutdown R:78d3c3e28cb6|xenops] xenstore-rm /local/domain/0/error/backend/tap/11 [20100219 11:20:50.360|debug|culo|6484|Async.VM.clean_shutdown R:78d3c3e28cb6|xenops] xenstore-rm /local/domain/11/error/device/vbd/768 [20100219 11:20:50.366|debug|culo|6484|Async.VM.clean_shutdown R:78d3c3e28cb6|xenops] xenstore-rm /local/domain/0/error/backend/vbd/11 [20100219 11:20:50.366|debug|culo|6484|Async.VM.clean_shutdown R:78d3c3e28cb6|xenops] xenstore-rm /local/domain/11/error/device/vbd/5696 [20100219 11:20:50.753|debug|culo|6484|Async.VM.clean_shutdown R:78d3c3e28cb6|xenops] Device.Vif.hard_shutdown about to blow away backend and error paths [20100219 11:20:50.754|debug|culo|6484|Async.VM.clean_shutdown R:78d3c3e28cb6|xenops] xenstore-rm /local/domain/0/error/backend/vif/11 [20100219 11:20:50.754|debug|culo|6484|Async.VM.clean_shutdown R:78d3c3e28cb6|xenops] xenstore-rm /local/domain/11/error/device/vif/1 [20100219 11:20:50.757| warn|culo|6484|Async.VM.clean_shutdown R:78d3c3e28cb6|hotplug] Warning, deleting 'vif' entry from /xapi/11/hotplug/vif/1 [20100219 11:28:13.803|debug|culo|6610 inet-RPC|Connection to VM console R:e9f8b76e8975|console] error: INTERNAL_ERROR: [ Unix.Unix_error(63, "connect", "") ]

    Read the article

  • Troubleshooting latency spikes on ESXi NFS datastores

    - by exo_cw
    I'm experiencing fsync latencies of around five seconds on NFS datastores in ESXi, triggered by certain VMs. I suspect this might be caused by VMs using NCQ/TCQ, as this does not happen with virtual IDE drives. This can be reproduced using fsync-tester (by Ted Ts'o) and ioping. For example using a Grml live system with a 8GB disk: Linux 2.6.33-grml64: root@dynip211 /mnt/sda # ./fsync-tester fsync time: 5.0391 fsync time: 5.0438 fsync time: 5.0300 fsync time: 0.0231 fsync time: 0.0243 fsync time: 5.0382 fsync time: 5.0400 [... goes on like this ...] That is 5 seconds, not milliseconds. This is even creating IO-latencies on a different VM running on the same host and datastore: root@grml /mnt/sda/ioping-0.5 # ./ioping -i 0.3 -p 20 . 4096 bytes from . (reiserfs /dev/sda): request=1 time=7.2 ms 4096 bytes from . (reiserfs /dev/sda): request=2 time=0.9 ms 4096 bytes from . (reiserfs /dev/sda): request=3 time=0.9 ms 4096 bytes from . (reiserfs /dev/sda): request=4 time=0.9 ms 4096 bytes from . (reiserfs /dev/sda): request=5 time=4809.0 ms 4096 bytes from . (reiserfs /dev/sda): request=6 time=1.0 ms 4096 bytes from . (reiserfs /dev/sda): request=7 time=1.2 ms 4096 bytes from . (reiserfs /dev/sda): request=8 time=1.1 ms 4096 bytes from . (reiserfs /dev/sda): request=9 time=1.3 ms 4096 bytes from . (reiserfs /dev/sda): request=10 time=1.2 ms 4096 bytes from . (reiserfs /dev/sda): request=11 time=1.0 ms 4096 bytes from . (reiserfs /dev/sda): request=12 time=4950.0 ms When I move the first VM to local storage it looks perfectly normal: root@dynip211 /mnt/sda # ./fsync-tester fsync time: 0.0191 fsync time: 0.0201 fsync time: 0.0203 fsync time: 0.0206 fsync time: 0.0192 fsync time: 0.0231 fsync time: 0.0201 [... tried that for one hour: no spike ...] Things I've tried that made no difference: Tested several ESXi Builds: 381591, 348481, 260247 Tested on different hardware, different Intel and AMD boxes Tested with different NFS servers, all show the same behavior: OpenIndiana b147 (ZFS sync always or disabled: no difference) OpenIndiana b148 (ZFS sync always or disabled: no difference) Linux 2.6.32 (sync or async: no difference) It makes no difference if the NFS server is on the same machine (as a virtual storage appliance) or on a different host Guest OS tested, showing problems: Windows 7 64 Bit (using CrystalDiskMark, latency spikes happen mostly during preparing phase) Linux 2.6.32 (fsync-tester + ioping) Linux 2.6.38 (fsync-tester + ioping) I could not reproduce this problem on Linux 2.6.18 VMs. Another workaround is to use virtual IDE disks (vs SCSI/SAS), but that is limiting performance and the number of drives per VM. Update 2011-06-30: The latency spikes seem to happen more often if the application writes in multiple small blocks before fsync. For example fsync-tester does this (strace output): pwrite(3, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"..., 1048576, 0) = 1048576 fsync(3) = 0 ioping does this while preparing the file: [lots of pwrites] pwrite(3, "********************************"..., 4096, 1036288) = 4096 pwrite(3, "********************************"..., 4096, 1040384) = 4096 pwrite(3, "********************************"..., 4096, 1044480) = 4096 fsync(3) = 0 The setup phase of ioping almost always hangs, while fsync-tester sometimes works fine. Is someone capable of updating fsync-tester to write multiple small blocks? My C skills suck ;) Update 2011-07-02: This problem does not occur with iSCSI. I tried this with the OpenIndiana COMSTAR iSCSI server. But iSCSI does not give you easy access to the VMDK files so you can move them between hosts with snapshots and rsync. Update 2011-07-06: This is part of a wireshark capture, captured by a third VM on the same vSwitch. This all happens on the same host, no physical network involved. I've started ioping around time 20. There were no packets sent until the five second delay was over: No. Time Source Destination Protocol Info 1082 16.164096 192.168.250.10 192.168.250.20 NFS V3 WRITE Call (Reply In 1085), FH:0x3eb56466 Offset:0 Len:84 FILE_SYNC 1083 16.164112 192.168.250.10 192.168.250.20 NFS V3 WRITE Call (Reply In 1086), FH:0x3eb56f66 Offset:0 Len:84 FILE_SYNC 1084 16.166060 192.168.250.20 192.168.250.10 TCP nfs > iclcnet-locate [ACK] Seq=445 Ack=1057 Win=32806 Len=0 TSV=432016 TSER=769110 1085 16.167678 192.168.250.20 192.168.250.10 NFS V3 WRITE Reply (Call In 1082) Len:84 FILE_SYNC 1086 16.168280 192.168.250.20 192.168.250.10 NFS V3 WRITE Reply (Call In 1083) Len:84 FILE_SYNC 1087 16.168417 192.168.250.10 192.168.250.20 TCP iclcnet-locate > nfs [ACK] Seq=1057 Ack=773 Win=4163 Len=0 TSV=769110 TSER=432016 1088 23.163028 192.168.250.10 192.168.250.20 NFS V3 GETATTR Call (Reply In 1089), FH:0x0bb04963 1089 23.164541 192.168.250.20 192.168.250.10 NFS V3 GETATTR Reply (Call In 1088) Directory mode:0777 uid:0 gid:0 1090 23.274252 192.168.250.10 192.168.250.20 TCP iclcnet-locate > nfs [ACK] Seq=1185 Ack=889 Win=4163 Len=0 TSV=769821 TSER=432716 1091 24.924188 192.168.250.10 192.168.250.20 RPC Continuation 1092 24.924210 192.168.250.10 192.168.250.20 RPC Continuation 1093 24.924216 192.168.250.10 192.168.250.20 RPC Continuation 1094 24.924225 192.168.250.10 192.168.250.20 RPC Continuation 1095 24.924555 192.168.250.20 192.168.250.10 TCP nfs > iclcnet_svinfo [ACK] Seq=6893 Ack=1118613 Win=32625 Len=0 TSV=432892 TSER=769986 1096 24.924626 192.168.250.10 192.168.250.20 RPC Continuation 1097 24.924635 192.168.250.10 192.168.250.20 RPC Continuation 1098 24.924643 192.168.250.10 192.168.250.20 RPC Continuation 1099 24.924649 192.168.250.10 192.168.250.20 RPC Continuation 1100 24.924653 192.168.250.10 192.168.250.20 RPC Continuation 2nd Update 2011-07-06: There seems to be some influence from TCP window sizes. I was not able to reproduce this problem using FreeNAS (based on FreeBSD) as a NFS server. The wireshark captures showed TCP window updates to 29127 bytes in regular intervals. I did not see them with OpenIndiana, which uses larger window sizes by default. I can no longer reproduce this problem if I set the following options in OpenIndiana and restart the NFS server: ndd -set /dev/tcp tcp_recv_hiwat 8192 # default is 128000 ndd -set /dev/tcp tcp_max_buf 1048575 # default is 1048576 But this kills performance: Writing from /dev/zero to a file with dd_rescue goes from 170MB/s to 80MB/s. Update 2011-07-07: I've uploaded this tcpdump capture (can be analyzed with wireshark). In this case 192.168.250.2 is the NFS server (OpenIndiana b148) and 192.168.250.10 is the ESXi host. Things I've tested during this capture: Started "ioping -w 5 -i 0.2 ." at time 30, 5 second hang in setup, completed at time 40. Started "ioping -w 5 -i 0.2 ." at time 60, 5 second hang in setup, completed at time 70. Started "fsync-tester" at time 90, with the following output, stopped at time 120: fsync time: 0.0248 fsync time: 5.0197 fsync time: 5.0287 fsync time: 5.0242 fsync time: 5.0225 fsync time: 0.0209 2nd Update 2011-07-07: Tested another NFS server VM, this time NexentaStor 3.0.5 community edition: Shows the same problems. Update 2011-07-31: I can also reproduce this problem on the new ESXi build 4.1.0.433742.

    Read the article

  • EC2 instance suddenly refusing SSH connections and won't respond to ping

    - by Chris
    My instance was running fine and this morning I was able to access a Ruby on Rails app hosted on it. An hour later I suddenly wasn't able to access my site, my SSH connection attempts were refused and the server wasn't even responding to ping. I didn't change anything on my system during that hour and reboots aren't fixing it. I've never had any problems connecting or pinging the system before. Can someone please help? This is on my production system! OS: CentOS 5 AMI ID: ami-10b55379 Type: m1.small [] ~% ssh -v *****@meeteor.com OpenSSH_5.2p1, OpenSSL 0.9.8l 5 Nov 2009 debug1: Reading configuration data /etc/ssh_config debug1: Connecting to meeteor.com [184.73.235.191] port 22. debug1: connect to address 184.73.235.191 port 22: Connection refused ssh: connect to host meeteor.com port 22: Connection refused [] ~% ping meeteor.com PING meeteor.com (184.73.235.191): 56 data bytes Request timeout for icmp_seq 0 Request timeout for icmp_seq 1 Request timeout for icmp_seq 2 ^C --- meeteor.com ping statistics --- 4 packets transmitted, 0 packets received, 100.0% packet loss [] ~% ========= System Log ========= Restarting system. Linux version 2.6.16-xenU ([email protected]) (gcc version 4.0.1 20050727 (Red Hat 4.0.1-5)) #1 SMP Mon May 28 03:41:49 SAST 2007 BIOS-provided physical RAM map: Xen: 0000000000000000 - 000000006a400000 (usable) 980MB HIGHMEM available. 727MB LOWMEM available. NX (Execute Disable) protection: active IRQ lockup detection disabled Built 1 zonelists Kernel command line: root=/dev/sda1 ro 4 Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Initializing CPU#0 PID hash table entries: 4096 (order: 12, 65536 bytes) Xen reported: 2599.998 MHz processor. Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) Software IO TLB disabled vmalloc area: ee000000-f53fe000, maxmem 2d7fe000 Memory: 1718700k/1748992k available (1958k kernel code, 20948k reserved, 620k data, 144k init, 1003528k highmem) Checking if this processor honours the WP bit even in supervisor mode... Ok. Calibrating delay using timer specific routine.. 5202.30 BogoMIPS (lpj=26011526) Mount-cache hash table entries: 512 CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) CPU: L2 Cache: 1024K (64 bytes/line) Checking 'hlt' instruction... OK. Brought up 1 CPUs migration_cost=0 Grant table initialized NET: Registered protocol family 16 Brought up 1 CPUs xen_mem: Initialising balloon driver. highmem bounce pool size: 64 pages VFS: Disk quotas dquot_6.5.1 Dquot-cache hash table entries: 1024 (order 0, 4096 bytes) Initializing Cryptographic API io scheduler noop registered io scheduler anticipatory registered (default) io scheduler deadline registered io scheduler cfq registered i8042.c: No controller found. RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize Xen virtual console successfully installed as tty1 Event-channel device installed. netfront: Initialising virtual ethernet driver. mice: PS/2 mouse device common for all mice md: md driver 0.90.3 MAX_MD_DEVS=256, MD_SB_DISKS=27 md: bitmap version 4.39 NET: Registered protocol family 2 Registering block device major 8 IP route cache hash table entries: 65536 (order: 6, 262144 bytes) TCP established hash table entries: 262144 (order: 9, 2097152 bytes) TCP bind hash table entries: 65536 (order: 7, 524288 bytes) TCP: Hash tables configured (established 262144 bind 65536) TCP reno registered TCP bic registered NET: Registered protocol family 1 NET: Registered protocol family 17 NET: Registered protocol family 15 Using IPI No-Shortcut mode md: Autodetecting RAID arrays. md: autorun ... md: ... autorun DONE. kjournald starting. Commit interval 5 seconds EXT3-fs: mounted filesystem with ordered data mode. VFS: Mounted root (ext3 filesystem) readonly. Freeing unused kernel memory: 144k freed *************************************************************** *************************************************************** ** WARNING: Currently emulating unsupported memory accesses ** ** in /lib/tls glibc libraries. The emulation is ** ** slow. To ensure full performance you should ** ** install a 'xen-friendly' (nosegneg) version of ** ** the library, or disable tls support by executing ** ** the following as root: ** ** mv /lib/tls /lib/tls.disabled ** ** Offending process: init (pid=1) ** *************************************************************** *************************************************************** Pausing... 5Pausing... 4Pausing... 3Pausing... 2Pausing... 1Continuing... INIT: version 2.86 booting Welcome to CentOS release 5.4 (Final) Press 'I' to enter interactive startup. Setting clock : Fri Oct 1 14:35:26 EDT 2010 [ OK ] Starting udev: [ OK ] Setting hostname localhost.localdomain: [ OK ] No devices found Setting up Logical Volume Management: [ OK ] Checking filesystems Checking all file systems. [/sbin/fsck.ext3 (1) -- /] fsck.ext3 -a /dev/sda1 /dev/sda1: clean, 275424/1310720 files, 1161123/2621440 blocks [ OK ] Remounting root filesystem in read-write mode: [ OK ] Mounting local filesystems: [ OK ] Enabling local filesystem quotas: [ OK ] Enabling /etc/fstab swaps: [ OK ] INIT: Entering runlevel: 4 Entering non-interactive startup Starting background readahead: [ OK ] Applying ip6tables firewall rules: modprobe: FATAL: Module ip6_tables not found. ip6tables-restore v1.3.5: ip6tables-restore: unable to initializetable 'filter' Error occurred at line: 3 Try `ip6tables-restore -h' or 'ip6tables-restore --help' for more information. [FAILED] Applying iptables firewall rules: [ OK ] Loading additional iptables modules: ip_conntrack_netbios_ns [ OK ] Bringing up loopback interface: [ OK ] Bringing up interface eth0: Determining IP information for eth0... done. [ OK ] Starting auditd: [FAILED] Starting irqbalance: [ OK ] Starting portmap: [ OK ] FATAL: Module lockd not found. Starting NFS statd: [ OK ] Starting RPC idmapd: FATAL: Module sunrpc not found. FATAL: Error running install command for sunrpc Error: RPC MTAB does not exist. Starting system message bus: [ OK ] Starting Bluetooth services:[ OK ] [ OK ] Can't open RFCOMM control socket: Address family not supported by protocol Mounting other filesystems: [ OK ] Starting PC/SC smart card daemon (pcscd): [ OK ] Starting hidd: Can't open HIDP control socket: Address family not supported by protocol [FAILED] Starting autofs: Starting automount: automount: test mount forbidden or incorrect kernel protocol version, kernel protocol version 5.00 or above required. [FAILED] [FAILED] Starting sshd: [ OK ] Starting cups: [ OK ] Starting sendmail: [ OK ] Starting sm-client: [ OK ] Starting console mouse services: no console device found[FAILED] Starting crond: [ OK ] Starting xfs: [ OK ] Starting anacron: [ OK ] Starting atd: [ OK ] % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 390 100 390 0 0 58130 0 --:--:-- --:--:-- --:--:-- 58130 100 390 100 390 0 0 56984 0 --:--:-- --:--:-- --:--:-- 0 Starting yum-updatesd: [ OK ] Starting Avahi daemon... [ OK ] Starting HAL daemon: [ OK ] Starting OSSEC: [ OK ] Starting smartd: [ OK ] c CentOS release 5.4 (Final) Kernel 2.6.16-xenU on an i686 domU-12-31-39-00-C4-97 login: INIT: Id "2" respawning too fast: disabled for 5 minutes INIT: Id "3" respawning too fast: disabled for 5 minutes INIT: Id "4" respawning too fast: disabled for 5 minutes INIT: Id "5" respawning too fast: disabled for 5 minutes INIT: Id "6" respawning too fast: disabled for 5 minutes

    Read the article

  • Strange Recurrent Excessive I/O Wait

    - by Chris
    I know quite well that I/O wait has been discussed multiple times on this site, but all the other topics seem to cover constant I/O latency, while the I/O problem we need to solve on our server occurs at irregular (short) intervals, but is ever-present with massive spikes of up to 20k ms a-wait and service times of 2 seconds. The disk affected is /dev/sdb (Seagate Barracuda, for details see below). A typical iostat -x output would at times look like this, which is an extreme sample but by no means rare: iostat (Oct 6, 2013) tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 16.00 0.00 156.00 9.75 21.89 288.12 36.00 57.60 5.50 0.00 44.00 8.00 48.79 2194.18 181.82 100.00 2.00 0.00 16.00 8.00 46.49 3397.00 500.00 100.00 4.50 0.00 40.00 8.89 43.73 5581.78 222.22 100.00 14.50 0.00 148.00 10.21 13.76 5909.24 68.97 100.00 1.50 0.00 12.00 8.00 8.57 7150.67 666.67 100.00 0.50 0.00 4.00 8.00 6.31 10168.00 2000.00 100.00 2.00 0.00 16.00 8.00 5.27 11001.00 500.00 100.00 0.50 0.00 4.00 8.00 2.96 17080.00 2000.00 100.00 34.00 0.00 1324.00 9.88 1.32 137.84 4.45 59.60 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 22.00 44.00 204.00 11.27 0.01 0.27 0.27 0.60 Let me provide you with some more information regarding the hardware. It's a Dell 1950 III box with Debian as OS where uname -a reports the following: Linux xx 2.6.32-5-amd64 #1 SMP Fri Feb 15 15:39:52 UTC 2013 x86_64 GNU/Linux The machine is a dedicated server that hosts an online game without any databases or I/O heavy applications running. The core application consumes about 0.8 of the 8 GBytes RAM, and the average CPU load is relatively low. The game itself, however, reacts rather sensitive towards I/O latency and thus our players experience massive ingame lag, which we would like to address as soon as possible. iostat: avg-cpu: %user %nice %system %iowait %steal %idle 1.77 0.01 1.05 1.59 0.00 95.58 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sdb 13.16 25.42 135.12 504701011 2682640656 sda 1.52 0.74 20.63 14644533 409684488 Uptime is: 19:26:26 up 229 days, 17:26, 4 users, load average: 0.36, 0.37, 0.32 Harddisk controller: 01:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 1078 (rev 04) Harddisks: Array 1, RAID-1, 2x Seagate Cheetah 15K.5 73 GB SAS Array 2, RAID-1, 2x Seagate ST3500620SS Barracuda ES.2 500GB 16MB 7200RPM SAS Partition information from df: Filesystem 1K-blocks Used Available Use% Mounted on /dev/sdb1 480191156 30715200 425083668 7% /home /dev/sda2 7692908 437436 6864692 6% / /dev/sda5 15377820 1398916 13197748 10% /usr /dev/sda6 39159724 19158340 18012140 52% /var Some more data samples generated with iostat -dx sdb 1 (Oct 11, 2013) Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sdb 0.00 15.00 0.00 70.00 0.00 656.00 9.37 4.50 1.83 4.80 33.60 sdb 0.00 0.00 0.00 2.00 0.00 16.00 8.00 12.00 836.00 500.00 100.00 sdb 0.00 0.00 0.00 3.00 0.00 32.00 10.67 9.96 1990.67 333.33 100.00 sdb 0.00 0.00 0.00 4.00 0.00 40.00 10.00 6.96 3075.00 250.00 100.00 sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 4.00 0.00 0.00 100.00 sdb 0.00 0.00 0.00 2.00 0.00 16.00 8.00 2.62 4648.00 500.00 100.00 sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 2.00 0.00 0.00 100.00 sdb 0.00 0.00 0.00 1.00 0.00 16.00 16.00 1.69 7024.00 1000.00 100.00 sdb 0.00 74.00 0.00 124.00 0.00 1584.00 12.77 1.09 67.94 6.94 86.00 Characteristic charts generated with rrdtool can be found here: iostat plot 1, 24 min interval: http://imageshack.us/photo/my-images/600/yqm3.png/ iostat plot 2, 120 min interval: http://imageshack.us/photo/my-images/407/griw.png/ As we have a rather large cache of 5.5 GBytes, we thought it might be a good idea to test if the I/O wait spikes would perhaps be caused by cache miss events. Therefore, we did a sync and then this to flush the cache and buffers: echo 3 > /proc/sys/vm/drop_caches and directly afterwards the I/O wait and service times virtually went through the roof, and everything on the machine felt like slow motion. During the next few hours the latency recovered and everything was as before - small to medium lags in short, unpredictable intervals. Now my question is: does anybody have any idea what might cause this annoying behaviour? Is it the first indication of the disk array or the raid controller dying, or something that can be easily mended by rebooting? (At the moment we're very reluctant to do this, however, because we're afraid that the disks might not come back up again.) Any help is greatly appreciated. Thanks in advance, Chris. Edited to add: we do see one or two processes go to 'D' state in top, one of which seems to be kjournald rather frequently. If I'm not mistaken, however, this does not indicate the processes causing the latency, but rather those affected by it - correct me if I'm wrong. Does the information about uninterruptibly sleeping processes help us in any way to address the problem? @Andy Shinn requested smartctl data, here it is: smartctl -a -d megaraid,2 /dev/sdb yields: smartctl 5.40 2010-07-12 r3124 [x86_64-unknown-linux-gnu] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net Device: SEAGATE ST3500620SS Version: MS05 Serial number: Device type: disk Transport protocol: SAS Local Time is: Mon Oct 14 20:37:13 2013 CEST Device supports SMART and is Enabled Temperature Warning Disabled or Not Supported SMART Health Status: OK Current Drive Temperature: 20 C Drive Trip Temperature: 68 C Elements in grown defect list: 0 Vendor (Seagate) cache information Blocks sent to initiator = 1236631092 Blocks received from initiator = 1097862364 Blocks read from cache and sent to initiator = 1383620256 Number of read and write commands whose size <= segment size = 531295338 Number of read and write commands whose size > segment size = 51986460 Vendor (Seagate/Hitachi) factory information number of hours powered up = 36556.93 number of minutes until next internal SMART test = 32 Error counter log: Errors Corrected by Total Correction Gigabytes Total ECC rereads/ errors algorithm processed uncorrected fast | delayed rewrites corrected invocations [10^9 bytes] errors read: 509271032 47 0 509271079 509271079 20981.423 0 write: 0 0 0 0 0 5022.039 0 verify: 1870931090 196 0 1870931286 1870931286 100558.708 0 Non-medium error count: 0 SMART Self-test log Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ] Description number (hours) # 1 Background short Completed 16 36538 - [- - -] # 2 Background short Completed 16 36514 - [- - -] # 3 Background short Completed 16 36490 - [- - -] # 4 Background short Completed 16 36466 - [- - -] # 5 Background short Completed 16 36442 - [- - -] # 6 Background long Completed 16 36420 - [- - -] # 7 Background short Completed 16 36394 - [- - -] # 8 Background short Completed 16 36370 - [- - -] # 9 Background long Completed 16 36364 - [- - -] #10 Background short Completed 16 36361 - [- - -] #11 Background long Completed 16 2 - [- - -] #12 Background short Completed 16 0 - [- - -] Long (extended) Self Test duration: 6798 seconds [113.3 minutes] smartctl -a -d megaraid,3 /dev/sdb yields: smartctl 5.40 2010-07-12 r3124 [x86_64-unknown-linux-gnu] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net Device: SEAGATE ST3500620SS Version: MS05 Serial number: Device type: disk Transport protocol: SAS Local Time is: Mon Oct 14 20:37:26 2013 CEST Device supports SMART and is Enabled Temperature Warning Disabled or Not Supported SMART Health Status: OK Current Drive Temperature: 19 C Drive Trip Temperature: 68 C Elements in grown defect list: 0 Vendor (Seagate) cache information Blocks sent to initiator = 288745640 Blocks received from initiator = 1097848399 Blocks read from cache and sent to initiator = 1304149705 Number of read and write commands whose size <= segment size = 527414694 Number of read and write commands whose size > segment size = 51986460 Vendor (Seagate/Hitachi) factory information number of hours powered up = 36596.83 number of minutes until next internal SMART test = 28 Error counter log: Errors Corrected by Total Correction Gigabytes Total ECC rereads/ errors algorithm processed uncorrected fast | delayed rewrites corrected invocations [10^9 bytes] errors read: 610862490 44 0 610862534 610862534 20470.133 0 write: 0 0 0 0 0 5022.480 0 verify: 2861227413 203 0 2861227616 2861227616 100872.443 0 Non-medium error count: 1 SMART Self-test log Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ] Description number (hours) # 1 Background short Completed 16 36580 - [- - -] # 2 Background short Completed 16 36556 - [- - -] # 3 Background short Completed 16 36532 - [- - -] # 4 Background short Completed 16 36508 - [- - -] # 5 Background short Completed 16 36484 - [- - -] # 6 Background long Completed 16 36462 - [- - -] # 7 Background short Completed 16 36436 - [- - -] # 8 Background short Completed 16 36412 - [- - -] # 9 Background long Completed 16 36404 - [- - -] #10 Background short Completed 16 36401 - [- - -] #11 Background long Completed 16 2 - [- - -] #12 Background short Completed 16 0 - [- - -] Long (extended) Self Test duration: 6798 seconds [113.3 minutes]

    Read the article

  • Why are my hard drives failing?

    - by WishCow
    I have a small Ubuntu server running at home, with 2 HDDs. There are two software raids (raid1) on the disks, managed by mdadm, which I believe is irrelevant, but mentioning it anyway. Both of the HDDs are Western Digital, and have been used for around 2 years, when one of them started making clicking noises, and died. I figured that maybe it's natural after 2 years, so I bought a new one, and resynced the raid arrays. After about a month, the other drive also died. I didn't get suspicious, since both drives have been bought at the same time, it's not that surprising to see both of them near each other, so I bought another one. So far, 2 old drives failed, and 2 brand new in the system. After one month, one of the new drives died. This is when it started getting suspicious. Since the PC was put together from some really old parts (think AthlonXP), I figured that maybe the motherboard's SATA controller is the culprit. Of course you can't switch parts easily in an old PC like this, so I bought a whole system, new MB, new CPU, new RAM. Took the just failed drive back, since it was under warranty, and got it replaced. So it is up to 2 failed drives from the old ones, and 1 failed drive from the new ones. No problems, for 1 month. After that errors were creeping up again in /var/log/messages, and mdadm was reporting raid array failures. I started tearing my hair out. Everything is new in the system, it's up to the third brand new HDD, it's simply not possible that all of the new drives that I bought were faulty. Let's see what is still common... the cables. Okay, long shot, let's replace the SATA cables. Take HDD back, smile to the guy at the counter and say that I'm really unlucky. He replaces the HDD. I come home, one month passes and one of HDDs fails, again. I'm not joking. Two of the brand new HDDs have failed. Maybe it's a bug in the OS. Let's see what the manufacturer's testing tool says. Download testing tool, burn it to a CD, reboot, leave HDD testing overnight. Test says that the drive is faulty, and I should back up everything, if I still can. I don't know what's happening, but it does not look like a software problem, something is definitely thrashing the HDDs. I should mention now, that the whole system is in a shoebox. Since there are a load of "build your own ikea case" stuff, I thought there shouldn't be any problems throwing the thing in a box, and stuffing it away somewhere. The box is well ventilated, but I thought that just maybe the drives were overheating. There is no other possible answer to this. So I took the HDD back, and got it replaced (for the 3rd time), and bought HDD coolers. And just now, I have heard the sound of doom. click click whizzzzzzzzz. SSH into the box: You have new mail! mail r 1 DegradedArrayEvent on /dev/md0 ... dmesg output: [47128.000051] ata3: lost interrupt (Status 0x50) [47128.000097] end_request: I/O error, dev sda, sector 58588863 [47128.000134] md: super_written gets error=-5, uptodate=0 [48043.976054] ata3: lost interrupt (Status 0x50) [48043.976086] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen [48043.976132] ata3.00: cmd c8/00:18:bf:40:52/00:00:00:00:00/e1 tag 0 dma 12288 in [48043.976135] res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) [48043.976208] ata3.00: status: { DRDY } [48043.976241] ata3: soft resetting link [48044.148446] ata3.00: configured for UDMA/133 [48044.148457] ata3.00: device reported invalid CHS sector 0 [48044.148477] ata3: EH complete Recap: No possibility of overheating 6 drives have failed, 4 of those have been brand new. I'm not sure now that the original two have been faulty, or suffered the same thing that the new ones. There is nothing common in the system, apart from the OS which is Ubuntu Karmic now (started with Jaunty). New MB, new CPU, new RAM, new SATA cables. No, the little holes on the HDD are not covered I'm crying. Really. I don't have the face to return to the store now, it's not possible for 4 drives to fail under 4 months. A few ideas that I have been thinking: Is it possible that I fuck up something when I partition and resync the drives? Can it be so bad that it physicaly wrecks the drive? (since the vendor supplied tool says that the drive is damaged) I do the partitoning with fdisk, and use the same block size for the raid1 partitions (I check the exact block sizes with fdisk -lu) Is it possible that the linux kernel or mdadm, or something is not compatible with this exact brand of HDDs, and thrashes them? Is it possible that it may be the shoebox? Try placing it somewhere else? It's under a shelf now, so humidity is not a problem either. Is it possible that a normal PC case will solve my problem (I'm going to shoot myself then)? I will get a picture tomorrow. Am I just simply cursed? Any help or speculation is greatly appreciated. Edit: The power strip is guarded against overvoltage. Edit2: I have moved inbetween these 4 months, so the possibility of the cause being "dirty" electricity in both places, is very low. Edit3: I have checked the voltages in the BIOS (couldn't borrow a multimeter), and they are all seem correct, the biggest discrepancy is in the 12V, because it's supplying 11.3. Should I be worried about that? Edit4: I put my desktop PC's PSU into the server. The BIOS reported much more accurate voltage readings, and also it has successfully rebuilt the raid1 array, which took some 3-4 hours, so I feel a little positive now. Will get a new PSU tomorrow to test with that. Also, attaching the picture about the box: (disregard the 3rd drive)

    Read the article

  • Alternate way to create a clone of a UNIX System

    - by Spirit
    THE STORY: (If you don't like to read much, down below is the question :) ) Where I work we have two HP RP2470 servers same hardware same number of hard drives same everything :). One of them is a production server and runs HP-UX 11.00. The poor ba***rd hasn't been turned off for years and now I have to make a clone of it on the other server - just in case, for redundancy. The problem is simple (or not simple) as I have to make the the other server exactly the same. However the old version of OS (UX 11.00 is a history now) and the old software running on it, have made my task almost impossible. On the production server there is also a cloning/recover utility Ignite-UX. I tried many times to create a recovery tape with it. Then when I load the tape on the backup server, it succeeds with the loading of the tape (no errors no warnings) but on the next restart it fails to load the OS :S and drops into HP`s ISL prompt. --- THE QUESTION: Is there an alternate way to create a clone of the Unix System? The environment is: 1. 2x HP RP2470 Servers (non-Intel), same hardware, same number od HDDs (two each of them) same everything. 2. OS running: HP-UX 11.00 The production server has to be cloned without downtime - sadly :( as I hope that they will reconsider on this one For example (like on Windows platforms), if you try to copy an entire HDD with Windows inside on another HDD, and then put that HDD on another PC it will still work, as long as the hardware is the same. Can I do something like that with a Unix system? Can I somehow COPY the contents of the entire HDD, put those on another HDD, and then just load the HDD into the other server? (If you haven't read the story the servers are exactly the same) Will it work? Can it be done with ordinary commands like cp or dump or something like that? Does any one have a similar experience? --- UPDATE: 26.01.2012 NOTE: The update is related to "The Story". If you haven't read that part then you can skip this update. This is just a short update on the recover logs from the Ignite Tape.. someone with more exp. might notice something.. ... --- READING CONTENTS OF THE IGNITE TAPE --- --- OUTPUT OMITED --- ... ... x ./configure3, 413696 bytes, 808 tape blocks x ./monitor_bpr, 20480 bytes, 40 tape blocks * Download_mini-system: Complete * Loading_software: Begin * Installing boot area on disk. * Enabling swap areas. * Backing up LVM configuration for "vg00". * Processing the archive source (Recovery Archive). * Wed Jan 25 15:27:32 EST 2012: Starting archive load of the source (Recovery Archive). * Positioning the tape (/dev/rmt/0mn). * Archive extraction from tape is beginning. Please wait. * Wed Jan 25 15:39:52 EST 2012: Completed archive load of the source (Recovery Archive). * Executing user specified script: "/opt/ignite/data/scripts/os_arch_post_l". * Running in recovery mode (os_arch_post_l). * Running the ioinit command ("/sbin/ioinit -c") * Creating device files via the insf command. insf: Installing special files for sdisk instance 0 address 0/0/1/1.15.0 insf: Installing special files for sdisk instance 1 address 0/0/2/0.1.0 insf: Installing special files for sdisk instance 2 address 0/0/2/1.15.0 insf: Installing special files for stape instance 0 address 0/0/1/0.3.0 insf: Installing special files for btlan instance 0 address 0/0/0/0 insf: Installing special files for btlan instance 1 address 0/2/0/0 insf: Installing special files for pseudo driver dlpi insf: Installing special files for pseudo driver kepd insf: Installing special files for pseudo driver framebuf insf: Installing special files for pseudo driver sad * Running "/opt/upgrade/bin/tlinstall -v" and correcting transition link permissions. * Constructing the bootconf file. * Setting primary boot path to "0/0/1/1.15.0". * Executing: "/var/adm/sw/products/PHSS_20146/pfiles/iux_postload". * Executing: "/var/adm/sw/products/PHSS_25982/pfiles/iux_postload". NOTE: tlinstall is searching filesystem - please be patient NOTE: Successfully completed * Loading_software: Complete * Build_Kernel: Begin NOTE: Since the /stand/vmunix kernel is already in place, the kernel will not be re-built. Note that no mod_kernel directives will be processed. * Build_Kernel: Complete * Boot_From_Client_Disk: Begin * Rebooting machine as expected. NOTE: Rebooting system. sync'ing disks (0 buffers to flush): 0 buffers not flushed 0 buffers still dirty Closing open logical volumes... Done Console reset done. Boot device reset done. ********** VIRTUAL FRONT PANEL ********** System Boot detected ***************************************** LEDs: RUN ATTENTION FAULT REMOTE POWER FLASH OFF OFF ON ON LED State: Running non-OS code. (i.e. Boot or Diagnostics) ... ... ... --- SERVER IS PERFORMING POST SEQUENCE HERE --- --- OUTPUT OMITED --- ... ... ... ***************************************** ************ EARLY BOOT VFP ************* End of early boot detected ***************************************** Firmware Version 43.50 Duplex Console IO Dependent Code (IODC) revision 1 ------------------------------------------------------------------------------ (c) Copyright 1995-2002, Hewlett-Packard Company, All rights reserved ------------------------------------------------------------------------------ Processor Speed State CoProcessor State Cache Size Number State Inst Data --------- -------- --------------------- ----------------- ------------ 0 650 MHz Active Functional 750 KB 1.5 MB 1 650 MHz Idle Functional 750 KB 1.5 MB Central Bus Speed (in MHz) : 120 Available Memory : 2097152 KB Good Memory Required : 16140 KB Primary boot path: 0/0/1/1.15 Alternate boot path: 0/0/2/1.15 Console path: 0/0/4/1.643 Keyboard path: 0/0/4/0.0 Processor is starting autoboot process. To discontinue, press any key within 10 seconds. 10 seconds expired. Proceeding... Trying Primary Boot Path ------------------------ Booting... Boot IO Dependent Code (IODC) revision 1 HARD Booted. ISL Revision A.00.38 OCT 26, 1994 ISL booting hpux ISL>

    Read the article

< Previous Page | 69 70 71 72 73 74  | Next Page >