Search Results

Search found 35683 results on 1428 pages for 'volume id'.

Page 37/1428 | < Previous Page | 33 34 35 36 37 38 39 40 41 42 43 44  | Next Page >

  • How to set up a centralized backup server with lots of offsite workstations, intermittent internet connectivity, and stubborn users?

    - by Zac B
    This might be an impossible question. Context: We have a bunch of computers across around 1000 users. We have a centralized office where 900 of the users work, most of the time. Most of the computers are laptops. They are very frequently coming on and off the network for hours at a time. Users often take their computers home and do lots of work from home. In addition, there are a handful of users who work elsewhere in the country, who are offline (no internet connection whatsoever) for more than half of the time they use their machines. All of the machines are Windows 7/XP. Problem: People are always losing data. One day someone accidentally deletes a bunch of files. The next day someone else installs a bad driver or tries to mess with something in system32 and needs a personal data backup/reinstall of Windows. Because of how many of our business operations are done without an internet connection, and how frequently computers come on- and offline, it's unfeasible to make users use network storage for all of their data. We tried giving them Dropboxes, and they stored their files elsewhere. We bought and deployed Altiris, and they uninstalled it and blamed us when they couldn't get files back that they accidentally deleted while they were offline and hadn't taken a backup in months. We tried teaching them backup best-practices, and using scheduled sync tools to upload things to the network drives, and they turned them off because they "looked like viruses". It doesn't help that many of these users are pretty high up in the business and are not amicable to any sort of "you need to do something regularly because we say so" solution. Question: Other than finding another job where IT is treated differently and users are willing to follow best practices, how would people recommend I implement a file backup solution that supports the following: Backs up to a centralized server over LAN or WAN whenever a network link becomes available, or on a schedule. Supports interrupted/resumed backups (and hopefully file-delta only backups), since connections to the network (WAN or LAN) are often slow and only open for half an hour or so. Supports relatively rapid, "I accidentally deleted the TPS reports! Oh no!" single-file recovery, ideally administered from the central backup server rather than the client PC. Supports local-to-local file delta backup on a schedule, so that users without a network connection for a few days can still retrieve accidental deletions or whatnot. Ideally, the local stored backups would be pushed up to the server whenever network link is available. Isn't configurable on the clients without certain credentials. Because the CFOs (who won't give up their admin rights on the domain) will disable it if they can. Backs up the entire hard drive. There are people who are self-righteous about storing things in C:\, or in the recycle bin, or in the C:\Windows dir (yes, I know). I'm fine integrating multiple products/solutions, or scripting different programs together myself (I'm a somewhat competent programmer), but I've been drawing a blank on where to start. Dropbox is folder-specific, Altiris doesn't cope with LAN outages or interrupted/resumed backups, Volume Shadow Copy is awesome for a local-to-local solution, but I don't know how to push days of stored shadow copies up to a server in a 2 hour window of network access. The company is fine with spending decent money on this, thousands (USD) on a server, and hundreds on clients, if necessary. I want to emphasize that this isn't a shopping list request. While I wish there was a program out there that did what I want, I've looked pretty hard, and not found anything that fits the bill. Instead, I'm hoping for ideas on where to start hacking things together from scratch/from different technologies to make something stable that works. Cheers!

    Read the article

  • How to implement an ID field on a POCO representing an Identity field in MS SQL?

    - by Dr. Zim
    If I have a Domain Model that has an ID that maps to a SQL identity column, what does the POCO look like that contains that field? Candidate 1: Allows anyone to set and get the ID. I don't think we want anyone setting the ID except the Repository, from the SQL table. public class Thing { public int ID {get;set;} } Candidate 2: Allows someone to set the ID upon creation, but we won't know the ID until after we create the object (factory creates a blank Thing object where ID = 0 until we persist it). How would we set the ID after persisting? public class Thing { public Thing () : This (ID: 0) {} public Thing (int ID) { this.ID = ID } private int _ID; public int ID { get { return this.ID;}; } Candidate 3: Methods to set ID? Somehow we would need to allow the Repository to set the ID without allowing the consumer to change it. Any ideas? Is this barking up the wrong tree? Do we send the object to the Repository, save it, throw it away, then create a new object from the loaded version and return that as a new object?

    Read the article

  • How to implement an ID field on a POCO representing an Identity field in SQL Server?

    - by Dr. Zim
    If I have a Domain Model that has an ID that maps to a SQL Server identity column, what does the POCO look like that contains that field? Candidate 1: Allows anyone to set and get the ID. I don't think we want anyone setting the ID except the Repository, from the SQL table. public class Thing { public int ID {get;set;} } Candidate 2: Allows someone to set the ID upon creation, but we won't know the ID until after we create the object (factory creates a blank Thing object where ID = 0 until we persist it). How would we set the ID after persisting? public class Thing { public Thing () : This (ID: 0) {} public Thing (int ID) { this.ID = ID } private int _ID; public int ID { get { return this.ID;}; } Candidate 3: Methods to set ID? Somehow we would need to allow the Repository to set the ID without allowing the consumer to change it. Any ideas? Is this barking up the wrong tree? Do we send the object to the Repository, save it, throw it away, then create a new object from the loaded version and return that as a new object?

    Read the article

  • RDP issue..RDP not working providing logon to access on user id

    - by Mohammed Najmuddin
    I got a request to provide logon to access to one of Windows 2008 server, after I added this server on user's logon to list and given local admin access to server. I am not able to take RDP session. Its giving error.. Local Security authority failed to connect... I see Event ID 56 ..Source Termdd When I given access to Windows 2003 it working fine.. I checked remote desktop security settings..its configured "Remote desktop security layer" Can somebody help to fix this issue... Regards, Mohammed Najmuddin

    Read the article

  • Receiving Event ID: 10107, Hyper-V -VMMS

    - by Stargaten
    We are using physical disk on two of Guest operating systems. Is this a know issue? Do we need to have DPM 2010? "One or more physical disks are attached to virtual machine 'Myserver'. Back up programs that use the Hyper-V VSS writer cannot back up volumes that are attached to virtual machines as physical disks. To avoid potential data loss, use another method to back up the data on the physical disks. If you restore the data on this virtual machine, make sure to check the data of the physical disk for integrity. (Virtual machine ID 8EF3C0CB-967D-4D67-B4D8-7B782C7AC07C)"

    Read the article

  • SBS 2003 no network connection and acting strangely a bunch of Event ID 13568

    - by JMan78
    I've got an SBS 2003 Standard server and it was running fine until earlier today when it was rebooted, after the reboot it has no network connection, I can't seem to right click on a lot of stuff and get dialog boxes, I can't launch IE, it's acting extremely strange. We are dead in the water at this point. I checked the event logs and noticed we're getting a ton of Event ID's 13568. I thought it was a Journal Wrap error, and while I was going to try to fix it using this article: http://support.microsoft.com/kb/290762 I can't even do that because after I set the D4 value, then went to restart NTFRS from command prompt and I got the following: System Error 1059 has occurred. Circular service dependency was specified. That is where I'm at and haven't been able to figure anything else out. ALso, I've posted this on EE, there are some screens of event logs and such there: http://www.experts-exchange.com/OS/Microsoft_Operating_Systems/Server/SBS_Small_Business_Server/Q_27969593.html

    Read the article

  • Ubuntu 10.04 upgrade: gdm-simple-greeter no seat-id found

    - by Broam
    Upgraded a machine straight from Ubuntu 8.04 to 10.04. After a successful upgrade, GDM shows no users on the login screen, with the error message: "gdm-simple-greeter no seat-id found" Users can log in via text mode. A recovery user can startx without issue. I'm convinced this is an issue with GDM and/or ConsoleKit. What are my options? I've debated a reinstall but would rather not. I've also thought of switching to KDM or XDM instead. I've filed a launchpad bug--no responses yet.

    Read the article

  • "Target the specific user you will be using and assign it user id 0/group 0"

    - by Jeremy Holovacs
    I am trying to virtualize an Ubuntu machine using VMWare vCenter Converter, but ran into permissions issues. I followed the instructions of part 1 and 2 on this page but when I got to "For Ubuntu operating systems further configuration is needed" I started running into trouble. I'm decent at Linux, but I'm not an experienced sysadmin. How do I Target the specific user you will be using and assign it user id 0/group 0? How do I Ensure that you also still enable Allow root to ssh even though you are not using the root account? Thanks for your help.

    Read the article

  • Could not retrieve backup settings for primary ID in Log shipping

    - by user1723139
    I am doing log shipping between two Amazon EC2 instances running Windows Server 2008 R2 with SQL Server 2008 R2 standard edition. Both the instances are in the same domain and I can access the shared folders between the instances. The SQL server service account, agent service account are all running under a domain account. When I activate log shipping (with stand by mode restore in secondary server), the initial backup gets restored on the secondary. After that the backup operation is getting failed and i get the following error message: *** Error: Could not retrieve backup settings for primary ID 'xxxxxx-xxxx-xxxx-xxxx-4d772cd7337e'.(Microsoft.SqlServer.Management.LogShipping) *** *** Error: Failed to connect to server IP-0A7653F2.(Microsoft.SqlServer.ConnectionInfo) *** ****** Error: A network-related or instance-specific error occurred while establishing a connection to SQL Server.******** **The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 - Could not open a connection to SQL Server)(.Net SqlClient Data Provider) *** **----- END OF TRANSACTION LOG BACKUP -----**** Any ideas?

    Read the article

  • Does every SmartCard have unique ID?

    - by mr.b
    Does anyone knows if every SmartCard has some unique identifying number that can be read by using ordinary card reader? Number format doesn't matter at all (or string format), just the fact that it exists and that it's readable is enough. I know that new smartcards can be programmed to my liking, and that I can issue unique IDs to them in the process, but I'm more interested in reading unique ID from credit cards, phone calling cards, stuff like that. I should probably mention what is intended purpose for this, before I get incinerated. I want to find easy way to uniquely identify walk-in customers, most of people have at least one smartcard in their pocket...

    Read the article

  • Wrong Sound Blaster's PCI ID within Windows

    - by pavian
    I own Sound Blaster X-Fi Titanium, it was working with no problems under Linux or Windows 7. It's original PCI ID is 1102:000b but now I see different within MS Windows. BIOS setup: 1102:000b GNU/Linux: 1102:000b Windows 7: 1102:000d Windows 8: 1102:000d In last days I'm experimenting with IOMMU PCI passthrough in Xen and I tried to pass this device to virtual Windows 7 and 8. Here I found this problem. I don't know if this is just coincidence or reason of my problem but it's wrong even in physical system. Windows detects 1102:000d as a High Definition Audio sound device (I guess this name, I have localized Windows, but this is general name, the same was with Realtek HDA before drivers), it's playing but it's unstable (Windows speaker testing can crash that application) and I can't install Creative software. Used driver is hdaudio.sys. Booting in BIOS or UEFI mode doesn't change anything. Nor CMOS clean. Someone met the same problem.

    Read the article

  • Message-ID contains multiple '@' characters

    - by Thomaschaaf
    I am looking at my server's spamassasin and have stumbled across the "problem" that my outgoing email sends out its own X-Report and X-Score in the header (programs used are Outlook 2007 as client and exim4 with the vexim plugin and Spamassasin). On the one hand I want to get rid of the sent X-Report which gets sent out with every email and on the other hand I still want to have it for incoming mails. While I was trying to fix this (which I still haven't) I stumbled across this "error" which makes my email less trustworthy: 1.4 MSGID_MULTIPLE_AT Message-ID contains multiple '@' characters How do I get rid of the multiple '@' characters?

    Read the article

  • Does ssh-copy-id overwrite previous keys?

    - by decker
    I haven't yet found any definitive answer on this using google. It seems like the answer is no, but I need to know for sure before I go ahead and do it. Does ssh-copy-id append the key to authorized_keys or does it overwrite the previous keys? Thanks. Addendum: So the answer is right there in the man page. Go figure. I guess the question can at least help fellow Google-jockeys like me who get a little too used to googling and finding tutorials (that often explain things in layman's terms for us poor folks who have only used Windows our whole lives).

    Read the article

  • Windows 7 just restarts, no BSOD, shows "Event ID 14 volsnap"

    - by Mdc007
    My Windows 7 just restarts without giving me a BSOD. When it reboots, it will hang sometimes and you have to let it rest. Sometimes it will print a message saying that Hal.dll is missing or corrupt. When it does restart I can't run a backup or a clone drive – it comes up with constant errors. I tried to run chkdsk – it says everything's clean. What is really puzzling is SyncToy on a different drive hangs half way through and won't run either. Machine: OCZ SSD Agility 2 for C drive SATA HDD for programs, documents on D drive MSI AMD 880 chipset Event viewer says something about critical power failure I/O operations and "Event ID 14 volsnap" – but that is just telling me the computer powered off which I already know.

    Read the article

  • Windows 8 Activation - Product ID: Not available

    - by Guy Thomas
    The situation: I downloaded Windows 8 RTM from MSDN (I have a subscription). Naturally, I downloaded the product key as well. Windows 8 installed like a dream: lightning fast with no problems. I accepted the product key at the beginning of the install. Next, I thought I would download Updates, but they failed, so I checked the system's activation in Control Panel System. Problem: It returned "Product ID: Not available." There's nothing under "Windows activation" that I can click on, no blue links. I had a 'Chat' with MSDN, who introduced me to SLUI.exe. On Windows 8 it did nothing. (On Windows 7 it is supposed to bring up the Activation Menu). I phoned the Microsoft Activation number, they told me to contact MSDN. MSDN left the 'chat' by telling me to contact Microsoft! Hmm... I wonder if anyone at SuperUser can help?

    Read the article

  • Exchange 2003 Event ID 9337 - Offline Address Book

    - by Creepycc
    I have a support issue with an Exchange 2003 SP2 server. Event ID: 9337 Description: OALGen did not find any recipients in address list '\Global Address List'. This offline address list will not be generated. - Default Offline Address List When you preview the Global Address list within Exchange Systems Manager all is fine. Turning off cached mode on Outllok clients still errors Public fiolders / System folders are fine OABINTEG detects no issues, Pfdavadmin has checked all DACL The GAL and OAB have been deleted / recreated several times (With differnt names) DCDIAG, NETDIAG, ExchangeBPA all run without error Exhausted Google links diagnosing this issue, any suggestions?

    Read the article

  • ISP Couldn't Verify My HFC MAC Id

    - by Ann Rahn
    An Internet Service Provider whom I used in the past, claims to provide me with Internet service. However, when I gave his technician the Hybrid Fiber - COX MAC ID on my Modem, he could not find it on his list. Also, the Internet Service Provider doesn't even have my current mailing address. All he has is my phone number and he is demanding payment. On the other hand, I use a cable service which provide TV. At this point, the Internet Service Provider is threatening to disconnect my Internet service. My question: How can I verify if I am getting the service from him or through the cable?

    Read the article

  • ssh asks for password despite ssh-copy-id

    - by Aliud Alius
    I've been using public key authentication on a remote server for some time now for remote shell use as well as for sshfs mounts. After forcing a umount of my sshfs directory, I noticed that ssh began to prompt me for a password. I tried purging the remote .ssh/authorized_keys from any mention the local machine, and I cleaned the local machine from references to the remote machine. I then repeated my ssh-copy-id, it prompted me for a password, and returned normally. But lo and behold, when I ssh to the remote server I am still prompted for a password. I'm a little confused as to what the issue could be, any suggestions?

    Read the article

  • Manage a lot of open ssh sessions to servers with id's for hostname

    - by kimausloos
    Lately I'm working a lot with a few open ssh connections to multiple VPS servers. The hostnames of the servers all follow an ID approach and it's becoming very difficult to know on what machine I'm working with. I was wondering if there is a way to put a name I define somewhere as the title of the terminal. So I would, for example, associate IP 123.123.123.123 with webserver-stg. When opening the connection to the IP, webserver-stg would automaticaly be displayed as the name of the session. Of course I'm not able to change code on the VPS servers, so the solution should be client-side.

    Read the article

  • one email have multiple open id , unable to retrive specific open id password?

    - by superUser
    I have multiple OPENID accouts refrencing same email address, now i forget one of my accout's password. and when i tried to recover my password then only one openid accout link sent to my mail address whereas i need another openid password reset link what i have to do?? although i m able to login through gmail, but i want to login through openid. i have mailed already? but no satisfactory answer?? how do i collect all open ID password reset link referencing same email address??

    Read the article

  • Take Control Of Web Control ClientID Values in ASP.NET 4.0

    Each server-side Web control in an ASP.NET Web Forms application has an ID property that identifies the Web control and is name by which the Web control is accessed in the code-behind class. When rendered into HTML, the Web control turns its server-side ID value into a client-side id attribute. Ideally, there would be a one-to-one correspondence between the value of the server-side ID property and the generated client-side id, but in reality things aren't so simple. By default, the rendered client-side id is formed by taking the Web control's ID property and prefixed it with the ID properties of its naming containers. In short, a Web control with an ID of txtName can get rendered into an HTML element with a client-side id like ctl00_MainContent_txtName. This default translation from the server-side ID property value to the rendered client-side id attribute can introduce challenges when trying to access an HTML element via JavaScript, which is typically done by id, as the page developer building the web page and writing the JavaScript does not know what the id value of the rendered Web control will be at design time. (The client-side id value can be determined at runtime via the Web control's ClientID property.) ASP.NET 4.0 affords page developers much greater flexibility in how Web controls render their ID property into a client-side id. This article starts with an explanation as to why and how ASP.NET translates the server-side ID value into the client-side id value and then shows how to take control of this process using ASP.NET 4.0. Read on to learn more! Read More >Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

    Read the article

  • Take Control Of Web Control ClientID Values in ASP.NET 4.0

    Each server-side Web control in an ASP.NET Web Forms application has an ID property that identifies the Web control and is name by which the Web control is accessed in the code-behind class. When rendered into HTML, the Web control turns its server-side ID value into a client-side id attribute. Ideally, there would be a one-to-one correspondence between the value of the server-side ID property and the generated client-side id, but in reality things aren't so simple. By default, the rendered client-side id is formed by taking the Web control's ID property and prefixed it with the ID properties of its naming containers. In short, a Web control with an ID of txtName can get rendered into an HTML element with a client-side id like ctl00_MainContent_txtName. This default translation from the server-side ID property value to the rendered client-side id attribute can introduce challenges when trying to access an HTML element via JavaScript, which is typically done by id, as the page developer building the web page and writing the JavaScript does not know what the id value of the rendered Web control will be at design time. (The client-side id value can be determined at runtime via the Web control's ClientID property.) ASP.NET 4.0 affords page developers much greater flexibility in how Web controls render their ID property into a client-side id. This article starts with an explanation as to why and how ASP.NET translates the server-side ID value into the client-side id value and then shows how to take control of this process using ASP.NET 4.0. Read on to learn more! Read More >

    Read the article

  • Binary XML file <line 20>: Error inflating class <Unknown>

    - by user2750644
    <RelativeLayout xmlns:android="http://schemas.android.com/apk/res/android" xmlns:tools="http://schemas.android.com/tools" android:layout_width="match_parent" android:layout_height="match_parent" android:layout_gravity="top|left|right" android:background="@drawable/bg" tools:context=".MainActivity" > <ImageView android:id="@+id/imageView1" android:layout_width="wrap_content" android:layout_height="50dp" android:layout_alignParentTop="true" android:layout_centerHorizontal="true" android:src="@drawable/top_red_bar" android:layout_alignParentLeft="true" android:paddingLeft="0dp" android:scaleType="centerCrop" /> <ImageView android:id="@+id/imageView2" android:layout_width="wrap_content" android:layout_height="45dp" android:layout_alignParentLeft="true" android:layout_alignParentTop="true" android:layout_alignTop="@+id/imageView1" android:layout_centerHorizontal="true" android:src="@drawable/job_bar" /> <ImageView android:id="@+id/imageView3" android:layout_width="238dp" android:layout_height="50dp" android:layout_below="@+id/imageView1" android:layout_centerHorizontal="true" android:layout_centerVertical="true" android:layout_marginTop="123dp" android:src="@drawable/button" android:alpha="0.85" android:onClick="myhandler"/> <ImageView android:id="@+id/imageView4" android:layout_width="wrap_content" android:layout_height="50dp" android:layout_alignParentLeft="true" android:layout_below="@+id/imageView3" android:layout_marginTop="24dp" android:src="@drawable/button" android:alpha="0.85"/> <ImageView android:id="@+id/imageView5" android:layout_width="35dp" android:layout_height="45dp" android:layout_above="@+id/imageView4" android:layout_alignLeft="@+id/imageView3" android:layout_marginLeft="21dp" android:src="@drawable/employer" /> <ImageView android:id="@+id/imageView6" android:layout_width="50dp" android:layout_height="50dp" android:layout_alignLeft="@+id/imageView5" android:layout_alignTop="@+id/imageView4" android:src="@drawable/jobseekers" /> <TextView android:id="@+id/textView1" android:layout_width="wrap_content" android:layout_height="wrap_content" android:layout_alignTop="@+id/imageView5" android:layout_centerHorizontal="true" android:layout_marginLeft="18dp" android:layout_toRightOf="@+id/imageView6" android:text="Employer" android:textAppearance="?android:attr/textAppearanceLarge" android:textColor="#ffffff" android:textStyle="bold" /> <TextView android:id="@+id/textView2" android:layout_width="wrap_content" android:layout_height="wrap_content" android:layout_alignLeft="@+id/textView1" android:layout_alignTop="@+id/imageView4" android:text="Job Seekers" android:textAppearance="?android:attr/textAppearanceLarge" android:textColor="#ffffff" android:textStyle="bold" /> </RelativeLayout> My app gets crashed with error statement Binary XML file <line 20>: Error inflating class <Unknown>. Why does this happen?? Could anyone fix this? And interestingly, when I remove all code except background, it works!!!

    Read the article

  • Windows Server 2008 R2 Error CAPI2 Event ID 4107

    - by umar bhatti
    I am getting following error on couple of my 2008 R2 servers. I have tried couple of fixes which didn't fix the issues. Log Name: Application Source: Microsoft-Windows-CAPI2 Date: 18/03/2013 09:48:40 Event ID: 4107 Task Category: None Level: Error Keywords: Classic User: N/A Computer: ServerName Description: Failed extract of third-party root list from auto update cab at: <http://ctldl.windowsupdate.com/msdownload/update/v3/static/trustedr/en/authrootstl.cab> with error: The data is invalid. . Event Xml: <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event"> <System> <Provider Name="Microsoft-Windows-CAPI2" Guid="{5bbca4a8-b209-48dc-a8c7-b23d3e5216fb}" EventSourceName="Microsoft-Windows-CAPI2" /> <EventID Qualifiers="0">4107</EventID> <Version>0</Version> <Level>2</Level> <Task>0</Task> <Opcode>0</Opcode> <Keywords>0x8080000000000000</Keywords> <TimeCreated SystemTime="2013-03-18T09:48:40.169581600Z" /> <EventRecordID>8713</EventRecordID> <Correlation /> <Execution ProcessID="412" ThreadID="5288" /> <Channel>Application</Channel> <Computer>ServerName <Security /> </System> <EventData> <Data>http://ctldl.windowsupdate.com/msdownload/update/v3/static/trustedr/en/authrootstl.cab</Data> <Data>The data is invalid. </Data> </EventData> </Event>

    Read the article

  • Oracle Coherence, Split-Brain and Recovery Protocols In Detail

    - by Ricardo Ferreira
    This article provides a high level conceptual overview of Split-Brain scenarios in distributed systems. It will focus on a specific example of cluster communication failure and recovery in Oracle Coherence. This includes a discussion on the witness protocol (used to remove failed cluster members) and the panic protocol (used to resolve Split-Brain scenarios). Note that the removal of cluster members does not necessarily indicate a Split-Brain condition. Oracle Coherence does not (and cannot) detect a Split-Brain as it occurs, the condition is only detected when cluster members that previously lost contact with each other regain contact. Cluster Topology and Configuration In order to create an good didactic for the article, let's assume a cluster topology and configuration. In this example we have a six member cluster, consisting of one JVM on each physical machine. The member IDs are as follows: Member ID  IP Address  1  10.149.155.76  2  10.149.155.77  3  10.149.155.236  4  10.149.155.75  5  10.149.155.79  6  10.149.155.78 Members 1, 2, and 3 are connected to a switch, and members 4, 5, and 6 are connected to a second switch. There is a link between the two switches, which provides network connectivity between all of the machines. Member 1 is the first member to join this cluster, thus making it the senior member. Member 6 is the last member to join this cluster. Here is a log snippet from Member 6 showing the complete member set: 2010-02-26 15:27:57.390/3.062 Oracle Coherence GE 3.5.3/465p2 <Info> (thread=main, member=6): Started DefaultCacheServer... SafeCluster: Name=cluster:0xDDEB Group{Address=224.3.5.3, Port=35465, TTL=4} MasterMemberSet ( ThisMember=Member(Id=6, Timestamp=2010-02-26 15:27:58.635, Address=10.149.155.78:8088, MachineId=1102, Location=process:228, Role=CoherenceServer) OldestMember=Member(Id=1, Timestamp=2010-02-26 15:27:06.931, Address=10.149.155.76:8088, MachineId=1100, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:511, Role=CoherenceServer) ActualMemberSet=MemberSet(Size=6, BitSetCount=2 Member(Id=1, Timestamp=2010-02-26 15:27:06.931, Address=10.149.155.76:8088, MachineId=1100, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:511, Role=CoherenceServer) Member(Id=2, Timestamp=2010-02-26 15:27:17.847, Address=10.149.155.77:8088, MachineId=1101, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:296, Role=CoherenceServer) Member(Id=3, Timestamp=2010-02-26 15:27:24.892, Address=10.149.155.236:8088, MachineId=1260, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:32459, Role=CoherenceServer) Member(Id=4, Timestamp=2010-02-26 15:27:39.574, Address=10.149.155.75:8088, MachineId=1099, Location=process:800, Role=CoherenceServer) Member(Id=5, Timestamp=2010-02-26 15:27:49.095, Address=10.149.155.79:8088, MachineId=1103, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:3229, Role=CoherenceServer) Member(Id=6, Timestamp=2010-02-26 15:27:58.635, Address=10.149.155.78:8088, MachineId=1102, Location=process:228, Role=CoherenceServer) ) RecycleMillis=120000 RecycleSet=MemberSet(Size=0, BitSetCount=0 ) ) At approximately 15:30, the connection between the two switches is severed: Thirty seconds later (the default packet timeout in development mode) the logs indicate communication failures across the cluster. In this example, the communication failure was caused by a network failure. In a production setting, this type of communication failure can have many root causes, including (but not limited to) network failures, excessive GC, high CPU utilization, swapping/virtual memory, and exceeding maximum network bandwidth. In addition, this type of failure is not necessarily indicative of a split brain. Any communication failure will be logged in this fashion. Member 2 logs a communication failure with Member 5: 2010-02-26 15:30:32.638/196.928 Oracle Coherence GE 3.5.3/465p2 <Warning> (thread=PacketPublisher, member=2): Timeout while delivering a packet; requesting the departure confirmation for Member(Id=5, Timestamp=2010-02-26 15:27:49.095, Address=10.149.155.79:8088, MachineId=1103, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:3229, Role=CoherenceServer) by MemberSet(Size=2, BitSetCount=2 Member(Id=1, Timestamp=2010-02-26 15:27:06.931, Address=10.149.155.76:8088, MachineId=1100, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:511, Role=CoherenceServer) Member(Id=4, Timestamp=2010-02-26 15:27:39.574, Address=10.149.155.75:8088, MachineId=1099, Location=process:800, Role=CoherenceServer) ) The Coherence clustering protocol (TCMP) is a reliable transport mechanism built on UDP. In order for the protocol to be reliable, it requires an acknowledgement (ACK) for each packet delivered. If a packet fails to be acknowledged within the configured timeout period, the Coherence cluster member will log a packet timeout (as seen in the log message above). When this occurs, the cluster member will consult with other members to determine who is at fault for the communication failure. If the witness members agree that the suspect member is at fault, the suspect is removed from the cluster. If the witnesses unanimously disagree, the accuser is removed. This process is known as the witness protocol. Since Member 2 cannot communicate with Member 5, it selects two witnesses (Members 1 and 4) to determine if the communication issue is with Member 5 or with itself (Member 2). However, Member 4 is on the switch that is no longer accessible by Members 1, 2 and 3; thus a packet timeout for member 4 is recorded as well: 2010-02-26 15:30:35.648/199.938 Oracle Coherence GE 3.5.3/465p2 <Warning> (thread=PacketPublisher, member=2): Timeout while delivering a packet; requesting the departure confirmation for Member(Id=4, Timestamp=2010-02-26 15:27:39.574, Address=10.149.155.75:8088, MachineId=1099, Location=process:800, Role=CoherenceServer) by MemberSet(Size=2, BitSetCount=2 Member(Id=1, Timestamp=2010-02-26 15:27:06.931, Address=10.149.155.76:8088, MachineId=1100, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:511, Role=CoherenceServer) Member(Id=6, Timestamp=2010-02-26 15:27:58.635, Address=10.149.155.78:8088, MachineId=1102, Location=process:228, Role=CoherenceServer) ) Member 1 has the ability to confirm the departure of member 4, however Member 6 cannot as it is also inaccessible. At the same time, Member 3 sends a request to remove Member 6, which is followed by a report from Member 3 indicating that Member 6 has departed the cluster: 2010-02-26 15:30:35.706/199.996 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Cluster, member=2): MemberLeft request for Member 6 received from Member(Id=3, Timestamp=2010-02-26 15:27:24.892, Address=10.149.155.236:8088, MachineId=1260, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:32459, Role=CoherenceServer) 2010-02-26 15:30:35.709/199.999 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Cluster, member=2): MemberLeft notification for Member 6 received from Member(Id=3, Timestamp=2010-02-26 15:27:24.892, Address=10.149.155.236:8088, MachineId=1260, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:32459, Role=CoherenceServer) The log for Member 3 determines how Member 6 departed the cluster: 2010-02-26 15:30:35.161/191.694 Oracle Coherence GE 3.5.3/465p2 <Warning> (thread=PacketPublisher, member=3): Timeout while delivering a packet; requesting the departure confirmation for Member(Id=6, Timestamp=2010-02-26 15:27:58.635, Address=10.149.155.78:8088, MachineId=1102, Location=process:228, Role=CoherenceServer) by MemberSet(Size=2, BitSetCount=2 Member(Id=1, Timestamp=2010-02-26 15:27:06.931, Address=10.149.155.76:8088, MachineId=1100, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:511, Role=CoherenceServer) Member(Id=2, Timestamp=2010-02-26 15:27:17.847, Address=10.149.155.77:8088, MachineId=1101, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:296, Role=CoherenceServer) ) 2010-02-26 15:30:35.165/191.698 Oracle Coherence GE 3.5.3/465p2 <Info> (thread=Cluster, member=3): Member departure confirmed by MemberSet(Size=2, BitSetCount=2 Member(Id=1, Timestamp=2010-02-26 15:27:06.931, Address=10.149.155.76:8088, MachineId=1100, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:511, Role=CoherenceServer) Member(Id=2, Timestamp=2010-02-26 15:27:17.847, Address=10.149.155.77:8088, MachineId=1101, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:296, Role=CoherenceServer) ); removing Member(Id=6, Timestamp=2010-02-26 15:27:58.635, Address=10.149.155.78:8088, MachineId=1102, Location=process:228, Role=CoherenceServer) In this case, Member 3 happened to select two witnesses that it still had connectivity with (Members 1 and 2) thus resulting in a simple decision to remove Member 6. Given the departure of Member 6, Member 2 is left with a single witness to confirm the departure of Member 4: 2010-02-26 15:30:35.713/200.003 Oracle Coherence GE 3.5.3/465p2 <Info> (thread=Cluster, member=2): Member departure confirmed by MemberSet(Size=1, BitSetCount=2 Member(Id=1, Timestamp=2010-02-26 15:27:06.931, Address=10.149.155.76:8088, MachineId=1100, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:511, Role=CoherenceServer) ); removing Member(Id=4, Timestamp=2010-02-26 15:27:39.574, Address=10.149.155.75:8088, MachineId=1099, Location=process:800, Role=CoherenceServer) In the meantime, Member 4 logs a missing heartbeat from the senior member. This message is also logged on Members 5 and 6. 2010-02-26 15:30:07.906/150.453 Oracle Coherence GE 3.5.3/465p2 <Info> (thread=PacketListenerN, member=4): Scheduled senior member heartbeat is overdue; rejoining multicast group. Next, Member 4 logs a TcpRing failure with Member 2, thus resulting in the termination of Member 2: 2010-02-26 15:30:21.421/163.968 Oracle Coherence GE 3.5.3/465p2 <D4> (thread=Cluster, member=4): TcpRing: Number of socket exceptions exceeded maximum; last was "java.net.SocketTimeoutException: connect timed out"; removing the member: 2 For quick process termination detection, Oracle Coherence utilizes a feature called TcpRing which is a sparse collection of TCP/IP-based connections between different members in the cluster. Each member in the cluster is connected to at least one other member, which (if at all possible) is running on a different physical box. This connection is not used for any data transfer, only heartbeat communications are sent once a second per each link. If a certain number of exceptions are thrown while trying to re-establish a connection, the member throwing the exceptions is removed from the cluster. Member 5 logs a packet timeout with Member 3 and cites witnesses Members 4 and 6: 2010-02-26 15:30:29.791/165.037 Oracle Coherence GE 3.5.3/465p2 <Warning> (thread=PacketPublisher, member=5): Timeout while delivering a packet; requesting the departure confirmation for Member(Id=3, Timestamp=2010-02-26 15:27:24.892, Address=10.149.155.236:8088, MachineId=1260, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:32459, Role=CoherenceServer) by MemberSet(Size=2, BitSetCount=2 Member(Id=4, Timestamp=2010-02-26 15:27:39.574, Address=10.149.155.75:8088, MachineId=1099, Location=process:800, Role=CoherenceServer) Member(Id=6, Timestamp=2010-02-26 15:27:58.635, Address=10.149.155.78:8088, MachineId=1102, Location=process:228, Role=CoherenceServer) ) 2010-02-26 15:30:29.798/165.044 Oracle Coherence GE 3.5.3/465p2 <Info> (thread=Cluster, member=5): Member departure confirmed by MemberSet(Size=2, BitSetCount=2 Member(Id=4, Timestamp=2010-02-26 15:27:39.574, Address=10.149.155.75:8088, MachineId=1099, Location=process:800, Role=CoherenceServer) Member(Id=6, Timestamp=2010-02-26 15:27:58.635, Address=10.149.155.78:8088, MachineId=1102, Location=process:228, Role=CoherenceServer) ); removing Member(Id=3, Timestamp=2010-02-26 15:27:24.892, Address=10.149.155.236:8088, MachineId=1260, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:32459, Role=CoherenceServer) Eventually we are left with two distinct clusters consisting of Members 1, 2, 3 and Members 4, 5, 6, respectively. In the latter cluster, Member 4 is promoted to senior member. The connection between the two switches is restored at 15:33. Upon the restoration of the connection, the cluster members immediately receive cluster heartbeats from the two senior members. In the case of Members 1, 2, and 3, the following is logged: 2010-02-26 15:33:14.970/369.066 Oracle Coherence GE 3.5.3/465p2 <Warning> (thread=Cluster, member=1): The member formerly known as Member(Id=4, Timestamp=2010-02-26 15:30:35.341, Address=10.149.155.75:8088, MachineId=1099, Location=process:800, Role=CoherenceServer) has been forcefully evicted from the cluster, but continues to emit a cluster heartbeat; henceforth, the member will be shunned and its messages will be ignored. Likewise for Members 4, 5, and 6: 2010-02-26 15:33:14.343/336.890 Oracle Coherence GE 3.5.3/465p2 <Warning> (thread=Cluster, member=4): The member formerly known as Member(Id=1, Timestamp=2010-02-26 15:30:31.64, Address=10.149.155.76:8088, MachineId=1100, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:511, Role=CoherenceServer) has been forcefully evicted from the cluster, but continues to emit a cluster heartbeat; henceforth, the member will be shunned and its messages will be ignored. This message indicates that a senior heartbeat is being received from members that were previously removed from the cluster, in other words, something that should not be possible. For this reason, the recipients of these messages will initially ignore them. After several iterations of these messages, the existence of multiple clusters is acknowledged, thus triggering the panic protocol to reconcile this situation. When the presence of more than one cluster (i.e. Split-Brain) is detected by a Coherence member, the panic protocol is invoked in order to resolve the conflicting clusters and consolidate into a single cluster. The protocol consists of the removal of smaller clusters until there is one cluster remaining. In the case of equal size clusters, the one with the older Senior Member will survive. Member 1, being the oldest member, initiates the protocol: 2010-02-26 15:33:45.970/400.066 Oracle Coherence GE 3.5.3/465p2 <Warning> (thread=Cluster, member=1): An existence of a cluster island with senior Member(Id=4, Timestamp=2010-02-26 15:27:39.574, Address=10.149.155.75:8088, MachineId=1099, Location=process:800, Role=CoherenceServer) containing 3 nodes have been detected. Since this Member(Id=1, Timestamp=2010-02-26 15:27:06.931, Address=10.149.155.76:8088, MachineId=1100, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:511, Role=CoherenceServer) is the senior of an older cluster island, the panic protocol is being activated to stop the other island's senior and all junior nodes that belong to it. Member 3 receives the panic: 2010-02-26 15:33:45.803/382.336 Oracle Coherence GE 3.5.3/465p2 <Error> (thread=Cluster, member=3): Received panic from senior Member(Id=1, Timestamp=2010-02-26 15:27:06.931, Address=10.149.155.76:8088, MachineId=1100, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:511, Role=CoherenceServer) caused by Member(Id=4, Timestamp=2010-02-26 15:27:39.574, Address=10.149.155.75:8088, MachineId=1099, Location=process:800, Role=CoherenceServer) Member 4, the senior member of the younger cluster, receives the kill message from Member 3: 2010-02-26 15:33:44.921/367.468 Oracle Coherence GE 3.5.3/465p2 <Error> (thread=Cluster, member=4): Received a Kill message from a valid Member(Id=3, Timestamp=2010-02-26 15:27:24.892, Address=10.149.155.236:8088, MachineId=1260, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:32459, Role=CoherenceServer); stopping cluster service. In turn, Member 4 requests the departure of its junior members 5 and 6: 2010-02-26 15:33:44.921/367.468 Oracle Coherence GE 3.5.3/465p2 <Error> (thread=Cluster, member=4): Received a Kill message from a valid Member(Id=3, Timestamp=2010-02-26 15:27:24.892, Address=10.149.155.236:8088, MachineId=1260, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:32459, Role=CoherenceServer); stopping cluster service. 2010-02-26 15:33:43.343/349.015 Oracle Coherence GE 3.5.3/465p2 <Error> (thread=Cluster, member=6): Received a Kill message from a valid Member(Id=4, Timestamp=2010-02-26 15:27:39.574, Address=10.149.155.75:8088, MachineId=1099, Location=process:800, Role=CoherenceServer); stopping cluster service. Once Members 4, 5, and 6 restart, they rejoin the original cluster with senior member 1. The log below is from Member 4. Note that it receives a different member id when it rejoins the cluster. 2010-02-26 15:33:44.921/367.468 Oracle Coherence GE 3.5.3/465p2 <Error> (thread=Cluster, member=4): Received a Kill message from a valid Member(Id=3, Timestamp=2010-02-26 15:27:24.892, Address=10.149.155.236:8088, MachineId=1260, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:32459, Role=CoherenceServer); stopping cluster service. 2010-02-26 15:33:46.921/369.468 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Cluster, member=4): Service Cluster left the cluster 2010-02-26 15:33:47.046/369.593 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Invocation:InvocationService, member=4): Service InvocationService left the cluster 2010-02-26 15:33:47.046/369.593 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=OptimisticCache, member=4): Service OptimisticCache left the cluster 2010-02-26 15:33:47.046/369.593 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=ReplicatedCache, member=4): Service ReplicatedCache left the cluster 2010-02-26 15:33:47.046/369.593 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=DistributedCache, member=4): Service DistributedCache left the cluster 2010-02-26 15:33:47.046/369.593 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Invocation:Management, member=4): Service Management left the cluster 2010-02-26 15:33:47.046/369.593 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Cluster, member=4): Member 6 left service Management with senior member 5 2010-02-26 15:33:47.046/369.593 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Cluster, member=4): Member 6 left service DistributedCache with senior member 5 2010-02-26 15:33:47.046/369.593 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Cluster, member=4): Member 6 left service ReplicatedCache with senior member 5 2010-02-26 15:33:47.046/369.593 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Cluster, member=4): Member 6 left service OptimisticCache with senior member 5 2010-02-26 15:33:47.046/369.593 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Cluster, member=4): Member 6 left service InvocationService with senior member 5 2010-02-26 15:33:47.046/369.593 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Cluster, member=4): Member(Id=6, Timestamp=2010-02-26 15:33:47.046, Address=10.149.155.78:8088, MachineId=1102, Location=process:228, Role=CoherenceServer) left Cluster with senior member 4 2010-02-26 15:33:49.218/371.765 Oracle Coherence GE 3.5.3/465p2 <Info> (thread=main, member=n/a): Restarting cluster 2010-02-26 15:33:49.421/371.968 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Cluster, member=n/a): Service Cluster joined the cluster with senior service member n/a 2010-02-26 15:33:49.625/372.172 Oracle Coherence GE 3.5.3/465p2 <Info> (thread=Cluster, member=n/a): This Member(Id=5, Timestamp=2010-02-26 15:33:50.499, Address=10.149.155.75:8088, MachineId=1099, Location=process:800, Role=CoherenceServer, Edition=Grid Edition, Mode=Development, CpuCount=2, SocketCount=1) joined cluster "cluster:0xDDEB" with senior Member(Id=1, Timestamp=2010-02-26 15:27:06.931, Address=10.149.155.76:8088, MachineId=1100, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:511, Role=CoherenceServer, Edition=Grid Edition, Mode=Development, CpuCount=2, SocketCount=2) Cool isn't it?

    Read the article

< Previous Page | 33 34 35 36 37 38 39 40 41 42 43 44  | Next Page >