Search Results

Search found 4335 results on 174 pages for 'wait'.

Page 17/174 | < Previous Page | 13 14 15 16 17 18 19 20 21 22 23 24 | Next Page >

Executing sc.exe from .NET Process unable to start and stop service.

- by Jason

I'm trying to restart the service from a remote machine. Here is my code. The problem is that I need to enter startinfo.filename = "sc.exe" since I'm putting "start /wait sc" this is causing an error. Here is my code, any thoughts. Also if anyone has any idea how to keep the cmd window open after this is ran so I could see the code that was ran that would be awesome. string strCommandStop1; string strCommandStop2; string strCommandStart1; string strCommandStart2; string strServer = "\\" + txtServerName.Text; string strDb1 = "SqlAgent$" + txtInsName.Text; string strDb2 = "MSSQL$" + txtInsName.Text; strCommandStop1 = @"start /wait sc " + strServer + " Stop " + strDb1; strCommandStop2 = @"start /wait sc " + strServer + " Stop " + strDb2; strCommandStart1 = @"start /wait sc " + strServer + " Start " + strDb2; strCommandStart2 = @"start /wait sc " + strServer + " Start " + strDb1; try { ProcessStartInfo startInfo = new ProcessStartInfo(); startInfo.CreateNoWindow = true; startInfo.Arguments = strCommandStop1; startInfo.Arguments = strCommandStop2; startInfo.Arguments = strCommandStart1; startInfo.Arguments = strCommandStart2; startInfo .FileName = "sc.exe"; Process.Start(startInfo); } catch (Exception e) { MessageBox.Show(e.Message); }

Read the article
Know More About Oracle Row Lock

- by Liu Maclean(???)

??????Oracle??????????row lock,??ORACLE????????????????????,row lock???????????????????????????????,??Server Process?pin????block buffer????????? ????????,?process A ??update???????? Z?????????, ???????rollback???commit;??Process B??????DML??, ???????rowid???? Z???, ???????????process A????????ITL???,????????cleanout??,????????row???????????commit, ???????Process B????”enq: TX – row lock contention”??????? ????Process B????????????? ?????????Process A???????row,??Process B???????”enq: TX – row lock contention”???? ???????? ????????: SESSION A: SQL> select * from v$version; BANNER ---------------------------------------------------------------- Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bi PL/SQL Release 10.2.0.5.0 - Production CORE 10.2.0.5.0 Production TNS for Linux: Version 10.2.0.5.0 - Production NLSRTL Version 10.2.0.5.0 - Production SQL> select * from global_name; GLOBAL_NAME -------------------------------------------------------------------------------- www.oracledatabase12g.com SQL> create table maclean_lock(t1 int); Table created. SQL> insert into maclean_lock values (1); 1 row created. SQL> commit; Commit complete. SQL> select dbms_rowid.rowid_block_number(rowid),dbms_rowid.rowid_relative_fno(rowid) from maclean_lock; DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID) DBMS_ROWID.ROWID_RELATIVE_FNO(ROWID) ------------------------------------ ------------------------------------ 67642 1 SQL> select distinct sid from v$mystat; SID ---------- 142 SQL> select pid,spid from v$process where addr = ( select paddr from v$session where sid=(select distinct sid from v$mystat)); PID SPID ---------- ------------ 17 15636 ??SESSION A ????savepoint ,?update ????????? SQL> savepoint NONLOCK; Savepoint created. SQL> select * From v$Lock where sid=142; no rows selected SQL> set linesize 140 pagesize 1400 SQL> update maclean_lock set t1=t1+2; 1 row updated. SQL> select * From v$Lock where sid=142; ADDR KADDR SID TY ID1 ID2 LMODE REQUEST CTIME BLOCK ---------------- ---------------- ---------- -- ---------- ---------- ---------- ---------- ---------- ---------- 0000000091FC69F0 0000000091FC6A18 142 TM 55829 0 3 0 6 0 00000000914B4008 00000000914B4040 142 TX 393232 609 6 0 6 0 SQL> select dump(3,16) from dual; DUMP(3,16) -------------------------------------------------------------------------------- Typ=2 Len=2: c1,4 ALTER SYSTEM DUMP DATAFILE 1 BLOCK 67642; Object id on Block? Y seg/obj: 0xda16 csc: 0x00.234718 itc: 2 flg: O typ: 1 - DATA fsl: 0 fnx: 0x0 ver: 0x01 Itl Xid Uba Flag Lck Scn/Fsc 0x01 0x000a.00f.000001e0 0x00800075.02a6.29 C--- 0 scn 0x0000.00234711 0x02 0x0007.018.000001fe 0x0080065c.017a.02 ---- 1 fsc 0x0000.00000000 data_block_dump,data header at 0x81d185c =============== tsiz: 0x1fa0 hsiz: 0x14 pbl: 0x081d185c bdba: 0x0041083a 76543210 flag=-------- ntab=1 nrow=1 frre=-1 fsbo=0x14 fseo=0x1f9a avsp=0x1f83 tosp=0x1f83 0xe:pti[0] nrow=1 offs=0 0x12:pri[0] offs=0x1f9a block_row_dump: tab 0, row 0, @0x1f9a tl: 6 fb: --H-FL-- lb: 0x2 cc: 1 col 0: [ 2] c1 04 end_of_block_dump ?? BLOCK DUMP ???? ??????XID=0x0007.018.000001fe ?transaction?? lb:0x1 ??SESSION B ,?????UPDATE?? ???enq: TX - row lock contention ?? SQL> select distinct sid from v$mystat; SID ---------- 140 SQL> select pid,spid from v$process where addr = ( select paddr from v$session where sid=(select distinct sid from v$mystat)); PID SPID ---------- ------------ 24 15652 SQL> alter system set "_trace_events"='10000-10999:255:24'; System altered. SQL> update maclean_lock set t1=t1+2; select * From v$Lock where sid=142 or sid=140 order by sid; SESSION C: SQL> select * From v$Lock where sid=142 or sid=140 order by sid; ADDR KADDR SID TY ID1 ID2 LMODE REQUEST CTIME BLOCK ---------------- ---------------- ---------- -- ---------- ---------- ---------- ---------- ---------- ---------- 0000000091FC6B10 0000000091FC6B38 140 TM 55829 0 3 0 84 0 00000000924F4A58 00000000924F4A78 140 TX 458776 510 0 6 84 0 00000000914B51E8 00000000914B5220 142 TX 458776 510 6 0 312 1 0000000091FC69F0 0000000091FC6A18 142 TM 55829 0 3 0 312 0 ???? SESSION B SID=140 ?SESSION A ?TX ENQUEUE ?X mode?REQUEST SQL> oradebug dump systemstate 266; Statement processed. SESSION B waiter's enqueue lock SO: 0x924f4a58, type: 5, owner: 0x92bb8dc8, flag: INIT/-/-/0x00 (enqueue) TX-00070018-000001FE DID: 0001-0018-00000022 lv: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 res_flag: 0x6 req: X, lock_flag: 0x0, lock: 0x924f4a78, res: 0x925617c0 own: 0x92b76be0, sess: 0x92b76be0, proc: 0x92a737a0, prv: 0x925617e0 TX-00070018-000001FE=> TX 458776 510 SESSION A owner's enqueue lock SO: 0x914b51e8, type: 40, owner: 0x92b796d0, flag: INIT/-/-/0x00 (trans) flg = 0x1e03, flg2 = 0xc0000, prx = 0x0, ros = 2147483647 bsn = 0xed5 bndsn = 0xee7 spn = 0xef7 efd = 3 file:xct.c lineno:1179 DID: 0001-0011-000000C2 parent xid: 0x0000.000.00000000 env: (scn: 0x0000.00234718 xid: 0x0007.018.000001fe uba: 0x0080065c.017a.02 statement num=0 parent xid: xid: 0x0000.000.00000000 scn: 0x00 00.00234718 0sch: scn: 0x0000.00000000) cev: (spc = 7818 arsp = 914e8310 ubk tsn: 1 rdba: 0x0080065c useg tsn: 1 rdba: 0x00800069 hwm uba: 0x0080065c.017a.02 col uba: 0x00000000.0000.00 num bl: 1 bk list: 0x91435070) cr opc: 0x0 spc: 7818 uba: 0x0080065c.017a.02 (enqueue) TX-00070018-000001FE DID: 0001-0011-000000C2 lv: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 res_flag: 0x6 mode: X, lock_flag: 0x0, lock: 0x914b5220, res: 0x925617c0 own: 0x92b796d0, sess: 0x92b796d0, proc: 0x92a6ffd8, prv: 0x925617d0 xga: 0x8b7c6d40, heap: UGA Trans IMU st: 2 Pool index 65535, Redo pool 0x914b58d0, Undo pool 0x914b59b8 Redo pool range [0x86de640 0x86de640 0x86e0e40] Undo pool range [0x86dbe40 0x86dbe40 0x86de640] ---------------------------------------- SO: 0x91435070, type: 39, owner: 0x914b51e8, flag: -/-/-/0x00 (List of Blocks) next index = 1 index itli buffer hint rdba savepoint ----------------------------------------------------------- 0 2 0x647f1fc8 0x41083a 0xee7 ?SESSION A? ROLLBACK ?savepoint: SQL> rollback to NONLOCK; Rollback complete. ????savepoint ??update??????? ??UPDATE???????? ROLLBACK: SQL> select * From v$Lock where sid=142 or sid=140; ADDR KADDR SID TY ID1 ID2 LMODE REQUEST CTIME BLOCK ---------------- ---------------- ---------- -- ---------- ---------- ---------- ---------- ---------- ---------- 00000000924F4A58 00000000924F4A78 140 TX 458776 510 0 6 822 0 0000000091FC6B10 0000000091FC6B38 140 TM 55829 0 3 0 822 0 00000000914B51E8 00000000914B5220 142 TX 458776 510 6 0 1050 1 ???? SESSION A 142 ???SAVEPOINT ???????TM LOCK ????? ROLLBACK TO SAVEPOINT?????SESSION???TX LOCK!!!! ??????SESSION 142???TX ID1=458776 ID2=510, ????ROLLBACK TO SAVEPOINT?????????ABORT TRANSACTION ?? SESSION B SID=140?? SESSION A ?? , ?????????SESSION B? update???HANG?? ?????????CACHE?????: Object id on Block? Y seg/obj: 0xda16 csc: 0x00.2347b7 itc: 2 flg: O typ: 1 - DATA fsl: 0 fnx: 0x0 ver: 0x01 Itl Xid Uba Flag Lck Scn/Fsc 0x01 0x000a.00f.000001e0 0x00800075.02a6.29 C--- 0 scn 0x0000.00234711 0x02 0x0000.000.00000000 0x00000000.0000.00 ---- 0 fsc 0x0000.00000000 data_block_dump,data header at 0x745d85c =============== tsiz: 0x1fa0 hsiz: 0x14 pbl: 0x0745d85c bdba: 0x0041083a 76543210 flag=-------- ntab=1 nrow=1 frre=-1 fsbo=0x14 fseo=0x1f9a avsp=0x1f83 tosp=0x1f83 0xe:pti[0] nrow=1 offs=0 0x12:pri[0] offs=0x1f9a block_row_dump: tab 0, row 0, @0x1f9a tl: 6 fb: --H-FL-- lb: 0x0 cc: 1 col 0: [ 2] c1 02 end_of_block_dump ???? ITL=0x02? ?????????,col 0: [ 2] c1 02 ????????? ?????????SESSION D ,??????row lock?? ?UPDATE???????? SESSION D: SQL> update maclean_lock set t1=t1+2; 1 row updated. SQL> rollback; Rollback complete. ??SESSION B ??????????? ?????ORACLE????????, ??????????? TX lock?? row lock , ????????2??? row lock?????????, ?TX lock????????ENQUEUE LOCK???? ?????????PROCESS K?DML???????????????????????,??????????TX LOCK, ????PROCESS Z?????????????????????????ROW LOCK????????, ???????OLTP?????????????????????? ??ROW LOCK?Release ??????TX?ENQUEUE LOCK,?????????Process J ????????????, Process K??????????? ,Process K?????????,???row piece?lb??0x0 ,?????ITL, Process Z???ITL???????Process J????XID,?????Process J?????TX lock, PROCESS K ???TX resource?Enqueue Waiter Linked List?????X mode(exclusive)?enqueue lock? ???Process J??TX lock?,Process J?????TX resource?Enqueue Waiter Linked List ???Process K??????,??POST?????Process K? TX lock??????, ???????row lock???????,????????? ?????????? ?????: SESSION A ???PID =17 ?????????????????? SESSION B ???PID =24 ??????? "_trace_events"='10000-10999:255:24'; KST trace ??????? Server Process??? SESSION A PID=17 ?? acqure?SX mode???TM Lock ,?? ????Transaction?????UNDO SEGMENT 7,???XID 7.24.510, ?acquire ?X mode? TX-00070018-000001fe ? ?????? 00070018-000001fe ???? 7- 24 - 510? XID ? 781F4B8A:007A569C 17 142 10704 83 ksqgtl: acquire TM-0000da15-00000000 mode=SX flags=GLOBAL|XACT why="contention" 781F4B92:007A569D 17 142 10704 19 ksqgtl: SUCCESS 781F4BB3:007A569E 17 142 10812 2 0x000000000041083A 0x0000000000000000 0x0000000000234717 781F4BBA:007A569F 17 142 10812 3 0x0000000000000000 0x0000000000000000 0x0000000000000000 781F4BC0:007A56A0 17 142 10812 4 0x0000000000000000 0x0000000000000000 0x0000000000000000 781F4BD3:007A56A1 17 142 10812 5 0x000000000041083A 0x0000000000000000 0x0000000000000000 781F4BFE:007A56A2 17 142 10811 1 0x000000000041083A 0x0000000000000000 0x0000000000234711 0x0000000000000002 781F4C06:007A56A3 17 142 10811 2 0x000000000041083A 0x0000000000000000 0x0000000000234718 0x00007FA074EDA560 781F4C26:007A56A4 17 142 10813 1 ktubnd: Bind usn 7 nax 1 nbx 0 lng 0 par 0 781F4C43:007A56A5 17 142 10813 2 ktubnd: Txn Bound xid: 7.24.510 781F4C4A:007A56A6 17 142 10704 83 ksqgtl: acquire TX-00070018-000001fe mode=X flags=GLOBAL|XACT why="contention" 781F4C51:007A56A7 17 142 10704 19 ksqgtl: SUCCESS ?????????? ???????? 781F4CBF:007A56A8 17 142 10005 1 KSL WAIT BEG [SQL*Net message to client] 1650815232/0x62657100 1/0x1 0/0x0 781F4CCC:007A56A9 17 142 10005 2 KSL WAIT END [SQL*Net message to client] 1650815232/0x62657100 1/0x1 0/0x0 time=13 781F4CDE:007A56AA 17 142 10005 1 KSL WAIT BEG [SQL*Net message from client] 1650815232/0x62657100 1/0x1 0/0x0 786BD85D:007A57E0 17 142 10005 2 KSL WAIT END [SQL*Net message from client] 1650815232/0x62657100 1/0x1 0/0x0 time=5016447 786BD966:007A57E1 17 142 10005 1 KSL WAIT BEG [SQL*Net message to client] 1650815232/0x62657100 1/0x1 0/0x0 786BD96E:007A57E2 17 142 10005 2 KSL WAIT END [SQL*Net message to client] 1650815232/0x62657100 1/0x1 0/0x0 time=8 SESSION B ???PID =24 ,??????? SX mode? TM lock,??row lock? acquire X mode?TX-00070018-000001fe ksqgtl: acquire TM-0000da15-00000000 mode=SX flags=GLOBAL|XACT why="contention" ksqgtl: SUCCESS 0x000000000041083A 0x0000000000000000 0x00000000002354F8 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x000000000041083A 0x0000000000000000 0x00000000002354F8 0x0000000000000001 0x000000000041083A 0x0000000000000000 0x00000000002354F8 0x0000000008A63780 0x0000000000000001 0x0000000000800861 0x0000000000000241 0x0000000000000001 0x000000000041083A 0x0000000000000001 0x0000000000000001 0x000000000041083A 0x0000000000000000 0x00000000002354F9 0x0000000000000002 ksqgtl: acquire TX-00070018-000001fe mode=X flags=GLOBAL|LONG why="row lock contention" C4048EBD:007F52B6 24 140 10005 2 KSL WAIT END [enq: TX - row lock contention] 1415053318/0x54580006 458776/0x70018 510/0x1fe time=2929879 C4048ED4:007F52B7 24 140 10005 1 KSL WAIT BEG [enq: TX - row lock contention] 1415053318/0x54580006 458776/0x70018 510/0x1fe C43146CA:007F535E 24 140 10005 2 KSL WAIT END [enq: TX - row lock contention] 1415053318/0x54580006 458776/0x70018 510/0x1fe time=2930676 ????????? ,PID=24 ??????ksqcmi???????? deadlock C43146D9:007F535F 24 140 10704 134 ksqcmi: performing local deadlock detection on TX-00070018-000001fe C43146F8:007F5360 24 140 10704 150 ksqcmi: deadlock not detected on TX-00070018-000001fe ?? ??? PID 17 ??ROLLBACK ???? ,????????: PID 17 ROLLBACK; D7A495BB:007F9D3E 17 142 10005 4 KSL POST SENT postee=24 loc='ksqrcl' id1=0 id2=0 name= type=0 D7A495D8:007F9D3F 17 142 10444 12 ABORT TRANSACTION - xid: 0x0007.018.000001fe ?? PID 17 ??? TX resource?Enqueue Waiter linked List ???PID 24???,????KSL POST SENT ?? PID 24, ???ksqrcl???ENQUEUE LOCK ?PID 24??????KSL POST (KSL POST RCVD poster=17), ?ksqgtl???? TX-00070018-000001fe ?? ksqrcl??, ??PID 24???????? TX lock?USN ,??????? USN 3 XID 3.11.582 ,???acquire TX-0003000b-00000246 D7A49616:007F9D41 24 140 10005 3 KSL POST RCVD poster=17 loc='ksqrcl' id1=0 id2=0 name= type=0 fac#=0 facpost=1 D7A4961C:007F9D42 24 140 10704 19 ksqgtl: SUCCESS D7A4967D:007F9D43 24 140 10704 117 ksqrcl: release TX-00070018-000001fe mode=X D7A496A5:007F9D44 24 140 10813 1 ktubnd: Bind usn 3 nax 1 nbx 0 lng 0 par 0 D7A496C2:007F9D45 24 140 10813 2 ktubnd: Txn Bound xid: 3.11.582 D7A496C7:007F9D46 24 140 10704 83 ksqgtl: acquire TX-0003000b-00000246 mode=X flags=GLOBAL|XACT why="contention" D7A496E4:007F9D47 24 140 10704 19 ksqgtl: SUCCESS ROW LOCK?Release ??????TX?ENQUEUE LOCK,?????????Process J ????????????, Process K??????????? ,Process K?????????,???row piece?lb??0×0 ,?????ITL,Process Z???ITL???????Process J????XID,?????Process J?????TX lock,PROCESS K ???TX resource?Enqueue Waiter Linked List?????X mode(exclusive)?enqueue lock? ???Process J??TX lock?,Process J?????TX resource?Enqueue Waiter Linked List ???Process K??????,??POST?????Process K? TX lock??????,???????row lock???????,?????????

Read the article
Could I centralize batch files more efficiently?

- by PeanutsMonkey

I am new to the world of batch scripting so please forgive what may appear as basic questions. I am learning as I get assigned different jobs and I am a huge proponent of automation where possible. I have several batch files that perform several tasks. Each of these files had their paths hard-coded e.g. c:\temp. d:\data, etc in the batch file. Initially I moved these to a text file I could call from a batch file e.g. for /f "tokens=1,2 delims==" %%R in (config.txt) do ( if %%R==bdata set bdata=%%S if %%R==cdata set cdata=%%S ) The config.txt file contains these values bdata=c:\temp cdata=d:\data I realized that each time I would need to create a new variable, I would need to update the config.txt file as well the config.bat files. I decided I would move all the values to just the config.bat file as follows set bdata=c:\temp set cdata=d:\data I then updated each of the existing batch files to call the variables rather than the hard-coded paths. I also added the following lines of code to each batch file except config.bat. The only additional line added to the config.bat file is @echo off. @echo off setlocal enableextensions enabledelayedexpansion call config.bat I then have another batch file that centralizes calling all the batch files in sequence. The name of this batch file is start.bat. The reason I am using start /wait is because there have been instances of where the delete.bat runs before compress.bat has had an opportunity to finish. start /wait compress.bat start /wait validate.bat start /wait delete.bat Questions Is this the best way to centralize values and if not, what is a better way? Do I need to specify setlocal enableextensions enabledelayedexpansion in all the existing batch files? Do all the batch files have to have @echo off or is it sufficient for just the config.bat file? Is start /wait the best way to call multiple files? Can I pass values from one batch file to another using the said command? All the batch files have different functions e.g. move, delete, etc however use %%a or %%b. Is this okay? For example The validate.bat file has the code for %%a in (%bdata%\*.*) do if "%%~xa" == "" move /Y "%bdata%\%%~xa" "%bdata%\%done%" and the delete.bat file has the code for %%a in (%bdata%\*.*) do if "%%~xa" == ".txt" del "%%a"

Read the article
An XEvent a Day (30 of 31) – Tracking Session and Statement Level Waits

- by Jonathan Kehayias

While attending PASS Summit this year, I got the opportunity to hang out with Brent Ozar ( Blog | Twitter ) one afternoon while he did some work for Yanni Robel ( Blog | Twitter ). After looking at the wait stats information, Brent pointed out some potential problem points, and based on that information I pulled up my code for my PASS session the next day on Wait Statistics and Extended Events and made some changes to one of the demo’s so that the Event Session only focused on those potentially...(read more)

Read the article
How can I change the "timeout" duration for Nautilus "find the filename as you type" feature?

- by fred.bear

I often get stalled by the long timeout while typeing the first few letters of a file name in Nautilus... The current timeout seems to be 5 seconds. I'd prefer 1 second ...(as per item 2 on this page about Response Times) I don't use the mouse much, which means I either wait, or press Escape, when I don't find the file... I realize that this is a feature to some, but I'd rather not wait. Is there any way to change this timeout behaviour?

Read the article
Open Source HTML/JS game(s) with license that would allow embedding in my app?

- by DustMason

I'm working on an educational app for kids. At the end of the sign-up process, the kids must wait for a confirmation from their parents in order to gain access to the app. While they wait for this to happen, we want to let the kid play a simple game as a way to keep their interest up. Is there a marketplace or repository for games with such a license that we could either purchase (affordably) or use for free in our own app?

Read the article
Rebooting access point via SSH with pexpect... hangs. Any ideas?

- by MiniQuark

When I want to reboot my D-Link DWL-3200-AP access point from my bash shell, I connect to the AP using ssh and I just type reboot in the CLI interface. After about 30 seconds, the AP is rebooted: # ssh [email protected] [email protected]'s password: ******** Welcome to Wireless SSH Console!! ['help' or '?' to see commands] Wireless Driver Rev 4.0.0.167 D-Link Access Point wlan1 -> reboot Sound's great? Well unfortunately the ssh client process never exits, for some reason (maybe the AP kills the ssh server a bit too fast, I don't know). My ssh client process is completely blocked (even if I wait for several minutes, nothing happens). I always have to wait for the AP to reboot, then open another shell, find the ssh client process ID (using ps aux | grep ssh) then kill the ssh process using kill <pid>. That's quite annoying. So I decided to write a python script to reboot the AP. The script connects to the AP's CLI interface via ssh, using python-pexpect, and it tries to launch the "reboot" command. Here's what the script looks like: #!/usr/bin/python # usage: python reboot_ap.py {host} {user} {password} import pexpect import sys import time command = "ssh %(user)s@%(host)s"%{"user":sys.argv[2], "host":sys.argv[1]} session = pexpect.spawn(command, timeout=30) # start ssh process response = session.expect(r"password:") # wait for password prompt session.sendline(sys.argv[3]) # send password session.expect(" -> ") # wait for D-Link CLI prompt session.sendline("reboot") # send the reboot command time.sleep(60) # make sure the reboot has time to actually take place session.close(force=True) # kill the ssh process The script connects properly to the AP (I tried running some other commands than reboot, they work fine), it sends the reboot command, waits for one minute, then kills the ssh process. The problem is: this time, the AP never reboots! I have no idea why. Any solution, anyone?

Read the article
C#/.NET Little Wonders: ConcurrentBag and BlockingCollection

- by James Michael Hare

In the first week of concurrent collections, began with a general introduction and discussed the ConcurrentStack<T> and ConcurrentQueue<T>. The last post discussed the ConcurrentDictionary<T> . Finally this week, we shall close with a discussion of the ConcurrentBag<T> and BlockingCollection<T>. For more of the "Little Wonders" posts, see C#/.NET Little Wonders: A Redux. Recap As you'll recall from the previous posts, the original collections were object-based containers that accomplished synchronization through a Synchronized member. With the advent of .NET 2.0, the original collections were succeeded by the generic collections which are fully type-safe, but eschew automatic synchronization. With .NET 4.0, a new breed of collections was born in the System.Collections.Concurrent namespace. Of these, the final concurrent collection we will examine is the ConcurrentBag and a very useful wrapper class called the BlockingCollection. For some excellent information on the performance of the concurrent collections and how they perform compared to a traditional brute-force locking strategy, see this informative whitepaper by the Microsoft Parallel Computing Platform team here. ConcurrentBag<T> – Thread-safe unordered collection. Unlike the other concurrent collections, the ConcurrentBag<T> has no non-concurrent counterpart in the .NET collections libraries. Items can be added and removed from a bag just like any other collection, but unlike the other collections, the items are not maintained in any order. This makes the bag handy for those cases when all you care about is that the data be consumed eventually, without regard for order of consumption or even fairness – that is, it’s possible new items could be consumed before older items given the right circumstances for a period of time. So why would you ever want a container that can be unfair? Well, to look at it another way, you can use a ConcurrentQueue and get the fairness, but it comes at a cost in that the ordering rules and synchronization required to maintain that ordering can affect scalability a bit. Thus sometimes the bag is great when you want the fastest way to get the next item to process, and don’t care what item it is or how long its been waiting. The way that the ConcurrentBag works is to take advantage of the new ThreadLocal<T> type (new in System.Threading for .NET 4.0) so that each thread using the bag has a list local to just that thread. This means that adding or removing to a thread-local list requires very low synchronization. The problem comes in where a thread goes to consume an item but it’s local list is empty. In this case the bag performs “work-stealing” where it will rob an item from another thread that has items in its list. This requires a higher level of synchronization which adds a bit of overhead to the take operation. So, as you can imagine, this makes the ConcurrentBag good for situations where each thread both produces and consumes items from the bag, but it would be less-than-idea in situations where some threads are dedicated producers and the other threads are dedicated consumers because the work-stealing synchronization would outweigh the thread-local optimization for a thread taking its own items. Like the other concurrent collections, there are some curiosities to keep in mind: IsEmpty(), Count, ToArray(), and GetEnumerator() lock collection Each of these needs to take a snapshot of whole bag to determine if empty, thus they tend to be more expensive and cause Add() and Take() operations to block. ToArray() and GetEnumerator() are static snapshots Because it is based on a snapshot, will not show subsequent updates after snapshot. Add() is lightweight Since adding to the thread-local list, there is very little overhead on Add. TryTake() is lightweight if items in thread-local list As long as items are in the thread-local list, TryTake() is very lightweight, much more so than ConcurrentStack() and ConcurrentQueue(), however if the local thread list is empty, it must steal work from another thread, which is more expensive. Remember, a bag is not ideal for all situations, it is mainly ideal for situations where a process consumes an item and either decomposes it into more items to be processed, or handles the item partially and places it back to be processed again until some point when it will complete. The main point is that the bag works best when each thread both takes and adds items. For example, we could create a totally contrived example where perhaps we want to see the largest power of a number before it crosses a certain threshold. Yes, obviously we could easily do this with a log function, but bare with me while I use this contrived example for simplicity. So let’s say we have a work function that will take a Tuple out of a bag, this Tuple will contain two ints. The first int is the original number, and the second int is the last multiple of that number. So we could load our bag with the initial values (let’s say we want to know the last multiple of each of 2, 3, 5, and 7 under 100. 1: var bag = new ConcurrentBag<Tuple<int, int>> 2: { 3: Tuple.Create(2, 1), 4: Tuple.Create(3, 1), 5: Tuple.Create(5, 1), 6: Tuple.Create(7, 1) 7: }; Then we can create a method that given the bag, will take out an item, apply the multiplier again, 1: public static void FindHighestPowerUnder(ConcurrentBag<Tuple<int,int>> bag, int threshold) 2: { 3: Tuple<int,int> pair; 4: 5: // while there are items to take, this will prefer local first, then steal if no local 6: while (bag.TryTake(out pair)) 7: { 8: // look at next power 9: var result = Math.Pow(pair.Item1, pair.Item2 + 1); 10: 11: if (result < threshold) 12: { 13: // if smaller than threshold bump power by 1 14: bag.Add(Tuple.Create(pair.Item1, pair.Item2 + 1)); 15: } 16: else 17: { 18: // otherwise, we're done 19: Console.WriteLine("Highest power of {0} under {3} is {0}^{1} = {2}.", 20: pair.Item1, pair.Item2, Math.Pow(pair.Item1, pair.Item2), threshold); 21: } 22: } 23: } Now that we have this, we can load up this method as an Action into our Tasks and run it: 1: // create array of tasks, start all, wait for all 2: var tasks = new[] 3: { 4: new Task(() => FindHighestPowerUnder(bag, 100)), 5: new Task(() => FindHighestPowerUnder(bag, 100)), 6: }; 7: 8: Array.ForEach(tasks, t => t.Start()); 9: 10: Task.WaitAll(tasks); Totally contrived, I know, but keep in mind the main point! When you have a thread or task that operates on an item, and then puts it back for further consumption – or decomposes an item into further sub-items to be processed – you should consider a ConcurrentBag as the thread-local lists will allow for quick processing. However, if you need ordering or if your processes are dedicated producers or consumers, this collection is not ideal. As with anything, you should performance test as your mileage will vary depending on your situation! BlockingCollection<T> – A producers & consumers pattern collection The BlockingCollection<T> can be treated like a collection in its own right, but in reality it adds a producers and consumers paradigm to any collection that implements the interface IProducerConsumerCollection<T>. If you don’t specify one at the time of construction, it will use a ConcurrentQueue<T> as its underlying store. If you don’t want to use the ConcurrentQueue, the ConcurrentStack and ConcurrentBag also implement the interface (though ConcurrentDictionary does not). In addition, you are of course free to create your own implementation of the interface. So, for those who don’t remember the producers and consumers classical computer-science problem, the gist of it is that you have one (or more) processes that are creating items (producers) and one (or more) processes that are consuming these items (consumers). Now, the crux of the problem is that there is a bin (queue) where the produced items are placed, and typically that bin has a limited size. Thus if a producer creates an item, but there is no space to store it, it must wait until an item is consumed. Also if a consumer goes to consume an item and none exists, it must wait until an item is produced. The BlockingCollection makes it trivial to implement any standard producers/consumers process set by providing that “bin” where the items can be produced into and consumed from with the appropriate blocking operations. In addition, you can specify whether the bin should have a limited size or can be (theoretically) unbounded, and you can specify timeouts on the blocking operations. As far as your choice of “bin”, for the most part the ConcurrentQueue is the right choice because it is fairly light and maximizes fairness by ordering items so that they are consumed in the same order they are produced. You can use the concurrent bag or stack, of course, but your ordering would be random-ish in the case of the former and LIFO in the case of the latter. So let’s look at some of the methods of note in BlockingCollection: BoundedCapacity returns capacity of the “bin” If the bin is unbounded, the capacity is int.MaxValue. Count returns an internally-kept count of items This makes it O(1), but if you modify underlying collection directly (not recommended) it is unreliable. CompleteAdding() is used to cut off further adds. This sets IsAddingCompleted and begins to wind down consumers once empty. IsAddingCompleted is true when producers are “done”. Once you are done producing, should complete the add process to alert consumers. IsCompleted is true when producers are “done” and “bin” is empty. Once you mark the producers done, and all items removed, this will be true. Add() is a blocking add to collection. If bin is full, will wait till space frees up Take() is a blocking remove from collection. If bin is empty, will wait until item is produced or adding is completed. GetConsumingEnumerable() is used to iterate and consume items. Unlike the standard enumerator, this one consumes the items instead of iteration. TryAdd() attempts add but does not block completely If adding would block, returns false instead, can specify TimeSpan to wait before stopping. TryTake() attempts to take but does not block completely Like TryAdd(), if taking would block, returns false instead, can specify TimeSpan to wait. Note the use of CompleteAdding() to signal the BlockingCollection that nothing else should be added. This means that any attempts to TryAdd() or Add() after marked completed will throw an InvalidOperationException. In addition, once adding is complete you can still continue to TryTake() and Take() until the bin is empty, and then Take() will throw the InvalidOperationException and TryTake() will return false. So let’s create a simple program to try this out. Let’s say that you have one process that will be producing items, but a slower consumer process that handles them. This gives us a chance to peek inside what happens when the bin is bounded (by default, the bin is NOT bounded). 1: var bin = new BlockingCollection<int>(5); Now, we create a method to produce items: 1: public static void ProduceItems(BlockingCollection<int> bin, int numToProduce) 2: { 3: for (int i = 0; i < numToProduce; i++) 4: { 5: // try for 10 ms to add an item 6: while (!bin.TryAdd(i, TimeSpan.FromMilliseconds(10))) 7: { 8: Console.WriteLine("Bin is full, retrying..."); 9: } 10: } 11: 12: // once done producing, call CompleteAdding() 13: Console.WriteLine("Adding is completed."); 14: bin.CompleteAdding(); 15: } And one to consume them: 1: public static void ConsumeItems(BlockingCollection<int> bin) 2: { 3: // This will only be true if CompleteAdding() was called AND the bin is empty. 4: while (!bin.IsCompleted) 5: { 6: int item; 7: 8: if (!bin.TryTake(out item, TimeSpan.FromMilliseconds(10))) 9: { 10: Console.WriteLine("Bin is empty, retrying..."); 11: } 12: else 13: { 14: Console.WriteLine("Consuming item {0}.", item); 15: Thread.Sleep(TimeSpan.FromMilliseconds(20)); 16: } 17: } 18: } Then we can fire them off: 1: // create one producer and two consumers 2: var tasks = new[] 3: { 4: new Task(() => ProduceItems(bin, 20)), 5: new Task(() => ConsumeItems(bin)), 6: new Task(() => ConsumeItems(bin)), 7: }; 8: 9: Array.ForEach(tasks, t => t.Start()); 10: 11: Task.WaitAll(tasks); Notice that the producer is faster than the consumer, thus it should be hitting a full bin often and displaying the message after it times out on TryAdd(). 1: Consuming item 0. 2: Consuming item 1. 3: Bin is full, retrying... 4: Bin is full, retrying... 5: Consuming item 3. 6: Consuming item 2. 7: Bin is full, retrying... 8: Consuming item 4. 9: Consuming item 5. 10: Bin is full, retrying... 11: Consuming item 6. 12: Consuming item 7. 13: Bin is full, retrying... 14: Consuming item 8. 15: Consuming item 9. 16: Bin is full, retrying... 17: Consuming item 10. 18: Consuming item 11. 19: Bin is full, retrying... 20: Consuming item 12. 21: Consuming item 13. 22: Bin is full, retrying... 23: Bin is full, retrying... 24: Consuming item 14. 25: Adding is completed. 26: Consuming item 15. 27: Consuming item 16. 28: Consuming item 17. 29: Consuming item 19. 30: Consuming item 18. Also notice that once CompleteAdding() is called and the bin is empty, the IsCompleted property returns true, and the consumers will exit. Summary The ConcurrentBag is an interesting collection that can be used to optimize concurrency scenarios where tasks or threads both produce and consume items. In this way, it will choose to consume its own work if available, and then steal if not. However, in situations where you want fair consumption or ordering, or in situations where the producers and consumers are distinct processes, the bag is not optimal. The BlockingCollection is a great wrapper around all of the concurrent queue, stack, and bag that allows you to add producer and consumer semantics easily including waiting when the bin is full or empty. That’s the end of my dive into the concurrent collections. I’d also strongly recommend, once again, you read this excellent Microsoft white paper that goes into much greater detail on the efficiencies you can gain using these collections judiciously (here). Tweet Technorati Tags: C#,.NET,Concurrent Collections,Little Wonders

Read the article
Query on simple C++ threadpool implementation

- by ticketman

Stackoverflow has been a tremendous help to me and I'd to give something back to the community. I have been implementing a simple threadpool using the tinythread C++ portable thread library, using what I have learnt from Stackoverflow. I am new to thread programming, so not that comfortable with mutexes, etc. I have a question best asked after presenting the code (which runs quite well under Linux): // ThreadPool.h class ThreadPool { public: ThreadPool(); ~ThreadPool(); // Creates a pool of threads and gets them ready to be used void CreateThreads(int numOfThreads); // Assigns a job to a thread in the pool, but doesn't start the job // Each SubmitJob call will use up one thread of the pool. // This operation can only be undone by calling StartJobs and // then waiting for the jobs to complete. On completion, // new jobs may be submitted. void SubmitJob( void (*workFunc)(void *), void *workData ); // Begins execution of all the jobs in the pool. void StartJobs(); // Waits until all jobs have completed. // The wait will block the caller. // On completion, new jobs may be submitted. void WaitForJobsToComplete(); private: enum typeOfWorkEnum { e_work, e_quit }; class ThreadData { public: bool ready; // thread has been created and is ready for work bool haveWorkToDo; typeOfWorkEnum typeOfWork; // Pointer to the work function each thread has to call. void (*workFunc)(void *); // Pointer to work data void *workData; ThreadData() : ready(false), haveWorkToDo(false) { }; }; struct ThreadArgStruct { ThreadPool *threadPoolInstance; int threadId; }; // Data for each thread ThreadData *m_ThreadData; ThreadPool(ThreadPool const&); // copy ctor hidden ThreadPool& operator=(ThreadPool const&); // assign op. hidden // Static function that provides the function pointer that a thread can call // By including the ThreadPool instance in the void * parameter, // we can use it to access other data and methods in the ThreadPool instance. static void ThreadFuncWrapper(void *arg) { ThreadArgStruct *threadArg = static_cast<ThreadArgStruct *>(arg); threadArg->threadPoolInstance->ThreadFunc(threadArg->threadId); } // The function each thread calls void ThreadFunc( int threadId ); // Called by the thread pool destructor void DestroyThreadPool(); // Total number of threads available // (fixed on creation of thread pool) int m_numOfThreads; int m_NumOfThreadsDoingWork; int m_NumOfThreadsGivenJobs; // List of threads std::vector<tthread::thread *> m_ThreadList; // Condition variable to signal each thread has been created and executing tthread::mutex m_ThreadReady_mutex; tthread::condition_variable m_ThreadReady_condvar; // Condition variable to signal each thread to start work tthread::mutex m_WorkToDo_mutex; tthread::condition_variable m_WorkToDo_condvar; // Condition variable to signal the main thread that // all threads in the pool have completed their work tthread::mutex m_WorkCompleted_mutex; tthread::condition_variable m_WorkCompleted_condvar; }; cpp file: // // ThreadPool.cpp // #include "ThreadPool.h" // This is the thread function for each thread. // All threads remain in this function until // they are asked to quit, which only happens // when terminating the thread pool. void ThreadPool::ThreadFunc( int threadId ) { ThreadData *myThreadData = &m_ThreadData[threadId]; std::cout << "Hello world: Thread " << threadId << std::endl; // Signal that this thread is ready m_ThreadReady_mutex.lock(); myThreadData->ready = true; m_ThreadReady_condvar.notify_one(); // notify the main thread m_ThreadReady_mutex.unlock(); while(true) { //tthread::lock_guard<tthread::mutex> guard(m); m_WorkToDo_mutex.lock(); while(!myThreadData->haveWorkToDo) // check for work to do m_WorkToDo_condvar.wait(m_WorkToDo_mutex); // if no work, wait here myThreadData->haveWorkToDo = false; // need to do this before unlocking the mutex m_WorkToDo_mutex.unlock(); // Do the work switch(myThreadData->typeOfWork) { case e_work: std::cout << "Thread " << threadId << ": Woken with work to do\n"; // Do work myThreadData->workFunc(myThreadData->workData); std::cout << "#Thread " << threadId << ": Work is completed\n"; break; case e_quit: std::cout << "Thread " << threadId << ": Asked to quit\n"; return; // ends the thread } // Now to signal the main thread that my work is completed m_WorkCompleted_mutex.lock(); m_NumOfThreadsDoingWork--; // Unsure if this 'if' would make the program more efficient // if(NumOfThreadsDoingWork == 0) m_WorkCompleted_condvar.notify_one(); // notify the main thread m_WorkCompleted_mutex.unlock(); } } ThreadPool::ThreadPool() { m_numOfThreads = 0; m_NumOfThreadsDoingWork = 0; m_NumOfThreadsGivenJobs = 0; } ThreadPool::~ThreadPool() { if(m_numOfThreads) { DestroyThreadPool(); delete [] m_ThreadData; } } void ThreadPool::CreateThreads(int numOfThreads) { // Check a thread pool has already been created if(m_numOfThreads > 0) return; m_NumOfThreadsGivenJobs = 0; m_NumOfThreadsDoingWork = 0; m_numOfThreads = numOfThreads; m_ThreadData = new ThreadData[m_numOfThreads]; ThreadArgStruct threadArg; for(int i=0; i<m_numOfThreads; ++i) { threadArg.threadId = i; threadArg.threadPoolInstance = this; // Creates the thread and save in a list so we can destroy it later m_ThreadList.push_back( new tthread::thread( ThreadFuncWrapper, (void *)&threadArg ) ); // It takes a little time for a thread to get established. // Best wait until it gets established before creating the next thread. m_ThreadReady_mutex.lock(); while(!m_ThreadData[i].ready) // Check if thread is ready m_ThreadReady_condvar.wait(m_ThreadReady_mutex); // If not, wait here m_ThreadReady_mutex.unlock(); } } // Adds a job to the batch, but doesn't start the job void ThreadPool::SubmitJob(void (*workFunc)(void *), void *workData) { // Check that the thread pool has been created if(!m_numOfThreads) return; if(m_NumOfThreadsGivenJobs >= m_numOfThreads) return; m_ThreadData[m_NumOfThreadsGivenJobs].workFunc = workFunc; m_ThreadData[m_NumOfThreadsGivenJobs].workData = workData; std::cout << "Submitted job " << m_NumOfThreadsGivenJobs << std::endl; m_NumOfThreadsGivenJobs++; } void ThreadPool::StartJobs() { // Check that the thread pool has been created // and some jobs have been assigned if(!m_numOfThreads || !m_NumOfThreadsGivenJobs) return; // Set 'haveworkToDo' flag for all threads m_WorkToDo_mutex.lock(); for(int i=0; i<m_NumOfThreadsGivenJobs; ++i) m_ThreadData[i].haveWorkToDo = true; m_NumOfThreadsDoingWork = m_NumOfThreadsGivenJobs; // Reset this counter so we can resubmit jobs later m_NumOfThreadsGivenJobs = 0; // Notify all threads they have work to do m_WorkToDo_condvar.notify_all(); m_WorkToDo_mutex.unlock(); } void ThreadPool::WaitForJobsToComplete() { // Check that a thread pool has been created if(!m_numOfThreads) return; m_WorkCompleted_mutex.lock(); while(m_NumOfThreadsDoingWork > 0) // Check if all threads have completed their work m_WorkCompleted_condvar.wait(m_WorkCompleted_mutex); // If not, wait here m_WorkCompleted_mutex.unlock(); } void ThreadPool::DestroyThreadPool() { std::cout << "Ask threads to quit\n"; m_WorkToDo_mutex.lock(); for(int i=0; i<m_numOfThreads; ++i) { m_ThreadData[i].haveWorkToDo = true; m_ThreadData[i].typeOfWork = e_quit; } m_WorkToDo_condvar.notify_all(); m_WorkToDo_mutex.unlock(); // As each thread terminates, catch them here for(int i=0; i<m_numOfThreads; ++i) { tthread::thread *t = m_ThreadList[i]; // Wait for thread to complete t->join(); } m_numOfThreads = 0; } Example of usage: (this calculates pi-squared/6) struct CalculationDataStruct { int inputVal; double outputVal; }; void LongCalculation( void *theSums ) { CalculationDataStruct *sums = (CalculationDataStruct *)theSums; int terms = sums->inputVal; double sum; for(int i=1; i<terms; i++) sum += 1.0/( double(i)*double(i) ); sums->outputVal = sum; } int main(int argc, char** argv) { int numThreads = 10; // Create pool ThreadPool threadPool; threadPool.CreateThreads(numThreads); // Create thread workspace CalculationDataStruct sums[numThreads]; // Set up jobs for(int i=0; i<numThreads; i++) { sums[i].inputVal = 3000*(i+1); threadPool.SubmitJob(LongCalculation, &sums[i]); } // Run the jobs threadPool.StartJobs(); threadPool.WaitForJobsToComplete(); // Print results for(int i=0; i<numThreads; i++) std::cout << "Sum of " << sums[i].inputVal << " terms is " << sums[i].outputVal << std::endl; return 0; } Question: In the ThreadPool::ThreadFunc method, would better performance be obtained if the following if statement if(NumOfThreadsDoingWork == 0) was included? Also, I'd be grateful of criticisms and ways to improve the code. At the same time, I hope the code is of use to others.

Read the article
My error with upgrading 4.0 to 4.2- What NOT to do...

- by Steve Tunstall

Last week, I was helping a client upgrade from the 2011.1.4.0 code to the newest 2011.1.4.2 code. We downloaded the 4.2 update from MOS, upload and unpacked it on both controllers, and upgraded one of the controllers in the cluster with no issues at all. As this was a brand-new system with no networking or pools made on it yet, there were not any resources to fail back and forth between the controllers. Each controller had it's own, private, management interface (igb0 and igb1) and that's it. So we took controller 1 as the passive controller and upgraded it first. The first controller came back up with no issues and was now on the 4.2 code. Great. We then did a takeover on controller 1, making it the active head (although there were no resources for it to take), and then proceeded to upgrade controller 2. Upon upgrading the second controller, we ran the health check with no issues. We then ran the update and it ran and rebooted normally. However, something strange then happened. It took longer than normal to come back up, and when it did, we got the "cluster controllers on different code" error message that one gets when the two controllers of a cluster are running different code. But we just upgraded the second controller to 4.2, so they should have been the same, right??? Going into the Maintenance-->System screen of controller 2, we saw something very strange. The "current version" was still on 4.0, and the 4.2 code was there but was in the "previous" state with the rollback icon, as if it was the OLDER code and not the newer code. I have never seen this happen before. I would have thought it was a bad 4.2 code file, but it worked just fine with controller 1, so I don't think that was it. Other than the fact the code did not update, there was nothing else going on with this system. It had no yellow lights, no errors in the Problems section, and no errors in any of the logs. It was just out of the box a few hours ago, and didn't even have a storage pool yet. So.... We deleted the 4.2 code, uploaded it from scratch, ran the health check, and ran the upgrade again. once again, it seemed to go great, rebooted, and came back up to the same issue, where it came to 4.0 instead of 4.2. See the picture below.... HERE IS WHERE I MADE A BIG MISTAKE.... I SHOULD have instantly called support and opened a Sev 2 ticket. They could have done a shared shell and gotten the correct Fishwork engineer to look at the files and the code and determine what file was messed up and fixed it. The system was up and working just fine, it was just on an older code version, not really a huge problem at all. Instead, I went ahead and clicked the "Rollback" icon, thinking that the system would rollback to the 4.2 code. Ouch... What happened was that the system said, "Fine, I will delete the 4.0 code and boot to your 4.2 code"... Which was stupid on my part because something was wrong with the 4.2 code file here and the 4.0 was just fine. So now the system could not boot at all, and the 4.0 code was completely missing from the system, and even a high-level Fishworks engineer could not help us. I had messed it up good. We could only get to the ILOM, and I had to re-image the system from scratch using a hard-to-get-and-use FishStick USB drive. These are tightly controlled and difficult to get, almost always handcuffed to an engineer who will drive out to re-image a system. This took another day of my client's time. So.... If you see a "previous version" of your system code which is actually a version higher than the current version... DO NOT ROLL IT BACK.... It did not upgrade for a very good reason. In my case, after the system was re-imaged to a code level just 3 back, we once again tried the same 4.2 code update and it worked perfectly the first time and is now great and stable. Lesson learned. By the way, our buddy Ryan Matthews wanted to point out the best practice and supported way of performing an upgrade of an active/active ZFSSA, where both controllers are doing some of the work. These steps would not have helpped me for the above issue, but it's important to follow the correct proceedure when doing an upgrade. 1) Upload software to both controllers and wait for it to unpack 2) On controller "A" navigate to configuration/cluster and click "takeover" 3) Wait for controller "B" to finish restarting, then login to it, navigate to maintenance/system, and roll forward to the new software. 4) Wait for controller "B" to apply the update and finish rebooting 5) Login to controller "B", navigate to configuration/cluster and click "takeover" 6) Wait for controller "A" to finish restarting, then login to it, navigate to maintenance/system, and roll forward to the new software. 7) Wait for controller "A" to apply the update and finish rebooting 8) Login to controller "B", navigate to configuration/cluster and click "failback"

Read the article
Sharing Bandwidth and Prioritizing Realtime Traffic via HTB, Which Scenario Works Better?

- by Mecki

I would like to add some kind of traffic management to our Internet line. After reading a lot of documentation, I think HFSC is too complicated for me (I don't understand all the curves stuff, I'm afraid I will never get it right), CBQ is not recommend, and basically HTB is the way to go for most people. Our internal network has three "segments" and I'd like to share bandwidth more or less equally between those (at least in the beginning). Further I must prioritize traffic according to at least three kinds of traffic (realtime traffic, standard traffic, and bulk traffic). The bandwidth sharing is not as important as the fact that realtime traffic should always be treated as premium traffic whenever possible, but of course no other traffic class may starve either. The question is, what makes more sense and also guarantees better realtime throughput: Creating one class per segment, each having the same rate (priority doesn't matter for classes that are no leaves according to HTB developer) and each of these classes has three sub-classes (leaves) for the 3 priority levels (with different priorities and different rates). Having one class per priority level on top, each having a different rate (again priority won't matter) and each having 3 sub-classes, one per segment, whereas all 3 in the realtime class have highest prio, lowest prio in the bulk class, and so on. I'll try to make this more clear with the following ASCII art image: Case 1: root --+--> Segment A | +--> High Prio | +--> Normal Prio | +--> Low Prio | +--> Segment B | +--> High Prio | +--> Normal Prio | +--> Low Prio | +--> Segment C +--> High Prio +--> Normal Prio +--> Low Prio Case 2: root --+--> High Prio | +--> Segment A | +--> Segment B | +--> Segment C | +--> Normal Prio | +--> Segment A | +--> Segment B | +--> Segment C | +--> Low Prio +--> Segment A +--> Segment B +--> Segment C Case 1 Seems like the way most people would do it, but unless I don't read the HTB implementation details correctly, Case 2 may offer better prioritizing. The HTB manual says, that if a class has hit its rate, it may borrow from its parent and when borrowing, classes with higher priority always get bandwidth offered first. However, it also says that classes having bandwidth available on a lower tree-level are always preferred to those on a higher tree level, regardless of priority. Let's assume the following situation: Segment C is not sending any traffic. Segment A is only sending realtime traffic, as fast as it can (enough to saturate the link alone) and Segment B is only sending bulk traffic, as fast as it can (again, enough to saturate the full link alone). What will happen? Case 1: Segment A-High Prio and Segment B-Low Prio both have packets to send, since A-High Prio has the higher priority, it will always be scheduled first, till it hits its rate. Now it tries to borrow from Segment A, but since Segment A is on a higher level and Segment B-Low Prio has not yet hit its rate, this class is now served first, till it also hits the rate and wants to borrow from Segment B. Once both have hit their rates, both are on the same level again and now Segment A-High Prio is going to win again, until it hits the rate of Segment A. Now it tries to borrow from root (which has plenty of traffic spare, as Segment C is not using any of its guaranteed traffic), but again, it has to wait for Segment B-Low Prio to also reach the root level. Once that happens, priority is taken into account again and this time Segment A-High Prio will get all the bandwidth left over from Segment C. Case 2: High Prio-Segment A and Low Prio-Segment B both have packets to send, again High Prio-Segment A is going to win as it has the higher priority. Once it hits its rate, it tries to borrow from High Prio, which has bandwidth spare, but being on a higher level, it has to wait for Low Prio-Segment B again to also hit its rate. Once both have hit their rate and both have to borrow, High Prio-Segment A will win again until it hits the rate of the High Prio class. Once that happens, it tries to borrow from root, which has again plenty of bandwidth left (all bandwidth of Normal Prio is unused at the moment), but it has to wait again until Low Prio-Segment B hits the rate limit of the Low Prio class and also tries to borrow from root. Finally both classes try to borrow from root, priority is taken into account, and High Prio-Segment A gets all bandwidth root has left over. Both cases seem sub-optimal, as either way realtime traffic sometimes has to wait for bulk traffic, even though there is plenty of bandwidth left it could borrow. However, in case 2 it seems like the realtime traffic has to wait less than in case 1, since it only has to wait till the bulk traffic rate is hit, which is most likely less than the rate of a whole segment (and in case 1 that is the rate it has to wait for). Or am I totally wrong here? I thought about even simpler setups, using a priority qdisc. But priority queues have the big problem that they cause starvation if they are not somehow limited. Starvation is not acceptable. Of course one can put a TBF (Token Bucket Filter) into each priority class to limit the rate and thus avoid starvation, but when doing so, a single priority class cannot saturate the link on its own any longer, even if all other priority classes are empty, the TBF will prevent that from happening. And this is also sub-optimal, since why wouldn't a class get 100% of the line's bandwidth if no other class needs any of it at the moment? Any comments or ideas regarding this setup? It seems so hard to do using standard tc qdiscs. As a programmer it was such an easy task if I could simply write my own scheduler (which I'm not allowed to do).

Read the article
How to create a shared lock blocking an intent exclusive lock

- by FremenFreedom

As I understand it, a SELECT statement will place a shared lock on the rows that it will return. While that SELECT is running, if an UPDATE statement comes along and needs to grab an intent exclusive lock then that UPDATE statement will need to wait until the SELECT statement releases its shared locks. I am trying to test this SELECT shared lock thing by doing a BEGIN TRAN and then running a SELECT, not COMMITing, and then running an UPDATE in another session on the exact same row. The UPDATE worked fine -- no lock, no wait. So this must not be a valid way to simulate a shared lock blocking an intent exclusive lock? Can you give me a scenario where I can create a lock with a SELECT that would force an UPDATE to wait? I'm working with SQL Server 2000 and 2005 across a linked server: the table is on the 2005 instance, the select is happening on 2000, and the update is executed from 2005. All in SSMS 2005.

Read the article
Understanding Node.js and concept of non-blocking I/O

- by Saif Bechan

Recently I became interested in using Node.js to tackle some of the parts of my web-application. I love the part that its full JavaScript and its very light weight so no use anymore to call an JavaScript-PHP call but a lighter JavaScript-JavaScript call. I however do not understand all the concepts explained. Basic concepts Now in the presentation for Node.js Ryan Dahl talks about non-blocking IO and why this is the way we need to create our programs. I can understand the theoretical concept. You just don't wait for a response, you go ahead and do other things. You make a callback for the response, and when the response arrives millions of clock-cycles later, you can fire that. If you have not already I recommend to watch this presentation. It is very easy to follow and pretty detailed. There are some nice concepts explained on how to write your code in a good manner. There are also some examples given and I am going to work with the basic example given. Examples The way we do thing now: puts("Enter your name: "); var name = gets(); puts("Name: " + name); Now the problem with this is that the code is halted at line 1. It blocks your code. The way we need to do things according to node puts("Enter your name: "); gets(function (name) { puts("Name: " + name); }); Now with this your program does not halt, because the input is a function within the output. So the programs continues to work without halting. Questions Now the basic question I have is how does this work in real-life situations. I am talking here for the use in web-applications. The application I am writing does I/O, bit is still does it in am blocking matter. I think that most of the time, if not all, you need to block, because you have to wait on what the response is you have to work with. When you need to get some information from the database, most of the time this data needs to be verified before you can further with the code. Example 1 If you take a login for example. You have to wait for the database to response to return, because you can not do anything else. I can't see a way around this without blocking. Example 2 Going back to the basic example. The use just request something from a database which does not need any verification. You still have to block because you don't have anything to do more. I can not come up with a single example where you want to do other things while you wait for the response to return. Possible answers I have read that this frees up recourses. When you program like this it takes less CPU or memory usage. So this non-blocking IO is ONLY meant to free up recourses and does not have any other practical use. Not that this is not a huge plus, freeing up recourses is always good. Yet I fail to see this as a good solution. because in both of the above examples, the program has to wait for the response of the user. Whether this is inside a function, or just inline, in my opinion there is a program that wait for input. Resources I looked at I have looked at some recourses before I posted this question. They talk a lot about the theoretical concept, which is quite clear. Yet i fail to see some real-life examples where this is makes a huge difference. Stackoverflow: What is in simple words blocking IO and non-blocking IO? Blocking IO vs non-blocking IO; looking for good articles tidy code for asynchronous IO Other recources: Wikipedia: Asynchronous I/O Introduction to non-blocking I/O The C10K problem

Read the article
how to word wrap, align text like the output of man?

- by cody

what is the command that word wraps and justifies a text file so that the output looks like that of a man page: All of these system calls are used to wait for state changes in a child of the calling process, and obtain information about the child whose state has changed. A state change is considered to be: the child terminated; the child was stopped by a signal; or the child was resumed by a signal. In the case of a terminated child, performing a wait allows the system to release the resources associated with the child; if a wait is not performed, then the termi- nated child remains in a "zombie" state (see NOTES below). Thanks.

Read the article
PowerShell Script to Deploy Multiple VM on Azure in Parallel #azure #powershell

- by Marco Russo (SQLBI)

This blog is usually dedicated to Business Intelligence and SQL Server, but I didn’t found easily on the web simple PowerShell scripts to help me deploying a number of virtual machines on Azure that I use for testing and development. Since I need to deploy, start, stop and remove many virtual machines created from a common image I created (you know, Tabular is not part of the standard images provided by Microsoft…), I wanted to minimize the time required to execute every operation from my Windows Azure PowerShell console (but I suggest you using Windows PowerShell ISE), so I also wanted to fire the commands as soon as possible in parallel, without losing the result in the console. In order to execute multiple commands in parallel, I used the Start-Job cmdlet, and using Get-Job and Receive-Job I wait for job completion and display the messages generated during background command execution. This technique allows me to reduce execution time when I have to deploy, start, stop or remove virtual machines. Please note that a few operations on Azure acquire an exclusive lock and cannot be really executed in parallel, but only one part of their execution time is subject to this lock. Thus, you obtain a better response time also in these scenarios (this is the case of the provisioning of a new VM). Finally, when you remove the VMs you still have the disk containing the virtual machine to remove. This cannot be done just after the VM removal, because you have to wait that the removal operation is completed on Azure. So I wrote a script that you have to run a few minutes after VMs removal and delete disks (and VHD) no longer related to a VM. I just check that the disk were associated to the original image name used to provision the VMs (so I don’t remove other disks deployed by other batches that I might want to preserve). These examples are specific for my scenario, if you need more complex configurations you have to change and adapt the code. But if your need is to create multiple instances of the same VM running in a workgroup, these scripts should be good enough. I prepared the following PowerShell scripts: ProvisionVMs: Provision many VMs in parallel starting from the same image. It creates one service for each VM. RemoveVMs: Remove all the VMs in parallel – it also remove the service created for the VM StartVMs: Starts all the VMs in parallel StopVMs: Stops all the VMs in parallel RemoveOrphanDisks: Remove all the disks no longer used by any VMs. Run this script a few minutes after RemoveVMs script. ProvisionVMs # Name of subscription $SubscriptionName = "Copy the SubscriptionName property you get from Get-AzureSubscription" # Name of storage account (where VMs will be deployed) $StorageAccount = "Copy the Label property you get from Get-AzureStorageAccount" function ProvisionVM( [string]$VmName ) { Start-Job -ArgumentList $VmName { param($VmName) $Location = "Copy the Location property you get from Get-AzureStorageAccount" $InstanceSize = "A5" # You can use any other instance, such as Large, A6, and so on $AdminUsername = "UserName" # Write the name of the administrator account in the new VM $Password = "Password" # Write the password of the administrator account in the new VM $Image = "Copy the ImageName property you get from Get-AzureVMImage" # You can list your own images using the following command: # Get-AzureVMImage | Where-Object {$_.PublisherName -eq "User" } New-AzureVMConfig -Name $VmName -ImageName $Image -InstanceSize $InstanceSize | Add-AzureProvisioningConfig -Windows -Password $Password -AdminUsername $AdminUsername| New-AzureVM -Location $Location -ServiceName "$VmName" -Verbose } } # Set the proper storage - you might remove this line if you have only one storage in the subscription Set-AzureSubscription -SubscriptionName $SubscriptionName -CurrentStorageAccount $StorageAccount # Select the subscription - this line is fundamental if you have access to multiple subscription # You might remove this line if you have only one subscription Select-AzureSubscription -SubscriptionName $SubscriptionName # Every line in the following list provisions one VM using the name specified in the argument # You can change the number of lines - use a unique name for every VM - don't reuse names # already used in other VMs already deployed ProvisionVM "test10" ProvisionVM "test11" ProvisionVM "test12" ProvisionVM "test13" ProvisionVM "test14" ProvisionVM "test15" ProvisionVM "test16" ProvisionVM "test17" ProvisionVM "test18" ProvisionVM "test19" ProvisionVM "test20" # Wait for all to complete While (Get-Job -State "Running") { Get-Job -State "Completed" | Receive-Job Start-Sleep 1 } # Display output from all jobs Get-Job | Receive-Job # Cleanup of jobs Remove-Job * # Displays batch completed echo "Provisioning VM Completed" RemoveVMs # Name of subscription $SubscriptionName = "Copy the SubscriptionName property you get from Get-AzureSubscription" function RemoveVM( [string]$VmName ) { Start-Job -ArgumentList $VmName { param($VmName) Remove-AzureService -ServiceName $VmName -Force -Verbose } } # Select the subscription - this line is fundamental if you have access to multiple subscription # You might remove this line if you have only one subscription Select-AzureSubscription -SubscriptionName $SubscriptionName # Every line in the following list remove one VM using the name specified in the argument # You can change the number of lines - use a unique name for every VM - don't reuse names # already used in other VMs already deployed RemoveVM "test10" RemoveVM "test11" RemoveVM "test12" RemoveVM "test13" RemoveVM "test14" RemoveVM "test15" RemoveVM "test16" RemoveVM "test17" RemoveVM "test18" RemoveVM "test19" RemoveVM "test20" # Wait for all to complete While (Get-Job -State "Running") { Get-Job -State "Completed" | Receive-Job Start-Sleep 1 } # Display output from all jobs Get-Job | Receive-Job # Cleanup Remove-Job * # Displays batch completed echo "Remove VM Completed" StartVMs # Name of subscription $SubscriptionName = "Copy the SubscriptionName property you get from Get-AzureSubscription" function StartVM( [string]$VmName ) { Start-Job -ArgumentList $VmName { param($VmName) Start-AzureVM -Name $VmName -ServiceName $VmName -Verbose } } # Select the subscription - this line is fundamental if you have access to multiple subscription # You might remove this line if you have only one subscription Select-AzureSubscription -SubscriptionName $SubscriptionName # Every line in the following list starts one VM using the name specified in the argument # You can change the number of lines - use a unique name for every VM - don't reuse names # already used in other VMs already deployed StartVM "test10" StartVM "test11" StartVM "test11" StartVM "test12" StartVM "test13" StartVM "test14" StartVM "test15" StartVM "test16" StartVM "test17" StartVM "test18" StartVM "test19" StartVM "test20" # Wait for all to complete While (Get-Job -State "Running") { Get-Job -State "Completed" | Receive-Job Start-Sleep 1 } # Display output from all jobs Get-Job | Receive-Job # Cleanup Remove-Job * # Displays batch completed echo "Start VM Completed" StopVMs # Name of subscription $SubscriptionName = "Copy the SubscriptionName property you get from Get-AzureSubscription" function StopVM( [string]$VmName ) { Start-Job -ArgumentList $VmName { param($VmName) Stop-AzureVM -Name $VmName -ServiceName $VmName -Verbose -Force } } # Select the subscription - this line is fundamental if you have access to multiple subscription # You might remove this line if you have only one subscription Select-AzureSubscription -SubscriptionName $SubscriptionName # Every line in the following list stops one VM using the name specified in the argument # You can change the number of lines - use a unique name for every VM - don't reuse names # already used in other VMs already deployed StopVM "test10" StopVM "test11" StopVM "test12" StopVM "test13" StopVM "test14" StopVM "test15" StopVM "test16" StopVM "test17" StopVM "test18" StopVM "test19" StopVM "test20" # Wait for all to complete While (Get-Job -State "Running") { Get-Job -State "Completed" | Receive-Job Start-Sleep 1 } # Display output from all jobs Get-Job | Receive-Job # Cleanup Remove-Job * # Displays batch completed echo "Stop VM Completed" RemoveOrphanDisks $Image = "Copy the ImageName property you get from Get-AzureVMImage" # You can list your own images using the following command: # Get-AzureVMImage | Where-Object {$_.PublisherName -eq "User" } # Remove all orphan disks coming from the image specified in $ImageName Get-AzureDisk | Where-Object {$_.attachedto -eq $null -and $_.SourceImageName -eq $ImageName} | Remove-AzureDisk -DeleteVHD -Verbose

Read the article
SQL SERVER – Identify Most Resource Intensive Queries – SQL in Sixty Seconds #028 – Video

- by pinaldave

During performance tuning conversation the very first question people often ask is what are the queries offending the server or in another word let us identify the queries which are the most resource intensive. The resources are often described as either Memory, CPU or IO. When we talk about the queries the same is applicable for them as well. The query which is doing lots of reads or writes are for sure resource intensive as well query which are taking maximum CPU time. Performance tuning is a very deep subject and we all have our own preference regarding what should be the first step to tuning and what should be looked with the salt of grain. Though there is no denying that a query which uses more resources than what it should be using for sure require tuning. There are many ways to do identify query using intense resources (e.g. Extended events etc) but in this one we will go by simple DMV. There is a small gotcha we all have to remember about usage of DMV is that it only brings back results from existing cache. So if you have a query which is very resource intensive but is not cached or if you have explicitly removed the query from the cache it will be not part of the result returned by this DMV. It is quite possible that a query is aged and removed from the cache if your cache is not huge. If your cache is large you may want to be careful in running this query during business hours as this query itself can be resource intensive. Get Script to identify resource intensive query from Here Related Tips in SQL in Sixty Seconds: SQL SERVER – Find Most Expensive Queries Using DMV Simple Example to Configure Resource Governor – Introduction to Resource Governor SQL SERVER – DMV – sys.dm_exec_query_optimizer_info – Statistics of Optimizer SQL SERVER – Wait Stats – Wait Types – Wait Queues – Day 0 of 28 Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Database, Pinal Dave, PostADay, SQL, SQL Authority, SQL in Sixty Seconds, SQL Query, SQL Scripts, SQL Server, SQL Server Management Studio, SQL Tips and Tricks, T SQL, Technology, Video Tagged: Excel

Read the article
Moving while doing loop animation in RPGMaker

- by AzDesign

I made a custom class to display character portrait in RPGMaker XP Here is the class : class Poser attr_accessor :images def initialize @images = Sprite.new @images.bitmap = RPG::Cache.picture('Character.png') #100x300 @images.x = 540 #place it on the bottom right corner of the screen @images.y = 180 end end Create an event on the map to create an instance as global variable, tada! image popped out. Ok nice. But Im not satisfied, Now I want it to have bobbing-head animation-like (just like when we breathe, sometimes bob our head up and down) so I added another method : def move(x,y) @images.x += x @images.y += y end def animate(x,y,step,delay) forward = true 2.times { step.times { wait(delay) if forward move(x/step,y/step) else move(-x/step,-y/step) end } wait(delay*3) forward = false } end def wait(time) while time > 0 time -= 1 Graphics.update end end I called the method to run the animation and it works, so far so good, but the problem is, WHILE the portrait goes up and down, I cannot move my character until the animation is finished. So that's it, I'm stuck in the loop block, what I want is to watch the portrait moving up and down while I walk around village, talk to npc, etc. Anyone can solve my problem ? Or better solution ? Thanks in advance

Read the article
Is using a dedicated thread just for sending gpu commands a good idea?

- by tigrou

The most basic game loop is like this : while(1) { update(); draw(); swapbuffers(); } This is very simple but have a problem : some drawing commands can be blocking and cpu will wait while he could do other things (like processing next update() call). Another possible solution i have in mind would be to use two threads : one for updating and preparing commands to be sent to gpu, and one for sending these commands to the gpu : //first thread while(1) { update(); render(); // use gamestate to generate all needed triangles and commands for gpu // put them in a buffer, no command is send to gpu // two buffers will be used, see below pulse(); //signal the other thread data is ready } //second thread while(1) { wait(); // wait for second thread for data to come send_data_togpu(); // send prepared commands from buffer to graphic card swapbuffers(); } also : two buffers would be used, so one buffer could be filled with gpu commands while the other would be processed by gpu. Do you thing such a solution would be effective ? What would be advantages and disadvantages of such a solution (especially against a simpler solution (eg : single threaded with triple buffering enabled) ?

Read the article
How to properly multi thread an RPG

- by Nagrom_17

I am working on an RPG type game in Java and I would like to know a few things relating to threading, What is the best way to implement a "wait for this then do this" without hanging the whole thread? Like waiting for a player to move to a location then pick up an item? or to wait one second then attack? Currently I am spawning new threads every time I need to wait for something, but that doesn't feel like the best solution. Any help is appreciated. EDIT: Clarification and an example of how I currently do things. User clicks on an item The function walkToAndPickUp(item) is called which is basically this: Make a new thread so we don't freeze the thread handling input while the player moves. Tell player to move to the item While the player is not at the item(The player moves through an update() function called in a different thread, I don't know how else to do it without freezing threads) Repeat until the player is at the item If the player is at the item then call delete item from map and add to inventory.

Read the article
Why is my producer-consumer blocking?

- by User007

My code is here: http://pastebin.com/Fi3h0E0P Here is the output 0 Should we take order today (y or n): y Enter order number: 100 More customers (y or n): n Stop serving customers right now. Passing orders to cooker: There are total of 1 order(s) 1 Roger, waiter. I am processing order #100 The goal is waiter must take orders and then give them to the cook. The waiter has to wait cook finishes all pizza, deliver the pizza, and then take new orders. I asked how P-V work in my previous post here. I don't think it has anything to do with \n consuming? I tried all kinds of combination of wait(), but none work. Where did I make a mistake? The main part is here: //Producer process if(pid > 0) { while(1) { printf("0"); P(emptyShelf); // waiter as P finds no items on shelf; P(mutex); // has permission to use the shelf waiter_as_producer(); V(mutex); // cooker now can use the shelf V(orderOnShelf); // cooker now can pickup orders wait(); printf("2"); P(pizzaOnShelf); P(mutex); waiter_as_consumer(); V(mutex); V(emptyShelf); printf("3 "); } } if(pid == 0) { while(1) { printf("1"); P(orderOnShelf); // make sure there is an order on shelf P(mutex); //permission to work cooker_as_consumer(); // take order and put pizza on shelf printf("return from cooker"); V(mutex); //release permission printf("just released perm"); V(pizzaOnShelf); // pizza is now on shelf printf("after"); wait(); printf("4"); } } So I imagine this is the execution path: enter waiter_as_producer, then go to child process (cooker), then transfer the control back to parent, finish waiter_as_consumer, switch back to child. The two waits switch back to parent (like I said I tried all possible wait() combination...).

Read the article
Global Cache CR Requested But Current Block Received

- by Liu Maclean(???)

????????«MINSCN?Cache Fusion Read Consistent» ????,???????????? ??????????????????: SQL> select * from V$version; BANNER -------------------------------------------------------------------------------- Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production PL/SQL Release 11.2.0.3.0 - Production CORE 11.2.0.3.0 Production TNS for Linux: Version 11.2.0.3.0 - Production NLSRTL Version 11.2.0.3.0 - Production SQL> select count(*) from gv$instance; COUNT(*) ---------- 2 SQL> select * from global_name; GLOBAL_NAME -------------------------------------------------------------------------------- www.oracledatabase12g.com ?11gR2 2??RAC??????????status???XG,????Xcurrent block???INSTANCE 2?hold?,?????INSTANCE 1?????????,?????: SQL> select * from test; ID ---------- 1 2 SQL> select dbms_rowid.rowid_block_number(rowid),dbms_rowid.rowid_relative_fno(rowid) from test; DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID) DBMS_ROWID.ROWID_RELATIVE_FNO(ROWID) ------------------------------------ ------------------------------------ 89233 1 89233 1 SQL> alter system flush buffer_cache; System altered. INSTANCE 1 Session A: SQL> update test set id=id+1 where id=1; 1 row updated. INSTANCE 1 Session B: SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0; STATE CR_SCN_BAS ---------- ---------- 1 0 3 1755287 SQL> oradebug setmypid; Statement processed. SQL> oradebug dump gc_elements 255; Statement processed. SQL> oradebug tracefile_name; /s01/orabase/diag/rdbms/vprod/VPROD1/trace/VPROD1_ora_19111.trc GLOBAL CACHE ELEMENT DUMP (address: 0xa4ff3080): id1: 0x15c91 id2: 0x1 pkey: OBJ#76896 block: (1/89233) lock: X rls: 0x0 acq: 0x0 latch: 3 flags: 0x20 fair: 0 recovery: 0 fpin: 'kdswh11: kdst_fetch' bscn: 0x0.146e20 bctx: (nil) write: 0 scan: 0x0 lcp: (nil) lnk: [NULL] lch: [0xa9f6a6f8,0xa9f6a6f8] seq: 32 hist: 58 145:0 118 66 144:0 192 352 197 48 121 113 424 180 58 LIST OF BUFFERS LINKED TO THIS GLOBAL CACHE ELEMENT: flg: 0x02000001 lflg: 0x1 state: XCURRENT tsn: 0 tsh: 2 addr: 0xa9f6a5c8 obj: 76896 cls: DATA bscn: 0x0.1ac898 BH (0xa9f6a5c8) file#: 1 rdba: 0x00415c91 (1/89233) class: 1 ba: 0xa9e56000 set: 5 pool: 3 bsz: 8192 bsi: 0 sflg: 3 pwc: 0,15 dbwrid: 0 obj: 76896 objn: 76896 tsn: 0 afn: 1 hint: f hash: [0x91f4e970,0xbae9d5b8] lru: [0x91f58848,0xa9f6a828] lru-flags: debug_dump obj-flags: object_ckpt_list ckptq: [0x9df6d1d8,0xa9f6a740] fileq: [0xa2ece670,0xbdf4ed68] objq: [0xb4964e00,0xb4964e00] objaq: [0xb4964de0,0xb4964de0] st: XCURRENT md: NULL fpin: 'kdswh11: kdst_fetch' tch: 2 le: 0xa4ff3080 flags: buffer_dirty redo_since_read LRBA: [0x19.5671.0] LSCN: [0x0.1ac898] HSCN: [0x0.1ac898] HSUB: [1] buffer tsn: 0 rdba: 0x00415c91 (1/89233) scn: 0x0000.001ac898 seq: 0x01 flg: 0x00 tail: 0xc8980601 frmt: 0x02 chkval: 0x0000 type: 0x06=trans data ??????block: (1/89233)?GLOBAL CACHE ELEMENT DUMP?LOCK????X ??XG , ??????Current Block????Instance??modify???,????????????? ????Instance 2 ????: Instance 2 Session C: SQL> update test set id=id+1 where id=2; 1 row updated. Instance 2 Session D: SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0; STATE CR_SCN_BAS ---------- ---------- 1 0 3 1756658 SQL> oradebug setmypid; Statement processed. SQL> oradebug dump gc_elements 255; Statement processed. SQL> oradebug tracefile_name; /s01/orabase/diag/rdbms/vprod/VPROD2/trace/VPROD2_ora_13038.trc GLOBAL CACHE ELEMENT DUMP (address: 0x89fb25a0): id1: 0x15c91 id2: 0x1 pkey: OBJ#76896 block: (1/89233) lock: XG rls: 0x0 acq: 0x0 latch: 3 flags: 0x20 fair: 0 recovery: 0 fpin: 'kduwh01: kdusru' bscn: 0x0.1acdf3 bctx: (nil) write: 0 scan: 0x0 lcp: (nil) lnk: [NULL] lch: [0x96f4cf80,0x96f4cf80] seq: 61 hist: 324 21 143:0 19 16 352 329 144:6 14 7 352 197 LIST OF BUFFERS LINKED TO THIS GLOBAL CACHE ELEMENT: flg: 0x0a000001 state: XCURRENT tsn: 0 tsh: 1 addr: 0x96f4ce50 obj: 76896 cls: DATA bscn: 0x0.1acdf6 BH (0x96f4ce50) file#: 1 rdba: 0x00415c91 (1/89233) class: 1 ba: 0x96bd4000 set: 5 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 0,15 dbwrid: 0 obj: 76896 objn: 76896 tsn: 0 afn: 1 hint: f hash: [0x96ee1fe8,0xbae9d5b8] lru: [0x96f4d0b0,0x96f4cdc0] obj-flags: object_ckpt_list ckptq: [0xbdf519b8,0x96f4d5a8] fileq: [0xbdf519d8,0xbdf519d8] objq: [0xb4a47b90,0xb4a47b90] objaq: [0x96f4d0e8,0xb4a47b70] st: XCURRENT md: NULL fpin: 'kduwh01: kdusru' tch: 1 le: 0x89fb25a0 flags: buffer_dirty redo_since_read remote_transfered LRBA: [0x11.9e18.0] LSCN: [0x0.1acdf6] HSCN: [0x0.1acdf6] HSUB: [1] buffer tsn: 0 rdba: 0x00415c91 (1/89233) scn: 0x0000.001acdf6 seq: 0x01 flg: 0x00 tail: 0xcdf60601 frmt: 0x02 chkval: 0x0000 type: 0x06=trans data GCS CLIENT 0x89fb2618,6 resp[(nil),0x15c91.1] pkey 76896.0 grant 2 cvt 0 mdrole 0x42 st 0x100 lst 0x20 GRANTQ rl G0 master 1 owner 2 sid 0 remote[(nil),0] hist 0x94121c601163423c history 0x3c.0x4.0xd.0xb.0x1.0xc.0x7.0x9.0x14.0x1. cflag 0x0 sender 1 flags 0x0 replay# 0 abast (nil).x0.1 dbmap (nil) disk: 0x0000.00000000 write request: 0x0000.00000000 pi scn: 0x0000.00000000 sq[(nil),(nil)] msgseq 0x1 updseq 0x0 reqids[6,0,0] infop (nil) lockseq x2b8 pkey 76896.0 hv 93 [stat 0x0, 1->1, wm 32768, RMno 0, reminc 18, dom 0] kjga st 0x4, step 0.0.0, cinc 20, rmno 6, flags 0x0 lb 0, hb 0, myb 15250, drmb 15250, apifrz 0 ?Instance 2??????block: (1/89233)? GLOBAL CACHE ELEMENT Lock Convert?lock: XG ????GC_ELEMENTS DUMP???XCUR Cache Fusion?,???????X$ VIEW,??? X$LE X$KJBR X$KJBL, ???X$ VIEW???????????????????: INSTANCE 2 Session D: SELECT * FROM x$le WHERE le_addr IN (SELECT le_addr FROM x$bh WHERE obj IN (SELECT data_object_id FROM dba_objects WHERE owner = 'SYS' AND object_name = 'TEST') AND class = 1 AND state != 3); ADDR INDX INST_ID LE_ADDR LE_ID1 LE_ID2 ---------------- ---------- ---------- ---------------- ---------- ---------- LE_RLS LE_ACQ LE_FLAGS LE_MODE LE_WRITE LE_LOCAL LE_RECOVERY ---------- ---------- ---------- ---------- ---------- ---------- ----------- LE_BLKS LE_TIME LE_KJBL ---------- ---------- ---------------- 00007F94CA14CF60 7003 2 0000000089FB25A0 89233 1 0 0 32 2 0 1 0 1 0 0000000089FB2618 PCM Resource NAME?[ID1][ID2],[BL]???, ID1?ID2 ??blockno? fileno????, ??????????GC_elements dump?? id1: 0x15c91 id2: 0×1 pkey: OBJ#76896 block: (1/89233)?? ,? kjblname ? kjbrname ??”[0x15c91][0x1],[BL]” ??: INSTANCE 2 Session D: SQL> set linesize 80 pagesize 1400 SQL> SELECT * 2 FROM x$kjbl l 3 WHERE l.kjblname LIKE '%[0x15c91][0x1],[BL]%'; ADDR INDX INST_ID KJBLLOCKP KJBLGRANT KJBLREQUE ---------------- ---------- ---------- ---------------- --------- --------- KJBLROLE KJBLRESP KJBLNAME ---------- ---------------- ------------------------------ KJBLNAME2 KJBLQUEUE ------------------------------ ---------- KJBLLOCKST KJBLWRITING ---------------------------------------------------------------- ----------- KJBLREQWRITE KJBLOWNER KJBLMASTER KJBLBLOCKED KJBLBLOCKER KJBLSID KJBLRDOMID ------------ ---------- ---------- ----------- ----------- ---------- ---------- KJBLPKEY ---------- 00007F94CA22A288 451 2 0000000089FB2618 KJUSEREX KJUSERNL 0 00 [0x15c91][0x1],[BL][ext 0x0,0x 89233,1,BL 0 GRANTED 0 0 1 0 0 0 0 0 76896 SQL> SELECT r.* FROM x$kjbr r WHERE r.kjbrname LIKE '%[0x15c91][0x1],[BL]%'; no rows selected Instance 1 session B: SQL> SELECT r.* FROM x$kjbr r WHERE r.kjbrname LIKE '%[0x15c91][0x1],[BL]%'; ADDR INDX INST_ID KJBRRESP KJBRGRANT KJBRNCVL ---------------- ---------- ---------- ---------------- --------- --------- KJBRROLE KJBRNAME KJBRMASTER KJBRGRANTQ ---------- ------------------------------ ---------- ---------------- KJBRCVTQ KJBRWRITER KJBRSID KJBRRDOMID KJBRPKEY ---------------- ---------------- ---------- ---------- ---------- 00007F801ACA68F8 1355 1 00000000B5A62AE0 KJUSEREX KJUSERNL 0 [0x15c91][0x1],[BL][ext 0x0,0x 0 00000000B48BB330 00 00 0 0 76896 ??????Instance 1???block: (1/89233),??????Instance 2 build cr block ????Instance 1, ?????????? ????? Instance 1? Foreground Process ? Instance 2?LMS??????RAC TRACE: Instance 2: [oracle@vrh2 ~]$ ps -ef|grep ora_lms|grep -v grep oracle 23364 1 0 Apr29 ? 00:33:15 ora_lms0_VPROD2 SQL> oradebug setospid 23364 Oracle pid: 13, Unix process pid: 23364, image: [email protected] (LMS0) SQL> oradebug event 10046 trace name context forever,level 8:10708 trace name context forever,level 103: trace[rac.*] disk high; Statement processed. SQL> oradebug tracefile_name /s01/orabase/diag/rdbms/vprod/VPROD2/trace/VPROD2_lms0_23364.trc Instance 1 session B : SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0; STATE CR_SCN_BAS ---------- ---------- 3 1756658 3 1756661 3 1755287 Instance 1 session A : SQL> alter session set events '10046 trace name context forever,level 8:10708 trace name context forever,level 103: trace[rac.*] disk high'; Session altered. SQL> select * from test; ID ---------- 2 2 SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0; STATE CR_SCN_BAS ---------- ---------- 3 1761520 ?x$BH?????,???????Instance 1???build??CR block,????? TRACE ??: Instance 1 foreground Process: PARSING IN CURSOR #140336527348792 len=18 dep=0 uid=0 oct=3 lid=0 tim=1335939136125254 hv=1689401402 ad='b1a4c828' sqlid='c99yw1xkb4f1u' select * from test END OF STMT PARSE #140336527348792:c=2999,e=2860,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,plh=1357081020,tim=1335939136125253 EXEC #140336527348792:c=0,e=40,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1357081020,tim=1335939136125373 WAIT #140336527348792: nam='SQL*Net message to client' ela= 6 driver id=1650815232 #bytes=1 p3=0 obj#=0 tim=1335939136125420 *** 2012-05-02 02:12:16.125 kclscrs: req=0 block=1/89233 2012-05-02 02:12:16.125574 : kjbcro[0x15c91.1 76896.0][4] *** 2012-05-02 02:12:16.125 kclscrs: req=0 typ=nowait-abort *** 2012-05-02 02:12:16.125 kclscrs: bid=1:3:1:0:f:1e:0:0:10:0:0:0:1:2:4:1:20:0:0:0:c3:49:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:4:3:2:1:2:0:1c:0:4d:26:a3:52:0:0:0:0:c7:c:ca:62:c3:49:0:0:0:0:1:0:14:8e:47:76:1:2:dc:5:a9:fe:17:75:0:0:0:0:0:0:0:0:0:0:0:0:99:ed:0:0:0:0:0:0:10:0:0:0 2012-05-02 02:12:16.125718 : kjbcro[0x15c91.1 76896.0][4] 2012-05-02 02:12:16.125751 : GSIPC:GMBQ: buff 0xba0ee018, queue 0xbb79a7b8, pool 0x60013fa0, freeq 0, nxt 0xbb79a7b8, prv 0xbb79a7b8 2012-05-02 02:12:16.125780 : kjbsentscn[0x0.1ae0f0][to 2] 2012-05-02 02:12:16.125806 : GSIPC:SENDM: send msg 0xba0ee088 dest x20001 seq 177740 type 36 tkts xff0000 mlen x1680198 2012-05-02 02:12:16.125918 : kjbmscr(0x15c91.1)reqid=0x8(req 0xa4ff30f8)(rinst 1)hldr 2(infosz 200)(lseq x2b8) 2012-05-02 02:12:16.126959 : GSIPC:KSXPCB: msg 0xba0ee088 status 30, type 36, dest 2, rcvr 1 *** 2012-05-02 02:12:16.127 kclwcrs: wait=0 tm=1233 *** 2012-05-02 02:12:16.127 kclwcrs: got 1 blocks from ksxprcv WAIT #140336527348792: nam='gc cr block 2-way' ela= 1233 p1=1 p2=89233 p3=1 obj#=76896 tim=1335939136127199 2012-05-02 02:12:16.127272 : kjbcrcomplete[0x15c91.1 76896.0][0] 2012-05-02 02:12:16.127309 : kjbrcvdscn[0x0.1ae0f0][from 2][idx 2012-05-02 02:12:16.127329 : kjbrcvdscn[no bscn <= rscn 0x0.1ae0f0][from 2] ???? kjbcro[0x15c91.1 76896.0][4] kjbsentscn[0x0.1ae0f0][to 2] ?Instance 2??SCN=1ae0f0=1761520? block: (1/89233),???’gc cr block 2-way’ ??,?????????CR block? Instance 2 LMS TRACE 2012-05-02 02:12:15.634057 : GSIPC:RCVD: ksxp msg 0x7f16e1598588 sndr 1 seq 0.177740 type 36 tkts 0 2012-05-02 02:12:15.634094 : GSIPC:RCVD: watq msg 0x7f16e1598588 sndr 1, seq 177740, type 36, tkts 0 2012-05-02 02:12:15.634108 : GSIPC:TKT: collect msg 0x7f16e1598588 from 1 for rcvr -1, tickets 0 2012-05-02 02:12:15.634162 : kjbrcvdscn[0x0.1ae0f0][from 1][idx 2012-05-02 02:12:15.634186 : kjbrcvdscn[no bscn1, wm 32768, RMno 0, reminc 18, dom 0] kjga st 0x4, step 0.0.0, cinc 20, rmno 6, flags 0x0 lb 0, hb 0, myb 15250, drmb 15250, apifrz 0 GCS CLIENT END 2012-05-02 02:12:15.635211 : kjbdowncvt[0x15c91.1 76896.0][1][options x0] 2012-05-02 02:12:15.635230 : GSIPC:AMBUF: rcv buff 0x7f16e1c56420, pool rcvbuf, rqlen 1103 2012-05-02 02:12:15.635308 : GSIPC:GPBMSG: new bmsg 0x7f16e1c56490 mb 0x7f16e1c56420 msg 0x7f16e1c564b0 mlen 152 dest x101 flushsz -1 2012-05-02 02:12:15.635334 : kjbmslset(0x15c91.1)) seq 0x4 reqid=0x6 (shadow 0xb48bb330.xb)(rsn 2)(mas@1) 2012-05-02 02:12:15.635355 : GSIPC:SPBMSG: send bmsg 0x7f16e1c56490 blen 184 msg 0x7f16e1c564b0 mtype 33 attr|dest x30101 bsz|fsz x1ffff 2012-05-02 02:12:15.635377 : GSIPC:SNDQ: enq msg 0x7f16e1c56490, type 65521 seq 118669, inst 1, receiver 1, queued 1 *** 2012-05-02 02:12:15.635 kclccctx: cleanup copy 0x7f16e1d94798 2012-05-02 02:12:15.635479 : [kjmpmsgi:compl][type 36][msg 0x7f16e1598588][seq 177740.0][qtime 0][ptime 1257] 2012-05-02 02:12:15.635511 : GSIPC:BSEND: flushing sndq 0xb491dd28, id 1, dcx 0xbc516778, inst 1, rcvr 1 qlen 0 1 2012-05-02 02:12:15.635536 : GSIPC:BSEND: no batch1 msg 0x7f16e1c56490 type 65521 len 184 dest (1:1) 2012-05-02 02:12:15.635557 : kjbsentscn[0x0.1ae0f1][to 1] 2012-05-02 02:12:15.635578 : GSIPC:SENDM: send msg 0x7f16e1c56490 dest x10001 seq 118669 type 65521 tkts x10002 mlen xb800e8 WAIT #0: nam='gcs remote message' ela= 180 waittime=1 poll=0 event=0 obj#=0 tim=1335939135635819 2012-05-02 02:12:15.635853 : GSIPC:RCVD: ksxp msg 0x7f16e167e0b0 sndr 1 seq 0.177741 type 32 tkts 0 2012-05-02 02:12:15.635875 : GSIPC:RCVD: watq msg 0x7f16e167e0b0 sndr 1, seq 177741, type 32, tkts 0 2012-05-02 02:12:15.636012 : GSIPC:TKT: collect msg 0x7f16e167e0b0 from 1 for rcvr -1, tickets 0 2012-05-02 02:12:15.636040 : kjbrcvdscn[0x0.1ae0f1][from 1][idx 2012-05-02 02:12:15.636060 : kjbrcvdscn[no bscn <= rscn 0x0.1ae0f1][from 1] 2012-05-02 02:12:15.636082 : GSIPC:TKT: dest (1:1) rtkt not acked 1 unassigned bufs 0 tkts 0 newbufs 0 2012-05-02 02:12:15.636102 : GSIPC:TKT: remove ctx dest (1:1) 2012-05-02 02:12:15.636125 : [kjmxmpm][type 32][seq 0.177741][msg 0x7f16e167e0b0][from 1] 2012-05-02 02:12:15.636146 : kjbmpocr(0xb0.6)seq 0x1,reqid=0x23a,(client 0x9fff7b58,0x1)(from 1)(lseq xdf0) 2????LMS????????? ??gcs remote message GSIPC ????SCN=[0x0.1ae0f0] block=1/89233???,??BAST kjbmpbast(0x15c91.1),?? block=1/89233??????? ??fairness??(?11.2.0.3???_fairness_threshold=2),?current block?KCL: F156: fairness downconvert,?Xcurrent DownConvert? Scurrent: Instance 2: SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0; STATE CR_SCN_BAS ---------- ---------- 2 0 3 1756658 ??Instance 2 LMS ?cr block??? kjbmslset(0x15c91.1)) ????SEND QUEUE GSIPC:SNDQ: enq msg 0x7f16e1c56490? ???????Instance 1???? block: (1/89233)??? ??????: Instance 2: SQL> select CURRENT_RESULTS,LIGHT_WORKS from v$cr_block_server; CURRENT_RESULTS LIGHT_WORKS --------------- ----------- 29273 437 Instance 1 session A: SQL> SQL> select * from test; ID ---------- 2 2 SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0; STATE CR_SCN_BAS ---------- ---------- 3 1761942 3 1761932 1 0 3 1761520 Instance 2: SQL> select CURRENT_RESULTS,LIGHT_WORKS from v$cr_block_server; CURRENT_RESULTS LIGHT_WORKS --------------- ----------- 29274 437 select * from test END OF STMT PARSE #140336529675592:c=0,e=337,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1357081020,tim=1335939668940051 EXEC #140336529675592:c=0,e=96,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1357081020,tim=1335939668940204 WAIT #140336529675592: nam='SQL*Net message to client' ela= 5 driver id=1650815232 #bytes=1 p3=0 obj#=0 tim=1335939668940348 *** 2012-05-02 02:21:08.940 kclscrs: req=0 block=1/89233 2012-05-02 02:21:08.940676 : kjbcro[0x15c91.1 76896.0][5] *** 2012-05-02 02:21:08.940 kclscrs: req=0 typ=nowait-abort *** 2012-05-02 02:21:08.940 kclscrs: bid=1:3:1:0:f:21:0:0:10:0:0:0:1:2:4:1:20:0:0:0:c3:49:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:4:3:2:1:2:0:1f:0:4d:26:a3:52:0:0:0:0:c7:c:ca:62:c3:49:0:0:0:0:1:0:17:8e:47:76:1:2:dc:5:a9:fe:17:75:0:0:0:0:0:0:0:0:0:0:0:0:99:ed:0:0:0:0:0:0:10:0:0:0 2012-05-02 02:21:08.940799 : kjbcro[0x15c91.1 76896.0][5] 2012-05-02 02:21:08.940833 : GSIPC:GMBQ: buff 0xba0ee018, queue 0xbb79a7b8, pool 0x60013fa0, freeq 0, nxt 0xbb79a7b8, prv 0xbb79a7b8 2012-05-02 02:21:08.940859 : kjbsentscn[0x0.1ae28c][to 2] 2012-05-02 02:21:08.940870 : GSIPC:SENDM: send msg 0xba0ee088 dest x20001 seq 177810 type 36 tkts xff0000 mlen x1680198 2012-05-02 02:21:08.940976 : kjbmscr(0x15c91.1)reqid=0xa(req 0xa4ff30f8)(rinst 1)hldr 2(infosz 200)(lseq x2b8) 2012-05-02 02:21:08.941314 : GSIPC:KSXPCB: msg 0xba0ee088 status 30, type 36, dest 2, rcvr 1 *** 2012-05-02 02:21:08.941 kclwcrs: wait=0 tm=707 *** 2012-05-02 02:21:08.941 kclwcrs: got 1 blocks from ksxprcv 2012-05-02 02:21:08.941818 : kjbassume[0x15c91.1][sender 2][mymode x1][myrole x0][srole x0][flgs x0][spiscn 0x0.0][swscn 0x0.0] 2012-05-02 02:21:08.941852 : kjbrcvdscn[0x0.1ae28d][from 2][idx 2012-05-02 02:21:08.941871 : kjbrcvdscn[no bscn ??????????????SCN=[0x0.1ae28c]=1761932 Version?CR block, ????receive????Xcurrent Block??SCN=1ae28d=1761933,Instance 1???Xcurrent Block???build????????SCN=1761932?CR BLOCK, ????????Current block,?????????'gc current block 2-way'? ?????????????request current block,?????kjbcro;?????Instance 2?LMS???????Current Block: Instance 2 LMS trace: 2012-05-02 02:21:08.448743 : GSIPC:RCVD: ksxp msg 0x7f16e14a4398 sndr 1 seq 0.177810 type 36 tkts 0 2012-05-02 02:21:08.448778 : GSIPC:RCVD: watq msg 0x7f16e14a4398 sndr 1, seq 177810, type 36, tkts 0 2012-05-02 02:21:08.448798 : GSIPC:TKT: collect msg 0x7f16e14a4398 from 1 for rcvr -1, tickets 0 2012-05-02 02:21:08.448816 : kjbrcvdscn[0x0.1ae28c][from 1][idx 2012-05-02 02:21:08.448834 : kjbrcvdscn[no bscn <= rscn 0x0.1ae28c][from 1] 2012-05-02 02:21:08.448857 : GSIPC:TKT: dest (1:1) rtkt not acked 2 unassigned bufs 0 tkts 0 newbufs 0 2012-05-02 02:21:08.448875 : GSIPC:TKT: remove ctx dest (1:1) 2012-05-02 02:21:08.448970 : [kjmxmpm][type 36][seq 0.177810][msg 0x7f16e14a4398][from 1] 2012-05-02 02:21:08.448993 : kjbmpbast(0x15c91.1) reqid=0x6 (req 0xa4ff30f8)(reqinst 1)(reqid 10)(flags x0) *** 2012-05-02 02:21:08.449 kclcrrf: req=48054 block=1/89233 *** 2012-05-02 02:21:08.449 kcl_compress_block: compressed: 6 free space: 7680 2012-05-02 02:21:08.449085 : kjbsentscn[0x0.1ae28d][to 1] 2012-05-02 02:21:08.449142 : kjbdeliver[to 1][0xa4ff30f8][10][current 1] 2012-05-02 02:21:08.449164 : kjbmssch(reqlock 0xa4ff30f8,10)(to 1)(bsz 344) 2012-05-02 02:21:08.449183 : GSIPC:AMBUF: rcv buff 0x7f16e18bcec8, pool rcvbuf, rqlen 1102 *** 2012-05-02 02:21:08.449 kclccctx: cleanup copy 0x7f16e1d94838 *** 2012-05-02 02:21:08.449 kcltouched: touch seconds 3271 *** 2012-05-02 02:21:08.449 kclgrantlk: req=48054 2012-05-02 02:21:08.449347 : [kjmpmsgi:compl][type 36][msg 0x7f16e14a4398][seq 177810.0][qtime 0][ptime 1119] WAIT #0: nam='gcs remote message' ela= 568 waittime=1 poll=0 event=0 obj#=0 tim=1335939668449962 2012-05-02 02:21:08.450001 : GSIPC:RCVD: ksxp msg 0x7f16e1bb22a0 sndr 1 seq 0.177811 type 32 tkts 0 2012-05-02 02:21:08.450024 : GSIPC:RCVD: watq msg 0x7f16e1bb22a0 sndr 1, seq 177811, type 32, tkts 0 2012-05-02 02:21:08.450043 : GSIPC:TKT: collect msg 0x7f16e1bb22a0 from 1 for rcvr -1, tickets 0 2012-05-02 02:21:08.450060 : kjbrcvdscn[0x0.1ae28e][from 1][idx 2012-05-02 02:21:08.450078 : kjbrcvdscn[no bscn <= rscn 0x0.1ae28e][from 1] 2012-05-02 02:21:08.450097 : GSIPC:TKT: dest (1:1) rtkt not acked 3 unassigned bufs 0 tkts 0 newbufs 0 2012-05-02 02:21:08.450116 : GSIPC:TKT: remove ctx dest (1:1) 2012-05-02 02:21:08.450136 : [kjmxmpm][type 32][seq 0.177811][msg 0x7f16e1bb22a0][from 1] 2012-05-02 02:21:08.450155 : kjbmpocr(0xb0.6)seq 0x1,reqid=0x23e,(client 0x9fff7b58,0x1)(from 1)(lseq xdf4) ???Instance 2??LMS???,???build cr block,??????Instance 1?????Current Block??????Instance 2??v$cr_block_server??????LIGHT_WORKS?????current block transfer??????,??????? CR server? Light Work Rule(Light Work Rule?8i Cr Server?????????,?Remote LMS?? build CR????????,resource holder?LMS???????block,????CR build If creating the consistent read version block involves too much work (such as reading blocks from disk), then the holder sends the block to the requestor, and the requestor completes the CR fabrication. The holder maintains a fairness counter of CR requests. After the fairness threshold is reached, the holder downgrades it to lock mode.)? ??????? CR Request ????Current Block?? ???:??????class?block,CR server??????? ??undo block?? undo header block?CR quest, LMS????Current Block, ????? ???? ??????? block cleanout? CR Version??????? ???????? data blocks, ??????? CR quest & CR received?(???????Light Work Rule,LMS"??"), ??Current Block??DownConvert???S lock,??LMS???????ship??current version?block? ??????? , ?????? ,???????DownConvert?????”_fairness_threshold“???200,????Xcurrent Block?????Scurrent, ????LMS?????Current Version?Data Block: SQL> show parameter fair NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ _fairness_threshold integer 200 Instance 1: SQL> update test set id=id+1 where id=4; 1 row updated. Instance 2: SQL> update test set id=id+1 where id=2; 1 row updated. SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0; STATE CR_SCN_BAS ---------- ---------- 1 0 3 1838166 ?Instance 1? ????,? ??instance 2? v$cr_block_server?? instance 1 SQL> select * from test; ID ---------- 10 3 instance 2: SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0; STATE CR_SCN_BAS ---------- ---------- 1 0 3 1883707 8 0 SQL> select * from test; ID ---------- 10 3 SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0; STATE CR_SCN_BAS ---------- ---------- 1 0 3 1883707 8 0 ................... SQL> / STATE CR_SCN_BAS ---------- ---------- 2 0 3 1883707 3 1883695 repeat cr request on Instance 1 SQL> / STATE CR_SCN_BAS ---------- ---------- 8 0 3 1883707 3 1883695 ??????_fairness_threshold????????,?????200 ????????CR serve??Downgrade?lock, ????data block? CR Request????Receive? Current Block?

Read the article
ZFS for Database Log Files

- by user12620111

I've been troubled by drop outs in CPU usage in my application server, characterized by the CPUs suddenly going from close to 90% CPU busy to almost completely CPU idle for a few seconds. Here is an example of a drop out as shown by a snippet of vmstat data taken while the application server is under a heavy workload. # vmstat 1 kthr memory page disk faults cpu r b w swap free re mf pi po fr de sr s3 s4 s5 s6 in sy cs us sy id 1 0 0 130160176 116381952 0 16 0 0 0 0 0 0 0 0 0 207377 117715 203884 70 21 9 12 0 0 130160160 116381936 0 25 0 0 0 0 0 0 0 0 0 200413 117162 197250 70 20 9 11 0 0 130160176 116381920 0 16 0 0 0 0 0 0 1 0 0 203150 119365 200249 72 21 7 8 0 0 130160176 116377808 0 19 0 0 0 0 0 0 0 0 0 169826 96144 165194 56 17 27 0 0 0 130160176 116377800 0 16 0 0 0 0 0 0 0 0 1 10245 9376 9164 2 1 97 0 0 0 130160176 116377792 0 16 0 0 0 0 0 0 0 0 2 15742 12401 14784 4 1 95 0 0 0 130160176 116377776 2 16 0 0 0 0 0 0 1 0 0 19972 17703 19612 6 2 92 14 0 0 130160176 116377696 0 16 0 0 0 0 0 0 0 0 0 202794 116793 199807 71 21 8 9 0 0 130160160 116373584 0 30 0 0 0 0 0 0 18 0 0 203123 117857 198825 69 20 11 This behavior occurred consistently while the application server was processing synthetic transactions: HTTP requests from JMeter running on an external machine. I explored many theories trying to explain the drop outs, including: Unexpected JMeter behavior Network contention Java Garbage Collection Application Server thread pool problems Connection pool problems Database transaction processing Database I/O contention Graphing the CPU %idle led to a breakthrough: Several of the drop outs were 30 seconds apart. With that insight, I went digging through the data again and looking for other outliers that were 30 seconds apart. In the database server statistics, I found spikes in the iostat "asvc_t" (average response time of disk transactions, in milliseconds) for the disk drive that was being used for the database log files. Here is an example: extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 2053.6 0.0 8234.3 0.0 0.2 0.0 0.1 0 24 c3t60080E5...F4F6d0s0 0.0 2162.2 0.0 8652.8 0.0 0.3 0.0 0.1 0 28 c3t60080E5...F4F6d0s0 0.0 1102.5 0.0 10012.8 0.0 4.5 0.0 4.1 0 69 c3t60080E5...F4F6d0s0 0.0 74.0 0.0 7920.6 0.0 10.0 0.0 135.1 0 100 c3t60080E5...F4F6d0s0 0.0 568.7 0.0 6674.0 0.0 6.4 0.0 11.2 0 90 c3t60080E5...F4F6d0s0 0.0 1358.0 0.0 5456.0 0.0 0.6 0.0 0.4 0 55 c3t60080E5...F4F6d0s0 0.0 1314.3 0.0 5285.2 0.0 0.7 0.0 0.5 0 70 c3t60080E5...F4F6d0s0 Here is a little more information about my database configuration: The database and application server were running on two different SPARC servers. Storage for the database was on a storage array connected via 8 gigabit Fibre Channel Data storage and log file were on different physical disk drives Reliable low latency I/O is provided by battery backed NVRAM Highly available: Two Fibre Channel links accessed via MPxIO Two Mirrored cache controllers The log file physical disks were mirrored in the storage device Database log files on a ZFS Filesystem with cutting-edge technologies, such as copy-on-write and end-to-end checksumming Why would I be getting service time spikes in my high-end storage? First, I wanted to verify that the database log disk service time spikes aligned with the application server CPU drop outs, and they did: At first, I guessed that the disk service time spikes might be related to flushing the write through cache on the storage device, but I was unable to validate that theory. After searching the WWW for a while, I decided to try using a separate log device: # zpool add ZFS-db-41 log c3t60080E500017D55C000015C150A9F8A7d0 The ZFS log device is configured in a similar manner as described above: two physical disks mirrored in the storage array. This change to the database storage configuration eliminated the application server CPU drop outs: Here is the zpool configuration: # zpool status ZFS-db-41 pool: ZFS-db-41 state: ONLINE scan: none requested config: NAME STATE ZFS-db-41 ONLINE c3t60080E5...F4F6d0 ONLINE logs c3t60080E5...F8A7d0 ONLINE Now, the I/O spikes look like this: extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 1053.5 0.0 4234.1 0.0 0.8 0.0 0.7 0 75 c3t60080E5...F8A7d0s0 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 1131.8 0.0 4555.3 0.0 0.8 0.0 0.7 0 76 c3t60080E5...F8A7d0s0 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 1167.6 0.0 4682.2 0.0 0.7 0.0 0.6 0 74 c3t60080E5...F8A7d0s0 0.0 162.2 0.0 19153.9 0.0 0.7 0.0 4.2 0 12 c3t60080E5...F4F6d0s0 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 1247.2 0.0 4992.6 0.0 0.7 0.0 0.6 0 71 c3t60080E5...F8A7d0s0 0.0 41.0 0.0 70.0 0.0 0.1 0.0 1.6 0 2 c3t60080E5...F4F6d0s0 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 1241.3 0.0 4989.3 0.0 0.8 0.0 0.6 0 75 c3t60080E5...F8A7d0s0 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 1193.2 0.0 4772.9 0.0 0.7 0.0 0.6 0 71 c3t60080E5...F8A7d0s0 We can see the steady flow of 4k writes to the ZIL device from O_SYNC database log file writes. The spikes are from flushing the transaction group. Like almost all problems that I run into, once I thoroughly understand the problem, I find that other people have documented similar experiences. Thanks to all of you who have documented alternative approaches. Saved for another day: now that the problem is obvious, I should try "zfs:zfs_immediate_write_sz" as recommended in the ZFS Evil Tuning Guide. References: The ZFS Intent Log Solaris ZFS, Synchronous Writes and the ZIL Explained ZFS Evil Tuning Guide: Cache Flushes ZFS Evil Tuning Guide: Tuning ZFS for Database Performance

Read the article
NHibernate 3.0 and FluentNHibernate, how to get up and running….

- by DesigningCode

First up. Its actually really easy. I’m not very religious about my DB tech, I don’t really care, I just want something that works. So I’m happy to consider all options if they provide an advantage, and recently I was considering jumping from NHibernate to EF 4.0. However before ditching NHibernate and jumping to EF 4.0 I thought I should try the head version of NHibernates trunk and the Head version of FluentNHibernate. I currently have a “Repository / Unit of Work” Framework built up around these two techs. All up it makes my life pretty simple for dealing with databases. The problem is the current release of NHibernate + the Linq provider wasn’t too hot for our purposes. Especially trying to plug it into older VB.NET code. The Linq provider spat the dummy with VB.NET lambdas. Mainly because in C# Query().Where(l => l.Name.Contains("x") || l.Name.Contains("y")).ToList(); is not the same as the VB.NET Query().Where(Function(l) l.Name.Contains("x") Or l.Name.Contains("y")).ToList VB.NET seems to spit out … well…. something different :-) so anyways… Compiling your own version of NHibernate and FluentNHibernate. It’s actually pretty easy! First you’ll need to install tortisesvn NAnt and Git if you don’t already have them. NHibernate first step, get the subversion trunk https://nhibernate.svn.sourceforge.net/svnroot/nhibernate/trunk/ into a directory somewhere. eg \thirdparty\nhibernate Then use NAnt to build it. (if you open the .sln it will show errors in that AssemblyInfo.cs doesn’t exist ) to build it, there is a .txt document with sample command line build instructions, I simply used :- NAnt -D:project.config=release clean build >output-release-build.log *wait* *wait* *wait* and ta da, you will have a bin directory with all the release dlls. FluentNHibernate This was pretty simple. there’s instructions here :- http://wiki.fluentnhibernate.org/Getting_started#Installation basically, with git, create a directory, and you issue the command git clone git://github.com/jagregory/fluent-nhibernate.git and wait, and soon enough you have the source. Now, from the bin directory that NHibernate spit out, take everything and dump it into the subdirectory “fluent-nhibernate\tools\NHibernate” Now, to build, you can use rake….which a ruby build system, however you can also just open the solution and build. Which is what I did. I had a few problems with the references which I simply re-added using the new ones. Once built, I just took all the NHibnerate dlls, and the fluent ones and replaced my existing NHibernate / Fluent and killed off the old linq project. All I had to change is the places that used .Linq<T> and replace them with .Query<T> (which was easy as I had wrapped it already to isolate my code from such changes) and hey presto, everything worked. Even the VB.NET linq calls. I need to do some more testing as I’ve only done basic smoke tests, but its all looking pretty good, so for now, I will stick to NHibernate!

Read the article
Recent improvements in Console Performance

- by loren.konkus

Recently, the WebLogic Server development and support organizations have worked with a number of customers to quantify and improve the performance of the Administration Console in large, distributed configurations where there is significant latency in the communications between the administration server and managed servers. These improvements fall into two categories: Constraining the amount of time that the Console stalls waiting for communication Reducing and streamlining the amount of data required for an update A few releases ago, we added support for a configurable domain-wide mbean "Invocation Timeout" value on the Console's configuration: general, advanced section for a domain. The default value for this setting is 0, which means wait indefinitely and was chosen for compatibility with the behavior of previous releases. This configuration setting applies to all mbean communications between the admin server and managed servers, and is the first line of defense against being blocked by a stalled or completely overloaded managed server. Each site should choose an appropriate timeout value for their environment and network latency. In the next release of WebLogic Server, we've added an additional console preference, "Management Operation Timeout", to the Console's shared preference page. This setting further constrains how long certain console pages will wait for slowly responding servers before returning partial results. While not all Console pages support this yet, key pages such as the Servers Configuration and Control table pages and the Deployments Control pages have been updated to support this. For example, if a user requests a Servers Table page and a Management Operation Timeout occurs, the table is displayed with both local configuration and remote runtime information from the responding managed servers and only local configuration information for servers that did not yet respond. This means that a troublesome managed server does not impede your ability to manage your domain using the Console. To support these changes, these Console pages have been re-written to use the Work Management feature of WebLogic Server to interact with each server or deployment concurrently, which further improves the responsiveness of these pages. The basic algorithm for these pages is: For each configuration mbean (ie, Servers) populate rows with configuration attributes from the fast, local mbean server Find a WorkManager For each server, Create a Work instance to obtain runtime mbean attributes for the server Schedule Work instance in the WorkManager Call WorkManager.waitForAll to wait WorkItems to finish, constrained by Management Operation Timeout For each WorkItem, if the runtime information obtained was not complete, add a message indicating which server has incomplete data Display collected data in table In addition to these changes to constrain how long the console waits for communication, a number of other changes have been made to reduce the amount and scope of managed server interactions for key pages. For example, in previous releases the Deployments Control table looked at the status of a deployment on every managed server, even those servers that the deployment was not currently targeted on. (This was done to handle an edge case where a deployment's target configuration was changed while it remained running on previously targeted servers.) We decided supporting that edge case did not warrant the performance impact for all, and instead only look at the status of a deployment on the servers it is targeted to. Comprehensive status continues to be available if a user clicks on the 'status' field for a deployment. Finally, changes have been made to the System Status portlet to reduce its impact on Console page display times. Obtaining health information for this display requires several mbean interactions with managed servers. In previous releases, this mbean interaction occurred with every display, and any delay or impediment in these interactions was reflected in the display time for every page. To reduce this impact, we've made several changes in this portlet: Using Work Management to obtain health concurrently Applying the operation timeout configuration to constrain how long we will wait Caching health information to reduce the cost during rapid navigation from page to page and only obtaining new health information if the previous information is over 30 seconds old. Eliminating heath collection if this portlet is minimized. Together, these Console changes have resulted in significant performance improvements for the customers with large configurations and high latency that we have worked with during their development, and some lesser performance improvements for those with small configurations and very fast networks. These changes will be included in the 11g Rel 1 patch set 2 (10.3.3.0) release of WebLogic Server.

Read the article
Boost Thread Synchronization

- by Dave18

I don't see synchronized output when i comment the the line wait(1) in thread(). can I make them run at the same time (one after another) without having to use 'wait(1)'? #include <boost/thread.hpp> #include <iostream> void wait(int seconds) { boost::this_thread::sleep(boost::posix_time::seconds(seconds)); } boost::mutex mutex; void thread() { for (int i = 0; i < 100; ++i) { wait(1); mutex.lock(); std::cout << "Thread " << boost::this_thread::get_id() << ": " << i << std::endl; mutex.unlock(); } } int main() { boost::thread t1(thread); boost::thread t2(thread); t1.join(); t2.join(); }

Read the article

< Previous Page | 13 14 15 16 17 18 19 20 21 22 23 24 | Next Page >