Search Results

Search found 22569 results on 903 pages for 'win32 process'.

Page 277/903 | < Previous Page | 273 274 275 276 277 278 279 280 281 282 283 284 | Next Page >

Error starting JBoss 5.1.0

- by Alexandre

I've installed JBoss 5.1.0 on a Xubuntu (running as a guest on VMWare - Windows 7 host). It did work fine for some days, but now I'm completelly unable to start it anymore. Every time I try to start it, I got a "Port 8x83 already in use". I've tried to run it with different ports configurations, and none of them works. I did look for the services using the problematic ports, using netstat and lsof, but they never show up. Since this error occurs in all port configurations, I think this is a Jboss problem. Below is the error stack trace: 2010-06-15 06:21:47,992 INFO [org.jboss.web.WebService] (main) Using RMI server codebase: http://192.168.0.104:8083/ 2010-06-15 06:21:48,085 ERROR [org.jboss.kernel.plugins.dependency.AbstractKernelController] (main) Error installing to Start: name=jboss:service=WebService state=Create mode=Manual requiredState=Installed java.lang.Exception: Port 8083 already in use. at org.jboss.web.WebServer.start(WebServer.java:233) at org.jboss.web.WebService.startService(WebService.java:322) at org.jboss.system.ServiceMBeanSupport.jbossInternalStart(ServiceMBeanSupport.java:376) at org.jboss.system.ServiceMBeanSupport.jbossInternalLifecycle(ServiceMBeanSupport.java:322) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.jboss.mx.interceptor.ReflectedDispatcher.invoke(ReflectedDispatcher.java:157) at org.jboss.mx.server.Invocation.dispatch(Invocation.java:96) at org.jboss.mx.server.Invocation.invoke(Invocation.java:88) at org.jboss.mx.server.AbstractMBeanInvoker.invoke(AbstractMBeanInvoker.java:264) at org.jboss.mx.server.MBeanServerImpl.invoke(MBeanServerImpl.java:668) at org.jboss.system.microcontainer.ServiceProxy.invoke(ServiceProxy.java:189) at $Proxy38.start(Unknown Source) at org.jboss.system.microcontainer.StartStopLifecycleAction.installAction(StartStopLifecycleAction.java:42) at org.jboss.system.microcontainer.StartStopLifecycleAction.installAction(StartStopLifecycleAction.java:37) at org.jboss.dependency.plugins.action.SimpleControllerContextAction.simpleInstallAction(SimpleControllerContextAction.java:62) at org.jboss.dependency.plugins.action.AccessControllerContextAction.install(AccessControllerContextAction.java:71) at org.jboss.dependency.plugins.AbstractControllerContextActions.install(AbstractControllerContextActions.java:51) at org.jboss.dependency.plugins.AbstractControllerContext.install(AbstractControllerContext.java:348) at org.jboss.system.microcontainer.ServiceControllerContext.install(ServiceControllerContext.java:286) at org.jboss.dependency.plugins.AbstractController.install(AbstractController.java:1631) at org.jboss.dependency.plugins.AbstractController.incrementState(AbstractController.java:934) at org.jboss.dependency.plugins.AbstractController.resolveContexts(AbstractController.java:1082) at org.jboss.dependency.plugins.AbstractController.resolveContexts(AbstractController.java:984) at org.jboss.dependency.plugins.AbstractController.change(AbstractController.java:822) at org.jboss.dependency.plugins.AbstractController.change(AbstractController.java:553) at org.jboss.system.ServiceController.doChange(ServiceController.java:688) at org.jboss.system.ServiceController.start(ServiceController.java:460) at org.jboss.system.deployers.ServiceDeployer.start(ServiceDeployer.java:163) at org.jboss.system.deployers.ServiceDeployer.deploy(ServiceDeployer.java:99) at org.jboss.system.deployers.ServiceDeployer.deploy(ServiceDeployer.java:46) at org.jboss.deployers.spi.deployer.helpers.AbstractSimpleRealDeployer.internalDeploy(AbstractSimpleRealDeployer.java:62) at org.jboss.deployers.spi.deployer.helpers.AbstractRealDeployer.deploy(AbstractRealDeployer.java:50) at org.jboss.deployers.spi.deployer.helpers.AbstractRealDeployer.deploy(AbstractRealDeployer.java:50) at org.jboss.deployers.plugins.deployers.DeployerWrapper.deploy(DeployerWrapper.java:171) at org.jboss.deployers.plugins.deployers.DeployersImpl.doDeploy(DeployersImpl.java:1439) at org.jboss.deployers.plugins.deployers.DeployersImpl.doInstallParentFirst(DeployersImpl.java:1157) at org.jboss.deployers.plugins.deployers.DeployersImpl.doInstallParentFirst(DeployersImpl.java:1178) at org.jboss.deployers.plugins.deployers.DeployersImpl.install(DeployersImpl.java:1098) at org.jboss.dependency.plugins.AbstractControllerContext.install(AbstractControllerContext.java:348) at org.jboss.dependency.plugins.AbstractController.install(AbstractController.java:1631) at org.jboss.dependency.plugins.AbstractController.incrementState(AbstractController.java:934) at org.jboss.dependency.plugins.AbstractController.resolveContexts(AbstractController.java:1082) at org.jboss.dependency.plugins.AbstractController.resolveContexts(AbstractController.java:984) at org.jboss.dependency.plugins.AbstractController.change(AbstractController.java:822) at org.jboss.dependency.plugins.AbstractController.change(AbstractController.java:553) at org.jboss.deployers.plugins.deployers.DeployersImpl.process(DeployersImpl.java:781) at org.jboss.deployers.plugins.main.MainDeployerImpl.process(MainDeployerImpl.java:702) at org.jboss.system.server.profileservice.repository.MainDeployerAdapter.process(MainDeployerAdapter.java:117) at org.jboss.system.server.profileservice.repository.ProfileDeployAction.install(ProfileDeployAction.java:70) at org.jboss.system.server.profileservice.repository.AbstractProfileAction.install(AbstractProfileAction.java:53) at org.jboss.system.server.profileservice.repository.AbstractProfileService.install(AbstractProfileService.java:361) at org.jboss.dependency.plugins.AbstractControllerContext.install(AbstractControllerContext.java:348) at org.jboss.dependency.plugins.AbstractController.install(AbstractController.java:1631) at org.jboss.dependency.plugins.AbstractController.incrementState(AbstractController.java:934) at org.jboss.dependency.plugins.AbstractController.resolveContexts(AbstractController.java:1082) at org.jboss.dependency.plugins.AbstractController.resolveContexts(AbstractController.java:984) at org.jboss.dependency.plugins.AbstractController.change(AbstractController.java:822) at org.jboss.dependency.plugins.AbstractController.change(AbstractController.java:553) at org.jboss.system.server.profileservice.repository.AbstractProfileService.activateProfile(AbstractProfileService.java:306) at org.jboss.system.server.profileservice.ProfileServiceBootstrap.start(ProfileServiceBootstrap.java:271) at org.jboss.bootstrap.AbstractServerImpl.start(AbstractServerImpl.java:461) at org.jboss.Main.boot(Main.java:221) at org.jboss.Main$1.run(Main.java:556) at java.lang.Thread.run(Thread.java:619) Caused by: java.net.BindException: Cannot assign requested address at java.net.PlainSocketImpl.socketBind(Native Method) at java.net.PlainSocketImpl.bind(PlainSocketImpl.java:365) at java.net.ServerSocket.bind(ServerSocket.java:319) at java.net.ServerSocket.<init>(ServerSocket.java:185) at org.jboss.web.WebServer.start(WebServer.java:226) Any hint on this? Thanks

Read the article
PHP crashing (seg-fault) under mod_fcgi, apache

- by Andras Gyomrey

I've been programming a site using: Zend Framework 1.11.5 (complete MVC) PHP 5.3.6 Apache 2.2.19 CentOS 5.6 i686 virtuozzo on vps cPanel WHM 11.30.1 (build 4) Mysql 5.1.56-log Mysqli API 5.1.56 The issue started here http://stackoverflow.com/questions/6769515/php-programming-seg-fault. In brief, php is giving me random segmentation-faults. [Wed Jul 20 17:45:34 2011] [error] mod_fcgid: process /usr/local/cpanel/cgi-sys/php5(11562) exit(communication error), get unexpected signal 11 [Wed Jul 20 17:45:34 2011] [warn] [client 190.78.208.30] (104)Connection reset by peer: mod_fcgid: error reading data from FastCGI server [Wed Jul 20 17:45:34 2011] [error] [client 190.78.208.30] Premature end of script headers: index.php About extensions. When i compile php with "--enable-debug" flag, i have to disable this line: zend_extension="/usr/local/IonCube/ioncube_loader_lin_5.3.so" Otherwise, the server doesn't accept requests and i get a "The connection with the server was reset". It is possible that i have to disable eaccelerator too because of the same reason. I still don't get why apache gets running it some times and some others not: extension="eaccelerator.so" Anyway, after i get httpd running, seg-faults can occurr randomly. If i don't compile php with "--enable-debug" flag, i can get DETERMINISTICALLY a php crash: <?php class Admin_DbController extends Controller_BaseController { public function updateSqlDefinitionsAction() { $db = Zend_Registry::get('db'); $row = $db->fetchRow("SHOW CREATE TABLE 222AFI"); } } ?> BUT if i compile php with "--enable-debug" flag, it's really hard to get this error. I must add some complexity to make it crash. I have to be doing many paralell requests for a few seconds to get a crash: <?php class Admin_DbController extends Controller_BaseController { public function updateSqlDefinitionsAction() { $db = Zend_Registry::get('db'); $tableList = $db->listTables(); foreach ($tableList as $tableName){ $row = $db->fetchRow("SHOW CREATE TABLE " . $db->quoteIdentifier($tableName)); file_put_contents( DB_DEFINITIONS_PATH . '/' . $tableName . '.sql', $row['Create Table'] . ';' ); } } } ?> Please notice this is the same script, but creating DDL for all tables in database rather than for one. It seems that if php is heavy loaded (with extensions and me doing many paralell requests) it's when i get php to crash. About starting httpd with "-X": i've tried. The thing is, it is already hard to make php crash with --enable-debug. With "-X" option (which only enables one child process) i can't do parallel requests. So i haven't been able to create to proper debug backtrace: https://bugs.php.net/bugs-generating-backtrace.php My concrete question is, what do i do to get a coredump? root@GWT4 [~]# httpd -V Server version: Apache/2.2.19 (Unix) Server built: Jul 20 2011 19:18:58 Cpanel::Easy::Apache v3.4.2 rev9999 Server's Module Magic Number: 20051115:28 Server loaded: APR 1.4.5, APR-Util 1.3.12 Compiled using: APR 1.4.5, APR-Util 1.3.12 Architecture: 32-bit Server MPM: Prefork threaded: no forked: yes (variable process count) Server compiled with.... -D APACHE_MPM_DIR="server/mpm/prefork" -D APR_HAS_SENDFILE -D APR_HAS_MMAP -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled) -D APR_USE_SYSVSEM_SERIALIZE -D APR_USE_PTHREAD_SERIALIZE -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT -D APR_HAS_OTHER_CHILD -D AP_HAVE_RELIABLE_PIPED_LOGS -D DYNAMIC_MODULE_LIMIT=128 -D HTTPD_ROOT="/usr/local/apache" -D SUEXEC_BIN="/usr/local/apache/bin/suexec" -D DEFAULT_PIDLOG="logs/httpd.pid" -D DEFAULT_SCOREBOARD="logs/apache_runtime_status" -D DEFAULT_LOCKFILE="logs/accept.lock" -D DEFAULT_ERRORLOG="logs/error_log" -D AP_TYPES_CONFIG_FILE="conf/mime.types" -D SERVER_CONFIG_FILE="conf/httpd.conf"

Read the article
Need help with yum,python and php in CentOS. (I made a complete mess!)

- by pek

a while back I wanted to install some plugins for Trac but it required python 2.5 I tried installing it (I don't remember how) and the only thing I managed was to have two versions of python (2.4 and 2.5). Trac still uses the old version but the console uses 2.5 (python -V = Python 2.5.2). Anyway, the problem is not python, the problem is yum (which uses python). I am trying to upgrade my PHP version from 5.1.x to 5.2.x. I tried following this tutorial but when I reach the step with yum I get this error: >[root@XXX]# yum update Loading "installonlyn" plugin Setting up Update Process Setting up repositories Reading repository metadata in from local files Traceback (most recent call last): File "/usr/bin/yum", line 29, in ? yummain.main(sys.argv[1:]) File "/usr/share/yum-cli/yummain.py", line 94, in main result, resultmsgs = base.doCommands() File "/usr/share/yum-cli/cli.py", line 381, in doCommands return self.yum_cli_commands[self.basecmd].doCommand(self, self.basecmd, self.extcmds) File "/usr/share/yum-cli/yumcommands.py", line 150, in doCommand return base.updatePkgs(extcmds) File "/usr/share/yum-cli/cli.py", line 672, in updatePkgs self.doRepoSetup() File "/usr/share/yum-cli/cli.py", line 109, in doRepoSetup self.doSackSetup(thisrepo=thisrepo) File "/usr/lib/python2.4/site-packages/yum/__init__.py", line 338, in doSackSetup self.repos.populateSack(which=repos) File "/usr/lib/python2.4/site-packages/yum/repos.py", line 200, in populateSack sack.populate(repo, with, callback, cacheonly) File "/usr/lib/python2.4/site-packages/yum/yumRepo.py", line 91, in populate dobj = repo.cacheHandler.getPrimary(xml, csum) File "/usr/lib/python2.4/site-packages/yum/sqlitecache.py", line 100, in getPrimary return self._getbase(location, checksum, 'primary') File "/usr/lib/python2.4/site-packages/yum/sqlitecache.py", line 86, in _getbase (db, dbchecksum) = self.getDatabase(location, metadatatype) File "/usr/lib/python2.4/site-packages/yum/sqlitecache.py", line 82, in getDatabase db = self.makeSqliteCacheFile(filename,cachetype) File "/usr/lib/python2.4/site-packages/yum/sqlitecache.py", line 245, in makeSqliteCacheFile self.createTablesPrimary(db) File "/usr/lib/python2.4/site-packages/yum/sqlitecache.py", line 165, in createTablesPrimary cur.execute(q) File "/usr/lib/python2.4/site-packages/sqlite/main.py", line 244, in execute self.rs = self.con.db.execute(SQL) _sqlite.DatabaseError: near "release": syntax error Any help? Thank you. Update OK, so I've managed to update yum hoping it would solve my problems but now I get a slightly different version of the same error: [root@XXX]# yum -y update Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile * addons: mirror.skiplink.com * base: www.gtlib.gatech.edu * epel: mirrors.tummy.com * extras: yum.singlehop.com * updates: centos-distro.cavecreek.net (process:30840): GLib-CRITICAL **: g_timer_stop: assertion `timer != NULL' failed (process:30840): GLib-CRITICAL **: g_timer_destroy: assertion `timer != NULL' failed Traceback (most recent call last): File "/usr/bin/yum", line 29, in ? yummain.user_main(sys.argv[1:], exit_code=True) File "/usr/share/yum-cli/yummain.py", line 309, in user_main errcode = main(args) File "/usr/share/yum-cli/yummain.py", line 178, in main result, resultmsgs = base.doCommands() File "/usr/share/yum-cli/cli.py", line 345, in doCommands self._getTs(needTsRemove) File "/usr/lib/python2.4/site-packages/yum/depsolve.py", line 101, in _getTs self._getTsInfo(remove_only) File "/usr/lib/python2.4/site-packages/yum/depsolve.py", line 112, in _getTsInfo pkgSack = self.pkgSack File "/usr/lib/python2.4/site-packages/yum/__init__.py", line 661, in <lambda> pkgSack = property(fget=lambda self: self._getSacks(), File "/usr/lib/python2.4/site-packages/yum/__init__.py", line 501, in _getSacks self.repos.populateSack(which=repos) File "/usr/lib/python2.4/site-packages/yum/repos.py", line 260, in populateSack sack.populate(repo, mdtype, callback, cacheonly) File "/usr/lib/python2.4/site-packages/yum/yumRepo.py", line 190, in populate dobj = repo_cache_function(xml, csum) File "/usr/lib/python2.4/site-packages/sqlitecachec.py", line 42, in getPrimary self.repoid)) TypeError: Can not create packages table: near "release": syntax error I'm guessing that this "release" thing has something to do with a repository, but I didn't find anything... I went to the sqlitecachec.py at line 42 which writes (line numbers added for convenience): 39: return self.open_database(_sqlitecache.update_primary(location, 40: checksum, 41: self.callback, 42: self.repoid)) Update 2 I think I found the problem. This post suggests that the problem is sqlite and not yum. The version of sqlite I have installed is 3.6.10 but I have no idea which version does python 2.4 uses. ld.so.config contains the following: include ld.so.conf.d/*.conf /usr/local/lib In folder /usr/local/lib I find a symbolic link named libsqlite3.so that points to libsqlite3.so.0.8.6 WHAT IS HAPPENING??????? :S

Read the article
Monit won't run

- by Yaniro

I have two identical EC2 instances (the second is a replica of the first), running Gentoo. The first instance has monit running which monitors a single process and some system resources and functions great. In the second instance, monit runs but quits right away. The configuration is similar on both instances so are the versions of monit. monit.log shows: [GMT Oct 3 08:36:41] info : monit daemon with PID 5 awakened Final lines on strace monit show: write(2, "monit daemon with PID 5 awakened"..., 33monit daemon with PID 5 awakened ) = 33 time(NULL) = 1349252827 open("/etc/localtime", O_RDONLY) = 4 fstat64(4, {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 fstat64(4, {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb773a000 read(4, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0\0\0\1\0\0\0\0"..., 4096) = 118 _llseek(4, -6, [112], SEEK_CUR) = 0 read(4, "\nGMT0\n", 4096) = 6 close(4) = 0 munmap(0xb773a000, 4096) = 0 write(3, "[GMT Oct 3 08:27:07] info :"..., 33) = 33 write(3, "monit daemon with PID 5 awakened"..., 33) = 33 waitpid(-1, NULL, WNOHANG) = -1 ECHILD (No child processes) close(3) = 0 exit_group(0) = ? No core dumps (ulimit -c shows unlimited) monit -v shows: monit: Debug: Adding host allow 'localhost' monit: Debug: Skipping redundant host 'localhost' monit: Debug: Skipping redundant host 'localhost' monit: Debug: Adding credentials for user 'xxxx'. Runtime constants: Control file = /etc/monitrc Log file = /var/log/monit/monit.log Pid file = /var/run/monit.pid Id file = /var/run/monit.pid Debug = True Log = True Use syslog = False Is Daemon = True Use process engine = True Poll time = 30 seconds with start delay 0 seconds Expect buffer = 256 bytes Event queue = base directory /var/monit with 100 slots Mail server(s) = xx.xxx.xx.xxx with timeout 30 seconds Mail from = (not defined) Mail subject = (not defined) Mail message = (not defined) Start monit httpd = True httpd bind address = Any/All httpd portnumber = 2812 httpd signature = True Use ssl encryption = False httpd auth. style = Basic Authentication and Host/Net allow list Alert mail to = [email protected] Alert on = All events The service list contains the following entries: System Name = xxxx Monitoring mode = active CPU wait limit = if greater than 20.0% 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert CPU system limit = if greater than 30.0% 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert CPU user limit = if greater than 70.0% 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert Swap usage limit = if greater than 25.0% 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert Memory usage limit = if greater than 75.0% 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert Load avg. (5min) = if greater than 2.0 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert Load avg. (1min) = if greater than 4.0 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert Process Name = xxxx Group = server Pid file = /var/run/xxxx.pid Monitoring mode = active Start program = '/etc/init.d/xxxx restart' timeout 20 second(s) Stop program = '/etc/init.d/xxxx stop' timeout 30 second(s) Existence = if does not exist 1 times within 1 cycle(s) then restart else if succeeded 1 times within 1 cycle(s) then alert Pid = if changed 1 times within 1 cycle(s) then alert Ppid = if changed 1 times within 1 cycle(s) then alert Timeout = If restarted 3 times within 5 cycle(s) then unmonitor Alert mail to = [email protected] Alert on = All events Alert mail to = [email protected] Alert on = All events ------------------------------------------------------------------------------- monit daemon with PID 5 awakened Ran emerge --sync before emerge -va monit which installed monit v5.3.2. When that didn't work i've downloaded v5.5 from their website and compiled from source which did not work either.

Read the article
Extended URL php

- by web.ask

I have a problem with the extended url here in php. I'm not even sure if it's called extended url. I was wondering if someone can help me since i'm no good with PHP. I have a php page the careers.php it shows the career listing but when I click on Learn more link it goes to http(dot)localhost/houseofpatel/jobs/4/Career-4 but it outputs an "Object not found". Does xampplite have something to do with this? or is it just the codes? I also have the breadcrumb.php in here. Please see code below: session_start(); include("Breadcrumb.php"); $trail = new Breadcrumb(); $trail->add('Careers', $_SERVER['PHP_SELF'], 1); $trail -> output(); # REMOVE SPECIAL CHARACTERS $sPattern = array('/[^a-zA-Z0-9 -]/', '/[ -]+/', '/^-|-$/'); $sReplace = array('', '-', ''); mysql_query("SET CHARACTER_SET utf8"); //$query = "SELECT * FROM careers_mst WHERE status = 1 ORDER BY dateposted DESC LIMIT 0, 4"; if(!$_GET['jobs']){ $jobsPerPage = 4; }else{ $jobsPerPage = $_GET['jobs']; } if(!$_GET['page']){ $currPage = 1; $showMoreJobs = true; }else{ $currPage = $_GET['page']; $showMoreJobs = false; } $offset = ($currPage - 1 )* $jobsPerPage; $query = "SELECT * FROM careers_mst WHERE status = 1 ORDER BY dateposted"; $process = mysql_query($query); if(@mysql_num_rows($process) > 0){ $totalRows = @mysql_num_rows($process); $query2 = "SELECT * FROM careers_mst WHERE status = 1 ORDER BY dateposted DESC LIMIT $offset, $jobsPerPage"; $process2 = mysql_query($query2); if(@mysql_num_rows($process2) > 0){ $pagNav = pageNavigator($totalRows,$jobsPerPage,$currPage,''); while($row = @mysql_fetch_array($process2)){ $id = $row[0]; $title = stripslashes($row['title']); $title1 = preg_replace($sPattern, $sReplace, $title); $description = stripslashes($row['description']); $description1 = substr($description, 0, 200); $careers_list .= ' <div class="left" style="width:340px; padding-left:10px;"> <p><span class="blue"><strong>'.$title.'</strong></span></p> '.$description1.'... <p><a href="jobs/'.$id.'/'.$title1.'" title="Learn more">[ Learn More ]</a></p> </div>'; } /*$careers_list .= ' <div class="left" style="width: 100%; margin-top:15px;"> <p><a href="more-jobs" title="More Job Opennings">MORE JOB OPENNINGS >> </a></p> <div class="divider left"></div> </div>'; */ if($showMoreJobs){ if($totalRows > $jobsPerPage){ $currPage++; } $jobNav = "<a href=careers.php?page=$currPage title='More Job Opennings'>MORE JOB OPENNINGS >> </a>"; }else{ $jobNav = "$pagNav"; } $careers_list .= "<div class='left' style='width: 100%; margin-top:15px;'> <p>$jobNav</p> <div class='divider left'></div> </div>"; } }

Read the article
Where's the Swap File/Partition?

- by chrisbunney

I'm investigating the virtual memory configuration of a Debian based Amazon EC2 instance, and as my background isn't in system admin, I'm slightly confused by what I'm seeing. We're using MongoDB, and the monitoring server we have indicates that the Mongo process is using about 20GB of swap space, however I can't figure out where this is located on the server. As far as I can tell from using the various suggested methods from Google, there is either a much smaller amount, or none at all. top indicates that there is 1.8GB of swap memory: top - 15:35:21 up 6 days, 3:23, 1 user, load average: 1.60, 1.43, 1.37 Tasks: 47 total, 2 running, 45 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0%us, 1.3%sy, 0.0%ni, 14.7%id, 83.8%wa, 0.0%hi, 0.0%si, 0.1%st Mem: 3928924k total, 2855572k used, 1073352k free, 640564k buffers Swap: 0k total, 0k used, 0k free, 1887788k cached swapon -s doesn't seem to think there's any swap space: Filename Type Size Used Priority free -m doesn't think there's any swap either: total used free shared buffers cached Mem: 3836 3663 172 0 626 2701 -/+ buffers/cache: 336 3500 Swap: 0 0 0 And neither does vmstat: procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 3 0 66224 641372 2874744 0 0 21 5012 21 33 2 2 76 19 But cat /etc/fstab thinks there is a swap partition: /dev/xvda1 / ext3 defaults 1 1 /dev/xvda2 /mnt ext3 defaults 0 0 /dev/xvda3 swap swap defaults 0 0 none /proc proc defaults 0 0 none /sys sysfs defaults 0 0 However df -k gives no indication of the xvda3 partition: Filesystem 1K-blocks Used Available Use% Mounted on /dev/xvda1 16513960 15675324 0 100% / tmpfs 1964460 8 1964452 1% /lib/init/rw udev 1914148 28 1914120 1% /dev tmpfs 1964460 4 1964456 1% /dev/shm So I really don't know what to make of this, because I appear to have a process using about 10 times more virtual memory than what might be available, and I have no idea where this virtual memory is on the system. I'm probably misinterpreting the output of the tools, so I'd be grateful if someone would be able to set me straight: What have I got wrong, what's the right interpretation, and how do you reach that interpretation? EDIT0: We use 10gen's MMS for monitoring the database, the relevant section for memory from the last data point is: "mem": { "virtual": 20749, "bits": 64, "supported": true, "mappedWithJournal": 20376, "mapped": 10188, "resident": 1219 }, This JSON is specific to the database process (I believe) rather than the system as a whole. fdisk -l /dev/xvda outputs... nothing? I tried each of the 3 xvda entries in /etc/fstab as well: root@ip:~# fdisk -l /dev/xvda1 Disk /dev/xvda1: 34.4 GB, 34359738368 bytes 255 heads, 63 sectors/track, 4177 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00000000 Disk /dev/xvda1 doesn't contain a valid partition table root@ip:~# fdisk -l /dev/xvda2 root@ip:~# fdisk -l /dev/xvda3 root@ip:~# Edit1: Output of cat /proc/meminfo for the sake of completeness: MemTotal: 3928924 kB MemFree: 726600 kB Buffers: 648368 kB Cached: 2216556 kB SwapCached: 0 kB Active: 1945100 kB Inactive: 994016 kB Active(anon): 60476 kB Inactive(anon): 12952 kB Active(file): 1884624 kB Inactive(file): 981064 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 0 kB SwapFree: 0 kB Dirty: 387180 kB Writeback: 0 kB AnonPages: 73380 kB Mapped: 1188260 kB Shmem: 48 kB Slab: 149768 kB SReclaimable: 146076 kB SUnreclaim: 3692 kB KernelStack: 1104 kB PageTables: 16096 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 1964460 kB Committed_AS: 305572 kB VmallocTotal: 34359738367 kB VmallocUsed: 16760 kB VmallocChunk: 34359721448 kB HardwareCorrupted: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 3932160 kB DirectMap2M: 0 kB

Read the article
Riak "error":"insufficient_vnodes_available"

- by Wolfiem

We have 4 nodes Riak installation. They are running on Ubuntu 12.04 LTS Precise installed servers. We have installed 1.1.4 at August 1st 2012 and upgraded 1.2.0 when its available. Server names are: f1 - 10.10.0.12 - This is the first installed server. We have joined other ones to this server. This also serves Riak control. s2 - 10.10.0.22 - s3 - 10.10.0.23 - s4 - 10.10.0.24 - This server also serves Riak control. This morning we've seen "insufficient nodes available" error at our applications log and restarted all nodes. 3 of them became available except "f1" UPDATE : while I prepare this message live 3 nodes became unavailable and need restart Riak. wolfiem@f01:~$ sudo /etc/init.d/riak start Riak failed to start within 15 seconds, see the output of 'riak console' for more information. If you want to wait longer, set the environment variable WAIT_FOR_ERLANG to the number of seconds to wait. I've tried to set WAIT_FOR_ERLANG value to 60 seconds but I can't. adding this line in vm.args didn't work: -env WAIT_FOR_ERLANG 60 I also tried to set this from terminal but it didn't work either. wolfiem@f01:~$ export WAIT_FOR_ERLANG=60 It still says "Riak failed to start within 15 seconds" This is the console.log output: 2012-09-11 10:58:02.532 [info] <0.7.0> Application lager started on node '[email protected]' 2012-09-11 10:58:02.560 [warning] <0.148.0>@riak_core_ring_manager:reload_ring:231 No ring file available. 2012-09-11 10:58:02.585 [error] <0.164.0> CRASH REPORT Process <0.164.0> with 0 neighbours exited with reason: eaddrnotavail in gen_server:init_it/6 line 320 This is the error.log output 2012-09-11 10:58:02.585 [error] <0.164.0> CRASH REPORT Process <0.164.0> with 0 neighbours exited with reason: eaddrnotavail in gen_server:init_it/6 line 320 This is the crash.log output: 2012-09-11 10:58:02 =CRASH REPORT==== crasher: initial call: mochiweb_socket_server:init/1 pid: <0.164.0> registered_name: [] exception exit: {eaddrnotavail,[{gen_server,init_it,6,[{file,"gen_server.erl"},{line,320}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]} ancestors: [riak_core_sup,<0.135.0>] messages: [] links: [<0.136.0>] dictionary: [] trap_exit: true status: running heap_size: 377 stack_size: 24 reductions: 403 neighbours: You can find the riak console output below: wolfiem@f01:~$ riak console Attempting to restart script through sudo -H -u riak Exec: /usr/lib/riak/erts-5.9.1/bin/erlexec -boot /usr/lib/riak/releases/1.2.0/riak -embedded -config /etc/riak/app.config -pa /usr/lib/riak/basho-patches -args_file /etc/riak/vm.args -- console Root: /usr/lib/riak Erlang R15B01 (erts-5.9.1) [source] [64-bit] [smp:8:8] [async-threads:64] [kernel-poll:true] =INFO REPORT==== 11-Sep-2012::10:44:18 === alarm_handler: {set,{system_memory_high_watermark,[]}} ** /usr/lib/riak/lib/observer-1.1/ebin/etop_txt.beam hides /usr/lib/riak/lib/basho-patches/etop_txt.beam ** Found 1 name clashes in code paths 10:44:19.099 [info] Application lager started on node '[email protected]' 10:44:19.130 [warning] No ring file available. 10:44:19.158 [error] CRASH REPORT Process <0.164.0> with 0 neighbours exited with reason: eaddrnotavail in gen_server:init_it/6 line 320 /usr/lib/riak/lib/os_mon-2.2.9/priv/bin/memsup: Erlang has closed. =INFO REPORT==== 11-Sep-2012::10:44:19 === alarm_handler: {clear,system_memory_high_watermark} Erlang has closed {"Kernel pid terminated",application_controller,"{application_start_failure,riak_core,{shutdown,{riak_core_app,start,[normal,[]]}}}"} Crash dump was written to: /var/log/riak/erl_crash.dump Kernel pid terminated (application_controller) ({application_start_failure,riak_core,{shutdown,{riak_core_app,start,[normal,[]]}}})

Read the article
cherrypy fails to stop when puppet tries to ensure running and refresh it at the same time

- by ento

I am trying to manage a cherrypy service with puppet. However, when the configuration is applied, cherryd ends up with no PID file although the process is up and running. Since the PID file is lost I can no longer stop the process with /etc/init.d/mycherryd stop (unless I modify the handmade init script to lookup the PID with ps or something.) $ /etc/init.d/mycherryd status cherryd dead but subsys locked The problem seems to be that puppet is trying to refresh/restart cherryd (triggered by changes in configuration files) immediately after ensuring it's running (as specified in the manifest), and cherrypy fails to stop and start (restart) itself while still executing its startup process. Is there a clear cut solution to this problem? Is this a bug on the cherrypy side, or can I write a puppet manifest so refresh is called only after the service is up and running? Any suggestion welcome. cherrypy log See how cherrypy catches SIGTERM midway through startup and still starts to listen. :cherrypy.error[18666] 2010-02-12 13:10:23,551 INFO: ENGINE Listening for SIGHUP. :cherrypy.error[18666] 2010-02-12 13:10:23,552 INFO: ENGINE Listening for SIGTERM. :cherrypy.error[18666] 2010-02-12 13:10:23,552 INFO: ENGINE Listening for SIGUSR1. :cherrypy.error[18666] 2010-02-12 13:10:23,552 INFO: ENGINE Bus STARTING :cherrypy.error[18671] 2010-02-12 13:10:23,554 INFO: ENGINE Daemonized to PID: 18671 :cherrypy.error[18671] 2010-02-12 13:10:23,554 INFO: ENGINE PID 18671 written to '/var/mycherryd/cherry.pid'. :cherrypy.error[18671] 2010-02-12 13:10:23,555 INFO: ENGINE Started monitor thread '_TimeoutMonitor'. :cherrypy.error[18670] 2010-02-12 13:10:23,556 INFO: ENGINE Forking twice. :cherrypy.error[18666] 2010-02-12 13:10:23,557 INFO: ENGINE Forking once. :cherrypy.error[18671] 2010-02-12 13:10:23,716 INFO: ENGINE Caught signal SIGTERM. :cherrypy.error[18671] 2010-02-12 13:10:23,716 INFO: ENGINE Bus STOPPING :cherrypy.error[18671] 2010-02-12 13:10:23,716 INFO: ENGINE HTTP Server cherrypy._cpwsgi_server.CPWSGIServer(('0.0.0.0', 12380)) already shut down :cherrypy.error[18671] 2010-02-12 13:10:23,717 INFO: ENGINE Stopped thread '_TimeoutMonitor'. :cherrypy.error[18671] 2010-02-12 13:10:23,717 INFO: ENGINE Bus STOPPED :cherrypy.error[18671] 2010-02-12 13:10:23,732 INFO: ENGINE Bus EXITING :cherrypy.error[18671] 2010-02-12 13:10:23,759 INFO: ENGINE PID file removed: '/var/mycherryd/cherry.pid'. :cherrypy.error[18671] 2010-02-12 13:10:23,782 INFO: ENGINE Bus EXITED :cherrypy.error[18671] 2010-02-12 13:10:23,792 INFO: ENGINE Serving on 0.0.0.0:12380 :cherrypy.error[18671] 2010-02-12 13:10:23,826 INFO: ENGINE Bus STARTED puppet log puppet tries to refresh the service immediately after ensuring it to be 'running'. Feb 12 13:10:22 localhost puppetd[8159]: (//mycherrypy/File[conffiles]) Scheduling refresh of Service[cherryd] Feb 12 13:10:22 localhost last message repeated 12 times Feb 12 13:10:23 localhost puppetd[8159]: (//mycherrypy/Service[mycherryd]/ensure) ensure changed 'stopped' to 'running' Feb 12 13:10:23 localhost puppetd[8159]: (//mycherrypy/Service[mycherryd]) Triggering 'refresh' from 13 dependencies Feb 12 13:11:23 localhost puppetd[8159]: (//mycherrypy/Service[mycherryd]) Failed to call refresh on Service[mycherryd]: Could not stop Service[mycherryd]: Execution of '/sbin/service mycherryd stop' returned 1: at /etc/puppet/manifests/mycherrypy.pp:161 Feb 12 13:11:24 localhost puppetd[8159]: Value of 'preferred_serialization_format' (pson) is invalid for report, using default (marshal) Feb 12 13:11:24 localhost puppetd[8159]: Finished catalog run in 99.25 seconds puppet manifest excerpt class mycherrypy { file { 'rpm': path => "/tmp/${apiserver}.i386.rpm", source => "${fileserver}/${apiserver}.i386.rpm"; 'conffiles': require => Package["${apiserver}"], path => "${service_home}/config/", ensure => present, source => "${fileserver}/config/", notify => Service["mycherryd"]; } package { "$apiserver": provider => 'rpm', source => "/tmp/${apiserver}.i386.rpm", ensure => latest; } service { "mycherryd": require => [File["conffiles"], Package["${apiserver}"]], ensure => running, provider => redhat, hasstatus => true; } }

Read the article
cherrypy fails to stop when puppet tries to ensure running and refresh it at the same time

- by ento

I am trying to manage a cherrypy service with puppet. However, when the configuration is applied, cherryd ends up with no PID file although the process is up and running. Since the PID file is lost I can no longer stop the process with /etc/init.d/mycherryd stop (unless I modify the handmade init script to lookup the PID with ps or something.) $ /etc/init.d/mycherryd status cherryd dead but subsys locked The problem seems to be that puppet is trying to refresh/restart cherryd (triggered by changes in configuration files) immediately after ensuring it's running (as specified in the manifest), and cherrypy fails to stop and start (restart) itself while still executing its startup process. Is there a clear cut solution to this problem? Is this a bug on the cherrypy side, or can I write a puppet manifest so refresh is called only after the service is up and running? Any suggestion welcome. cherrypy log See how cherrypy catches SIGTERM midway through startup and still starts to listen. :cherrypy.error[18666] 2010-02-12 13:10:23,551 INFO: ENGINE Listening for SIGHUP. :cherrypy.error[18666] 2010-02-12 13:10:23,552 INFO: ENGINE Listening for SIGTERM. :cherrypy.error[18666] 2010-02-12 13:10:23,552 INFO: ENGINE Listening for SIGUSR1. :cherrypy.error[18666] 2010-02-12 13:10:23,552 INFO: ENGINE Bus STARTING :cherrypy.error[18671] 2010-02-12 13:10:23,554 INFO: ENGINE Daemonized to PID: 18671 :cherrypy.error[18671] 2010-02-12 13:10:23,554 INFO: ENGINE PID 18671 written to '/var/mycherryd/cherry.pid'. :cherrypy.error[18671] 2010-02-12 13:10:23,555 INFO: ENGINE Started monitor thread '_TimeoutMonitor'. :cherrypy.error[18670] 2010-02-12 13:10:23,556 INFO: ENGINE Forking twice. :cherrypy.error[18666] 2010-02-12 13:10:23,557 INFO: ENGINE Forking once. :cherrypy.error[18671] 2010-02-12 13:10:23,716 INFO: ENGINE Caught signal SIGTERM. :cherrypy.error[18671] 2010-02-12 13:10:23,716 INFO: ENGINE Bus STOPPING :cherrypy.error[18671] 2010-02-12 13:10:23,716 INFO: ENGINE HTTP Server cherrypy._cpwsgi_server.CPWSGIServer(('0.0.0.0', 12380)) already shut down :cherrypy.error[18671] 2010-02-12 13:10:23,717 INFO: ENGINE Stopped thread '_TimeoutMonitor'. :cherrypy.error[18671] 2010-02-12 13:10:23,717 INFO: ENGINE Bus STOPPED :cherrypy.error[18671] 2010-02-12 13:10:23,732 INFO: ENGINE Bus EXITING :cherrypy.error[18671] 2010-02-12 13:10:23,759 INFO: ENGINE PID file removed: '/var/mycherryd/cherry.pid'. :cherrypy.error[18671] 2010-02-12 13:10:23,782 INFO: ENGINE Bus EXITED :cherrypy.error[18671] 2010-02-12 13:10:23,792 INFO: ENGINE Serving on 0.0.0.0:12380 :cherrypy.error[18671] 2010-02-12 13:10:23,826 INFO: ENGINE Bus STARTED puppet log puppet tries to refresh the service immediately after ensuring it to be 'running'. Feb 12 13:10:22 localhost puppetd[8159]: (//mycherrypy/File[conffiles]) Scheduling refresh of Service[cherryd] Feb 12 13:10:22 localhost last message repeated 12 times Feb 12 13:10:23 localhost puppetd[8159]: (//mycherrypy/Service[mycherryd]/ensure) ensure changed 'stopped' to 'running' Feb 12 13:10:23 localhost puppetd[8159]: (//mycherrypy/Service[mycherryd]) Triggering 'refresh' from 13 dependencies Feb 12 13:11:23 localhost puppetd[8159]: (//mycherrypy/Service[mycherryd]) Failed to call refresh on Service[mycherryd]: Could not stop Service[mycherryd]: Execution of '/sbin/service mycherryd stop' returned 1: at /etc/puppet/manifests/mycherrypy.pp:161 Feb 12 13:11:24 localhost puppetd[8159]: Value of 'preferred_serialization_format' (pson) is invalid for report, using default (marshal) Feb 12 13:11:24 localhost puppetd[8159]: Finished catalog run in 99.25 seconds puppet manifest excerpt class mycherrypy { file { 'rpm': path => "/tmp/${apiserver}.i386.rpm", source => "${fileserver}/${apiserver}.i386.rpm"; 'conffiles': require => Package["${apiserver}"], path => "${service_home}/config/", ensure => present, source => "${fileserver}/config/", notify => Service["mycherryd"]; } package { "$apiserver": provider => 'rpm', source => "/tmp/${apiserver}.i386.rpm", ensure => latest; } service { "mycherryd": require => [File["conffiles"], Package["${apiserver}"]], ensure => running, provider => redhat, hasstatus => true; } }

Read the article
SQLExpress service unable to start Error code 17053

- by Chris Sobolewski

A user was instructed by their software support to upgrade a program and install SQLExpress as part of the installation process. Since that time, the service has been able to start, citing error 17053, which appears to be an authentication issue. Here is the error log: 2011-01-11 13:17:45.50 Server Microsoft SQL Server 2005 - 9.00.3042.00 (Intel X86) Feb 9 2007 22:47:07 Copyright (c) 1988-2005 Microsoft Corporation Express Edition on Windows NT 5.1 (Build 2600: Service Pack 2) 2011-01-11 13:17:45.50 Server (c) 2005 Microsoft Corporation. 2011-01-11 13:17:45.50 Server All rights reserved. 2011-01-11 13:17:45.50 Server Server process ID is 3332. 2011-01-11 13:17:45.50 Server Authentication mode is WINDOWS-ONLY. 2011-01-11 13:17:45.50 Server Logging SQL Server messages in file 'c:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\LOG\ERRORLOG'. 2011-01-11 13:17:45.52 Server This instance of SQL Server last reported using a process ID of 2332 at 11/10/2010 2:15:24 PM (local) 11/10/2010 7:15:24 PM (UTC). This is an informational message only; no user action is required. 2011-01-11 13:17:45.52 Server Error: 17053, Severity: 16, State: 1. 2011-01-11 13:17:45.52 Server UpdateUptimeRegKey: Operating system error 5(Access is denied.) encountered. 2011-01-11 13:17:45.52 Server Registry startup parameters: 2011-01-11 13:17:45.52 Server -d c:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\master.mdf 2011-01-11 13:17:45.52 Server -e c:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\LOG\ERRORLOG 2011-01-11 13:17:45.52 Server -l c:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\mastlog.ldf 2011-01-11 13:17:45.52 Server Error: 17113, Severity: 16, State: 1. 2011-01-11 13:17:45.52 Server Error 3(The system cannot find the path specified.) occurred while opening file 'c:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\master.mdf' to obtain configuration information at startup. An invalid startup option might have caused the error. Verify your startup options, and correct or remove them if necessary. 2011-01-11 13:17:45.52 Server Error: 17053, Severity: 16, State: 1. 2011-01-11 13:17:45.52 Server UpdateUptimeRegKey: Operating system error 5(Access is denied.) encountered. 4 Server Error: 17053, Severity: 16, State: 1. 2011-01-11 13:08:21.34 Server UpdateUptimeRegKey: Operating system error 5(Access is denied.) encountered. 12:47:20.85 spid5s SQL Trace ID 1 was started by login "sa". 2011-01-11 12:47:20.90 spid5s Starting up database 'mssqlsystemresource'. 2011-01-11 12:47:20.93 spid5s The resource database build version is 9.00.3042. This is an informational message only. No user action is required. 2011-01-11 12:47:21.21 spid5s Error: 15466, Severity: 16, State: 1. 2011-01-11 12:47:21.21 spid5s An error occurred during decryption. 2011-01-11 12:47:21.38 spid8s Starting up database 'model'. 2011-01-11 12:47:21.38 Server Error: 17182, Severity: 16, State: 1. 2011-01-11 12:47:21.38 Server TDSSNIClient initialization failed with error 0x5, status code 0x90. 2011-01-11 12:47:21.38 Server Error: 17182, Severity: 16, State: 1. 2011-01-11 12:47:21.38 Server TDSSNIClient initialization failed with error 0x5, status code 0x1. 2011-01-11 12:47:21.38 Server Error: 17826, Severity: 18, State: 3. 2011-01-11 12:47:21.38 Server Could not start the network library because of an internal error in the network library. To determine the cause, review the errors immediately preceding this one in the error log. 2011-01-11 12:47:21.38 Server Error: 17120, Severity: 16, State: 1. 2011-01-11 12:47:21.38 Server SQL Server could not spawn FRunCM thread. Check the SQL Server error log and the Windows event logs for information about possible related problems. One lead I had was to change the SQL logon account from "Network Service" to "Local System". Unfortunately, that is resulting in the error message The Security ID Structure is Invalid [0x80070539] Any help either uninstalling or getting SQLExpress running would be fantastic.

Read the article
apache webserver unresponsible with server-status showing all child processes waiting for connection

- by Jeff

My setup: i have 3 nearly identical webserver machines serving the same high loaded dynamic website with simple load balancing over dns. The service has been working for over two ears with the same apache config. apache2, php5, ubuntu 8.04 linux 2.6.24-29-server My problem: since about two weeks i'm experiencing problems with this config. Nearly every day i have one small moment about 5 minutes, in which the website is unreachable. I'm still able to login to the servers over ssh. If i run htop, i see the machine simply doing nothing. i have about 1000 apache processes running, but no cpu activity. i've used the apache mod_status to debug this situation. the process scoreboard looks like this: _C.___K_______________________R._______.__K_K____K___C_______.__ _______C__________.___________________________________.________C _.____K__________K___K_WK_____._K_____________________________._ W______K__________K________.____________________._______C_______ _C_.__K__K____.._.._____________________________________C_______ _R___________K___.______C________.C_________.______._____C______ ____________KKC____K_____K__WC_________________C_____.__.____.__ _____________________C_________K______.____C______._____________ _.___C____.___.___________________________.K______.____K________ W__.___________________C.__.____K________K_______R_._.__._______ __C__C_.__________C__C_______._____W______________C_.___C_______ ____.______C_____________C________.____C____________.________._K __.__________.K_____________K_________._____C____.K__________KW_ __K.W________R_________._______.___W___________.____.__K_____W__ W___.___..________W____K Scoreboard Key: "_" Waiting for Connection, "S" Starting up, "R" Reading Request, "W" Sending Reply, "K" Keepalive (read), "D" DNS Lookup, "C" Closing connection, "L" Logging, "G" Gracefully finishing, "I" Idle cleanup of worker, "." Open slot with no current process So the most of the processes are just waiting for connection. after about 5 minutes the situation will return to normal: i have lot least processes on every machine, the most workers have the "."-status (meaing they are open to process a request) and of course the website is reachable! so i'm trying to find something in the logs, but there is simply nothing... the apache access log is silent for about 4 minutes, the same is for the error log. i also can not figure out anything wrong in other system logs. the situation is the same on all 3 webservers (all of them have this load peak and unresposibility at the same time), so i do not thing this is hardware related. but i think, this might be related to some network (tcp) issue. any ideas? EDIT: some more information, that i have just discovered: it has just happened again. and i was able to verify that i'm also not able to connect locally when this problem occurs. i have made some connection statistics with the following command after it happend netstat -an|awk '/tcp/ {print $6}'|sort|uniq -c 109 CLOSE_WAIT 2652 ESTABLISHED 2 FIN_WAIT1 11 LAST_ACK 12 LISTEN 91 SYN_RECV 1 SYN_SENT 16 TIME_WAIT If i execute the same command some time later, i have something like this: 4 CLOSING 108 ESTABLISHED 18 FIN_WAIT1 182 FIN_WAIT2 37 LAST_ACK 12 LISTEN 50 SYN_RECV 11276 TIME_WAIT So in the normal situation i have only 100-200 open connections by clients beeing handled by apache in this moment. when i have this "crash", i have a lot more connections. what is the best way to analyse this? EDIT2: the important lines in apache2.conf are: KeepAlive On MaxKeepAliveRequests 20 KeepAliveTimeout 1 <IfModule mpm_prefork_module> ServerLimit 920 StartServers 30 MinSpareServers 80 MaxSpareServers 120 MaxClients 920 MaxRequestsPerChild 700 </IfModule> it is an apache2 prefork with php_mod. the server has 8GB ram and a 4gb swap partition.

Read the article
Why would VMWare to go defunct? How to recover from/prevent it?

- by Josh

I am running VMWare Server 2.0.2 (Build 203138) on a dual core Intel i5 with Ubuntu Server 10.04 LTS system (kernel 2.6.32-22-server #33-Ubuntu SMP). Disk Subsystem is a software RAID5 array. The system has been set up for a little over a week. For the past 5 days I have been running at leat 3 VMs (Linux and a variety of Windows OSes) with no issues whatsoever. But while I was installing Linux onto a new VM, suddenly all VMs became unresponsive, including the one I was installing to. I could not log in to the VMWare Management Interface, and the system was somewhat unresponsive via SSH. When I looked at top, I saw: top - 16:14:51 up 6 days, 1:49, 8 users, load average: 24.29, 24.33 17.54 Tasks: 203 total, 7 running, 195 sleeping, 0 stopped, 1 zombie Cpu(s): 0.2%us, 25.6%sy, 0.0%ni, 74.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 8056656k total, 5927580k used, 2129076k free, 20320k buffers Swap: 7811064k total, 240216k used, 7570848k free, 5045884k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 21549 root 39 19 0 0 0 Z 100 0.0 15:02.44 [vmware-vmx] <defunct> 2115 root 20 0 0 0 0 S 1 0.0 170:32.08 [vmware-rtc] 2231 root 21 1 1494m 126m 100m S 1 1.6 892:58.05 /usr/lib/vmware/bin/vmware-vmx -# product=2; 2280 jnet 20 0 19320 1164 800 R 0 0.0 30:04.55 top 12236 root 20 0 833m 41m 34m S 0 0.5 88:34.24 /usr/lib/vmware/bin/vmware-vmx -# product=2; 1 root 20 0 23704 1476 920 S 0 0.0 0:00.80 /sbin/init 2 root 20 0 0 0 0 S 0 0.0 0:00.01 [kthreadd] 3 root RT 0 0 0 0 S 0 0.0 0:00.00 [migration/0] 4 root 20 0 0 0 0 S 0 0.0 0:00.84 [ksoftirqd/0] 5 root RT 0 0 0 0 S 0 0.0 0:00.00 [watchdog/0] 6 root RT 0 0 0 0 S 0 0.0 0:00.00 [migration/1] The VMWare process for the virtual machine I was installing into became a zombie. Yet, it was still consuming 100% of the CPU time on one of the cores, and I couldn't reach it or any other virtual machines. (I was logged in to one virtual machine over SSH, another via X11, and a third via VNC. All three connections died). When I ran ps -ef and similar commands, I found that the defunct vmware-vmx process had it's parent PID set to init (1). I also used lsof -p 21549 and found that the defunct process had no open files. Yet it was using 100% of CPU time... I was unable to kill any vmware-vmx processes, including the defunct one, even with kill -9. As a last resort to resolve the situation I tried to reboot the box, however shutdown, halt, reboot, and init 6 all failed to reboot/shutdown, even when given appropriate --force settings. ControlAltDel produced a message about rebooting on the console, but the system would not reboot. I had to hard power-cycle the box to resolve the situation. (See my other question, Should I worry about the integrity of my linux software RAID5 after a crash or kernel panic?) What would cause a scenario like this? What else could I have done to resolve it besides a hard reboot? What can I do to prevent such a situation in the future?

Read the article
What *exactly* gets screwed when I kill -9 or pull the power?

- by Mike

Set-Up I've been a programmer for quite some time now but I'm still a bit fuzzy on deep, internal stuff. Now. I am well aware that it's not a good idea to either: kill -9 a process (bad) spontaneously pull the power plug on a running computer or server (worse) However, sometimes you just plain have to. Sometimes a process just won't respond no matter what you do, and sometimes a computer just won't respond, no matter what you do. Let's assume a system running Apache 2, MySQL 5, PHP 5, and Python 2.6.5 through mod_wsgi. Note: I'm most interested about Mac OS X here, but an answer that pertains to any UNIX system would help me out. My Concern Each time I have to do either one of these, especially the second, I'm very worried for a period of time that something has been broken. Some file somewhere could be corrupt -- who knows which file? There are over 1,000,000 files on the computer. I'm often using OS X, so I'll run a "Verify Disk" operation through the Disk Utility. It will report no problems, but I'm still concerned about this. What if some configuration file somewhere got screwed up. Or even worse, what if a binary file somewhere is corrupt. Or a script file somewhere is corrupt now. What if some hardware is damaged? What if I don't find out about it until next month, in a critical scenario, when the corruption or damage causes a catastrophe? Or, what if valuable data is already lost? My Hope My hope is that these concerns and worries are unfounded. After all, after doing this many times before, nothing truly bad has happened yet. The worst is I've had to repair some MySQL tables, but I don't seem to have lost any data. But, if my worries are not unfounded, and real damage could happen in either situation 1 or 2, then my hope is that there is a way to detect it and prevent against it. My Question(s) Could this be because modern operating systems are designed to ensure that nothing is lost in these scenarios? Could this be because modern software is designed to ensure that nothing lost? What about modern hardware design? What measures are in place when you pull the power plug? My question is, for both of these scenarios, what exactly can go wrong, and what steps should be taken to fix it? I'm under the impression that one thing that can go wrong is some programs might not have flushed their data to the disk, so any highly recent data that was supposed to be written to the disk (say, a few seconds before the power pull) might be lost. But what about beyond that? And can this very issue of 5-second data loss screw up a system? What about corruption of random files hiding somewhere in the huge forest of files on my hard drives? What about hardware damage? What Would Help Me Most Detailed descriptions about what goes on internally when you either kill -9 a process or pull the power on the whole system. (it seems instant, but can someone slow it down for me?) Explanations of all things that could go wrong in these scenarios, along with (rough of course) probabilities (i.e., this is very unlikely, but this is likely)... Descriptions of measures in place in modern hardware, operating systems, and software, to prevent damage or corruption when these scenarios occur. (to comfort me) Instructions for what to do after a kill -9 or a power pull, beyond "verifying the disk", in order to truly make sure nothing is corrupt or damaged somewhere on the drive. Measures that can be taken to fortify a computer setup so that if something has to be killed or the power has to be pulled, any potential damage is mitigated. Thanks so much!

Read the article
"Service Unavailable" when browsing to static HTML page in non-application IIS website on Windows 2003 (possibly SharePoint WSS 2.0 related?)

- by Jordan Rieger

Background: My client has an old Pentium III Windows 2003 server whose 16/36 GB disks are dying. On it he has a database-driven web site and email application that needs further customization by a developer (me). First we need to get it working on the new server. The original developer is no longer available to provide a system setup guide. So my client got a tech who imaged the old drives over to the new server and managed to get it booting. But the IIS-driven site no longer works. In fact it seems that IIS itself does not work. Problem: Service Unavailable when attempting to browse from the server itself to the URL for a local Web Site called test which I setup in IIS to serve a single static index.htm file. This I did to isolate the problem, and eliminate the client's application from the equation. The site is setup on port 80 with the host header "test.myclientsdomain.com", and I used the etc\hosts file to point that host at the local IP. I know the host entry took effect because I can ping it. When doing an iisreset, I get: Attempting start... Restart attempt failed. IIS Admin Service or a service dependent on IIS Admin is not active. It most likely failed to start, which may mean that it's disabled. Despite this message, the services all stay in the Started state. The only relevant System event logs I found are: Event Type: Error Event Source: W3SVC Event Category: None Event ID: 1002 Date: 11/4/2012 Time: 11:04:47 PM User: N/A Computer: ALPHA1 Description: Application pool 'DefaultAppPool' is being automatically disabled due to a series of failures in the process(es) serving that application pool. Event Type: Error Event Source: W3SVC Event Category: None Event ID: 1039 Date: 11/4/2012 Time: 11:13:12 PM User: N/A Computer: ALPHA1 Description: A process serving application pool 'DefaultAppPool' reported a failure. The process id was '5636'. The data field contains the error number. Data: 0000: 7e 00 07 80 ~.. And one Application event log: Event Type: Error Event Source: Windows SharePoint Services 2.0 Event Category: None Event ID: 1000 Date: 11/4/2012 Time: 11:34:04 PM User: N/A Computer: ALPHA1 Description: #50070: Unable to connect to the database STS_Config on ALPHA2\SharePoint. Check the database connection information and make sure that the database server is running. That last log tells me that the tech may have initially tried to have both the old and the new server running, by renaming the new server from ALPHA1 to ALPHA2. And perhaps SharePoint grabbed onto that change, and now can't tell that the machine name has been switched back to the old ALPHA1. But why would SharePoint interfere with a static IIS web site serving a single HTML file? The test site is not even within an Application pool (I clicked the Remove button.) What I have tried/eliminated: No relevant services seem to be disabled: IIS Admin, WWW Publishing, Sharepoint Timer Giving Full Control to All Users/Everyone on the c:\inetpub\test folder serving my test site. I can connect to and query the local SharePoint config database (ALPHA1\SHAREPOINT\STS_CONFIG) from SSMS. But when I try to do stsadm -o setconfigdb -connect -databaseserver ALPHA1\SHAREPOINT it tells me The SharePoint admininstration port does not exist. Please use stsadm.exe to create it. And when I do that, using the port 9487 specified in the IIS SharePoint Admin site config, it tells me the port is already in use. Needless to say, simply browsing to the admin site gives me a similar error about being unable to reach the config database. I didn't want to go further down the SharePoint path as it may be completed unrelated to my IIS issue, and I don't even know yet if SharePoint is required for this application to work. The app itself is ASP.Net/C#/Silverlight and a little MS Word integration (maybe that's where the SharePoint stuff comes in.)

Read the article
Windows 7 disk errors after a few hours of runtime

- by GFK

I'm having trouble understanding what is going on with my work PC. Whenever I boot it, it runs fine for a while, then starts to randomly show disk errors. The displayed error often contains the message "not enough storage is available to process this command", although depending on the application that fails it can be different. This has happened for weeks now and is getting worse. This is what troubles me: It never seems to impact critical parts of the system (no BSOD, no freeze). Only some applications seem impacted, refusing to function correctly after a while: Outlook 2010 cannot download RSS feeds anymore, Firefox 6 or IE9 cannot download anything bigger than 3MB without failing, Windows Update fails, all msi installers fail, Visual Studio 2010 starts failing in weird manners... It only happens after a while using it (typically 3 hours, but it seems that installing a program or compiling several times makes it shorter) Rebooting solves it (temporarily). The system: The OS is Windows 7 Pro Spanish SP1, 32 bits The system is an HP Compaq 6000 Pro with 4 GB memory (only 3.4GB usable since the system is 32bit), one 500GB hard drive. Installed applications include: Visual Studio 2010, SQL Server 2008 R2, VMWare Workstation 7, Microsoft Security Essentials, Office 2010. Shutting down all related services and processes doesn't seem to change anything. The diagnostics I've run so far: Hard drive : 465GB, 165GB free Process Explorer : physical and virtual memory seem ok (pagefile is 5.3GB, physical memory usage 70%, system commit 39%) Windows Memory diagnostic tool: OK CHKDSK returned: 488282111 KB total disk space. 281668248 KB in 265779 files. 150188 KB in 62949 indexes. 0 KB in bad sectors. 571755 KB in use by the system. The log file has occupied 65536 kilobytes. 205891920 KB available on disk. For non-spanish speakers, that means all ok. SMART diagnostic tools (DiskCheckup) report all values normal. temperatures are in the normal range (HWinfo). The event viewer doesn't seem to contain any significant message. ran CCleaner 3, without any noticeable effect. I was thinking about some file number limit (between Visual Studio projects and other applications, there are around 300.000 files on the hard drive), but I couldn't find any. It's possible there is something related with the use of the temporary folders (it's the only explanation I have for why applications fail but Windows doesn't), but I cannot confirm that. Only thing I cannot find out is if chkdsk reporting 65MB for the log is normal. It seems since Vista it always reports this. Any other cleaning/diagnostic tool you might know of? Edit: I ran several other tools since I first published the question: Seagate SeaTools (the HD manufacturer's analysis tool): complete test run OK. Intel Rapid 10.1 (the HD controller manufacturer's troubleshooting tool): the HD's ok. Microsoft Desktop Heap Monitor: Desktop Heap Information Monitor Tool (Version 8.1.2925.0) Copyright (c) Microsoft Corporation. All rights reserved. Session ID: 1 Total Desktop: ( 46464 KB - 11 desktops) WinStation\Desktop Heap Size(KB) Used Rate(%) WinSta0\Winlogon (s1) 128 3.6 WinSta0\Disconnect (s1) 64 3.8 WinSta0\Default (s1) 20480 3.0 msswindowstation\mssrestricteddesk (s0) 1024 0.2 __X78B95_89_IW__A8D9S1_42_ID (s0) 1024 0.2 Service-0x0-3e5$\Default (s0) 1024 0.6 Service-0x0-3e4$\Default (s0) 1024 0.3 Service-0x0-3e7$\Default (s0) 1024 2.1 WinSta0\Winlogon (s0) 128 1.9 WinSta0\Disconnect (s0) 64 3.8 WinSta0\Default (s0) 20480 0.0 All ok, desktop heap usage < 5% Edit 2: I tried totally resetting my account by creating a new one, logging under this new one and delete the first one (local rights and files), then logging back with this deleted account (it is a domain account). No luck. Also, I found out often the error is "not enough storage is available to process this command". Searching on the internet, I found an old troubleshooting tip (setting a registry key to raise the IRP stack limit, whatever it is) which did not change anything.

Read the article
The best dvd ripper software in 2014 review

- by user328170

The top 3 DVD Ripping Tools in 2014 Nowadays everyone may have several smart mobile devices, such as iphone, ipad air, ipad mini ,Samsung Galaxy and Sony Xperia. If you want to take your movies with your mobile devices, or sometimes just want to backup those classic physical discs on your notebook or workstation with high quality resolutions, you need a fast and stable software to rip them and convert them to the format you like. Fortunately, there are plenty of great software products designed to make the process easy and transform DVD to the files that are playable on any mobile device you choose. We have done a full review on dozens of products. Here are five of the best, based on our review. We test the software from its ripping speed, friendly use guide , reliability and ripping capability. The top one is still Winx DVD Ripper platinum. We've test its 6.1 version 2 years ago for its ability to quickly and easily rip DVDs and Blu-ray discs to high quality MKV files with a single click. It gave us deep impression in the test. This time we test it’s lastest 7.3.5 version. Besides easy use and speed, we test its capability to decrypt all kinds of discs with different protect method, for example, Disney X-project DRM , Sony ArccOS, RCE and region code. The result shows that winx dvd ripper platinum still maintain its advantages in all the area. Winx dvd ripper platinum is a more focused on DVD ripping software with the basic duty to rip and convert DVD. The color of UI is a modern technical sense. All the main functions are shown obviously while others specials are hidden for advanced users, making it more clear and convenient to make option. There are two company weisoft limited and Digiarty who can provide the software. Weisoft limited focus on USA, UK and Australia market. Digiarty focus on others. ripping speed ????? friendly use guide ????? reliability ????? ripping capability ????? The second one DVDFab DVDFab is also very robust during ripping dics. It can also decrypt most of the dics in the market. The shortage it still friendly use and speed. We'd note that the app is frequently updated to cut through the copy protection on even the latest DVDs and Blu-ray discs . The app is shareware, meaning most features are free, including decrypting and ripping to your hard drive. Many of you note that you use another app for compression and authoring, but many of you say they hey, storage is cheap, and the rips from DVDFab are easy, one-click, and work. ripping speed ??? friendly use guide ???? reliability ????? ripping capability ????? The Third one is Handbrake Handbrake is our favorite video encoder for a reason: it's simple, easy to use, easy to install, and offers a lots of options to get the high quality file as a result. If you're scared by them, you don't even have to use them—the app will compensate for you and pick some settings it thinks you'll like based on your destination device. So many of you like Handbrake that many of you use it in conjunction with another app (like VLC, which makes ripping easy)—you'll let another app do the rip and crack the DRM on your discs, and then process the file through Handbrake for encoding. The app is fast, can make the most of multi-core processors to speed up the process, and is completely open source. ripping speed ??? friendly use guide ???? reliability ???? ripping capability ????

Read the article
SQL SERVER – Plan Recompilation and Reduce Recompilation – Performance Tuning

- by pinaldave

Recompilation process is same as compilation and degrades server performance. In SQL Server 2000 and earlier versions, this was a serious issue but in SQL server 2005, the severity of this issue has been significantly reduced by introducing a new feature called Statement-level recompilation. When SQL Server 2005 recompiles stored procedures, only the statement that [...]

Read the article
Converting a Visual Studio 2003 Web Project to a Visual Studio 2008 Web Application Project

- by navaneeth

This walkthrough describes how to convert a Visual Studio .NET 2002 or Visual Studio .NET 2003 Web project to a Visual Studio 2008 Web application project. The Visual Studio 2008 Web application project model is like the Visual Studio 2005 Web application project model. Therefore, the conversion processes are similar. For more information about Web application projects, see ASP.NET Web Application Projects. You can also convert from a Visual Studio .NET Web project to a Visual Studio 2008 Web site project. However, conversion to a Web application project is the approach that is supported, and gives you the convenience of tools to help with the conversion. For example, when you convert to a Visual Studio 2008 Web application project, you can use the Visual Studio Conversion Wizard to automate part of the process. For information about how to convert a Visual Studio .NET Web project to a Visual Studio 2008 Web site, see Common Web Project Conversion Issues and Solutions. There are two parts involved in converting a Visual Studio 2002 or 2003 Web project to a Visual Studio 2008 Web application project. The parts are as follows: Converting the project. You can use the Visual Studio Conversion Wizard for the initial conversion of the project and Web.config files. You can later use the Convert To Web Application command to update the project's files and structure. Upgrading the .NET Framework version of the project. You must upgrade the project's .NET Framework version to either .NET Framework 2.0 SP1 or to .NET Framework 3.5. This .NET Framework version upgrade is required because Visual Studio 2008 cannot target earlier versions of the .NET Framework. You can perform this upgrade during the project conversion, by using the Conversion Wizard. Alternatively, you can upgrade the .NET Framework version after you convert the project. NoteYou can change a project's .NET Framework version manually. To do so, in Visual Studio open the property pages for the project, click the Application tab, and then select a new version from the Target Framework list. This walkthrough illustrates the following tasks: Opening the Visual Studio .NET project in Visual Studio 2008 and creating a backup of the project files. Upgrading the .NET Framework version that the project targets. Converting the project file and the Web.config file. Converting ASP.NET code files. Testing the converted project. Prerequisites To complete this walkthrough, you will need: Visual Studio 2008. A Web site project that was created in Visual Studio .NET version 2002 or 2003 that compiles and runs without errors. Converting the Project and Upgrading the .NET Framework Version To begin, you open the project in Visual Studio 2008, which starts the conversion. It offers you an opportunity to back up the project before converting it. NoteIt is strongly recommended that you back up the project. The conversion works on the original project files, which cannot be recovered if the conversion is not successful.To convert the project and back up the files In Visual Studio 2008, in the File menu, click Open and then click Project. The Open Project dialog box is displayed. Browse to the folder that contains the project or solution file for the Visual Studio .NET project, select the file, and then click Open. NoteMake sure that you open the project by using the Open Project command. If you use the Open Web Site command, the project will be converted to the Web site project format.The Conversion Wizard opens and prompts you to create a backup before converting the project. To create the backup, click Yes. Click Browse, select the folder in which the backup should be created, and then click Next. Click Finish. The backup starts. NoteThere might be significant delays as the Conversion Wizard copies files, with no updates or progress indicated. Wait until the process finishes before you continue.When the conversion finishes, the wizard prompts you to upgrade the targeted version of the .NET Framework for the project. To upgrade to the .NET Framework 3.5, click Yes. To upgrade the project to target the .NET Framework 2.0 SP1, click No. It is recommended that you leave the check box selected that asks whether you want to upgrade all Webs in the solution. If you upgrade to .NET Framework 3.5, the project's Web.config file is modified at the same time as the project file. When the upgrade and conversion have finished, a message is displayed that indicates that you have completed the first step in converting your project. Click OK. The wizard displays status information about the conversion. Click Close. Testing the Converted Project After the conversion has finished, you can test the project to make sure that it runs. This will also help you identify code in the project that must be updated. To verify that the project runs If you know about changes that are required for the code to run with the new version of the .NET Framework, make those changes. In the Build menu, click Build. Any missing references or other compilation issues in the project are displayed in the Error List window. The most likely issues are missing assembly references or issues with dynamically generated types. In Solution Explorer, right-click the Web page that will be used to launch the application, and then click Set as Start Page. On the Debug menu, click Start Debugging. If debugging is not enabled, the Debugging Not Enabled dialog box is displayed. Select the option to add a Web.config file that has debugging enabled, and then click OK. Verify that the converted project runs as expected. Do not continue with the conversion process until all build and run-time errors are resolved. Converting ASP.NET Code Files ASP.NET Web page files and user-control files in Visual Studio 2008 that use the code-behind model have an associated designer file. The files that you just converted will have an associated code-behind file, but no designer file. Therefore, the next step is to generate designer files. NoteOnly ASP.NET Web pages and user controls that have their code in a separate code file require a separate designer file. For pages that have inline code and no associated code file, no designer file will be generated.To convert ASP.NET code files In Solution Explorer, right-click the project node, and then click Convert To Web Application. The files are converted. Verify that the converted code files have a code file and a designer file. Build and run the project to verify the results of the conversion.

Read the article
why is $0 set to -bash?

- by James Shimer

First login process name seems to be set to "-bash", but if I subshell then it becomes "bash". for example: root@nowere:~# echo $0 -bash root@nowere:~# bash root@nowere:~# echo $0 bash -bash is causing some scripts to fail, such as . /usr/share/debconf/confmodule exec /usr/share/debconf/frontend -bash Can't exec "-bash": No such file or directory at /usr/share/perl/5.14/IPC/Open3.pm line 186. open2: exec of -bash failed at /usr/share/perl5/Debconf/ConfModule.pm line 59 Anyone know the reason $0 is set to -bash?

Read the article
EFMVC Migrated to .NET 4.5, Visual Studio 2012, ASP.NET MVC 4 and EF 5 Code First

- by shiju

I have just migrated my EFMVC app from .NET 4.0 and ASP.NET MVC 4 RC to .NET 4.5, ASP.NET MVC 4 RTM and Entity Framework 5 Code First. In this release, the EFMVC solution is built with Visual Studio 2012 RTM. The migration process was very smooth and did not made any major changes other than adding simple unit tests with NUnit and Moq. I will add more unit tests on later and will also modify the existing solution. Source Code You can download the source code from http://efmvc.codeplex.com/

Read the article
OpenVPN not connecting

- by LandArch

There have been a number of post similar to this, but none seem to satisfy my need. Plus I am a Ubuntu newbie. I followed this tutorial to completely set up OpenVPN on Ubuntu 12.04 server. Here is my server.conf file ################################################# # Sample OpenVPN 2.0 config file for # # multi-client server. # # # # This file is for the server side # # of a many-clients <-> one-server # # OpenVPN configuration. # # # # OpenVPN also supports # # single-machine <-> single-machine # # configurations (See the Examples page # # on the web site for more info). # # # # This config should work on Windows # # or Linux/BSD systems. Remember on # # Windows to quote pathnames and use # # double backslashes, e.g.: # # "C:\\Program Files\\OpenVPN\\config\\foo.key" # # # # Comments are preceded with '#' or ';' # ################################################# # Which local IP address should OpenVPN # listen on? (optional) local 192.168.13.8 # Which TCP/UDP port should OpenVPN listen on? # If you want to run multiple OpenVPN instances # on the same machine, use a different port # number for each one. You will need to # open up this port on your firewall. port 1194 # TCP or UDP server? proto tcp ;proto udp # "dev tun" will create a routed IP tunnel, # "dev tap" will create an ethernet tunnel. # Use "dev tap0" if you are ethernet bridging # and have precreated a tap0 virtual interface # and bridged it with your ethernet interface. # If you want to control access policies # over the VPN, you must create firewall # rules for the the TUN/TAP interface. # On non-Windows systems, you can give # an explicit unit number, such as tun0. # On Windows, use "dev-node" for this. # On most systems, the VPN will not function # unless you partially or fully disable # the firewall for the TUN/TAP interface. dev tap0 up "/etc/openvpn/up.sh br0" down "/etc/openvpn/down.sh br0" ;dev tun # Windows needs the TAP-Win32 adapter name # from the Network Connections panel if you # have more than one. On XP SP2 or higher, # you may need to selectively disable the # Windows firewall for the TAP adapter. # Non-Windows systems usually don't need this. ;dev-node MyTap # SSL/TLS root certificate (ca), certificate # (cert), and private key (key). Each client # and the server must have their own cert and # key file. The server and all clients will # use the same ca file. # # See the "easy-rsa" directory for a series # of scripts for generating RSA certificates # and private keys. Remember to use # a unique Common Name for the server # and each of the client certificates. # # Any X509 key management system can be used. # OpenVPN can also use a PKCS #12 formatted key file # (see "pkcs12" directive in man page). ca "/etc/openvpn/ca.crt" cert "/etc/openvpn/server.crt" key "/etc/openvpn/server.key" # This file should be kept secret # Diffie hellman parameters. # Generate your own with: # openssl dhparam -out dh1024.pem 1024 # Substitute 2048 for 1024 if you are using # 2048 bit keys. dh dh1024.pem # Configure server mode and supply a VPN subnet # for OpenVPN to draw client addresses from. # The server will take 10.8.0.1 for itself, # the rest will be made available to clients. # Each client will be able to reach the server # on 10.8.0.1. Comment this line out if you are # ethernet bridging. See the man page for more info. ;server 10.8.0.0 255.255.255.0 # Maintain a record of client <-> virtual IP address # associations in this file. If OpenVPN goes down or # is restarted, reconnecting clients can be assigned # the same virtual IP address from the pool that was # previously assigned. ifconfig-pool-persist ipp.txt # Configure server mode for ethernet bridging. # You must first use your OS's bridging capability # to bridge the TAP interface with the ethernet # NIC interface. Then you must manually set the # IP/netmask on the bridge interface, here we # assume 10.8.0.4/255.255.255.0. Finally we # must set aside an IP range in this subnet # (start=10.8.0.50 end=10.8.0.100) to allocate # to connecting clients. Leave this line commented # out unless you are ethernet bridging. server-bridge 192.168.13.101 255.255.255.0 192.168.13.105 192.168.13.200 # Configure server mode for ethernet bridging # using a DHCP-proxy, where clients talk # to the OpenVPN server-side DHCP server # to receive their IP address allocation # and DNS server addresses. You must first use # your OS's bridging capability to bridge the TAP # interface with the ethernet NIC interface. # Note: this mode only works on clients (such as # Windows), where the client-side TAP adapter is # bound to a DHCP client. ;server-bridge # Push routes to the client to allow it # to reach other private subnets behind # the server. Remember that these # private subnets will also need # to know to route the OpenVPN client # address pool (10.8.0.0/255.255.255.0) # back to the OpenVPN server. push "route 192.168.13.1 255.255.255.0" push "dhcp-option DNS 192.168.13.201" push "dhcp-option DOMAIN blahblah.dyndns-wiki.com" ;push "route 192.168.20.0 255.255.255.0" # To assign specific IP addresses to specific # clients or if a connecting client has a private # subnet behind it that should also have VPN access, # use the subdirectory "ccd" for client-specific # configuration files (see man page for more info). # EXAMPLE: Suppose the client # having the certificate common name "Thelonious" # also has a small subnet behind his connecting # machine, such as 192.168.40.128/255.255.255.248. # First, uncomment out these lines: ;client-config-dir ccd ;route 192.168.40.128 255.255.255.248 # Then create a file ccd/Thelonious with this line: # iroute 192.168.40.128 255.255.255.248 # This will allow Thelonious' private subnet to # access the VPN. This example will only work # if you are routing, not bridging, i.e. you are # using "dev tun" and "server" directives. # EXAMPLE: Suppose you want to give # Thelonious a fixed VPN IP address of 10.9.0.1. # First uncomment out these lines: ;client-config-dir ccd ;route 10.9.0.0 255.255.255.252 # Then add this line to ccd/Thelonious: # ifconfig-push 10.9.0.1 10.9.0.2 # Suppose that you want to enable different # firewall access policies for different groups # of clients. There are two methods: # (1) Run multiple OpenVPN daemons, one for each # group, and firewall the TUN/TAP interface # for each group/daemon appropriately. # (2) (Advanced) Create a script to dynamically # modify the firewall in response to access # from different clients. See man # page for more info on learn-address script. ;learn-address ./script # If enabled, this directive will configure # all clients to redirect their default # network gateway through the VPN, causing # all IP traffic such as web browsing and # and DNS lookups to go through the VPN # (The OpenVPN server machine may need to NAT # or bridge the TUN/TAP interface to the internet # in order for this to work properly). ;push "redirect-gateway def1 bypass-dhcp" # Certain Windows-specific network settings # can be pushed to clients, such as DNS # or WINS server addresses. CAVEAT: # http://openvpn.net/faq.html#dhcpcaveats # The addresses below refer to the public # DNS servers provided by opendns.com. ;push "dhcp-option DNS 208.67.222.222" ;push "dhcp-option DNS 208.67.220.220" # Uncomment this directive to allow different # clients to be able to "see" each other. # By default, clients will only see the server. # To force clients to only see the server, you # will also need to appropriately firewall the # server's TUN/TAP interface. ;client-to-client # Uncomment this directive if multiple clients # might connect with the same certificate/key # files or common names. This is recommended # only for testing purposes. For production use, # each client should have its own certificate/key # pair. # # IF YOU HAVE NOT GENERATED INDIVIDUAL # CERTIFICATE/KEY PAIRS FOR EACH CLIENT, # EACH HAVING ITS OWN UNIQUE "COMMON NAME", # UNCOMMENT THIS LINE OUT. ;duplicate-cn # The keepalive directive causes ping-like # messages to be sent back and forth over # the link so that each side knows when # the other side has gone down. # Ping every 10 seconds, assume that remote # peer is down if no ping received during # a 120 second time period. keepalive 10 120 # For extra security beyond that provided # by SSL/TLS, create an "HMAC firewall" # to help block DoS attacks and UDP port flooding. # # Generate with: # openvpn --genkey --secret ta.key # # The server and each client must have # a copy of this key. # The second parameter should be '0' # on the server and '1' on the clients. ;tls-auth ta.key 0 # This file is secret # Select a cryptographic cipher. # This config item must be copied to # the client config file as well. ;cipher BF-CBC # Blowfish (default) ;cipher AES-128-CBC # AES ;cipher DES-EDE3-CBC # Triple-DES # Enable compression on the VPN link. # If you enable it here, you must also # enable it in the client config file. comp-lzo # The maximum number of concurrently connected # clients we want to allow. ;max-clients 100 # It's a good idea to reduce the OpenVPN # daemon's privileges after initialization. # # You can uncomment this out on # non-Windows systems. user nobody group nogroup # The persist options will try to avoid # accessing certain resources on restart # that may no longer be accessible because # of the privilege downgrade. persist-key persist-tun # Output a short status file showing # current connections, truncated # and rewritten every minute. status openvpn-status.log # By default, log messages will go to the syslog (or # on Windows, if running as a service, they will go to # the "\Program Files\OpenVPN\log" directory). # Use log or log-append to override this default. # "log" will truncate the log file on OpenVPN startup, # while "log-append" will append to it. Use one # or the other (but not both). ;log openvpn.log ;log-append openvpn.log # Set the appropriate level of log # file verbosity. # # 0 is silent, except for fatal errors # 4 is reasonable for general usage # 5 and 6 can help to debug connection problems # 9 is extremely verbose verb 3 # Silence repeating messages. At most 20 # sequential messages of the same message # category will be output to the log. ;mute 20 I am using Windows 7 as the Client and set that up accordingly using the OpenVPN GUI. That conf file is as follows: ############################################## # Sample client-side OpenVPN 2.0 config file # # for connecting to multi-client server. # # # # This configuration can be used by multiple # # clients, however each client should have # # its own cert and key files. # # # # On Windows, you might want to rename this # # file so it has a .ovpn extension # ############################################## # Specify that we are a client and that we # will be pulling certain config file directives # from the server. client # Use the same setting as you are using on # the server. # On most systems, the VPN will not function # unless you partially or fully disable # the firewall for the TUN/TAP interface. dev tap0 up "/etc/openvpn/up.sh br0" down "/etc/openvpn/down.sh br0" ;dev tun # Windows needs the TAP-Win32 adapter name # from the Network Connections panel # if you have more than one. On XP SP2, # you may need to disable the firewall # for the TAP adapter. ;dev-node MyTap # Are we connecting to a TCP or # UDP server? Use the same setting as # on the server. proto tcp ;proto udp # The hostname/IP and port of the server. # You can have multiple remote entries # to load balance between the servers. blahblah.dyndns-wiki.com 1194 ;remote my-server-2 1194 # Choose a random host from the remote # list for load-balancing. Otherwise # try hosts in the order specified. ;remote-random # Keep trying indefinitely to resolve the # host name of the OpenVPN server. Very useful # on machines which are not permanently connected # to the internet such as laptops. resolv-retry infinite # Most clients don't need to bind to # a specific local port number. nobind # Downgrade privileges after initialization (non-Windows only) user nobody group nobody # Try to preserve some state across restarts. persist-key persist-tun # If you are connecting through an # HTTP proxy to reach the actual OpenVPN # server, put the proxy server/IP and # port number here. See the man page # if your proxy server requires # authentication. ;http-proxy-retry # retry on connection failures ;http-proxy [proxy server] [proxy port #] # Wireless networks often produce a lot # of duplicate packets. Set this flag # to silence duplicate packet warnings. ;mute-replay-warnings # SSL/TLS parms. # See the server config file for more # description. It's best to use # a separate .crt/.key file pair # for each client. A single ca # file can be used for all clients. ca "C:\\Program Files\OpenVPN\config\\ca.crt" cert "C:\\Program Files\OpenVPN\config\\ChadMWade-THINK.crt" key "C:\\Program Files\OpenVPN\config\\ChadMWade-THINK.key" # Verify server certificate by checking # that the certicate has the nsCertType # field set to "server". This is an # important precaution to protect against # a potential attack discussed here: # http://openvpn.net/howto.html#mitm # # To use this feature, you will need to generate # your server certificates with the nsCertType # field set to "server". The build-key-server # script in the easy-rsa folder will do this. ns-cert-type server # If a tls-auth key is used on the server # then every client must also have the key. ;tls-auth ta.key 1 # Select a cryptographic cipher. # If the cipher option is used on the server # then you must also specify it here. ;cipher x # Enable compression on the VPN link. # Don't enable this unless it is also # enabled in the server config file. comp-lzo # Set log file verbosity. verb 3 # Silence repeating messages ;mute 20 Not sure whats left to do.

Read the article
Agile SOA Governance: SO-Aware and Visual Studio Integration

- by gsusx

One of the major limitations of traditional SOA governance platforms is the lack of integration as part of the development process. Tools like HP-Systinet or SOA Software are designed to operate by models on which the architects dictate the governance procedures and policies and the rest of the team members follow along. Consequently, those procedures are frequently rejected by developers and testers given that they can’t incorporate it as part of their daily activities. Having SOA governance products...(read more)

Read the article
[OT] : Windows Activation, en masse

- by AaronBertrand

This weekend I discovered a minor issue in one of my virtual environments. I had built out 100 VMs based on a Hyper-V template, but I forgot to activate the original source before creating the template, so all of the machines were suddenly out of compliance. While easy enough on a one- or two-machine basis to just log into the machine and activate manually, there was no way I was even going to dream of repeating that process on 100 machines. My First Reaction : PowerShell Whenever I do anything with...(read more)

Read the article
Improving Partitioned Table Join Performance

- by Paul White

The query optimizer does not always choose an optimal strategy when joining partitioned tables. This post looks at an example, showing how a manual rewrite of the query can almost double performance, while reducing the memory grant to almost nothing. Test Data The two tables in this example use a common partitioning partition scheme. The partition function uses 41 equal-size partitions: CREATE PARTITION FUNCTION PFT (integer) AS RANGE RIGHT FOR VALUES ( 125000, 250000, 375000, 500000, 625000, 750000, 875000, 1000000, 1125000, 1250000, 1375000, 1500000, 1625000, 1750000, 1875000, 2000000, 2125000, 2250000, 2375000, 2500000, 2625000, 2750000, 2875000, 3000000, 3125000, 3250000, 3375000, 3500000, 3625000, 3750000, 3875000, 4000000, 4125000, 4250000, 4375000, 4500000, 4625000, 4750000, 4875000, 5000000 ); GO CREATE PARTITION SCHEME PST AS PARTITION PFT ALL TO ([PRIMARY]); There two tables are: CREATE TABLE dbo.T1 ( TID integer NOT NULL IDENTITY(0,1), Column1 integer NOT NULL, Padding binary(100) NOT NULL DEFAULT 0x, CONSTRAINT PK_T1 PRIMARY KEY CLUSTERED (TID) ON PST (TID) ); CREATE TABLE dbo.T2 ( TID integer NOT NULL, Column1 integer NOT NULL, Padding binary(100) NOT NULL DEFAULT 0x, CONSTRAINT PK_T2 PRIMARY KEY CLUSTERED (TID, Column1) ON PST (TID) ); The next script loads 5 million rows into T1 with a pseudo-random value between 1 and 5 for Column1. The table is partitioned on the IDENTITY column TID: INSERT dbo.T1 WITH (TABLOCKX) (Column1) SELECT (ABS(CHECKSUM(NEWID())) % 5) + 1 FROM dbo.Numbers AS N WHERE n BETWEEN 1 AND 5000000; In case you don’t already have an auxiliary table of numbers lying around, here’s a script to create one with 10 million rows: CREATE TABLE dbo.Numbers (n bigint PRIMARY KEY); WITH L0 AS(SELECT 1 AS c UNION ALL SELECT 1), L1 AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B), L2 AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B), L3 AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B), L4 AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B), L5 AS(SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B), Nums AS(SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS n FROM L5) INSERT dbo.Numbers WITH (TABLOCKX) SELECT TOP (10000000) n FROM Nums ORDER BY n OPTION (MAXDOP 1); Table T1 contains data like this: Next we load data into table T2. The relationship between the two tables is that table 2 contains ‘n’ rows for each row in table 1, where ‘n’ is determined by the value in Column1 of table T1. There is nothing particularly special about the data or distribution, by the way. INSERT dbo.T2 WITH (TABLOCKX) (TID, Column1) SELECT T.TID, N.n FROM dbo.T1 AS T JOIN dbo.Numbers AS N ON N.n >= 1 AND N.n <= T.Column1; Table T2 ends up containing about 15 million rows: The primary key for table T2 is a combination of TID and Column1. The data is partitioned according to the value in column TID alone. Partition Distribution The following query shows the number of rows in each partition of table T1: SELECT PartitionID = CA1.P, NumRows = COUNT_BIG(*) FROM dbo.T1 AS T CROSS APPLY (VALUES ($PARTITION.PFT(TID))) AS CA1 (P) GROUP BY CA1.P ORDER BY CA1.P; There are 40 partitions containing 125,000 rows (40 * 125k = 5m rows). The rightmost partition remains empty. The next query shows the distribution for table 2: SELECT PartitionID = CA1.P, NumRows = COUNT_BIG(*) FROM dbo.T2 AS T CROSS APPLY (VALUES ($PARTITION.PFT(TID))) AS CA1 (P) GROUP BY CA1.P ORDER BY CA1.P; There are roughly 375,000 rows in each partition (the rightmost partition is also empty): Ok, that’s the test data done. Test Query and Execution Plan The task is to count the rows resulting from joining tables 1 and 2 on the TID column: SET STATISTICS IO ON; DECLARE @s datetime2 = SYSUTCDATETIME(); SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID; SELECT DATEDIFF(Millisecond, @s, SYSUTCDATETIME()); SET STATISTICS IO OFF; The optimizer chooses a plan using parallel hash join, and partial aggregation: The Plan Explorer plan tree view shows accurate cardinality estimates and an even distribution of rows across threads (click to enlarge the image): With a warm data cache, the STATISTICS IO output shows that no physical I/O was needed, and all 41 partitions were touched: Running the query without actual execution plan or STATISTICS IO information for maximum performance, the query returns in around 2600ms. Execution Plan Analysis The first step toward improving on the execution plan produced by the query optimizer is to understand how it works, at least in outline. The two parallel Clustered Index Scans use multiple threads to read rows from tables T1 and T2. Parallel scan uses a demand-based scheme where threads are given page(s) to scan from the table as needed. This arrangement has certain important advantages, but does result in an unpredictable distribution of rows amongst threads. The point is that multiple threads cooperate to scan the whole table, but it is impossible to predict which rows end up on which threads. For correct results from the parallel hash join, the execution plan has to ensure that rows from T1 and T2 that might join are processed on the same thread. For example, if a row from T1 with join key value ‘1234’ is placed in thread 5’s hash table, the execution plan must guarantee that any rows from T2 that also have join key value ‘1234’ probe thread 5’s hash table for matches. The way this guarantee is enforced in this parallel hash join plan is by repartitioning rows to threads after each parallel scan. The two repartitioning exchanges route rows to threads using a hash function over the hash join keys. The two repartitioning exchanges use the same hash function so rows from T1 and T2 with the same join key must end up on the same hash join thread. Expensive Exchanges This business of repartitioning rows between threads can be very expensive, especially if a large number of rows is involved. The execution plan selected by the optimizer moves 5 million rows through one repartitioning exchange and around 15 million across the other. As a first step toward removing these exchanges, consider the execution plan selected by the optimizer if we join just one partition from each table, disallowing parallelism: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = 1 AND $PARTITION.PFT(T2.TID) = 1 OPTION (MAXDOP 1); The optimizer has chosen a (one-to-many) merge join instead of a hash join. The single-partition query completes in around 100ms. If everything scaled linearly, we would expect that extending this strategy to all 40 populated partitions would result in an execution time around 4000ms. Using parallelism could reduce that further, perhaps to be competitive with the parallel hash join chosen by the optimizer. This raises a question. If the most efficient way to join one partition from each of the tables is to use a merge join, why does the optimizer not choose a merge join for the full query? Forcing a Merge Join Let’s force the optimizer to use a merge join on the test query using a hint: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (MERGE JOIN); This is the execution plan selected by the optimizer: This plan results in the same number of logical reads reported previously, but instead of 2600ms the query takes 5000ms. The natural explanation for this drop in performance is that the merge join plan is only using a single thread, whereas the parallel hash join plan could use multiple threads. Parallel Merge Join We can get a parallel merge join plan using the same query hint as before, and adding trace flag 8649: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (MERGE JOIN, QUERYTRACEON 8649); The execution plan is: This looks promising. It uses a similar strategy to distribute work across threads as seen for the parallel hash join. In practice though, performance is disappointing. On a typical run, the parallel merge plan runs for around 8400ms; slower than the single-threaded merge join plan (5000ms) and much worse than the 2600ms for the parallel hash join. We seem to be going backwards! The logical reads for the parallel merge are still exactly the same as before, with no physical IOs. The cardinality estimates and thread distribution are also still very good (click to enlarge): A big clue to the reason for the poor performance is shown in the wait statistics (captured by Plan Explorer Pro): CXPACKET waits require careful interpretation, and are most often benign, but in this case excessive waiting occurs at the repartitioning exchanges. Unlike the parallel hash join, the repartitioning exchanges in this plan are order-preserving ‘merging’ exchanges (because merge join requires ordered inputs): Parallelism works best when threads can just grab any available unit of work and get on with processing it. Preserving order introduces inter-thread dependencies that can easily lead to significant waits occurring. In extreme cases, these dependencies can result in an intra-query deadlock, though the details of that will have to wait for another time to explore in detail. The potential for waits and deadlocks leads the query optimizer to cost parallel merge join relatively highly, especially as the degree of parallelism (DOP) increases. This high costing resulted in the optimizer choosing a serial merge join rather than parallel in this case. The test results certainly confirm its reasoning. Collocated Joins In SQL Server 2008 and later, the optimizer has another available strategy when joining tables that share a common partition scheme. This strategy is a collocated join, also known as as a per-partition join. It can be applied in both serial and parallel execution plans, though it is limited to 2-way joins in the current optimizer. Whether the optimizer chooses a collocated join or not depends on cost estimation. The primary benefits of a collocated join are that it eliminates an exchange and requires less memory, as we will see next. Costing and Plan Selection The query optimizer did consider a collocated join for our original query, but it was rejected on cost grounds. The parallel hash join with repartitioning exchanges appeared to be a cheaper option. There is no query hint to force a collocated join, so we have to mess with the costing framework to produce one for our test query. Pretending that IOs cost 50 times more than usual is enough to convince the optimizer to use collocated join with our test query: -- Pretend IOs are 50x cost temporarily DBCC SETIOWEIGHT(50); -- Co-located hash join SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (RECOMPILE); -- Reset IO costing DBCC SETIOWEIGHT(1); Collocated Join Plan The estimated execution plan for the collocated join is: The Constant Scan contains one row for each partition of the shared partitioning scheme, from 1 to 41. The hash repartitioning exchanges seen previously are replaced by a single Distribute Streams exchange using Demand partitioning. Demand partitioning means that the next partition id is given to the next parallel thread that asks for one. My test machine has eight logical processors, and all are available for SQL Server to use. As a result, there are eight threads in the single parallel branch in this plan, each processing one partition from each table at a time. Once a thread finishes processing a partition, it grabs a new partition number from the Distribute Streams exchange…and so on until all partitions have been processed. It is important to understand that the parallel scans in this plan are different from the parallel hash join plan. Although the scans have the same parallelism icon, tables T1 and T2 are not being co-operatively scanned by multiple threads in the same way. Each thread reads a single partition of T1 and performs a hash match join with the same partition from table T2. The properties of the two Clustered Index Scans show a Seek Predicate (unusual for a scan!) limiting the rows to a single partition: The crucial point is that the join between T1 and T2 is on TID, and TID is the partitioning column for both tables. A thread that processes partition ‘n’ is guaranteed to see all rows that can possibly join on TID for that partition. In addition, no other thread will see rows from that partition, so this removes the need for repartitioning exchanges. CPU and Memory Efficiency Improvements The collocated join has removed two expensive repartitioning exchanges and added a single exchange processing 41 rows (one for each partition id). Remember, the parallel hash join plan exchanges had to process 5 million and 15 million rows. The amount of processor time spent on exchanges will be much lower in the collocated join plan. In addition, the collocated join plan has a maximum of 8 threads processing single partitions at any one time. The 41 partitions will all be processed eventually, but a new partition is not started until a thread asks for it. Threads can reuse hash table memory for the new partition. The parallel hash join plan also had 8 hash tables, but with all 5,000,000 build rows loaded at the same time. The collocated plan needs memory for only 8 * 125,000 = 1,000,000 rows at any one time. Collocated Hash Join Performance The collated join plan has disappointing performance in this case. The query runs for around 25,300ms despite the same IO statistics as usual. This is much the worst result so far, so what went wrong? It turns out that cardinality estimation for the single partition scans of table T1 is slightly low. The properties of the Clustered Index Scan of T1 (graphic immediately above) show the estimation was for 121,951 rows. This is a small shortfall compared with the 125,000 rows actually encountered, but it was enough to cause the hash join to spill to physical tempdb: A level 1 spill doesn’t sound too bad, until you realize that the spill to tempdb probably occurs for each of the 41 partitions. As a side note, the cardinality estimation error is a little surprising because the system tables accurately show there are 125,000 rows in every partition of T1. Unfortunately, the optimizer uses regular column and index statistics to derive cardinality estimates here rather than system table information (e.g. sys.partitions). Collocated Merge Join We will never know how well the collocated parallel hash join plan might have worked without the cardinality estimation error (and the resulting 41 spills to tempdb) but we do know: Merge join does not require a memory grant; and Merge join was the optimizer’s preferred join option for a single partition join Putting this all together, what we would really like to see is the same collocated join strategy, but using merge join instead of hash join. Unfortunately, the current query optimizer cannot produce a collocated merge join; it only knows how to do collocated hash join. So where does this leave us? CROSS APPLY sys.partitions We can try to write our own collocated join query. We can use sys.partitions to find the partition numbers, and CROSS APPLY to get a count per partition, with a final step to sum the partial counts. The following query implements this idea: SELECT row_count = SUM(Subtotals.cnt) FROM ( -- Partition numbers SELECT p.partition_number FROM sys.partitions AS p WHERE p.[object_id] = OBJECT_ID(N'T1', N'U') AND p.index_id = 1 ) AS P CROSS APPLY ( -- Count per collocated join SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals; The estimated plan is: The cardinality estimates aren’t all that good here, especially the estimate for the scan of the system table underlying the sys.partitions view. Nevertheless, the plan shape is heading toward where we would like to be. Each partition number from the system table results in a per-partition scan of T1 and T2, a one-to-many Merge Join, and a Stream Aggregate to compute the partial counts. The final Stream Aggregate just sums the partial counts. Execution time for this query is around 3,500ms, with the same IO statistics as always. This compares favourably with 5,000ms for the serial plan produced by the optimizer with the OPTION (MERGE JOIN) hint. This is another case of the sum of the parts being less than the whole – summing 41 partial counts from 41 single-partition merge joins is faster than a single merge join and count over all partitions. Even so, this single-threaded collocated merge join is not as quick as the original parallel hash join plan, which executed in 2,600ms. On the positive side, our collocated merge join uses only one logical processor and requires no memory grant. The parallel hash join plan used 16 threads and reserved 569 MB of memory: Using a Temporary Table Our collocated merge join plan should benefit from parallelism. The reason parallelism is not being used is that the query references a system table. We can work around that by writing the partition numbers to a temporary table (or table variable): SET STATISTICS IO ON; DECLARE @s datetime2 = SYSUTCDATETIME(); CREATE TABLE #P ( partition_number integer PRIMARY KEY); INSERT #P (partition_number) SELECT p.partition_number FROM sys.partitions AS p WHERE p.[object_id] = OBJECT_ID(N'T1', N'U') AND p.index_id = 1; SELECT row_count = SUM(Subtotals.cnt) FROM #P AS p CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals; DROP TABLE #P; SELECT DATEDIFF(Millisecond, @s, SYSUTCDATETIME()); SET STATISTICS IO OFF; Using the temporary table adds a few logical reads, but the overall execution time is still around 3500ms, indistinguishable from the same query without the temporary table. The problem is that the query optimizer still doesn’t choose a parallel plan for this query, though the removal of the system table reference means that it could if it chose to: In fact the optimizer did enter the parallel plan phase of query optimization (running search 1 for a second time): Unfortunately, the parallel plan found seemed to be more expensive than the serial plan. This is a crazy result, caused by the optimizer’s cost model not reducing operator CPU costs on the inner side of a nested loops join. Don’t get me started on that, we’ll be here all night. In this plan, everything expensive happens on the inner side of a nested loops join. Without a CPU cost reduction to compensate for the added cost of exchange operators, candidate parallel plans always look more expensive to the optimizer than the equivalent serial plan. Parallel Collocated Merge Join We can produce the desired parallel plan using trace flag 8649 again: SELECT row_count = SUM(Subtotals.cnt) FROM #P AS p CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals OPTION (QUERYTRACEON 8649); The actual execution plan is: One difference between this plan and the collocated hash join plan is that a Repartition Streams exchange operator is used instead of Distribute Streams. The effect is similar, though not quite identical. The Repartition uses round-robin partitioning, meaning the next partition id is pushed to the next thread in sequence. The Distribute Streams exchange seen earlier used Demand partitioning, meaning the next partition id is pulled across the exchange by the next thread that is ready for more work. There are subtle performance implications for each partitioning option, but going into that would again take us too far off the main point of this post. Performance The important thing is the performance of this parallel collocated merge join – just 1350ms on a typical run. The list below shows all the alternatives from this post (all timings include creation, population, and deletion of the temporary table where appropriate) from quickest to slowest: Collocated parallel merge join: 1350ms Parallel hash join: 2600ms Collocated serial merge join: 3500ms Serial merge join: 5000ms Parallel merge join: 8400ms Collated parallel hash join: 25,300ms (hash spill per partition) The parallel collocated merge join requires no memory grant (aside from a paltry 1.2MB used for exchange buffers). This plan uses 16 threads at DOP 8; but 8 of those are (rather pointlessly) allocated to the parallel scan of the temporary table. These are minor concerns, but it turns out there is a way to address them if it bothers you. Parallel Collocated Merge Join with Demand Partitioning This final tweak replaces the temporary table with a hard-coded list of partition ids (dynamic SQL could be used to generate this query from sys.partitions): SELECT row_count = SUM(Subtotals.cnt) FROM ( VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10), (11),(12),(13),(14),(15),(16),(17),(18),(19),(20), (21),(22),(23),(24),(25),(26),(27),(28),(29),(30), (31),(32),(33),(34),(35),(36),(37),(38),(39),(40),(41) ) AS P (partition_number) CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals OPTION (QUERYTRACEON 8649); The actual execution plan is: The parallel collocated hash join plan is reproduced below for comparison: The manual rewrite has another advantage that has not been mentioned so far: the partial counts (per partition) can be computed earlier than the partial counts (per thread) in the optimizer’s collocated join plan. The earlier aggregation is performed by the extra Stream Aggregate under the nested loops join. The performance of the parallel collocated merge join is unchanged at around 1350ms. Final Words It is a shame that the current query optimizer does not consider a collocated merge join (Connect item closed as Won’t Fix). The example used in this post showed an improvement in execution time from 2600ms to 1350ms using a modestly-sized data set and limited parallelism. In addition, the memory requirement for the query was almost completely eliminated – down from 569MB to 1.2MB. The problem with the parallel hash join selected by the optimizer is that it attempts to process the full data set all at once (albeit using eight threads). It requires a large memory grant to hold all 5 million rows from table T1 across the eight hash tables, and does not take advantage of the divide-and-conquer opportunity offered by the common partitioning. The great thing about the collocated join strategies is that each parallel thread works on a single partition from both tables, reading rows, performing the join, and computing a per-partition subtotal, before moving on to a new partition. From a thread’s point of view… If you have trouble visualizing what is happening from just looking at the parallel collocated merge join execution plan, let’s look at it again, but from the point of view of just one thread operating between the two Parallelism (exchange) operators. Our thread picks up a single partition id from the Distribute Streams exchange, and starts a merge join using ordered rows from partition 1 of table T1 and partition 1 of table T2. By definition, this is all happening on a single thread. As rows join, they are added to a (per-partition) count in the Stream Aggregate immediately above the Merge Join. Eventually, either T1 (partition 1) or T2 (partition 1) runs out of rows and the merge join stops. The per-partition count from the aggregate passes on through the Nested Loops join to another Stream Aggregate, which is maintaining a per-thread subtotal. Our same thread now picks up a new partition id from the exchange (say it gets id 9 this time). The count in the per-partition aggregate is reset to zero, and the processing of partition 9 of both tables proceeds just as it did for partition 1, and on the same thread. Each thread picks up a single partition id and processes all the data for that partition, completely independently from other threads working on other partitions. One thread might eventually process partitions (1, 9, 17, 25, 33, 41) while another is concurrently processing partitions (2, 10, 18, 26, 34) and so on for the other six threads at DOP 8. The point is that all 8 threads can execute independently and concurrently, continuing to process new partitions until the wider job (of which the thread has no knowledge!) is done. This divide-and-conquer technique can be much more efficient than simply splitting the entire workload across eight threads all at once. Related Reading Understanding and Using Parallelism in SQL Server Parallel Execution Plans Suck © 2013 Paul White – All Rights Reserved Twitter: @SQL_Kiwi

Read the article
BPEL 11.1.1.2 Certified for Prebuilt E-Business Suite 12.1.3 SOA Integrations

- by Steven Chan

A new certification was released simultaneously with the E-Business Suite 12.1.3 Maintenance Pack late last year: the use of BPEL 11g Version 11.1.1.2 with E-Business Suite 12.1.3. There are two major options for SOA-related integrations for the E-Business Suite:Custom integrations using the Oracle Application Server (SOA) Adapter for Oracle ApplicationsPrebuilt SOA integrations for E-Business Suite using BPEL Process ManagerFor more background about these two options, please see this article:BPEL 10.1.3.5 Certified for Prebuilt E-Business Suite 12 SOA Integrations

Read the article

Search Results

Search found 22569 results on 903 pages for 'win32 process'.

Page 277/903 | < Previous Page | 273 274 275 276 277 278 279 280 281 282 283 284 | Next Page >

- by Alexandre

- by Andras Gyomrey

- by pek

- by Yaniro

- by web.ask

- by chrisbunney

- by Wolfiem

- by ento

- by ento

- by Chris Sobolewski

- by Jeff

- by Josh

- by Mike

- by Jordan Rieger

- by GFK

- by user328170

- by pinaldave

- by navaneeth

- by James Shimer

- by shiju

- by LandArch

- by gsusx

- by AaronBertrand

- by Paul White

- by Steven Chan

< Previous Page | 273 274 275 276 277 278 279 280 281 282 283 284 | Next Page >