Search Results

Search found 4142 results on 166 pages for 'lost hobbit'.

Page 164/166 | < Previous Page | 160 161 162 163 164 165 166  | Next Page >

  • Multi-threading does not work correctly using std::thread (C++ 11)

    - by user1364743
    I coded a small c++ program to try to understand how multi-threading works using std::thread. Here's the step of my program execution : Initialization of a 5x5 matrix of integers with a unique value '42' contained in the class 'Toto' (initialized in the main). I print the initialized 5x5 matrix. Declaration of std::vector of 5 threads. I attach all threads respectively with their task (threadTask method). Each thread will manipulate a std::vector<int> instance. I join all threads. I print the new state of my 5x5 matrix. Here's the output : 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 It should be : 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 Here's the code sample : #include <iostream> #include <vector> #include <thread> class Toto { public: /* ** Initialize a 5x5 matrix with the 42 value. */ void initData(void) { for (int y = 0; y < 5; y++) { std::vector<int> vec; for (int x = 0; x < 5; x++) { vec.push_back(42); } this->m_data.push_back(vec); } } /* ** Display the whole matrix. */ void printData(void) const { for (int y = 0; y < 5; y++) { for (int x = 0; x < 5; x++) { printf("%d ", this->m_data[y][x]); } printf("\n"); } printf("\n"); } /* ** Function attached to the thread (thread task). ** Replace the original '42' value by another one. */ void threadTask(std::vector<int> &list, int value) { for (int x = 0; x < 5; x++) { list[x] = value; } } /* ** Return the m_data instance propertie. */ std::vector<std::vector<int> > &getData(void) { return (this->m_data); } private: std::vector<std::vector<int> > m_data; }; int main(void) { Toto toto; toto.initData(); toto.printData(); //Display the original 5x5 matrix (first display). std::vector<std::thread> threadList(5); //Initialization of vector of 5 threads. for (int i = 0; i < 5; i++) { //Threads initializationss std::vector<int> vec = toto.getData()[i]; //Get each sub-vectors. threadList.at(i) = std::thread(&Toto::threadTask, toto, vec, i); //Each thread will be attached to a specific vector. } for (int j = 0; j < 5; j++) { threadList.at(j).join(); } toto.printData(); //Second display. getchar(); return (0); } However, in the method threadTask, if I print the variable list[x], the output is correct. I think I can't print the correct data in the main because the printData() call is in the main thread and the display in the threadTask function is correct because the method is executed in its own thread (not the main one). It's strange, it means that all threads created in a parent processes can't modified the data in this parent processes ? I think I forget something in my code. I'm really lost. Does anyone can help me, please ? Thank a lot in advance for your help.

    Read the article

  • Compat Wireless Drivers Centrino N-2230

    - by user2699451
    So I am using linux and am having trouble installing the Compat Wireless drivers Hardware: Intel Centrino N-2230 OS: Linux Mint 64bit (kernel 13.08-generic) I followed this link http://www.mathyvanhoef.com/2012/09/compat-wireless-injection-patch-for.html Output: apt-get install linux-headers-$(uname -r) Reading package lists... Done Building dependency tree Reading state information... Done linux-headers-3.8.0-19-generic is already the newest version. 0 upgraded, 0 newly installed, 0 to remove and 19 not upgraded. charles-W55xEU compat-wireless-2010-10-16 # cd ~ charles-W55xEU ~ # dir adt-bundle-linux-x86_64-20130917.zip Desktop known_hosts_backup charles-W55xEU ~ # wget http://www.orbit-lab.org/kernel/compat-wireless-3-stable/v3.6/compat-wireless-3.6.2-1-snp.tar.bz2 --2013-10-29 10:28:23-- http://www.orbit-lab.org/kernel/compat-wireless-3-stable/v3.6/compat-wireless-3.6.2-1-snp.tar.bz2 Resolving www.orbit-lab.org (www.orbit-lab.org)... 128.6.192.131 Connecting to www.orbit-lab.org (www.orbit-lab.org)|128.6.192.131|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 4443700 (4,2M) [application/x-bzip2] Saving to: ‘compat-wireless-3.6.2-1-snp.tar.bz2’ 100%[======================================>] 4 443 700 13,5KB/s in 11m 3s 2013-10-29 10:39:27 (6,55 KB/s) - ‘compat-wireless-3.6.2-1-snp.tar.bz2’ saved [4443700/4443700] charles-W55xEU ~ # tar -xf compat-wireless-3.6.2-1-snp.tar.bz2 charles-W55xEU ~ # cd compat-wireless-3.6-rc6-1 bash: cd: compat-wireless-3.6-rc6-1: No such file or directory charles-W55xEU ~ # dir adt-bundle-linux-x86_64-20130917.zip Desktop compat-wireless-3.6.2-1-snp known_hosts_backup compat-wireless-3.6.2-1-snp.tar.bz2 charles-W55xEU ~ # cd compat-wireless-3.6.2-1-snp/ charles-W55xEU compat-wireless-3.6.2-1-snp # dir code-metrics.txt defconfigs linux-next-pending pending-stable compat drivers MAINTAINERS README config.mk enable-older-kernels Makefile scripts COPYRIGHT include net udev crap linux-next-cherry-picks patches charles-W55xEU compat-wireless-3.6.2-1-snp # wget http://patches.aircrack-ng.org/mac80211.compat08082009.wl_frag+ack_v1.patch --2013-10-29 10:40:52-- http://patches.aircrack-ng.org/mac80211.compat08082009.wl_frag+ack_v1.patch Resolving patches.aircrack-ng.org (patches.aircrack-ng.org)... 213.186.33.2, 2001:41d0:1:1b00:213:186:33:2 Connecting to patches.aircrack-ng.org (patches.aircrack-ng.org)|213.186.33.2|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 1049 (1,0K) [text/plain] Saving to: ‘mac80211.compat08082009.wl_frag+ack_v1.patch’ 100%[======================================>] 1 049 --.-K/s in 0s 2013-10-29 10:40:56 (180 MB/s) - ‘mac80211.compat08082009.wl_frag+ack_v1.patch’ saved [1049/1049] charles-W55xEU compat-wireless-3.6.2-1-snp # patch -p1 < mac80211.compat08082009.wl_frag+ack_v1.patch patching file net/mac80211/tx.c Hunk #1 succeeded at 792 (offset 115 lines). charles-W55xEU compat-wireless-3.6.2-1-snp # wget -Ocompatwireless_chan_qos_frag.patch http://pastie.textmate.org/pastes/4882675/download --2013-10-29 10:43:18-- http://pastie.textmate.org/pastes/4882675/download Resolving pastie.textmate.org (pastie.textmate.org)... 178.79.137.125 Connecting to pastie.textmate.org (pastie.textmate.org)|178.79.137.125|:80... connected. HTTP request sent, awaiting response... 301 Moved Permanently Location: http://pastie.org/pastes/4882675/download [following] --2013-10-29 10:43:20-- http://pastie.org/pastes/4882675/download Resolving pastie.org (pastie.org)... 96.126.119.119 Connecting to pastie.org (pastie.org)|96.126.119.119|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 2036 (2,0K) [application/octet-stream] Saving to: ‘compatwireless_chan_qos_frag.patch’ 100%[======================================>] 2 036 --.-K/s in 0,001s 2013-10-29 10:43:21 (3,35 MB/s) - ‘compatwireless_chan_qos_frag.patch’ saved [2036/2036] charles-W55xEU compat-wireless-3.6.2-1-snp # patch -p1 < compatwireless_chan_qos_frag.patch patching file drivers/net/wireless/rtl818x/rtl8187/dev.c patching file net/mac80211/tx.c Hunk #1 succeeded at 1495 (offset 8 lines). patching file net/wireless/chan.c charles-W55xEU compat-wireless-3.6.2-1-snp # make ./scripts/gen-compat-autoconf.sh /root/compat-wireless-3.6.2-1-snp/.config /root/compat-wireless-3.6.2-1-snp/config.mk > include/linux/compat_autoconf.h make -C /lib/modules/3.8.0-19-generic/build M=/root/compat-wireless-3.6.2-1-snp modules make[1]: Entering directory `/usr/src/linux-headers-3.8.0-19-generic' CC [M] /root/compat-wireless-3.6.2-1-snp/compat/main.o LD [M] /root/compat-wireless-3.6.2-1-snp/compat/compat.o CC [M] /root/compat-wireless-3.6.2-1-snp/drivers/bcma/main.o In file included from /root/compat-wireless-3.6.2-1-snp/include/linux/bcma/bcma.h:8:0, from /root/compat-wireless-3.6.2-1-snp/drivers/bcma/bcma_private.h:8, from /root/compat-wireless-3.6.2-1-snp/drivers/bcma/main.c:8: /root/compat-wireless-3.6.2-1-snp/include/linux/bcma/bcma_driver_pci.h:217:23: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘bcma_core_pci_init’ In file included from /root/compat-wireless-3.6.2-1-snp/include/linux/bcma/bcma.h:10:0, from /root/compat-wireless-3.6.2-1-snp/drivers/bcma/bcma_private.h:8, from /root/compat-wireless-3.6.2-1-snp/drivers/bcma/main.c:8: /root/compat-wireless-3.6.2-1-snp/include/linux/bcma/bcma_driver_gmac_cmn.h:95:23: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘bcma_core_gmac_cmn_init’ In file included from /root/compat-wireless-3.6.2-1-snp/drivers/bcma/main.c:8:0: /root/compat-wireless-3.6.2-1-snp/drivers/bcma/bcma_private.h:25:15: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘bcma_bus_register’ /root/compat-wireless-3.6.2-1-snp/drivers/bcma/main.c:152:15: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘bcma_bus_register’ /root/compat-wireless-3.6.2-1-snp/drivers/bcma/main.c:17:21: warning: ‘bcma_bus_next_num’ defined but not used [-Wunused-variable] /root/compat-wireless-3.6.2-1-snp/drivers/bcma/main.c:93:12: warning: ‘bcma_register_cores’ defined but not used [-Wunused-function] make[3]: *** [/root/compat-wireless-3.6.2-1-snp/drivers/bcma/main.o] Error 1 make[2]: *** [/root/compat-wireless-3.6.2-1-snp/drivers/bcma] Error 2 make[1]: *** [_module_/root/compat-wireless-3.6.2-1-snp] Error 2 make[1]: Leaving directory `/usr/src/linux-headers-3.8.0-19-generic' make: *** [modules] Error 2 charles-W55xEU compat-wireless-3.6.2-1-snp # make install Warning: You may or may not need to update your initframfs, you should if any of the modules installed are part of your initramfs. To add support for your distribution to do this automatically send a patch against ./scripts/update-initramfs. If your distribution does not require this send a patch against the '/usr/bin/lsb_release -i -s': LinuxMint tag for your distribution to avoid this warning. make -C /lib/modules/3.8.0-19-generic/build M=/root/compat-wireless-3.6.2-1-snp modules make[1]: Entering directory `/usr/src/linux-headers-3.8.0-19-generic' CC [M] /root/compat-wireless-3.6.2-1-snp/drivers/bcma/main.o In file included from /root/compat-wireless-3.6.2-1-snp/include/linux/bcma/bcma.h:8:0, from /root/compat-wireless-3.6.2-1-snp/drivers/bcma/bcma_private.h:8, from /root/compat-wireless-3.6.2-1-snp/drivers/bcma/main.c:8: /root/compat-wireless-3.6.2-1-snp/include/linux/bcma/bcma_driver_pci.h:217:23: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘bcma_core_pci_init’ In file included from /root/compat-wireless-3.6.2-1-snp/include/linux/bcma/bcma.h:10:0, from /root/compat-wireless-3.6.2-1-snp/drivers/bcma/bcma_private.h:8, from /root/compat-wireless-3.6.2-1-snp/drivers/bcma/main.c:8: /root/compat-wireless-3.6.2-1-snp/include/linux/bcma/bcma_driver_gmac_cmn.h:95:23: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘bcma_core_gmac_cmn_init’ In file included from /root/compat-wireless-3.6.2-1-snp/drivers/bcma/main.c:8:0: /root/compat-wireless-3.6.2-1-snp/drivers/bcma/bcma_private.h:25:15: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘bcma_bus_register’ /root/compat-wireless-3.6.2-1-snp/drivers/bcma/main.c:152:15: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘bcma_bus_register’ /root/compat-wireless-3.6.2-1-snp/drivers/bcma/main.c:17:21: warning: ‘bcma_bus_next_num’ defined but not used [-Wunused-variable] /root/compat-wireless-3.6.2-1-snp/drivers/bcma/main.c:93:12: warning: ‘bcma_register_cores’ defined but not used [-Wunused-function] make[3]: *** [/root/compat-wireless-3.6.2-1-snp/drivers/bcma/main.o] Error 1 make[2]: *** [/root/compat-wireless-3.6.2-1-snp/drivers/bcma] Error 2 make[1]: *** [_module_/root/compat-wireless-3.6.2-1-snp] Error 2 make[1]: Leaving directory `/usr/src/linux-headers-3.8.0-19-generic' make: *** [modules] Error 2 charles-W55xEU compat-wireless-3.6.2-1-snp # It keeps giving errors, same with other sites, I get the same errors??? I am lost, help needed

    Read the article

  • Too many apache processes, killing the CPU

    - by RULE101
    I am noticed that too many apache processes killing the CPU in my dedicated server. 14193 (Trace) (Kill) nobody 0 66.1 0.0 /usr/local/apache/bin/httpd -k start -DSSL 14128 (Trace) (Kill) nobody 0 65.9 0.0 /usr/local/apache/bin/httpd -k start -DSSL 14136 (Trace) (Kill) nobody 0 65.9 0.0 /usr/local/apache/bin/httpd -k start -DSSL 14129 (Trace) (Kill) nobody 0 65.8 0.0 /usr/local/apache/bin/httpd -k start -DSSL 13419 (Trace) (Kill) nobody 0 65.7 0.0 /usr/local/apache/bin/httpd -k start -DSSL 13421 (Trace) (Kill) nobody 0 65.7 0.0 /usr/local/apache/bin/httpd -k start -DSSL 13426 (Trace) (Kill) nobody 0 65.7 0.0 /usr/local/apache/bin/httpd -k start -DSSL 13428 (Trace) (Kill) nobody 0 65.7 0.0 /usr/local/apache/bin/httpd -k start -DSSL 13429 (Trace) (Kill) nobody 0 65.7 0.0 /usr/local/apache/bin/httpd -k start -DSSL 12173 (Trace) (Kill) nobody 0 65.5 0.0 /usr/local/apache/bin/httpd -k start -DSSL 14073 (Trace) (Kill) nobody 0 65.5 0.0 /usr/local/apache/bin/httpd -k start -DSSL I am getting high load email notification from cpanel during the day. FROM httpd.conf Include "/usr/local/apache/conf/includes/pre_main_global.conf" Include "/usr/local/apache/conf/includes/pre_main_2.conf" LoadModule bwlimited_module modules/mod_bwlimited.so LoadModule h264_streaming_module /usr/local/apache/modules/mod_h264_streaming.so AddHandler h264-streaming.extensions .mp4 Include "/usr/local/apache/conf/php.conf" Include "/usr/local/apache/conf/includes/errordocument.conf" ErrorLog "logs/error_log" ScriptAliasMatch ^/?controlpanel/?$ /usr/local/cpanel/cgi-sys/redirect.cgi ScriptAliasMatch ^/?cpanel/?$ /usr/local/cpanel/cgi-sys/redirect.cgi ScriptAliasMatch ^/?kpanel/?$ /usr/local/cpanel/cgi-sys/redirect.cgi ScriptAliasMatch ^/?securecontrolpanel/?$ /usr/local/cpanel/cgi-sys/sredirect.cgi ScriptAliasMatch ^/?securecpanel/?$ /usr/local/cpanel/cgi-sys/sredirect.cgi ScriptAliasMatch ^/?securewhm/?$ /usr/local/cpanel/cgi-sys/swhmredirect.cgi ScriptAliasMatch ^/?webmail/?$ /usr/local/cpanel/cgi-sys/wredirect.cgi ScriptAliasMatch ^/?whm/?$ /usr/local/cpanel/cgi-sys/whmredirect.cgi RewriteEngine on AddType text/html .shtml Alias /akopia /usr/local/cpanel/3rdparty/interchange/share/akopia/ Alias /bandwidth /usr/local/bandmin/htdocs/ Alias /img-sys /usr/local/cpanel/img-sys/ Alias /interchange /usr/local/cpanel/3rdparty/interchange/share/interchange/ Alias /interchange-5 /usr/local/cpanel/3rdparty/interchange/share/interchange-5/ Alias /java-sys /usr/local/cpanel/java-sys/ Alias /mailman/archives /usr/local/cpanel/3rdparty/mailman/archives/public/ Alias /pipermail /usr/local/cpanel/3rdparty/mailman/archives/public/ Alias /sys_cpanel /usr/local/cpanel/sys_cpanel/ ScriptAlias /cgi-sys /usr/local/cpanel/cgi-sys/ ScriptAlias /mailman /usr/local/cpanel/3rdparty/mailman/cgi-bin/ <Directory "/"> AllowOverride All Options All </Directory> <Directory "/usr/local/apache/htdocs"> Options All AllowOverride None Require all granted </Directory> <Files ~ "^error_log$"> Order allow,deny Deny from all Satisfy All </Files> <Files ".ht*"> Require all denied </Files> <IfModule log_config_module> LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined LogFormat "%h %l %u %t \"%r\" %>s %b" common CustomLog "logs/access_log" common <IfModule logio_module> LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" %I %O" combinedio </IfModule> </IfModule> <IfModule alias_module> ScriptAlias /cgi-bin/ "/usr/local/apache/cgi-bin/" </IfModule> <Directory "/usr/local/apache/cgi-bin"> AllowOverride None Options All Require all granted </Directory> <IfModule mime_module> TypesConfig conf/mime.types AddType application/x-compress .Z AddType application/x-gzip .gz .tgz </IfModule> <IfModule prefork.c> Mutex default mpm-accept </IfModule> <IfModule mod_log_config.c> LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined LogFormat "%h %l %u %t \"%r\" %>s %b" common LogFormat "%{Referer}i -> %U" referer LogFormat "%{User-agent}i" agent CustomLog logs/access_log common </IfModule> <IfModule worker.c> Mutex default mpm-accept </IfModule> # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # Direct modifications to the Apache configuration file may be lost upon subsequent regeneration of the # # configuration file. To have modifications retained, all modifications must be checked into the # # configuration system by running: # # /usr/local/cpanel/bin/apache_conf_distiller --update # # To see if your changes will be conserved, regenerate the Apache configuration file by running: # # /usr/local/cpanel/bin/build_apache_conf # # and check the configuration file for your alterations. If your changes have been ignored, then they will # # need to be added directly to their respective template files. # # # # It is also possible to add custom directives to the various "Include" files loaded by this httpd.conf # # For detailed instructions on using Include files and the apache_conf_distiller with the new configuration # # system refer to the documentation at: http://www.cpanel.net/support/docs/ea/ea3/customdirectives.html # # # # This configuration file was built from the following templates: # # /var/cpanel/templates/apache2/main.default # # /var/cpanel/templates/apache2/main.local # # /var/cpanel/templates/apache2/vhost.default # # /var/cpanel/templates/apache2/vhost.local # # /var/cpanel/templates/apache2/ssl_vhost.default # # /var/cpanel/templates/apache2/ssl_vhost.local # # # # Templates with the '.local' extension will be preferred over templates with the '.default' extension. # # The only template updated by the apache_conf_distiller is main.default. # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # PidFile logs/httpd.pid # Defined in /var/cpanel/cpanel.config: apache_port Listen 0.0.0.0:80 User nobody Group nobody ExtendedStatus On ServerAdmin [email protected] ServerName server.powerlabel.net LogLevel warn # These can be set in WHM under 'Apache Global Configuration' Timeout 300 ServerSignature On <IfModule prefork.c> </IfModule> RewriteEngine on RewriteMap LeechProtect prg:/usr/local/cpanel/bin/leechprotect Mutex file:/usr/local/apache/logs rewrite-map <IfModule !mod_ruid2.c> UserDir public_html </IfModule> <IfModule mod_ruid2.c> UserDir disabled </IfModule> # DirectoryIndex is set via the WHM -> Service Configuration -> Apache Setup -> DirectoryIndex Priority DirectoryIndex index.html.var index.htm index.html index.shtml index.xhtml index.wml index.perl index.pl index.plx index.ppl index.cgi index.jsp index.js index.jp index.php4 index.php3 index.php index.phtml default.htm default.html home.htm index.php5 Default.html Default.htm home.html # SSLCipherSuite can be set in WHM under 'Apache Global Configuration' SSLPassPhraseDialog builtin SSLUseStapling on SSLStaplingCache shmcb:/usr/local/apache/logs/stapling_cache_shmcb(256000) SSLSessionCache shmcb:/usr/local/apache/logs/ssl_gcache_data_shmcb(1024000) SSLSessionCacheTimeout 300 Mutex file:/usr/local/apache/logs ssl-cache SSLRandomSeed startup builtin SSLRandomSeed connect builtin # Defined in /var/cpanel/cpanel.config: apache_ssl_port Listen 0.0.0.0:443 AddType application/x-x509-ca-cert .crt AddType application/x-pkcs7-crl .crl AddHandler cgi-script .cgi .pl .plx .ppl .perl AddHandler server-parsed .shtml AddType text/html .shtml AddType application/x-tar .tgz AddType text/vnd.wap.wml .wml AddType image/vnd.wap.wbmp .wbmp AddType text/vnd.wap.wmlscript .wmls AddType application/vnd.wap.wmlc .wmlc AddType application/vnd.wap.wmlscriptc .wmlsc <Location /whm-server-status> SetHandler server-status Order deny,allow Deny from all Allow from 127.0.0.1 </Location> # SUEXEC is supported Include "/usr/local/apache/conf/includes/pre_virtualhost_global.conf" Include "/usr/local/apache/conf/includes/pre_virtualhost_2.conf" What can cause this and how can i fix it ?

    Read the article

  • Is post-sudden-power-loss filesystem corruption on an SSD drive's ext3 partition "expected behavior"?

    - by Jeremy Friesner
    My company makes an embedded Debian Linux device that boots from an ext3 partition on an internal SSD drive. Because the device is an embedded "black box", it is usually shut down the rude way, by simply cutting power to the device via an external switch. This is normally okay, as ext3's journalling keeps things in order, so other than the occasional loss of part of a log file, things keep chugging along fine. However, we've recently seen a number of units where after a number of hard-power-cycles the ext3 partition starts to develop structural issues -- in particular, we run e2fsck on the ext3 partition and it finds a number of issues like those shown in the output listing at the bottom of this Question. Running e2fsck until it stops reporting errors (or reformatting the partition) clears the issues. My question is... what are the implications of seeing problems like this on an ext3/SSD system that has been subjected to lots of sudden/unexpected shutdowns? My feeling is that this might be a sign of a software or hardware problem in our system, since my understanding is that (barring a bug or hardware problem) ext3's journalling feature is supposed to prevent these sorts of filesystem-integrity errors. (Note: I understand that user-data is not journalled and so munged/missing/truncated user-files can happen; I'm specifically talking here about filesystem-metadata errors like those shown below) My co-worker, on the other hand, says that this is known/expected behavior because SSD controllers sometimes re-order write commands and that can cause the ext3 journal to get confused. In particular, he believes that even given normally functioning hardware and bug-free software, the ext3 journal only makes filesystem corruption less likely, not impossible, so we should not be surprised to see problems like this from time to time. Which of us is right? Embedded-PC-failsafe:~# ls Embedded-PC-failsafe:~# umount /mnt/unionfs Embedded-PC-failsafe:~# e2fsck /dev/sda3 e2fsck 1.41.3 (12-Oct-2008) embeddedrootwrite contains a file system with errors, check forced. Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Invalid inode number for '.' in directory inode 46948. Fix<y>? yes Directory inode 46948, block 0, offset 12: directory corrupted Salvage<y>? yes Entry 'status_2012-11-26_14h13m41.csv' in /var/log/status_logs (46956) has deleted/unused inode 47075. Clear<y>? yes Entry 'status_2012-11-26_10h42m58.csv.gz' in /var/log/status_logs (46956) has deleted/unused inode 47076. Clear<y>? yes Entry 'status_2012-11-26_11h29m41.csv.gz' in /var/log/status_logs (46956) has deleted/unused inode 47080. Clear<y>? yes Entry 'status_2012-11-26_11h42m13.csv.gz' in /var/log/status_logs (46956) has deleted/unused inode 47081. Clear<y>? yes Entry 'status_2012-11-26_12h07m17.csv.gz' in /var/log/status_logs (46956) has deleted/unused inode 47083. Clear<y>? yes Entry 'status_2012-11-26_12h14m53.csv.gz' in /var/log/status_logs (46956) has deleted/unused inode 47085. Clear<y>? yes Entry 'status_2012-11-26_15h06m49.csv' in /var/log/status_logs (46956) has deleted/unused inode 47088. Clear<y>? yes Entry 'status_2012-11-20_14h50m09.csv' in /var/log/status_logs (46956) has deleted/unused inode 47073. Clear<y>? yes Entry 'status_2012-11-20_14h55m32.csv' in /var/log/status_logs (46956) has deleted/unused inode 47074. Clear<y>? yes Entry 'status_2012-11-26_11h04m36.csv.gz' in /var/log/status_logs (46956) has deleted/unused inode 47078. Clear<y>? yes Entry 'status_2012-11-26_11h54m45.csv.gz' in /var/log/status_logs (46956) has deleted/unused inode 47082. Clear<y>? yes Entry 'status_2012-11-26_12h12m20.csv.gz' in /var/log/status_logs (46956) has deleted/unused inode 47084. Clear<y>? yes Entry 'status_2012-11-26_12h33m52.csv.gz' in /var/log/status_logs (46956) has deleted/unused inode 47086. Clear<y>? yes Entry 'status_2012-11-26_10h51m59.csv.gz' in /var/log/status_logs (46956) has deleted/unused inode 47077. Clear<y>? yes Entry 'status_2012-11-26_11h17m09.csv.gz' in /var/log/status_logs (46956) has deleted/unused inode 47079. Clear<y>? yes Entry 'status_2012-11-26_12h54m11.csv.gz' in /var/log/status_logs (46956) has deleted/unused inode 47087. Clear<y>? yes Pass 3: Checking directory connectivity '..' in /etc/network/run (46948) is <The NULL inode> (0), should be /etc/network (46953). Fix<y>? yes Couldn't fix parent of inode 46948: Couldn't find parent directory entry Pass 4: Checking reference counts Unattached inode 46945 Connect to /lost+found<y>? yes Inode 46945 ref count is 2, should be 1. Fix<y>? yes Inode 46953 ref count is 5, should be 4. Fix<y>? yes Pass 5: Checking group summary information Block bitmap differences: -(208264--208266) -(210062--210068) -(211343--211491) -(213241--213250) -(213344--213393) -213397 -(213457--213463) -(213516--213521) -(213628--213655) -(213683--213688) -(213709--213728) -(215265--215300) -(215346--215365) -(221541--221551) -(221696--221704) -227517 Fix<y>? yes Free blocks count wrong for group #6 (17247, counted=17611). Fix<y>? yes Free blocks count wrong (161691, counted=162055). Fix<y>? yes Inode bitmap differences: +(47089--47090) +47093 +47095 +(47097--47099) +(47101--47104) -(47219--47220) -47222 -47224 -47228 -47231 -(47347--47348) -47350 -47352 -47356 -47359 -(47457--47488) -47985 -47996 -(47999--48000) -48017 -(48027--48028) -(48030--48032) -48049 -(48059--48060) -(48062--48064) -48081 -(48091--48092) -(48094--48096) Fix<y>? yes Free inodes count wrong for group #6 (7608, counted=7624). Fix<y>? yes Free inodes count wrong (61919, counted=61935). Fix<y>? yes embeddedrootwrite: ***** FILE SYSTEM WAS MODIFIED ***** embeddedrootwrite: ********** WARNING: Filesystem still has errors ********** embeddedrootwrite: 657/62592 files (24.4% non-contiguous), 87882/249937 blocks Embedded-PC-failsafe:~# Embedded-PC-failsafe:~# e2fsck /dev/sda3 e2fsck 1.41.3 (12-Oct-2008) embeddedrootwrite contains a file system with errors, check forced. Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Directory entry for '.' in ... (46948) is big. Split<y>? yes Missing '..' in directory inode 46948. Fix<y>? yes Setting filetype for entry '..' in ... (46948) to 2. Pass 3: Checking directory connectivity '..' in /etc/network/run (46948) is <The NULL inode> (0), should be /etc/network (46953). Fix<y>? yes Pass 4: Checking reference counts Inode 2 ref count is 12, should be 13. Fix<y>? yes Pass 5: Checking group summary information embeddedrootwrite: ***** FILE SYSTEM WAS MODIFIED ***** embeddedrootwrite: 657/62592 files (24.4% non-contiguous), 87882/249937 blocks Embedded-PC-failsafe:~# Embedded-PC-failsafe:~# e2fsck /dev/sda3 e2fsck 1.41.3 (12-Oct-2008) embeddedrootwrite: clean, 657/62592 files, 87882/249937 blocks

    Read the article

  • Does anyone really understand how HFSC scheduling in Linux/BSD works?

    - by Mecki
    I read the original SIGCOMM '97 PostScript paper about HFSC, it is very technically, but I understand the basic concept. Instead of giving a linear service curve (as with pretty much every other scheduling algorithm), you can specify a convex or concave service curve and thus it is possible to decouple bandwidth and delay. However, even though this paper mentions to kind of scheduling algorithms being used (real-time and link-share), it always only mentions ONE curve per scheduling class (the decoupling is done by specifying this curve, only one curve is needed for that). Now HFSC has been implemented for BSD (OpenBSD, FreeBSD, etc.) using the ALTQ scheduling framework and it has been implemented Linux using the TC scheduling framework (part of iproute2). Both implementations added two additional service curves, that were NOT in the original paper! A real-time service curve and an upper-limit service curve. Again, please note that the original paper mentions two scheduling algorithms (real-time and link-share), but in that paper both work with one single service curve. There never have been two independent service curves for either one as you currently find in BSD and Linux. Even worse, some version of ALTQ seems to add an additional queue priority to HSFC (there is no such thing as priority in the original paper either). I found several BSD HowTo's mentioning this priority setting (even though the man page of the latest ALTQ release knows no such parameter for HSFC, so officially it does not even exist). This all makes the HFSC scheduling even more complex than the algorithm described in the original paper and there are tons of tutorials on the Internet that often contradict each other, one claiming the opposite of the other one. This is probably the main reason why nobody really seems to understand how HFSC scheduling really works. Before I can ask my questions, we need a sample setup of some kind. I'll use a very simple one as seen in the image below: Here are some questions I cannot answer because the tutorials contradict each other: What for do I need a real-time curve at all? Assuming A1, A2, B1, B2 are all 128 kbit/s link-share (no real-time curve for either one), then each of those will get 128 kbit/s if the root has 512 kbit/s to distribute (and A and B are both 256 kbit/s of course), right? Why would I additionally give A1 and B1 a real-time curve with 128 kbit/s? What would this be good for? To give those two a higher priority? According to original paper I can give them a higher priority by using a curve, that's what HFSC is all about after all. By giving both classes a curve of [256kbit/s 20ms 128kbit/s] both have twice the priority than A2 and B2 automatically (still only getting 128 kbit/s on average) Does the real-time bandwidth count towards the link-share bandwidth? E.g. if A1 and B1 both only have 64kbit/s real-time and 64kbit/s link-share bandwidth, does that mean once they are served 64kbit/s via real-time, their link-share requirement is satisfied as well (they might get excess bandwidth, but lets ignore that for a second) or does that mean they get another 64 kbit/s via link-share? So does each class has a bandwidth "requirement" of real-time plus link-share? Or does a class only have a higher requirement than the real-time curve if the link-share curve is higher than the real-time curve (current link-share requirement equals specified link-share requirement minus real-time bandwidth already provided to this class)? Is upper limit curve applied to real-time as well, only to link-share, or maybe to both? Some tutorials say one way, some say the other way. Some even claim upper-limit is the maximum for real-time bandwidth + link-share bandwidth? What is the truth? Assuming A2 and B2 are both 128 kbit/s, does it make any difference if A1 and B1 are 128 kbit/s link-share only, or 64 kbit/s real-time and 128 kbit/s link-share, and if so, what difference? If I use the seperate real-time curve to increase priorities of classes, why would I need "curves" at all? Why is not real-time a flat value and link-share also a flat value? Why are both curves? The need for curves is clear in the original paper, because there is only one attribute of that kind per class. But now, having three attributes (real-time, link-share, and upper-limit) what for do I still need curves on each one? Why would I want the curves shape (not average bandwidth, but their slopes) to be different for real-time and link-share traffic? According to the little documentation available, real-time curve values are totally ignored for inner classes (class A and B), they are only applied to leaf classes (A1, A2, B1, B2). If that is true, why does the ALTQ HFSC sample configuration (search for 3.3 Sample configuration) set real-time curves on inner classes and claims that those set the guaranteed rate of those inner classes? Isn't that completely pointless? (note: pshare sets the link-share curve in ALTQ and grate the real-time curve; you can see this in the paragraph above the sample configuration). Some tutorials say the sum of all real-time curves may not be higher than 80% of the line speed, others say it must not be higher than 70% of the line speed. Which one is right or are they maybe both wrong? One tutorial said you shall forget all the theory. No matter how things really work (schedulers and bandwidth distribution), imagine the three curves according to the following "simplified mind model": real-time is the guaranteed bandwidth that this class will always get. link-share is the bandwidth that this class wants to become fully satisfied, but satisfaction cannot be guaranteed. In case there is excess bandwidth, the class might even get offered more bandwidth than necessary to become satisfied, but it may never use more than upper-limit says. For all this to work, the sum of all real-time bandwidths may not be above xx% of the line speed (see question above, the percentage varies). Question: Is this more or less accurate or a total misunderstanding of HSFC? And if assumption above is really accurate, where is prioritization in that model? E.g. every class might have a real-time bandwidth (guaranteed), a link-share bandwidth (not guaranteed) and an maybe an upper-limit, but still some classes have higher priority needs than other classes. In that case I must still prioritize somehow, even among real-time traffic of those classes. Would I prioritize by the slope of the curves? And if so, which curve? The real-time curve? The link-share curve? The upper-limit curve? All of them? Would I give all of them the same slope or each a different one and how to find out the right slope? I still haven't lost hope that there exists at least a hand full of people in this world that really understood HFSC and are able to answer all these questions accurately. And doing so without contradicting each other in the answers would be really nice ;-)

    Read the article

  • How to get full control of umask/PAM/permissions?

    - by plua
    OUR SITUATION Several people from our company log in to a server and upload files. They all need to be able to upload and overwrite the same files. They have different usernames, but are all part of the same group. However, this is an internet server, so the "other" users should have (in general) just read-only access. So what I want to have is these standard permissions: files: 664 directories: 771 My goal is that all users do not need to worry about permissions. The server should be configured in such a way that these permissions apply to all files and directories, newly created, copied, or over-written. Only when we need some special permissions we'd manually change this. We upload files to the server by SFTP-ing in Nautilus, by mounting the server using sshfs and accessing it in Nautilus as if it were a local folder, and by SCP-ing in the command line. That basically covers our situation and what we aim to do. Now, I have read many things about the beautiful umask functionality. From what I understand umask (together with PAM) should allow me to do exactly what I want: set standard permissions for new files and directories. However, after many many hours of reading and trial-and-error, I still do not get this to work. I get many unexpected results. I really like to get a solid grasp of umask and have many question unanswered. I will post these questions below, together with my findings and an explanation of my trials that led to these questions. Given that many things appear to go wrong, I think that I am doing several things wrong. So therefore, there are many questions. NOTE: I am using Ubuntu 9.10 and therefore can not change the sshd_config to set the umask for the SFTP server. Installed SSH OpenSSH_5.1p1 Debian-6ubuntu2 < required OpenSSH 5.4p1. So here go the questions. 1. DO I NEED TO RESTART FOR PAM CHANGS TO TAKE EFFECT? Let's start with this. There were so many files involved and I was unable to figure out what does and what does not affect things, also because I did not know whether or not I have to restart the whole system for PAM changes to take effect. I did do so after not seeing the expected results, but is this really necessary? Or can I just log out from the server and log back in, and should new PAM policies be effective? Or is there some 'PAM' program to reload? 2. IS THERE ONE SINGLE FILE TO CHANGE THAT AFFECTS ALL USERS FOR ALL SESSIONS? So I ended up changing MANY files, as I read MANY different things. I ended up setting the umask in the following files: ~/.profile -> umask=0002 ~/.bashrc -> umask=0002 /etc/profile -> umask=0002 /etc/pam.d/common-session -> umask=0002 /etc/pam.d/sshd -> umask=0002 /etc/pam.d/login -> umask=0002 I want this change to apply to all users, so some sort of system-wide change would be best. Can it be achieved? 3. AFTER ALL, THIS UMASK THING, DOES IT WORK? So after changing umask to 0002 at every possible place, I run tests. ------------SCP----------- TEST 1: scp testfile (which has 777 permissions for testing purposes) server:/home/ testfile 100% 4 0.0KB/s 00:00 Let's check permissions: user@server:/home$ ls -l total 4 -rwx--x--x 1 user uploaders 4 2011-02-05 17:59 testfile (711) ---------SSH------------ TEST 2: ssh server user@server:/home$ touch anotherfile user@server:/home$ ls -l total 4 -rw-rw-r-- 1 user uploaders 0 2011-02-05 18:03 anotherfile (664) --------SFTP----------- Nautilus: sftp://server/home/ Copy and paste newfile from client to server (777 on client) TEST 3: user@server:/home$ ls -l total 4 -rwxrwxrwx 1 user uploaders 3 2011-02-05 18:05 newfile (777) Create a new file through Nautilus. Check file permissions in terminal: TEST 4: user@server:/home$ ls -l total 4 -rw------- 1 user uploaders 0 2011-02-05 18:06 newfile (600) I mean... WHAT just happened here?! We should get 644 every single time. Instead I get 711, 777, 600, and then once 644. And the 644 is only achieved when creating a new, blank file through SSH, which is the least probable scenario. So I am asking, does umask/pam work after all? 4. SO WHAT DOES IT MEAN TO UMASK SSHFS? Sometimes we mount a server locally, using sshfs. Very useful. But again, we have permissions issues. Here is how we mount: sshfs -o idmap=user -o umask=0113 user@server:/home/ /mnt NOTE: we use umask = 113 because apparently, sshfs starts from 777 instead of 666, so with 113 we get 664 which is the desired file permission. But what now happens is that we see all files and directories as if they are 664. We browse in Nautilus to /mnt and: Right click - New File (newfile) --- TEST 5 Right click - New Folder (newfolder) --- TEST 6 Copy and paste a 777 file from our local client --- TEST 7 So let's check on the command line: user@client:/mnt$ ls -l total 8 -rw-rw-r-- 1 user 1007 3 Feb 5 18:05 copyfile (664) -rw-rw-r-- 1 user 1007 0 Feb 5 18:15 newfile (664) drw-rw-r-- 1 user 1007 4096 Feb 5 18:15 newfolder (664) But hey, let's check this same folder on the server-side: user@server:/home$ ls -l total 8 -rwxrwxrwx 1 user uploaders 3 2011-02-05 18:05 copyfile (777) -rw------- 1 user uploaders 0 2011-02-05 18:15 newfile (600) drwx--x--x 2 user uploaders 4096 2011-02-05 18:15 newfolder (711) What?! The REAL file permissions are very different from what we see in Nautilus. So does this umask on sshfs just create a 'filter' that shows unreal file permissions? And I tried to open a file from another user but the same group that had real 600 permissions but 644 'fake' permissions, and I could still not read this, so what good is this filter?? 5. UMASK IS ALL ABOUT FILES. BUT WHAT ABOUT DIRECTORIES? From my tests I can see that the umask that is being applied also somehow influences the directory permissions. However, I want my files to be 664 (002) and my directories to be 771 (006). So is it possible to have a different umask for directories? 6. PERHAPS UMASK/PAM IS REALLY COOL, BUT UBUNTU IS JUST BUGGY? On the one hand, I have read topics of people that have had success with PAM/UMASK and Ubuntu. On the other hand, I have found many older and newer bugs regarding umask/PAM/fuse on Ubuntu: https://bugs.launchpad.net/ubuntu/+source/gdm/+bug/241198 https://bugs.launchpad.net/ubuntu/+source/fuse/+bug/239792 https://bugs.launchpad.net/ubuntu/+source/pam/+bug/253096 https://bugs.launchpad.net/ubuntu/+source/sudo/+bug/549172 http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=314796 So I do not know what to believe anymore. Should I just give up? Would ACL solve all my problems? Or do I have again problems using Ubuntu? One word of caution with backups using tar. Red Hat /Centos distributions support acls in the tar program but Ubuntu does not support acls when backing up. This means that all acls will be lost when you create a backup. I am very willing to upgrade to Ubuntu 10.04 if that would solve my problems too, but first I want to understand what is happening.

    Read the article

  • Does anyone really understand how HFSC scheduling in Linux/BSD works?

    - by Mecki
    I read the original SIGCOMM '97 PostScript paper about HFSC, it is very technically, but I understand the basic concept. Instead of giving a linear service curve (as with pretty much every other scheduling algorithm), you can specify a convex or concave service curve and thus it is possible to decouple bandwidth and delay. However, even though this paper mentions to kind of scheduling algorithms being used (real-time and link-share), it always only mentions ONE curve per scheduling class (the decoupling is done by specifying this curve, only one curve is needed for that). Now HFSC has been implemented for BSD (OpenBSD, FreeBSD, etc.) using the ALTQ scheduling framework and it has been implemented Linux using the TC scheduling framework (part of iproute2). Both implementations added two additional service curves, that were NOT in the original paper! A real-time service curve and an upper-limit service curve. Again, please note that the original paper mentions two scheduling algorithms (real-time and link-share), but in that paper both work with one single service curve. There never have been two independent service curves for either one as you currently find in BSD and Linux. Even worse, some version of ALTQ seems to add an additional queue priority to HSFC (there is no such thing as priority in the original paper either). I found several BSD HowTo's mentioning this priority setting (even though the man page of the latest ALTQ release knows no such parameter for HSFC, so officially it does not even exist). This all makes the HFSC scheduling even more complex than the algorithm described in the original paper and there are tons of tutorials on the Internet that often contradict each other, one claiming the opposite of the other one. This is probably the main reason why nobody really seems to understand how HFSC scheduling really works. Before I can ask my questions, we need a sample setup of some kind. I'll use a very simple one as seen in the image below: Here are some questions I cannot answer because the tutorials contradict each other: What for do I need a real-time curve at all? Assuming A1, A2, B1, B2 are all 128 kbit/s link-share (no real-time curve for either one), then each of those will get 128 kbit/s if the root has 512 kbit/s to distribute (and A and B are both 256 kbit/s of course), right? Why would I additionally give A1 and B1 a real-time curve with 128 kbit/s? What would this be good for? To give those two a higher priority? According to original paper I can give them a higher priority by using a curve, that's what HFSC is all about after all. By giving both classes a curve of [256kbit/s 20ms 128kbit/s] both have twice the priority than A2 and B2 automatically (still only getting 128 kbit/s on average) Does the real-time bandwidth count towards the link-share bandwidth? E.g. if A1 and B1 both only have 64kbit/s real-time and 64kbit/s link-share bandwidth, does that mean once they are served 64kbit/s via real-time, their link-share requirement is satisfied as well (they might get excess bandwidth, but lets ignore that for a second) or does that mean they get another 64 kbit/s via link-share? So does each class has a bandwidth "requirement" of real-time plus link-share? Or does a class only have a higher requirement than the real-time curve if the link-share curve is higher than the real-time curve (current link-share requirement equals specified link-share requirement minus real-time bandwidth already provided to this class)? Is upper limit curve applied to real-time as well, only to link-share, or maybe to both? Some tutorials say one way, some say the other way. Some even claim upper-limit is the maximum for real-time bandwidth + link-share bandwidth? What is the truth? Assuming A2 and B2 are both 128 kbit/s, does it make any difference if A1 and B1 are 128 kbit/s link-share only, or 64 kbit/s real-time and 128 kbit/s link-share, and if so, what difference? If I use the seperate real-time curve to increase priorities of classes, why would I need "curves" at all? Why is not real-time a flat value and link-share also a flat value? Why are both curves? The need for curves is clear in the original paper, because there is only one attribute of that kind per class. But now, having three attributes (real-time, link-share, and upper-limit) what for do I still need curves on each one? Why would I want the curves shape (not average bandwidth, but their slopes) to be different for real-time and link-share traffic? According to the little documentation available, real-time curve values are totally ignored for inner classes (class A and B), they are only applied to leaf classes (A1, A2, B1, B2). If that is true, why does the ALTQ HFSC sample configuration (search for 3.3 Sample configuration) set real-time curves on inner classes and claims that those set the guaranteed rate of those inner classes? Isn't that completely pointless? (note: pshare sets the link-share curve in ALTQ and grate the real-time curve; you can see this in the paragraph above the sample configuration). Some tutorials say the sum of all real-time curves may not be higher than 80% of the line speed, others say it must not be higher than 70% of the line speed. Which one is right or are they maybe both wrong? One tutorial said you shall forget all the theory. No matter how things really work (schedulers and bandwidth distribution), imagine the three curves according to the following "simplified mind model": real-time is the guaranteed bandwidth that this class will always get. link-share is the bandwidth that this class wants to become fully satisfied, but satisfaction cannot be guaranteed. In case there is excess bandwidth, the class might even get offered more bandwidth than necessary to become satisfied, but it may never use more than upper-limit says. For all this to work, the sum of all real-time bandwidths may not be above xx% of the line speed (see question above, the percentage varies). Question: Is this more or less accurate or a total misunderstanding of HSFC? And if assumption above is really accurate, where is prioritization in that model? E.g. every class might have a real-time bandwidth (guaranteed), a link-share bandwidth (not guaranteed) and an maybe an upper-limit, but still some classes have higher priority needs than other classes. In that case I must still prioritize somehow, even among real-time traffic of those classes. Would I prioritize by the slope of the curves? And if so, which curve? The real-time curve? The link-share curve? The upper-limit curve? All of them? Would I give all of them the same slope or each a different one and how to find out the right slope? I still haven't lost hope that there exists at least a hand full of people in this world that really understood HFSC and are able to answer all these questions accurately. And doing so without contradicting each other in the answers would be really nice ;-)

    Read the article

  • Windows periodically disconnects, reconnects to the network

    - by einpoklum
    My setup: I have a PC with a Gigabyte GA-MA78S2H motherboard (Realtek Gigabit wired Ethernet on-board). I have the latest drivers (at least the latest driver for the NIC. I'm connecting via an Edimax BR-6216Mg (again, wired connection). For some reason I experience short periodic disconnects and reconnects. Specifically, Skype disconnects, tries to connect, succeeds after a short while; incoming SFTP sessions get dropped; using a browser, I sometime get stuck in the DNS lookup or connection to the website and a page won't load. A couple of seconds later, a reload works. All this happens with Windows XP SP3. With Windows 7, it also happens. (When I initially wrote this question I didn't notice it.) ipconfig for my adapter: Ethernet adapter Local Area Connection: Connection-specific DNS Suffix . : Description . . . . . . . . . . . : Realtek PCIe GBE Family Controller Physical Address. . . . . . . . . : 00-1D-7D-E9-72-9E Dhcp Enabled. . . . . . . . . . . : Yes Autoconfiguration Enabled . . . . : Yes IP Address. . . . . . . . . . . . : 192.168.0.2 Subnet Mask . . . . . . . . . . . : 255.255.255.0 Default Gateway . . . . . . . . . : 192.168.0.254 DHCP Server . . . . . . . . . . . : 192.168.0.254 DNS Servers . . . . . . . . . . . : 192.117.235.235 62.219.186.7 Lease Obtained. . . . . . . . . . : Saturday, March 10, 2012 8:28:20 AM Lease Expires . . . . . . . . . . : Friday, January 26, 1906 2:00:04 AM A result of some tests a couple of the disconnects: C:\Documents and Settings\eyalroz.BAKNUNIN>nslookup google.com DNS request timed out. timeout was 2 seconds. *** Can't find server name for address 192.117.235.235: Timed out DNS request timed out. timeout was 2 seconds. *** Can't find server name for address 62.219.186.7: Timed out *** Default servers are not available Server: UnKnown Address: 192.117.235.235 DNS request timed out. timeout was 2 seconds. DNS request timed out. timeout was 2 seconds. *** Request to UnKnown timed-out C:\Documents and Settings\eyalroz.BAKNUNIN>ping 194.90.1.5 Pinging 194.90.1.5 with 32 bytes of data: Control-C ^C C:\Documents and Settings\eyalroz.BAKNUNIN>tracert -d 194.90.1.5 Tracing route to 194.90.1.5 over a maximum of 30 hops 1 <1 ms <1 ms <1 ms 192.168.0.254 2 * * 11 ms 10.168.128.1 3 14 ms 13 ms 14 ms 212.179.160.142 4 * * * Request timed out. 5 * * * Request timed out. 6 * * 47 ms 62.219.189.169 7 31 ms 27 ms 32 ms 62.219.189.150 8 15 ms 14 ms 16 ms 192.114.65.202 9 15 ms 15 ms 11 ms 212.143.10.66 10 13 ms 29 ms 31 ms 212.143.12.234 11 35 ms 15 ms 18 ms 212.143.8.72 12 22 ms 22 ms 16 ms 194.90.1.5 I usually ping 194.90.1.5 (which is not at my ISP) with 15ms response time and no losses. Things I've done/tried: [2012-03-26] I replaced the cable; I thought that made a difference, but the disconnects were back a while later, so that wasn't it. Updated the NIC driver. Tried reducing the MTU (used a utility called Dr. TCP); there was no effect. I updated my board BIOS revision (which caused all the HW to be "reinstalled" or re-identified - successfully). I installed another NIC, and tried switching to it - same effect with the on-board NIC. A while back I tried another router (although it was an Edimax model) - same problem. Connected the computer directly, with no router. Same problem. ping -t to the router (192.168.0.254) gives pongs, nothing is lost, and time is < 1 ms almost always (sometimes it says 1 or 2 ms). This is the case also during the disconnects.

    Read the article

  • How to set up linux watchdog daemon with Intel 6300esb

    - by ACiD GRiM
    I've been searching for this on Google for sometime now and I have yet to find proper documentation on how to connect the kernel driver for my 6300esb watchdog timer to /dev/watchdog and ensure that watchdog daemon is keeping it alive. I am using RHEL compatible Scientific Linux 6.3 in a KVM virtual machine by the way Below is everything I've tried so far: dmesg|grep 6300 i6300ESB timer: Intel 6300ESB WatchDog Timer Driver v0.04 i6300ESB timer: initialized (0xffffc900008b8000). heartbeat=30 sec (nowayout=0) | ll /dev/watchdog crw-rw----. 1 root root 10, 130 Sep 22 22:25 /dev/watchdog | /etc/watchdog.conf #ping = 172.31.14.1 #ping = 172.26.1.255 #interface = eth0 file = /var/log/messages #change = 1407 # Uncomment to enable test. Setting one of these values to '0' disables it. # These values will hopefully never reboot your machine during normal use # (if your machine is really hung, the loadavg will go much higher than 25) max-load-1 = 24 max-load-5 = 18 max-load-15 = 12 # Note that this is the number of pages! # To get the real size, check how large the pagesize is on your machine. #min-memory = 1 #repair-binary = /usr/sbin/repair #test-binary = #test-timeout = watchdog-device = /dev/watchdog # Defaults compiled into the binary #temperature-device = #max-temperature = 120 # Defaults compiled into the binary #admin = root interval = 10 #logtick = 1 # This greatly decreases the chance that watchdog won't be scheduled before # your machine is really loaded realtime = yes priority = 1 # Check if syslogd is still running by enabling the following line #pidfile = /var/run/syslogd.pid Now maybe I'm not testing it correctly, but I would expecting that stopping the watchdog service would cause the /dev/watchdog to time out after 30 seconds and I should see the host reboot, however this does not happen. Also, here is my config for the KVM vm <!-- WARNING: THIS IS AN AUTO-GENERATED FILE. CHANGES TO IT ARE LIKELY TO BE OVERWRITTEN AND LOST. Changes to this xml configuration should be made using: virsh edit sl6template or other application using the libvirt API. --> <domain type='kvm'> <name>sl6template</name> <uuid>960d0ac2-2e6a-5efa-87a3-6bb779e15b6a</uuid> <memory unit='KiB'>262144</memory> <currentMemory unit='KiB'>262144</currentMemory> <vcpu placement='static'>1</vcpu> <os> <type arch='x86_64' machine='rhel6.3.0'>hvm</type> <boot dev='hd'/> </os> <features> <acpi/> <apic/> <pae/> </features> <cpu mode='custom' match='exact'> <model fallback='allow'>Westmere</model> <vendor>Intel</vendor> <feature policy='require' name='tm2'/> <feature policy='require' name='est'/> <feature policy='require' name='vmx'/> <feature policy='require' name='ds'/> <feature policy='require' name='smx'/> <feature policy='require' name='ss'/> <feature policy='require' name='vme'/> <feature policy='require' name='dtes64'/> <feature policy='require' name='rdtscp'/> <feature policy='require' name='ht'/> <feature policy='require' name='dca'/> <feature policy='require' name='pbe'/> <feature policy='require' name='tm'/> <feature policy='require' name='pdcm'/> <feature policy='require' name='pdpe1gb'/> <feature policy='require' name='ds_cpl'/> <feature policy='require' name='pclmuldq'/> <feature policy='require' name='xtpr'/> <feature policy='require' name='acpi'/> <feature policy='require' name='monitor'/> <feature policy='require' name='aes'/> </cpu> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <devices> <emulator>/usr/libexec/qemu-kvm</emulator> <disk type='file' device='disk'> <driver name='qemu' type='raw'/> <source file='/mnt/data/vms/sl6template.img'/> <target dev='vda' bus='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </disk> <controller type='usb' index='0'> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> </controller> <interface type='bridge'> <mac address='52:54:00:44:57:f6'/> <source bridge='br0.2'/> <model type='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> <interface type='bridge'> <mac address='52:54:00:88:0f:42'/> <source bridge='br1'/> <model type='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </interface> <serial type='pty'> <target port='0'/> </serial> <console type='pty'> <target type='serial' port='0'/> </console> <watchdog model='i6300esb' action='reset'> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </watchdog> <memballoon model='virtio'> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </memballoon> </devices> </domain> Any help is appreciated as the most I've found are patches to kvm and general softdog documentation or IPMI watchdog answers.

    Read the article

  • VPN still working after rebooting without client - DrayTek client shows "No Connection"

    - by HeavenCore
    My home network is a simple router + pc's setup, nothing fancy - the router has DHCP enabled for 192.168.0.X (255.255.255.0) and my PC picks up the address 192.168.0.82. There are no devices on my local lan in the 192.168.1.x range. On my pc i have the DrayTek VPN client, and a company i do some work for has a DrayTek Vigor router. The VPN client establishes a VPN to that remote company using an IPSec Tunnel (PreShared Key - no encryption) Last night i shut down my pc with the VPN tunnel still connected, when i turned my computer on this morning i accidentally clicked an RDP shortcut to 192.168.1.2 (a host in the remote company) and to my amazement it connected?!? I checked and the DrayTek VPN client isnt running, and when i did run it, it clearly shows "Status: No connection". confused as to how my machine can still talk to this remote machine i tried a trace: C:\Users\HeavenCore>tracert 192.168.1.2 Tracing route to C4SERVERII [192.168.1.2] over a maximum of 30 hops: 1 * * * Request timed out. 2 * * * Request timed out. 3 * * * Request timed out. 4 * * * Request timed out. 5 * * * Request timed out. 6 * * * Request timed out. 7 * * * Request timed out. 8 * * * Request timed out. 9 * * * Request timed out. 10 * * * Request timed out. 11 * * * Request timed out. 12 15 ms 21 ms 32 ms C4SERVERII [192.168.1.2] Trace complete. No indication there as to how it's getting from my network to the remote host. with my network mask being 255.255.255.0 with ip 192.168.0.1 i dont even see how packets are routing to 192.168.1.1 - unless there was a static route in place, so i checked the route table: IPv4 Route Table =========================================================================== Active Routes: Network Destination Netmask Gateway Interface Metric 0.0.0.0 0.0.0.0 192.168.0.1 192.168.0.82 266 127.0.0.0 255.0.0.0 On-link 127.0.0.1 306 127.0.0.1 255.255.255.255 On-link 127.0.0.1 306 127.255.255.255 255.255.255.255 On-link 127.0.0.1 306 192.168.0.0 255.255.255.0 On-link 192.168.0.82 266 192.168.0.82 255.255.255.255 On-link 192.168.0.82 266 192.168.0.255 255.255.255.255 On-link 192.168.0.82 266 224.0.0.0 240.0.0.0 On-link 127.0.0.1 306 224.0.0.0 240.0.0.0 On-link 192.168.0.82 266 255.255.255.255 255.255.255.255 On-link 127.0.0.1 306 255.255.255.255 255.255.255.255 On-link 192.168.0.82 266 =========================================================================== Persistent Routes: Network Address Netmask Gateway Address Metric 0.0.0.0 0.0.0.0 192.168.0.1 Default =========================================================================== As far as i can see, nothing indicating how my packets are getting to 192.168.1.2??? To confirm i was on a different subnet i did an ipconfig /all: Ethernet adapter Local Area Connection: Connection-specific DNS Suffix . : Description . . . . . . . . . . . : Marvell Yukon 88E8056 PCI-E Gigabit Ether net Controller Physical Address. . . . . . . . . : 00-23-54-F3-4E-BA DHCP Enabled. . . . . . . . . . . : No Autoconfiguration Enabled . . . . : Yes IPv4 Address. . . . . . . . . . . : 192.168.0.82(Preferred) Subnet Mask . . . . . . . . . . . : 255.255.255.0 Default Gateway . . . . . . . . . : 192.168.0.1 DNS Servers . . . . . . . . . . . : 192.168.0.1 208.67.222.222 NetBIOS over Tcpip. . . . . . . . : Enabled Yet straight after confirming my ip and subnet as above i can go ahead and ping the remote machine: C:\Users\HeavenCore>ping 192.168.1.2 Pinging 192.168.1.2 with 32 bytes of data: Reply from 192.168.1.2: bytes=32 time=48ms TTL=127 Reply from 192.168.1.2: bytes=32 time=23ms TTL=127 Reply from 192.168.1.2: bytes=32 time=103ms TTL=127 Reply from 192.168.1.2: bytes=32 time=25ms TTL=127 Ping statistics for 192.168.1.2: Packets: Sent = 4, Received = 4, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 23ms, Maximum = 103ms, Average = 49ms Also, note on the ping how the times are 35ms ish, this clearly shows the pings are to the remote host and not something on my local lan (all stuff on my local lan pings in 0ms) - plus i verified the host was actually the host via RDP. My Question: Can an IPSec tunnel stay up some how after a reboot without use of the VPN client? (well, i can clearly see that it can) - where in windows is there visibility of this? how does my machine know where to route the packets? I appreciate any insights & thoughts!

    Read the article

  • Oracle Coherence, Split-Brain and Recovery Protocols In Detail

    - by Ricardo Ferreira
    This article provides a high level conceptual overview of Split-Brain scenarios in distributed systems. It will focus on a specific example of cluster communication failure and recovery in Oracle Coherence. This includes a discussion on the witness protocol (used to remove failed cluster members) and the panic protocol (used to resolve Split-Brain scenarios). Note that the removal of cluster members does not necessarily indicate a Split-Brain condition. Oracle Coherence does not (and cannot) detect a Split-Brain as it occurs, the condition is only detected when cluster members that previously lost contact with each other regain contact. Cluster Topology and Configuration In order to create an good didactic for the article, let's assume a cluster topology and configuration. In this example we have a six member cluster, consisting of one JVM on each physical machine. The member IDs are as follows: Member ID  IP Address  1  10.149.155.76  2  10.149.155.77  3  10.149.155.236  4  10.149.155.75  5  10.149.155.79  6  10.149.155.78 Members 1, 2, and 3 are connected to a switch, and members 4, 5, and 6 are connected to a second switch. There is a link between the two switches, which provides network connectivity between all of the machines. Member 1 is the first member to join this cluster, thus making it the senior member. Member 6 is the last member to join this cluster. Here is a log snippet from Member 6 showing the complete member set: 2010-02-26 15:27:57.390/3.062 Oracle Coherence GE 3.5.3/465p2 <Info> (thread=main, member=6): Started DefaultCacheServer... SafeCluster: Name=cluster:0xDDEB Group{Address=224.3.5.3, Port=35465, TTL=4} MasterMemberSet ( ThisMember=Member(Id=6, Timestamp=2010-02-26 15:27:58.635, Address=10.149.155.78:8088, MachineId=1102, Location=process:228, Role=CoherenceServer) OldestMember=Member(Id=1, Timestamp=2010-02-26 15:27:06.931, Address=10.149.155.76:8088, MachineId=1100, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:511, Role=CoherenceServer) ActualMemberSet=MemberSet(Size=6, BitSetCount=2 Member(Id=1, Timestamp=2010-02-26 15:27:06.931, Address=10.149.155.76:8088, MachineId=1100, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:511, Role=CoherenceServer) Member(Id=2, Timestamp=2010-02-26 15:27:17.847, Address=10.149.155.77:8088, MachineId=1101, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:296, Role=CoherenceServer) Member(Id=3, Timestamp=2010-02-26 15:27:24.892, Address=10.149.155.236:8088, MachineId=1260, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:32459, Role=CoherenceServer) Member(Id=4, Timestamp=2010-02-26 15:27:39.574, Address=10.149.155.75:8088, MachineId=1099, Location=process:800, Role=CoherenceServer) Member(Id=5, Timestamp=2010-02-26 15:27:49.095, Address=10.149.155.79:8088, MachineId=1103, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:3229, Role=CoherenceServer) Member(Id=6, Timestamp=2010-02-26 15:27:58.635, Address=10.149.155.78:8088, MachineId=1102, Location=process:228, Role=CoherenceServer) ) RecycleMillis=120000 RecycleSet=MemberSet(Size=0, BitSetCount=0 ) ) At approximately 15:30, the connection between the two switches is severed: Thirty seconds later (the default packet timeout in development mode) the logs indicate communication failures across the cluster. In this example, the communication failure was caused by a network failure. In a production setting, this type of communication failure can have many root causes, including (but not limited to) network failures, excessive GC, high CPU utilization, swapping/virtual memory, and exceeding maximum network bandwidth. In addition, this type of failure is not necessarily indicative of a split brain. Any communication failure will be logged in this fashion. Member 2 logs a communication failure with Member 5: 2010-02-26 15:30:32.638/196.928 Oracle Coherence GE 3.5.3/465p2 <Warning> (thread=PacketPublisher, member=2): Timeout while delivering a packet; requesting the departure confirmation for Member(Id=5, Timestamp=2010-02-26 15:27:49.095, Address=10.149.155.79:8088, MachineId=1103, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:3229, Role=CoherenceServer) by MemberSet(Size=2, BitSetCount=2 Member(Id=1, Timestamp=2010-02-26 15:27:06.931, Address=10.149.155.76:8088, MachineId=1100, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:511, Role=CoherenceServer) Member(Id=4, Timestamp=2010-02-26 15:27:39.574, Address=10.149.155.75:8088, MachineId=1099, Location=process:800, Role=CoherenceServer) ) The Coherence clustering protocol (TCMP) is a reliable transport mechanism built on UDP. In order for the protocol to be reliable, it requires an acknowledgement (ACK) for each packet delivered. If a packet fails to be acknowledged within the configured timeout period, the Coherence cluster member will log a packet timeout (as seen in the log message above). When this occurs, the cluster member will consult with other members to determine who is at fault for the communication failure. If the witness members agree that the suspect member is at fault, the suspect is removed from the cluster. If the witnesses unanimously disagree, the accuser is removed. This process is known as the witness protocol. Since Member 2 cannot communicate with Member 5, it selects two witnesses (Members 1 and 4) to determine if the communication issue is with Member 5 or with itself (Member 2). However, Member 4 is on the switch that is no longer accessible by Members 1, 2 and 3; thus a packet timeout for member 4 is recorded as well: 2010-02-26 15:30:35.648/199.938 Oracle Coherence GE 3.5.3/465p2 <Warning> (thread=PacketPublisher, member=2): Timeout while delivering a packet; requesting the departure confirmation for Member(Id=4, Timestamp=2010-02-26 15:27:39.574, Address=10.149.155.75:8088, MachineId=1099, Location=process:800, Role=CoherenceServer) by MemberSet(Size=2, BitSetCount=2 Member(Id=1, Timestamp=2010-02-26 15:27:06.931, Address=10.149.155.76:8088, MachineId=1100, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:511, Role=CoherenceServer) Member(Id=6, Timestamp=2010-02-26 15:27:58.635, Address=10.149.155.78:8088, MachineId=1102, Location=process:228, Role=CoherenceServer) ) Member 1 has the ability to confirm the departure of member 4, however Member 6 cannot as it is also inaccessible. At the same time, Member 3 sends a request to remove Member 6, which is followed by a report from Member 3 indicating that Member 6 has departed the cluster: 2010-02-26 15:30:35.706/199.996 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Cluster, member=2): MemberLeft request for Member 6 received from Member(Id=3, Timestamp=2010-02-26 15:27:24.892, Address=10.149.155.236:8088, MachineId=1260, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:32459, Role=CoherenceServer) 2010-02-26 15:30:35.709/199.999 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Cluster, member=2): MemberLeft notification for Member 6 received from Member(Id=3, Timestamp=2010-02-26 15:27:24.892, Address=10.149.155.236:8088, MachineId=1260, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:32459, Role=CoherenceServer) The log for Member 3 determines how Member 6 departed the cluster: 2010-02-26 15:30:35.161/191.694 Oracle Coherence GE 3.5.3/465p2 <Warning> (thread=PacketPublisher, member=3): Timeout while delivering a packet; requesting the departure confirmation for Member(Id=6, Timestamp=2010-02-26 15:27:58.635, Address=10.149.155.78:8088, MachineId=1102, Location=process:228, Role=CoherenceServer) by MemberSet(Size=2, BitSetCount=2 Member(Id=1, Timestamp=2010-02-26 15:27:06.931, Address=10.149.155.76:8088, MachineId=1100, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:511, Role=CoherenceServer) Member(Id=2, Timestamp=2010-02-26 15:27:17.847, Address=10.149.155.77:8088, MachineId=1101, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:296, Role=CoherenceServer) ) 2010-02-26 15:30:35.165/191.698 Oracle Coherence GE 3.5.3/465p2 <Info> (thread=Cluster, member=3): Member departure confirmed by MemberSet(Size=2, BitSetCount=2 Member(Id=1, Timestamp=2010-02-26 15:27:06.931, Address=10.149.155.76:8088, MachineId=1100, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:511, Role=CoherenceServer) Member(Id=2, Timestamp=2010-02-26 15:27:17.847, Address=10.149.155.77:8088, MachineId=1101, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:296, Role=CoherenceServer) ); removing Member(Id=6, Timestamp=2010-02-26 15:27:58.635, Address=10.149.155.78:8088, MachineId=1102, Location=process:228, Role=CoherenceServer) In this case, Member 3 happened to select two witnesses that it still had connectivity with (Members 1 and 2) thus resulting in a simple decision to remove Member 6. Given the departure of Member 6, Member 2 is left with a single witness to confirm the departure of Member 4: 2010-02-26 15:30:35.713/200.003 Oracle Coherence GE 3.5.3/465p2 <Info> (thread=Cluster, member=2): Member departure confirmed by MemberSet(Size=1, BitSetCount=2 Member(Id=1, Timestamp=2010-02-26 15:27:06.931, Address=10.149.155.76:8088, MachineId=1100, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:511, Role=CoherenceServer) ); removing Member(Id=4, Timestamp=2010-02-26 15:27:39.574, Address=10.149.155.75:8088, MachineId=1099, Location=process:800, Role=CoherenceServer) In the meantime, Member 4 logs a missing heartbeat from the senior member. This message is also logged on Members 5 and 6. 2010-02-26 15:30:07.906/150.453 Oracle Coherence GE 3.5.3/465p2 <Info> (thread=PacketListenerN, member=4): Scheduled senior member heartbeat is overdue; rejoining multicast group. Next, Member 4 logs a TcpRing failure with Member 2, thus resulting in the termination of Member 2: 2010-02-26 15:30:21.421/163.968 Oracle Coherence GE 3.5.3/465p2 <D4> (thread=Cluster, member=4): TcpRing: Number of socket exceptions exceeded maximum; last was "java.net.SocketTimeoutException: connect timed out"; removing the member: 2 For quick process termination detection, Oracle Coherence utilizes a feature called TcpRing which is a sparse collection of TCP/IP-based connections between different members in the cluster. Each member in the cluster is connected to at least one other member, which (if at all possible) is running on a different physical box. This connection is not used for any data transfer, only heartbeat communications are sent once a second per each link. If a certain number of exceptions are thrown while trying to re-establish a connection, the member throwing the exceptions is removed from the cluster. Member 5 logs a packet timeout with Member 3 and cites witnesses Members 4 and 6: 2010-02-26 15:30:29.791/165.037 Oracle Coherence GE 3.5.3/465p2 <Warning> (thread=PacketPublisher, member=5): Timeout while delivering a packet; requesting the departure confirmation for Member(Id=3, Timestamp=2010-02-26 15:27:24.892, Address=10.149.155.236:8088, MachineId=1260, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:32459, Role=CoherenceServer) by MemberSet(Size=2, BitSetCount=2 Member(Id=4, Timestamp=2010-02-26 15:27:39.574, Address=10.149.155.75:8088, MachineId=1099, Location=process:800, Role=CoherenceServer) Member(Id=6, Timestamp=2010-02-26 15:27:58.635, Address=10.149.155.78:8088, MachineId=1102, Location=process:228, Role=CoherenceServer) ) 2010-02-26 15:30:29.798/165.044 Oracle Coherence GE 3.5.3/465p2 <Info> (thread=Cluster, member=5): Member departure confirmed by MemberSet(Size=2, BitSetCount=2 Member(Id=4, Timestamp=2010-02-26 15:27:39.574, Address=10.149.155.75:8088, MachineId=1099, Location=process:800, Role=CoherenceServer) Member(Id=6, Timestamp=2010-02-26 15:27:58.635, Address=10.149.155.78:8088, MachineId=1102, Location=process:228, Role=CoherenceServer) ); removing Member(Id=3, Timestamp=2010-02-26 15:27:24.892, Address=10.149.155.236:8088, MachineId=1260, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:32459, Role=CoherenceServer) Eventually we are left with two distinct clusters consisting of Members 1, 2, 3 and Members 4, 5, 6, respectively. In the latter cluster, Member 4 is promoted to senior member. The connection between the two switches is restored at 15:33. Upon the restoration of the connection, the cluster members immediately receive cluster heartbeats from the two senior members. In the case of Members 1, 2, and 3, the following is logged: 2010-02-26 15:33:14.970/369.066 Oracle Coherence GE 3.5.3/465p2 <Warning> (thread=Cluster, member=1): The member formerly known as Member(Id=4, Timestamp=2010-02-26 15:30:35.341, Address=10.149.155.75:8088, MachineId=1099, Location=process:800, Role=CoherenceServer) has been forcefully evicted from the cluster, but continues to emit a cluster heartbeat; henceforth, the member will be shunned and its messages will be ignored. Likewise for Members 4, 5, and 6: 2010-02-26 15:33:14.343/336.890 Oracle Coherence GE 3.5.3/465p2 <Warning> (thread=Cluster, member=4): The member formerly known as Member(Id=1, Timestamp=2010-02-26 15:30:31.64, Address=10.149.155.76:8088, MachineId=1100, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:511, Role=CoherenceServer) has been forcefully evicted from the cluster, but continues to emit a cluster heartbeat; henceforth, the member will be shunned and its messages will be ignored. This message indicates that a senior heartbeat is being received from members that were previously removed from the cluster, in other words, something that should not be possible. For this reason, the recipients of these messages will initially ignore them. After several iterations of these messages, the existence of multiple clusters is acknowledged, thus triggering the panic protocol to reconcile this situation. When the presence of more than one cluster (i.e. Split-Brain) is detected by a Coherence member, the panic protocol is invoked in order to resolve the conflicting clusters and consolidate into a single cluster. The protocol consists of the removal of smaller clusters until there is one cluster remaining. In the case of equal size clusters, the one with the older Senior Member will survive. Member 1, being the oldest member, initiates the protocol: 2010-02-26 15:33:45.970/400.066 Oracle Coherence GE 3.5.3/465p2 <Warning> (thread=Cluster, member=1): An existence of a cluster island with senior Member(Id=4, Timestamp=2010-02-26 15:27:39.574, Address=10.149.155.75:8088, MachineId=1099, Location=process:800, Role=CoherenceServer) containing 3 nodes have been detected. Since this Member(Id=1, Timestamp=2010-02-26 15:27:06.931, Address=10.149.155.76:8088, MachineId=1100, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:511, Role=CoherenceServer) is the senior of an older cluster island, the panic protocol is being activated to stop the other island's senior and all junior nodes that belong to it. Member 3 receives the panic: 2010-02-26 15:33:45.803/382.336 Oracle Coherence GE 3.5.3/465p2 <Error> (thread=Cluster, member=3): Received panic from senior Member(Id=1, Timestamp=2010-02-26 15:27:06.931, Address=10.149.155.76:8088, MachineId=1100, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:511, Role=CoherenceServer) caused by Member(Id=4, Timestamp=2010-02-26 15:27:39.574, Address=10.149.155.75:8088, MachineId=1099, Location=process:800, Role=CoherenceServer) Member 4, the senior member of the younger cluster, receives the kill message from Member 3: 2010-02-26 15:33:44.921/367.468 Oracle Coherence GE 3.5.3/465p2 <Error> (thread=Cluster, member=4): Received a Kill message from a valid Member(Id=3, Timestamp=2010-02-26 15:27:24.892, Address=10.149.155.236:8088, MachineId=1260, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:32459, Role=CoherenceServer); stopping cluster service. In turn, Member 4 requests the departure of its junior members 5 and 6: 2010-02-26 15:33:44.921/367.468 Oracle Coherence GE 3.5.3/465p2 <Error> (thread=Cluster, member=4): Received a Kill message from a valid Member(Id=3, Timestamp=2010-02-26 15:27:24.892, Address=10.149.155.236:8088, MachineId=1260, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:32459, Role=CoherenceServer); stopping cluster service. 2010-02-26 15:33:43.343/349.015 Oracle Coherence GE 3.5.3/465p2 <Error> (thread=Cluster, member=6): Received a Kill message from a valid Member(Id=4, Timestamp=2010-02-26 15:27:39.574, Address=10.149.155.75:8088, MachineId=1099, Location=process:800, Role=CoherenceServer); stopping cluster service. Once Members 4, 5, and 6 restart, they rejoin the original cluster with senior member 1. The log below is from Member 4. Note that it receives a different member id when it rejoins the cluster. 2010-02-26 15:33:44.921/367.468 Oracle Coherence GE 3.5.3/465p2 <Error> (thread=Cluster, member=4): Received a Kill message from a valid Member(Id=3, Timestamp=2010-02-26 15:27:24.892, Address=10.149.155.236:8088, MachineId=1260, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:32459, Role=CoherenceServer); stopping cluster service. 2010-02-26 15:33:46.921/369.468 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Cluster, member=4): Service Cluster left the cluster 2010-02-26 15:33:47.046/369.593 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Invocation:InvocationService, member=4): Service InvocationService left the cluster 2010-02-26 15:33:47.046/369.593 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=OptimisticCache, member=4): Service OptimisticCache left the cluster 2010-02-26 15:33:47.046/369.593 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=ReplicatedCache, member=4): Service ReplicatedCache left the cluster 2010-02-26 15:33:47.046/369.593 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=DistributedCache, member=4): Service DistributedCache left the cluster 2010-02-26 15:33:47.046/369.593 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Invocation:Management, member=4): Service Management left the cluster 2010-02-26 15:33:47.046/369.593 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Cluster, member=4): Member 6 left service Management with senior member 5 2010-02-26 15:33:47.046/369.593 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Cluster, member=4): Member 6 left service DistributedCache with senior member 5 2010-02-26 15:33:47.046/369.593 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Cluster, member=4): Member 6 left service ReplicatedCache with senior member 5 2010-02-26 15:33:47.046/369.593 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Cluster, member=4): Member 6 left service OptimisticCache with senior member 5 2010-02-26 15:33:47.046/369.593 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Cluster, member=4): Member 6 left service InvocationService with senior member 5 2010-02-26 15:33:47.046/369.593 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Cluster, member=4): Member(Id=6, Timestamp=2010-02-26 15:33:47.046, Address=10.149.155.78:8088, MachineId=1102, Location=process:228, Role=CoherenceServer) left Cluster with senior member 4 2010-02-26 15:33:49.218/371.765 Oracle Coherence GE 3.5.3/465p2 <Info> (thread=main, member=n/a): Restarting cluster 2010-02-26 15:33:49.421/371.968 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Cluster, member=n/a): Service Cluster joined the cluster with senior service member n/a 2010-02-26 15:33:49.625/372.172 Oracle Coherence GE 3.5.3/465p2 <Info> (thread=Cluster, member=n/a): This Member(Id=5, Timestamp=2010-02-26 15:33:50.499, Address=10.149.155.75:8088, MachineId=1099, Location=process:800, Role=CoherenceServer, Edition=Grid Edition, Mode=Development, CpuCount=2, SocketCount=1) joined cluster "cluster:0xDDEB" with senior Member(Id=1, Timestamp=2010-02-26 15:27:06.931, Address=10.149.155.76:8088, MachineId=1100, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:511, Role=CoherenceServer, Edition=Grid Edition, Mode=Development, CpuCount=2, SocketCount=2) Cool isn't it?

    Read the article

  • A Taxonomy of Numerical Methods v1

    - by JoshReuben
    Numerical Analysis – When, What, (but not how) Once you understand the Math & know C++, Numerical Methods are basically blocks of iterative & conditional math code. I found the real trick was seeing the forest for the trees – knowing which method to use for which situation. Its pretty easy to get lost in the details – so I’ve tried to organize these methods in a way that I can quickly look this up. I’ve included links to detailed explanations and to C++ code examples. I’ve tried to classify Numerical methods in the following broad categories: Solving Systems of Linear Equations Solving Non-Linear Equations Iteratively Interpolation Curve Fitting Optimization Numerical Differentiation & Integration Solving ODEs Boundary Problems Solving EigenValue problems Enjoy – I did ! Solving Systems of Linear Equations Overview Solve sets of algebraic equations with x unknowns The set is commonly in matrix form Gauss-Jordan Elimination http://en.wikipedia.org/wiki/Gauss%E2%80%93Jordan_elimination C++: http://www.codekeep.net/snippets/623f1923-e03c-4636-8c92-c9dc7aa0d3c0.aspx Produces solution of the equations & the coefficient matrix Efficient, stable 2 steps: · Forward Elimination – matrix decomposition: reduce set to triangular form (0s below the diagonal) or row echelon form. If degenerate, then there is no solution · Backward Elimination –write the original matrix as the product of ints inverse matrix & its reduced row-echelon matrix à reduce set to row canonical form & use back-substitution to find the solution to the set Elementary ops for matrix decomposition: · Row multiplication · Row switching · Add multiples of rows to other rows Use pivoting to ensure rows are ordered for achieving triangular form LU Decomposition http://en.wikipedia.org/wiki/LU_decomposition C++: http://ganeshtiwaridotcomdotnp.blogspot.co.il/2009/12/c-c-code-lu-decomposition-for-solving.html Represent the matrix as a product of lower & upper triangular matrices A modified version of GJ Elimination Advantage – can easily apply forward & backward elimination to solve triangular matrices Techniques: · Doolittle Method – sets the L matrix diagonal to unity · Crout Method - sets the U matrix diagonal to unity Note: both the L & U matrices share the same unity diagonal & can be stored compactly in the same matrix Gauss-Seidel Iteration http://en.wikipedia.org/wiki/Gauss%E2%80%93Seidel_method C++: http://www.nr.com/forum/showthread.php?t=722 Transform the linear set of equations into a single equation & then use numerical integration (as integration formulas have Sums, it is implemented iteratively). an optimization of Gauss-Jacobi: 1.5 times faster, requires 0.25 iterations to achieve the same tolerance Solving Non-Linear Equations Iteratively find roots of polynomials – there may be 0, 1 or n solutions for an n order polynomial use iterative techniques Iterative methods · used when there are no known analytical techniques · Requires set functions to be continuous & differentiable · Requires an initial seed value – choice is critical to convergence à conduct multiple runs with different starting points & then select best result · Systematic - iterate until diminishing returns, tolerance or max iteration conditions are met · bracketing techniques will always yield convergent solutions, non-bracketing methods may fail to converge Incremental method if a nonlinear function has opposite signs at 2 ends of a small interval x1 & x2, then there is likely to be a solution in their interval – solutions are detected by evaluating a function over interval steps, for a change in sign, adjusting the step size dynamically. Limitations – can miss closely spaced solutions in large intervals, cannot detect degenerate (coinciding) solutions, limited to functions that cross the x-axis, gives false positives for singularities Fixed point method http://en.wikipedia.org/wiki/Fixed-point_iteration C++: http://books.google.co.il/books?id=weYj75E_t6MC&pg=PA79&lpg=PA79&dq=fixed+point+method++c%2B%2B&source=bl&ots=LQ-5P_taoC&sig=lENUUIYBK53tZtTwNfHLy5PEWDk&hl=en&sa=X&ei=wezDUPW1J5DptQaMsIHQCw&redir_esc=y#v=onepage&q=fixed%20point%20method%20%20c%2B%2B&f=false Algebraically rearrange a solution to isolate a variable then apply incremental method Bisection method http://en.wikipedia.org/wiki/Bisection_method C++: http://numericalcomputing.wordpress.com/category/algorithms/ Bracketed - Select an initial interval, keep bisecting it ad midpoint into sub-intervals and then apply incremental method on smaller & smaller intervals – zoom in Adv: unaffected by function gradient à reliable Disadv: slow convergence False Position Method http://en.wikipedia.org/wiki/False_position_method C++: http://www.dreamincode.net/forums/topic/126100-bisection-and-false-position-methods/ Bracketed - Select an initial interval , & use the relative value of function at interval end points to select next sub-intervals (estimate how far between the end points the solution might be & subdivide based on this) Newton-Raphson method http://en.wikipedia.org/wiki/Newton's_method C++: http://www-users.cselabs.umn.edu/classes/Summer-2012/csci1113/index.php?page=./newt3 Also known as Newton's method Convenient, efficient Not bracketed – only a single initial guess is required to start iteration – requires an analytical expression for the first derivative of the function as input. Evaluates the function & its derivative at each step. Can be extended to the Newton MutiRoot method for solving multiple roots Can be easily applied to an of n-coupled set of non-linear equations – conduct a Taylor Series expansion of a function, dropping terms of order n, rewrite as a Jacobian matrix of PDs & convert to simultaneous linear equations !!! Secant Method http://en.wikipedia.org/wiki/Secant_method C++: http://forum.vcoderz.com/showthread.php?p=205230 Unlike N-R, can estimate first derivative from an initial interval (does not require root to be bracketed) instead of inputting it Since derivative is approximated, may converge slower. Is fast in practice as it does not have to evaluate the derivative at each step. Similar implementation to False Positive method Birge-Vieta Method http://mat.iitm.ac.in/home/sryedida/public_html/caimna/transcendental/polynomial%20methods/bv%20method.html C++: http://books.google.co.il/books?id=cL1boM2uyQwC&pg=SA3-PA51&lpg=SA3-PA51&dq=Birge-Vieta+Method+c%2B%2B&source=bl&ots=QZmnDTK3rC&sig=BPNcHHbpR_DKVoZXrLi4nVXD-gg&hl=en&sa=X&ei=R-_DUK2iNIjzsgbE5ID4Dg&redir_esc=y#v=onepage&q=Birge-Vieta%20Method%20c%2B%2B&f=false combines Horner's method of polynomial evaluation (transforming into lesser degree polynomials that are more computationally efficient to process) with Newton-Raphson to provide a computational speed-up Interpolation Overview Construct new data points for as close as possible fit within range of a discrete set of known points (that were obtained via sampling, experimentation) Use Taylor Series Expansion of a function f(x) around a specific value for x Linear Interpolation http://en.wikipedia.org/wiki/Linear_interpolation C++: http://www.hamaluik.com/?p=289 Straight line between 2 points à concatenate interpolants between each pair of data points Bilinear Interpolation http://en.wikipedia.org/wiki/Bilinear_interpolation C++: http://supercomputingblog.com/graphics/coding-bilinear-interpolation/2/ Extension of the linear function for interpolating functions of 2 variables – perform linear interpolation first in 1 direction, then in another. Used in image processing – e.g. texture mapping filter. Uses 4 vertices to interpolate a value within a unit cell. Lagrange Interpolation http://en.wikipedia.org/wiki/Lagrange_polynomial C++: http://www.codecogs.com/code/maths/approximation/interpolation/lagrange.php For polynomials Requires recomputation for all terms for each distinct x value – can only be applied for small number of nodes Numerically unstable Barycentric Interpolation http://epubs.siam.org/doi/pdf/10.1137/S0036144502417715 C++: http://www.gamedev.net/topic/621445-barycentric-coordinates-c-code-check/ Rearrange the terms in the equation of the Legrange interpolation by defining weight functions that are independent of the interpolated value of x Newton Divided Difference Interpolation http://en.wikipedia.org/wiki/Newton_polynomial C++: http://jee-appy.blogspot.co.il/2011/12/newton-divided-difference-interpolation.html Hermite Divided Differences: Interpolation polynomial approximation for a given set of data points in the NR form - divided differences are used to approximately calculate the various differences. For a given set of 3 data points , fit a quadratic interpolant through the data Bracketed functions allow Newton divided differences to be calculated recursively Difference table Cubic Spline Interpolation http://en.wikipedia.org/wiki/Spline_interpolation C++: https://www.marcusbannerman.co.uk/index.php/home/latestarticles/42-articles/96-cubic-spline-class.html Spline is a piecewise polynomial Provides smoothness – for interpolations with significantly varying data Use weighted coefficients to bend the function to be smooth & its 1st & 2nd derivatives are continuous through the edge points in the interval Curve Fitting A generalization of interpolating whereby given data points may contain noise à the curve does not necessarily pass through all the points Least Squares Fit http://en.wikipedia.org/wiki/Least_squares C++: http://www.ccas.ru/mmes/educat/lab04k/02/least-squares.c Residual – difference between observed value & expected value Model function is often chosen as a linear combination of the specified functions Determines: A) The model instance in which the sum of squared residuals has the least value B) param values for which model best fits data Straight Line Fit Linear correlation between independent variable and dependent variable Linear Regression http://en.wikipedia.org/wiki/Linear_regression C++: http://www.oocities.org/david_swaim/cpp/linregc.htm Special case of statistically exact extrapolation Leverage least squares Given a basis function, the sum of the residuals is determined and the corresponding gradient equation is expressed as a set of normal linear equations in matrix form that can be solved (e.g. using LU Decomposition) Can be weighted - Drop the assumption that all errors have the same significance –-> confidence of accuracy is different for each data point. Fit the function closer to points with higher weights Polynomial Fit - use a polynomial basis function Moving Average http://en.wikipedia.org/wiki/Moving_average C++: http://www.codeproject.com/Articles/17860/A-Simple-Moving-Average-Algorithm Used for smoothing (cancel fluctuations to highlight longer-term trends & cycles), time series data analysis, signal processing filters Replace each data point with average of neighbors. Can be simple (SMA), weighted (WMA), exponential (EMA). Lags behind latest data points – extra weight can be given to more recent data points. Weights can decrease arithmetically or exponentially according to distance from point. Parameters: smoothing factor, period, weight basis Optimization Overview Given function with multiple variables, find Min (or max by minimizing –f(x)) Iterative approach Efficient, but not necessarily reliable Conditions: noisy data, constraints, non-linear models Detection via sign of first derivative - Derivative of saddle points will be 0 Local minima Bisection method Similar method for finding a root for a non-linear equation Start with an interval that contains a minimum Golden Search method http://en.wikipedia.org/wiki/Golden_section_search C++: http://www.codecogs.com/code/maths/optimization/golden.php Bisect intervals according to golden ratio 0.618.. Achieves reduction by evaluating a single function instead of 2 Newton-Raphson Method Brent method http://en.wikipedia.org/wiki/Brent's_method C++: http://people.sc.fsu.edu/~jburkardt/cpp_src/brent/brent.cpp Based on quadratic or parabolic interpolation – if the function is smooth & parabolic near to the minimum, then a parabola fitted through any 3 points should approximate the minima – fails when the 3 points are collinear , in which case the denominator is 0 Simplex Method http://en.wikipedia.org/wiki/Simplex_algorithm C++: http://www.codeguru.com/cpp/article.php/c17505/Simplex-Optimization-Algorithm-and-Implemetation-in-C-Programming.htm Find the global minima of any multi-variable function Direct search – no derivatives required At each step it maintains a non-degenerative simplex – a convex hull of n+1 vertices. Obtains the minimum for a function with n variables by evaluating the function at n-1 points, iteratively replacing the point of worst result with the point of best result, shrinking the multidimensional simplex around the best point. Point replacement involves expanding & contracting the simplex near the worst value point to determine a better replacement point Oscillation can be avoided by choosing the 2nd worst result Restart if it gets stuck Parameters: contraction & expansion factors Simulated Annealing http://en.wikipedia.org/wiki/Simulated_annealing C++: http://code.google.com/p/cppsimulatedannealing/ Analogy to heating & cooling metal to strengthen its structure Stochastic method – apply random permutation search for global minima - Avoid entrapment in local minima via hill climbing Heating schedule - Annealing schedule params: temperature, iterations at each temp, temperature delta Cooling schedule – can be linear, step-wise or exponential Differential Evolution http://en.wikipedia.org/wiki/Differential_evolution C++: http://www.amichel.com/de/doc/html/ More advanced stochastic methods analogous to biological processes: Genetic algorithms, evolution strategies Parallel direct search method against multiple discrete or continuous variables Initial population of variable vectors chosen randomly – if weighted difference vector of 2 vectors yields a lower objective function value then it replaces the comparison vector Many params: #parents, #variables, step size, crossover constant etc Convergence is slow – many more function evaluations than simulated annealing Numerical Differentiation Overview 2 approaches to finite difference methods: · A) approximate function via polynomial interpolation then differentiate · B) Taylor series approximation – additionally provides error estimate Finite Difference methods http://en.wikipedia.org/wiki/Finite_difference_method C++: http://www.wpi.edu/Pubs/ETD/Available/etd-051807-164436/unrestricted/EAMPADU.pdf Find differences between high order derivative values - Approximate differential equations by finite differences at evenly spaced data points Based on forward & backward Taylor series expansion of f(x) about x plus or minus multiples of delta h. Forward / backward difference - the sums of the series contains even derivatives and the difference of the series contains odd derivatives – coupled equations that can be solved. Provide an approximation of the derivative within a O(h^2) accuracy There is also central difference & extended central difference which has a O(h^4) accuracy Richardson Extrapolation http://en.wikipedia.org/wiki/Richardson_extrapolation C++: http://mathscoding.blogspot.co.il/2012/02/introduction-richardson-extrapolation.html A sequence acceleration method applied to finite differences Fast convergence, high accuracy O(h^4) Derivatives via Interpolation Cannot apply Finite Difference method to discrete data points at uneven intervals – so need to approximate the derivative of f(x) using the derivative of the interpolant via 3 point Lagrange Interpolation Note: the higher the order of the derivative, the lower the approximation precision Numerical Integration Estimate finite & infinite integrals of functions More accurate procedure than numerical differentiation Use when it is not possible to obtain an integral of a function analytically or when the function is not given, only the data points are Newton Cotes Methods http://en.wikipedia.org/wiki/Newton%E2%80%93Cotes_formulas C++: http://www.siafoo.net/snippet/324 For equally spaced data points Computationally easy – based on local interpolation of n rectangular strip areas that is piecewise fitted to a polynomial to get the sum total area Evaluate the integrand at n+1 evenly spaced points – approximate definite integral by Sum Weights are derived from Lagrange Basis polynomials Leverage Trapezoidal Rule for default 2nd formulas, Simpson 1/3 Rule for substituting 3 point formulas, Simpson 3/8 Rule for 4 point formulas. For 4 point formulas use Bodes Rule. Higher orders obtain more accurate results Trapezoidal Rule uses simple area, Simpsons Rule replaces the integrand f(x) with a quadratic polynomial p(x) that uses the same values as f(x) for its end points, but adds a midpoint Romberg Integration http://en.wikipedia.org/wiki/Romberg's_method C++: http://code.google.com/p/romberg-integration/downloads/detail?name=romberg.cpp&can=2&q= Combines trapezoidal rule with Richardson Extrapolation Evaluates the integrand at equally spaced points The integrand must have continuous derivatives Each R(n,m) extrapolation uses a higher order integrand polynomial replacement rule (zeroth starts with trapezoidal) à a lower triangular matrix set of equation coefficients where the bottom right term has the most accurate approximation. The process continues until the difference between 2 successive diagonal terms becomes sufficiently small. Gaussian Quadrature http://en.wikipedia.org/wiki/Gaussian_quadrature C++: http://www.alglib.net/integration/gaussianquadratures.php Data points are chosen to yield best possible accuracy – requires fewer evaluations Ability to handle singularities, functions that are difficult to evaluate The integrand can include a weighting function determined by a set of orthogonal polynomials. Points & weights are selected so that the integrand yields the exact integral if f(x) is a polynomial of degree <= 2n+1 Techniques (basically different weighting functions): · Gauss-Legendre Integration w(x)=1 · Gauss-Laguerre Integration w(x)=e^-x · Gauss-Hermite Integration w(x)=e^-x^2 · Gauss-Chebyshev Integration w(x)= 1 / Sqrt(1-x^2) Solving ODEs Use when high order differential equations cannot be solved analytically Evaluated under boundary conditions RK for systems – a high order differential equation can always be transformed into a coupled first order system of equations Euler method http://en.wikipedia.org/wiki/Euler_method C++: http://rosettacode.org/wiki/Euler_method First order Runge–Kutta method. Simple recursive method – given an initial value, calculate derivative deltas. Unstable & not very accurate (O(h) error) – not used in practice A first-order method - the local error (truncation error per step) is proportional to the square of the step size, and the global error (error at a given time) is proportional to the step size In evolving solution between data points xn & xn+1, only evaluates derivatives at beginning of interval xn à asymmetric at boundaries Higher order Runge Kutta http://en.wikipedia.org/wiki/Runge%E2%80%93Kutta_methods C++: http://www.dreamincode.net/code/snippet1441.htm 2nd & 4th order RK - Introduces parameterized midpoints for more symmetric solutions à accuracy at higher computational cost Adaptive RK – RK-Fehlberg – estimate the truncation at each integration step & automatically adjust the step size to keep error within prescribed limits. At each step 2 approximations are compared – if in disagreement to a specific accuracy, the step size is reduced Boundary Value Problems Where solution of differential equations are located at 2 different values of the independent variable x à more difficult, because cannot just start at point of initial value – there may not be enough starting conditions available at the end points to produce a unique solution An n-order equation will require n boundary conditions – need to determine the missing n-1 conditions which cause the given conditions at the other boundary to be satisfied Shooting Method http://en.wikipedia.org/wiki/Shooting_method C++: http://ganeshtiwaridotcomdotnp.blogspot.co.il/2009/12/c-c-code-shooting-method-for-solving.html Iteratively guess the missing values for one end & integrate, then inspect the discrepancy with the boundary values of the other end to adjust the estimate Given the starting boundary values u1 & u2 which contain the root u, solve u given the false position method (solving the differential equation as an initial value problem via 4th order RK), then use u to solve the differential equations. Finite Difference Method For linear & non-linear systems Higher order derivatives require more computational steps – some combinations for boundary conditions may not work though Improve the accuracy by increasing the number of mesh points Solving EigenValue Problems An eigenvalue can substitute a matrix when doing matrix multiplication à convert matrix multiplication into a polynomial EigenValue For a given set of equations in matrix form, determine what are the solution eigenvalue & eigenvectors Similar Matrices - have same eigenvalues. Use orthogonal similarity transforms to reduce a matrix to diagonal form from which eigenvalue(s) & eigenvectors can be computed iteratively Jacobi method http://en.wikipedia.org/wiki/Jacobi_method C++: http://people.sc.fsu.edu/~jburkardt/classes/acs2_2008/openmp/jacobi/jacobi.html Robust but Computationally intense – use for small matrices < 10x10 Power Iteration http://en.wikipedia.org/wiki/Power_iteration For any given real symmetric matrix, generate the largest single eigenvalue & its eigenvectors Simplest method – does not compute matrix decomposition à suitable for large, sparse matrices Inverse Iteration Variation of power iteration method – generates the smallest eigenvalue from the inverse matrix Rayleigh Method http://en.wikipedia.org/wiki/Rayleigh's_method_of_dimensional_analysis Variation of power iteration method Rayleigh Quotient Method Variation of inverse iteration method Matrix Tri-diagonalization Method Use householder algorithm to reduce an NxN symmetric matrix to a tridiagonal real symmetric matrix vua N-2 orthogonal transforms     Whats Next Outside of Numerical Methods there are lots of different types of algorithms that I’ve learned over the decades: Data Mining – (I covered this briefly in a previous post: http://geekswithblogs.net/JoshReuben/archive/2007/12/31/ssas-dm-algorithms.aspx ) Search & Sort Routing Problem Solving Logical Theorem Proving Planning Probabilistic Reasoning Machine Learning Solvers (eg MIP) Bioinformatics (Sequence Alignment, Protein Folding) Quant Finance (I read Wilmott’s books – interesting) Sooner or later, I’ll cover the above topics as well.

    Read the article

  • An Introduction to Meteor

    - by Stephen.Walther
    The goal of this blog post is to give you a brief introduction to Meteor which is a framework for building Single Page Apps. In this blog entry, I provide a walkthrough of building a simple Movie database app. What is special about Meteor? Meteor has two jaw-dropping features: Live HTML – If you make any changes to the HTML, CSS, JavaScript, or data on the server then every client shows the changes automatically without a browser refresh. For example, if you change the background color of a page to yellow then every open browser will show the new yellow background color without a refresh. Or, if you add a new movie to a collection of movies, then every open browser will display the new movie automatically. With Live HTML, users no longer need a refresh button. Changes to an application happen everywhere automatically without any effort. The Meteor framework handles all of the messy details of keeping all of the clients in sync with the server for you. Latency Compensation – When you modify data on the client, these modifications appear as if they happened on the server without any delay. For example, if you create a new movie then the movie appears instantly. However, that is all an illusion. In the background, Meteor updates the database with the new movie. If, for whatever reason, the movie cannot be added to the database then Meteor removes the movie from the client automatically. Latency compensation is extremely important for creating a responsive web application. You want the user to be able to make instant modifications in the browser and the framework to handle the details of updating the database without slowing down the user. Installing Meteor Meteor is licensed under the open-source MIT license and you can start building production apps with the framework right now. Be warned that Meteor is still in the “early preview” stage. It has not reached a 1.0 release. According to the Meteor FAQ, Meteor will reach version 1.0 in “More than a month, less than a year.” Don’t be scared away by that. You should be aware that, unlike most open source projects, Meteor has financial backing. The Meteor project received an $11.2 million round of financing from Andreessen Horowitz. So, it would be a good bet that this project will reach the 1.0 mark. And, if it doesn’t, the framework as it exists right now is still very powerful. Meteor runs on top of Node.js. You write Meteor apps by writing JavaScript which runs both on the client and on the server. You can build Meteor apps on Windows, Mac, or Linux (Although the support for Windows is still officially unofficial). If you want to install Meteor on Windows then download the MSI from the following URL: http://win.meteor.com/ If you want to install Meteor on Mac/Linux then run the following CURL command from your terminal: curl https://install.meteor.com | /bin/sh Meteor will install all of its dependencies automatically including Node.js. However, I recommend that you install Node.js before installing Meteor by installing Node.js from the following address: http://nodejs.org/ If you let Meteor install Node.js then Meteor won’t install NPM which is the standard package manager for Node.js. If you install Node.js and then you install Meteor then you get NPM automatically. Creating a New Meteor App To get a sense of how Meteor works, I am going to walk through the steps required to create a simple Movie database app. Our app will display a list of movies and contain a form for creating a new movie. The first thing that we need to do is create our new Meteor app. Open a command prompt/terminal window and execute the following command: Meteor create MovieApp After you execute this command, you should see something like the following: Follow the instructions: execute cd MovieApp to change to your MovieApp directory, and run the meteor command. Executing the meteor command starts Meteor on port 3000. Open up your favorite web browser and navigate to http://localhost:3000 and you should see the default Meteor Hello World page: Open up your favorite development environment to see what the Meteor app looks like. Open the MovieApp folder which we just created. Here’s what the MovieApp looks like in Visual Studio 2012: Notice that our MovieApp contains three files named MovieApp.css, MovieApp.html, and MovieApp.js. In other words, it contains a Cascading Style Sheet file, an HTML file, and a JavaScript file. Just for fun, let’s see how the Live HTML feature works. Open up multiple browsers and point each browser at http://localhost:3000. Now, open the MovieApp.html page and modify the text “Hello World!” to “Hello Cruel World!” and save the change. The text in all of the browsers should update automatically without a browser refresh. Pretty amazing, right? Controlling Where JavaScript Executes You write a Meteor app using JavaScript. Some of the JavaScript executes on the client (the browser) and some of the JavaScript executes on the server and some of the JavaScript executes in both places. For a super simple app, you can use the Meteor.isServer and Meteor.isClient properties to control where your JavaScript code executes. For example, the following JavaScript contains a section of code which executes on the server and a section of code which executes in the browser: if (Meteor.isClient) { console.log("Hello Browser!"); } if (Meteor.isServer) { console.log("Hello Server!"); } console.log("Hello Browser and Server!"); When you run the app, the message “Hello Browser!” is written to the browser JavaScript console. The message “Hello Server!” is written to the command/terminal window where you ran Meteor. Finally, the message “Hello Browser and Server!” is execute on both the browser and server and the message appears in both places. For simple apps, using Meteor.isClient and Meteor.isServer to control where JavaScript executes is fine. For more complex apps, you should create separate folders for your server and client code. Here are the folders which you can use in a Meteor app: · client – This folder contains any JavaScript which executes only on the client. · server – This folder contains any JavaScript which executes only on the server. · common – This folder contains any JavaScript code which executes on both the client and server. · lib – This folder contains any JavaScript files which you want to execute before any other JavaScript files. · public – This folder contains static application assets such as images. For the Movie App, we need the client, server, and common folders. Delete the existing MovieApp.js, MovieApp.html, and MovieApp.css files. We will create new files in the right locations later in this walkthrough. Combining HTML, CSS, and JavaScript Files Meteor combines all of your JavaScript files, and all of your Cascading Style Sheet files, and all of your HTML files automatically. If you want to create one humongous JavaScript file which contains all of the code for your app then that is your business. However, if you want to build a more maintainable application, then you should break your JavaScript files into many separate JavaScript files and let Meteor combine them for you. Meteor also combines all of your HTML files into a single file. HTML files are allowed to have the following top-level elements: <head> — All <head> files are combined into a single <head> and served with the initial page load. <body> — All <body> files are combined into a single <body> and served with the initial page load. <template> — All <template> files are compiled into JavaScript templates. Because you are creating a single page app, a Meteor app typically will contain a single HTML file for the <head> and <body> content. However, a Meteor app typically will contain several template files. In other words, all of the interesting stuff happens within the <template> files. Displaying a List of Movies Let me start building the Movie App by displaying a list of movies. In order to display a list of movies, we need to create the following four files: · client\movies.html – Contains the HTML for the <head> and <body> of the page for the Movie app. · client\moviesTemplate.html – Contains the HTML template for displaying the list of movies. · client\movies.js – Contains the JavaScript for supplying data to the moviesTemplate. · server\movies.js – Contains the JavaScript for seeding the database with movies. After you create these files, your folder structure should looks like this: Here’s what the client\movies.html file looks like: <head> <title>My Movie App</title> </head> <body> <h1>Movies</h1> {{> moviesTemplate }} </body>   Notice that it contains <head> and <body> top-level elements. The <body> element includes the moviesTemplate with the syntax {{> moviesTemplate }}. The moviesTemplate is defined in the client/moviesTemplate.html file: <template name="moviesTemplate"> <ul> {{#each movies}} <li> {{title}} </li> {{/each}} </ul> </template> By default, Meteor uses the Handlebars templating library. In the moviesTemplate above, Handlebars is used to loop through each of the movies using {{#each}}…{{/each}} and display the title for each movie using {{title}}. The client\movies.js JavaScript file is used to bind the moviesTemplate to the Movies collection on the client. Here’s what this JavaScript file looks like: // Declare client Movies collection Movies = new Meteor.Collection("movies"); // Bind moviesTemplate to Movies collection Template.moviesTemplate.movies = function () { return Movies.find(); }; The Movies collection is a client-side proxy for the server-side Movies database collection. Whenever you want to interact with the collection of Movies stored in the database, you use the Movies collection instead of communicating back to the server. The moviesTemplate is bound to the Movies collection by assigning a function to the Template.moviesTemplate.movies property. The function simply returns all of the movies from the Movies collection. The final file which we need is the server-side server\movies.js file: // Declare server Movies collection Movies = new Meteor.Collection("movies"); // Seed the movie database with a few movies Meteor.startup(function () { if (Movies.find().count() == 0) { Movies.insert({ title: "Star Wars", director: "Lucas" }); Movies.insert({ title: "Memento", director: "Nolan" }); Movies.insert({ title: "King Kong", director: "Jackson" }); } }); The server\movies.js file does two things. First, it declares the server-side Meteor Movies collection. When you declare a server-side Meteor collection, a collection is created in the MongoDB database associated with your Meteor app automatically (Meteor uses MongoDB as its database automatically). Second, the server\movies.js file seeds the Movies collection (MongoDB collection) with three movies. Seeding the database gives us some movies to look at when we open the Movies app in a browser. Creating New Movies Let me modify the Movies Database App so that we can add new movies to the database of movies. First, I need to create a new template file – named client\movieForm.html – which contains an HTML form for creating a new movie: <template name="movieForm"> <fieldset> <legend>Add New Movie</legend> <form> <div> <label> Title: <input id="title" /> </label> </div> <div> <label> Director: <input id="director" /> </label> </div> <div> <input type="submit" value="Add Movie" /> </div> </form> </fieldset> </template> In order for the new form to show up, I need to modify the client\movies.html file to include the movieForm.html template. Notice that I added {{> movieForm }} to the client\movies.html file: <head> <title>My Movie App</title> </head> <body> <h1>Movies</h1> {{> moviesTemplate }} {{> movieForm }} </body> After I make these modifications, our Movie app will display the form: The next step is to handle the submit event for the movie form. Below, I’ve modified the client\movies.js file so that it contains a handler for the submit event raised when you submit the form contained in the movieForm.html template: // Declare client Movies collection Movies = new Meteor.Collection("movies"); // Bind moviesTemplate to Movies collection Template.moviesTemplate.movies = function () { return Movies.find(); }; // Handle movieForm events Template.movieForm.events = { 'submit': function (e, tmpl) { // Don't postback e.preventDefault(); // create the new movie var newMovie = { title: tmpl.find("#title").value, director: tmpl.find("#director").value }; // add the movie to the db Movies.insert(newMovie); } }; The Template.movieForm.events property contains an event map which maps event names to handlers. In this case, I am mapping the form submit event to an anonymous function which handles the event. In the event handler, I am first preventing a postback by calling e.preventDefault(). This is a single page app, no postbacks are allowed! Next, I am grabbing the new movie from the HTML form. I’m taking advantage of the template find() method to retrieve the form field values. Finally, I am calling Movies.insert() to insert the new movie into the Movies collection. Here, I am explicitly inserting the new movie into the client-side Movies collection. Meteor inserts the new movie into the server-side Movies collection behind the scenes. When Meteor inserts the movie into the server-side collection, the new movie is added to the MongoDB database associated with the Movies app automatically. If server-side insertion fails for whatever reasons – for example, your internet connection is lost – then Meteor will remove the movie from the client-side Movies collection automatically. In other words, Meteor takes care of keeping the client Movies collection and the server Movies collection in sync. If you open multiple browsers, and add movies, then you should notice that all of the movies appear on all of the open browser automatically. You don’t need to refresh individual browsers to update the client-side Movies collection. Meteor keeps everything synchronized between the browsers and server for you. Removing the Insecure Module To make it easier to develop and debug a new Meteor app, by default, you can modify the database directly from the client. For example, you can delete all of the data in the database by opening up your browser console window and executing multiple Movies.remove() commands. Obviously, enabling anyone to modify your database from the browser is not a good idea in a production application. Before you make a Meteor app public, you should first run the meteor remove insecure command from a command/terminal window: Running meteor remove insecure removes the insecure package from the Movie app. Unfortunately, it also breaks our Movie app. We’ll get an “Access denied” error in our browser console whenever we try to insert a new movie. No worries. I’ll fix this issue in the next section. Creating Meteor Methods By taking advantage of Meteor Methods, you can create methods which can be invoked on both the client and the server. By taking advantage of Meteor Methods you can: 1. Perform form validation on both the client and the server. For example, even if an evil hacker bypasses your client code, you can still prevent the hacker from submitting an invalid value for a form field by enforcing validation on the server. 2. Simulate database operations on the client but actually perform the operations on the server. Let me show you how we can modify our Movie app so it uses Meteor Methods to insert a new movie. First, we need to create a new file named common\methods.js which contains the definition of our Meteor Methods: Meteor.methods({ addMovie: function (newMovie) { // Perform form validation if (newMovie.title == "") { throw new Meteor.Error(413, "Missing title!"); } if (newMovie.director == "") { throw new Meteor.Error(413, "Missing director!"); } // Insert movie (simulate on client, do it on server) return Movies.insert(newMovie); } }); The addMovie() method is called from both the client and the server. This method does two things. First, it performs some basic validation. If you don’t enter a title or you don’t enter a director then an error is thrown. Second, the addMovie() method inserts the new movie into the Movies collection. When called on the client, inserting the new movie into the Movies collection just updates the collection. When called on the server, inserting the new movie into the Movies collection causes the database (MongoDB) to be updated with the new movie. You must add the common\methods.js file to the common folder so it will get executed on both the client and the server. Our folder structure now looks like this: We actually call the addMovie() method within our client code in the client\movies.js file. Here’s what the updated file looks like: // Declare client Movies collection Movies = new Meteor.Collection("movies"); // Bind moviesTemplate to Movies collection Template.moviesTemplate.movies = function () { return Movies.find(); }; // Handle movieForm events Template.movieForm.events = { 'submit': function (e, tmpl) { // Don't postback e.preventDefault(); // create the new movie var newMovie = { title: tmpl.find("#title").value, director: tmpl.find("#director").value }; // add the movie to the db Meteor.call( "addMovie", newMovie, function (err, result) { if (err) { alert("Could not add movie " + err.reason); } } ); } }; The addMovie() method is called – on both the client and the server – by calling the Meteor.call() method. This method accepts the following parameters: · The string name of the method to call. · The data to pass to the method (You can actually pass multiple params for the data if you like). · A callback function to invoke after the method completes. In the JavaScript code above, the addMovie() method is called with the new movie retrieved from the HTML form. The callback checks for an error. If there is an error then the error reason is displayed in an alert (please don’t use alerts for validation errors in a production app because they are ugly!). Summary The goal of this blog post was to provide you with a brief walk through of a simple Meteor app. I showed you how you can create a simple Movie Database app which enables you to display a list of movies and create new movies. I also explained why it is important to remove the Meteor insecure package from a production app. I showed you how to use Meteor Methods to insert data into the database instead of doing it directly from the client. I’m very impressed with the Meteor framework. The support for Live HTML and Latency Compensation are required features for many real world Single Page Apps but implementing these features by hand is not easy. Meteor makes it easy.

    Read the article

  • The broken Promise of the Mobile Web

    - by Rick Strahl
    High end mobile devices have been with us now for almost 7 years and they have utterly transformed the way we access information. Mobile phones and smartphones that have access to the Internet and host smart applications are in the hands of a large percentage of the population of the world. In many places even very remote, cell phones and even smart phones are a common sight. I’ll never forget when I was in India in 2011 I was up in the Southern Indian mountains riding an elephant out of a tiny local village, with an elephant herder in front riding atop of the elephant in front of us. He was dressed in traditional garb with the loin wrap and head cloth/turban as did quite a few of the locals in this small out of the way and not so touristy village. So we’re slowly trundling along in the forest and he’s lazily using his stick to guide the elephant and… 10 minutes in he pulls out his cell phone from his sash and starts texting. In the middle of texting a huge pig jumps out from the side of the trail and he takes a picture running across our path in the jungle! So yeah, mobile technology is very pervasive and it’s reached into even very buried and unexpected parts of this world. Apps are still King Apps currently rule the roost when it comes to mobile devices and the applications that run on them. If there’s something that you need on your mobile device your first step usually is to look for an app, not use your browser. But native app development remains a pain in the butt, with the requirement to have to support 2 or 3 completely separate platforms. There are solutions that try to bridge that gap. Xamarin is on a tear at the moment, providing their cross-device toolkit to build applications using C#. While Xamarin tools are impressive – and also *very* expensive – they only address part of the development madness that is app development. There are still specific device integration isssues, dealing with the different developer programs, security and certificate setups and all that other noise that surrounds app development. There’s also PhoneGap/Cordova which provides a hybrid solution that involves creating local HTML/CSS/JavaScript based applications, and then packaging them to run in a specialized App container that can run on most mobile device platforms using a WebView interface. This allows for using of HTML technology, but it also still requires all the set up, configuration of APIs, security keys and certification and submission and deployment process just like native applications – you actually lose many of the benefits that  Web based apps bring. The big selling point of Cordova is that you get to use HTML have the ability to build your UI once for all platforms and run across all of them – but the rest of the app process remains in place. Apps can be a big pain to create and manage especially when we are talking about specialized or vertical business applications that aren’t geared at the mainstream market and that don’t fit the ‘store’ model. If you’re building a small intra department application you don’t want to deal with multiple device platforms and certification etc. for various public or corporate app stores. That model is simply not a good fit both from the development and deployment perspective. Even for commercial, big ticket apps, HTML as a UI platform offers many advantages over native, from write-once run-anywhere, to remote maintenance, single point of management and failure to having full control over the application as opposed to have the app store overloads censor you. In a lot of ways Web based HTML/CSS/JavaScript applications have so much potential for building better solutions based on existing Web technologies for the very same reasons a lot of content years ago moved off the desktop to the Web. To me the Web as a mobile platform makes perfect sense, but the reality of today’s Mobile Web unfortunately looks a little different… Where’s the Love for the Mobile Web? Yet here we are in the middle of 2014, nearly 7 years after the first iPhone was released and brought the promise of rich interactive information at your fingertips, and yet we still don’t really have a solid mobile Web platform. I know what you’re thinking: “But we have lots of HTML/JavaScript/CSS features that allows us to build nice mobile interfaces”. I agree to a point – it’s actually quite possible to build nice looking, rich and capable Web UI today. We have media queries to deal with varied display sizes, CSS transforms for smooth animations and transitions, tons of CSS improvements in CSS 3 that facilitate rich layout, a host of APIs geared towards mobile device features and lately even a number of JavaScript framework choices that facilitate development of multi-screen apps in a consistent manner. Personally I’ve been working a lot with AngularJs and heavily modified Bootstrap themes to build mobile first UIs and that’s been working very well to provide highly usable and attractive UI for typical mobile business applications. From the pure UI perspective things actually look very good. Not just about the UI But it’s not just about the UI - it’s also about integration with the mobile device. When it comes to putting all those pieces together into what amounts to a consolidated platform to build mobile Web applications, I think we still have a ways to go… there are a lot of missing pieces to make it all work together and integrate with the device more smoothly, and more importantly to make it work uniformly across the majority of devices. I think there are a number of reasons for this. Slow Standards Adoption HTML standards implementations and ratification has been dreadfully slow, and browser vendors all seem to pick and choose different pieces of the technology they implement. The end result is that we have a capable UI platform that’s missing some of the infrastructure pieces to make it whole on mobile devices. There’s lots of potential but what is lacking that final 10% to build truly compelling mobile applications that can compete favorably with native applications. Some of it is the fragmentation of browsers and the slow evolution of the mobile specific HTML APIs. A host of mobile standards exist but many of the standards are in the early review stage and they have been there stuck for long periods of time and seem to move at a glacial pace. Browser vendors seem even slower to implement them, and for good reason – non-ratified standards mean that implementations may change and vendor implementations tend to be experimental and  likely have to be changed later. Neither Vendors or developers are not keen on changing standards. This is the typical chicken and egg scenario, but without some forward momentum from some party we end up stuck in the mud. It seems that either the standards bodies or the vendors need to carry the torch forward and that doesn’t seem to be happening quickly enough. Mobile Device Integration just isn’t good enough Current standards are not far reaching enough to address a number of the use case scenarios necessary for many mobile applications. While not every application needs to have access to all mobile device features, almost every mobile application could benefit from some integration with other parts of the mobile device platform. Integration with GPS, phone, media, messaging, notifications, linking and contacts system are benefits that are unique to mobile applications and could be widely used, but are mostly (with the exception of GPS) inaccessible for Web based applications today. Unfortunately trying to do most of this today only with a mobile Web browser is a losing battle. Aside from PhoneGap/Cordova’s app centric model with its own custom API accessing mobile device features and the token exception of the GeoLocation API, most device integration features are not widely supported by the current crop of mobile browsers. For example there’s no usable messaging API that allows access to SMS or contacts from HTML. Even obvious components like the Media Capture API are only implemented partially by mobile devices. There are alternatives and workarounds for some of these interfaces by using browser specific code, but that’s might ugly and something that I thought we were trying to leave behind with newer browser standards. But it’s not quite working out that way. It’s utterly perplexing to me that mobile standards like Media Capture and Streams, Media Gallery Access, Responsive Images, Messaging API, Contacts Manager API have only minimal or no traction at all today. Keep in mind we’ve had mobile browsers for nearly 7 years now, and yet we still have to think about how to get access to an image from the image gallery or the camera on some devices? Heck Windows Phone IE Mobile just gained the ability to upload images recently in the Windows 8.1 Update – that’s feature that HTML has had for 20 years! These are simple concepts and common problems that should have been solved a long time ago. It’s extremely frustrating to see build 90% of a mobile Web app with relative ease and then hit a brick wall for the remaining 10%, which often can be show stoppers. The remaining 10% have to do with platform integration, browser differences and working around the limitations that browsers and ‘pinned’ applications impose on HTML applications. The maddening part is that these limitations seem arbitrary as they could easily work on all mobile platforms. For example, SMS has a URL Moniker interface that sort of works on Android, works badly with iOS (only works if the address is already in the contact list) and not at all on Windows Phone. There’s no reason this shouldn’t work universally using the same interface – after all all phones have supported SMS since before the year 2000! But, it doesn’t have to be this way Change can happen very quickly. Take the GeoLocation API for example. Geolocation has taken off at the very beginning of the mobile device era and today it works well, provides the necessary security (a big concern for many mobile APIs), and is supported by just about all major mobile and even desktop browsers today. It handles security concerns via prompts to avoid unwanted access which is a model that would work for most other device APIs in a similar fashion. One time approval and occasional re-approval if code changes or caches expire. Simple and only slightly intrusive. It all works well, even though GeoLocation actually has some physical limitations, such as representing the current location when no GPS device is present. Yet this is a solved problem, where other APIs that are conceptually much simpler to implement have failed to gain any traction at all. Technically none of these APIs should be a problem to implement, but it appears that the momentum is just not there. Inadequate Web Application Linking and Activation Another important piece of the puzzle missing is the integration of HTML based Web applications. Today HTML based applications are not first class citizens on mobile operating systems. When talking about HTML based content there’s a big difference between content and applications. Content is great for search engine discovery and plain browser usage. Content is usually accessed intermittently and permanent linking is not so critical for this type of content.  But applications have different needs. Applications need to be started up quickly and must be easily switchable to support a multi-tasking user workflow. Therefore, it’s pretty crucial that mobile Web apps are integrated into the underlying mobile OS and work with the standard task management features. Unfortunately this integration is not as smooth as it should be. It starts with actually trying to find mobile Web applications, to ‘installing’ them onto a phone in an easily accessible manner in a prominent position. The experience of discovering a Mobile Web ‘App’ and making it sticky is by no means as easy or satisfying. Today the way you’d go about this is: Open the browser Search for a Web Site in the browser with your search engine of choice Hope that you find the right site Hope that you actually find a site that works for your mobile device Click on the link and run the app in a fully chrome’d browser instance (read tiny surface area) Pin the app to the home screen (with all the limitations outline above) Hope you pointed at the right URL when you pinned Even for you and me as developers, there are a few steps in there that are painful and annoying, but think about the average user. First figuring out how to search for a specific site or URL? And then pinning the app and hopefully from the right location? You’ve probably lost more than half of your audience at that point. This experience sucks. For developers too this process is painful since app developers can’t control the shortcut creation directly. This problem often gets solved by crazy coding schemes, with annoying pop-ups that try to get people to create shortcuts via fancy animations that are both annoying and add overhead to each and every application that implements this sort of thing differently. And that’s not the end of it - getting the link onto the home screen with an application icon varies quite a bit between browsers. Apple’s non-standard meta tags are prominent and they work with iOS and Android (only more recent versions), but not on Windows Phone. Windows Phone instead requires you to create an actual screen or rather a partial screen be captured for a shortcut in the tile manager. Who had that brilliant idea I wonder? Surprisingly Chrome on recent Android versions seems to actually get it right – icons use pngs, pinning is easy and pinned applications properly behave like standalone apps and retain the browser’s active page state and content. Each of the platforms has a different way to specify icons (WP doesn’t allow you to use an icon image at all), and the most widely used interface in use today is a bunch of Apple specific meta tags that other browsers choose to support. The question is: Why is there no standard implementation for installing shortcuts across mobile platforms using an official format rather than a proprietary one? Then there’s iOS and the crazy way it treats home screen linked URLs using a crazy hybrid format that is neither as capable as a Web app running in Safari nor a WebView hosted application. Moving off the Web ‘app’ link when switching to another app actually causes the browser and preview it to ‘blank out’ the Web application in the Task View (see screenshot on the right). Then, when the ‘app’ is reactivated it ends up completely restarting the browser with the original link. This is crazy behavior that you can’t easily work around. In some situations you might be able to store the application state and restore it using LocalStorage, but for many scenarios that involve complex data sources (like say Google Maps) that’s not a possibility. The only reason for this screwed up behavior I can think of is that it is deliberate to make Web apps a pain in the butt to use and forcing users trough the App Store/PhoneGap/Cordova route. App linking and management is a very basic problem – something that we essentially have solved in every desktop browser – yet on mobile devices where it arguably matters a lot more to have easy access to web content we have to jump through hoops to have even a remotely decent linking/activation experience across browsers. Where’s the Money? It’s not surprising that device home screen integration and Mobile Web support in general is in such dismal shape – the mobile OS vendors benefit financially from App store sales and have little to gain from Web based applications that bypass the App store and the cash cow that it presents. On top of that, platform specific vendor lock-in of both end users and developers who have invested in hardware, apps and consumables is something that mobile platform vendors actually aspire to. Web based interfaces that are cross-platform are the anti-thesis of that and so again it’s no surprise that the mobile Web is on a struggling path. But – that may be changing. More and more we’re seeing operations shifting to services that are subscription based or otherwise collect money for usage, and that may drive more progress into the Web direction in the end . Nothing like the almighty dollar to drive innovation forward. Do we need a Mobile Web App Store? As much as I dislike moderated experiences in today’s massive App Stores, they do at least provide one single place to look for apps for your device. I think we could really use some sort of registry, that could provide something akin to an app store for mobile Web apps, to make it easier to actually find mobile applications. This could take the form of a specialized search engine, or maybe a more formal store/registry like structure. Something like apt-get/chocolatey for Web apps. It could be curated and provide at least some feedback and reviews that might help with the integrity of applications. Coupled to that could be a native application on each platform that would allow searching and browsing of the registry and then also handle installation in the form of providing the home screen linking, plus maybe an initial security configuration that determines what features are allowed access to for the app. I’m not holding my breath. In order for this sort of thing to take off and gain widespread appeal, a lot of coordination would be required. And in order to get enough traction it would have to come from a well known entity – a mobile Web app store from a no name source is unlikely to gain high enough usage numbers to make a difference. In a way this would eliminate some of the freedom of the Web, but of course this would also be an optional search path in addition to the standard open Web search mechanisms to find and access content today. Security Security is a big deal, and one of the perceived reasons why so many IT professionals appear to be willing to go back to the walled garden of deployed apps is that Apps are perceived as safe due to the official review and curation of the App stores. Curated stores are supposed to protect you from malware, illegal and misleading content. It doesn’t always work out that way and all the major vendors have had issues with security and the review process at some time or another. Security is critical, but I also think that Web applications in general pose less of a security threat than native applications, by nature of the sandboxed browser and JavaScript environments. Web applications run externally completely and in the HTML and JavaScript sandboxes, with only a very few controlled APIs allowing access to device specific features. And as discussed earlier – security for any device interaction can be granted the same for mobile applications through a Web browser, as they can for native applications either via explicit policies loaded from the Web, or via prompting as GeoLocation does today. Security is important, but it’s certainly solvable problem for Web applications even those that need to access device hardware. Security shouldn’t be a reason for Web apps to be an equal player in mobile applications. Apps are winning, but haven’t we been here before? So now we’re finding ourselves back in an era of installed app, rather than Web based and managed apps. Only it’s even worse today than with Desktop applications, in that the apps are going through a gatekeeper that charges a toll and censors what you can and can’t do in your apps. Frankly it’s a mystery to me why anybody would buy into this model and why it’s lasted this long when we’ve already been through this process. It’s crazy… It’s really a shame that this regression is happening. We have the technology to make mobile Web apps much more prominent, but yet we’re basically held back by what seems little more than bureaucracy, partisan bickering and self interest of the major parties involved. Back in the day of the desktop it was Internet Explorer’s 98+%  market shareholding back the Web from improvements for many years – now it’s the combined mobile OS market in control of the mobile browsers. If mobile Web apps were allowed to be treated the same as native apps with simple ways to install and run them consistently and persistently, that would go a long way to making mobile applications much more usable and seriously viable alternatives to native apps. But as it is mobile apps have a severe disadvantage in placement and operation. There are a few bright spots in all of this. Mozilla’s FireFoxOs is embracing the Web for it’s mobile OS by essentially building every app out of HTML and JavaScript based content. It supports both packaged and certified package modes (that can be put into the app store), and Open Web apps that are loaded and run completely off the Web and can also cache locally for offline operation using a manifest. Open Web apps are treated as full class citizens in FireFoxOS and run using the same mechanism as installed apps. Unfortunately FireFoxOs is getting a slow start with minimal device support and specifically targeting the low end market. We can hope that this approach will change and catch on with other vendors, but that’s also an uphill battle given the conflict of interest with platform lock in that it represents. Recent versions of Android also seem to be working reasonably well with mobile application integration onto the desktop and activation out of the box. Although it still uses the Apple meta tags to find icons and behavior settings, everything at least works as you would expect – icons to the desktop on pinning, WebView based full screen activation, and reliable application persistence as the browser/app is treated like a real application. Hopefully iOS will at some point provide this same level of rudimentary Web app support. What’s also interesting to me is that Microsoft hasn’t picked up on the obvious need for a solid Web App platform. Being a distant third in the mobile OS war, Microsoft certainly has nothing to lose and everything to gain by using fresh ideas and expanding into areas that the other major vendors are neglecting. But instead Microsoft is trying to beat the market leaders at their own game, fighting on their adversary’s terms instead of taking a new tack. Providing a kick ass mobile Web platform that takes the lead on some of the proposed mobile APIs would be something positive that Microsoft could do to improve its miserable position in the mobile device market. Where are we at with Mobile Web? It sure sounds like I’m really down on the Mobile Web, right? I’ve built a number of mobile apps in the last year and while overall result and response has been very positive to what we were able to accomplish in terms of UI, getting that final 10% that required device integration dialed was an absolute nightmare on every single one of them. Big compromises had to be made and some features were left out or had to be modified for some devices. In two cases we opted to go the Cordova route in order to get the integration we needed, along with the extra pain involved in that process. Unless you’re not integrating with device features and you don’t care deeply about a smooth integration with the mobile desktop, mobile Web development is fraught with frustration. So, yes I’m frustrated! But it’s not for lack of wanting the mobile Web to succeed. I am still a firm believer that we will eventually arrive a much more functional mobile Web platform that allows access to the most common device features in a sensible way. It wouldn't be difficult for device platform vendors to make Web based applications first class citizens on mobile devices. But unfortunately it looks like it will still be some time before this happens. So, what’s your experience building mobile Web apps? Are you finding similar issues? Just giving up on raw Web applications and building PhoneGap apps instead? Completely skipping the Web and going native? Leave a comment for discussion. Resources Rick Strahl on DotNet Rocks talking about Mobile Web© Rick Strahl, West Wind Technologies, 2005-2014Posted in HTML5  Mobile   Tweet !function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0];if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src="//platform.twitter.com/widgets.js";fjs.parentNode.insertBefore(js,fjs);}}(document,"script","twitter-wjs"); (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/plusone.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

    Read the article

  • The Incremental Architect&acute;s Napkin &ndash; #3 &ndash; Make Evolvability inevitable

    - by Ralf Westphal
    Originally posted on: http://geekswithblogs.net/theArchitectsNapkin/archive/2014/06/04/the-incremental-architectacutes-napkin-ndash-3-ndash-make-evolvability-inevitable.aspxThe easier something to measure the more likely it will be produced. Deviations between what is and what should be can be readily detected. That´s what automated acceptance tests are for. That´s what sprint reviews in Scrum are for. It´s no small wonder our software looks like it looks. It has all the traits whose conformance with requirements can easily be measured. And it´s lacking traits which cannot easily be measured. Evolvability (or Changeability) is such a trait. If an operation is correct, if an operation if fast enough, that can be checked very easily. But whether Evolvability is high or low, that cannot be checked by taking a measure or two. Evolvability might correlate with certain traits, e.g. number of lines of code (LOC) per function or Cyclomatic Complexity or test coverage. But there is no threshold value signalling “evolvability too low”; also Evolvability is hardly tangible for the customer. Nevertheless Evolvability is of great importance - at least in the long run. You can get away without much of it for a short time. Eventually, though, it´s needed like any other requirement. Or even more. Because without Evolvability no other requirement can be implemented. Evolvability is the foundation on which all else is build. Such fundamental importance is in stark contrast with its immeasurability. To compensate this, Evolvability must be put at the very center of software development. It must become the hub around everything else revolves. Since we cannot measure Evolvability, though, we cannot start watching it more. Instead we need to establish practices to keep it high (enough) at all times. Chefs have known that for long. That´s why everybody in a restaurant kitchen is constantly seeing after cleanliness. Hygiene is important as is to have clean tools at standardized locations. Only then the health of the patrons can be guaranteed and production efficiency is constantly high. Still a kitchen´s level of cleanliness is easier to measure than software Evolvability. That´s why important practices like reviews, pair programming, or TDD are not enough, I guess. What we need to keep Evolvability in focus and high is… to continually evolve. Change must not be something to avoid but too embrace. To me that means the whole change cycle from requirement analysis to delivery needs to be gone through more often. Scrum´s sprints of 4, 2 even 1 week are too long. Kanban´s flow of user stories across is too unreliable; it takes as long as it takes. Instead we should fix the cycle time at 2 days max. I call that Spinning. No increment must take longer than from this morning until tomorrow evening to finish. Then it should be acceptance checked by the customer (or his/her representative, e.g. a Product Owner). For me there are several resasons for such a fixed and short cycle time for each increment: Clear expectations Absolute estimates (“This will take X days to complete.”) are near impossible in software development as explained previously. Too much unplanned research and engineering work lurk in every feature. And then pervasive interruptions of work by peers and management. However, the smaller the scope the better our absolute estimates become. That´s because we understand better what really are the requirements and what the solution should look like. But maybe more importantly the shorter the timespan the more we can control how we use our time. So much can happen over the course of a week and longer timespans. But if push comes to shove I can block out all distractions and interruptions for a day or possibly two. That´s why I believe we can give rough absolute estimates on 3 levels: Noon Tonight Tomorrow Think of a meeting with a Product Owner at 8:30 in the morning. If she asks you, how long it will take you to implement a user story or bug fix, you can say, “It´ll be fixed by noon.”, or you can say, “I can manage to implement it until tonight before I leave.”, or you can say, “You´ll get it by tomorrow night at latest.” Yes, I believe all else would be naive. If you´re not confident to get something done by tomorrow night (some 34h from now) you just cannot reliably commit to any timeframe. That means you should not promise anything, you should not even start working on the issue. So when estimating use these four categories: Noon, Tonight, Tomorrow, NoClue - with NoClue meaning the requirement needs to be broken down further so each aspect can be assigned to one of the first three categories. If you like absolute estimates, here you go. But don´t do deep estimates. Don´t estimate dozens of issues; don´t think ahead (“Issue A is a Tonight, then B will be a Tomorrow, after that it´s C as a Noon, finally D is a Tonight - that´s what I´ll do this week.”). Just estimate so Work-in-Progress (WIP) is 1 for everybody - plus a small number of buffer issues. To be blunt: Yes, this makes promises impossible as to what a team will deliver in terms of scope at a certain date in the future. But it will give a Product Owner a clear picture of what to pull for acceptance feedback tonight and tomorrow. Trust through reliability Our trade is lacking trust. Customers don´t trust software companies/departments much. Managers don´t trust developers much. I find that perfectly understandable in the light of what we´re trying to accomplish: delivering software in the face of uncertainty by means of material good production. Customers as well as managers still expect software development to be close to production of houses or cars. But that´s a fundamental misunderstanding. Software development ist development. It´s basically research. As software developers we´re constantly executing experiments to find out what really provides value to users. We don´t know what they need, we just have mediated hypothesises. That´s why we cannot reliably deliver on preposterous demands. So trust is out of the window in no time. If we switch to delivering in short cycles, though, we can regain trust. Because estimates - explicit or implicit - up to 32 hours at most can be satisfied. I´d say: reliability over scope. It´s more important to reliably deliver what was promised then to cover a lot of requirement area. So when in doubt promise less - but deliver without delay. Deliver on scope (Functionality and Quality); but also deliver on Evolvability, i.e. on inner quality according to accepted principles. Always. Trust will be the reward. Less complexity of communication will follow. More goodwill buffer will follow. So don´t wait for some Kanban board to show you, that flow can be improved by scheduling smaller stories. You don´t need to learn that the hard way. Just start with small batch sizes of three different sizes. Fast feedback What has been finished can be checked for acceptance. Why wait for a sprint of several weeks to end? Why let the mental model of the issue and its solution dissipate? If you get final feedback after one or two weeks, you hardly remember what you did and why you did it. Resoning becomes hard. But more importantly youo probably are not in the mood anymore to go back to something you deemed done a long time ago. It´s boring, it´s frustrating to open up that mental box again. Learning is harder the longer it takes from event to feedback. Effort can be wasted between event (finishing an issue) and feedback, because other work might go in the wrong direction based on false premises. Checking finished issues for acceptance is the most important task of a Product Owner. It´s even more important than planning new issues. Because as long as work started is not released (accepted) it´s potential waste. So before starting new work better make sure work already done has value. By putting the emphasis on acceptance rather than planning true pull is established. As long as planning and starting work is more important, it´s a push process. Accept a Noon issue on the same day before leaving. Accept a Tonight issue before leaving today or first thing tomorrow morning. Accept a Tomorrow issue tomorrow night before leaving or early the day after tomorrow. After acceptance the developer(s) can start working on the next issue. Flexibility As if reliability/trust and fast feedback for less waste weren´t enough economic incentive, there is flexibility. After each issue the Product Owner can change course. If on Monday morning feature slices A, B, C, D, E were important and A, B, C were scheduled for acceptance by Monday evening and Tuesday evening, the Product Owner can change her mind at any time. Maybe after A got accepted she asks for continuation with D. But maybe, just maybe, she has gotten a completely different idea by then. Maybe she wants work to continue on F. And after B it´s neither D nor E, but G. And after G it´s D. With Spinning every 32 hours at latest priorities can be changed. And nothing is lost. Because what got accepted is of value. It provides an incremental value to the customer/user. Or it provides internal value to the Product Owner as increased knowledge/decreased uncertainty. I find such reactivity over commitment economically very benefical. Why commit a team to some workload for several weeks? It´s unnecessary at beast, and inflexible and wasteful at worst. If we cannot promise delivery of a certain scope on a certain date - which is what customers/management usually want -, we can at least provide them with unpredecented flexibility in the face of high uncertainty. Where the path is not clear, cannot be clear, make small steps so you´re able to change your course at any time. Premature completion Customers/management are used to premeditating budgets. They want to know exactly how much to pay for a certain amount of requirements. That´s understandable. But it does not match with the nature of software development. We should know that by now. Maybe there´s somewhere in the world some team who can consistently deliver on scope, quality, and time, and budget. Great! Congratulations! I, however, haven´t seen such a team yet. Which does not mean it´s impossible, but I think it´s nothing I can recommend to strive for. Rather I´d say: Don´t try this at home. It might hurt you one way or the other. However, what we can do, is allow customers/management stop work on features at any moment. With spinning every 32 hours a feature can be declared as finished - even though it might not be completed according to initial definition. I think, progress over completion is an important offer software development can make. Why think in terms of completion beyond a promise for the next 32 hours? Isn´t it more important to constantly move forward? Step by step. We´re not running sprints, we´re not running marathons, not even ultra-marathons. We´re in the sport of running forever. That makes it futile to stare at the finishing line. The very concept of a burn-down chart is misleading (in most cases). Whoever can only think in terms of completed requirements shuts out the chance for saving money. The requirements for a features mostly are uncertain. So how does a Product Owner know in the first place, how much is needed. Maybe more than specified is needed - which gets uncovered step by step with each finished increment. Maybe less than specified is needed. After each 4–32 hour increment the Product Owner can do an experient (or invite users to an experiment) if a particular trait of the software system is already good enough. And if so, she can switch the attention to a different aspect. In the end, requirements A, B, C then could be finished just 70%, 80%, and 50%. What the heck? It´s good enough - for now. 33% money saved. Wouldn´t that be splendid? Isn´t that a stunning argument for any budget-sensitive customer? You can save money and still get what you need? Pull on practices So far, in addition to more trust, more flexibility, less money spent, Spinning led to “doing less” which also means less code which of course means higher Evolvability per se. Last but not least, though, I think Spinning´s short acceptance cycles have one more effect. They excert pull-power on all sorts of practices known for increasing Evolvability. If, for example, you believe high automated test coverage helps Evolvability by lowering the fear of inadverted damage to a code base, why isn´t 90% of the developer community practicing automated tests consistently? I think, the answer is simple: Because they can do without. Somehow they manage to do enough manual checks before their rare releases/acceptance checks to ensure good enough correctness - at least in the short term. The same goes for other practices like component orientation, continuous build/integration, code reviews etc. None of that is compelling, urgent, imperative. Something else always seems more important. So Evolvability principles and practices fall through the cracks most of the time - until a project hits a wall. Then everybody becomes desperate; but by then (re)gaining Evolvability has become as very, very difficult and tedious undertaking. Sometimes up to the point where the existence of a project/company is in danger. With Spinning that´s different. If you´re practicing Spinning you cannot avoid all those practices. With Spinning you very quickly realize you cannot deliver reliably even on your 32 hour promises. Spinning thus is pulling on developers to adopt principles and practices for Evolvability. They will start actively looking for ways to keep their delivery rate high. And if not, management will soon tell them to do that. Because first the Product Owner then management will notice an increasing difficulty to deliver value within 32 hours. There, finally there emerges a way to measure Evolvability: The more frequent developers tell the Product Owner there is no way to deliver anything worth of feedback until tomorrow night, the poorer Evolvability is. Don´t count the “WTF!”, count the “No way!” utterances. In closing For sustainable software development we need to put Evolvability first. Functionality and Quality must not rule software development but be implemented within a framework ensuring (enough) Evolvability. Since Evolvability cannot be measured easily, I think we need to put software development “under pressure”. Software needs to be changed more often, in smaller increments. Each increment being relevant to the customer/user in some way. That does not mean each increment is worthy of shipment. It´s sufficient to gain further insight from it. Increments primarily serve the reduction of uncertainty, not sales. Sales even needs to be decoupled from this incremental progress. No more promises to sales. No more delivery au point. Rather sales should look at a stream of accepted increments (or incremental releases) and scoup from that whatever they find valuable. Sales and marketing need to realize they should work on what´s there, not what might be possible in the future. But I digress… In my view a Spinning cycle - which is not easy to reach, which requires practice - is the core practice to compensate the immeasurability of Evolvability. From start to finish of each issue in 32 hours max - that´s the challenge we need to accept if we´re serious increasing Evolvability. Fortunately higher Evolvability is not the only outcome of Spinning. Customer/management will like the increased flexibility and “getting more bang for the buck”.

    Read the article

  • Automatic Standby Recreation for Data Guard

    - by pablo.boixeda(at)oracle.com
    Hi,Unfortunately sometimes a Standby Instance needs to be recreated. This can happen for many reasons such as lost archive logs, standby data files, failover, among others.This is why we wanted to have one script to recreate standby instances in an easy way.This script recreates the standby considering some prereqs:-Database Version should be at least 11gR1-Dummy instance started on the standby node (Seeking to improve this so it won't be needed)-Broker configuration hasn't been removed-In our case we have two TNSNAMES files, one for the Standby creation (using SID) and the other one for production using service names (including broker service name)-Some environment variables set up by the environment db script (like ORACLE_HOME, PATH...)-The directory tree should not have been modified in the stanby hostWe are currently using it on our 11gR2 Data Guard tests.Any improvements will be welcome! Normal 0 21 false false false ES X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin-top:0cm; mso-para-margin-right:0cm; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0cm; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} #!/bin/ksh ###    NOMBRE / VERSION ###       recrea_dg.sh   v.1.00 ### ###    DESCRIPCION ###       reacreacion de la Standby ### ###    DEVUELVE ###       0 Creacion de STANDBY correcta ###       1 Fallo ### ###    NOTAS ###       Este shell script NO DEBE MODIFICARSE. ###       Todas las variables y constantes necesarias se toman del entorno. ### ###    MODIFICADO POR:    FECHA:        COMENTARIOS: ###    ---------------    ----------    ------------------------------------- ###      Oracle           15/02/2011    Creacion. ### ### ### Cargar entorno ### V_ADMIN_DIR=`dirname $0` . ${V_ADMIN_DIR}/entorno_bd.sh 1>>/dev/null if [ $? -ne 0 ] then   echo "Error Loading the environment."   exit 1 fi V_RET=0 V_DATE=`/bin/date` V_DATE_F=`/bin/date +%Y%m%d_%H%M%S` V_LOGFILE=${V_TRAZAS}/recrea_dg_${V_DATE_F}.log exec 4>&1 tee ${V_FICH_LOG} >&4 |& exec 1>&p 2>&1 ### ### Variables para Recrear el Data Guard ### V_DB_BR=`echo ${V_DB_NAME}|tr '[:lower:]' '[:upper:]'` if [ "${ORACLE_SID}" = "${V_DB_NAME}01" ] then         V_LOCAL_BR=${V_DB_BR}'01'         V_REMOTE_BR=${V_DB_BR}'02' else         V_LOCAL_BR=${V_DB_BR}'02'         V_REMOTE_BR=${V_DB_BR}'01' fi echo " Getting local instance ROLE ${ORACLE_SID} ..." sqlplus -s /nolog 1>>/dev/null 2>&1 <<-! whenever sqlerror exit 1 connect / as sysdba variable salida number declare   v_database_role v\$database.database_role%type; begin   select database_role into v_database_role from v\$database;   :salida := case v_database_role        when 'PRIMARY' then 2        when 'PHYSICAL STANDBY' then 3        else 4      end; end; / exit :salida ! case $? in 1) echo " ERROR: Cannot get instance ROLE ." | tee -a ${V_LOGFILE}   2>&1    V_RET=1 ;; 2) echo " Local Instance with PRIMARY role." | tee -a ${V_LOGFILE}   2>&1    V_DB_ROLE_LCL=PRIMARY ;; 3) echo " Local Instance with PHYSICAL STANDBY role." | tee -a ${V_LOGFILE}   2>&1    V_DB_ROLE_LCL=STANDBY ;; *) echo " ERROR: UNKNOWN ROLE." | tee -a ${V_LOGFILE}   2>&1    V_RET=1 ;; esac if [ "${V_DB_ROLE_LCL}" = "PRIMARY" ] then         echo "####################################################################" | tee -a ${V_LOGFILE}   2>&1         echo "${V_DATE} - Reacreating  STANDBY Instance." | tee -a ${V_LOGFILE}   2>&1         echo "" | tee -a ${V_LOGFILE}   2>&1         echo "DATAFILES, CONTROL FILES, REDO LOGS and ARCHIVE LOGS in standby instance ${V_REMOTE_BR} will be removed" | tee -a ${V_LOGFILE}   2>&1         echo "" | tee -a ${V_LOGFILE}   2>&1         V_PRIMARY=${V_LOCAL_BR}         V_STANDBY=${V_REMOTE_BR} fi if [ "${V_DB_ROLE_LCL}" = "STANDBY" ] then         echo "####################################################################" | tee -a ${V_LOGFILE}   2>&1         echo "${V_DATE} - Reacreating  STANDBY Instance." | tee -a ${V_LOGFILE}   2>&1         echo "" | tee -a ${V_LOGFILE}   2>&1         echo "DATAFILES, CONTROL FILES, REDO LOGS and ARCHIVE LOGS in standby instance ${V_LOCAL_BR} will be removed" | tee -a ${V_LOGFILE}   2>&1         echo "" | tee -a ${V_LOGFILE}   2>&1         V_PRIMARY=${V_REMOTE_BR}         V_STANDBY=${V_LOCAL_BR} fi # Cargamos las variables de los hosts # Cargamos las variables de los hosts PRY_HOST=`sqlplus  /nolog << EOF | grep KEEP | sed 's/KEEP//;s/[   ]//g' connect sys/${V_DB_PWD}@${V_PRIMARY} as sysdba select 'KEEP',host_name from v\\$instance; EOF` SBY_HOST=`sqlplus  /nolog << EOF | grep KEEP | sed 's/KEEP//;s/[   ]//g' connect sys/${V_DB_PWD}@${V_STANDBY} as sysdba select 'KEEP',host_name from v\\$instance; EOF` echo "el HOST primary es: ${PRY_HOST}" | tee -a ${V_LOGFILE}   2>&1 echo "el HOST standby es: ${SBY_HOST}" | tee -a ${V_LOGFILE}   2>&1 echo "" | tee -a ${V_LOGFILE}   2>&1 ## ## Paramos la instancia STANDBY ## V_DATE=`/bin/date` echo "${V_DATE} - Shutting down Standby instance" | tee -a ${V_LOGFILE}   2>&1 echo "" | tee -a ${V_LOGFILE}   2>&1 echo "********************************************************************************" | tee -a ${V_LOGFILE}   2>&1 ## ## Paramos la instancia STANDBY ## SBY_STATUS=`sqlplus  /nolog << EOF | grep KEEP | sed 's/KEEP//;s/[   ]//g' connect sys/${V_DB_PWD}@${V_STANDBY} as sysdba select 'KEEP',status from v\\$instance; EOF` if [ ${SBY_STATUS} = 'STARTED' ] || [ ${SBY_STATUS} = 'MOUNTED' ] || [ ${SBY_STATUS} = 'OPEN' ] then         echo "${V_DATE} - Standby instance shutdown in progress..." | tee -a ${V_LOGFILE}   2>&1         echo "" | tee -a ${V_LOGFILE}   2>&1         echo "********************************************************************************" | tee -a ${V_LOGFILE}   2>&1         sqlplus -s /nolog 1>>/dev/null 2>&1 <<-!         whenever sqlerror exit 1         connect sys/${V_DB_PWD}@${V_STANDBY} as sysdba         shutdown abort         ! fi V_DATE=`/bin/date` echo "" echo "${V_DATE} - Standby instance stopped" | tee -a ${V_LOGFILE}   2>&1 echo "" | tee -a ${V_LOGFILE}   2>&1 echo "********************************************************************************" | tee -a ${V_LOGFILE}   2>&1 ## ## Eliminamos los ficheros de la base de datos ## V_SBY_SID=`echo ${V_STANDBY}|tr '[:upper:]' '[:lower:]'` V_PRY_SID=`echo ${V_PRIMARY}|tr '[:upper:]' '[:lower:]'` ssh ${SBY_HOST} rm /opt/oracle/db/db${V_DB_NAME}/${V_SBY_SID}/data/*.dbf ssh ${SBY_HOST} rm /opt/oracle/db/db${V_DB_NAME}/${V_SBY_SID}/arch/*.arc ssh ${SBY_HOST} rm /opt/oracle/db/db${V_DB_NAME}/${V_SBY_SID}/ctl/*.ctl ssh ${SBY_HOST} rm /opt/oracle/db/db${V_DB_NAME}/${V_SBY_SID}/redo/*.ctl ssh ${SBY_HOST} rm /opt/oracle/db/db${V_DB_NAME}/${V_SBY_SID}/redo/*.rdo ## ## Startup nomount stby instance ## V_DATE=`/bin/date` echo "" | tee -a ${V_LOGFILE}   2>&1 echo "${V_DATE} - Starting  DUMMY Standby Instance " | tee -a ${V_LOGFILE}   2>&1 echo "" | tee -a ${V_LOGFILE}   2>&1 echo "********************************************************************************" | tee -a ${V_LOGFILE}   2>&1 ssh ${SBY_HOST} touch /home/oracle/init_dg.ora ssh ${SBY_HOST} 'echo "DB_NAME='${V_DB_NAME}'">>/home/oracle/init_dg.ora' ssh ${SBY_HOST} touch /home/oracle/start_dummy.sh ssh ${SBY_HOST} 'echo "ORACLE_HOME=/opt/oracle/db/db'${V_DB_NAME}'/soft/db11.2.0.2 ">>/home/oracle/start_dummy.sh' ssh ${SBY_HOST} 'echo "export ORACLE_HOME">>/home/oracle/start_dummy.sh' ssh ${SBY_HOST} 'echo "PATH=\$ORACLE_HOME/bin:\$PATH">>/home/oracle/start_dummy.sh' ssh ${SBY_HOST} 'echo "export PATH">>/home/oracle/start_dummy.sh' ssh ${SBY_HOST} 'echo "ORACLE_SID='${V_SBY_SID}'">>/home/oracle/start_dummy.sh' ssh ${SBY_HOST} 'echo "export ORACLE_SID">>/home/oracle/start_dummy.sh' ssh ${SBY_HOST} 'echo "sqlplus -s /nolog <<-!" >>/home/oracle/start_dummy.sh' ssh ${SBY_HOST} 'echo "      whenever sqlerror exit 1 ">>/home/oracle/start_dummy.sh' ssh ${SBY_HOST} 'echo "      connect / as sysdba ">>/home/oracle/start_dummy.sh' ssh ${SBY_HOST} 'echo "      startup nomount pfile='\''/home/oracle/init_dg.ora'\''">>/home/oracle/start_dummy.sh' ssh ${SBY_HOST} 'echo "! ">>/home/oracle/start_dummy.sh' ssh ${SBY_HOST} 'chmod 744 /home/oracle/start_dummy.sh' ssh ${SBY_HOST} 'sh /home/oracle/start_dummy.sh' ssh ${SBY_HOST} 'rm /home/oracle/start_dummy.sh' ssh ${SBY_HOST} 'rm /home/oracle/init_dg.ora' ## ## TNSNAMES change, specific for RMAN duplicate ## V_DATE=`/bin/date` echo "" | tee -a ${V_LOGFILE}   2>&1 echo "${V_DATE} - Setting up TNSNAMES in PRIMARY host " | tee -a ${V_LOGFILE}   2>&1 echo "" | tee -a ${V_LOGFILE}   2>&1 echo "********************************************************************************" | tee -a ${V_LOGFILE}   2>&1 ssh ${PRY_HOST} 'cp /opt/oracle/db/db'${V_DB_NAME}'/soft/db11.2.0.2/network/admin/tnsnames.ora.inst  /opt/oracle/db/db'${V_DB_NAME}'/soft/db11.2.0.2/network/admin/tnsnames.ora' V_DATE=`/bin/date` echo "" | tee -a ${V_LOGFILE}   2>&1 echo "${V_DATE} - Starting STANDBY creation with RMAN.. " | tee -a ${V_LOGFILE}   2>&1 echo "" | tee -a ${V_LOGFILE}   2>&1 echo "********************************************************************************" | tee -a ${V_LOGFILE}   2>&1 rman<<-! >>${V_LOGFILE} connect target sys/${V_DB_PWD}@${V_PRIMARY} connect auxiliary sys/${V_DB_PWD}@${V_STANDBY} run { allocate channel prmy1 type disk; allocate channel prmy2 type disk; allocate channel prmy3 type disk; allocate channel prmy4 type disk; allocate auxiliary channel stby type disk; duplicate target database for standby from active database dorecover spfile parameter_value_convert '${V_PRY_SID}','${V_SBY_SID}' set control_files='/opt/oracle/db/db${V_DB_NAME}/${V_SBY_SID}/ctl/control01.ctl','/opt/oracle/db/db${V_DB_NAME}/${V_SBY_SID}/redo/control02.ctl' set db_file_name_convert='/opt/oracle/db/db${V_DB_NAME}/${V_PRY_SID}/','/opt/oracle/db/db${V_DB_NAME}/${V_SBY_SID}/' set log_file_name_convert='/opt/oracle/db/db${V_DB_NAME}/${V_PRY_SID}/','/opt/oracle/db/db${V_DB_NAME}/${V_SBY_SID}/' set 'db_unique_name'='${V_SBY_SID}' set log_archive_config='DG_CONFIG=(${V_PRIMARY},${V_STANDBY})' set fal_client='${V_STANDBY}' set fal_server='${V_PRIMARY}' set log_archive_dest_1='LOCATION=/opt/oracle/db/db${V_DB_NAME}/${V_SBY_SID}/arch DB_UNIQUE_NAME=${V_SBY_SID} MANDATORY VALID_FOR=(ALL_LOGFILES,ALL_ROLES)' set log_archive_dest_2='SERVICE="${V_PRIMARY}"','SYNC AFFIRM DB_UNIQUE_NAME=${V_PRY_SID} DELAY=0 MAX_FAILURE=0 REOPEN=300 REGISTER VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE)' nofilenamecheck ; } ! V_DATE=`/bin/date` if [ $? -ne 0 ] then         echo ""         echo "${V_DATE} - Error creating STANDBY instance"         echo ""         echo "********************************************************************************" else         echo ""         echo "${V_DATE} - STANDBY instance created SUCCESSFULLY "         echo ""         echo "********************************************************************************" fi sqlplus -s /nolog 1>>/dev/null 2>&1 <<-!         whenever sqlerror exit 1         connect sys/${V_DB_PWD}@${V_STANDBY} as sysdba         alter system set local_listener='(ADDRESS=(PROTOCOL=TCP)(HOST=${SBY_HOST})(PORT=1544))' scope=both;         alter system set service_names='${V_DB_NAME}.eu.roca.net,${V_SBY_SID}.eu.roca.net,${V_SBY_SID}_DGMGRL.eu.roca.net' scope=both;         alter database recover managed standby database using current logfile disconnect from session;         alter system set dg_broker_start=true scope=both; ! ## ## TNSNAMES change, back to Production Mode ## V_DATE=`/bin/date` echo " " | tee -a ${V_LOGFILE}   2>&1 echo "${V_DATE} - Restoring TNSNAMES in PRIMARY "  | tee -a ${V_LOGFILE}   2>&1 echo ""  | tee -a ${V_LOGFILE}   2>&1 echo "********************************************************************************"  | tee -a ${V_LOGFILE}   2>&1 ssh ${PRY_HOST} 'cp /opt/oracle/db/db'${V_DB_NAME}'/soft/db11.2.0.2/network/admin/tnsnames.ora.prod  /opt/oracle/db/db'${V_DB_NAME}'/soft/db11.2.0.2/network/admin/tnsnames.ora' echo ""  | tee -a ${V_LOGFILE}   2>&1 echo "${V_DATE} -  Waiting for media recovery before check the DATA GUARD Broker"  | tee -a ${V_LOGFILE}   2>&1 echo ""  | tee -a ${V_LOGFILE}   2>&1 echo "********************************************************************************"  | tee -a ${V_LOGFILE}   2>&1 sleep 200 dgmgrl <<-! | grep SUCCESS 1>/dev/null 2>&1     connect ${V_DB_USR}/${V_DB_PWD}@${V_STANDBY}     show configuration verbose; ! if [ $? -ne 0 ] ; then         echo "       ERROR: El status del Broker no es SUCCESS" | tee -a ${V_LOGFILE}   2>&1 ;         V_RET=1 else          echo "      DATA GUARD OK " | tee -a ${V_LOGFILE}   2>&1 ; Normal 0 21 false false false ES X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin-top:0cm; mso-para-margin-right:0cm; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0cm; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;}         V_RET=0 fi Hope it helps.

    Read the article

  • ANTS Memory Profiler 7.0 Review

    - by Michael B. McLaughlin
    (This is my first review as a part of the GeeksWithBlogs.net Influencers program. It’s a program in which I (and the others who have been selected for it) get the opportunity to check out new products and services and write reviews about them. We don’t get paid for this, but we do generally get to keep a copy of the software or retain an account for some period of time on the service that we review. In this case I received a copy of Red Gate Software’s ANTS Memory Profiler 7.0, which was released in January. I don’t have any upgrade rights nor is my review guided, restrained, influenced, or otherwise controlled by Red Gate or anyone else. But I do get to keep the software license. I will always be clear about what I received whenever I do a review – I leave it up to you to decide whether you believe I can be objective. I believe I can be. If I used something and really didn’t like it, keeping a copy of it wouldn’t be worth anything to me. In that case though, I would simply uninstall/deactivate/whatever the software or service and tell the company what I didn’t like about it so they could (hopefully) make it better in the future. I don’t think it’d be polite to write up a terrible review, nor do I think it would be a particularly good use of my time. There are people who get paid for a living to review things, so I leave it to them to tell you what they think is bad and why. I’ll only spend my time telling you about things I think are good.) Overview of Common .NET Memory Problems When coming to land of managed memory from the wilds of unmanaged code, it’s easy to say to one’s self, “Wow! Now I never have to worry about memory problems again!” But this simply isn’t true. Managed code environments, such as .NET, make many, many things easier. You will never have to worry about memory corruption due to a bad pointer, for example (unless you’re working with unsafe code, of course). But managed code has its own set of memory concerns. For example, failing to unsubscribe from events when you are done with them leaves the publisher of an event with a reference to the subscriber. If you eliminate all your own references to the subscriber, then that memory is effectively lost since the GC won’t delete it because of the publishing object’s reference. When the publishing object itself becomes subject to garbage collection then you’ll get that memory back finally, but that could take a very long time depending of the life of the publisher. Another common source of resource leaks is failing to properly release unmanaged resources. When writing a class that contains members that hold unmanaged resources (e.g. any of the Stream-derived classes, IsolatedStorageFile, most classes ending in “Reader” or “Writer”), you should always implement IDisposable, making sure to use a properly written Dispose method. And when you are using an instance of a class that implements IDisposable, you should always make sure to use a 'using' statement in order to ensure that the object’s unmanaged resources are disposed of properly. (A ‘using’ statement is a nicer, cleaner looking, and easier to use version of a try-finally block. The compiler actually translates it as though it were a try-finally block. Note that Code Analysis warning 2202 (CA2202) will often be triggered by nested using blocks. A properly written dispose method ensures that it only runs once such that calling dispose multiple times should not be a problem. Nonetheless, CA2202 exists and if you want to avoid triggering it then you should write your code such that only the innermost IDisposable object uses a ‘using’ statement, with any outer code making use of appropriate try-finally blocks instead). Then, of course, there are situations where you are operating in a memory-constrained environment or else you want to limit or even eliminate allocations within a certain part of your program (e.g. within the main game loop of an XNA game) in order to avoid having the GC run. On the Xbox 360 and Windows Phone 7, for example, for every 1 MB of heap allocations you make, the GC runs; the added time of a GC collection can cause a game to drop frames or run slowly thereby making it look bad. Eliminating allocations (or else minimizing them and calling an explicit Collect at an appropriate time) is a common way of avoiding this (the other way is to simplify your heap so that the GC’s latency is low enough not to cause performance issues). ANTS Memory Profiler 7.0 When the opportunity to review Red Gate’s recently released ANTS Memory Profiler 7.0 arose, I jumped at it. In order to review it, I was given a free copy (which does not include upgrade rights for future versions) which I am allowed to keep. For those of you who are familiar with ANTS Memory Profiler, you can find a list of new features and enhancements here. If you are an experienced .NET developer who is familiar with .NET memory management issues, ANTS Memory Profiler is great. More importantly still, if you are new to .NET development or you have no experience or limited experience with memory profiling, ANTS Memory Profiler is awesome. From the very beginning, it guides you through the process of memory profiling. If you’re experienced and just want dive in however, it doesn’t get in your way. The help items GAHSFLASHDAJLDJA are well designed and located right next to the UI controls so that they are easy to find without being intrusive. When you first launch it, it presents you with a “Getting Started” screen that contains links to “Memory profiling video tutorials”, “Strategies for memory profiling”, and the “ANTS Memory Profiler forum”. I’m normally the kind of person who looks at a screen like that only to find the “Don’t show this again” checkbox. Since I was doing a review, though, I decided I should examine them. I was pleasantly surprised. The overview video clocks in at three minutes and fifty seconds. It begins by showing you how to get started profiling an application. It explains that profiling is done by taking memory snapshots periodically while your program is running and then comparing them. ANTS Memory Profiler (I’m just going to call it “ANTS MP” from here) analyzes these snapshots in the background while your application is running. It briefly mentions a new feature in Version 7, a new API that give you the ability to trigger snapshots from within your application’s source code (more about this below). You can also, and this is the more common way you would do it, take a memory snapshot at any time from within the ANTS MP window by clicking the “Take Memory Snapshot” button in the upper right corner. The overview video goes on to demonstrate a basic profiling session on an application that pulls information from a database and displays it. It shows how to switch which snapshots you are comparing, explains the different sections of the Summary view and what they are showing, and proceeds to show you how to investigate memory problems using the “Instance Categorizer” to track the path from an object (or set of objects) to the GC’s root in order to find what things along the path are holding a reference to it/them. For a set of objects, you can then click on it and get the “Instance List” view. This displays all of the individual objects (including their individual sizes, values, etc.) of that type which share the same path to the GC root. You can then click on one of the objects to generate an “Instance Retention Graph” view. This lets you track directly up to see the reference chain for that individual object. In the overview video, it turned out that there was an event handler which was holding on to a reference, thereby keeping a large number of strings that should have been freed in memory. Lastly the video shows the “Class List” view, which lets you dig in deeply to find problems that might not have been clear when following the previous workflow. Once you have at least one memory snapshot you can begin analyzing. The main interface is in the “Analysis” tab. You can also switch to the “Session Overview” tab, which gives you several bar charts highlighting basic memory data about the snapshots you’ve taken. If you hover over the individual bars (and the individual colors in bars that have more than one), you will see a detailed text description of what the bar is representing visually. The Session Overview is good for a quick summary of memory usage and information about the different heaps. You are going to spend most of your time in the Analysis tab, but it’s good to remember that the Session Overview is there to give you some quick feedback on basic memory usage stats. As described above in the summary of the overview video, there is a certain natural workflow to the Analysis tab. You’ll spin up your application and take some snapshots at various times such as before and after clicking a button to open a window or before and after closing a window. Taking these snapshots lets you examine what is happening with memory. You would normally expect that a lot of memory would be freed up when closing a window or exiting a document. By taking snapshots before and after performing an action like that you can see whether or not the memory is really being freed. If you already know an area that’s giving you trouble, you can run your application just like normal until just before getting to that part and then you can take a few strategic snapshots that should help you pin down the problem. Something the overview didn’t go into is how to use the “Filters” section at the bottom of ANTS MP together with the Class List view in order to narrow things down. The video tutorials page has a nice 3 minute intro video called “How to use the filters”. It’s a nice introduction and covers some of the basics. I’m going to cover a bit more because I think they’re a really neat, really helpful feature. Large programs can bring up thousands of classes. Even simple programs can instantiate far more classes than you might realize. In a basic .NET 4 WPF application for example (and when I say basic, I mean just MainWindow.xaml with a button added to it), the unfiltered Class List view will have in excess of 1000 classes (my simple test app had anywhere from 1066 to 1148 classes depending on which snapshot I was using as the “Current” snapshot). This is amazing in some ways as it shows you how in stark detail just how immensely powerful the WPF framework is. But hunting through 1100 classes isn’t productive, no matter how cool it is that there are that many classes instantiated and doing all sorts of awesome things. Let’s say you wanted to examine just the classes your application contains source code for (in my simple example, that would be the MainWindow and App). Under “Basic Filters”, click on “Classes with source” under “Show only…”. Voilà. Down from 1070 classes in the snapshot I was using as “Current” to 2 classes. If you then click on a class’s name, it will show you (to the right of the class name) two little icon buttons. Hover over them and you will see that you can click one to view the Instance Categorizer for the class and another to view the Instance List for the class. You can also show classes based on which heap they live on. If you chose both a Baseline snapshot and a Current snapshot then you can use the “Comparing snapshots” filters to show only: “New objects”; “Surviving objects”; “Survivors in growing classes”; or “Zombie objects” (if you aren’t sure what one of these means, you can click the helpful “?” in a green circle icon to bring up a popup that explains them and provides context). Remember that your selection(s) under the “Show only…” heading will still apply, so you should update those selections to make sure you are seeing the view you want. There are also links under the “What is my memory problem?” heading that can help you diagnose the problems you are seeing including one for “I don’t know which kind I have” for situations where you know generally that your application has some problems but aren’t sure what the behavior you have been seeing (OutOfMemoryExceptions, continually growing memory usage, larger memory use than expected at certain points in the program). The Basic Filters are not the only filters there are. “Filter by Object Type” gives you the ability to filter by: “Objects that are disposable”; “Objects that are/are not disposed”; “Objects that are/are not GC roots” (GC roots are things like static variables); and “Objects that implement _______”. “Objects that implement” is particularly neat. Once you check the box, you can then add one or more classes and interfaces that an object must implement in order to survive the filtering. Lastly there is “Filter by Reference”, which gives you the option to pare down the list based on whether an object is “Kept in memory exclusively by” a particular item, a class/interface, or a namespace; whether an object is “Referenced by” one or more of those choices; and whether an object is “Never referenced by” one or more of those choices. Remember that filtering is cumulative, so anything you had set in one of the filter sections still remains in effect unless and until you go back and change it. There’s quite a bit more to ANTS MP – it’s a very full featured product – but I think I touched on all of the most significant pieces. You can use it to debug: a .NET executable; an ASP.NET web application (running on IIS); an ASP.NET web application (running on Visual Studio’s built-in web development server); a Silverlight 4 browser application; a Windows service; a COM+ server; and even something called an XBAP (local XAML browser application). You can also attach to a .NET 4 process to profile an application that’s already running. The startup screen also has a large number of “Charting Options” that let you adjust which statistics ANTS MP should collect. The default selection is a good, minimal set. It’s worth your time to browse through the charting options to examine other statistics that may also help you diagnose a particular problem. The more statistics ANTS MP collects, the longer it will take to collect statistics. So just turning everything on is probably a bad idea. But the option to selectively add in additional performance counters from the extensive list could be a very helpful thing for your memory profiling as it lets you see additional data that might provide clues about a particular problem that has been bothering you. ANTS MP integrates very nicely with all versions of Visual Studio that support plugins (i.e. all of the non-Express versions). Just note that if you choose “Profile Memory” from the “ANTS” menu that it will launch profiling for whichever project you have set as the Startup project. One quick tip from my experience so far using ANTS MP: if you want to properly understand your memory usage in an application you’ve written, first create an “empty” version of the type of project you are going to profile (a WPF application, an XNA game, etc.) and do a quick profiling session on that so that you know the baseline memory usage of the framework itself. By “empty” I mean just create a new project of that type in Visual Studio then compile it and run it with profiling – don’t do anything special or add in anything (except perhaps for any external libraries you’re planning to use). The first thing I tried ANTS MP out on was a demo XNA project of an editor that I’ve been working on for quite some time that involves a custom extension to XNA’s content pipeline. The first time I ran it and saw the unmanaged memory usage I was convinced I had some horrible bug that was creating extra copies of texture data (the demo project didn’t have a lot of texture data so when I saw a lot of unmanaged memory I instantly figured I was doing something wrong). Then I thought to run an empty project through and when I saw that the amount of unmanaged memory was virtually identical, it dawned on me that the CLR itself sits in unmanaged memory and that (thankfully) there was nothing wrong with my code! Quite a relief. Earlier, when discussing the overview video, I mentioned the API that lets you take snapshots from within your application. I gave it a quick trial and it’s very easy to integrate and make use of and is a really nice addition (especially for projects where you want to know what, if any, allocations there are in a specific, complicated section of code). The only concern I had was that if I hadn’t watched the overview video I might never have known it existed. Even then it took me five minutes of hunting around Red Gate’s website before I found the “Taking snapshots from your code" article that explains what DLL you need to add as a reference and what method of what class you should call in order to take an automatic snapshot (including the helpful warning to wrap it in a try-catch block since, under certain circumstances, it can raise an exception, such as trying to call it more than 5 times in 30 seconds. The difficulty in discovering and then finding information about the automatic snapshots API was one thing I thought could use improvement. Another thing I think would make it even better would be local copies of the webpages it links to. Although I’m generally always connected to the internet, I imagine there are more than a few developers who aren’t or who are behind very restrictive firewalls. For them (and for me, too, if my internet connection happens to be down), it would be nice to have those documents installed locally or to have the option to download an additional “documentation” package that would add local copies. Another thing that I wish could be easier to manage is the Filters area. Finding and setting individual filters is very easy as is understanding what those filter do. And breaking it up into three sections (basic, by object, and by reference) makes sense. But I could easily see myself running a long profiling session and forgetting that I had set some filter a long while earlier in a different filter section and then spending quite a bit of time trying to figure out why some problem that was clearly visible in the data wasn’t showing up in, e.g. the instance list before remembering to check all the filters for that one setting that was only culling a few things from view. Some sort of indicator icon next to the filter section names that appears you have at least one filter set in that area would be a nice visual clue to remind me that “oh yeah, I told it to only show objects on the Gen 2 heap! That’s why I’m not seeing those instances of the SuperMagic class!” Something that would be nice (but that Red Gate cannot really do anything about) would be if this could be used in Windows Phone 7 development. If Microsoft and Red Gate could work together to make this happen (even if just on the WP7 emulator), that would be amazing. Especially given the memory constraints that apps and games running on mobile devices need to work within, a good memory profiler would be a phenomenally helpful tool. If anyone at Microsoft reads this, it’d be really great if you could make something like that happen. Perhaps even a (subsidized) custom version just for WP7 development. (For XNA games, of course, you can create a Windows version of the game and use ANTS MP on the Windows version in order to get a better picture of your memory situation. For Silverlight on WP7, though, there’s quite a bit of educated guess work and WeakReference creation followed by forced collections in order to find the source of a memory problem.) The only other thing I found myself wanting was a “Back” button. Between my Windows Phone 7, Zune, and other things, I’ve grown very used to having a “back stack” that lets me just navigate back to where I came from. The ANTS MP interface is surprisingly easy to use given how much it lets you do, and once you start using it for any amount of time, you learn all of the different areas such that you know where to go. And it does remember the state of the areas you were previously in, of course. So if you go to, e.g., the Instance Retention Graph from the Class List and then return back to the Class List, it will remember which class you had selected and all that other state information. Still, a “Back” button would be a welcome addition to a future release. Bottom Line ANTS Memory Profiler is not an inexpensive tool. But my time is valuable. I can easily see ANTS MP saving me enough time tracking down memory problems to justify it on a cost basis. More importantly to me, knowing what is happening memory-wise in my programs and having the confidence that my code doesn’t have any hidden time bombs in it that will cause it to OOM if I leave it running for longer than I do when I spin it up real quickly for debugging or just to see how a new feature looks and feels is a good feeling. It’s a feeling that I like having and want to continue to have. I got the current version for free in order to review it. Having done so, I’ve now added it to my must-have tools and will gladly lay out the money for the next version when it comes out. It has a 14 day free trial, so if you aren’t sure if it’s right for you or if you think it seems interesting but aren’t really sure if it’s worth shelling out the money for it, give it a try.

    Read the article

  • What is hogging my connection?

    - by SF.
    At times it seems like dozens, if not hundreds of root-owned HTTP connections spring up. This is not much of a problem on LAN or WLAN as each of them seems to transfer very little, but if I use GPRS link, my ping times go into minutes (seriously, 80000ms is not infrequent!) and all connections grind to a halt waiting till these end. This usually lasts some 15 minutes and ends about when I start troubleshooting it for real. I've managed to capture a fragment of Nethogs output NetHogs version 0.8.0 PID USER PROGRAM DEV SENT RECEIVED ? root 37.209.147.180:59854-141.101.114.59:80 0.013 0.000 KB/sec ? root 37.209.147.180:59853-141.101.114.59:80 0.000 0.000 KB/sec ? root 37.209.147.180:52804-173.194.70.95:80 0.000 0.000 KB/sec 1954 bw /home/bw/.dropbox-dist/dropbox ppp0 0.000 0.000 KB/sec ? root 37.209.147.180:59851-141.101.114.59:80 0.000 0.000 KB/sec ? root 37.209.147.180:59850-141.101.114.59:80 0.000 0.000 KB/sec ? root 37.209.147.180:52801-173.194.70.95:80 0.000 0.000 KB/sec 13301 bw /usr/lib/firefox/firefox ppp0 0.000 0.000 KB/sec ? root unknown TCP 0.000 0.000 KB/sec Unfortunately, it doesn't display the owning process of these. Does anyone recognize these addresses or is able to suggest how to troubleshoot it further or disable it? Is it some automatic update or something like that? EDIT: per request; netstat -n, for obvious reason that normal netstat won't ever launch as all DNS requests are hogged just the same. netstat -n Active Internet connections (w/o servers) Proto Recv-Q Send-Q Local Address Foreign Address State tcp 0 1 93.154.166.62:51314 198.252.206.16:80 FIN_WAIT1 tcp 0 1 37.209.147.180:44098 198.252.206.16:80 FIN_WAIT1 tcp 0 1 37.209.147.180:59855 141.101.114.59:80 FIN_WAIT1 tcp 1 0 192.168.43.224:38237 213.189.45.39:443 CLOSE_WAIT tcp 1 0 93.154.146.186:35167 75.101.152.29:80 CLOSE_WAIT tcp 1 0 192.168.43.224:32939 199.15.160.100:80 CLOSE_WAIT tcp 1 0 192.168.43.224:55619 63.245.217.207:443 CLOSE_WAIT tcp 1 0 93.154.146.186:60210 75.101.152.29:443 CLOSE_WAIT tcp 1 0 192.168.43.224:32944 199.15.160.100:80 CLOSE_WAIT tcp 0 1 37.209.147.180:52804 173.194.70.95:80 FIN_WAIT1 tcp 1 0 93.154.146.186:46606 23.21.151.181:80 CLOSE_WAIT tcp 1 0 93.154.146.186:52619 107.22.246.76:80 CLOSE_WAIT tcp 415 0 93.154.146.186:36156 82.112.106.104:80 CLOSE_WAIT tcp 1 0 93.154.146.186:50352 107.22.246.76:443 CLOSE_WAIT tcp 1 0 192.168.43.224:55000 213.189.45.44:443 CLOSE_WAIT tcp 0 1 37.209.147.180:59853 141.101.114.59:80 FIN_WAIT1 tcp 1 0 192.168.43.224:32937 199.15.160.100:80 CLOSE_WAIT tcp 1 0 192.168.43.224:56055 93.184.221.40:80 CLOSE_WAIT tcp 415 0 93.154.146.186:36155 82.112.106.104:80 CLOSE_WAIT tcp 0 1 37.209.147.180:44097 198.252.206.16:80 FIN_WAIT1 tcp 1 0 93.154.146.186:35166 75.101.152.29:80 CLOSE_WAIT tcp 1 0 192.168.43.224:32943 199.15.160.100:80 CLOSE_WAIT tcp 1 0 93.154.146.186:46607 23.21.151.181:80 CLOSE_WAIT tcp 1 0 93.154.146.186:36422 23.21.151.181:443 CLOSE_WAIT tcp 1 0 192.168.43.224:36081 93.184.220.148:80 CLOSE_WAIT tcp 1 0 192.168.43.224:44462 213.189.45.29:443 CLOSE_WAIT tcp 1 0 192.168.43.224:32938 199.15.160.100:80 CLOSE_WAIT tcp 1 0 93.154.146.186:36419 23.21.151.181:443 CLOSE_WAIT tcp 0 497 93.154.166.62:51313 198.252.206.16:80 FIN_WAIT1 tcp 0 1 37.209.147.180:59851 141.101.114.59:80 FIN_WAIT1 tcp 0 1 37.209.147.180:44095 198.252.206.16:80 FIN_WAIT1 tcp 1 0 93.154.146.186:46611 23.21.151.181:80 CLOSE_WAIT tcp 1 0 192.168.43.224:38236 213.189.45.39:443 CLOSE_WAIT tcp 0 171 37.209.147.180:45341 173.194.113.146:443 ESTABLISHED tcp 0 1 37.209.147.180:52801 173.194.70.95:80 FIN_WAIT1 tcp 1 0 192.168.43.224:36080 93.184.220.148:80 CLOSE_WAIT tcp 0 1 37.209.147.180:59856 141.101.114.59:80 FIN_WAIT1 tcp 0 1 37.209.147.180:44096 198.252.206.16:80 FIN_WAIT1 tcp 0 1 93.154.166.62:57471 108.160.162.49:80 FIN_WAIT1 tcp 0 1 37.209.147.180:59854 141.101.114.59:80 FIN_WAIT1 tcp 0 171 37.209.147.180:45340 173.194.113.146:443 ESTABLISHED tcp 0 168 37.209.147.180:45334 173.194.113.146:443 FIN_WAIT1 tcp 1 0 93.154.146.186:46609 23.21.151.181:80 CLOSE_WAIT tcp 0 1248 93.154.166.62:58270 64.251.23.59:443 FIN_WAIT1 tcp 0 1 37.209.147.180:59850 141.101.114.59:80 FIN_WAIT1 tcp 1 0 93.154.146.186:35181 75.101.152.29:80 CLOSE_WAIT tcp 232 0 93.154.172.168:46384 198.252.206.25:80 ESTABLISHED tcp 1 0 93.154.146.186:52618 107.22.246.76:80 CLOSE_WAIT tcp 1 0 93.154.172.168:36298 173.194.69.95:443 CLOSE_WAIT tcp 1 0 93.154.146.186:60209 75.101.152.29:443 CLOSE_WAIT tcp 0 168 37.209.147.180:45335 173.194.113.146:443 FIN_WAIT1 tcp 415 0 93.154.146.186:36157 82.112.106.104:80 CLOSE_WAIT tcp 1 0 192.168.43.224:36082 93.184.220.148:80 CLOSE_WAIT tcp 1 0 192.168.43.224:32942 199.15.160.100:80 CLOSE_WAIT tcp 1 0 93.154.146.186:50350 107.22.246.76:443 CLOSE_WAIT tcp 1 0 192.168.43.224:32941 199.15.160.100:80 CLOSE_WAIT tcp 0 534 37.209.147.180:44089 198.252.206.16:80 FIN_WAIT1 tcp 1 0 93.154.146.186:46608 23.21.151.181:80 CLOSE_WAIT tcp 1 0 93.154.146.186:46612 23.21.151.181:80 CLOSE_WAIT udp 0 0 37.209.147.180:49057 193.41.112.14:53 ESTABLISHED udp 0 0 37.209.147.180:51631 193.41.112.18:53 ESTABLISHED udp 0 0 37.209.147.180:34827 193.41.112.18:53 ESTABLISHED udp 0 0 37.209.147.180:35908 193.41.112.14:53 ESTABLISHED udp 0 0 37.209.147.180:44106 193.41.112.14:53 ESTABLISHED udp 0 0 37.209.147.180:42184 193.41.112.14:53 ESTABLISHED udp 0 0 37.209.147.180:54485 193.41.112.14:53 ESTABLISHED udp 0 0 37.209.147.180:42216 193.41.112.18:53 ESTABLISHED udp 0 0 37.209.147.180:51961 193.41.112.14:53 ESTABLISHED udp 0 0 37.209.147.180:48412 193.41.112.14:53 ESTABLISHED The interesting lines from ping got lost, but the summary over past few hours is: --- 8.8.8.8 ping statistics --- 107459 packets transmitted, 104376 received, +22 duplicates, 2% packet loss, time 195427362ms rtt min/avg/max/mdev = 24.822/528.132/90538.257/2519.263 ms, pipe 90 EDIT: Per request: Happened again, reboot didn't help but cleaned up all "hanging" processes. Currently netstat shows: bw@pony:/var/log$ netstat -n -t Active Internet connections (w/o servers) Proto Recv-Q Send-Q Local Address Foreign Address State tcp 0 0 93.154.188.68:42767 74.125.239.143:443 TIME_WAIT tcp 0 0 93.154.188.68:50270 173.194.69.189:443 ESTABLISHED tcp 0 0 93.154.188.68:45250 190.93.244.58:80 TIME_WAIT tcp 0 0 93.154.188.68:53488 173.194.32.198:80 ESTABLISHED tcp 0 0 93.154.188.68:53490 173.194.32.198:80 ESTABLISHED tcp 0 159 93.154.188.68:42741 74.125.239.143:443 LAST_ACK tcp 0 0 93.154.188.68:45808 198.252.206.25:80 ESTABLISHED tcp 0 0 93.154.188.68:52449 173.194.32.199:443 ESTABLISHED tcp 0 0 93.154.188.68:52600 173.194.32.199:443 TIME_WAIT tcp 0 0 93.154.188.68:50300 173.194.69.189:443 TIME_WAIT tcp 0 0 93.154.188.68:45253 190.93.244.58:80 TIME_WAIT tcp 0 0 93.154.188.68:46252 173.194.32.204:443 ESTABLISHED tcp 0 0 93.154.188.68:45246 190.93.244.58:80 ESTABLISHED tcp 0 0 93.154.188.68:47064 173.194.113.143:443 ESTABLISHED tcp 0 0 93.154.188.68:34484 173.194.69.95:443 ESTABLISHED tcp 0 0 93.154.188.68:45252 190.93.244.58:80 TIME_WAIT tcp 0 0 93.154.188.68:54290 173.194.32.202:443 ESTABLISHED tcp 0 0 93.154.188.68:47063 173.194.113.143:443 ESTABLISHED tcp 0 0 93.154.188.68:53469 173.194.32.198:80 TIME_WAIT tcp 0 0 93.154.188.68:45242 190.93.244.58:80 TIME_WAIT tcp 0 0 93.154.188.68:53468 173.194.32.198:80 ESTABLISHED tcp 0 0 93.154.188.68:50299 173.194.69.189:443 TIME_WAIT tcp 0 0 93.154.188.68:42764 74.125.239.143:443 TIME_WAIT tcp 0 0 93.154.188.68:45256 190.93.244.58:80 TIME_WAIT tcp 0 0 93.154.188.68:58047 108.160.162.105:80 ESTABLISHED tcp 0 0 93.154.188.68:45249 190.93.244.58:80 TIME_WAIT tcp 0 0 93.154.188.68:50297 173.194.69.189:443 TIME_WAIT tcp 0 0 93.154.188.68:53470 173.194.32.198:80 ESTABLISHED tcp 0 0 93.154.188.68:34100 68.232.35.121:443 ESTABLISHED tcp 0 0 93.154.188.68:42758 74.125.239.143:443 ESTABLISHED tcp 0 0 93.154.188.68:42765 74.125.239.143:443 TIME_WAIT tcp 0 0 93.154.188.68:39000 173.194.69.95:80 TIME_WAIT tcp 0 0 93.154.188.68:50296 173.194.69.189:443 TIME_WAIT tcp 0 0 93.154.188.68:53467 173.194.32.198:80 ESTABLISHED tcp 0 0 93.154.188.68:42766 74.125.239.143:443 TIME_WAIT tcp 0 0 93.154.188.68:45251 190.93.244.58:80 TIME_WAIT tcp 0 0 93.154.188.68:45248 190.93.244.58:80 TIME_WAIT tcp 0 0 93.154.188.68:45247 190.93.244.58:80 ESTABLISHED tcp 0 159 93.154.188.68:50254 173.194.69.189:443 LAST_ACK tcp 0 0 93.154.188.68:34483 173.194.69.95:443 ESTABLISHED Output of ps: USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 1 0.8 0.0 3628 2092 ? Ss 16:52 0:03 /sbin/init root 2 0.0 0.0 0 0 ? S 16:52 0:00 [kthreadd] root 3 0.1 0.0 0 0 ? S 16:52 0:00 [ksoftirqd/0] root 4 0.1 0.0 0 0 ? S 16:52 0:00 [kworker/0:0] root 6 0.0 0.0 0 0 ? S 16:52 0:00 [migration/0] root 7 0.0 0.0 0 0 ? S 16:52 0:00 [watchdog/0] root 8 0.0 0.0 0 0 ? S 16:52 0:00 [migration/1] root 10 0.1 0.0 0 0 ? S 16:52 0:00 [ksoftirqd/1] root 11 0.0 0.0 0 0 ? S 16:52 0:00 [watchdog/1] root 12 0.0 0.0 0 0 ? S 16:52 0:00 [migration/2] root 14 0.1 0.0 0 0 ? S 16:52 0:00 [ksoftirqd/2] root 15 0.0 0.0 0 0 ? S 16:52 0:00 [watchdog/2] root 16 0.0 0.0 0 0 ? S 16:52 0:00 [migration/3] root 17 0.0 0.0 0 0 ? S 16:52 0:00 [kworker/3:0] root 18 0.1 0.0 0 0 ? S 16:52 0:00 [ksoftirqd/3] root 19 0.0 0.0 0 0 ? S 16:52 0:00 [watchdog/3] root 20 0.0 0.0 0 0 ? S< 16:52 0:00 [cpuset] root 21 0.0 0.0 0 0 ? S< 16:52 0:00 [khelper] root 22 0.0 0.0 0 0 ? S 16:52 0:00 [kdevtmpfs] root 23 0.0 0.0 0 0 ? S< 16:52 0:00 [netns] root 24 0.0 0.0 0 0 ? S 16:52 0:00 [sync_supers] root 25 0.0 0.0 0 0 ? S 16:52 0:00 [bdi-default] root 26 0.0 0.0 0 0 ? S< 16:52 0:00 [kintegrityd] root 27 0.0 0.0 0 0 ? S< 16:52 0:00 [kblockd] root 28 0.0 0.0 0 0 ? S< 16:52 0:00 [ata_sff] root 29 0.0 0.0 0 0 ? S 16:52 0:00 [khubd] root 30 0.0 0.0 0 0 ? S< 16:52 0:00 [md] root 42 0.0 0.0 0 0 ? S 16:52 0:00 [khungtaskd] root 43 0.0 0.0 0 0 ? S 16:52 0:00 [kswapd0] root 44 0.0 0.0 0 0 ? SN 16:52 0:00 [ksmd] root 45 0.0 0.0 0 0 ? SN 16:52 0:00 [khugepaged] root 46 0.0 0.0 0 0 ? S 16:52 0:00 [fsnotify_mark] root 47 0.0 0.0 0 0 ? S 16:52 0:00 [ecryptfs-kthrea] root 48 0.0 0.0 0 0 ? S< 16:52 0:00 [crypto] root 59 0.0 0.0 0 0 ? S< 16:52 0:00 [kthrotld] root 70 0.1 0.0 0 0 ? S 16:52 0:00 [kworker/2:1] root 71 0.0 0.0 0 0 ? S 16:52 0:00 [scsi_eh_0] root 72 0.0 0.0 0 0 ? S 16:52 0:00 [scsi_eh_1] root 73 0.0 0.0 0 0 ? S 16:52 0:00 [scsi_eh_2] root 74 0.0 0.0 0 0 ? S 16:52 0:00 [scsi_eh_3] root 75 0.0 0.0 0 0 ? S 16:52 0:00 [kworker/u:2] root 76 0.0 0.0 0 0 ? S 16:52 0:00 [kworker/u:3] root 79 0.0 0.0 0 0 ? S 16:52 0:00 [kworker/1:1] root 99 0.0 0.0 0 0 ? S< 16:52 0:00 [deferwq] root 100 0.0 0.0 0 0 ? S< 16:52 0:00 [charger_manager] root 101 0.0 0.0 0 0 ? S< 16:52 0:00 [devfreq_wq] root 102 0.1 0.0 0 0 ? S 16:52 0:00 [kworker/2:2] root 106 0.0 0.0 0 0 ? S 16:52 0:00 [scsi_eh_4] root 107 0.0 0.0 0 0 ? S 16:52 0:00 [usb-storage] root 108 0.0 0.0 0 0 ? S 16:52 0:00 [scsi_eh_5] root 109 0.0 0.0 0 0 ? S 16:52 0:00 [usb-storage] root 271 0.1 0.0 0 0 ? S 16:52 0:00 [kworker/1:2] root 316 0.0 0.0 0 0 ? S 16:52 0:00 [jbd2/sda1-8] root 317 0.0 0.0 0 0 ? S< 16:52 0:00 [ext4-dio-unwrit] root 440 0.1 0.0 2820 608 ? S 16:52 0:00 upstart-udev-bridge --daemon root 478 0.0 0.0 3460 1648 ? Ss 16:52 0:00 /sbin/udevd --daemon root 632 0.0 0.0 3348 1336 ? S 16:52 0:00 /sbin/udevd --daemon root 633 0.0 0.0 3348 1204 ? S 16:52 0:00 /sbin/udevd --daemon root 782 0.0 0.0 2816 596 ? S 16:52 0:00 upstart-socket-bridge --daemon root 822 0.0 0.0 6684 2400 ? Ss 16:52 0:00 /usr/sbin/sshd -D 102 834 0.2 0.0 4064 1864 ? Ss 16:52 0:01 dbus-daemon --system --fork root 857 0.0 0.1 7420 3380 ? Ss 16:52 0:00 /usr/sbin/modem-manager root 858 0.0 0.0 4784 1636 ? Ss 16:52 0:00 /usr/sbin/bluetoothd syslog 860 0.0 0.0 31068 1496 ? Sl 16:52 0:00 rsyslogd -c5 root 869 0.1 0.1 24280 5564 ? Ssl 16:52 0:00 NetworkManager avahi 883 0.0 0.0 3448 1488 ? S 16:52 0:00 avahi-daemon: running [pony.local] avahi 884 0.0 0.0 3448 436 ? S 16:52 0:00 avahi-daemon: chroot helper root 885 0.0 0.0 0 0 ? S< 16:52 0:00 [kpsmoused] root 892 0.0 0.1 25696 4140 ? Sl 16:52 0:00 /usr/lib/policykit-1/polkitd --no-debug root 923 0.0 0.0 0 0 ? S 16:52 0:00 [scsi_eh_6] root 959 0.0 0.0 0 0 ? S< 16:52 0:00 [krfcommd] root 970 0.0 0.1 7536 3120 ? Ss 16:52 0:00 /usr/sbin/cupsd -F colord 976 0.1 0.3 55080 10396 ? Sl 16:52 0:00 /usr/lib/i386-linux-gnu/colord/colord root 979 0.0 0.0 4632 872 tty4 Ss+ 16:52 0:00 /sbin/getty -8 38400 tty4 root 987 0.0 0.0 4632 884 tty5 Ss+ 16:52 0:00 /sbin/getty -8 38400 tty5 root 994 0.0 0.0 4632 884 tty2 Ss+ 16:52 0:00 /sbin/getty -8 38400 tty2 root 995 0.0 0.0 4632 868 tty3 Ss+ 16:52 0:00 /sbin/getty -8 38400 tty3 root 998 0.0 0.0 4632 876 tty6 Ss+ 16:52 0:00 /sbin/getty -8 38400 tty6 root 1022 0.0 0.0 2176 680 ? Ss 16:52 0:00 acpid -c /etc/acpi/events -s /var/run/acpid.socket root 1029 0.0 0.0 3632 664 ? Ss 16:52 0:00 /usr/sbin/irqbalance daemon 1030 0.0 0.0 2476 120 ? Ss 16:52 0:00 atd root 1031 0.0 0.0 2620 880 ? Ss 16:52 0:00 cron root 1061 0.1 0.0 0 0 ? S 16:52 0:00 [kworker/3:2] root 1064 0.0 1.0 34116 31072 ? SLsl 16:52 0:00 lightdm root 1076 13.4 1.2 118688 37920 tty7 Ssl+ 16:52 0:55 /usr/bin/X :0 -core -auth /var/run/lightdm/root/:0 -nolisten tcp vt7 -novtswit root 1085 0.0 0.0 0 0 ? S 16:52 0:00 [rts_pstor] root 1087 0.0 0.0 0 0 ? S 16:52 0:00 [rtsx-polling] root 1095 0.0 0.0 0 0 ? S< 16:52 0:00 [cfg80211] root 1127 0.0 0.0 0 0 ? S 16:52 0:00 [flush-8:0] root 1130 0.0 0.0 6136 1824 ? Ss 16:52 0:00 /sbin/wpa_supplicant -B -P /run/sendsigs.omit.d/wpasupplicant.pid -u -s -O /va root 1137 0.0 0.1 24604 3164 ? Sl 16:52 0:00 /usr/lib/accountsservice/accounts-daemon root 1140 0.0 0.0 0 0 ? S< 16:52 0:00 [hd-audio0] root 1188 0.0 0.1 34308 3420 ? Sl 16:52 0:00 /usr/sbin/console-kit-daemon --no-daemon root 1425 0.0 0.0 4632 872 tty1 Ss+ 16:52 0:00 /sbin/getty -8 38400 tty1 root 1443 0.1 0.1 29460 4664 ? Sl 16:52 0:00 /usr/lib/upower/upowerd root 1579 0.0 0.1 16540 3272 ? Sl 16:53 0:00 lightdm --session-child 12 19 bw 1623 0.0 0.0 2232 644 ? Ss 16:53 0:00 /bin/sh /usr/bin/startkde bw 1672 0.0 0.0 4092 204 ? Ss 16:53 0:00 /usr/bin/ssh-agent /usr/bin/gpg-agent --daemon --sh --write-env-file=/home/bw/ bw 1673 0.0 0.0 5492 384 ? Ss 16:53 0:00 /usr/bin/gpg-agent --daemon --sh --write-env-file=/home/bw/.gnupg/gpg-agent-in bw 1676 0.0 0.0 3848 792 ? S 16:53 0:00 /usr/bin/dbus-launch --exit-with-session /usr/bin/startkde bw 1677 0.5 0.0 5384 2180 ? Ss 16:53 0:02 //bin/dbus-daemon --fork --print-pid 5 --print-address 7 --session root 1704 0.3 0.1 25348 3600 ? Sl 16:53 0:01 /usr/lib/udisks/udisks-daemon root 1705 0.0 0.0 6620 728 ? S 16:53 0:00 udisks-daemon: not polling any devices bw 1736 0.0 0.0 2008 64 ? S 16:53 0:00 /usr/lib/kde4/libexec/start_kdeinit +kcminit_startup bw 1737 0.0 0.5 115200 15588 ? Ss 16:53 0:00 kdeinit4: kdeinit4 Running... bw 1738 0.1 0.2 116756 8728 ? S 16:53 0:00 kdeinit4: klauncher [kdeinit] --fd=9 bw 1740 0.6 1.0 340524 31264 ? Sl 16:53 0:02 kdeinit4: kded4 [kdeinit] bw 1742 0.0 0.0 8944 2144 ? S 16:53 0:00 /usr/lib/i386-linux-gnu/gconf/gconfd-2 bw 1746 0.2 0.4 92028 14688 ? S 16:53 0:00 /usr/bin/kglobalaccel bw 1748 0.0 0.4 90804 13500 ? S 16:53 0:00 /usr/bin/kwalletd bw 1752 0.1 0.5 103764 15152 ? S 16:53 0:00 /usr/bin/kactivitymanagerd bw 1758 0.0 0.0 2144 280 ? S 16:53 0:00 kwrapper4 ksmserver bw 1759 0.1 0.5 150016 16088 ? Sl 16:53 0:00 kdeinit4: ksmserver [kdeinit] bw 1763 2.2 1.0 178492 32100 ? Sl 16:53 0:08 kwin bw 1772 0.2 0.5 106292 16340 ? Sl 16:53 0:00 /usr/bin/knotify4 bw 1777 0.9 1.1 246120 32912 ? Sl 16:53 0:03 /usr/bin/krunner bw 1778 6.3 2.7 389884 80216 ? Sl 16:53 0:23 /usr/bin/plasma-desktop bw 1785 0.0 0.0 2844 1208 ? S 16:53 0:00 ksysguardd bw 1789 0.1 0.4 82036 14176 ? S 16:53 0:00 /usr/bin/kuiserver bw 1805 0.3 0.1 61560 5612 ? Sl 16:53 0:01 /usr/bin/akonadi_control root 1806 0.0 0.0 0 0 ? S 16:53 0:00 [kworker/0:2] bw 1808 0.1 0.2 211852 8460 ? Sl 16:53 0:00 akonadiserver bw 1810 0.4 0.8 244116 25360 ? Sl 16:53 0:01 /usr/sbin/mysqld --defaults-file=/home/bw/.local/share/akonadi/mysql.conf --da bw 1874 0.0 0.0 35284 2956 ? Sl 16:53 0:00 /usr/bin/xsettings-kde bw 1876 0.0 0.3 68776 9488 ? Sl 16:53 0:00 /usr/bin/nepomukserver bw 1884 0.4 0.9 173876 29240 ? SNl 16:53 0:01 /usr/bin/nepomukservicestub nepomukstorage bw 1902 6.1 2.1 451512 63924 ? Sl 16:53 0:21 /home/bw/.dropbox-dist/dropbox bw 1906 3.8 1.0 142368 32376 ? Rl 16:53 0:13 /usr/bin/yakuake bw 1933 0.0 0.1 54636 4680 ? Sl 16:53 0:00 /usr/bin/zeitgeist-datahub bw 1943 0.5 1.5 164836 46836 ? Sl 16:53 0:01 python /usr/bin/printer-applet bw 1945 0.1 0.1 99636 5048 ? S<l 16:53 0:00 /usr/bin/pulseaudio --start --log-target=syslog rtkit 1947 0.0 0.0 21336 1248 ? SNl 16:53 0:00 /usr/lib/rtkit/rtkit-daemon bw 1958 0.0 0.1 44204 3792 ? Sl 16:53 0:00 /usr/bin/zeitgeist-daemon bw 1972 0.0 0.0 27008 2684 ? Sl 16:53 0:00 /usr/lib/gvfs/gvfsd bw 1974 0.1 0.5 90480 16660 ? Sl 16:53 0:00 /usr/bin/akonadi_agent_launcher akonadi_akonotes_resource akonadi_akonotes_res bw 1984 0.1 0.5 90472 16636 ? Sl 16:53 0:00 /usr/bin/akonadi_agent_launcher akonadi_akonotes_resource akonadi_akonotes_res bw 1985 0.3 0.9 148800 28304 ? S 16:53 0:01 /usr/bin/akonadi_archivemail_agent --identifier akonadi_archivemail_agent bw 1992 0.1 0.5 90020 16148 ? Sl 16:53 0:00 /usr/bin/akonadi_agent_launcher akonadi_contacts_resource akonadi_contacts_res bw 1993 0.1 0.5 90132 16452 ? Sl 16:53 0:00 /usr/bin/akonadi_agent_launcher akonadi_contacts_resource akonadi_contacts_res bw 1994 0.1 0.5 90564 16332 ? Sl 16:53 0:00 /usr/bin/akonadi_agent_launcher akonadi_ical_resource akonadi_ical_resource_0 bw 1995 0.1 0.5 90676 16732 ? Sl 16:53 0:00 /usr/bin/akonadi_agent_launcher akonadi_ical_resource akonadi_ical_resource_1 bw 1996 0.1 0.5 90468 16800 ? Sl 16:53 0:00 /usr/bin/akonadi_agent_launcher akonadi_maildir_resource akonadi_maildir_resou bw 1999 0.2 0.6 99324 19276 ? S 16:53 0:00 /usr/bin/akonadi_maildispatcher_agent --identifier akonadi_maildispatcher_agen bw 2006 0.3 0.9 148808 28332 ? S 16:53 0:01 /usr/bin/akonadi_mailfilter_agent --identifier akonadi_mailfilter_agent bw 2017 0.0 0.1 50256 4716 ? Sl 16:53 0:00 /usr/lib/zeitgeist/zeitgeist-fts bw 2024 0.2 0.6 103632 18376 ? Sl 16:53 0:00 /usr/bin/akonadi_nepomuk_feeder --identifier akonadi_nepomuk_feeder bw 2043 0.0 0.0 4484 280 ? S 16:53 0:00 /bin/cat bw 2101 0.2 0.7 113600 22396 ? Sl 16:53 0:00 /usr/lib/kde4/libexec/polkit-kde-authentication-agent-1 bw 2105 0.2 0.7 114196 22072 ? Sl 16:53 0:00 /usr/bin/nepomukcontroller bw 2156 0.3 1.0 333188 31244 ? Sl 16:54 0:01 /usr/bin/kmix bw 2167 0.0 0.0 6548 2724 pts/2 Ss 16:54 0:00 /bin/bash bw 2177 0.2 0.7 113496 22960 ? Sl 16:54 0:00 /usr/bin/klipper bw 2394 3.5 1.2 52932 35596 ? SNl 16:54 0:11 /usr/bin/virtuoso-t +foreground +configfile /tmp/virtuoso_hX1884.ini +wait root 2460 0.0 0.0 6184 1876 pts/2 S 16:54 0:00 sudo -s root 2500 0.0 0.0 6528 2700 pts/2 S 16:54 0:00 /bin/bash root 2599 0.0 0.0 5444 1280 pts/2 S+ 16:54 0:00 /bin/bash bin/aero root 2606 0.1 0.0 9836 2500 pts/2 S+ 16:54 0:00 wvdial aero2 root 2619 0.0 0.0 3504 1280 pts/2 S 16:54 0:00 /usr/sbin/pppd 57600 modem crtscts defaultroute usehostname -detach user aero bw 2653 0.0 0.0 6600 2880 pts/3 Ss 16:54 0:00 /bin/bash bw 2676 0.4 0.8 130296 24016 ? SNl 16:54 0:01 /usr/bin/nepomukservicestub nepomukfilewatch bw 2679 0.1 0.7 101636 22252 ? SNl 16:54 0:00 /usr/bin/nepomukservicestub nepomukqueryservice bw 2681 0.2 0.8 109836 24280 ? SNl 16:54 0:00 /usr/bin/nepomukservicestub nepomukbackupsync bw 3833 46.0 9.7 829272 288012 ? Rl 16:55 1:46 /usr/lib/firefox/firefox bw 3903 0.0 0.0 35128 2804 ? Sl 16:55 0:00 /usr/lib/at-spi2-core/at-spi-bus-launcher bw 4708 0.1 0.0 6564 2736 pts/4 Ss 16:56 0:00 /bin/bash root 5210 0.0 0.0 0 0 ? S 16:57 0:00 [kworker/u:0] root 6140 0.2 0.0 0 0 ? S 16:58 0:00 [kworker/0:1] root 6371 0.5 0.0 6184 1868 pts/4 S+ 16:59 0:00 sudo nethogs ppp0 root 6411 17.7 0.2 8616 6144 pts/4 S+ 16:59 0:05 nethogs ppp0 bw 6787 0.0 0.0 5464 1220 pts/3 R+ 16:59 0:00 ps auxw

    Read the article

  • Nagging As A Strategy For Better Linking: -z guidance

    - by user9154181
    The link-editor (ld) in Solaris 11 has a new feature that we call guidance that is intended to help you build better objects. The basic idea behind guidance is that if (and only if) you request it, the link-editor will issue messages suggesting better options and other changes you might make to your ld command to get better results. You can choose to take the advice, or you can disable specific types of guidance while acting on others. In some ways, this works like an experienced friend leaning over your shoulder and giving you advice — you're free to take it or leave it as you see fit, but you get nudged to do a better job than you might have otherwise. We use guidance to build the core Solaris OS, and it has proven to be useful, both in improving our objects, and in making sure that regressions don't creep back in later. In this article, I'm going to describe the evolution in thinking and design that led to the implementation of the -z guidance option, as well as give a brief description of how it works. The guidance feature issues non-fatal warnings. However, experience shows that once developers get used to ignoring warnings, it is inevitable that real problems will be lost in the noise and ignored or missed. This is why we have a zero tolerance policy against build noise in the core Solaris OS. In order to get maximum benefit from -z guidance while maintaining this policy, I added the -z fatal-warnings option at the same time. Much of the material presented here is adapted from the arc case: PSARC 2010/312 Link-editor guidance The History Of Unfortunate Link-Editor Defaults The Solaris link-editor is one of the oldest Unix commands. It stands to reason that this would be true — in order to write an operating system, you need the ability to compile and link code. The original link-editor (ld) had defaults that made sense at the time. As new features were needed, command line option switches were added to let the user use them, while maintaining backward compatibility for those who didn't. Backward compatibility is always a concern in system design, but is particularly important in the case of the tool chain (compilers, linker, and related tools), since it is a basic building block for the entire system. Over the years, applications have grown in size and complexity. Important concepts like dynamic linking that didn't exist in the original Unix system were invented. Object file formats changed. In the case of System V Release 4 Unix derivatives like Solaris, the ELF (Extensible Linking Format) was adopted. Since then, the ELF system has evolved to provide tools needed to manage today's larger and more complex environments. Features such as lazy loading, and direct bindings have been added. In an ideal world, many of these options would be defaults, with rarely used options that allow the user to turn them off. However, the reality is exactly the reverse: For backward compatibility, these features are all options that must be explicitly turned on by the user. This has led to a situation in which most applications do not take advantage of the many improvements that have been made in linking over the last 20 years. If their code seems to link and run without issue, what motivation does a developer have to read a complex manpage, absorb the information provided, choose the features that matter for their application, and apply them? Experience shows that only the most motivated and diligent programmers will make that effort. We know that most programs would be improved if we could just get you to use the various whizzy features that we provide, but the defaults conspire against us. We have long wanted to do something to make it easier for our users to use the linkers more effectively. There have been many conversations over the years regarding this issue, and how to address it. They always break down along the following lines: Change ld Defaults Since the world would be a better place the newer ld features were the defaults, why not change things to make it so? This idea is simple, elegant, and impossible. Doing so would break a large number of existing applications, including those of ISVs, big customers, and a plethora of existing open source packages. In each case, the owner of that code may choose to follow our lead and fix their code, or they may view it as an invitation to reconsider their commitment to our platform. Backward compatibility, and our installed base of working software, is one of our greatest assets, and not something to be lightly put at risk. Breaking backward compatibility at this level of the system is likely to do more harm than good. But, it sure is tempting. New Link-Editor One might create a new linker command, not called 'ld', leaving the old command as it is. The new one could use the same code as ld, but would offer only modern options, with the proper defaults for features such as direct binding. The resulting link-editor would be a pleasure to use. However, the approach is doomed to niche status. There is a vast pile of exiting code in the world built around the existing ld command, that reaches back to the 1970's. ld use is embedded in large and unknown numbers of makefiles, and is used by name by compilers that execute it. A Unix link-editor that is not named ld will not find a majority audience no matter how good it might be. Finally, a new linker command will eventually cease to be new, and will accumulate its own burden of backward compatibility issues. An Option To Make ld Do The Right Things Automatically This line of reasoning is best summarized by a CR filed in 2005, entitled 6239804 make it easier for ld(1) to do what's best The idea is to have a '-z best' option that unchains ld from its backward compatibility commitment, and allows it to turn on the "best" set of features, as determined by the authors of ld. The specific set of features enabled by -z best would be subject to change over time, as requirements change. This idea is more realistic than the other two, but was never implemented because it has some important issues that we could never answer to our satisfaction: The -z best proposal assumes that the user can turn it on, and trust it to select good options without the user needing to be aware of the options being applied. This is a fallacy. Features such as direct bindings require the user to do some analysis to ensure that the resulting program will still operate properly. A user who is willing to do the work to verify that what -z best does will be OK for their application is capable of turning on those features directly, and therefore gains little added benefit from -z best. The intent is that when a user opts into -z best, that they understand that z best is subject to sometimes incompatible evolution. Experience teaches us that this won't work. People will use this feature, the meaning of -z best will change, code that used to build will fail, and then there will be complaints and demands to retract the change. When (not if) this occurs, we will of course defend our actions, and point at the disclaimer. We'll win some of those debates, and lose others. Ultimately, we'll end up with -z best2 (-z better), or other compromises, and our goal of simplifying the world will have failed. The -z best idea rolls up a set of features that may or may not be related to each other into a unit that must be taken wholesale, or not at all. It could be that only a subset of what it does is compatible with a given application, in which case the user is expected to abandon -z best and instead set the options that apply to their application directly. In doing so, they lose one of the benefits of -z best, that if you use it, future versions of ld may choose a different set of options, and automatically improve the object through the act of rebuilding it. I drew two conclusions from the above history: For a link-editor, backward compatibility is vital. If a given command line linked your application 10 years ago, you have every reason to expect that it will link today, assuming that the libraries you're linking against are still available and compatible with their previous interfaces. For an application of any size or complexity, there is no substitute for the work involved in examining the code and determining which linker options apply and which do not. These options are largely orthogonal to each other, and it can be reasonable not to use any or all of them, depending on the situation, even in modern applications. It is a mistake to tie them together. The idea for -z guidance came from consideration of these points. By decoupling the advice from the act of taking the advice, we can retain the good aspects of -z best while avoiding its pitfalls: -z guidance gives advice, but the decision to take that advice remains with the user who must evaluate its merit and make a decision to take it or not. As such, we are free to change the specific guidance given in future releases of ld, without breaking existing applications. The only fallout from this will be some new warnings in the build output, which can be ignored or dealt with at the user's convenience. It does not couple the various features given into a single "take it or leave it" option, meaning that there will never be a need to offer "-zguidance2", or other such variants as things change over time. Guidance has the potential to be our final word on this subject. The user is given the flexibility to disable specific categories of guidance without losing the benefit of others, including those that might be added to future versions of the system. Although -z fatal-warnings stands on its own as a useful feature, it is of particular interest in combination with -z guidance. Used together, the guidance turns from advice to hard requirement: The user must either make the suggested change, or explicitly reject the advice by specifying a guidance exception token, in order to get a build. This is valuable in environments with high coding standards. ld Command Line Options The guidance effort resulted in new link-editor options for guidance and for turning warnings into fatal errors. Before I reproduce that text here, I'd like to highlight the strategic decisions embedded in the guidance feature: In order to get guidance, you have to opt in. We hope you will opt in, and believe you'll get better objects if you do, but our default mode of operation will continue as it always has, with full backward compatibility, and without judgement. Guidance suggestions always offers specific advice, and not vague generalizations. You can disable some guidance without turning off the entire feature. When you get guidance warnings, you can choose to take the advice, or you can specify a keyword to disable guidance for just that category. This allows you to get guidance for things that are useful to you, without being bothered about things that you've already considered and dismissed. As the world changes, we will add new guidance to steer you in the right direction. All such new guidance will come with a keyword that let's you turn it off. In order to facilitate building your code on different versions of Solaris, we quietly ignore any guidance keywords we don't recognize, assuming that they are intended for newer versions of the link-editor. If you want to see what guidance tokens ld does and does not recognize on your system, you can use the ld debugging feature as follows: % ld -Dargs -z guidance=foo,nodefs debug: debug: Solaris Linkers: 5.11-1.2275 debug: debug: arg[1] option=-D: option-argument: args debug: arg[2] option=-z: option-argument: guidance=foo,nodefs debug: warning: unrecognized -z guidance item: foo The -z fatal-warning option is straightforward, and generally useful in environments with strict coding standards. Note that the GNU ld already had this feature, and we accept their option names as synonyms: -z fatal-warnings | nofatal-warnings --fatal-warnings | --no-fatal-warnings The -z fatal-warnings and the --fatal-warnings option cause the link-editor to treat warnings as fatal errors. The -z nofatal-warnings and the --no-fatal-warnings option cause the link-editor to treat warnings as non-fatal. This is the default behavior. The -z guidance option is defined as follows: -z guidance[=item1,item2,...] Provide guidance messages to suggest ld options that can improve the quality of the resulting object, or which are otherwise considered to be beneficial. The specific guidance offered is subject to change over time as the system evolves. Obsolete guidance offered by older versions of ld may be dropped in new versions. Similarly, new guidance may be added to new versions of ld. Guidance therefore always represents current best practices. It is possible to enable guidance, while preventing specific guidance messages, by providing a list of item tokens, representing the class of guidance to be suppressed. In this way, unwanted advice can be suppressed without losing the benefit of other guidance. Unrecognized item tokens are quietly ignored by ld, allowing a given ld command line to be executed on a variety of older or newer versions of Solaris. The guidance offered by the current version of ld, and the item tokens used to disable these messages, are as follows. Specify Required Dependencies Dynamic executables and shared objects should explicitly define all of the dependencies they require. Guidance recommends the use of the -z defs option, should any symbol references remain unsatisfied when building dynamic objects. This guidance can be disabled with -z guidance=nodefs. Do Not Specify Non-Required Dependencies Dynamic executables and shared objects should not define any dependencies that do not satisfy the symbol references made by the dynamic object. Guidance recommends that unused dependencies be removed. This guidance can be disabled with -z guidance=nounused. Lazy Loading Dependencies should be identified for lazy loading. Guidance recommends the use of the -z lazyload option should any dependency be processed before either a -z lazyload or -z nolazyload option is encountered. This guidance can be disabled with -z guidance=nolazyload. Direct Bindings Dependencies should be referenced with direct bindings. Guidance recommends the use of the -B direct, or -z direct options should any dependency be processed before either of these options, or the -z nodirect option is encountered. This guidance can be disabled with -z guidance=nodirect. Pure Text Segment Dynamic objects should not contain relocations to non-writable, allocable sections. Guidance recommends compiling objects with Position Independent Code (PIC) should any relocations against the text segment remain, and neither the -z textwarn or -z textoff options are encountered. This guidance can be disabled with -z guidance=notext. Mapfile Syntax All mapfiles should use the version 2 mapfile syntax. Guidance recommends the use of the version 2 syntax should any mapfiles be encountered that use the version 1 syntax. This guidance can be disabled with -z guidance=nomapfile. Library Search Path Inappropriate dependencies that are encountered by ld are quietly ignored. For example, a 32-bit dependency that is encountered when generating a 64-bit object is ignored. These dependencies can result from incorrect search path settings, such as supplying an incorrect -L option. Although benign, this dependency processing is wasteful, and might hide a build problem that should be solved. Guidance recommends the removal of any inappropriate dependencies. This guidance can be disabled with -z guidance=nolibpath. In addition, -z guidance=noall can be used to entirely disable the guidance feature. See Chapter 7, Link-Editor Quick Reference, in the Linker and Libraries Guide for more information on guidance and advice for building better objects. Example The following example demonstrates how the guidance feature is intended to work. We will build a shared object that has a variety of shortcomings: Does not specify all it's dependencies Specifies dependencies it does not use Does not use direct bindings Uses a version 1 mapfile Contains relocations to the readonly allocable text (not PIC) This scenario is sadly very common — many shared objects have one or more of these issues. % cat hello.c #include <stdio.h> #include <unistd.h> void hello(void) { printf("hello user %d\n", getpid()); } % cat mapfile.v1 # This version 1 mapfile will trigger a guidance message % cc hello.c -o hello.so -G -M mapfile.v1 -lelf As you can see, the operation completes without error, resulting in a usable object. However, turning on guidance reveals a number of things that could be better: % cc hello.c -o hello.so -G -M mapfile.v1 -lelf -zguidance ld: guidance: version 2 mapfile syntax recommended: mapfile.v1 ld: guidance: -z lazyload option recommended before first dependency ld: guidance: -B direct or -z direct option recommended before first dependency Undefined first referenced symbol in file getpid hello.o (symbol belongs to implicit dependency /lib/libc.so.1) printf hello.o (symbol belongs to implicit dependency /lib/libc.so.1) ld: warning: symbol referencing errors ld: guidance: -z defs option recommended for shared objects ld: guidance: removal of unused dependency recommended: libelf.so.1 warning: Text relocation remains referenced against symbol offset in file .rodata1 (section) 0xa hello.o getpid 0x4 hello.o printf 0xf hello.o ld: guidance: position independent (PIC) code recommended for shared objects ld: guidance: see ld(1) -z guidance for more information Given the explicit advice in the above guidance messages, it is relatively easy to modify the example to do the right things: % cat mapfile.v2 # This version 2 mapfile will not trigger a guidance message $mapfile_version 2 % cc hello.c -o hello.so -Kpic -G -Bdirect -M mapfile.v2 -lc -zguidance There are situations in which the guidance does not fit the object being built. For instance, you want to build an object without direct bindings: % cc -Kpic hello.c -o hello.so -G -M mapfile.v2 -lc -zguidance ld: guidance: -B direct or -z direct option recommended before first dependency ld: guidance: see ld(1) -z guidance for more information It is easy to disable that specific guidance warning without losing the overall benefit from allowing the remainder of the guidance feature to operate: % cc -Kpic hello.c -o hello.so -G -M mapfile.v2 -lc -zguidance=nodirect Conclusions The linking guidelines enforced by the ld guidance feature correspond rather directly to our standards for building the core Solaris OS. I'm sure that comes as no surprise. It only makes sense that we would want to build our own product as well as we know how. Solaris is usually the first significant test for any new linker feature. We now enable guidance by default for all builds, and the effect has been very positive. Guidance helps us find suboptimal objects more quickly. Programmers get concrete advice for what to change instead of vague generalities. Even in the cases where we override the guidance, the makefile rules to do so serve as documentation of the fact. Deciding to use guidance is likely to cause some up front work for most code, as it forces you to consider using new features such as direct bindings. Such investigation is worthwhile, but does not come for free. However, the guidance suggestions offer a structured and straightforward way to tackle modernizing your objects, and once that work is done, for keeping them that way. The investment is often worth it, and will replay you in terms of better performance and fewer problems. I hope that you find guidance to be as useful as we have.

    Read the article

  • Windows 7 intermittently drops wired internet/lan connection.

    - by CraigTP
    In a nutshell, my Windows 7 Ultimate PC intermittently drops it's internet connection. Why? Background: My PC is wired to my ADSL modem/router which is directly connected to the phone line. I also have wireless connectivity turned on within the router for a laptop to connect wirelessly. Every few hours or so, when using my PC, I find I cannot access the internet and pages will not load. Eventually, Windows7 will update the network icon in the task-tray to show the exclamation mark symbol on the network icon. Opening up the Network And Sharing Centre will show the red cross between the "Multiple Networks" and "The Internet". Here's a picture of the "Network And Sharing Centre" (grabbed when everything was working!) As you can see, I'm running Sun's VirtualBox on this machine and that creates a Network connection for itself. This doesn't seem to affect the intermittent dropping (i.e. the intermittent drops occur whether the VirtualBox connection is in use or not). When the connection does drop, I cannot access any internet pages, nor can I access the router's web admin page at http://192.168.1.1/, so I'm assuming I've lost all local LAN access too. It's definitely not the router (or the internet connection itself) as my laptop, using the wireless connection (and running Vista Home Premium) continues to be able to access the internet (and the router's web admin pages) just fine. Every time this happens, I can immediately restore all internet and LAN access by opening Network Adapter page, disabling the "Local Area Connection" and then re-enabling it. Give it a few seconds and everything is fine again. I assume this is because, beneath the GUI, it's effectively doing an "ipconfig /release" then "ipconfig /renew". Why does this happen in the first place, though? I've googled for this and seen quite a few other people (even on MSDN/Technet forums) experiencing the same or almost the same problem, but with no clear resolution. Suggestions of turning off IPv6 on the LAN adapter, and ensuring there's no power management "sleeping" the network adapter have been tried but do not cure the problem. There does not seem to be any particular sequence of events that cause it to happen either. I've had it go twice in 20 minutes when just randomly browsing the web with no other traffic, and I've also had it go once then not go again for 2-3 hours with the same sort of usage. Can anyone tell me why this is happening and how to make it stop? EDIT: Additional information based upon the answer provided so far: Firstly, I forgot the mention that this is Windows 7 64 bit if that makes any difference at all. I mentioned that I don't think the VirtualBox network adpater is causing this problem in any way, and I also have VirtualBox installed on two other machines, one running Vista Home Premium and the other running XP. Neither of these machine experience the same network connectivity issues as the Windows 7 machine. The IP assignment for the Windows 7 machine is the same both before and after the "drop". I have a DHCP server on the router issuing IP Addresses, however my Windows 7 machine uses a static address. Here's the output from "ipconfig": Ethernet adapter Local Area Connection: Connection-specific DNS Suffix . : Description . . . . . . . . . . . : Realtek PCIe GBE Family Controller DHCP Enabled. . . . . . . . . . . : No Autoconfiguration Enabled . . . . : Yes IPv4 Address. . . . . . . . . . . : 192.168.1.2(Preferred) Subnet Mask . . . . . . . . . . . : 255.255.255.0 Default Gateway . . . . . . . . . : 192.168.1.1 DNS Servers . . . . . . . . . . . : 192.168.1.1 NetBIOS over Tcpip. . . . . . . . : Enabled Within the system's event logs, the only event that relates to the connection dropping is a "DNS Client Event" and this is generated after the connection has dropped and is an event detailing that DNS information can't be found for whatever website I may be trying to access, just as the connection drops: Log Name: System Source: Microsoft-Windows-DNS-Client Event ID: 1014 Task Category: None Level: Warning Keywords: User: NETWORK SERVICE Description: Name resolution for the name weather.service.msn.com timed out after none of the configured DNS servers responded. The network adapter chipset is Realtek PCIe GBE Family Controller and I have confirmed that this is the correct chipset for the motherboard (Asus M4A77TD PRO), and in fact, Windows Update installed an updated driver for this on 12/Jan/2009. The details of the update say that it's a Realtek software update from December 2009. Incidentally, I was still having the same intermittent problems prior to this update. It seems to have made no difference at all. EDIT 2 (1 Feb 2010): In my quest to solve this problem, I have discovered some more interesting information. On another forum, someone suggested that I should try running Windows in "Safe Mode With Networking" and see if the problem continues to occur. This was a fantastic suggestion and I don't know why I didn't think of it sooner myself. So, I proceeded to run in Safe Mode with Networking for a number of hours, and amazingly, the "drops" didn't occur once. It was a positive discovery, however, due to the intermittent nature of the original problem, I wasn't completely convinced that the problem was cured. One thing I did note is that the fan on my GFX card was running alot louder than normal. This is due to the fact that I have an ASUS ENGTS250 graphics card (http://www.asus.com/product.aspx?P_ID=B6imcoax3MRY42f3) which had a known problem with a noisy fan until a BIOS update fixed the issue. (See the "Manufacturer Response" here: http://www.newegg.com/Product/Product.aspx?Item=N82E16814121334 for details). Well, running in safe mode had the fan running (incorrectly) at full speed (as it did before the BIOS update), but with an (apparently) stable network connection. Obviously some driver was not loaded for the GFX card when in Safe Mode so this got me thinking about the GFX card (since the very noisy fan was quite obvious when running in Safe Mode). I rebooted into normal mode, and found that Nvidia had a very up-to-date new driver for my GFX card (only about 1 week old), so I downloaded the appropriate driver and installed it. After installation and a reboot, I was able to use my PC for an entire day with NO NETWORK DROPS!!! This was on Saturday. However, on the Sunday, I also had my PC for pretty much the entire day and experienced 2 network drops. No other changes have been made to my PC in this time. So, the story seems to be that updating my graphics card drivers seems to have improved (if not completely fixed) the issue, however, I'm still searching for a proper fix for this problem. Hopefully, this information may help anyone who may have additional ideas as to why this problem is occuring in the first place. (And why does new GFX card drivers have anything to do with the network?) I appreciate everyone's feedback so far. However, I'll have to ask once more if anyone has any further ideas of how to fix this particular problem? Thanks in advance.

    Read the article

  • Can't join OS X Mavericks to AD Domain

    - by watkipet
    I'm attempting to join an OS X Mavericks (10.9) client to a Windows Server 2008 Active Directory domain, however the bind fails with this error in the OS X client's system.log: Oct 24 15:03:15 host.domain.com com.apple.preferences.users.remoteservice[5547]: -[ODCAddServerSheetController handleOtherActionError: gotError: Error Domain=com.apple.OpenDirectory Code=5202 "Authentication server encountered an error while attempting the requested operation." UserInfo=0x7f9e6cb3e180 {NSLocalizedDescription=Authentication server encountered an error while attempting the requested operation., NSLocalizedFailureReason=Authentication server encountered an error while attempting the requested operation.}, Authentication server encountered an error while attempting the requested operation. I've joined (bound) Ubuntu Linux clients to the same domain with net ads join in the past with no problems (using the same administrative user). I don't have access to any server logs. Here's the GUI error (from Directory Utility) on the OS X client: Here's the GUI error (from User's and Groups) in System Preferences on the OS X client: Update After some Wiresharking I've got some more info: OS X Client - KDC (over UDP): AS_REQ (no padata) OS X Client <- KDC (over UDP): KRB5KDC_ERR_PREAUTH_REQUIRED OS X Client - KDC (over UDP): AS_REQ (this time with PA-ENC-TIMESTAMP in padata) OS X Client <- KDC (over UDP): KRB5KDC_ERR_RESPONSE_TOO_BIG OS X Client - KDC (over TCP): AS_REQ (also with PA-ENC-TIMESTAMP in padata) OS X Client <- KDC (over TCP): KDC_ERR_ETYPE_NOSUPP ...and that's it. This is what I think is going on: The OS X client sends a kerberos request. The KDC says, "You need to pre-authenticate. Try again" The OS X client tries to pre-authenticate (all this so far is over UDP) Something gets lost on our network and the KDC says, "Oops something went wrong" The OS X client switches to TCP and tries again. Over TCP, the KDC says, "You're using an encryption type I don't support" Note that in its padata records, the OS X client is always using "aes256-cts-hmac-sha1-96" as its encryption type. However, in its KDC_REQ_BODY record it lists the aes256-cts-hmac-sha1-96, aes128-cts-hmac-sha1-96, des3-cbc-sha1, and rc4-hmac encryption types. When the KDC comes back with KDC_ERR_ETYPE_NOSUPP, it uses rc4-hmac as its encryption type in its padata record. I know next to nothing about Kerberos, but it seems to me that the OS X client should go ahead and try the rc4-hmac encryption type. However, it does nothing after this. Update 2 Here's the debug log from Directory Services on the OS X client. Sorry--it's long. 2013-10-25 14:19:13.219128 PDT - 10544.20463 - ODNodeCustomCall request, NodeID: 52A65FAE-4B24-455D-86EC-2199A780D234, Code: 80 2013-10-25 14:19:13.220409 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - client requested OU - 'CN=Computers,DC=domain,DC=com' 2013-10-25 14:19:13.220427 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - Binding using '[email protected]' for kerberos ID 2013-10-25 14:19:13.220571 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - new kerberos credential cache 'MEMORY:0x7fa713635470' for '[email protected]' 2013-10-25 14:19:13.220623 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - krb5_get_init_creds: loop 1 2013-10-25 14:19:13.220639 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - KDC send 0 patypes 2013-10-25 14:19:13.220653 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - fast disabled, not doing any fast wrapping 2013-10-25 14:19:13.220699 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - Trying to find service kdc for realm DOMAIN.COM flags 0 2013-10-25 14:19:13.221275 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - submissing new requests to new host 2013-10-25 14:19:13.221326 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - connecting to host: udp 192.168.0.1:kerberos (192.168.0.1) tid: 00000001 2013-10-25 14:19:13.221373 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - writing packet: udp 192.168.0.1:kerberos (192.168.0.1) tid: 00000001 2013-10-25 14:19:13.222588 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - reading packet: udp 192.168.0.1:kerberos (192.168.0.1) tid: 00000001 2013-10-25 14:19:13.222617 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - host completed: udp 192.168.0.1:kerberos (192.168.0.1) tid: 00000001 2013-10-25 14:19:13.222665 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - krb5_sendto_context DOMAIN.COM done: 0 hosts 1 packets 1 wc: 0.001960 nr: 0.000000 kh: 0.000560 tid: 00000001 2013-10-25 14:19:13.222705 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - krb5_get_init_creds: loop 2 2013-10-25 14:19:13.222737 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - krb5_get_init_creds: processing input 2013-10-25 14:19:13.222752 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - krb5_get_init_creds: got an KRB-ERROR from KDC 2013-10-25 14:19:13.222775 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - krb5_get_init_creds: KRB-ERROR -1765328359/Additional pre-authentication required 2013-10-25 14:19:13.222791 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - KDC send 4 patypes 2013-10-25 14:19:13.222800 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - KDC send PA-DATA type: 19 2013-10-25 14:19:13.222808 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - KDC send PA-DATA type: 2 2013-10-25 14:19:13.222816 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - KDC send PA-DATA type: 16 2013-10-25 14:19:13.222825 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - KDC send PA-DATA type: 15 2013-10-25 14:19:13.222840 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - krb5_get_init_creds: using ENC-TS with enctype 18 2013-10-25 14:19:13.222850 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - krb5_get_init_creds: using default_s2k_func 2013-10-25 14:19:13.227443 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - fast disabled, not doing any fast wrapping 2013-10-25 14:19:13.227502 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - Trying to find service kdc for realm DOMAIN.COM flags 0 2013-10-25 14:19:13.228233 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - submissing new requests to new host 2013-10-25 14:19:13.228320 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - connecting to host: udp 192.168.0.1:kerberos (192.168.0.1) tid: 00010001 2013-10-25 14:19:13.228374 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - writing packet: udp 192.168.0.1:kerberos (192.168.0.1) tid: 00010001 2013-10-25 14:19:13.229930 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - reading packet: udp 192.168.0.1:kerberos (192.168.0.1) tid: 00010001 2013-10-25 14:19:13.229957 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - host completed: udp 192.168.0.1:kerberos (192.168.0.1) tid: 00010001 2013-10-25 14:19:13.229975 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - krb5_sendto trying over again (reset): 0 2013-10-25 14:19:13.230023 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - Trying to find service kdc for realm DOMAIN.COM flags 2 2013-10-25 14:19:13.230664 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - submissing new requests to new host 2013-10-25 14:19:13.230726 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - connecting to host: tcp 192.168.0.1:kerberos (192.168.0.1) tid: 00010002 2013-10-25 14:19:13.230818 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - connecting to 11: tcp 192.168.0.1:kerberos (192.168.0.1) tid: 00010002 2013-10-25 14:19:13.231101 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - writing packet: tcp 192.168.0.1:kerberos (192.168.0.1) tid: 00010002 2013-10-25 14:19:13.232743 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - reading packet: tcp 192.168.0.1:kerberos (192.168.0.1) tid: 00010002 2013-10-25 14:19:13.232777 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - host completed: tcp 192.168.0.1:kerberos (192.168.0.1) tid: 00010002 2013-10-25 14:19:13.232798 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - krb5_sendto_context DOMAIN.COM done: 0 hosts 2 packets 2 wc: 0.005316 nr: 0.000000 kh: 0.001339 tid: 00010002 2013-10-25 14:19:13.232856 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - krb5_get_init_creds: loop 3 2013-10-25 14:19:13.232868 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - krb5_get_init_creds: processing input 2013-10-25 14:19:13.232900 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - krb5_get_init_creds: using keyproc 2013-10-25 14:19:13.232910 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - krb5_get_init_creds: using default_s2k_func 2013-10-25 14:19:13.236487 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - krb5_get_init_creds: extracting ticket 2013-10-25 14:19:13.236557 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - krb5_get_init_creds: wc: 0.015944 2013-10-25 14:19:13.237022 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - Trying to find service kdc for realm DOMAIN.COM flags 2 2013-10-25 14:19:13.237444 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - submissing new requests to new host 2013-10-25 14:19:13.237482 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - connecting to host: tcp 192.168.0.1:kerberos (192.168.0.1) tid: 00020001 2013-10-25 14:19:13.237551 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - connecting to 11: tcp 192.168.0.1:kerberos (192.168.0.1) tid: 00020001 2013-10-25 14:19:13.237900 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - writing packet: tcp 192.168.0.1:kerberos (192.168.0.1) tid: 00020001 2013-10-25 14:19:13.238616 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - reading packet: tcp 192.168.0.1:kerberos (192.168.0.1) tid: 00020001 2013-10-25 14:19:13.238645 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - host completed: tcp 192.168.0.1:kerberos (192.168.0.1) tid: 00020001 2013-10-25 14:19:13.238674 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - krb5_sendto_context DOMAIN.COM done: 0 hosts 1 packets 1 wc: 0.001656 nr: 0.000000 kh: 0.000409 tid: 00020001 2013-10-25 14:19:13.238839 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - Trying to find service kdc for realm DOMAIN.COM flags 2 2013-10-25 14:19:13.239302 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - submissing new requests to new host 2013-10-25 14:19:13.239360 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - connecting to host: tcp 192.168.0.1:kerberos (192.168.0.1) tid: 00030001 2013-10-25 14:19:13.239429 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - connecting to 11: tcp 192.168.0.1:kerberos (192.168.0.1) tid: 00030001 2013-10-25 14:19:13.239683 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - writing packet: tcp 192.168.0.1:kerberos (192.168.0.1) tid: 00030001 2013-10-25 14:19:13.240350 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - reading packet: tcp 192.168.0.1:kerberos (192.168.0.1) tid: 00030001 2013-10-25 14:19:13.240387 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - host completed: tcp 192.168.0.1:kerberos (192.168.0.1) tid: 00030001 2013-10-25 14:19:13.240415 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - krb5_sendto_context DOMAIN.COM done: 0 hosts 1 packets 1 wc: 0.001578 nr: 0.000000 kh: 0.000445 tid: 00030001 2013-10-25 14:19:13.240514 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - krb5_credential - krb5_get_credentials_with_flags: DOMAIN.COM wc: 0.003615 2013-10-25 14:19:13.240537 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - valid credentials for [email protected] 2013-10-25 14:19:13.240541 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - switching to cache 'MEMORY:0x7fa713635470' 2013-10-25 14:19:13.240545 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - switching GSS to cache 'MEMORY:0x7fa713635470 2013-10-25 14:19:13.240555 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - Bind Step 5 - Bind/Join computer to domain - 'domain.com' 2013-10-25 14:19:13.241345 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - resolving 'server.domain.com' 2013-10-25 14:19:13.241646 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - added socket 12 for host 'server.domain.com:389' address '192.168.0.2' to kqueue list 2013-10-25 14:19:13.241930 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - Setting kerberos server for 'Kerberos:DOMAIN.COM' to 'server.domain.com' 2013-10-25 14:19:13.241962 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - switching to cache 'MEMORY:0x7fa713635470' 2013-10-25 14:19:13.241969 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - switching GSS to cache 'MEMORY:0x7fa713635470 2013-10-25 14:19:13.242231 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - GSSAPI allow Confidentiality 2013-10-25 14:19:13.242234 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - setting realm 'DOMAIN.COM' for node '/Active Directory/domain.com' 2013-10-25 14:19:13.242239 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - GSSAPI allow Integrity (signing) 2013-10-25 14:19:13.242274 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - GSSAPI using hostname 'server.domain.com' 2013-10-25 14:19:13.242282 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - GSSAPI using initiator credential '[email protected]' 2013-10-25 14:19:13.250771 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - Authenticate to LDAP using Kerberos credential - 0 2013-10-25 14:19:13.250784 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - verified connectivity to '192.168.0.2' with socket 12 2013-10-25 14:19:13.251513 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - locating site using domain domain.com using CLDAP 2013-10-25 14:19:13.252145 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - using site of 'DOMAINGROUP' from CLDAP 2013-10-25 14:19:13.253626 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - resolving 'server2.domain.com' 2013-10-25 14:19:13.253933 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - added socket 13 for host 'server2.domain.com:389' address '192.168.0.1' to kqueue list 2013-10-25 14:19:13.254428 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - Setting kerberos server for 'Kerberos:DOMAIN.COM' to 'server2.domain.com' 2013-10-25 14:19:13.254462 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - switching to cache 'MEMORY:0x7fa713635470' 2013-10-25 14:19:13.254468 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - switching GSS to cache 'MEMORY:0x7fa713635470 2013-10-25 14:19:13.254617 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - setting realm 'DOMAIN.COM' for node '/Active Directory/domain.com' 2013-10-25 14:19:13.254661 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - GSSAPI allow Confidentiality 2013-10-25 14:19:13.254670 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - GSSAPI allow Integrity (signing) 2013-10-25 14:19:13.254689 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - GSSAPI using hostname 'server2.domain.com' 2013-10-25 14:19:13.254695 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - GSSAPI using initiator credential '[email protected]' 2013-10-25 14:19:13.262092 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - Authenticate to LDAP using Kerberos credential - 0 2013-10-25 14:19:13.262108 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - verified connectivity to '192.168.0.1' with socket 13 2013-10-25 14:19:13.262982 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - Computer account either already exists or DC is already Read/Write 2013-10-25 14:19:13.264968 PDT - 10544.20463, Node: /Active Directory, Module: ActiveDirectory - Adding record 'cn=spike,CN=Computers,DC=domain,DC=com' in 'domain.com' The failure point seems to be Computer account either already exists or DC is already Read/Write, however, I can search for 'spike' on the Active Directory server using Active Directory Explorer and it's not there. If I do the same search for the Linux and Windows PCs I added previously, I can find them.

    Read the article

  • Cisco SR520w FE - WAN Port Stops Working

    - by Mike Hanley
    I have setup a Cisco SR520W and everything appears to be working. After about 1-2 days, it looks like the WAN port stops forwarding traffic to the Internet gateway IP of the device. If I unplug and then plug in the network cable connecting the WAN port of the SR520W to my Comcast Cable Modem, traffic startings flowing again. Also, if I restart the SR520W, the traffic will flow again. Any ideas? Here is the running config: Current configuration : 10559 bytes ! version 12.4 no service pad no service timestamps debug uptime service timestamps log datetime msec no service password-encryption ! hostname hostname.mydomain.com ! boot-start-marker boot-end-marker ! logging message-counter syslog no logging rate-limit enable secret 5 <removed> ! aaa new-model ! ! aaa authentication login default local aaa authorization exec default local ! ! aaa session-id common clock timezone PST -8 clock summer-time PDT recurring ! crypto pki trustpoint TP-self-signed-334750407 enrollment selfsigned subject-name cn=IOS-Self-Signed-Certificate-334750407 revocation-check none rsakeypair TP-self-signed-334750407 ! ! crypto pki certificate chain TP-self-signed-334750407 certificate self-signed 01 <removed> quit dot11 syslog ! dot11 ssid <removed> vlan 75 authentication open authentication key-management wpa guest-mode wpa-psk ascii 0 <removed> ! ip source-route ! ! ip dhcp excluded-address 172.16.0.1 172.16.0.10 ! ip dhcp pool inside import all network 172.16.0.0 255.240.0.0 default-router 172.16.0.1 dns-server 10.0.0.15 10.0.0.12 domain-name mydomain.com ! ! ip cef ip domain name mydomain.com ip name-server 68.87.76.178 ip name-server 66.240.48.9 ip port-map user-ezvpn-remote port udp 10000 ip ips notify SDEE ip ips name sdm_ips_rule ! ip ips signature-category category all retired true category ios_ips basic retired false ! ip inspect log drop-pkt no ipv6 cef ! multilink bundle-name authenticated parameter-map type inspect z1-z2-pmap audit-trail on password encryption aes ! ! username admin privilege 15 secret 5 <removed> ! crypto key pubkey-chain rsa named-key realm-cisco.pub key-string <removed> quit ! ! ! ! ! ! crypto ipsec client ezvpn EZVPN_REMOTE_CONNECTION_1 connect auto group EZVPN_GROUP_1 key <removed> mode client peer 64.1.208.90 virtual-interface 1 username admin password <removed> xauth userid mode local ! ! archive log config logging enable logging size 600 hidekeys ! ! ! class-map type inspect match-any SDM_AH match access-group name SDM_AH class-map type inspect match-any SDM-Voice-permit match protocol sip class-map type inspect match-any SDM_ESP match access-group name SDM_ESP class-map type inspect match-any SDM_EASY_VPN_REMOTE_TRAFFIC match protocol isakmp match protocol ipsec-msft match class-map SDM_AH match class-map SDM_ESP match protocol user-ezvpn-remote class-map type inspect match-all SDM_EASY_VPN_REMOTE_PT match class-map SDM_EASY_VPN_REMOTE_TRAFFIC match access-group 101 class-map type inspect match-any Easy_VPN_Remote_VT match access-group 102 class-map type inspect match-any sdm-cls-icmp-access match protocol icmp match protocol tcp match protocol udp class-map type inspect match-any sdm-cls-insp-traffic match protocol cuseeme match protocol dns match protocol ftp match protocol h323 match protocol https match protocol icmp match protocol imap match protocol pop3 match protocol netshow match protocol shell match protocol realmedia match protocol rtsp match protocol smtp extended match protocol sql-net match protocol streamworks match protocol tftp match protocol vdolive match protocol tcp match protocol udp class-map type inspect match-any L4-inspect-class match protocol icmp class-map type inspect match-all sdm-invalid-src match access-group 100 class-map type inspect match-all dhcp_out_self match access-group name dhcp-resp-permit class-map type inspect match-all dhcp_self_out match access-group name dhcp-req-permit class-map type inspect match-all sdm-protocol-http match protocol http ! ! policy-map type inspect sdm-permit-icmpreply class type inspect dhcp_self_out pass class type inspect sdm-cls-icmp-access inspect class class-default pass policy-map type inspect sdm-permit_VT class type inspect Easy_VPN_Remote_VT pass class class-default drop policy-map type inspect sdm-inspect class type inspect SDM-Voice-permit pass class type inspect sdm-cls-insp-traffic inspect class type inspect sdm-invalid-src drop log class type inspect sdm-protocol-http inspect z1-z2-pmap class class-default pass policy-map type inspect sdm-inspect-voip-in class type inspect SDM-Voice-permit pass class class-default drop policy-map type inspect sdm-permit class type inspect SDM_EASY_VPN_REMOTE_PT pass class type inspect dhcp_out_self pass class class-default drop ! zone security ezvpn-zone zone security out-zone zone security in-zone zone-pair security sdm-zp-in-ezvpn1 source in-zone destination ezvpn-zone service-policy type inspect sdm-permit_VT zone-pair security sdm-zp-out-ezpn1 source out-zone destination ezvpn-zone service-policy type inspect sdm-permit_VT zone-pair security sdm-zp-ezvpn-out1 source ezvpn-zone destination out-zone service-policy type inspect sdm-permit_VT zone-pair security sdm-zp-self-out source self destination out-zone service-policy type inspect sdm-permit-icmpreply zone-pair security sdm-zp-out-in source out-zone destination in-zone service-policy type inspect sdm-inspect-voip-in zone-pair security sdm-zp-ezvpn-in1 source ezvpn-zone destination in-zone service-policy type inspect sdm-permit_VT zone-pair security sdm-zp-out-self source out-zone destination self service-policy type inspect sdm-permit zone-pair security sdm-zp-in-out source in-zone destination out-zone service-policy type inspect sdm-inspect ! bridge irb ! ! interface FastEthernet0 switchport access vlan 75 ! interface FastEthernet1 switchport access vlan 75 ! interface FastEthernet2 switchport access vlan 75 ! interface FastEthernet3 switchport access vlan 75 ! interface FastEthernet4 description $FW_OUTSIDE$ ip address 75.149.48.76 255.255.255.240 ip nat outside ip ips sdm_ips_rule out ip virtual-reassembly zone-member security out-zone duplex auto speed auto crypto ipsec client ezvpn EZVPN_REMOTE_CONNECTION_1 ! interface Virtual-Template1 type tunnel no ip address ip virtual-reassembly zone-member security ezvpn-zone tunnel mode ipsec ipv4 ! interface Dot11Radio0 no ip address ! encryption vlan 75 mode ciphers aes-ccm ! ssid <removed> ! speed basic-1.0 basic-2.0 basic-5.5 6.0 9.0 basic-11.0 12.0 18.0 24.0 36.0 48.0 54.0 station-role root ! interface Dot11Radio0.75 encapsulation dot1Q 75 native ip virtual-reassembly bridge-group 75 bridge-group 75 subscriber-loop-control bridge-group 75 spanning-disabled bridge-group 75 block-unknown-source no bridge-group 75 source-learning no bridge-group 75 unicast-flooding ! interface Vlan1 no ip address ip virtual-reassembly bridge-group 1 ! interface Vlan75 no ip address ip virtual-reassembly bridge-group 75 bridge-group 75 spanning-disabled ! interface BVI1 no ip address ip nat inside ip virtual-reassembly ! interface BVI75 description $FW_INSIDE$ ip address 172.16.0.1 255.240.0.0 ip nat inside ip ips sdm_ips_rule in ip virtual-reassembly zone-member security in-zone crypto ipsec client ezvpn EZVPN_REMOTE_CONNECTION_1 inside ! ip forward-protocol nd ip route 0.0.0.0 0.0.0.0 75.149.48.78 2 ! ip http server ip http authentication local ip http secure-server ip http timeout-policy idle 60 life 86400 requests 10000 ip nat inside source list 1 interface FastEthernet4 overload ! ip access-list extended SDM_AH remark SDM_ACL Category=1 permit ahp any any ip access-list extended SDM_ESP remark SDM_ACL Category=1 permit esp any any ip access-list extended dhcp-req-permit remark SDM_ACL Category=1 permit udp any eq bootpc any eq bootps ip access-list extended dhcp-resp-permit remark SDM_ACL Category=1 permit udp any eq bootps any eq bootpc ! access-list 1 remark SDM_ACL Category=2 access-list 1 permit 172.16.0.0 0.15.255.255 access-list 100 remark SDM_ACL Category=128 access-list 100 permit ip host 255.255.255.255 any access-list 100 permit ip 127.0.0.0 0.255.255.255 any access-list 100 permit ip 75.149.48.64 0.0.0.15 any access-list 101 remark SDM_ACL Category=128 access-list 101 permit ip host 64.1.208.90 any access-list 102 remark SDM_ACL Category=1 access-list 102 permit ip any any ! ! ! ! snmp-server community <removed> RO ! control-plane ! bridge 1 protocol ieee bridge 1 route ip bridge 75 route ip banner login ^CSR520 Base Config - MFG 1.0 ^C ! line con 0 no modem enable line aux 0 line vty 0 4 transport input telnet ssh ! scheduler max-task-time 5000 end I also ran some diagnostics when the WAN port stopped working: 1. show interface fa4 FastEthernet4 is up, line protocol is up Hardware is PQUICC_FEC, address is 0026.99c5.b434 (bia 0026.99c5.b434) Description: $FW_OUTSIDE$ Internet address is 75.149.48.76/28 MTU 1500 bytes, BW 100000 Kbit/sec, DLY 100 usec, reliability 255/255, txload 1/255, rxload 1/255 Encapsulation ARPA, loopback not set Keepalive set (10 sec) Full-duplex, 100Mb/s, 100BaseTX/FX ARP type: ARPA, ARP Timeout 04:00:00 Last input 01:08:15, output 00:00:00, output hang never Last clearing of "show interface" counters never Input queue: 0/75/23/0 (size/max/drops/flushes); Total output drops: 0 Queueing strategy: fifo Output queue: 0/40 (size/max) 5 minute input rate 0 bits/sec, 0 packets/sec 5 minute output rate 1000 bits/sec, 0 packets/sec 336446 packets input, 455403158 bytes Received 23 broadcasts, 0 runts, 0 giants, 37 throttles 41 input errors, 0 CRC, 0 frame, 0 overrun, 41 ignored 0 watchdog 0 input packets with dribble condition detected 172529 packets output, 23580132 bytes, 0 underruns 0 output errors, 0 collisions, 2 interface resets 0 unknown protocol drops 0 babbles, 0 late collision, 0 deferred 0 lost carrier, 0 no carrier 0 output buffer failures, 0 output buffers swapped out 2. show ip route Gateway of last resort is 75.149.48.78 to network 0.0.0.0 C 192.168.75.0/24 is directly connected, BVI75 64.0.0.0/32 is subnetted, 1 subnets S 64.1.208.90 [1/0] via 75.149.48.78 S 192.168.10.0/24 is directly connected, BVI75 75.0.0.0/28 is subnetted, 1 subnets C 75.149.48.64 is directly connected, FastEthernet4 S* 0.0.0.0/0 [2/0] via 75.149.48.78 3. show ip arp Protocol Address Age (min) Hardware Addr Type Interface Internet 75.149.48.65 69 001e.2a39.7b08 ARPA FastEthernet4 Internet 75.149.48.76 - 0026.99c5.b434 ARPA FastEthernet4 Internet 75.149.48.78 93 0022.2d6c.ae36 ARPA FastEthernet4 Internet 192.168.75.1 - 0027.0d58.f5f0 ARPA BVI75 Internet 192.168.75.12 50 7c6d.62c7.8c0a ARPA BVI75 Internet 192.168.75.13 0 001b.6301.1227 ARPA BVI75 4. sh ip cef Prefix Next Hop Interface 0.0.0.0/0 75.149.48.78 FastEthernet4 0.0.0.0/8 drop 0.0.0.0/32 receive 64.1.208.90/32 75.149.48.78 FastEthernet4 75.149.48.64/28 attached FastEthernet4 75.149.48.64/32 receive FastEthernet4 75.149.48.65/32 attached FastEthernet4 75.149.48.76/32 receive FastEthernet4 75.149.48.78/32 attached FastEthernet4 75.149.48.79/32 receive FastEthernet4 127.0.0.0/8 drop 192.168.10.0/24 attached BVI75 192.168.75.0/24 attached BVI75 192.168.75.0/32 receive BVI75 192.168.75.1/32 receive BVI75 192.168.75.12/32 attached BVI75 192.168.75.13/32 attached BVI75 192.168.75.255/32 receive BVI75 224.0.0.0/4 drop 224.0.0.0/24 receive 240.0.0.0/4 drop 255.255.255.255/32 receive Thanks in advance, -Mike

    Read the article

  • Can ping IP address and nslookup hostname but cannot ping hostname

    - by jao
    On a windows 2003 server I can nslookup www.google.com which returns Server: localhost Address: 127.0.0.1 Non-authoritative answer: Name: www.l.google.com Addresses: 74.125.79.104, 74.125.79.147, 74.125.79.99 Aliases: www.google.com I can then ping 74.125.79.104: Pinging 74.125.79.104 with 32 bytes of data: Reply from 74.125.79.104: bytes=32 time=16ms TTL=54 Reply from 74.125.79.104: bytes=32 time=32ms TTL=54 Reply from 74.125.79.104: bytes=32 time=15ms TTL=54 Reply from 74.125.79.104: bytes=32 time=15ms TTL=54 Ping statistics for 74.125.79.104: Packets: Sent = 4, Received = 4, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 15ms, Maximum = 32ms, Average = 19ms But I cannot ping www.google.com: Ping request could not find host www.google.com. Please check the name and try again. (this one is different from the other question in that this one has a TLD, it is not a local domain.) Update: I am running a dns server at localhost (127.0.0.1). Even when I change it to use for example opendns, it still can nslookup hostname and ping ip address, but not ping hostname. So what is wrong? Update 2: here is the ipconfig /all result: Windows IP Configuration Host Name . . . . . . . . . . . . : SERVER Primary Dns Suffix . . . . . . . : NETWORK.local Node Type . . . . . . . . . . . . : Unknown IP Routing Enabled. . . . . . . . : No WINS Proxy Enabled. . . . . . . . : No DNS Suffix Search List. . . . . . : NETWORK.local Ethernet adapter Local Area Connection: Connection-specific DNS Suffix . : Description . . . . . . . . . . . : Broadcom NetXtreme Gigabit Ethernet #2 Physical Address. . . . . . . . . : 00-0F-1F-56-3B-AA DHCP Enabled. . . . . . . . . . . : No IP Address. . . . . . . . . . . . : 192.168.7.2 Subnet Mask . . . . . . . . . . . : 255.255.255.0 Default Gateway . . . . . . . . . : 192.168.7.1 DNS Servers . . . . . . . . . . . : 127.0.0.1 Update 3: Thanks everyone for their help and suggestions. I appreciate that. Ipconfig /flushdns returns: Sucessfully flushed the DNS resolver cache Ipconfig /displaydns returns: 2.7.168.192.in-addr.arpa ---------------------------------------- Record Name . . . . . : 2.7.168.192.in-addr.arpa. Record Type . . . . . : 12 Time To Live . . . . : 0 Data Length . . . . . : 4 Section . . . . . . . : Answer PTR Record . . . . . : webserver.mydomainname.com 1.0.0.127.in-addr.arpa ---------------------------------------- Record Name . . . . . : 1.0.0.127.in-addr.arpa. Record Type . . . . . : 12 Time To Live . . . . : 0 Data Length . . . . . : 4 Section . . . . . . . : Answer PTR Record . . . . . : localhost Update 4: Wireshark shows the following: 3 11.540542 208.67.220.220 192.168.7.2 DNS Standard query response A 74.125.79.99 A 74.125.79.104 A 74.125.79.147 6 42.056794 192.168.7.2 192.168.7.255 NBNS Name query NB WWW.GOOGLE.COM<00> which is weird: when I ping, it sends a packet to 192.168.7.255 instead of asking the DNS server for an address

    Read the article

  • org-sort multi: date/time (?d ?t) | priority (?p) | title (?a)

    - by lawlist
    Is anyone aware of an org-sort function / modification that can refile / organize a group of TODO so that it sorts them by three (3) criteria: first sort by due date, second sort by priority, and third sort by by title of the task? EDIT: I believe that org-sort by deadline (?d) has a bug that cannot properly handle undated tasks. I am working on a workaround (i.e., moving the undated todo to a different heading before the deadline (?d) sort occurs), but perhaps the best thing to do would be to try and fix the original sorting function. Development of the workaround can be found in this thread (i.e., moving the undated tasks to a different heading in one fell swoop): How to automate org-refile for multiple todo EDIT: Apparently, the following code (ancient history) that I found on the internet was eventually modified and included as a part of org-sort-entries. Unfortunately, undated todo are not properly sorted when sorting by deadline -- i.e., they are mixed in with the dated todo. ;; multiple sort (defun org-sort-multi (&rest sort-types) "Multiple sorts on a certain level of an outline tree, or plain list items. SORT-TYPES is a list where each entry is either a character or a cons pair (BOOL . CHAR), where BOOL is whether or not to sort case-sensitively, and CHAR is one of the characters defined in `org-sort-entries-or-items'. Entries are applied in back to front order. Example: To sort first by TODO status, then by priority, then by date, then alphabetically (case-sensitive) use the following call: (org-sort-multi '(?d ?p ?t (t . ?a)))" (interactive) (dolist (x (nreverse sort-types)) (when (char-valid-p x) (setq x (cons nil x))) (condition-case nil (org-sort-entries (car x) (cdr x)) (error nil)))) ;; sort current level (defun lawlist-sort (&rest sort-types) "Sort the current org level. SORT-TYPES is a list where each entry is either a character or a cons pair (BOOL . CHAR), where BOOL is whether or not to sort case-sensitively, and CHAR is one of the characters defined in `org-sort-entries-or-items'. Entries are applied in back to front order. Defaults to \"?o ?p\" which is sorted by TODO status, then by priority" (interactive) (when (equal mode-name "Org") (let ((sort-types (or sort-types (if (or (org-entry-get nil "TODO") (org-entry-get nil "PRIORITY")) '(?d ?t ?p) ;; date, time, priority '((nil . ?a)))))) (save-excursion (outline-up-heading 1) (let ((start (point)) end) (while (and (not (bobp)) (not (eobp)) (<= (point) start)) (condition-case nil (outline-forward-same-level 1) (error (outline-up-heading 1)))) (unless (> (point) start) (goto-char (point-max))) (setq end (point)) (goto-char start) (apply 'org-sort-multi sort-types) (goto-char end) (when (eobp) (forward-line -1)) (when (looking-at "^\\s-*$") ;; (delete-line) ) (goto-char start) ;; (dotimes (x ) (org-cycle)) ))))) EDIT: Here is a more modern version of multi-sort, which is likely based upon further development of the above-code: (defun org-sort-all () (interactive) (save-excursion (goto-char (point-min)) (while (re-search-forward "^\* " nil t) (goto-char (match-beginning 0)) (condition-case err (progn (org-sort-entries t ?a) (org-sort-entries t ?p) (org-sort-entries t ?o) (forward-line)) (error nil))) (goto-char (point-min)) (while (re-search-forward "\* PROJECT " nil t) (goto-char (line-beginning-position)) (ignore-errors (org-sort-entries t ?a) (org-sort-entries t ?p) (org-sort-entries t ?o)) (forward-line)))) EDIT: The best option will be to fix sorting of deadlines (?d) so that undated todo are moved to the bottom of the outline, instead of mixed in with the dated todo. Here is an excerpt from the current org.el included within Emacs Trunk (as of July 1, 2013): (defun org-sort (with-case) "Call `org-sort-entries', `org-table-sort-lines' or `org-sort-list'. Optional argument WITH-CASE means sort case-sensitively." (interactive "P") (cond ((org-at-table-p) (org-call-with-arg 'org-table-sort-lines with-case)) ((org-at-item-p) (org-call-with-arg 'org-sort-list with-case)) (t (org-call-with-arg 'org-sort-entries with-case)))) (defun org-sort-remove-invisible (s) (remove-text-properties 0 (length s) org-rm-props s) (while (string-match org-bracket-link-regexp s) (setq s (replace-match (if (match-end 2) (match-string 3 s) (match-string 1 s)) t t s))) s) (defvar org-priority-regexp) ; defined later in the file (defvar org-after-sorting-entries-or-items-hook nil "Hook that is run after a bunch of entries or items have been sorted. When children are sorted, the cursor is in the parent line when this hook gets called. When a region or a plain list is sorted, the cursor will be in the first entry of the sorted region/list.") (defun org-sort-entries (&optional with-case sorting-type getkey-func compare-func property) "Sort entries on a certain level of an outline tree. If there is an active region, the entries in the region are sorted. Else, if the cursor is before the first entry, sort the top-level items. Else, the children of the entry at point are sorted. Sorting can be alphabetically, numerically, by date/time as given by a time stamp, by a property or by priority. The command prompts for the sorting type unless it has been given to the function through the SORTING-TYPE argument, which needs to be a character, \(?n ?N ?a ?A ?t ?T ?s ?S ?d ?D ?p ?P ?o ?O ?r ?R ?f ?F). Here is the precise meaning of each character: n Numerically, by converting the beginning of the entry/item to a number. a Alphabetically, ignoring the TODO keyword and the priority, if any. o By order of TODO keywords. t By date/time, either the first active time stamp in the entry, or, if none exist, by the first inactive one. s By the scheduled date/time. d By deadline date/time. c By creation time, which is assumed to be the first inactive time stamp at the beginning of a line. p By priority according to the cookie. r By the value of a property. Capital letters will reverse the sort order. If the SORTING-TYPE is ?f or ?F, then GETKEY-FUNC specifies a function to be called with point at the beginning of the record. It must return either a string or a number that should serve as the sorting key for that record. Comparing entries ignores case by default. However, with an optional argument WITH-CASE, the sorting considers case as well." (interactive "P") (let ((case-func (if with-case 'identity 'downcase)) (cmstr ;; The clock marker is lost when using `sort-subr', let's ;; store the clocking string. (when (equal (marker-buffer org-clock-marker) (current-buffer)) (save-excursion (goto-char org-clock-marker) (looking-back "^.*") (match-string-no-properties 0)))) start beg end stars re re2 txt what tmp) ;; Find beginning and end of region to sort (cond ((org-region-active-p) ;; we will sort the region (setq end (region-end) what "region") (goto-char (region-beginning)) (if (not (org-at-heading-p)) (outline-next-heading)) (setq start (point))) ((or (org-at-heading-p) (condition-case nil (progn (org-back-to-heading) t) (error nil))) ;; we will sort the children of the current headline (org-back-to-heading) (setq start (point) end (progn (org-end-of-subtree t t) (or (bolp) (insert "\n")) (org-back-over-empty-lines) (point)) what "children") (goto-char start) (show-subtree) (outline-next-heading)) (t ;; we will sort the top-level entries in this file (goto-char (point-min)) (or (org-at-heading-p) (outline-next-heading)) (setq start (point)) (goto-char (point-max)) (beginning-of-line 1) (when (looking-at ".*?\\S-") ;; File ends in a non-white line (end-of-line 1) (insert "\n")) (setq end (point-max)) (setq what "top-level") (goto-char start) (show-all))) (setq beg (point)) (if (>= beg end) (error "Nothing to sort")) (looking-at "\\(\\*+\\)") (setq stars (match-string 1) re (concat "^" (regexp-quote stars) " +") re2 (concat "^" (regexp-quote (substring stars 0 -1)) "[ \t\n]") txt (buffer-substring beg end)) (if (not (equal (substring txt -1) "\n")) (setq txt (concat txt "\n"))) (if (and (not (equal stars "*")) (string-match re2 txt)) (error "Region to sort contains a level above the first entry")) (unless sorting-type (message "Sort %s: [a]lpha [n]umeric [p]riority p[r]operty todo[o]rder [f]unc [t]ime [s]cheduled [d]eadline [c]reated A/N/P/R/O/F/T/S/D/C means reversed:" what) (setq sorting-type (read-char-exclusive)) (and (= (downcase sorting-type) ?f) (setq getkey-func (org-icompleting-read "Sort using function: " obarray 'fboundp t nil nil)) (setq getkey-func (intern getkey-func))) (and (= (downcase sorting-type) ?r) (setq property (org-icompleting-read "Property: " (mapcar 'list (org-buffer-property-keys t)) nil t)))) (message "Sorting entries...") (save-restriction (narrow-to-region start end) (let ((dcst (downcase sorting-type)) (case-fold-search nil) (now (current-time))) (sort-subr (/= dcst sorting-type) ;; This function moves to the beginning character of the "record" to ;; be sorted. (lambda nil (if (re-search-forward re nil t) (goto-char (match-beginning 0)) (goto-char (point-max)))) ;; This function moves to the last character of the "record" being ;; sorted. (lambda nil (save-match-data (condition-case nil (outline-forward-same-level 1) (error (goto-char (point-max)))))) ;; This function returns the value that gets sorted against. (lambda nil (cond ((= dcst ?n) (if (looking-at org-complex-heading-regexp) (string-to-number (match-string 4)) nil)) ((= dcst ?a) (if (looking-at org-complex-heading-regexp) (funcall case-func (match-string 4)) nil)) ((= dcst ?t) (let ((end (save-excursion (outline-next-heading) (point)))) (if (or (re-search-forward org-ts-regexp end t) (re-search-forward org-ts-regexp-both end t)) (org-time-string-to-seconds (match-string 0)) (org-float-time now)))) ((= dcst ?c) (let ((end (save-excursion (outline-next-heading) (point)))) (if (re-search-forward (concat "^[ \t]*\\[" org-ts-regexp1 "\\]") end t) (org-time-string-to-seconds (match-string 0)) (org-float-time now)))) ((= dcst ?s) (let ((end (save-excursion (outline-next-heading) (point)))) (if (re-search-forward org-scheduled-time-regexp end t) (org-time-string-to-seconds (match-string 1)) (org-float-time now)))) ((= dcst ?d) (let ((end (save-excursion (outline-next-heading) (point)))) (if (re-search-forward org-deadline-time-regexp end t) (org-time-string-to-seconds (match-string 1)) (org-float-time now)))) ((= dcst ?p) (if (re-search-forward org-priority-regexp (point-at-eol) t) (string-to-char (match-string 2)) org-default-priority)) ((= dcst ?r) (or (org-entry-get nil property) "")) ((= dcst ?o) (if (looking-at org-complex-heading-regexp) (- 9999 (length (member (match-string 2) org-todo-keywords-1))))) ((= dcst ?f) (if getkey-func (progn (setq tmp (funcall getkey-func)) (if (stringp tmp) (setq tmp (funcall case-func tmp))) tmp) (error "Invalid key function `%s'" getkey-func))) (t (error "Invalid sorting type `%c'" sorting-type)))) nil (cond ((= dcst ?a) 'string<) ((= dcst ?f) compare-func) ((member dcst '(?p ?t ?s ?d ?c)) '<))))) (run-hooks 'org-after-sorting-entries-or-items-hook) ;; Reset the clock marker if needed (when cmstr (save-excursion (goto-char start) (search-forward cmstr nil t) (move-marker org-clock-marker (point)))) (message "Sorting entries...done"))) (defun org-do-sort (table what &optional with-case sorting-type) "Sort TABLE of WHAT according to SORTING-TYPE. The user will be prompted for the SORTING-TYPE if the call to this function does not specify it. WHAT is only for the prompt, to indicate what is being sorted. The sorting key will be extracted from the car of the elements of the table. If WITH-CASE is non-nil, the sorting will be case-sensitive." (unless sorting-type (message "Sort %s: [a]lphabetic, [n]umeric, [t]ime. A/N/T means reversed:" what) (setq sorting-type (read-char-exclusive))) (let ((dcst (downcase sorting-type)) extractfun comparefun) ;; Define the appropriate functions (cond ((= dcst ?n) (setq extractfun 'string-to-number comparefun (if (= dcst sorting-type) '< '>))) ((= dcst ?a) (setq extractfun (if with-case (lambda(x) (org-sort-remove-invisible x)) (lambda(x) (downcase (org-sort-remove-invisible x)))) comparefun (if (= dcst sorting-type) 'string< (lambda (a b) (and (not (string< a b)) (not (string= a b))))))) ((= dcst ?t) (setq extractfun (lambda (x) (if (or (string-match org-ts-regexp x) (string-match org-ts-regexp-both x)) (org-float-time (org-time-string-to-time (match-string 0 x))) 0)) comparefun (if (= dcst sorting-type) '< '>))) (t (error "Invalid sorting type `%c'" sorting-type))) (sort (mapcar (lambda (x) (cons (funcall extractfun (car x)) (cdr x))) table) (lambda (a b) (funcall comparefun (car a) (car b))))))

    Read the article

  • OS X won't see Windows 7 in network (and vice versa)

    - by meds
    I've enabled SMB sharing in OS X Lion and have added folders to share, it says 'Windows Sharing: On' with a green circle next to it (from the sharing window) and that to access the volume I will need to to go to \\192.168.0.17. It also says that the OS X should be visible as 'macbook' in the network. Both my WIndows 7 and OS X are connected to the same network, yet when I try to go to \\192.168.0.17 or from the Mac try to go to my Windows system (smb://192.168.0.6) the two OSs don't see each other. Any ideas why? Attempting to ping the Mac from Windows results in this output in the command prompt: Pinging 192.168.0.17 with 32 bytes of data: Reply from 192.168.0.6: Destination host unreachable. Request timed out. Request timed out. Request timed out. Ping statistics for 192.168.0.17: Packets: Sent = 4, Received = 1, Lost = 3 (75% loss), ipconfig in Windows is: Wireless LAN adapter Wireless Network Connection: Connection-specific DNS Suffix . : Link-local IPv6 Address . . . . . : fe80::8918:efd1:b05c:890f%21 IPv4 Address. . . . . . . . . . . : 192.168.0.6 Subnet Mask . . . . . . . . . . . : 255.255.255.0 Default Gateway . . . . . . . . . : 192.168.0.1 Ethernet adapter Bluetooth Network Connection: Media State . . . . . . . . . . . : Media disconnected Connection-specific DNS Suffix . : Ethernet adapter VMware Network Adapter VMnet1: Connection-specific DNS Suffix . : Link-local IPv6 Address . . . . . : fe80::98ab:63fc:3c07:d837%13 IPv4 Address. . . . . . . . . . . : 192.168.74.1 Subnet Mask . . . . . . . . . . . : 255.255.255.0 Default Gateway . . . . . . . . . : Ethernet adapter VMware Network Adapter VMnet8: Connection-specific DNS Suffix . : Link-local IPv6 Address . . . . . : fe80::80ff:c575:7b50:3a10%14 IPv4 Address. . . . . . . . . . . : 192.168.21.1 Subnet Mask . . . . . . . . . . . : 255.255.255.0 Default Gateway . . . . . . . . . : Tunnel adapter isatap.{2E97D0AE-9E18-4072-AC23-1979BA0DCB79}: Media State . . . . . . . . . . . : Media disconnected Connection-specific DNS Suffix . : Tunnel adapter isatap.{E260CE43-E9A7-4DE0-A88E-4EAFF68ACDDB}: Media State . . . . . . . . . . . : Media disconnected Connection-specific DNS Suffix . : Tunnel adapter isatap.{A5130812-59CE-4DDF-9C35-9433BCED9831}: Media State . . . . . . . . . . . : Media disconnected Connection-specific DNS Suffix . : Tunnel adapter Teredo Tunneling Pseudo-Interface: Media State . . . . . . . . . . . : Media disconnected Connection-specific DNS Suffix . : Tunnel adapter isatap.{134BCAE7-CFFF-4A98-8DA0-3708806AABEB}: Media State . . . . . . . . . . . : Media disconnected Connection-specific DNS Suffix . : Tunnel adapter isatap.{8D9E3B8F-161C-4ACE-B211-3EDD694416B2}: Media State . . . . . . . . . . . : Media disconnected Connection-specific DNS Suffix . : in OS X: lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384 options=3<RXCSUM,TXCSUM> inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1 inet 127.0.0.1 netmask 0xff000000 inet6 ::1 prefixlen 128 gif0: flags=8010<POINTOPOINT,MULTICAST> mtu 1280 stf0: flags=0<> mtu 1280 en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500 options=2b<RXCSUM,TXCSUM,VLAN_HWTAGGING,TSO4> ether c8:2a:14:01:24:c1 media: autoselect (none) status: inactive en1: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500 ether e0:f8:47:0c:fe:04 inet6 fe80::e2f8:47ff:fe0c:fe04%en1 prefixlen 64 scopeid 0x5 inet 192.168.0.17 netmask 0xffffff00 broadcast 192.168.0.255 media: autoselect status: active p2p0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 2304 ether 02:f8:47:0c:fe:04 media: autoselect status: inactive fw0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 4078 lladdr 70:cd:60:ff:fe:d8:f1:32 media: autoselect <full-duplex> status: inactive

    Read the article

< Previous Page | 160 161 162 163 164 165 166  | Next Page >