Search Results

Search found 5472 results on 219 pages for 'jack low'.

Page 217/219 | < Previous Page | 213 214 215 216 217 218 219  | Next Page >

  • solved: puppet master REST API returns 403 when running under passenger works when master runs from command line

    - by Anadi Misra
    I am using the standard auth.conf provided in puppet install for the puppet master which is running through passenger under Nginx. However for most of the catalog, files and certitifcate request I get a 403 response. ### Authenticated paths - these apply only when the client ### has a valid certificate and is thus authenticated # allow nodes to retrieve their own catalog path ~ ^/catalog/([^/]+)$ method find allow $1 # allow nodes to retrieve their own node definition path ~ ^/node/([^/]+)$ method find allow $1 # allow all nodes to access the certificates services path ~ ^/certificate_revocation_list/ca method find allow * # allow all nodes to store their reports path /report method save allow * # unconditionally allow access to all file services # which means in practice that fileserver.conf will # still be used path /file allow * ### Unauthenticated ACL, for clients for which the current master doesn't ### have a valid certificate; we allow authenticated users, too, because ### there isn't a great harm in letting that request through. # allow access to the master CA path /certificate/ca auth any method find allow * path /certificate/ auth any method find allow * path /certificate_request auth any method find, save allow * path /facts auth any method find, search allow * # this one is not stricly necessary, but it has the merit # of showing the default policy, which is deny everything else path / auth any Puppet master however does not seems to be following this as I get this error on client [amisr1@blramisr195602 ~]$ sudo puppet agent --no-daemonize --verbose --server bangvmpllda02.XXXXX.com [sudo] password for amisr1: Starting Puppet client version 3.0.1 Warning: Unable to fetch my node definition, but the agent run will continue: Warning: Error 403 on SERVER: Forbidden request: XX.XXX.XX.XX(XX.XXX.XX.XX) access to /certificate_revocation_list/ca [find] at :110 Info: Retrieving plugin Error: /File[/var/lib/puppet/lib]: Failed to generate additional resources using 'eval_generate: Error 403 on SERVER: Forbidden request: XX.XXX.XX.XX(XX.XXX.XX.XX) access to /file_metadata/plugins [search] at :110 Error: /File[/var/lib/puppet/lib]: Could not evaluate: Error 403 on SERVER: Forbidden request: XX.XXX.XX.XX(XX.XXX.XX.XX) access to /file_metadata/plugins [find] at :110 Could not retrieve file metadata for puppet://devops.XXXXX.com/plugins: Error 403 on SERVER: Forbidden request: XX.XXX.XX.XX(XX.XXX.XX.XX) access to /file_metadata/plugins [find] at :110 Error: Could not retrieve catalog from remote server: Error 403 on SERVER: Forbidden request: XX.XXX.XX.XX(XX.XXX.XX.XX) access to /catalog/blramisr195602.XXXXX.com [find] at :110 Using cached catalog Error: Could not retrieve catalog; skipping run Error: Could not send report: Error 403 on SERVER: Forbidden request: XX.XXX.XX.XX(XX.XXX.XX.XX) access to /report/blramisr195602.XXXXX.com [save] at :110 and the server logs show XX.XXX.XX.XX - - [10/Dec/2012:14:46:52 +0530] "GET /production/certificate_revocation_list/ca? HTTP/1.1" 403 102 "-" "Ruby" XX.XXX.XX.XX - - [10/Dec/2012:14:46:52 +0530] "GET /production/file_metadatas/plugins?links=manage&recurse=true&&ignore=---+%0A++-+%22.svn%22%0A++-+CVS%0A++-+%22.git%22&checksum_type=md5 HTTP/1.1" 403 95 "-" "Ruby" XX.XXX.XX.XX - - [10/Dec/2012:14:46:52 +0530] "GET /production/file_metadata/plugins? HTTP/1.1" 403 93 "-" "Ruby" XX.XXX.XX.XX - - [10/Dec/2012:14:46:53 +0530] "POST /production/catalog/blramisr195602.XXXXX.com HTTP/1.1" 403 106 "-" "Ruby" XX.XXX.XX.XX - - [10/Dec/2012:14:46:53 +0530] "PUT /production/report/blramisr195602.XXXXX.com HTTP/1.1" 403 105 "-" "Ruby" thefile server conf file is as follows (and goin by what they say on puppet site, It is better to regulate access in auth.conf for reaching file server and then allow file server to server all) [files] path /apps/puppet/files allow * [private] path /apps/puppet/private/%H allow * [modules] allow * I am using server and client version 3 Nginx has been compiled using the following options nginx version: nginx/1.3.9 built by gcc 4.4.6 20120305 (Red Hat 4.4.6-4) (GCC) TLS SNI support enabled configure arguments: --prefix=/apps/nginx --conf-path=/apps/nginx/nginx.conf --pid-path=/apps/nginx/run/nginx.pid --error-log-path=/apps/nginx/logs/error.log --http-log-path=/apps/nginx/logs/access.log --with-http_ssl_module --with-http_gzip_static_module --add-module=/usr/lib/ruby/gems/1.8/gems/passenger-3.0.18/ext/nginx --add-module=/apps/Downloads/nginx/nginx-auth-ldap-master/ and the standard nginx puppet master conf server { ssl on; listen 8140 ssl; server_name _; passenger_enabled on; passenger_set_cgi_param HTTP_X_CLIENT_DN $ssl_client_s_dn; passenger_set_cgi_param HTTP_X_CLIENT_VERIFY $ssl_client_verify; passenger_min_instances 5; access_log logs/puppet_access.log; error_log logs/puppet_error.log; root /apps/nginx/html/rack/public; ssl_certificate /var/lib/puppet/ssl/certs/bangvmpllda02.XXXXXX.com.pem; ssl_certificate_key /var/lib/puppet/ssl/private_keys/bangvmpllda02.XXXXXX.com.pem; ssl_crl /var/lib/puppet/ssl/ca/ca_crl.pem; ssl_client_certificate /var/lib/puppet/ssl/certs/ca.pem; ssl_ciphers SSLv2:-LOW:-EXPORT:RC4+RSA; ssl_prefer_server_ciphers on; ssl_verify_client optional; ssl_verify_depth 1; ssl_session_cache shared:SSL:128m; ssl_session_timeout 5m; } Puppet is picking up the correct settings from the files mentioned because config print command points to /etc/puppet [amisr1@bangvmpllDA02 puppet]$ sudo puppet config print | grep conf async_storeconfigs = false authconfig = /etc/puppet/namespaceauth.conf autosign = /etc/puppet/autosign.conf catalog_cache_terminus = store_configs confdir = /etc/puppet config = /etc/puppet/puppet.conf config_file_name = puppet.conf config_version = "" configprint = all configtimeout = 120 dblocation = /var/lib/puppet/state/clientconfigs.sqlite3 deviceconfig = /etc/puppet/device.conf fileserverconfig = /etc/puppet/fileserver.conf genconfig = false hiera_config = /etc/puppet/hiera.yaml localconfig = /var/lib/puppet/state/localconfig name = config rest_authconfig = /etc/puppet/auth.conf storeconfigs = true storeconfigs_backend = puppetdb tagmap = /etc/puppet/tagmail.conf thin_storeconfigs = false I checked the firewall rules on this VM; 80, 443, 8140, 3000 are allowed. Do I still have to tweak any specifics to auth.conf for getting this to work? Update I added verbose logging to the puppet master and restarted nginx; here's the additional info I see in logs Mon Dec 10 18:19:15 +0530 2012 Puppet (err): Could not resolve 10.209.47.31: no name for 10.209.47.31 Mon Dec 10 18:19:15 +0530 2012 access[/] (info): defaulting to no access for 10.209.47.31 Mon Dec 10 18:19:15 +0530 2012 Puppet (warning): Denying access: Forbidden request: 10.209.47.31(10.209.47.31) access to /file_metadata/plugins [find] at :111 Mon Dec 10 18:19:15 +0530 2012 Puppet (err): Forbidden request: 10.209.47.31(10.209.47.31) access to /file_metadata/plugins [find] at :111 10.209.47.31 - - [10/Dec/2012:18:19:15 +0530] "GET /production/file_metadata/plugins? HTTP/1.1" 403 93 "-" "Ruby" On the agent machine facter fqdn and hostname both return a fully qualified host name [amisr1@blramisr195602 ~]$ sudo facter fqdn blramisr195602.XXXXXXX.com I then updated the agent configuration to add dns_alt_names = 10.209.47.31 cleaned all certificates on master and agent and regenerated the certificates and signed them on master using the option --allow-dns-alt-names [amisr1@bangvmpllDA02 ~]$ sudo puppet cert sign blramisr195602.XXXXXX.com Error: CSR 'blramisr195602.XXXXXX.com' contains subject alternative names (DNS:10.209.47.31, DNS:blramisr195602.XXXXXX.com), which are disallowed. Use `puppet cert --allow-dns-alt-names sign blramisr195602.XXXXXX.com` to sign this request. [amisr1@bangvmpllDA02 ~]$ sudo puppet cert --allow-dns-alt-names sign blramisr195602.XXXXXX.com Signed certificate request for blramisr195602.XXXXXX.com Removing file Puppet::SSL::CertificateRequest blramisr195602.XXXXXX.com at '/var/lib/puppet/ssl/ca/requests/blramisr195602.XXXXXX.com.pem' however, that doesn't help either; I get same errors as before. Not sure why in the logs it shows comparing access rules by IP and not hostname. Is there any Nginx configuration to change this behavior?

    Read the article

  • pasenger does not start puppet master under nginx

    - by Anadi Misra
    On the server [root@bangvmpllDA02 logs]# ruby -v ruby 1.8.7 (2011-06-30 patchlevel 352) [x86_64-linux] [root@bangvmpllDA02 logs]# puppet --version 3.0.1 and [root@bangvmpllDA02 logs]# service nginx configtest nginx: the configuration file /apps/nginx/nginx.conf syntax is ok nginx: configuration file /apps/nginx/nginx.conf test is successful [root@bangvmpllDA02 logs]# service nginx status nginx (pid 25923 25921 25920 25917 25908) is running... [root@bangvmpllDA02 logs]# however none of my agents are able to connect to the master, they all fail with errors like so [amisr1@blramisr195602 ~]$ puppet agent --test --verbose --server bangvmpllda02.XXX.com Info: Creating a new SSL certificate request for blramisr195602.XXX.com Info: Certificate Request fingerprint (SHA256): 26:EB:08:1F:82:32:E4:03:7A:64:8E:30:A3:99:93:26:E6:66:B9:B0:49:B6:08:F9:67:CA:1B:0C:00:B9:1D:41 Error: Could not request certificate: Error 405 on SERVER: <html> <head><title>405 Not Allowed</title></head> <body bgcolor="white"> <center><h1>405 Not Allowed</h1></center> <hr><center>nginx</center> </body> </html> Exiting; failed to retrieve certificate and waitforcert is disabled when I check logs on puppet master [root@bangvmpllDA02 logs]# tail puppet_access.log [05/Dec/2012:17:45:18 +0530] "GET /production/certificate/ca? HTTP/1.1" 404 162 "-" "Ruby" [05/Dec/2012:18:32:23 +0530] "PUT /production/certificate_request/sl63anadi.XXX.com HTTP/1.1" 405 166 "-" "-" [05/Dec/2012:18:33:33 +0530] "GET /production/certificate/sl63anadi.XXX.com? HTTP/1.1" 404 162 "-" "-" [05/Dec/2012:18:33:33 +0530] "GET /production/certificate_request/sl63anadi.XXX.com? HTTP/1.1" 404 162 "-" "-" [05/Dec/2012:18:33:33 +0530] "PUT /production/certificate_request/sl63anadi.XXX.com HTTP/1.1" 405 166 "-" "-" and the error logs show that nginx is not really able to process the request well 2012/12/05 18:33:33 [error] 25920#0: *23 open() "/etc/puppet/rack/public/production/certificate/sl63anadi.XXX.com" failed (2: No such file or directory), client: 10.209.47.26, server: , request: "GET /production/certificate/sl63anadi.XXX.com? HTTP/1.1", host: "bangvmpllda02.XXX.com:8140" 2012/12/05 18:33:33 [error] 25920#0: *24 open() "/etc/puppet/rack/public/production/certificate_request/sl63anadi.XXX.com" failed (2: No such file or directory), client: 10.209.47.26, server: , request: "GET /production/certificate_request/sl63anadi.XXX.com? HTTP/1.1", host: "bangvmpllda02.XXX.com:8140" 2012/12/05 18:47:56 [error] 25923#0: *27 open() "/etc/puppet/rack/public/production/certificate/ca" failed (2: No such file or directory), client: 10.209.47.31, server: , request: "GET /production/certificate/ca? HTTP/1.1", host: "bangvmpllda02.XXX.com:8140" 2012/12/05 18:47:56 [error] 25923#0: *28 open() "/etc/puppet/rack/public/production/certificate_request/blramisr195602.XXX.com" failed (2: No such file or directory), client: 10.209.47.31, server: , request: "GET /production/certificate_request/blramisr195602.XXX.com? HTTP/1.1", host: "bangvmpllda02.XXX.com:8140" Passenger does not show any application groups either [root@bangvmpllDA02 nginx]# passenger-status ----------- General information ----------- max = 15 count = 0 active = 0 inactive = 0 Waiting on global queue: 0 ----------- Application groups ----------- [root@bangvmpllDA02 nginx]# here's my nginx configuration [root@bangvmpllDA02 logs]# cat ../nginx.conf user puppet; worker_processes 4; #error_log logs/error.log; #error_log logs/error.log notice; error_log logs/error.log info; #pid logs/nginx.pid; events { use epoll; worker_connections 1024; } http { include mime.types; default_type application/octet-stream; log_format main '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$http_x_forwarded_for"'; access_log logs/access.log main; sendfile on; #tcp_nopush on; server_tokens off; #keepalive_timeout 0; keepalive_timeout 120; gzip on; gzip_http_version 1.1; gzip_disable "msie6"; gzip_vary on; gzip_min_length 1100; gzip_buffers 64 8k; gzip_comp_level 3; gzip_proxied any; gzip_types text/plain text/css application/x-javascript text/xml application/xml; server { listen 80; server_name bangvmpllda02.XXXX.com; charset utf-8; #access_log logs/http.access.log main; location / { root html; index index.html index.htm index.php; } #error_page 404 /404.html; # redirect server error pages to the static page /50x.html # error_page 500 502 503 504 /50x.html; location = /50x.html { root html; } # proxy the PHP scripts to Apache listening on 127.0.0.1:80 # #location ~ \.php$ { # proxy_pass http://127.0.0.1; #} # pass the PHP scripts to FastCGI server listening on 127.0.0.1:9000 # location ~ \.php$ { root html; fastcgi_pass unix:/var/run/php-fpm/php-fpm.sock; fastcgi_index index.php; fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; fastcgi_param SCRIPT_NAME $fastcgi_script_name; include fastcgi_params; } # deny access to .htaccess files, if Apache's document root # concurs with nginx's one # location ~ /\.ht { access_log off; log_not_found off; deny all; } location ~* \.(jpg|jpeg|gif|png|css|js|ico|xml)$ { access_log off; log_not_found off; expires 2d; } } # Passenger needed for puppet passenger_root /usr/lib/ruby/gems/1.8/gems/passenger-3.0.18; passenger_ruby /usr/bin/ruby; passenger_max_pool_size 15; server { ssl on; listen 8140 default ssl; server_name bangvmpllda02.XXXX.com; passenger_enabled on; passenger_set_cgi_param HTTP_X_CLIENT_DN $ssl_client_s_dn; passenger_set_cgi_param HTTP_X_CLIENT_VERIFY $ssl_client_verify; passenger_min_instances 5; access_log logs/puppet_access.log; error_log logs/puppet_error.log; root /etc/puppet/rack/public; ssl_certificate /var/lib/puppet/ssl/certs/bangvmpllda02.XXX.com.pem; ssl_certificate_key /var/lib/puppet/ssl/private_keys/bangvmpllda02.XXX.com.pem; ssl_crl /var/lib/puppet/ssl/ca/ca_crl.pem; ssl_client_certificate /var/lib/puppet/ssl/certs/ca.pem; ssl_ciphers SSLv2:-LOW:-EXPORT:RC4+RSA; ssl_prefer_server_ciphers on; ssl_verify_client optional; ssl_verify_depth 1; ssl_session_cache shared:SSL:128m; ssl_session_timeout 5m; } } and the puppet.conf [main] # The Puppet log directory. # The default value is '$vardir/log'. logdir = /var/log/puppet # Where Puppet PID files are kept. # The default value is '$vardir/run'. rundir = /var/run/puppet dns_alt_names = devops.XXXX.com,devops confdir = /etc/puppet vardir = /var/lib/puppet storeconfigs = true storeconfigs_backend = puppetdb thin_storeconfigs = false async_storeconfigs = false ssl_client_header = SSL_CLIENT_S_D ssl_client_verify_header = SSL_CLIENT_VERIFY # Where SSL certificates are kept. # The default value is '$confdir/ssl'. ssldir = $vardir/ssl any ideas where am I going wrong? I checkthe directory permissions; /usr/share/puppet, /etc/puppet and /var/lib/puppet (and files inside them) are owned by puppet user.

    Read the article

  • Strange Recurrent Excessive I/O Wait

    - by Chris
    I know quite well that I/O wait has been discussed multiple times on this site, but all the other topics seem to cover constant I/O latency, while the I/O problem we need to solve on our server occurs at irregular (short) intervals, but is ever-present with massive spikes of up to 20k ms a-wait and service times of 2 seconds. The disk affected is /dev/sdb (Seagate Barracuda, for details see below). A typical iostat -x output would at times look like this, which is an extreme sample but by no means rare: iostat (Oct 6, 2013) tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 16.00 0.00 156.00 9.75 21.89 288.12 36.00 57.60 5.50 0.00 44.00 8.00 48.79 2194.18 181.82 100.00 2.00 0.00 16.00 8.00 46.49 3397.00 500.00 100.00 4.50 0.00 40.00 8.89 43.73 5581.78 222.22 100.00 14.50 0.00 148.00 10.21 13.76 5909.24 68.97 100.00 1.50 0.00 12.00 8.00 8.57 7150.67 666.67 100.00 0.50 0.00 4.00 8.00 6.31 10168.00 2000.00 100.00 2.00 0.00 16.00 8.00 5.27 11001.00 500.00 100.00 0.50 0.00 4.00 8.00 2.96 17080.00 2000.00 100.00 34.00 0.00 1324.00 9.88 1.32 137.84 4.45 59.60 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 22.00 44.00 204.00 11.27 0.01 0.27 0.27 0.60 Let me provide you with some more information regarding the hardware. It's a Dell 1950 III box with Debian as OS where uname -a reports the following: Linux xx 2.6.32-5-amd64 #1 SMP Fri Feb 15 15:39:52 UTC 2013 x86_64 GNU/Linux The machine is a dedicated server that hosts an online game without any databases or I/O heavy applications running. The core application consumes about 0.8 of the 8 GBytes RAM, and the average CPU load is relatively low. The game itself, however, reacts rather sensitive towards I/O latency and thus our players experience massive ingame lag, which we would like to address as soon as possible. iostat: avg-cpu: %user %nice %system %iowait %steal %idle 1.77 0.01 1.05 1.59 0.00 95.58 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sdb 13.16 25.42 135.12 504701011 2682640656 sda 1.52 0.74 20.63 14644533 409684488 Uptime is: 19:26:26 up 229 days, 17:26, 4 users, load average: 0.36, 0.37, 0.32 Harddisk controller: 01:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 1078 (rev 04) Harddisks: Array 1, RAID-1, 2x Seagate Cheetah 15K.5 73 GB SAS Array 2, RAID-1, 2x Seagate ST3500620SS Barracuda ES.2 500GB 16MB 7200RPM SAS Partition information from df: Filesystem 1K-blocks Used Available Use% Mounted on /dev/sdb1 480191156 30715200 425083668 7% /home /dev/sda2 7692908 437436 6864692 6% / /dev/sda5 15377820 1398916 13197748 10% /usr /dev/sda6 39159724 19158340 18012140 52% /var Some more data samples generated with iostat -dx sdb 1 (Oct 11, 2013) Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sdb 0.00 15.00 0.00 70.00 0.00 656.00 9.37 4.50 1.83 4.80 33.60 sdb 0.00 0.00 0.00 2.00 0.00 16.00 8.00 12.00 836.00 500.00 100.00 sdb 0.00 0.00 0.00 3.00 0.00 32.00 10.67 9.96 1990.67 333.33 100.00 sdb 0.00 0.00 0.00 4.00 0.00 40.00 10.00 6.96 3075.00 250.00 100.00 sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 4.00 0.00 0.00 100.00 sdb 0.00 0.00 0.00 2.00 0.00 16.00 8.00 2.62 4648.00 500.00 100.00 sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 2.00 0.00 0.00 100.00 sdb 0.00 0.00 0.00 1.00 0.00 16.00 16.00 1.69 7024.00 1000.00 100.00 sdb 0.00 74.00 0.00 124.00 0.00 1584.00 12.77 1.09 67.94 6.94 86.00 Characteristic charts generated with rrdtool can be found here: iostat plot 1, 24 min interval: http://imageshack.us/photo/my-images/600/yqm3.png/ iostat plot 2, 120 min interval: http://imageshack.us/photo/my-images/407/griw.png/ As we have a rather large cache of 5.5 GBytes, we thought it might be a good idea to test if the I/O wait spikes would perhaps be caused by cache miss events. Therefore, we did a sync and then this to flush the cache and buffers: echo 3 > /proc/sys/vm/drop_caches and directly afterwards the I/O wait and service times virtually went through the roof, and everything on the machine felt like slow motion. During the next few hours the latency recovered and everything was as before - small to medium lags in short, unpredictable intervals. Now my question is: does anybody have any idea what might cause this annoying behaviour? Is it the first indication of the disk array or the raid controller dying, or something that can be easily mended by rebooting? (At the moment we're very reluctant to do this, however, because we're afraid that the disks might not come back up again.) Any help is greatly appreciated. Thanks in advance, Chris. Edited to add: we do see one or two processes go to 'D' state in top, one of which seems to be kjournald rather frequently. If I'm not mistaken, however, this does not indicate the processes causing the latency, but rather those affected by it - correct me if I'm wrong. Does the information about uninterruptibly sleeping processes help us in any way to address the problem? @Andy Shinn requested smartctl data, here it is: smartctl -a -d megaraid,2 /dev/sdb yields: smartctl 5.40 2010-07-12 r3124 [x86_64-unknown-linux-gnu] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net Device: SEAGATE ST3500620SS Version: MS05 Serial number: Device type: disk Transport protocol: SAS Local Time is: Mon Oct 14 20:37:13 2013 CEST Device supports SMART and is Enabled Temperature Warning Disabled or Not Supported SMART Health Status: OK Current Drive Temperature: 20 C Drive Trip Temperature: 68 C Elements in grown defect list: 0 Vendor (Seagate) cache information Blocks sent to initiator = 1236631092 Blocks received from initiator = 1097862364 Blocks read from cache and sent to initiator = 1383620256 Number of read and write commands whose size <= segment size = 531295338 Number of read and write commands whose size > segment size = 51986460 Vendor (Seagate/Hitachi) factory information number of hours powered up = 36556.93 number of minutes until next internal SMART test = 32 Error counter log: Errors Corrected by Total Correction Gigabytes Total ECC rereads/ errors algorithm processed uncorrected fast | delayed rewrites corrected invocations [10^9 bytes] errors read: 509271032 47 0 509271079 509271079 20981.423 0 write: 0 0 0 0 0 5022.039 0 verify: 1870931090 196 0 1870931286 1870931286 100558.708 0 Non-medium error count: 0 SMART Self-test log Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ] Description number (hours) # 1 Background short Completed 16 36538 - [- - -] # 2 Background short Completed 16 36514 - [- - -] # 3 Background short Completed 16 36490 - [- - -] # 4 Background short Completed 16 36466 - [- - -] # 5 Background short Completed 16 36442 - [- - -] # 6 Background long Completed 16 36420 - [- - -] # 7 Background short Completed 16 36394 - [- - -] # 8 Background short Completed 16 36370 - [- - -] # 9 Background long Completed 16 36364 - [- - -] #10 Background short Completed 16 36361 - [- - -] #11 Background long Completed 16 2 - [- - -] #12 Background short Completed 16 0 - [- - -] Long (extended) Self Test duration: 6798 seconds [113.3 minutes] smartctl -a -d megaraid,3 /dev/sdb yields: smartctl 5.40 2010-07-12 r3124 [x86_64-unknown-linux-gnu] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net Device: SEAGATE ST3500620SS Version: MS05 Serial number: Device type: disk Transport protocol: SAS Local Time is: Mon Oct 14 20:37:26 2013 CEST Device supports SMART and is Enabled Temperature Warning Disabled or Not Supported SMART Health Status: OK Current Drive Temperature: 19 C Drive Trip Temperature: 68 C Elements in grown defect list: 0 Vendor (Seagate) cache information Blocks sent to initiator = 288745640 Blocks received from initiator = 1097848399 Blocks read from cache and sent to initiator = 1304149705 Number of read and write commands whose size <= segment size = 527414694 Number of read and write commands whose size > segment size = 51986460 Vendor (Seagate/Hitachi) factory information number of hours powered up = 36596.83 number of minutes until next internal SMART test = 28 Error counter log: Errors Corrected by Total Correction Gigabytes Total ECC rereads/ errors algorithm processed uncorrected fast | delayed rewrites corrected invocations [10^9 bytes] errors read: 610862490 44 0 610862534 610862534 20470.133 0 write: 0 0 0 0 0 5022.480 0 verify: 2861227413 203 0 2861227616 2861227616 100872.443 0 Non-medium error count: 1 SMART Self-test log Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ] Description number (hours) # 1 Background short Completed 16 36580 - [- - -] # 2 Background short Completed 16 36556 - [- - -] # 3 Background short Completed 16 36532 - [- - -] # 4 Background short Completed 16 36508 - [- - -] # 5 Background short Completed 16 36484 - [- - -] # 6 Background long Completed 16 36462 - [- - -] # 7 Background short Completed 16 36436 - [- - -] # 8 Background short Completed 16 36412 - [- - -] # 9 Background long Completed 16 36404 - [- - -] #10 Background short Completed 16 36401 - [- - -] #11 Background long Completed 16 2 - [- - -] #12 Background short Completed 16 0 - [- - -] Long (extended) Self Test duration: 6798 seconds [113.3 minutes]

    Read the article

  • solved: passenger(mod_rails) fails to start puppet master under nginx

    - by Anadi Misra
    On the server [root@bangvmpllDA02 logs]# ruby -v ruby 1.8.7 (2011-06-30 patchlevel 352) [x86_64-linux] [root@bangvmpllDA02 logs]# puppet --version 3.0.1 and [root@bangvmpllDA02 logs]# service nginx configtest nginx: the configuration file /apps/nginx/nginx.conf syntax is ok nginx: configuration file /apps/nginx/nginx.conf test is successful [root@bangvmpllDA02 logs]# service nginx status nginx (pid 25923 25921 25920 25917 25908) is running... [root@bangvmpllDA02 logs]# however none of my agents are able to connect to the master, they all fail with errors like so [amisr1@blramisr195602 ~]$ puppet agent --test --verbose --server bangvmpllda02.XXX.com Info: Creating a new SSL certificate request for blramisr195602.XXX.com Info: Certificate Request fingerprint (SHA256): 26:EB:08:1F:82:32:E4:03:7A:64:8E:30:A3:99:93:26:E6:66:B9:B0:49:B6:08:F9:67:CA:1B:0C:00:B9:1D:41 Error: Could not request certificate: Error 405 on SERVER: <html> <head><title>405 Not Allowed</title></head> <body bgcolor="white"> <center><h1>405 Not Allowed</h1></center> <hr><center>nginx</center> </body> </html> Exiting; failed to retrieve certificate and waitforcert is disabled when I check logs on puppet master [root@bangvmpllDA02 logs]# tail puppet_access.log [05/Dec/2012:17:45:18 +0530] "GET /production/certificate/ca? HTTP/1.1" 404 162 "-" "Ruby" [05/Dec/2012:18:32:23 +0530] "PUT /production/certificate_request/sl63anadi.XXX.com HTTP/1.1" 405 166 "-" "-" [05/Dec/2012:18:33:33 +0530] "GET /production/certificate/sl63anadi.XXX.com? HTTP/1.1" 404 162 "-" "-" [05/Dec/2012:18:33:33 +0530] "GET /production/certificate_request/sl63anadi.XXX.com? HTTP/1.1" 404 162 "-" "-" [05/Dec/2012:18:33:33 +0530] "PUT /production/certificate_request/sl63anadi.XXX.com HTTP/1.1" 405 166 "-" "-" and the error logs show that nginx is not really able to process the request well 2012/12/05 18:33:33 [error] 25920#0: *23 open() "/etc/puppet/rack/public/production/certificate/sl63anadi.XXX.com" failed (2: No such file or directory), client: 10.209.47.26, server: , request: "GET /production/certificate/sl63anadi.XXX.com? HTTP/1.1", host: "bangvmpllda02.XXX.com:8140" 2012/12/05 18:33:33 [error] 25920#0: *24 open() "/etc/puppet/rack/public/production/certificate_request/sl63anadi.XXX.com" failed (2: No such file or directory), client: 10.209.47.26, server: , request: "GET /production/certificate_request/sl63anadi.XXX.com? HTTP/1.1", host: "bangvmpllda02.XXX.com:8140" 2012/12/05 18:47:56 [error] 25923#0: *27 open() "/etc/puppet/rack/public/production/certificate/ca" failed (2: No such file or directory), client: 10.209.47.31, server: , request: "GET /production/certificate/ca? HTTP/1.1", host: "bangvmpllda02.XXX.com:8140" 2012/12/05 18:47:56 [error] 25923#0: *28 open() "/etc/puppet/rack/public/production/certificate_request/blramisr195602.XXX.com" failed (2: No such file or directory), client: 10.209.47.31, server: , request: "GET /production/certificate_request/blramisr195602.XXX.com? HTTP/1.1", host: "bangvmpllda02.XXX.com:8140" Passenger does not show any application groups either [root@bangvmpllDA02 nginx]# passenger-status ----------- General information ----------- max = 15 count = 0 active = 0 inactive = 0 Waiting on global queue: 0 ----------- Application groups ----------- [root@bangvmpllDA02 nginx]# here's my nginx configuration [root@bangvmpllDA02 logs]# cat ../nginx.conf user puppet; worker_processes 4; #error_log logs/error.log; #error_log logs/error.log notice; error_log logs/error.log info; #pid logs/nginx.pid; events { use epoll; worker_connections 1024; } http { include mime.types; default_type application/octet-stream; log_format main '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$http_x_forwarded_for"'; access_log logs/access.log main; sendfile on; #tcp_nopush on; server_tokens off; #keepalive_timeout 0; keepalive_timeout 120; gzip on; gzip_http_version 1.1; gzip_disable "msie6"; gzip_vary on; gzip_min_length 1100; gzip_buffers 64 8k; gzip_comp_level 3; gzip_proxied any; gzip_types text/plain text/css application/x-javascript text/xml application/xml; server { listen 80; server_name bangvmpllda02.XXXX.com; charset utf-8; #access_log logs/http.access.log main; location / { root html; index index.html index.htm index.php; } #error_page 404 /404.html; # redirect server error pages to the static page /50x.html # error_page 500 502 503 504 /50x.html; location = /50x.html { root html; } # proxy the PHP scripts to Apache listening on 127.0.0.1:80 # #location ~ \.php$ { # proxy_pass http://127.0.0.1; #} # pass the PHP scripts to FastCGI server listening on 127.0.0.1:9000 # location ~ \.php$ { root html; fastcgi_pass unix:/var/run/php-fpm/php-fpm.sock; fastcgi_index index.php; fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; fastcgi_param SCRIPT_NAME $fastcgi_script_name; include fastcgi_params; } # deny access to .htaccess files, if Apache's document root # concurs with nginx's one # location ~ /\.ht { access_log off; log_not_found off; deny all; } location ~* \.(jpg|jpeg|gif|png|css|js|ico|xml)$ { access_log off; log_not_found off; expires 2d; } } # Passenger needed for puppet passenger_root /usr/lib/ruby/gems/1.8/gems/passenger-3.0.18; passenger_ruby /usr/bin/ruby; passenger_max_pool_size 15; server { ssl on; listen 8140 default ssl; server_name bangvmpllda02.XXXX.com; passenger_enabled on; passenger_set_cgi_param HTTP_X_CLIENT_DN $ssl_client_s_dn; passenger_set_cgi_param HTTP_X_CLIENT_VERIFY $ssl_client_verify; passenger_min_instances 5; access_log logs/puppet_access.log; error_log logs/puppet_error.log; root /etc/puppet/rack/public; ssl_certificate /var/lib/puppet/ssl/certs/bangvmpllda02.XXX.com.pem; ssl_certificate_key /var/lib/puppet/ssl/private_keys/bangvmpllda02.XXX.com.pem; ssl_crl /var/lib/puppet/ssl/ca/ca_crl.pem; ssl_client_certificate /var/lib/puppet/ssl/certs/ca.pem; ssl_ciphers SSLv2:-LOW:-EXPORT:RC4+RSA; ssl_prefer_server_ciphers on; ssl_verify_client optional; ssl_verify_depth 1; ssl_session_cache shared:SSL:128m; ssl_session_timeout 5m; } } and the puppet.conf [main] # The Puppet log directory. # The default value is '$vardir/log'. logdir = /var/log/puppet # Where Puppet PID files are kept. # The default value is '$vardir/run'. rundir = /var/run/puppet dns_alt_names = devops.XXXX.com,devops confdir = /etc/puppet vardir = /var/lib/puppet storeconfigs = true storeconfigs_backend = puppetdb thin_storeconfigs = false async_storeconfigs = false ssl_client_header = SSL_CLIENT_S_D ssl_client_verify_header = SSL_CLIENT_VERIFY # Where SSL certificates are kept. # The default value is '$confdir/ssl'. ssldir = $vardir/ssl any ideas where am I going wrong? I checkthe directory permissions; /usr/share/puppet, /etc/puppet and /var/lib/puppet (and files inside them) are owned by puppet user. Solved The simple solution to my complicated problem was that I had placed the config.ru in wrong place moved it to /etc/puppet/rack , it was in /etc/puppet/rack/public Well!!! :-/

    Read the article

  • Connect ps/2->usb keyboard to linux?

    - by Daniel
    I have a lovely ancient ergonomic keyboard (no name SK - 6000) connected via a DIN-ps/2 adapter to a ps/2-usb adapter to my docking station. After Grub it stops working. It takes either suspending and waking up or replugging it while Linux is running to get it to work. No extra kernel modules get loaded for this. When it works and I restart without power off, it will work immediately. Even when it does not work, it is visible (lsusb device number varies but output is identical whether working or not): $ lsusb -v -s 001:006 Bus 001 Device 006: ID 0a81:0205 Chesen Electronics Corp. PS/2 Keyboard+Mouse Adapter Device Descriptor: bLength 18 bDescriptorType 1 bcdUSB 1.10 bDeviceClass 0 (Defined at Interface level) bDeviceSubClass 0 bDeviceProtocol 0 bMaxPacketSize0 8 idVendor 0x0a81 Chesen Electronics Corp. idProduct 0x0205 PS/2 Keyboard+Mouse Adapter bcdDevice 0.10 iManufacturer 1 CHESEN iProduct 2 PS2 to USB Converter iSerial 0 bNumConfigurations 1 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 59 bNumInterfaces 2 bConfigurationValue 1 iConfiguration 2 PS2 to USB Converter bmAttributes 0xa0 (Bus Powered) Remote Wakeup MaxPower 100mA Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 0 bAlternateSetting 0 bNumEndpoints 1 bInterfaceClass 3 Human Interface Device bInterfaceSubClass 1 Boot Interface Subclass bInterfaceProtocol 1 Keyboard iInterface 0 HID Device Descriptor: bLength 9 bDescriptorType 33 bcdHID 1.10 bCountryCode 0 Not supported bNumDescriptors 1 bDescriptorType 34 Report wDescriptorLength 64 Report Descriptors: ** UNAVAILABLE ** Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x81 EP 1 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x0008 1x 8 bytes bInterval 10 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 1 bAlternateSetting 0 bNumEndpoints 1 bInterfaceClass 3 Human Interface Device bInterfaceSubClass 1 Boot Interface Subclass bInterfaceProtocol 2 Mouse iInterface 0 HID Device Descriptor: bLength 9 bDescriptorType 33 bcdHID 1.10 bCountryCode 0 Not supported bNumDescriptors 1 bDescriptorType 34 Report wDescriptorLength 148 Report Descriptors: ** UNAVAILABLE ** Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x82 EP 2 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x0008 1x 8 bytes bInterval 10 Device Status: 0x0000 (Bus Powered) $ ll -R /sys/bus/hid/drivers/ /sys/bus/hid/drivers/: total 0 drwxr-xr-x 2 root root 0 Jul 8 2012 generic-usb/ /sys/bus/hid/drivers/generic-usb: total 0 lrwxrwxrwx 1 root root 0 Jul 7 23:33 0003:046D:C03D.0003 -> ../../../../devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2.2/1-1.2.2:1.0/0003:046D:C03D.0003/ lrwxrwxrwx 1 root root 0 Jul 7 23:33 0003:0A81:0205.0001 -> ../../../../devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2.1/1-1.2.1:1.0/0003:0A81:0205.0001/ lrwxrwxrwx 1 root root 0 Jul 7 23:33 0003:0A81:0205.0002 -> ../../../../devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2.1/1-1.2.1:1.1/0003:0A81:0205.0002/ --w------- 1 root root 4096 Jul 7 23:32 bind lrwxrwxrwx 1 root root 0 Jul 7 23:33 module -> ../../../../module/usbhid/ --w------- 1 root root 4096 Jul 7 23:32 new_id --w------- 1 root root 4096 Jul 8 2012 uevent --w------- 1 root root 4096 Jul 7 23:32 unbind When replugging, dmesg shows this (which except for the 1st line and different input numbers already came at boot time): [ 1583.295385] usb 1-1.2.1: new low-speed USB device number 6 using ehci_hcd [ 1583.446514] input: CHESEN PS2 to USB Converter as /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2.1/1-1.2.1:1.0/input/input17 [ 1583.446817] generic-usb 0003:0A81:0205.0001: input,hidraw0: USB HID v1.10 Keyboard [CHESEN PS2 to USB Converter] on usb-0000:00:1a.0-1.2.1/input0 [ 1583.454764] input: CHESEN PS2 to USB Converter as /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2.1/1-1.2.1:1.1/input/input18 [ 1583.455534] generic-usb 0003:0A81:0205.0002: input,hidraw1: USB HID v1.10 Mouse [CHESEN PS2 to USB Converter] on usb-0000:00:1a.0-1.2.1/input1 [ 1583.455578] usbcore: registered new interface driver usbhid [ 1583.455584] usbhid: USB HID core driver So I tried $ sudo udevadm test /sys/devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2.1/1-1.2.1:1.0/0003:0A81:0205.0001/hidraw/hidraw0 run_command: calling: test adm_test: version 175 This program is for debugging only, it does not run any program, specified by a RUN key. It may show incorrect results, because some values may be different, or not available at a simulation run. parse_file: reading '/lib/udev/rules.d/40-crda.rules' as rules file parse_file: reading '/lib/udev/rules.d/40-fuse.rules' as rules file ... parse_file: reading '/lib/udev/rules.d/40-usb-media-players.rules' as rules file parse_file: reading '/lib/udev/rules.d/40-usb_modeswitch.rules' as rules file ... parse_file: reading '/lib/udev/rules.d/42-qemu-usb.rules' as rules file ... parse_file: reading '/lib/udev/rules.d/69-cd-sensors.rules' as rules file add_rule: IMPORT found builtin 'usb_id', replacing /lib/udev/rules.d/69-cd-sensors.rules:76 ... parse_file: reading '/lib/udev/rules.d/77-mm-usb-device-blacklist.rules' as rules file ... parse_file: reading '/lib/udev/rules.d/85-usbmuxd.rules' as rules file ... parse_file: reading '/lib/udev/rules.d/95-upower-hid.rules' as rules file parse_file: reading '/lib/udev/rules.d/95-upower-wup.rules' as rules file parse_file: reading '/lib/udev/rules.d/97-bluetooth-hid2hci.rules' as rules file udev_rules_new: rules use 271500 bytes tokens (22625 * 12 bytes), 44331 bytes buffer udev_rules_new: temporary index used 76320 bytes (3816 * 20 bytes) udev_device_new_from_syspath: device 0x7f78a5e4d2d0 has devpath '/devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2.1/1-1.2.1:1.0/0003:0A81:0205.0001/hidraw/hidraw0' udev_device_new_from_syspath: device 0x7f78a5e5f820 has devpath '/devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2.1/1-1.2.1:1.0/0003:0A81:0205.0001/hidraw/hidraw0' udev_device_read_db: device 0x7f78a5e5f820 filled with db file data udev_device_new_from_syspath: device 0x7f78a5e60270 has devpath '/devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2.1/1-1.2.1:1.0/0003:0A81:0205.0001' udev_device_new_from_syspath: device 0x7f78a5e609c0 has devpath '/devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2.1/1-1.2.1:1.0' udev_device_new_from_syspath: device 0x7f78a5e61160 has devpath '/devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2.1' udev_device_new_from_syspath: device 0x7f78a5e61960 has devpath '/devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2' udev_device_new_from_syspath: device 0x7f78a5e62150 has devpath '/devices/pci0000:00/0000:00:1a.0/usb1/1-1' udev_device_new_from_syspath: device 0x7f78a5e62940 has devpath '/devices/pci0000:00/0000:00:1a.0/usb1' udev_device_new_from_syspath: device 0x7f78a5e630f0 has devpath '/devices/pci0000:00/0000:00:1a.0' udev_device_new_from_syspath: device 0x7f78a5e638a0 has devpath '/devices/pci0000:00' udev_event_execute_rules: no node name set, will use kernel supplied name 'hidraw0' udev_node_add: creating device node '/dev/hidraw0', devnum=251:0, mode=0600, uid=0, gid=0 udev_node_mknod: preserve file '/dev/hidraw0', because it has correct dev_t udev_node_mknod: preserve permissions /dev/hidraw0, 020600, uid=0, gid=0 node_symlink: preserve already existing symlink '/dev/char/251:0' to '../hidraw0' udev_device_update_db: created empty file '/run/udev/data/c251:0' for '/devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2.1/1-1.2.1:1.0/0003:0A81:0205.0001/hidraw/hidraw0' ACTION=add DEVNAME=/dev/hidraw0 DEVPATH=/devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2.1/1-1.2.1:1.0/0003:0A81:0205.0001/hidraw/hidraw0 MAJOR=251 MINOR=0 SUBSYSTEM=hidraw UDEV_LOG=6 USEC_INITIALIZED=969079051 The later lines sound like it's already there. And none of these awakes the keyboard: $ sudo udevadm trigger --verbose --sysname-match=usb* /sys/devices/pci0000:00/0000:00:1a.0/usb1 /sys/devices/pci0000:00/0000:00:1a.0/usbmon/usbmon1 /sys/devices/pci0000:00/0000:00:1d.0/usb2 /sys/devices/pci0000:00/0000:00:1d.0/usbmon/usbmon2 /sys/devices/virtual/usbmon/usbmon0 $ sudo udevadm trigger --verbose --sysname-match=hidraw0 /sys/devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2.1/1-1.2.1:1.0/0003:0A81:0205.0001/hidraw/hidraw0 $ sudo udevadm trigger I also tried this to no avail: # echo -n 0003:0A81:0205.0001 > /sys/bus/hid/drivers/generic-usb/bind ksh: echo: write to 1 failed [No such device] # echo -n 0003:0A81:0205.0001 > /sys/bus/hid/drivers/generic-usb/unbind # echo -n 0003:0A81:0205.0001 > /sys/bus/hid/drivers/generic-usb/bind # echo usb1 >/sys/bus/usb/drivers/usb/unbind # echo usb1 >/sys/bus/usb/drivers/usb/bind What else should I try to get the same result as replugging or suspending, by just issuing a command?

    Read the article

  • Why are my hard drives failing?

    - by WishCow
    I have a small Ubuntu server running at home, with 2 HDDs. There are two software raids (raid1) on the disks, managed by mdadm, which I believe is irrelevant, but mentioning it anyway. Both of the HDDs are Western Digital, and have been used for around 2 years, when one of them started making clicking noises, and died. I figured that maybe it's natural after 2 years, so I bought a new one, and resynced the raid arrays. After about a month, the other drive also died. I didn't get suspicious, since both drives have been bought at the same time, it's not that surprising to see both of them near each other, so I bought another one. So far, 2 old drives failed, and 2 brand new in the system. After one month, one of the new drives died. This is when it started getting suspicious. Since the PC was put together from some really old parts (think AthlonXP), I figured that maybe the motherboard's SATA controller is the culprit. Of course you can't switch parts easily in an old PC like this, so I bought a whole system, new MB, new CPU, new RAM. Took the just failed drive back, since it was under warranty, and got it replaced. So it is up to 2 failed drives from the old ones, and 1 failed drive from the new ones. No problems, for 1 month. After that errors were creeping up again in /var/log/messages, and mdadm was reporting raid array failures. I started tearing my hair out. Everything is new in the system, it's up to the third brand new HDD, it's simply not possible that all of the new drives that I bought were faulty. Let's see what is still common... the cables. Okay, long shot, let's replace the SATA cables. Take HDD back, smile to the guy at the counter and say that I'm really unlucky. He replaces the HDD. I come home, one month passes and one of HDDs fails, again. I'm not joking. Two of the brand new HDDs have failed. Maybe it's a bug in the OS. Let's see what the manufacturer's testing tool says. Download testing tool, burn it to a CD, reboot, leave HDD testing overnight. Test says that the drive is faulty, and I should back up everything, if I still can. I don't know what's happening, but it does not look like a software problem, something is definitely thrashing the HDDs. I should mention now, that the whole system is in a shoebox. Since there are a load of "build your own ikea case" stuff, I thought there shouldn't be any problems throwing the thing in a box, and stuffing it away somewhere. The box is well ventilated, but I thought that just maybe the drives were overheating. There is no other possible answer to this. So I took the HDD back, and got it replaced (for the 3rd time), and bought HDD coolers. And just now, I have heard the sound of doom. click click whizzzzzzzzz. SSH into the box: You have new mail! mail r 1 DegradedArrayEvent on /dev/md0 ... dmesg output: [47128.000051] ata3: lost interrupt (Status 0x50) [47128.000097] end_request: I/O error, dev sda, sector 58588863 [47128.000134] md: super_written gets error=-5, uptodate=0 [48043.976054] ata3: lost interrupt (Status 0x50) [48043.976086] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen [48043.976132] ata3.00: cmd c8/00:18:bf:40:52/00:00:00:00:00/e1 tag 0 dma 12288 in [48043.976135] res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) [48043.976208] ata3.00: status: { DRDY } [48043.976241] ata3: soft resetting link [48044.148446] ata3.00: configured for UDMA/133 [48044.148457] ata3.00: device reported invalid CHS sector 0 [48044.148477] ata3: EH complete Recap: No possibility of overheating 6 drives have failed, 4 of those have been brand new. I'm not sure now that the original two have been faulty, or suffered the same thing that the new ones. There is nothing common in the system, apart from the OS which is Ubuntu Karmic now (started with Jaunty). New MB, new CPU, new RAM, new SATA cables. No, the little holes on the HDD are not covered I'm crying. Really. I don't have the face to return to the store now, it's not possible for 4 drives to fail under 4 months. A few ideas that I have been thinking: Is it possible that I fuck up something when I partition and resync the drives? Can it be so bad that it physicaly wrecks the drive? (since the vendor supplied tool says that the drive is damaged) I do the partitoning with fdisk, and use the same block size for the raid1 partitions (I check the exact block sizes with fdisk -lu) Is it possible that the linux kernel or mdadm, or something is not compatible with this exact brand of HDDs, and thrashes them? Is it possible that it may be the shoebox? Try placing it somewhere else? It's under a shelf now, so humidity is not a problem either. Is it possible that a normal PC case will solve my problem (I'm going to shoot myself then)? I will get a picture tomorrow. Am I just simply cursed? Any help or speculation is greatly appreciated. Edit: The power strip is guarded against overvoltage. Edit2: I have moved inbetween these 4 months, so the possibility of the cause being "dirty" electricity in both places, is very low. Edit3: I have checked the voltages in the BIOS (couldn't borrow a multimeter), and they are all seem correct, the biggest discrepancy is in the 12V, because it's supplying 11.3. Should I be worried about that? Edit4: I put my desktop PC's PSU into the server. The BIOS reported much more accurate voltage readings, and also it has successfully rebuilt the raid1 array, which took some 3-4 hours, so I feel a little positive now. Will get a new PSU tomorrow to test with that. Also, attaching the picture about the box: (disregard the 3rd drive)

    Read the article

  • How to diagnose failing 6Gbps SATA connection?

    - by whitequark
    I have a Samsung RC530 notebook and OCZ Vertex-3 6Gbps SATA SSD working in AHCI mode. # dmesg | grep DMI SAMSUNG ELECTRONICS CO., LTD. RC530/RC730/RC530/RC730, BIOS 03WD.M008.20110927.PSA 09/27/2011 # lspci -nn 00:1f.2 SATA controller [0106]: Intel Corporation 6 Series/C200 Series Chipset Family 6 port SATA AHCI Controller [8086:1c03] (rev 04) # sdparm -a /dev/sda /dev/sda: ATA OCZ-VERTEX3 2.15 At the boot, the following messages are present in dmesg (I am running Debian wheezy @ Linux 3.2.8): # dmesg | grep -iE '(ata|ahci)' [ 5.179783] ahci 0000:00:1f.2: version 3.0 [ 5.179802] ahci 0000:00:1f.2: PCI INT B -> GSI 19 (level, low) -> IRQ 19 [ 5.179864] ahci 0000:00:1f.2: irq 42 for MSI/MSI-X [ 5.195424] ahci 0000:00:1f.2: AHCI 0001.0300 32 slots 6 ports 6 Gbps 0x5 impl SATA mode [ 5.195429] ahci 0000:00:1f.2: flags: 64bit ncq sntf pm led clo pio slum part ems apst [ 5.195436] ahci 0000:00:1f.2: setting latency timer to 64 [ 5.204035] scsi0 : ahci [ 5.204301] scsi1 : ahci [ 5.204447] scsi2 : ahci [ 5.204592] scsi3 : ahci [ 5.204682] scsi4 : ahci [ 5.204799] scsi5 : ahci [ 5.204917] ata1: SATA max UDMA/133 abar m2048@0xf7c06000 port 0xf7c06100 irq 42 [ 5.204920] ata2: DUMMY [ 5.204923] ata3: SATA max UDMA/133 abar m2048@0xf7c06000 port 0xf7c06200 irq 42 [ 5.204924] ata4: DUMMY [ 5.204926] ata5: DUMMY [ 5.204927] ata6: DUMMY [ 5.523039] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [ 5.525911] ata3.00: ATAPI: TSSTcorp CDDVDW SN-208BB, SC00, max UDMA/100 [ 5.531006] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) [ 5.533703] ata3.00: configured for UDMA/100 [ 5.542790] ata1.00: ATA-8: OCZ-VERTEX3, 2.15, max UDMA/133 [ 5.542800] ata1.00: 117231408 sectors, multi 16: LBA48 NCQ (depth 31/32), AA [ 5.552751] ata1.00: configured for UDMA/133 [ 5.553050] scsi 0:0:0:0: Direct-Access ATA OCZ-VERTEX3 2.15 PQ: 0 ANSI: 5 [ 5.559621] scsi 2:0:0:0: CD-ROM TSSTcorp CDDVDW SN-208BB SC00 PQ: 0 ANSI: 5 [ 5.564059] sd 0:0:0:0: [sda] 117231408 512-byte logical blocks: (60.0 GB/55.8 GiB) [ 5.564127] sd 0:0:0:0: [sda] Write Protect is off [ 5.564131] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 [ 5.564158] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 5.564582] sda: sda1 [ 5.564810] sd 0:0:0:0: [sda] Attached SCSI disk [ 5.572006] sr0: scsi3-mmc drive: 16x/24x writer dvd-ram cd/rw xa/form2 cdda tray [ 5.572010] cdrom: Uniform CD-ROM driver Revision: 3.20 [ 5.572189] sr 2:0:0:0: Attached scsi CD-ROM sr0 [ 6.717181] ata1.00: exception Emask 0x50 SAct 0x1 SErr 0x280900 action 0x6 frozen [ 6.717238] ata1.00: irq_stat 0x08000000, interface fatal error [ 6.717291] ata1: SError: { UnrecovData HostInt 10B8B BadCRC } [ 6.717342] ata1.00: failed command: READ FPDMA QUEUED [ 6.717395] ata1.00: cmd 60/50:00:20:39:58/00:00:00:00:00/40 tag 0 ncq 40960 in [ 6.717396] res 40/00:00:20:39:58/00:00:00:00:00/40 Emask 0x50 (ATA bus error) [ 6.717503] ata1.00: status: { DRDY } [ 6.717553] ata1: hard resetting link [ 7.033417] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) [ 7.055234] ata1.00: configured for UDMA/133 [ 7.055262] ata1: EH complete [ 7.147280] ata1.00: exception Emask 0x10 SAct 0xf8 SErr 0x280100 action 0x6 frozen [ 7.147340] ata1.00: irq_stat 0x08000000, interface fatal error [ 7.147393] ata1: SError: { UnrecovData 10B8B BadCRC } [ 7.147460] ata1.00: failed command: READ FPDMA QUEUED [ 7.147529] ata1.00: cmd 60/08:18:88:17:41/00:00:02:00:00/40 tag 3 ncq 4096 in [ 7.147531] res 40/00:38:50:99:64/00:00:02:00:00/40 Emask 0x10 (ATA bus error) [ 7.147691] ata1.00: status: { DRDY } [ 7.147754] ata1.00: failed command: READ FPDMA QUEUED [ 7.147821] ata1.00: cmd 60/00:20:f8:42:4c/01:00:02:00:00/40 tag 4 ncq 131072 in [ 7.147822] res 40/00:38:50:99:64/00:00:02:00:00/40 Emask 0x10 (ATA bus error) [ 7.147977] ata1.00: status: { DRDY } [ 7.148036] ata1.00: failed command: READ FPDMA QUEUED [ 7.148100] ata1.00: cmd 60/50:28:f8:43:4c/00:00:02:00:00/40 tag 5 ncq 40960 in [ 7.148101] res 40/00:38:50:99:64/00:00:02:00:00/40 Emask 0x10 (ATA bus error) [ 7.148255] ata1.00: status: { DRDY } [ 7.148315] ata1.00: failed command: READ FPDMA QUEUED [ 7.148379] ata1.00: cmd 60/00:30:50:98:64/01:00:02:00:00/40 tag 6 ncq 131072 in [ 7.148380] res 40/00:38:50:99:64/00:00:02:00:00/40 Emask 0x10 (ATA bus error) [ 7.148534] ata1.00: status: { DRDY } [ 7.148593] ata1.00: failed command: READ FPDMA QUEUED [ 7.148657] ata1.00: cmd 60/00:38:50:99:64/01:00:02:00:00/40 tag 7 ncq 131072 in [ 7.148658] res 40/00:38:50:99:64/00:00:02:00:00/40 Emask 0x10 (ATA bus error) [ 7.148813] ata1.00: status: { DRDY } [ 7.148875] ata1: hard resetting link [ 7.464842] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) [ 7.486794] ata1.00: configured for UDMA/133 [ 7.486822] ata1: EH complete [ 7.546395] ata1.00: exception Emask 0x10 SAct 0x2f SErr 0x280100 action 0x6 frozen [ 7.546470] ata1.00: irq_stat 0x08000000, interface fatal error [ 7.546531] ata1: SError: { UnrecovData 10B8B BadCRC } [ 7.546588] ata1.00: failed command: READ FPDMA QUEUED [ 7.546648] ata1.00: cmd 60/00:00:e0:4b:61/01:00:02:00:00/40 tag 0 ncq 131072 in [ 7.546649] res 40/00:28:e0:4c:61/00:00:02:00:00/40 Emask 0x10 (ATA bus error) [ 7.546794] ata1.00: status: { DRDY } [ 7.546847] ata1.00: failed command: READ FPDMA QUEUED [ 7.546906] ata1.00: cmd 60/00:08:90:2f:48/01:00:02:00:00/40 tag 1 ncq 131072 in [ 7.546907] res 40/00:28:e0:4c:61/00:00:02:00:00/40 Emask 0x10 (ATA bus error) [ 7.547053] ata1.00: status: { DRDY } [ 7.547106] ata1.00: failed command: READ FPDMA QUEUED [ 7.547165] ata1.00: cmd 60/00:10:90:30:48/01:00:02:00:00/40 tag 2 ncq 131072 in [ 7.547166] res 40/00:28:e0:4c:61/00:00:02:00:00/40 Emask 0x10 (ATA bus error) [ 7.547310] ata1.00: status: { DRDY } [ 7.547363] ata1.00: failed command: READ FPDMA QUEUED [ 7.547422] ata1.00: cmd 60/00:18:50:c7:64/01:00:02:00:00/40 tag 3 ncq 131072 in [ 7.547423] res 40/00:28:e0:4c:61/00:00:02:00:00/40 Emask 0x10 (ATA bus error) [ 7.547568] ata1.00: status: { DRDY } [ 7.547621] ata1.00: failed command: READ FPDMA QUEUED [ 7.547681] ata1.00: cmd 60/00:28:e0:4c:61/01:00:02:00:00/40 tag 5 ncq 131072 in [ 7.547682] res 40/00:28:e0:4c:61/00:00:02:00:00/40 Emask 0x10 (ATA bus error) [ 7.547825] ata1.00: status: { DRDY } [ 7.547882] ata1: hard resetting link [ 7.864408] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) [ 7.886351] ata1.00: configured for UDMA/133 [ 7.886375] ata1: EH complete [ 7.890012] ata1: limiting SATA link speed to 3.0 Gbps [ 7.890016] ata1.00: exception Emask 0x10 SAct 0x7 SErr 0x280100 action 0x6 frozen [ 7.890093] ata1.00: irq_stat 0x08000000, interface fatal error [ 7.890152] ata1: SError: { UnrecovData 10B8B BadCRC } [ 7.890210] ata1.00: failed command: READ FPDMA QUEUED [ 7.890272] ata1.00: cmd 60/00:00:90:33:48/01:00:02:00:00/40 tag 0 ncq 131072 in [ 7.890273] res 40/00:10:e0:4f:61/00:00:02:00:00/40 Emask 0x10 (ATA bus error) [ 7.890418] ata1.00: status: { DRDY } [ 7.890472] ata1.00: failed command: READ FPDMA QUEUED [ 7.890530] ata1.00: cmd 60/00:08:90:34:48/01:00:02:00:00/40 tag 1 ncq 131072 in [ 7.890531] res 40/00:10:e0:4f:61/00:00:02:00:00/40 Emask 0x10 (ATA bus error) [ 7.890672] ata1.00: status: { DRDY } [ 7.890724] ata1.00: failed command: READ FPDMA QUEUED [ 7.890781] ata1.00: cmd 60/78:10:e0:4f:61/00:00:02:00:00/40 tag 2 ncq 61440 in [ 7.890782] res 40/00:10:e0:4f:61/00:00:02:00:00/40 Emask 0x10 (ATA bus error) [ 7.890925] ata1.00: status: { DRDY } [ 7.890981] ata1: hard resetting link [ 8.208021] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 320) [ 8.230100] ata1.00: configured for UDMA/133 [ 8.230124] ata1: EH complete Looks like the SATA interface tries to use 6Gbps link, then fails miserably and Linux fallbacks to 3Gbps. This is somewhat fine for me, as the system boots successfully each time and works under high load (cd linux-3.2.8; make -j16). I've also ran memtest86+ and it did not find any errors. What concerns me more is that Grub sometimes takes a long time to load the images and/or fails to load itself completely. The error is consistent and is probablistic: that is, each time I boot I have a certain chance to fail. Actually, I have a slight suspiction on the cause of the failure. Look at the cabling: What kind of engineer does it this way? Nah. Even 1Gbps Ethernet hardly tolerates cables bent over a small angle, and there you have 6Gbps SATA. How cound I determine and fix the cause of errors and/or switch the link to 3Gbps mode permanently?

    Read the article

  • Windows Server 2008 R2 network adapter stops working, requires hard reboot

    - by Geoff Dalgas
    TL;DR version: Turns out this was a Windows Server 2008 R2 kernel networking bug. After siccing Microsoft support on it, we (eventually) got an unpublished kernel hotfix from Microsoft to address it. If you, too, are experiencing mysterious low-level network driver failures requiring a reboot/bluescreen cycle, you might want that hotfix (or maybe Service Pack 1 whenever it is released, too.) We have been using HAProxy along with heartbeat from the Linux-HA project. We are using two linux instances to provide a failover. Each server has with their own public IP and a single IP which is shared between the two using a virtual interface (eth1:1) at IP: 69.59.196.211 The virtual interface (eth1:1) IP 69.59.196.211 is configured as the gateway for the windows servers behind them and we use ip_forwarding to route traffic. We are experiencing an occasional network outage on one of our windows servers behind our linux gateways. HAProxy will detect the server is offline which we can verify by remoting to the failed server and attempting to ping the gateway: Pinging 69.59.196.211 with 32 bytes of data: Reply from 69.59.196.220: Destination host unreachable. Running arp -a on this failed server shows that there is no entry for the gateway address (69.59.196.211): Interface: 69.59.196.220 --- 0xa Internet Address Physical Address Type 69.59.196.161 00-26-88-63-c7-80 dynamic 69.59.196.210 00-15-5d-0a-3e-0e dynamic 69.59.196.212 00-21-5e-4d-45-c9 dynamic 69.59.196.213 00-15-5d-00-b2-0d dynamic 69.59.196.215 00-21-5e-4d-61-1a dynamic 69.59.196.217 00-21-5e-4d-2c-e8 dynamic 69.59.196.219 00-21-5e-4d-38-e5 dynamic 69.59.196.221 00-15-5d-00-b2-0d dynamic 69.59.196.222 00-15-5d-0a-3e-09 dynamic 69.59.196.223 ff-ff-ff-ff-ff-ff static 224.0.0.22 01-00-5e-00-00-16 static 224.0.0.252 01-00-5e-00-00-fc static 225.0.0.1 01-00-5e-00-00-01 static On our linux gateway instances arp -a shows: peak-colo-196-220.peak.org (69.59.196.220) at <incomplete> on eth1 stackoverflow.com (69.59.196.212) at 00:21:5e:4d:45:c9 [ether] on eth1 peak-colo-196-215.peak.org (69.59.196.215) at 00:21:5e:4d:61:1a [ether] on eth1 peak-colo-196-219.peak.org (69.59.196.219) at 00:21:5e:4d:38:e5 [ether] on eth1 peak-colo-196-222.peak.org (69.59.196.222) at 00:15:5d:0a:3e:09 [ether] on eth1 peak-colo-196-209.peak.org (69.59.196.209) at 00:26:88:63:c7:80 [ether] on eth1 peak-colo-196-217.peak.org (69.59.196.217) at 00:21:5e:4d:2c:e8 [ether] on eth1 Why would arp occasionally set the entry for this failed server as <incomplete>? Should we be defining our arp entries statically? I've always left arp alone since it works 99% of the time, but in this one instance it appears to be failing. Are there any additional troubleshooting steps we can take help resolve this issue? THINGS WE HAVE TRIED I added a static arp entry for testing on one of the linux gateways which still didn't help. root@haproxy2:~# arp -a peak-colo-196-215.peak.org (69.59.196.215) at 00:21:5e:4d:61:1a [ether] on eth1 peak-colo-196-221.peak.org (69.59.196.221) at 00:15:5d:00:b2:0d [ether] on eth1 stackoverflow.com (69.59.196.212) at 00:21:5e:4d:45:c9 [ether] on eth1 peak-colo-196-219.peak.org (69.59.196.219) at 00:21:5e:4d:38:e5 [ether] on eth1 peak-colo-196-209.peak.org (69.59.196.209) at 00:26:88:63:c7:80 [ether] on eth1 peak-colo-196-217.peak.org (69.59.196.217) at 00:21:5e:4d:2c:e8 [ether] on eth1 peak-colo-196-220.peak.org (69.59.196.220) at 00:21:5e:4d:30:8d [ether] PERM on eth1 root@haproxy2:~# arp -i eth1 -s 69.59.196.220 00:21:5e:4d:30:8d root@haproxy2:~# ping 69.59.196.220 PING 69.59.196.220 (69.59.196.220) 56(84) bytes of data. --- 69.59.196.220 ping statistics --- 7 packets transmitted, 0 received, 100% packet loss, time 6006ms Rebooting the windows web server solves this issue temporarily with no other changes to the network but our experience shows this issue will come back. Swapping network cards and switches I noticed the link light on the port of the switch for the failed windows server was running at 100Mb instead of 1Gb on the failed interface. I moved the cable to several other open ports and the link indicated 100Mb for each port that I tried. I also swapped the cable with the same result. I tried changing the properties of the network card in windows and the server locked up and required a hard reset after clicking apply. This windows server has two physical network interfaces so I have swapped the cables and network settings on the two interfaces to see if the problem follows the interface. If the public interface goes down again we will know that it is not an issue with the network card. (We also tried another switch we have on hand, no change) Changing network hardware driver versions We've had the same problem with the latest Broadcom driver, as well as the built-in driver that ships in Windows Server 2008 R2. Replacing network cables As a last ditch effort we remembered another change that occurred was the replacement of all of the patch cords between our servers / switch. We had purchased two sets, one green of lengths 1ft - 3ft for the private interfaces and another set of red cables for the public interfaces. We swapped out all of the public interface patch cables with a different brand and ran our servers without issue for a full week ... aaaaaand then the problem recurred. Disable checksum offload, remove TProxy We also tried disabling TCP/IP checksum offload in the driver, no change. We're now pulling out TProxy and moving to a more traditional x-forwarded-for network arrangement without any fancy IP address rewriting. We'll see if that helps. Switch Virtualization providers On the off chance this was related to Hyper-V in some way (we do host Linux VMs on it), we switched to VMWare Server. No change. Switch host model We've reached the end of our troubleshooting rope and are now formally involving Microsoft support. They recommended changing the host model: http://en.wikipedia.org/wiki/Host_model http://technet.microsoft.com/en-us/magazine/2007.09.cableguy.aspx We did that, and.. we'll see.

    Read the article

  • Linux server is only using 60% of memory, then swapping

    - by Kamil Kisiel
    I've got a Linux server that's running our bacula backup system. The machine is grinding like mad because it's going heavy in to swap. The problem is, it's only using 60% of its physical memory! Here's the output from free -m: free -m total used free shared buffers cached Mem: 3949 2356 1593 0 0 1 -/+ buffers/cache: 2354 1595 Swap: 7629 1804 5824 and some sample output from vmstat 1: procs -----------memory---------- ---swap-- -----io---- -system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 0 2 1843536 1634512 0 4188 54 13 2524 666 2 1 1 1 89 9 0 1 11 1845916 1640724 0 388 2700 4816 221880 4879 14409 170721 4 3 63 30 0 0 9 1846096 1643952 0 0 4956 756 174832 804 12357 159306 3 4 63 30 0 0 11 1846104 1643532 0 0 4916 540 174320 580 10609 139960 3 4 64 29 0 0 4 1846084 1640272 0 2336 4080 524 140408 548 9331 118287 3 4 63 30 0 0 8 1846104 1642096 0 1488 2940 432 102516 457 7023 82230 2 4 65 29 0 0 5 1846104 1642268 0 1276 3704 452 126520 452 9494 119612 3 5 65 27 0 3 12 1846104 1641528 0 328 6092 608 187776 636 8269 113059 4 3 64 29 0 2 2 1846084 1640960 0 724 5948 0 111480 0 7751 116370 4 4 63 29 0 0 4 1846100 1641484 0 404 4144 1476 125760 1500 10668 105358 2 3 71 25 0 0 13 1846104 1641932 0 0 5872 828 153808 840 10518 128447 3 4 70 22 0 0 8 1846096 1639172 0 3164 3556 556 74884 580 5082 65362 2 2 73 23 0 1 4 1846080 1638676 0 396 4512 28 50928 44 2672 38277 2 2 80 16 0 0 3 1846080 1628808 0 7132 2636 0 28004 8 1358 14090 0 1 78 20 0 0 2 1844728 1618552 0 11140 7680 0 12740 8 763 2245 0 0 82 18 0 0 2 1837764 1532056 0 101504 2952 0 95644 24 802 3817 0 1 87 12 0 0 11 1842092 1633324 0 4416 1748 10900 143144 11024 6279 134442 3 3 70 24 0 2 6 1846104 1642756 0 0 4768 468 78752 468 4672 60141 2 2 76 20 0 1 12 1846104 1640792 0 236 4752 440 140712 464 7614 99593 3 5 58 34 0 0 3 1846084 1630368 0 6316 5104 0 20336 0 1703 22424 1 1 72 26 0 2 17 1846104 1638332 0 3168 4080 1720 211960 1744 11977 155886 3 4 65 28 0 1 10 1846104 1640800 0 132 4488 556 126016 584 8016 106368 3 4 63 29 0 0 14 1846104 1639740 0 2248 3436 428 114188 452 7030 92418 3 3 59 35 0 1 6 1846096 1639504 0 1932 5500 436 141412 460 8261 112210 4 4 63 29 0 0 10 1846104 1640164 0 3052 4028 448 147684 472 7366 109554 4 4 61 30 0 0 10 1846100 1641040 0 2332 4952 632 147452 664 8767 118384 3 4 63 30 0 4 8 1846084 1641092 0 664 4948 276 152264 292 6448 98813 5 5 62 28 0 Furthermore, the output of top sorted by CPU time seems to support the theory that swap is what's bogging down the system: top - 09:05:32 up 37 days, 23:24, 1 user, load average: 9.75, 8.24, 7.12 Tasks: 173 total, 1 running, 172 sleeping, 0 stopped, 0 zombie Cpu(s): 1.6%us, 1.4%sy, 0.0%ni, 76.1%id, 20.6%wa, 0.1%hi, 0.2%si, 0.0%st Mem: 4044632k total, 2405628k used, 1639004k free, 0k buffers Swap: 7812492k total, 1851852k used, 5960640k free, 436k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ TIME COMMAND 4174 root 17 0 63156 176 56 S 8 0.0 2138:52 35,38 bacula-fd 4185 root 17 0 63352 284 104 S 6 0.0 1709:25 28,29 bacula-sd 240 root 15 0 0 0 0 D 3 0.0 831:55.19 831:55 kswapd0 2852 root 10 -5 0 0 0 S 1 0.0 126:35.59 126:35 xfsbufd 2849 root 10 -5 0 0 0 S 0 0.0 119:50.94 119:50 xfsbufd 1364 root 10 -5 0 0 0 S 0 0.0 117:05.39 117:05 xfsbufd 21 root 10 -5 0 0 0 S 1 0.0 48:03.44 48:03 events/3 6940 postgres 16 0 43596 8 8 S 0 0.0 46:50.35 46:50 postmaster 1342 root 10 -5 0 0 0 S 0 0.0 23:14.34 23:14 xfsdatad/4 5415 root 17 0 1770m 108 48 S 0 0.0 15:03.74 15:03 bacula-dir 23 root 10 -5 0 0 0 S 0 0.0 13:09.71 13:09 events/5 5604 root 17 0 1216m 500 200 S 0 0.0 12:38.20 12:38 java 5552 root 16 0 1194m 580 248 S 0 0.0 11:58.00 11:58 java Here's the same sorted by virtual memory image size: top - 09:08:32 up 37 days, 23:27, 1 user, load average: 8.43, 8.26, 7.32 Tasks: 173 total, 1 running, 172 sleeping, 0 stopped, 0 zombie Cpu(s): 3.6%us, 3.4%sy, 0.0%ni, 62.2%id, 30.2%wa, 0.2%hi, 0.3%si, 0.0%st Mem: 4044632k total, 2404212k used, 1640420k free, 0k buffers Swap: 7812492k total, 1852548k used, 5959944k free, 100k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ TIME COMMAND 5415 root 17 0 1770m 56 44 S 0 0.0 15:03.78 15:03 bacula-dir 5604 root 17 0 1216m 492 200 S 0 0.0 12:38.30 12:38 java 5552 root 16 0 1194m 476 200 S 0 0.0 11:58.20 11:58 java 4598 root 16 0 117m 44 44 S 0 0.0 0:13.37 0:13 eventmond 9614 gdm 16 0 93188 0 0 S 0 0.0 0:00.30 0:00 gdmgreeter 5527 root 17 0 78716 0 0 S 0 0.0 0:00.30 0:00 gdm 4185 root 17 0 63352 284 104 S 20 0.0 1709:52 28,29 bacula-sd 4174 root 17 0 63156 208 88 S 24 0.0 2139:25 35,39 bacula-fd 10849 postgres 18 0 54740 216 108 D 0 0.0 0:31.40 0:31 postmaster 6661 postgres 17 0 49432 0 0 S 0 0.0 0:03.50 0:03 postmaster 5507 root 15 0 47980 0 0 S 0 0.0 0:00.00 0:00 gdm 6940 postgres 16 0 43596 16 16 S 0 0.0 46:51.39 46:51 postmaster 5304 postgres 16 0 40580 132 88 S 0 0.0 6:21.79 6:21 postmaster 5301 postgres 17 0 40448 24 24 S 0 0.0 0:32.17 0:32 postmaster 11280 root 16 0 40288 28 28 S 0 0.0 0:00.11 0:00 sshd 5534 root 17 0 37580 0 0 S 0 0.0 0:56.18 0:56 X 30870 root 30 15 31668 28 28 S 0 0.0 1:13.38 1:13 snmpd 5305 postgres 17 0 30628 16 16 S 0 0.0 0:11.60 0:11 postmaster 27403 postfix 17 0 30248 0 0 S 0 0.0 0:02.76 0:02 qmgr 10815 postfix 15 0 30208 16 16 S 0 0.0 0:00.02 0:00 pickup 5306 postgres 16 0 29760 20 20 S 0 0.0 0:52.89 0:52 postmaster 5302 postgres 17 0 29628 64 32 S 0 0.0 1:00.64 1:00 postmaster I've tried tuning the swappiness kernel parameter to both high and low values, but nothing appears to change the behavior here. I'm at a loss to figure out what's going on. How can I find out what's causing this? Update: The system is a fully 64-bit system, so there should be no question of memory limitations due to 32-bit issues. Update2: As I mentioned in the original question, I've already tried tuning swappiness to all sorts of values, including 0. The result is always the same, with approximately 1.6 GB of memory remaining unused. Update3: Added top output to the above info.

    Read the article

  • Why is Java EE 6 better than Spring ?

    - by arungupta
    Java EE 6 was released over 2 years ago and now there are 14 compliant application servers. In all my talks around the world, a question that is frequently asked is Why should I use Java EE 6 instead of Spring ? There are already several blogs covering that topic: Java EE wins over Spring by Bill Burke Why will I use Java EE instead of Spring in new Enterprise Java projects in 2012 ? by Kai Waehner (more discussion on TSS) Spring to Java EE migration (Part 1 and 2, 3 and 4 coming as well) by David Heffelfinger Spring to Java EE - A Migration Experience by Lincoln Baxter Migrating Spring to Java EE 6 by Bert Ertman and Paul Bakker at NLJUG Moving from Spring to Java EE 6 - The Age of Frameworks is Over at TSS Java EE vs Spring Shootout by Rohit Kelapure and Reza Rehman at JavaOne 2011 Java EE 6 and the Ewoks by Murat Yener Definite excuse to avoid Spring forever - Bert Ertman and Arun Gupta I will try to share my perspective in this blog. First of all, I'd like to start with a note: Thank you Spring framework for filling the interim gap and providing functionality that is now included in the mainstream Java EE 6 application servers. The Java EE platform has evolved over the years learning from frameworks like Spring and provides all the functionality to build an enterprise application. Thank you very much Spring framework! While Spring was revolutionary in its time and is still very popular and quite main stream in the same way Struts was circa 2003, it really is last generation's framework - some people are even calling it legacy. However my theory is "code is king". So my approach is to build/take a simple Hello World CRUD application in Java EE 6 and Spring and compare the deployable artifacts. I started looking at the official tutorial Developing a Spring Framework MVC Application Step-by-Step but it is using the older version 2.5. I wasn't able to find any updated version in the current 3.1 release. Next, I downloaded Spring Tool Suite and thought that would provide some template samples to get started. A least a quick search did not show any handy tutorials - either video or text-based. So I searched and found a link to their SVN repository at src.springframework.org/svn/spring-samples/. I tried the "mvc-basic" sample and the generated WAR file was 4.43 MB. While it was named a "basic" sample it seemed to come with 19 different libraries bundled but it was what I could find: ./WEB-INF/lib/aopalliance-1.0.jar./WEB-INF/lib/hibernate-validator-4.1.0.Final.jar./WEB-INF/lib/jcl-over-slf4j-1.6.1.jar./WEB-INF/lib/joda-time-1.6.2.jar./WEB-INF/lib/joda-time-jsptags-1.0.2.jar./WEB-INF/lib/jstl-1.2.jar./WEB-INF/lib/log4j-1.2.16.jar./WEB-INF/lib/slf4j-api-1.6.1.jar./WEB-INF/lib/slf4j-log4j12-1.6.1.jar./WEB-INF/lib/spring-aop-3.0.5.RELEASE.jar./WEB-INF/lib/spring-asm-3.0.5.RELEASE.jar./WEB-INF/lib/spring-beans-3.0.5.RELEASE.jar./WEB-INF/lib/spring-context-3.0.5.RELEASE.jar./WEB-INF/lib/spring-context-support-3.0.5.RELEASE.jar./WEB-INF/lib/spring-core-3.0.5.RELEASE.jar./WEB-INF/lib/spring-expression-3.0.5.RELEASE.jar./WEB-INF/lib/spring-web-3.0.5.RELEASE.jar./WEB-INF/lib/spring-webmvc-3.0.5.RELEASE.jar./WEB-INF/lib/validation-api-1.0.0.GA.jar And it is not even using any database! The app deployed fine on GlassFish 3.1.2 but the "@Controller Example" link did not work as it was missing the context root. With a bit of tweaking I could deploy the application and assume that the account got created because no error was displayed in the browser or server log. Next I generated the WAR for "mvc-ajax" and the 5.1 MB WAR had 20 JARs (1 removed, 2 added): ./WEB-INF/lib/aopalliance-1.0.jar./WEB-INF/lib/hibernate-validator-4.1.0.Final.jar./WEB-INF/lib/jackson-core-asl-1.6.4.jar./WEB-INF/lib/jackson-mapper-asl-1.6.4.jar./WEB-INF/lib/jcl-over-slf4j-1.6.1.jar./WEB-INF/lib/joda-time-1.6.2.jar./WEB-INF/lib/jstl-1.2.jar./WEB-INF/lib/log4j-1.2.16.jar./WEB-INF/lib/slf4j-api-1.6.1.jar./WEB-INF/lib/slf4j-log4j12-1.6.1.jar./WEB-INF/lib/spring-aop-3.0.5.RELEASE.jar./WEB-INF/lib/spring-asm-3.0.5.RELEASE.jar./WEB-INF/lib/spring-beans-3.0.5.RELEASE.jar./WEB-INF/lib/spring-context-3.0.5.RELEASE.jar./WEB-INF/lib/spring-context-support-3.0.5.RELEASE.jar./WEB-INF/lib/spring-core-3.0.5.RELEASE.jar./WEB-INF/lib/spring-expression-3.0.5.RELEASE.jar./WEB-INF/lib/spring-web-3.0.5.RELEASE.jar./WEB-INF/lib/spring-webmvc-3.0.5.RELEASE.jar./WEB-INF/lib/validation-api-1.0.0.GA.jar 2 more JARs for just doing Ajax. Anyway, deploying this application gave the following error: Caused by: java.lang.NoSuchMethodError: org.codehaus.jackson.map.SerializationConfig.<init>(Lorg/codehaus/jackson/map/ClassIntrospector;Lorg/codehaus/jackson/map/AnnotationIntrospector;Lorg/codehaus/jackson/map/introspect/VisibilityChecker;Lorg/codehaus/jackson/map/jsontype/SubtypeResolver;)V    at org.springframework.samples.mvc.ajax.json.ConversionServiceAwareObjectMapper.<init>(ConversionServiceAwareObjectMapper.java:20)    at org.springframework.samples.mvc.ajax.json.JacksonConversionServiceConfigurer.postProcessAfterInitialization(JacksonConversionServiceConfigurer.java:40)    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.applyBeanPostProcessorsAfterInitialization(AbstractAutowireCapableBeanFactory.java:407) Seems like some incorrect repos in the "pom.xml". Next one is "mvc-showcase" and the 6.49 MB WAR now has 28 JARs as shown below: ./WEB-INF/lib/aopalliance-1.0.jar./WEB-INF/lib/aspectjrt-1.6.10.jar./WEB-INF/lib/commons-fileupload-1.2.2.jar./WEB-INF/lib/commons-io-2.0.1.jar./WEB-INF/lib/el-api-2.2.jar./WEB-INF/lib/hibernate-validator-4.1.0.Final.jar./WEB-INF/lib/jackson-core-asl-1.8.1.jar./WEB-INF/lib/jackson-mapper-asl-1.8.1.jar./WEB-INF/lib/javax.inject-1.jar./WEB-INF/lib/jcl-over-slf4j-1.6.1.jar./WEB-INF/lib/jdom-1.0.jar./WEB-INF/lib/joda-time-1.6.2.jar./WEB-INF/lib/jstl-api-1.2.jar./WEB-INF/lib/jstl-impl-1.2.jar./WEB-INF/lib/log4j-1.2.16.jar./WEB-INF/lib/rome-1.0.0.jar./WEB-INF/lib/slf4j-api-1.6.1.jar./WEB-INF/lib/slf4j-log4j12-1.6.1.jar./WEB-INF/lib/spring-aop-3.1.0.RELEASE.jar./WEB-INF/lib/spring-asm-3.1.0.RELEASE.jar./WEB-INF/lib/spring-beans-3.1.0.RELEASE.jar./WEB-INF/lib/spring-context-3.1.0.RELEASE.jar./WEB-INF/lib/spring-context-support-3.1.0.RELEASE.jar./WEB-INF/lib/spring-core-3.1.0.RELEASE.jar./WEB-INF/lib/spring-expression-3.1.0.RELEASE.jar./WEB-INF/lib/spring-web-3.1.0.RELEASE.jar./WEB-INF/lib/spring-webmvc-3.1.0.RELEASE.jar./WEB-INF/lib/validation-api-1.0.0.GA.jar The app at least deployed and showed results this time. But still no database! Next I tried building "jpetstore" and got the error: [ERROR] Failed to execute goal on project org.springframework.samples.jpetstore:Could not resolve dependencies for project org.springframework.samples:org.springframework.samples.jpetstore:war:1.0.0-SNAPSHOT: Failed to collect dependencies for [commons-fileupload:commons-fileupload:jar:1.2.1 (compile), org.apache.struts:com.springsource.org.apache.struts:jar:1.2.9 (compile), javax.xml.rpc:com.springsource.javax.xml.rpc:jar:1.1.0 (compile), org.apache.commons:com.springsource.org.apache.commons.dbcp:jar:1.2.2.osgi (compile), commons-io:commons-io:jar:1.3.2 (compile), hsqldb:hsqldb:jar:1.8.0.7 (compile), org.apache.tiles:tiles-core:jar:2.2.0 (compile), org.apache.tiles:tiles-jsp:jar:2.2.0 (compile), org.tuckey:urlrewritefilter:jar:3.1.0 (compile), org.springframework:spring-webmvc:jar:3.0.0.BUILD-SNAPSHOT (compile), org.springframework:spring-orm:jar:3.0.0.BUILD-SNAPSHOT (compile), org.springframework:spring-context-support:jar:3.0.0.BUILD-SNAPSHOT (compile), org.springframework.webflow:spring-js:jar:2.0.7.RELEASE (compile), org.apache.ibatis:com.springsource.com.ibatis:jar:2.3.4.726 (runtime), com.caucho:com.springsource.com.caucho:jar:3.2.1 (compile), org.apache.axis:com.springsource.org.apache.axis:jar:1.4.0 (compile), javax.wsdl:com.springsource.javax.wsdl:jar:1.6.1 (compile), javax.servlet:jstl:jar:1.2 (runtime), org.aspectj:aspectjweaver:jar:1.6.5 (compile), javax.servlet:servlet-api:jar:2.5 (provided), javax.servlet.jsp:jsp-api:jar:2.1 (provided), junit:junit:jar:4.6 (test)]: Failed to read artifact descriptor for org.springframework:spring-webmvc:jar:3.0.0.BUILD-SNAPSHOT: Could not transfer artifact org.springframework:spring-webmvc:pom:3.0.0.BUILD-SNAPSHOT from/to JBoss repository (http://repository.jboss.com/maven2): Access denied to: http://repository.jboss.com/maven2/org/springframework/spring-webmvc/3.0.0.BUILD-SNAPSHOT/spring-webmvc-3.0.0.BUILD-SNAPSHOT.pom It appears the sample is broken - maybe I was pulling from the wrong repository - would be great if someone were to point me at a good target to use here. With a 50% hit on samples in this repository, I started searching through numerous blogs, most of which have either outdated information (using XML-heavy Spring 2.5), some piece of configuration (which is a typical "feature" of Spring) is missing, or too much complexity in the sample. I finally found this blog that worked like a charm. This blog creates a trivial Spring MVC 3 application using Hibernate and MySQL. This application performs CRUD operations on a single table in a database using typical Spring technologies.  I downloaded the sample code from the blog, deployed it on GlassFish 3.1.2 and could CRUD the "person" entity. The source code for this application can be downloaded here. More details on the application statistics below. And then I built a similar CRUD application in Java EE 6 using NetBeans wizards in a couple of minutes. The source code for the application can be downloaded here and the WAR here. The Spring Source Tool Suite may also offer similar wizard-driven capabilities but this blog focus primarily on comparing the runtimes. The lack of STS tutorials was slightly disappointing as well. NetBeans however has tons of text-based and video tutorials and tons of material even by the community. One more bit on the download size of tools bundle ... NetBeans 7.1.1 "All" is 211 MB (which includes GlassFish and Tomcat) Spring Tool Suite  2.9.0 is 347 MB (~ 65% bigger) This blog is not about the tooling comparison so back to the Java EE 6 version of the application .... In order to run the Java EE version on GlassFish, copy the MySQL Connector/J to glassfish3/glassfish/domains/domain1/lib/ext directory and create a JDBC connection pool and JDBC resource as: ./bin/asadmin create-jdbc-connection-pool --datasourceclassname \\ com.mysql.jdbc.jdbc2.optional.MysqlDataSource --restype \\ javax.sql.DataSource --property \\ portNumber=3306:user=mysql:password=mysql:databaseName=mydatabase \\ myConnectionPool ./bin/asadmin create-jdbc-resource --connectionpoolid myConnectionPool jdbc/myDataSource I generated WARs for the two projects and the table below highlights some differences between them: Java EE 6 Spring WAR File Size 0.021030 MB 10.87 MB (~516x) Number of files 20 53 (> 2.5x) Bundled libraries 0 36 Total size of libraries 0 12.1 MB XML files 3 5 LoC in XML files 50 (11 + 15 + 24) 129 (27 + 46 + 16 + 11 + 19) (~ 2.5x) Total .properties files 1 Bundle.properties 2 spring.properties, log4j.properties Cold Deploy 5,339 ms 11,724 ms Second Deploy 481 ms 6,261 ms Third Deploy 528 ms 5,484 ms Fourth Deploy 484 ms 5,576 ms Runtime memory ~73 MB ~101 MB Some points worth highlighting from the table ... 516x WAR file, 10x deployment time - With 12.1 MB of libraries (for a very basic application) bundled in your application, the WAR file size and the deployment time will naturally go higher. The WAR file for Spring-based application is 516x bigger and the deployment time is double during the first deployment and ~ 10x during subsequent deployments. The Java EE 6 application is fully portable and will run on any Java EE 6 compliant application server. 36 libraries in the WAR - There are 14 Java EE 6 compliant application servers today. Each of those servers provide all the functionality like transactions, dependency injection, security, persistence, etc typically required of an enterprise or web application. There is no need to bundle 36 libraries worth 12.1 MB for a trivial CRUD application. These 14 compliant application servers provide all the functionality baked in. Now you can also deploy these libraries in the container but then you don't get the "portability" offered by Spring in that case. Does your typical Spring deployment actually do that ? 3x LoC in XML - The number of XML files is about 1.6x and the LoC is ~ 2.5x. So much XML seems circa 2003 when the Java language had no annotations. The XML files can be further reduced, e.g. faces-config.xml can be replaced without providing i18n, but I just want to compare stock applications. Memory usage - Both the applications were deployed on default GlassFish 3.1.2 installation and any additional memory consumed as part of deployment/access was attributed to the application. This is by no means scientific but at least provides an initial ballpark. This area definitely needs more investigation. Another table that compares typical Java EE 6 compliant application servers and the custom-stack created for a Spring application ... Java EE 6 Spring Web Container ? 53 MB (tcServer 2.6.3 Developer Edition) Security ? 12 MB (Spring Security 3.1.0) Persistence ? 6.3 MB (Hibernate 4.1.0, required) Dependency Injection ? 5.3 MB (Framework) Web Services ? 796 KB (Spring WS 2.0.4) Messaging ? 3.4 MB (RabbitMQ Server 2.7.1) 936 KB (Java client 936) OSGi ? 1.3 MB (Spring OSGi 1.2.1) GlassFish and WebLogic (starting at 33 MB) 83.3 MB There are differentiating factors on both the stacks. But most of the functionality like security, persistence, and dependency injection is baked in a Java EE 6 compliant application server but needs to be individually managed and patched for a Spring application. This very quickly leads to a "stack explosion". The Java EE 6 servers are tested extensively on a variety of platforms in different combinations whereas a Spring application developer is responsible for testing with different JDKs, Operating Systems, Versions, Patches, etc. Oracle has both the leading OSS lightweight server with GlassFish and the leading enterprise Java server with WebLogic Server, both Java EE 6 and both with lightweight deployment options. The Web Container offered as part of a Java EE 6 application server not only deploys your enterprise Java applications but also provide operational management, diagnostics, and mission-critical capabilities required by your applications. The Java EE 6 platform also introduced the Web Profile which is a subset of the specifications from the entire platform. It is targeted at developers of modern web applications offering a reasonably complete stack, composed of standard APIs, and is capable out-of-the-box of addressing the needs of a large class of Web applications. As your applications grow, the stack can grow to the full Java EE 6 platform. The GlassFish Server Web Profile starting at 33MB (smaller than just the non-standard tcServer) provides most of the functionality typically required by a web application. WebLogic provides battle-tested functionality for a high throughput, low latency, and enterprise grade web application. No individual managing or patching, all tested and commercially supported for you! Note that VMWare does have a server, tcServer, but it is non-standard and not even certified to the level of the standard Web Profile most customers expect these days. Customers who choose this risk proprietary lock-in since VMWare does not seem to want to formally certify with either Java EE 6 Enterprise Platform or with Java EE 6 Web Profile but of course it would be great if they were to join the community and help their customers reduce the risk of deploying on VMWare software. Some more points to help you decide choose between Java EE 6 and Spring ... Freedom to choose container - There are 14 Java EE 6 compliant application servers today, with a variety of open source and commercial offerings. A Java EE 6 application can be deployed on any of those containers. So if you deployed your application on GlassFish today and would like to scale up with your demands then you can deploy the same application to WebLogic. And because of the portability of a Java EE 6 application, you can even take it a different vendor altogether. Spring requires a runtime which could be any of these app servers as well. But why use Spring when all the required functionality is already baked into the application server itself ? Spring also has a different definition of portability where they claim to bundle all the libraries in the WAR file and move to any application server. But we saw earlier how bloated that archive could be. The equivalent features in Spring runtime offerings (mainly tcServer) are not all open source, not as mature, and often require manual assembly.  Vendor choice - The Java EE 6 platform is created using the Java Community Process where all the big players like Oracle, IBM, RedHat, and Apache are conritbuting to make the platform successful. Each application server provides the basic Java EE 6 platform compliance and has its own competitive offerings. This allows you to choose an application server for deploying your Java EE 6 applications. If you are not happy with the support or feature of one vendor then you can move your application to a different vendor because of the portability promise offered by the platform. Spring is a set of products from a single company, one price book, one support organization, one sustaining organization, one sales organization, etc. If any of those cause a customer headache, where do you go ? Java EE, backed by multiple vendors, is a safer bet for those that are risk averse. Production support - With Spring, typically you need to get support from two vendors - VMWare and the container provider. With Java EE 6, all of this is typically provided by one vendor. For example, Oracle offers commercial support from systems, operating systems, JDK, application server, and applications on top of them. VMWare certainly offers complete production support but do you really want to put all your eggs in one basket ? Do you really use tcServer ? ;-) Maintainability - With Spring, you are likely building your own distribution with multiple JAR files, integrating, patching, versioning, etc of all those components. Spring's claim is that multiple JAR files allow you to go à la carte and pick the latest versions of different components. But who is responsible for testing whether all these versions work together ? Yep, you got it, its YOU! If something does not work, who patches and maintains the JARs ? Of course, you! Commercial support for such a configuration ? On your own! The Java EE application servers manage all of this for you and provide a well-tested and commercially supported bundle. While it is always good to realize that there is something new and improved that updates and replaces older frameworks like Spring, the good news is not only does a Java EE 6 container offer what is described here, most also will let you deploy and run your Spring applications on them while you go through an upgrade to a more modern architecture. End result, you get the best of both worlds - keeping your legacy investment but moving to a more agile, lightweight world of Java EE 6. A message to the Spring lovers ... The complexity in J2EE 1.2, 1.3, and 1.4 led to the genesis of Spring but that was in 2004. This is 2012 and the name has changed to "Java EE 6" :-) There are tons of improvements in the Java EE platform to make it easy-to-use and powerful. Some examples: Adding @Stateless on a POJO makes it an EJB EJBs can be packaged in a WAR with no special packaging or deployment descriptors "web.xml" and "faces-config.xml" are optional in most of the common cases Typesafe dependency injection is now part of the Java EE platform Add @Path on a POJO allows you to publish it as a RESTful resource EJBs can be used as backing beans for Facelets-driven JSF pages providing full MVC Java EE 6 WARs are known to be kilobytes in size and deployed in milliseconds Tons of other simplifications in the platform and application servers So if you moved away from J2EE to Spring many years ago and have not looked at Java EE 6 (which has been out since Dec 2009) then you should definitely try it out. Just be at least aware of what other alternatives are available instead of restricting yourself to one stack. Here are some workshops and screencasts worth trying: screencast #37 shows how to build an end-to-end application using NetBeans screencast #36 builds the same application using Eclipse javaee-lab-feb2012.pdf is a 3-4 hours self-paced hands-on workshop that guides you to build a comprehensive Java EE 6 application using NetBeans Each city generally has a "spring cleanup" program every year. It allows you to clean up the mess from your house. For your software projects, you don't need to wait for an annual event, just get started and reduce the technical debt now! Move away from your legacy Spring-based applications to a lighter and more modern approach of building enterprise Java applications using Java EE 6. Watch this beautiful presentation that explains how to migrate from Spring -> Java EE 6: List of files in the Java EE 6 project: ./index.xhtml./META-INF./person./person/Create.xhtml./person/Edit.xhtml./person/List.xhtml./person/View.xhtml./resources./resources/css./resources/css/jsfcrud.css./template.xhtml./WEB-INF./WEB-INF/classes./WEB-INF/classes/Bundle.properties./WEB-INF/classes/META-INF./WEB-INF/classes/META-INF/persistence.xml./WEB-INF/classes/org./WEB-INF/classes/org/javaee./WEB-INF/classes/org/javaee/javaeemysql./WEB-INF/classes/org/javaee/javaeemysql/AbstractFacade.class./WEB-INF/classes/org/javaee/javaeemysql/Person.class./WEB-INF/classes/org/javaee/javaeemysql/Person_.class./WEB-INF/classes/org/javaee/javaeemysql/PersonController$1.class./WEB-INF/classes/org/javaee/javaeemysql/PersonController$PersonControllerConverter.class./WEB-INF/classes/org/javaee/javaeemysql/PersonController.class./WEB-INF/classes/org/javaee/javaeemysql/PersonFacade.class./WEB-INF/classes/org/javaee/javaeemysql/util./WEB-INF/classes/org/javaee/javaeemysql/util/JsfUtil.class./WEB-INF/classes/org/javaee/javaeemysql/util/PaginationHelper.class./WEB-INF/faces-config.xml./WEB-INF/web.xml List of files in the Spring 3.x project: ./META-INF ./META-INF/MANIFEST.MF./WEB-INF./WEB-INF/applicationContext.xml./WEB-INF/classes./WEB-INF/classes/log4j.properties./WEB-INF/classes/org./WEB-INF/classes/org/krams ./WEB-INF/classes/org/krams/tutorial ./WEB-INF/classes/org/krams/tutorial/controller ./WEB-INF/classes/org/krams/tutorial/controller/MainController.class ./WEB-INF/classes/org/krams/tutorial/domain ./WEB-INF/classes/org/krams/tutorial/domain/Person.class ./WEB-INF/classes/org/krams/tutorial/service ./WEB-INF/classes/org/krams/tutorial/service/PersonService.class ./WEB-INF/hibernate-context.xml ./WEB-INF/hibernate.cfg.xml ./WEB-INF/jsp ./WEB-INF/jsp/addedpage.jsp ./WEB-INF/jsp/addpage.jsp ./WEB-INF/jsp/deletedpage.jsp ./WEB-INF/jsp/editedpage.jsp ./WEB-INF/jsp/editpage.jsp ./WEB-INF/jsp/personspage.jsp ./WEB-INF/lib ./WEB-INF/lib/antlr-2.7.6.jar ./WEB-INF/lib/aopalliance-1.0.jar ./WEB-INF/lib/c3p0-0.9.1.2.jar ./WEB-INF/lib/cglib-nodep-2.2.jar ./WEB-INF/lib/commons-beanutils-1.8.3.jar ./WEB-INF/lib/commons-collections-3.2.1.jar ./WEB-INF/lib/commons-digester-2.1.jar ./WEB-INF/lib/commons-logging-1.1.1.jar ./WEB-INF/lib/dom4j-1.6.1.jar ./WEB-INF/lib/ejb3-persistence-1.0.2.GA.jar ./WEB-INF/lib/hibernate-annotations-3.4.0.GA.jar ./WEB-INF/lib/hibernate-commons-annotations-3.1.0.GA.jar ./WEB-INF/lib/hibernate-core-3.3.2.GA.jar ./WEB-INF/lib/javassist-3.7.ga.jar ./WEB-INF/lib/jstl-1.1.2.jar ./WEB-INF/lib/jta-1.1.jar ./WEB-INF/lib/junit-4.8.1.jar ./WEB-INF/lib/log4j-1.2.14.jar ./WEB-INF/lib/mysql-connector-java-5.1.14.jar ./WEB-INF/lib/persistence-api-1.0.jar ./WEB-INF/lib/slf4j-api-1.6.1.jar ./WEB-INF/lib/slf4j-log4j12-1.6.1.jar ./WEB-INF/lib/spring-aop-3.0.5.RELEASE.jar ./WEB-INF/lib/spring-asm-3.0.5.RELEASE.jar ./WEB-INF/lib/spring-beans-3.0.5.RELEASE.jar ./WEB-INF/lib/spring-context-3.0.5.RELEASE.jar ./WEB-INF/lib/spring-context-support-3.0.5.RELEASE.jar ./WEB-INF/lib/spring-core-3.0.5.RELEASE.jar ./WEB-INF/lib/spring-expression-3.0.5.RELEASE.jar ./WEB-INF/lib/spring-jdbc-3.0.5.RELEASE.jar ./WEB-INF/lib/spring-orm-3.0.5.RELEASE.jar ./WEB-INF/lib/spring-tx-3.0.5.RELEASE.jar ./WEB-INF/lib/spring-web-3.0.5.RELEASE.jar ./WEB-INF/lib/spring-webmvc-3.0.5.RELEASE.jar ./WEB-INF/lib/standard-1.1.2.jar ./WEB-INF/lib/xml-apis-1.0.b2.jar ./WEB-INF/spring-servlet.xml ./WEB-INF/spring.properties ./WEB-INF/web.xml So, are you excited about Java EE 6 ? Want to get started now ? Here are some resources: Java EE 6 SDK (including runtime, samples, tutorials etc) GlassFish Server Open Source Edition 3.1.2 (Community) Oracle GlassFish Server 3.1.2 (Commercial) Java EE 6 using WebLogic 12c and NetBeans (Video) Java EE 6 with NetBeans and GlassFish (Video) Java EE with Eclipse and GlassFish (Video)

    Read the article

  • NoSQL with RavenDB and ASP.NET MVC - Part 1

    - by shiju
     A while back, I have blogged NoSQL with MongoDB, NoRM and ASP.NET MVC Part 1 and Part 2 on how to use MongoDB with an ASP.NET MVC application. The NoSQL movement is getting big attention and RavenDB is the latest addition to the NoSQL and document database world. RavenDB is an Open Source (with a commercial option) document database for the .NET/Windows platform developed  by Ayende Rahien.  Raven stores schema-less JSON documents, allow you to define indexes using Linq queries and focus on low latency and high performance. RavenDB is .NET focused document database which comes with a fully functional .NET client API  and supports LINQ. RavenDB comes with two components, a server and a client API. RavenDB is a REST based system, so you can write your own HTTP cleint API. As a .NET developer, RavenDB is becoming my favorite document database. Unlike other document databases, RavenDB is supports transactions using System.Transactions. Also it's supports both embedded and server mode of database. You can access RavenDB site at http://ravendb.netA demo App with ASP.NET MVCLet's create a simple demo app with RavenDB and ASP.NET MVC. To work with RavenDB, do the following steps. Go to http://ravendb.net/download and download the latest build.Unzip the downloaded file.Go to the /Server directory and run the RavenDB.exe. This will start the RavenDB server listening on localhost:8080You can change the port of RavenDB  by modifying the "Raven/Port" appSetting value in the RavenDB.exe.config file.When running the RavenDB, it will automatically create a database in the /Data directory. You can change the directory name data by modifying "Raven/DataDirt" appSetting value in the RavenDB.exe.config file.RavenDB provides a browser based admin tool. When the Raven server is running, You can be access the browser based admin tool and view and edit documents and index using your browser admin tool. The web admin tool available at http://localhost:8080The below is the some screen shots of web admin tool     Working with ASP.NET MVC  To working with RavenDB in our demo ASP.NET MVC application, do the following steps Step 1 - Add reference to Raven Cleint API In our ASP.NET MVC application, Add a reference to the Raven.Client.Lightweight.dll from the Client directory. Step 2 - Create DocumentStoreThe document store would be created once per application. Let's create a DocumentStore on application start-up in the Global.asax.cs. documentStore = new DocumentStore { Url = "http://localhost:8080/" }; documentStore.Initialise(); The above code will create a Raven DB document store and will be listening the server locahost at port 8080    Step 3 - Create DocumentSession on BeginRequest   Let's create a DocumentSession on BeginRequest event in the Global.asax.cs. We are using the document session for every unit of work. In our demo app, every HTTP request would be a single Unit of Work (UoW). BeginRequest += (sender, args) =>   HttpContext.Current.Items[RavenSessionKey] = documentStore.OpenSession(); Step 4 - Destroy the DocumentSession on EndRequest  EndRequest += (o, eventArgs) => {     var disposable = HttpContext.Current.Items[RavenSessionKey] as IDisposable;     if (disposable != null)         disposable.Dispose(); };  At the end of HTTP request, we are destroying the DocumentSession  object.The below  code block shown all the code in the Global.asax.cs  private const string RavenSessionKey = "RavenMVC.Session"; private static DocumentStore documentStore;   protected void Application_Start() { //Create a DocumentStore in Application_Start //DocumentStore should be created once per application and stored as a singleton. documentStore = new DocumentStore { Url = "http://localhost:8080/" }; documentStore.Initialise(); AreaRegistration.RegisterAllAreas(); RegisterRoutes(RouteTable.Routes); //DI using Unity 2.0 ConfigureUnity(); }   public MvcApplication() { //Create a DocumentSession on BeginRequest   //create a document session for every unit of work BeginRequest += (sender, args) =>     HttpContext.Current.Items[RavenSessionKey] = documentStore.OpenSession(); //Destroy the DocumentSession on EndRequest EndRequest += (o, eventArgs) => { var disposable = HttpContext.Current.Items[RavenSessionKey] as IDisposable; if (disposable != null) disposable.Dispose(); }; }   //Getting the current DocumentSession public static IDocumentSession CurrentSession {   get { return (IDocumentSession)HttpContext.Current.Items[RavenSessionKey]; } }  We have setup all necessary code in the Global.asax.cs for working with RavenDB. For our demo app, Let’s write a domain class  public class Category {       public string Id { get; set; }       [Required(ErrorMessage = "Name Required")]     [StringLength(25, ErrorMessage = "Must be less than 25 characters")]     public string Name { get; set;}     public string Description { get; set; }   } We have created simple domain entity Category. Let's create repository class for performing CRUD operations against our domain entity Category.  public interface ICategoryRepository {     Category Load(string id);     IEnumerable<Category> GetCategories();     void Save(Category category);     void Delete(string id);       }    public class CategoryRepository : ICategoryRepository {     private IDocumentSession session;     public CategoryRepository()     {             session = MvcApplication.CurrentSession;     }     //Load category based on Id     public Category Load(string id)     {         return session.Load<Category>(id);     }     //Get all categories     public IEnumerable<Category> GetCategories()     {         var categories= session.LuceneQuery<Category>()                 .WaitForNonStaleResults()             .ToArray();         return categories;       }     //Insert/Update category     public void Save(Category category)     {         if (string.IsNullOrEmpty(category.Id))         {             //insert new record             session.Store(category);         }         else         {             //edit record             var categoryToEdit = Load(category.Id);             categoryToEdit.Name = category.Name;             categoryToEdit.Description = category.Description;         }         //save the document session         session.SaveChanges();     }     //delete a category     public void Delete(string id)     {         var category = Load(id);         session.Delete<Category>(category);         session.SaveChanges();     }        } For every CRUD operations, we are taking the current document session object from HttpContext object. session = MvcApplication.CurrentSession; We are calling the static method CurrentSession from the Global.asax.cs public static IDocumentSession CurrentSession {     get { return (IDocumentSession)HttpContext.Current.Items[RavenSessionKey]; } }  Retrieve Entities  The Load method get the single Category object based on the Id. RavenDB is working based on the REST principles and the Id would be like categories/1. The Id would be created by automatically when a new object is inserted to the document store. The REST uri categories/1 represents a single category object with Id representation of 1.   public Category Load(string id) {    return session.Load<Category>(id); } The GetCategories method returns all the categories calling the session.LuceneQuery method. RavenDB is using a lucen query syntax for querying. I will explain more details about querying and indexing in my future posts.   public IEnumerable<Category> GetCategories() {     var categories= session.LuceneQuery<Category>()             .WaitForNonStaleResults()         .ToArray();     return categories;   } Insert/Update entityFor insert/Update a Category entity, we have created Save method in repository class. If  the Id property of Category is null, we call Store method of Documentsession for insert a new record. For editing a existing record, we load the Category object and assign the values to the loaded Category object. The session.SaveChanges() will save the changes to document store.  //Insert/Update category public void Save(Category category) {     if (string.IsNullOrEmpty(category.Id))     {         //insert new record         session.Store(category);     }     else     {         //edit record         var categoryToEdit = Load(category.Id);         categoryToEdit.Name = category.Name;         categoryToEdit.Description = category.Description;     }     //save the document session     session.SaveChanges(); }  Delete Entity  In the Delete method, we call the document session's delete method and call the SaveChanges method to reflect changes in the document store.  public void Delete(string id) {     var category = Load(id);     session.Delete<Category>(category);     session.SaveChanges(); }  Let’s create ASP.NET MVC controller and controller actions for handling CRUD operations for the domain class Category  public class CategoryController : Controller { private ICategoryRepository categoyRepository; //DI enabled constructor public CategoryController(ICategoryRepository categoyRepository) {     this.categoyRepository = categoyRepository; } public ActionResult Index() {         var categories = categoyRepository.GetCategories();     if (categories == null)         return RedirectToAction("Create");     return View(categories); }   [HttpGet] public ActionResult Edit(string id) {     var category = categoyRepository.Load(id);         return View("Save",category); } // GET: /Category/Create [HttpGet] public ActionResult Create() {     var category = new Category();     return View("Save", category); } [HttpPost] public ActionResult Save(Category category) {     if (!ModelState.IsValid)     {         return View("Save", category);     }           categoyRepository.Save(category);         return RedirectToAction("Index");     }        [HttpPost] public ActionResult Delete(string id) {     categoyRepository.Delete(id);     var categories = categoyRepository.GetCategories();     return PartialView("CategoryList", categories);      }        }  RavenDB is an awesome document database and I hope that it will be the winner in .NET space of document database world.  The source code of demo application available at http://ravenmvc.codeplex.com/

    Read the article

  • WCF WS-Security and WSE Nonce Authentication

    - by Rick Strahl
    WCF makes it fairly easy to access WS-* Web Services, except when you run into a service format that it doesn't support. Even then WCF provides a huge amount of flexibility to make the service clients work, however finding the proper interfaces to make that happen is not easy to discover and for the most part undocumented unless you're lucky enough to run into a blog, forum or StackOverflow post on the matter. This is definitely true for the Password Nonce as part of the WS-Security/WSE protocol, which is not natively supported in WCF. Specifically I had a need to create a WCF message on the client that includes a WS-Security header that looks like this from their spec document:<soapenv:Header> <wsse:Security soapenv:mustUnderstand="1" xmlns:wsse="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-secext-1.0.xsd"> <wsse:UsernameToken wsu:Id="UsernameToken-8" xmlns:wsu="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-utility-1.0.xsd"> <wsse:Username>TeStUsErNaMe1</wsse:Username> <wsse:Password Type="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-username-token-profile-1.0#PasswordText" >TeStPaSsWoRd1</wsse:Password> <wsse:Nonce EncodingType="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-soap-message-security-1.0#Base64Binary" >f8nUe3YupTU5ISdCy3X9Gg==</wsse:Nonce> <wsu:Created>2011-05-04T19:01:40.981Z</wsu:Created> </wsse:UsernameToken> </wsse:Security> </soapenv:Header> Specifically, the Nonce and Created keys are what WCF doesn't create or have a built in formatting for. Why is there a nonce? My first thought here was WTF? The username and password are there in clear text, what does the Nonce accomplish? The Nonce and created keys are are part of WSE Security specification and are meant to allow the server to detect and prevent replay attacks. The hashed nonce should be unique per request which the server can store and check for before running another request thus ensuring that a request is not replayed with exactly the same values. Basic ServiceUtl Import - not much Luck The first thing I did when I imported this service with a service reference was to simply import it as a Service Reference. The Add Service Reference import automatically detects that WS-Security is required and appropariately adds the WS-Security to the basicHttpBinding in the config file:<?xml version="1.0" encoding="utf-8" ?> <configuration> <system.serviceModel> <bindings> <basicHttpBinding> <binding name="RealTimeOnlineSoapBinding"> <security mode="Transport" /> </binding> <binding name="RealTimeOnlineSoapBinding1" /> </basicHttpBinding> </bindings> <client> <endpoint address="https://notarealurl.com:443/services/RealTimeOnline" binding="basicHttpBinding" bindingConfiguration="RealTimeOnlineSoapBinding" contract="RealTimeOnline.RealTimeOnline" name="RealTimeOnline" /> </client> </system.serviceModel> </configuration> If if I run this as is using code like this:var client = new RealTimeOnlineClient(); client.ClientCredentials.UserName.UserName = "TheUsername"; client.ClientCredentials.UserName.Password = "ThePassword"; … I get nothing in terms of WS-Security headers. The request is sent, but the the binding expects transport level security to be applied, rather than message level security. To fix this so that a WS-Security message header is sent the security mode can be changed to: <security mode="TransportWithMessageCredential" /> Now if I re-run I at least get a WS-Security header which looks like this:<s:Envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/" xmlns:u="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-utility-1.0.xsd"> <s:Header> <o:Security s:mustUnderstand="1" xmlns:o="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-secext-1.0.xsd"> <u:Timestamp u:Id="_0"> <u:Created>2012-11-24T02:55:18.011Z</u:Created> <u:Expires>2012-11-24T03:00:18.011Z</u:Expires> </u:Timestamp> <o:UsernameToken u:Id="uuid-18c215d4-1106-40a5-8dd1-c81fdddf19d3-1"> <o:Username>TheUserName</o:Username> <o:Password Type="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-username-token-profile-1.0#PasswordText" >ThePassword</o:Password> </o:UsernameToken> </o:Security> </s:Header> Closer! Now the WS-Security header is there along with a timestamp field (which might not be accepted by some WS-Security expecting services), but there's no Nonce or created timestamp as required by my original service. Using a CustomBinding instead My next try was to go with a CustomBinding instead of basicHttpBinding as it allows a bit more control over the protocol and transport configurations for the binding. Specifically I can explicitly specify the message protocol(s) used. Using configuration file settings here's what the config file looks like:<?xml version="1.0"?> <configuration> <system.serviceModel> <bindings> <customBinding> <binding name="CustomSoapBinding"> <security includeTimestamp="false" authenticationMode="UserNameOverTransport" defaultAlgorithmSuite="Basic256" requireDerivedKeys="false" messageSecurityVersion="WSSecurity10WSTrustFebruary2005WSSecureConversationFebruary2005WSSecurityPolicy11BasicSecurityProfile10"> </security> <textMessageEncoding messageVersion="Soap11"></textMessageEncoding> <httpsTransport maxReceivedMessageSize="2000000000"/> </binding> </customBinding> </bindings> <client> <endpoint address="https://notrealurl.com:443/services/RealTimeOnline" binding="customBinding" bindingConfiguration="CustomSoapBinding" contract="RealTimeOnline.RealTimeOnline" name="RealTimeOnline" /> </client> </system.serviceModel> <startup> <supportedRuntime version="v4.0" sku=".NETFramework,Version=v4.0"/> </startup> </configuration> This ends up creating a cleaner header that's missing the timestamp field which can cause some services problems. The WS-Security header output generated with the above looks like this:<s:Header> <o:Security s:mustUnderstand="1" xmlns:o="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-secext-1.0.xsd"> <o:UsernameToken u:Id="uuid-291622ca-4c11-460f-9886-ac1c78813b24-1"> <o:Username>TheUsername</o:Username> <o:Password Type="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-username-token-profile-1.0#PasswordText" >ThePassword</o:Password> </o:UsernameToken> </o:Security> </s:Header> This is closer as it includes only the username and password. The key here is the protocol for WS-Security:messageSecurityVersion="WSSecurity10WSTrustFebruary2005WSSecureConversationFebruary2005WSSecurityPolicy11BasicSecurityProfile10" which explicitly specifies the protocol version. There are several variants of this specification but none of them seem to support the nonce unfortunately. This protocol does allow for optional omission of the Nonce and created timestamp provided (which effectively makes those keys optional). With some services I tried that requested a Nonce just using this protocol actually worked where the default basicHttpBinding failed to connect, so this is a possible solution for access to some services. Unfortunately for my target service that was not an option. The nonce has to be there. Creating Custom ClientCredentials As it turns out WCF doesn't have support for the Digest Nonce as part of WS-Security, and so as far as I can tell there's no way to do it just with configuration settings. I did a bunch of research on this trying to find workarounds for this, and I did find a couple of entries on StackOverflow as well as on the MSDN forums. However, none of these are particularily clear and I ended up using bits and pieces of several of them to arrive at a working solution in the end. http://stackoverflow.com/questions/896901/wcf-adding-nonce-to-usernametoken http://social.msdn.microsoft.com/Forums/en-US/wcf/thread/4df3354f-0627-42d9-b5fb-6e880b60f8ee The latter forum message is the more useful of the two (the last message on the thread in particular) and it has most of the information required to make this work. But it took some experimentation for me to get this right so I'll recount the process here maybe a bit more comprehensively. In order for this to work a number of classes have to be overridden: ClientCredentials ClientCredentialsSecurityTokenManager WSSecurityTokenizer The idea is that we need to create a custom ClientCredential class to hold the custom properties so they can be set from the UI or via configuration settings. The TokenManager and Tokenizer are mainly required to allow the custom credentials class to flow through the WCF pipeline and eventually provide custom serialization. Here are the three classes required and their full implementations:public class CustomCredentials : ClientCredentials { public CustomCredentials() { } protected CustomCredentials(CustomCredentials cc) : base(cc) { } public override System.IdentityModel.Selectors.SecurityTokenManager CreateSecurityTokenManager() { return new CustomSecurityTokenManager(this); } protected override ClientCredentials CloneCore() { return new CustomCredentials(this); } } public class CustomSecurityTokenManager : ClientCredentialsSecurityTokenManager { public CustomSecurityTokenManager(CustomCredentials cred) : base(cred) { } public override System.IdentityModel.Selectors.SecurityTokenSerializer CreateSecurityTokenSerializer(System.IdentityModel.Selectors.SecurityTokenVersion version) { return new CustomTokenSerializer(System.ServiceModel.Security.SecurityVersion.WSSecurity11); } } public class CustomTokenSerializer : WSSecurityTokenSerializer { public CustomTokenSerializer(SecurityVersion sv) : base(sv) { } protected override void WriteTokenCore(System.Xml.XmlWriter writer, System.IdentityModel.Tokens.SecurityToken token) { UserNameSecurityToken userToken = token as UserNameSecurityToken; string tokennamespace = "o"; DateTime created = DateTime.Now; string createdStr = created.ToString("yyyy-MM-ddThh:mm:ss.fffZ"); // unique Nonce value - encode with SHA-1 for 'randomness' // in theory the nonce could just be the GUID by itself string phrase = Guid.NewGuid().ToString(); var nonce = GetSHA1String(phrase); // in this case password is plain text // for digest mode password needs to be encoded as: // PasswordAsDigest = Base64(SHA-1(Nonce + Created + Password)) // and profile needs to change to //string password = GetSHA1String(nonce + createdStr + userToken.Password); string password = userToken.Password; writer.WriteRaw(string.Format( "<{0}:UsernameToken u:Id=\"" + token.Id + "\" xmlns:u=\"http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-utility-1.0.xsd\">" + "<{0}:Username>" + userToken.UserName + "</{0}:Username>" + "<{0}:Password Type=\"http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-username-token-profile-1.0#PasswordText\">" + password + "</{0}:Password>" + "<{0}:Nonce EncodingType=\"http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-soap-message-security-1.0#Base64Binary\">" + nonce + "</{0}:Nonce>" + "<u:Created>" + createdStr + "</u:Created></{0}:UsernameToken>", tokennamespace)); } protected string GetSHA1String(string phrase) { SHA1CryptoServiceProvider sha1Hasher = new SHA1CryptoServiceProvider(); byte[] hashedDataBytes = sha1Hasher.ComputeHash(Encoding.UTF8.GetBytes(phrase)); return Convert.ToBase64String(hashedDataBytes); } } Realistically only the CustomTokenSerializer has any significant code in. The code there deals with actually serializing the custom credentials using low level XML semantics by writing output into an XML writer. I can't take credit for this code - most of the code comes from the MSDN forum post mentioned earlier - I made a few adjustments to simplify the nonce generation and also added some notes to allow for PasswordDigest generation. Per spec the nonce is nothing more than a unique value that's supposed to be 'random'. I'm thinking that this value can be any string that's unique and a GUID on its own probably would have sufficed. Comments on other posts that GUIDs can be potentially guessed are highly exaggerated to say the least IMHO. To satisfy even that aspect though I added the SHA1 encryption and binary decoding to give a more random value that would be impossible to 'guess'. The original example from the forum post used another level of encoding and decoding to string in between - but that really didn't accomplish anything but extra overhead. The header output generated from this looks like this:<s:Header> <o:Security s:mustUnderstand="1" xmlns:o="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-secext-1.0.xsd"> <o:UsernameToken u:Id="uuid-f43d8b0d-0ebb-482e-998d-f544401a3c91-1" xmlns:u="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-utility-1.0.xsd"> <o:Username>TheUsername</o:Username> <o:Password Type="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-username-token-profile-1.0#PasswordText">ThePassword</o:Password> <o:Nonce EncodingType="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-soap-message-security-1.0#Base64Binary" >PjVE24TC6HtdAnsf3U9c5WMsECY=</o:Nonce> <u:Created>2012-11-23T07:10:04.670Z</u:Created> </o:UsernameToken> </o:Security> </s:Header> which is exactly as it should be. Password Digest? In my case the password is passed in plain text over an SSL connection, so there's no digest required so I was done with the code above. Since I don't have a service handy that requires a password digest,  I had no way of testing the code for the digest implementation, but here is how this is likely to work. If you need to pass a digest encoded password things are a little bit trickier. The password type namespace needs to change to: http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-username-token-profile-1.0#Digest and then the password value needs to be encoded. The format for password digest encoding is this: Base64(SHA-1(Nonce + Created + Password)) and it can be handled in the code above with this code (that's commented in the snippet above): string password = GetSHA1String(nonce + createdStr + userToken.Password); The entire WriteTokenCore method for digest code looks like this:protected override void WriteTokenCore(System.Xml.XmlWriter writer, System.IdentityModel.Tokens.SecurityToken token) { UserNameSecurityToken userToken = token as UserNameSecurityToken; string tokennamespace = "o"; DateTime created = DateTime.Now; string createdStr = created.ToString("yyyy-MM-ddThh:mm:ss.fffZ"); // unique Nonce value - encode with SHA-1 for 'randomness' // in theory the nonce could just be the GUID by itself string phrase = Guid.NewGuid().ToString(); var nonce = GetSHA1String(phrase); string password = GetSHA1String(nonce + createdStr + userToken.Password); writer.WriteRaw(string.Format( "<{0}:UsernameToken u:Id=\"" + token.Id + "\" xmlns:u=\"http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-utility-1.0.xsd\">" + "<{0}:Username>" + userToken.UserName + "</{0}:Username>" + "<{0}:Password Type=\"http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-username-token-profile-1.0#Digest\">" + password + "</{0}:Password>" + "<{0}:Nonce EncodingType=\"http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-soap-message-security-1.0#Base64Binary\">" + nonce + "</{0}:Nonce>" + "<u:Created>" + createdStr + "</u:Created></{0}:UsernameToken>", tokennamespace)); } I had no service to connect to to try out Digest auth - if you end up needing it and get it to work please drop a comment… How to use the custom Credentials The easiest way to use the custom credentials is to create the client in code. Here's a factory method I use to create an instance of my service client:  public static RealTimeOnlineClient CreateRealTimeOnlineProxy(string url, string username, string password) { if (string.IsNullOrEmpty(url)) url = "https://notrealurl.com:443/cows/services/RealTimeOnline"; CustomBinding binding = new CustomBinding(); var security = TransportSecurityBindingElement.CreateUserNameOverTransportBindingElement(); security.IncludeTimestamp = false; security.DefaultAlgorithmSuite = SecurityAlgorithmSuite.Basic256; security.MessageSecurityVersion = MessageSecurityVersion.WSSecurity10WSTrustFebruary2005WSSecureConversationFebruary2005WSSecurityPolicy11BasicSecurityProfile10; var encoding = new TextMessageEncodingBindingElement(); encoding.MessageVersion = MessageVersion.Soap11; var transport = new HttpsTransportBindingElement(); transport.MaxReceivedMessageSize = 20000000; // 20 megs binding.Elements.Add(security); binding.Elements.Add(encoding); binding.Elements.Add(transport); RealTimeOnlineClient client = new RealTimeOnlineClient(binding, new EndpointAddress(url)); // to use full client credential with Nonce uncomment this code: // it looks like this might not be required - the service seems to work without it client.ChannelFactory.Endpoint.Behaviors.Remove<System.ServiceModel.Description.ClientCredentials>(); client.ChannelFactory.Endpoint.Behaviors.Add(new CustomCredentials()); client.ClientCredentials.UserName.UserName = username; client.ClientCredentials.UserName.Password = password; return client; } This returns a service client that's ready to call other service methods. The key item in this code is the ChannelFactory endpoint behavior modification that that first removes the original ClientCredentials and then adds the new one. The ClientCredentials property on the client is read only and this is the way it has to be added.   Summary It's a bummer that WCF doesn't suport WSE Security authentication with nonce values out of the box. From reading the comments in posts/articles while I was trying to find a solution, I found that this feature was omitted by design as this protocol is considered unsecure. While I agree that plain text passwords are rarely a good idea even if they go over secured SSL connection as WSE Security does, there are unfortunately quite a few services (mosly Java services I suspect) that use this protocol. I've run into this twice now and trying to find a solution online I can see that this is not an isolated problem - many others seem to have struggled with this. It seems there are about a dozen questions about this on StackOverflow all with varying incomplete answers. Hopefully this post provides a little more coherent content in one place. Again I marvel at WCF and its breadth of support for protocol features it has in a single tool. And even when it can't handle something there are ways to get it working via extensibility. But at the same time I marvel at how freaking difficult it is to arrive at these solutions. I mean there's no way I could have ever figured this out on my own. It takes somebody working on the WCF team or at least being very, very intricately involved in the innards of WCF to figure out the interconnection of the various objects to do this from scratch. Luckily this is an older problem that has been discussed extensively online and I was able to cobble together a solution from the online content. I'm glad it worked out that way, but it feels dirty and incomplete in that there's a whole learning path that was omitted to get here… Man am I glad I'm not dealing with SOAP services much anymore. REST service security - even when using some sort of federation is a piece of cake by comparison :-) I'm sure once standards bodies gets involved we'll be right back in security standard hell…© Rick Strahl, West Wind Technologies, 2005-2012Posted in WCF  Web Services   Tweet !function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0];if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src="//platform.twitter.com/widgets.js";fjs.parentNode.insertBefore(js,fjs);}}(document,"script","twitter-wjs"); (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/plusone.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

    Read the article

  • West Wind WebSurge - an easy way to Load Test Web Applications

    - by Rick Strahl
    A few months ago on a project the subject of load testing came up. We were having some serious issues with a Web application that would start spewing SQL lock errors under somewhat heavy load. These sort of errors can be tough to catch, precisely because they only occur under load and not during typical development testing. To replicate this error more reliably we needed to put a load on the application and run it for a while before these SQL errors would flare up. It’s been a while since I’d looked at load testing tools, so I spent a bit of time looking at different tools and frankly didn’t really find anything that was a good fit. A lot of tools were either a pain to use, didn’t have the basic features I needed, or are extravagantly expensive. In  the end I got frustrated enough to build an initially small custom load test solution that then morphed into a more generic library, then gained a console front end and eventually turned into a full blown Web load testing tool that is now called West Wind WebSurge. I got seriously frustrated looking for tools every time I needed some quick and dirty load testing for an application. If my aim is to just put an application under heavy enough load to find a scalability problem in code, or to simply try and push an application to its limits on the hardware it’s running I shouldn’t have to have to struggle to set up tests. It should be easy enough to get going in a few minutes, so that the testing can be set up quickly so that it can be done on a regular basis without a lot of hassle. And that was the goal when I started to build out my initial custom load tester into a more widely usable tool. If you’re in a hurry and you want to check it out, you can find more information and download links here: West Wind WebSurge Product Page Walk through Video Download link (zip) Install from Chocolatey Source on GitHub For a more detailed discussion of the why’s and how’s and some background continue reading. How did I get here? When I started out on this path, I wasn’t planning on building a tool like this myself – but I got frustrated enough looking at what’s out there to think that I can do better than what’s available for the most common simple load testing scenarios. When we ran into the SQL lock problems I mentioned, I started looking around what’s available for Web load testing solutions that would work for our whole team which consisted of a few developers and a couple of IT guys both of which needed to be able to run the tests. It had been a while since I looked at tools and I figured that by now there should be some good solutions out there, but as it turns out I didn’t really find anything that fit our relatively simple needs without costing an arm and a leg… I spent the better part of a day installing and trying various load testing tools and to be frank most of them were either terrible at what they do, incredibly unfriendly to use, used some terminology I couldn’t even parse, or were extremely expensive (and I mean in the ‘sell your liver’ range of expensive). Pick your poison. There are also a number of online solutions for load testing and they actually looked more promising, but those wouldn’t work well for our scenario as the application is running inside of a private VPN with no outside access into the VPN. Most of those online solutions also ended up being very pricey as well – presumably because of the bandwidth required to test over the open Web can be enormous. When I asked around on Twitter what people were using– I got mostly… crickets. Several people mentioned Visual Studio Load Test, and most other suggestions pointed to online solutions. I did get a bunch of responses though with people asking to let them know what I found – apparently I’m not alone when it comes to finding load testing tools that are effective and easy to use. As to Visual Studio, the higher end skus of Visual Studio and the test edition include a Web load testing tool, which is quite powerful, but there are a number of issues with that: First it’s tied to Visual Studio so it’s not very portable – you need a VS install. I also find the test setup and terminology used by the VS test runner extremely confusing. Heck, it’s complicated enough that there’s even a Pluralsight course on using the Visual Studio Web test from Steve Smith. And of course you need to have one of the high end Visual Studio Skus, and those are mucho Dinero ($$$) – just for the load testing that’s rarely an option. Some of the tools are ultra extensive and let you run analysis tools on the target serves which is useful, but in most cases – just plain overkill and only distracts from what I tend to be ultimately interested in: Reproducing problems that occur at high load, and finding the upper limits and ‘what if’ scenarios as load is ramped up increasingly against a site. Yes it’s useful to have Web app instrumentation, but often that’s not what you’re interested in. I still fondly remember early days of Web testing when Microsoft had the WAST (Web Application Stress Tool) tool, which was rather simple – and also somewhat limited – but easily allowed you to create stress tests very quickly. It had some serious limitations (mainly that it didn’t work with SSL),  but the idea behind it was excellent: Create tests quickly and easily and provide a decent engine to run it locally with minimal setup. You could get set up and run tests within a few minutes. Unfortunately, that tool died a quiet death as so many of Microsoft’s tools that probably were built by an intern and then abandoned, even though there was a lot of potential and it was actually fairly widely used. Eventually the tools was no longer downloadable and now it simply doesn’t work anymore on higher end hardware. West Wind Web Surge – Making Load Testing Quick and Easy So I ended up creating West Wind WebSurge out of rebellious frustration… The goal of WebSurge is to make it drop dead simple to create load tests. It’s super easy to capture sessions either using the built in capture tool (big props to Eric Lawrence, Telerik and FiddlerCore which made that piece a snap), using the full version of Fiddler and exporting sessions, or by manually or programmatically creating text files based on plain HTTP headers to create requests. I’ve been using this tool for 4 months now on a regular basis on various projects as a reality check for performance and scalability and it’s worked extremely well for finding small performance issues. I also use it regularly as a simple URL tester, as it allows me to quickly enter a URL plus headers and content and test that URL and its results along with the ability to easily save one or more of those URLs. A few weeks back I made a walk through video that goes over most of the features of WebSurge in some detail: Note that the UI has slightly changed since then, so there are some UI improvements. Most notably the test results screen has been updated recently to a different layout and to provide more information about each URL in a session at a glance. The video and the main WebSurge site has a lot of info of basic operations. For the rest of this post I’ll talk about a few deeper aspects that may be of interest while also giving a glance at how WebSurge works. Session Capturing As you would expect, WebSurge works with Sessions of Urls that are played back under load. Here’s what the main Session View looks like: You can create session entries manually by individually adding URLs to test (on the Request tab on the right) and saving them, or you can capture output from Web Browsers, Windows Desktop applications that call services, your own applications using the built in Capture tool. With this tool you can capture anything HTTP -SSL requests and content from Web pages, AJAX calls, SOAP or REST services – again anything that uses Windows or .NET HTTP APIs. Behind the scenes the capture tool uses FiddlerCore so basically anything you can capture with Fiddler you can also capture with Web Surge Session capture tool. Alternately you can actually use Fiddler as well, and then export the captured Fiddler trace to a file, which can then be imported into WebSurge. This is a nice way to let somebody capture session without having to actually install WebSurge or for your customers to provide an exact playback scenario for a given set of URLs that cause a problem perhaps. Note that not all applications work with Fiddler’s proxy unless you configure a proxy. For example, .NET Web applications that make HTTP calls usually don’t show up in Fiddler by default. For those .NET applications you can explicitly override proxy settings to capture those requests to service calls. The capture tool also has handy optional filters that allow you to filter by domain, to help block out noise that you typically don’t want to include in your requests. For example, if your pages include links to CDNs, or Google Analytics or social links you typically don’t want to include those in your load test, so by capturing just from a specific domain you are guaranteed content from only that one domain. Additionally you can provide url filters in the configuration file – filters allow to provide filter strings that if contained in a url will cause requests to be ignored. Again this is useful if you don’t filter by domain but you want to filter out things like static image, css and script files etc. Often you’re not interested in the load characteristics of these static and usually cached resources as they just add noise to tests and often skew the overall url performance results. In my testing I tend to care only about my dynamic requests. SSL Captures require Fiddler Note, that in order to capture SSL requests you’ll have to install the Fiddler’s SSL certificate. The easiest way to do this is to install Fiddler and use its SSL configuration options to get the certificate into the local certificate store. There’s a document on the Telerik site that provides the exact steps to get SSL captures to work with Fiddler and therefore with WebSurge. Session Storage A group of URLs entered or captured make up a Session. Sessions can be saved and restored easily as they use a very simple text format that simply stored on disk. The format is slightly customized HTTP header traces separated by a separator line. The headers are standard HTTP headers except that the full URL instead of just the domain relative path is stored as part of the 1st HTTP header line for easier parsing. Because it’s just text and uses the same format that Fiddler uses for exports, it’s super easy to create Sessions by hand manually or under program control writing out to a simple text file. You can see what this format looks like in the Capture window figure above – the raw captured format is also what’s stored to disk and what WebSurge parses from. The only ‘custom’ part of these headers is that 1st line contains the full URL instead of the domain relative path and Host: header. The rest of each header are just plain standard HTTP headers with each individual URL isolated by a separator line. The format used here also uses what Fiddler produces for exports, so it’s easy to exchange or view data either in Fiddler or WebSurge. Urls can also be edited interactively so you can modify the headers easily as well: Again – it’s just plain HTTP headers so anything you can do with HTTP can be added here. Use it for single URL Testing Incidentally I’ve also found this form as an excellent way to test and replay individual URLs for simple non-load testing purposes. Because you can capture a single or many URLs and store them on disk, this also provides a nice HTTP playground where you can record URLs with their headers, and fire them one at a time or as a session and see results immediately. It’s actually an easy way for REST presentations and I find the simple UI flow actually easier than using Fiddler natively. Finally you can save one or more URLs as a session for later retrieval. I’m using this more and more for simple URL checks. Overriding Cookies and Domains Speaking of HTTP headers – you can also overwrite cookies used as part of the options. One thing that happens with modern Web applications is that you have session cookies in use for authorization. These cookies tend to expire at some point which would invalidate a test. Using the Options dialog you can actually override the cookie: which replaces the cookie for all requests with the cookie value specified here. You can capture a valid cookie from a manual HTTP request in your browser and then paste into the cookie field, to replace the existing Cookie with the new one that is now valid. Likewise you can easily replace the domain so if you captured urls on west-wind.com and now you want to test on localhost you can do that easily easily as well. You could even do something like capture on store.west-wind.com and then test on localhost/store which would also work. Running Load Tests Once you’ve created a Session you can specify the length of the test in seconds, and specify the number of simultaneous threads to run each session on. Sessions run through each of the URLs in the session sequentially by default. One option in the options list above is that you can also randomize the URLs so each thread runs requests in a different order. This avoids bunching up URLs initially when tests start as all threads run the same requests simultaneously which can sometimes skew the results of the first few minutes of a test. While sessions run some progress information is displayed: By default there’s a live view of requests displayed in a Console-like window. On the bottom of the window there’s a running total summary that displays where you’re at in the test, how many requests have been processed and what the requests per second count is currently for all requests. Note that for tests that run over a thousand requests a second it’s a good idea to turn off the console display. While the console display is nice to see that something is happening and also gives you slight idea what’s happening with actual requests, once a lot of requests are processed, this UI updating actually adds a lot of CPU overhead to the application which may cause the actual load generated to be reduced. If you are running a 1000 requests a second there’s not much to see anyway as requests roll by way too fast to see individual lines anyway. If you look on the options panel, there is a NoProgressEvents option that disables the console display. Note that the summary display is still updated approximately once a second so you can always tell that the test is still running. Test Results When the test is done you get a simple Results display: On the right you get an overall summary as well as breakdown by each URL in the session. Both success and failures are highlighted so it’s easy to see what’s breaking in your load test. The report can be printed or you can also open the HTML document in your default Web Browser for printing to PDF or saving the HTML document to disk. The list on the right shows you a partial list of the URLs that were fired so you can look in detail at the request and response data. The list can be filtered by success and failure requests. Each list is partial only (at the moment) and limited to a max of 1000 items in order to render reasonably quickly. Each item in the list can be clicked to see the full request and response data: This particularly useful for errors so you can quickly see and copy what request data was used and in the case of a GET request you can also just click the link to quickly jump to the page. For non-GET requests you can find the URL in the Session list, and use the context menu to Test the URL as configured including any HTTP content data to send. You get to see the full HTTP request and response as well as a link in the Request header to go visit the actual page. Not so useful for a POST as above, but definitely useful for GET requests. Finally you can also get a few charts. The most useful one is probably the Request per Second chart which can be accessed from the Charts menu or shortcut. Here’s what it looks like:   Results can also be exported to JSON, XML and HTML. Keep in mind that these files can get very large rather quickly though, so exports can end up taking a while to complete. Command Line Interface WebSurge runs with a small core load engine and this engine is plugged into the front end application I’ve shown so far. There’s also a command line interface available to run WebSurge from the Windows command prompt. Using the command line you can run tests for either an individual URL (similar to AB.exe for example) or a full Session file. By default when it runs WebSurgeCli shows progress every second showing total request count, failures and the requests per second for the entire test. A silent option can turn off this progress display and display only the results. The command line interface can be useful for build integration which allows checking for failures perhaps or hitting a specific requests per second count etc. It’s also nice to use this as quick and dirty URL test facility similar to the way you’d use Apache Bench (ab.exe). Unlike ab.exe though, WebSurgeCli supports SSL and makes it much easier to create multi-URL tests using either manual editing or the WebSurge UI. Current Status Currently West Wind WebSurge is still in Beta status. I’m still adding small new features and tweaking the UI in an attempt to make it as easy and self-explanatory as possible to run. Documentation for the UI and specialty features is also still a work in progress. I plan on open-sourcing this product, but it won’t be free. There’s a free version available that provides a limited number of threads and request URLs to run. A relatively low cost license  removes the thread and request limitations. Pricing info can be found on the Web site – there’s an introductory price which is $99 at the moment which I think is reasonable compared to most other for pay solutions out there that are exorbitant by comparison… The reason code is not available yet is – well, the UI portion of the app is a bit embarrassing in its current monolithic state. The UI started as a very simple interface originally that later got a lot more complex – yeah, that never happens, right? Unless there’s a lot of interest I don’t foresee re-writing the UI entirely (which would be ideal), but in the meantime at least some cleanup is required before I dare to publish it :-). The code will likely be released with version 1.0. I’m very interested in feedback. Do you think this could be useful to you and provide value over other tools you may or may not have used before? I hope so – it already has provided a ton of value for me and the work I do that made the development worthwhile at this point. You can leave a comment below, or for more extensive discussions you can post a message on the West Wind Message Board in the WebSurge section Microsoft MVPs and Insiders get a free License If you’re a Microsoft MVP or a Microsoft Insider you can get a full license for free. Send me a link to your current, official Microsoft profile and I’ll send you a not-for resale license. Send any messages to [email protected]. Resources For more info on WebSurge and to download it to try it out, use the following links. West Wind WebSurge Home Download West Wind WebSurge Getting Started with West Wind WebSurge Video© Rick Strahl, West Wind Technologies, 2005-2014Posted in ASP.NET   Tweet !function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0];if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src="//platform.twitter.com/widgets.js";fjs.parentNode.insertBefore(js,fjs);}}(document,"script","twitter-wjs"); (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/plusone.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

    Read the article

  • Top things web developers should know about the Visual Studio 2013 release

    - by Jon Galloway
    ASP.NET and Web Tools for Visual Studio 2013 Release NotesASP.NET and Web Tools for Visual Studio 2013 Release NotesSummary for lazy readers: Visual Studio 2013 is now available for download on the Visual Studio site and on MSDN subscriber downloads) Visual Studio 2013 installs side by side with Visual Studio 2012 and supports round-tripping between Visual Studio versions, so you can try it out without committing to a switch Visual Studio 2013 ships with the new version of ASP.NET, which includes ASP.NET MVC 5, ASP.NET Web API 2, Razor 3, Entity Framework 6 and SignalR 2.0 The new releases ASP.NET focuses on One ASP.NET, so core features and web tools work the same across the platform (e.g. adding ASP.NET MVC controllers to a Web Forms application) New core features include new templates based on Bootstrap, a new scaffolding system, and a new identity system Visual Studio 2013 is an incredible editor for web files, including HTML, CSS, JavaScript, Markdown, LESS, Coffeescript, Handlebars, Angular, Ember, Knockdown, etc. Top links: Visual Studio 2013 content on the ASP.NET site are in the standard new releases area: http://www.asp.net/vnext ASP.NET and Web Tools for Visual Studio 2013 Release Notes Short intro videos on the new Visual Studio web editor features from Scott Hanselman and Mads Kristensen Announcing release of ASP.NET and Web Tools for Visual Studio 2013 post on the official .NET Web Development and Tools Blog Scott Guthrie's post: Announcing the Release of Visual Studio 2013 and Great Improvements to ASP.NET and Entity Framework Okay, for those of you who are still with me, let's dig in a bit. Quick web dev notes on downloading and installing Visual Studio 2013 I found Visual Studio 2013 to be a pretty fast install. According to Brian Harry's release post, installing over pre-release versions of Visual Studio is supported.  I've installed the release version over pre-release versions, and it worked fine. If you're only going to be doing web development, you can speed up the install if you just select Web Developer tools. Of course, as a good Microsoft employee, I'll mention that you might also want to install some of those other features, like the Store apps for Windows 8 and the Windows Phone 8.0 SDK, but they do download and install a lot of other stuff (e.g. the Windows Phone SDK sets up Hyper-V and downloads several GB's of VM's). So if you're planning just to do web development for now, you can pick just the Web Developer Tools and install the other stuff later. If you've got a fast internet connection, I recommend using the web installer instead of downloading the ISO. The ISO includes all the features, whereas the web installer just downloads what you're installing. Visual Studio 2013 development settings and color theme When you start up Visual Studio, it'll prompt you to pick some defaults. These are totally up to you -whatever suits your development style - and you can change them later. As I said, these are completely up to you. I recommend either the Web Development or Web Development (Code Only) settings. The only real difference is that Code Only hides the toolbars, and you can switch between them using Tools / Import and Export Settings / Reset. Web Development settings Web Development (code only) settings Usually I've just gone with Web Development (code only) in the past because I just want to focus on the code, although the Standard toolbar does make it easier to switch default web browsers. More on that later. Color theme Sigh. Okay, everyone's got their favorite colors. I alternate between Light and Dark depending on my mood, and I personally like how the low contrast on the window chrome in those themes puts the emphasis on my code rather than the tabs and toolbars. I know some people got pretty worked up over that, though, and wanted the blue theme back. I personally don't like it - it reminds me of ancient versions of Visual Studio that I don't want to think about anymore. So here's the thing: if you install Visual Studio Ultimate, it defaults to Blue. The other versions default to Light. If you use Blue, I won't criticize you - out loud, that is. You can change themes really easily - either Tools / Options / Environment / General, or the smart way: ctrl+q for quick launch, then type Theme and hit enter. Signing in During the first run, you'll be prompted to sign in. You don't have to - you can click the "Not now, maybe later" link at the bottom of that dialog. I recommend signing in, though. It's not hooked in with licensing or tracking the kind of code you write to sell you components. It is doing good things, like  syncing your Visual Studio settings between computers. More about that here. So, you don't have to, but I sure do. Overview of shiny new things in ASP.NET land There are a lot of good new things in ASP.NET. I'll list some of my favorite here, but you can read more on the ASP.NET site. One ASP.NET You've heard us talk about this for a while. The idea is that options are good, but choice can be a burden. When you start a new ASP.NET project, why should you have to make a tough decision - with long-term consequences - about how your application will work? If you want to use ASP.NET Web Forms, but have the option of adding in ASP.NET MVC later, why should that be hard? It's all ASP.NET, right? Ideally, you'd just decide that you want to use ASP.NET to build sites and services, and you could use the appropriate tools (the green blocks below) as you needed them. So, here it is. When you create a new ASP.NET application, you just create an ASP.NET application. Next, you can pick from some templates to get you started... but these are different. They're not "painful decision" templates, they're just some starting pieces. And, most importantly, you can mix and match. I can pick a "mostly" Web Forms template, but include MVC and Web API folders and core references. If you've tried to mix and match in the past, you're probably aware that it was possible, but not pleasant. ASP.NET MVC project files contained special project type GUIDs, so you'd only get controller scaffolding support in a Web Forms project if you manually edited the csproj file. Features in one stack didn't work in others. Project templates were painful choices. That's no longer the case. Hooray! I just did a demo in a presentation last week where I created a new Web Forms + MVC + Web API site, built a model, scaffolded MVC and Web API controllers with EF Code First, add data in the MVC view, viewed it in Web API, then added a GridView to the Web Forms Default.aspx page and bound it to the Model. In about 5 minutes. Sure, it's a simple example, but it's great to be able to share code and features across the whole ASP.NET family. Authentication In the past, authentication was built into the templates. So, for instance, there was an ASP.NET MVC 4 Intranet Project template which created a new ASP.NET MVC 4 application that was preconfigured for Windows Authentication. All of that authentication stuff was built into each template, so they varied between the stacks, and you couldn't reuse them. You didn't see a lot of changes to the authentication options, since they required big changes to a bunch of project templates. Now, the new project dialog includes a common authentication experience. When you hit the Change Authentication button, you get some common options that work the same way regardless of the template or reference settings you've made. These options work on all ASP.NET frameworks, and all hosting environments (IIS, IIS Express, or OWIN for self-host) The default is Individual User Accounts: This is the standard "create a local account, using username / password or OAuth" thing; however, it's all built on the new Identity system. More on that in a second. The one setting that has some configuration to it is Organizational Accounts, which lets you configure authentication using Active Directory, Windows Azure Active Directory, or Office 365. Identity There's a new identity system. We've taken the best parts of the previous ASP.NET Membership and Simple Identity systems, rolled in a lot of feedback and made big enhancements to support important developer concerns like unit testing and extensiblity. I've written long posts about ASP.NET identity, and I'll do it again. Soon. This is not that post. The short version is that I think we've finally got just the right Identity system. Some of my favorite features: There are simple, sensible defaults that work well - you can File / New / Run / Register / Login, and everything works. It supports standard username / password as well as external authentication (OAuth, etc.). It's easy to customize without having to re-implement an entire provider. It's built using pluggable pieces, rather than one large monolithic system. It's built using interfaces like IUser and IRole that allow for unit testing, dependency injection, etc. You can easily add user profile data (e.g. URL, twitter handle, birthday). You just add properties to your ApplicationUser model and they'll automatically be persisted. Complete control over how the identity data is persisted. By default, everything works with Entity Framework Code First, but it's built to support changes from small (modify the schema) to big (use another ORM, store your data in a document database or in the cloud or in XML or in the EXIF data of your desktop background or whatever). It's configured via OWIN. More on OWIN and Katana later, but the fact that it's built using OWIN means it's portable. You can find out more in the Authentication and Identity section of the ASP.NET site (and lots more content will be going up there soon). New Bootstrap based project templates The new project templates are built using Bootstrap 3. Bootstrap (formerly Twitter Bootstrap) is a front-end framework that brings a lot of nice benefits: It's responsive, so your projects will automatically scale to device width using CSS media queries. For example, menus are full size on a desktop browser, but on narrower screens you automatically get a mobile-friendly menu. The built-in Bootstrap styles make your standard page elements (headers, footers, buttons, form inputs, tables etc.) look nice and modern. Bootstrap is themeable, so you can reskin your whole site by dropping in a new Bootstrap theme. Since Bootstrap is pretty popular across the web development community, this gives you a large and rapidly growing variety of templates (free and paid) to choose from. Bootstrap also includes a lot of very useful things: components (like progress bars and badges), useful glyphicons, and some jQuery plugins for tooltips, dropdowns, carousels, etc.). Here's a look at how the responsive part works. When the page is full screen, the menu and header are optimized for a wide screen display: When I shrink the page down (this is all based on page width, not useragent sniffing) the menu turns into a nice mobile-friendly dropdown: For a quick example, I grabbed a new free theme off bootswatch.com. For simple themes, you just need to download the boostrap.css file and replace the /content/bootstrap.css file in your project. Now when I refresh the page, I've got a new theme: Scaffolding The big change in scaffolding is that it's one system that works across ASP.NET. You can create a new Empty Web project or Web Forms project and you'll get the Scaffold context menus. For release, we've got MVC 5 and Web API 2 controllers. We had a preview of Web Forms scaffolding in the preview releases, but they weren't fully baked for RTM. Look for them in a future update, expected pretty soon. This scaffolding system wasn't just changed to work across the ASP.NET frameworks, it's also built to enable future extensibility. That's not in this release, but should also hopefully be out soon. Project Readme page This is a small thing, but I really like it. When you create a new project, you get a Project_Readme.html page that's added to the root of your project and opens in the Visual Studio built-in browser. I love it. A long time ago, when you created a new project we just dumped it on you and left you scratching your head about what to do next. Not ideal. Then we started adding a bunch of Getting Started information to the new project templates. That told you what to do next, but you had to delete all of that stuff out of your website. It doesn't belong there. Not ideal. This is a simple HTML file that's not integrated into your project code at all. You can delete it if you want. But, it shows a lot of helpful links that are current for the project you just created. In the future, if we add new wacky project types, they can create readme docs with specific information on how to do appropriately wacky things. Side note: I really like that they used the internal browser in Visual Studio to show this content rather than popping open an HTML page in the default browser. I hate that. It's annoying. If you're doing that, I hope you'll stop. What if some unnamed person has 40 or 90 tabs saved in their browser session? When you pop open your "Thanks for installing my Visual Studio extension!" page, all eleventy billion tabs start up and I wish I'd never installed your thing. Be like these guys and pop stuff Visual Studio specific HTML docs in the Visual Studio browser. ASP.NET MVC 5 The biggest change with ASP.NET MVC 5 is that it's no longer a separate project type. It integrates well with the rest of ASP.NET. In addition to that and the other common features we've already looked at (Bootstrap templates, Identity, authentication), here's what's new for ASP.NET MVC. Attribute routing ASP.NET MVC now supports attribute routing, thanks to a contribution by Tim McCall, the author of http://attributerouting.net. With attribute routing you can specify your routes by annotating your actions and controllers. This supports some pretty complex, customized routing scenarios, and it allows you to keep your route information right with your controller actions if you'd like. Here's a controller that includes an action whose method name is Hiding, but I've used AttributeRouting to configure it to /spaghetti/with-nesting/where-is-waldo public class SampleController : Controller { [Route("spaghetti/with-nesting/where-is-waldo")] public string Hiding() { return "You found me!"; } } I enable that in my RouteConfig.cs, and I can use that in conjunction with my other MVC routes like this: public class RouteConfig { public static void RegisterRoutes(RouteCollection routes) { routes.IgnoreRoute("{resource}.axd/{*pathInfo}"); routes.MapMvcAttributeRoutes(); routes.MapRoute( name: "Default", url: "{controller}/{action}/{id}", defaults: new { controller = "Home", action = "Index", id = UrlParameter.Optional } ); } } You can read more about Attribute Routing in ASP.NET MVC 5 here. Filter enhancements There are two new additions to filters: Authentication Filters and Filter Overrides. Authentication filters are a new kind of filter in ASP.NET MVC that run prior to authorization filters in the ASP.NET MVC pipeline and allow you to specify authentication logic per-action, per-controller, or globally for all controllers. Authentication filters process credentials in the request and provide a corresponding principal. Authentication filters can also add authentication challenges in response to unauthorized requests. Override filters let you change which filters apply to a given action method or controller. Override filters specify a set of filter types that should not be run for a given scope (action or controller). This allows you to configure filters that apply globally but then exclude certain global filters from applying to specific actions or controllers. ASP.NET Web API 2 ASP.NET Web API 2 includes a lot of new features. Attribute Routing ASP.NET Web API supports the same attribute routing system that's in ASP.NET MVC 5. You can read more about the Attribute Routing features in Web API in this article. OAuth 2.0 ASP.NET Web API picks up OAuth 2.0 support, using security middleware running on OWIN (discussed below). This is great for features like authenticated Single Page Applications. OData Improvements ASP.NET Web API now has full OData support. That required adding in some of the most powerful operators: $select, $expand, $batch and $value. You can read more about OData operator support in this article by Mike Wasson. Lots more There's a huge list of other features, including CORS (cross-origin request sharing), IHttpActionResult, IHttpRequestContext, and more. I think the best overview is in the release notes. OWIN and Katana I've written about OWIN and Katana recently. I'm a big fan. OWIN is the Open Web Interfaces for .NET. It's a spec, like HTML or HTTP, so you can't install OWIN. The benefit of OWIN is that it's a community specification, so anyone who implements it can plug into the ASP.NET stack, either as middleware or as a host. Katana is the Microsoft implementation of OWIN. It leverages OWIN to wire up things like authentication, handlers, modules, IIS hosting, etc., so ASP.NET can host OWIN components and Katana components can run in someone else's OWIN implementation. Howard Dierking just wrote a cool article in MSDN magazine describing Katana in depth: Getting Started with the Katana Project. He had an interesting example showing an OWIN based pipeline which leveraged SignalR, ASP.NET Web API and NancyFx components in the same stack. If this kind of thing makes sense to you, that's great. If it doesn't, don't worry, but keep an eye on it. You're going to see some cool things happen as a result of ASP.NET becoming more and more pluggable. Visual Studio Web Tools Okay, this stuff's just crazy. Visual Studio has been adding some nice web dev features over the past few years, but they've really cranked it up for this release. Visual Studio is by far my favorite code editor for all web files: CSS, HTML, JavaScript, and lots of popular libraries. Stop thinking of Visual Studio as a big editor that you only use to write back-end code. Stop editing HTML and CSS in Notepad (or Sublime, Notepad++, etc.). Visual Studio starts up in under 2 seconds on a modern computer with an SSD. Misspelling HTML attributes or your CSS classes or jQuery or Angular syntax is stupid. It doesn't make you a better developer, it makes you a silly person who wastes time. Browser Link Browser Link is a real-time, two-way connection between Visual Studio and all connected browsers. It's only attached when you're running locally, in debug, but it applies to any and all connected browser, including emulators. You may have seen demos that showed the browsers refreshing based on changes in the editor, and I'll agree that's pretty cool. But it's really just the start. It's a two-way connection, and it's built for extensiblity. That means you can write extensions that push information from your running application (in IE, Chrome, a mobile emulator, etc.) back to Visual Studio. Mads and team have showed off some demonstrations where they enabled edit mode in the browser which updated the source HTML back on the browser. It's also possible to look at how the rendered HTML performs, check for compatibility issues, watch for unused CSS classes, the sky's the limit. New HTML editor The previous HTML editor had a lot of old code that didn't allow for improvements. The team rewrote the HTML editor to take advantage of the new(ish) extensibility features in Visual Studio, which then allowed them to add in all kinds of features - things like CSS Class and ID IntelliSense (so you type style="" and get a list of classes and ID's for your project), smart indent based on how your document is formatted, JavaScript reference auto-sync, etc. Here's a 3 minute tour from Mads Kristensen. The previous HTML editor had a lot of old code that didn't allow for improvements. The team rewrote the HTML editor to take advantage of the new(ish) extensibility features in Visual Studio, which then allowed them to add in all kinds of features - things like CSS Class and ID IntelliSense (so you type style="" and get a list of classes and ID's for your project), smart indent based on how your document is formatted, JavaScript reference auto-sync, etc. Lots more Visual Studio web dev features That's just a sampling - there's a ton of great features for JavaScript editing, CSS editing, publishing, and Page Inspector (which shows real-time rendering of your page inside Visual Studio). Here are some more short videos showing those features. Lots, lots more Okay, that's just a summary, and it's still quite a bit. Head on over to http://asp.net/vnext for more information, and download Visual Studio 2013 now to get started!

    Read the article

  • IIS SSL Certificate Renewal Pain

    - by Rick Strahl
    I’m in the middle of my annual certificate renewal for the West Wind site and I can honestly say that I hate IIS’s certificate system.  When it works it’s fine, but when it doesn’t man can it be a pain. Because I deal with public certificates on my site merely once a year, and you have to perform the certificate dance just the right way, I seem to run into some sort of trouble every year, thinking that Microsoft surely must have addressed the issues I ran into previously – HA! Not so. Don’t ever use the Renew Certificate Feature in IIS! The first rule that I should have never forgotten is that certificate renewals in IIS (7 is what I’m using but I think it’s no different in 7.5 and 8), simply don’t work if you’re submitting to get a public certificate from a certificate authority. I use DNSimple for my DNS domain management and SSL certificates because they provide ridiculously easy domain management and good prices for SSL certs – especially wildcard certificates, which is what I use on west-wind.com. Certificates in IIS can be found pegged to the machine root. If you go into the IIS Manager, go to the machine root the tree and then click on certificates and you then get various certificate options: Both of these options create a new Certificate request (CSR), which is just a text file. But if you’re silly enough like me to click on the Renew button on your old certificate, you’ll find that you end up generating a very long Certificate Request that looks nothing like the original certificate request and the format that’s used for this is not accepted by most certificate authorities. While I’m not sure exactly what the problem is, it simply looks like IIS is respecting none of your original certificate bit size choices and is generating a huge certificate request that is 3 times the size of a ‘normal’ certificate request. The end result is (and I’ve done this at least twice now) is that the certificate processor is likely to fail processing those renewals. Always create a new Certificate While it’s a little more work and you have to remember how to fill out the certificate request properly, this is the safe way to make sure your certificate generates properly. First comes the Distinguished Name Properties dialog: Ah yes you have to love the nomenclature of this stuff. Distinguished name, Common name – WTF is a common name? It doesn’t look common to me! Make sure this form gets filled out correctly. Common NameThis is the domain name of the Web site. In my case I’m creating a wildcard certificate so I’m using the * prefix. If you’re purchasing a certificate for a specific domain use www.west-wind.com or store.west-wind.com for example. Make sure this matches the EXACT domain you’re trying to use secure access on because that’s all the certificate is going to work on unless you get a wildcard certificate. Organization Is the name of your company or organization. Depending on the kind of certificate you purchase this name will show up on your certificate. Most low end SSL certificates (ie. those that cost under $100 for single domains) don’t list the organization, the higher signature certificates that also require extensive validation by the cert authority do. Regardless you should make sure this matches the right company/organization. Organizational Unit This can be anything. Not really sure what this is for, but traditionally I’ve always set this to Web because – well this is a Web thing after all right? I’ve never seen this used anywhere that I can tell other than to internally reference the cert. State and CountryPretty obvious. Should reflect the location of the business/organization/person or site.   Next you have to configure the bit size used for the certificate: The default on this dialog is 1024, but I’ve found that most providers these days request a minimum bit length of 2048, as did my DNSimple provider. Again check with the provider when you submit to make sure. Bit length mismatches can cause problems if you use a size that isn’t supported by the provider. I had that happen last year when I submitted my CSR and it got rejected quite a bit later, when the certs usually are issued within an hour or less. When you’re done here, the certificate is saved to disk as a .txt file and it should look something like this (this is a 2048 bit length CSR):-----BEGIN NEW CERTIFICATE REQUEST----- MIIEVGCCAz0CAQAwdjELMAkGA1UEBhMCVVMxDzANBgNVBAgMBkhhd2FpaTENMAsG A1UEBwwEUGFpYTEfMB0GA1UECgwWV2VzdCBXaW5kIFRlY2hub2xvZ2llczEMMAoG B1UECwwDV2ViMRgwFgYDVQQDDA8qLndlc3Qtd2luZC5jb20wggEiMA0GCSqGSIb3 DQEBAQUAA4IBDwAwggEKAoIBAQDIPWOFMkMVRp2Ftj9w/cCVV4OYYhoZYtl+8lTk oqDwKca0xWHLgioX/9v0rZLS6a82MHqKEBxVXu+cuCmSE4AQtB/1YH9lS4tpc/be OZDvnTotP6l4MCEzzAfROcw4CiIg6X0RMSnl8IATAvv2V5LQM9TDdt9oDdMpX2IY +vVC9RZ7PMHBmR9kwI2i/lrKitzhQKaHgpmKcRlM6iqpALUiX28w5HJaDKK1MDHN 607tyFJLHijuJKx7PdTqZYf50KkC3NupfZ2avVycf18Q13jHWj59tvwEOczoVzRL l4LQivAqbhyiqMpWnrZunIOUZta5aGm+jo7O1knGWJjxuraTAgMBAAGgggGYMBoG CisGAQQBgjcNAgMxDBYKNi4yLjkyMDAuMjA0BgkrBgEEAYI3FRQxJzAlAgEFDAZS QVNYUFMMC1JBU1hQU1xSaWNrDAtJbmV0TWdyLmV4ZTByBgorBgEEAYI3DQICMWQw YgIBAR5aAE0AaQBjAHIAbwBzAG8AZgB0ACAAUgBTAEEAIABTAEMAaABhAG4AbgBl AGwAIABDAHIAeQBwAHQAbwBnAHIAYQBwAGgAaQBjACAAUAByAG8AdgBpAGQAZQBy AwEAMIHPBgkqhkiG9w0BCQ4xgcEwgb4wDgYDVR0PAQH/BAQDAgTwMBMGA1UdJQQM MAoGCCsGAQUFBwMBMHgGCSqGSIb3DQEJDwRrMGkwDgYIKoZIhvcNAwICAgCAMA4G CCqGSIb3DQMEAgIAgDALBglghkgBZQMEASowCwYJYIZIAWUDBAEtMAsGCWCGSAFl AwQBAjALBglghkgBZQMEAQUwBwYFKw4DAgcwCgYIKoZIhvcNAwcwHQYDVR0OBBYE FD/yOsTbXE+GVFCFMmldzQvyloz9MA0GCSqGSIb3DQEBBQUAA4IBAQCK6LlsCuIM 1AU0niB6QZ9v0FTsGFxP1dYvVUnJyY6VEKNiGFiQjZac7UCs0p58yScdXWEFOE8V OsjAYD3xYNc05+ckyD67UHRGEUAVB9RBvbKW23KeR/8kBmEzc8PemD52YOgExxAJ 57xWmAwEHAvbgYzQvhO8AOzH3TGvvHbg5UKM1pYgNmuwZq5DkL/IDoeIJwfk/wrI wghNTuxxIFgbH4YrgLgv4PRvrS/LaTCRBdboaCgzATMczaOb1nd/DVNR+3fCtMhM W0psTAjzRbmXF3nJyAQa7jF/52gkY0RfFX2lG5tJnG+XDsVNvKNvh9Qa5Tlmkm06 ILKCm9ciWCKk -----END NEW CERTIFICATE REQUEST----- You can take that certificate request and submit that to your certificate provider. Since this is base64 encoded you can typically just paste it into a text box on the submission page, or some providers will ask you to upload the CSR as a file. What does a Renewal look like? Note the length of the CSR will vary somewhat with key strength, but compare this to a renewal request that IIS generated from my existing site:-----BEGIN NEW CERTIFICATE REQUEST----- MIIPpwYFKoZIhvcNAQcCoIIPmDCCD5QCAQExCzAJBgUrDgMCGgUAMIIIqAYJKoZI hvcNAQcBoIIImQSCCJUwggiRMIIH+gIBADBdMSEwHwYDVQQLDBhEb21haW4gQ29u dHJvbCBWYWxpFGF0ZWQxHjAcBgNVBAsMFUVzc2VudGlhbFNTTCBXaWxkY2FyZDEY MBYGA1UEAwwPKi53ZXN0LXdpbmQuY29tMIGfMA0GCSqGSIb3DQEBAQUAA4GNADCB iQKBgQCK4OuIOR18Wb8tNMGRZiD1c9X57b332Lj7DhbckFqLs0ys8kVDHrTXSj+T Ye9nmAvfPpZmBtE5p9qRNN79rUYugAdl+qEtE4IJe1bRfxXzcKa1SXa8+TEs3zQa zYSmcR2dDuC8om1eAdeCtt0NnkvANgm1VLwGOor/UHMASaEhCQIDAQABoIIG8jAa BgorBgEEAYI3DQIDMQwWCjYuMi45MjAwLjIwNAYJKwYBBAGCNxUUMScwJQIBBQwG UkFTWFBTDAtSQVNYUFNcUmljawwLSW5ldE1nci5leGUwZgYKKwYBBAGCNw0CAjFY MFYCAQIeTgBNAGkAYwByAG8AcwBvAGYAdAAgAFMAdAByAG8AbgBnACAAQwByAHkA cAB0AG8AZwByAGEAcABoAGkAYwAgAFAAcgBvAHYAaQBkAGUAcgMBADCCAQAGCSqG SIb3DQEJDjGB8jCB7zAOBgNVHQ8BAf8EBAMCBaAwDAYDVR0TAQH/BAIwADA0BgNV HSUELTArBggrBgEFBQcDAQYIKwYBBQUHAwIGCisGAQQBgjcKAwMGCWCGSAGG+EIE ATBPBgNVHSAESDBGMDoGCysGAQQBsjEBAgIHMCswKQYIKwYBBQUHAgEWHWh0dHBz Oi8vc2VjdXJlLmNvbW9kby5jb20vQ1BTMAgGBmeBDAECATApBgNVHREEIjAggg8q Lndlc3Qtd2luZC5jb22CDXdlc3Qtd2luZC5jb20wHQYDVR0OBBYEFEVLAyO8gDiv lsfovKrx9mHPyrsiMIIFMAYJKwYBBAGCNw0BMYIFITCCBR0wggQFoAMCAQICEQDu 1E1T5Jvtkm5LOfSHabWlMA0GCSqGSIb3DQEBBQUAMHIxCzAJBgNVBAYTAkdCMRsw GQYDVQQIExJHcmVhdGVyIE1hbmNoZXN0ZXIxEDAOBgNVBAcTB1NhbGZvcmQxGjAY BgNVBAoTEUNPTU9ETyBDQSBMaW1pdGVkMRgwFgYDVQQDEw9Fc3NlbnRpYWxTU0wg Q0EwHhcNMTQwNTA3MDAwMDAwWhcNMTUwNjA2MjM1OTU5WjBdMSEwHwYDVQQLExhE b21haW4gQ29udHJvbCBWYWxpZGF0ZWQxHjAcBgNVBAsTFUVzc2VudGlhbFNTTCBX aWxkY2FyZDEYMBYGA1UEAxQPKi53ZXN0LXdpbmQuY29tMIIBIjANBgkqhkiG9w0B AQEFAAOCAQ8AMIIBCgKCAQEAiyKfL66XB51DlUfm6xXqJBcvMU2qorRHxC+WjEpB amvg8XoqNfCKzDAvLMbY4BLhbYCTagqtslnP3Gj4AKhXqRKU0n6iSbmS1gcWzCJM CHufZ5RDtuTuxhTdJxzP9YqZUfKV5abWQp/TK6V1ryaBJvdqM73q4tRjrQODtkiR PfZjxpybnBHFJS8jYAf8jcOjSDZcgN1d9Evc5MrEJCp/90cAkozyF/NMcFtD6Yj8 UM97z3MzDT2JPDoH3kAr3cCgpUNyQ2+wDNCnL9eWYFkOQi8FZMsZol7KlZ5NgNfO a7iZMVGbqDg6rkS//2uGe6tSQJTTs+mAZB+na+M8XT2UqwIDAQABo4IBwTCCAb0w HwYDVR0jBBgwFoAU2svqrVsIXcz//CZUzknlVcY49PgwHQYDVR0OBBYEFH0AmLiL RSEL9+sQD/n5O4N7/nnqMA4GA1UdDwEB/wQEAwIFoDAMBgNVHRMBAf8EAjAAMDQG A1UdJQQtMCsGCCsGAQUFBwMBBggrBgEFBQcDAgYKKwYBBAGCNwoDAwYJYIZIAYb4 QgQBME8GA1UdIARIMEYwOgYLKwYBBAGyMQECAgcwKzApBggrBgEFBQcCARYdaHR0 cHM6Ly9zZWN1cmUuY29tb2RvLmNvbS9DUFMwCAYGZ4EMAQIBMDsGA1UdHwQ0MDIw MKAuoCyGKmh0dHA6Ly9jcmwuY29tb2RvY2EuY29tL0Vzc2VudGlhbFNTTENBLmNy bDBuBggrBgEFBQcBAQRiMGAwOAYIKwYBBQUHMAKGLGh0dHA6Ly9jcnQuY29tb2Rv Y2EuY29tL0Vzc2VudGlhbFNTTENBXzIuY3J0MCQGCCsGAQUFBzABhhhodHRwOi8v b2NzcC5jb21vZG9jYS5jb20wKQYDVR0RBCIwIIIPKi53ZXN0LXdpbmQuY29tgg13 ZXN0LXdpbmQuY29tMA0GCSqGSIb3DQEBBQUAA4IBAQBqBfd6QHrxXsfgfKARG6np 8yszIPhHGPPmaE7xq7RpcZjY9H+8l6fe4jQbGFjbA5uHBklYI4m2snhPaW2p8iF8 YOkm2V2hEsSTnkf5/flw9mZtlCFEDFXSsBxBdNz8RYTthPMu1h09C0XuDB30sztg nR692FrxJN5/bXsk+MC9nEweTFW/t2HW+XZ8bhM7vsAS+pZionR4MyuQ0mYIt/lD csZVZ91KxTsIm8rNMkkYGFoSIXjQ0+0tCbxMF0i2qnpmNRpA6PU8l7lxxvPkplsk 9KB8QIPFrR5p/i/SUAd9vECWh5+/ktlcrfFP2PK7XcEwWizsvMrNqLyvQVNXSUPT MA0GCSqGSIb3DQEBBQUAA4GBABt/NitwMzc5t22p5+zy4HXbVYzLEjesLH8/v0ot uLQ3kkG8tIWNh5RplxIxtilXt09H4Oxpo3fKUN0yw+E6WsBfg0sAF8pHNBdOJi48 azrQbt4HvKktQkGpgYFjLsormjF44SRtToLHlYycDHBNvjaBClUwMCq8HnwY6vDq xikRoIIFITCCBR0wggQFoAMCAQICEQDu1E1T5Jvtkm5LOfSHabWlMA0GCSqGSIb3 DQEBBQUAMHIxCzAJBgNVBAYTAkdCMRswGQYDVQQIExJHcmVhdGVyIE1hbmNoZXN0 ZXIxEDAOBgNVBAcTB1NhbGZvcmQxGjAYBgNVBAoTEUNPTU9ETyBDQSBMaW1pdGVk MRgwFgYDVQQDEw9Fc3NlbnRpYWxTU0wgQ0EwHhcNMTQwNTA3MDAwMDAwWhcNMTUw NjA2MjM1OTU5WjBdMSEwHwYDVQQLExhEb21haW4gQ29udHJvbCBWYWxpZGF0ZWQx HjAcBgNVBAsTFUVzc2VudGlhbFNTTCBXaWxkY2FyZDEYMBYGA1UEAxQPKi53ZXN0 LXdpbmQuY29tMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAiyKfL66X B51DlUfm6xXqJBcvMU2qorRHxC+WjEpBamvg8XoqNfCKzDAvLMbY4BLhbYCTagqt slnP3Gj4AKhXqRKU0n6iSbmS1gcWzCJMCHufZ5RDtuTuxhTdJxzP9YqZUfKV5abW Qp/TK6V1ryaBJvdqM73q4tRjrQODtkiRPfZjxpybnBHFJS8jYAf8jcOjSDZcgN1d 9Evc5MrEJCp/90cAkozyF/NMcFtD6Yj8UM97z3MzDT2JPDoH3kAr3cCgpUNyQ2+w DNCnL9eWYFkOQi8FZMsZol7KlZ5NgNfOa7iZMVGbqDg6rkS//2uGe6tSQJTTs+mA ZB+na+M8XT2UqwIDAQABo4IBwTCCAb0wHwYDVR0jBBgwFoAU2svqrVsIXcz//CZU zknlVcY49PgwHQYDVR0OBBYEFH0AmLiLRSEL9+sQD/n5O4N7/nnqMA4GA1UdDwEB /wQEAwIFoDAMBgNVHRMBAf8EAjAAMDQGA1UdJQQtMCsGCCsGAQUFBwMBBggrBgEF BQcDAgYKKwYBBAGCNwoDAwYJYIZIAYb4QgQBME8GA1UdIARIMEYwOgYLKwYBBAGy MQECAgcwKzApBggrBgEFBQcCARYdaHR0cHM6Ly9zZWN1cmUuY29tb2RvLmNvbS9D UFMwCAYGZ4EMAQIBMDsGA1UdHwQ0MDIwMKAuoCyGKmh0dHA6Ly9jcmwuY29tb2Rv Y2EuY29tL0Vzc2VudGlhbFNTTENBLmNybDBuBggrBgEFBQcBAQRiMGAwOAYIKwYB BQUHMAKGLGh0dHA6Ly9jcnQuY29tb2RvY2EuY29tL0Vzc2VudGlhbFNTTENBXzIu Y3J0MCQGCCsGAQUFBzABhhhodHRwOi8vb2NzcC5jb21vZG9jYS5jb20wKQYDVR0R BCIwIIIPKi53ZXN0LXdpbmQuY29tgg13ZXN0LXdpbmQuY29tMA0GCSqGSIb3DQEB BQUAA4IBAQBqBfd6QHrxXsfgfKARG6np8yszIPhHGPPmaE7xq7RpcZjY9H+8l6fe 4jQbGFjbA5uHBklYI4m2snhPaW2p8iF8YOkm2V2hEsSTnkf5/flw9mZtlCFEDFXS sBxBdNz8RYTthPMu1h09C0XuDB30sztgnR692FrxJN5/bXsk+MC9nEweTFW/t2HW +XZ8bhM7vsAS+pZionR4MyuQ0mYIt/lDcsZVZ91KxTsIm8rNMkkYGFoSIXjQ0+0t CbxMF0i2qnpmNRpA6PU8l7lxxvPkplsk9KB8QIPFrR5p/i/SUAd9vECWh5+/ktlc rfFP2PK7XcEwWizsvMrNqLyvQVNXSUPTMYIBrzCCAasCAQEwgYcwcjELMAkGA1UE BhMCR0IxGzAZBgNVBAgTEkdyZWF0ZXIgTWFuY2hlc3RlcjEQMA4GA1UEBxMHU2Fs Zm9yZDEaMBgGA1UEChMRQ09NT0RPIENBIExpbWl0ZWQxGDAWBgNVBAMTD0Vzc2Vu dGlhbFNTTCBDQQIRAO7UTVPkm+2Sbks59IdptaUwCQYFKw4DAhoFADANBgkqhkiG 9w0BAQEFAASCAQB8PNQ6bYnQpWfkHyxnDuvNKw3wrqF2p7JMZm+SuN2qp3R2LpCR mW2LrGtQIm9Iob/QOYH+8houYNVdvsATGPXX2T8gzn+anof4tOG0vCTK1Bp9bwf9 MkRP+1c8RW/vkYmUW4X5/C+y3CZpMH5dDTaXBIpXFzjX/fxNpH/rvLzGiaYYL3Cn OLO+aOADr9qq5yoqwpiYCSfYNNYKTUNNGfYIidQwYtbHXEYhSukB2oR89xD2sZZ4 bOqFjUPgTa5SsERLDDeg3omMKiIXVYGxlqBEq51Kge6IQt4qQV9P9VgInW7cWmKe dTqNHI9ri3ttewdEnT++TKGKKfTjX9SR8Waj -----END NEW CERTIFICATE REQUEST----- Clearly there’s something very different between this an my original request! And it didn’t work. IIS creates a custom CSR that is encoded in a format that no certificate authority I’ve ever used uses. If you want the gory details of what’s in there look at this ServerFault question (thanks to Mika in the comments). In the end it doesn’t matter  though – no certificate authority knows what to do with this CSR. So create a new CSR and skip the renewal. Always! Use the same Server Keep in mind that on IIS at least you should always create your certificate on a single server and then when you receive the final certificate from your provider import it on that server. IIS tracks the CSR it created and requires it in order to import the final certificate properly. So if for some reason you try to install the certificate on another server, it won’t work. I’ve also run into trouble trying to install the same certificate twice – this time around I didn’t give my certificate the proper friendly name and IIS failed to allow me to assign the certificate to any of my Web sites. So I removed the certificate and tried to import again, only to find it failed the second time around. There are other ways to fix this, but in my case I had to have the certificate re-issued to work – not what you want to do. Regardless of what you do though, when you import make sure you do it right the first time by crossing all your t’s and dotting your i's– it’ll save you a lot of grief! You don’t actually have to use the server that the certificate gets installed on to generate the CSR and first install it, but it is generally a good idea to do so just so you can get the certificate installed into the right place right away. If you have access to the server where you need to install the certificate you might as well use it. But you can use another machine to generated the and install the certificate, then export the certificate and move it to another machine as needed. So you can use your Dev machine to create a certificate then export it and install it on a live server. More on installation and back up/export later. Installing the Certificate Once you’ve submitted a CSR request your provider will process the request and eventually issue you a new final certificate that contains another text file with the final key to import into your certificate store. IIS does this by combining the content in your certificate request with the original CSR. If all goes well your new certificate shows up in the certificate list and you’re ready to assign the certificate to your sites. Make sure you use a friendly name that matches domain name of your site. So use *.mysite.com or www.mysite.com or store.mysite.com to ensure IIS recognizes the certificate. I made the mistake of not naming my friendly name this way and found that IIS was unable to link my sites to my wildcard certificate. It needed to have the *. as part of the certificate otherwise the Hostname input field was blanked out. Changing the Friendly Name If you by accidentally used an invalid friendly name you can change it later in the Windows certificate store. Bring up a Run Box Type MMC File | Add/Remove Snap In Add Certificates | Computer Account | Local Computer Drill into Certificates | Personal | Certificates Find your Certificate | Right Click | Properties Edit the Friendly Name | Click OK Backing up your Certificate The first thing you should do once your certificate is successfully installed is to back it up! In case your server crashes or you otherwise lose your configuration this will ensure you have an easy way to recover and reinstall your certificate either on the same server or a different one. If you’re running a server farm or using a wildcard certificate you also need to get the certificate onto other machines and a PFX file import is the easiest way to do this. To back up your certificate select your certificate and choose Export from the context or sidebar menu: The Export Certificate option allows you to export a password protected binary file that you can import in a single step. You can copy the resulting binary PFX file to back up or copy to other machines to install on. Importing the certificate on another machine is as easy as pointing at the PFX file and specifying the password. IIS handles the rest. Assigning a new certificate to your Site Once you have the new certificate installed, all that’s left to do is assign it to your site. In IIS select your Web site and bring up the Site Bindings from the right sidebar. Add a new binding for https, bind it to port 443, specify your hostname and pick the certificate from the pick list. If you’re using a root site make sure to set up your certificate for www.yoursite.com and also for yoursite.com so that both work properly with SSL. Note that you need to explicitly configure each hostname for a certificate if you plan to use SSL. Luckily if you update your SSL certificate in the following year, IIS prompts you and asks whether you like to update all other sites that are using the existing cert to the newer cert. And you’re done. So what’s the Pain? So, all of this is old hat and it doesn’t look all that bad right? So what’s the pain here? Well if you follow the instructions and do everything right, then the process is about as straight forward as you would expect it to be. You create a cert request, you import it and assign it to your sites. That’s the basic steps and to be perfectly fair it works well – if nothing goes wrong. However, renewing tends to be the problem. The first unintuitive issue is that you simply shouldn’t renew but create a new CSR and generate your new certificate from that. Over the years I’ve fallen prey to the belief that Microsoft eventually will fix this so that the renewal creates the same type of CSR as the old cert, but apparently that will just never happen. Booo! The other problem I ran into is that I accidentally misnamed my imported certificate which in turn set off a chain of events that caused my originally issued certificate to become uninstallable. When I received my completed certificate I installed it and it installed just fine, but the friendly name was wrong. As a result IIS refused to assign the certificate to any of my host headered sites. That’s strike number one. Why the heck should the friendly name have any effect on the ability to attach the certificate??? Next I uninstalled the certificate because I figured that would be the easiest way to make sure I get it right. But I found that I could not reinstall my certificate. I kept getting these stop errors: "ASN1 bad tag value met" that would prevent the installation from completion. After searching around for this error and reading countless long messages on forums, I found that this error supposedly does not actually mean the install failed, but the list wouldn’t refresh. Commodo has this to say: Note: There is a known issue in IIS 7 giving the following error: "Cannot find the certificate request associated with this certificate file. A certificate request must be completed on the computer where it was created." You may also receive a message stating "ASN1 bad tag value met". If this is the same server that you generated the CSR on then, in most cases, the certificate is actually installed. Simply cancel the dialog and press "F5" to refresh the list of server certificates. If the new certificate is now in the list, you can continue with the next step. If it is not in the list, you will need to reissue your certificate using a new CSR (see our CSR creation instructions for IIS 7). After creating a new CSR, login to your Comodo account and click the 'replace' button for your certificate. Not sure if this issue is fixed in IIS 8 but that’s an insane bug to have crop up. As it turns out, in my case the refresh didn’t work and the certificate didn’t show up in the IIS list after the reinstall. In fact when looking at the certificate store I could see my certificate was installed in the right place, but the private key is missing which is most likely why IIS is not picking it up. It looks like IIS could not match the final cert to the original CSR generated. But again some sort of message to that affect might be helpful instead of ASN1 bad tag value met. Recovering the Private Key So it turns out my original problem was that I received the published key, but when I imported the private key was missing. There’s a relatively easy way to recover from this. If your certificate doesn’t show up in IIS check in the certificate store for the local machine (see steps above on how to bring this up). If you look at the certificate in Certificates/Personal/Certificates make sure you see the key as shown in the image below: if the key is missing it means that the certificate is missing the private key most likely. To fix a certificate you can do the following: Double click the certificate Go to the Details Tab Copy down the Serial number You can copy the serial number from the area blurred out above. The serial number will be in a format like ?00 a7 9b a1 a4 9d 91 63 57 d6 9f 26 b8 ee 79 b5 cb and you’ll need to strip out the spaces in order to use it in the next step. Next open up an Administrative command prompt and issue the following command: certutil -repairstore my 00a79ba1a49d916357d69f26b8ee79b5cb You should get a confirmation message that the repair worked. If you now go back to the certificate store you should now see the key icon show up on the certificate. Your certificate is fixed. Now go back into IIS Manager and refresh the list of certificates and if all goes well you should see all the certificates that showed in the cert store now: Remember – back up the key first then map to your site… Summary I deal with a lot of customers who run their own IIS servers, and I can’t tell you how often I hear about botched SSL installations. When I posted some of my issues on Twitter yesterday I got a hell storm of “me too” responses. I’m clearly not the only one, who’s run into this especially with renewals. I feel pretty comfortable with IIS configuration and I do a lot of it for support purposes, but the SSL configuration is one that never seems to go seamlessly. This blog post is meant as reminder to myself to read next time I do a renewal. So I can dot my i's and dash my t’s before I get caught in the mess I’m dealing with today. Hopefully some of you find this useful as well.© Rick Strahl, West Wind Technologies, 2005-2014Posted in IIS7  Security   Tweet !function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0];if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src="//platform.twitter.com/widgets.js";fjs.parentNode.insertBefore(js,fjs);}}(document,"script","twitter-wjs"); (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/plusone.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

    Read the article

  • Guidance: A Branching strategy for Scrum Teams

    - by Martin Hinshelwood
    Having a good branching strategy will save your bacon, or at least your code. Be careful when deviating from your branching strategy because if you do, you may be worse off than when you started! This is one possible branching strategy for Scrum teams and I will not be going in depth with Scrum but you can find out more about Scrum by reading the Scrum Guide and you can even assess your Scrum knowledge by having a go at the Scrum Open Assessment. You can also read SSW’s Rules to Better Scrum using TFS which have been developed during our own Scrum implementations. Acknowledgements Bill Heys – Bill offered some good feedback on this post and helped soften the language. Note: Bill is a VS ALM Ranger and co-wrote the Branching Guidance for TFS 2010 Willy-Peter Schaub – Willy-Peter is an ex Visual Studio ALM MVP turned blue badge and has been involved in most of the guidance including the Branching Guidance for TFS 2010 Chris Birmele – Chris wrote some of the early TFS Branching and Merging Guidance. Dr Paul Neumeyer, Ph.D Parallel Processes, ScrumMaster and SSW Solution Architect – Paul wanted to have feature branches coming from the release branch as well. We agreed that this is really a spin-off that needs own project, backlog, budget and Team. Scenario: A product is developed RTM 1.0 is released and gets great sales.  Extra features are demanded but the new version will have double to price to pay to recover costs, work is approved by the guys with budget and a few sprints later RTM 2.0 is released.  Sales a very low due to the pricing strategy. There are lots of clients on RTM 1.0 calling out for patches. As I keep getting Reverse Integration and Forward Integration mixed up and Bill keeps slapping my wrists I thought I should have a reminder: You still seemed to use reverse and/or forward integration in the wrong context. I would recommend reviewing your document at the end to ensure that it agrees with the common understanding of these terms merge (forward integration) from parent to child (same direction as the branch), and merge  (reverse integration) from child to parent (the reverse direction of the branch). - one of my many slaps on the wrist from Bill Heys.   As I mentioned previously we are using a single feature branching strategy in our current project. The single biggest mistake developers make is developing against the “Main” or “Trunk” line. This ultimately leads to messy code as things are added and never finished. Your only alternative is to NEVER check in unless your code is 100%, but this does not work in practice, even with a single developer. Your ADD will kick in and your half-finished code will be finished enough to pass the build and the tests. You do use builds don’t you? Sadly, this is a very common scenario and I have had people argue that branching merely adds complexity. Then again I have seen the other side of the universe ... branching  structures from he... We should somehow convince everyone that there is a happy between no-branching and too-much-branching. - Willy-Peter Schaub, VS ALM Ranger, Microsoft   A key benefit of branching for development is to isolate changes from the stable Main branch. Branching adds sanity more than it adds complexity. We do try to stress in our guidance that it is important to justify a branch, by doing a cost benefit analysis. The primary cost is the effort to do merges and resolve conflicts. A key benefit is that you have a stable code base in Main and accept changes into Main only after they pass quality gates, etc. - Bill Heys, VS ALM Ranger & TFS Branching Lead, Microsoft The second biggest mistake developers make is branching anything other than the WHOLE “Main” line. If you branch parts of your code and not others it gets out of sync and can make integration a nightmare. You should have your Source, Assets, Build scripts deployment scripts and dependencies inside the “Main” folder and branch the whole thing. Some departments within MSFT even go as far as to add the environments used to develop the product in there as well; although I would not recommend that unless you have a massive SQL cluster to house your source code. We tried the “add environment” back in South-Africa and while it was “phenomenal”, especially when having to switch between environments, the disk storage and processing requirements killed us. We opted for virtualization to skin this cat of keeping a ready-to-go environment handy. - Willy-Peter Schaub, VS ALM Ranger, Microsoft   I think people often think that you should have separate branches for separate environments (e.g. Dev, Test, Integration Test, QA, etc.). I prefer to think of deploying to environments (such as from Main to QA) rather than branching for QA). - Bill Heys, VS ALM Ranger & TFS Branching Lead, Microsoft   You can read about SSW’s Rules to better Source Control for some additional information on what Source Control to use and how to use it. There are also a number of branching Anti-Patterns that should be avoided at all costs: You know you are on the wrong track if you experience one or more of the following symptoms in your development environment: Merge Paranoia—avoiding merging at all cost, usually because of a fear of the consequences. Merge Mania—spending too much time merging software assets instead of developing them. Big Bang Merge—deferring branch merging to the end of the development effort and attempting to merge all branches simultaneously. Never-Ending Merge—continuous merging activity because there is always more to merge. Wrong-Way Merge—merging a software asset version with an earlier version. Branch Mania—creating many branches for no apparent reason. Cascading Branches—branching but never merging back to the main line. Mysterious Branches—branching for no apparent reason. Temporary Branches—branching for changing reasons, so the branch becomes a permanent temporary workspace. Volatile Branches—branching with unstable software assets shared by other branches or merged into another branch. Note   Branches are volatile most of the time while they exist as independent branches. That is the point of having them. The difference is that you should not share or merge branches while they are in an unstable state. Development Freeze—stopping all development activities while branching, merging, and building new base lines. Berlin Wall—using branches to divide the development team members, instead of dividing the work they are performing. -Branching and Merging Primer by Chris Birmele - Developer Tools Technical Specialist at Microsoft Pty Ltd in Australia   In fact, this can result in a merge exercise no-one wants to be involved in, merging hundreds of thousands of change sets and trying to get a consolidated build. Again, we need to find a happy medium. - Willy-Peter Schaub on Merge Paranoia Merge conflicts are generally the result of making changes to the same file in both the target and source branch. If you create merge conflicts, you will eventually need to resolve them. Often the resolution is manual. Merging more frequently allows you to resolve these conflicts close to when they happen, making the resolution clearer. Waiting weeks or months to resolve them, the Big Bang approach, means you are more likely to resolve conflicts incorrectly. - Bill Heys, VS ALM Ranger & TFS Branching Lead, Microsoft   Figure: Main line, this is where your stable code lives and where any build has known entities, always passes and has a happy test that passes as well? Many development projects consist of, a single “Main” line of source and artifacts. This is good; at least there is source control . There are however a couple of issues that need to be considered. What happens if: you and your team are working on a new set of features and the customer wants a change to his current version? you are working on two features and the customer decides to abandon one of them? you have two teams working on different feature sets and their changes start interfering with each other? I just use labels instead of branches? That's a lot of “what if’s”, but there is a simple way of preventing this. Branching… In TFS, labels are not immutable. This does not mean they are not useful. But labels do not provide a very good development isolation mechanism. Branching allows separate code sets to evolve separately (e.g. Current with hotfixes, and vNext with new development). I don’t see how labels work here. - Bill Heys, VS ALM Ranger & TFS Branching Lead, Microsoft   Figure: Creating a single feature branch means you can isolate the development work on that branch.   Its standard practice for large projects with lots of developers to use Feature branching and you can check the Branching Guidance for the latest recommendations from the Visual Studio ALM Rangers for other methods. In the diagram above you can see my recommendation for branching when using Scrum development with TFS 2010. It consists of a single Sprint branch to contain all the changes for the current sprint. The main branch has the permissions changes so contributors to the project can only Branch and Merge with “Main”. This will prevent accidental check-ins or checkouts of the “Main” line that would contaminate the code. The developers continue to develop on sprint one until the completion of the sprint. Note: In the real world, starting a new Greenfield project, this process starts at Sprint 2 as at the start of Sprint 1 you would have artifacts in version control and no need for isolation.   Figure: Once the sprint is complete the Sprint 1 code can then be merged back into the Main line. There are always good practices to follow, and one is to always do a Forward Integration from Main into Sprint 1 before you do a Reverse Integration from Sprint 1 back into Main. In this case it may seem superfluous, but this builds good muscle memory into your developer’s work ethic and means that no bad habits are learned that would interfere with additional Scrum Teams being added to the Product. The process of completing your sprint development: The Team completes their work according to their definition of done. Merge from “Main” into “Sprint1” (Forward Integration) Stabilize your code with any changes coming from other Scrum Teams working on the same product. If you have one Scrum Team this should be quick, but there may have been bug fixes in the Release branches. (we will talk about release branches later) Merge from “Sprint1” into “Main” to commit your changes. (Reverse Integration) Check-in Delete the Sprint1 branch Note: The Sprint 1 branch is no longer required as its useful life has been concluded. Check-in Done But you are not yet done with the Sprint. The goal in Scrum is to have a “potentially shippable product” at the end of every Sprint, and we do not have that yet, we only have finished code.   Figure: With Sprint 1 merged you can create a Release branch and run your final packaging and testing In 99% of all projects I have been involved in or watched, a “shippable product” only happens towards the end of the overall lifecycle, especially when sprints are short. The in-between releases are great demonstration releases, but not shippable. Perhaps it comes from my 80’s brain washing that we only ship when we reach the agreed quality and business feature bar. - Willy-Peter Schaub, VS ALM Ranger, Microsoft Although you should have been testing and packaging your code all the way through your Sprint 1 development, preferably using an automated process, you still need to test and package with stable unchanging code. This is where you do what at SSW we call a “Test Please”. This is first an internal test of the product to make sure it meets the needs of the customer and you generally use a resource external to your Team. Then a “Test Please” is conducted with the Product Owner to make sure he is happy with the output. You can read about how to conduct a Test Please on our Rules to Successful Projects: Do you conduct an internal "test please" prior to releasing a version to a client?   Figure: If you find a deviation from the expected result you fix it on the Release branch. If during your final testing or your “Test Please” you find there are issues or bugs then you should fix them on the release branch. If you can’t fix them within the time box of your Sprint, then you will need to create a Bug and put it onto the backlog for prioritization by the Product owner. Make sure you leave plenty of time between your merge from the development branch to find and fix any problems that are uncovered. This process is commonly called Stabilization and should always be conducted once you have completed all of your User Stories and integrated all of your branches. Even once you have stabilized and released, you should not delete the release branch as you would with the Sprint branch. It has a usefulness for servicing that may extend well beyond the limited life you expect of it. Note: Don't get forced by the business into adding features into a Release branch instead that indicates the unspoken requirement is that they are asking for a product spin-off. In this case you can create a new Team Project and branch from the required Release branch to create a new Main branch for that product. And you create a whole new backlog to work from.   Figure: When the Team decides it is happy with the product you can create a RTM branch. Once you have fixed all the bugs you can, and added any you can’t to the Product Backlog, and you Team is happy with the result you can create a Release. This would consist of doing the final Build and Packaging it up ready for your Sprint Review meeting. You would then create a read-only branch that represents the code you “shipped”. This is really an Audit trail branch that is optional, but is good practice. You could use a Label, but Labels are not Auditable and if a dispute was raised by the customer you can produce a verifiable version of the source code for an independent party to check. Rare I know, but you do not want to be at the wrong end of a legal battle. Like the Release branch the RTM branch should never be deleted, or only deleted according to your companies legal policy, which in the UK is usually 7 years.   Figure: If you have made any changes in the Release you will need to merge back up to Main in order to finalise the changes. Nothing is really ever done until it is in Main. The same rules apply when merging any fixes in the Release branch back into Main and you should do a reverse merge before a forward merge, again for the muscle memory more than necessity at this stage. Your Sprint is now nearly complete, and you can have a Sprint Review meeting knowing that you have made every effort and taken every precaution to protect your customer’s investment. Note: In order to really achieve protection for both you and your client you would add Automated Builds, Automated Tests, Automated Acceptance tests, Acceptance test tracking, Unit Tests, Load tests, Web test and all the other good engineering practices that help produce reliable software.     Figure: After the Sprint Planning meeting the process begins again. Where the Sprint Review and Retrospective meetings mark the end of the Sprint, the Sprint Planning meeting marks the beginning. After you have completed your Sprint Planning and you know what you are trying to achieve in Sprint 2 you can create your new Branch to develop in. How do we handle a bug(s) in production that can’t wait? Although in Scrum the only work done should be on the backlog there should be a little buffer added to the Sprint Planning for contingencies. One of these contingencies is a bug in the current release that can’t wait for the Sprint to finish. But how do you handle that? Willy-Peter Schaub asked an excellent question on the release activities: In reality Sprint 2 starts when sprint 1 ends + weekend. Should we not cater for a possible parallelism between Sprint 2 and the release activities of sprint 1? It would introduce FI’s from main to sprint 2, I guess. Your “Figure: Merging print 2 back into Main.” covers, what I tend to believe to be reality in most cases. - Willy-Peter Schaub, VS ALM Ranger, Microsoft I agree, and if you have a single Scrum team then your resources are limited. The Scrum Team is responsible for packaging and release, so at least one run at stabilization, package and release should be included in the Sprint time box. If more are needed on the current production release during the Sprint 2 time box then resource needs to be pulled from Sprint 2. The Product Owner and the Team have four choices (in order of disruption/cost): Backlog: Add the bug to the backlog and fix it in the next Sprint Buffer Time: Use any buffer time included in the current Sprint to fix the bug quickly Make time: Remove a Story from the current Sprint that is of equal value to the time lost fixing the bug(s) and releasing. Note: The Team must agree that it can still meet the Sprint Goal. Cancel Sprint: Cancel the sprint and concentrate all resource on fixing the bug(s) Note: This can be a very costly if the current sprint has already had a lot of work completed as it will be lost. The choice will depend on the complexity and severity of the bug(s) and both the Product Owner and the Team need to agree. In this case we will go with option #2 or #3 as they are uncomplicated but severe bugs. Figure: Real world issue where a bug needs fixed in the current release. If the bug(s) is urgent enough then then your only option is to fix it in place. You can edit the release branch to find and fix the bug, hopefully creating a test so it can’t happen again. Follow the prior process and conduct an internal and customer “Test Please” before releasing. You can read about how to conduct a Test Please on our Rules to Successful Projects: Do you conduct an internal "test please" prior to releasing a version to a client?   Figure: After you have fixed the bug you need to ship again. You then need to again create an RTM branch to hold the version of the code you released in escrow.   Figure: Main is now out of sync with your Release. We now need to get these new changes back up into the Main branch. Do a reverse and then forward merge again to get the new code into Main. But what about the branch, are developers not working on Sprint 2? Does Sprint 2 now have changes that are not in Main and Main now have changes that are not in Sprint 2? Well, yes… and this is part of the hit you take doing branching. But would this scenario even have been possible without branching?   Figure: Getting the changes in Main into Sprint 2 is very important. The Team now needs to do a Forward Integration merge into their Sprint and resolve any conflicts that occur. Maybe the bug has already been fixed in Sprint 2, maybe the bug no longer exists! This needs to be identified and resolved by the developers before they continue to get further out of Sync with Main. Note: Avoid the “Big bang merge” at all costs.   Figure: Merging Sprint 2 back into Main, the Forward Integration, and R0 terminates. Sprint 2 now merges (Reverse Integration) back into Main following the procedures we have already established.   Figure: The logical conclusion. This then allows the creation of the next release. By now you should be getting the big picture and hopefully you learned something useful from this post. I know I have enjoyed writing it as I find these exploratory posts coupled with real world experience really help harden my understanding.  Branching is a tool; it is not a silver bullet. Don’t over use it, and avoid “Anti-Patterns” where possible. Although the diagram above looks complicated I hope showing you how it is formed simplifies it as much as possible.   Technorati Tags: Branching,Scrum,VS ALM,TFS 2010,VS2010

    Read the article

  • Improving Partitioned Table Join Performance

    - by Paul White
    The query optimizer does not always choose an optimal strategy when joining partitioned tables. This post looks at an example, showing how a manual rewrite of the query can almost double performance, while reducing the memory grant to almost nothing. Test Data The two tables in this example use a common partitioning partition scheme. The partition function uses 41 equal-size partitions: CREATE PARTITION FUNCTION PFT (integer) AS RANGE RIGHT FOR VALUES ( 125000, 250000, 375000, 500000, 625000, 750000, 875000, 1000000, 1125000, 1250000, 1375000, 1500000, 1625000, 1750000, 1875000, 2000000, 2125000, 2250000, 2375000, 2500000, 2625000, 2750000, 2875000, 3000000, 3125000, 3250000, 3375000, 3500000, 3625000, 3750000, 3875000, 4000000, 4125000, 4250000, 4375000, 4500000, 4625000, 4750000, 4875000, 5000000 ); GO CREATE PARTITION SCHEME PST AS PARTITION PFT ALL TO ([PRIMARY]); There two tables are: CREATE TABLE dbo.T1 ( TID integer NOT NULL IDENTITY(0,1), Column1 integer NOT NULL, Padding binary(100) NOT NULL DEFAULT 0x,   CONSTRAINT PK_T1 PRIMARY KEY CLUSTERED (TID) ON PST (TID) );   CREATE TABLE dbo.T2 ( TID integer NOT NULL, Column1 integer NOT NULL, Padding binary(100) NOT NULL DEFAULT 0x,   CONSTRAINT PK_T2 PRIMARY KEY CLUSTERED (TID, Column1) ON PST (TID) ); The next script loads 5 million rows into T1 with a pseudo-random value between 1 and 5 for Column1. The table is partitioned on the IDENTITY column TID: INSERT dbo.T1 WITH (TABLOCKX) (Column1) SELECT (ABS(CHECKSUM(NEWID())) % 5) + 1 FROM dbo.Numbers AS N WHERE n BETWEEN 1 AND 5000000; In case you don’t already have an auxiliary table of numbers lying around, here’s a script to create one with 10 million rows: CREATE TABLE dbo.Numbers (n bigint PRIMARY KEY);   WITH L0 AS(SELECT 1 AS c UNION ALL SELECT 1), L1 AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B), L2 AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B), L3 AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B), L4 AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B), L5 AS(SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B), Nums AS(SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS n FROM L5) INSERT dbo.Numbers WITH (TABLOCKX) SELECT TOP (10000000) n FROM Nums ORDER BY n OPTION (MAXDOP 1); Table T1 contains data like this: Next we load data into table T2. The relationship between the two tables is that table 2 contains ‘n’ rows for each row in table 1, where ‘n’ is determined by the value in Column1 of table T1. There is nothing particularly special about the data or distribution, by the way. INSERT dbo.T2 WITH (TABLOCKX) (TID, Column1) SELECT T.TID, N.n FROM dbo.T1 AS T JOIN dbo.Numbers AS N ON N.n >= 1 AND N.n <= T.Column1; Table T2 ends up containing about 15 million rows: The primary key for table T2 is a combination of TID and Column1. The data is partitioned according to the value in column TID alone. Partition Distribution The following query shows the number of rows in each partition of table T1: SELECT PartitionID = CA1.P, NumRows = COUNT_BIG(*) FROM dbo.T1 AS T CROSS APPLY (VALUES ($PARTITION.PFT(TID))) AS CA1 (P) GROUP BY CA1.P ORDER BY CA1.P; There are 40 partitions containing 125,000 rows (40 * 125k = 5m rows). The rightmost partition remains empty. The next query shows the distribution for table 2: SELECT PartitionID = CA1.P, NumRows = COUNT_BIG(*) FROM dbo.T2 AS T CROSS APPLY (VALUES ($PARTITION.PFT(TID))) AS CA1 (P) GROUP BY CA1.P ORDER BY CA1.P; There are roughly 375,000 rows in each partition (the rightmost partition is also empty): Ok, that’s the test data done. Test Query and Execution Plan The task is to count the rows resulting from joining tables 1 and 2 on the TID column: SET STATISTICS IO ON; DECLARE @s datetime2 = SYSUTCDATETIME();   SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID;   SELECT DATEDIFF(Millisecond, @s, SYSUTCDATETIME()); SET STATISTICS IO OFF; The optimizer chooses a plan using parallel hash join, and partial aggregation: The Plan Explorer plan tree view shows accurate cardinality estimates and an even distribution of rows across threads (click to enlarge the image): With a warm data cache, the STATISTICS IO output shows that no physical I/O was needed, and all 41 partitions were touched: Running the query without actual execution plan or STATISTICS IO information for maximum performance, the query returns in around 2600ms. Execution Plan Analysis The first step toward improving on the execution plan produced by the query optimizer is to understand how it works, at least in outline. The two parallel Clustered Index Scans use multiple threads to read rows from tables T1 and T2. Parallel scan uses a demand-based scheme where threads are given page(s) to scan from the table as needed. This arrangement has certain important advantages, but does result in an unpredictable distribution of rows amongst threads. The point is that multiple threads cooperate to scan the whole table, but it is impossible to predict which rows end up on which threads. For correct results from the parallel hash join, the execution plan has to ensure that rows from T1 and T2 that might join are processed on the same thread. For example, if a row from T1 with join key value ‘1234’ is placed in thread 5’s hash table, the execution plan must guarantee that any rows from T2 that also have join key value ‘1234’ probe thread 5’s hash table for matches. The way this guarantee is enforced in this parallel hash join plan is by repartitioning rows to threads after each parallel scan. The two repartitioning exchanges route rows to threads using a hash function over the hash join keys. The two repartitioning exchanges use the same hash function so rows from T1 and T2 with the same join key must end up on the same hash join thread. Expensive Exchanges This business of repartitioning rows between threads can be very expensive, especially if a large number of rows is involved. The execution plan selected by the optimizer moves 5 million rows through one repartitioning exchange and around 15 million across the other. As a first step toward removing these exchanges, consider the execution plan selected by the optimizer if we join just one partition from each table, disallowing parallelism: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = 1 AND $PARTITION.PFT(T2.TID) = 1 OPTION (MAXDOP 1); The optimizer has chosen a (one-to-many) merge join instead of a hash join. The single-partition query completes in around 100ms. If everything scaled linearly, we would expect that extending this strategy to all 40 populated partitions would result in an execution time around 4000ms. Using parallelism could reduce that further, perhaps to be competitive with the parallel hash join chosen by the optimizer. This raises a question. If the most efficient way to join one partition from each of the tables is to use a merge join, why does the optimizer not choose a merge join for the full query? Forcing a Merge Join Let’s force the optimizer to use a merge join on the test query using a hint: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (MERGE JOIN); This is the execution plan selected by the optimizer: This plan results in the same number of logical reads reported previously, but instead of 2600ms the query takes 5000ms. The natural explanation for this drop in performance is that the merge join plan is only using a single thread, whereas the parallel hash join plan could use multiple threads. Parallel Merge Join We can get a parallel merge join plan using the same query hint as before, and adding trace flag 8649: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (MERGE JOIN, QUERYTRACEON 8649); The execution plan is: This looks promising. It uses a similar strategy to distribute work across threads as seen for the parallel hash join. In practice though, performance is disappointing. On a typical run, the parallel merge plan runs for around 8400ms; slower than the single-threaded merge join plan (5000ms) and much worse than the 2600ms for the parallel hash join. We seem to be going backwards! The logical reads for the parallel merge are still exactly the same as before, with no physical IOs. The cardinality estimates and thread distribution are also still very good (click to enlarge): A big clue to the reason for the poor performance is shown in the wait statistics (captured by Plan Explorer Pro): CXPACKET waits require careful interpretation, and are most often benign, but in this case excessive waiting occurs at the repartitioning exchanges. Unlike the parallel hash join, the repartitioning exchanges in this plan are order-preserving ‘merging’ exchanges (because merge join requires ordered inputs): Parallelism works best when threads can just grab any available unit of work and get on with processing it. Preserving order introduces inter-thread dependencies that can easily lead to significant waits occurring. In extreme cases, these dependencies can result in an intra-query deadlock, though the details of that will have to wait for another time to explore in detail. The potential for waits and deadlocks leads the query optimizer to cost parallel merge join relatively highly, especially as the degree of parallelism (DOP) increases. This high costing resulted in the optimizer choosing a serial merge join rather than parallel in this case. The test results certainly confirm its reasoning. Collocated Joins In SQL Server 2008 and later, the optimizer has another available strategy when joining tables that share a common partition scheme. This strategy is a collocated join, also known as as a per-partition join. It can be applied in both serial and parallel execution plans, though it is limited to 2-way joins in the current optimizer. Whether the optimizer chooses a collocated join or not depends on cost estimation. The primary benefits of a collocated join are that it eliminates an exchange and requires less memory, as we will see next. Costing and Plan Selection The query optimizer did consider a collocated join for our original query, but it was rejected on cost grounds. The parallel hash join with repartitioning exchanges appeared to be a cheaper option. There is no query hint to force a collocated join, so we have to mess with the costing framework to produce one for our test query. Pretending that IOs cost 50 times more than usual is enough to convince the optimizer to use collocated join with our test query: -- Pretend IOs are 50x cost temporarily DBCC SETIOWEIGHT(50);   -- Co-located hash join SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (RECOMPILE);   -- Reset IO costing DBCC SETIOWEIGHT(1); Collocated Join Plan The estimated execution plan for the collocated join is: The Constant Scan contains one row for each partition of the shared partitioning scheme, from 1 to 41. The hash repartitioning exchanges seen previously are replaced by a single Distribute Streams exchange using Demand partitioning. Demand partitioning means that the next partition id is given to the next parallel thread that asks for one. My test machine has eight logical processors, and all are available for SQL Server to use. As a result, there are eight threads in the single parallel branch in this plan, each processing one partition from each table at a time. Once a thread finishes processing a partition, it grabs a new partition number from the Distribute Streams exchange…and so on until all partitions have been processed. It is important to understand that the parallel scans in this plan are different from the parallel hash join plan. Although the scans have the same parallelism icon, tables T1 and T2 are not being co-operatively scanned by multiple threads in the same way. Each thread reads a single partition of T1 and performs a hash match join with the same partition from table T2. The properties of the two Clustered Index Scans show a Seek Predicate (unusual for a scan!) limiting the rows to a single partition: The crucial point is that the join between T1 and T2 is on TID, and TID is the partitioning column for both tables. A thread that processes partition ‘n’ is guaranteed to see all rows that can possibly join on TID for that partition. In addition, no other thread will see rows from that partition, so this removes the need for repartitioning exchanges. CPU and Memory Efficiency Improvements The collocated join has removed two expensive repartitioning exchanges and added a single exchange processing 41 rows (one for each partition id). Remember, the parallel hash join plan exchanges had to process 5 million and 15 million rows. The amount of processor time spent on exchanges will be much lower in the collocated join plan. In addition, the collocated join plan has a maximum of 8 threads processing single partitions at any one time. The 41 partitions will all be processed eventually, but a new partition is not started until a thread asks for it. Threads can reuse hash table memory for the new partition. The parallel hash join plan also had 8 hash tables, but with all 5,000,000 build rows loaded at the same time. The collocated plan needs memory for only 8 * 125,000 = 1,000,000 rows at any one time. Collocated Hash Join Performance The collated join plan has disappointing performance in this case. The query runs for around 25,300ms despite the same IO statistics as usual. This is much the worst result so far, so what went wrong? It turns out that cardinality estimation for the single partition scans of table T1 is slightly low. The properties of the Clustered Index Scan of T1 (graphic immediately above) show the estimation was for 121,951 rows. This is a small shortfall compared with the 125,000 rows actually encountered, but it was enough to cause the hash join to spill to physical tempdb: A level 1 spill doesn’t sound too bad, until you realize that the spill to tempdb probably occurs for each of the 41 partitions. As a side note, the cardinality estimation error is a little surprising because the system tables accurately show there are 125,000 rows in every partition of T1. Unfortunately, the optimizer uses regular column and index statistics to derive cardinality estimates here rather than system table information (e.g. sys.partitions). Collocated Merge Join We will never know how well the collocated parallel hash join plan might have worked without the cardinality estimation error (and the resulting 41 spills to tempdb) but we do know: Merge join does not require a memory grant; and Merge join was the optimizer’s preferred join option for a single partition join Putting this all together, what we would really like to see is the same collocated join strategy, but using merge join instead of hash join. Unfortunately, the current query optimizer cannot produce a collocated merge join; it only knows how to do collocated hash join. So where does this leave us? CROSS APPLY sys.partitions We can try to write our own collocated join query. We can use sys.partitions to find the partition numbers, and CROSS APPLY to get a count per partition, with a final step to sum the partial counts. The following query implements this idea: SELECT row_count = SUM(Subtotals.cnt) FROM ( -- Partition numbers SELECT p.partition_number FROM sys.partitions AS p WHERE p.[object_id] = OBJECT_ID(N'T1', N'U') AND p.index_id = 1 ) AS P CROSS APPLY ( -- Count per collocated join SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals; The estimated plan is: The cardinality estimates aren’t all that good here, especially the estimate for the scan of the system table underlying the sys.partitions view. Nevertheless, the plan shape is heading toward where we would like to be. Each partition number from the system table results in a per-partition scan of T1 and T2, a one-to-many Merge Join, and a Stream Aggregate to compute the partial counts. The final Stream Aggregate just sums the partial counts. Execution time for this query is around 3,500ms, with the same IO statistics as always. This compares favourably with 5,000ms for the serial plan produced by the optimizer with the OPTION (MERGE JOIN) hint. This is another case of the sum of the parts being less than the whole – summing 41 partial counts from 41 single-partition merge joins is faster than a single merge join and count over all partitions. Even so, this single-threaded collocated merge join is not as quick as the original parallel hash join plan, which executed in 2,600ms. On the positive side, our collocated merge join uses only one logical processor and requires no memory grant. The parallel hash join plan used 16 threads and reserved 569 MB of memory:   Using a Temporary Table Our collocated merge join plan should benefit from parallelism. The reason parallelism is not being used is that the query references a system table. We can work around that by writing the partition numbers to a temporary table (or table variable): SET STATISTICS IO ON; DECLARE @s datetime2 = SYSUTCDATETIME();   CREATE TABLE #P ( partition_number integer PRIMARY KEY);   INSERT #P (partition_number) SELECT p.partition_number FROM sys.partitions AS p WHERE p.[object_id] = OBJECT_ID(N'T1', N'U') AND p.index_id = 1;   SELECT row_count = SUM(Subtotals.cnt) FROM #P AS p CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals;   DROP TABLE #P;   SELECT DATEDIFF(Millisecond, @s, SYSUTCDATETIME()); SET STATISTICS IO OFF; Using the temporary table adds a few logical reads, but the overall execution time is still around 3500ms, indistinguishable from the same query without the temporary table. The problem is that the query optimizer still doesn’t choose a parallel plan for this query, though the removal of the system table reference means that it could if it chose to: In fact the optimizer did enter the parallel plan phase of query optimization (running search 1 for a second time): Unfortunately, the parallel plan found seemed to be more expensive than the serial plan. This is a crazy result, caused by the optimizer’s cost model not reducing operator CPU costs on the inner side of a nested loops join. Don’t get me started on that, we’ll be here all night. In this plan, everything expensive happens on the inner side of a nested loops join. Without a CPU cost reduction to compensate for the added cost of exchange operators, candidate parallel plans always look more expensive to the optimizer than the equivalent serial plan. Parallel Collocated Merge Join We can produce the desired parallel plan using trace flag 8649 again: SELECT row_count = SUM(Subtotals.cnt) FROM #P AS p CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals OPTION (QUERYTRACEON 8649); The actual execution plan is: One difference between this plan and the collocated hash join plan is that a Repartition Streams exchange operator is used instead of Distribute Streams. The effect is similar, though not quite identical. The Repartition uses round-robin partitioning, meaning the next partition id is pushed to the next thread in sequence. The Distribute Streams exchange seen earlier used Demand partitioning, meaning the next partition id is pulled across the exchange by the next thread that is ready for more work. There are subtle performance implications for each partitioning option, but going into that would again take us too far off the main point of this post. Performance The important thing is the performance of this parallel collocated merge join – just 1350ms on a typical run. The list below shows all the alternatives from this post (all timings include creation, population, and deletion of the temporary table where appropriate) from quickest to slowest: Collocated parallel merge join: 1350ms Parallel hash join: 2600ms Collocated serial merge join: 3500ms Serial merge join: 5000ms Parallel merge join: 8400ms Collated parallel hash join: 25,300ms (hash spill per partition) The parallel collocated merge join requires no memory grant (aside from a paltry 1.2MB used for exchange buffers). This plan uses 16 threads at DOP 8; but 8 of those are (rather pointlessly) allocated to the parallel scan of the temporary table. These are minor concerns, but it turns out there is a way to address them if it bothers you. Parallel Collocated Merge Join with Demand Partitioning This final tweak replaces the temporary table with a hard-coded list of partition ids (dynamic SQL could be used to generate this query from sys.partitions): SELECT row_count = SUM(Subtotals.cnt) FROM ( VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10), (11),(12),(13),(14),(15),(16),(17),(18),(19),(20), (21),(22),(23),(24),(25),(26),(27),(28),(29),(30), (31),(32),(33),(34),(35),(36),(37),(38),(39),(40),(41) ) AS P (partition_number) CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals OPTION (QUERYTRACEON 8649); The actual execution plan is: The parallel collocated hash join plan is reproduced below for comparison: The manual rewrite has another advantage that has not been mentioned so far: the partial counts (per partition) can be computed earlier than the partial counts (per thread) in the optimizer’s collocated join plan. The earlier aggregation is performed by the extra Stream Aggregate under the nested loops join. The performance of the parallel collocated merge join is unchanged at around 1350ms. Final Words It is a shame that the current query optimizer does not consider a collocated merge join (Connect item closed as Won’t Fix). The example used in this post showed an improvement in execution time from 2600ms to 1350ms using a modestly-sized data set and limited parallelism. In addition, the memory requirement for the query was almost completely eliminated  – down from 569MB to 1.2MB. The problem with the parallel hash join selected by the optimizer is that it attempts to process the full data set all at once (albeit using eight threads). It requires a large memory grant to hold all 5 million rows from table T1 across the eight hash tables, and does not take advantage of the divide-and-conquer opportunity offered by the common partitioning. The great thing about the collocated join strategies is that each parallel thread works on a single partition from both tables, reading rows, performing the join, and computing a per-partition subtotal, before moving on to a new partition. From a thread’s point of view… If you have trouble visualizing what is happening from just looking at the parallel collocated merge join execution plan, let’s look at it again, but from the point of view of just one thread operating between the two Parallelism (exchange) operators. Our thread picks up a single partition id from the Distribute Streams exchange, and starts a merge join using ordered rows from partition 1 of table T1 and partition 1 of table T2. By definition, this is all happening on a single thread. As rows join, they are added to a (per-partition) count in the Stream Aggregate immediately above the Merge Join. Eventually, either T1 (partition 1) or T2 (partition 1) runs out of rows and the merge join stops. The per-partition count from the aggregate passes on through the Nested Loops join to another Stream Aggregate, which is maintaining a per-thread subtotal. Our same thread now picks up a new partition id from the exchange (say it gets id 9 this time). The count in the per-partition aggregate is reset to zero, and the processing of partition 9 of both tables proceeds just as it did for partition 1, and on the same thread. Each thread picks up a single partition id and processes all the data for that partition, completely independently from other threads working on other partitions. One thread might eventually process partitions (1, 9, 17, 25, 33, 41) while another is concurrently processing partitions (2, 10, 18, 26, 34) and so on for the other six threads at DOP 8. The point is that all 8 threads can execute independently and concurrently, continuing to process new partitions until the wider job (of which the thread has no knowledge!) is done. This divide-and-conquer technique can be much more efficient than simply splitting the entire workload across eight threads all at once. Related Reading Understanding and Using Parallelism in SQL Server Parallel Execution Plans Suck © 2013 Paul White – All Rights Reserved Twitter: @SQL_Kiwi

    Read the article

  • Toorcon14

    - by danx
    Toorcon 2012 Information Security Conference San Diego, CA, http://www.toorcon.org/ Dan Anderson, October 2012 It's almost Halloween, and we all know what that means—yes, of course, it's time for another Toorcon Conference! Toorcon is an annual conference for people interested in computer security. This includes the whole range of hackers, computer hobbyists, professionals, security consultants, press, law enforcement, prosecutors, FBI, etc. We're at Toorcon 14—see earlier blogs for some of the previous Toorcon's I've attended (back to 2003). This year's "con" was held at the Westin on Broadway in downtown San Diego, California. The following are not necessarily my views—I'm just the messenger—although I could have misquoted or misparaphrased the speakers. Also, I only reviewed some of the talks, below, which I attended and interested me. MalAndroid—the Crux of Android Infections, Aditya K. Sood Programming Weird Machines with ELF Metadata, Rebecca "bx" Shapiro Privacy at the Handset: New FCC Rules?, Valkyrie Hacking Measured Boot and UEFI, Dan Griffin You Can't Buy Security: Building the Open Source InfoSec Program, Boris Sverdlik What Journalists Want: The Investigative Reporters' Perspective on Hacking, Dave Maas & Jason Leopold Accessibility and Security, Anna Shubina Stop Patching, for Stronger PCI Compliance, Adam Brand McAfee Secure & Trustmarks — a Hacker's Best Friend, Jay James & Shane MacDougall MalAndroid—the Crux of Android Infections Aditya K. Sood, IOActive, Michigan State PhD candidate Aditya talked about Android smartphone malware. There's a lot of old Android software out there—over 50% Gingerbread (2.3.x)—and most have unpatched vulnerabilities. Of 9 Android vulnerabilities, 8 have known exploits (such as the old Gingerbread Global Object Table exploit). Android protection includes sandboxing, security scanner, app permissions, and screened Android app market. The Android permission checker has fine-grain resource control, policy enforcement. Android static analysis also includes a static analysis app checker (bouncer), and a vulnerablity checker. What security problems does Android have? User-centric security, which depends on the user to grant permission and make smart decisions. But users don't care or think about malware (the're not aware, not paranoid). All they want is functionality, extensibility, mobility Android had no "proper" encryption before Android 3.0 No built-in protection against social engineering and web tricks Alternative Android app markets are unsafe. Simply visiting some markets can infect Android Aditya classified Android Malware types as: Type A—Apps. These interact with the Android app framework. For example, a fake Netflix app. Or Android Gold Dream (game), which uploads user files stealthy manner to a remote location. Type K—Kernel. Exploits underlying Linux libraries or kernel Type H—Hybrid. These use multiple layers (app framework, libraries, kernel). These are most commonly used by Android botnets, which are popular with Chinese botnet authors What are the threats from Android malware? These incude leak info (contacts), banking fraud, corporate network attacks, malware advertising, malware "Hackivism" (the promotion of social causes. For example, promiting specific leaders of the Tunisian or Iranian revolutions. Android malware is frequently "masquerated". That is, repackaged inside a legit app with malware. To avoid detection, the hidden malware is not unwrapped until runtime. The malware payload can be hidden in, for example, PNG files. Less common are Android bootkits—there's not many around. What they do is hijack the Android init framework—alteering system programs and daemons, then deletes itself. For example, the DKF Bootkit (China). Android App Problems: no code signing! all self-signed native code execution permission sandbox — all or none alternate market places no robust Android malware detection at network level delayed patch process Programming Weird Machines with ELF Metadata Rebecca "bx" Shapiro, Dartmouth College, NH https://github.com/bx/elf-bf-tools @bxsays on twitter Definitions. "ELF" is an executable file format used in linking and loading executables (on UNIX/Linux-class machines). "Weird machine" uses undocumented computation sources (I think of them as unintended virtual machines). Some examples of "weird machines" are those that: return to weird location, does SQL injection, corrupts the heap. Bx then talked about using ELF metadata as (an uintended) "weird machine". Some ELF background: A compiler takes source code and generates a ELF object file (hello.o). A static linker makes an ELF executable from the object file. A runtime linker and loader takes ELF executable and loads and relocates it in memory. The ELF file has symbols to relocate functions and variables. ELF has two relocation tables—one at link time and another one at loading time: .rela.dyn (link time) and .dynsym (dynamic table). GOT: Global Offset Table of addresses for dynamically-linked functions. PLT: Procedure Linkage Tables—works with GOT. The memory layout of a process (not the ELF file) is, in order: program (+ heap), dynamic libraries, libc, ld.so, stack (which includes the dynamic table loaded into memory) For ELF, the "weird machine" is found and exploited in the loader. ELF can be crafted for executing viruses, by tricking runtime into executing interpreted "code" in the ELF symbol table. One can inject parasitic "code" without modifying the actual ELF code portions. Think of the ELF symbol table as an "assembly language" interpreter. It has these elements: instructions: Add, move, jump if not 0 (jnz) Think of symbol table entries as "registers" symbol table value is "contents" immediate values are constants direct values are addresses (e.g., 0xdeadbeef) move instruction: is a relocation table entry add instruction: relocation table "addend" entry jnz instruction: takes multiple relocation table entries The ELF weird machine exploits the loader by relocating relocation table entries. The loader will go on forever until told to stop. It stores state on stack at "end" and uses IFUNC table entries (containing function pointer address). The ELF weird machine, called "Brainfu*k" (BF) has: 8 instructions: pointer inc, dec, inc indirect, dec indirect, jump forward, jump backward, print. Three registers - 3 registers Bx showed example BF source code that implemented a Turing machine printing "hello, world". More interesting was the next demo, where bx modified ping. Ping runs suid as root, but quickly drops privilege. BF modified the loader to disable the library function call dropping privilege, so it remained as root. Then BF modified the ping -t argument to execute the -t filename as root. It's best to show what this modified ping does with an example: $ whoami bx $ ping localhost -t backdoor.sh # executes backdoor $ whoami root $ The modified code increased from 285948 bytes to 290209 bytes. A BF tool compiles "executable" by modifying the symbol table in an existing ELF executable. The tool modifies .dynsym and .rela.dyn table, but not code or data. Privacy at the Handset: New FCC Rules? "Valkyrie" (Christie Dudley, Santa Clara Law JD candidate) Valkyrie talked about mobile handset privacy. Some background: Senator Franken (also a comedian) became alarmed about CarrierIQ, where the carriers track their customers. Franken asked the FCC to find out what obligations carriers think they have to protect privacy. The carriers' response was that they are doing just fine with self-regulation—no worries! Carriers need to collect data, such as missed calls, to maintain network quality. But carriers also sell data for marketing. Verizon sells customer data and enables this with a narrow privacy policy (only 1 month to opt out, with difficulties). The data sold is not individually identifiable and is aggregated. But Verizon recommends, as an aggregation workaround to "recollate" data to other databases to identify customers indirectly. The FCC has regulated telephone privacy since 1934 and mobile network privacy since 2007. Also, the carriers say mobile phone privacy is a FTC responsibility (not FCC). FTC is trying to improve mobile app privacy, but FTC has no authority over carrier / customer relationships. As a side note, Apple iPhones are unique as carriers have extra control over iPhones they don't have with other smartphones. As a result iPhones may be more regulated. Who are the consumer advocates? Everyone knows EFF, but EPIC (Electrnic Privacy Info Center), although more obsecure, is more relevant. What to do? Carriers must be accountable. Opt-in and opt-out at any time. Carriers need incentive to grant users control for those who want it, by holding them liable and responsible for breeches on their clock. Location information should be added current CPNI privacy protection, and require "Pen/trap" judicial order to obtain (and would still be a lower standard than 4th Amendment). Politics are on a pro-privacy swing now, with many senators and the Whitehouse. There will probably be new regulation soon, and enforcement will be a problem, but consumers will still have some benefit. Hacking Measured Boot and UEFI Dan Griffin, JWSecure, Inc., Seattle, @JWSdan Dan talked about hacking measured UEFI boot. First some terms: UEFI is a boot technology that is replacing BIOS (has whitelisting and blacklisting). UEFI protects devices against rootkits. TPM - hardware security device to store hashs and hardware-protected keys "secure boot" can control at firmware level what boot images can boot "measured boot" OS feature that tracks hashes (from BIOS, boot loader, krnel, early drivers). "remote attestation" allows remote validation and control based on policy on a remote attestation server. Microsoft pushing TPM (Windows 8 required), but Google is not. Intel TianoCore is the only open source for UEFI. Dan has Measured Boot Tool at http://mbt.codeplex.com/ with a demo where you can also view TPM data. TPM support already on enterprise-class machines. UEFI Weaknesses. UEFI toolkits are evolving rapidly, but UEFI has weaknesses: assume user is an ally trust TPM implicitly, and attached to computer hibernate file is unprotected (disk encryption protects against this) protection migrating from hardware to firmware delays in patching and whitelist updates will UEFI really be adopted by the mainstream (smartphone hardware support, bank support, apathetic consumer support) You Can't Buy Security: Building the Open Source InfoSec Program Boris Sverdlik, ISDPodcast.com co-host Boris talked about problems typical with current security audits. "IT Security" is an oxymoron—IT exists to enable buiness, uptime, utilization, reporting, but don't care about security—IT has conflict of interest. There's no Magic Bullet ("blinky box"), no one-size-fits-all solution (e.g., Intrusion Detection Systems (IDSs)). Regulations don't make you secure. The cloud is not secure (because of shared data and admin access). Defense and pen testing is not sexy. Auditors are not solution (security not a checklist)—what's needed is experience and adaptability—need soft skills. Step 1: First thing is to Google and learn the company end-to-end before you start. Get to know the management team (not IT team), meet as many people as you can. Don't use arbitrary values such as CISSP scores. Quantitive risk assessment is a myth (e.g. AV*EF-SLE). Learn different Business Units, legal/regulatory obligations, learn the business and where the money is made, verify company is protected from script kiddies (easy), learn sensitive information (IP, internal use only), and start with low-hanging fruit (customer service reps and social engineering). Step 2: Policies. Keep policies short and relevant. Generic SANS "security" boilerplate policies don't make sense and are not followed. Focus on acceptable use, data usage, communications, physical security. Step 3: Implementation: keep it simple stupid. Open source, although useful, is not free (implementation cost). Access controls with authentication & authorization for local and remote access. MS Windows has it, otherwise use OpenLDAP, OpenIAM, etc. Application security Everyone tries to reinvent the wheel—use existing static analysis tools. Review high-risk apps and major revisions. Don't run different risk level apps on same system. Assume host/client compromised and use app-level security control. Network security VLAN != segregated because there's too many workarounds. Use explicit firwall rules, active and passive network monitoring (snort is free), disallow end user access to production environment, have a proxy instead of direct Internet access. Also, SSL certificates are not good two-factor auth and SSL does not mean "safe." Operational Controls Have change, patch, asset, & vulnerability management (OSSI is free). For change management, always review code before pushing to production For logging, have centralized security logging for business-critical systems, separate security logging from administrative/IT logging, and lock down log (as it has everything). Monitor with OSSIM (open source). Use intrusion detection, but not just to fulfill a checkbox: build rules from a whitelist perspective (snort). OSSEC has 95% of what you need. Vulnerability management is a QA function when done right: OpenVas and Seccubus are free. Security awareness The reality is users will always click everything. Build real awareness, not compliance driven checkbox, and have it integrated into the culture. Pen test by crowd sourcing—test with logging COSSP http://www.cossp.org/ - Comprehensive Open Source Security Project What Journalists Want: The Investigative Reporters' Perspective on Hacking Dave Maas, San Diego CityBeat Jason Leopold, Truthout.org The difference between hackers and investigative journalists: For hackers, the motivation varies, but method is same, technological specialties. For investigative journalists, it's about one thing—The Story, and they need broad info-gathering skills. J-School in 60 Seconds: Generic formula: Person or issue of pubic interest, new info, or angle. Generic criteria: proximity, prominence, timeliness, human interest, oddity, or consequence. Media awareness of hackers and trends: journalists becoming extremely aware of hackers with congressional debates (privacy, data breaches), demand for data-mining Journalists, use of coding and web development for Journalists, and Journalists busted for hacking (Murdock). Info gathering by investigative journalists include Public records laws. Federal Freedom of Information Act (FOIA) is good, but slow. California Public Records Act is a lot stronger. FOIA takes forever because of foot-dragging—it helps to be specific. Often need to sue (especially FBI). CPRA is faster, and requests can be vague. Dumps and leaks (a la Wikileaks) Journalists want: leads, protecting ourselves, our sources, and adapting tools for news gathering (Google hacking). Anonomity is important to whistleblowers. They want no digital footprint left behind (e.g., email, web log). They don't trust encryption, want to feel safe and secure. Whistleblower laws are very weak—there's no upside for whistleblowers—they have to be very passionate to do it. Accessibility and Security or: How I Learned to Stop Worrying and Love the Halting Problem Anna Shubina, Dartmouth College Anna talked about how accessibility and security are related. Accessibility of digital content (not real world accessibility). mostly refers to blind users and screenreaders, for our purpose. Accessibility is about parsing documents, as are many security issues. "Rich" executable content causes accessibility to fail, and often causes security to fail. For example MS Word has executable format—it's not a document exchange format—more dangerous than PDF or HTML. Accessibility is often the first and maybe only sanity check with parsing. They have no choice because someone may want to read what you write. Google, for example, is very particular about web browser you use and are bad at supporting other browsers. Uses JavaScript instead of links, often requiring mouseover to display content. PDF is a security nightmare. Executible format, embedded flash, JavaScript, etc. 15 million lines of code. Google Chrome doesn't handle PDF correctly, causing several security bugs. PDF has an accessibility checker and PDF tagging, to help with accessibility. But no PDF checker checks for incorrect tags, untagged content, or validates lists or tables. None check executable content at all. The "Halting Problem" is: can one decide whether a program will ever stop? The answer, in general, is no (Rice's theorem). The same holds true for accessibility checkers. Language-theoretic Security says complicated data formats are hard to parse and cannot be solved due to the Halting Problem. W3C Web Accessibility Guidelines: "Perceivable, Operable, Understandable, Robust" Not much help though, except for "Robust", but here's some gems: * all information should be parsable (paraphrasing) * if not parsable, cannot be converted to alternate formats * maximize compatibility in new document formats Executible webpages are bad for security and accessibility. They say it's for a better web experience. But is it necessary to stuff web pages with JavaScript for a better experience? A good example is The Drudge Report—it has hand-written HTML with no JavaScript, yet drives a lot of web traffic due to good content. A bad example is Google News—hidden scrollbars, guessing user input. Solutions: Accessibility and security problems come from same source Expose "better user experience" myth Keep your corner of Internet parsable Remember "Halting Problem"—recognize false solutions (checking and verifying tools) Stop Patching, for Stronger PCI Compliance Adam Brand, protiviti @adamrbrand, http://www.picfun.com/ Adam talked about PCI compliance for retail sales. Take an example: for PCI compliance, 50% of Brian's time (a IT guy), 960 hours/year was spent patching POSs in 850 restaurants. Often applying some patches make no sense (like fixing a browser vulnerability on a server). "Scanner worship" is overuse of vulnerability scanners—it gives a warm and fuzzy and it's simple (red or green results—fix reds). Scanners give a false sense of security. In reality, breeches from missing patches are uncommon—more common problems are: default passwords, cleartext authentication, misconfiguration (firewall ports open). Patching Myths: Myth 1: install within 30 days of patch release (but PCI §6.1 allows a "risk-based approach" instead). Myth 2: vendor decides what's critical (also PCI §6.1). But §6.2 requires user ranking of vulnerabilities instead. Myth 3: scan and rescan until it passes. But PCI §11.2.1b says this applies only to high-risk vulnerabilities. Adam says good recommendations come from NIST 800-40. Instead use sane patching and focus on what's really important. From NIST 800-40: Proactive: Use a proactive vulnerability management process: use change control, configuration management, monitor file integrity. Monitor: start with NVD and other vulnerability alerts, not scanner results. Evaluate: public-facing system? workstation? internal server? (risk rank) Decide:on action and timeline Test: pre-test patches (stability, functionality, rollback) for change control Install: notify, change control, tickets McAfee Secure & Trustmarks — a Hacker's Best Friend Jay James, Shane MacDougall, Tactical Intelligence Inc., Canada "McAfee Secure Trustmark" is a website seal marketed by McAfee. A website gets this badge if they pass their remote scanning. The problem is a removal of trustmarks act as flags that you're vulnerable. Easy to view status change by viewing McAfee list on website or on Google. "Secure TrustGuard" is similar to McAfee. Jay and Shane wrote Perl scripts to gather sites from McAfee and search engines. If their certification image changes to a 1x1 pixel image, then they are longer certified. Their scripts take deltas of scans to see what changed daily. The bottom line is change in TrustGuard status is a flag for hackers to attack your site. Entire idea of seals is silly—you're raising a flag saying if you're vulnerable.

    Read the article

  • Using BPEL Performance Statistics to Diagnose Performance Bottlenecks

    - by fip
    Tuning performance of Oracle SOA 11G applications could be challenging. Because SOA is a platform for you to build composite applications that connect many applications and "services", when the overall performance is slow, the bottlenecks could be anywhere in the system: the applications/services that SOA connects to, the infrastructure database, or the SOA server itself.How to quickly identify the bottleneck becomes crucial in tuning the overall performance. Fortunately, the BPEL engine in Oracle SOA 11G (and 10G, for that matter) collects BPEL Engine Performance Statistics, which show the latencies of low level BPEL engine activities. The BPEL engine performance statistics can make it a bit easier for you to identify the performance bottleneck. Although the BPEL engine performance statistics are always available, the access to and interpretation of them are somewhat obscure in the early and current (PS5) 11G versions. This blog attempts to offer instructions that help you to enable, retrieve and interpret the performance statistics, before the future versions provides a more pleasant user experience. Overview of BPEL Engine Performance Statistics  SOA BPEL has a feature of collecting some performance statistics and store them in memory. One MBean attribute, StatLastN, configures the size of the memory buffer to store the statistics. This memory buffer is a "moving window", in a way that old statistics will be flushed out by the new if the amount of data exceeds the buffer size. Since the buffer size is limited by StatLastN, impacts of statistics collection on performance is minimal. By default StatLastN=-1, which means no collection of performance data. Once the statistics are collected in the memory buffer, they can be retrieved via another MBean oracle.as.soainfra.bpel:Location=[Server Name],name=BPELEngine,type=BPELEngine.> My friend in Oracle SOA development wrote this simple 'bpelstat' web app that looks up and retrieves the performance data from the MBean and displays it in a human readable form. It does not have beautiful UI but it is fairly useful. Although in Oracle SOA 11.1.1.5 onwards the same statistics can be viewed via a more elegant UI under "request break down" at EM -> SOA Infrastructure -> Service Engines -> BPEL -> Statistics, some unsophisticated minds like mine may still prefer the simplicity of the 'bpelstat' JSP. One thing that simple JSP does do well is that you can save the page and send it to someone to further analyze Follows are the instructions of how to install and invoke the BPEL statistic JSP. My friend in SOA Development will soon blog about interpreting the statistics. Stay tuned. Step1: Enable BPEL Engine Statistics for Each SOA Servers via Enterprise Manager First st you need to set the StatLastN to some number as a way to enable the collection of BPEL Engine Performance Statistics EM Console -> soa-infra(Server Name) -> SOA Infrastructure -> SOA Administration -> BPEL Properties Click on "More BPEL Configuration Properties" Click on attribute "StatLastN", set its value to some integer number. Typically you want to set it 1000 or more. Step 2: Download and Deploy bpelstat.war File to Admin Server, Note: the WAR file contains a JSP that does NOT have any security restriction. You do NOT want to keep in your production server for a long time as it is a security hazard. Deactivate the war once you are done. Download the bpelstat.war to your local PC At WebLogic Console, Go to Deployments -> Install Click on the "upload your file(s)" Click the "Browse" button to upload the deployment to Admin Server Accept the uploaded file as the path, click next Check the default option "Install this deployment as an application" Check "AdminServer" as the target server Finish the rest of the deployment with default settings Console -> Deployments Check the box next to "bpelstat" application Click on the "Start" button. It will change the state of the app from "prepared" to "active" Step 3: Invoke the BPEL Statistic Tool The BPELStat tool merely call the MBean of BPEL server and collects and display the in-memory performance statics. You usually want to do that after some peak loads. Go to http://<admin-server-host>:<admin-server-port>/bpelstat Enter the correct admin hostname, port, username and password Enter the SOA Server Name from which you want to collect the performance statistics. For example, SOA_MS1, etc. Click Submit Keep doing the same for all SOA servers. Step 3: Interpret the BPEL Engine Statistics You will see a few categories of BPEL Statistics from the JSP Page. First it starts with the overall latency of BPEL processes, grouped by synchronous and asynchronous processes. Then it provides the further break down of the measurements through the life time of a BPEL request, which is called the "request break down". 1. Overall latency of BPEL processes The top of the page shows that the elapse time of executing the synchronous process TestSyncBPELProcess from the composite TestComposite averages at about 1543.21ms, while the elapse time of executing the asynchronous process TestAsyncBPELProcess from the composite TestComposite2 averages at about 1765.43ms. The maximum and minimum latency were also shown. Synchronous process statistics <statistics>     <stats key="default/TestComposite!2.0.2-ScopedJMSOSB*soa_bfba2527-a9ba-41a7-95c5-87e49c32f4ff/TestSyncBPELProcess" min="1234" max="4567" average="1543.21" count="1000">     </stats> </statistics> Asynchronous process statistics <statistics>     <stats key="default/TestComposite2!2.0.2-ScopedJMSOSB*soa_bfba2527-a9ba-41a7-95c5-87e49c32f4ff/TestAsyncBPELProcess" min="2234" max="3234" average="1765.43" count="1000">     </stats> </statistics> 2. Request break down Under the overall latency categorized by synchronous and asynchronous processes is the "Request breakdown". Organized by statistic keys, the Request breakdown gives finer grain performance statistics through the life time of the BPEL requests.It uses indention to show the hierarchy of the statistics. Request breakdown <statistics>     <stats key="eng-composite-request" min="0" max="0" average="0.0" count="0">         <stats key="eng-single-request" min="22" max="606" average="258.43" count="277">             <stats key="populate-context" min="0" max="0" average="0.0" count="248"> Please note that in SOA 11.1.1.6, the statistics under Request breakdown is aggregated together cross all the BPEL processes based on statistic keys. It does not differentiate between BPEL processes. If two BPEL processes happen to have the statistic that share same statistic key, the statistics from two BPEL processes will be aggregated together. Keep this in mind when we go through more details below. 2.1 BPEL process activity latencies A very useful measurement in the Request Breakdown is the performance statistics of the BPEL activities you put in your BPEL processes: Assign, Invoke, Receive, etc. The names of the measurement in the JSP page directly come from the names to assign to each BPEL activity. These measurements are under the statistic key "actual-perform" Example 1:  Follows is the measurement for BPEL activity "AssignInvokeCreditProvider_Input", which looks like the Assign activity in a BPEL process that assign an input variable before passing it to the invocation:                                <stats key="AssignInvokeCreditProvider_Input" min="1" max="8" average="1.9" count="153">                                     <stats key="sensor-send-activity-data" min="0" max="1" average="0.0" count="306">                                     </stats>                                     <stats key="sensor-send-variable-data" min="0" max="0" average="0.0" count="153">                                     </stats>                                     <stats key="monitor-send-activity-data" min="0" max="0" average="0.0" count="306">                                     </stats>                                 </stats> Note: because as previously mentioned that the statistics cross all BPEL processes are aggregated together based on statistic keys, if two BPEL processes happen to name their Invoke activity the same name, they will show up at one measurement (i.e. statistic key). Example 2: Follows is the measurement of BPEL activity called "InvokeCreditProvider". You can not only see that by average it takes 3.31ms to finish this call (pretty fast) but also you can see from the further break down that most of this 3.31 ms was spent on the "invoke-service".                                  <stats key="InvokeCreditProvider" min="1" max="13" average="3.31" count="153">                                     <stats key="initiate-correlation-set-again" min="0" max="0" average="0.0" count="153">                                     </stats>                                     <stats key="invoke-service" min="1" max="13" average="3.08" count="153">                                         <stats key="prep-call" min="0" max="1" average="0.04" count="153">                                         </stats>                                     </stats>                                     <stats key="initiate-correlation-set" min="0" max="0" average="0.0" count="153">                                     </stats>                                     <stats key="sensor-send-activity-data" min="0" max="0" average="0.0" count="306">                                     </stats>                                     <stats key="sensor-send-variable-data" min="0" max="0" average="0.0" count="153">                                     </stats>                                     <stats key="monitor-send-activity-data" min="0" max="0" average="0.0" count="306">                                     </stats>                                     <stats key="update-audit-trail" min="0" max="2" average="0.03" count="153">                                     </stats>                                 </stats> 2.2 BPEL engine activity latency Another type of measurements under Request breakdown are the latencies of underlying system level engine activities. These activities are not directly tied to a particular BPEL process or process activity, but they are critical factors in the overall engine performance. These activities include the latency of saving asynchronous requests to database, and latency of process dehydration. My friend Malkit Bhasin is working on providing more information on interpreting the statistics on engine activities on his blog (https://blogs.oracle.com/malkit/). I will update this blog once the information becomes available. Update on 2012-10-02: My friend Malkit Bhasin has published the detail interpretation of the BPEL service engine statistics at his blog http://malkit.blogspot.com/2012/09/oracle-bpel-engine-soa-suite.html.

    Read the article

  • Node.js Adventure - Host Node.js on Windows Azure Worker Role

    - by Shaun
    In my previous post I demonstrated about how to develop and deploy a Node.js application on Windows Azure Web Site (a.k.a. WAWS). WAWS is a new feature in Windows Azure platform. Since it’s low-cost, and it provides IIS and IISNode components so that we can host our Node.js application though Git, FTP and WebMatrix without any configuration and component installation. But sometimes we need to use the Windows Azure Cloud Service (a.k.a. WACS) and host our Node.js on worker role. Below are some benefits of using worker role. - WAWS leverages IIS and IISNode to host Node.js application, which runs in x86 WOW mode. It reduces the performance comparing with x64 in some cases. - WACS worker role does not need IIS, hence there’s no restriction of IIS, such as 8000 concurrent requests limitation. - WACS provides more flexibility and controls to the developers. For example, we can RDP to the virtual machines of our worker role instances. - WACS provides the service configuration features which can be changed when the role is running. - WACS provides more scaling capability than WAWS. In WAWS we can have at most 3 reserved instances per web site while in WACS we can have up to 20 instances in a subscription. - Since when using WACS worker role we starts the node by ourselves in a process, we can control the input, output and error stream. We can also control the version of Node.js.   Run Node.js in Worker Role Node.js can be started by just having its execution file. This means in Windows Azure, we can have a worker role with the “node.exe” and the Node.js source files, then start it in Run method of the worker role entry class. Let’s create a new windows azure project in Visual Studio and add a new worker role. Since we need our worker role execute the “node.exe” with our application code we need to add the “node.exe” into our project. Right click on the worker role project and add an existing item. By default the Node.js will be installed in the “Program Files\nodejs” folder so we can navigate there and add the “node.exe”. Then we need to create the entry code of Node.js. In WAWS the entry file must be named “server.js”, which is because it’s hosted by IIS and IISNode and IISNode only accept “server.js”. But here as we control everything we can choose any files as the entry code. For example, I created a new JavaScript file named “index.js” in project root. Since we created a C# Windows Azure project we cannot create a JavaScript file from the context menu “Add new item”. We have to create a text file, and then rename it to JavaScript extension. After we added these two files we should set their “Copy to Output Directory” property to “Copy Always”, or “Copy if Newer”. Otherwise they will not be involved in the package when deployed. Let’s paste a very simple Node.js code in the “index.js” as below. As you can see I created a web server listening at port 12345. 1: var http = require("http"); 2: var port = 12345; 3:  4: http.createServer(function (req, res) { 5: res.writeHead(200, { "Content-Type": "text/plain" }); 6: res.end("Hello World\n"); 7: }).listen(port); 8:  9: console.log("Server running at port %d", port); Then we need to start “node.exe” with this file when our worker role was started. This can be done in its Run method. I found the Node.js and entry JavaScript file name, and then create a new process to run it. Our worker role will wait for the process to be exited. If everything is OK once our web server was opened the process will be there listening for incoming requests, and should not be terminated. The code in worker role would be like this. 1: public override void Run() 2: { 3: // This is a sample worker implementation. Replace with your logic. 4: Trace.WriteLine("NodejsHost entry point called", "Information"); 5:  6: // retrieve the node.exe and entry node.js source code file name. 7: var node = Environment.ExpandEnvironmentVariables(@"%RoleRoot%\approot\node.exe"); 8: var js = "index.js"; 9:  10: // prepare the process starting of node.exe 11: var info = new ProcessStartInfo(node, js) 12: { 13: CreateNoWindow = false, 14: ErrorDialog = true, 15: WindowStyle = ProcessWindowStyle.Normal, 16: UseShellExecute = false, 17: WorkingDirectory = Environment.ExpandEnvironmentVariables(@"%RoleRoot%\approot") 18: }; 19: Trace.WriteLine(string.Format("{0} {1}", node, js), "Information"); 20:  21: // start the node.exe with entry code and wait for exit 22: var process = Process.Start(info); 23: process.WaitForExit(); 24: } Then we can run it locally. In the computer emulator UI the worker role started and it executed the Node.js, then Node.js windows appeared. Open the browser to verify the website hosted by our worker role. Next let’s deploy it to azure. But we need some additional steps. First, we need to create an input endpoint. By default there’s no endpoint defined in a worker role. So we will open the role property window in Visual Studio, create a new input TCP endpoint to the port we want our website to use. In this case I will use 80. Even though we created a web server we should add a TCP endpoint of the worker role, since Node.js always listen on TCP instead of HTTP. And then changed the “index.js”, let our web server listen on 80. 1: var http = require("http"); 2: var port = 80; 3:  4: http.createServer(function (req, res) { 5: res.writeHead(200, { "Content-Type": "text/plain" }); 6: res.end("Hello World\n"); 7: }).listen(port); 8:  9: console.log("Server running at port %d", port); Then publish it to Windows Azure. And then in browser we can see our Node.js website was running on WACS worker role. We may encounter an error if we tried to run our Node.js website on 80 port at local emulator. This is because the compute emulator registered 80 and map the 80 endpoint to 81. But our Node.js cannot detect this operation. So when it tried to listen on 80 it will failed since 80 have been used.   Use NPM Modules When we are using WAWS to host Node.js, we can simply install modules we need, and then just publish or upload all files to WAWS. But if we are using WACS worker role, we have to do some extra steps to make the modules work. Assuming that we plan to use “express” in our application. Firstly of all we should download and install this module through NPM command. But after the install finished, they are just in the disk but not included in the worker role project. If we deploy the worker role right now the module will not be packaged and uploaded to azure. Hence we need to add them to the project. On solution explorer window click the “Show all files” button, select the “node_modules” folder and in the context menu select “Include In Project”. But that not enough. We also need to make all files in this module to “Copy always” or “Copy if newer”, so that they can be uploaded to azure with the “node.exe” and “index.js”. This is painful step since there might be many files in a module. So I created a small tool which can update a C# project file, make its all items as “Copy always”. The code is very simple. 1: static void Main(string[] args) 2: { 3: if (args.Length < 1) 4: { 5: Console.WriteLine("Usage: copyallalways [project file]"); 6: return; 7: } 8:  9: var proj = args[0]; 10: File.Copy(proj, string.Format("{0}.bak", proj)); 11:  12: var xml = new XmlDocument(); 13: xml.Load(proj); 14: var nsManager = new XmlNamespaceManager(xml.NameTable); 15: nsManager.AddNamespace("pf", "http://schemas.microsoft.com/developer/msbuild/2003"); 16:  17: // add the output setting to copy always 18: var contentNodes = xml.SelectNodes("//pf:Project/pf:ItemGroup/pf:Content", nsManager); 19: UpdateNodes(contentNodes, xml, nsManager); 20: var noneNodes = xml.SelectNodes("//pf:Project/pf:ItemGroup/pf:None", nsManager); 21: UpdateNodes(noneNodes, xml, nsManager); 22: xml.Save(proj); 23:  24: // remove the namespace attributes 25: var content = xml.InnerXml.Replace("<CopyToOutputDirectory xmlns=\"\">", "<CopyToOutputDirectory>"); 26: xml.LoadXml(content); 27: xml.Save(proj); 28: } 29:  30: static void UpdateNodes(XmlNodeList nodes, XmlDocument xml, XmlNamespaceManager nsManager) 31: { 32: foreach (XmlNode node in nodes) 33: { 34: var copyToOutputDirectoryNode = node.SelectSingleNode("pf:CopyToOutputDirectory", nsManager); 35: if (copyToOutputDirectoryNode == null) 36: { 37: var n = xml.CreateNode(XmlNodeType.Element, "CopyToOutputDirectory", null); 38: n.InnerText = "Always"; 39: node.AppendChild(n); 40: } 41: else 42: { 43: if (string.Compare(copyToOutputDirectoryNode.InnerText, "Always", true) != 0) 44: { 45: copyToOutputDirectoryNode.InnerText = "Always"; 46: } 47: } 48: } 49: } Please be careful when use this tool. I created only for demo so do not use it directly in a production environment. Unload the worker role project, execute this tool with the worker role project file name as the command line argument, it will set all items as “Copy always”. Then reload this worker role project. Now let’s change the “index.js” to use express. 1: var express = require("express"); 2: var app = express(); 3:  4: var port = 80; 5:  6: app.configure(function () { 7: }); 8:  9: app.get("/", function (req, res) { 10: res.send("Hello Node.js!"); 11: }); 12:  13: app.get("/User/:id", function (req, res) { 14: var id = req.params.id; 15: res.json({ 16: "id": id, 17: "name": "user " + id, 18: "company": "IGT" 19: }); 20: }); 21:  22: app.listen(port); Finally let’s publish it and have a look in browser.   Use Windows Azure SQL Database We can use Windows Azure SQL Database (a.k.a. WACD) from Node.js as well on worker role hosting. Since we can control the version of Node.js, here we can use x64 version of “node-sqlserver” now. This is better than if we host Node.js on WAWS since it only support x86. Just install the “node-sqlserver” module from NPM, copy the “sqlserver.node” from “Build\Release” folder to “Lib” folder. Include them in worker role project and run my tool to make them to “Copy always”. Finally update the “index.js” to use WASD. 1: var express = require("express"); 2: var sql = require("node-sqlserver"); 3:  4: var connectionString = "Driver={SQL Server Native Client 10.0};Server=tcp:{SERVER NAME}.database.windows.net,1433;Database={DATABASE NAME};Uid={LOGIN}@{SERVER NAME};Pwd={PASSWORD};Encrypt=yes;Connection Timeout=30;"; 5: var port = 80; 6:  7: var app = express(); 8:  9: app.configure(function () { 10: app.use(express.bodyParser()); 11: }); 12:  13: app.get("/", function (req, res) { 14: sql.open(connectionString, function (err, conn) { 15: if (err) { 16: console.log(err); 17: res.send(500, "Cannot open connection."); 18: } 19: else { 20: conn.queryRaw("SELECT * FROM [Resource]", function (err, results) { 21: if (err) { 22: console.log(err); 23: res.send(500, "Cannot retrieve records."); 24: } 25: else { 26: res.json(results); 27: } 28: }); 29: } 30: }); 31: }); 32:  33: app.get("/text/:key/:culture", function (req, res) { 34: sql.open(connectionString, function (err, conn) { 35: if (err) { 36: console.log(err); 37: res.send(500, "Cannot open connection."); 38: } 39: else { 40: var key = req.params.key; 41: var culture = req.params.culture; 42: var command = "SELECT * FROM [Resource] WHERE [Key] = '" + key + "' AND [Culture] = '" + culture + "'"; 43: conn.queryRaw(command, function (err, results) { 44: if (err) { 45: console.log(err); 46: res.send(500, "Cannot retrieve records."); 47: } 48: else { 49: res.json(results); 50: } 51: }); 52: } 53: }); 54: }); 55:  56: app.get("/sproc/:key/:culture", function (req, res) { 57: sql.open(connectionString, function (err, conn) { 58: if (err) { 59: console.log(err); 60: res.send(500, "Cannot open connection."); 61: } 62: else { 63: var key = req.params.key; 64: var culture = req.params.culture; 65: var command = "EXEC GetItem '" + key + "', '" + culture + "'"; 66: conn.queryRaw(command, function (err, results) { 67: if (err) { 68: console.log(err); 69: res.send(500, "Cannot retrieve records."); 70: } 71: else { 72: res.json(results); 73: } 74: }); 75: } 76: }); 77: }); 78:  79: app.post("/new", function (req, res) { 80: var key = req.body.key; 81: var culture = req.body.culture; 82: var val = req.body.val; 83:  84: sql.open(connectionString, function (err, conn) { 85: if (err) { 86: console.log(err); 87: res.send(500, "Cannot open connection."); 88: } 89: else { 90: var command = "INSERT INTO [Resource] VALUES ('" + key + "', '" + culture + "', N'" + val + "')"; 91: conn.queryRaw(command, function (err, results) { 92: if (err) { 93: console.log(err); 94: res.send(500, "Cannot retrieve records."); 95: } 96: else { 97: res.send(200, "Inserted Successful"); 98: } 99: }); 100: } 101: }); 102: }); 103:  104: app.listen(port); Publish to azure and now we can see our Node.js is working with WASD through x64 version “node-sqlserver”.   Summary In this post I demonstrated how to host our Node.js in Windows Azure Cloud Service worker role. By using worker role we can control the version of Node.js, as well as the entry code. And it’s possible to do some pre jobs before the Node.js application started. It also removed the IIS and IISNode limitation. I personally recommended to use worker role as our Node.js hosting. But there are some problem if you use the approach I mentioned here. The first one is, we need to set all JavaScript files and module files as “Copy always” or “Copy if newer” manually. The second one is, in this way we cannot retrieve the cloud service configuration information. For example, we defined the endpoint in worker role property but we also specified the listening port in Node.js hardcoded. It should be changed that our Node.js can retrieve the endpoint. But I can tell you it won’t be working here. In the next post I will describe another way to execute the “node.exe” and Node.js application, so that we can get the cloud service configuration in Node.js. I will also demonstrate how to use Windows Azure Storage from Node.js by using the Windows Azure Node.js SDK.   Hope this helps, Shaun All documents and related graphics, codes are provided "AS IS" without warranty of any kind. Copyright © Shaun Ziyan Xu. This work is licensed under the Creative Commons License.

    Read the article

  • The broken Promise of the Mobile Web

    - by Rick Strahl
    High end mobile devices have been with us now for almost 7 years and they have utterly transformed the way we access information. Mobile phones and smartphones that have access to the Internet and host smart applications are in the hands of a large percentage of the population of the world. In many places even very remote, cell phones and even smart phones are a common sight. I’ll never forget when I was in India in 2011 I was up in the Southern Indian mountains riding an elephant out of a tiny local village, with an elephant herder in front riding atop of the elephant in front of us. He was dressed in traditional garb with the loin wrap and head cloth/turban as did quite a few of the locals in this small out of the way and not so touristy village. So we’re slowly trundling along in the forest and he’s lazily using his stick to guide the elephant and… 10 minutes in he pulls out his cell phone from his sash and starts texting. In the middle of texting a huge pig jumps out from the side of the trail and he takes a picture running across our path in the jungle! So yeah, mobile technology is very pervasive and it’s reached into even very buried and unexpected parts of this world. Apps are still King Apps currently rule the roost when it comes to mobile devices and the applications that run on them. If there’s something that you need on your mobile device your first step usually is to look for an app, not use your browser. But native app development remains a pain in the butt, with the requirement to have to support 2 or 3 completely separate platforms. There are solutions that try to bridge that gap. Xamarin is on a tear at the moment, providing their cross-device toolkit to build applications using C#. While Xamarin tools are impressive – and also *very* expensive – they only address part of the development madness that is app development. There are still specific device integration isssues, dealing with the different developer programs, security and certificate setups and all that other noise that surrounds app development. There’s also PhoneGap/Cordova which provides a hybrid solution that involves creating local HTML/CSS/JavaScript based applications, and then packaging them to run in a specialized App container that can run on most mobile device platforms using a WebView interface. This allows for using of HTML technology, but it also still requires all the set up, configuration of APIs, security keys and certification and submission and deployment process just like native applications – you actually lose many of the benefits that  Web based apps bring. The big selling point of Cordova is that you get to use HTML have the ability to build your UI once for all platforms and run across all of them – but the rest of the app process remains in place. Apps can be a big pain to create and manage especially when we are talking about specialized or vertical business applications that aren’t geared at the mainstream market and that don’t fit the ‘store’ model. If you’re building a small intra department application you don’t want to deal with multiple device platforms and certification etc. for various public or corporate app stores. That model is simply not a good fit both from the development and deployment perspective. Even for commercial, big ticket apps, HTML as a UI platform offers many advantages over native, from write-once run-anywhere, to remote maintenance, single point of management and failure to having full control over the application as opposed to have the app store overloads censor you. In a lot of ways Web based HTML/CSS/JavaScript applications have so much potential for building better solutions based on existing Web technologies for the very same reasons a lot of content years ago moved off the desktop to the Web. To me the Web as a mobile platform makes perfect sense, but the reality of today’s Mobile Web unfortunately looks a little different… Where’s the Love for the Mobile Web? Yet here we are in the middle of 2014, nearly 7 years after the first iPhone was released and brought the promise of rich interactive information at your fingertips, and yet we still don’t really have a solid mobile Web platform. I know what you’re thinking: “But we have lots of HTML/JavaScript/CSS features that allows us to build nice mobile interfaces”. I agree to a point – it’s actually quite possible to build nice looking, rich and capable Web UI today. We have media queries to deal with varied display sizes, CSS transforms for smooth animations and transitions, tons of CSS improvements in CSS 3 that facilitate rich layout, a host of APIs geared towards mobile device features and lately even a number of JavaScript framework choices that facilitate development of multi-screen apps in a consistent manner. Personally I’ve been working a lot with AngularJs and heavily modified Bootstrap themes to build mobile first UIs and that’s been working very well to provide highly usable and attractive UI for typical mobile business applications. From the pure UI perspective things actually look very good. Not just about the UI But it’s not just about the UI - it’s also about integration with the mobile device. When it comes to putting all those pieces together into what amounts to a consolidated platform to build mobile Web applications, I think we still have a ways to go… there are a lot of missing pieces to make it all work together and integrate with the device more smoothly, and more importantly to make it work uniformly across the majority of devices. I think there are a number of reasons for this. Slow Standards Adoption HTML standards implementations and ratification has been dreadfully slow, and browser vendors all seem to pick and choose different pieces of the technology they implement. The end result is that we have a capable UI platform that’s missing some of the infrastructure pieces to make it whole on mobile devices. There’s lots of potential but what is lacking that final 10% to build truly compelling mobile applications that can compete favorably with native applications. Some of it is the fragmentation of browsers and the slow evolution of the mobile specific HTML APIs. A host of mobile standards exist but many of the standards are in the early review stage and they have been there stuck for long periods of time and seem to move at a glacial pace. Browser vendors seem even slower to implement them, and for good reason – non-ratified standards mean that implementations may change and vendor implementations tend to be experimental and  likely have to be changed later. Neither Vendors or developers are not keen on changing standards. This is the typical chicken and egg scenario, but without some forward momentum from some party we end up stuck in the mud. It seems that either the standards bodies or the vendors need to carry the torch forward and that doesn’t seem to be happening quickly enough. Mobile Device Integration just isn’t good enough Current standards are not far reaching enough to address a number of the use case scenarios necessary for many mobile applications. While not every application needs to have access to all mobile device features, almost every mobile application could benefit from some integration with other parts of the mobile device platform. Integration with GPS, phone, media, messaging, notifications, linking and contacts system are benefits that are unique to mobile applications and could be widely used, but are mostly (with the exception of GPS) inaccessible for Web based applications today. Unfortunately trying to do most of this today only with a mobile Web browser is a losing battle. Aside from PhoneGap/Cordova’s app centric model with its own custom API accessing mobile device features and the token exception of the GeoLocation API, most device integration features are not widely supported by the current crop of mobile browsers. For example there’s no usable messaging API that allows access to SMS or contacts from HTML. Even obvious components like the Media Capture API are only implemented partially by mobile devices. There are alternatives and workarounds for some of these interfaces by using browser specific code, but that’s might ugly and something that I thought we were trying to leave behind with newer browser standards. But it’s not quite working out that way. It’s utterly perplexing to me that mobile standards like Media Capture and Streams, Media Gallery Access, Responsive Images, Messaging API, Contacts Manager API have only minimal or no traction at all today. Keep in mind we’ve had mobile browsers for nearly 7 years now, and yet we still have to think about how to get access to an image from the image gallery or the camera on some devices? Heck Windows Phone IE Mobile just gained the ability to upload images recently in the Windows 8.1 Update – that’s feature that HTML has had for 20 years! These are simple concepts and common problems that should have been solved a long time ago. It’s extremely frustrating to see build 90% of a mobile Web app with relative ease and then hit a brick wall for the remaining 10%, which often can be show stoppers. The remaining 10% have to do with platform integration, browser differences and working around the limitations that browsers and ‘pinned’ applications impose on HTML applications. The maddening part is that these limitations seem arbitrary as they could easily work on all mobile platforms. For example, SMS has a URL Moniker interface that sort of works on Android, works badly with iOS (only works if the address is already in the contact list) and not at all on Windows Phone. There’s no reason this shouldn’t work universally using the same interface – after all all phones have supported SMS since before the year 2000! But, it doesn’t have to be this way Change can happen very quickly. Take the GeoLocation API for example. Geolocation has taken off at the very beginning of the mobile device era and today it works well, provides the necessary security (a big concern for many mobile APIs), and is supported by just about all major mobile and even desktop browsers today. It handles security concerns via prompts to avoid unwanted access which is a model that would work for most other device APIs in a similar fashion. One time approval and occasional re-approval if code changes or caches expire. Simple and only slightly intrusive. It all works well, even though GeoLocation actually has some physical limitations, such as representing the current location when no GPS device is present. Yet this is a solved problem, where other APIs that are conceptually much simpler to implement have failed to gain any traction at all. Technically none of these APIs should be a problem to implement, but it appears that the momentum is just not there. Inadequate Web Application Linking and Activation Another important piece of the puzzle missing is the integration of HTML based Web applications. Today HTML based applications are not first class citizens on mobile operating systems. When talking about HTML based content there’s a big difference between content and applications. Content is great for search engine discovery and plain browser usage. Content is usually accessed intermittently and permanent linking is not so critical for this type of content.  But applications have different needs. Applications need to be started up quickly and must be easily switchable to support a multi-tasking user workflow. Therefore, it’s pretty crucial that mobile Web apps are integrated into the underlying mobile OS and work with the standard task management features. Unfortunately this integration is not as smooth as it should be. It starts with actually trying to find mobile Web applications, to ‘installing’ them onto a phone in an easily accessible manner in a prominent position. The experience of discovering a Mobile Web ‘App’ and making it sticky is by no means as easy or satisfying. Today the way you’d go about this is: Open the browser Search for a Web Site in the browser with your search engine of choice Hope that you find the right site Hope that you actually find a site that works for your mobile device Click on the link and run the app in a fully chrome’d browser instance (read tiny surface area) Pin the app to the home screen (with all the limitations outline above) Hope you pointed at the right URL when you pinned Even for you and me as developers, there are a few steps in there that are painful and annoying, but think about the average user. First figuring out how to search for a specific site or URL? And then pinning the app and hopefully from the right location? You’ve probably lost more than half of your audience at that point. This experience sucks. For developers too this process is painful since app developers can’t control the shortcut creation directly. This problem often gets solved by crazy coding schemes, with annoying pop-ups that try to get people to create shortcuts via fancy animations that are both annoying and add overhead to each and every application that implements this sort of thing differently. And that’s not the end of it - getting the link onto the home screen with an application icon varies quite a bit between browsers. Apple’s non-standard meta tags are prominent and they work with iOS and Android (only more recent versions), but not on Windows Phone. Windows Phone instead requires you to create an actual screen or rather a partial screen be captured for a shortcut in the tile manager. Who had that brilliant idea I wonder? Surprisingly Chrome on recent Android versions seems to actually get it right – icons use pngs, pinning is easy and pinned applications properly behave like standalone apps and retain the browser’s active page state and content. Each of the platforms has a different way to specify icons (WP doesn’t allow you to use an icon image at all), and the most widely used interface in use today is a bunch of Apple specific meta tags that other browsers choose to support. The question is: Why is there no standard implementation for installing shortcuts across mobile platforms using an official format rather than a proprietary one? Then there’s iOS and the crazy way it treats home screen linked URLs using a crazy hybrid format that is neither as capable as a Web app running in Safari nor a WebView hosted application. Moving off the Web ‘app’ link when switching to another app actually causes the browser and preview it to ‘blank out’ the Web application in the Task View (see screenshot on the right). Then, when the ‘app’ is reactivated it ends up completely restarting the browser with the original link. This is crazy behavior that you can’t easily work around. In some situations you might be able to store the application state and restore it using LocalStorage, but for many scenarios that involve complex data sources (like say Google Maps) that’s not a possibility. The only reason for this screwed up behavior I can think of is that it is deliberate to make Web apps a pain in the butt to use and forcing users trough the App Store/PhoneGap/Cordova route. App linking and management is a very basic problem – something that we essentially have solved in every desktop browser – yet on mobile devices where it arguably matters a lot more to have easy access to web content we have to jump through hoops to have even a remotely decent linking/activation experience across browsers. Where’s the Money? It’s not surprising that device home screen integration and Mobile Web support in general is in such dismal shape – the mobile OS vendors benefit financially from App store sales and have little to gain from Web based applications that bypass the App store and the cash cow that it presents. On top of that, platform specific vendor lock-in of both end users and developers who have invested in hardware, apps and consumables is something that mobile platform vendors actually aspire to. Web based interfaces that are cross-platform are the anti-thesis of that and so again it’s no surprise that the mobile Web is on a struggling path. But – that may be changing. More and more we’re seeing operations shifting to services that are subscription based or otherwise collect money for usage, and that may drive more progress into the Web direction in the end . Nothing like the almighty dollar to drive innovation forward. Do we need a Mobile Web App Store? As much as I dislike moderated experiences in today’s massive App Stores, they do at least provide one single place to look for apps for your device. I think we could really use some sort of registry, that could provide something akin to an app store for mobile Web apps, to make it easier to actually find mobile applications. This could take the form of a specialized search engine, or maybe a more formal store/registry like structure. Something like apt-get/chocolatey for Web apps. It could be curated and provide at least some feedback and reviews that might help with the integrity of applications. Coupled to that could be a native application on each platform that would allow searching and browsing of the registry and then also handle installation in the form of providing the home screen linking, plus maybe an initial security configuration that determines what features are allowed access to for the app. I’m not holding my breath. In order for this sort of thing to take off and gain widespread appeal, a lot of coordination would be required. And in order to get enough traction it would have to come from a well known entity – a mobile Web app store from a no name source is unlikely to gain high enough usage numbers to make a difference. In a way this would eliminate some of the freedom of the Web, but of course this would also be an optional search path in addition to the standard open Web search mechanisms to find and access content today. Security Security is a big deal, and one of the perceived reasons why so many IT professionals appear to be willing to go back to the walled garden of deployed apps is that Apps are perceived as safe due to the official review and curation of the App stores. Curated stores are supposed to protect you from malware, illegal and misleading content. It doesn’t always work out that way and all the major vendors have had issues with security and the review process at some time or another. Security is critical, but I also think that Web applications in general pose less of a security threat than native applications, by nature of the sandboxed browser and JavaScript environments. Web applications run externally completely and in the HTML and JavaScript sandboxes, with only a very few controlled APIs allowing access to device specific features. And as discussed earlier – security for any device interaction can be granted the same for mobile applications through a Web browser, as they can for native applications either via explicit policies loaded from the Web, or via prompting as GeoLocation does today. Security is important, but it’s certainly solvable problem for Web applications even those that need to access device hardware. Security shouldn’t be a reason for Web apps to be an equal player in mobile applications. Apps are winning, but haven’t we been here before? So now we’re finding ourselves back in an era of installed app, rather than Web based and managed apps. Only it’s even worse today than with Desktop applications, in that the apps are going through a gatekeeper that charges a toll and censors what you can and can’t do in your apps. Frankly it’s a mystery to me why anybody would buy into this model and why it’s lasted this long when we’ve already been through this process. It’s crazy… It’s really a shame that this regression is happening. We have the technology to make mobile Web apps much more prominent, but yet we’re basically held back by what seems little more than bureaucracy, partisan bickering and self interest of the major parties involved. Back in the day of the desktop it was Internet Explorer’s 98+%  market shareholding back the Web from improvements for many years – now it’s the combined mobile OS market in control of the mobile browsers. If mobile Web apps were allowed to be treated the same as native apps with simple ways to install and run them consistently and persistently, that would go a long way to making mobile applications much more usable and seriously viable alternatives to native apps. But as it is mobile apps have a severe disadvantage in placement and operation. There are a few bright spots in all of this. Mozilla’s FireFoxOs is embracing the Web for it’s mobile OS by essentially building every app out of HTML and JavaScript based content. It supports both packaged and certified package modes (that can be put into the app store), and Open Web apps that are loaded and run completely off the Web and can also cache locally for offline operation using a manifest. Open Web apps are treated as full class citizens in FireFoxOS and run using the same mechanism as installed apps. Unfortunately FireFoxOs is getting a slow start with minimal device support and specifically targeting the low end market. We can hope that this approach will change and catch on with other vendors, but that’s also an uphill battle given the conflict of interest with platform lock in that it represents. Recent versions of Android also seem to be working reasonably well with mobile application integration onto the desktop and activation out of the box. Although it still uses the Apple meta tags to find icons and behavior settings, everything at least works as you would expect – icons to the desktop on pinning, WebView based full screen activation, and reliable application persistence as the browser/app is treated like a real application. Hopefully iOS will at some point provide this same level of rudimentary Web app support. What’s also interesting to me is that Microsoft hasn’t picked up on the obvious need for a solid Web App platform. Being a distant third in the mobile OS war, Microsoft certainly has nothing to lose and everything to gain by using fresh ideas and expanding into areas that the other major vendors are neglecting. But instead Microsoft is trying to beat the market leaders at their own game, fighting on their adversary’s terms instead of taking a new tack. Providing a kick ass mobile Web platform that takes the lead on some of the proposed mobile APIs would be something positive that Microsoft could do to improve its miserable position in the mobile device market. Where are we at with Mobile Web? It sure sounds like I’m really down on the Mobile Web, right? I’ve built a number of mobile apps in the last year and while overall result and response has been very positive to what we were able to accomplish in terms of UI, getting that final 10% that required device integration dialed was an absolute nightmare on every single one of them. Big compromises had to be made and some features were left out or had to be modified for some devices. In two cases we opted to go the Cordova route in order to get the integration we needed, along with the extra pain involved in that process. Unless you’re not integrating with device features and you don’t care deeply about a smooth integration with the mobile desktop, mobile Web development is fraught with frustration. So, yes I’m frustrated! But it’s not for lack of wanting the mobile Web to succeed. I am still a firm believer that we will eventually arrive a much more functional mobile Web platform that allows access to the most common device features in a sensible way. It wouldn't be difficult for device platform vendors to make Web based applications first class citizens on mobile devices. But unfortunately it looks like it will still be some time before this happens. So, what’s your experience building mobile Web apps? Are you finding similar issues? Just giving up on raw Web applications and building PhoneGap apps instead? Completely skipping the Web and going native? Leave a comment for discussion. Resources Rick Strahl on DotNet Rocks talking about Mobile Web© Rick Strahl, West Wind Technologies, 2005-2014Posted in HTML5  Mobile   Tweet !function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0];if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src="//platform.twitter.com/widgets.js";fjs.parentNode.insertBefore(js,fjs);}}(document,"script","twitter-wjs"); (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/plusone.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

    Read the article

  • Quick guide to Oracle IRM 11g: Classification design

    - by Simon Thorpe
    Quick guide to Oracle IRM 11g indexThis is the final article in the quick guide to Oracle IRM. If you've followed everything prior you will now have a fully functional and tested Information Rights Management service. It doesn't matter if you've been following the 10g or 11g guide as this next article is common to both. ContentsWhy this is the most important part... Understanding the classification and standard rights model Identifying business use cases Creating an effective IRM classification modelOne single classification across the entire businessA context for each and every possible granular use caseWhat makes a good context? Deciding on the use of roles in the context Reviewing the features and security for context roles Summary Why this is the most important part...Now the real work begins, installing and getting an IRM system running is as simple as following instructions. However to actually have an IRM technology easily protecting your most sensitive information without interfering with your users existing daily work flows and be able to scale IRM across the entire business, requires thought into how confidential documents are created, used and distributed. This article is going to give you the information you need to ask the business the right questions so that you can deploy your IRM service successfully. The IRM team here at Oracle have over 10 years of experience in helping customers and it is important you understand the following to be successful in securing access to your most confidential information. Whatever you are trying to secure, be it mergers and acquisitions information, engineering intellectual property, health care documentation or financial reports. No matter what type of user is going to access the information, be they employees, contractors or customers, there are common goals you are always trying to achieve.Securing the content at the earliest point possible and do it automatically. Removing the dependency on the user to decide to secure the content reduces the risk of mistakes significantly and therefore results a more secure deployment. K.I.S.S. (Keep It Simple Stupid) Reduce complexity in the rights/classification model. Oracle IRM lets you make changes to access to documents even after they are secured which allows you to start with a simple model and then introduce complexity once you've understood how the technology is going to be used in the business. After an initial learning period you can review your implementation and start to make informed decisions based on user feedback and administration experience. Clearly communicate to the user, when appropriate, any changes to their existing work practice. You must make every effort to make the transition to sealed content as simple as possible. For external users you must help them understand why you are securing the documents and inform them the value of the technology to both your business and them. Before getting into the detail, I must pay homage to Martin White, Vice President of client services in SealedMedia, the company Oracle acquired and who created Oracle IRM. In the SealedMedia years Martin was involved with every single customer and was key to the design of certain aspects of the IRM technology, specifically the context model we will be discussing here. Listening carefully to customers and understanding the flexibility of the IRM technology, Martin taught me all the skills of helping customers build scalable, effective and simple to use IRM deployments. No matter how well the engineering department designed the software, badly designed and poorly executed projects can result in difficult to use and manage, and ultimately insecure solutions. The advice and information that follows was born with Martin and he's still delivering IRM consulting with customers and can be found at www.thinkers.co.uk. It is from Martin and others that Oracle not only has the most advanced, scalable and usable document security solution on the market, but Oracle and their partners have the most experience in delivering successful document security solutions. Understanding the classification and standard rights model The goal of any successful IRM deployment is to balance the increase in security the technology brings without over complicating the way people use secured content and avoid a significant increase in administration and maintenance. With Oracle it is possible to automate the protection of content, deploy the desktop software transparently and use authentication methods such that users can open newly secured content initially unaware the document is any different to an insecure one. That is until of course they attempt to do something for which they don't have any rights, such as copy and paste to an insecure application or try and print. Central to achieving this objective is creating a classification model that is simple to understand and use but also provides the right level of complexity to meet the business needs. In Oracle IRM the term used for each classification is a "context". A context defines the relationship between.A group of related documents The people that use the documents The roles that these people perform The rights that these people need to perform their role The context is the key to the success of Oracle IRM. It provides the separation of the role and rights of a user from the content itself. Documents are sealed to contexts but none of the rights, user or group information is stored within the content itself. Sealing only places information about the location of the IRM server that sealed it, the context applied to the document and a few other pieces of metadata that pertain only to the document. This important separation of rights from content means that millions of documents can be secured against a single classification and a user needs only one right assigned to be able to access all documents. If you have followed all the previous articles in this guide, you will be ready to start defining contexts to which your sensitive information will be protected. But before you even start with IRM, you need to understand how your own business uses and creates sensitive documents and emails. Identifying business use cases Oracle is able to support multiple classification systems, but usually there is one single initial need for the technology which drives a deployment. This need might be to protect sensitive mergers and acquisitions information, engineering intellectual property, financial documents. For this and every subsequent use case you must understand how users create and work with documents, to who they are distributed and how the recipients should interact with them. A successful IRM deployment should start with one well identified use case (we go through some examples towards the end of this article) and then after letting this use case play out in the business, you learn how your users work with content, how well your communication to the business worked and if the classification system you deployed delivered the right balance. It is at this point you can start rolling the technology out further. Creating an effective IRM classification model Once you have selected the initial use case you will address with IRM, you need to design a classification model that defines the access to secured documents within the use case. In Oracle IRM there is an inbuilt classification system called the "context" model. In Oracle IRM 11g it is possible to extend the server to support any rights classification model, but the majority of users who are not using an application integration (such as Oracle IRM within Oracle Beehive) are likely to be starting out with the built in context model. Before looking at creating a classification system with IRM, it is worth reviewing some recognized standards and methods for creating and implementing security policy. A very useful set of documents are the ISO 17799 guidelines and the SANS security policy templates. First task is to create a context against which documents are to be secured. A context consists of a group of related documents (all top secret engineering research), a list of roles (contributors and readers) which define how users can access documents and a list of users (research engineers) who have been given a role allowing them to interact with sealed content. Before even creating the first context it is wise to decide on a philosophy which will dictate the level of granularity, the question is, where do you start? At a department level? By project? By technology? First consider the two ends of the spectrum... One single classification across the entire business Imagine that instead of having separate contexts, one for engineering intellectual property, one for your financial data, one for human resources personally identifiable information, you create one context for all documents across the entire business. Whilst you may have immediate objections, there are some significant benefits in thinking about considering this. Document security classification decisions are simple. You only have one context to chose from! User provisioning is simple, just make sure everyone has a role in the only context in the business. Administration is very low, if you assign rights to groups from the business user repository you probably never have to touch IRM administration again. There are however some obvious downsides to this model.All users in have access to all IRM secured content. So potentially a sales person could access sensitive mergers and acquisition documents, if they can get their hands on a copy that is. You cannot delegate control of different documents to different parts of the business, this may not satisfy your regulatory requirements for the separation and delegation of duties. Changing a users role affects every single document ever secured. Even though it is very unlikely a business would ever use one single context to secure all their sensitive information, thinking about this scenario raises one very important point. Just having one single context and securing all confidential documents to it, whilst incurring some of the problems detailed above, has one huge value. Once secured, IRM protected content can ONLY be accessed by authorized users. Just think of all the sensitive documents in your business today, imagine if you could ensure that only everyone you trust could open them. Even if an employee lost a laptop or someone accidentally sent an email to the wrong recipient, only the right people could open that file. A context for each and every possible granular use case Now let's think about the total opposite of a single context design. What if you created a context for each and every single defined business need and created multiple contexts within this for each level of granularity? Let's take a use case where we need to protect engineering intellectual property. Imagine we have 6 different engineering groups, and in each we have a research department, a design department and manufacturing. The company information security policy defines 3 levels of information sensitivity... restricted, confidential and top secret. Then let's say that each group and department needs to define access to information from both internal and external users. Finally add into the mix that they want to review the rights model for each context every financial quarter. This would result in a huge amount of contexts. For example, lets just look at the resulting contexts for one engineering group. Q1FY2010 Restricted Internal - Engineering Group 1 - Research Q1FY2010 Restricted Internal - Engineering Group 1 - Design Q1FY2010 Restricted Internal - Engineering Group 1 - Manufacturing Q1FY2010 Restricted External- Engineering Group 1 - Research Q1FY2010 Restricted External - Engineering Group 1 - Design Q1FY2010 Restricted External - Engineering Group 1 - Manufacturing Q1FY2010 Confidential Internal - Engineering Group 1 - Research Q1FY2010 Confidential Internal - Engineering Group 1 - Design Q1FY2010 Confidential Internal - Engineering Group 1 - Manufacturing Q1FY2010 Confidential External - Engineering Group 1 - Research Q1FY2010 Confidential External - Engineering Group 1 - Design Q1FY2010 Confidential External - Engineering Group 1 - Manufacturing Q1FY2010 Top Secret Internal - Engineering Group 1 - Research Q1FY2010 Top Secret Internal - Engineering Group 1 - Design Q1FY2010 Top Secret Internal - Engineering Group 1 - Manufacturing Q1FY2010 Top Secret External - Engineering Group 1 - Research Q1FY2010 Top Secret External - Engineering Group 1 - Design Q1FY2010 Top Secret External - Engineering Group 1 - Manufacturing Now multiply the above by 6 for each engineering group, 18 contexts. You are then creating/reviewing another 18 every 3 months. After a year you've got 72 contexts. What would be the advantages of such a complex classification model? You can satisfy very granular rights requirements, for example only an authorized engineering group 1 researcher can create a top secret report for access internally, and his role will be reviewed on a very frequent basis. Your business may have very complex rights requirements and mapping this directly to IRM may be an obvious exercise. The disadvantages of such a classification model are significant...Huge administrative overhead. Someone in the business must manage, review and administrate each of these contexts. If the engineering group had a single administrator, they would have 72 classifications to reside over each year. From an end users perspective life will be very confusing. Imagine if a user has rights in just 6 of these contexts. They may be able to print content from one but not another, be able to edit content in 2 contexts but not the other 4. Such confusion at the end user level causes frustration and resistance to the use of the technology. Increased synchronization complexity. Imagine a user who after 3 years in the company ends up with over 300 rights in many different contexts across the business. This would result in long synchronization times as the client software updates all your offline rights. Hard to understand who can do what with what. Imagine being the VP of engineering and as part of an internal security audit you are asked the question, "What rights to researchers have to our top secret information?". In this complex model the answer is not simple, it would depend on many roles in many contexts. Of course this example is extreme, but it highlights that trying to build many barriers in your business can result in a nightmare of administration and confusion amongst users. In the real world what we need is a balance of the two. We need to seek an optimum number of contexts. Too many contexts are unmanageable and too few contexts does not give fine enough granularity. What makes a good context? Good context design derives mainly from how well you understand your business requirements to secure access to confidential information. Some customers I have worked with can tell me exactly the documents they wish to secure and know exactly who should be opening them. However there are some customers who know only of the government regulation that requires them to control access to certain types of information, they don't actually know where the documents are, how they are created or understand exactly who should have access. Therefore you need to know how to ask the business the right questions that lead to information which help you define a context. First ask these questions about a set of documentsWhat is the topic? Who are legitimate contributors on this topic? Who are the authorized readership? If the answer to any one of these is significantly different, then it probably merits a separate context. Remember that sealed documents are inherently secure and as such they cannot leak to your competitors, therefore it is better sealed to a broad context than not sealed at all. Simplicity is key here. Always revert to the first extreme example of a single classification, then work towards essential complexity. If there is any doubt, always prefer fewer contexts. Remember, Oracle IRM allows you to change your mind later on. You can implement a design now and continue to change and refine as you learn how the technology is used. It is easy to go from a simple model to a more complex one, it is much harder to take a complex model that is already embedded in the work practice of users and try to simplify it. It is also wise to take a single use case and address this first with the business. Don't try and tackle many different problems from the outset. Do one, learn from the process, refine it and then take what you have learned into the next use case, refine and continue. Once you have a good grasp of the technology and understand how your business will use it, you can then start rolling out the technology wider across the business. Deciding on the use of roles in the context Once you have decided on that first initial use case and a context to create let's look at the details you need to decide upon. For each context, identify; Administrative rolesBusiness owner, the person who makes decisions about who may or may not see content in this context. This is often the person who wanted to use IRM and drove the business purchase. They are the usually the person with the most at risk when sensitive information is lost. Point of contact, the person who will handle requests for access to content. Sometimes the same as the business owner, sometimes a trusted secretary or administrator. Context administrator, the person who will enact the decisions of the Business Owner. Sometimes the point of contact, sometimes a trusted IT person. Document related rolesContributors, the people who create and edit documents in this context. Reviewers, the people who are involved in reviewing documents but are not trusted to secure information to this classification. This role is not always necessary. (See later discussion on Published-work and Work-in-Progress) Readers, the people who read documents from this context. Some people may have several of the roles above, which is fine. What you are trying to do is understand and define how the business interacts with your sensitive information. These roles obviously map directly to roles available in Oracle IRM. Reviewing the features and security for context roles At this point we have decided on a classification of information, understand what roles people in the business will play when administrating this classification and how they will interact with content. The final piece of the puzzle in getting the information for our first context is to look at the permissions people will have to sealed documents. First think why are you protecting the documents in the first place? It is to prevent the loss of leaking of information to the wrong people. To control the information, making sure that people only access the latest versions of documents. You are not using Oracle IRM to prevent unauthorized people from doing legitimate work. This is an important point, with IRM you can erect many barriers to prevent access to content yet too many restrictions and authorized users will often find ways to circumvent using the technology and end up distributing unprotected originals. Because IRM is a security technology, it is easy to get carried away restricting different groups. However I would highly recommend starting with a simple solution with few restrictions. Ensure that everyone who reasonably needs to read documents can do so from the outset. Remember that with Oracle IRM you can change rights to content whenever you wish and tighten security. Always return to the fact that the greatest value IRM brings is that ONLY authorized users can access secured content, remember that simple "one context for the entire business" model. At the start of the deployment you really need to aim for user acceptance and therefore a simple model is more likely to succeed. As time passes and users understand how IRM works you can start to introduce more restrictions and complexity. Another key aspect to focus on is handling exceptions. If you decide on a context model where engineering can only access engineering information, and sales can only access sales data. Act quickly when a sales manager needs legitimate access to a set of engineering documents. Having a quick and effective process for permitting other people with legitimate needs to obtain appropriate access will be rewarded with acceptance from the user community. These use cases can often be satisfied by integrating IRM with a good Identity & Access Management technology which simplifies the process of assigning users the correct business roles. The big print issue... Printing is often an issue of contention, users love to print but the business wants to ensure sensitive information remains in the controlled digital world. There are many cases of physical document loss causing a business pain, it is often overlooked that IRM can help with this issue by limiting the ability to generate physical copies of digital content. However it can be hard to maintain a balance between security and usability when it comes to printing. Consider the following points when deciding about whether to give print rights. Oracle IRM sealed documents can contain watermarks that expose information about the user, time and location of access and the classification of the document. This information would reside in the printed copy making it easier to trace who printed it. Printed documents are slower to distribute in comparison to their digital counterparts, so time sensitive information in printed format may present a lower risk. Print activity is audited, therefore you can monitor and react to users abusing print rights. Summary In summary it is important to think carefully about the way you create your context model. As you ask the business these questions you may get a variety of different requirements. There may be special projects that require a context just for sensitive information created during the lifetime of the project. There may be a department that requires all information in the group is secured and you might have a few senior executives who wish to use IRM to exchange a small number of highly sensitive documents with a very small number of people. Oracle IRM, with its very flexible context classification system, can support all of these use cases. The trick is to introducing the complexity to deliver them at the right level. In another article i'm working on I will go through some examples of how Oracle IRM might map to existing business use cases. But for now, this article covers all the important questions you need to get your IRM service deployed and successfully protecting your most sensitive information.

    Read the article

  • How to find and fix performance problems in ORM powered applications

    - by FransBouma
    Once in a while we get requests about how to fix performance problems with our framework. As it comes down to following the same steps and looking into the same things every single time, I decided to write a blogpost about it instead, so more people can learn from this and solve performance problems in their O/R mapper powered applications. In some parts it's focused on LLBLGen Pro but it's also usable for other O/R mapping frameworks, as the vast majority of performance problems in O/R mapper powered applications are not specific for a certain O/R mapper framework. Too often, the developer looks at the wrong part of the application, trying to fix what isn't a problem in that part, and getting frustrated that 'things are so slow with <insert your favorite framework X here>'. I'm in the O/R mapper business for a long time now (almost 10 years, full time) and as it's a small world, we O/R mapper developers know almost all tricks to pull off by now: we all know what to do to make task ABC faster and what compromises (because there are almost always compromises) to deal with if we decide to make ABC faster that way. Some O/R mapper frameworks are faster in X, others in Y, but you can be sure the difference is mainly a result of a compromise some developers are willing to deal with and others aren't. That's why the O/R mapper frameworks on the market today are different in many ways, even though they all fetch and save entities from and to a database. I'm not suggesting there's no room for improvement in today's O/R mapper frameworks, there always is, but it's not a matter of 'the slowness of the application is caused by the O/R mapper' anymore. Perhaps query generation can be optimized a bit here, row materialization can be optimized a bit there, but it's mainly coming down to milliseconds. Still worth it if you're a framework developer, but it's not much compared to the time spend inside databases and in user code: if a complete fetch takes 40ms or 50ms (from call to entity object collection), it won't make a difference for your application as that 10ms difference won't be noticed. That's why it's very important to find the real locations of the problems so developers can fix them properly and don't get frustrated because their quest to get a fast, performing application failed. Performance tuning basics and rules Finding and fixing performance problems in any application is a strict procedure with four prescribed steps: isolate, analyze, interpret and fix, in that order. It's key that you don't skip a step nor make assumptions: these steps help you find the reason of a problem which seems to be there, and how to fix it or leave it as-is. Skipping a step, or when you assume things will be bad/slow without doing analysis will lead to the path of premature optimization and won't actually solve your problems, only create new ones. The most important rule of finding and fixing performance problems in software is that you have to understand what 'performance problem' actually means. Most developers will say "when a piece of software / code is slow, you have a performance problem". But is that actually the case? If I write a Linq query which will aggregate, group and sort 5 million rows from several tables to produce a resultset of 10 rows, it might take more than a couple of milliseconds before that resultset is ready to be consumed by other logic. If I solely look at the Linq query, the code consuming the resultset of the 10 rows and then look at the time it takes to complete the whole procedure, it will appear to me to be slow: all that time taken to produce and consume 10 rows? But if you look closer, if you analyze and interpret the situation, you'll see it does a tremendous amount of work, and in that light it might even be extremely fast. With every performance problem you encounter, always do realize that what you're trying to solve is perhaps not a technical problem at all, but a perception problem. The second most important rule you have to understand is based on the old saying "Penny wise, Pound Foolish": the part which takes e.g. 5% of the total time T for a given task isn't worth optimizing if you have another part which takes a much larger part of the total time T for that same given task. Optimizing parts which are relatively insignificant for the total time taken is not going to bring you better results overall, even if you totally optimize that part away. This is the core reason why analysis of the complete set of application parts which participate in a given task is key to being successful in solving performance problems: No analysis -> no problem -> no solution. One warning up front: hunting for performance will always include making compromises. Fast software can be made maintainable, but if you want to squeeze as much performance out of your software, you will inevitably be faced with the dilemma of compromising one or more from the group {readability, maintainability, features} for the extra performance you think you'll gain. It's then up to you to decide whether it's worth it. In almost all cases it's not. The reason for this is simple: the vast majority of performance problems can be solved by implementing the proper algorithms, the ones with proven Big O-characteristics so you know the performance you'll get plus you know the algorithm will work. The time taken by the algorithm implementing code is inevitable: you already implemented the best algorithm. You might find some optimizations on the technical level but in general these are minor. Let's look at the four steps to see how they guide us through the quest to find and fix performance problems. Isolate The first thing you need to do is to isolate the areas in your application which are assumed to be slow. For example, if your application is a web application and a given page is taking several seconds or even minutes to load, it's a good candidate to check out. It's important to start with the isolate step because it allows you to focus on a single code path per area with a clear begin and end and ignore the rest. The rest of the steps are taken per identified problematic area. Keep in mind that isolation focuses on tasks in an application, not code snippets. A task is something that's started in your application by either another task or the user, or another program, and has a beginning and an end. You can see a task as a piece of functionality offered by your application.  Analyze Once you've determined the problem areas, you have to perform analysis on the code paths of each area, to see where the performance problems occur and which areas are not the problem. This is a multi-layered effort: an application which uses an O/R mapper typically consists of multiple parts: there's likely some kind of interface (web, webservice, windows etc.), a part which controls the interface and business logic, the O/R mapper part and the RDBMS, all connected with either a network or inter-process connections provided by the OS or other means. Each of these parts, including the connectivity plumbing, eat up a part of the total time it takes to complete a task, e.g. load a webpage with all orders of a given customer X. To understand which parts participate in the task / area we're investigating and how much they contribute to the total time taken to complete the task, analysis of each participating task is essential. Start with the code you wrote which starts the task, analyze the code and track the path it follows through your application. What does the code do along the way, verify whether it's correct or not. Analyze whether you have implemented the right algorithms in your code for this particular area. Remember we're looking at one area at a time, which means we're ignoring all other code paths, just the code path of the current problematic area, from begin to end and back. Don't dig in and start optimizing at the code level just yet. We're just analyzing. If your analysis reveals big architectural stupidity, it's perhaps a good idea to rethink the architecture at this point. For the rest, we're analyzing which means we collect data about what could be wrong, for each participating part of the complete application. Reviewing the code you wrote is a good tool to get deeper understanding of what is going on for a given task but ultimately it lacks precision and overview what really happens: humans aren't good code interpreters, computers are. We therefore need to utilize tools to get deeper understanding about which parts contribute how much time to the total task, triggered by which other parts and for example how many times are they called. There are two different kind of tools which are necessary: .NET profilers and O/R mapper / RDBMS profilers. .NET profiling .NET profilers (e.g. dotTrace by JetBrains or Ants by Red Gate software) show exactly which pieces of code are called, how many times they're called, and the time it took to run that piece of code, at the method level and sometimes even at the line level. The .NET profilers are essential tools for understanding whether the time taken to complete a given task / area in your application is consumed by .NET code, where exactly in your code, the path to that code, how many times that code was called by other code and thus reveals where hotspots are located: the areas where a solution can be found. Importantly, they also reveal which areas can be left alone: remember our penny wise pound foolish saying: if a profiler reveals that a group of methods are fast, or don't contribute much to the total time taken for a given task, ignore them. Even if the code in them is perhaps complex and looks like a candidate for optimization: you can work all day on that, it won't matter.  As we're focusing on a single area of the application, it's best to start profiling right before you actually activate the task/area. Most .NET profilers support this by starting the application without starting the profiling procedure just yet. You navigate to the particular part which is slow, start profiling in the profiler, in your application you perform the actions which are considered slow, and afterwards you get a snapshot in the profiler. The snapshot contains the data collected by the profiler during the slow action, so most data is produced by code in the area to investigate. This is important, because it allows you to stay focused on a single area. O/R mapper and RDBMS profiling .NET profilers give you a good insight in the .NET side of things, but not in the RDBMS side of the application. As this article is about O/R mapper powered applications, we're also looking at databases, and the software making it possible to consume the database in your application: the O/R mapper. To understand which parts of the O/R mapper and database participate how much to the total time taken for task T, we need different tools. There are two kind of tools focusing on O/R mappers and database performance profiling: O/R mapper profilers and RDBMS profilers. For O/R mapper profilers, you can look at LLBLGen Prof by hibernating rhinos or the Linq to Sql/LLBLGen Pro profiler by Huagati. Hibernating rhinos also have profilers for other O/R mappers like NHibernate (NHProf) and Entity Framework (EFProf) and work the same as LLBLGen Prof. For RDBMS profilers, you have to look whether the RDBMS vendor has a profiler. For example for SQL Server, the profiler is shipped with SQL Server, for Oracle it's build into the RDBMS, however there are also 3rd party tools. Which tool you're using isn't really important, what's important is that you get insight in which queries are executed during the task / area we're currently focused on and how long they took. Here, the O/R mapper profilers have an advantage as they collect the time it took to execute the query from the application's perspective so they also collect the time it took to transport data across the network. This is important because a query which returns a massive resultset or a resultset with large blob/clob/ntext/image fields takes more time to get transported across the network than a small resultset and a database profiler doesn't take this into account most of the time. Another tool to use in this case, which is more low level and not all O/R mappers support it (though LLBLGen Pro and NHibernate as well do) is tracing: most O/R mappers offer some form of tracing or logging system which you can use to collect the SQL generated and executed and often also other activity behind the scenes. While tracing can produce a tremendous amount of data in some cases, it also gives insight in what's going on. Interpret After we've completed the analysis step it's time to look at the data we've collected. We've done code reviews to see whether we've done anything stupid and which parts actually take place and if the proper algorithms have been implemented. We've done .NET profiling to see which parts are choke points and how much time they contribute to the total time taken to complete the task we're investigating. We've performed O/R mapper profiling and RDBMS profiling to see which queries were executed during the task, how many queries were generated and executed and how long they took to complete, including network transportation. All this data reveals two things: which parts are big contributors to the total time taken and which parts are irrelevant. Both aspects are very important. The parts which are irrelevant (i.e. don't contribute significantly to the total time taken) can be ignored from now on, we won't look at them. The parts which contribute a lot to the total time taken are important to look at. We now have to first look at the .NET profiler results, to see whether the time taken is consumed in our own code, in .NET framework code, in the O/R mapper itself or somewhere else. For example if most of the time is consumed by DbCommand.ExecuteReader, the time it took to complete the task is depending on the time the data is fetched from the database. If there was just 1 query executed, according to tracing or O/R mapper profilers / RDBMS profilers, check whether that query is optimal, uses indexes or has to deal with a lot of data. Interpret means that you follow the path from begin to end through the data collected and determine where, along the path, the most time is contributed. It also means that you have to check whether this was expected or is totally unexpected. My previous example of the 10 row resultset of a query which groups millions of rows will likely reveal that a long time is spend inside the database and almost no time is spend in the .NET code, meaning the RDBMS part contributes the most to the total time taken, the rest is compared to that time, irrelevant. Considering the vastness of the source data set, it's expected this will take some time. However, does it need tweaking? Perhaps all possible tweaks are already in place. In the interpret step you then have to decide that further action in this area is necessary or not, based on what the analysis results show: if the analysis results were unexpected and in the area where the most time is contributed to the total time taken is room for improvement, action should be taken. If not, you can only accept the situation and move on. In all cases, document your decision together with the analysis you've done. If you decide that the perceived performance problem is actually expected due to the nature of the task performed, it's essential that in the future when someone else looks at the application and starts asking questions you can answer them properly and new analysis is only necessary if situations changed. Fix After interpreting the analysis results you've concluded that some areas need adjustment. This is the fix step: you're actively correcting the performance problem with proper action targeted at the real cause. In many cases related to O/R mapper powered applications it means you'll use different features of the O/R mapper to achieve the same goal, or apply optimizations at the RDBMS level. It could also mean you apply caching inside your application (compromise memory consumption over performance) to avoid unnecessary re-querying data and re-consuming the results. After applying a change, it's key you re-do the analysis and interpretation steps: compare the results and expectations with what you had before, to see whether your actions had any effect or whether it moved the problem to a different part of the application. Don't fall into the trap to do partly analysis: do the full analysis again: .NET profiling and O/R mapper / RDBMS profiling. It might very well be that the changes you've made make one part faster but another part significantly slower, in such a way that the overall problem hasn't changed at all. Performance tuning is dealing with compromises and making choices: to use one feature over the other, to accept a higher memory footprint, to go away from the strict-OO path and execute queries directly onto the RDBMS, these are choices and compromises which will cross your path if you want to fix performance problems with respect to O/R mappers or data-access and databases in general. In most cases it's not a big issue: alternatives are often good choices too and the compromises aren't that hard to deal with. What is important is that you document why you made a choice, a compromise: which analysis data, which interpretation led you to the choice made. This is key for good maintainability in the years to come. Most common performance problems with O/R mappers Below is an incomplete list of common performance problems related to data-access / O/R mappers / RDBMS code. It will help you with fixing the hotspots you found in the interpretation step. SELECT N+1: (Lazy-loading specific). Lazy loading triggered performance bottlenecks. Consider a list of Orders bound to a grid. You have a Field mapped onto a related field in Order, Customer.CompanyName. Showing this column in the grid will make the grid fetch (indirectly) for each row the Customer row. This means you'll get for the single list not 1 query (for the orders) but 1+(the number of orders shown) queries. To solve this: use eager loading using a prefetch path to fetch the customers with the orders. SELECT N+1 is easy to spot with an O/R mapper profiler or RDBMS profiler: if you see a lot of identical queries executed at once, you have this problem. Prefetch paths using many path nodes or sorting, or limiting. Eager loading problem. Prefetch paths can help with performance, but as 1 query is fetched per node, it can be the number of data fetched in a child node is bigger than you think. Also consider that data in every node is merged on the client within the parent. This is fast, but it also can take some time if you fetch massive amounts of entities. If you keep fetches small, you can use tuning parameters like the ParameterizedPrefetchPathThreshold setting to get more optimal queries. Deep inheritance hierarchies of type Target Per Entity/Type. If you use inheritance of type Target per Entity / Type (each type in the inheritance hierarchy is mapped onto its own table/view), fetches will join subtype- and supertype tables in many cases, which can lead to a lot of performance problems if the hierarchy has many types. With this problem, keep inheritance to a minimum if possible, or switch to a hierarchy of type Target Per Hierarchy, which means all entities in the inheritance hierarchy are mapped onto the same table/view. Of course this has its own set of drawbacks, but it's a compromise you might want to take. Fetching massive amounts of data by fetching large lists of entities. LLBLGen Pro supports paging (and limiting the # of rows returned), which is often key to process through large sets of data. Use paging on the RDBMS if possible (so a query is executed which returns only the rows in the page requested). When using paging in a web application, be sure that you switch server-side paging on on the datasourcecontrol used. In this case, paging on the grid alone is not enough: this can lead to fetching a lot of data which is then loaded into the grid and paged there. Keep note that analyzing queries for paging could lead to the false assumption that paging doesn't occur, e.g. when the query contains a field of type ntext/image/clob/blob and DISTINCT can't be applied while it should have (e.g. due to a join): the datareader will do DISTINCT filtering on the client. this is a little slower but it does perform paging functionality on the data-reader so it won't fetch all rows even if the query suggests it does. Fetch massive amounts of data because blob/clob/ntext/image fields aren't excluded. LLBLGen Pro supports field exclusion for queries. You can exclude fields (also in prefetch paths) per query to avoid fetching all fields of an entity, e.g. when you don't need them for the logic consuming the resultset. Excluding fields can greatly reduce the amount of time spend on data-transport across the network. Use this optimization if you see that there's a big difference between query execution time on the RDBMS and the time reported by the .NET profiler for the ExecuteReader method call. Doing client-side aggregates/scalar calculations by consuming a lot of data. If possible, try to formulate a scalar query or group by query using the projection system or GetScalar functionality of LLBLGen Pro to do data consumption on the RDBMS server. It's far more efficient to process data on the RDBMS server than to first load it all in memory, then traverse the data in-memory to calculate a value. Using .ToList() constructs inside linq queries. It might be you use .ToList() somewhere in a Linq query which makes the query be run partially in-memory. Example: var q = from c in metaData.Customers.ToList() where c.Country=="Norway" select c; This will actually fetch all customers in-memory and do an in-memory filtering, as the linq query is defined on an IEnumerable<T>, and not on the IQueryable<T>. Linq is nice, but it can often be a bit unclear where some parts of a Linq query might run. Fetching all entities to delete into memory first. To delete a set of entities it's rather inefficient to first fetch them all into memory and then delete them one by one. It's more efficient to execute a DELETE FROM ... WHERE query on the database directly to delete the entities in one go. LLBLGen Pro supports this feature, and so do some other O/R mappers. It's not always possible to do this operation in the context of an O/R mapper however: if an O/R mapper relies on a cache, these kind of operations are likely not supported because they make it impossible to track whether an entity is actually removed from the DB and thus can be removed from the cache. Fetching all entities to update with an expression into memory first. Similar to the previous point: it is more efficient to update a set of entities directly with a single UPDATE query using an expression instead of fetching the entities into memory first and then updating the entities in a loop, and afterwards saving them. It might however be a compromise you don't want to take as it is working around the idea of having an object graph in memory which is manipulated and instead makes the code fully aware there's a RDBMS somewhere. Conclusion Performance tuning is almost always about compromises and making choices. It's also about knowing where to look and how the systems in play behave and should behave. The four steps I provided should help you stay focused on the real problem and lead you towards the solution. Knowing how to optimally use the systems participating in your own code (.NET framework, O/R mapper, RDBMS, network/services) is key for success as well as knowing what's going on inside the application you built. I hope you'll find this guide useful in tracking down performance problems and dealing with them in a useful way.  

    Read the article

  • MacBook Pro Late 2009 SATA Resets, Slowness

    - by A Student at a University
    My MacBook Pro runs slower the longer it's on. I am getting kernel warnings. The resets correlate with AC power connects and disconnects. I don't know if the warnings do. (How do I tell?) Are these bus CRC errors? Or something else? Can this damage the drive or corrupt data? What is it seeing that motivates these? 02:37:16 :[ 0.791992] ahci 0000:00:0b.0: PCI INT A -> Link[LSI0] -> GSI 20 (level, low) -> IRQ 20 02:37:16 :[ 0.792053] ahci 0000:00:0b.0: controller can't do PMP, turning off CAP_PMP 02:37:16 :[ 0.792104] ahci 0000:00:0b.0: AHCI 0001.0200 32 slots 6 ports 1.5 Gbps 0x3 impl IDE mode 02:37:16 :[ 0.792107] ahci 0000:00:0b.0: flags: 64bit ncq sntf pm led pio slum part boh 02:37:16 :[ 0.813473] scsi0 : ahci 02:37:16 :[ 0.823340] scsi1 : ahci 02:37:16 :[ 0.848164] ata1: SATA max UDMA/133 abar m8192@0xe7484000 port 0xe7484100 irq 43 02:37:16 :[ 0.848166] ata2: SATA max UDMA/133 abar m8192@0xe7484000 port 0xe7484180 irq 43 02:37:16 :[ 1.190132] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) 02:37:16 :[ 1.190153] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) 02:37:16 :[ 1.213568] ata1.00: ATA-8: OCZ-VERTEX2, 1.23, max UDMA/133 02:37:16 :[ 1.213572] ata1.00: 195371568 sectors, multi 1: LBA48 NCQ (depth 31/32) 02:37:16 :[ 1.227293] ata2.00: ATA-8: ST9500420ASG, 0002SDM1, max UDMA/133 02:37:16 :[ 1.227297] ata2.00: 976773168 sectors, multi 16: LBA48 NCQ (depth 31/32) 02:37:16 :[ 1.229570] ata2.00: configured for UDMA/133 02:37:16 :[ 1.240133] ata2: hard resetting link 02:37:16 :[ 1.260738] ata1.00: configured for UDMA/133 02:37:16 :[ 1.280122] ata1: hard resetting link 02:37:16 :[ 1.470125] usb 2-5: new high speed USB device using ehci_hcd and address 3 02:37:16 :[ 1.550165] firewire_core: created device fw0: GUID 58b035fffea99f5c, S800 02:37:16 :[ 1.631306] Initializing USB Mass Storage driver... 02:37:16 :[ 1.631392] scsi6 : usb-storage 2-5:1.0 02:37:16 :[ 1.631454] usbcore: registered new interface driver usb-storage 02:37:16 :[ 1.631455] USB Mass Storage support registered. 02:37:16 :[ 1.960128] usb 4-1: new full speed USB device using ohci_hcd and address 2 02:37:16 :[ 1.990101] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) 02:37:16 :[ 1.994215] ata2.00: configured for UDMA/133 02:37:16 :[ 1.994220] ata2: EH complete 02:37:16 :[ 2.030097] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) 02:37:16 :[ 2.090773] ata1.00: configured for UDMA/133 02:37:16 :[ 2.090778] ata1: EH complete 02:37:16 :[ 2.090931] scsi 0:0:0:0: Direct-Access ATA OCZ-VERTEX2 1.23 PQ: 0 ANSI: 5 02:37:16 :[ 2.091045] sd 0:0:0:0: Attached scsi generic sg0 type 0 02:37:16 :[ 2.091121] sd 0:0:0:0: [sda] 195371568 512-byte logical blocks: (100 GB/93.1 GiB) 02:37:16 :[ 2.091159] scsi 1:0:0:0: Direct-Access ATA ST9500420ASG 0002 PQ: 0 ANSI: 5 02:37:16 :[ 2.091163] sd 0:0:0:0: [sda] Write Protect is off 02:37:16 :[ 2.091183] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA 02:37:16 :[ 2.091252] sd 1:0:0:0: Attached scsi generic sg1 type 0 02:37:16 :[ 2.091337] sda: 02:37:16 :[ 2.091446] sd 1:0:0:0: [sdb] 976773168 512-byte logical blocks: (500 GB/465 GiB) 02:37:16 :[ 2.091580] sd 1:0:0:0: [sdb] Write Protect is off 02:37:16 :[ 2.091637] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA 02:37:16 :[ 2.091756] sdb: sda1 sda2 02:37:16 :[ 2.093140] sd 0:0:0:0: [sda] Attached SCSI disk 02:37:16 :[ 2.093505] sdb1 sdb2 sdb3 02:37:16 :[ 2.093773] sd 1:0:0:0: [sdb] Attached SCSI disk 02:37:16 :[ 2.693899] EXT4-fs (dm-0): mounted filesystem with ordered data mode. Opts: (null) 02:37:16 :[ 5.483492] EXT4-fs (dm-0): re-mounted. Opts: errors=remount-ro 02:37:16 :[ 7.905040] EXT4-fs (dm-2): mounted filesystem with ordered data mode. Opts: (null) 02:37:25 :[ 19.553095] EXT4-fs (dm-0): re-mounted. Opts: errors=remount-ro,commit=600 02:37:25 :[ 19.555266] EXT4-fs (dm-2): re-mounted. Opts: commit=600 02:37:25 :[ 19.641533] ata1: hard resetting link 02:37:25 :[ 19.642084] ata2: hard resetting link 02:37:26 :[ 20.392606] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) 02:37:26 :[ 20.392610] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) 02:37:26 :[ 20.396697] ata2.00: configured for UDMA/133 02:37:26 :[ 20.396703] ata2: EH complete 02:37:26 :[ 20.451491] ata1.00: configured for UDMA/133 02:37:26 :[ 20.451498] ata1: EH complete 02:37:30 :[ 24.563725] EXT4-fs (dm-0): re-mounted. Opts: errors=remount-ro,commit=600 02:37:30 :[ 24.565939] EXT4-fs (dm-2): re-mounted. Opts: commit=600 02:37:30 :[ 24.627246] ata1: hard resetting link 02:37:30 :[ 24.632250] ata2: hard resetting link 02:37:31 :[ 25.372582] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) 02:37:31 :[ 25.382615] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) 02:37:31 :[ 25.386782] ata2.00: configured for UDMA/133 02:37:31 :[ 25.386788] ata2: EH complete 02:37:31 :[ 25.431668] ata1.00: configured for UDMA/133 02:37:31 :[ 25.431674] ata1: EH complete 02:45:54 :[ 529.141844] EXT4-fs (dm-0): re-mounted. Opts: errors=remount-ro,commit=0 02:45:55 :[ 529.544529] EXT4-fs (dm-2): re-mounted. Opts: commit=0 02:45:55 :[ 529.622561] ata1: limiting SATA link speed to 1.5 Gbps 02:45:55 :[ 529.622583] ata1: hard resetting link 02:45:55 :[ 529.622609] ata2: limiting SATA link speed to 1.5 Gbps 02:45:55 :[ 529.622624] ata2: hard resetting link 02:45:56 :[ 530.380135] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 02:45:56 :[ 530.380157] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 02:45:56 :[ 530.384305] ata2.00: configured for UDMA/133 02:45:56 :[ 530.384314] ata2: EH complete 02:45:56 :[ 530.399225] ata1.00: configured for UDMA/133 02:45:56 :[ 530.399233] ata1: EH complete 02:45:58 :[ 532.395990] EXT4-fs (dm-0): re-mounted. Opts: errors=remount-ro,commit=600 02:45:58 :[ 532.518270] EXT4-fs (dm-2): re-mounted. Opts: commit=600 02:45:58 :[ 532.590983] ata1: hard resetting link 02:45:58 :[ 532.591045] ata2: hard resetting link 02:45:59 :[ 533.340147] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 02:45:59 :[ 533.340168] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 02:45:59 :[ 533.344416] ata2.00: configured for UDMA/133 02:45:59 :[ 533.344424] ata2: EH complete 02:45:59 :[ 533.360839] ata1.00: configured for UDMA/133 02:45:59 :[ 533.360847] ata1: EH complete 02:45:59 :[ 533.584449] EXT4-fs (dm-0): re-mounted. Opts: errors=remount-ro,commit=0 02:45:59 :[ 533.586999] EXT4-fs (dm-2): re-mounted. Opts: commit=0 02:45:59 :[ 533.660132] ata2: hard resetting link 02:45:59 :[ 533.660151] ata1: hard resetting link 02:46:00 :[ 534.412536] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 02:46:00 :[ 534.412562] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 02:46:00 :[ 534.416768] ata2.00: configured for UDMA/133 02:46:00 :[ 534.416777] ata2: EH complete 02:46:00 :[ 534.431396] ata1.00: configured for UDMA/133 02:46:00 :[ 534.431401] ata1: EH complete 02:46:03 :[ 537.384649] EXT4-fs (dm-0): re-mounted. Opts: errors=remount-ro,commit=600 02:46:03 :[ 537.504214] EXT4-fs (dm-2): re-mounted. Opts: commit=600 02:46:03 :[ 537.586002] ata1: hard resetting link 02:46:03 :[ 537.586036] ata2: hard resetting link 02:46:04 :[ 538.330147] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 02:46:04 :[ 538.330168] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 02:46:04 :[ 538.334389] ata2.00: configured for UDMA/133 02:46:04 :[ 538.334398] ata2: EH complete 02:46:04 :[ 538.343511] ata1.00: configured for UDMA/133 02:46:04 :[ 538.343519] ata1: EH complete 02:46:04 :[ 538.456413] EXT4-fs (dm-0): re-mounted. Opts: errors=remount-ro,commit=0 02:46:04 :[ 538.459404] EXT4-fs (dm-2): re-mounted. Opts: commit=0 02:46:04 :[ 538.540138] ata1.00: limiting speed to UDMA/100:PIO4 02:46:04 :[ 538.540159] ata1: hard resetting link 02:46:04 :[ 538.540202] ata2.00: limiting speed to UDMA/100:PIO4 02:46:04 :[ 538.540220] ata2: hard resetting link 02:46:05 :[ 539.290054] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 02:46:05 :[ 539.290041] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 02:46:05 :[ 539.294100] ata2.00: configured for UDMA/100 02:46:05 :[ 539.294106] ata2: EH complete 02:46:05 :[ 539.314125] ata1.00: configured for UDMA/100 02:46:05 :[ 539.314132] ------------[ cut here ]------------ 02:46:05 :[ 539.314140] WARNING: at /build/buildd/linux-2.6.35/drivers/ata/libata-eh.c:3638 ata_eh_finish+0xdf/0xf0() 02:46:05 :[ 539.314144] Hardware name: MacBookPro5,3 02:46:05 :[ 539.314146] Modules linked in: michael_mic arc4 xt_multiport binfmt_misc rfcomm sco bnep l2cap parport_pc ppdev nvidia(P) ipt_REJECT xt_recent snd_hda_codec_cirrus xt_limit xt_tcpudp ipt_addrtype xt_state snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi applesmc led_class ip6table_filter lib80211_crypt_tkip snd_rawmidi snd_seq_midi_event ip6_tables input_polldev hid_apple snd_seq wl(P) snd_timer snd_seq_device snd joydev bcm5974 usbhid mbp_nvidia_bl uvcvideo btusb videodev v4l1_compat v4l2_compat_ioctl32 nf_nat_irc hid nf_conntrack_irc soundcore snd_page_alloc i2c_nforce2 coretemp lib80211 bluetooth nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack_ftp nf_conntrack lp parport iptable_filter ip_tables x_tables usb_storage firewire_ohci firewire_core forcedeth crc_itu_t ahci libahci 02:46:05 :[ 539.314221] Pid: 202, comm: scsi_eh_0 Tainted: P 2.6.35-25-generic #44-Ubuntu 02:46:05 :[ 539.314224] Call Trace: 02:46:05 :[ 539.314233] [<ffffffff8106091f>] warn_slowpath_common+0x7f/0xc0 02:46:05 :[ 539.314237] [<ffffffff8106097a>] warn_slowpath_null+0x1a/0x20 02:46:05 :[ 539.314242] [<ffffffff813dc77f>] ata_eh_finish+0xdf/0xf0 02:46:05 :[ 539.314246] [<ffffffff813e441e>] sata_pmp_error_handler+0x2e/0x40 02:46:05 :[ 539.314256] [<ffffffffa00021bf>] ahci_error_handler+0x1f/0x90 [libahci] 02:46:05 :[ 539.314261] [<ffffffff813dd6d2>] ata_scsi_error+0x492/0x5e0 02:46:05 :[ 539.314266] [<ffffffff813b24cd>] scsi_error_handler+0x10d/0x190 02:46:05 :[ 539.314270] [<ffffffff813b23c0>] ? scsi_error_handler+0x0/0x190 02:46:05 :[ 539.314275] [<ffffffff8107f266>] kthread+0x96/0xa0 02:46:05 :[ 539.314280] [<ffffffff8100aee4>] kernel_thread_helper+0x4/0x10 02:46:05 :[ 539.314284] [<ffffffff8107f1d0>] ? kthread+0x0/0xa0 02:46:05 :[ 539.314288] [<ffffffff8100aee0>] ? kernel_thread_helper+0x0/0x10 02:46:05 :[ 539.314291] ---[ end trace 76dbffc2d5d49d9b ]--- 02:46:05 :[ 539.314296] ata1: EH complete 02:46:12 :[ 547.040117] ata1: hard resetting link 02:46:13 :[ 547.390144] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 02:46:13 :[ 547.408430] ata1.00: configured for UDMA/100 02:46:13 :[ 547.408438] ------------[ cut here ]------------ 02:46:13 :[ 547.408447] WARNING: at /build/buildd/linux-2.6.35/drivers/ata/libata-eh.c:3638 ata_eh_finish+0xdf/0xf0() 02:46:13 :[ 547.408451] Hardware name: MacBookPro5,3 02:46:13 :[ 547.408453] Modules linked in: michael_mic arc4 xt_multiport binfmt_misc rfcomm sco bnep l2cap parport_pc ppdev nvidia(P) ipt_REJECT xt_recent snd_hda_codec_cirrus xt_limit xt_tcpudp ipt_addrtype xt_state snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi applesmc led_class ip6table_filter lib80211_crypt_tkip snd_rawmidi snd_seq_midi_event ip6_tables input_polldev hid_apple snd_seq wl(P) snd_timer snd_seq_device snd joydev bcm5974 usbhid mbp_nvidia_bl uvcvideo btusb videodev v4l1_compat v4l2_compat_ioctl32 nf_nat_irc hid nf_conntrack_irc soundcore snd_page_alloc i2c_nforce2 coretemp lib80211 bluetooth nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack_ftp nf_conntrack lp parport iptable_filter ip_tables x_tables usb_storage firewire_ohci firewire_core forcedeth crc_itu_t ahci libahci 02:46:13 :[ 547.408528] Pid: 202, comm: scsi_eh_0 Tainted: P W 2.6.35-25-generic #44-Ubuntu 02:46:13 :[ 547.408531] Call Trace: 02:46:13 :[ 547.408540] [<ffffffff8106091f>] warn_slowpath_common+0x7f/0xc0 02:46:13 :[ 547.408544] [<ffffffff8106097a>] warn_slowpath_null+0x1a/0x20 02:46:13 :[ 547.408549] [<ffffffff813dc77f>] ata_eh_finish+0xdf/0xf0 02:46:13 :[ 547.408553] [<ffffffff813e441e>] sata_pmp_error_handler+0x2e/0x40 02:46:13 :[ 547.408563] [<ffffffffa00021bf>] ahci_error_handler+0x1f/0x90 [libahci] 02:46:13 :[ 547.408567] [<ffffffff813dd6d2>] ata_scsi_error+0x492/0x5e0 02:46:13 :[ 547.408572] [<ffffffff813b24cd>] scsi_error_handler+0x10d/0x190 02:46:13 :[ 547.408577] [<ffffffff813b23c0>] ? scsi_error_handler+0x0/0x190 02:46:13 :[ 547.408582] [<ffffffff8107f266>] kthread+0x96/0xa0 02:46:13 :[ 547.408587] [<ffffffff8100aee4>] kernel_thread_helper+0x4/0x10 02:46:13 :[ 547.408591] [<ffffffff8107f1d0>] ? kthread+0x0/0xa0 02:46:13 :[ 547.408595] [<ffffffff8100aee0>] ? kernel_thread_helper+0x0/0x10 02:46:13 :[ 547.408598] ---[ end trace 76dbffc2d5d49d9c ]--- 02:46:13 :[ 547.408620] ata1: EH complete 02:46:13 :[ 547.562470] EXT4-fs (dm-0): re-mounted. Opts: errors=remount-ro,commit=600 02:46:13 :[ 547.671380] EXT4-fs (dm-2): re-mounted. Opts: commit=600 02:46:13 :[ 547.738198] ata1.00: limiting speed to UDMA/33:PIO4 02:46:13 :[ 547.738218] ata1: hard resetting link 02:46:13 :[ 547.738274] ata2: hard resetting link 02:46:14 :[ 548.482561] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 02:46:14 :[ 548.484083] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 02:46:14 :[ 548.486809] ata2.00: configured for UDMA/100 02:46:14 :[ 548.486818] ata2: EH complete 02:46:14 :[ 548.498998] ata1.00: configured for UDMA/33 02:46:14 :[ 548.499004] ata1: EH complete 02:46:18 :[ 552.410499] EXT4-fs (dm-0): re-mounted. Opts: errors=remount-ro,commit=600 02:46:18 :[ 552.522521] EXT4-fs (dm-2): re-mounted. Opts: commit=600 02:46:18 :[ 552.529684] ata1: hard resetting link 02:46:18 :[ 552.529723] ata2: hard resetting link 02:46:19 :[ 553.280059] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 02:46:19 :[ 553.280068] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 02:46:19 :[ 553.284141] ata2.00: configured for UDMA/100 02:46:19 :[ 553.284150] ata2: EH complete 02:46:19 :[ 553.301629] ata1.00: configured for UDMA/33 02:46:19 :[ 553.301637] ata1: EH complete 02:46:21 :[ 556.078830] EXT4-fs (dm-0): re-mounted. Opts: errors=remount-ro,commit=0 02:46:21 :[ 556.180361] EXT4-fs (dm-2): re-mounted. Opts: commit=0 02:46:22 :[ 556.262612] ata1: hard resetting link 02:46:22 :[ 556.262617] ata2: hard resetting link 02:46:22 :[ 557.010050] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 02:46:22 :[ 557.010070] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 02:46:22 :[ 557.014069] ata2.00: configured for UDMA/100 02:46:22 :[ 557.014075] ata2: EH complete 02:46:22 :[ 557.023646] ata1.00: configured for UDMA/33 02:46:22 :[ 557.023654] ata1: EH complete 02:46:30 :[ 565.047438] EXT4-fs (dm-0): re-mounted. Opts: errors=remount-ro,commit=600 02:46:30 :[ 565.051554] EXT4-fs (dm-2): re-mounted. Opts: commit=600 02:46:30 :[ 565.108332] ata1: hard resetting link 02:46:30 :[ 565.108389] ata2.00: limiting speed to UDMA/33:PIO4 02:46:30 :[ 565.108406] ata2: hard resetting link 02:46:31 :[ 565.850048] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 02:46:31 :[ 565.850068] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 02:46:31 :[ 565.854304] ata2.00: configured for UDMA/33 02:46:31 :[ 565.854313] ata2: EH complete 02:46:31 :[ 565.868477] ata1.00: configured for UDMA/33 02:46:31 :[ 565.868485] ata1: EH complete 02:46:35 :[ 569.265469] EXT4-fs (dm-0): re-mounted. Opts: errors=remount-ro,commit=0 02:46:35 :[ 569.268139] EXT4-fs (dm-2): re-mounted. Opts: commit=0 02:46:35 :[ 569.340079] ata1: hard resetting link 02:46:35 :[ 569.340113] ata2: hard resetting link 02:46:35 :[ 570.092568] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 02:46:35 :[ 570.092589] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 02:46:35 :[ 570.096828] ata2.00: configured for UDMA/33 02:46:35 :[ 570.096837] ata2: EH complete 02:46:35 :[ 570.110727] ata1.00: configured for UDMA/33 02:46:35 :[ 570.110735] ata1: EH complete 02:47:04 :[ 598.528232] EXT4-fs (dm-0): re-mounted. Opts: errors=remount-ro,commit=600 02:47:04 :[ 598.653973] EXT4-fs (dm-2): re-mounted. Opts: commit=600 02:47:04 :[ 598.730854] ata1: hard resetting link 02:47:04 :[ 598.730910] ata2: hard resetting link 02:47:05 :[ 599.480136] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 02:47:05 :[ 599.480159] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 02:47:05 :[ 599.484206] ata2.00: configured for UDMA/33 02:47:05 :[ 599.484213] ata2: EH complete 02:47:05 :[ 599.496699] ata1.00: configured for UDMA/33 02:47:05 :[ 599.496707] ata1: EH complete 04:45:59 :[ 7733.756548] EXT4-fs (dm-0): re-mounted. Opts: errors=remount-ro,commit=0 04:45:59 :[ 7733.882748] EXT4-fs (dm-2): re-mounted. Opts: commit=0 04:45:59 :[ 7733.960142] ata1: hard resetting link 04:45:59 :[ 7733.960189] ata2: hard resetting link 04:46:00 :[ 7734.701926] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 04:46:00 :[ 7734.719939] ata1.00: configured for UDMA/33 04:46:00 :[ 7734.719946] ata1: EH complete 04:46:00 :[ 7734.722547] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 04:46:00 :[ 7734.726652] ata2.00: configured for UDMA/33 04:46:00 :[ 7734.726659] ata2: EH complete 04:46:02 :[ 7736.656465] ACPI: EC: GPE storm detected, transactions will use polling mode 13:38:49 :[39704.188621] EXT4-fs (dm-0): re-mounted. Opts: errors=remount-ro,commit=600 13:38:49 :[39704.280588] EXT4-fs (dm-2): re-mounted. Opts: commit=600 13:38:49 :[39704.360819] ata1: hard resetting link 13:38:49 :[39704.360882] ata2: hard resetting link 13:38:50 :[39705.112956] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 13:38:50 :[39705.114435] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 13:38:50 :[39705.118673] ata2.00: configured for UDMA/33 13:38:50 :[39705.118682] ata2: EH complete 13:38:50 :[39705.127076] ata1.00: configured for UDMA/33 13:38:50 :[39705.127084] ata1: EH complete 13:39:49 :[39764.142463] applesmc: F1Mn: write arg fail 13:48:11 :[40267.025145] applesmc: FS! : read arg fail 13:52:53 :[40548.596735] applesmc: FS! : read arg fail 13:53:58 :[40613.972856] applesmc: FS! : read arg fail 13:54:08 :[40624.057339] applesmc: FS! : read arg fail 13:58:20 :[40875.397749] applesmc: TC0D: read data fail 14:16:56 :[41991.722054] applesmc: Th2H: read data fail 14:22:32 :[42327.991522] applesmc: light sensor data length set to 10 14:26:19 :[42554.788886] applesmc: F1Mn: write arg fail 14:32:36 :[42931.860443] applesmc: TC0F: read data fail 14:34:32 :[43048.041469] EXT4-fs (dm-0): re-mounted. Opts: errors=remount-ro,commit=0 14:34:33 :[43048.185850] EXT4-fs (dm-2): re-mounted. Opts: commit=0 14:34:33 :[43048.270184] ata1: hard resetting link 14:34:33 :[43048.270224] ata2: hard resetting link 14:34:33 :[43049.030049] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 14:34:33 :[43049.030065] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 14:34:33 :[43049.034106] ata2.00: configured for UDMA/33 14:34:33 :[43049.034112] ata2: EH complete 14:34:33 :[43049.056952] ata1.00: configured for UDMA/33 14:34:33 :[43049.056959] ------------[ cut here ]------------ 14:34:33 :[43049.056968] WARNING: at /build/buildd/linux-2.6.35/drivers/ata/libata-eh.c:3638 ata_eh_finish+0xdf/0xf0() 14:34:33 :[43049.056971] Hardware name: MacBookPro5,3 14:34:33 :[43049.056973] Modules linked in: michael_mic arc4 xt_multiport binfmt_misc rfcomm sco bnep l2cap parport_pc ppdev nvidia(P) ipt_REJECT xt_recent snd_hda_codec_cirrus xt_limit xt_tcpudp ipt_addrtype xt_state snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi applesmc led_class ip6table_filter lib80211_crypt_tkip snd_rawmidi snd_seq_midi_event ip6_tables input_polldev hid_apple snd_seq wl(P) snd_timer snd_seq_device snd joydev bcm5974 usbhid mbp_nvidia_bl uvcvideo btusb videodev v4l1_compat v4l2_compat_ioctl32 nf_nat_irc hid nf_conntrack_irc soundcore snd_page_alloc i2c_nforce2 coretemp lib80211 bluetooth nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack_ftp nf_conntrack lp parport iptable_filter ip_tables x_tables usb_storage firewire_ohci firewire_core forcedeth crc_itu_t ahci libahci 14:34:33 :[43049.057048] Pid: 202, comm: scsi_eh_0 Tainted: P W 2.6.35-25-generic #44-Ubuntu 14:34:33 :[43049.057052] Call Trace: 14:34:33 :[43049.057060] [<ffffffff8106091f>] warn_slowpath_common+0x7f/0xc0 14:34:33 :[43049.057064] [<ffffffff8106097a>] warn_slowpath_null+0x1a/0x20 14:34:33 :[43049.057069] [<ffffffff813dc77f>] ata_eh_finish+0xdf/0xf0 14:34:33 :[43049.057074] [<ffffffff813e441e>] sata_pmp_error_handler+0x2e/0x40 14:34:33 :[43049.057083] [<ffffffffa00021bf>] ahci_error_handler+0x1f/0x90 [libahci] 14:34:33 :[43049.057088] [<ffffffff813dd6d2>] ata_scsi_error+0x492/0x5e0 14:34:33 :[43049.057093] [<ffffffff813b24cd>] scsi_error_handler+0x10d/0x190 14:34:33 :[43049.057097] [<ffffffff813b23c0>] ? scsi_error_handler+0x0/0x190 14:34:33 :[43049.057102] [<ffffffff8107f266>] kthread+0x96/0xa0 14:34:33 :[43049.057107] [<ffffffff8100aee4>] kernel_thread_helper+0x4/0x10 14:34:33 :[43049.057111] [<ffffffff8107f1d0>] ? kthread+0x0/0xa0 14:34:33 :[43049.057115] [<ffffffff8100aee0>] ? kernel_thread_helper+0x0/0x10 14:34:33 :[43049.057118] ---[ end trace 76dbffc2d5d49d9d ]--- 14:34:33 :[43049.057123] ata1: EH complete 14:34:41 :[43057.012698] ata1: hard resetting link 14:34:42 :[43057.362780] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 14:34:42 :[43057.381432] ata1.00: configured for UDMA/33 14:34:42 :[43057.381441] ------------[ cut here ]------------ 14:34:42 :[43057.381450] WARNING: at /build/buildd/linux-2.6.35/drivers/ata/libata-eh.c:3638 ata_eh_finish+0xdf/0xf0() 14:34:42 :[43057.381453] Hardware name: MacBookPro5,3 14:34:42 :[43057.381455] Modules linked in: michael_mic arc4 xt_multiport binfmt_misc rfcomm sco bnep l2cap parport_pc ppdev nvidia(P) ipt_REJECT xt_recent snd_hda_codec_cirrus xt_limit xt_tcpudp ipt_addrtype xt_state snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi applesmc led_class ip6table_filter lib80211_crypt_tkip snd_rawmidi snd_seq_midi_event ip6_tables input_polldev hid_apple snd_seq wl(P) snd_timer snd_seq_device snd joydev bcm5974 usbhid mbp_nvidia_bl uvcvideo btusb videodev v4l1_compat v4l2_compat_ioctl32 nf_nat_irc hid nf_conntrack_irc soundcore snd_page_alloc i2c_nforce2 coretemp lib80211 bluetooth nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack_ftp nf_conntrack lp parport iptable_filter ip_tables x_tables usb_storage firewire_ohci firewire_core forcedeth crc_itu_t ahci libahci 14:34:42 :[43057.381530] Pid: 202, comm: scsi_eh_0 Tainted: P W 2.6.35-25-generic #44-Ubuntu 14:34:42 :[43057.381533] Call Trace: 14:34:42 :[43057.381542] [<ffffffff8106091f>] warn_slowpath_common+0x7f/0xc0 14:34:42 :[43057.381546] [<ffffffff8106097a>] warn_slowpath_null+0x1a/0x20 14:34:42 :[43057.381551] [<ffffffff813dc77f>] ata_eh_finish+0xdf/0xf0 14:34:42 :[43057.381556] [<ffffffff813e441e>] sata_pmp_error_handler+0x2e/0x40 14:34:42 :[43057.381565] [<ffffffffa00021bf>] ahci_error_handler+0x1f/0x90 [libahci] 14:34:42 :[43057.381569] [<ffffffff813dd6d2>] ata_scsi_error+0x492/0x5e0 14:34:42 :[43057.381575] [<ffffffff813b24cd>] scsi_error_handler+0x10d/0x190 14:34:42 :[43057.381579] [<ffffffff813b23c0>] ? scsi_error_handler+0x0/0x190 14:34:42 :[43057.381584] [<ffffffff8107f266>] kthread+0x96/0xa0 14:34:42 :[43057.381589] [<ffffffff8100aee4>] kernel_thread_helper+0x4/0x10 14:34:42 :[43057.381594] [<ffffffff8107f1d0>] ? kthread+0x0/0xa0 14:34:42 :[43057.381598] [<ffffffff8100aee0>] ? kernel_thread_helper+0x0/0x10 14:34:42 :[43057.381601] ---[ end trace 76dbffc2d5d49d9e ]--- 14:34:42 :[43057.381624] ata1: EH complete 14:34:42 :[43057.557887] EXT4-fs (dm-0): re-mounted. Opts: errors=remount-ro,commit=600 14:34:42 :[43057.560517] EXT4-fs (dm-2): re-mounted. Opts: commit=600 14:34:42 :[43057.621194] ata1: hard resetting link 14:34:42 :[43057.621252] ata2: hard resetting link 14:34:43 :[43058.370141] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 14:34:43 :[43058.370162] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 14:34:43 :[43058.374407] ata2.00: configured for UDMA/33 14:34:43 :[43058.374415] ata2: EH complete 14:34:43 :[43058.381989] ata1.00: configured for UDMA/33 14:34:43 :[43058.381996] ata1: EH complete 14:34:43 :[43058.616228] EXT4-fs (dm-0): re-mounted. Opts: errors=remount-ro,commit=600 14:34:43 :[43058.618931] EXT4-fs (dm-2): re-mounted. Opts: commit=600 14:34:43 :[43058.626687] ata1: hard resetting link 14:34:43 :[43058.626731] ata2: hard resetting link 14:34:44 :[43059.372908] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 14:34:44 :[43059.372932] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 14:34:44 :[43059.376997] ata2.00: configured for UDMA/33 14:34:44 :[43059.377003] ata2: EH complete 14:34:44 :[43059.392576] ata1.00: configured for UDMA/33 14:34:44 :[43059.392585] ata1: EH complete 15:48:19 :[47474.710860] ata1: hard resetting link 15:48:19 :[47474.710882] ata2: hard resetting link 15:48:20 :[47475.460144] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 15:48:20 :[47475.460169] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 15:48:20 :[47475.473709] ata1.00: configured for UDMA/33 15:48:20 :[47475.473717] ata1: EH complete 15:48:20 :[47475.727960] ata2.00: configured for UDMA/33 15:48:20 :[47475.727969] ata2: EH complete 16:29:39 :[49954.295017] EXT4-fs (dm-0): re-mounted. Opts: errors=remount-ro,commit=0 16:29:39 :[49954.622307] EXT4-fs (dm-2): re-mounted. Opts: commit=0 16:29:39 :[49954.710139] ata1: hard resetting link 16:29:39 :[49954.710174] ata2: hard resetting link 16:29:40 :[49955.460046] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 16:29:40 :[49955.460062] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310) 16:29:40 :[49955.464138] ata2.00: configured for UDMA/33 16:29:40 :[49955.464144] ata2: EH complete 16:29:40 :[49955.473251] ata1.00: configured for UDMA/33 16:29:40 :[49955.473258] ata1: EH complete

    Read the article

  • Plan Caching and Query Memory Part I – When not to use stored procedure or other plan caching mechanisms like sp_executesql or prepared statement

    - by sqlworkshops
      The most common performance mistake SQL Server developers make: SQL Server estimates memory requirement for queries at compilation time. This mechanism is fine for dynamic queries that need memory, but not for queries that cache the plan. With dynamic queries the plan is not reused for different set of parameters values / predicates and hence different amount of memory can be estimated based on different set of parameter values / predicates. Common memory allocating queries are that perform Sort and do Hash Match operations like Hash Join or Hash Aggregation or Hash Union. This article covers Sort with examples. It is recommended to read Plan Caching and Query Memory Part II after this article which covers Hash Match operations.   When the plan is cached by using stored procedure or other plan caching mechanisms like sp_executesql or prepared statement, SQL Server estimates memory requirement based on first set of execution parameters. Later when the same stored procedure is called with different set of parameter values, the same amount of memory is used to execute the stored procedure. This might lead to underestimation / overestimation of memory on plan reuse, overestimation of memory might not be a noticeable issue for Sort operations, but underestimation of memory will lead to spill over tempdb resulting in poor performance.   This article covers underestimation / overestimation of memory for Sort. Plan Caching and Query Memory Part II covers underestimation / overestimation for Hash Match operation. It is important to note that underestimation of memory for Sort and Hash Match operations lead to spill over tempdb and hence negatively impact performance. Overestimation of memory affects the memory needs of other concurrently executing queries. In addition, it is important to note, with Hash Match operations, overestimation of memory can actually lead to poor performance.   To read additional articles I wrote click here.   In most cases it is cheaper to pay for the compilation cost of dynamic queries than huge cost for spill over tempdb, unless memory requirement for a stored procedure does not change significantly based on predicates.   The best way to learn is to practice. To create the below tables and reproduce the behavior, join the mailing list by using this link: www.sqlworkshops.com/ml and I will send you the table creation script. Most of these concepts are also covered in our webcasts: www.sqlworkshops.com/webcasts   Enough theory, let’s see an example where we sort initially 1 month of data and then use the stored procedure to sort 6 months of data.   Let’s create a stored procedure that sorts customers by name within certain date range.   --Example provided by www.sqlworkshops.com create proc CustomersByCreationDate @CreationDateFrom datetime, @CreationDateTo datetime as begin       declare @CustomerID int, @CustomerName varchar(48), @CreationDate datetime       select @CustomerName = c.CustomerName, @CreationDate = c.CreationDate from Customers c             where c.CreationDate between @CreationDateFrom and @CreationDateTo             order by c.CustomerName       option (maxdop 1)       end go Let’s execute the stored procedure initially with 1 month date range.   set statistics time on go --Example provided by www.sqlworkshops.com exec CustomersByCreationDate '2001-01-01', '2001-01-31' go The stored procedure took 48 ms to complete.     The stored procedure was granted 6656 KB based on 43199.9 rows being estimated.       The estimated number of rows, 43199.9 is similar to actual number of rows 43200 and hence the memory estimation should be ok.       There was no Sort Warnings in SQL Profiler.      Now let’s execute the stored procedure with 6 month date range. --Example provided by www.sqlworkshops.com exec CustomersByCreationDate '2001-01-01', '2001-06-30' go The stored procedure took 679 ms to complete.      The stored procedure was granted 6656 KB based on 43199.9 rows being estimated.      The estimated number of rows, 43199.9 is way different from the actual number of rows 259200 because the estimation is based on the first set of parameter value supplied to the stored procedure which is 1 month in our case. This underestimation will lead to sort spill over tempdb, resulting in poor performance.      There was Sort Warnings in SQL Profiler.    To monitor the amount of data written and read from tempdb, one can execute select num_of_bytes_written, num_of_bytes_read from sys.dm_io_virtual_file_stats(2, NULL) before and after the stored procedure execution, for additional information refer to the webcast: www.sqlworkshops.com/webcasts.     Let’s recompile the stored procedure and then let’s first execute the stored procedure with 6 month date range.  In a production instance it is not advisable to use sp_recompile instead one should use DBCC FREEPROCCACHE (plan_handle). This is due to locking issues involved with sp_recompile, refer to our webcasts for further details.   exec sp_recompile CustomersByCreationDate go --Example provided by www.sqlworkshops.com exec CustomersByCreationDate '2001-01-01', '2001-06-30' go Now the stored procedure took only 294 ms instead of 679 ms.    The stored procedure was granted 26832 KB of memory.      The estimated number of rows, 259200 is similar to actual number of rows of 259200. Better performance of this stored procedure is due to better estimation of memory and avoiding sort spill over tempdb.      There was no Sort Warnings in SQL Profiler.       Now let’s execute the stored procedure with 1 month date range.   --Example provided by www.sqlworkshops.com exec CustomersByCreationDate '2001-01-01', '2001-01-31' go The stored procedure took 49 ms to complete, similar to our very first stored procedure execution.     This stored procedure was granted more memory (26832 KB) than necessary memory (6656 KB) based on 6 months of data estimation (259200 rows) instead of 1 month of data estimation (43199.9 rows). This is because the estimation is based on the first set of parameter value supplied to the stored procedure which is 6 months in this case. This overestimation did not affect performance, but it might affect performance of other concurrent queries requiring memory and hence overestimation is not recommended. This overestimation might affect performance Hash Match operations, refer to article Plan Caching and Query Memory Part II for further details.    Let’s recompile the stored procedure and then let’s first execute the stored procedure with 2 day date range. exec sp_recompile CustomersByCreationDate go --Example provided by www.sqlworkshops.com exec CustomersByCreationDate '2001-01-01', '2001-01-02' go The stored procedure took 1 ms.      The stored procedure was granted 1024 KB based on 1440 rows being estimated.      There was no Sort Warnings in SQL Profiler.      Now let’s execute the stored procedure with 6 month date range. --Example provided by www.sqlworkshops.com exec CustomersByCreationDate '2001-01-01', '2001-06-30' go   The stored procedure took 955 ms to complete, way higher than 679 ms or 294ms we noticed before.      The stored procedure was granted 1024 KB based on 1440 rows being estimated. But we noticed in the past this stored procedure with 6 month date range needed 26832 KB of memory to execute optimally without spill over tempdb. This is clear underestimation of memory and the reason for the very poor performance.      There was Sort Warnings in SQL Profiler. Unlike before this was a Multiple pass sort instead of Single pass sort. This occurs when granted memory is too low.      Intermediate Summary: This issue can be avoided by not caching the plan for memory allocating queries. Other possibility is to use recompile hint or optimize for hint to allocate memory for predefined date range.   Let’s recreate the stored procedure with recompile hint. --Example provided by www.sqlworkshops.com drop proc CustomersByCreationDate go create proc CustomersByCreationDate @CreationDateFrom datetime, @CreationDateTo datetime as begin       declare @CustomerID int, @CustomerName varchar(48), @CreationDate datetime       select @CustomerName = c.CustomerName, @CreationDate = c.CreationDate from Customers c             where c.CreationDate between @CreationDateFrom and @CreationDateTo             order by c.CustomerName       option (maxdop 1, recompile)       end go Let’s execute the stored procedure initially with 1 month date range and then with 6 month date range. --Example provided by www.sqlworkshops.com exec CustomersByCreationDate '2001-01-01', '2001-01-30' exec CustomersByCreationDate '2001-01-01', '2001-06-30' go The stored procedure took 48ms and 291 ms in line with previous optimal execution times.      The stored procedure with 1 month date range has good estimation like before.      The stored procedure with 6 month date range also has good estimation and memory grant like before because the query was recompiled with current set of parameter values.      The compilation time and compilation CPU of 1 ms is not expensive in this case compared to the performance benefit.     Let’s recreate the stored procedure with optimize for hint of 6 month date range.   --Example provided by www.sqlworkshops.com drop proc CustomersByCreationDate go create proc CustomersByCreationDate @CreationDateFrom datetime, @CreationDateTo datetime as begin       declare @CustomerID int, @CustomerName varchar(48), @CreationDate datetime       select @CustomerName = c.CustomerName, @CreationDate = c.CreationDate from Customers c             where c.CreationDate between @CreationDateFrom and @CreationDateTo             order by c.CustomerName       option (maxdop 1, optimize for (@CreationDateFrom = '2001-01-01', @CreationDateTo ='2001-06-30'))       end go Let’s execute the stored procedure initially with 1 month date range and then with 6 month date range.   --Example provided by www.sqlworkshops.com exec CustomersByCreationDate '2001-01-01', '2001-01-30' exec CustomersByCreationDate '2001-01-01', '2001-06-30' go The stored procedure took 48ms and 291 ms in line with previous optimal execution times.    The stored procedure with 1 month date range has overestimation of rows and memory. This is because we provided hint to optimize for 6 months of data.      The stored procedure with 6 month date range has good estimation and memory grant because we provided hint to optimize for 6 months of data.       Let’s execute the stored procedure with 12 month date range using the currently cashed plan for 6 month date range. --Example provided by www.sqlworkshops.com exec CustomersByCreationDate '2001-01-01', '2001-12-31' go The stored procedure took 1138 ms to complete.      2592000 rows were estimated based on optimize for hint value for 6 month date range. Actual number of rows is 524160 due to 12 month date range.      The stored procedure was granted enough memory to sort 6 month date range and not 12 month date range, so there will be spill over tempdb.      There was Sort Warnings in SQL Profiler.      As we see above, optimize for hint cannot guarantee enough memory and optimal performance compared to recompile hint.   This article covers underestimation / overestimation of memory for Sort. Plan Caching and Query Memory Part II covers underestimation / overestimation for Hash Match operation. It is important to note that underestimation of memory for Sort and Hash Match operations lead to spill over tempdb and hence negatively impact performance. Overestimation of memory affects the memory needs of other concurrently executing queries. In addition, it is important to note, with Hash Match operations, overestimation of memory can actually lead to poor performance.   Summary: Cached plan might lead to underestimation or overestimation of memory because the memory is estimated based on first set of execution parameters. It is recommended not to cache the plan if the amount of memory required to execute the stored procedure has a wide range of possibilities. One can mitigate this by using recompile hint, but that will lead to compilation overhead. However, in most cases it might be ok to pay for compilation rather than spilling sort over tempdb which could be very expensive compared to compilation cost. The other possibility is to use optimize for hint, but in case one sorts more data than hinted by optimize for hint, this will still lead to spill. On the other side there is also the possibility of overestimation leading to unnecessary memory issues for other concurrently executing queries. In case of Hash Match operations, this overestimation of memory might lead to poor performance. When the values used in optimize for hint are archived from the database, the estimation will be wrong leading to worst performance, so one has to exercise caution before using optimize for hint, recompile hint is better in this case. I explain these concepts with detailed examples in my webcasts (www.sqlworkshops.com/webcasts), I recommend you to watch them. The best way to learn is to practice. To create the above tables and reproduce the behavior, join the mailing list at www.sqlworkshops.com/ml and I will send you the relevant SQL Scripts.     Register for the upcoming 3 Day Level 400 Microsoft SQL Server 2008 and SQL Server 2005 Performance Monitoring & Tuning Hands-on Workshop in London, United Kingdom during March 15-17, 2011, click here to register / Microsoft UK TechNet.These are hands-on workshops with a maximum of 12 participants and not lectures. For consulting engagements click here.     Disclaimer and copyright information:This article refers to organizations and products that may be the trademarks or registered trademarks of their various owners. Copyright of this article belongs to R Meyyappan / www.sqlworkshops.com. You may freely use the ideas and concepts discussed in this article with acknowledgement (www.sqlworkshops.com), but you may not claim any of it as your own work. This article is for informational purposes only; you use any of the suggestions given here entirely at your own risk.   R Meyyappan [email protected] LinkedIn: http://at.linkedin.com/in/rmeyyappan

    Read the article

< Previous Page | 213 214 215 216 217 218 219  | Next Page >