Search Results

Search found 32516 results on 1301 pages for 'oracle off topic'.

Page 531/1301 | < Previous Page | 527 528 529 530 531 532 533 534 535 536 537 538  | Next Page >

  • "FOR UPDATE" v/s "LOCK IN SHARE MODE" : Allow concurrent threads to read updated "state" value of locked row

    - by shadesco
    I have the following scenario: User X logs in to the application from location lc1: call it Ulc1 User X (has been hacked, or some friend of his knows his login credential, or he just logs in from a different browser on his machine,etc.. u got the point) logs in at the same time from location lc2: call it Ulc2 I am using a main servlet which : - gets a connection from database pooling - sets autocommit to false - executes a command that goes through app layers: if all successful, set autocommit to true in a "finally" statement, and closes connection. Else if an exception happens, rollback(). In my database (mysql/innoDb) i have a "history" table, with row columns: id(primary key) |username | date | topic | locked The column "locked" has by default value "false" and it serves as a flag that marks if a specific row is locked or not. Each row is specific to a user (as u can see from the username column) So back to the scenario: --Ulc1 sends the command to update his history from the db for date "D" and topic "T". --Ulc2 sends the same command to update history from the db for the same date "D" and same topic "T" at the exact same time. I want to implement an mysql/innoDB locking system that will enable whichever thread arriving to do the following check: Is column "locked" for this row true or not? if true, return a message to the user that " he is already updating the same data from another location" if not true (ie not locked) : flag it as locked and update then reset locked to false once finished. Which of these two mysql locking techniques, will actually allow the 2nd arriving thread from reading the "updated" value of the locked column to decide wt action to take?Should i use "FOR UPDATE" or "LOCK IN SHARE MODE"? This scenario explains what i want to accomplish: - Ulc1 thread arrives first: column "locked" is false, set it to true and continue updating process - Ulc2 thread arrives while Ulc1's transaction is still in process, and even though the row is locked through innoDb functionalities, it doesn't have to wait but in fact reads the "new" value of column locked which is "true", and so doesn't in fact have to wait till Ulc1 transaction commits to read the value of the "locked" column(anyway by that time the value of this column will already have been reset to false). I am not very experienced with the 2 types of locking mechanisms, what i understand so far is that LOCK IN SHARE MODE allow other transaction to read the locked row while FOR UPDATE doesn't even allow reading. But does this read gets on the updated value? or the 2nd arriving thread has to wait the first thread to commit to then read the value? Any recommendations about which locking mechanism to use for this scenario is appreciated. Also if there's a better way to "check" if the row has been locked (other than using a true/false column flag) please let me know about it. thank you SOLUTION (Jdbc pseudocode example based on @Darhazer's answer) Table : [ id(primary key) |username | date | topic | locked ] connection.setautocommit(false); //transaction-1 PreparedStatement ps1 = "Select locked from tableName for update where id="key" and locked=false); ps1.executeQuery(); //transaction 2 PreparedStatement ps2 = "Update tableName set locked=true where id="key"; ps2.executeUpdate(); connection.setautocommit(true);// here we allow other transactions threads to see the new value connection.setautocommit(false); //transaction 3 PreparedStatement ps3 = "Update tableName set aField="Sthg" where id="key" And date="D" and topic="T"; ps3.executeUpdate(); // reset locked to false PreparedStatement ps4 = "Update tableName set locked=false where id="key"; ps4.executeUpdate(); //commit connection.setautocommit(true);

    Read the article

  • How to implement tail calls in a custom VM

    - by DeadMG
    How can I implement tail calls in a custom virtual machine? I know that I need to pop off the original function's local stack, then it's arguments, then push on the new arguments. But, if I pop off the function's local stack, how am I supposed to push on the new arguments? They've just been popped off the stack.

    Read the article

  • Why does this anonymous function starting with println result in a NullPointerException?

    - by noahz
    I am learning about pmap and wrote the following function: (pmap #((println "hello from " (-> (Thread/currentThread) .getName)) (+ %1 %2)) [1 1 1] [-1 -1 -1]) When run, the result is a NullPointerException (hello from clojure-agent-send-off-pool-4 hello from clojure-agent-send-off-pool-3 hello from clojure-agent-send-off-pool-5 NullPointerException user/eval55/fn--56 (NO_SOURCE_FILE:11) Why is this happening? I have understood and observed the body of a fn to be an implicit do.

    Read the article

  • Random Cache Expiry

    - by mahemoff
    I've been experimenting with random cache expiry times to avoid situations where an individual request forces multiple things to update at once. For example, a web page might include five different components. If each is set to time out in 30 minutes, the user will have a long wait time every 30 minutes. So instead, you set them all to a random time between 15 and 45 minutes to make it likely at most only one component will reload for any given page load. I'm trying to find any research or guidelines on this topic, e.g. optimal variance parameters. I do recall seeing one article about how Google (?) uses this technique, but can't locate it, and there doesn't seem to be much written about the topic.

    Read the article

  • stdio data from write not making it into a file

    - by user1551209
    I'm having a problem with using stdio commands for manipulating data in a file. I short, when I write data into a file, write returns an int indicating that it was successful, but when I read it back out I only get the old data. Here's a stripped down version of the code: fd = open(filename,O_RDWR|O_APPEND); struct dE *cDE = malloc(sizeof(struct dE)); //Read present data printf("\nreading values at %d\n",off); printf("SeekStatus <%d>\n",lseek(fd,off,SEEK_SET)); printf("ReadStatus <%d>\n",read(fd,cDE,deSize)); printf("current Key/Data <%d/%s>\n",cDE->key,cDE->data); printf("\nwriting new values\n"); //Change the values locally cDE->key = //something new cDE->data = //something new //Write them back printf("SeekStatus <%d>\n",lseek(fd,off,SEEK_SET)); printf("WriteStatus <%d>\n",write(fd,cDE,deSize)); //Re-read to make sure that it got written back printf("\nre-reading values at %d\n",off); printf("SeekStatus <%d>\n",lseek(fd,off,SEEK_SET)); printf("ReadStatus <%d>\n",read(fd,cDE,deSize)); printf("current Key/Data <%d/%s>\n",cDE->key,cDE->data); Furthermore, here's the dE struct in case you're wondering: struct dE { int key; char data[DataSize]; }; This prints: reading values at 1072 SeekStatus <1072> ReadStatus <32> current Key/Data <27/old> writing new values SeekStatus <1072> WriteStatus <32> re-reading values at 1072 SeekStatus <1072> ReadStatus <32> current Key/Data <27/old>

    Read the article

  • What does ON [PRIMARY] mean?

    - by Icono123
    I'm creating an SQL setup script and I'm using someone else's script as an example. Here's an example of the script: SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO CREATE TABLE [dbo].[be_Categories]( [CategoryID] [uniqueidentifier] ROWGUIDCOL NOT NULL CONSTRAINT [DF_be_Categories_CategoryID] DEFAULT (newid()), [CategoryName] [nvarchar](50) NULL, [Description] [nvarchar](200) NULL, [ParentID] [uniqueidentifier] NULL, CONSTRAINT [PK_be_Categories] PRIMARY KEY CLUSTERED ( [CategoryID] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] GO Does anyone know what the ON [PRIMARY] command does? Regards.

    Read the article

  • How can I build pyv8 from source on FreeBSD against the v8 port?

    - by Utkonos
    I am unable to build pyv8 from source on FreeBSD. I have installed the /usr/ports/lang/v8 port, and I'm running into the following error. It seems that pyv8 wants to build v8 itself even though v8 is already built and installed. How can I point pyv8 to the already installed location of v8? # python setup.py build Found Google v8 base on V8_HOME , update it to the latest SVN trunk at running build ==================== INFO: Installing or updating GYP... -------------------- INFO: Check out GYP from SVN ... DEBUG: make dependencies ERROR: Check out GYP from SVN failed: code=2 DEBUG: "Makefile", line 43: Missing dependency operator "Makefile", line 45: Need an operator "Makefile", line 46: Need an operator "Makefile", line 48: Need an operator "Makefile", line 50: Need an operator "Makefile", line 52: Need an operator "Makefile", line 54: Missing dependency operator "Makefile", line 56: Need an operator "Makefile", line 58: Missing dependency operator "Makefile", line 60: Need an operator "Makefile", line 62: Missing dependency operator "Makefile", line 64: Need an operator "Makefile", line 66: Missing dependency operator "Makefile", line 68: Need an operator "Makefile", line 70: Missing dependency operator "Makefile", line 72: Need an operator "Makefile", line 73: Missing dependency operator "Makefile", line 75: Need an operator "Makefile", line 77: Missing dependency operator "Makefile", line 79: Need an operator "Makefile", line 81: Missing dependency operator "Makefile", line 83: Need an operator "Makefile", line 85: Missing dependency operator "Makefile", line 87: Need an operator "Makefile", line 89: Need an operator "Makefile", line 91: Missing dependency operator "Makefile", line 93: Need an operator "Makefile", line 95: Need an operator "Makefile", line 97: Need an operator "Makefile", line 99: Missing dependency operator "Makefile", line 101: Need an operator "Makefile", line 103: Missing dependency operator "Makefile", line 105: Need an operator "Makefile", line 107: Missing dependency operator "Makefile", line 109: Need an operator "Makefile", line 111: Missing dependency operator "Makefile", line 113: Need an operator "Makefile", line 115: Missing dependency operator "Makefile", line 117: Need an operator Error expanding embedded variable. ==================== INFO: Patching the GYP scripts INFO: patch the Google v8 build/standalone.gypi file to enable RTTI and C++ Exceptions ==================== INFO: building Google v8 with GYP for x64 platform with release mode -------------------- INFO: build v8 from SVN ... DEBUG: make verifyheap=off component=shared_library visibility=on gdbjit=off liveobjectlist=off regexp=native disassembler=off objectprint=off debuggersupport=on extrachecks=off snapshot=on werror=on x64.release ERROR: build v8 from SVN failed: code=2 DEBUG: "Makefile", line 43: Missing dependency operator "Makefile", line 45: Need an operator "Makefile", line 46: Need an operator "Makefile", line 48: Need an operator "Makefile", line 50: Need an operator "Makefile", line 52: Need an operator "Makefile", line 54: Missing dependency operator "Makefile", line 56: Need an operator "Makefile", line 58: Missing dependency operator "Makefile", line 60: Need an operator "Makefile", line 62: Missing dependency operator "Makefile", line 64: Need an operator "Makefile", line 66: Missing dependency operator "Makefile", line 68: Need an operator "Makefile", line 70: Missing dependency operator "Makefile", line 72: Need an operator "Makefile", line 73: Missing dependency operator "Makefile", line 75: Need an operator "Makefile", line 77: Missing dependency operator "Makefile", line 79: Need an operator "Makefile", line 81: Missing dependency operator "Makefile", line 83: Need an operator "Makefile", line 85: Missing dependency operator "Makefile", line 87: Need an operator "Makefile", line 89: Need an operator "Makefile", line 91: Missing dependency operator "Makefile", line 93: Need an operator "Makefile", line 95: Need an operator "Makefile", line 97: Need an operator "Makefile", line 99: Missing dependency operator "Makefile", line 101: Need an operator "Makefile", line 103: Missing dependency operator "Makefile", line 105: Need an operator "Makefile", line 107: Missing dependency operator "Makefile", line 109: Need an operator "Makefile", line 111: Missing dependency operator "Makefile", line 113: Need an operator "Makefile", line 115: Missing dependency operator "Makefile", line 117: Need an operator Error expanding embedded variable. The files that are installed by the v8 port are the following (in /usr/local): bin/d8 include/v8.h include/v8-debug.h include/v8-preparser.h include/v8-profiler.h include/v8-testing.h include/v8stdint.h lib/libv8.so lib/libv8.so.1

    Read the article

  • Screen -X exec commands not working until manually attached

    - by James Watt
    I have a batch script that starts a java server application inside of a screen. The command looks like this: cd /dir/ && screen -A -m -d -S javascreen java -Xms640M -Xmx1024M -jar javaserverapp.jar nogui After I run the batch script, it starts the server and puts it inside the correct screen. If I list my screens after, I see something like this: user@gtwy /dir $ screen -list There is a screen on: 16180.javascreen (Detached) 1 Socket in /var/run/screen/S-user. However, I have a second batch script that sends automated commands to this server and runs on a different crontab interval. Because of the way the application works, I send commands to it like this (this command tells it to alert connected users "testing 123"): screen -X exec .\!\! echo say testing 123 I've also tried: screen -R -X exec .\!\! echo say testing 123 screen -S javascreen -X exec .\!\! echo say testing 123 Unfortunately, these commands DO NOT WORK. They don't even give me an error message, they just do nothing. HOWEVER - If I manually attach to the screen first (with the below command) and then detach, now I can run any of the above commands flawlessly. I can demonstrate this with a video, if I wasn't clear enough here. screen -r -d Thanks in advance. Update: here is the important parts of /etc/screenrc. It should be totally vanilla, I've never edited this file. # VARIABLES # =============================================================== # No annoying audible bell, using "visual bell" # vbell on # default: off # vbell_msg " -- Bell,Bell!! -- " # default: "Wuff,Wuff!!" # Automatically detach on hangup. autodetach on # default: on # Don't display the copyright page startup_message off # default: on # Uses nethack-style messages # nethack on # default: off # Affects the copying of text regions crlf off # default: off # Enable/disable multiuser mode. Standard screen operation is singleuser. # In multiuser mode the commands acladd, aclchg, aclgrp and acldel can be used # to enable (and disable) other user accessing this screen session. # Requires suid-root. multiuser off # Change default scrollback value for new windows defscrollback 1000 # default: 100 # Define the time that all windows monitored for silence should # wait before displaying a message. Default 30 seconds. silencewait 15 # default: 30 # bufferfile: The file to use for commands # "readbuf" ('<') and "writebuf" ('>'): bufferfile $HOME/.screen_exchange # # hardcopydir: The directory which contains all hardcopies. # hardcopydir ~/.hardcopy # hardcopydir ~/.screen # # shell: Default process started in screen's windows. # Makes it possible to use a different shell inside screen # than is set as the default login shell. # If begins with a '-' character, the shell will be started as a login shell. # shell zsh # shell bash # shell ksh shell -$SHELL # shellaka '> |tcsh' # shelltitle '$ |bash' # emulate .logout message pow_detach_msg "Screen session of \$LOGNAME \$:cr:\$:nl:ended." # caption always " %w --- %c:%s" # caption always "%3n %t%? @%u%?%? [%h]%?%=%c" # advertise hardstatus support to $TERMCAP # termcapinfo * '' 'hs:ts=\E_:fs=\E\\:ds=\E_\E\\' # set every new windows hardstatus line to somenthing descriptive # defhstatus "screen: ^En (^Et)" # don't kill window after the process died # zombie "^["

    Read the article

  • unable to sniff traffic despite network interface being in monitor or promiscuous mode

    - by user65126
    I'm trying to sniff out my network's wireless traffic but am having issues. I'm able to put the card in monitor mode, but am unable to see any traffic except broadcasts, multicasts and probe/beacon frames. I have two network interfaces on this laptop. One is connected normally to 'linksys' and the other is in monitor mode. The interface in monitor mode is on the right channel. I'm not associated with the access point because, as I understand, I don't need to if using monitor mode (vs promiscuous). When I try to ping the router ip, I'm not seeing that traffic show up in wireshark. Here's my ifconfig settings: daniel@seasonBlack:~$ ifconfig eth0 Link encap:Ethernet HWaddr 00:1f:29:9e:b2:89 UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) Interrupt:16 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:112 errors:0 dropped:0 overruns:0 frame:0 TX packets:112 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:8518 (8.5 KB) TX bytes:8518 (8.5 KB) wlan0 Link encap:Ethernet HWaddr 00:21:00:34:f7:f4 inet addr:192.168.1.116 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::221:ff:fe34:f7f4/64 Scope:Link UP BROADCAST RUNNING MTU:1500 Metric:1 RX packets:9758 errors:0 dropped:0 overruns:0 frame:0 TX packets:4869 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:3291516 (3.2 MB) TX bytes:677386 (677.3 KB) wlan1 Link encap:UNSPEC HWaddr 00-02-72-7B-92-53-33-34-00-00-00-00-00-00-00-00 UP BROADCAST NOTRAILERS PROMISC ALLMULTI MTU:1500 Metric:1 RX packets:112754 errors:0 dropped:0 overruns:0 frame:0 TX packets:101 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:18569124 (18.5 MB) TX bytes:12874 (12.8 KB) wmaster0 Link encap:UNSPEC HWaddr 00-21-00-34-F7-F4-00-00-00-00-00-00-00-00-00-00 UP RUNNING MTU:0 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) wmaster1 Link encap:UNSPEC HWaddr 00-02-72-7B-92-53-00-00-00-00-00-00-00-00-00-00 UP RUNNING MTU:0 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) Here's my iwconfig settings: daniel@seasonBlack:~$ iwconfig lo no wireless extensions. eth0 no wireless extensions. wmaster0 no wireless extensions. wlan0 IEEE 802.11bg ESSID:"linksys" Mode:Managed Frequency:2.437 GHz Access Point: 00:18:F8:D6:17:34 Bit Rate=54 Mb/s Tx-Power=27 dBm Retry long limit:7 RTS thr:off Fragment thr:off Power Management:off Link Quality=68/70 Signal level=-42 dBm Noise level=-69 dBm Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0 Tx excessive retries:0 Invalid misc:0 Missed beacon:0 wmaster1 no wireless extensions. wlan1 IEEE 802.11bg Mode:Monitor Frequency:2.437 GHz Tx-Power=27 dBm Retry long limit:7 RTS thr:off Fragment thr:off Power Management:off Link Quality:0 Signal level:0 Noise level:0 Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0 Tx excessive retries:0 Invalid misc:0 Missed beacon:0 Here's how I know I'm on the right channel: daniel@seasonBlack:~$ iwlist channel lo no frequency information. eth0 no frequency information. wmaster0 no frequency information. wlan0 11 channels in total; available frequencies : Channel 01 : 2.412 GHz Channel 02 : 2.417 GHz Channel 03 : 2.422 GHz Channel 04 : 2.427 GHz Channel 05 : 2.432 GHz Channel 06 : 2.437 GHz Channel 07 : 2.442 GHz Channel 08 : 2.447 GHz Channel 09 : 2.452 GHz Channel 10 : 2.457 GHz Channel 11 : 2.462 GHz Current Frequency=2.437 GHz (Channel 6) wmaster1 no frequency information. wlan1 11 channels in total; available frequencies : Channel 01 : 2.412 GHz Channel 02 : 2.417 GHz Channel 03 : 2.422 GHz Channel 04 : 2.427 GHz Channel 05 : 2.432 GHz Channel 06 : 2.437 GHz Channel 07 : 2.442 GHz Channel 08 : 2.447 GHz Channel 09 : 2.452 GHz Channel 10 : 2.457 GHz Channel 11 : 2.462 GHz Current Frequency=2.437 GHz (Channel 6)

    Read the article

  • Can't get IPv6 address using radvd on wireless interface on windows, wired works fine.

    - by AndrejaKo
    I set up my router to use OpenWRT and set it up to use IPv6 using a tunnel form SixXs. I'm having problems with stateless autoconfiguration using radvd. On my computer, wired connection can get its IPv6 address fine, but wireless can't. After spending some time on OpenWRT forums, I'm pretty much sure now that router is set up fine and that the problem is with my windows settings. Also, I do not have any problems getting IPv6 address on openSUSE 11.3. So what should I do to solve this and what information should I post? Here's radvdump output for wired interface: interface br-lan { AdvSendAdvert on; # Note: {Min,Max}RtrAdvInterval cannot be obtained with radvdump AdvManagedFlag on; AdvOtherConfigFlag on; AdvReachableTime 0; AdvRetransTimer 0; AdvCurHopLimit 64; AdvDefaultLifetime 1800; AdvHomeAgentFlag off; AdvDefaultPreference medium; AdvSourceLLAddress on; prefix 2001:15c0:67d0::/64 { AdvValidLifetime 86400; AdvPreferredLifetime 14400; AdvOnLink on; AdvAutonomous on; AdvRouterAddr off; }; # End of prefix definition }; # End of interface definition Here's radvdump output for wireless interface: # # radvd configuration generated by radvdump 1.6 # based on Router Advertisement from fe80::a0b7:deff:fef0:5b34 # received by interface br-lan # interface br-lan { AdvSendAdvert on; # Note: {Min,Max}RtrAdvInterval cannot be obtained with radvdump AdvManagedFlag on; AdvOtherConfigFlag on; AdvReachableTime 0; AdvRetransTimer 0; AdvCurHopLimit 64; AdvDefaultLifetime 1800; AdvHomeAgentFlag off; AdvDefaultPreference medium; AdvSourceLLAddress on; prefix 2001:15c0:67d0::/64 { AdvValidLifetime 86400; AdvPreferredLifetime 14400; AdvOnLink on; AdvAutonomous on; AdvRouterAddr off; }; # End of prefix definition }; # End of interface definition # # radvd configuration generated by radvdump 1.6 # based on Router Advertisement from fe80::a0b7:deff:fef0:5b34 # received by interface br-lan # interface br-lan { AdvSendAdvert on; # Note: {Min,Max}RtrAdvInterval cannot be obtained with radvdump AdvManagedFlag on; AdvOtherConfigFlag on; AdvReachableTime 0; AdvRetransTimer 0; AdvCurHopLimit 64; AdvDefaultLifetime 1800; AdvHomeAgentFlag off; AdvDefaultPreference medium; AdvSourceLLAddress on; prefix 2001:15c0:67d0::/64 { AdvValidLifetime 86400; AdvPreferredLifetime 14400; AdvOnLink on; AdvAutonomous on; AdvRouterAddr off; }; # End of prefix definition }; # End of interface definition

    Read the article

  • MySQL Syslog Audit Plugin

    - by jonathonc
    This post shows the construction process of the Syslog Audit plugin that was presented at MySQL Connect 2012. It is based on an environment that has the appropriate development tools enabled including gcc,g++ and cmake. It also assumes you have downloaded the MySQL source code (5.5.16 or higher) and have compiled and installed the system into the /usr/local/mysql directory ready for use.  The information provided below is designed to show the different components that make up a plugin, and specifically an audit type plugin, and how it comes together to be used within the MySQL service. The MySQL Reference Manual contains information regarding the plugin API and how it can be used, so please refer there for more detailed information. The code in this post is designed to give the simplest information necessary, so handling every return code, managing race conditions etc is not part of this example code. Let's start by looking at the most basic implementation of our plugin code as seen below: /*    Copyright (c) 2012, Oracle and/or its affiliates. All rights reserved.    Author:  Jonathon Coombes    Licence: GPL    Description: An auditing plugin that logs to syslog and                 can adjust the loglevel via the system variables. */ #include <stdio.h> #include <string.h> #include <mysql/plugin_audit.h> #include <syslog.h> There is a commented header detailing copyright/licencing and meta-data information and then the include headers. The two important include statements for our plugin are the syslog.h plugin, which gives us the structures for syslog, and the plugin_audit.h include which has details regarding the audit specific plugin api. Note that we do not need to include the general plugin header plugin.h, as this is done within the plugin_audit.h file already. To implement our plugin within the current implementation we need to add it into our source code and compile. > cd /usr/local/src/mysql-5.5.28/plugin > mkdir audit_syslog > cd audit_syslog A simple CMakeLists.txt file is created to manage the plugin compilation: MYSQL_ADD_PLUGIN(audit_syslog audit_syslog.cc MODULE_ONLY) Run the cmake  command at the top level of the source and then you can compile the plugin using the 'make' command. This results in a compiled audit_syslog.so library, but currently it is not much use to MySQL as there is no level of api defined to communicate with the MySQL service. Now we need to define the general plugin structure that enables MySQL to recognise the library as a plugin and be able to install/uninstall it and have it show up in the system. The structure is defined in the plugin.h file in the MySQL source code.  /*   Plugin library descriptor */ mysql_declare_plugin(audit_syslog) {   MYSQL_AUDIT_PLUGIN,           /* plugin type                    */   &audit_syslog_descriptor,     /* descriptor handle               */   "audit_syslog",               /* plugin name                     */   "Author Name",                /* author                          */   "Simple Syslog Audit",        /* description                     */   PLUGIN_LICENSE_GPL,           /* licence                         */   audit_syslog_init,            /* init function     */   audit_syslog_deinit,          /* deinit function */   0x0001,                       /* plugin version                  */   NULL,                         /* status variables        */   NULL,                         /* system variables                */   NULL,                         /* no reserves                     */   0,                            /* no flags                        */ } mysql_declare_plugin_end; The general plugin descriptor above is standard for all plugin types in MySQL. The plugin type is defined along with the init/deinit functions and interface methods into the system for sharing information, and various other metadata information. The descriptors have an internally recognised version number so that plugins can be matched against the api on the running server. The other details are usually related to the type-specific methods and structures to implement the plugin. Each plugin has a type-specific descriptor as well which details how the plugin is implemented for the specific purpose of that plugin type. /*   Plugin type-specific descriptor */ static struct st_mysql_audit audit_syslog_descriptor= {   MYSQL_AUDIT_INTERFACE_VERSION,                        /* interface version    */   NULL,                                                 /* release_thd function */   audit_syslog_notify,                                  /* notify function      */   { (unsigned long) MYSQL_AUDIT_GENERAL_CLASSMASK |                     MYSQL_AUDIT_CONNECTION_CLASSMASK }  /* class mask           */ }; In this particular case, the release_thd function has not been defined as it is not required. The important method for auditing is the notify function which is activated when an event occurs on the system. The notify function is designed to activate on an event and the implementation will determine how it is handled. For the audit_syslog plugin, the use of the syslog feature sends all events to the syslog for recording. The class mask allows us to determine what type of events are being seen by the notify function. There are currently two major types of event: 1. General Events: This includes general logging, errors, status and result type events. This is the main one for tracking the queries and operations on the database. 2. Connection Events: This group is based around user logins. It monitors connections and disconnections, but also if somebody changes user while connected. With most audit plugins, the principle behind the plugin is to track changes to the system over time and counters can be an important part of this process. The next step is to define and initialise the counters that are used to track the events in the service. There are 3 counters defined in total for our plugin - the # of general events, the # of connection events and the total number of events.  static volatile int total_number_of_calls; /* Count MYSQL_AUDIT_GENERAL_CLASS event instances */ static volatile int number_of_calls_general; /* Count MYSQL_AUDIT_CONNECTION_CLASS event instances */ static volatile int number_of_calls_connection; The init and deinit functions for the plugin are there to be called when the plugin is activated and when it is terminated. These offer the best option to initialise the counters for our plugin: /*  Initialize the plugin at server start or plugin installation. */ static int audit_syslog_init(void *arg __attribute__((unused))) {     openlog("mysql_audit:",LOG_PID|LOG_PERROR|LOG_CONS,LOG_USER);     total_number_of_calls= 0;     number_of_calls_general= 0;     number_of_calls_connection= 0;     return(0); } The init function does a call to openlog to initialise the syslog functionality. The parameters are the service to log under ("mysql_audit" in this case), the syslog flags and the facility for the logging. Then each of the counters are initialised to zero and a success is returned. If the init function is not defined, it will return success by default. /*  Terminate the plugin at server shutdown or plugin deinstallation. */ static int audit_syslog_deinit(void *arg __attribute__((unused))) {     closelog();     return(0); } The deinit function will simply close our syslog connection and return success. Note that the syslog functionality is part of the glibc libraries and does not require any external factors.  The function names are what we define in the general plugin structure, so these have to match otherwise there will be errors. The next step is to implement the event notifier function that was defined in the type specific descriptor (audit_syslog_descriptor) which is audit_syslog_notify. /* Event notifier function */ static void audit_syslog_notify(MYSQL_THD thd __attribute__((unused)), unsigned int event_class, const void *event) { total_number_of_calls++; if (event_class == MYSQL_AUDIT_GENERAL_CLASS) { const struct mysql_event_general *event_general= (const struct mysql_event_general *) event; number_of_calls_general++; syslog(audit_loglevel,"%lu: User: %s Command: %s Query: %s\n", event_general->general_thread_id, event_general->general_user, event_general->general_command, event_general->general_query ); } else if (event_class == MYSQL_AUDIT_CONNECTION_CLASS) { const struct mysql_event_connection *event_connection= (const struct mysql_event_connection *) event; number_of_calls_connection++; syslog(audit_loglevel,"%lu: User: %s@%s[%s] Event: %d Status: %d\n", event_connection->thread_id, event_connection->user, event_connection->host, event_connection->ip, event_connection->event_subclass, event_connection->status ); } }   In the case of an event, the notifier function is called. The first step is to increment the total number of events that have occurred in our database.The event argument is then cast into the appropriate event structure depending on the class type, of general event or connection event. The event type counters are incremented and details are sent via the syslog() function out to the system log. There are going to be different line formats and information returned since the general events have different data compared to the connection events, even though some of the details overlap, for example, user, thread id, host etc. On compiling the code now, there should be no errors and the resulting audit_syslog.so can be loaded into the server and ready to use. Log into the server and type: mysql> INSTALL PLUGIN audit_syslog SONAME 'audit_syslog.so'; This will install the plugin and will start updating the syslog immediately. Note that the audit plugin attaches to the immediate thread and cannot be uninstalled while that thread is active. This means that you cannot run the UNISTALL command until you log into a different connection (thread) on the server. Once the plugin is loaded, the system log will show output such as the following: Oct  8 15:33:21 machine mysql_audit:[8337]: 87: User: root[root] @ localhost []  Command: (null)  Query: INSTALL PLUGIN audit_syslog SONAME 'audit_syslog.so' Oct  8 15:33:21 machine mysql_audit:[8337]: 87: User: root[root] @ localhost []  Command: Query  Query: INSTALL PLUGIN audit_syslog SONAME 'audit_syslog.so' Oct  8 15:33:40 machine mysql_audit:[8337]: 87: User: root[root] @ localhost []  Command: (null)  Query: show tables Oct  8 15:33:40 machine mysql_audit:[8337]: 87: User: root[root] @ localhost []  Command: Query  Query: show tables Oct  8 15:33:43 machine mysql_audit:[8337]: 87: User: root[root] @ localhost []  Command: (null)  Query: select * from t1 Oct  8 15:33:43 machine mysql_audit:[8337]: 87: User: root[root] @ localhost []  Command: Query  Query: select * from t1 It appears that two of each event is being shown, but in actuality, these are two separate event types - the result event and the status event. This could be refined further by changing the audit_syslog_notify function to handle the different event sub-types in a different manner.  So far, it seems that the logging is working with events showing up in the syslog output. The issue now is that the counters created earlier to track the number of events by type are not accessible when the plugin is being run. Instead there needs to be a way to expose the plugin specific information to the service and vice versa. This could be done via the information_schema plugin api, but for something as simple as counters, the obvious choice is the system status variables. This is done using the standard structure and the declaration: /*  Plugin status variables for SHOW STATUS */ static struct st_mysql_show_var audit_syslog_status[]= {   { "Audit_syslog_total_calls",     (char *) &total_number_of_calls,     SHOW_INT },   { "Audit_syslog_general_events",     (char *) &number_of_calls_general,     SHOW_INT },   { "Audit_syslog_connection_events",     (char *) &number_of_calls_connection,     SHOW_INT },   { 0, 0, SHOW_INT } };   The structure is simply the name that will be displaying in the mysql service, the address of the associated variables, and the data type being used for the counter. It is finished with a blank structure to show that there are no more variables. Remember that status variables may have the same name for variables from other plugin, so it is considered appropriate to add the plugin name at the start of the status variable name to avoid confusion. Looking at the status variables in the mysql client shows something like the following: mysql> show global status like "audit%"; +--------------------------------+-------+ | Variable_name                  | Value | +--------------------------------+-------+ | Audit_syslog_connection_events | 1     | | Audit_syslog_general_events    | 2     | | Audit_syslog_total_calls       | 3     | +--------------------------------+-------+ 3 rows in set (0.00 sec) The final connectivity piece for the plugin is to allow the interactive change of the logging level between the plugin and the system. This requires the ability to send changes via the mysql service through to the plugin. This is done using the system variables interface and defining a single variable to keep track of the active logging level for the facility. /* Plugin system variables for SHOW VARIABLES */ static MYSQL_SYSVAR_STR(loglevel, audit_loglevel,                         PLUGIN_VAR_RQCMDARG,                         "User can specify the log level for auditing",                         audit_loglevel_check, audit_loglevel_update, "LOG_NOTICE"); static struct st_mysql_sys_var* audit_syslog_sysvars[] = {     MYSQL_SYSVAR(loglevel),     NULL }; So now the system variable 'loglevel' is defined for the plugin and associated to the global variable 'audit_loglevel'. The check or validation function is defined to make sure that no garbage values are attempted in the update of the variable. The update function is used to save the new value to the variable. Note that the audit_syslog_sysvars structure is defined in the general plugin descriptor to associate the link between the plugin and the system and how much they interact. Next comes the implementation of the validation function and the update function for the system variable. It is worth noting that if you have a simple numeric such as integers for the variable types, the validate function is often not required as MySQL will handle the automatic check and validation of simple types. /* longest valid value */ #define MAX_LOGLEVEL_SIZE 100 /* hold the valid values */ static const char *possible_modes[]= { "LOG_ERROR", "LOG_WARNING", "LOG_NOTICE", NULL };  static int audit_loglevel_check(     THD*                        thd,    /*!< in: thread handle */     struct st_mysql_sys_var*    var,    /*!< in: pointer to system                                         variable */     void*                       save,   /*!< out: immediate result                                         for update function */     struct st_mysql_value*      value)  /*!< in: incoming string */ {     char buff[MAX_LOGLEVEL_SIZE];     const char *str;     const char **found;     int length;     length= sizeof(buff);     if (!(str= value->val_str(value, buff, &length)))         return 1;     /*         We need to return a pointer to a locally allocated value in "save".         Here we pick to search for the supplied value in an global array of         constant strings and return a pointer to one of them.         The other possiblity is to use the thd_alloc() function to allocate         a thread local buffer instead of the global constants.     */     for (found= possible_modes; *found; found++)     {         if (!strcmp(*found, str))         {             *(const char**)save= *found;             return 0;         }     }     return 1; } The validation function is simply to take the value being passed in via the SET GLOBAL VARIABLE command and check if it is one of the pre-defined values allowed  in our possible_values array. If it is found to be valid, then the value is assigned to the save variable ready for passing through to the update function. static void audit_loglevel_update(     THD*                        thd,        /*!< in: thread handle */     struct st_mysql_sys_var*    var,        /*!< in: system variable                                             being altered */     void*                       var_ptr,    /*!< out: pointer to                                             dynamic variable */     const void*                 save)       /*!< in: pointer to                                             temporary storage */ {     /* assign the new value so that the server can read it */     *(char **) var_ptr= *(char **) save;     /* assign the new value to the internal variable */     audit_loglevel= *(char **) save; } Since all the validation has been done already, the update function is quite simple for this plugin. The first part is to update the system variable pointer so that the server can read the value. The second part is to update our own global plugin variable for tracking the value. Notice that the save variable is passed in as a void type to allow handling of various data types, so it must be cast to the appropriate data type when assigning it to the variables. Looking at how the latest changes affect the usage of the plugin and the interaction within the server shows: mysql> show global variables like "audit%"; +-----------------------+------------+ | Variable_name         | Value      | +-----------------------+------------+ | audit_syslog_loglevel | LOG_NOTICE | +-----------------------+------------+ 1 row in set (0.00 sec) mysql> set global audit_syslog_loglevel="LOG_ERROR"; Query OK, 0 rows affected (0.00 sec) mysql> show global status like "audit%"; +--------------------------------+-------+ | Variable_name                  | Value | +--------------------------------+-------+ | Audit_syslog_connection_events | 1     | | Audit_syslog_general_events    | 11    | | Audit_syslog_total_calls       | 12    | +--------------------------------+-------+ 3 rows in set (0.00 sec) mysql> show global variables like "audit%"; +-----------------------+-----------+ | Variable_name         | Value     | +-----------------------+-----------+ | audit_syslog_loglevel | LOG_ERROR | +-----------------------+-----------+ 1 row in set (0.00 sec)   So now we have a plugin that will audit the events on the system and log the details to the system log. It allows for interaction to see the number of different events within the server details and provides a mechanism to change the logging level interactively via the standard system methods of the SET command. A more complex auditing plugin may have more detailed code, but each of the above areas is what will be involved and simply expanded on to add more functionality. With the above skeleton code, it is now possible to create your own audit plugins to implement your own auditing requirements. If, however, you are not of the coding persuasion, then you could always consider the option of the MySQL Enterprise Audit plugin that is available to purchase.

    Read the article

  • value types in the vm

    - by john.rose
    value types in the vm p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Times} p.p2 {margin: 0.0px 0.0px 14.0px 0.0px; font: 14.0px Times} p.p3 {margin: 0.0px 0.0px 12.0px 0.0px; font: 14.0px Times} p.p4 {margin: 0.0px 0.0px 15.0px 0.0px; font: 14.0px Times} p.p5 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Courier} p.p6 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Courier; min-height: 17.0px} p.p7 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Times; min-height: 18.0px} p.p8 {margin: 0.0px 0.0px 0.0px 36.0px; text-indent: -36.0px; font: 14.0px Times; min-height: 18.0px} p.p9 {margin: 0.0px 0.0px 12.0px 0.0px; font: 14.0px Times; min-height: 18.0px} p.p10 {margin: 0.0px 0.0px 12.0px 0.0px; font: 14.0px Times; color: #000000} li.li1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Times} li.li7 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Times; min-height: 18.0px} span.s1 {font: 14.0px Courier} span.s2 {color: #000000} span.s3 {font: 14.0px Courier; color: #000000} ol.ol1 {list-style-type: decimal} Or, enduring values for a changing world. Introduction A value type is a data type which, generally speaking, is designed for being passed by value in and out of methods, and stored by value in data structures. The only value types which the Java language directly supports are the eight primitive types. Java indirectly and approximately supports value types, if they are implemented in terms of classes. For example, both Integer and String may be viewed as value types, especially if their usage is restricted to avoid operations appropriate to Object. In this note, we propose a definition of value types in terms of a design pattern for Java classes, accompanied by a set of usage restrictions. We also sketch the relation of such value types to tuple types (which are a JVM-level notion), and point out JVM optimizations that can apply to value types. This note is a thought experiment to extend the JVM’s performance model in support of value types. The demonstration has two phases.  Initially the extension can simply use design patterns, within the current bytecode architecture, and in today’s Java language. But if the performance model is to be realized in practice, it will probably require new JVM bytecode features, changes to the Java language, or both.  We will look at a few possibilities for these new features. An Axiom of Value In the context of the JVM, a value type is a data type equipped with construction, assignment, and equality operations, and a set of typed components, such that, whenever two variables of the value type produce equal corresponding values for their components, the values of the two variables cannot be distinguished by any JVM operation. Here are some corollaries: A value type is immutable, since otherwise a copy could be constructed and the original could be modified in one of its components, allowing the copies to be distinguished. Changing the component of a value type requires construction of a new value. The equals and hashCode operations are strictly component-wise. If a value type is represented by a JVM reference, that reference cannot be successfully synchronized on, and cannot be usefully compared for reference equality. A value type can be viewed in terms of what it doesn’t do. We can say that a value type omits all value-unsafe operations, which could violate the constraints on value types.  These operations, which are ordinarily allowed for Java object types, are pointer equality comparison (the acmp instruction), synchronization (the monitor instructions), all the wait and notify methods of class Object, and non-trivial finalize methods. The clone method is also value-unsafe, although for value types it could be treated as the identity function. Finally, and most importantly, any side effect on an object (however visible) also counts as an value-unsafe operation. A value type may have methods, but such methods must not change the components of the value. It is reasonable and useful to define methods like toString, equals, and hashCode on value types, and also methods which are specifically valuable to users of the value type. Representations of Value Value types have two natural representations in the JVM, unboxed and boxed. An unboxed value consists of the components, as simple variables. For example, the complex number x=(1+2i), in rectangular coordinate form, may be represented in unboxed form by the following pair of variables: /*Complex x = Complex.valueOf(1.0, 2.0):*/ double x_re = 1.0, x_im = 2.0; These variables might be locals, parameters, or fields. Their association as components of a single value is not defined to the JVM. Here is a sample computation which computes the norm of the difference between two complex numbers: double distance(/*Complex x:*/ double x_re, double x_im,         /*Complex y:*/ double y_re, double y_im) {     /*Complex z = x.minus(y):*/     double z_re = x_re - y_re, z_im = x_im - y_im;     /*return z.abs():*/     return Math.sqrt(z_re*z_re + z_im*z_im); } A boxed representation groups component values under a single object reference. The reference is to a ‘wrapper class’ that carries the component values in its fields. (A primitive type can naturally be equated with a trivial value type with just one component of that type. In that view, the wrapper class Integer can serve as a boxed representation of value type int.) The unboxed representation of complex numbers is practical for many uses, but it fails to cover several major use cases: return values, array elements, and generic APIs. The two components of a complex number cannot be directly returned from a Java function, since Java does not support multiple return values. The same story applies to array elements: Java has no ’array of structs’ feature. (Double-length arrays are a possible workaround for complex numbers, but not for value types with heterogeneous components.) By generic APIs I mean both those which use generic types, like Arrays.asList and those which have special case support for primitive types, like String.valueOf and PrintStream.println. Those APIs do not support unboxed values, and offer some problems to boxed values. Any ’real’ JVM type should have a story for returns, arrays, and API interoperability. The basic problem here is that value types fall between primitive types and object types. Value types are clearly more complex than primitive types, and object types are slightly too complicated. Objects are a little bit dangerous to use as value carriers, since object references can be compared for pointer equality, and can be synchronized on. Also, as many Java programmers have observed, there is often a performance cost to using wrapper objects, even on modern JVMs. Even so, wrapper classes are a good starting point for talking about value types. If there were a set of structural rules and restrictions which would prevent value-unsafe operations on value types, wrapper classes would provide a good notation for defining value types. This note attempts to define such rules and restrictions. Let’s Start Coding Now it is time to look at some real code. Here is a definition, written in Java, of a complex number value type. @ValueSafe public final class Complex implements java.io.Serializable {     // immutable component structure:     public final double re, im;     private Complex(double re, double im) {         this.re = re; this.im = im;     }     // interoperability methods:     public String toString() { return "Complex("+re+","+im+")"; }     public List<Double> asList() { return Arrays.asList(re, im); }     public boolean equals(Complex c) {         return re == c.re && im == c.im;     }     public boolean equals(@ValueSafe Object x) {         return x instanceof Complex && equals((Complex) x);     }     public int hashCode() {         return 31*Double.valueOf(re).hashCode()                 + Double.valueOf(im).hashCode();     }     // factory methods:     public static Complex valueOf(double re, double im) {         return new Complex(re, im);     }     public Complex changeRe(double re2) { return valueOf(re2, im); }     public Complex changeIm(double im2) { return valueOf(re, im2); }     public static Complex cast(@ValueSafe Object x) {         return x == null ? ZERO : (Complex) x;     }     // utility methods and constants:     public Complex plus(Complex c)  { return new Complex(re+c.re, im+c.im); }     public Complex minus(Complex c) { return new Complex(re-c.re, im-c.im); }     public double abs() { return Math.sqrt(re*re + im*im); }     public static final Complex PI = valueOf(Math.PI, 0.0);     public static final Complex ZERO = valueOf(0.0, 0.0); } This is not a minimal definition, because it includes some utility methods and other optional parts.  The essential elements are as follows: The class is marked as a value type with an annotation. The class is final, because it does not make sense to create subclasses of value types. The fields of the class are all non-private and final.  (I.e., the type is immutable and structurally transparent.) From the supertype Object, all public non-final methods are overridden. The constructor is private. Beyond these bare essentials, we can observe the following features in this example, which are likely to be typical of all value types: One or more factory methods are responsible for value creation, including a component-wise valueOf method. There are utility methods for complex arithmetic and instance creation, such as plus and changeIm. There are static utility constants, such as PI. The type is serializable, using the default mechanisms. There are methods for converting to and from dynamically typed references, such as asList and cast. The Rules In order to use value types properly, the programmer must avoid value-unsafe operations.  A helpful Java compiler should issue errors (or at least warnings) for code which provably applies value-unsafe operations, and should issue warnings for code which might be correct but does not provably avoid value-unsafe operations.  No such compilers exist today, but to simplify our account here, we will pretend that they do exist. A value-safe type is any class, interface, or type parameter marked with the @ValueSafe annotation, or any subtype of a value-safe type.  If a value-safe class is marked final, it is in fact a value type.  All other value-safe classes must be abstract.  The non-static fields of a value class must be non-public and final, and all its constructors must be private. Under the above rules, a standard interface could be helpful to define value types like Complex.  Here is an example: @ValueSafe public interface ValueType extends java.io.Serializable {     // All methods listed here must get redefined.     // Definitions must be value-safe, which means     // they may depend on component values only.     List<? extends Object> asList();     int hashCode();     boolean equals(@ValueSafe Object c);     String toString(); } //@ValueSafe inherited from supertype: public final class Complex implements ValueType { … The main advantage of such a conventional interface is that (unlike an annotation) it is reified in the runtime type system.  It could appear as an element type or parameter bound, for facilities which are designed to work on value types only.  More broadly, it might assist the JVM to perform dynamic enforcement of the rules for value types. Besides types, the annotation @ValueSafe can mark fields, parameters, local variables, and methods.  (This is redundant when the type is also value-safe, but may be useful when the type is Object or another supertype of a value type.)  Working forward from these annotations, an expression E is defined as value-safe if it satisfies one or more of the following: The type of E is a value-safe type. E names a field, parameter, or local variable whose declaration is marked @ValueSafe. E is a call to a method whose declaration is marked @ValueSafe. E is an assignment to a value-safe variable, field reference, or array reference. E is a cast to a value-safe type from a value-safe expression. E is a conditional expression E0 ? E1 : E2, and both E1 and E2 are value-safe. Assignments to value-safe expressions and initializations of value-safe names must take their values from value-safe expressions. A value-safe expression may not be the subject of a value-unsafe operation.  In particular, it cannot be synchronized on, nor can it be compared with the “==” operator, not even with a null or with another value-safe type. In a program where all of these rules are followed, no value-type value will be subject to a value-unsafe operation.  Thus, the prime axiom of value types will be satisfied, that no two value type will be distinguishable as long as their component values are equal. More Code To illustrate these rules, here are some usage examples for Complex: Complex pi = Complex.valueOf(Math.PI, 0); Complex zero = pi.changeRe(0);  //zero = pi; zero.re = 0; ValueType vtype = pi; @SuppressWarnings("value-unsafe")   Object obj = pi; @ValueSafe Object obj2 = pi; obj2 = new Object();  // ok List<Complex> clist = new ArrayList<Complex>(); clist.add(pi);  // (ok assuming List.add param is @ValueSafe) List<ValueType> vlist = new ArrayList<ValueType>(); vlist.add(pi);  // (ok) List<Object> olist = new ArrayList<Object>(); olist.add(pi);  // warning: "value-unsafe" boolean z = pi.equals(zero); boolean z1 = (pi == zero);  // error: reference comparison on value type boolean z2 = (pi == null);  // error: reference comparison on value type boolean z3 = (pi == obj2);  // error: reference comparison on value type synchronized (pi) { }  // error: synch of value, unpredictable result synchronized (obj2) { }  // unpredictable result Complex qq = pi; qq = null;  // possible NPE; warning: “null-unsafe" qq = (Complex) obj;  // warning: “null-unsafe" qq = Complex.cast(obj);  // OK @SuppressWarnings("null-unsafe")   Complex empty = null;  // possible NPE qq = empty;  // possible NPE (null pollution) The Payoffs It follows from this that either the JVM or the java compiler can replace boxed value-type values with unboxed ones, without affecting normal computations.  Fields and variables of value types can be split into their unboxed components.  Non-static methods on value types can be transformed into static methods which take the components as value parameters. Some common questions arise around this point in any discussion of value types. Why burden the programmer with all these extra rules?  Why not detect programs automagically and perform unboxing transparently?  The answer is that it is easy to break the rules accidently unless they are agreed to by the programmer and enforced.  Automatic unboxing optimizations are tantalizing but (so far) unreachable ideal.  In the current state of the art, it is possible exhibit benchmarks in which automatic unboxing provides the desired effects, but it is not possible to provide a JVM with a performance model that assures the programmer when unboxing will occur.  This is why I’m writing this note, to enlist help from, and provide assurances to, the programmer.  Basically, I’m shooting for a good set of user-supplied “pragmas” to frame the desired optimization. Again, the important thing is that the unboxing must be done reliably, or else programmers will have no reason to work with the extra complexity of the value-safety rules.  There must be a reasonably stable performance model, wherein using a value type has approximately the same performance characteristics as writing the unboxed components as separate Java variables. There are some rough corners to the present scheme.  Since Java fields and array elements are initialized to null, value-type computations which incorporate uninitialized variables can produce null pointer exceptions.  One workaround for this is to require such variables to be null-tested, and the result replaced with a suitable all-zero value of the value type.  That is what the “cast” method does above. Generically typed APIs like List<T> will continue to manipulate boxed values always, at least until we figure out how to do reification of generic type instances.  Use of such APIs will elicit warnings until their type parameters (and/or relevant members) are annotated or typed as value-safe.  Retrofitting List<T> is likely to expose flaws in the present scheme, which we will need to engineer around.  Here are a couple of first approaches: public interface java.util.List<@ValueSafe T> extends Collection<T> { … public interface java.util.List<T extends Object|ValueType> extends Collection<T> { … (The second approach would require disjunctive types, in which value-safety is “contagious” from the constituent types.) With more transformations, the return value types of methods can also be unboxed.  This may require significant bytecode-level transformations, and would work best in the presence of a bytecode representation for multiple value groups, which I have proposed elsewhere under the title “Tuples in the VM”. But for starters, the JVM can apply this transformation under the covers, to internally compiled methods.  This would give a way to express multiple return values and structured return values, which is a significant pain-point for Java programmers, especially those who work with low-level structure types favored by modern vector and graphics processors.  The lack of multiple return values has a strong distorting effect on many Java APIs. Even if the JVM fails to unbox a value, there is still potential benefit to the value type.  Clustered computing systems something have copy operations (serialization or something similar) which apply implicitly to command operands.  When copying JVM objects, it is extremely helpful to know when an object’s identity is important or not.  If an object reference is a copied operand, the system may have to create a proxy handle which points back to the original object, so that side effects are visible.  Proxies must be managed carefully, and this can be expensive.  On the other hand, value types are exactly those types which a JVM can “copy and forget” with no downside. Array types are crucial to bulk data interfaces.  (As data sizes and rates increase, bulk data becomes more important than scalar data, so arrays are definitely accompanying us into the future of computing.)  Value types are very helpful for adding structure to bulk data, so a successful value type mechanism will make it easier for us to express richer forms of bulk data. Unboxing arrays (i.e., arrays containing unboxed values) will provide better cache and memory density, and more direct data movement within clustered or heterogeneous computing systems.  They require the deepest transformations, relative to today’s JVM.  There is an impedance mismatch between value-type arrays and Java’s covariant array typing, so compromises will need to be struck with existing Java semantics.  It is probably worth the effort, since arrays of unboxed value types are inherently more memory-efficient than standard Java arrays, which rely on dependent pointer chains. It may be sufficient to extend the “value-safe” concept to array declarations, and allow low-level transformations to change value-safe array declarations from the standard boxed form into an unboxed tuple-based form.  Such value-safe arrays would not be convertible to Object[] arrays.  Certain connection points, such as Arrays.copyOf and System.arraycopy might need additional input/output combinations, to allow smooth conversion between arrays with boxed and unboxed elements. Alternatively, the correct solution may have to wait until we have enough reification of generic types, and enough operator overloading, to enable an overhaul of Java arrays. Implicit Method Definitions The example of class Complex above may be unattractively complex.  I believe most or all of the elements of the example class are required by the logic of value types. If this is true, a programmer who writes a value type will have to write lots of error-prone boilerplate code.  On the other hand, I think nearly all of the code (except for the domain-specific parts like plus and minus) can be implicitly generated. Java has a rule for implicitly defining a class’s constructor, if no it defines no constructors explicitly.  Likewise, there are rules for providing default access modifiers for interface members.  Because of the highly regular structure of value types, it might be reasonable to perform similar implicit transformations on value types.  Here’s an example of a “highly implicit” definition of a complex number type: public class Complex implements ValueType {  // implicitly final     public double re, im;  // implicitly public final     //implicit methods are defined elementwise from te fields:     //  toString, asList, equals(2), hashCode, valueOf, cast     //optionally, explicit methods (plus, abs, etc.) would go here } In other words, with the right defaults, a simple value type definition can be a one-liner.  The observant reader will have noticed the similarities (and suitable differences) between the explicit methods above and the corresponding methods for List<T>. Another way to abbreviate such a class would be to make an annotation the primary trigger of the functionality, and to add the interface(s) implicitly: public @ValueType class Complex { … // implicitly final, implements ValueType (But to me it seems better to communicate the “magic” via an interface, even if it is rooted in an annotation.) Implicitly Defined Value Types So far we have been working with nominal value types, which is to say that the sequence of typed components is associated with a name and additional methods that convey the intention of the programmer.  A simple ordered pair of floating point numbers can be variously interpreted as (to name a few possibilities) a rectangular or polar complex number or Cartesian point.  The name and the methods convey the intended meaning. But what if we need a truly simple ordered pair of floating point numbers, without any further conceptual baggage?  Perhaps we are writing a method (like “divideAndRemainder”) which naturally returns a pair of numbers instead of a single number.  Wrapping the pair of numbers in a nominal type (like “QuotientAndRemainder”) makes as little sense as wrapping a single return value in a nominal type (like “Quotient”).  What we need here are structural value types commonly known as tuples. For the present discussion, let us assign a conventional, JVM-friendly name to tuples, roughly as follows: public class java.lang.tuple.$DD extends java.lang.tuple.Tuple {      double $1, $2; } Here the component names are fixed and all the required methods are defined implicitly.  The supertype is an abstract class which has suitable shared declarations.  The name itself mentions a JVM-style method parameter descriptor, which may be “cracked” to determine the number and types of the component fields. The odd thing about such a tuple type (and structural types in general) is it must be instantiated lazily, in response to linkage requests from one or more classes that need it.  The JVM and/or its class loaders must be prepared to spin a tuple type on demand, given a simple name reference, $xyz, where the xyz is cracked into a series of component types.  (Specifics of naming and name mangling need some tasteful engineering.) Tuples also seem to demand, even more than nominal types, some support from the language.  (This is probably because notations for non-nominal types work best as combinations of punctuation and type names, rather than named constructors like Function3 or Tuple2.)  At a minimum, languages with tuples usually (I think) have some sort of simple bracket notation for creating tuples, and a corresponding pattern-matching syntax (or “destructuring bind”) for taking tuples apart, at least when they are parameter lists.  Designing such a syntax is no simple thing, because it ought to play well with nominal value types, and also with pre-existing Java features, such as method parameter lists, implicit conversions, generic types, and reflection.  That is a task for another day. Other Use Cases Besides complex numbers and simple tuples there are many use cases for value types.  Many tuple-like types have natural value-type representations. These include rational numbers, point locations and pixel colors, and various kinds of dates and addresses. Other types have a variable-length ‘tail’ of internal values. The most common example of this is String, which is (mathematically) a sequence of UTF-16 character values. Similarly, bit vectors, multiple-precision numbers, and polynomials are composed of sequences of values. Such types include, in their representation, a reference to a variable-sized data structure (often an array) which (somehow) represents the sequence of values. The value type may also include ’header’ information. Variable-sized values often have a length distribution which favors short lengths. In that case, the design of the value type can make the first few values in the sequence be direct ’header’ fields of the value type. In the common case where the header is enough to represent the whole value, the tail can be a shared null value, or even just a null reference. Note that the tail need not be an immutable object, as long as the header type encapsulates it well enough. This is the case with String, where the tail is a mutable (but never mutated) character array. Field types and their order must be a globally visible part of the API.  The structure of the value type must be transparent enough to have a globally consistent unboxed representation, so that all callers and callees agree about the type and order of components  that appear as parameters, return types, and array elements.  This is a trade-off between efficiency and encapsulation, which is forced on us when we remove an indirection enjoyed by boxed representations.  A JVM-only transformation would not care about such visibility, but a bytecode transformation would need to take care that (say) the components of complex numbers would not get swapped after a redefinition of Complex and a partial recompile.  Perhaps constant pool references to value types need to declare the field order as assumed by each API user. This brings up the delicate status of private fields in a value type.  It must always be possible to load, store, and copy value types as coordinated groups, and the JVM performs those movements by moving individual scalar values between locals and stack.  If a component field is not public, what is to prevent hostile code from plucking it out of the tuple using a rogue aload or astore instruction?  Nothing but the verifier, so we may need to give it more smarts, so that it treats value types as inseparable groups of stack slots or locals (something like long or double). My initial thought was to make the fields always public, which would make the security problem moot.  But public is not always the right answer; consider the case of String, where the underlying mutable character array must be encapsulated to prevent security holes.  I believe we can win back both sides of the tradeoff, by training the verifier never to split up the components in an unboxed value.  Just as the verifier encapsulates the two halves of a 64-bit primitive, it can encapsulate the the header and body of an unboxed String, so that no code other than that of class String itself can take apart the values. Similar to String, we could build an efficient multi-precision decimal type along these lines: public final class DecimalValue extends ValueType {     protected final long header;     protected private final BigInteger digits;     public DecimalValue valueOf(int value, int scale) {         assert(scale >= 0);         return new DecimalValue(((long)value << 32) + scale, null);     }     public DecimalValue valueOf(long value, int scale) {         if (value == (int) value)             return valueOf((int)value, scale);         return new DecimalValue(-scale, new BigInteger(value));     } } Values of this type would be passed between methods as two machine words. Small values (those with a significand which fits into 32 bits) would be represented without any heap data at all, unless the DecimalValue itself were boxed. (Note the tension between encapsulation and unboxing in this case.  It would be better if the header and digits fields were private, but depending on where the unboxing information must “leak”, it is probably safer to make a public revelation of the internal structure.) Note that, although an array of Complex can be faked with a double-length array of double, there is no easy way to fake an array of unboxed DecimalValues.  (Either an array of boxed values or a transposed pair of homogeneous arrays would be reasonable fallbacks, in a current JVM.)  Getting the full benefit of unboxing and arrays will require some new JVM magic. Although the JVM emphasizes portability, system dependent code will benefit from using machine-level types larger than 64 bits.  For example, the back end of a linear algebra package might benefit from value types like Float4 which map to stock vector types.  This is probably only worthwhile if the unboxing arrays can be packed with such values. More Daydreams A more finely-divided design for dynamic enforcement of value safety could feature separate marker interfaces for each invariant.  An empty marker interface Unsynchronizable could cause suitable exceptions for monitor instructions on objects in marked classes.  More radically, a Interchangeable marker interface could cause JVM primitives that are sensitive to object identity to raise exceptions; the strangest result would be that the acmp instruction would have to be specified as raising an exception. @ValueSafe public interface ValueType extends java.io.Serializable,         Unsynchronizable, Interchangeable { … public class Complex implements ValueType {     // inherits Serializable, Unsynchronizable, Interchangeable, @ValueSafe     … It seems possible that Integer and the other wrapper types could be retro-fitted as value-safe types.  This is a major change, since wrapper objects would be unsynchronizable and their references interchangeable.  It is likely that code which violates value-safety for wrapper types exists but is uncommon.  It is less plausible to retro-fit String, since the prominent operation String.intern is often used with value-unsafe code. We should also reconsider the distinction between boxed and unboxed values in code.  The design presented above obscures that distinction.  As another thought experiment, we could imagine making a first class distinction in the type system between boxed and unboxed representations.  Since only primitive types are named with a lower-case initial letter, we could define that the capitalized version of a value type name always refers to the boxed representation, while the initial lower-case variant always refers to boxed.  For example: complex pi = complex.valueOf(Math.PI, 0); Complex boxPi = pi;  // convert to boxed myList.add(boxPi); complex z = myList.get(0);  // unbox Such a convention could perhaps absorb the current difference between int and Integer, double and Double. It might also allow the programmer to express a helpful distinction among array types. As said above, array types are crucial to bulk data interfaces, but are limited in the JVM.  Extending arrays beyond the present limitations is worth thinking about; for example, the Maxine JVM implementation has a hybrid object/array type.  Something like this which can also accommodate value type components seems worthwhile.  On the other hand, does it make sense for value types to contain short arrays?  And why should random-access arrays be the end of our design process, when bulk data is often sequentially accessed, and it might make sense to have heterogeneous streams of data as the natural “jumbo” data structure.  These considerations must wait for another day and another note. More Work It seems to me that a good sequence for introducing such value types would be as follows: Add the value-safety restrictions to an experimental version of javac. Code some sample applications with value types, including Complex and DecimalValue. Create an experimental JVM which internally unboxes value types but does not require new bytecodes to do so.  Ensure the feasibility of the performance model for the sample applications. Add tuple-like bytecodes (with or without generic type reification) to a major revision of the JVM, and teach the Java compiler to switch in the new bytecodes without code changes. A staggered roll-out like this would decouple language changes from bytecode changes, which is always a convenient thing. A similar investigation should be applied (concurrently) to array types.  In this case, it seems to me that the starting point is in the JVM: Add an experimental unboxing array data structure to a production JVM, perhaps along the lines of Maxine hybrids.  No bytecode or language support is required at first; everything can be done with encapsulated unsafe operations and/or method handles. Create an experimental JVM which internally unboxes value types but does not require new bytecodes to do so.  Ensure the feasibility of the performance model for the sample applications. Add tuple-like bytecodes (with or without generic type reification) to a major revision of the JVM, and teach the Java compiler to switch in the new bytecodes without code changes. That’s enough musing me for now.  Back to work!

    Read the article

  • Guidelines for using Merge task in SSIS

    - by thursdaysgeek
    I have a table with three fields, one an identity field, and I need to add some new records from a source that has the other two fields. I'm using SSIS, and I think I should use the merge tool, because one of the sources is not in the local database. But, I'm confused by the merge tool and the proper process. I have my one source (an Oracle table), and I get two fields, well_id and well_name, with a sort after, sorting by well_id. I have the destination table (sql server), and I'm also using that as a source. It has three fields: well_key (identity field), well_id, and well_name, and I then have a sort task, sorting on well_id. Both of those are input to my merge task. I was going to output to a temporary table, and then somehow get the new records back into the sql server table. Oracle Well SQL Well | | V V Sort Source Sort Well | | -------> Merge* <----------- | V Temp well table I suspect this isn't the best way to use this tool, however. What are the proper steps for a merge like this? One of my reasons for questioning this method is that my merge has an error, telling me that the "Merge Input 2" must be sorted, but its source is a sort task, so it IS sorted. Example data SQL Well (before merge) well_key well_id well_name 1 123 well k 2 292 well c 3 344 well t 5 439 well d Oracle Well well_id well_name 123 well k 292 well c 311 well y 344 well t 439 well d 532 well j SQL Well (after merge) well_key well_id well_name 1 123 well k 2 292 well c 3 344 well t 5 439 well d 6 311 well y 7 532 well j Would it be better to load my Oracle Well to a temporary local file, and then just use a sql insert statment on it?

    Read the article

  • Blocking on DBCP connection pool (open and close connnection). Is database connection pooling in OpenEJB pluggable?

    - by topchef
    We use OpenEJB on Tomcat (used to run on JBoss, Weblogic, etc.). While running load tests we experience significant performance problems with handling JMS messages (queues). Problem was localized to blocking on database connection pool getting or releasing connection to the pool. Blocking prevented concurrent MDB instances (threads) from running hence performance suffered 10-fold and worse. The same code used to run on application servers (with their respective connection pool implementations) with no blocking at all. Example of thread blocked: Name: JMS Resource Adapter-worker-23 State: BLOCKED on org.apache.commons.pool.impl.GenericObjectPool@1ea6b4a owned by: JMS Resource Adapter-worker-19 Total blocked: 18,426 Total waited: 0 Stack trace: org.apache.commons.pool.impl.GenericObjectPool.returnObject(GenericObjectPool.java:916) org.apache.commons.dbcp.PoolableConnection.close(PoolableConnection.java:91) - locked org.apache.commons.dbcp.PoolableConnection@1bcba8 org.apache.commons.dbcp.managed.ManagedConnection.close(ManagedConnection.java:147) com.xxxxx.persistence.DbHelper.closeConnection(DbHelper.java:290) .... Couple of questions. I am almost certain that some transactional attributes and properties contribute to this blocking, but MDBs are defined as non-transactional (we use both annotations and ejb-jar.xml). Some EJBs do use container-managed transactions though (and we can observe blocking there as well). Are there any DBCP configurations that may fix blocking? Is DBCP connection pool implementation replaceable in OpenEJB? How easy (difficult) to replace it with another library? Just in case this is how we define data source in OpenEJB (openejb.xml): <Resource id="MyDataSource" type="DataSource"> JdbcDriver oracle.jdbc.driver.OracleDriver JdbcUrl ${oracle.jdbc} UserName ${oracle.user} Password ${oracle.password} JtaManaged true InitialSize 5 MaxActive 30 ValidationQuery SELECT 1 FROM DUAL TestOnBorrow true </Resource>

    Read the article

  • How to test a struts 2.1.x developer?

    - by Jason Pyeron
    We employ test to filter out those who can't. The tests are designed to be very low effort for those who can and too much effort for those who can't. Here is an example for java web application developer on an Oracle project: We only work with contractors who can use the tools we use, to determine if you can use the tools we have devised some very simple tests. Instructions If you are prepared and knowledgeable this will take you about 2-5 minutes. Suggested knowledge and tools: * subversion 1.6 see http://subversion.tigris.org/ or http://cygwin.com/setup.exe * java 1.6 see http://java.sun.com/javase/downloads/index.jsp * oracle >=10g see http://www.oracle.com/technology/software/products/jdev/index.html * j2ee server see http://tomcat.apache.org/download-55.cgi or http://www.jboss.org/jbossas/downloads/ Steps 1. check out svn://statics32.pdinc.us/home/subversion/guest 2. deploy the war file found at trunk/test.war 3. browse to the web application you installed from the war file and answer the one SQL question: How many rows are in the table 'testdata' where column 'value' ends with either an 'A' or an 'a'? The login credentials are in trunk/doc/oracle.txt 4. make a RESULTS HASH by submitting your answer to the form. 5. create a file in tmp/YourUserName.txt and put the RESULTS HASH in it, not the answer. 6. check in your file (don't forget to add the file first). 7. message me with the revision number of your check in. As such I am looking for ideas on how to test for someone to be a struts 2.1 w/ annotations.

    Read the article

  • Duplicate C# web service proxy classes generated for Java types

    - by Sergey
    My question is about integration between a Java web service and a C# .NET client. Service: CXF 2.2.3 with Aegis databinding Client: C#, .NET 3.5 SP1 For some reason Visual Studio generates two C# proxy enums for each Java enum. The generated C# classes do not compile. For example, this Java enum: public enum SqlDialect { GENERIC, SYBASE, SQL_SERVER, ORACLE; } Produces this WSDL: <xsd:simpleType name="SqlDialect"> <xsd:restriction base="xsd:string"> <xsd:enumeration value="GENERIC" /> <xsd:enumeration value="SYBASE" /> <xsd:enumeration value="SQL_SERVER" /> <xsd:enumeration value="ORACLE" /> </xsd:restriction> </xsd:simpleType> For this WSDL Visual Studio generates two partial C# classes (generated comments removed): [System.CodeDom.Compiler.GeneratedCodeAttribute("System.Runtime.Serialization", "3.0.0.0")] [System.Runtime.Serialization.DataContractAttribute(Name="SqlDialect", Namespace="http://somenamespace")] public enum SqlDialect : int { [System.Runtime.Serialization.EnumMemberAttribute()] GENERIC = 0, [System.Runtime.Serialization.EnumMemberAttribute()] SYBASE = 1, [System.Runtime.Serialization.EnumMemberAttribute()] SQL_SERVER = 2, [System.Runtime.Serialization.EnumMemberAttribute()] ORACLE = 3, } [System.CodeDom.Compiler.GeneratedCodeAttribute("System.Xml", "2.0.50727.3082")] [System.SerializableAttribute()] [System.Xml.Serialization.XmlTypeAttribute(Namespace="http://somenamespace")] public enum SqlDialect { GENERIC, SYBASE, SQL_SERVER, ORACLE, } The resulting C# code does not compile: The namespace 'somenamespace' already contains a definition for 'SqlDialect' I will appreciate any ideas...

    Read the article

  • Is there a Telecommunications Reference Architecture?

    - by raul.goycoolea
    @font-face { font-family: "Arial"; }@font-face { font-family: "Courier New"; }@font-face { font-family: "Wingdings"; }@font-face { font-family: "Cambria"; }p.MsoNormal, li.MsoNormal, div.MsoNormal { margin: 0cm 0cm 0.0001pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpFirst, li.MsoListParagraphCxSpFirst, div.MsoListParagraphCxSpFirst { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpMiddle, li.MsoListParagraphCxSpMiddle, div.MsoListParagraphCxSpMiddle { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpLast, li.MsoListParagraphCxSpLast, div.MsoListParagraphCxSpLast { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }div.Section1 { page: Section1; }ol { margin-bottom: 0cm; }ul { margin-bottom: 0cm; } Abstract   Reference architecture provides needed architectural information that can be provided in advance to an enterprise to enable consistent architectural best practices. Enterprise Reference Architecture helps business owners to actualize their strategies, vision, objectives, and principles. It evaluates the IT systems, based on Reference Architecture goals, principles, and standards. It helps to reduce IT costs by increasing functionality, availability, scalability, etc. Telecom Reference Architecture provides customers with the flexibility to view bundled service bills online with the provision of multiple services. It provides real-time, flexible billing and charging systems, to handle complex promotions, discounts, and settlements with multiple parties. This paper attempts to describe the Reference Architecture for the Telecom Enterprises. It lays the foundation for a Telecom Reference Architecture by articulating the requirements, drivers, and pitfalls for telecom service providers. It describes generic reference architecture for telecom enterprises and moves on to explain how to achieve Enterprise Reference Architecture by using SOA.   Introduction   A Reference Architecture provides a methodology, set of practices, template, and standards based on a set of successful solutions implemented earlier. These solutions have been generalized and structured for the depiction of both a logical and a physical architecture, based on the harvesting of a set of patterns that describe observations in a number of successful implementations. It helps as a reference for the various architectures that an enterprise can implement to solve various problems. It can be used as the starting point or the point of comparisons for various departments/business entities of a company, or for the various companies for an enterprise. It provides multiple views for multiple stakeholders.   Major artifacts of the Enterprise Reference Architecture are methodologies, standards, metadata, documents, design patterns, etc.   Purpose of Reference Architecture   In most cases, architects spend a lot of time researching, investigating, defining, and re-arguing architectural decisions. It is like reinventing the wheel as their peers in other organizations or even the same organization have already spent a lot of time and effort defining their own architectural practices. This prevents an organization from learning from its own experiences and applying that knowledge for increased effectiveness.   Reference architecture provides missing architectural information that can be provided in advance to project team members to enable consistent architectural best practices.   Enterprise Reference Architecture helps an enterprise to achieve the following at the abstract level:   ·       Reference architecture is more of a communication channel to an enterprise ·       Helps the business owners to accommodate to their strategies, vision, objectives, and principles. ·       Evaluates the IT systems based on Reference Architecture Principles ·       Reduces IT spending through increasing functionality, availability, scalability, etc ·       A Real-time Integration Model helps to reduce the latency of the data updates Is used to define a single source of Information ·       Provides a clear view on how to manage information and security ·       Defines the policy around the data ownership, product boundaries, etc. ·       Helps with cost optimization across project and solution portfolios by eliminating unused or duplicate investments and assets ·       Has a shorter implementation time and cost   Once the reference architecture is in place, the set of architectural principles, standards, reference models, and best practices ensure that the aligned investments have the greatest possible likelihood of success in both the near term and the long term (TCO).     Common pitfalls for Telecom Service Providers   Telecom Reference Architecture serves as the first step towards maturity for a telecom service provider. During the course of our assignments/experiences with telecom players, we have come across the following observations – Some of these indicate a lack of maturity of the telecom service provider:   ·       In markets that are growing and not so mature, it has been observed that telcos have a significant amount of in-house or home-grown applications. In some of these markets, the growth has been so rapid that IT has been unable to cope with business demands. Telcos have shown a tendency to come up with workarounds in their IT applications so as to meet business needs. ·       Even for core functions like provisioning or mediation, some telcos have tried to manage with home-grown applications. ·       Most of the applications do not have the required scalability or maintainability to sustain growth in volumes or functionality. ·       Applications face interoperability issues with other applications in the operator's landscape. Integrating a new application or network element requires considerable effort on the part of the other applications. ·       Application boundaries are not clear, and functionality that is not in the initial scope of that application gets pushed onto it. This results in the development of the multiple, small applications without proper boundaries. ·       Usage of Legacy OSS/BSS systems, poor Integration across Multiple COTS Products and Internal Systems. Most of the Integrations are developed on ad-hoc basis and Point-to-Point Integration. ·       Redundancy of the business functions in different applications • Fragmented data across the different applications and no integrated view of the strategic data • Lot of performance Issues due to the usage of the complex integration across OSS and BSS systems   However, this is where the maturity of the telecom industry as a whole can be of help. The collaborative efforts of telcos to overcome some of these problems have resulted in bodies like the TM Forum. They have come up with frameworks for business processes, data, applications, and technology for telecom service providers. These could be a good starting point for telcos to clean up their enterprise landscape.   Industry Trends in Telecom Reference Architecture   Telecom reference architectures are evolving rapidly because telcos are facing business and IT challenges.   “The reality is that there probably is no killer application, no silver bullet that the telcos can latch onto to carry them into a 21st Century.... Instead, there are probably hundreds – perhaps thousands – of niche applications.... And the only way to find which of these works for you is to try out lots of them, ramp up the ones that work, and discontinue the ones that fail.” – Martin Creaner President & CTO TM Forum.   The following trends have been observed in telecom reference architecture:   ·       Transformation of business structures to align with customer requirements ·       Adoption of more Internet-like technical architectures. The Web 2.0 concept is increasingly being used. ·       Virtualization of the traditional operations support system (OSS) ·       Adoption of SOA to support development of IP-based services ·       Adoption of frameworks like Service Delivery Platforms (SDPs) and IP Multimedia Subsystem ·       (IMS) to enable seamless deployment of various services over fixed and mobile networks ·       Replacement of in-house, customized, and stove-piped OSS/BSS with standards-based COTS products ·       Compliance with industry standards and frameworks like eTOM, SID, and TAM to enable seamless integration with other standards-based products   Drivers of Reference Architecture   The drivers of the Reference Architecture are Reference Architecture Goals, Principles, and Enterprise Vision and Telecom Transformation. The details are depicted below diagram. @font-face { font-family: "Cambria"; }p.MsoNormal, li.MsoNormal, div.MsoNormal { margin: 0cm 0cm 0.0001pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoCaption, li.MsoCaption, div.MsoCaption { margin: 0cm 0cm 10pt; font-size: 9pt; font-family: "Times New Roman"; color: rgb(79, 129, 189); font-weight: bold; }div.Section1 { page: Section1; } Figure 1. Drivers for Reference Architecture @font-face { font-family: "Arial"; }@font-face { font-family: "Courier New"; }@font-face { font-family: "Wingdings"; }@font-face { font-family: "Cambria"; }p.MsoNormal, li.MsoNormal, div.MsoNormal { margin: 0cm 0cm 0.0001pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpFirst, li.MsoListParagraphCxSpFirst, div.MsoListParagraphCxSpFirst { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpMiddle, li.MsoListParagraphCxSpMiddle, div.MsoListParagraphCxSpMiddle { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpLast, li.MsoListParagraphCxSpLast, div.MsoListParagraphCxSpLast { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }div.Section1 { page: Section1; }ol { margin-bottom: 0cm; }ul { margin-bottom: 0cm; } Today’s telecom reference architectures should seamlessly integrate traditional legacy-based applications and transition to next-generation network technologies (e.g., IP multimedia subsystems). This has resulted in new requirements for flexible, real-time billing and OSS/BSS systems and implications on the service provider’s organizational requirements and structure.   Telecom reference architectures are today expected to:   ·       Integrate voice, messaging, email and other VAS over fixed and mobile networks, back end systems ·       Be able to provision multiple services and service bundles • Deliver converged voice, video and data services ·       Leverage the existing Network Infrastructure ·       Provide real-time, flexible billing and charging systems to handle complex promotions, discounts, and settlements with multiple parties. ·       Support charging of advanced data services such as VoIP, On-Demand, Services (e.g.  Video), IMS/SIP Services, Mobile Money, Content Services and IPTV. ·       Help in faster deployment of new services • Serve as an effective platform for collaboration between network IT and business organizations ·       Harness the potential of converging technology, networks, devices and content to develop multimedia services and solutions of ever-increasing sophistication on a single Internet Protocol (IP) ·       Ensure better service delivery and zero revenue leakage through real-time balance and credit management ·       Lower operating costs to drive profitability   Enterprise Reference Architecture   The Enterprise Reference Architecture (RA) fills the gap between the concepts and vocabulary defined by the reference model and the implementation. Reference architecture provides detailed architectural information in a common format such that solutions can be repeatedly designed and deployed in a consistent, high-quality, supportable fashion. This paper attempts to describe the Reference Architecture for the Telecom Application Usage and how to achieve the Enterprise Level Reference Architecture using SOA.   • Telecom Reference Architecture • Enterprise SOA based Reference Architecture   Telecom Reference Architecture   Tele Management Forum’s New Generation Operations Systems and Software (NGOSS) is an architectural framework for organizing, integrating, and implementing telecom systems. NGOSS is a component-based framework consisting of the following elements:   ·       The enhanced Telecom Operations Map (eTOM) is a business process framework. ·       The Shared Information Data (SID) model provides a comprehensive information framework that may be specialized for the needs of a particular organization. ·       The Telecom Application Map (TAM) is an application framework to depict the functional footprint of applications, relative to the horizontal processes within eTOM. ·       The Technology Neutral Architecture (TNA) is an integrated framework. TNA is an architecture that is sustainable through technology changes.   NGOSS Architecture Standards are:   ·       Centralized data ·       Loosely coupled distributed systems ·       Application components/re-use  ·       A technology-neutral system framework with technology specific implementations ·       Interoperability to service provider data/processes ·       Allows more re-use of business components across multiple business scenarios ·       Workflow automation   The traditional operator systems architecture consists of four layers,   ·       Business Support System (BSS) layer, with focus toward customers and business partners. Manages order, subscriber, pricing, rating, and billing information. ·       Operations Support System (OSS) layer, built around product, service, and resource inventories. ·       Networks layer – consists of Network elements and 3rd Party Systems. ·       Integration Layer – to maximize application communication and overall solution flexibility.   Reference architecture for telecom enterprises is depicted below. @font-face { font-family: "Arial"; }@font-face { font-family: "Courier New"; }@font-face { font-family: "Wingdings"; }@font-face { font-family: "Cambria"; }p.MsoNormal, li.MsoNormal, div.MsoNormal { margin: 0cm 0cm 0.0001pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoCaption, li.MsoCaption, div.MsoCaption { margin: 0cm 0cm 10pt; font-size: 9pt; font-family: "Times New Roman"; color: rgb(79, 129, 189); font-weight: bold; }p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpFirst, li.MsoListParagraphCxSpFirst, div.MsoListParagraphCxSpFirst { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpMiddle, li.MsoListParagraphCxSpMiddle, div.MsoListParagraphCxSpMiddle { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpLast, li.MsoListParagraphCxSpLast, div.MsoListParagraphCxSpLast { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }div.Section1 { page: Section1; }ol { margin-bottom: 0cm; }ul { margin-bottom: 0cm; } Figure 2. Telecom Reference Architecture   The major building blocks of any Telecom Service Provider architecture are as follows:   1. Customer Relationship Management   CRM encompasses the end-to-end lifecycle of the customer: customer initiation/acquisition, sales, ordering, and service activation, customer care and support, proactive campaigns, cross sell/up sell, and retention/loyalty.   CRM also includes the collection of customer information and its application to personalize, customize, and integrate delivery of service to a customer, as well as to identify opportunities for increasing the value of the customer to the enterprise.   The key functionalities related to Customer Relationship Management are   ·       Manage the end-to-end lifecycle of a customer request for products. ·       Create and manage customer profiles. ·       Manage all interactions with customers – inquiries, requests, and responses. ·       Provide updates to Billing and other south bound systems on customer/account related updates such as customer/ account creation, deletion, modification, request bills, final bill, duplicate bills, credit limits through Middleware. ·       Work with Order Management System, Product, and Service Management components within CRM. ·       Manage customer preferences – Involve all the touch points and channels to the customer, including contact center, retail stores, dealers, self service, and field service, as well as via any media (phone, face to face, web, mobile device, chat, email, SMS, mail, the customer's bill, etc.). ·       Support single interface for customer contact details, preferences, account details, offers, customer premise equipment, bill details, bill cycle details, and customer interactions.   CRM applications interact with customers through customer touch points like portals, point-of-sale terminals, interactive voice response systems, etc. The requests by customers are sent via fulfillment/provisioning to billing system for ordering processing.   2. Billing and Revenue Management   Billing and Revenue Management handles the collection of appropriate usage records and production of timely and accurate bills – for providing pre-bill usage information and billing to customers; for processing their payments; and for performing payment collections. In addition, it handles customer inquiries about bills, provides billing inquiry status, and is responsible for resolving billing problems to the customer's satisfaction in a timely manner. This process grouping also supports prepayment for services.   The key functionalities provided by these applications are   ·       To ensure that enterprise revenue is billed and invoices delivered appropriately to customers. ·       To manage customers’ billing accounts, process their payments, perform payment collections, and monitor the status of the account balance. ·       To ensure the timely and effective fulfillment of all customer bill inquiries and complaints. ·       Collect the usage records from mediation and ensure appropriate rating and discounting of all usage and pricing. ·       Support revenue sharing; split charging where usage is guided to an account different from the service consumer. ·       Support prepaid and post-paid rating. ·       Send notification on approach / exceeding the usage thresholds as enforced by the subscribed offer, and / or as setup by the customer. ·       Support prepaid, post paid, and hybrid (where some services are prepaid and the rest of the services post paid) customers and conversion from post paid to prepaid, and vice versa. ·       Support different billing function requirements like charge prorating, promotion, discount, adjustment, waiver, write-off, account receivable, GL Interface, late payment fee, credit control, dunning, account or service suspension, re-activation, expiry, termination, contract violation penalty, etc. ·       Initiate direct debit to collect payment against an invoice outstanding. ·       Send notification to Middleware on different events; for example, payment receipt, pre-suspension, threshold exceed, etc.   Billing systems typically get usage data from mediation systems for rating and billing. They get provisioning requests from order management systems and inquiries from CRM systems. Convergent and real-time billing systems can directly get usage details from network elements.   3. Mediation   Mediation systems transform/translate the Raw or Native Usage Data Records into a general format that is acceptable to billing for their rating purposes.   The following lists the high-level roles and responsibilities executed by the Mediation system in the end-to-end solution.   ·       Collect Usage Data Records from different data sources – like network elements, routers, servers – via different protocol and interfaces. ·       Process Usage Data Records – Mediation will process Usage Data Records as per the source format. ·       Validate Usage Data Records from each source. ·       Segregates Usage Data Records coming from each source to multiple, based on the segregation requirement of end Application. ·       Aggregates Usage Data Records based on the aggregation rule if any from different sources. ·       Consolidates multiple Usage Data Records from each source. ·       Delivers formatted Usage Data Records to different end application like Billing, Interconnect, Fraud Management, etc. ·       Generates audit trail for incoming Usage Data Records and keeps track of all the Usage Data Records at various stages of mediation process. ·       Checks duplicate Usage Data Records across files for a given time window.   4. Fulfillment   This area is responsible for providing customers with their requested products in a timely and correct manner. It translates the customer's business or personal need into a solution that can be delivered using the specific products in the enterprise's portfolio. This process informs the customers of the status of their purchase order, and ensures completion on time, as well as ensuring a delighted customer. These processes are responsible for accepting and issuing orders. They deal with pre-order feasibility determination, credit authorization, order issuance, order status and tracking, customer update on customer order activities, and customer notification on order completion. Order management and provisioning applications fall into this category.   The key functionalities provided by these applications are   ·       Issuing new customer orders, modifying open customer orders, or canceling open customer orders; ·       Verifying whether specific non-standard offerings sought by customers are feasible and supportable; ·       Checking the credit worthiness of customers as part of the customer order process; ·       Testing the completed offering to ensure it is working correctly; ·       Updating of the Customer Inventory Database to reflect that the specific product offering has been allocated, modified, or cancelled; ·       Assigning and tracking customer provisioning activities; ·       Managing customer provisioning jeopardy conditions; and ·       Reporting progress on customer orders and other processes to customer.   These applications typically get orders from CRM systems. They interact with network elements and billing systems for fulfillment of orders.   5. Enterprise Management   This process area includes those processes that manage enterprise-wide activities and needs, or have application within the enterprise as a whole. They encompass all business management processes that   ·       Are necessary to support the whole of the enterprise, including processes for financial management, legal management, regulatory management, process, cost, and quality management, etc.;   ·       Are responsible for setting corporate policies, strategies, and directions, and for providing guidelines and targets for the whole of the business, including strategy development and planning for areas, such as Enterprise Architecture, that are integral to the direction and development of the business;   ·       Occur throughout the enterprise, including processes for project management, performance assessments, cost assessments, etc.     (i) Enterprise Risk Management:   Enterprise Risk Management focuses on assuring that risks and threats to the enterprise value and/or reputation are identified, and appropriate controls are in place to minimize or eliminate the identified risks. The identified risks may be physical or logical/virtual. Successful risk management ensures that the enterprise can support its mission critical operations, processes, applications, and communications in the face of serious incidents such as security threats/violations and fraud attempts. Two key areas covered in Risk Management by telecom operators are:   ·       Revenue Assurance: Revenue assurance system will be responsible for identifying revenue loss scenarios across components/systems, and will help in rectifying the problems. The following lists the high-level roles and responsibilities executed by the Revenue Assurance system in the end-to-end solution. o   Identify all usage information dropped when networks are being upgraded. o   Interconnect bill verification. o   Identify where services are routinely provisioned but never billed. o   Identify poor sales policies that are intensifying collections problems. o   Find leakage where usage is sent to error bucket and never billed for. o   Find leakage where field service, CRM, and network build-out are not optimized.   ·       Fraud Management: Involves collecting data from different systems to identify abnormalities in traffic patterns, usage patterns, and subscription patterns to report suspicious activity that might suggest fraudulent usage of resources, resulting in revenue losses to the operator.   The key roles and responsibilities of the system component are as follows:   o   Fraud management system will capture and monitor high usage (over a certain threshold) in terms of duration, value, and number of calls for each subscriber. The threshold for each subscriber is decided by the system and fixed automatically. o   Fraud management will be able to detect the unauthorized access to services for certain subscribers. These subscribers may have been provided unauthorized services by employees. The component will raise the alert to the operator the very first time of such illegal calls or calls which are not billed. o   The solution will be to have an alarm management system that will deliver alarms to the operator/provider whenever it detects a fraud, thus minimizing fraud by catching it the first time it occurs. o   The Fraud Management system will be capable of interfacing with switches, mediation systems, and billing systems   (ii) Knowledge Management   This process focuses on knowledge management, technology research within the enterprise, and the evaluation of potential technology acquisitions.   Key responsibilities of knowledge base management are to   ·       Maintain knowledge base – Creation and updating of knowledge base on ongoing basis. ·       Search knowledge base – Search of knowledge base on keywords or category browse ·       Maintain metadata – Management of metadata on knowledge base to ensure effective management and search. ·       Run report generator. ·       Provide content – Add content to the knowledge base, e.g., user guides, operational manual, etc.   (iii) Document Management   It focuses on maintaining a repository of all electronic documents or images of paper documents relevant to the enterprise using a system.   (iv) Data Management   It manages data as a valuable resource for any enterprise. For telecom enterprises, the typical areas covered are Master Data Management, Data Warehousing, and Business Intelligence. It is also responsible for data governance, security, quality, and database management.   Key responsibilities of Data Management are   ·       Using ETL, extract the data from CRM, Billing, web content, ERP, campaign management, financial, network operations, asset management info, customer contact data, customer measures, benchmarks, process data, e.g., process inputs, outputs, and measures, into Enterprise Data Warehouse. ·       Management of data traceability with source, data related business rules/decisions, data quality, data cleansing data reconciliation, competitors data – storage for all the enterprise data (customer profiles, products, offers, revenues, etc.) ·       Get online update through night time replication or physical backup process at regular frequency. ·       Provide the data access to business intelligence and other systems for their analysis, report generation, and use.   (v) Business Intelligence   It uses the Enterprise Data to provide the various analysis and reports that contain prospects and analytics for customer retention, acquisition of new customers due to the offers, and SLAs. It will generate right and optimized plans – bolt-ons for the customers.   The following lists the high-level roles and responsibilities executed by the Business Intelligence system at the Enterprise Level:   ·       It will do Pattern analysis and reports problem. ·       It will do Data Analysis – Statistical analysis, data profiling, affinity analysis of data, customer segment wise usage patterns on offers, products, service and revenue generation against services and customer segments. ·       It will do Performance (business, system, and forecast) analysis, churn propensity, response time, and SLAs analysis. ·       It will support for online and offline analysis, and report drill down capability. ·       It will collect, store, and report various SLA data. ·       It will provide the necessary intelligence for marketing and working on campaigns, etc., with cost benefit analysis and predictions.   It will advise on customer promotions with additional services based on loyalty and credit history of customer   ·       It will Interface with Enterprise Data Management system for data to run reports and analysis tasks. It will interface with the campaign schedules, based on historical success evidence.   (vi) Stakeholder and External Relations Management   It manages the enterprise's relationship with stakeholders and outside entities. Stakeholders include shareholders, employee organizations, etc. Outside entities include regulators, local community, and unions. Some of the processes within this grouping are Shareholder Relations, External Affairs, Labor Relations, and Public Relations.   (vii) Enterprise Resource Planning   It is used to manage internal and external resources, including tangible assets, financial resources, materials, and human resources. Its purpose is to facilitate the flow of information between all business functions inside the boundaries of the enterprise and manage the connections to outside stakeholders. ERP systems consolidate all business operations into a uniform and enterprise wide system environment.   The key roles and responsibilities for Enterprise System are given below:   ·        It will handle responsibilities such as core accounting, financial, and management reporting. ·       It will interface with CRM for capturing customer account and details. ·       It will interface with billing to capture the billing revenue and other financial data. ·       It will be responsible for executing the dunning process. Billing will send the required feed to ERP for execution of dunning. ·       It will interface with the CRM and Billing through batch interfaces. Enterprise management systems are like horizontals in the enterprise and typically interact with all major telecom systems. E.g., an ERP system interacts with CRM, Fulfillment, and Billing systems for different kinds of data exchanges.   6. External Interfaces/Touch Points   The typical external parties are customers, suppliers/partners, employees, shareholders, and other stakeholders. External interactions from/to a Service Provider to other parties can be achieved by a variety of mechanisms, including:   ·       Exchange of emails or faxes ·       Call Centers ·       Web Portals ·       Business-to-Business (B2B) automated transactions   These applications provide an Internet technology driven interface to external parties to undertake a variety of business functions directly for themselves. These can provide fully or partially automated service to external parties through various touch points.   Typical characteristics of these touch points are   ·       Pre-integrated self-service system, including stand-alone web framework or integration front end with a portal engine ·       Self services layer exposing atomic web services/APIs for reuse by multiple systems across the architectural environment ·       Portlets driven connectivity exposing data and services interoperability through a portal engine or web application   These touch points mostly interact with the CRM systems for requests, inquiries, and responses.   7. Middleware   The component will be primarily responsible for integrating the different systems components under a common platform. It should provide a Standards-Based Platform for building Service Oriented Architecture and Composite Applications. The following lists the high-level roles and responsibilities executed by the Middleware component in the end-to-end solution.   ·       As an integration framework, covering to and fro interfaces ·       Provide a web service framework with service registry. ·       Support SOA framework with SOA service registry. ·       Each of the interfaces from / to Middleware to other components would handle data transformation, translation, and mapping of data points. ·       Receive data from the caller / activate and/or forward the data to the recipient system in XML format. ·       Use standard XML for data exchange. ·       Provide the response back to the service/call initiator. ·       Provide a tracking until the response completion. ·       Keep a store transitional data against each call/transaction. ·       Interface through Middleware to get any information that is possible and allowed from the existing systems to enterprise systems; e.g., customer profile and customer history, etc. ·       Provide the data in a common unified format to the SOA calls across systems, and follow the Enterprise Architecture directive. ·       Provide an audit trail for all transactions being handled by the component.   8. Network Elements   The term Network Element means a facility or equipment used in the provision of a telecommunications service. Such terms also includes features, functions, and capabilities that are provided by means of such facility or equipment, including subscriber numbers, databases, signaling systems, and information sufficient for billing and collection or used in the transmission, routing, or other provision of a telecommunications service.   Typical network elements in a GSM network are Home Location Register (HLR), Intelligent Network (IN), Mobile Switching Center (MSC), SMS Center (SMSC), and network elements for other value added services like Push-to-talk (PTT), Ring Back Tone (RBT), etc.   Network elements are invoked when subscribers use their telecom devices for any kind of usage. These elements generate usage data and pass it on to downstream systems like mediation and billing system for rating and billing. They also integrate with provisioning systems for order/service fulfillment.   9. 3rd Party Applications   3rd Party systems are applications like content providers, payment gateways, point of sale terminals, and databases/applications maintained by the Government.   Depending on applicability and the type of functionality provided by 3rd party applications, the integration with different telecom systems like CRM, provisioning, and billing will be done.   10. Service Delivery Platform   A service delivery platform (SDP) provides the architecture for the rapid deployment, provisioning, execution, management, and billing of value added telecom services. SDPs are based on the concept of SOA and layered architecture. They support the delivery of voice, data services, and content in network and device-independent fashion. They allow application developers to aggregate network capabilities, services, and sources of content. SDPs typically contain layers for web services exposure, service application development, and network abstraction.   SOA Reference Architecture   SOA concept is based on the principle of developing reusable business service and building applications by composing those services, instead of building monolithic applications in silos. It’s about bridging the gap between business and IT through a set of business-aligned IT services, using a set of design principles, patterns, and techniques.   In an SOA, resources are made available to participants in a value net, enterprise, line of business (typically spanning multiple applications within an enterprise or across multiple enterprises). It consists of a set of business-aligned IT services that collectively fulfill an organization’s business processes and goals. We can choreograph these services into composite applications and invoke them through standard protocols. SOA, apart from agility and reusability, enables:   ·       The business to specify processes as orchestrations of reusable services ·       Technology agnostic business design, with technology hidden behind service interface ·       A contractual-like interaction between business and IT, based on service SLAs ·       Accountability and governance, better aligned to business services ·       Applications interconnections untangling by allowing access only through service interfaces, reducing the daunting side effects of change ·       Reduced pressure to replace legacy and extended lifetime for legacy applications, through encapsulation in services   ·       A Cloud Computing paradigm, using web services technologies, that makes possible service outsourcing on an on-demand, utility-like, pay-per-usage basis   The following section represents the Reference Architecture of logical view for the Telecom Solution. The new custom built application needs to align with this logical architecture in the long run to achieve EA benefits.   Packaged implementation applications, such as ERP billing applications, need to expose their functions as service providers (as other applications consume) and interact with other applications as service consumers.   COT applications need to expose services through wrappers such as adapters to utilize existing resources and at the same time achieve Enterprise Architecture goal and objectives.   The following are the various layers for Enterprise level deployment of SOA. This diagram captures the abstract view of Enterprise SOA layers and important components of each layer. Layered architecture means decomposition of services such that most interactions occur between adjacent layers. However, there is no strict rule that top layers should not directly communicate with bottom layers.   The diagram below represents the important logical pieces that would result from overall SOA transformation. @font-face { font-family: "Arial"; }@font-face { font-family: "Courier New"; }@font-face { font-family: "Wingdings"; }@font-face { font-family: "Cambria"; }p.MsoNormal, li.MsoNormal, div.MsoNormal { margin: 0cm 0cm 0.0001pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoCaption, li.MsoCaption, div.MsoCaption { margin: 0cm 0cm 10pt; font-size: 9pt; font-family: "Times New Roman"; color: rgb(79, 129, 189); font-weight: bold; }p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpFirst, li.MsoListParagraphCxSpFirst, div.MsoListParagraphCxSpFirst { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpMiddle, li.MsoListParagraphCxSpMiddle, div.MsoListParagraphCxSpMiddle { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }p.MsoListParagraphCxSpLast, li.MsoListParagraphCxSpLast, div.MsoListParagraphCxSpLast { margin: 0cm 0cm 0.0001pt 36pt; font-size: 12pt; font-family: "Times New Roman"; }div.Section1 { page: Section1; }ol { margin-bottom: 0cm; }ul { margin-bottom: 0cm; } Figure 3. Enterprise SOA Reference Architecture 1.          Operational System Layer: This layer consists of all packaged applications like CRM, ERP, custom built applications, COTS based applications like Billing, Revenue Management, Fulfilment, and the Enterprise databases that are essential and contribute directly or indirectly to the Enterprise OSS/BSS Transformation.   ERP holds the data of Asset Lifecycle Management, Supply Chain, and Advanced Procurement and Human Capital Management, etc.   CRM holds the data related to Order, Sales, and Marketing, Customer Care, Partner Relationship Management, Loyalty, etc.   Content Management handles Enterprise Search and Query. Billing application consists of the following components:   ·       Collections Management, Customer Billing Management, Invoices, Real-Time Rating, Discounting, and Applying of Charges ·       Enterprise databases will hold both the application and service data, whether structured or unstructured.   MDM - Master data majorly consists of Customer, Order, Product, and Service Data.     2.          Enterprise Component Layer:   This layer consists of the Application Services and Common Services that are responsible for realizing the functionality and maintaining the QoS of the exposed services. This layer uses container-based technologies such as application servers to implement the components, workload management, high availability, and load balancing.   Application Services: This Service Layer enables application, technology, and database abstraction so that the complex accessing logic is hidden from the other service layers. This is a basic service layer, which exposes application functionalities and data as reusable services. The three types of the Application access services are:   ·       Application Access Service: This Service Layer exposes application level functionalities as a reusable service between BSS to BSS and BSS to OSS integration. This layer is enabled using disparate technology such as Web Service, Integration Servers, and Adaptors, etc.   ·       Data Access Service: This Service Layer exposes application data services as a reusable reference data service. This is done via direct interaction with application data. and provides the federated query.   ·       Network Access Service: This Service Layer exposes provisioning layer as a reusable service from OSS to OSS integration. This integration service emphasizes the need for high performance, stateless process flows, and distributed design.   Common Services encompasses management of structured, semi-structured, and unstructured data such as information services, portal services, interaction services, infrastructure services, and security services, etc.   3.          Integration Layer:   This consists of service infrastructure components like service bus, service gateway for partner integration, service registry, service repository, and BPEL processor. Service bus will carry the service invocation payloads/messages between consumers and providers. The other important functions expected from it are itinerary based routing, distributed caching of routing information, transformations, and all qualities of service for messaging-like reliability, scalability, and availability, etc. Service registry will hold all contracts (wsdl) of services, and it helps developers to locate or discover service during design time or runtime.   • BPEL processor would be useful in orchestrating the services to compose a complex business scenario or process. • Workflow and business rules management are also required to support manual triggering of certain activities within business process. based on the rules setup and also the state machine information. Application, data, and service mediation layer typically forms the overall composite application development framework or SOA Framework.   4.          Business Process Layer: These are typically the intermediate services layer and represent Shared Business Process Services. At Enterprise Level, these services are from Customer Management, Order Management, Billing, Finance, and Asset Management application domains.   5.          Access Layer: This layer consists of portals for Enterprise and provides a single view of Enterprise information management and dashboard services.   6.          Channel Layer: This consists of various devices; applications that form part of extended enterprise; browsers through which users access the applications.   7.          Client Layer: This designates the different types of users accessing the enterprise applications. The type of user typically would be an important factor in determining the level of access to applications.   8.          Vertical pieces like management, monitoring, security, and development cut across all horizontal layers Management and monitoring involves all aspects of SOA-like services, SLAs, and other QoS lifecycle processes for both applications and services surrounding SOA governance.     9.          EA Governance, Reference Architecture, Roadmap, Principles, and Best Practices:   EA Governance is important in terms of providing the overall direction to SOA implementation within the enterprise. This involves board-level involvement, in addition to business and IT executives. At a high level, this involves managing the SOA projects implementation, managing SOA infrastructure, and controlling the entire effort through all fine-tuned IT processes in accordance with COBIT (Control Objectives for Information Technology).   Devising tools and techniques to promote reuse culture, and the SOA way of doing things needs competency centers to be established in addition to training the workforce to take up new roles that are suited to SOA journey.   Conclusions   Reference Architectures can serve as the basis for disparate architecture efforts throughout the organization, even if they use different tools and technologies. Reference architectures provide best practices and approaches in the independent way a vendor deals with technology and standards. Reference Architectures model the abstract architectural elements for an enterprise independent of the technologies, protocols, and products that are used to implement an SOA. Telecom enterprises today are facing significant business and technology challenges due to growing competition, a multitude of services, and convergence. Adopting architectural best practices could go a long way in meeting these challenges. The use of SOA-based architecture for communication to each of the external systems like Billing, CRM, etc., in OSS/BSS system has made the architecture very loosely coupled, with greater flexibility. Any change in the external systems would be absorbed at the Integration Layer without affecting the rest of the ecosystem. The use of a Business Process Management (BPM) tool makes the management and maintenance of the business processes easy, with better performance in terms of lead time, quality, and cost. Since the Architecture is based on standards, it will lower the cost of deploying and managing OSS/BSS applications over their lifecycles.

    Read the article

  • VS 2010 Debugger Improvements (BreakPoints, DataTips, Import/Export)

    - by ScottGu
    This is the twenty-first in a series of blog posts I’m doing on the VS 2010 and .NET 4 release.  Today’s blog post covers a few of the nice usability improvements coming with the VS 2010 debugger.  The VS 2010 debugger has a ton of great new capabilities.  Features like Intellitrace (aka historical debugging), the new parallel/multithreaded debugging capabilities, and dump debuging support typically get a ton of (well deserved) buzz and attention when people talk about the debugging improvements with this release.  I’ll be doing blog posts in the future that demonstrate how to take advantage of them as well.  With today’s post, though, I thought I’d start off by covering a few small, but nice, debugger usability improvements that were also included with the VS 2010 release, and which I think you’ll find useful. Breakpoint Labels VS 2010 includes new support for better managing debugger breakpoints.  One particularly useful feature is called “Breakpoint Labels” – it enables much better grouping and filtering of breakpoints within a project or across a solution.  With previous releases of Visual Studio you had to manage each debugger breakpoint as a separate item. Managing each breakpoint separately can be a pain with large projects and for cases when you want to maintain “logical groups” of breakpoints that you turn on/off depending on what you are debugging.  Using the new VS 2010 “breakpoint labeling” feature you can now name these “groups” of breakpoints and manage them as a unit. Grouping Multiple Breakpoints Together using a Label Below is a screen-shot of the breakpoints window within Visual Studio 2010.  This lists all of the breakpoints defined within my solution (which in this case is the ASP.NET MVC 2 code base): The first and last breakpoint in the list above breaks into the debugger when a Controller instance is created or released by the ASP.NET MVC Framework. Using VS 2010, I can now select these two breakpoints, right-click, and then select the new “Edit labels…” menu command to give them a common label/name (making them easier to find and manage): Below is the dialog that appears when I select the “Edit labels” command.  We can use it to create a new string label for our breakpoints or select an existing one we have already defined.  In this case we’ll create a new label called “Lifetime Management” to describe what these two breakpoints cover: When we press the OK button our two selected breakpoints will be grouped under the newly created “Lifetime Management” label: Filtering/Sorting Breakpoints by Label We can use the “Search” combobox to quickly filter/sort breakpoints by label.  Below we are only showing those breakpoints with the “Lifetime Management” label: Toggling Breakpoints On/Off by Label We can also toggle sets of breakpoints on/off by label group.  We can simply filter by the label group, do a Ctrl-A to select all the breakpoints, and then enable/disable all of them with a single click: Importing/Exporting Breakpoints VS 2010 now supports importing/exporting breakpoints to XML files – which you can then pass off to another developer, attach to a bug report, or simply re-load later.  To export only a subset of breakpoints, you can filter by a particular label and then click the “Export breakpoint” button in the Breakpoints window: Above I’ve filtered my breakpoint list to only export two particular breakpoints (specific to a bug that I’m chasing down).  I can export these breakpoints to an XML file and then attach it to a bug report or email – which will enable another developer to easily setup the debugger in the correct state to investigate it on a separate machine.  Pinned DataTips Visual Studio 2010 also includes some nice new “DataTip pinning” features that enable you to better see and track variable and expression values when in the debugger.  Simply hover over a variable or expression within the debugger to expose its DataTip (which is a tooltip that displays its value)  – and then click the new “pin” button on it to make the DataTip always visible: You can “pin” any number of DataTips you want onto the screen.  In addition to pinning top-level variables, you can also drill into the sub-properties on variables and pin them as well.  Below I’ve “pinned” three variables: “category”, “Request.RawUrl” and “Request.LogonUserIdentity.Name”.  Note that these last two variable are sub-properties of the “Request” object.   Associating Comments with Pinned DataTips Hovering over a pinned DataTip exposes some additional UI within the debugger: Clicking the comment button at the bottom of this UI expands the DataTip - and allows you to optionally add a comment with it: This makes it really easy to attach and track debugging notes: Pinned DataTips are usable across both Debug Sessions and Visual Studio Sessions Pinned DataTips can be used across multiple debugger sessions.  This means that if you stop the debugger, make a code change, and then recompile and start a new debug session - any pinned DataTips will still be there, along with any comments you associate with them.  Pinned DataTips can also be used across multiple Visual Studio sessions.  This means that if you close your project, shutdown Visual Studio, and then later open the project up again – any pinned DataTips will still be there, along with any comments you associate with them. See the Value from Last Debug Session (Great Code Editor Feature) How many times have you ever stopped the debugger only to go back to your code and say: $#@! – what was the value of that variable again??? One of the nice things about pinned DataTips is that they keep track of their “last value from debug session” – and you can look these values up within the VB/C# code editor even when the debugger is no longer running.  DataTips are by default hidden when you are in the code editor and the debugger isn’t running.  On the left-hand margin of the code editor, though, you’ll find a push-pin for each pinned DataTip that you’ve previously setup: Hovering your mouse over a pinned DataTip will cause it to display on the screen.  Below you can see what happens when I hover over the first pin in the editor - it displays our debug session’s last values for the “Request” object DataTip along with the comment we associated with them: This makes it much easier to keep track of state and conditions as you toggle between code editing mode and debugging mode on your projects. Importing/Exporting Pinned DataTips As I mentioned earlier in this post, pinned DataTips are by default saved across Visual Studio sessions (you don’t need to do anything to enable this). VS 2010 also now supports importing/exporting pinned DataTips to XML files – which you can then pass off to other developers, attach to a bug report, or simply re-load later. Combined with the new support for importing/exporting breakpoints, this makes it much easier for multiple developers to share debugger configurations and collaborate across debug sessions. Summary Visual Studio 2010 includes a bunch of great new debugger features – both big and small.  Today’s post shared some of the nice debugger usability improvements. All of the features above are supported with the Visual Studio 2010 Professional edition (the Pinned DataTip features are also supported in the free Visual Studio 2010 Express Editions)  I’ll be covering some of the “big big” new debugging features like Intellitrace, parallel/multithreaded debugging, and dump file analysis in future blog posts.  Hope this helps, Scott P.S. In addition to blogging, I am also now using Twitter for quick updates and to share links. Follow me at: twitter.com/scottgu

    Read the article

  • Take your colleague to see Paul and Kimberly for free

    - by simonsabin
    I’ve been given details of another great off that you can’t miss out on for the Paul Randal and Kimberly Tripp Masterclass next week.   REGISTER TODAY AT www.regonline.co.uk/kimtrippsql on the registration form simply quote discount code: BOGOF and enter your colleague’s details and you will save 100% off a second registration – that’s a 199 GBP saving! This offer is limited, book early to avoid disappointment....(read more)

    Read the article

  • Scripting with the Sun ZFS Storage 7000 Appliance

    - by Geoff Ongley
    The Sun ZFS Storage 7000 appliance has a user friendly and easy to understand graphical web based interface we call the "BUI" or "Browser User Interface".This interface is very useful for many tasks, but in some cases a script (or workflow) may be more appropriate, such as:Repetitive tasksTasks which work on (or obtain information about) a large number of shares or usersTasks which are triggered by an alert threshold (workflows)Tasks where you want a only very basic input, but a consistent output (workflows)The appliance scripting language is based on ECMAscript 3 (close to javascript). I'm not going to cover ECMAscript 3 in great depth (I'm far from an expert here), but I would like to show you some neat things you can do with the appliance, to get you started based on what I have found from my own playing around.I'm making the assumption you have some sort of programming background, and understand variables, arrays, functions to some extent - but of course if something is not clear, please let me know so I can fix it up or clarify it.Variable Declarations and ArraysVariablesECMAScript is a dynamically and weakly typed language. If you don't know what that means, google is your friend - but at a high level it means we can just declare variables with no specific type and on the fly.For example, I can declare a variable and use it straight away in the middle of my code, for example:projects=list();Which makes projects an array of values that are returned from the list(); function (which is usable in most contexts). With this kind of variable, I can do things like:projects.length (this property on array tells you how many objects are in it, good for for loops etc). Alternatively, I could say:projects=3;and now projects is just a simple number.Should we declare variables like this so loosely? In my opinion, the answer is no - I feel it is a better practice to declare variables you are going to use, before you use them - and given them an initial value. You can do so as follows:var myVariable=0;To demonstrate the ability to just randomly assign and change the type of variables, you can create a simple script at the cli as follows (bold for input):fishy10:> script("." to run)> run("cd /");("." to run)> run ("shares");("." to run)> var projects;("." to run)> projects=list();("." to run)> printf("Number of projects is: %d\n",projects.length);("." to run)> projects=152;("." to run)> printf("Value of the projects variable as an integer is now: %d\n",projects);("." to run)> .Number of projects is: 7Value of the projects variable as an integer is now: 152You can also confirm this behaviour by checking the typeof variable we are dealing with:fishy10:> script("." to run)> run("cd /");("." to run)> run ("shares");("." to run)> var projects;("." to run)> projects=list();("." to run)> printf("var projects is of type %s\n",typeof(projects));("." to run)> projects=152;("." to run)> printf("var projects is of type %s\n",typeof(projects));("." to run)> .var projects is of type objectvar projects is of type numberArraysSo you likely noticed that we have already touched on arrays, as the list(); (in the shares context) stored an array into the 'projects' variable.But what if you want to declare your own array? Easy! This is very similar to Java and other languages, we just instantiate a brand new "Array" object using the keyword new:var myArray = new Array();will create an array called "myArray".A quick example:fishy10:> script("." to run)> testArray = new Array();("." to run)> testArray[0]="This";("." to run)> testArray[1]="is";("." to run)> testArray[2]="just";("." to run)> testArray[3]="a";("." to run)> testArray[4]="test";("." to run)> for (i=0; i < testArray.length; i++)("." to run)> {("." to run)>    printf("Array element %d is %s\n",i,testArray[i]);("." to run)> }("." to run)> .Array element 0 is ThisArray element 1 is isArray element 2 is justArray element 3 is aArray element 4 is testWorking With LoopsFor LoopFor loops are very similar to those you will see in C, java and several other languages. One of the key differences here is, as you were made aware earlier, we can be a bit more sloppy with our variable declarations.The general way you would likely use a for loop is as follows:for (variable; test-case; modifier for variable){}For example, you may wish to declare a variable i as 0; and a MAX_ITERATIONS variable to determine how many times this loop should repeat:var i=0;var MAX_ITERATIONS=10;And then, use this variable to be tested against some case existing (has i reached MAX_ITERATIONS? - if not, increment i using i++);for (i=0; i < MAX_ITERATIONS; i++){ // some work to do}So lets run something like this on the appliance:fishy10:> script("." to run)> var i=0;("." to run)> var MAX_ITERATIONS=10;("." to run)> for (i=0; i < MAX_ITERATIONS; i++)("." to run)> {("." to run)>    printf("The number is %d\n",i);("." to run)> }("." to run)> .The number is 0The number is 1The number is 2The number is 3The number is 4The number is 5The number is 6The number is 7The number is 8The number is 9While LoopWhile loops again are very similar to other languages, we loop "while" a condition is met. For example:fishy10:> script("." to run)> var isTen=false;("." to run)> var counter=0;("." to run)> while(isTen==false)("." to run)> {("." to run)>    if (counter==10) ("." to run)>    { ("." to run)>            isTen=true;   ("." to run)>    } ("." to run)>    printf("Counter is %d\n",counter);("." to run)>    counter++;    ("." to run)> }("." to run)> printf("Loop has ended and Counter is %d\n",counter);("." to run)> .Counter is 0Counter is 1Counter is 2Counter is 3Counter is 4Counter is 5Counter is 6Counter is 7Counter is 8Counter is 9Counter is 10Loop has ended and Counter is 11So what do we notice here? Something has actually gone wrong - counter will technically be 11 once the loop completes... Why is this?Well, if we have a loop like this, where the 'while' condition that will end the loop may be set based on some other condition(s) existing (such as the counter has reached 10) - we must ensure that we  terminate this iteration of the loop when the condition is met - otherwise the rest of the code will be followed which may not be desirable. In other words, like in other languages, we will only ever check the loop condition once we are ready to perform the next iteration, so any other code after we set "isTen" to be true, will still be executed as we can see it was above.We can avoid this by adding a break into our loop once we know we have set the condition - this will stop the rest of the logic being processed in this iteration (and as such, counter will not be incremented). So lets try that again:fishy10:> script("." to run)> var isTen=false;("." to run)> var counter=0;("." to run)> while(isTen==false)("." to run)> {("." to run)>    if (counter==10) ("." to run)>    { ("." to run)>            isTen=true;   ("." to run)>            break;("." to run)>    } ("." to run)>    printf("Counter is %d\n",counter);("." to run)>    counter++;    ("." to run)> }("." to run)> printf("Loop has ended and Counter is %d\n", counter);("." to run)> .Counter is 0Counter is 1Counter is 2Counter is 3Counter is 4Counter is 5Counter is 6Counter is 7Counter is 8Counter is 9Loop has ended and Counter is 10Much better!Methods to Obtain and Manipulate DataGet MethodThe get method allows you to get simple properties from an object, for example a quota from a user. The syntax is fairly simple:var myVariable=get('property');An example of where you may wish to use this, is when you are getting a bunch of information about a user (such as quota information when in a shares context):var users=list();for(k=0; k < users.length; k++){     user=users[k];     run('select ' + user);     var username=get('name');     var usage=get('usage');     var quota=get('quota');...Which you can then use to your advantage - to print or manipulate infomation (you could change a user's information with a set method, based on the information returned from the get method). The set method is explained next.Set MethodThe set method can be used in a simple manner, similar to get. The syntax for set is:set('property','value'); // where value is a string, if it was a number, you don't need quotesFor example, we could set the quota on a share as follows (first observing the initial value):fishy10:shares default/test-geoff> script("." to run)> var currentQuota=get('quota');("." to run)> printf("Current Quota is: %s\n",currentQuota);("." to run)> set('quota','30G');("." to run)> run('commit');("." to run)> currentQuota=get('quota');("." to run)> printf("Current Quota is: %s\n",currentQuota);("." to run)> .Current Quota is: 0Current Quota is: 32212254720This shows us using both the get and set methods as can be used in scripts, of course when only setting an individual share, the above is overkill - it would be much easier to set it manually at the cli using 'set quota=3G' and then 'commit'.List MethodThe list method can be very powerful, especially in more complex scripts which iterate over large amounts of data and manipulate it if so desired. The general way you will use list is as follows:var myVar=list();Which will make "myVar" an array, containing all the objects in the relevant context (this could be a list of users, shares, projects, etc). You can then gather or manipulate data very easily.We could list all the shares and mountpoints in a given project for example:fishy10:shares another-project> script("." to run)> var shares=list();("." to run)> for (i=0; i < shares.length; i++)("." to run)> {("." to run)>    run('select ' + shares[i]);("." to run)>    var mountpoint=get('mountpoint');("." to run)>    printf("Share %s discovered, has mountpoint %s\n",shares[i],mountpoint);("." to run)>    run('done');("." to run)> }("." to run)> .Share and-another discovered, has mountpoint /export/another-project/and-anotherShare another-share discovered, has mountpoint /export/another-project/another-shareShare bob discovered, has mountpoint /export/another-projectShare more-shares-for-all discovered, has mountpoint /export/another-project/more-shares-for-allShare yep discovered, has mountpoint /export/another-project/yepWriting More Complex and Re-Usable CodeFunctionsThe best way to be able to write more complex code is to use functions to split up repeatable or reusable sections of your code. This also makes your more complex code easier to read and understand for other programmers.We write functions as follows:function functionName(variable1,variable2,...,variableN){}For example, we could have a function that takes a project name as input, and lists shares for that project (assuming we're already in the 'project' context - context is important!):function getShares(proj){        run('select ' + proj);        shares=list();        printf("Project: %s\n", proj);        for(j=0; j < shares.length; j++)        {                printf("Discovered share: %s\n",shares[i]);        }        run('done'); // exit selected project}Commenting your CodeLike any other language, a large part of making it readable and understandable is to comment it. You can use the same comment style as in C and Java amongst other languages.In other words, sngle line comments use://at the beginning of the comment.Multi line comments use:/*at the beginning, and:*/ at the end.For example, here we will use both:fishy10:> script("." to run)> // This is a test comment("." to run)> printf("doing some work...\n");("." to run)> /* This is a multi-line("." to run)> comment which I will span across("." to run)> three lines in total */("." to run)> printf("doing some more work...\n");("." to run)> .doing some work...doing some more work...Your comments do not have to be on their own, they can begin (particularly with single line comments this is handy) at the end of a statement, for examplevar projects=list(); // The variable projects is an array containing all projects on the system.Try and Catch StatementsYou may be used to using try and catch statements in other languages, and they can (and should) be utilised in your code to catch expected or unexpected error conditions, that you do NOT wish to stop your code from executing (if you do not catch these errors, your script will exit!):try{  // do some work}catch(err) // Catch any error that could occur{ // do something here under the error condition}For example, you may wish to only execute some code if a context can be reached. If you can't perform certain actions under certain circumstances, that may be perfectly acceptable.For example if you want to test a condition that only makes sense when looking at a SMB/NFS share, but does not make sense when you hit an iscsi or FC LUN, you don't want to stop all processing of other shares you may not have covered yet.For example we may wish to obtain quota information on all shares for all users on a share (but this makes no sense for a LUN):function getShareQuota(shar) // Get quota for each user of this share{        run('select ' + shar);        printf("  SHARE: %s\n", shar);        try        {                run('users');                printf("    %20s        %11s    %11s    %3s\n","Username","Usage(G)","Quota(G)","Quota(%)");                printf("    %20s        %11s    %11s    %4s\n","--------","--------","--------","----");                                users=list();                for(k=0; k < users.length; k++)                {                        user=users[k];                        getUserQuota(user);                }                run('done'); // exit user context        }        catch(err)        {                printf("    SKIPPING %s - This is NOT a NFS or CIFs share, not looking for users\n", shar);        }        run('done'); // done with this share}Running Scripts Remotely over SSHAs you have likely noticed, writing and running scripts for all but the simplest jobs directly on the appliance is not going to be a lot of fun.There's a couple of choices on what you can do here:Create scripts on a remote system and run them over sshCreate scripts, wrapping them in workflow code, so they are stored on the appliance and can be triggered under certain circumstances (like a threshold being reached)We'll cover the first one here, and then cover workflows later on (as these are for the most part just scripts with some wrapper information around them).Creating a SSH Public/Private SSH Key PairLog on to your handy Solaris box (You wouldn't be using any other OS, right? :P) and use ssh-keygen to create a pair of ssh keys. I'm storing this separate to my normal key:[geoff@lightning ~] ssh-keygen -t rsa -b 1024Generating public/private rsa key pair.Enter file in which to save the key (/export/home/geoff/.ssh/id_rsa): /export/home/geoff/.ssh/nas_key_rsaEnter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /export/home/geoff/.ssh/nas_key_rsa.Your public key has been saved in /export/home/geoff/.ssh/nas_key_rsa.pub.The key fingerprint is:7f:3d:53:f0:2a:5e:8b:2d:94:2a:55:77:66:5c:9b:14 geoff@lightningInstalling the Public Key on the ApplianceOn your Solaris host, observe the public key:[geoff@lightning ~] cat .ssh/nas_key_rsa.pub ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAIEAvYfK3RIaAYmMHBOvyhKM41NaSmcgUMC3igPN5gUKJQvSnYmjuWG6CBr1CkF5UcDji7v19jG3qAD5lAMFn+L0CxgRr8TNaAU+hA4/tpAGkjm+dKYSyJgEdMIURweyyfUFXoerweR8AWW5xlovGKEWZTAfvJX9Zqvh8oMQ5UJLUUc= geoff@lightningNow, copy and paste everything after "ssh-rsa" and before "user@hostname" - in this case, geoff@lightning. That is, this bit:AAAAB3NzaC1yc2EAAAABIwAAAIEAvYfK3RIaAYmMHBOvyhKM41NaSmcgUMC3igPN5gUKJQvSnYmjuWG6CBr1CkF5UcDji7v19jG3qAD5lAMFn+L0CxgRr8TNaAU+hA4/tpAGkjm+dKYSyJgEdMIURweyyfUFXoerweR8AWW5xlovGKEWZTAfvJX9Zqvh8oMQ5UJLUUc=Logon to your appliance and get into the preferences -> keys area for this user (root):[geoff@lightning ~] ssh [email protected]: Last login: Mon Dec  6 17:13:28 2010 from 192.168.0.2fishy10:> configuration usersfishy10:configuration users> select rootfishy10:configuration users root> preferences fishy10:configuration users root preferences> keysOR do it all in one hit:fishy10:> configuration users select root preferences keysNow, we create a new public key that will be accepted for this user and set the type to RSA:fishy10:configuration users root preferences keys> createfishy10:configuration users root preferences key (uncommitted)> set type=RSASet the key itself using the string copied previously (between ssh-rsa and user@host), and set the key ensuring you put double quotes around it (eg. set key="<key>"):fishy10:configuration users root preferences key (uncommitted)> set key="AAAAB3NzaC1yc2EAAAABIwAAAIEAvYfK3RIaAYmMHBOvyhKM41NaSmcgUMC3igPN5gUKJQvSnYmjuWG6CBr1CkF5UcDji7v19jG3qAD5lAMFn+L0CxgRr8TNaAU+hA4/tpAGkjm+dKYSyJgEdMIURweyyfUFXoerweR8AWW5xlovGKEWZTAfvJX9Zqvh8oMQ5UJLUUc="Now set the comment for this key (do not use spaces):fishy10:configuration users root preferences key (uncommitted)> set comment="LightningRSAKey" Commit the new key:fishy10:configuration users root preferences key (uncommitted)> commitVerify the key is there:fishy10:configuration users root preferences keys> lsKeys:NAME     MODIFIED              TYPE   COMMENT                                  key-000  2010-10-25 20:56:42   RSA    cycloneRSAKey                           key-001  2010-12-6 17:44:53    RSA    LightningRSAKey                         As you can see, we now have my new key, and a previous key I have created on this appliance.Running your Script over SSH from a Remote SystemHere I have created a basic test script, and saved it as test.ecma3:[geoff@lightning ~] cat test.ecma3 script// This is a test script, By Geoff Ongley 2010.printf("Testing script remotely over ssh\n");.Now, we can run this script remotely with our keyless login:[geoff@lightning ~] ssh -i .ssh/nas_key_rsa root@fishy10 < test.ecma3Pseudo-terminal will not be allocated because stdin is not a terminal.Testing script remotely over sshPutting it Together - An Example Completed Quota Gathering ScriptSo now we have a lot of the basics to creating a script, let us do something useful, like, find out how much every user is using, on every share on the system (you will recognise some of the code from my previous examples): script/************************************** Quick and Dirty Quota Check script ** Written By Geoff Ongley            ** 25 October 2010                    **************************************/function getUserQuota(usr){        run('select ' + usr);        var username=get('name');        var usage=get('usage');        var quota=get('quota');        var usage_g=usage / 1073741824; // convert bytes to gigabytes        var quota_g=quota / 1073741824; // as above        var quota_percent=0        if (quota > 0)        {                quota_percent=(usage / quota)*(100/1);        }        printf("    %20s        %8.2f           %8.2f           %d%%\n",username,usage_g,quota_g,quota_percent);        run('done'); // done with this selected user}function getShareQuota(shar){        //printf("DEBUG: selecting share %s\n", shar);        run('select ' + shar);        printf("  SHARE: %s\n", shar);        try        {                run('users');                printf("    %20s        %11s    %11s    %3s\n","Username","Usage(G)","Quota(G)","Quota(%)");                printf("    %20s        %11s    %11s    %4s\n","--------","--------","--------","--------");                                users=list();                for(k=0; k < users.length; k++)                {                        user=users[k];                        getUserQuota(user);                }                run('done'); // exit user context        }        catch(err)        {                printf("    SKIPPING %s - This is NOT a NFS or CIFs share, not looking for users\n", shar);        }        run('done'); // done with this share}function getShares(proj){        //printf("DEBUG: selecting project %s\n",proj);        run('select ' + proj);        shares=list();        printf("Project: %s\n", proj);        for(j=0; j < shares.length; j++)        {                share=shares[j];                getShareQuota(share);        }        run('done'); // exit selected project}function getProjects(){        run('cd /');        run('shares');        projects=list();                for (i=0; i < projects.length; i++)        {                var project=projects[i];                getShares(project);        }        run('done'); // exit context for all projects}getProjects();.Which can be run as follows, and will print information like this:[geoff@lightning ~/FISHWORKS_SCRIPTS] ssh -i ~/.ssh/nas_key_rsa root@fishy10 < get_quota_utilisation.ecma3Pseudo-terminal will not be allocated because stdin is not a terminal.Project: another-project  SHARE: and-another                Username           Usage(G)       Quota(G)    Quota(%)                --------           --------       --------    --------                  nobody            0.00            0.00        0%                 geoffro            0.05            0.00        0%                   Billy            0.10            0.00        0%                    root            0.00            0.00        0%            testing-user            0.05            0.00        0%  SHARE: another-share                Username           Usage(G)       Quota(G)    Quota(%)                --------           --------       --------    --------                    root            0.00            0.00        0%                  nobody            0.00            0.00        0%                 geoffro            0.05            0.49        9%            testing-user            0.05            0.02        249%                   Billy            0.10            0.29        33%  SHARE: bob                Username           Usage(G)       Quota(G)    Quota(%)                --------           --------       --------    --------                  nobody            0.00            0.00        0%                    root            0.00            0.00        0%  SHARE: more-shares-for-all                Username           Usage(G)       Quota(G)    Quota(%)                --------           --------       --------    --------                   Billy            0.10            0.00        0%            testing-user            0.05            0.00        0%                  nobody            0.00            0.00        0%                    root            0.00            0.00        0%                 geoffro            0.05            0.00        0%  SHARE: yep                Username           Usage(G)       Quota(G)    Quota(%)                --------           --------       --------    --------                    root            0.00            0.00        0%                  nobody            0.00            0.00        0%                   Billy            0.10            0.01        999%            testing-user            0.05            0.49        9%                 geoffro            0.05            0.00        0%Project: default  SHARE: Test-LUN    SKIPPING Test-LUN - This is NOT a NFS or CIFs share, not looking for users  SHARE: test-geoff                Username           Usage(G)       Quota(G)    Quota(%)                --------           --------       --------    --------                 geoffro            0.05            0.00        0%                    root            3.18           10.00        31%                    uucp            0.00            0.00        0%                  nobody            0.59            0.49        119%^CKilled by signal 2.Creating a WorkflowWorkflows are scripts that we store on the appliance, and can have the script execute either on request (even from the BUI), or on an event such as a threshold being met.Workflow BasicsA workflow allows you to create a simple process that can be executed either via the BUI interface interactively, or by an alert being raised (for some threshold being reached, for example).The basics parameters you will have to set for your "workflow object" (notice you're creating a variable, that embodies ECMAScript) are as follows (parameters is optional):name: A name for this workflowdescription: A Description for the workflowparameters: A set of input parameters (useful when you need user input to execute the workflow)execute: The code, the script itself to execute, which will be function (parameters)With parameters, you can specify things like this (slightly modified sample taken from the System Administration Guide):          ...parameters:        variableParam1:         {                             label: 'Name of Share',                             type: 'String'                  },                  variableParam2                  {                             label: 'Share Size',                             type: 'size'                  },execute: ....};  Note the commas separating the sections of name, parameters, execute, and so on. This is important!Also - there is plenty of properties you can set on the parameters for your workflow, these are described in the Sun ZFS Storage System Administration Guide.Creating a Basic Workflow from a Basic ScriptTo make a basic script into a basic workflow, you need to wrap the following around your script to create a 'workflow' object:var workflow = {name: 'Get User Quotas',description: 'Displays Quota Utilisation for each user on each share',execute: function() {// (basic script goes here, minus the "script" at the beginning, and "." at the end)}};However, it appears (at least in my experience to date) that the workflow object may only be happy with one function in the execute parameter - either that or I'm doing something wrong. As far as I can tell, after execute: you should only have a basic one function context like so:execute: function(){}To deal with this, and to give an example similar to our script earlier, I have created another simple quota check, to show the same basic functionality, but in a workflow format:var workflow = {name: 'Get User Quotas',description: 'Displays Quota Utilisation for each user on each share',execute: function () {        run('cd /');        run('shares');        projects=list();                for (i=0; i < projects.length; i++)        {                run('select ' + projects[i]);                shares=list('filesystem');                printf("Project: %s\n", projects[i]);                for(j=0; j < shares.length; j++)                {                        run('select ' +shares[j]);                        try                        {                                run('users');                                printf("  SHARE: %s\n", shares[j]);                                printf("    %20s        %11s    %11s    %3s\n","Username","Usage(G)","Quota(G)","Quota(%)");                                printf("    %20s        %11s    %11s    %4s\n","--------","--------","--------","-------");                                users=list();                                for(k=0; k < users.length; k++)                                {                                        run('select ' + users[k]);                                        username=get('name');                                        usage=get('usage');                                        quota=get('quota');                                        usage_g=usage / 1073741824; // convert bytes to gigabytes                                        quota_g=quota / 1073741824; // as above                                        quota_percent=0                                        if (quota > 0)                                        {                                                quota_percent=(usage / quota)*(100/1);                                        }                                        printf("    %20s        %8.2f   %8.2f   %d%%\n",username,usage_g,quota_g,quota_percent);                                        run('done');                                }                                run('done'); // exit user context                        }                        catch(err)                        {                        //      printf("    %s is a LUN, Not looking for users\n", shares[j]);                        }                        run('done'); // exit selected share context                }                run('done'); // exit project context        }        }};SummaryThe Sun ZFS Storage 7000 Appliance offers lots of different and interesting features to Sun/Oracle customers, including the world renowned Analytics. Hopefully the above will help you to think of new creative things you could be doing by taking advantage of one of the other neat features, the internal scripting engine!Some references are below to help you continue learning more, I'll update this post as I do the same! Enjoy...More information on ECMAScript 3A complete reference to ECMAScript 3 which will help you learn more of the details you may be interested in, can be found here:http://www.ecma-international.org/publications/files/ECMA-ST-ARCH/ECMA-262,%203rd%20edition,%20December%201999.pdfMore Information on Administering the Sun ZFS Storage 7000The Sun ZFS Storage 7000 System Administration guide can be a useful reference point, and can be found here:http://wikis.sun.com/download/attachments/186238602/2010_Q3_2_ADMIN.pdf

    Read the article

  • How to bill a client for frequently-interrupted time

    - by Greg
    I find that when I'm working on hourly-billable projects (in particular, those that are research/design/architecture-oriented as opposed to straight coding) that I'm easily distracted by any number of things (email, grab a drink (loss of focus, but nature happens), link off the webpage I was reading, wandering mind (easy when the job calls for a lot of thinking), etc.) This results in very fragmented time, far too incremental IMO to accurately track with a timeclock, and some time very gray. I frequently end up billing for only some fraction of the elapsed time I spent in order to feel fair, but sometimes it takes a really long time to put in an 8-hour day. By contrast, when I've worked for salary I've not worried about whether I'm actively working at any given minute, I just get the job done, and I've never had anything but stellar reviews/feedback from past salaried employers, so I think I get the job done well. I personally believe in an 80/20 cycle: I get 80% of my work done during an inspired 20% of my time. But I have to screw around the other 80% of the time in order to get that first 20%. So the question: what billing/time-tracking policy can I adopt in order to be fair to my hourly customers without having to write off my own less-productive 80% that a salaried employer is willing to overlook in light of the complete package? Note: This question is not about how to be more productive or focused. It's about how to work around whatever salient limitations that I have in a way that's both fair to me and to my customers. Update: A little clarification (to pre-emptively stop some righteous indignation): I currently have a half dozen different project/client groups. It's not a great situation and I'm working at reducing it down to two, but that's my current reality. It's very easy to get off on a thread related to a different project than the one I'm clocking, and I'm not always conscious of it at the time. [I did not intend the question to mean that I was off playing games or making personal calls, etc., and have adjusted wording above to be clearer. Most of the time. I am only human, and sometimes the mind does force you to take a break! :-)]

    Read the article

  • How many production servers should I start with?

    - by elfintar
    I'm building a site that I plan to grow to the size of SO. I'm only planning to have one prodcuction server to start off with. This will host everything including the database. I know it's very hard to say but am I likely to run into trouble quickly (if the site takes off) and if this is the case should I start out with more than one server so I can load balance everything from day 1? If no, should I be looking for something a little bigger than this spec?: http://www.123-reg.co.uk/dedicated-server-hosting/

    Read the article

  • How John Got 15x Improvement Without Really Trying

    - by rchrd
    The following article was published on a Sun Microsystems website a number of years ago by John Feo. It is still useful and worth preserving. So I'm republishing it here.  How I Got 15x Improvement Without Really Trying John Feo, Sun Microsystems Taking ten "personal" program codes used in scientific and engineering research, the author was able to get from 2 to 15 times performance improvement easily by applying some simple general optimization techniques. Introduction Scientific research based on computer simulation depends on the simulation for advancement. The research can advance only as fast as the computational codes can execute. The codes' efficiency determines both the rate and quality of results. In the same amount of time, a faster program can generate more results and can carry out a more detailed simulation of physical phenomena than a slower program. Highly optimized programs help science advance quickly and insure that monies supporting scientific research are used as effectively as possible. Scientific computer codes divide into three broad categories: ISV, community, and personal. ISV codes are large, mature production codes developed and sold commercially. The codes improve slowly over time both in methods and capabilities, and they are well tuned for most vendor platforms. Since the codes are mature and complex, there are few opportunities to improve their performance solely through code optimization. Improvements of 10% to 15% are typical. Examples of ISV codes are DYNA3D, Gaussian, and Nastran. Community codes are non-commercial production codes used by a particular research field. Generally, they are developed and distributed by a single academic or research institution with assistance from the community. Most users just run the codes, but some develop new methods and extensions that feed back into the general release. The codes are available on most vendor platforms. Since these codes are younger than ISV codes, there are more opportunities to optimize the source code. Improvements of 50% are not unusual. Examples of community codes are AMBER, CHARM, BLAST, and FASTA. Personal codes are those written by single users or small research groups for their own use. These codes are not distributed, but may be passed from professor-to-student or student-to-student over several years. They form the primordial ocean of applications from which community and ISV codes emerge. Government research grants pay for the development of most personal codes. This paper reports on the nature and performance of this class of codes. Over the last year, I have looked at over two dozen personal codes from more than a dozen research institutions. The codes cover a variety of scientific fields, including astronomy, atmospheric sciences, bioinformatics, biology, chemistry, geology, and physics. The sources range from a few hundred lines to more than ten thousand lines, and are written in Fortran, Fortran 90, C, and C++. For the most part, the codes are modular, documented, and written in a clear, straightforward manner. They do not use complex language features, advanced data structures, programming tricks, or libraries. I had little trouble understanding what the codes did or how data structures were used. Most came with a makefile. Surprisingly, only one of the applications is parallel. All developers have access to parallel machines, so availability is not an issue. Several tried to parallelize their applications, but stopped after encountering difficulties. Lack of education and a perception that parallelism is difficult prevented most from trying. I parallelized several of the codes using OpenMP, and did not judge any of the codes as difficult to parallelize. Even more surprising than the lack of parallelism is the inefficiency of the codes. I was able to get large improvements in performance in a matter of a few days applying simple optimization techniques. Table 1 lists ten representative codes [names and affiliation are omitted to preserve anonymity]. Improvements on one processor range from 2x to 15.5x with a simple average of 4.75x. I did not use sophisticated performance tools or drill deep into the program's execution character as one would do when tuning ISV or community codes. Using only a profiler and source line timers, I identified inefficient sections of code and improved their performance by inspection. The changes were at a high level. I am sure there is another factor of 2 or 3 in each code, and more if the codes are parallelized. The study’s results show that personal scientific codes are running many times slower than they should and that the problem is pervasive. Computational scientists are not sloppy programmers; however, few are trained in the art of computer programming or code optimization. I found that most have a working knowledge of some programming language and standard software engineering practices; but they do not know, or think about, how to make their programs run faster. They simply do not know the standard techniques used to make codes run faster. In fact, they do not even perceive that such techniques exist. The case studies described in this paper show that applying simple, well known techniques can significantly increase the performance of personal codes. It is important that the scientific community and the Government agencies that support scientific research find ways to better educate academic scientific programmers. The inefficiency of their codes is so bad that it is retarding both the quality and progress of scientific research. # cacheperformance redundantoperations loopstructures performanceimprovement 1 x x 15.5 2 x 2.8 3 x x 2.5 4 x 2.1 5 x x 2.0 6 x 5.0 7 x 5.8 8 x 6.3 9 2.2 10 x x 3.3 Table 1 — Area of improvement and performance gains of 10 codes The remainder of the paper is organized as follows: sections 2, 3, and 4 discuss the three most common sources of inefficiencies in the codes studied. These are cache performance, redundant operations, and loop structures. Each section includes several examples. The last section summaries the work and suggests a possible solution to the issues raised. Optimizing cache performance Commodity microprocessor systems use caches to increase memory bandwidth and reduce memory latencies. Typical latencies from processor to L1, L2, local, and remote memory are 3, 10, 50, and 200 cycles, respectively. Moreover, bandwidth falls off dramatically as memory distances increase. Programs that do not use cache effectively run many times slower than programs that do. When optimizing for cache, the biggest performance gains are achieved by accessing data in cache order and reusing data to amortize the overhead of cache misses. Secondary considerations are prefetching, associativity, and replacement; however, the understanding and analysis required to optimize for the latter are probably beyond the capabilities of the non-expert. Much can be gained simply by accessing data in the correct order and maximizing data reuse. 6 out of the 10 codes studied here benefited from such high level optimizations. Array Accesses The most important cache optimization is the most basic: accessing Fortran array elements in column order and C array elements in row order. Four of the ten codes—1, 2, 4, and 10—got it wrong. Compilers will restructure nested loops to optimize cache performance, but may not do so if the loop structure is too complex, or the loop body includes conditionals, complex addressing, or function calls. In code 1, the compiler failed to invert a key loop because of complex addressing do I = 0, 1010, delta_x IM = I - delta_x IP = I + delta_x do J = 5, 995, delta_x JM = J - delta_x JP = J + delta_x T1 = CA1(IP, J) + CA1(I, JP) T2 = CA1(IM, J) + CA1(I, JM) S1 = T1 + T2 - 4 * CA1(I, J) CA(I, J) = CA1(I, J) + D * S1 end do end do In code 2, the culprit is conditionals do I = 1, N do J = 1, N If (IFLAG(I,J) .EQ. 0) then T1 = Value(I, J-1) T2 = Value(I-1, J) T3 = Value(I, J) T4 = Value(I+1, J) T5 = Value(I, J+1) Value(I,J) = 0.25 * (T1 + T2 + T5 + T4) Delta = ABS(T3 - Value(I,J)) If (Delta .GT. MaxDelta) MaxDelta = Delta endif enddo enddo I fixed both programs by inverting the loops by hand. Code 10 has three-dimensional arrays and triply nested loops. The structure of the most computationally intensive loops is too complex to invert automatically or by hand. The only practical solution is to transpose the arrays so that the dimension accessed by the innermost loop is in cache order. The arrays can be transposed at construction or prior to entering a computationally intensive section of code. The former requires all array references to be modified, while the latter is cost effective only if the cost of the transpose is amortized over many accesses. I used the second approach to optimize code 10. Code 5 has four-dimensional arrays and loops are nested four deep. For all of the reasons cited above the compiler is not able to restructure three key loops. Assume C arrays and let the four dimensions of the arrays be i, j, k, and l. In the original code, the index structure of the three loops is L1: for i L2: for i L3: for i for l for l for j for k for j for k for j for k for l So only L3 accesses array elements in cache order. L1 is a very complex loop—much too complex to invert. I brought the loop into cache alignment by transposing the second and fourth dimensions of the arrays. Since the code uses a macro to compute all array indexes, I effected the transpose at construction and changed the macro appropriately. The dimensions of the new arrays are now: i, l, k, and j. L3 is a simple loop and easily inverted. L2 has a loop-carried scalar dependence in k. By promoting the scalar name that carries the dependence to an array, I was able to invert the third and fourth subloops aligning the loop with cache. Code 5 is by far the most difficult of the four codes to optimize for array accesses; but the knowledge required to fix the problems is no more than that required for the other codes. I would judge this code at the limits of, but not beyond, the capabilities of appropriately trained computational scientists. Array Strides When a cache miss occurs, a line (64 bytes) rather than just one word is loaded into the cache. If data is accessed stride 1, than the cost of the miss is amortized over 8 words. Any stride other than one reduces the cost savings. Two of the ten codes studied suffered from non-unit strides. The codes represent two important classes of "strided" codes. Code 1 employs a multi-grid algorithm to reduce time to convergence. The grids are every tenth, fifth, second, and unit element. Since time to convergence is inversely proportional to the distance between elements, coarse grids converge quickly providing good starting values for finer grids. The better starting values further reduce the time to convergence. The downside is that grids of every nth element, n > 1, introduce non-unit strides into the computation. In the original code, much of the savings of the multi-grid algorithm were lost due to this problem. I eliminated the problem by compressing (copying) coarse grids into continuous memory, and rewriting the computation as a function of the compressed grid. On convergence, I copied the final values of the compressed grid back to the original grid. The savings gained from unit stride access of the compressed grid more than paid for the cost of copying. Using compressed grids, the loop from code 1 included in the previous section becomes do j = 1, GZ do i = 1, GZ T1 = CA(i+0, j-1) + CA(i-1, j+0) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) S1 = T1 + T4 - 4 * CA1(i+0, j+0) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 enddo enddo where CA and CA1 are compressed arrays of size GZ. Code 7 traverses a list of objects selecting objects for later processing. The labels of the selected objects are stored in an array. The selection step has unit stride, but the processing steps have irregular stride. A fix is to save the parameters of the selected objects in temporary arrays as they are selected, and pass the temporary arrays to the processing functions. The fix is practical if the same parameters are used in selection as in processing, or if processing comprises a series of distinct steps which use overlapping subsets of the parameters. Both conditions are true for code 7, so I achieved significant improvement by copying parameters to temporary arrays during selection. Data reuse In the previous sections, we optimized for spatial locality. It is also important to optimize for temporal locality. Once read, a datum should be used as much as possible before it is forced from cache. Loop fusion and loop unrolling are two techniques that increase temporal locality. Unfortunately, both techniques increase register pressure—as loop bodies become larger, the number of registers required to hold temporary values grows. Once register spilling occurs, any gains evaporate quickly. For multiprocessors with small register sets or small caches, the sweet spot can be very small. In the ten codes presented here, I found no opportunities for loop fusion and only two opportunities for loop unrolling (codes 1 and 3). In code 1, unrolling the outer and inner loop one iteration increases the number of result values computed by the loop body from 1 to 4, do J = 1, GZ-2, 2 do I = 1, GZ-2, 2 T1 = CA1(i+0, j-1) + CA1(i-1, j+0) T2 = CA1(i+1, j-1) + CA1(i+0, j+0) T3 = CA1(i+0, j+0) + CA1(i-1, j+1) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) T5 = CA1(i+2, j+0) + CA1(i+1, j+1) T6 = CA1(i+1, j+1) + CA1(i+0, j+2) T7 = CA1(i+2, j+1) + CA1(i+1, j+2) S1 = T1 + T4 - 4 * CA1(i+0, j+0) S2 = T2 + T5 - 4 * CA1(i+1, j+0) S3 = T3 + T6 - 4 * CA1(i+0, j+1) S4 = T4 + T7 - 4 * CA1(i+1, j+1) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 CA(i+1, j+0) = CA1(i+1, j+0) + DD * S2 CA(i+0, j+1) = CA1(i+0, j+1) + DD * S3 CA(i+1, j+1) = CA1(i+1, j+1) + DD * S4 enddo enddo The loop body executes 12 reads, whereas as the rolled loop shown in the previous section executes 20 reads to compute the same four values. In code 3, two loops are unrolled 8 times and one loop is unrolled 4 times. Here is the before for (k = 0; k < NK[u]; k++) { sum = 0.0; for (y = 0; y < NY; y++) { sum += W[y][u][k] * delta[y]; } backprop[i++]=sum; } and after code for (k = 0; k < KK - 8; k+=8) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (y = 0; y < NY; y++) { sum0 += W[y][0][k+0] * delta[y]; sum1 += W[y][0][k+1] * delta[y]; sum2 += W[y][0][k+2] * delta[y]; sum3 += W[y][0][k+3] * delta[y]; sum4 += W[y][0][k+4] * delta[y]; sum5 += W[y][0][k+5] * delta[y]; sum6 += W[y][0][k+6] * delta[y]; sum7 += W[y][0][k+7] * delta[y]; } backprop[k+0] = sum0; backprop[k+1] = sum1; backprop[k+2] = sum2; backprop[k+3] = sum3; backprop[k+4] = sum4; backprop[k+5] = sum5; backprop[k+6] = sum6; backprop[k+7] = sum7; } for one of the loops unrolled 8 times. Optimizing for temporal locality is the most difficult optimization considered in this paper. The concepts are not difficult, but the sweet spot is small. Identifying where the program can benefit from loop unrolling or loop fusion is not trivial. Moreover, it takes some effort to get it right. Still, educating scientific programmers about temporal locality and teaching them how to optimize for it will pay dividends. Reducing instruction count Execution time is a function of instruction count. Reduce the count and you usually reduce the time. The best solution is to use a more efficient algorithm; that is, an algorithm whose order of complexity is smaller, that converges quicker, or is more accurate. Optimizing source code without changing the algorithm yields smaller, but still significant, gains. This paper considers only the latter because the intent is to study how much better codes can run if written by programmers schooled in basic code optimization techniques. The ten codes studied benefited from three types of "instruction reducing" optimizations. The two most prevalent were hoisting invariant memory and data operations out of inner loops. The third was eliminating unnecessary data copying. The nature of these inefficiencies is language dependent. Memory operations The semantics of C make it difficult for the compiler to determine all the invariant memory operations in a loop. The problem is particularly acute for loops in functions since the compiler may not know the values of the function's parameters at every call site when compiling the function. Most compilers support pragmas to help resolve ambiguities; however, these pragmas are not comprehensive and there is no standard syntax. To guarantee that invariant memory operations are not executed repetitively, the user has little choice but to hoist the operations by hand. The problem is not as severe in Fortran programs because in the absence of equivalence statements, it is a violation of the language's semantics for two names to share memory. Codes 3 and 5 are C programs. In both cases, the compiler did not hoist all invariant memory operations from inner loops. Consider the following loop from code 3 for (y = 0; y < NY; y++) { i = 0; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += delta[y] * I1[i++]; } } } Since dW[y][u] can point to the same memory space as delta for one or more values of y and u, assignment to dW[y][u][k] may change the value of delta[y]. In reality, dW and delta do not overlap in memory, so I rewrote the loop as for (y = 0; y < NY; y++) { i = 0; Dy = delta[y]; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += Dy * I1[i++]; } } } Failure to hoist invariant memory operations may be due to complex address calculations. If the compiler can not determine that the address calculation is invariant, then it can hoist neither the calculation nor the associated memory operations. As noted above, code 5 uses a macro to address four-dimensional arrays #define MAT4D(a,q,i,j,k) (double *)((a)->data + (q)*(a)->strides[0] + (i)*(a)->strides[3] + (j)*(a)->strides[2] + (k)*(a)->strides[1]) The macro is too complex for the compiler to understand and so, it does not identify any subexpressions as loop invariant. The simplest way to eliminate the address calculation from the innermost loop (over i) is to define a0 = MAT4D(a,q,0,j,k) before the loop and then replace all instances of *MAT4D(a,q,i,j,k) in the loop with a0[i] A similar problem appears in code 6, a Fortran program. The key loop in this program is do n1 = 1, nh nx1 = (n1 - 1) / nz + 1 nz1 = n1 - nz * (nx1 - 1) do n2 = 1, nh nx2 = (n2 - 1) / nz + 1 nz2 = n2 - nz * (nx2 - 1) ndx = nx2 - nx1 ndy = nz2 - nz1 gxx = grn(1,ndx,ndy) gyy = grn(2,ndx,ndy) gxy = grn(3,ndx,ndy) balance(n1,1) = balance(n1,1) + (force(n2,1) * gxx + force(n2,2) * gxy) * h1 balance(n1,2) = balance(n1,2) + (force(n2,1) * gxy + force(n2,2) * gyy)*h1 end do end do The programmer has written this loop well—there are no loop invariant operations with respect to n1 and n2. However, the loop resides within an iterative loop over time and the index calculations are independent with respect to time. Trading space for time, I precomputed the index values prior to the entering the time loop and stored the values in two arrays. I then replaced the index calculations with reads of the arrays. Data operations Ways to reduce data operations can appear in many forms. Implementing a more efficient algorithm produces the biggest gains. The closest I came to an algorithm change was in code 4. This code computes the inner product of K-vectors A(i) and B(j), 0 = i < N, 0 = j < M, for most values of i and j. Since the program computes most of the NM possible inner products, it is more efficient to compute all the inner products in one triply-nested loop rather than one at a time when needed. The savings accrue from reading A(i) once for all B(j) vectors and from loop unrolling. for (i = 0; i < N; i+=8) { for (j = 0; j < M; j++) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (k = 0; k < K; k++) { sum0 += A[i+0][k] * B[j][k]; sum1 += A[i+1][k] * B[j][k]; sum2 += A[i+2][k] * B[j][k]; sum3 += A[i+3][k] * B[j][k]; sum4 += A[i+4][k] * B[j][k]; sum5 += A[i+5][k] * B[j][k]; sum6 += A[i+6][k] * B[j][k]; sum7 += A[i+7][k] * B[j][k]; } C[i+0][j] = sum0; C[i+1][j] = sum1; C[i+2][j] = sum2; C[i+3][j] = sum3; C[i+4][j] = sum4; C[i+5][j] = sum5; C[i+6][j] = sum6; C[i+7][j] = sum7; }} This change requires knowledge of a typical run; i.e., that most inner products are computed. The reasons for the change, however, derive from basic optimization concepts. It is the type of change easily made at development time by a knowledgeable programmer. In code 5, we have the data version of the index optimization in code 6. Here a very expensive computation is a function of the loop indices and so cannot be hoisted out of the loop; however, the computation is invariant with respect to an outer iterative loop over time. We can compute its value for each iteration of the computation loop prior to entering the time loop and save the values in an array. The increase in memory required to store the values is small in comparison to the large savings in time. The main loop in Code 8 is doubly nested. The inner loop includes a series of guarded computations; some are a function of the inner loop index but not the outer loop index while others are a function of the outer loop index but not the inner loop index for (j = 0; j < N; j++) { for (i = 0; i < M; i++) { r = i * hrmax; R = A[j]; temp = (PRM[3] == 0.0) ? 1.0 : pow(r, PRM[3]); high = temp * kcoeff * B[j] * PRM[2] * PRM[4]; low = high * PRM[6] * PRM[6] / (1.0 + pow(PRM[4] * PRM[6], 2.0)); kap = (R > PRM[6]) ? high * R * R / (1.0 + pow(PRM[4]*r, 2.0) : low * pow(R/PRM[6], PRM[5]); < rest of loop omitted > }} Note that the value of temp is invariant to j. Thus, we can hoist the computation for temp out of the loop and save its values in an array. for (i = 0; i < M; i++) { r = i * hrmax; TEMP[i] = pow(r, PRM[3]); } [N.B. – the case for PRM[3] = 0 is omitted and will be reintroduced later.] We now hoist out of the inner loop the computations invariant to i. Since the conditional guarding the value of kap is invariant to i, it behooves us to hoist the computation out of the inner loop, thereby executing the guard once rather than M times. The final version of the code is for (j = 0; j < N; j++) { R = rig[j] / 1000.; tmp1 = kcoeff * par[2] * beta[j] * par[4]; tmp2 = 1.0 + (par[4] * par[4] * par[6] * par[6]); tmp3 = 1.0 + (par[4] * par[4] * R * R); tmp4 = par[6] * par[6] / tmp2; tmp5 = R * R / tmp3; tmp6 = pow(R / par[6], par[5]); if ((par[3] == 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp5; } else if ((par[3] == 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp4 * tmp6; } else if ((par[3] != 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp5; } else if ((par[3] != 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp4 * tmp6; } for (i = 0; i < M; i++) { kap = KAP[i]; r = i * hrmax; < rest of loop omitted > } } Maybe not the prettiest piece of code, but certainly much more efficient than the original loop, Copy operations Several programs unnecessarily copy data from one data structure to another. This problem occurs in both Fortran and C programs, although it manifests itself differently in the two languages. Code 1 declares two arrays—one for old values and one for new values. At the end of each iteration, the array of new values is copied to the array of old values to reset the data structures for the next iteration. This problem occurs in Fortran programs not included in this study and in both Fortran 77 and Fortran 90 code. Introducing pointers to the arrays and swapping pointer values is an obvious way to eliminate the copying; but pointers is not a feature that many Fortran programmers know well or are comfortable using. An easy solution not involving pointers is to extend the dimension of the value array by 1 and use the last dimension to differentiate between arrays at different times. For example, if the data space is N x N, declare the array (N, N, 2). Then store the problem’s initial values in (_, _, 2) and define the scalar names new = 2 and old = 1. At the start of each iteration, swap old and new to reset the arrays. The old–new copy problem did not appear in any C program. In programs that had new and old values, the code swapped pointers to reset data structures. Where unnecessary coping did occur is in structure assignment and parameter passing. Structures in C are handled much like scalars. Assignment causes the data space of the right-hand name to be copied to the data space of the left-hand name. Similarly, when a structure is passed to a function, the data space of the actual parameter is copied to the data space of the formal parameter. If the structure is large and the assignment or function call is in an inner loop, then copying costs can grow quite large. While none of the ten programs considered here manifested this problem, it did occur in programs not included in the study. A simple fix is always to refer to structures via pointers. Optimizing loop structures Since scientific programs spend almost all their time in loops, efficient loops are the key to good performance. Conditionals, function calls, little instruction level parallelism, and large numbers of temporary values make it difficult for the compiler to generate tightly packed, highly efficient code. Conditionals and function calls introduce jumps that disrupt code flow. Users should eliminate or isolate conditionls to their own loops as much as possible. Often logical expressions can be substituted for if-then-else statements. For example, code 2 includes the following snippet MaxDelta = 0.0 do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) if (Delta > MaxDelta) MaxDelta = Delta enddo enddo if (MaxDelta .gt. 0.001) goto 200 Since the only use of MaxDelta is to control the jump to 200 and all that matters is whether or not it is greater than 0.001, I made MaxDelta a boolean and rewrote the snippet as MaxDelta = .false. do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) MaxDelta = MaxDelta .or. (Delta .gt. 0.001) enddo enddo if (MaxDelta) goto 200 thereby, eliminating the conditional expression from the inner loop. A microprocessor can execute many instructions per instruction cycle. Typically, it can execute one or more memory, floating point, integer, and jump operations. To be executed simultaneously, the operations must be independent. Thick loops tend to have more instruction level parallelism than thin loops. Moreover, they reduce memory traffice by maximizing data reuse. Loop unrolling and loop fusion are two techniques to increase the size of loop bodies. Several of the codes studied benefitted from loop unrolling, but none benefitted from loop fusion. This observation is not too surpising since it is the general tendency of programmers to write thick loops. As loops become thicker, the number of temporary values grows, increasing register pressure. If registers spill, then memory traffic increases and code flow is disrupted. A thick loop with many temporary values may execute slower than an equivalent series of thin loops. The biggest gain will be achieved if the thick loop can be split into a series of independent loops eliminating the need to write and read temporary arrays. I found such an occasion in code 10 where I split the loop do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do into two disjoint loops do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) end do end do do i = 1, n do j = 1, m C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do Conclusions Over the course of the last year, I have had the opportunity to work with over two dozen academic scientific programmers at leading research universities. Their research interests span a broad range of scientific fields. Except for two programs that relied almost exclusively on library routines (matrix multiply and fast Fourier transform), I was able to improve significantly the single processor performance of all codes. Improvements range from 2x to 15.5x with a simple average of 4.75x. Changes to the source code were at a very high level. I did not use sophisticated techniques or programming tools to discover inefficiencies or effect the changes. Only one code was parallel despite the availability of parallel systems to all developers. Clearly, we have a problem—personal scientific research codes are highly inefficient and not running parallel. The developers are unaware of simple optimization techniques to make programs run faster. They lack education in the art of code optimization and parallel programming. I do not believe we can fix the problem by publishing additional books or training manuals. To date, the developers in questions have not studied the books or manual available, and are unlikely to do so in the future. Short courses are a possible solution, but I believe they are too concentrated to be much use. The general concepts can be taught in a three or four day course, but that is not enough time for students to practice what they learn and acquire the experience to apply and extend the concepts to their codes. Practice is the key to becoming proficient at optimization. I recommend that graduate students be required to take a semester length course in optimization and parallel programming. We would never give someone access to state-of-the-art scientific equipment costing hundreds of thousands of dollars without first requiring them to demonstrate that they know how to use the equipment. Yet the criterion for time on state-of-the-art supercomputers is at most an interesting project. Requestors are never asked to demonstrate that they know how to use the system, or can use the system effectively. A semester course would teach them the required skills. Government agencies that fund academic scientific research pay for most of the computer systems supporting scientific research as well as the development of most personal scientific codes. These agencies should require graduate schools to offer a course in optimization and parallel programming as a requirement for funding. About the Author John Feo received his Ph.D. in Computer Science from The University of Texas at Austin in 1986. After graduate school, Dr. Feo worked at Lawrence Livermore National Laboratory where he was the Group Leader of the Computer Research Group and principal investigator of the Sisal Language Project. In 1997, Dr. Feo joined Tera Computer Company where he was project manager for the MTA, and oversaw the programming and evaluation of the MTA at the San Diego Supercomputer Center. In 2000, Dr. Feo joined Sun Microsystems as an HPC application specialist. He works with university research groups to optimize and parallelize scientific codes. Dr. Feo has published over two dozen research articles in the areas of parallel parallel programming, parallel programming languages, and application performance.

    Read the article

  • Friday Fun: Games that Look Like Productivity Apps

    - by Mysticgeek
    We’ve been showing you fun flash games to play during company time on a Friday afternoon. Hopefully while playing them, you haven’t received a “talking to”. Today we show you some cool games to play that look like productivity apps, so the boss will be none the wiser. The website cantyouseeimbusy.com has developed some very neat little games that look like productivity apps like Word and Excel. These apps look exactly like some project you would be working on, but are really neat little games. Here we take a look at three cool ones on the site called Breakdown, Leadership, and Cost Cutter. Leadership Leadership is a cool game that looks like something you would be working in Excel and is a spin off of the classic game Moon Lander. You navigate your ship through a variety of challenging line graphs. Breakdown This one is a knock off of the classic game Break Out. Use your mouse to scroll the racket at the bottom and bounce the ball off of the text in the document. Press the space bar to pause the game and the elements will disappear…good for when the boss comes around. Cost Cutter This one is a puzzle game where it looks like your working on some bar charts in Excel. You need to click combinations of two or more blocks that are the same color. Again, hit the spacebar and the game elements will disappear. If you’re looking for a way to goof off with some simple games without the boss knowing, these will definitely do the trick. Another cool game along these lines is Excit! which we covered previously. Play Cost Cutter, Breakdown, and Leadership at cantyouseeimbusy.com Similar Articles Productive Geek Tips Friday Fun: Get Your Mario OnFriday Fun: Bricks Breaking & Cube CrashFriday Fun: Fancy Pants AdventuresFriday Fun: GemCraft is a Totally Addictive Tower Defense GameFriday Fun: Five More Time Wasting Online Games TouchFreeze Alternative in AutoHotkey The Icy Undertow Desktop Windows Home Server – Backup to LAN The Clear & Clean Desktop Use This Bookmarklet to Easily Get Albums Use AutoHotkey to Assign a Hotkey to a Specific Window Latest Software Reviews Tinyhacker Random Tips Revo Uninstaller Pro Registry Mechanic 9 for Windows PC Tools Internet Security Suite 2010 PCmover Professional Download Microsoft Office Help tab The Growth of Citibank Quickly Switch between Tabs in IE Windows Media Player 12: Tweak Video & Sound with Playback Enhancements Own a cell phone, or does a cell phone own you? Make your Joomla & Drupal Sites Mobile with OSMOBI

    Read the article

< Previous Page | 527 528 529 530 531 532 533 534 535 536 537 538  | Next Page >