Search Results

Search found 49554 results on 1983 pages for 'database users'.

Page 498/1983 | < Previous Page | 494 495 496 497 498 499 500 501 502 503 504 505 | Next Page >

SQL Server – Undelete a Table and Restore a Single Table from Backup

- by Mladen Prajdic

This post is part of the monthly community event called T-SQL Tuesday started by Adam Machanic (blog|twitter) and hosted by someone else each month. This month the host is Sankar Reddy (blog|twitter) and the topic is Misconceptions in SQL Server. You can follow posts for this theme on Twitter by looking at #TSQL2sDay hashtag. Let me start by saying: This code is a crazy hack that is to never be used unless you really, really have to. Really! And I don’t think there’s a time when you would really have to use it for real. Because it’s a hack there are number of things that can go wrong so play with it knowing that. I’ve managed to totally corrupt one database. :) Oh… and for those saying: yeah yeah.. you have a single table in a file group and you’re restoring that, I say “nay nay” to you. As we all know SQL Server can’t do single table restores from backup. This is kind of a obvious thing due to different relational integrity (RI) concerns. Since we have to maintain that we have to restore all tables represented in a RI graph. For this exercise i say BAH! to those concerns. Note that this method “works” only for simple tables that don’t have LOB and off rows data. The code can be expanded to include those but I’ve tried to leave things “simple”. Note that for this to work our table needs to be relatively static data-wise. This doesn’t work for OLTP table. Products are a perfect example of static data. They don’t change much between backups, pretty much everything depends on them and their table is one of those tables that are relatively easy to accidentally delete everything from. This only works if the database is in Full or Bulk-Logged recovery mode for tables where the contents have been deleted or truncated but NOT when a table was dropped. Everything we’ll talk about has to be done before the data pages are reused for other purposes. After deletion or truncation the pages are marked as reusable so you have to act fast. The best thing probably is to put the database into single user mode ASAP while you’re performing this procedure and return it to multi user after you’re done. How do we do it? We will be using an undocumented but known DBCC commands: DBCC PAGE, an undocumented function sys.fn_dblog and a little known DATABASE RESTORE PAGE option. All tests will be on a copy of Production.Product table in AdventureWorks database called Production.Product1 because the original table has FK constraints that prevent us from truncating it for testing. -- create a duplicate table. This doesn't preserve indexes!SELECT *INTO AdventureWorks.Production.Product1FROM AdventureWorks.Production.Product After we run this code take a full back to perform further testing. First let’s see what the difference between DELETE and TRUNCATE is when it comes to logging. With DELETE every row deletion is logged in the transaction log. With TRUNCATE only whole data page deallocations are logged in the transaction log. Getting deleted data pages is simple. All we have to look for is row delete entry in the sys.fn_dblog output. But getting data pages that were truncated from the transaction log presents a bit of an interesting problem. I will not go into depths of IAM(Index Allocation Map) and PFS (Page Free Space) pages but suffice to say that every IAM page has intervals that tell us which data pages are allocated for a table and which aren’t. If we deep dive into the sys.fn_dblog output we can see that once you truncate a table all the pages in all the intervals are deallocated and this is shown in the PFS page transaction log entry as deallocation of pages. For every 8 pages in the same extent there is one PFS page row in the transaction log. This row holds information about all 8 pages in CSV format which means we can get to this data with some parsing. A great help for parsing this stuff is Peter Debetta’s handy function dbo.HexStrToVarBin that converts hexadecimal string into a varbinary value that can be easily converted to integer tus giving us a readable page number. The shortened (columns removed) sys.fn_dblog output for a PFS page with CSV data for 1 extent (8 data pages) looks like this: -- [Page ID] is displayed in hex format. -- To convert it to readable int we'll use dbo.HexStrToVarBin function found at -- http://sqlblog.com/blogs/peter_debetta/archive/2007/03/09/t-sql-convert-hex-string-to-varbinary.aspx -- This function must be installed in the master databaseSELECT Context, AllocUnitName, [Page ID], DescriptionFROM sys.fn_dblog(NULL, NULL)WHERE [Current LSN] = '00000031:00000a46:007d' The pages at the end marked with 0x00—> are pages that are allocated in the extent but are not part of a table. We can inspect the raw content of each data page with a DBCC PAGE command: -- we need this trace flag to redirect output to the query window.DBCC TRACEON (3604); -- WITH TABLERESULTS gives us data in table format instead of message format-- we use format option 3 because it's the easiest to read and manipulate further onDBCC PAGE (AdventureWorks, 1, 613, 3) WITH TABLERESULTS Since the DBACC PAGE output can be quite extensive I won’t put it here. You can see an example of it in the link at the beginning of this section. Getting deleted data back When we run a delete statement every row to be deleted is marked as a ghost record. A background process periodically cleans up those rows. A huge misconception is that the data is actually removed. It’s not. Only the pointers to the rows are removed while the data itself is still on the data page. We just can’t access it with normal means. To get those pointers back we need to restore every deleted page using the RESTORE PAGE option mentioned above. This restore must be done from a full backup, followed by any differential and log backups that you may have. This is necessary to bring the pages up to the same point in time as the rest of the data. However the restore doesn’t magically connect the restored page back to the original table. It simply replaces the current page with the one from the backup. After the restore we use the DBCC PAGE to read data directly from all data pages and insert that data into a temporary table. To finish the RESTORE PAGE procedure we finally have to take a tail log backup (simple backup of the transaction log) and restore it back. We can now insert data from the temporary table to our original table by hand. Getting truncated data back When we run a truncate the truncated data pages aren’t touched at all. Even the pointers to rows stay unchanged. Because of this getting data back from truncated table is simple. we just have to find out which pages belonged to our table and use DBCC PAGE to read data off of them. No restore is necessary. Turns out that the problems we had with finding the data pages is alleviated by not having to do a RESTORE PAGE procedure. Stop stalling… show me The Code! This is the code for getting back deleted and truncated data back. It’s commented in all the right places so don’t be afraid to take a closer look. Make sure you have a full backup before trying this out. Also I suggest that the last step of backing and restoring the tail log is performed by hand. USE masterGOIF OBJECT_ID('dbo.HexStrToVarBin') IS NULL RAISERROR ('No dbo.HexStrToVarBin installed. Go to http://sqlblog.com/blogs/peter_debetta/archive/2007/03/09/t-sql-convert-hex-string-to-varbinary.aspx and install it in master database' , 18, 1) SET NOCOUNT ONBEGIN TRY DECLARE @dbName VARCHAR(1000), @schemaName VARCHAR(1000), @tableName VARCHAR(1000), @fullBackupName VARCHAR(1000), @undeletedTableName VARCHAR(1000), @sql VARCHAR(MAX), @tableWasTruncated bit; /* THE FIRST LINE ARE OUR INPUT PARAMETERS In this case we're trying to recover Production.Product1 table in AdventureWorks database. My full backup of AdventureWorks database is at e:\AW.bak */ SELECT @dbName = 'AdventureWorks', @schemaName = 'Production', @tableName = 'Product1', @fullBackupName = 'e:\AW.bak', @undeletedTableName = '##' + @tableName + '_Undeleted', @tableWasTruncated = 0, -- copy the structure from original table to a temp table that we'll fill with restored data @sql = 'IF OBJECT_ID(''tempdb..' + @undeletedTableName + ''') IS NOT NULL DROP TABLE ' + @undeletedTableName + ' SELECT *' + ' INTO ' + @undeletedTableName + ' FROM [' + @dbName + '].[' + @schemaName + '].[' + @tableName + ']' + ' WHERE 1 = 0' EXEC (@sql) IF OBJECT_ID('tempdb..#PagesToRestore') IS NOT NULL DROP TABLE #PagesToRestore /* FIND DATA PAGES WE NEED TO RESTORE*/ CREATE TABLE #PagesToRestore ([ID] INT IDENTITY(1,1), [FileID] INT, [PageID] INT, [SQLtoExec] VARCHAR(1000)) -- DBCC PACE statement to run later RAISERROR ('Looking for deleted pages...', 10, 1) -- use T-LOG direct read to get deleted data pages INSERT INTO #PagesToRestore([FileID], [PageID], [SQLtoExec]) EXEC('USE [' + @dbName + '];SELECT FileID, PageID, ''DBCC TRACEON (3604); DBCC PAGE ([' + @dbName + '], '' + FileID + '', '' + PageID + '', 3) WITH TABLERESULTS'' as SQLToExecFROM (SELECT DISTINCT LEFT([Page ID], 4) AS FileID, CONVERT(VARCHAR(100), ' + 'CONVERT(INT, master.dbo.HexStrToVarBin(SUBSTRING([Page ID], 6, 20)))) AS PageIDFROM sys.fn_dblog(NULL, NULL)WHERE AllocUnitName LIKE ''%' + @schemaName + '.' + @tableName + '%'' ' + 'AND Context IN (''LCX_MARK_AS_GHOST'', ''LCX_HEAP'') AND Operation in (''LOP_DELETE_ROWS''))t');SELECT *FROM #PagesToRestore -- if upper EXEC returns 0 rows it means the table was truncated so find truncated pages IF (SELECT COUNT(*) FROM #PagesToRestore) = 0 BEGIN RAISERROR ('No deleted pages found. Looking for truncated pages...', 10, 1) -- use T-LOG read to get truncated data pages INSERT INTO #PagesToRestore([FileID], [PageID], [SQLtoExec]) -- dark magic happens here -- because truncation simply deallocates pages we have to find out which pages were deallocated. -- we can find this out by looking at the PFS page row's Description column. -- for every deallocated extent the Description has a CSV of 8 pages in that extent. -- then it's just a matter of parsing it. -- we also remove the pages in the extent that weren't allocated to the table itself -- marked with '0x00-->00' EXEC ('USE [' + @dbName + '];DECLARE @truncatedPages TABLE(DeallocatedPages VARCHAR(8000), IsMultipleDeallocs BIT);INSERT INTO @truncatedPagesSELECT REPLACE(REPLACE(Description, ''Deallocated '', ''Y''), ''0x00-->00 '', ''N'') + '';'' AS DeallocatedPages, CHARINDEX('';'', Description) AS IsMultipleDeallocsFROM (SELECT DISTINCT LEFT([Page ID], 4) AS FileID, CONVERT(VARCHAR(100), CONVERT(INT, master.dbo.HexStrToVarBin(SUBSTRING([Page ID], 6, 20)))) AS PageID, DescriptionFROM sys.fn_dblog(NULL, NULL)WHERE Context IN (''LCX_PFS'') AND Description LIKE ''Deallocated%'' AND AllocUnitName LIKE ''%' + @schemaName + '.' + @tableName + '%'') t;SELECT FileID, PageID , ''DBCC TRACEON (3604); DBCC PAGE ([' + @dbName + '], '' + FileID + '', '' + PageID + '', 3) WITH TABLERESULTS'' as SQLToExecFROM (SELECT LEFT(PageAndFile, 1) as WasPageAllocatedToTable , SUBSTRING(PageAndFile, 2, CHARINDEX('':'', PageAndFile) - 2 ) as FileID , CONVERT(VARCHAR(100), CONVERT(INT, master.dbo.HexStrToVarBin(SUBSTRING(PageAndFile, CHARINDEX('':'', PageAndFile) + 1, LEN(PageAndFile))))) as PageIDFROM ( SELECT SUBSTRING(DeallocatedPages, delimPosStart, delimPosEnd - delimPosStart) as PageAndFile, IsMultipleDeallocs FROM ( SELECT *, CHARINDEX('';'', DeallocatedPages)*(N-1) + 1 AS delimPosStart, CHARINDEX('';'', DeallocatedPages)*N AS delimPosEnd FROM @truncatedPages t1 CROSS APPLY (SELECT TOP (case when t1.IsMultipleDeallocs = 1 then 8 else 1 end) ROW_NUMBER() OVER(ORDER BY number) as N FROM master..spt_values) t2 )t)t)tWHERE WasPageAllocatedToTable = ''Y''') SELECT @tableWasTruncated = 1 END DECLARE @lastID INT, @pagesCount INT SELECT @lastID = 1, @pagesCount = COUNT(*) FROM #PagesToRestore SELECT @sql = 'Number of pages to restore: ' + CONVERT(VARCHAR(10), @pagesCount) IF @pagesCount = 0 RAISERROR ('No data pages to restore.', 18, 1) ELSE RAISERROR (@sql, 10, 1) -- If the table was truncated we'll read the data directly from data pages without restoring from backup IF @tableWasTruncated = 0 BEGIN -- RESTORE DATA PAGES FROM FULL BACKUP IN BATCHES OF 200 WHILE @lastID <= @pagesCount BEGIN -- create CSV string of pages to restore SELECT @sql = STUFF((SELECT ',' + CONVERT(VARCHAR(100), FileID) + ':' + CONVERT(VARCHAR(100), PageID) FROM #PagesToRestore WHERE ID BETWEEN @lastID AND @lastID + 200 ORDER BY ID FOR XML PATH('')), 1, 1, '') SELECT @sql = 'RESTORE DATABASE [' + @dbName + '] PAGE = ''' + @sql + ''' FROM DISK = ''' + @fullBackupName + '''' RAISERROR ('Starting RESTORE command:' , 10, 1) WITH NOWAIT; RAISERROR (@sql , 10, 1) WITH NOWAIT; EXEC(@sql); RAISERROR ('Restore DONE' , 10, 1) WITH NOWAIT; SELECT @lastID = @lastID + 200 END /* If you have any differential or transaction log backups you should restore them here to bring the previously restored data pages up to date */ END DECLARE @dbccSinglePage TABLE ( [ParentObject] NVARCHAR(500), [Object] NVARCHAR(500), [Field] NVARCHAR(500), [VALUE] NVARCHAR(MAX) ) DECLARE @cols NVARCHAR(MAX), @paramDefinition NVARCHAR(500), @SQLtoExec VARCHAR(1000), @FileID VARCHAR(100), @PageID VARCHAR(100), @i INT = 1 -- Get deleted table columns from information_schema view -- Need sp_executeSQL because database name can't be passed in as variable SELECT @cols = 'select @cols = STUFF((SELECT '', ['' + COLUMN_NAME + '']''FROM ' + @dbName + '.INFORMATION_SCHEMA.COLUMNSWHERE TABLE_NAME = ''' + @tableName + ''' AND TABLE_SCHEMA = ''' + @schemaName + '''ORDER BY ORDINAL_POSITIONFOR XML PATH('''')), 1, 2, '''')', @paramDefinition = N'@cols nvarchar(max) OUTPUT' EXECUTE sp_executesql @cols, @paramDefinition, @cols = @cols OUTPUT -- Loop through all the restored data pages, -- read data from them and insert them into temp table -- which you can then insert into the orignial deleted table DECLARE dbccPageCursor CURSOR GLOBAL FORWARD_ONLY FOR SELECT [FileID], [PageID], [SQLtoExec] FROM #PagesToRestore ORDER BY [FileID], [PageID] OPEN dbccPageCursor; FETCH NEXT FROM dbccPageCursor INTO @FileID, @PageID, @SQLtoExec; WHILE @@FETCH_STATUS = 0 BEGIN RAISERROR ('---------------------------------------------', 10, 1) WITH NOWAIT; SELECT @sql = 'Loop iteration: ' + CONVERT(VARCHAR(10), @i); RAISERROR (@sql, 10, 1) WITH NOWAIT; SELECT @sql = 'Running: ' + @SQLtoExec RAISERROR (@sql, 10, 1) WITH NOWAIT; -- if something goes wrong with DBCC execution or data gathering, skip it but print error BEGIN TRY INSERT INTO @dbccSinglePage EXEC (@SQLtoExec) -- make the data insert magic happen here IF (SELECT CONVERT(BIGINT, [VALUE]) FROM @dbccSinglePage WHERE [Field] LIKE '%Metadata: ObjectId%') = OBJECT_ID('['+@dbName+'].['+@schemaName +'].['+@tableName+']') BEGIN DELETE @dbccSinglePage WHERE NOT ([ParentObject] LIKE 'Slot % Offset %' AND [Object] LIKE 'Slot % Column %') SELECT @sql = 'USE tempdb; ' + 'IF (OBJECTPROPERTY(object_id(''' + @undeletedTableName + '''), ''TableHasIdentity'') = 1) ' + 'SET IDENTITY_INSERT ' + @undeletedTableName + ' ON; ' + 'INSERT INTO ' + @undeletedTableName + '(' + @cols + ') ' + STUFF((SELECT ' UNION ALL SELECT ' + STUFF((SELECT ', ' + CASE WHEN VALUE = '[NULL]' THEN 'NULL' ELSE '''' + [VALUE] + '''' END FROM ( -- the unicorn help here to correctly set ordinal numbers of columns in a data page -- it's turning STRING order into INT order (1,10,11,2,21 into 1,2,..10,11...21) SELECT [ParentObject], [Object], Field, VALUE, RIGHT('00000' + O1, 6) AS ParentObjectOrder, RIGHT('00000' + REVERSE(LEFT(O2, CHARINDEX(' ', O2)-1)), 6) AS ObjectOrder FROM ( SELECT [ParentObject], [Object], Field, VALUE, REPLACE(LEFT([ParentObject], CHARINDEX('Offset', [ParentObject])-1), 'Slot ', '') AS O1, REVERSE(LEFT([Object], CHARINDEX('Offset ', [Object])-2)) AS O2 FROM @dbccSinglePage WHERE t.ParentObject = ParentObject )t)t ORDER BY ParentObjectOrder, ObjectOrder FOR XML PATH('')), 1, 2, '') FROM @dbccSinglePage t GROUP BY ParentObject FOR XML PATH('') ), 1, 11, '') + ';' RAISERROR (@sql, 10, 1) WITH NOWAIT; EXEC (@sql) END END TRY BEGIN CATCH SELECT @sql = 'ERROR!!!' + CHAR(10) + CHAR(13) + 'ErrorNumber: ' + ERROR_NUMBER() + '; ErrorMessage' + ERROR_MESSAGE() + CHAR(10) + CHAR(13) + 'FileID: ' + @FileID + '; PageID: ' + @PageID RAISERROR (@sql, 10, 1) WITH NOWAIT; END CATCH DELETE @dbccSinglePage SELECT @sql = 'Pages left to process: ' + CONVERT(VARCHAR(10), @pagesCount - @i) + CHAR(10) + CHAR(13) + CHAR(10) + CHAR(13) + CHAR(10) + CHAR(13), @i = @i+1 RAISERROR (@sql, 10, 1) WITH NOWAIT; FETCH NEXT FROM dbccPageCursor INTO @FileID, @PageID, @SQLtoExec; END CLOSE dbccPageCursor; DEALLOCATE dbccPageCursor; EXEC ('SELECT ''' + @undeletedTableName + ''' as TableName; SELECT * FROM ' + @undeletedTableName)END TRYBEGIN CATCH SELECT ERROR_NUMBER() AS ErrorNumber, ERROR_MESSAGE() AS ErrorMessage IF CURSOR_STATUS ('global', 'dbccPageCursor') >= 0 BEGIN CLOSE dbccPageCursor; DEALLOCATE dbccPageCursor; ENDEND CATCH-- if the table was deleted we need to finish the restore page sequenceIF @tableWasTruncated = 0BEGIN -- take a log tail backup and then restore it to complete page restore process DECLARE @currentDate VARCHAR(30) SELECT @currentDate = CONVERT(VARCHAR(30), GETDATE(), 112) RAISERROR ('Starting Log Tail backup to c:\Temp ...', 10, 1) WITH NOWAIT; PRINT ('BACKUP LOG [' + @dbName + '] TO DISK = ''c:\Temp\' + @dbName + '_TailLogBackup_' + @currentDate + '.trn''') EXEC ('BACKUP LOG [' + @dbName + '] TO DISK = ''c:\Temp\' + @dbName + '_TailLogBackup_' + @currentDate + '.trn''') RAISERROR ('Log Tail backup done.', 10, 1) WITH NOWAIT; RAISERROR ('Starting Log Tail restore from c:\Temp ...', 10, 1) WITH NOWAIT; PRINT ('RESTORE LOG [' + @dbName + '] FROM DISK = ''c:\Temp\' + @dbName + '_TailLogBackup_' + @currentDate + '.trn''') EXEC ('RESTORE LOG [' + @dbName + '] FROM DISK = ''c:\Temp\' + @dbName + '_TailLogBackup_' + @currentDate + '.trn''') RAISERROR ('Log Tail restore done.', 10, 1) WITH NOWAIT;END-- The last step is manual. Insert data from our temporary table to the original deleted table The misconception here is that you can do a single table restore properly in SQL Server. You can't. But with little experimentation you can get pretty close to it. One way to possible remove a dependency on a backup to retrieve deleted pages is to quickly run a similar script to the upper one that gets data directly from data pages while the rows are still marked as ghost records. It could be done if we could beat the ghost record cleanup task.

Read the article
What's up with OCFS2?

- by wcoekaer

On Linux there are many filesystem choices and even from Oracle we provide a number of filesystems, all with their own advantages and use cases. Customers often confuse ACFS with OCFS or OCFS2 which then causes assumptions to be made such as one replacing the other etc... I thought it would be good to write up a summary of how OCFS2 got to where it is, what we're up to still, how it is different from other options and how this really is a cool native Linux cluster filesystem that we worked on for many years and is still widely used. Work on a cluster filesystem at Oracle started many years ago, in the early 2000's when the Oracle Database Cluster development team wrote a cluster filesystem for Windows that was primarily focused on providing an alternative to raw disk devices and help customers with the deployment of Oracle Real Application Cluster (RAC). Oracle RAC is a cluster technology that lets us make a cluster of Oracle Database servers look like one big database. The RDBMS runs on many nodes and they all work on the same data. It's a Shared Disk database design. There are many advantages doing this but I will not go into detail as that is not the purpose of my write up. Suffice it to say that Oracle RAC expects all the database data to be visible in a consistent, coherent way, across all the nodes in the cluster. To do that, there were/are a few options : 1) use raw disk devices that are shared, through SCSI, FC, or iSCSI 2) use a network filesystem (NFS) 3) use a cluster filesystem(CFS) which basically gives you a filesystem that's coherent across all nodes using shared disks. It is sort of (but not quite) combining option 1 and 2 except that you don't do network access to the files, the files are effectively locally visible as if it was a local filesystem. So OCFS (Oracle Cluster FileSystem) on Windows was born. Since Linux was becoming a very important and popular platform, we decided that we would also make this available on Linux and thus the porting of OCFS/Windows started. The first version of OCFS was really primarily focused on replacing the use of Raw devices with a simple filesystem that lets you create files and provide direct IO to these files to get basically native raw disk performance. The filesystem was not designed to be fully POSIX compliant and it did not have any where near good/decent performance for regular file create/delete/access operations. Cache coherency was easy since it was basically always direct IO down to the disk device and this ensured that any time one issues a write() command it would go directly down to the disk, and not return until the write() was completed. Same for read() any sort of read from a datafile would be a read() operation that went all the way to disk and return. We did not cache any data when it came down to Oracle data files. So while OCFS worked well for that, since it did not have much of a normal filesystem feel, it was not something that could be submitted to the kernel mail list for inclusion into Linux as another native linux filesystem (setting aside the Windows porting code ...) it did its job well, it was very easy to configure, node membership was simple, locking was disk based (so very slow but it existed), you could create regular files and do regular filesystem operations to a certain extend but anything that was not database data file related was just not very useful in general. Logfiles ok, standard filesystem use, not so much. Up to this point, all the work was done, at Oracle, by Oracle developers. Once OCFS (1) was out for a while and there was a lot of use in the database RAC world, many customers wanted to do more and were asking for features that you'd expect in a normal native filesystem, a real "general purposes cluster filesystem". So the team sat down and basically started from scratch to implement what's now known as OCFS2 (Oracle Cluster FileSystem release 2). Some basic criteria were : Design it with a real Distributed Lock Manager and use the network for lock negotiation instead of the disk Make it a Linux native filesystem instead of a native shim layer and a portable core Support standard Posix compliancy and be fully cache coherent with all operations Support all the filesystem features Linux offers (ACL, extended Attributes, quotas, sparse files,...) Be modern, support large files, 32/64bit, journaling, data ordered journaling, endian neutral, we can mount on both endian /cross architecture,.. Needless to say, this was a huge development effort that took many years to complete. A few big milestones happened along the way... OCFS2 was development in the open, we did not have a private tree that we worked on without external code review from the Linux Filesystem maintainers, great folks like Christopher Hellwig reviewed the code regularly to make sure we were not doing anything out of line, we submitted the code for review on lkml a number of times to see if we were getting close for it to be included into the mainline kernel. Using this development model is standard practice for anyone that wants to write code that goes into the kernel and having any chance of doing so without a complete rewrite or.. shall I say flamefest when submitted. It saved us a tremendous amount of time by not having to re-fit code for it to be in a Linus acceptable state. Some other filesystems that were trying to get into the kernel that didn't follow an open development model had a lot harder time and a lot harsher criticism. March 2006, when Linus released 2.6.16, OCFS2 officially became part of the mainline kernel, it was accepted a little earlier in the release candidates but in 2.6.16. OCFS2 became officially part of the mainline Linux kernel tree as one of the many filesystems. It was the first cluster filesystem to make it into the kernel tree. Our hope was that it would then end up getting picked up by the distribution vendors to make it easy for everyone to have access to a CFS. Today the source code for OCFS2 is approximately 85000 lines of code. We made OCFS2 production with full support for customers that ran Oracle database on Linux, no extra or separate support contract needed. OCFS2 1.0.0 started being built for RHEL4 for x86, x86-64, ppc, s390x and ia64. For RHEL5 starting with OCFS2 1.2. SuSE was very interested in high availability and clustering and decided to build and include OCFS2 with SLES9 for their customers and was, next to Oracle, the main contributor to the filesystem for both new features and bug fixes. Source code was always available even prior to inclusion into mainline and as of 2.6.16, source code was just part of a Linux kernel download from kernel.org, which it still is, today. So the latest OCFS2 code is always the upstream mainline Linux kernel. OCFS2 is the cluster filesystem used in Oracle VM 2 and Oracle VM 3 as the virtual disk repository filesystem. Since the filesystem is in the Linux kernel it's released under the GPL v2 The release model has always been that new feature development happened in the mainline kernel and we then built consistent, well tested, snapshots that had versions, 1.2, 1.4, 1.6, 1.8. But these releases were effectively just snapshots in time that were tested for stability and release quality. OCFS2 is very easy to use, there's a simple text file that contains the node information (hostname, node number, cluster name) and a file that contains the cluster heartbeat timeouts. It is very small, and very efficient. As Sunil Mushran wrote in the manual : OCFS2 is an efficient, easily configured, quickly installed, fully integrated and compatible, feature-rich, architecture and endian neutral, cache coherent, ordered data journaling, POSIX-compliant, shared disk cluster file system. Here is a list of some of the important features that are included : Variable Block and Cluster sizes Supports block sizes ranging from 512 bytes to 4 KB and cluster sizes ranging from 4 KB to 1 MB (increments in power of 2). Extent-based Allocations Tracks the allocated space in ranges of clusters making it especially efficient for storing very large files. Optimized Allocations Supports sparse files, inline-data, unwritten extents, hole punching and allocation reservation for higher performance and efficient storage. File Cloning/snapshots REFLINK is a feature which introduces copy-on-write clones of files in a cluster coherent way. Indexed Directories Allows efficient access to millions of objects in a directory. Metadata Checksums Detects silent corruption in inodes and directories. Extended Attributes Supports attaching an unlimited number of name:value pairs to the file system objects like regular files, directories, symbolic links, etc. Advanced Security Supports POSIX ACLs and SELinux in addition to the traditional file access permission model. Quotas Supports user and group quotas. Journaling Supports both ordered and writeback data journaling modes to provide file system consistency in the event of power failure or system crash. Endian and Architecture neutral Supports a cluster of nodes with mixed architectures. Allows concurrent mounts on nodes running 32-bit and 64-bit, little-endian (x86, x86_64, ia64) and big-endian (ppc64) architectures. In-built Cluster-stack with DLM Includes an easy to configure, in-kernel cluster-stack with a distributed lock manager. Buffered, Direct, Asynchronous, Splice and Memory Mapped I/Os Supports all modes of I/Os for maximum flexibility and performance. Comprehensive Tools Support Provides a familiar EXT3-style tool-set that uses similar parameters for ease-of-use. The filesystem was distributed for Linux distributions in separate RPM form and this had to be built for every single kernel errata release or every updated kernel provided by the vendor. We provided builds from Oracle for Oracle Linux and all kernels released by Oracle and for Red Hat Enterprise Linux. SuSE provided the modules directly for every kernel they shipped. With the introduction of the Unbreakable Enterprise Kernel for Oracle Linux and our interest in reducing the overhead of building filesystem modules for every minor release, we decide to make OCFS2 available as part of UEK. There was no more need for separate kernel modules, everything was built-in and a kernel upgrade automatically updated the filesystem, as it should. UEK allowed us to not having to backport new upstream filesystem code into an older kernel version, backporting features into older versions introduces risk and requires extra testing because the code is basically partially rewritten. The UEK model works really well for continuing to provide OCFS2 without that extra overhead. Because the RHEL kernel did not contain OCFS2 as a kernel module (it is in the source tree but it is not built by the vendor in kernel module form) we stopped adding the extra packages to Oracle Linux and its RHEL compatible kernel and for RHEL. Oracle Linux customers/users obviously get OCFS2 included as part of the Unbreakable Enterprise Kernel, SuSE customers get it by SuSE distributed with SLES and Red Hat can decide to distribute OCFS2 to their customers if they chose to as it's just a matter of compiling the module and making it available. OCFS2 today, in the mainline kernel is pretty much feature complete in terms of integration with every filesystem feature Linux offers and it is still actively maintained with Joel Becker being the primary maintainer. Since we use OCFS2 as part of Oracle VM, we continue to look at interesting new functionality to add, REFLINK was a good example, and as such we continue to enhance the filesystem where it makes sense. Bugfixes and any sort of code that goes into the mainline Linux kernel that affects filesystems, automatically also modifies OCFS2 so it's in kernel, actively maintained but not a lot of new development happening at this time. We continue to fully support OCFS2 as part of Oracle Linux and the Unbreakable Enterprise Kernel and other vendors make their own decisions on support as it's really a Linux cluster filesystem now more than something that we provide to customers. It really just is part of Linux like EXT3 or BTRFS etc, the OS distribution vendors decide. Do not confuse OCFS2 with ACFS (ASM cluster Filesystem) also known as Oracle Cloud Filesystem. ACFS is a filesystem that's provided by Oracle on various OS platforms and really integrates into Oracle ASM (Automatic Storage Management). It's a very powerful Cluster Filesystem but it's not distributed as part of the Operating System, it's distributed with the Oracle Database product and installs with and lives inside Oracle ASM. ACFS obviously is fully supported on Linux (Oracle Linux, Red Hat Enterprise Linux) but OCFS2 independently as a native Linux filesystem is also, and continues to also be supported. ACFS is very much tied into the Oracle RDBMS, OCFS2 is just a standard native Linux filesystem with no ties into Oracle products. Customers running the Oracle database and ASM really should consider using ACFS as it also provides storage/clustered volume management. Customers wanting to use a simple, easy to use generic Linux cluster filesystem should consider using OCFS2. To learn more about OCFS2 in detail, you can find good documentation on http://oss.oracle.com/projects/ocfs2 in the Documentation area, or get the latest mainline kernel from http://kernel.org and read the source. One final, unrelated note - since I am not always able to publicly answer or respond to comments, I do not want to selectively publish comments from readers. Sometimes I forget to publish comments, sometime I publish them and sometimes I would publish them but if for some reason I cannot publicly comment on them, it becomes a very one-sided stream. So for now I am going to not publish comments from anyone, to be fair to all sides. You are always welcome to email me and I will do my best to respond to technical questions, questions about strategy or direction are sometimes not possible to answer for obvious reasons.

Read the article
CodePlex Daily Summary for Friday, December 07, 2012

CodePlex Daily Summary for Friday, December 07, 2012Popular ReleasesTorrents-List Organizer: Torrents-list organizer v 0.5.0.0: ????? ? ?????? 0.5.0.0: 1) ????? ? ?????? ?????? ??????????? ???????? ?????????? ? ?????? ????????. ??????????????, ????????? ???????, ??? ????????? ???????? ?????? ("???????? ?? ??????"). 2) ????????? ???????? ???????? ???????-??????. ????? ??????, ?????? ??? ?????? ?????????????? ??? ???????-?????. 3) ?????? ??? ??????? ??????? ?????? ? ????? ????????????? ?????????? ???????????, ? ?????????????, ??????? ?????? ????? ?????? ??????????, ?????????? ???? ??????? ???? ?????? ???????? ??????? ? ...fastJSON: v2.0.11: - bug fix single char number json - added UseEscapedUnicode parameter for controlling string output in \uxxxx for unicode/utf8 format - bug fix null and generic ToObject<>() - bug fix List<> of custom typesMedia Companion: MediaCompanion3.508b: Recommended Download - Fixes IMDB title scrape bug and several mc_com movie and actor cache bugs.testddgit060720122: asdf: asdfFile Upload App: FileUploadApp_Source: You can download this source code,and run it direstly.Microsoft Ajax Minifier: Microsoft Ajax Minifier 4.77: Look for and strip "IDENT:nomunge" minification hints that define fields that should not be automatically renamed within the current scope (multiple IDENT:nomunge pairs can be specified in a single string literal if comma-separated). Will only work if the field is actually defined within the current scope (can't be used to not rename an outer field, for instance; will be ignored in those cases). Minification hints are expression statements that are composed of a single string literal similar ...Periodic.Net: 0.8: Whats new for Periodic.Net 0.8: New Element Info Dialog New Website MenuItem Minor Bug Fix's, improvements and speed upsActor Framework for Windows Azure: ActorFx V0.10: ActorFx V0.10 Very Preliminary release of the Actor Framework for Windows Azure. The code is still fairly unstable. After you download the code, please refer to the Actor Runtime Manager help page for instructions.mp3player-xslt-plugin: mp3player module 1.0: This is a mp3 player module that you can install on your umbraco project. It can display the list of the media and you can click the button to play or stop.Yahoo! UI Library: YUI Compressor for .Net: Version 2.2.0.0 - Epee: New : Web Optimization package! Cleaned up the nuget packages BugFix: minifying lots of files will now be faster because of a recent regression in some code. (We were instantiating something far too many times).DtPad - .NET Framework text editor: DtPad 2.9.0.40: http://dtpad.diariotraduttore.com/files/images/flag-eng.png English + A new built-in editor for the management of CSV files, including the edit of cells, deleting and adding new rows, replacement of delimiter character and much more (issue #1137) + The limit of rows allowed before the decommissioning of their side panel has been raised (new default: 1.000) (issue #1155, only partially solved) + Pressing CTRL+TAB now DtPad opens a screen that shows the list of opened tabs (issue #1143) + Note...AvalonDock: AvalonDock 2.0.1746: Welcome to the new release of AvalonDock 2.0 This release contains a lot (lot) of bug fixes and some great improvements: Views Caching: Content of Documents and Anchorables is no more recreated everytime user move it. Autohide pane opens really fast now. Two new themes Expression (Dark and Light) and Metro (both of them still in experimental stage). If you already use AD 2.0 or plan to integrate it in your future projects, I'm interested in your ideas for new features: http://avalondock...AcDown?????: AcDown????? v4.3.2: ??●AcDown??????????、??、??、???????。????，????，?????????????????????????。???????????Acfun、????(Bilibili)、??、??、YouTube、??、???、??????、SF????、????????????。 ●??????AcPlay?????，??????、????????????????。 ● AcDown??????????????????，????????????????????????????。 ● AcDown???????C#??，????.NET Framework 2.0??。?????"Acfun?????"。 ?? v4.3.2?? ?????????????????? ??Acfun??????? ??Bilibili?????? ??Bilibili???????????? ??Bilibili????????? ??????????????? ???? ??Bilibili??????? ????32??64? Windows XP/...ExtJS based ASP.NET 2.0 Controls: FineUI v3.2.2: ??FineUI ?? ExtJS ??? ASP.NET 2.0 ???。 FineUI??? ?? No JavaScript，No CSS，No UpdatePanel，No ViewState，No WebServices ???????。 ?????? IE 7.0、Firefox 3.6、Chrome 3.0、Opera 10.5、Safari 3.0+ ???? Apache License 2.0 (Apache) ???? ??：http://fineui.com/bbs/ ??：http://fineui.com/demo/ ??：http://fineui.com/doc/ ??：http://fineui.codeplex.com/ ???? +2012-12-03 v3.2.2 -?????????????，?????button/button_menu.aspx（????）。 +?Window????Plain??；?ToolbarPosition??Footer??；?????FooterBarAlign??。 -????win...Player Framework by Microsoft: Player Framework for Windows Phone 8: This is a brand new version of the Player Framework for Windows Phone, available exclusively for Windows Phone 8, and now based upon the Player Framework for Windows 8. While this new version is not backward compatible with Windows Phone 7 (get that http://smf.codeplex.com/releases/view/88970), it does offer the same great feature set plus dozens of new features such as advertising, localization support, and improved skinning. Click here for more information about what's new in the Windows P...SSH.NET Library: 2012.12.3: New feature(s): + SynchronizeDirectoriesQuest: Quest 5.3 Beta: New features in Quest 5.3 include: Grid-based map (sponsored by Phillip Zolla) Changable POV (sponsored by Phillip Zolla) Game log (sponsored by Phillip Zolla) Customisable object link colour (sponsored by Phillip Zolla) More room description options (by James Gregory) More mathematical functions now available to expressions Desktop Player uses the same UI as WebPlayer - this will make it much easier to implement customisation options New sorting functions: ObjectListSort(list,...Chinook Database: Chinook Database 1.4: Chinook Database 1.4 This is a sample database available in multiple formats: SQL scripts for multiple database vendors, embeded database files, and XML format. The Chinook data model is available here. ChinookDatabase1.4_CompleteVersion.zip is a complete package for all supported databases/data sources. There are also packages for each specific data source. Supported Database ServersDB2 EffiProz MySQL Oracle PostgreSQL SQL Server SQL Server Compact SQLite Issues Resolved293...RiP-Ripper & PG-Ripper: RiP-Ripper 2.9.34: changes FIXED: Thanks Function when "Download each post in it's own folder" is disabled FIXED: "PixHub.eu" linksD3 Loot Tracker: 1.5.6: Updated to work with D3 version 1.0.6.13300New Projects132702: dddActor Framework for Windows Azure: The goal for ActorFx for Windows Azure is to provide a non-prescriptive, language-independent model of dynamic distributed objects. Administração de Condominios: Sistema de administração de condominiosAssembly Searcher: Search references of assemblies to find interlinked assemblies.BadgeManager: Acceso Utenti Gestione gestionaleBee OPOA Platform: rapid-develop platform include 1,Multi-database supported, 2,MVC, 3,RBACBetterPlace WebApp: Student web site for out BetterPlaceBookingCanned Bytes Repository: A C# code repository for backing other codeplex open source projects I'm doing.churchtool: sermon, ftp, upload, churchDNN Schedule Clean DB log: Schedule task for cleaning database logs, eventlog and sitelog tables.Hestia's Cyberminer Software: Coming soon!IcyTowerMobile: A small project of two students from FH Hagenberg, Mobile Computing department, to port the great game IcyTower from desktop to Windows Phone 7.IronJS register custom javascript object: IronJS test examplesKaryadeka Alam Lestari Accounting: Sistem Informasi Akunting Penjualan Jasa PT Karyadeka Alam Lestari SemarangKJFramework????: KJFramework????, ???HA?????C/S????????????。Measure It: Measuring micro benchmarks in dot net Modern UI (Metro) Charts for Windows 8, WPF, Silverlight: This project provides a charting library which can be used in WPF, Windows8 and SilverlightMSurplus: NOMyGreatLittleSite: This is a demo projectNeznayka: Static Code Analysis Rule-set for Visual Studio 2010 Database Edition Nprove Pulgin for Visual Studio: Nprove Plugin A plugin for developers to ease the handling of .net projects on systems with hundreds of them.Nusya Tester: A software to create and use various tests with right answers explanations to help one in self-education. It's developed in C#Object Pool reference implementation: A simple object pooling reference implementation demonstrating how to create an object pool for expensive to create, but often used, objects. parserkit: little parserkit QConnectSDK: ????????OAuth2 SDK??sampleydoo: some stuff hereServEvo: ServEvo is an application that allows you to turn your computer into a complete development environmentSnakeApplet: Classic snake game.TestWSP: A test project setupWindSim: ...WinMoApps: A collection of useful Windows Mobile applications, including a secure password manager, poker blinds timer, shopping list manager and timer program.WParse: WParse is a CLS compliant class library developed in C# with .NET 4.0 as the target framework. WParse makes it easier to parse log files for the game Wurm Online and generate interesting and useful statistics from the logged data. It includes scripting support for parsers in C#.XLS File Deployment: Excel tool to support file deployment.Ynote Classic: Ynote Classic is a text editor made by SS Corporation Author : Samarjeet Singh It can compile batch files. And is a text editor which can color text. Ide too

Read the article
Using the ASP.NET Cache to cache data in a Model or Business Object layer, without a dependency on System.Web in the layer - Part One.

- by Rhames

ASP.NET applications can make use of the System.Web.Caching.Cache object to cache data and prevent repeated expensive calls to a database or other store. However, ideally an application should make use of caching at the point where data is retrieved from the database, which typically is inside a Business Objects or Model layer. One of the key features of using a UI pattern such as Model-View-Presenter (MVP) or Model-View-Controller (MVC) is that the Model and Presenter (or Controller) layers are developed without any knowledge of the UI layer. Introducing a dependency on System.Web into the Model layer would break this independence of the Model from the View. This article gives a solution to this problem, using dependency injection to inject the caching implementation into the Model layer at runtime. This allows caching to be used within the Model layer, without any knowledge of the actual caching mechanism that will be used. Create a sample application to use the caching solution Create a test SQL Server database This solution uses a SQL Server database with the same Sales data used in my previous post on calculating running totals. The advantage of using this data is that it gives nice slow queries that will exaggerate the effect of using caching! To create the data, first create a new SQL database called CacheSample. Next run the following script to create the Sale table and populate it: USE CacheSample GO CREATE TABLE Sale(DayCount smallint, Sales money) CREATE CLUSTERED INDEX ndx_DayCount ON Sale(DayCount) go INSERT Sale VALUES (1,120) INSERT Sale VALUES (2,60) INSERT Sale VALUES (3,125) INSERT Sale VALUES (4,40) DECLARE @DayCount smallint, @Sales money SET @DayCount = 5 SET @Sales = 10 WHILE @DayCount < 5000 BEGIN INSERT Sale VALUES (@DayCount,@Sales) SET @DayCount = @DayCount + 1 SET @Sales = @Sales + 15 END Next create a stored procedure to calculate the running total, and return a specified number of rows from the Sale table, using the following script: USE [CacheSample] GO SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO -- ============================================= -- Author: Robin -- Create date: -- Description: -- ============================================= CREATE PROCEDURE [dbo].[spGetRunningTotals] -- Add the parameters for the stored procedure here @HighestDayCount smallint = null AS BEGIN -- SET NOCOUNT ON added to prevent extra result sets from -- interfering with SELECT statements. SET NOCOUNT ON; IF @HighestDayCount IS NULL SELECT @HighestDayCount = MAX(DayCount) FROM dbo.Sale DECLARE @SaleTbl TABLE (DayCount smallint, Sales money, RunningTotal money) DECLARE @DayCount smallint, @Sales money, @RunningTotal money SET @RunningTotal = 0 SET @DayCount = 0 DECLARE rt_cursor CURSOR FOR SELECT DayCount, Sales FROM Sale ORDER BY DayCount OPEN rt_cursor FETCH NEXT FROM rt_cursor INTO @DayCount,@Sales WHILE @@FETCH_STATUS = 0 AND @DayCount <= @HighestDayCount BEGIN SET @RunningTotal = @RunningTotal + @Sales INSERT @SaleTbl VALUES (@DayCount,@Sales,@RunningTotal) FETCH NEXT FROM rt_cursor INTO @DayCount,@Sales END CLOSE rt_cursor DEALLOCATE rt_cursor SELECT DayCount, Sales, RunningTotal FROM @SaleTbl END GO Create the Sample ASP.NET application In Visual Studio create a new solution and add a class library project called CacheSample.BusinessObjects and an ASP.NET web application called CacheSample.UI. The CacheSample.BusinessObjects project will contain a single class to represent a Sale data item, with all the code to retrieve the sales from the database included in it for simplicity (normally I would at least have a separate Repository or other object that is responsible for retrieving data, and probably a data access layer as well, but for this sample I want to keep it simple). The C# code for the Sale class is shown below: using System; using System.Collections.Generic; using System.Data; using System.Data.SqlClient; namespace CacheSample.BusinessObjects { public class Sale { public Int16 DayCount { get; set; } public decimal Sales { get; set; } public decimal RunningTotal { get; set; } public static IEnumerable<Sale> GetSales(int? highestDayCount) { List<Sale> sales = new List<Sale>(); SqlParameter highestDayCountParameter = new SqlParameter("@HighestDayCount", SqlDbType.SmallInt); if (highestDayCount.HasValue) highestDayCountParameter.Value = highestDayCount; else highestDayCountParameter.Value = DBNull.Value; string connectionStr = System.Configuration.ConfigurationManager .ConnectionStrings["CacheSample"].ConnectionString; using(SqlConnection sqlConn = new SqlConnection(connectionStr)) using (SqlCommand sqlCmd = sqlConn.CreateCommand()) { sqlCmd.CommandText = "spGetRunningTotals"; sqlCmd.CommandType = CommandType.StoredProcedure; sqlCmd.Parameters.Add(highestDayCountParameter); sqlConn.Open(); using (SqlDataReader dr = sqlCmd.ExecuteReader()) { while (dr.Read()) { Sale newSale = new Sale(); newSale.DayCount = dr.GetInt16(0); newSale.Sales = dr.GetDecimal(1); newSale.RunningTotal = dr.GetDecimal(2); sales.Add(newSale); } } } return sales; } } } The static GetSale() method makes a call to the spGetRunningTotals stored procedure and then reads each row from the returned SqlDataReader into an instance of the Sale class, it then returns a List of the Sale objects, as IEnnumerable<Sale>. A reference to System.Configuration needs to be added to the CacheSample.BusinessObjects project so that the connection string can be read from the web.config file. In the CacheSample.UI ASP.NET project, create a single web page called ShowSales.aspx, and make this the default start up page. This page will contain a single button to call the GetSales() method and a label to display the results. The html mark up and the C# code behind are shown below: ShowSales.aspx <%@ Page Language="C#" AutoEventWireup="true" CodeBehind="ShowSales.aspx.cs" Inherits="CacheSample.UI.ShowSales" %> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head runat="server"> <title>Cache Sample - Show All Sales</title> </head> <body> <form id="form1" runat="server"> <div> <asp:Button ID="btnTest1" runat="server" onclick="btnTest1_Click" Text="Get All Sales" />     <asp:Label ID="lblResults" runat="server"></asp:Label> </div> </form> </body> </html> ShowSales.aspx.cs using System; using System.Collections.Generic; using System.Linq; using System.Web; using System.Web.UI; using System.Web.UI.WebControls; using CacheSample.BusinessObjects; namespace CacheSample.UI { public partial class ShowSales : System.Web.UI.Page { protected void Page_Load(object sender, EventArgs e) { } protected void btnTest1_Click(object sender, EventArgs e) { System.Diagnostics.Stopwatch stopWatch = new System.Diagnostics.Stopwatch(); stopWatch.Start(); var sales = Sale.GetSales(null); var lastSales = sales.Last(); stopWatch.Stop(); lblResults.Text = string.Format( "Count of Sales: {0}, Last DayCount: {1}, Total Sales: {2}. Query took {3} ms", sales.Count(), lastSales.DayCount, lastSales.RunningTotal, stopWatch.ElapsedMilliseconds); } } } Finally we need to add a connection string to the CacheSample SQL Server database, called CacheSample, to the web.config file: <?xmlversion="1.0"?> <configuration> <connectionStrings> <addname="CacheSample" connectionString="data source=.\SQLEXPRESS;Integrated Security=SSPI;Initial Catalog=CacheSample" providerName="System.Data.SqlClient" /> </connectionStrings> <system.web> <compilationdebug="true"targetFramework="4.0" /> </system.web> </configuration> Run the application and click the button a few times to see how long each call to the database takes. On my system, each query takes about 450ms. Next I shall look at a solution to use the ASP.NET caching to cache the data returned by the query, so that subsequent requests to the GetSales() method are much faster. Adding Data Caching Support I am going to create my caching support in a separate project called CacheSample.Caching, so the next step is to add a class library to the solution. We shall be using the application configuration to define the implementation of our caching system, so we need a reference to System.Configuration adding to the project. ICacheProvider<T> Interface The first step in adding caching to our application is to define an interface, called ICacheProvider, in the CacheSample.Caching project, with methods to retrieve any data from the cache or to retrieve the data from the data source if it is not present in the cache. Dependency Injection will then be used to inject an implementation of this interface at runtime, allowing the users of the interface (i.e. the CacheSample.BusinessObjects project) to be completely unaware of how the caching is actually implemented. As data of any type maybe retrieved from the data source, it makes sense to use generics in the interface, with a generic type parameter defining the data type associated with a particular instance of the cache interface implementation. The C# code for the ICacheProvider interface is shown below: using System; using System.Collections.Generic; namespace CacheSample.Caching { public interface ICacheProvider { } public interface ICacheProvider<T> : ICacheProvider { T Fetch(string key, Func<T> retrieveData, DateTime? absoluteExpiry, TimeSpan? relativeExpiry); IEnumerable<T> Fetch(string key, Func<IEnumerable<T>> retrieveData, DateTime? absoluteExpiry, TimeSpan? relativeExpiry); } } The empty non-generic interface will be used as a type in a Dictionary generic collection later to store instances of the ICacheProvider<T> implementation for reuse, I prefer to use a base interface when doing this, as I think the alternative of using object makes for less clear code. The ICacheProvider<T> interface defines two overloaded Fetch methods, the difference between these is that one will return a single instance of the type T and the other will return an IEnumerable<T>, providing support for easy caching of collections of data items. Both methods will take a key parameter, which will uniquely identify the cached data, a delegate of type Func<T> or Func<IEnumerable<T>> which will provide the code to retrieve the data from the store if it is not present in the cache, and absolute or relative expiry policies to define when a cached item should expire. Note that at present there is no support for cache dependencies, but I shall be showing a method of adding this in part two of this article. CacheProviderFactory Class We need a mechanism of creating instances of our ICacheProvider<T> interface, using Dependency Injection to get the implementation of the interface. To do this we shall create a CacheProviderFactory static class in the CacheSample.Caching project. This factory will provide a generic static method called GetCacheProvider<T>(), which shall return instances of ICacheProvider<T>. We can then call this factory method with the relevant data type (for example the Sale class in the CacheSample.BusinessObject project) to get a instance of ICacheProvider for that type (e.g. call CacheProviderFactory.GetCacheProvider<Sale>() to get the ICacheProvider<Sale> implementation). The C# code for the CacheProviderFactory is shown below: using System; using System.Collections.Generic; using CacheSample.Caching.Configuration; namespace CacheSample.Caching { public static class CacheProviderFactory { private static Dictionary<Type, ICacheProvider> cacheProviders = new Dictionary<Type, ICacheProvider>(); private static object syncRoot = new object(); ///<summary> /// Factory method to create or retrieve an implementation of the /// ICacheProvider interface for type <typeparamref name="T"/>. ///</summary> ///<typeparam name="T"> /// The type that this cache provider instance will work with ///</typeparam> ///<returns>An instance of the implementation of ICacheProvider for type ///<typeparamref name="T"/>, as specified by the application /// configuration</returns> public static ICacheProvider<T> GetCacheProvider<T>() { ICacheProvider<T> cacheProvider = null; // Get the Type reference for the type parameter T Type typeOfT = typeof(T); // Lock the access to the cacheProviders dictionary // so multiple threads can work with it lock (syncRoot) { // First check if an instance of the ICacheProvider implementation // already exists in the cacheProviders dictionary for the type T if (cacheProviders.ContainsKey(typeOfT)) cacheProvider = (ICacheProvider<T>)cacheProviders[typeOfT]; else { // There is not already an instance of the ICacheProvider in // cacheProviders for the type T // so we need to create one // Get the Type reference for the application's implementation of // ICacheProvider from the configuration Type cacheProviderType = Type.GetType(CacheProviderConfigurationSection.Current. CacheProviderType); if (cacheProviderType != null) { // Now get a Type reference for the Cache Provider with the // type T generic parameter Type typeOfCacheProviderTypeForT = cacheProviderType.MakeGenericType(new Type[] { typeOfT }); if (typeOfCacheProviderTypeForT != null) { // Create the instance of the Cache Provider and add it to // the cacheProviders dictionary for future use cacheProvider = (ICacheProvider<T>)Activator. CreateInstance(typeOfCacheProviderTypeForT); cacheProviders.Add(typeOfT, cacheProvider); } } } } return cacheProvider; } } } As this code uses Activator.CreateInstance() to create instances of the ICacheProvider<T> implementation, which is a slow process, the factory class maintains a Dictionary of the previously created instances so that a cache provider needs to be created only once for each type. The type of the implementation of ICacheProvider<T> is read from a custom configuration section in the application configuration file, via the CacheProviderConfigurationSection class, which is described below. CacheProviderConfigurationSection Class The implementation of ICacheProvider<T> will be specified in a custom configuration section in the application’s configuration. To handle this create a folder in the CacheSample.Caching project called Configuration, and add a class called CacheProviderConfigurationSection to this folder. This class will extend the System.Configuration.ConfigurationSection class, and will contain a single string property called CacheProviderType. The C# code for this class is shown below: using System; using System.Configuration; namespace CacheSample.Caching.Configuration { internal class CacheProviderConfigurationSection : ConfigurationSection { public static CacheProviderConfigurationSection Current { get { return (CacheProviderConfigurationSection) ConfigurationManager.GetSection("cacheProvider"); } } [ConfigurationProperty("type", IsRequired=true)] public string CacheProviderType { get { return (string)this["type"]; } } } } Adding Data Caching to the Sales Class We now have enough code in place to add caching to the GetSales() method in the CacheSample.BusinessObjects.Sale class, even though we do not yet have an implementation of the ICacheProvider<T> interface. We need to add a reference to the CacheSample.Caching project to CacheSample.BusinessObjects so that we can use the ICacheProvider<T> interface within the GetSales() method. Once the reference is added, we can first create a unique string key based on the method name and the parameter value, so that the same cache key is used for repeated calls to the method with the same parameter values. Then we get an instance of the cache provider for the Sales type, using the CacheProviderFactory, and pass the existing code to retrieve the data from the database as the retrievalMethod delegate in a call to the Cache Provider Fetch() method. The C# code for the modified GetSales() method is shown below: public static IEnumerable<Sale> GetSales(int? highestDayCount) { string cacheKey = string.Format("CacheSample.BusinessObjects.GetSalesWithCache({0})", highestDayCount); return CacheSample.Caching.CacheProviderFactory. GetCacheProvider<Sale>().Fetch(cacheKey, delegate() { List<Sale> sales = new List<Sale>(); SqlParameter highestDayCountParameter = new SqlParameter("@HighestDayCount", SqlDbType.SmallInt); if (highestDayCount.HasValue) highestDayCountParameter.Value = highestDayCount; else highestDayCountParameter.Value = DBNull.Value; string connectionStr = System.Configuration.ConfigurationManager. ConnectionStrings["CacheSample"].ConnectionString; using (SqlConnection sqlConn = new SqlConnection(connectionStr)) using (SqlCommand sqlCmd = sqlConn.CreateCommand()) { sqlCmd.CommandText = "spGetRunningTotals"; sqlCmd.CommandType = CommandType.StoredProcedure; sqlCmd.Parameters.Add(highestDayCountParameter); sqlConn.Open(); using (SqlDataReader dr = sqlCmd.ExecuteReader()) { while (dr.Read()) { Sale newSale = new Sale(); newSale.DayCount = dr.GetInt16(0); newSale.Sales = dr.GetDecimal(1); newSale.RunningTotal = dr.GetDecimal(2); sales.Add(newSale); } } } return sales; }, null, new TimeSpan(0, 10, 0)); } This example passes the code to retrieve the Sales data from the database to the Cache Provider as an anonymous method, however it could also be written as a lambda. The main advantage of using an anonymous function (method or lambda) is that the code inside the anonymous function can access the parameters passed to the GetSales() method. Finally the absolute expiry is set to null, and the relative expiry set to 10 minutes, to indicate that the cache entry should be removed 10 minutes after the last request for the data. As the ICacheProvider<T> has a Fetch() method that returns IEnumerable<T>, we can simply return the results of the Fetch() method to the caller of the GetSales() method. This should be all that is needed for the GetSales() method to now retrieve data from a cache after the first time the data has be retrieved from the database. Implementing a ASP.NET Cache Provider The final step is to actually implement the ICacheProvider<T> interface, and add the implementation details to the web.config file for the dependency injection. The cache provider implementation needs to have access to System.Web. Therefore it could be placed in the CacheSample.UI project, or in its own project that has a reference to System.Web. Implementing the Cache Provider in a separate project is my favoured approach. Create a new project inside the solution called CacheSample.CacheProvider, and add references to System.Web and CacheSample.Caching to this project. Add a class to the project called AspNetCacheProvider. Make the class a generic class by adding the generic parameter <T> and indicate that the class implements ICacheProvider<T>. The C# code for the AspNetCacheProvider class is shown below: using System; using System.Collections.Generic; using System.Linq; using System.Web; using System.Web.Caching; using CacheSample.Caching; namespace CacheSample.CacheProvider { public class AspNetCacheProvider<T> : ICacheProvider<T> { #region ICacheProvider<T> Members public T Fetch(string key, Func<T> retrieveData, DateTime? absoluteExpiry, TimeSpan? relativeExpiry) { return FetchAndCache<T>(key, retrieveData, absoluteExpiry, relativeExpiry); } public IEnumerable<T> Fetch(string key, Func<IEnumerable<T>> retrieveData, DateTime? absoluteExpiry, TimeSpan? relativeExpiry) { return FetchAndCache<IEnumerable<T>>(key, retrieveData, absoluteExpiry, relativeExpiry); } #endregion #region Helper Methods private U FetchAndCache<U>(string key, Func<U> retrieveData, DateTime? absoluteExpiry, TimeSpan? relativeExpiry) { U value; if (!TryGetValue<U>(key, out value)) { value = retrieveData(); if (!absoluteExpiry.HasValue) absoluteExpiry = Cache.NoAbsoluteExpiration; if (!relativeExpiry.HasValue) relativeExpiry = Cache.NoSlidingExpiration; HttpContext.Current.Cache.Insert(key, value, null, absoluteExpiry.Value, relativeExpiry.Value); } return value; } private bool TryGetValue<U>(string key, out U value) { object cachedValue = HttpContext.Current.Cache.Get(key); if (cachedValue == null) { value = default(U); return false; } else { try { value = (U)cachedValue; return true; } catch { value = default(U); return false; } } } #endregion } } The two interface Fetch() methods call a private method called FetchAndCache(). This method first checks for a element in the HttpContext.Current.Cache with the specified cache key, and if so tries to cast this to the specified type (either T or IEnumerable<T>). If the cached element is found, the FetchAndCache() method simply returns it. If it is not found in the cache, the method calls the retrievalMethod delegate to get the data from the data source, and then adds this to the HttpContext.Current.Cache. The final step is to add the AspNetCacheProvider class to the relevant custom configuration section in the CacheSample.UI.Web.Config file. To do this there needs to be a <configSections> element added as the first element in <configuration>. This will match a custom section called <cacheProvider> with the CacheProviderConfigurationSection. Then we add a <cacheProvider> element, with a type property set to the fully qualified assembly name of the AspNetCacheProvider class, as shown below: <?xmlversion="1.0"?> <configuration> <configSections> <sectionname="cacheProvider" type="CacheSample.Base.Configuration.CacheProviderConfigurationSection, CacheSample.Base" /> </configSections> <connectionStrings> <addname="CacheSample" connectionString="data source=.\SQLEXPRESS;Integrated Security=SSPI;Initial Catalog=CacheSample" providerName="System.Data.SqlClient" /> </connectionStrings> <cacheProvidertype="CacheSample.CacheProvider.AspNetCacheProvider`1, CacheSample.CacheProvider, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null"> </cacheProvider> <system.web> <compilationdebug="true"targetFramework="4.0" /> </system.web> </configuration> One point to note is that the fully qualified assembly name of the AspNetCacheProvider class includes the notation `1 after the class name, which indicates that it is a generic class with a single generic type parameter. The CacheSample.UI project needs to have references added to CacheSample.Caching and CacheSample.CacheProvider so that the actual application is aware of the relevant cache provider implementation. Conclusion After implementing this solution, you should have a working cache provider mechanism, that will allow the middle and data access layers to implement caching support when retrieving data, without any knowledge of the actually caching implementation. If the UI is not ASP.NET based, if for example it is Winforms or WPF, the implementation of ICacheProvider<T> would be written around whatever technology is available. It could even be a standalone caching system that takes full responsibility for adding and removing items from a global store. The next part of this article will show how this caching mechanism may be extended to provide support for cache dependencies, such as the System.Web.Caching.SqlCacheDependency. Another possible extension would be to cache the cache provider implementations instead of storing them in a static Dictionary in the CacheProviderFactory. This would prevent a build up of seldom used cache providers in the application memory, as they could be removed from the cache if not used often enough, although in reality there are probably unlikely to be vast numbers of cache provider implementation instances, as most applications do not have a massive number of business object or model types.

Read the article
Securing an ADF Application using OES11g: Part 2

- by user12587121

To validate the integration with OES we need a sample ADF Application that is rich enough to allow us to test securing the various ADF elements. To achieve this we can add some items including bounded task flows to the application developed in this tutorial. A sample JDeveloper 11.1.1.6 project is available here. It depends on the Fusion Order Demo (FOD) database schema which is easily created using the FOD build scripts.In the deployment we have chosen to enable only ADF Authentication as we will delegate Authorization, mostly, to OES.The welcome page of the application with all the links exposed looks as follows: The Welcome, Browse Products, Browse Stock and System Administration links go to pages while the Supplier Registration and Update Stock are bounded task flows. The Login link goes to a basic login page and once logged in a link is presented that goes to a logout page. Only the Browse Products and Browse Stock pages are really connected to the database--the other pages and task flows do not really perform any operations on the database. Required Security Policies We make use of a set of test users and roles as decscribed on the welcome page of the application. In order to exercise the different authorization possibilities we would like to enforce the following sample policies: Anonymous users can see the Login, Welcome and Supplier Registration links. They can also see the Welcome page, the Login page and follow the Supplier Registration task flow. They can see the icon adjacent to the Login link indicating whether they have logged in or not. Authenticated users can see the Browse Product page. Only staff granted the right can see the Browse Product page cost price value returned from the database and then only if the value is below a configurable limit. Suppliers and staff can see the Browse Stock links and pages. Customers cannot. Suppliers can see the Update Stock link but only those with the update permission are allowed to follow the task flow that it launches. We could hide the link but leave it exposed here so we can easily demonstrate the method call activity protecting the task flow. Only staff granted the right can see the System Administration link and the System Administration page it accesses. Implementing the required policies In order to secure the application we will make use of the following techniques: EL Expressions and Java backing beans: JSF has the notion of EL expressions to reference data from backing Java classes. We use these to control the presentation of links on the navigation page which respect the security contraints. So a user will not see links that he is not allowed to click on into. These Java backing beans can call on to OES for an authorization decision. Important Note: naturally we would configure the WLS domain where our ADF application is running as an OES WLS SM, which would allow us to efficiently query OES over the PEP API. However versioning conflicts between OES 11.1.1.5 and ADF 11.1.1.6 mean that this is not possible. Nevertheless, we can make use of the OES RESTful gateway technique from this posting in order to call into OES. You can easily create and manage backing beans in Jdeveloper as follows: Custom ADF Phase Listener: ADF extends the JSF page lifecycle flow and allows one to hook into the flow to intercept page rendering. We use this to put a check prior to rendering any protected pages, again calling on to OES via the backing bean. Phase listeners are configured in the adf-settings.xml file. See the MyPageListener.java class in the project. Here, for example, is the code we use in the listener to check for allowed access to the sysadmin page, navigating back to the welcome page if authorization is not granted: if (page != null && (page.equals("/system.jspx") || page.equals("/system"))){ System.out.println("MyPageListener: Checking Authorization for /system"); if (getValue("#{oesBackingBean.UIAccessSysAdmin}").toString().equals("false") ){ System.out.println("MyPageListener: Forcing navigation away from system" + "to welcome"); NavigationHandler nh = fc.getApplication().getNavigationHandler(); nh.handleNavigation(fc, null, "welcome"); } else { System.out.println("MyPageListener: access allowed"); } } Method call activity: our app makes use of bounded task flows to implement the sequence of pages that update the stock or allow suppliers to self register. ADF takes care of ensuring that a bounded task flow can be entered by only one page. So a way to protect all those pages is to make a call to OES in the first activity and then either exit the task flow or continue depending on the authorization decision. The method call returns a String which contains the name of the transition to effect. This is where we configure the method call activity in JDeveloper: We implement each of the policies using the above techniques as follows: Policies 1 and 2: as these policies concern the coarse grained notions of controlling access to anonymous and authenticated users we can make use of the container’s security constraints which can be defined in the web.xml file. The allPages constraint is added automatically when we configure Authentication for the ADF application. We have added the “anonymousss” constraint to allow access to the the required pages, task flows and icons: <security-constraint> <web-resource-collection> <web-resource-name>anonymousss</web-resource-name> <url-pattern>/faces/welcome</url-pattern> <url-pattern>/afr/*</url-pattern> <url-pattern>/adf/*</url-pattern> <url-pattern>/key.png</url-pattern> <url-pattern>/faces/supplier-reg-btf/*</url-pattern> <url-pattern>/faces/supplier_register_complete</url-pattern> </web-resource-collection> </security-constraint> Policy 3: we can place an EL expression on the element representing the cost price on the products.jspx page: #{oesBackingBean.dataAccessCostPrice}. This EL Expression references a method in a Java backing bean that will call on to OES for an authorization decision. In OES we model the authorization requirement by requiring the view permission on the resource /MyADFApp/data/costprice and granting it only to the staff application role. We recover any obligations to determine the limit. Policy 4: is implemented by putting an EL expression on the Browse Stock link #{oesBackingBean.UIAccessBrowseStock} which checks for the view permission on the /MyADFApp/ui/stock resource. The stock.jspx page is protected by checking for the same permission in a custom phase listener—if the required permission is not satisfied then we force navigation back to the welcome page. Policy 5: the Update Stock link is protected with the same EL expression as the Browse Link: #{oesBackingBean.UIAccessBrowseStock}. However the Update Stock link launches a bounded task flow and to protect it the first activity in the flow is a method call activity which will execute an EL expression #{oesBackingBean.isUIAccessSupplierUpdateTransition} to check for the update permission on the /MyADFApp/ui/stock resource and either transition to the next step in the flow or terminate the flow with an authorization error. Policy 6: the System Administration link is protected with an EL Expression #{oesBackingBean.UIAccessSysAdmin} that checks for view access on the /MyADF/ui/sysadmin resource. The system page is protected in the same way at the stock page—the custom phase listener checks for the same permission that protects the link and if not satisfied we navigate back to the welcome page. Testing the Application To test the application: deploy the OES11g Admin to a WLS domain deploy the OES gateway in a another domain configured to be a WLS SM. You must ensure that the jps-config.xml file therein is configured to allow access to the identity store, otherwise the gateway will not b eable to resolve the principals for the requested users. To do this ensure that the following elements appear in the jps-config.xml file: <serviceProvider type="IDENTITY_STORE" name="idstore.ldap.provider" class="oracle.security.jps.internal.idstore.ldap.LdapIdentityStoreProvider"> <description>LDAP-based IdentityStore Provider</description> </serviceProvider> <serviceInstance name="idstore.ldap" provider="idstore.ldap.provider"> <property name="idstore.config.provider" value="oracle.security.jps.wls.internal.idstore.WlsLdapIdStoreConfigProvider"/> <property name="CONNECTION_POOL_CLASS" value="oracle.security.idm.providers.stdldap.JNDIPool"/></serviceInstance> <serviceInstanceRef ref="idstore.ldap"/> download the sample application and change the URL to the gateway in the MyADFApp OESBackingBean code to point to the OES Gateway and deploy the application to an 11.1.1.6 WLS domain that has been extended with the ADF JRF files. You will need to configure the FOD database connection to point your database which contains the FOD schema. populate the OES Admin and OES Gateway WLS LDAP stores with the sample set of users and groups. If you have configured the WLS domains to point to the same LDAP then it would only have to be done once. To help with this there is a directory called ldap_scripts in the sample project with ldif files for the test users and groups. start the OES Admin console and configure the required OES authorization policies for the MyADFApp application and push them to the WLS SM containing the OES Gateway. Login to the MyADFApp as each of the users described on the login page to test that the security policy is correct. You will see informative logging from the OES Gateway and the ADF application to their respective WLS consoles. Congratulations, you may now login to the OES Admin console and change policies that will control the behaviour of your ADF application--change the limit value in the obligation for the cost price for example, or define Role Mapping policies to determine staff access to the system administration page based on user profile attributes. ADF Development Notes Some notes on ADF development which are probably typical gotchas: May need this on WLS startup in order to allow us to overwrite credentials for the database, the signal here is that there is an error trying to access the data base: -Djps.app.credential.overwrite.allowed=true Best to call Bounded Task flows via a CommandLink (as opposed to a go link) as you cannot seem to start them again from a go link, even having completed the task flow correctly with a return activity. Once a bounded task flow (BTF) is initated it must complete correctly via a return activity—attempting to click on any other link whilst in the context of a BTF has no effect. See here for example: When using the ADF Authentication only security approach it seems to be awkward to allow anonymous access to the welcome and registration pages. We can achieve anonymous access using the web.xml security constraint shown above (where no auth-constraint is specified) however it is not clear what needs to be listed in there….for example the /afr/* and /adf/* are in there by trial and error as sometimes the welcome page will not render if we omit those items. I was not able to use the default allPages constraint with for example the anonymous-role or the everyone WLS group in order to be able to allow anonymous access to pages. The ADF security best practice advises placing all pages under the public_html/WEB-INF folder as then ADF will not allow any direct access to the .jspx pages but will only allow acces via a link of the form /faces/welcome rather than /faces/welcome.jspx. This seems like a very good practice to follow as having multiple entry points to data is a source of confusion in a web application (particulary from a security point of view). In Authentication+Authorization mode only pages with a Page definition file are protected. In order to add an emty one right click on the page and choose Go to Page Definition. This will create an empty page definition and now the page will require explicit permission to be seen. It is advisable to give a unique context root via the weblogic.xml for the application, as otherwise the application will clash with any other application with the same context root and it will not deploy

Read the article
Office 2010: It’s not just DOC(X) and XLS(X)

- by andrewbrust

Office 2010 has released to manufacturing. The bits have left the (product team’s) building. Will you upgrade? This version of Office is officially numbered 14, a designation that correlates with the various releases, through the years, of Microsoft Word. There were six major versions of Word for DOS, during whose release cycles came three 16-bit Windows versions. Then, starting with Word 95 and counting through Word 2007, there have been six more versions – all for the 32-bit Windows platform. Skip version 13 to ward off folksy bad luck (and, perhaps, the bugs that could come with it) and that brings us to version 14, which includes implementations for both 32- and 64-bit Windows platforms. We’ve come a long way baby. Or have we? As it does every three years or so, debate will now start to rage on over whether we need a “14th” version the PC platform’s standard word processor, or a “13th” version of the spreadsheet. If you accept the premise of that question, then you may be on a slippery slope toward answering it in the negative. Thing is, that premise is valid for certain customers and not others. The Microsoft Office product has morphed from one that offered core word processing, spreadsheet, presentation and email functionality to a suite of applications that provides unique, new value-added features, and even whole applications, in the context of those core services. The core apps thus grow in mission: Excel is a BI tool. Word is a collaborative editorial system for the production of publications. PowerPoint is a media production platform for for live presentations and, increasingly, for delivering more effective presentations online. Outlook is a time and task management system. Access is a rich client front-end for data-driven self-service SharePoint applications. OneNote helps you capture ideas, corral random thoughts in a semi-structured way, and then tie them back to other, more rigidly structured, Office documents. Google Docs and other cloud productivity platforms like Zoho don’t really do these things. And there is a growing chorus of voices who say that they shouldn’t, because those ancillary capabilities are over-engineered, over-produced and “under-necessary.” They might say Microsoft is layering on superfluous capabilities to avoid admitting that Office’s core capabilities, the ones people really need, have become commoditized. It’s hard to take sides in that argument, because different people, and the different companies that employ them, have different needs. For my own needs, it all comes down to three basic questions: will the new version of Office save me time, will it make the mundane parts of my job easier, and will it augment my services to customers? I need my time back. I need to spend more of it with my family, and more of it focusing on my own core capabilities rather than the administrative tasks around them. And I also need my customers to be able to get more value out of the services I provide. Help me triage my inbox, help me get proposals done more quickly and make them easier to read. Let me get my presentations done faster, make them more effective and make it easier for me to reuse materials from other presentations. And, since I’m in the BI and data business, help me and my customers manage data and analytics more easily, both on the desktop and online. Those are my criteria. And, with those in mind, Office 2010 is looking like a worthwhile upgrade. Perhaps it’s not earth-shattering, but it offers a combination of incremental improvements and a few new major capabilities that I think are quite compelling. I provide a brief roundup of them here. It’s admittedly arbitrary and not comprehensive, but I think it tells the Office 2010 story effectively. Across the Suite More than any other, this release of Office aims to give collaboration a real workout. In certain apps, for the first time, documents can be opened simultaneously by multiple users, with colleagues’ changes appearing in near real-time. Web-browser-based versions of Word, Excel, PowerPoint and OneNote will be available to extend collaboration to contributors who are off the corporate network. The ribbon user interface is now more pervasive (for example, it appears in OneNote and in Outlook’s main window). It’s also customizable, allowing users to add, easily, buttons and options of their choosing, into new tabs, or into new groups within existing tabs. Microsoft has also taken the File menu (which was the “Office Button” menu in the 2007 release) and made it into a full-screen “Backstage” view where document-wide operations, like saving, printing and online publishing are performed. And because, more and more, heavily formatted content is cut and pasted between documents and applications, Office 2010 makes it easier to manage the retention or jettisoning of that formatting right as the paste operation is performed. That’s much nicer than stripping it off, or adding it back, afterwards. And, speaking of pasting, a number of Office apps now make it especially easy to insert screenshots within their documents. I know that’s useful to me, because I often document or critique applications and need to show them in action. For the vast majority of users, I expect that this feature will be more useful for capturing snapshots of Web pages, but we’ll have to see whether this feature becomes popular. Excel At first glance, Excel 2010 looks and acts nearly identically to the 2007 version. But additional glances are necessary. It’s important to understand that lots of people in the working world use Excel as more of a database, analytics and mathematical modeling tool than merely as a spreadsheet. And it’s also important to understand that Excel wasn’t designed to handle such workloads past a certain scale. That all changes with this release. The first reason things change is that Excel has been tuned for performance. It’s been optimized for multi-threaded operation; previously lengthy processes have been shortened, especially for large data sets; more rows and columns are allowed and, for the first time, Excel (and the rest of Office) is available in a 64-bit version. For Excel, this means users can take advantage of more than the 2GB of memory that the 32-bit version is limited to. On the analysis side, Excel 2010 adds Sparklines (tiny charts that fit into a single cell and can therefore be presented down an entire column or across a row) and Slicers (a more user-friendly filter mechanism for PivotTables and charts, which visually indicates what the filtered state of a given data member is). But most important, Excel 2010 supports the new PowerPIvot add-in which brings true self-service BI to Office. PowerPivot allows users to import data from almost anywhere, model it, and then analyze it. Rather than forcing users to build “spreadmarts” or use corporate-built data warehouses, PowerPivot models function as true columnar, in-memory OLAP cubes that can accommodate millions of rows of data and deliver fast drill-down performance. And speaking of OLAP, Excel 2010 now supports an important Analysis Services OLAP feature called write-back. Write-back is especially useful in financial forecasting scenarios for which Excel is the natural home. Support for write-back is long overdue, but I’m still glad it’s there, because I had almost given up on it. PowerPoint This version of PowerPoint marks its progression from a presentation tool to a video and photo editing and production tool. Whether or not it’s successful in this pursuit, and if offering this is even a sensible goal, is another question. Regardless, the new capabilities are kind of interesting. A greatly enhanced set of slide transitions with 3D effects; in-product photo and video editing; accommodation of embedded videos from services such as YouTube; and the ability to save a presentation as a video each lay testimony to PowerPoint’s transformation into a media tool and away from a pure presentation tool. These capabilities also recognize the importance of the Web as both a source for materials and a channel for disseminating PowerPoint output. Congruent with that is PowerPoint’s new ability to broadcast a slide presentation, using a quickly-generated public URL, without involving the hassle or expense of a Web meeting service like GoToMeeting or Microsoft’s own LiveMeeting. Slides presented through this broadcast feature retain full color fidelity and transitions and animations are preserved as well. Outlook Microsoft’s ubiquitous email/calendar/contact/task management tool gains long overdue speed improvements, especially against POP3 email accounts. Outlook 2010 also supports multiple Exchange accounts, rather than just one; tighter integration with OneNote; and a new Social Connector providing integration with, and presence information from, online social network services like LinkedIn and Facebook (not to mention Windows Live). A revamped conversation view now includes messages that are part of a given thread regardless of which folder they may be stored in. I don’t know yet how well the Social Connector will work or whether it will keep Outlook relevant to those who live on Facebook and LinkedIn. But among the other features, there’s very little not to like. OneNote To me, OneNote is the part of Office that just keeps getting better. There is one major caveat to this, which I’ll cover in a moment, but let’s first catalog what new stuff OneNote 2010 brings. The best part of OneNote, is the way each of its versions have managed hierarchy: Notebooks have sections, sections have pages, pages have sub pages, multiple notes can be contained in either, and each note supports infinite levels of indentation. None of that is new to 2010, but the new version does make creation of pages and subpages easier and also makes simple work out of promoting and demoting pages from sub page to full page status. And relationships between pages are quite easy to create now: much like a Wiki, simply typing a page’s name in double-square-brackets (“[[…]]”) creates a link to it. OneNote is also great at integrating content outside of its notebooks. With a new Dock to Desktop feature, OneNote becomes aware of what window is displayed in the rest of the screen and, if it’s an Office document or a Web page, links the notes you’re typing, at the time, to it. A single click from your notes later on will bring that same document or Web page back on-screen. Embedding content from Web pages and elsewhere is also easier. Using OneNote’s Windows Key+S combination to grab part of the screen now allows you to specify the destination of that bitmap instead of automatically creating a new note in the Unfiled Notes area. Using the Send to OneNote buttons in Internet Explorer and Outlook result in the same choice. Collaboration gets better too. Real-time multi-author editing is better accommodated and determining author lineage of particular changes is easily carried out. My one pet peeve with OneNote is the difficulty using it when I’m not one a Windows PC. OneNote’s main competitor, Evernote, while I believe inferior in terms of features, has client versions for PC, Mac, Windows Mobile, Android, iPhone, iPad and Web browsers. Since I have an Android phone and an iPad, I am practically forced to use it. However, the OneNote Web app should help here, as should a forthcoming version of OneNote for Windows Phone 7. In the mean time, it turns out that using OneNote’s Email Page ribbon button lets you move a OneNote page easily into EverNote (since every EverNote account gets a unique email address for adding notes) and that Evernote’s Email function combined with Outlook’s Send to OneNote button (in the Move group of the ribbon’s Home tab) can achieve the reverse. Access To me, the big change in Access 2007 was its tight integration with SharePoint lists. Access 2010 and SharePoint 2010 continue this integration with the introduction of SharePoint’s Access Services. Much as Excel Services provides a SharePoint-hosted experience for viewing (and now editing) Excel spreadsheet, PivotTable and chart content, Access Services allows for SharePoint browser-hosted editing of Access data within the forms that are built in the Access client itself. To me this makes all kinds of sense. Although it does beg the question of where to draw the line between Access, InfoPath, SharePoint list maintenance and SharePoint 2010’s new Business Connectivity Services. Each of these tools provide overlapping data entry and data maintenance functionality. But if you do prefer Access, then you’ll like things like templates and application parts that make it easier to get off the blank page. These features help you quickly get tables, forms and reports built out. To make things look nice, Access even gets its own version of Excel’s Conditional Formatting feature, letting you add data bars and data-driven text formatting. Word As I said at the beginning of this post, upgrades to Office are about much more than enhancing the suite’s flagship word processing application. So are there any enhancements in Word worth mentioning? I think so. The most important one has to be the collaboration features. Essentially, when a user opens a Word document that is in a SharePoint document library (or Windows Live SkyDrive folder), rather than the whole document being locked, Word has the ability to observe more granular locks on the individual paragraphs being edited. Word also shows you who’s editing what and its Save function morphs into a sync feature that both saves your changes and loads those made by anyone editing the document concurrently. There’s also a new navigation pane that lets you manage sections in your document in much the same way as you manage slides in a PowerPoint deck. Using the navigation pane, you can reorder sections, insert new ones, or promote and demote sections in the outline hierarchy. Not earth shattering, but nice. Other Apps and Summarized Findings What about InfoPath, Publisher, Visio and Project? I haven’t looked at them yet. And for this post, I think that’s fine. While those apps (and, arguably, Access) cater to specific tasks, I think the apps we’ve looked at in this post service the general purpose needs of most users. And the theme in those 2010 apps is clear: collaboration is key, the Web and productivity are indivisible, and making data and analytics into a self-service amenity is the way to go. But perhaps most of all, features are still important, as long as they get you through your day faster, rather than adding complexity for its own sake. I would argue that this is true for just about every product Microsoft makes: users want utility, not complexity.

Read the article
Accessing Oracle 6i and 9i/10g Databases using C#

- by Mike M

Hi all, I am making two build files using NAnt. The first aims to automatically compile Oracle 6i forms and reports and the second aims to compile Oracle 9i/10g forms and reports. Within the NAnt task is a C# script which prompts the developer for database credentials (username, password, database) in order to compile the forms and reports. I want to then run these credentials against the relevant database to ensure the credentials entered are correct and, if they are not, prompt the user to re-enter their credentials. My script currently looks as follows: class GetInput { public static void ScriptMain(Project project) { Console.Clear(); Console.WriteLine("==================================================================="); Console.WriteLine("Welcome to the Compile and Deploy Oracle Forms and Reports Facility"); Console.WriteLine("==================================================================="); Console.WriteLine(); Console.WriteLine("Please enter the acronym of the project to work on from the following list:"); Console.WriteLine(); Console.WriteLine("--------"); Console.WriteLine("- BCS"); Console.WriteLine("- COPEN"); Console.WriteLine("- FCDD"); Console.WriteLine("--------"); Console.WriteLine(); Console.Write("Selection: "); project.Properties["project.type"] = Console.ReadLine(); Console.WriteLine(); Console.Write("Please enter username: "); string username = Console.ReadLine(); project.Properties["username"] = username; string password = ReturnPassword(); project.Properties["password"] = password Console.WriteLine(); Console.Write("Please enter database: "); string database = Console.ReadLine(); project.Properties["database"] = database Console.WriteLine(); //Call method to verify user credentials Console.WriteLine(); Console.WriteLine("Compiling files..."; } public static string ReturnPassword() { Console.Write("Please enter password: "); string password = ""; ConsoleKeyInfo nextKey = Console.ReadKey(true); while (nextKey.Key != ConsoleKey.Enter) { if (nextKey.Key == ConsoleKey.Backspace) { if (password.Length > 0) { password = password.Substring(0, password.Length - 1); Console.Write(nextKey.KeyChar); Console.Write(" "); Console.Write(nextKey.KeyChar); } } else { password += nextKey.KeyChar; Console.Write("*"); } nextKey = Console.ReadKey(true); } return password; } } Having done a bit of research, I find that you can connect to Oracle databases using the System.Data.OracleClient namespace clicky. However, as mentioned in the link, Microsoft is discontinuing support for this so it is not a desireable solution. I have also fonud that Oracle provides its own classes for connecting to Oracle databases clicky. However, this only seems to support connecting to Oracle 9 or newer databases (clicky) so it is not feasible solution as I also need to connect to Oracle 6i databases. I could achieve this by calling a bat script from within the C# script, but I would much prefer to have a single build file for simplicity. Ideally, I would like to run a series of commands such as is contained in the following .bat script: rem -- Set Database SID -- set ORACLE_SID=%DBSID% sqlplus -s %nameofuser%/%password%@%dbsid% set cmdsep on set cmdsep '"'; --" set term on set echo off set heading off select '========================================' || CHR(10) || 'Have checked and found both Password and ' || chr(10) || 'Database Identifier are valid, continuing ...' || CHR(10) || '========================================' from dual; exit; This requires me to set the environment variable of ORACLE_SID and then run sqlplus in silent mode (-s) followed by a series of sql set commands (set x), the actual select statement and an exit command. Can I achieve this within a c# script without calling a bat script, or am I forced to call a bat script? Thanks in advance!

Read the article
Data Profiling without SSIS

Strangely enough for a predominantly SSIS blog, this post is all about how to perform data profiling without using SSIS. Whilst the Data Profiling Task is a worthy addition, there are a couple of limitations I’ve encountered of late. The first is that it requires SQL Server 2008, and not everyone is there yet. The second is that it can only target SQL Server 2005 and above. What about older systems, which are the ones that we probably need to investigate the most, or other vendor databases such as Oracle? With these limitations in mind I did some searching to find a quick and easy alternative to help me perform some data profiling for a project I was working on recently. I only had SQL Server 2005 available, and anyway most of my target source systems were Oracle, and of course I had short timescales. I looked at several options. Some never got beyond the download stage, they failed to install or just did not run, and others provided less than I could have produced myself by spending 2 minutes writing some basic SQL queries. In the end I settled on an open source product called DataCleaner. To quote from their website: DataCleaner is an Open Source application for profiling, validating and comparing data. These activities help you administer and monitor your data quality in order to ensure that your data is useful and applicable to your business situation. DataCleaner is the free alternative to software for master data management (MDM) methodologies, data warehousing (DW) projects, statistical research, preparation for extract-transform-load (ETL) activities and more. DataCleaner is developed in Java and licensed under LGPL. As quoted above it claims to support profiling, validating and comparing data, but I didn’t really get past the profiling functions, so won’t comment on the other two. The profiling whilst not prefect certainly saved some time compared to the limited alternatives. The ability to profile heterogeneous data sources is a big advantage over the SSIS option, and I found it overall quite easy to use and performance was good. I could see it struggling at times, but actually for what it does I was impressed. It had some data type niggles with Oracle, and some metrics seem a little strange, although thankfully they were easy to augment with some SQL queries to ensure a consistent picture. The report export options didn’t do it for me, but copy and paste with a bit of Excel magic was sufficient. One initial point for me personally is that I have had limited exposure to things of the Java persuasion and whilst I normally get by fine, sometimes the simplest things can throw me. For example installing a JDBC driver, why do I have to copy files to make it all work, has nobody ever heard of an MSI? In case there are other people out there like me who have become totally indoctrinated with the Microsoft software paradigm, I’ve written a quick start guide that details every step required. Steps 1- 5 are the key ones, the rest is really an excuse for some screenshots to show you the tool. Quick Start Guide Step 1 - Download Data Cleaner. The Microsoft Windows zipped exe option, and I chose the latest stable build, currently DataCleaner 1.5.3 (final). Extract the files to a suitable location. Step 2 - Download Java. If you try and run datacleaner.exe without Java it will warn you, and then open your default browser and take you to the Java download site. Follow the installation instructions from there, normally just click Download Java a couple of times and you’re done. Step 3 - Download Microsoft SQL Server JDBC Driver. You may have SQL Server installed, but you won’t have a JDBC driver. Version 3.0 is the latest as of April 2010. There is no real installer, we are in the Java world here, but run the exe you downloaded to extract the files. The default Unzip to folder is not much help, so try a fully qualified path such as C:\Program Files\Microsoft SQL Server JDBC Driver 3.0\ to ensure you can find the files afterwards. Step 4 - If you wish to use Windows Authentication to connect to your SQL Server then first we need to copy a file so that Data Cleaner can find it. Browse to the JDBC extract location from Step 3 and drill down to the file sqljdbc_auth.dll. You will have to choose the correct directory for your processor architecture. e.g. C:\Program Files\Microsoft SQL Server JDBC Driver 3.0\sqljdbc_3.0\enu\auth\x86\sqljdbc_auth.dll. Now copy this file to the Data Cleaner extract folder you chose in Step 1. An alternative method is to edit datacleaner.cmd in the data cleaner extract folder as detailed in this data cleaner wiki topic, but I find copying the file simpler. Step 5 – Now lets run Data Cleaner, just run datacleaner.exe from the extract folder you chose in Step 1. Step 6 – Complete or skip the registration screen, and ignore the task window for now. In the main window click settings. Step 7 – In the Settings dialog, select the Database drivers tab, then click Register database driver and select the Local JAR file option. Step 8 – Browse to the JDBC driver extract location from Step 3 and drill down to select sqljdbc4.jar. e.g. C:\Program Files\Microsoft SQL Server JDBC Driver 3.0\sqljdbc_3.0\enu\sqljdbc4.jar Step 9 – Select the Database driver class as com.microsoft.sqlserver.jdbc.SQLServerDriver, and then click the Test and Save database driver button. Step 10 - You should be back at the Settings dialog with a the list of drivers that includes SQL Server. Just click Save Settings to persist all your hard work. Step 11 – Now we can start to profile some data. In the main Data Cleaner window click New Task, and then Profile from the task window. Step 12 – In the Profile window click Open Database Step 13 – Now choose the SQL Server connection string option. Selecting a connection string gives us a template like jdbc:sqlserver://<hostname>:1433;databaseName=<database>, but obviously it requires some details to be entered for example jdbc:sqlserver://localhost:1433;databaseName=SQLBits. This will connect to the database called SQLBits on my local machine. The port may also have to be changed if using such as when you have a multiple instances of SQL Server running. If using SQL Server Authentication enter a username and password as required and then click Connect to database. You can use Window Authentication, just add integratedSecurity=true to the end of your connection string. e.g jdbc:sqlserver://localhost:1433;databaseName=SQLBits;integratedSecurity=true. If you didn’t complete Step 4 above you will need to do so now and restart Data Cleaner before it will work. Manually setting the connection string is fine, but creating a named connection makes more sense if you will be spending any length of time profiling a specific database. As highlighted in the left-hand screen-shot, at the bottom of the dialog it includes partial instructions on how to create named connections. In the folder shown C:\Users\<Username>\.datacleaner\1.5.3, open the datacleaner-config.xml file in your editor of choice add your own details. You’ll see a sample connection in the file already, just add yours following the same pattern. e.g.  <bean class="dk.eobjects.datacleaner.gui.model.NamedConnection"> <property name="name" value="SQLBits Local Connection" /> <property name="driverClass" value="com.microsoft.sqlserver.jdbc.SQLServerDriver" /> <property name="connectionString" value="jdbc:sqlserver://localhost:1433;databaseName=SQLBits;integratedSecurity=true" /> <property name="tableTypes"> <list> <value>TABLE</value> <value>VIEW</value> </list> </property> </bean> Step 14 – Once back at the Profile window, you should now see your schemas, tables and/or views listed down the left hand side. Browse this tree and double-click a table to select it for profiling. You can then click Add profile, and choose some profiling options, before finally clicking Run profiling. You can see below a sample output for three of the most common profiles, click the image for full size. I hope this has given you a taster for DataCleaner, and should help you get up and running pretty quickly.

Read the article
Enterprise 2.0 - Connecting People, Processes & Content

- by kellsey.ruppel(at)oracle.com

With recent technological advances, the Internet is changing. When users head to the web, they are no longer just looking for information from a simple text and picture based website. Users want a more interactive experience - they want to participate, to share their views and get the feedback of others. And this is precisely what Web 2.0 technology addresses. Web 2.0 is about web applications that facilitate interactive information sharing, user-centered design and collaboration on the World Wide Web. Web 2.0 technology is everywhere on the Internet and is radically changing the speed and medium in which we interact and communicate. There are thousands of examples in the consumer world of Web 2.0 applications, technologies and solutions at work. You might be familiar with some of them...blogs, wikis (Wikipedia), Twitter, Facebook, LinkedIn - these are all examples of Web 2.0. And these technologies are transforming our world into a real-time, participation-oriented, user-driven, content-centric world. With all of these Web 2.0 solutions it's about the user, the consumer and all the content they are generating. It's a world full of online communities where people share and participate. We're not talking about disseminating information top-down , nor is it a bottom-up fight. Everyone has an equal opportunity to participate and share. The more you participate, the more you share, the more valued you are in the community. The web is not just a collection of documents online. It is the social web. For the active users in the community, staying connected becomes critically important so they can participate at anytime and from anywhere. And because feedback and interaction are so critical, time is of the essence. When everyone is providing immediate responses, you feel the urge to do the same. Hence everything needs to be done right now, together...and collaboratively. With all the content being generated online by users, there is complete information overload out there. (That's a good thing for Google). But...it's no longer just about search. Sometimes you want the information to just come to you. Recommendations and discovery engines will deliver you more applicable results than a non-contextual search. How many of you have heard about a news headline on Facebook as part of your feed before you read the paper or see it on TV? This is how the new generation of workers live their daily lives...and as they enter the workforce, these trends and technologies are showing up in the enterprise too. A lot of the Web 2.0 technologies and solutions in the consumer world are geared for just that....consumers. But the core concepts that put them into the Web 2.0 category can be applied to the enterprise as well. And that is what we mean when we talk about Enterprise 2.0. Enterprise 2.0 is the use of Web 2.0 tools and technologies in the workplace. It provides a modern user experience by connecting the people, content and business processes inside and outside the enterprise. Enterprise 2.0 empowers users to collaborate more effectively, find and share information in the proper content and improves the overall business processes which they participate in. As we head into 2011, is your organization using Enterprise 2.0 capabilities to the fullest? Are you connecting your people, processes and content together to provide a modern user experience?

Read the article
New Big Data Appliance Security Features

- by mgubar

The Oracle Big Data Appliance (BDA) is an engineered system for big data processing. It greatly simplifies the deployment of an optimized Hadoop Cluster – whether that cluster is used for batch or real-time processing. The vast majority of BDA customers are integrating the appliance with their Oracle Databases and they have certain expectations – especially around security. Oracle Database customers have benefited from a rich set of security features: encryption, redaction, data masking, database firewall, label based access control – and much, much more. They want similar capabilities with their Hadoop cluster. Unfortunately, Hadoop wasn’t developed with security in mind. By default, a Hadoop cluster is insecure – the antithesis of an Oracle Database. Some critical security features have been implemented – but even those capabilities are arduous to setup and configure. Oracle believes that a key element of an optimized appliance is that its data should be secure. Therefore, by default the BDA delivers the “AAA of security”: authentication, authorization and auditing. Security Starts at Authentication A successful security strategy is predicated on strong authentication – for both users and software services. Consider the default configuration for a newly installed Oracle Database; it’s been a long time since you had a legitimate chance at accessing the database using the credentials “system/manager” or “scott/tiger”. The default Oracle Database policy is to lock accounts thereby restricting access; administrators must consciously grant access to users. Default Authentication in Hadoop By default, a Hadoop cluster fails the authentication test. For example, it is easy for a malicious user to masquerade as any other user on the system. Consider the following scenario that illustrates how a user can access any data on a Hadoop cluster by masquerading as a more privileged user. In our scenario, the Hadoop cluster contains sensitive salary information in the file /user/hrdata/salaries.txt. When logged in as the hr user, you can see the following files. Notice, we’re using the Hadoop command line utilities for accessing the data: $ hadoop fs -ls /user/hrdataFound 1 items-rw-r--r-- 1 oracle supergroup 70 2013-10-31 10:38 /user/hrdata/salaries.txt$ hadoop fs -cat /user/hrdata/salaries.txtTom Brady,11000000Tom Hanks,5000000Bob Smith,250000Oprah,300000000 User DrEvil has access to the cluster – and can see that there is an interesting folder called “hrdata”. $ hadoop fs -ls /user Found 1 items drwx------ - hr supergroup 0 2013-10-31 10:38 /user/hrdata However, DrEvil cannot view the contents of the folder due to lack of access privileges: $ hadoop fs -ls /user/hrdata ls: Permission denied: user=drevil, access=READ_EXECUTE, inode="/user/hrdata":oracle:supergroup:drwx------ Accessing this data will not be a problem for DrEvil. He knows that the hr user owns the data by looking at the folder’s ACLs. To overcome this challenge, he will simply masquerade as the hr user. On his local machine, he adds the hr user, assigns that user a password, and then accesses the data on the Hadoop cluster: $ sudo useradd hr $ sudo passwd $ su hr $ hadoop fs -cat /user/hrdata/salaries.txt Tom Brady,11000000 Tom Hanks,5000000 Bob Smith,250000 Oprah,300000000 Hadoop has not authenticated the user; it trusts that the identity that has been presented is indeed the hr user. Therefore, sensitive data has been easily compromised. Clearly, the default security policy is inappropriate and dangerous to many organizations storing critical data in HDFS. Big Data Appliance Provides Secure Authentication The BDA provides secure authentication to the Hadoop cluster by default – preventing the type of masquerading described above. It accomplishes this thru Kerberos integration. Figure 1: Kerberos Integration The Key Distribution Center (KDC) is a server that has two components: an authentication server and a ticket granting service. The authentication server validates the identity of the user and service. Once authenticated, a client must request a ticket from the ticket granting service – allowing it to access the BDA’s NameNode, JobTracker, etc. At installation, you simply point the BDA to an external KDC or automatically install a highly available KDC on the BDA itself. Kerberos will then provide strong authentication for not just the end user – but also for important Hadoop services running on the appliance. You can now guarantee that users are who they claim to be – and rogue services (like fake data nodes) are not added to the system. It is common for organizations to want to leverage existing LDAP servers for common user and group management. Kerberos integrates with LDAP servers – allowing the principals and encryption keys to be stored in the common repository. This simplifies the deployment and administration of the secure environment. Authorize Access to Sensitive Data Kerberos-based authentication ensures secure access to the system and the establishment of a trusted identity – a prerequisite for any authorization scheme. Once this identity is established, you need to authorize access to the data. HDFS will authorize access to files using ACLs with the authorization specification applied using classic Linux-style commands like chmod and chown (e.g. hadoop fs -chown oracle:oracle /user/hrdata changes the ownership of the /user/hrdata folder to oracle). Authorization is applied at the user or group level – utilizing group membership found in the Linux environment (i.e. /etc/group) or in the LDAP server. For SQL-based data stores – like Hive and Impala – finer grained access control is required. Access to databases, tables, columns, etc. must be controlled. And, you want to leverage roles to facilitate administration. Apache Sentry is a new project that delivers fine grained access control; both Cloudera and Oracle are the project’s founding members. Sentry satisfies the following three authorization requirements: Secure Authorization: the ability to control access to data and/or privileges on data for authenticated users. Fine-Grained Authorization: the ability to give users access to a subset of the data (e.g. column) in a database Role-Based Authorization: the ability to create/apply template-based privileges based on functional roles. With Sentry, “all”, “select” or “insert” privileges are granted to an object. The descendants of that object automatically inherit that privilege. A collection of privileges across many objects may be aggregated into a role – and users/groups are then assigned that role. This leads to simplified administration of security across the system. Figure 2: Object Hierarchy – granting a privilege on the database object will be inherited by its tables and views. Sentry is currently used by both Hive and Impala – but it is a framework that other data sources can leverage when offering fine-grained authorization. For example, one can expect Sentry to deliver authorization capabilities to Cloudera Search in the near future. Audit Hadoop Cluster Activity Auditing is a critical component to a secure system and is oftentimes required for SOX, PCI and other regulations. The BDA integrates with Oracle Audit Vault and Database Firewall – tracking different types of activity taking place on the cluster: Figure 3: Monitored Hadoop services. At the lowest level, every operation that accesses data in HDFS is captured. The HDFS audit log identifies the user who accessed the file, the time that file was accessed, the type of access (read, write, delete, list, etc.) and whether or not that file access was successful. The other auditing features include: MapReduce: correlate the MapReduce job that accessed the file Oozie: describes who ran what as part of a workflow Hive: captures changes were made to the Hive metadata The audit data is captured in the Audit Vault Server – which integrates audit activity from a variety of sources, adding databases (Oracle, DB2, SQL Server) and operating systems to activity from the BDA. Figure 4: Consolidated audit data across the enterprise. Once the data is in the Audit Vault server, you can leverage a rich set of prebuilt and custom reports to monitor all the activity in the enterprise. In addition, alerts may be defined to trigger violations of audit policies. Conclusion Security cannot be considered an afterthought in big data deployments. Across most organizations, Hadoop is managing sensitive data that must be protected; it is not simply crunching publicly available information used for search applications. The BDA provides a strong security foundation – ensuring users are only allowed to view authorized data and that data access is audited in a consolidated framework.

Read the article
quick look at: dm_db_index_physical_stats

- by fatherjack

A quick look at the key data from this dmv that can help a DBA keep databases performing well and systems online as the users need them. When the dynamic management views relating to index statistics became available in SQL Server 2005 there was much hype about how they can help a DBA keep their servers running in better health than ever before. This particular view gives an insight into the physical health of the indexes present in a database. Whether they are use or unused, complete or missing some columns is irrelevant, this is simply the physical stats of all indexes; disabled indexes are ignored however. In it’s simplest form this dmv can be executed as: The results from executing this contain a record for every index in every database but some of the columns will be NULL. The first parameter is there so that you can specify which database you want to gather index details on, rather than scan every database. Simply specifying DB_ID() in place of the first NULL achieves this. In order to avoid the NULLS, or more accurately, in order to choose when to have the NULLS you need to specify a value for the last parameter. It takes one of 4 values – DEFAULT, ‘SAMPLED’, ‘LIMITED’ or ‘DETAILED’. If you execute the dmv with each of these values you can see some interesting details in the times taken to complete each step. DECLARE @Start DATETIME DECLARE @First DATETIME DECLARE @Second DATETIME DECLARE @Third DATETIME DECLARE @Finish DATETIME SET @Start = GETDATE() SELECT * FROM [sys].[dm_db_index_physical_stats](DB_ID(), NULL, NULL, NULL, DEFAULT) AS ddips SET @First = GETDATE() SELECT * FROM [sys].[dm_db_index_physical_stats](DB_ID(), NULL, NULL, NULL, 'SAMPLED') AS ddips SET @Second = GETDATE() SELECT * FROM [sys].[dm_db_index_physical_stats](DB_ID(), NULL, NULL, NULL, 'LIMITED') AS ddips SET @Third = GETDATE() SELECT * FROM [sys].[dm_db_index_physical_stats](DB_ID(), NULL, NULL, NULL, 'DETAILED') AS ddips SET @Finish = GETDATE() SELECT DATEDIFF(ms, @Start, @First) AS [DEFAULT] , DATEDIFF(ms, @First, @Second) AS [SAMPLED] , DATEDIFF(ms, @Second, @Third) AS [LIMITED] , DATEDIFF(ms, @Third, @Finish) AS [DETAILED] Running this code will give you 4 result sets; DEFAULT will have 12 columns full of data and then NULLS in the remainder. SAMPLED will have 21 columns full of data. LIMITED will have 12 columns of data and the NULLS in the remainder. DETAILED will have 21 columns full of data. So, from this we can deduce that the DEFAULT value (the same one that is also applied when you query the view using a NULL parameter) is the same as using LIMITED. Viewing the final result set has some details that are worth noting: Running queries against this view takes significantly longer when using the SAMPLED and DETAILED values in the last parameter. The duration of the query is directly related to the size of the database you are working in so be careful running this on big databases unless you have tried it on a test server first. Let’s look at the data we get back with the DEFAULT value first of all and then progress to the extra information later. We know that the first parameter that we supply has to be a database id and for the purposes of this blog we will be providing that value with the DB_ID function. We could just as easily put a fixed value in there or a function such as DB_ID (‘AnyDatabaseName’). The first columns we get back are database_id and object_id. These are pretty explanatory and we can wrap those in some code to make things a little easier to read: SELECT DB_NAME([ddips].[database_id]) AS [DatabaseName] , OBJECT_NAME([ddips].[object_id]) AS [TableName] … FROM [sys].[dm_db_index_physical_stats](DB_ID(), NULL, NULL, NULL, NULL) AS ddips gives us SELECT DB_NAME([ddips].[database_id]) AS [DatabaseName] , OBJECT_NAME([ddips].[object_id]) AS [TableName], [i].[name] AS [IndexName] , ….. FROM [sys].[dm_db_index_physical_stats](DB_ID(), NULL, NULL, NULL, NULL) AS ddips INNER JOIN [sys].[indexes] AS i ON [ddips].[index_id] = [i].[index_id] AND [ddips].[object_id] = [i].[object_id] These handily tie in with the next parameters in the query on the dmv. If you specify an object_id and an index_id in these then you get results limited to either the table or the specific index. Once again we can place a function in here to make it easier to work with a specific table. eg. SELECT * FROM [sys].[dm_db_index_physical_stats] (DB_ID(), OBJECT_ID(‘AdventureWorks2008.Person.Address’) , 1, NULL, NULL) AS ddips Note: Despite me showing that functions can be placed directly in the parameters for this dmv, best practice recommends that functions are not used directly in the function as it is possible that they will fail to return a valid object ID. To be certain of not passing invalid values to this function, and therefore setting an automated process off on the wrong path, declare variables for the OBJECT_IDs and once they have been validated, use them in the function: DECLARE @db_id SMALLINT; DECLARE @object_id INT; SET @db_id = DB_ID(N’AdventureWorks_2008′); SET @object_id = OBJECT_ID(N’AdventureWorks_2008.Person.Address’); IF @db_id IS NULL BEGINPRINT N’Invalid database’; ENDELSE IF @object_id IS NULL BEGINPRINT N’Invalid object’; ENDELSE BEGINSELECT * FROM sys.dm_db_index_physical_stats (@db_id, @object_id, NULL, NULL , ‘LIMITED’); END; GO In cases where the results of querying this dmv don’t have any effect on other processes (i.e. simply viewing the results in the SSMS results area) then it will be noticed when the results are not consistent with the expected results and in the case of this blog this is the method I have used. So, now we can relate the values in these columns to something that we recognise in the database lets see what those other values in the dmv are all about. The next columns are: We’ll skip partition_number, index_type_desc, alloc_unit_type_desc, index_depth and index_level as this is a quick look at the dmv and they are pretty self explanatory. The final columns revealed by querying this view in the DEFAULT mode are avg_fragmentation_in_percent. This is the amount that the index is logically fragmented. It will show NULL when the dmv is queried in SAMPLED mode. fragment_count. The number of pieces that the index is broken into. It will show NULL when the dmv is queried in SAMPLED mode. avg_fragment_size_in_pages. The average size, in pages, of a single fragment in the leaf level of the IN_ROW_DATA allocation unit. It will show NULL when the dmv is queried in SAMPLED mode. page_count. Total number of index or data pages in use. OK, so what does this give us? Well, there is an obvious correlation between fragment_count, page_count and avg_fragment_size-in_pages. We see that an index that takes up 27 pages and is in 3 fragments has an average fragment size of 9 pages (27/3=9). This means that for this index there are 3 separate places on the hard disk that SQL Server needs to locate and access to gather the data when it is requested by a DML query. If this index was bigger than 72KB then having it’s data in 3 pieces might not be too big an issue as each piece would have a significant piece of data to read and the speed of access would not be too poor. If the number of fragments increases then obviously the amount of data in each piece decreases and that means the amount of work for the disks to do in order to retrieve the data to satisfy the query increases and this would start to decrease performance. This information can be useful to keep in mind when considering the value in the avg_fragmentation_in_percent column. This is arrived at by an internal algorithm that gives a value to the logical fragmentation of the index taking into account the multiple files, type of allocation unit and the previously mentioned characteristics if index size (page_count) and fragment_count. Seeing an index with a high avg_fragmentation_in_percent value will be a call to action for a DBA that is investigating performance issues. It is possible that tables will have indexes that suffer from rapid increases in fragmentation as part of normal daily business and that regular defragmentation work will be needed to keep it in good order. In other cases indexes will rarely become fragmented and therefore not need rebuilding from one end of the year to another. Keeping this in mind DBAs need to use an ‘intelligent’ process that assesses key characteristics of an index and decides on the best, if any, defragmentation method to apply should be used. There is a simple example of this in the sample code found in the Books OnLine content for this dmv, in example D. There are also a couple of very popular solutions created by SQL Server MVPs Michelle Ufford and Ola Hallengren which I would wholly recommend that you review for much further detail on how to care for your SQL Server indexes. Right, let’s get back on track then. Querying the dmv with the fifth parameter value as ‘DETAILED’ takes longer because it goes through the index and refreshes all data from every level of the index. As this blog is only a quick look a we are going to skate right past ghost_record_count and version_ghost_record_count and discuss avg_page_space_used_in_percent, record_count, min_record_size_in_bytes, max_record_size_in_bytes and avg_record_size_in_bytes. We can see from the details below that there is a correlation between the columns marked. Column 1 (Page_Count) is the number of 8KB pages used by the index, column 2 is how full each page is (how much of the 8KB has actual data written on it), column 3 is how many records are recorded in the index and column 4 is the average size of each record. This approximates to: ((Col1*8) * 1024*(Col2/100))/Col3 = Col4*. avg_page_space_used_in_percent is an important column to review as this indicates how much of the disk that has been given over to the storage of the index actually has data on it. This value is affected by the value given for the FILL_FACTOR parameter when creating an index. avg_record_size_in_bytes is important as you can use it to get an idea of how many records are in each page and therefore in each fragment, thus reinforcing how important it is to keep fragmentation under control. min_record_size_in_bytes and max_record_size_in_bytes are exactly as their names set them out to be. A detail of the smallest and largest records in the index. Purely offered as a guide to the DBA to better understand the storage practices taking place. So, keeping an eye on avg_fragmentation_in_percent will ensure that your indexes are helping data access processes take place as efficiently as possible. Where fragmentation recurs frequently then potentially the DBA should consider; the fill_factor of the index in order to leave space at the leaf level so that new records can be inserted without causing fragmentation so rapidly. the columns used in the index should be analysed to avoid new records needing to be inserted in the middle of the index but rather always be added to the end. * – it’s approximate as there are many factors associated with things like the type of data and other database settings that affect this slightly. Another great resource for working with SQL Server DMVs is Performance Tuning with SQL Server Dynamic Management Views by Louis Davidson and Tim Ford – a free ebook or paperback from Simple Talk. Disclaimer – Jonathan is a Friend of Red Gate and as such, whenever they are discussed, will have a generally positive disposition towards Red Gate tools. Other tools are often available and you should always try others before you come back and buy the Red Gate ones. All code in this blog is provided “as is” and no guarantee, warranty or accuracy is applicable or inferred, run the code on a test server and be sure to understand it before you run it on a server that means a lot to you or your manager.

Read the article
Organizational characteristics that impact the selection of Development Methodology concepts applied to a project

Based on my experience, no one really follows a specific methodology exactly as it is formally designed. In fact, the key concepts of a few methodologies are usually combined to form a hybrid methodology for each project based on the current organizational makeup and the project need/requirements to be accomplished. Organizational characteristics that impact the selection of methodology concepts applied to a project. Prior subject knowledge pertaining to a project can be critical when deciding on what methodology or combination of methodologies to apply to a project. For example, if a project is very straight forward, and the development staff has experience in developing that are similar, then the waterfall method could possibly be the best choice because little to no research is needed in order to complete the project tasks and there is very little need for changes to occur. On the other hand, if the development staff has limited subject knowledge or the requirements/specification of the project could possibly change as the project progresses then the use of spiral, iterative, incremental, agile, or any combination would be preferred. The previous methodologies used by an organization typically do not change much from project to project unless the needs of a project dictate differently. For example, if the waterfall method is the preferred development methodology then most projects will be developed by the waterfall method. Depending on the time allotted to a project each day can impact the selection of a development methodology. In one example, if the staff can only devote a few hours a day to a project then the incremental methodology might be ideal because modules can be added to the final project as they are developed. On the other hand, if daily time allocation is not an issue, then a multitude of methodologies could work well for a project. Project characteristics that impact the selection of methodology concepts applied to a project. The type of project being developed can often dictate the type of methodology used for the project. Based on my experience, projects that tend to have a lot of user interaction, follow a more iterative, incremental, or agile approach typically using a prototype that develops into a final project. These methodologies desire back and forth communication between users, clients, and developers to allow for requirements to change and functionality to be enhanced. Conversely, limited interaction applications or automated services can still sometimes get away with using the waterfall or transactional approach. The timeline of a project can also force an organization to prefer a particular methodology over the rest. For instance, if the project must be completed within 24 hours, then there is very little time for discussions back and forth between clients, users and the development team. In this scenario, the waterfall method would be perfect because the only interaction with the client occurs prior to a development project to outline the system requirements, and the development team can quickly move through the software development stages in order to complete the project within the deadline. If the team had more time, then the other methodologies could also be considered because there is more time for client and users to review the project and make changes as they see fit, and/or allow for more time to review the project in order to enhance the business performance and functionality. Sometimes the client and or user involvement can dictate the selection of methodologies applied to a project. One example of this is if a client is highly motivated to get a project completed and desires to play an active part in the development process then the agile development approach would work perfectly with this client because it allows for frequent interaction between clients, users and the development team. The inverse of this situation is a client that just wants to provide the project requirements and only wants to get involved when the project is to be delivered. In this case the waterfall method would work well because there is no room for changes and no back and forth between the users, clients or the development team.

Read the article
Bad Spot to Be In: Playing Catch-up with Mobile Advertising

- by Mike Stiles

You probably noticed, there’s a mass migration going on from online desktop/laptop usage to smartphone/tablet usage. It’s an indicator of how we live our lives in the modern world: always on the go, with no intention of being disconnected while out there. Consequently, paid as it relates to mobile advertising is taking the social spotlight. eMarketer estimated that in 2013, US adults would spend about 2 hours, 21 minutes a day on mobile, not counting talking time. More people in the world own smartphones than own toothbrushes (bad news I suppose if you’re marketing toothpaste). They’re using those mobile devices to access social networks, consuming at least 17% of their mobile time on them. Frankly, you don’t need a deep dive into mobile usage stats to know what’s going on. Just look around you in any store, venue or coffee shop. It’s really obvious…our mobile devices are now where we “are,” so that’s where marketers can increasingly reach us. And it’s a smart place for them to do just that. Mobile devices can be viewed more and more as shopping facilitators. Usually when someone is on mobile, they are not in passive research mode. They are likely standing near a store or in front of a product, using their mobile to seek reassurance that buying that product is the right move. They are the hottest of hot prospects. Consider that 4 out of 5 consumers use smartphones to shop, 52% of Americans use mobile devices for in-store for research, 70% of mobile searches lead to online action inside of an hour, and people that find you on mobile convert at almost 3x the rate as those that find you on desktop or laptop. But what are marketers doing? Enter statistics from Mary Meeker’s latest State of the Internet report. Common sense says you buy advertising where people are spending their eyeball time, right? But while mobile is 20% of media use and rising, the ad spend there is 4%. Conversely, while print usage is at 5% and falling, ad spend there is 19%. We all love nostalgia, but come on. There are reasons marketing dollar migration to mobile has not matched user migration, including the availability of mobile ad products and the ability to measure user response to mobile ads. But interesting things are happening now. First came Facebook’s mobile ad, which let app developers pay to get potential downloads. Then their mobile ad network was announced at F8, allowing marketers to target users across non-Facebook apps while leveraging the wealth of diverse data Facebook has on those users, a big deal since Nielsen has pointed out mobile apps make up 89% of the media time spent on mobile. Twitter has a similar play in motion with their MoPub acquisition. And now mobile deeplinks have arrived, which can take users straight to sub-pages of mobile apps for a faster, more direct shopper/researcher user experience. The sooner the gratification, the smoother and faster the conversion. To be clear, growth in mobile ad spending is well underway. After posting $13.1 billion in 2013, Gartner expects global mobile ad spending to reach $18 billion this year, then go to $41.9 billion by 2017. Cheap smartphones and data plans are spreading worldwide, further fueling the shift to mobile. Mobile usage in India alone should grow 400% by 2018. And, of course, there’s the famous statistic that mobile should overtake desktop Internet usage this year. How can we as marketers mess up this opportunity? Two ways. We could position ourselves in perpetual “catch-up” mode and keep spending ad dollars where the public used to be. And we could annoy mobile users with horrid old-school marketing practices. Two-thirds of users told Forrester they think interruptive in-app ads are more annoying than TV ads. Make sure your brand’s social marketing technology platform is delivering a crystal clear picture of your social connections so the mobile touch point is highly relevant, mobile optimized, and delivering real value and satisfying experiences. Otherwise, all we’ve done is find a new way to be unwanted. @mikestiles @oraclesocialPhoto: Kate Mallatratt, freeimages.com

Read the article
New Management Console in Java SE Advanced 8u20

- by Erik Costlow-Oracle

Java SE 8 update 20 is a new feature release designed to provide desktop administrators with better control of their managed systems. The release notes for 8u20 are available from the public JDK release notes page. This release is not a Critical Patch Update (CPU). I would like to call attention to two noteworthy features of Oracle Java SE Advanced, the commercially supported version of Java SE for enterprises that require both support and specialized tools. The new Advanced Management Console provides a way to monitor and understand client systems at scale. It allows organizations to track usage and more easily create and manage client configuration like Deployment Rule Sets (DRS). DRS can control execution of tracked applications as well as specify compatibility of which application should use which Java SE installation. The new MSI Installer integrates into various desktop management tools, making it easier to customize and roll out different Java SE versions. Advanced Management Console The Advanced Management Console is part of Java SE Advanced designed for desktop administrators, whose users need to run many different Java applications. It provides usage tracking for those Applet & Web Start applications to help identify them for guided DRS creation. DRS can then be verified against the tracked data, to ensure that end-users can run their application against the appropriate Java version with no prompts. Usage tracking also has a different definition for Java SE than it does for most software applications. Unlike most applications where usage can be determined by a simple run-count, Java is a platform used for launching other applications. This means that usage tracking must answer both "how often is this Java SE version used" and "what applications are launched by it." Usage Tracking One piece of Java SE Advanced is a centralized usage tracker. Simply placing a properties file on the client informs systems to report information to this usage tracker, so that the desktop administrator can better understand usage. Information is sent via UDP to prevent any delay on the client. The usage tracking server resides at a central location on the intranet to collect information from those clients. The information is stored in a normalized database for performance, meaning that a single usage tracker can handle a large number of clients. Guided Deployment Rule Sets Deployment Rule Sets were introduced in Java 7 update 40 (September 2013) in order to help administrators control security prompts and guide compatibility. A previous post, Deployment Rule Sets by Example, explains how to configure a rule set so that most applications run against the most secure version but a specific applet may run against the Java version that was current several years ago. There are a different set of questions that can be asked by a desktop administrator in a large or distributed firm: Where are the Java RIAs that our users need? Which RIA needs which Java version? Which users need which Java versions? How do I verify these answers once I have them? The guided deployment rule set creation uses usage tracker data to identify applications both by certificate hash and location. After creating the rules, a comparison tool exists to verify them against the tracked data: If you intend to run an RIA, is it green? If something specific should be blocked, is it red? This makes user-testing easier. MSI Installer The Windows Installer format (MSI) provides a number of benefits for desktop administrators that customize or manage software at scale. Unlike the basic installer that most users obtain from Java.com or OTN, this installer is built around customization and integration with various desktop management products like SCCM. Desktop administrators using the MSI installer can use every feature provided by the format, such as silent installs/upgrades, low-privileged installations, or self-repair capabilities Customers looking for Java SE Advanced can download the MSI installer through their My Oracle Support (MOS) account. Java SE Advanced The new features in Java SE Advanced make it easier for desktop administrators to identify and control client installations at scale. Administrators at organizations that want either the tools or associated commercial support should consider Java SE Advanced.

Read the article
Windows Azure Recipe: Social Web / Big Media

- by Clint Edmonson

With the rise of social media there’s been an explosion of special interest media web sites on the web. From athletics to board games to funny animal behaviors, you can bet there’s a group of people somewhere on the web talking about it. Social media sites allow us to interact, share experiences, and bond with like minded enthusiasts around the globe. And through the power of software, we can follow trends in these unique domains in real time. Drivers Reach Scalability Media hosting Global distribution Solution Here’s a sketch of how a social media application might be built out on Windows Azure: Ingredients Traffic Manager (optional) – can be used to provide hosting and load balancing across different instances and/or data centers. Perfect if the solution needs to be delivered to different cultures or regions around the world. Access Control – this service is essential to managing user identity. It’s backed by a full blown implementation of Active Directory and allows the definition and management of users, groups, and roles. A pre-built ASP.NET membership provider is included in the training kit to leverage this capability but it’s also flexible enough to be combined with external Identity providers including Windows LiveID, Google, Yahoo!, and Facebook. The provider model has extensibility points to hook into other identity providers as well. Web Role – hosts the core of the web application and presents a central social hub users. Database – used to store core operational, functional, and workflow data for the solution’s web services. Caching (optional) – as a web site traffic grows caching can be leveraged to keep frequently used read-only, user specific, and application resource data in a high-speed distributed in-memory for faster response times and ultimately higher scalability without spinning up more web and worker roles. It includes a token based security model that works alongside the Access Control service. Tables (optional) – for semi-structured data streams that don’t need relational integrity such as conversations, comments, or activity streams, tables provide a faster and more flexible way to store this kind of historical data. Blobs (optional) – users may be creating or uploading large volumes of heterogeneous data such as documents or rich media. Blob storage provides a scalable, resilient way to store terabytes of user data. The storage facilities can also integrate with the Access Control service to ensure users’ data is delivered securely. Content Delivery Network (CDN) (optional) – for sites that service users around the globe, the CDN is an extension to blob storage that, when enabled, will automatically cache frequently accessed blobs and static site content at edge data centers around the world. The data can be delivered statically or streamed in the case of rich media content. Training These links point to online Windows Azure training labs and resources where you can learn more about the individual ingredients described above. (Note: The entire Windows Azure Training Kit can also be downloaded for offline use.) Windows Azure (16 labs) Windows Azure is an internet-scale cloud computing and services platform hosted in Microsoft data centers, which provides an operating system and a set of developer services which can be used individually or together. It gives developers the choice to build web applications; applications running on connected devices, PCs, or servers; or hybrid solutions offering the best of both worlds. New or enhanced applications can be built using existing skills with the Visual Studio development environment and the .NET Framework. With its standards-based and interoperable approach, the services platform supports multiple internet protocols, including HTTP, REST, SOAP, and plain XML SQL Azure (7 labs) Microsoft SQL Azure delivers on the Microsoft Data Platform vision of extending the SQL Server capabilities to the cloud as web-based services, enabling you to store structured, semi-structured, and unstructured data. Windows Azure Services (9 labs) As applications collaborate across organizational boundaries, ensuring secure transactions across disparate security domains is crucial but difficult to implement. Windows Azure Services provides hosted authentication and access control using powerful, secure, standards-based infrastructure. See my Windows Azure Resource Guide for more guidance on how to get started, including links web portals, training kits, samples, and blogs related to Windows Azure.

Read the article
Introducing the First Global Web Experience Management Content Management System

- by kellsey.ruppel

By Calvin Scharffs, VP of Marketing and Product Development, Lingotek Globalizing online content is more important than ever. The total spending power of online consumers around the world is nearly $50 trillion, a recent Common Sense Advisory report found. Three years ago, enterprises would have to translate content into 37 language to reach 98 percent of Internet users. This year, it takes 48 languages to reach the same amount of users. For companies seeking to increase global market share, “translate frequently and fast” is the name of the game. Today’s content is dynamic and ever-changing, covering the gamut from social media sites to company forums to press releases. With high-quality translation and localization, enterprises can tailor content to consumers around the world. Speed and Efficiency in Translation When it comes to the “frequently and fast” part of the equation, enterprises run into problems. Professional service providers provide translated content in files, which company workers then have to manually insert into their CMS. When companies update or edit source documents, they have to hunt down all the translated content and change each document individually. Lingotek and Oracle have solved the problem by making the Lingotek Collaborative Translation Platform fully integrated and interoperable with Oracle WebCenter Sites Web Experience Management. Lingotek combines best-in-class machine translation solutions, real-time community/crowd translation and professional translation to enable companies to publish globalized content in an efficient and cost-effective manner. WebCenter Sites Web Experience Management simplifies the creation and management of different types of content across multiple channels, including social media. Globalization Without Interrupting the Workflow The combination of the Lingotek platform with WebCenter Sites ensures that process of authoring, publishing, targeting, optimizing and personalizing global Web content is automated, saving companies the time and effort of manually entering content. Users can seamlessly integrate translation into their WebCenter Sites workflows, optimizing their translation and localization across web, social and mobile channels in multiple languages. The original structure and formatting of all translated content is maintained, saving workers the time and effort involved with inserting the text translation and reformatting. In addition, Lingotek’s continuous publication model addresses the dynamic nature of content, automatically updating the status of translated documents within the WebCenter Sites Workflow whenever users edit or update source documents. This enables users to sync translations in real time. The translation, localization, updating and publishing of Web Experience Management content happens in a single, uninterrupted workflow. The net result of Lingotek Inside for Oracle WebCenter Sites Web Experience Management is a system that more than meets the need for frequent and fast global translation. Workflows are accelerated. The globalization of content becomes faster and more streamlined. Enterprises save time, cost and effort in translation project management, and can address the needs of each of their global markets in a timely and cost-effective manner. About Lingotek Lingotek is an Oracle Gold Partner and is going to be one of the first Oracle Validated Integrator (OVI) partners with WebCenter Sites. Lingotek is also an OVI partner with Oracle WebCenter Content. Watch a video about how Lingotek Inside for Oracle WebCenter Sites works! Oracle WebCenter will be hosting a webinar, “Hitachi Data Systems Improves Global Web Experiences with Oracle WebCenter," tomorrow, September 13th. To attend the webinar, please register now! For more information about Lingotek for Oracle WebCenter, please visit http://www.lingotek.com/oracle.

Read the article
SQL SERVER – Weekly Series – Memory Lane – #007

- by pinaldave

Here is the list of selected articles of SQLAuthority.com across all these years. Instead of just listing all the articles I have selected a few of my most favorite articles and have listed them here with additional notes below it. Let me know which one of the following is your favorite article from memory lane. 2006 Find Stored Procedure Related to Table in Database – Search in All Stored Procedure In 2006 I wrote a small script which will help user find all the Stored Procedures (SP) which are related to one or more specific tables. This was quite a popular script however, in SQL Server 2012 the same can be achieved using new DMV sys.sql-expression_dependencies. I recently blogged about it over Find Referenced or Referencing Object in SQL Server using sys.sql_expression_dependencies. 2007 SQL SERVER – Versions, CodeNames, Year of Release 1993 – SQL Server 4.21 for Windows NT 1995 – SQL Server 6.0, codenamed SQL95 1996 – SQL Server 6.5, codenamed Hydra 1999 – SQL Server 7.0, codenamed Sphinx 1999 – SQL Server 7.0 OLAP, codenamed Plato 2000 – SQL Server 2000 32-bit, codenamed Shiloh (version 8.0) 2003 – SQL Server 2000 64-bit, codenamed Liberty 2005 – SQL Server 2005, codenamed Yukon (version 9.0) 2008 – SQL Server 2008, codenamed Katmai (version 10.0) 2011 – SQL Server 2008, codenamed Denali (version 11.0) Search String in Stored Procedure Searching sting in the stored procedure is one of the most frequent task developer do. They might be searching for a table, view or any other details. I have written a script to do the same in SQL Server 2000 and SQL Server 2005. This is worth bookmarking blog post. There is an alternative way to do the same as well here is the example. 2008 SQL SERVER – Refresh Database Using T-SQL NO! Some of the questions have a single answer NO! You may want to read the question in the original blog post. I had a great time saying No! SQL SERVER – Delete Backup History – Cleanup Backup History SQL Server stores history of all the taken backup forever. History of all the backup is stored in the msdb database. Many times older history is no more required. Following Stored Procedure can be executed with a parameter which takes days of history to keep. In the following example 30 is passed to keep a history of month. 2009 Stored Procedure are Compiled on First Run – SP taking Longer to Run First Time Is stored procedure pre-compiled? Why the Stored Procedure takes a long time to run for the first time? This is a very common questions often discussed by developers and DBAs. There is an absolutely definite answer but the question has been discussed forever. There is a misconception that stored procedures are pre-compiled. They are not pre-compiled, but compiled only during the first run. For every subsequent runs, it is for sure pre-compiled. Read the entire article for example and demonstration. Removing Key Lookup – Seek Predicate – Predicate – An Interesting Observation Related to Datatypes This is one of the most important performance tuning lesson on my blog. I suggest this weekend you spend time reading them and let me know what you think about the concepts which I have demonstrated in the four part series. Part 1 | Part 2 | Part 3 | Part 4 Seek Predicate is the operation that describes the b-tree portion of the Seek. Predicate is the operation that describes the additional filter using non-key columns. Based on the description, it is very clear that Seek Predicate is better than Predicate as it searches indexes whereas in Predicate, the search is on non-key columns – which implies that the search is on the data in page files itself. Policy Based Management – Create, Evaluate and Fix Policies This article will cover the most spectacular feature of SQL Server – Policy-based management and how the configuration of SQL Server with policy-based management architecture can make a powerful difference. Policy based management is loaded with several advantages. It can help you implement various policies for reliable configuration of the system. It also provides additional administration assistance to DBAs and helps them effortlessly manage various tasks of SQL Server across the enterprise. 2010 Recycle Error Log – Create New Log file without Server Restart Once I observed a DBA to restaring the SQL Server when he needed new error log file. This was funny and sad both at the same time. There is no need to restart the server to create a new log file or recycle the log file. You can run sp_cycle_errorlog and achieve the same result. Get Database Backup History for a Single Database Simple but effective script! Reducing CXPACKET Wait Stats for High Transactional Database The subject is very complex and I have done my best to simplify the concept. In simpler words, when a parallel operation is created for SQL Query, there are multiple threads for a single query. Each query deals with a different set of the data (or rows). Due to some reasons, one or more of the threads lag behind, creating the CXPACKET Wait Stat. Threads which came first have to wait for the slower thread to finish. The Wait by a specific completed thread is called CXPACKET Wait Stat. Information Related to DATETIME and DATETIME2 There are quite a lot of confusion with DATETIME and DATETIME2. DATETIME2 is also one of the underutilized datatype of SQL Server. In this blog post I have written a follow up of the my earlier datetime series where I clarify a few of the concepts related to datetime. Difference Between GETDATE and SYSDATETIME Difference Between DATETIME and DATETIME2 – WITH GETDATE Difference Between DATETIME and DATETIME2 2011 Introduction to CUME_DIST – Analytic Functions Introduced in SQL Server 2012 SQL Server 2012 introduces new analytical function CUME_DIST(). This function provides cumulative distribution value. It will be very difficult to explain this in words so I will attempt small example to explain you this function. Instead of creating new table, I will be using AdventureWorks sample database as most of the developer uses that for experiment. Introduction to FIRST _VALUE and LAST_VALUE – Analytic Functions Introduced in SQL Server 2012 SQL Server 2012 introduces new analytical functions FIRST_VALUE() and LAST_VALUE(). This function returns first and last value from the list. It will be very difficult to explain this in words so I’d like to attempt to explain its function through a brief example. Instead of creating a new table, I will be using the AdventureWorks sample database as most developers use that for experiment purposes. OVER clause with FIRST _VALUE and LAST_VALUE – Analytic Functions Introduced in SQL Server 2012 – ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING “Don’t you think there is bug in your first example where FIRST_VALUE is remain same but the LAST_VALUE is changing every line. I think the LAST_VALUE should be the highest value in the windows or set of result.” Puzzle – Functions FIRST_VALUE and LAST_VALUE with OVER clause and ORDER BY You can see that row number 2, 3, 4, and 5 has same SalesOrderID = 43667. The FIRST_VALUE is 78 and LAST_VALUE is 77. Now if these function was working on maximum and minimum value they should have given answer as 77 and 80 respectively instead of 78 and 77. Also the value of FIRST_VALUE is greater than LAST_VALUE 77. Why? Explain in detail. Introduction to LEAD and LAG – Analytic Functions Introduced in SQL Server 2012 SQL Server 2012 introduces new analytical function LEAD() and LAG(). This functions accesses data from a subsequent row (for lead) and previous row (for lag) in the same result set without the use of a self-join . It will be very difficult to explain this in words so I will attempt small example to explain you this function. Instead of creating new table, I will be using AdventureWorks sample database as most of the developer uses that for experiment. A Real Story of Book Getting ‘Out of Stock’ to A 25% Discount Story Available Our book was out of stock in 48 hours of it was arrived in stock! We got call from the online store with a request for more copies within 12 hours. But we had printed only as many as we had sent them. There were no extra copies. We finally talked to the printer to get more copies. However, due to festivals and holidays the copies could not be shipped to the online retailer for two days. We knew for sure that they were going to be out of the book for 48 hours. This is the story of how we overcame that situation! Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Memory Lane, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
DISA Cross Domain Enterprise Solutions on the NetBeans Platform

- by Geertjan

Bray 2.0 is a tool based on the NetBeans Platform that assists in creating valid Data Flow Configuration (DFC) files. The DFC Specification was developed to provide a standardized way for defining, validating, and approving data flows for use on cross-domain guarding solutions. A DFC document specifies key entities such as security domains, guards that facilitate data between security domains, data flows that describe how data travels between security domains, filters that transform and validate the data and more. Related info: http://www.disa.mil/Services/Information-Assurance/Cross-Domain-Solutions The Bray product is in development at Fulcrum IT (http://www.fulcrumco.com). The DFC Specification and Bray were developed in support of the US Department of Defense. Bray 2.0 marks the first release of Bray on the NetBeans Platform and utilizes a number of features that are core to the NetBeans Platform: Modular plugability. Bray consumers can integrate their own tools, file types, and more into the product with relative ease. Robust UI. The NetBeans Platform intuitive UI makes it easy to access and manipulate multiple aspects of a DFC. Explorer. The Explorer is a key component that makes the DFC XML easy to traverse, edit, and find errors. Context-sensitive help. JavaHelp can be readily integrated for the product as well as all the UI within. Editors. Any external file can be added to a DFC. Users can register their own editors or use the provided NetBeans editors to edit files. Printing. The NetBeans Platform Print API makes it easy to determine what should be printed and how. A screenshot: Bray 2.0 provides a lot of key features in developing valid, robust DFC files: XML validation. A DFC can be validated against the DFC schema specification. DFC Check List. An interactive, minimal guide for creating a complete DFC. Summary Window. The Summary Window functions like the Navigator in NetBeans IDE. The current "item of interest" is checked against various business rules and provides the ability to quickly find and fix errors. Change Log. Bray audits every change to a DFC and places them in a change log for users to peruse. Comments. Users can optionally add comments for other users to see. Digital signatures. DFC files can be digitally signed. A signature history and signature validation is provided in Bray. Pluggable security schemes. Bray ships with plain text and IC-ISM security schemes. If needed, users can integrate additional ones. ...and more to come! New features for Bray are constantly in development including use of the NetBeans Visual Library, language support, and more. More screenshots:

Read the article
Consumer Oriented Search In Oracle Endeca Information Discovery – Part 1

- by Bob Zurek

Information Discovery, a core capability of Oracle Endeca Information Discovery, enables business users to rapidly search, discover and navigate through a wide variety of big data including structured, unstructured and semi-structured data. One of the key capabilities, among many, that differentiate our solution from others in the Information Discovery market is our deep support for search across this growing amount of varied big data. Our method and approach is very different than classic simple keyword search that is found in may information discovery solutions. In this first part of a series on the topic of search, I will walk you through many of the key capabilities that go beyond the simple search box that you might experience in products where search was clearly an afterthought or attempt to catch up to our core capabilities in this area. Lets explore. The core data management solution of Oracle Endeca Information Discovery is the Endeca Server, a hybrid search-analytical database that his highly scalable and column-oriented in nature. We will talk in more technical detail about the capabilities of the Endeca Server in future blog posts as this post is intended to give you a feel for the deep search capabilities that are an integral part of the Endeca Server. The Endeca Server provides best-of-breed search features aw well as a new class of features that are the first to be designed around the requirement to bridge structured, semi-structured and unstructured big data. Some of the key features of search include type a heads, automatic alphanumeric spell corrections, positional search, Booleans, wildcarding, natural language, and category search and query classification dialogs. This is just a subset of the advanced search capabilities found in Oracle Endeca Information Discovery. Search is an important feature that makes it possible for business users to explore on the diverse data sets the Endeca Server can hold at any one time. The search capabilities in the Endeca server differ from other Information Discovery products with simple “search boxes” in the following ways: The Endeca Server Supports Exploratory Search. Enterprise data frequently requires the user to explore content through an ad hoc dialog, with guidance that helps them succeed. This has implications for how to design search features. Traditional search doesn’t assume a dialog, and so it uses relevance ranking to get its best guess to the top of the results list. It calculates many relevance factors for each query, like word frequency, distance, and meaning, and then reduces those many factors to a single score based on a proprietary “black box” formula. But how can a business users, searching, act on the information that the document is say only 38.1% relevant? In contrast, exploratory search gives users the opportunity to clarify what is relevant to them through refinements and summaries. This approach has received consumer endorsement through popular ecommerce sites where guided navigation across a broad range of products has helped consumers better discover choices that meet their, sometimes undetermined requirements. This same model exists in Oracle Endeca Information Discovery. In fact, the Endeca Server powers many of the most popular e-commerce sites in the world. The Endeca Server Supports Cascading Relevance. Traditional approaches of search reduce many relevance weights to a single score. This means that if a result with a good title match gets a similar score to one with an exact phrase match, they’ll appear next to each other in a list. But a user can’t deduce from their score why each got it’s ranking, even though that information could be valuable. Oracle Endeca Information Discovery takes a different approach. The Endeca Server stratifies results by a primary relevance strategy, and then breaks ties within a strata by ordering them with a secondary strategy, and so on. Application managers get the explicit means to compose these strategies based on their knowledge of their own domain. This approach gives both business users and managers a deterministic way to set and understand relevance. Now that you have an understanding of two of the core search capabilities in Oracle Endeca Information Discovery, our next blog post on this topic will discuss more advanced features including set search, second-order relevance as well as an understanding of faceted search mechanisms that include queries and filters.

Read the article
Social Targeting: This One's Just for You

- by Mike Stiles

Think of social targeting in terms of the archery competition we just saw in the Olympics. If someone loaded up 5 arrows and shot them straight up into the air all at once, hoping some would land near the target, the world would have united in laughter. But sadly for hysterical YouTube video viewing, that’s not what happened. The archers sought to maximize every arrow by zeroing in on the spot that would bring them the most points. Marketers have always sought to do the same. But they can only work with the tools that are available. A firm grasp of the desired target does little good if the ad products aren’t there to deliver that target. On the social side, both Facebook and Twitter have taken steps to enhance targeting for marketers. And why not? As the demand to monetize only goes up, they’re quite motivated to leverage and deliver their incredible user bases in ways that make economic sense for advertisers. You could target keywords on Twitter with promoted accounts, and get promoted tweets into search. They would surface for your followers and some users that Twitter thought were like them. Now you can go beyond keywords and target Twitter users based on 350 interests in 25 categories. How does a user wind up in one of these categories? Twitter looks at that user’s tweets, they look at whom they follow, and they run data through some sort of Twitter secret sauce. The result is, you have a much clearer shot at Twitter users who are most likely to welcome and be responsive to your tweets. And beyond the 350 interests, you can also create custom segments that find users who resemble followers of whatever Twitter handle you give it. That means you can now use boring tweets to sell like a madman, right? Not quite. This ad product is still quality-based, meaning if you’re not putting out tweets that lead to interest and thus, engagement, that tweet will earn a low quality score and wind up costing you more under Twitter’s auction system to maintain. That means, as the old knight in “Indiana Jones and the Last Crusade” cautions, “choose wisely” when targeting based on these interests and categories to make sure your interests truly do line up with theirs. On the Facebook side, they’re rolling out ad targeting that uses email addresses, phone numbers, game and app developers’ user ID’s, and eventually addresses for you bigger brands. Why? Because you marketers asked for it. Here you were with this amazing customer list but no way to reach those same customers should they be on Facebook. Now you can find and communicate with customers you gathered outside of social, and use Facebook to do it. Fair to say such users are a sensible target and will be responsive to your message since they’ve already bought something from you. And no you’re not giving your customer info to Facebook. They’ll use something called “hashing” to make sure you don’t see Facebook user data (beyond email, phone number, address, or user ID), and Facebook can’t see your customer data. The end result, social becomes far more workable and more valuable to marketers when it delivers on the promise that made it so exciting in the first place. That promise is the ability to move past casting wide nets to the masses and toward concentrating marketing dollars efficiently on the targets most likely to yield results.

Read the article
Brazil is Hot for Social Media

- by Mike Stiles

Today’s guest blog is from Oracle SVP Product Development Reggie Bradford, fresh off a visit to Sao Paulo, Brazil where he spoke at the Dachis Social Business Summit and spent some time getting a personal taste for the astonishing growth of social in Brazil, both in terms of usage and engagement. I knew it was big, but I now have an all-new appreciation for why the Wall Street Journal branded Brazil the “social media capital of the universe.” Brazil has the world’s 5th largest economy, an expanding middle class, an active younger demo market, a connected & outgoing culture, and an ongoing embrace of the social media platforms. According to comScore's 2012 Brazil Digital Future in Focus report, 97% are using social media, and that’s not even taking mobile-only users into account. There were 65 million Facebook users in 2012, spending an average 535 minutes there, up 208%. It’s one of Twitter’s fastest growing markets and the 2nd biggest market for YouTube. Instagram usage has grown over 300% since last year. That by itself is exciting, but look at the opportunity for social marketing brands. 74% of Brazilian social users follow brands on Facebook, and 59% have praised a company on either Twitter or Facebook. A 2011 Oh! Panel study found 81% of social networkers there used social to research new products and 75% went there looking for discounts. B2C eCommerce sales in Brazil is projected to hit $26.9 billion by 2015. I bet I’m not the only one who sees great things ahead, and I was fortunate enough give a keynote ABRADI, an association of leading digital agencies in Brazil with 53 execs from 35 agencies attending. I was also afforded the opportunity to give my impressions of what’s going on in Brazil to Jornal Propoganda & Marketing, one of the most popular publications in Latin America for marketers. I conveyed that especially in an environment like Brazil, where social users are so willing to connect and engage brands, marketers need to back away from the heavy-handed, one-way messaging of old school advertising and move toward genuine relationships and trust-building. To aide in this, organizational and operation changes must be embraced inside the enterprise. We've talked often about the new, tighter partnership forming between the CIO and CMO. If this partnership is not encouraged, fostered and resourced, the increasing amount of time consumers spend on mobile and digital, and the efficiencies and integrations offered by cloud-based software cannot be exploited. These are the kinds of changes that can yield social data that, when combined with enterprise data, helps you come to know your social audiences intimately and predict their needs. Consumers are always connected and need your brand to be accessible at any time, be it for information or customer service. And, of course, all of this is happening quite publicly. The holistic, socially-enable enterprise connects social to customer service systems and all other customer touch points, facilitating the kind of immediate, real-time, gratifying response customers are coming to expect. Social users in Brazil are highly active and clearly willing to meet us as brands more than halfway. Empowering yourself with a social management technology platform will have you set up to maximize this booming social market…from listening & monitoring to engagement to analytics to workflow & automation to globalization & language support. Brands, it’s time to be as social as the great people of Brazil are. Obrigado! @reggiebradfordPhoto: Gualberto107, freedigitalphotos.net

Read the article
Pinterest and the Rising Power of Imagery

- by Mike Stiles

If images keep you glued to a screen, you’re hardly alone. Countless social users are letting their eyes do the walking, waiting for that special photo to grab their attention. And perhaps more than any other social network, Pinterest has been giving those eyes plenty of room to walk. Pinterest came along in 2010. Its play was that users could simply create topic boards and pin pictures to the appropriate boards for sharing. Yes there are some words, captions mostly, but not many. The speed of its growth raised eyebrows. Traffic quadrupled in the last quarter of 2011, with 7.51 million unique visitors in December alone. It now gets 1.9 billion monthly page views. And it was sticky. In the US, the average time a user spends strolling through boards and photos on Pinterest is 15 minutes, 50 seconds. Proving the concept of browsing a catalogue is not dead, it became a top 5 referrer for several apparel retailers like Land’s End, Nordstrom, and Bergdorfs. Now a survey of online shoppers by BizRate Insights says that Pinterest is responsible for more purchases online than Facebook. Over 70% of its users are going there specifically to keep up with trends and get shopping ideas. And when they buy, the average order value is $179. Pinterest is also scoring better in terms of user engagement. 66% of pinners regularly follow and repin retailers, whereas 17% of Facebook fans turn to that platform for purchase ideas. (Facebook still wins when it comes to reach and driving traffic to 3rd-party sites by the way). Social posting best practices have consistently shown that posts with photos are rewarded with higher engagement levels. You may be downright Shakespearean in your writing, but what makes images in the digital world so much more powerful than prose? 1. They transcend language barriers. 2. They’re fun and addictive to look at. 3. They can be consumed in fractions of a second, important considering how fast users move through their social content (admit it, you do too). 4. They’re efficient gateways. A good picture might get them to the headline. A good headline might then get them to the written content. 5. The audience for them surpasses demographic limitations. 6. They can effectively communicate and trigger an emotion. 7. With mobile use soaring, photos are created on those devices and easily consumed and shared on them. Pinterest’s iPad app hit #1 in the Apple store in 1 day. Even as far back as 2009, over 2.5 billion devices with cameras were on the streets generating in just 1 year, 10% of the number of photos taken…ever. But let’s say you’re not a retailer. What if you’re a B2B whose products or services aren’t visual? Should you worry about your presence on Pinterest? As with all things, you need a keen awareness of who your audience is, where they reside online, and what they want to do there. If it doesn’t make sense to put a tent stake in Pinterest, fine. But ignore the power of pictures at your own peril. If not visually, how are you going to attention-grab social users scrolling down their News Feeds at top speed? You’re competing with every other cool image out there from countless content sources. Bore us and we’ll fly right past you.

Read the article
Consumer Oriented Search In Oracle Endeca Information Discovery - Part 2

- by Bob Zurek

As discussed in my last blog posting on this topic, Information Discovery, a core capability of the Oracle Endeca Information Discovery solution enables businesses to search, discover and navigate through a wide variety of big data including structured, unstructured and semi-structured data. With search as a core advanced capabilities of our product it is important to understand some of the key differences and capabilities in the underlying data store of Oracle Endeca Information Discovery and that is our Endeca Server. In the last post on this subject, we talked about Exploratory Search capabilities along with support for cascading relevance. Additional search capabilities in the Endeca Server, which differentiate from simple keyword based "search boxes" in other Information Discovery products also include: The Endeca Server Supports Set Search. The Endeca Server is organized around set retrieval, which means that it looks at groups of results (all the documents that match a search), as well as the relationship of each individual result to the set. Other approaches only compute the relevance of a document by comparing the document to the search query – not by comparing the document to all the others. For example, a search for “U.S.” in another approach might match to the title of a document and get a high ranking. But what if it were a collection of government documents in which “U.S.” appeared in many titles, making that clue less meaningful? A set analysis would reveal this and be used to adjust relevance accordingly. The Endeca Server Supports Second-Order Relvance. Unlike simple search interfaces in traditional BI tools, which provide limited relevance ranking, such as a list of results based on key word matching, Endeca enables users to determine the most salient terms to divide up the result. Determining this second-order relevance is the key to providing effective guidance. Support for Queries and Filters. Search is the most common query type, but hardly complete, and users need to express a wide range of queries. Oracle Endeca Information Discovery also includes navigation, interactive visualizations, analytics, range filters, geospatial filters, and other query types that are more commonly associated with BI tools. Unlike other approaches, these queries operate across structured, semi-structured and unstructured content stored in the Endeca Server. Furthermore, this set is easily extensible because the core engine allows for pluggable features to be added. Like a search engine, queries are answered with a results list, ranked to put the most likely matches first. Unlike “black box” relevance solutions, which generalize one strategy for everyone, we believe that optimal relevance strategies vary across domains. Therefore, it provides line-of-business owners with a set of relevance modules that let them tune the best results based on their content. The Endeca Server query result sets are summarized, which gives users guidance on how to refine and explore further. Summaries include Guided Navigation® (a form of faceted search), maps, charts, graphs, tag clouds, concept clusters, and clarification dialogs. Users don’t explicitly ask for these summaries; Oracle Endeca Information Discovery analytic applications provide the right ones, based on configurable controls and rules. For example, the analytic application might guide a procurement agent filtering for in-stock parts by visualizing the results on a map and calculating their average fulfillment time. Furthermore, the user can interact with summaries and filters without resorting to writing complex SQL queries. The user can simply just click to add filters. Within Oracle Endeca Information Discovery, all parts of the summaries are clickable and searchable. We are living in a search driven society where business users really seem to enjoy entering information into a search box. We do this everyday as consumers and therefore, we have gotten used to looking for that box. However, the key to getting the right results is to guide that user in a way that provides additional Discovery, beyond what they may have anticipated. This is why these important and advanced features of search inside the Endeca Server have been so important. They have helped to guide our great customers to success.

Read the article
I.T. Chargeback : Core to Cloud Computing

- by Anand Akela

Contributed by Mark McGill Consolidation and Virtualization have been widely adopted over the years to help deliver benefits such as increased server utilization, greater agility and lower cost to the I.T. organization. These are key enablers of cloud, but in themselves they do not provide a complete cloud solution. Building a true enterprise private cloud involves moving from an admin driven world, where the I.T. department is ultimately responsible for the provisioning of servers, databases, middleware and applications, to a world where the consumers of I.T. resources can provision their infrastructure, platforms and even complete application stacks on demand. Switching from an admin-driven provisioning model to a user-driven model creates some challenges. How do you ensure that users provisioning resources will not provision more than they need? How do you encourage users to return resources when they have finished with them so that others can use them? While chargeback has existed as a concept for many years (especially in mainframe environments), it is the move to this self-service model that has created a need for a new breed of chargeback applications for cloud. Enabling self-service without some form of chargeback is like opening a shop where all of the goods are free. A successful chargeback solution will be able to allocate the costs of shared I.T. infrastructure based on the relative consumption by the users. Doing this creates transparency between the I.T. department and the consumers of I.T. When users are able to understand how their consumption translates to cost they are much more likely to be prudent when it comes to their use of I.T. resources. This also gives them control of their I.T. costs, as moderate usage will translate to a lower charge at the end of the month. Implementing Chargeback successfully create a win-win situation for I.T. and the consumers. Chargeback can help to ensure that I.T. resources are used for activities that deliver business value. It also improves the overall utilization of I.T. infrastructure as I.T. resources that are not needed are not left running idle. Enterprise Manager 12c provides an integrated metering and chargeback solution for Enterprise Manager Targets. This solution is built on top of the rich configuration and utilization information already available in Enterprise Manager. It provides metering not just for virtual machines, but also for physical hosts, databases and middleware. Enterprise Manager 12c provides metering based on the utilization and configuration of the following types of Enterprise Manager Target: Oracle VM Host Oracle Database Oracle WebLogic Server Using Enterprise Manager Chargeback, administrators are able to create a set of Charge Plans that are used to attach prices to the various metered resources. These plans can contain fixed costs (eg. $10/month/database), configuration based costs (eg. $10/month if OS is Windows) and utilization based costs (eg. $0.05/GB of Memory/hour) The self-service user provisioning these resources is then able to view a report that details their usage and helps them understand how this usage translates into cost. Armed with this information, the user is able to determine if the resources are delivering adequate business value based on what is being charged. Figure 1: Chargeback in Self-Service Portal Enterprise Manager 12c provides a variety of additional interfaces into this data. The administrator can access summary and trending reports. Summary reports allow the administrator to drill-down through the cost center hierarchy to identify, for example, the top resource consumers across the organization. Figure 2: Charge Summary Report Trending reports can be used for I.T. planning and budgeting as they show utilization and charge trends over a period of time. Figure 3: CPU Trend Report We also provide chargeback reports through BI Publisher. This provides a way for users who do not have an Enterprise Manager login (such as Line of Business managers) to view charge and usage information. For situations where a bill needs to be produced, chargeback can be integrated with billing applications such as Oracle Billing and Revenue Management (BRM). Further information on Enterprise Manager 12c’s integrated metering and chargeback: White Paper Screenwatch Cloud Management on OTN

Read the article
Trouble connecting to vsftpd on ubuntu server

- by littleK

I have installed Ubuntu Server 10.10 and I am using it to host a domain that I have. I am trying to set up FTP for the server, but I am running into some problems. I have successfully installed vsFTPd and I have opened up ports 20, 21 on my firewall. In my vsFTPd configuration, I have enabled SSL. Every time I try to connect to my server via FTP, I receive a "Connection Refused" error. I have had a little more success with SSL disabled, however the connection process will time out after the LIST command (but it does accept my authentication). Here is my vsFTPd configuration, the SSL stuff is at the bottom: # Example config file /etc/vsftpd.conf # # The default compiled in settings are fairly paranoid. This sample file # loosens things up a bit, to make the ftp daemon more usable. # Please see vsftpd.conf.5 for all compiled in defaults. # # READ THIS: This example file is NOT an exhaustive list of vsftpd options. # Please read the vsftpd.conf.5 manual page to get a full idea of vsftpd's # capabilities. # # # Run standalone? vsftpd can run either from an inetd or as a standalone # daemon started from an initscript. listen=YES # # Run standalone with IPv6? # Like the listen parameter, except vsftpd will listen on an IPv6 socket # instead of an IPv4 one. This parameter and the listen parameter are mutually # exclusive. #listen_ipv6=YES # # Allow anonymous FTP? (Disabled by default) anonymous_enable=NO # # Uncomment this to allow local users to log in. local_enable=YES # # Uncomment this to enable any form of FTP write command. write_enable=YES # # Default umask for local users is 077. You may wish to change this to 022, # if your users expect that (022 is used by most other ftpd's) #local_umask=022 # # Uncomment this to allow the anonymous FTP user to upload files. This only # has an effect if the above global write enable is activated. Also, you will # obviously need to create a directory writable by the FTP user. #anon_upload_enable=YES # # Uncomment this if you want the anonymous FTP user to be able to create # new directories. #anon_mkdir_write_enable=YES # # Activate directory messages - messages given to remote users when they # go into a certain directory. dirmessage_enable=YES # # If enabled, vsftpd will display directory listings with the time # in your local time zone. The default is to display GMT. The # times returned by the MDTM FTP command are also affected by this # option. use_localtime=YES # # Activate logging of uploads/downloads. xferlog_enable=YES # # Make sure PORT transfer connections originate from port 20 (ftp-data). connect_from_port_20=YES # # If you want, you can arrange for uploaded anonymous files to be owned by # a different user. Note! Using "root" for uploaded files is not # recommended! #chown_uploads=YES #chown_username=whoever # # You may override where the log file goes if you like. The default is shown # below. #xferlog_file=/var/log/vsftpd.log # # If you want, you can have your log file in standard ftpd xferlog format. # Note that the default log file location is /var/log/xferlog in this case. #xferlog_std_format=YES # # You may change the default value for timing out an idle session. #idle_session_timeout=600 # # You may change the default value for timing out a data connection. #data_connection_timeout=120 # # It is recommended that you define on your system a unique user which the # ftp server can use as a totally isolated and unprivileged user. #nopriv_user=ftpsecure # # Enable this and the server will recognise asynchronous ABOR requests. Not # recommended for security (the code is non-trivial). Not enabling it, # however, may confuse older FTP clients. #async_abor_enable=YES # # By default the server will pretend to allow ASCII mode but in fact ignore # the request. Turn on the below options to have the server actually do ASCII # mangling on files when in ASCII mode. # Beware that on some FTP servers, ASCII support allows a denial of service # attack (DoS) via the command "SIZE /big/file" in ASCII mode. vsftpd # predicted this attack and has always been safe, reporting the size of the # raw file. # ASCII mangling is a horrible feature of the protocol. #ascii_upload_enable=YES #ascii_download_enable=YES # # You may fully customise the login banner string: #ftpd_banner=Welcome to blah FTP service. # # You may specify a file of disallowed anonymous e-mail addresses. Apparently # useful for combatting certain DoS attacks. #deny_email_enable=YES # (default follows) #banned_email_file=/etc/vsftpd.banned_emails # # You may restrict local users to their home directories. See the FAQ for # the possible risks in this before using chroot_local_user or # chroot_list_enable below. #chroot_local_user=YES # # You may specify an explicit list of local users to chroot() to their home # directory. If chroot_local_user is YES, then this list becomes a list of # users to NOT chroot(). #chroot_local_user=YES #chroot_list_enable=YES # (default follows) #chroot_list_file=/etc/vsftpd.chroot_list # # You may activate the "-R" option to the builtin ls. This is disabled by # default to avoid remote users being able to cause excessive I/O on large # sites. However, some broken FTP clients such as "ncftp" and "mirror" assume # the presence of the "-R" option, so there is a strong case for enabling it. #ls_recurse_enable=YES # # Debian customization # # Some of vsftpd's settings don't fit the Debian filesystem layout by # default. These settings are more Debian-friendly. # # This option should be the name of a directory which is empty. Also, the # directory should not be writable by the ftp user. This directory is used # as a secure chroot() jail at times vsftpd does not require filesystem # access. secure_chroot_dir=/var/run/vsftpd/empty # # This string is the name of the PAM service vsftpd will use. pam_service_name=vsftpd # # This option specifies the location of the RSA certificate to use for SSL # encrypted connections. rsa_cert_file=/etc/ssl/private/vsftpd.pem # SSL ssl_enable=YES allow_anon_ssl=NO force_local_data_ssl=YES force_local_logins_ssl=YES ssl_tlsv1=YES ssl_sslv2=YES ssl_sslv3=YES Thanks!

Read the article

Search Results

Search found 49554 results on 1983 pages for 'database users'.

Page 498/1983 | < Previous Page | 494 495 496 497 498 499 500 501 502 503 504 505 | Next Page >

- by Mladen Prajdic

- by wcoekaer

- by Rhames

- by user12587121

- by andrewbrust

- by Mike M

- by kellsey.ruppel(at)oracle.com

- by mgubar

- by fatherjack

- by Mike Stiles

- by Erik Costlow-Oracle

- by Clint Edmonson

- by kellsey.ruppel

- by pinaldave

- by Geertjan

- by Bob Zurek

- by Mike Stiles

- by Mike Stiles

- by Mike Stiles

- by Bob Zurek

- by Anand Akela

- by littleK

< Previous Page | 494 495 496 497 498 499 500 501 502 503 504 505 | Next Page >