Search Results

Search found 18698 results on 748 pages for 'oracle fusion sales cloud'.

Page 534/748 | < Previous Page | 530 531 532 533 534 535 536 537 538 539 540 541  | Next Page >

  • The Minimalist Approach to Content Governance - Create Phase

    - by Kellsey Ruppel
     Originally posted by John Brunswick. In this installment of our Minimalist Approach to Content Governance we finally get to the fun part of the content creation process! Once the content requester has addressed the items outlined in the Request Phase it is time to setup and begin the production of content.   For this to be done correctly it is important the the content be assigned appropriate workflow and security information. As in our prior phase, let's take a look at what can be done to streamline this process - as contributors are focused on getting information to their end users as quickly as possible. This often means that details around how to ensure that the materials are properly managed can be overlooked, but fortunately there are some techniques that leverage our content management system's native capabilities to automatically take care of some of the details. 1. Determine Access Why - Even if content is not something that needs to restricted due to security reasons, it is helpful to apply access rights so that the content ends up being visible only to users that it relates to. This will greatly improve user experience. For instance, if your team is working on a group project many of your fellow company employees do not need to see the content that is being worked on for that project. How - Make use of native content features that allow propagation of security and meta data from parent folders within your content system that have been setup for your particular effort. This makes it painless to enforce security, as well as meta data policies for even the most unorganized users. The default settings at a parent level can be set once the content creation request has been accepted and a location in the content management system is assigned for your specific project. Impact - Users can find information will less effort, as they will only be exposed to what they need for their work and can leverage advanced search features to take advantage of meta data assigned to content. The combination of default security and meta data will also help in running reports against the content in the Manage and Retire stages that we will discuss in the next 2 posts. 2. Assign Workflow (optional depending on nature of content) Why - Every case for workflow is going to be a bit different, but it generally involves ensuring that content conforms to management, legal and or editorial requirements. How - Oracle's Universal Content Management offers two ways of helping to workflow content without much effort. Workflow can be applied to content based on Criteria acting on meta data or explicitly assigned to content with a Basic workflow. Impact - Any content that needs additional attention before release is addressed, allowing users to comment and version until a suitable result is reached. By using inheritance from parent folders within the content management system content can automatically be given the right security, meta data and workflow information for a particular project's content. This relieves the burden of doing this for every piece of content from management teams and content contributors. We will cover more about the management phase within the content lifecycle in our next installment.

    Read the article

  • Thinking differently about BI delivery

    - by jamiet
    My day job involves implementing Business Intelligence (BI) solutions which, as I have said before, is simply about giving people the information they need to do their jobs. I’m always interested in learning about new ways of achieving that aim and that is my motivation for writing blog entries that are not concerned with SQL or SQL Server per se. Implementing BI systems usually involves hacking together a bunch third party products with some in-house “glue” and delivering information using some shiny, expensive web-based front-end tool; the list of vendors that supply such tools is big and ever-growing. No doubt these tools have their place and of late I have started to wonder whether they can be supplemented with different ways of delivering information. The problem I have with these separate web-based tools is exactly that – they are separate web-based tools. What’s the problem with that you might ask? I’ll explain! They force the information worker to go somewhere unfamiliar in order to get the information they need to do their jobs. Would it not be better if we could deliver information into the tools that those information workers are already using and not force them to go somewhere else? I look at the rise of blogging over recent years and I realise that what made them popular is that people can subscribe to RSS feeds and have information pushed to them in their tool of choice rather than them having to go and find the information for themselves in a tool that has been foisted upon them. Would it not be a good idea to adopt the principle of subscription for the benefit of delivering BI information as well? I think it would and in the rest of this blog entry I’ll outline such a scenario where the power of subscription could be used to enhance the delivery of information to information workers. Typical questions that information workers ask might be: What are my year-on-year sales figures? What was my footfall yesterday? How many widgets have I sold so far today? Each of those questions includes a time element and that shouldn’t surprise us, any BI system that I have worked on includes the dimension of time. Now, what do people use to view and organise their time-oriented information? Its not a trick question, they use a calendar and in the enterprise space more often than not that calendar is managed using Outlook. Given then that information workers are already looking at their calendar in Outlook anyway would it not make sense then to deliver information into that same calendar? Of course it would. Calendars are a great way of visualising information such as sales figures. Observe: Just in this single screenshot I have managed to convey a multitude of information. The information worker can see, at a glance, information about hourly/daily/weekly/monthly sales and, moreover, he/she is viewing that information right inside the tool that they use every day. There is no effort on the part of him/her, the information just appears hour after hour, day after day. Taking the idea further, each one of those calendar items could be a mini-dashboard in its own right. Double-clicking on an item could show a plethora of other information about that time slot such as breaking the sales down per region or year-over-year comparisons. Perhaps the title could employ a sparkline? Loads of possibilities. The point is that calendars are a completely natural way to visualise information; we should make more use of them! The real beauty of delivering information using calendars for us BI developers is that it should be so easy. In the case of Outlook we don’t need to write complicated VBA code that can go and manipulate a person’s calendar, simply publishing data in a format that Outlook can understand is sufficient and happily such formats already exist; iCalendar is the accepted format and the even more flexible xCalendar is hopefully on its way as well.   I’d like to make one last point and this one is with my SQL Server hat on. Reporting Services 2008 R2 introduced the ability to publish data as subscribable Atom feeds so it seems logical that it could also be a vehicle for delivering calendar feeds too. If you think this would be a good idea go and vote for it at Publish data as iCalendar feeds and please please please add some comments (especially if you vote it down). Work smarter, not harder! @Jamiet Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

    Read the article

  • Solaris continuera à supporter les processeurs Xeon d'Intel, son responsable dévoile les premiers éléments du prochain update

    Solaris continuera à supporter les processeurs Xeon d'Intel Le responsable de la plateforme chez Oracle dévoile les premiers éléments du prochain update De passage à Paris, le responsable de Solaris chez Oracle - Joost Pronk - a confirmé que l'OS « au coeur de la stratégie des nouveaux systèmes intégrés (Exadata, Exalogic et SPARC SuperCluster...), en partant des disques jusqu'aux applications » continuerait à être développé pour être compatible aussi bien avec SPARC qu'avec les processeurs d'Intel. « Peu importe ce que l'on vous raconte, ou ce que vous lisez ou ce que vous entendrez ailleurs, moi je vous le dis, Solaris supportera SPARC et les Xeon d'Intel », assure le port...

    Read the article

  • WHERE x = @x OR @x IS NULL

    - by steveh99999
    Every SQL DBA and developer should read the blog of MVP Erland Sommarskog – but particularly  his article on dynamic search conditions in T-SQL. I’ve linked above to his SQL 2005 article but his 2008 version is also a must-read. I seem to regularly come across uses of the SQL in the title above… Erland’s article explains in detail why this is inefficient, but I came across a nice example recently… A stored procedure contained the following code :- WHERE @Name is null or [Name] like @Name as a nonclustered index exists on the Name column, you might assume this would be handled efficiently by SQL Server. However, I got the following output from SET STATISTICS IO Table 'xxxxx'. Scan count 15, logical reads 47760, physical reads 9, read-ahead reads 13872, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Note the high number of logical reads… After a bit of investigation, we found that @Name could never actually be set to NULL in this particular example. ie the @x IS NULL was spurious… So, we changed the call to WHERE  [Name] like @Name Now, how much more efficient is this code ? Table 'xxxxx'. Scan count 3, logical reads 24, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0 A nice easy win in this case…… a full index scan has been replaced by a significantly more efficient index seek. I managed to recreate the same behaviour on Adventureworks – here’s a quick query to demonstrate :- USE adventureworks SET STATISTICS IO ON DECLARE @id INT = 51721 SELECT * FROM Sales.SalesOrderDetail WHERE @id IS NULL OR salesorderid = @id SELECT * FROM Sales.SalesOrderDetail WHERE salesorderid = @id Take a look at the STATISTICS IO output and compare the actual query plans used to prove the impact of  WHERE @id IS NULL. And just to follow some of Erland’s advice – here’s how you could get similar performance if it was possible that @id could actually sometimes contain NULL. DECLARE @sql NVARCHAR(4000), @parameterlist NVARCHAR(4000) DECLARE @id INT = 51721 – or change to NULL to prove query is functionally correct SET @sql = 'SELECT * FROM Sales.SalesOrderDetail WHERE 1 = 1' IF @id IS NOT NULL SET @sql = @sql + ' AND salesorderid = @id' IF @id IS NULL SET @sql = @sql + ' AND salesorderid IS NULL' SET @parameterlist = '@id INT' EXEC sp_executesql @sql, @parameterlist,@id Sometimes I think we focus too much on hardware and SQL Server configuration – when really the answer is focus on writing efficient SQL.

    Read the article

  • How to set the initial component focus

    - by frank.nimphius
    Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} In ADF Faces, you use the af:document tag's initialFocusId to define the initial component focus. For this, specify the id property value of the component that you want to put the initial focus on. Identifiers are relative to the component, and must account for NamingContainers. You can use a single colon to start the search from the root, or multiple colons to move up through the NamingContainers - "::" will pop out of the component's naming container and begin the search from there, ":::" will pop out of two naming containers and begin the search from there. Alternatively you can add the naming container IDs as a prefix to the component Id, e.g. nc1:nc2:comp1. http://download.oracle.com/docs/cd/E17904_01/apirefs.1111/e12419/tagdoc/af_document.html To set the initial focus to a component located in a page fragment that is exposed through an ADF region, keep in mind that ADF Faces regions - af:region - is a naming container too. To address an input text field with the id "it1" in an ADF region exposed by an af:region tag with the id r1, you use the following reference in af:document: <af:document id="d1" initialFocusId="r1:0:it1"> Note the "0" index in the client Id. Also, make sure the input text component has its clientComponent property set to true as otherwise no client component exist to put focus on.

    Read the article

  • HERMES Medical Solutions Helps Save Lives with MySQL

    - by Bertrand Matthelié
    Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin:0cm; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Cambria","serif"; mso-ascii-font-family:Cambria; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Cambria; mso-hansi-theme-font:minor-latin;} HERMES Medical Solutions was established in 1976 in Stockholm, Sweden, and is a leading innovator in medical imaging hardware/software products for health care facilities worldwide. HERMES delivers a plethora of different medical imaging solutions to optimize hospital workflow. Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin:0cm; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Cambria","serif"; mso-ascii-font-family:Cambria; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Cambria; mso-hansi-theme-font:minor-latin;} HERMES advanced algorithms make it possible to detect the smallest changes under therapies important and necessary to optimize different therapeutic methods and doses. Challenges Fighting illness & disease requires state-of-the-art imaging modalities and software in order to diagnose accurately, stage disease appropriately and select the best treatment available. Selecting and implementing a new database platform that would deliver the needed performance, reliability, security and flexibility required by the high-end medical solutions offered by HERMES. Solution Decision to migrate from in-house database to an embedded SQL database powering the HERMES products, delivered either as software, integrated hardware and software solutions, or via the cloud in a software-as-a-service configuration. Evaluation of several databases and selection of MySQL based on its high performance, ease of use and integration, and low Total Cost of Ownership. On average, between 4 and 12 Terabytes of data are stored in MySQL databases underpinning the HERMES solutions. The data generated by each medical study is indeed stored during 10 years or more after the treatment was performed. MySQL-based HERMES systems also allow doctors worldwide to conduct new drug research projects leveraging the large amount of medical data collected. Hospitals and other HERMES customers worldwide highly value the “zero administration” capabilities and reliability of MySQL, enabling them to perform medical analysis without any downtime. Relying on MySQL as their embedded database, the HERMES team has been able to increase their focus on further developing their clinical applications. HERMES Medical Solutions could leverage the Oracle Financing payment plan to spread its investment over time and make the MySQL choice even more valuable. “MySQL has proven to be an excellent database choice for us. We offer high-end medical solutions, and MySQL delivers the reliability, security and performance such solutions require.” Jan Bertling, CEO.

    Read the article

  • Is there any way to get faster app reviews?

    - by David
    I am trying to build a business around an iPhone app. The app will be our main sales channel, and being able to adapt the sales channel faster than the 9-10 days delay cause by the app review times is crucial. Therefore, I was wondering whether there is anything I can do to improve the speed of reviews. I am thinking that the publishers of Angry Birds, surely would not have to wait in line for reviews on the same terms as some obscure free app. So what can I do? Some things I am considering: Would Apple prioritize apps that they earn money on? Could I in some way pay Apple directly? I already know of the possibility of requesting an expedite review, but it seems like one can get punished for supplying a non-technical reason.

    Read the article

  • What books would I recommend?

    - by user12277104
    One of my mentees (I have three right now) said he had some time on his hands this Summer and was looking for good UX books to read ... I sigh heavily, because there is no shortage of good UX books to read. My bookshelves have titles by well-read authors like Nielsen, Norman, Tufte, Dumas, Krug, Gladwell, Pink, Csikszentmihalyi, and Roam. I have titles buy lesser-known authors, many whom I call friends, and many others whom I'll likely never meet. I have books on Excel pivot tables, typography, mental models, culture, accessibility, surveys, checklists, prototyping, Agile, Java, sketching, project management, HTML, negotiation, statistics, user research methods, six sigma, usability guidelines, dashboards, the effects of aging on cognition, UI design, and learning styles, among others ... many others. So I feel the need to qualify any book recommendations with "it depends ...", because it depends on who I'm talking to, and what they are looking for.  It's probably best that I also mention that the views expressed in this blog are mine, and may not necessarily reflect the views of Oracle. There. I'm glad I got that off my chest. For that mentee, who will be graduating with his MS HFID + MBA from Bentley in the Fall, I'll recommend this book: Universal Principles of Design -- this is a great book, which in its first edition held "100  ways to enhance usability, influence perception, increase appeal, make better design decisions, and teach through design." Granted, the second edition expanded that number to 125, but when I first found this book, I felt like I'd discovered the Grail. Its research-based principles are all laid out in 2 pages each, with lots of pictures and good references. A must-have for the new grad. Do I have recommendations for a book that will teach you how to conduct a usability test? Yes, three of them. To communicate what we do to management? Yes. To create personas? Yep -- two or three. Help you with UX in an Agile environment? You bet, I've got two I'd recommend. Create an excellent presentation? Uh hunh. Get buy-in from your team? Of course. There are a plethora of excellent UX books out there. But which ones I recommend ... well ... it depends. 

    Read the article

  • Architecture for dashbaord showing aggregated stats

    - by soulnafein
    I'm trying to find the best architecture for an application that shows a dashboard with aggregated stats that come from another one (e.g. number of sales in the last 12 months, current sales this month, a fairly complex score, performance of users over last 30 days, etc.) There is a fair bit of business logic that lives in Application 1 but the aggregated data gets saved in Application 2 (dashboard). What's the best way to create the aggregate data? 1) Pull data directly from Application 1 database and duplicate business logic for score calculation etc. 2) Push data from Application 1 to Application 2 somehow 3) Aggregate data in Application 1 on the fly and provide and api for Application 2 4) Other (probably) Please suggest solutions, Thanks.

    Read the article

  • Deduping your redundancies

    - by nospam(at)example.com (Joerg Moellenkamp)
    Robin Harris of Storagemojo pointed to an interesting article about about deduplication and it's impact to the resiliency of your data against data corruption on ACM Queue. The problem in short: A considerable number of filesystems store important metadata at multiple locations. For example the ZFS rootblock is copied to three locations. Other filesystems have similar provisions to protect their metadata. However you can easily proof, that the rootblock pointer in the uberblock of ZFS for example is pointing to blocks with absolutely equal content in all three locatition (with zdb -uu and zdb -r). It has to be that way, because they are protected by the same checksum. A number of devices offer block level dedup, either as an option or as part of their inner workings. However when you store three identical blocks on them and the devices does block level dedup internally, the device may just deduplicated your redundant metadata to a block stored just once that is stored on the non-voilatile storage. When this block is corrupted, you have essentially three corrupted copies. Three hit with one bullet. This is indeed an interesting problem: A device doing deduplication doesn't know if a block is important or just a datablock. This is the reason why I like deduplication like it's done in ZFS. It's an integrated part and so important parts don't get deduplicated away. A disk accessed by a block level interface doesn't know anything about the importance of a block. A metadata block is nothing different to it's inner mechanism than a normal data block because there is no way to tell that this is important and that those redundancies aren't allowed to fall prey to some clever deduplication mechanism. Robin talks about this in regard of the Sandforce disk controllers who use a kind of dedup to reduce some of the nasty effects of writing data to flash, but the problem is much broader. However this is relevant whenever you are using a device with block level deduplication. It's just the point that you have to activate it for most implementation by command, whereas certain devices do this by default or by design and you don't know about it. However I'm not perfectly sure about that ? given that storage administration and server administration are often different groups with different business objectives I would ask your storage guys if they have activated dedup without telling somebody elase on their boxes in order to speak less often with the storage sales rep. The problem is even more interesting with ZFS. You may use ditto blocks to protect important data to store multiple copies of data in the pool to increase redundancy, even when your pool just consists out of one disk or just a striped set of disk. However when your device is doing dedup internally it may remove your redundancy before it hits the nonvolatile storage. You've won nothing. Just spend your disk quota on the the LUNs in the SAN and you make your disk admin happy because of the good dedup ratio However you can just fall in this specific "deduped ditto block"trap when your pool just consists out of a single device, because ZFS writes ditto blocks on different disks, when there is more than just one disk. Yet another reason why you should spend some extra-thought when putting your zpool on a single LUN, especially when the LUN is sliced and dices out of a large heap of storage devices by a storage controller. However I have one problem with the articles and their specific mention of ZFS: You can just hit by this problem when you are using the deduplicating device for the pool. However in the specifically mentioned case of SSD this isn't the usecase. Most implementations of SSD in conjunction with ZFS are hybrid storage pools and so rotating rust disk is used as pool and SSD are used as L2ARC/sZIL. And there it simply doesn't matter: When you really have to resort to the sZIL (your system went down, it doesn't matter of one block or several blocks are corrupt, you have to fail back to the last known good transaction group the device. On the other side, when a block in L2ARC is corrupt, you simply read it from the pool and in HSP implementations this is the already mentioned rust. In conjunction with ZFS this is more interesting when using a storage array, that is capable to do dedup and where you use LUNs for your pool. However as mentioned before, on those devices it's a user made decision to do so, and so it's less probable that you deduplicating your redundancies. Other filesystems lacking acapability similar to hybrid storage pools are more "haunted" by this problem of SSD using dedup-like mechanisms internally, because those filesystem really store the data on the the SSD instead of using it just as accelerating devices. However at the end Robin is correct: It's jet another point why protecting your data by creating redundancies by dispersing it several disks (by mirror or parity RAIDs) is really important. No dedup mechanism inside a device can dedup away your redundancy when you write it to a totally different and indepenent device.

    Read the article

  • Have You Visited the New Procurement Enhancement Request Community?

    - by LuciaC
    Have you visited the new Procurement Enhancement Request Community yet?  If not, we strongly encourage you to visit this site to vote on current Enhancement Requests (ERs) available through the ‘Quick Preview of Voting List’.  You can also vote on any ER currently displayed.  Have an ER that is not listed?  Simply add it by creating a thread stating the ER and any detailed information you would like to include.  If the ER already exists in the database, we will add the ER # to the thread so that development can provide updates around the requested ERs. This community is your one-stop source for all Enhancement information.  It is being monitored regularly by development and soon we will be posting some updates around some of the top voted Enhancement Requests.  Know that your vote counts!  By voting, you will bring forward those ERs that impact the Procurement Suite's value and usability.  Is your request industry specific?  Let us know by posting this information in the body of the thread.  We have a team monitoring these ERs and will be happy to highlight industry specific ERs to ensure they also get equal visibility! Coming Soon:  A list of the Top implemented ERs!  Development has been working hard to make improvements to the Procurement Suite of Products and they want you to know about them!  Until then, check out the Best Practices Section for some key ERs and how they can help your company secure the most value from your implementation!! What you need to know: The Procurement Enhancement Requests Community is your 1-stop shop for the latest information on Enhancements! The Community allows you to vote on ERs bringing visibility to the collective audience interest in value and usability recommendations. Your place to submit any new enhancement requests. Get the latest on top Procurement Enhancement Requests (ERs) - know when an improvement is PLANNED, COMING SOON, and DELIVERED. This Community is owned and managed by the Oracle Procurement Development team! Let your voice be heard by telling us what you want to see implemented in the Procurement Suite.

    Read the article

  • Is How the Company Makes Its Money One of The Most Important Determining Factors in their work environment, culture, etc

    - by programmx10
    This is a viewpoint I've started to realize recently about some companies that I have worked for. They had their own software product that they developed in-house but most of the focus was on building an in-person sales team to push their product to businesses throughout the country. I figure that companies that are exclusively "online", meaning that their revenue source comes from online transactions where there is no "face" of the company to the customer would have a different work culture. Just curious if anyone has worked for both types of companies and notices a difference. I myself am hoping to get more into contract programming and figure that companies that don't have to employ a sales-force and things like that would be more focused on technology and maybe even willing to be flexible on partial telecommute, etc

    Read the article

  • SQL Azure Reporting Limited CTP Arrived

    - by Shaun
    It’s about 3 months later when I registered the SQL Azure Reporting CTP on the Microsoft Connect after TechED 2010 China. Today when I checked my mailbox I found that the SQL Azure team had just accepted my request and sent the activation code over to me. So let’s have a look on the new SQL Azure Reporting.   Concept The SQL Azure Reporting provides cloud-based reporting as a service, built on SQL Server Reporting Services and SQL Azure technologies. Cloud-based reporting solutions such as SQL Azure Reporting provide many benefits, including rapid provisioning, cost-effective scalability, high availability, and reduced management overhead for report servers; and secure access, viewing, and management of reports. By using the SQL Azure Reporting service, we can do: Embed the Visual Studio Report Viewer ADO.NET Ajax control or Windows Form control to view the reports deployed on SQL Azure Reporting Service in our web or desktop application. Leverage the SQL Azure Reporting SOAP API to manage and retrieve the report content from any kinds of application. Use the SQL Azure Reporting Service Portal to navigate and view the reports deployed on the cloud. Since the SQL Azure Reporting was built based on the SQL Server 2008 R2 Reporting Service, we can use any tools we are familiar with, such as the SQL Server Integration Studio, Visual Studio Report Viewer. The SQL Azure Reporting Service runs as a remote SQL Server Reporting Service just on the cloud rather than on a server besides us.   Establish a New SQL Azure Reporting Let’s move to the windows azure deveploer portal and click the Reporting item from the left side navigation bar. If you don’t have the activation code you can click the Sign Up button to send a requirement to the Microsoft Connect. Since I already recieved the received code mail I clicked the Provision button. Then after agree the terms of the service I will select the subscription for where my SQL Azure Reporting CTP should be provisioned. In this case I selected my free Windows Azure Pass subscription. Then the final step, paste the activation code and enter the password of our SQL Azure Reporting Service. The user name of the SQL Azure Reporting will be generated by SQL Azure automatically. After a while the new SQL Azure Reporting Server will be shown on our developer portal. The Reporting Service URL and the user name will be shown as well. We can reset the password from the toolbar button.   Deploy Report to SQL Azure Reporting If you are familiar with SQL Server Reporting Service you will find this part will be very similar with what you know and what you did before. Firstly we open the SQL Server Business Intelligence Development Studio and create a new Report Server Project. Then we will create a shared data source where the report data will be retrieved from. This data source can be SQL Azure but we can use local SQL Server or other database if it opens the port up. In this case we use a SQL Azure database located in the same data center of our reporting service. In the Credentials tab page we entered the user name and password to this SQL Azure database. The SQL Azure Reporting CTP only available at the North US Data Center now so that the related SQL Server and hosted service might be better to select the same data center to avoid the external data transfer fee. Then we create a very simple report, just retrieve all records from a table named Members and have a table in the report to list them. In the data source selection step we choose the shared data source we created before, then enter the T-SQL to select all records from the Member table, then put all fields into the table columns. The report will be like this as following In order to deploy the report onto the SQL Azure Reporting Service we need to update the project property. Right click the project node from the solution explorer and select the property item. In the Target Server URL item we will specify the reporting server URL of our SQL Azure Reporting. We can go back to the developer portal and select the reporting node from the left side, then copy the Web Service URL and paste here. But notice that we need to append “/reportserver” after pasted. Then just click the Deploy menu item in the context menu of the project, the Visual Studio will compile the report and then upload to the reporting service accordingly. In this step we will be prompted to input the user name and password of our SQL Azure Reporting Service. We can get the user name from the developer portal, just next to the Web Service URL in the SQL Azure Reporting page. And the password is the one we specified when created the reporting service. After about one minute the report will be deployed succeed.   View the Report in Browser SQL Azure Reporting allows us to view the reports which deployed on the cloud from a standard browser. We copied the Web Service URL from the reporting service main page and appended “/reportserver” in HTTPS protocol then we will have the SQL Azure Reporting Service login page. After entered the user name and password of the SQL Azure Reporting Service we can see the directories and reports listed. Click the report will launch the Report Viewer to render the report.   View Report in a Web Role with the Report Viewer The ASP.NET and Windows Form Report Viewer works well with the SQL Azure Reporting Service as well. We can create a ASP.NET Web Role and added the Report Viewer control in the default page. What we need to change to the report viewer are Change the Processing Mode to Remote. Specify the Report Server URL under the Server Remote category to the URL of the SQL Azure Reporting Web Service URL with “/reportserver” appended. Specify the Report Path to the report which we want to display. The report name should NOT include the extension name. For example my report was in the SqlAzureReportingTest project and named MemberList.rdl then the report path should be /SqlAzureReportingTest/MemberList. And the next one is to specify the SQL Azure Reporting Credentials. We can use the following class to wrap the report server credential. 1: private class ReportServerCredentials : IReportServerCredentials 2: { 3: private string _userName; 4: private string _password; 5: private string _domain; 6:  7: public ReportServerCredentials(string userName, string password, string domain) 8: { 9: _userName = userName; 10: _password = password; 11: _domain = domain; 12: } 13:  14: public WindowsIdentity ImpersonationUser 15: { 16: get 17: { 18: return null; 19: } 20: } 21:  22: public ICredentials NetworkCredentials 23: { 24: get 25: { 26: return null; 27: } 28: } 29:  30: public bool GetFormsCredentials(out Cookie authCookie, out string user, out string password, out string authority) 31: { 32: authCookie = null; 33: user = _userName; 34: password = _password; 35: authority = _domain; 36: return true; 37: } 38: } And then in the Page_Load method, pass it to the report viewer. 1: protected void Page_Load(object sender, EventArgs e) 2: { 3: ReportViewer1.ServerReport.ReportServerCredentials = new ReportServerCredentials( 4: "<user name>", 5: "<password>", 6: "<sql azure reporting web service url>"); 7: } Finally deploy it to Windows Azure and enjoy the report.   Summary In this post I introduced the SQL Azure Reporting CTP which had just available. Likes other features in Windows Azure, the SQL Azure Reporting is very similar with the SQL Server Reporting. As you can see in this post we can use the existing and familiar tools to build and deploy the reports and display them on a website. But the SQL Azure Reporting is just in the CTP stage which means It is free. There’s no support for it. Only available at the North US Data Center. You can get more information about the SQL Azure Reporting CTP from the links following SQL Azure Reporting Limited CTP at MSDN SQL Azure Reporting Samples at TechNet Wiki You can download the solutions and the projects used in this post here.   Hope this helps, Shaun All documents and related graphics, codes are provided "AS IS" without warranty of any kind. Copyright © Shaun Ziyan Xu. This work is licensed under the Creative Commons License.

    Read the article

  • The Sensemaking Spectrum for Business Analytics: Translating from Data to Business Through Analysis

    - by Joe Lamantia
    One of the most compelling outcomes of our strategic research efforts over the past several years is a growing vocabulary that articulates our cumulative understanding of the deep structure of the domains of discovery and business analytics. Modes are one example of the deep structure we’ve found.  After looking at discovery activities across a very wide range of industries, question types, business needs, and problem solving approaches, we've identified distinct and recurring kinds of sensemaking activity, independent of context.  We label these activities Modes: Explore, compare, and comprehend are three of the nine recognizable modes.  Modes describe *how* people go about realizing insights.  (Read more about the programmatic research and formal academic grounding and discussion of the modes here: https://www.researchgate.net/publication/235971352_A_Taxonomy_of_Enterprise_Search_and_Discovery) By analogy to languages, modes are the 'verbs' of discovery activity.  When applied to the practical questions of product strategy and development, the modes of discovery allow one to identify what kinds of analytical activity a product, platform, or solution needs to support across a spread of usage scenarios, and then make concrete and well-informed decisions about every aspect of the solution, from high-level capabilities, to which specific types of information visualizations better enable these scenarios for the types of data users will analyze. The modes are a powerful generative tool for product making, but if you've spent time with young children, or had a really bad hangover (or both at the same time...), you understand the difficult of communicating using only verbs.  So I'm happy to share that we've found traction on another facet of the deep structure of discovery and business analytics.  Continuing the language analogy, we've identified some of the ‘nouns’ in the language of discovery: specifically, the consistently recurring aspects of a business that people are looking for insight into.  We call these discovery Subjects, since they identify *what* people focus on during discovery efforts, rather than *how* they go about discovery as with the Modes. Defining the collection of Subjects people repeatedly focus on allows us to understand and articulate sense making needs and activity in more specific, consistent, and complete fashion.  In combination with the Modes, we can use Subjects to concretely identify and define scenarios that describe people’s analytical needs and goals.  For example, a scenario such as ‘Explore [a Mode] the attrition rates [a Measure, one type of Subject] of our largest customers [Entities, another type of Subject] clearly captures the nature of the activity — exploration of trends vs. deep analysis of underlying factors — and the central focus — attrition rates for customers above a certain set of size criteria — from which follow many of the specifics needed to address this scenario in terms of data, analytical tools, and methods. We can also use Subjects to translate effectively between the different perspectives that shape discovery efforts, reducing ambiguity and increasing impact on both sides the perspective divide.  For example, from the language of business, which often motivates analytical work by asking questions in business terms, to the perspective of analysis.  The question posed to a Data Scientist or analyst may be something like “Why are sales of our new kinds of potato chips to our largest customers fluctuating unexpectedly this year?” or “Where can innovate, by expanding our product portfolio to meet unmet needs?”.  Analysts translate questions and beliefs like these into one or more empirical discovery efforts that more formally and granularly indicate the plan, methods, tools, and desired outcomes of analysis.  From the perspective of analysis this second question might become, “Which customer needs of type ‘A', identified and measured in terms of ‘B’, that are not directly or indirectly addressed by any of our current products, offer 'X' potential for ‘Y' positive return on the investment ‘Z' required to launch a new offering, in time frame ‘W’?  And how do these compare to each other?”.  Translation also happens from the perspective of analysis to the perspective of data; in terms of availability, quality, completeness, format, volume, etc. By implication, we are proposing that most working organizations — small and large, for profit and non-profit, domestic and international, and in the majority of industries — can be described for analytical purposes using this collection of Subjects.  This is a bold claim, but simplified articulation of complexity is one of the primary goals of sensemaking frameworks such as this one.  (And, yes, this is in fact a framework for making sense of sensemaking as a category of activity - but we’re not considering the recursive aspects of this exercise at the moment.) Compellingly, we can place the collection of subjects on a single continuum — we call it the Sensemaking Spectrum — that simply and coherently illustrates some of the most important relationships between the different types of Subjects, and also illuminates several of the fundamental dynamics shaping business analytics as a domain.  As a corollary, the Sensemaking Spectrum also suggests innovation opportunities for products and services related to business analytics. The first illustration below shows Subjects arrayed along the Sensemaking Spectrum; the second illustration presents examples of each kind of Subject.  Subjects appear in colors ranging from blue to reddish-orange, reflecting their place along the Spectrum, which indicates whether a Subject addresses more the viewpoint of systems and data (Data centric and blue), or people (User centric and orange).  This axis is shown explicitly above the Spectrum.  Annotations suggest how Subjects align with the three significant perspectives of Data, Analysis, and Business that shape business analytics activity.  This rendering makes explicit the translation and bridging function of Analysts as a role, and analysis as an activity. Subjects are best understood as fuzzy categories [http://georgelakoff.files.wordpress.com/2011/01/hedges-a-study-in-meaning-criteria-and-the-logic-of-fuzzy-concepts-journal-of-philosophical-logic-2-lakoff-19731.pdf], rather than tightly defined buckets.  For each Subject, we suggest some of the most common examples: Entities may be physical things such as named products, or locations (a building, or a city); they could be Concepts, such as satisfaction; or they could be Relationships between entities, such as the variety of possible connections that define linkage in social networks.  Likewise, Events may indicate a time and place in the dictionary sense; or they may be Transactions involving named entities; or take the form of Signals, such as ‘some Measure had some value at some time’ - what many enterprises understand as alerts.   The central story of the Spectrum is that though consumers of analytical insights (represented here by the Business perspective) need to work in terms of Subjects that are directly meaningful to their perspective — such as Themes, Plans, and Goals — the working realities of data (condition, structure, availability, completeness, cost) and the changing nature of most discovery efforts make direct engagement with source data in this fashion impossible.  Accordingly, business analytics as a domain is structured around the fundamental assumption that sense making depends on analytical transformation of data.  Analytical activity incrementally synthesizes more complex and larger scope Subjects from data in its starting condition, accumulating insight (and value) by moving through a progression of stages in which increasingly meaningful Subjects are iteratively synthesized from the data, and recombined with other Subjects.  The end goal of  ‘laddering’ successive transformations is to enable sense making from the business perspective, rather than the analytical perspective.Synthesis through laddering is typically accomplished by specialized Analysts using dedicated tools and methods. Beginning with some motivating question such as seeking opportunities to increase the efficiency (a Theme) of fulfillment processes to reach some level of profitability by the end of the year (Plan), Analysts will iteratively wrangle and transform source data Records, Values and Attributes into recognizable Entities, such as Products, that can be combined with Measures or other data into the Events (shipment of orders) that indicate the workings of the business.  More complex Subjects (to the right of the Spectrum) are composed of or make reference to less complex Subjects: a business Process such as Fulfillment will include Activities such as confirming, packing, and then shipping orders.  These Activities occur within or are conducted by organizational units such as teams of staff or partner firms (Networks), composed of Entities which are structured via Relationships, such as supplier and buyer.  The fulfillment process will involve other types of Entities, such as the products or services the business provides.  The success of the fulfillment process overall may be judged according to a sophisticated operating efficiency Model, which includes tiered Measures of business activity and health for the transactions and activities included.  All of this may be interpreted through an understanding of the operational domain of the businesses supply chain (a Domain).   We'll discuss the Spectrum in more depth in succeeding posts.

    Read the article

  • Community Outreach - Where Should I Go

    - by Roger Brinkley
    A few days ago I was talking to person new to community development and they asked me what guidelines I used to determine the worthiness of a particular event. After our conversation was over I thought about it a little bit more and figured out there are three ways to determine if any event (be it conference, blog, podcast or other social medias) is worth doing: Transferability, Multiplication, and Impact. Transferability - Is what I have to say useful to the people that are going to hear it. For instance, consider a company that has product offering that can connect up using a number of languages like Scala, Grovey or Java. Sending a Scala expert to talk about Scala and the product is not transferable to a Java User Group, but a Java expert doing the same talk with a Java slant is. Similarly, talking about JavaFX to any Java User Group meeting in Brazil was pretty much a wasted effort until it was open sourced. Once it was open sourced it was well received. You can also look at transferability in relation to the subject matter that you're dealing with. How transferable is a presentation that I create. Can I, or a technical writer on the staff, turn it into some technical document. Could it be converted into some type of screen cast. If we have a regular podcast can we make a reference to the document, catch the high points or turn it into a interview. Is there a way of using this in the sales group. In other words is the document purely one dimensional or can it be re-purposed in other forms. Multiplication - On every trip I'm looking for 2 to 5 solid connections that I can make with developers. These are long term connections, because I know that once that relationship is established it will lead to another 2 - 5 from that connection and within a couple of years were talking about some 100 connections from just one developer. For instance, when I was working on JavaHelp in 2000 I hired a science teacher with a programming background. We've developed a very tight relationship over the year though we rarely see each other more than once a year. But at this JavaOne, one of his employees came up to me and said, "Richard (Rick Hard in Czech) told me to tell you that he couldn't make it to JavaOne this year but if I saw you to tell you hi". Another example is from my Mobile & Embedded days in Brasil. On our very first FISL trip about 5 years ago there were two university students that had created a project called "Marge". Marge was a Bluetooth framework that made connecting bluetooth devices easier. I invited them to a "Sun" dinner that evening. Originally they were planning on leaving that afternoon, but they changed their plans recognizing the opportunity. Their eyes were as big a saucers when they realized the level of engineers at the meeting. They went home started a JUG in Florianoplis that we've visited more than a couple of times. One of them went to work for Brazilian government lab like Berkley Labs, MIT Lab, John Hopkins Applied Physicas Labs or Lincoln Labs in the US. That presented us with an opportunity to show Embedded Java as a possibility for some of the work they were doing there. Impact - The final criteria is how life changing is what I'm going to say be to the individuals I'm reaching. A t-shirt is just a token, but when I reach down and tug at their developer hearts then I know I've succeeded. I'll never forget one time we flew all night to reach Joan Pasoa in Northern Brazil. We arrived at 2am went immediately to our hotel only to be woken up at 6 am to travel 2 hours by car to the presentation hall. When we arrived we were totally exhausted. Outside the facility there were 500 people lined up to hear 6 speakers for the day. That itself was uplifting.  I delivered one of my favorite talks on "I have passion". It was a talk on golf and embedded java development, "Find your passion". When we finished a couple of first year students came up to me and said how much my talk had inspired them. FISL is another great example. I had been about 4 years in a row. FISL is a very young group of developers so capturing their attention is important. Several of the students will come back 2 or 3 years later and ask me questions about research or jobs. And then there's Louis. Louis is one my favorite Brazilians. I can only describe him as a big Brazilian teddy bear. I see him every year at FISL. He works primarily in Java EE but he's attended every single one of my talks over the last 4 years. I can't tell you why, but he always greets me and gives me a hug. For some reason I've had a real impact. And of course when it comes to impact you don't just measure a presentation but every single interaction you have at an event. It's the hall way conversations, the booth conversations, but more importantly it's the conversations at dinner tables or in the cars when you're getting transported to an event. There's a good story that illustrates this. Last year in the spring I was traveling to Goiânia in Brazil. I've been there many times and leaders there no me well. One young man has picked me up at the airport on more than one occasion. We were going out to dinner one evening and he brought his girl friend along. One thing let to another and I eventually asked him, in front of her, "Why haven't you asked her to marry you?" There were all kinds of excuses and she just looked at him and smiled. When I came back in December for JavaOne he came and sought me. "I just want to tell you that I thought a lot about what you said, and I asked her to marry me. We're getting married next Spring." Sometimes just one presentation is all it takes to make an impact. Other times it takes years. Some impacts are directly related to the company and some are more personal in nature. It doesn't matter which it is because it's having the impact that matters.

    Read the article

  • Taking a Flying Leap

    - by Lance Shaw
    Yesterday, I went skydiving with three of my children.  It was thrilling, scary, invigorating and exciting. While there is obvious risk involved, the reward and feeling of success was well worth it. You might already be wondering what skydiving would have to with WebCenter, so let me explain. Implementing a skydiving program and becoming an instructor does not happen overnight.  It does not happen with the purchase of the needed technology. Not one of us would go out, buy a parachute, the harnesses, helmet and all the gear and be able to convince anyone that we are now ready to be a skydiving instructor. The fact is that obtaining the technology is merely a small piece of the overall process and so is the case with managing content in your company. You don't just buy the right software (Oracle WebCenter Content) and go to your boss and declare information management success. There is planning, research and effort that goes into deploying software of any kind and especially when it is as mission-critical to the success of your business as Enterprise Content Management. To become a certified skydiving instructor takes at least 3 years of commitment and often longer. In the United States, candidates must complete over 500 solo jumps of their own over a minimum of 36 months and then must complete additional rigorous training under observation.  When you consider the amount of time and effort involved, it's not unlike getting a college degree and anyone that has trusted their lives to one of these instructors will no doubt appreciate their dedication to the curriculum.  Implementing an ECM system won't take that long, but it certainly requires commitment, analysis and consideration. But guess what?  Humans are involved and that means that mistakes can happen and that rules change.  This struck me while reading an excellent post on darkreading.com by Glenn S. Phillips entitled "Mission Impossible: 4 Reasons Compliance is Impossible".  His over-arching point was that with information management and security, environments change and people are involved meaning the work is never done.  He stated that you can never claim your compliance efforts are complete because of the following reasons. People are involved.  And lets face it, some are more trustworthy than others. Change is Constant. There is always some new technology coming along that is disruptive. Consumer grade cloud file sharing and sync tools come to mind here. Compliance is interpreted, not defined.  Laws and the judges that read them are always on the move. Technology is a tool, not a complete solution. There is no magic pill. The skydiving analogy holds true here as well.  Ultimately, a single person packs your parachute.  For obvious reasons, you prefer that this person be trustworthy but there are no absolute guarantees of a 100% error-free scenario.  Weather and wind conditions are never a constant and the best-laid plans for a great day of skydiving are easily disrupted by forces outside of your control.  Rules and regulations vary by location and may be updated at any time and as I mentioned early on, even the best technology on its own will only get you started. The good news is that, like skydiving, with the right technology, the right planning, the right team and a proper understanding of the rules and regulations that govern your industry, your ECM deployment can be a great success.  Failure to plan for any of the 4 factors that Glenn outlined in his article will certainly put your deployment and maybe even your company at risk, so consider them carefully. As a final aside, for those of you who consider skydiving an incredibly dangerous and risky pastime, consider this comparative statistic.  In 2012, the U.S. Parachute Association recorded 19 fatal skydiving accidents in the U.S. out of roughly 3.1 million jumps.  That’s 0.006 fatalities per 1,000 jumps. By comparison, the U.S. National Highway Traffic Safety Administration reports that there were 34,080 deaths due to car accidents in 2012.  Based on the percentages, one could argue that it is safer to jump out of a plane than to drive to the airport where the skydiving will take place. While the way you manage, secure, classify, control, retain and dispose of company files may not carry as much risk as driving or skydiving, it certainly carries risk for the organization when not planned and deployed appropriately.  Consider all the factors involved in your organization as you make your content management plans.  For additional areas of consideration, be sure to download our free whitepaper on the topic entitled "The Top 10 Criteria for Choosing an ECM System" which is available for download here.

    Read the article

  • Using R to Analyze G1GC Log Files

    - by user12620111
    Using R to Analyze G1GC Log Files body, td { font-family: sans-serif; background-color: white; font-size: 12px; margin: 8px; } tt, code, pre { font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace; } h1 { font-size:2.2em; } h2 { font-size:1.8em; } h3 { font-size:1.4em; } h4 { font-size:1.0em; } h5 { font-size:0.9em; } h6 { font-size:0.8em; } a:visited { color: rgb(50%, 0%, 50%); } pre { margin-top: 0; max-width: 95%; border: 1px solid #ccc; white-space: pre-wrap; } pre code { display: block; padding: 0.5em; } code.r, code.cpp { background-color: #F8F8F8; } table, td, th { border: none; } blockquote { color:#666666; margin:0; padding-left: 1em; border-left: 0.5em #EEE solid; } hr { height: 0px; border-bottom: none; border-top-width: thin; border-top-style: dotted; border-top-color: #999999; } @media print { * { background: transparent !important; color: black !important; filter:none !important; -ms-filter: none !important; } body { font-size:12pt; max-width:100%; } a, a:visited { text-decoration: underline; } hr { visibility: hidden; page-break-before: always; } pre, blockquote { padding-right: 1em; page-break-inside: avoid; } tr, img { page-break-inside: avoid; } img { max-width: 100% !important; } @page :left { margin: 15mm 20mm 15mm 10mm; } @page :right { margin: 15mm 10mm 15mm 20mm; } p, h2, h3 { orphans: 3; widows: 3; } h2, h3 { page-break-after: avoid; } } pre .operator, pre .paren { color: rgb(104, 118, 135) } pre .literal { color: rgb(88, 72, 246) } pre .number { color: rgb(0, 0, 205); } pre .comment { color: rgb(76, 136, 107); } pre .keyword { color: rgb(0, 0, 255); } pre .identifier { color: rgb(0, 0, 0); } pre .string { color: rgb(3, 106, 7); } var hljs=new function(){function m(p){return p.replace(/&/gm,"&").replace(/"}while(y.length||w.length){var v=u().splice(0,1)[0];z+=m(x.substr(q,v.offset-q));q=v.offset;if(v.event=="start"){z+=t(v.node);s.push(v.node)}else{if(v.event=="stop"){var p,r=s.length;do{r--;p=s[r];z+=("")}while(p!=v.node);s.splice(r,1);while(r'+M[0]+""}else{r+=M[0]}O=P.lR.lastIndex;M=P.lR.exec(L)}return r+L.substr(O,L.length-O)}function J(L,M){if(M.sL&&e[M.sL]){var r=d(M.sL,L);x+=r.keyword_count;return r.value}else{return F(L,M)}}function I(M,r){var L=M.cN?'':"";if(M.rB){y+=L;M.buffer=""}else{if(M.eB){y+=m(r)+L;M.buffer=""}else{y+=L;M.buffer=r}}D.push(M);A+=M.r}function G(N,M,Q){var R=D[D.length-1];if(Q){y+=J(R.buffer+N,R);return false}var P=q(M,R);if(P){y+=J(R.buffer+N,R);I(P,M);return P.rB}var L=v(D.length-1,M);if(L){var O=R.cN?"":"";if(R.rE){y+=J(R.buffer+N,R)+O}else{if(R.eE){y+=J(R.buffer+N,R)+O+m(M)}else{y+=J(R.buffer+N+M,R)+O}}while(L1){O=D[D.length-2].cN?"":"";y+=O;L--;D.length--}var r=D[D.length-1];D.length--;D[D.length-1].buffer="";if(r.starts){I(r.starts,"")}return R.rE}if(w(M,R)){throw"Illegal"}}var E=e[B];var D=[E.dM];var A=0;var x=0;var y="";try{var s,u=0;E.dM.buffer="";do{s=p(C,u);var t=G(s[0],s[1],s[2]);u+=s[0].length;if(!t){u+=s[1].length}}while(!s[2]);if(D.length1){throw"Illegal"}return{r:A,keyword_count:x,value:y}}catch(H){if(H=="Illegal"){return{r:0,keyword_count:0,value:m(C)}}else{throw H}}}function g(t){var p={keyword_count:0,r:0,value:m(t)};var r=p;for(var q in e){if(!e.hasOwnProperty(q)){continue}var s=d(q,t);s.language=q;if(s.keyword_count+s.rr.keyword_count+r.r){r=s}if(s.keyword_count+s.rp.keyword_count+p.r){r=p;p=s}}if(r.language){p.second_best=r}return p}function i(r,q,p){if(q){r=r.replace(/^((]+|\t)+)/gm,function(t,w,v,u){return w.replace(/\t/g,q)})}if(p){r=r.replace(/\n/g,"")}return r}function n(t,w,r){var x=h(t,r);var v=a(t);var y,s;if(v){y=d(v,x)}else{return}var q=c(t);if(q.length){s=document.createElement("pre");s.innerHTML=y.value;y.value=k(q,c(s),x)}y.value=i(y.value,w,r);var u=t.className;if(!u.match("(\\s|^)(language-)?"+v+"(\\s|$)")){u=u?(u+" "+v):v}if(/MSIE [678]/.test(navigator.userAgent)&&t.tagName=="CODE"&&t.parentNode.tagName=="PRE"){s=t.parentNode;var p=document.createElement("div");p.innerHTML=""+y.value+"";t=p.firstChild.firstChild;p.firstChild.cN=s.cN;s.parentNode.replaceChild(p.firstChild,s)}else{t.innerHTML=y.value}t.className=u;t.result={language:v,kw:y.keyword_count,re:y.r};if(y.second_best){t.second_best={language:y.second_best.language,kw:y.second_best.keyword_count,re:y.second_best.r}}}function o(){if(o.called){return}o.called=true;var r=document.getElementsByTagName("pre");for(var p=0;p|=||=||=|\\?|\\[|\\{|\\(|\\^|\\^=|\\||\\|=|\\|\\||~";this.ER="(?![\\s\\S])";this.BE={b:"\\\\.",r:0};this.ASM={cN:"string",b:"'",e:"'",i:"\\n",c:[this.BE],r:0};this.QSM={cN:"string",b:'"',e:'"',i:"\\n",c:[this.BE],r:0};this.CLCM={cN:"comment",b:"//",e:"$"};this.CBLCLM={cN:"comment",b:"/\\*",e:"\\*/"};this.HCM={cN:"comment",b:"#",e:"$"};this.NM={cN:"number",b:this.NR,r:0};this.CNM={cN:"number",b:this.CNR,r:0};this.BNM={cN:"number",b:this.BNR,r:0};this.inherit=function(r,s){var p={};for(var q in r){p[q]=r[q]}if(s){for(var q in s){p[q]=s[q]}}return p}}();hljs.LANGUAGES.cpp=function(){var a={keyword:{"false":1,"int":1,"float":1,"while":1,"private":1,"char":1,"catch":1,"export":1,virtual:1,operator:2,sizeof:2,dynamic_cast:2,typedef:2,const_cast:2,"const":1,struct:1,"for":1,static_cast:2,union:1,namespace:1,unsigned:1,"long":1,"throw":1,"volatile":2,"static":1,"protected":1,bool:1,template:1,mutable:1,"if":1,"public":1,friend:2,"do":1,"return":1,"goto":1,auto:1,"void":2,"enum":1,"else":1,"break":1,"new":1,extern:1,using:1,"true":1,"class":1,asm:1,"case":1,typeid:1,"short":1,reinterpret_cast:2,"default":1,"double":1,register:1,explicit:1,signed:1,typename:1,"try":1,"this":1,"switch":1,"continue":1,wchar_t:1,inline:1,"delete":1,alignof:1,char16_t:1,char32_t:1,constexpr:1,decltype:1,noexcept:1,nullptr:1,static_assert:1,thread_local:1,restrict:1,_Bool:1,complex:1},built_in:{std:1,string:1,cin:1,cout:1,cerr:1,clog:1,stringstream:1,istringstream:1,ostringstream:1,auto_ptr:1,deque:1,list:1,queue:1,stack:1,vector:1,map:1,set:1,bitset:1,multiset:1,multimap:1,unordered_set:1,unordered_map:1,unordered_multiset:1,unordered_multimap:1,array:1,shared_ptr:1}};return{dM:{k:a,i:"",k:a,r:10,c:["self"]}]}}}();hljs.LANGUAGES.r={dM:{c:[hljs.HCM,{cN:"number",b:"\\b0[xX][0-9a-fA-F]+[Li]?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+(?:[eE][+\\-]?\\d*)?L\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+\\.(?!\\d)(?:i\\b)?",e:hljs.IMMEDIATE_RE,r:1},{cN:"number",b:"\\b\\d+(?:\\.\\d*)?(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\.\\d+(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"keyword",b:"(?:tryCatch|library|setGeneric|setGroupGeneric)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\.",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\d+(?![\\w.])",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\b(?:function)",e:hljs.IMMEDIATE_RE,r:2},{cN:"keyword",b:"(?:if|in|break|next|repeat|else|for|return|switch|while|try|stop|warning|require|attach|detach|source|setMethod|setClass)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"literal",b:"(?:NA|NA_integer_|NA_real_|NA_character_|NA_complex_)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"literal",b:"(?:NULL|TRUE|FALSE|T|F|Inf|NaN)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"identifier",b:"[a-zA-Z.][a-zA-Z0-9._]*\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"|=||   Using R to Analyze G1GC Log Files   Using R to Analyze G1GC Log Files Introduction Working in Oracle Platform Integration gives an engineer opportunities to work on a wide array of technologies. My team’s goal is to make Oracle applications run best on the Solaris/SPARC platform. When looking for bottlenecks in a modern applications, one needs to be aware of not only how the CPUs and operating system are executing, but also network, storage, and in some cases, the Java Virtual Machine. I was recently presented with about 1.5 GB of Java Garbage First Garbage Collector log file data. If you’re not familiar with the subject, you might want to review Garbage First Garbage Collector Tuning by Monica Beckwith. The customer had been running Java HotSpot 1.6.0_31 to host a web application server. I was told that the Solaris/SPARC server was running a Java process launched using a commmand line that included the following flags: -d64 -Xms9g -Xmx9g -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=80 -XX:PermSize=256m -XX:MaxPermSize=256m -XX:+PrintGC -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -XX:+PrintGCDateStamps -XX:+PrintFlagsFinal -XX:+DisableExplicitGC -XX:+UnlockExperimentalVMOptions -XX:ParallelGCThreads=8 Several sources on the internet indicate that if I were to print out the 1.5 GB of log files, it would require enough paper to fill the bed of a pick up truck. Of course, it would be fruitless to try to scan the log files by hand. Tools will be required to summarize the contents of the log files. Others have encountered large Java garbage collection log files. There are existing tools to analyze the log files: IBM’s GC toolkit The chewiebug GCViewer gchisto HPjmeter Instead of using one of the other tools listed, I decide to parse the log files with standard Unix tools, and analyze the data with R. Data Cleansing The log files arrived in two different formats. I guess that the difference is that one set of log files was generated using a more verbose option, maybe -XX:+PrintHeapAtGC, and the other set of log files was generated without that option. Format 1 In some of the log files, the log files with the less verbose format, a single trace, i.e. the report of a singe garbage collection event, looks like this: {Heap before GC invocations=12280 (full 61): garbage-first heap total 9437184K, used 7499918K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) region size 4096K, 1 young (4096K), 0 survivors (0K) compacting perm gen total 262144K, used 144077K [0xffffffff40000000, 0xffffffff50000000, 0xffffffff50000000) the space 262144K, 54% used [0xffffffff40000000, 0xffffffff48cb3758, 0xffffffff48cb3800, 0xffffffff50000000) No shared spaces configured. 2014-05-14T07:24:00.988-0700: 60586.353: [GC pause (young) 7324M->7320M(9216M), 0.1567265 secs] Heap after GC invocations=12281 (full 61): garbage-first heap total 9437184K, used 7496533K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) region size 4096K, 0 young (0K), 0 survivors (0K) compacting perm gen total 262144K, used 144077K [0xffffffff40000000, 0xffffffff50000000, 0xffffffff50000000) the space 262144K, 54% used [0xffffffff40000000, 0xffffffff48cb3758, 0xffffffff48cb3800, 0xffffffff50000000) No shared spaces configured. } A simple grep can be used to extract a summary: $ grep "\[ GC pause (young" g1gc.log 2014-05-13T13:24:35.091-0700: 3.109: [GC pause (young) 20M->5029K(9216M), 0.0146328 secs] 2014-05-13T13:24:35.440-0700: 3.459: [GC pause (young) 9125K->6077K(9216M), 0.0086723 secs] 2014-05-13T13:24:37.581-0700: 5.599: [GC pause (young) 25M->8470K(9216M), 0.0203820 secs] 2014-05-13T13:24:42.686-0700: 10.704: [GC pause (young) 44M->15M(9216M), 0.0288848 secs] 2014-05-13T13:24:48.941-0700: 16.958: [GC pause (young) 51M->20M(9216M), 0.0491244 secs] 2014-05-13T13:24:56.049-0700: 24.066: [GC pause (young) 92M->26M(9216M), 0.0525368 secs] 2014-05-13T13:25:34.368-0700: 62.383: [GC pause (young) 602M->68M(9216M), 0.1721173 secs] But that format wasn't easily read into R, so I needed to be a bit more tricky. I used the following Unix command to create a summary file that was easy for R to read. $ echo "SecondsSinceLaunch BeforeSize AfterSize TotalSize RealTime" $ grep "\[GC pause (young" g1gc.log | grep -v mark | sed -e 's/[A-SU-z\(\),]/ /g' -e 's/->/ /' -e 's/: / /g' | more SecondsSinceLaunch BeforeSize AfterSize TotalSize RealTime 2014-05-13T13:24:35.091-0700 3.109 20 5029 9216 0.0146328 2014-05-13T13:24:35.440-0700 3.459 9125 6077 9216 0.0086723 2014-05-13T13:24:37.581-0700 5.599 25 8470 9216 0.0203820 2014-05-13T13:24:42.686-0700 10.704 44 15 9216 0.0288848 2014-05-13T13:24:48.941-0700 16.958 51 20 9216 0.0491244 2014-05-13T13:24:56.049-0700 24.066 92 26 9216 0.0525368 2014-05-13T13:25:34.368-0700 62.383 602 68 9216 0.1721173 Format 2 In some of the log files, the log files with the more verbose format, a single trace, i.e. the report of a singe garbage collection event, was more complicated than Format 1. Here is a text file with an example of a single G1GC trace in the second format. As you can see, it is quite complicated. It is nice that there is so much information available, but the level of detail can be overwhelming. I wrote this awk script (download) to summarize each trace on a single line. #!/usr/bin/env awk -f BEGIN { printf("SecondsSinceLaunch IncrementalCount FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize\n") } ###################### # Save count data from lines that are at the start of each G1GC trace. # Each trace starts out like this: # {Heap before GC invocations=14 (full 0): # garbage-first heap total 9437184K, used 325496K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) ###################### /{Heap.*full/{ gsub ( "\\)" , "" ); nf=split($0,a,"="); split(a[2],b," "); getline; if ( match($0, "first") ) { G1GC=1; IncrementalCount=b[1]; FullCount=substr( b[3], 1, length(b[3])-1 ); } else { G1GC=0; } } ###################### # Pull out time stamps that are in lines with this format: # 2014-05-12T14:02:06.025-0700: 94.312: [GC pause (young), 0.08870154 secs] ###################### /GC pause/ { DateTime=$1; SecondsSinceLaunch=substr($2, 1, length($2)-1); } ###################### # Heap sizes are in lines that look like this: # [ 4842M->4838M(9216M)] ###################### /\[ .*]$/ { gsub ( "\\[" , "" ); gsub ( "\ \]" , "" ); gsub ( "->" , " " ); gsub ( "\\( " , " " ); gsub ( "\ \)" , " " ); split($0,a," "); if ( split(a[1],b,"M") > 1 ) {BeforeSize=b[1]*1024;} if ( split(a[1],b,"K") > 1 ) {BeforeSize=b[1];} if ( split(a[2],b,"M") > 1 ) {AfterSize=b[1]*1024;} if ( split(a[2],b,"K") > 1 ) {AfterSize=b[1];} if ( split(a[3],b,"M") > 1 ) {TotalSize=b[1]*1024;} if ( split(a[3],b,"K") > 1 ) {TotalSize=b[1];} } ###################### # Emit an output line when you find input that looks like this: # [Times: user=1.41 sys=0.08, real=0.24 secs] ###################### /\[Times/ { if (G1GC==1) { gsub ( "," , "" ); split($2,a,"="); UserTime=a[2]; split($3,a,"="); SysTime=a[2]; split($4,a,"="); RealTime=a[2]; print DateTime,SecondsSinceLaunch,IncrementalCount,FullCount,UserTime,SysTime,RealTime,BeforeSize,AfterSize,TotalSize; G1GC=0; } } The resulting summary is about 25X smaller that the original file, but still difficult for a human to digest. SecondsSinceLaunch IncrementalCount FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ... 2014-05-12T18:36:34.669-0700: 3985.744 561 0 0.57 0.06 0.16 1724416 1720320 9437184 2014-05-12T18:36:34.839-0700: 3985.914 562 0 0.51 0.06 0.19 1724416 1720320 9437184 2014-05-12T18:36:35.069-0700: 3986.144 563 0 0.60 0.04 0.27 1724416 1721344 9437184 2014-05-12T18:36:35.354-0700: 3986.429 564 0 0.33 0.04 0.09 1725440 1722368 9437184 2014-05-12T18:36:35.545-0700: 3986.620 565 0 0.58 0.04 0.17 1726464 1722368 9437184 2014-05-12T18:36:35.726-0700: 3986.801 566 0 0.43 0.05 0.12 1726464 1722368 9437184 2014-05-12T18:36:35.856-0700: 3986.930 567 0 0.30 0.04 0.07 1726464 1723392 9437184 2014-05-12T18:36:35.947-0700: 3987.023 568 0 0.61 0.04 0.26 1727488 1723392 9437184 2014-05-12T18:36:36.228-0700: 3987.302 569 0 0.46 0.04 0.16 1731584 1724416 9437184 Reading the Data into R Once the GC log data had been cleansed, either by processing the first format with the shell script, or by processing the second format with the awk script, it was easy to read the data into R. g1gc.df = read.csv("summary.txt", row.names = NULL, stringsAsFactors=FALSE,sep="") str(g1gc.df) ## 'data.frame': 8307 obs. of 10 variables: ## $ row.names : chr "2014-05-12T14:00:32.868-0700:" "2014-05-12T14:00:33.179-0700:" "2014-05-12T14:00:33.677-0700:" "2014-05-12T14:00:35.538-0700:" ... ## $ SecondsSinceLaunch: num 1.16 1.47 1.97 3.83 6.1 ... ## $ IncrementalCount : int 0 1 2 3 4 5 6 7 8 9 ... ## $ FullCount : int 0 0 0 0 0 0 0 0 0 0 ... ## $ UserTime : num 0.11 0.05 0.04 0.21 0.08 0.26 0.31 0.33 0.34 0.56 ... ## $ SysTime : num 0.04 0.01 0.01 0.05 0.01 0.06 0.07 0.06 0.07 0.09 ... ## $ RealTime : num 0.02 0.02 0.01 0.04 0.02 0.04 0.05 0.04 0.04 0.06 ... ## $ BeforeSize : int 8192 5496 5768 22528 24576 43008 34816 53248 55296 93184 ... ## $ AfterSize : int 1400 1672 2557 4907 7072 14336 16384 18432 19456 21504 ... ## $ TotalSize : int 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 ... head(g1gc.df) ## row.names SecondsSinceLaunch IncrementalCount ## 1 2014-05-12T14:00:32.868-0700: 1.161 0 ## 2 2014-05-12T14:00:33.179-0700: 1.472 1 ## 3 2014-05-12T14:00:33.677-0700: 1.969 2 ## 4 2014-05-12T14:00:35.538-0700: 3.830 3 ## 5 2014-05-12T14:00:37.811-0700: 6.103 4 ## 6 2014-05-12T14:00:41.428-0700: 9.720 5 ## FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ## 1 0 0.11 0.04 0.02 8192 1400 9437184 ## 2 0 0.05 0.01 0.02 5496 1672 9437184 ## 3 0 0.04 0.01 0.01 5768 2557 9437184 ## 4 0 0.21 0.05 0.04 22528 4907 9437184 ## 5 0 0.08 0.01 0.02 24576 7072 9437184 ## 6 0 0.26 0.06 0.04 43008 14336 9437184 Basic Statistics Once the data has been read into R, simple statistics are very easy to generate. All of the numbers from high school statistics are available via simple commands. For example, generate a summary of every column: summary(g1gc.df) ## row.names SecondsSinceLaunch IncrementalCount FullCount ## Length:8307 Min. : 1 Min. : 0 Min. : 0.0 ## Class :character 1st Qu.: 9977 1st Qu.:2048 1st Qu.: 0.0 ## Mode :character Median :12855 Median :4136 Median : 12.0 ## Mean :12527 Mean :4156 Mean : 31.6 ## 3rd Qu.:15758 3rd Qu.:6262 3rd Qu.: 61.0 ## Max. :55484 Max. :8391 Max. :113.0 ## UserTime SysTime RealTime BeforeSize ## Min. :0.040 Min. :0.0000 Min. : 0.0 Min. : 5476 ## 1st Qu.:0.470 1st Qu.:0.0300 1st Qu.: 0.1 1st Qu.:5137920 ## Median :0.620 Median :0.0300 Median : 0.1 Median :6574080 ## Mean :0.751 Mean :0.0355 Mean : 0.3 Mean :5841855 ## 3rd Qu.:0.920 3rd Qu.:0.0400 3rd Qu.: 0.2 3rd Qu.:7084032 ## Max. :3.370 Max. :1.5600 Max. :488.1 Max. :8696832 ## AfterSize TotalSize ## Min. : 1380 Min. :9437184 ## 1st Qu.:5002752 1st Qu.:9437184 ## Median :6559744 Median :9437184 ## Mean :5785454 Mean :9437184 ## 3rd Qu.:7054336 3rd Qu.:9437184 ## Max. :8482816 Max. :9437184 Q: What is the total amount of User CPU time spent in garbage collection? sum(g1gc.df$UserTime) ## [1] 6236 As you can see, less than two hours of CPU time was spent in garbage collection. Is that too much? To find the percentage of time spent in garbage collection, divide the number above by total_elapsed_time*CPU_count. In this case, there are a lot of CPU’s and it turns out the the overall amount of CPU time spent in garbage collection isn’t a problem when viewed in isolation. When calculating rates, i.e. events per unit time, you need to ask yourself if the rate is homogenous across the time period in the log file. Does the log file include spikes of high activity that should be separately analyzed? Averaging in data from nights and weekends with data from business hours may alias problems. If you have a reason to suspect that the garbage collection rates include peaks and valleys that need independent analysis, see the “Time Series” section, below. Q: How much garbage is collected on each pass? The amount of heap space that is recovered per GC pass is surprisingly low: At least one collection didn’t recover any data. (“Min.=0”) 25% of the passes recovered 3MB or less. (“1st Qu.=3072”) Half of the GC passes recovered 4MB or less. (“Median=4096”) The average amount recovered was 56MB. (“Mean=56390”) 75% of the passes recovered 36MB or less. (“3rd Qu.=36860”) At least one pass recovered 2GB. (“Max.=2121000”) g1gc.df$Delta = g1gc.df$BeforeSize - g1gc.df$AfterSize summary(g1gc.df$Delta) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0 3070 4100 56400 36900 2120000 Q: What is the maximum User CPU time for a single collection? The worst garbage collection (“Max.”) is many standard deviations away from the mean. The data appears to be right skewed. summary(g1gc.df$UserTime) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.040 0.470 0.620 0.751 0.920 3.370 sd(g1gc.df$UserTime) ## [1] 0.3966 Basic Graphics Once the data is in R, it is trivial to plot the data with formats including dot plots, line charts, bar charts (simple, stacked, grouped), pie charts, boxplots, scatter plots histograms, and kernel density plots. Histogram of User CPU Time per Collection I don't think that this graph requires any explanation. hist(g1gc.df$UserTime, main="User CPU Time per Collection", xlab="Seconds", ylab="Frequency") Box plot to identify outliers When the initial data is viewed with a box plot, you can see the one crazy outlier in the real time per GC. Save this data point for future analysis and drop the outlier so that it’s not throwing off our statistics. Now the box plot shows many outliers, which will be examined later, using times series analysis. Notice that the scale of the x-axis changes drastically once the crazy outlier is removed. par(mfrow=c(2,1)) boxplot(g1gc.df$UserTime,g1gc.df$SysTime,g1gc.df$RealTime, main="Box Plot of Time per GC\n(dominated by a crazy outlier)", names=c("usr","sys","elapsed"), xlab="Seconds per GC", ylab="Time (Seconds)", horizontal = TRUE, outcol="red") crazy.outlier.df=g1gc.df[g1gc.df$RealTime > 400,] g1gc.df=g1gc.df[g1gc.df$RealTime < 400,] boxplot(g1gc.df$UserTime,g1gc.df$SysTime,g1gc.df$RealTime, main="Box Plot of Time per GC\n(crazy outlier excluded)", names=c("usr","sys","elapsed"), xlab="Seconds per GC", ylab="Time (Seconds)", horizontal = TRUE, outcol="red") box(which = "outer", lty = "solid") Here is the crazy outlier for future analysis: crazy.outlier.df ## row.names SecondsSinceLaunch IncrementalCount ## 8233 2014-05-12T23:15:43.903-0700: 20741 8316 ## FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ## 8233 112 0.55 0.42 488.1 8381440 8235008 9437184 ## Delta ## 8233 146432 R Time Series Data To analyze the garbage collection as a time series, I’ll use Z’s Ordered Observations (zoo). “zoo is the creator for an S3 class of indexed totally ordered observations which includes irregular time series.” require(zoo) ## Loading required package: zoo ## ## Attaching package: 'zoo' ## ## The following objects are masked from 'package:base': ## ## as.Date, as.Date.numeric head(g1gc.df[,1]) ## [1] "2014-05-12T14:00:32.868-0700:" "2014-05-12T14:00:33.179-0700:" ## [3] "2014-05-12T14:00:33.677-0700:" "2014-05-12T14:00:35.538-0700:" ## [5] "2014-05-12T14:00:37.811-0700:" "2014-05-12T14:00:41.428-0700:" options("digits.secs"=3) times=as.POSIXct( g1gc.df[,1], format="%Y-%m-%dT%H:%M:%OS%z:") g1gc.z = zoo(g1gc.df[,-c(1)], order.by=times) head(g1gc.z) ## SecondsSinceLaunch IncrementalCount FullCount ## 2014-05-12 17:00:32.868 1.161 0 0 ## 2014-05-12 17:00:33.178 1.472 1 0 ## 2014-05-12 17:00:33.677 1.969 2 0 ## 2014-05-12 17:00:35.538 3.830 3 0 ## 2014-05-12 17:00:37.811 6.103 4 0 ## 2014-05-12 17:00:41.427 9.720 5 0 ## UserTime SysTime RealTime BeforeSize AfterSize ## 2014-05-12 17:00:32.868 0.11 0.04 0.02 8192 1400 ## 2014-05-12 17:00:33.178 0.05 0.01 0.02 5496 1672 ## 2014-05-12 17:00:33.677 0.04 0.01 0.01 5768 2557 ## 2014-05-12 17:00:35.538 0.21 0.05 0.04 22528 4907 ## 2014-05-12 17:00:37.811 0.08 0.01 0.02 24576 7072 ## 2014-05-12 17:00:41.427 0.26 0.06 0.04 43008 14336 ## TotalSize Delta ## 2014-05-12 17:00:32.868 9437184 6792 ## 2014-05-12 17:00:33.178 9437184 3824 ## 2014-05-12 17:00:33.677 9437184 3211 ## 2014-05-12 17:00:35.538 9437184 17621 ## 2014-05-12 17:00:37.811 9437184 17504 ## 2014-05-12 17:00:41.427 9437184 28672 Example of Two Benchmark Runs in One Log File The data in the following graph is from a different log file, not the one of primary interest to this article. I’m including this image because it is an example of idle periods followed by busy periods. It would be uninteresting to average the rate of garbage collection over the entire log file period. More interesting would be the rate of garbage collect in the two busy periods. Are they the same or different? Your production data may be similar, for example, bursts when employees return from lunch and idle times on weekend evenings, etc. Once the data is in an R Time Series, you can analyze isolated time windows. Clipping the Time Series data Flashing back to our test case… Viewing the data as a time series is interesting. You can see that the work intensive time period is between 9:00 PM and 3:00 AM. Lets clip the data to the interesting period:     par(mfrow=c(2,1)) plot(g1gc.z$UserTime, type="h", main="User Time per GC\nTime: Complete Log File", xlab="Time of Day", ylab="CPU Seconds per GC", col="#1b9e77") clipped.g1gc.z=window(g1gc.z, start=as.POSIXct("2014-05-12 21:00:00"), end=as.POSIXct("2014-05-13 03:00:00")) plot(clipped.g1gc.z$UserTime, type="h", main="User Time per GC\nTime: Limited to Benchmark Execution", xlab="Time of Day", ylab="CPU Seconds per GC", col="#1b9e77") box(which = "outer", lty = "solid") Cumulative Incremental and Full GC count Here is the cumulative incremental and full GC count. When the line is very steep, it indicates that the GCs are repeating very quickly. Notice that the scale on the Y axis is different for full vs. incremental. plot(clipped.g1gc.z[,c(2:3)], main="Cumulative Incremental and Full GC count", xlab="Time of Day", col="#1b9e77") GC Analysis of Benchmark Execution using Time Series data In the following series of 3 graphs: The “After Size” show the amount of heap space in use after each garbage collection. Many Java objects are still referenced, i.e. alive, during each garbage collection. This may indicate that the application has a memory leak, or may indicate that the application has a very large memory footprint. Typically, an application's memory footprint plateau's in the early stage of execution. One would expect this graph to have a flat top. The steep decline in the heap space may indicate that the application crashed after 2:00. The second graph shows that the outliers in real execution time, discussed above, occur near 2:00. when the Java heap seems to be quite full. The third graph shows that Full GCs are infrequent during the first few hours of execution. The rate of Full GC's, (the slope of the cummulative Full GC line), changes near midnight.   plot(clipped.g1gc.z[,c("AfterSize","RealTime","FullCount")], xlab="Time of Day", col=c("#1b9e77","red","#1b9e77")) GC Analysis of heap recovered Each GC trace includes the amount of heap space in use before and after the individual GC event. During garbage coolection, unreferenced objects are identified, the space holding the unreferenced objects is freed, and thus, the difference in before and after usage indicates how much space has been freed. The following box plot and bar chart both demonstrate the same point - the amount of heap space freed per garbage colloection is surprisingly low. par(mfrow=c(2,1)) boxplot(as.vector(clipped.g1gc.z$Delta), main="Amount of Heap Recovered per GC Pass", xlab="Size in KB", horizontal = TRUE, col="red") hist(as.vector(clipped.g1gc.z$Delta), main="Amount of Heap Recovered per GC Pass", xlab="Size in KB", breaks=100, col="red") box(which = "outer", lty = "solid") This graph is the most interesting. The dark blue area shows how much heap is occupied by referenced Java objects. This represents memory that holds live data. The red fringe at the top shows how much data was recovered after each garbage collection. barplot(clipped.g1gc.z[,c("AfterSize","Delta")], col=c("#7570b3","#e7298a"), xlab="Time of Day", border=NA) legend("topleft", c("Live Objects","Heap Recovered on GC"), fill=c("#7570b3","#e7298a")) box(which = "outer", lty = "solid") When I discuss the data in the log files with the customer, I will ask for an explaination for the large amount of referenced data resident in the Java heap. There are two are posibilities: There is a memory leak and the amount of space required to hold referenced objects will continue to grow, limited only by the maximum heap size. After the maximum heap size is reached, the JVM will throw an “Out of Memory” exception every time that the application tries to allocate a new object. If this is the case, the aplication needs to be debugged to identify why old objects are referenced when they are no longer needed. The application has a legitimate requirement to keep a large amount of data in memory. The customer may want to further increase the maximum heap size. Another possible solution would be to partition the application across multiple cluster nodes, where each node has responsibility for managing a unique subset of the data. Conclusion In conclusion, R is a very powerful tool for the analysis of Java garbage collection log files. The primary difficulty is data cleansing so that information can be read into an R data frame. Once the data has been read into R, a rich set of tools may be used for thorough evaluation.

    Read the article

  • JavaOne Latin America 2012 is a wrap!

    - by arungupta
    Third JavaOne in Latin America (2010, 2011) is now a wrap! Like last year, the event started with a Geek Bike Ride. I could not attend the bike ride because of pre-planned activities but heard lots of good comments about it afterwards. This is a great way to engage with JavaOne attendees in an informal setting. I highly recommend you joining next time! JavaOne Blog provides a a great coverage for the opening keynotes. I talked about all the great set of functionality that is coming in the Java EE 7 Platform. Also shared the details on how Java EE 7 JSRs are willing to take help from the Adopt-a-JSR program. glassfish.org/adoptajsr bridges the gap between JUGs willing to participate and looking for areas on where to help. The different specification leads have identified areas on where they are looking for feedback. So if you are JUG is interested in picking a JSR, I recommend to take a look at glassfish.org/adoptajsr and jump on the bandwagon. The main attraction for the Tuesday evening was the GlassFish Party. The party was packed with Latin American JUG leaders, execs from Oracle, and local community members. Free flowing food and beer/caipirinhas acted as great lubricant for great conversations. Some of them were considering the migration from Spring -> Java EE 6 and replacing their primary app server with GlassFish. Locaweb, a local hosting provider sponsored a round of beer at the party as well. They are planning to come with Java EE hosting next year and GlassFish would be a logical choice for them ;) I heard lots of positive feedback about the party afterwards. Many thanks to Bruno Borges for organizing a great party! Check out some more fun pictures of the party! Next day, I gave a presentation on "The Java EE 7 Platform: Productivity and HTML 5" and the slides are now available: With so much new content coming in the plaform: Java Caching API (JSR 107) Concurrency Utilities for Java EE (JSR 236) Batch Applications for the Java Platform (JSR 352) Java API for JSON (JSR 353) Java API for WebSocket (JSR 356) And JAX-RS 2.0 (JSR 339) and JMS 2.0 (JSR 343) getting major updates, there is definitely lot of excitement that was evident amongst the attendees. The talk was delivered in the biggest hall and had about 200 attendees. Also spent a lot of time talking to folks at the OTN Lounge. The JUG leaders appreciation dinner in the evening had its usual share of fun. Day 3 started with a session on "Building HTML5 WebSocket Apps in Java". The slides are now available: The room was packed with about 150 attendees and there was good interaction in the room as well. A collaborative whiteboard built using WebSocket was very well received. The following tweets made it more worthwhile: A WebSocket speek, by @ArunGupta, was worth every hour lost in transit. #JavaOneBrasil2012, #JavaOneBr @arungupta awesome presentation about WebSockets :) The session was immediately followed by the hands-on lab "Developing JAX-RS Web Applications Utilizing Server-Sent Events and WebSocket". The lab covers JAX-RS 2.0, Jersey-specific features such as Server-Sent Events, and a WebSocket endpoint using JSR 356. The complete self-paced lab guide can be downloaded from here. The lab was planned for 2 hours but several folks finished the entire exercise in about 75 mins. The wonderfully written lab material and an added incentive of Java EE 6 Pocket Guide did the trick ;-) I also spoke at "The Java Community Process: How You Can Make a Positive Difference". It was really great to see several JUG leaders talking about Adopt-a-JSR program and other activities that attendees can do to participate in the JCP. I shared details about Adopt a Java EE 7 JSR as well. The community keynote in the evening was looking fun but I had to leave in between to go through the peak Sao Paulo traffic time :) Enjoy the complete set of pictures in the album:

    Read the article

< Previous Page | 530 531 532 533 534 535 536 537 538 539 540 541  | Next Page >