rich baxter - Page 74 - Developer IT

Silverlight Recruiting Application Part 5 - Jobs Module / View

Now we starting getting into a more code-heavy portion of this series, thankfully though this means the groundwork is all set for the most part and after adding the modules we will have a complete application that can be provided with full source. The Jobs module will have two concerns- adding and maintaining jobs that can then be broadcast out to the website. How they are displayed on the site will be handled by our admin system (which will just poll from this common database), so we aren't too concerned with that, but rather with getting the information into the system and allowing the backend administration/HR users to keep things up to date. Since there is a fair bit of information that we want to display, we're going to move editing to a separate view so we can get all that information in an easy-to-use spot. With all the files created for this module, the project looks something like this: And now... on to the code. XAML for the Job Posting View All we really need for the Job Posting View is a RadGridView and a few buttons. This will let us both show off records and perform operations on the records without much hassle. That XAML is going to look something like this: 01.<Grid x:Name="LayoutRoot" 02.Background="White"> 03.<Grid.RowDefinitions> 04.<RowDefinition Height="30" /> 05.<RowDefinition /> 06.</Grid.RowDefinitions> 07.<StackPanel Orientation="Horizontal"> 08.<Button x:Name="xAddRecordButton" 09.Content="Add Job" 10.Width="120" 11.cal:Click.Command="{Binding AddRecord}" 12.telerik:StyleManager.Theme="Windows7" /> 13.<Button x:Name="xEditRecordButton" 14.Content="Edit Job" 15.Width="120" 16.cal:Click.Command="{Binding EditRecord}" 17.telerik:StyleManager.Theme="Windows7" /> 18.</StackPanel> 19.<telerikGrid:RadGridView x:Name="xJobsGrid" 20.Grid.Row="1" 21.IsReadOnly="True" 22.AutoGenerateColumns="False" 23.ColumnWidth="*" 24.RowDetailsVisibilityMode="VisibleWhenSelected" 25.ItemsSource="{Binding MyJobs}" 26.SelectedItem="{Binding SelectedJob, Mode=TwoWay}" 27.command:SelectedItemChangedEventClass.Command="{Binding SelectedItemChanged}"> 28.<telerikGrid:RadGridView.Columns> 29.<telerikGrid:GridViewDataColumn Header="Job Title" 30.DataMemberBinding="{Binding JobTitle}" 31.UniqueName="JobTitle" /> 32.<telerikGrid:GridViewDataColumn Header="Location" 33.DataMemberBinding="{Binding Location}" 34.UniqueName="Location" /> 35.<telerikGrid:GridViewDataColumn Header="Resume Required" 36.DataMemberBinding="{Binding NeedsResume}" 37.UniqueName="NeedsResume" /> 38.<telerikGrid:GridViewDataColumn Header="CV Required" 39.DataMemberBinding="{Binding NeedsCV}" 40.UniqueName="NeedsCV" /> 41.<telerikGrid:GridViewDataColumn Header="Overview Required" 42.DataMemberBinding="{Binding NeedsOverview}" 43.UniqueName="NeedsOverview" /> 44.<telerikGrid:GridViewDataColumn Header="Active" 45.DataMemberBinding="{Binding IsActive}" 46.UniqueName="IsActive" /> 47.</telerikGrid:RadGridView.Columns> 48.</telerikGrid:RadGridView> 49.</Grid> I'll explain what's happening here by line numbers: Lines 11 and 16: Using the same type of click commands as we saw in the Menu module, we tie the button clicks to delegate commands in the viewmodel. Line 25: The source for the jobs will be a collection in the viewmodel. Line 26: We also bind the selected item to a public property from the viewmodel for use in code. Line 27: We've turned the event into a command so we can handle it via code in the viewmodel. So those first three probably make sense to you as far as Silverlight/WPF binding magic is concerned, but for line 27... This actually comes from something I read onDamien Schenkelman's blog back in the day for creating an attached behavior from any event. So, any time you see me using command:Whatever.Command, the backing for it is actually something like this: SelectedItemChangedEventBehavior.cs: 01.public class SelectedItemChangedEventBehavior : CommandBehaviorBase<Telerik.Windows.Controls.DataControl> 02.{ 03.public SelectedItemChangedEventBehavior(DataControl element) 04.: base(element) 05.{ 06.element.SelectionChanged += new EventHandler<SelectionChangeEventArgs>(element_SelectionChanged); 07.} 08.void element_SelectionChanged(object sender, SelectionChangeEventArgs e) 09.{ 10.// We'll only ever allow single selection, so will only need item index 0 11.base.CommandParameter = e.AddedItems[0]; 12.base.ExecuteCommand(); 13.} 14.} SelectedItemChangedEventClass.cs: 01.public class SelectedItemChangedEventClass 02.{ 03.#region The Command Stuff 04.public static ICommand GetCommand(DependencyObject obj) 05.{ 06.return (ICommand)obj.GetValue(CommandProperty); 07.} 08.public static void SetCommand(DependencyObject obj, ICommand value) 09.{ 10.obj.SetValue(CommandProperty, value); 11.} 12.public static readonly DependencyProperty CommandProperty = 13.DependencyProperty.RegisterAttached("Command", typeof(ICommand), 14.typeof(SelectedItemChangedEventClass), new PropertyMetadata(OnSetCommandCallback)); 15.public static void OnSetCommandCallback(DependencyObject dependencyObject, DependencyPropertyChangedEventArgs e) 16.{ 17.DataControl element = dependencyObject as DataControl; 18.if (element != null) 19.{ 20.SelectedItemChangedEventBehavior behavior = GetOrCreateBehavior(element); 21.behavior.Command = e.NewValue as ICommand; 22.} 23.} 24.#endregion 25.public static SelectedItemChangedEventBehavior GetOrCreateBehavior(DataControl element) 26.{ 27.SelectedItemChangedEventBehavior behavior = element.GetValue(SelectedItemChangedEventBehaviorProperty) as SelectedItemChangedEventBehavior; 28.if (behavior == null) 29.{ 30.behavior = new SelectedItemChangedEventBehavior(element); 31.element.SetValue(SelectedItemChangedEventBehaviorProperty, behavior); 32.} 33.return behavior; 34.} 35.public static SelectedItemChangedEventBehavior GetSelectedItemChangedEventBehavior(DependencyObject obj) 36.{ 37.return (SelectedItemChangedEventBehavior)obj.GetValue(SelectedItemChangedEventBehaviorProperty); 38.} 39.public static void SetSelectedItemChangedEventBehavior(DependencyObject obj, SelectedItemChangedEventBehavior value) 40.{ 41.obj.SetValue(SelectedItemChangedEventBehaviorProperty, value); 42.} 43.public static readonly DependencyProperty SelectedItemChangedEventBehaviorProperty = 44.DependencyProperty.RegisterAttached("SelectedItemChangedEventBehavior", 45.typeof(SelectedItemChangedEventBehavior), typeof(SelectedItemChangedEventClass), null); 46.} These end up looking very similar from command to command, but in a nutshell you create a command based on any event, determine what the parameter for it will be, then execute. It attaches via XAML and ties to a DelegateCommand in the viewmodel, so you get the full event experience (since some controls get a bit event-rich for added functionality). Simple enough, right? Viewmodel for the Job Posting View The Viewmodel is going to need to handle all events going back and forth, maintaining interactions with the data we are using, and both publishing and subscribing to events. Rather than breaking this into tons of little pieces, I'll give you a nice view of the entire viewmodel and then hit up the important points line-by-line: 001.public class JobPostingViewModel : ViewModelBase 002.{ 003.private readonly IEventAggregator eventAggregator; 004.private readonly IRegionManager regionManager; 005.public DelegateCommand<object> AddRecord { get; set; } 006.public DelegateCommand<object> EditRecord { get; set; } 007.public DelegateCommand<object> SelectedItemChanged { get; set; } 008.public RecruitingContext context; 009.private QueryableCollectionView _myJobs; 010.public QueryableCollectionView MyJobs 011.{ 012.get { return _myJobs; } 013.} 014.private QueryableCollectionView _selectionJobActionHistory; 015.public QueryableCollectionView SelectedJobActionHistory 016.{ 017.get { return _selectionJobActionHistory; } 018.} 019.private JobPosting _selectedJob; 020.public JobPosting SelectedJob 021.{ 022.get { return _selectedJob; } 023.set 024.{ 025.if (value != _selectedJob) 026.{ 027._selectedJob = value; 028.NotifyChanged("SelectedJob"); 029.} 030.} 031.} 032.public SubscriptionToken editToken = new SubscriptionToken(); 033.public SubscriptionToken addToken = new SubscriptionToken(); 034.public JobPostingViewModel(IEventAggregator eventAgg, IRegionManager regionmanager) 035.{ 036.// set Unity items 037.this.eventAggregator = eventAgg; 038.this.regionManager = regionmanager; 039.// load our context 040.context = new RecruitingContext(); 041.this._myJobs = new QueryableCollectionView(context.JobPostings); 042.context.Load(context.GetJobPostingsQuery()); 043.// set command events 044.this.AddRecord = new DelegateCommand<object>(this.AddNewRecord); 045.this.EditRecord = new DelegateCommand<object>(this.EditExistingRecord); 046.this.SelectedItemChanged = new DelegateCommand<object>(this.SelectedRecordChanged); 047.SetSubscriptions(); 048.} 049.#region DelegateCommands from View 050.public void AddNewRecord(object obj) 051.{ 052.this.eventAggregator.GetEvent<AddJobEvent>().Publish(true); 053.} 054.public void EditExistingRecord(object obj) 055.{ 056.if (_selectedJob == null) 057.{ 058.this.eventAggregator.GetEvent<NotifyUserEvent>().Publish("No job selected."); 059.} 060.else 061.{ 062.this._myJobs.EditItem(this._selectedJob); 063.this.eventAggregator.GetEvent<EditJobEvent>().Publish(this._selectedJob); 064.} 065.} 066.public void SelectedRecordChanged(object obj) 067.{ 068.if (obj.GetType() == typeof(ActionHistory)) 069.{ 070.// event bubbles up so we don't catch items from the ActionHistory grid 071.} 072.else 073.{ 074.JobPosting job = obj as JobPosting; 075.GrabHistory(job.PostingID); 076.} 077.} 078.#endregion 079.#region Subscription Declaration and Events 080.public void SetSubscriptions() 081.{ 082.EditJobCompleteEvent editComplete = eventAggregator.GetEvent<EditJobCompleteEvent>(); 083.if (editToken != null) 084.editComplete.Unsubscribe(editToken); 085.editToken = editComplete.Subscribe(this.EditCompleteEventHandler); 086.AddJobCompleteEvent addComplete = eventAggregator.GetEvent<AddJobCompleteEvent>(); 087.if (addToken != null) 088.addComplete.Unsubscribe(addToken); 089.addToken = addComplete.Subscribe(this.AddCompleteEventHandler); 090.} 091.public void EditCompleteEventHandler(bool complete) 092.{ 093.if (complete) 094.{ 095.JobPosting thisJob = _myJobs.CurrentEditItem as JobPosting; 096.this._myJobs.CommitEdit(); 097.this.context.SubmitChanges((s) => 098.{ 099.ActionHistory myAction = new ActionHistory(); 100.myAction.PostingID = thisJob.PostingID; 101.myAction.Description = String.Format("Job '{0}' has been edited by {1}", thisJob.JobTitle, "default user"); 102.myAction.TimeStamp = DateTime.Now; 103.eventAggregator.GetEvent<AddActionEvent>().Publish(myAction); 104.} 105., null); 106.} 107.else 108.{ 109.this._myJobs.CancelEdit(); 110.} 111.this.MakeMeActive(this.regionManager, "MainRegion", "JobPostingsView"); 112.} 113.public void AddCompleteEventHandler(JobPosting job) 114.{ 115.if (job == null) 116.{ 117.// do nothing, new job add cancelled 118.} 119.else 120.{ 121.this.context.JobPostings.Add(job); 122.this.context.SubmitChanges((s) => 123.{ 124.ActionHistory myAction = new ActionHistory(); 125.myAction.PostingID = job.PostingID; 126.myAction.Description = String.Format("Job '{0}' has been added by {1}", job.JobTitle, "default user"); 127.myAction.TimeStamp = DateTime.Now; 128.eventAggregator.GetEvent<AddActionEvent>().Publish(myAction); 129.} 130., null); 131.} 132.this.MakeMeActive(this.regionManager, "MainRegion", "JobPostingsView"); 133.} 134.#endregion 135.public void GrabHistory(int postID) 136.{ 137.context.ActionHistories.Clear(); 138._selectionJobActionHistory = new QueryableCollectionView(context.ActionHistories); 139.context.Load(context.GetHistoryForJobQuery(postID)); 140.} Taking it from the top, we're injecting an Event Aggregator and Region Manager for use down the road and also have the public DelegateCommands (just like in the Menu module). We also grab a reference to our context, which we'll obviously need for data, then set up a few fields with public properties tied to them. We're also setting subscription tokens, which we have not yet seen but I will get into below. The AddNewRecord (50) and EditExistingRecord (54) methods should speak for themselves for functionality, the one thing of note is we're sending events off to the Event Aggregator which some module, somewhere will take care of. Since these aren't entirely relying on one another, the Jobs View doesn't care if anyone is listening, but it will publish AddJobEvent (52), NotifyUserEvent (58) and EditJobEvent (63)regardless. Don't mind the GrabHistory() method so much, that is just grabbing history items (visibly being created in the SubmitChanges callbacks), and adding them to the database. Every action will trigger a history event, so we'll know who modified what and when, just in case. ;) So where are we at? Well, if we click to Add a job, we publish an event, if we edit a job, we publish an event with the selected record (attained through the magic of binding). Where is this all going though? To the Viewmodel, of course! XAML for the AddEditJobView This is pretty straightforward except for one thing, noted below: 001.<Grid x:Name="LayoutRoot" 002.Background="White"> 003.<Grid x:Name="xEditGrid" 004.Margin="10" 005.validationHelper:ValidationScope.Errors="{Binding Errors}"> 006.<Grid.Background> 007.<LinearGradientBrush EndPoint="0.5,1" 008.StartPoint="0.5,0"> 009.<GradientStop Color="#FFC7C7C7" 010.Offset="0" /> 011.<GradientStop Color="#FFF6F3F3" 012.Offset="1" /> 013.</LinearGradientBrush> 014.</Grid.Background> 015.<Grid.RowDefinitions> 016.<RowDefinition Height="40" /> 017.<RowDefinition Height="40" /> 018.<RowDefinition Height="40" /> 019.<RowDefinition Height="100" /> 020.<RowDefinition Height="100" /> 021.<RowDefinition Height="100" /> 022.<RowDefinition Height="40" /> 023.<RowDefinition Height="40" /> 024.<RowDefinition Height="40" /> 025.</Grid.RowDefinitions> 026.<Grid.ColumnDefinitions> 027.<ColumnDefinition Width="150" /> 028.<ColumnDefinition Width="150" /> 029.<ColumnDefinition Width="300" /> 030.<ColumnDefinition Width="100" /> 031.</Grid.ColumnDefinitions> 032. 033.<TextBlock Margin="8" 034.Text="{Binding AddEditString}" 035.TextWrapping="Wrap" 036.Grid.Column="1" 037.Grid.ColumnSpan="2" 038.FontSize="16" /> 039. 040. 041.<TextBlock Margin="8,0,0,0" 042.Style="{StaticResource LabelTxb}" 043.Grid.Row="1" 044.Text="Job Title" 045.VerticalAlignment="Center" /> 046.<TextBox x:Name="xJobTitleTB" 047.Margin="0,8" 048.Grid.Column="1" 049.Grid.Row="1" 050.Text="{Binding activeJob.JobTitle, Mode=TwoWay, NotifyOnValidationError=True, ValidatesOnExceptions=True}" 051.Grid.ColumnSpan="2" /> 052.<TextBlock Margin="8,0,0,0" 053.Grid.Row="2" 054.Text="Location" 055.d:LayoutOverrides="Height" 056.VerticalAlignment="Center" /> 057.<TextBox x:Name="xLocationTB" 058.Margin="0,8" 059.Grid.Column="1" 060.Grid.Row="2" 061.Text="{Binding activeJob.Location, Mode=TwoWay, NotifyOnValidationError=True, ValidatesOnExceptions=True}" 062.Grid.ColumnSpan="2" /> 063. 064.<TextBlock Margin="8,11,8,0" 065.Grid.Row="3" 066.Text="Description" 067.TextWrapping="Wrap" 068.VerticalAlignment="Top" /> 069. 070.<TextBox x:Name="xDescriptionTB" 071.Height="84" 072.TextWrapping="Wrap" 073.ScrollViewer.VerticalScrollBarVisibility="Auto" 074.Grid.Column="1" 075.Grid.Row="3" 076.Text="{Binding activeJob.Description, Mode=TwoWay, NotifyOnValidationError=True, ValidatesOnExceptions=True}" 077.Grid.ColumnSpan="2" /> 078.<TextBlock Margin="8,11,8,0" 079.Grid.Row="4" 080.Text="Requirements" 081.TextWrapping="Wrap" 082.VerticalAlignment="Top" /> 083. 084.<TextBox x:Name="xRequirementsTB" 085.Height="84" 086.TextWrapping="Wrap" 087.ScrollViewer.VerticalScrollBarVisibility="Auto" 088.Grid.Column="1" 089.Grid.Row="4" 090.Text="{Binding activeJob.Requirements, Mode=TwoWay, NotifyOnValidationError=True, ValidatesOnExceptions=True}" 091.Grid.ColumnSpan="2" /> 092.<TextBlock Margin="8,11,8,0" 093.Grid.Row="5" 094.Text="Qualifications" 095.TextWrapping="Wrap" 096.VerticalAlignment="Top" /> 097. 098.<TextBox x:Name="xQualificationsTB" 099.Height="84" 100.TextWrapping="Wrap" 101.ScrollViewer.VerticalScrollBarVisibility="Auto" 102.Grid.Column="1" 103.Grid.Row="5" 104.Text="{Binding activeJob.Qualifications, Mode=TwoWay, NotifyOnValidationError=True, ValidatesOnExceptions=True}" 105.Grid.ColumnSpan="2" /> 106. 107. 108.<CheckBox x:Name="xResumeRequiredCB" Margin="8,8,8,15" 109.Content="Resume Required" 110.Grid.Row="6" 111.Grid.ColumnSpan="2" 112.IsChecked="{Binding activeJob.NeedsResume, Mode=TwoWay, NotifyOnValidationError=True, ValidatesOnExceptions=True}"/> 113. 114.<CheckBox x:Name="xCoverletterRequiredCB" Margin="8,8,8,15" 115.Content="Cover Letter Required" 116.Grid.Column="2" 117.Grid.Row="6" 118.IsChecked="{Binding activeJob.NeedsCV, Mode=TwoWay, NotifyOnValidationError=True, ValidatesOnExceptions=True}"/> 119. 120.<CheckBox x:Name="xOverviewRequiredCB" Margin="8,8,8,15" 121.Content="Overview Required" 122.Grid.Row="7" 123.Grid.ColumnSpan="2" 124.IsChecked="{Binding activeJob.NeedsOverview, Mode=TwoWay, NotifyOnValidationError=True, ValidatesOnExceptions=True}"/> 125. 126.<CheckBox x:Name="xJobActiveCB" Margin="8,8,8,15" 127.Content="Job is Active" 128.Grid.Column="2" 129.Grid.Row="7" 130.IsChecked="{Binding activeJob.IsActive, Mode=TwoWay, NotifyOnValidationError=True, ValidatesOnExceptions=True}"/> 131. 132. 133. 134.<Button x:Name="xAddEditButton" Margin="8,8,0,10" 135.Content="{Binding AddEditButtonString}" 136.cal:Click.Command="{Binding AddEditCommand}" 137.Grid.Column="2" 138.Grid.Row="8" 139.HorizontalAlignment="Left" 140.Width="125" 141.telerik:StyleManager.Theme="Windows7" /> 142. 143.<Button x:Name="xCancelButton" HorizontalAlignment="Right" 144.Content="Cancel" 145.cal:Click.Command="{Binding CancelCommand}" 146.Margin="0,8,8,10" 147.Width="125" 148.Grid.Column="2" 149.Grid.Row="8" 150.telerik:StyleManager.Theme="Windows7" /> 151.</Grid> 152.</Grid> The 'validationHelper:ValidationScope' line may seem odd. This is a handy little trick for catching current and would-be validation errors when working in this whole setup. This all comes from an approach found on theJoy Of Code blog, although it looks like the story for this will be changing slightly with new advances in SL4/WCF RIA Services, so this section can definitely get an overhaul a little down the road. The code is the fun part of all this, so let us see what's happening under the hood. Viewmodel for the AddEditJobView We are going to see some of the same things happening here, so I'll skip over the repeat info and get right to the good stuff: 001.public class AddEditJobViewModel : ViewModelBase 002.{ 003.private readonly IEventAggregator eventAggregator; 004.private readonly IRegionManager regionManager; 005. 006.public RecruitingContext context; 007. 008.private JobPosting _activeJob; 009.public JobPosting activeJob 010.{ 011.get { return _activeJob; } 012.set 013.{ 014.if (_activeJob != value) 015.{ 016._activeJob = value; 017.NotifyChanged("activeJob"); 018.} 019.} 020.} 021. 022.public bool isNewJob; 023. 024.private string _addEditString; 025.public string AddEditString 026.{ 027.get { return _addEditString; } 028.set 029.{ 030.if (_addEditString != value) 031.{ 032._addEditString = value; 033.NotifyChanged("AddEditString"); 034.} 035.} 036.} 037. 038.private string _addEditButtonString; 039.public string AddEditButtonString 040.{ 041.get { return _addEditButtonString; } 042.set 043.{ 044.if (_addEditButtonString != value) 045.{ 046._addEditButtonString = value; 047.NotifyChanged("AddEditButtonString"); 048.} 049.} 050.} 051. 052.public SubscriptionToken addJobToken = new SubscriptionToken(); 053.public SubscriptionToken editJobToken = new SubscriptionToken(); 054. 055.public DelegateCommand<object> AddEditCommand { get; set; } 056.public DelegateCommand<object> CancelCommand { get; set; } 057. 058.private ObservableCollection<ValidationError> _errors = new ObservableCollection<ValidationError>(); 059.public ObservableCollection<ValidationError> Errors 060.{ 061.get { return _errors; } 062.} 063. 064.private ObservableCollection<ValidationResult> _valResults = new ObservableCollection<ValidationResult>(); 065.public ObservableCollection<ValidationResult> ValResults 066.{ 067.get { return this._valResults; } 068.} 069. 070.public AddEditJobViewModel(IEventAggregator eventAgg, IRegionManager regionmanager) 071.{ 072.// set Unity items 073.this.eventAggregator = eventAgg; 074.this.regionManager = regionmanager; 075. 076.context = new RecruitingContext(); 077. 078.AddEditCommand = new DelegateCommand<object>(this.AddEditJobCommand); 079.CancelCommand = new DelegateCommand<object>(this.CancelAddEditCommand); 080. 081.SetSubscriptions(); 082.} 083. 084.#region Subscription Declaration and Events 085. 086.public void SetSubscriptions() 087.{ 088.AddJobEvent addJob = this.eventAggregator.GetEvent<AddJobEvent>(); 089. 090.if (addJobToken != null) 091.addJob.Unsubscribe(addJobToken); 092. 093.addJobToken = addJob.Subscribe(this.AddJobEventHandler); 094. 095.EditJobEvent editJob = this.eventAggregator.GetEvent<EditJobEvent>(); 096. 097.if (editJobToken != null) 098.editJob.Unsubscribe(editJobToken); 099. 100.editJobToken = editJob.Subscribe(this.EditJobEventHandler); 101.} 102. 103.public void AddJobEventHandler(bool isNew) 104.{ 105.this.activeJob = null; 106.this.activeJob = new JobPosting(); 107.this.activeJob.IsActive = true; // We assume that we want a new job to go up immediately 108.this.isNewJob = true; 109.this.AddEditString = "Add New Job Posting"; 110.this.AddEditButtonString = "Add Job"; 111. 112.MakeMeActive(this.regionManager, "MainRegion", "AddEditJobView"); 113.} 114. 115.public void EditJobEventHandler(JobPosting editJob) 116.{ 117.this.activeJob = null; 118.this.activeJob = editJob; 119.this.isNewJob = false; 120.this.AddEditString = "Edit Job Posting"; 121.this.AddEditButtonString = "Edit Job"; 122. 123.MakeMeActive(this.regionManager, "MainRegion", "AddEditJobView"); 124.} 125. 126.#endregion 127. 128.#region DelegateCommands from View 129. 130.public void AddEditJobCommand(object obj) 131.{ 132.if (this.Errors.Count > 0) 133.{ 134.List<string> errorMessages = new List<string>(); 135. 136.foreach (var valR in this.Errors) 137.{ 138.errorMessages.Add(valR.Exception.Message); 139.} 140. 141.this.eventAggregator.GetEvent<DisplayValidationErrorsEvent>().Publish(errorMessages); 142. 143.} 144.else if (!Validator.TryValidateObject(this.activeJob, new ValidationContext(this.activeJob, null, null), _valResults, true)) 145.{ 146.List<string> errorMessages = new List<string>(); 147. 148.foreach (var valR in this._valResults) 149.{ 150.errorMessages.Add(valR.ErrorMessage); 151.} 152. 153.this._valResults.Clear(); 154. 155.this.eventAggregator.GetEvent<DisplayValidationErrorsEvent>().Publish(errorMessages); 156.} 157.else 158.{ 159.if (this.isNewJob) 160.{ 161.this.eventAggregator.GetEvent<AddJobCompleteEvent>().Publish(this.activeJob); 162.} 163.else 164.{ 165.this.eventAggregator.GetEvent<EditJobCompleteEvent>().Publish(true); 166.} 167.} 168.} 169. 170.public void CancelAddEditCommand(object obj) 171.{ 172.if (this.isNewJob) 173.{ 174.this.eventAggregator.GetEvent<AddJobCompleteEvent>().Publish(null); 175.} 176.else 177.{ 178.this.eventAggregator.GetEvent<EditJobCompleteEvent>().Publish(false); 179.} 180.} 181. 182.#endregion 183.} 184.} We start seeing something new on line 103- the AddJobEventHandler will create a new job and set that to the activeJob item on the ViewModel. When this is all set, the view calls that familiar MakeMeActive method to activate itself. I made a bit of a management call on making views self-activate like this, but I figured it works for one reason. As I create this application, views may not exist that I have in mind, so after a view receives its 'ping' from being subscribed to an event, it prepares whatever it needs to do and then goes active. This way if I don't have 'edit' hooked up, I can click as the day is long on the main view and won't get lost in an empty region. Total personal preference here. :) Everything else should again be pretty straightforward, although I do a bit of validation checking in the AddEditJobCommand, which can either fire off an event back to the main view/viewmodel if everything is a success or sent a list of errors to our notification module, which pops open a RadWindow with the alerts if any exist. As a bonus side note, here's what my WCF RIA Services metadata looks like for handling all of the validation: private JobPostingMetadata() { } [StringLength(2500, ErrorMessage = "Description should be more than one and less than 2500 characters.", MinimumLength = 1)] [Required(ErrorMessage = "Description is required.")] public string Description; [Required(ErrorMessage="Active Status is Required")] public bool IsActive; [StringLength(100, ErrorMessage = "Posting title must be more than 3 but less than 100 characters.", MinimumLength = 3)] [Required(ErrorMessage = "Job Title is required.")] public bool JobTitle; [Required] public string Location; public bool NeedsCV; public bool NeedsOverview; public bool NeedsResume; public int PostingID; [Required(ErrorMessage="Qualifications are required.")] [StringLength(2500, ErrorMessage="Qualifications should be more than one and less than 2500 characters.", MinimumLength=1)] public string Qualifications; [StringLength(2500, ErrorMessage = "Requirements should be more than one and less than 2500 characters.", MinimumLength = 1)] [Required(ErrorMessage="Requirements are required.")] public string Requirements; The RecruitCB Alternative See all that Xaml I pasted above? Those are now two pieces sitting in the JobsView.xaml file now. The only real difference is that the xEditGrid now sits in the same place as xJobsGrid, with visibility swapping out between the two for a quick switch. I also took out all the cal: and command: command references and replaced Button events with clicks and the Grid selection command replaced with a SelectedItemChanged event. Also, at the bottom of the xEditGrid after the last button, I add a ValidationSummary (with Visibility=Collapsed) to catch any errors that are popping up. Simple as can be, and leads to this being the single code-behind file: 001.public partial class JobsView : UserControl 002.{ 003.public RecruitingContext context; 004.public JobPosting activeJob; 005.public bool isNew; 006.private ObservableCollection<ValidationResult> _valResults = new ObservableCollection<ValidationResult>(); 007.public ObservableCollection<ValidationResult> ValResults 008.{ 009.get { return this._valResults; } 010.} 011.public JobsView() 012.{ 013.InitializeComponent(); 014.this.Loaded += new RoutedEventHandler(JobsView_Loaded); 015.} 016.void JobsView_Loaded(object sender, RoutedEventArgs e) 017.{ 018.context = new RecruitingContext(); 019.xJobsGrid.ItemsSource = context.JobPostings; 020.context.Load(context.GetJobPostingsQuery()); 021.} 022.private void xAddRecordButton_Click(object sender, RoutedEventArgs e) 023.{ 024.activeJob = new JobPosting(); 025.isNew = true; 026.xAddEditTitle.Text = "Add a Job Posting"; 027.xAddEditButton.Content = "Add"; 028.xEditGrid.DataContext = activeJob; 029.HideJobsGrid(); 030.} 031.private void xEditRecordButton_Click(object sender, RoutedEventArgs e) 032.{ 033.activeJob = xJobsGrid.SelectedItem as JobPosting; 034.isNew = false; 035.xAddEditTitle.Text = "Edit a Job Posting"; 036.xAddEditButton.Content = "Edit"; 037.xEditGrid.DataContext = activeJob; 038.HideJobsGrid(); 039.} 040.private void xAddEditButton_Click(object sender, RoutedEventArgs e) 041.{ 042.if (!Validator.TryValidateObject(this.activeJob, new ValidationContext(this.activeJob, null, null), _valResults, true)) 043.{ 044.List<string> errorMessages = new List<string>(); 045.foreach (var valR in this._valResults) 046.{ 047.errorMessages.Add(valR.ErrorMessage); 048.} 049.this._valResults.Clear(); 050.ShowErrors(errorMessages); 051.} 052.else if (xSummary.Errors.Count > 0) 053.{ 054.List<string> errorMessages = new List<string>(); 055.foreach (var err in xSummary.Errors) 056.{ 057.errorMessages.Add(err.Message); 058.} 059.ShowErrors(errorMessages); 060.} 061.else 062.{ 063.if (this.isNew) 064.{ 065.context.JobPostings.Add(activeJob); 066.context.SubmitChanges((s) => 067.{ 068.ActionHistory thisAction = new ActionHistory(); 069.thisAction.PostingID = activeJob.PostingID; 070.thisAction.Description = String.Format("Job '{0}' has been edited by {1}", activeJob.JobTitle, "default user"); 071.thisAction.TimeStamp = DateTime.Now; 072.context.ActionHistories.Add(thisAction); 073.context.SubmitChanges(); 074.}, null); 075.} 076.else 077.{ 078.context.SubmitChanges((s) => 079.{ 080.ActionHistory thisAction = new ActionHistory(); 081.thisAction.PostingID = activeJob.PostingID; 082.thisAction.Description = String.Format("Job '{0}' has been added by {1}", activeJob.JobTitle, "default user"); 083.thisAction.TimeStamp = DateTime.Now; 084.context.ActionHistories.Add(thisAction); 085.context.SubmitChanges(); 086.}, null); 087.} 088.ShowJobsGrid(); 089.} 090.} 091.private void xCancelButton_Click(object sender, RoutedEventArgs e) 092.{ 093.ShowJobsGrid(); 094.} 095.private void ShowJobsGrid() 096.{ 097.xAddEditRecordButtonPanel.Visibility = Visibility.Visible; 098.xEditGrid.Visibility = Visibility.Collapsed; 099.xJobsGrid.Visibility = Visibility.Visible; 100.} 101.private void HideJobsGrid() 102.{ 103.xAddEditRecordButtonPanel.Visibility = Visibility.Collapsed; 104.xJobsGrid.Visibility = Visibility.Collapsed; 105.xEditGrid.Visibility = Visibility.Visible; 106.} 107.private void ShowErrors(List<string> errorList) 108.{ 109.string nm = "Errors received: \n"; 110.foreach (string anerror in errorList) 111.nm += anerror + "\n"; 112.RadWindow.Alert(nm); 113.} 114.} The first 39 lines should be pretty familiar, not doing anything too unorthodox to get this up and running. Once we hit the xAddEditButton_Click on line 40, we're still doing pretty much the same things except instead of checking the ValidationHelper errors, we both run a check on the current activeJob object as well as check the ValidationSummary errors list. Once that is set, we again use the callback of context.SubmitChanges (lines 68 and 78) to create an ActionHistory which we will use to track these items down the line. That's all? Essentially... yes. If you look back through this post, most of the code and adventures we have taken were just to get things working in the MVVM/Prism setup. Since I have the whole 'module' self-contained in a single JobView+code-behind setup, I don't have to worry about things like sending events off into space for someone to pick up, communicating through an Infrastructure project, or even re-inventing events to be used with attached behaviors. Everything just kinda works, and again with much less code. Here's a picture of the MVVM and Code-behind versions on the Jobs and AddEdit views, but since the functionality is the same in both apps you still cannot tell them apart (for two-strike): Looking ahead, the Applicants module is effectively the same thing as the Jobs module, so most of the code is being cut-and-pasted back and forth with minor tweaks here and there. So that one is being taken care of by me behind the scenes. Next time, we get into a new world of fun- the interview scheduling module, which will pull from available jobs and applicants for each interview being scheduled, tying everything together with RadScheduler to the rescue. Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

Read the article

Extending Oracle CEP with Predictive Analytics

- by vikram.shukla(at)oracle.com

Introduction: OCEP is often used as a business rules engine to execute a set of business logic rules via CQL statements, and take decisions based on the outcome of those rules. There are times where configuring rules manually is sufficient because an application needs to deal with only a small and well-defined set of static rules. However, in many situations customers don't want to pre-define such rules for two reasons. First, they are dealing with events with lots of columns and manually crafting such rules for each column or a set of columns and combinations thereof is almost impossible. Second, they are content with probabilistic outcomes and do not care about 100% precision. The former is the case when a user is dealing with data with high dimensionality, the latter when an application can live with "false" positives as they can be discarded after further inspection, say by a Human Task component in a Business Process Management software. The primary goal of this blog post is to show how this can be achieved by combining OCEP with Oracle Data Mining® and leveraging the latter's rich set of algorithms and functionality to do predictive analytics in real time on streaming events. The secondary goal of this post is also to show how OCEP can be extended to invoke any arbitrary external computation in an RDBMS from within CEP. The extensible facility is known as the JDBC cartridge. The rest of the post describes the steps required to achieve this: We use the dataset available at http://blogs.oracle.com/datamining/2010/01/fraud_and_anomaly_detection_made_simple.html to showcase the capabilities. We use it to show how transaction anomalies or fraud can be detected. Building the model: Follow the self-explanatory steps described at the above URL to build the model. It is very simple - it uses built-in Oracle Data Mining PL/SQL packages to cleanse, normalize and build the model out of the dataset. You can also use graphical Oracle Data Miner® to build the models. To summarize, it involves: Specifying which algorithms to use. In this case we use Support Vector Machines as we're trying to find anomalies in highly dimensional dataset.Build model on the data in the table for the algorithms specified. For this example, the table was populated in the scott/tiger schema with appropriate privileges. Configuring the Data Source: This is the first step in building CEP application using such an integration. Our datasource looks as follows in the server config file. It is advisable that you use the Visualizer to add it to the running server dynamically, rather than manually edit the file. <data-source> <name>DataMining</name> <data-source-params> <jndi-names> <element>DataMining</element> </jndi-names> <global-transactions-protocol>OnePhaseCommit</global-transactions-protocol> </data-source-params> <connection-pool-params> <credential-mapping-enabled></credential-mapping-enabled> <test-table-name>SQL SELECT 1 from DUAL</test-table-name> <initial-capacity>1</initial-capacity> <max-capacity>15</max-capacity> <capacity-increment>1</capacity-increment> </connection-pool-params> <driver-params> <use-xa-data-source-interface>true</use-xa-data-source-interface> <driver-name>oracle.jdbc.OracleDriver</driver-name> <url>jdbc:oracle:thin:@localhost:1522:orcl</url> <properties> <element> <value>scott</value> <name>user</name> </element> <element> <value>{Salted-3DES}AzFE5dDbO2g=</value> <name>password</name> </element> <element> <name>com.bea.core.datasource.serviceName</name> <value>oracle11.2g</value> </element> <element> <name>com.bea.core.datasource.serviceVersion</name> <value>11.2.0</value> </element> <element> <name>com.bea.core.datasource.serviceObjectClass</name> <value>java.sql.Driver</value> </element> </properties> </driver-params> </data-source> Designing the EPN: The EPN is very simple in this example. We briefly describe each of the components. The adapter ("DataMiningAdapter") reads data from a .csv file and sends it to the CQL processor downstream. The event payload here is same as that of the table in the database (refer to the attached project or do a "desc table-name" from a SQL*PLUS prompt). While this is for convenience in this example, it need not be the case. One can still omit fields in the streaming events, and need not match all columns in the table on which the model was built. Better yet, it does not even need to have the same name as columns in the table, as long as you alias them in the USING clause of the mining function. (Caveat: they still need to draw values from a similar universe or domain, otherwise it constitutes incorrect usage of the model). There are two things in the CQL processor ("DataMiningProc") that make scoring possible on streaming events. 1. User defined cartridge function Please refer to the OCEP CQL reference manual to find more details about how to define such functions. We include the function below in its entirety for illustration. <?xml version="1.0" encoding="UTF-8"?> <jdbcctxconfig:config xmlns:jdbcctxconfig="http://www.bea.com/ns/wlevs/config/application" xmlns:jc="http://www.oracle.com/ns/ocep/config/jdbc"> <jc:jdbc-ctx> <name>Oracle11gR2</name> <data-source>DataMining</data-source> <function name="prediction2"> <param name="CQLMONTH" type="char"/> <param name="WEEKOFMONTH" type="int"/> <param name="DAYOFWEEK" type="char" /> <param name="MAKE" type="char" /> <param name="ACCIDENTAREA" type="char" /> <param name="DAYOFWEEKCLAIMED" type="char" /> <param name="MONTHCLAIMED" type="char" /> <param name="WEEKOFMONTHCLAIMED" type="int" /> <param name="SEX" type="char" /> <param name="MARITALSTATUS" type="char" /> <param name="AGE" type="int" /> <param name="FAULT" type="char" /> <param name="POLICYTYPE" type="char" /> <param name="VEHICLECATEGORY" type="char" /> <param name="VEHICLEPRICE" type="char" /> <param name="FRAUDFOUND" type="int" /> <param name="POLICYNUMBER" type="int" /> <param name="REPNUMBER" type="int" /> <param name="DEDUCTIBLE" type="int" /> <param name="DRIVERRATING" type="int" /> <param name="DAYSPOLICYACCIDENT" type="char" /> <param name="DAYSPOLICYCLAIM" type="char" /> <param name="PASTNUMOFCLAIMS" type="char" /> <param name="AGEOFVEHICLES" type="char" /> <param name="AGEOFPOLICYHOLDER" type="char" /> <param name="POLICEREPORTFILED" type="char" /> <param name="WITNESSPRESNT" type="char" /> <param name="AGENTTYPE" type="char" /> <param name="NUMOFSUPP" type="char" /> <param name="ADDRCHGCLAIM" type="char" /> <param name="NUMOFCARS" type="char" /> <param name="CQLYEAR" type="int" /> <param name="BASEPOLICY" type="char" /> <return-component-type>char</return-component-type> <sql><![CDATA[ SELECT to_char(PREDICTION_PROBABILITY(CLAIMSMODEL, '0' USING *)) AS probability FROM (SELECT :CQLMONTH AS MONTH, :WEEKOFMONTH AS WEEKOFMONTH, :DAYOFWEEK AS DAYOFWEEK, :MAKE AS MAKE, :ACCIDENTAREA AS ACCIDENTAREA, :DAYOFWEEKCLAIMED AS DAYOFWEEKCLAIMED, :MONTHCLAIMED AS MONTHCLAIMED, :WEEKOFMONTHCLAIMED, :SEX AS SEX, :MARITALSTATUS AS MARITALSTATUS, :AGE AS AGE, :FAULT AS FAULT, :POLICYTYPE AS POLICYTYPE, :VEHICLECATEGORY AS VEHICLECATEGORY, :VEHICLEPRICE AS VEHICLEPRICE, :FRAUDFOUND AS FRAUDFOUND, :POLICYNUMBER AS POLICYNUMBER, :REPNUMBER AS REPNUMBER, :DEDUCTIBLE AS DEDUCTIBLE, :DRIVERRATING AS DRIVERRATING, :DAYSPOLICYACCIDENT AS DAYSPOLICYACCIDENT, :DAYSPOLICYCLAIM AS DAYSPOLICYCLAIM, :PASTNUMOFCLAIMS AS PASTNUMOFCLAIMS, :AGEOFVEHICLES AS AGEOFVEHICLES, :AGEOFPOLICYHOLDER AS AGEOFPOLICYHOLDER, :POLICEREPORTFILED AS POLICEREPORTFILED, :WITNESSPRESNT AS WITNESSPRESENT, :AGENTTYPE AS AGENTTYPE, :NUMOFSUPP AS NUMOFSUPP, :ADDRCHGCLAIM AS ADDRCHGCLAIM, :NUMOFCARS AS NUMOFCARS, :CQLYEAR AS YEAR, :BASEPOLICY AS BASEPOLICY FROM dual) ]]> </sql> </function> </jc:jdbc-ctx> </jdbcctxconfig:config> 2. Invoking the function for each event. Once this function is defined, you can invoke it from CQL as follows: <?xml version="1.0" encoding="UTF-8"?> <wlevs:config xmlns:wlevs="http://www.bea.com/ns/wlevs/config/application"> <processor> <name>DataMiningProc</name> <rules> <query id="q1"><![CDATA[ ISTREAM(SELECT S.CQLMONTH, S.WEEKOFMONTH, S.DAYOFWEEK, S.MAKE, : S.BASEPOLICY, C.F AS probability FROM StreamDataChannel [NOW] AS S, TABLE(prediction2@Oracle11gR2(S.CQLMONTH, S.WEEKOFMONTH, S.DAYOFWEEK, S.MAKE, ..., S.BASEPOLICY) AS F of char) AS C) ]]></query> </rules> </processor> </wlevs:config> Finally, the last stage in the EPN prints out the probability of the event being an anomaly. One can also define a threshold in CQL to filter out events that are normal, i.e., below a certain mark as defined by the analyst or designer. Sample Runs: Now let's see how this behaves when events are streamed through CEP. We use only two events for brevity, one normal and other one not. This is one of the "normal" looking events and the probability of it being anomalous is less than 60%. Event is: eventType=DataMiningOutEvent object=q1 time=2904821976256 S.CQLMONTH=Dec, S.WEEKOFMONTH=5, S.DAYOFWEEK=Wednesday, S.MAKE=Honda, S.ACCIDENTAREA=Urban, S.DAYOFWEEKCLAIMED=Tuesday, S.MONTHCLAIMED=Jan, S.WEEKOFMONTHCLAIMED=1, S.SEX=Female, S.MARITALSTATUS=Single, S.AGE=21, S.FAULT=Policy Holder, S.POLICYTYPE=Sport - Liability, S.VEHICLECATEGORY=Sport, S.VEHICLEPRICE=more than 69000, S.FRAUDFOUND=0, S.POLICYNUMBER=1, S.REPNUMBER=12, S.DEDUCTIBLE=300, S.DRIVERRATING=1, S.DAYSPOLICYACCIDENT=more than 30, S.DAYSPOLICYCLAIM=more than 30, S.PASTNUMOFCLAIMS=none, S.AGEOFVEHICLES=3 years, S.AGEOFPOLICYHOLDER=26 to 30, S.POLICEREPORTFILED=No, S.WITNESSPRESENT=No, S.AGENTTYPE=External, S.NUMOFSUPP=none, S.ADDRCHGCLAIM=1 year, S.NUMOFCARS=3 to 4, S.CQLYEAR=1994, S.BASEPOLICY=Liability, probability=.58931702982118561 isTotalOrderGuarantee=true\nAnamoly probability: .58931702982118561 However, the following event is scored as an anomaly with a very high probability of 89%. So there is likely to be something wrong with it. A close look reveals that the value of "deductible" field (10000) is not "normal". What exactly constitutes normal here?. If you run the query on the database to find ALL distinct values for the "deductible" field, it returns the following set: {300, 400, 500, 700} Event is: eventType=DataMiningOutEvent object=q1 time=2598483773496 S.CQLMONTH=Dec, S.WEEKOFMONTH=5, S.DAYOFWEEK=Wednesday, S.MAKE=Honda, S.ACCIDENTAREA=Urban, S.DAYOFWEEKCLAIMED=Tuesday, S.MONTHCLAIMED=Jan, S.WEEKOFMONTHCLAIMED=1, S.SEX=Female, S.MARITALSTATUS=Single, S.AGE=21, S.FAULT=Policy Holder, S.POLICYTYPE=Sport - Liability, S.VEHICLECATEGORY=Sport, S.VEHICLEPRICE=more than 69000, S.FRAUDFOUND=0, S.POLICYNUMBER=1, S.REPNUMBER=12, S.DEDUCTIBLE=10000, S.DRIVERRATING=1, S.DAYSPOLICYACCIDENT=more than 30, S.DAYSPOLICYCLAIM=more than 30, S.PASTNUMOFCLAIMS=none, S.AGEOFVEHICLES=3 years, S.AGEOFPOLICYHOLDER=26 to 30, S.POLICEREPORTFILED=No, S.WITNESSPRESENT=No, S.AGENTTYPE=External, S.NUMOFSUPP=none, S.ADDRCHGCLAIM=1 year, S.NUMOFCARS=3 to 4, S.CQLYEAR=1994, S.BASEPOLICY=Liability, probability=.89171554529576691 isTotalOrderGuarantee=true\nAnamoly probability: .89171554529576691 Conclusion: By way of this example, we show: real-time scoring of events as they flow through CEP leveraging Oracle Data Mining.how CEP applications can invoke complex arbitrary external computations (function shipping) in an RDBMS.

Read the article

Using R to Analyze G1GC Log Files

- by user12620111

Using R to Analyze G1GC Log Files body, td { font-family: sans-serif; background-color: white; font-size: 12px; margin: 8px; } tt, code, pre { font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace; } h1 { font-size:2.2em; } h2 { font-size:1.8em; } h3 { font-size:1.4em; } h4 { font-size:1.0em; } h5 { font-size:0.9em; } h6 { font-size:0.8em; } a:visited { color: rgb(50%, 0%, 50%); } pre { margin-top: 0; max-width: 95%; border: 1px solid #ccc; white-space: pre-wrap; } pre code { display: block; padding: 0.5em; } code.r, code.cpp { background-color: #F8F8F8; } table, td, th { border: none; } blockquote { color:#666666; margin:0; padding-left: 1em; border-left: 0.5em #EEE solid; } hr { height: 0px; border-bottom: none; border-top-width: thin; border-top-style: dotted; border-top-color: #999999; } @media print { * { background: transparent !important; color: black !important; filter:none !important; -ms-filter: none !important; } body { font-size:12pt; max-width:100%; } a, a:visited { text-decoration: underline; } hr { visibility: hidden; page-break-before: always; } pre, blockquote { padding-right: 1em; page-break-inside: avoid; } tr, img { page-break-inside: avoid; } img { max-width: 100% !important; } @page :left { margin: 15mm 20mm 15mm 10mm; } @page :right { margin: 15mm 10mm 15mm 20mm; } p, h2, h3 { orphans: 3; widows: 3; } h2, h3 { page-break-after: avoid; } } pre .operator, pre .paren { color: rgb(104, 118, 135) } pre .literal { color: rgb(88, 72, 246) } pre .number { color: rgb(0, 0, 205); } pre .comment { color: rgb(76, 136, 107); } pre .keyword { color: rgb(0, 0, 255); } pre .identifier { color: rgb(0, 0, 0); } pre .string { color: rgb(3, 106, 7); } var hljs=new function(){function m(p){return p.replace(/&/gm,"&").replace(/"}while(y.length||w.length){var v=u().splice(0,1)[0];z+=m(x.substr(q,v.offset-q));q=v.offset;if(v.event=="start"){z+=t(v.node);s.push(v.node)}else{if(v.event=="stop"){var p,r=s.length;do{r--;p=s[r];z+=("")}while(p!=v.node);s.splice(r,1);while(r'+M[0]+""}else{r+=M[0]}O=P.lR.lastIndex;M=P.lR.exec(L)}return r+L.substr(O,L.length-O)}function J(L,M){if(M.sL&&e[M.sL]){var r=d(M.sL,L);x+=r.keyword_count;return r.value}else{return F(L,M)}}function I(M,r){var L=M.cN?'':"";if(M.rB){y+=L;M.buffer=""}else{if(M.eB){y+=m(r)+L;M.buffer=""}else{y+=L;M.buffer=r}}D.push(M);A+=M.r}function G(N,M,Q){var R=D[D.length-1];if(Q){y+=J(R.buffer+N,R);return false}var P=q(M,R);if(P){y+=J(R.buffer+N,R);I(P,M);return P.rB}var L=v(D.length-1,M);if(L){var O=R.cN?"":"";if(R.rE){y+=J(R.buffer+N,R)+O}else{if(R.eE){y+=J(R.buffer+N,R)+O+m(M)}else{y+=J(R.buffer+N+M,R)+O}}while(L1){O=D[D.length-2].cN?"":"";y+=O;L--;D.length--}var r=D[D.length-1];D.length--;D[D.length-1].buffer="";if(r.starts){I(r.starts,"")}return R.rE}if(w(M,R)){throw"Illegal"}}var E=e[B];var D=[E.dM];var A=0;var x=0;var y="";try{var s,u=0;E.dM.buffer="";do{s=p(C,u);var t=G(s[0],s[1],s[2]);u+=s[0].length;if(!t){u+=s[1].length}}while(!s[2]);if(D.length1){throw"Illegal"}return{r:A,keyword_count:x,value:y}}catch(H){if(H=="Illegal"){return{r:0,keyword_count:0,value:m(C)}}else{throw H}}}function g(t){var p={keyword_count:0,r:0,value:m(t)};var r=p;for(var q in e){if(!e.hasOwnProperty(q)){continue}var s=d(q,t);s.language=q;if(s.keyword_count+s.rr.keyword_count+r.r){r=s}if(s.keyword_count+s.rp.keyword_count+p.r){r=p;p=s}}if(r.language){p.second_best=r}return p}function i(r,q,p){if(q){r=r.replace(/^((]+|\t)+)/gm,function(t,w,v,u){return w.replace(/\t/g,q)})}if(p){r=r.replace(/\n/g,"")}return r}function n(t,w,r){var x=h(t,r);var v=a(t);var y,s;if(v){y=d(v,x)}else{return}var q=c(t);if(q.length){s=document.createElement("pre");s.innerHTML=y.value;y.value=k(q,c(s),x)}y.value=i(y.value,w,r);var u=t.className;if(!u.match("(\\s|^)(language-)?"+v+"(\\s|$)")){u=u?(u+" "+v):v}if(/MSIE [678]/.test(navigator.userAgent)&&t.tagName=="CODE"&&t.parentNode.tagName=="PRE"){s=t.parentNode;var p=document.createElement("div");p.innerHTML=""+y.value+"";t=p.firstChild.firstChild;p.firstChild.cN=s.cN;s.parentNode.replaceChild(p.firstChild,s)}else{t.innerHTML=y.value}t.className=u;t.result={language:v,kw:y.keyword_count,re:y.r};if(y.second_best){t.second_best={language:y.second_best.language,kw:y.second_best.keyword_count,re:y.second_best.r}}}function o(){if(o.called){return}o.called=true;var r=document.getElementsByTagName("pre");for(var p=0;p|=||=||=|\\?|\\[|\\{|\\(|\\^|\\^=|\\||\\|=|\\|\\||~";this.ER="(?![\\s\\S])";this.BE={b:"\\\\.",r:0};this.ASM={cN:"string",b:"'",e:"'",i:"\\n",c:[this.BE],r:0};this.QSM={cN:"string",b:'"',e:'"',i:"\\n",c:[this.BE],r:0};this.CLCM={cN:"comment",b:"//",e:"$"};this.CBLCLM={cN:"comment",b:"/\\*",e:"\\*/"};this.HCM={cN:"comment",b:"#",e:"$"};this.NM={cN:"number",b:this.NR,r:0};this.CNM={cN:"number",b:this.CNR,r:0};this.BNM={cN:"number",b:this.BNR,r:0};this.inherit=function(r,s){var p={};for(var q in r){p[q]=r[q]}if(s){for(var q in s){p[q]=s[q]}}return p}}();hljs.LANGUAGES.cpp=function(){var a={keyword:{"false":1,"int":1,"float":1,"while":1,"private":1,"char":1,"catch":1,"export":1,virtual:1,operator:2,sizeof:2,dynamic_cast:2,typedef:2,const_cast:2,"const":1,struct:1,"for":1,static_cast:2,union:1,namespace:1,unsigned:1,"long":1,"throw":1,"volatile":2,"static":1,"protected":1,bool:1,template:1,mutable:1,"if":1,"public":1,friend:2,"do":1,"return":1,"goto":1,auto:1,"void":2,"enum":1,"else":1,"break":1,"new":1,extern:1,using:1,"true":1,"class":1,asm:1,"case":1,typeid:1,"short":1,reinterpret_cast:2,"default":1,"double":1,register:1,explicit:1,signed:1,typename:1,"try":1,"this":1,"switch":1,"continue":1,wchar_t:1,inline:1,"delete":1,alignof:1,char16_t:1,char32_t:1,constexpr:1,decltype:1,noexcept:1,nullptr:1,static_assert:1,thread_local:1,restrict:1,_Bool:1,complex:1},built_in:{std:1,string:1,cin:1,cout:1,cerr:1,clog:1,stringstream:1,istringstream:1,ostringstream:1,auto_ptr:1,deque:1,list:1,queue:1,stack:1,vector:1,map:1,set:1,bitset:1,multiset:1,multimap:1,unordered_set:1,unordered_map:1,unordered_multiset:1,unordered_multimap:1,array:1,shared_ptr:1}};return{dM:{k:a,i:"",k:a,r:10,c:["self"]}]}}}();hljs.LANGUAGES.r={dM:{c:[hljs.HCM,{cN:"number",b:"\\b0[xX][0-9a-fA-F]+[Li]?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+(?:[eE][+\\-]?\\d*)?L\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+\\.(?!\\d)(?:i\\b)?",e:hljs.IMMEDIATE_RE,r:1},{cN:"number",b:"\\b\\d+(?:\\.\\d*)?(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\.\\d+(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"keyword",b:"(?:tryCatch|library|setGeneric|setGroupGeneric)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\.",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\d+(?![\\w.])",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\b(?:function)",e:hljs.IMMEDIATE_RE,r:2},{cN:"keyword",b:"(?:if|in|break|next|repeat|else|for|return|switch|while|try|stop|warning|require|attach|detach|source|setMethod|setClass)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"literal",b:"(?:NA|NA_integer_|NA_real_|NA_character_|NA_complex_)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"literal",b:"(?:NULL|TRUE|FALSE|T|F|Inf|NaN)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"identifier",b:"[a-zA-Z.][a-zA-Z0-9._]*\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"|=|| Using R to Analyze G1GC Log Files Using R to Analyze G1GC Log Files Introduction Working in Oracle Platform Integration gives an engineer opportunities to work on a wide array of technologies. My team’s goal is to make Oracle applications run best on the Solaris/SPARC platform. When looking for bottlenecks in a modern applications, one needs to be aware of not only how the CPUs and operating system are executing, but also network, storage, and in some cases, the Java Virtual Machine. I was recently presented with about 1.5 GB of Java Garbage First Garbage Collector log file data. If you’re not familiar with the subject, you might want to review Garbage First Garbage Collector Tuning by Monica Beckwith. The customer had been running Java HotSpot 1.6.0_31 to host a web application server. I was told that the Solaris/SPARC server was running a Java process launched using a commmand line that included the following flags: -d64 -Xms9g -Xmx9g -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=80 -XX:PermSize=256m -XX:MaxPermSize=256m -XX:+PrintGC -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -XX:+PrintGCDateStamps -XX:+PrintFlagsFinal -XX:+DisableExplicitGC -XX:+UnlockExperimentalVMOptions -XX:ParallelGCThreads=8 Several sources on the internet indicate that if I were to print out the 1.5 GB of log files, it would require enough paper to fill the bed of a pick up truck. Of course, it would be fruitless to try to scan the log files by hand. Tools will be required to summarize the contents of the log files. Others have encountered large Java garbage collection log files. There are existing tools to analyze the log files: IBM’s GC toolkit The chewiebug GCViewer gchisto HPjmeter Instead of using one of the other tools listed, I decide to parse the log files with standard Unix tools, and analyze the data with R. Data Cleansing The log files arrived in two different formats. I guess that the difference is that one set of log files was generated using a more verbose option, maybe -XX:+PrintHeapAtGC, and the other set of log files was generated without that option. Format 1 In some of the log files, the log files with the less verbose format, a single trace, i.e. the report of a singe garbage collection event, looks like this: {Heap before GC invocations=12280 (full 61): garbage-first heap total 9437184K, used 7499918K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) region size 4096K, 1 young (4096K), 0 survivors (0K) compacting perm gen total 262144K, used 144077K [0xffffffff40000000, 0xffffffff50000000, 0xffffffff50000000) the space 262144K, 54% used [0xffffffff40000000, 0xffffffff48cb3758, 0xffffffff48cb3800, 0xffffffff50000000) No shared spaces configured. 2014-05-14T07:24:00.988-0700: 60586.353: [GC pause (young) 7324M->7320M(9216M), 0.1567265 secs] Heap after GC invocations=12281 (full 61): garbage-first heap total 9437184K, used 7496533K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) region size 4096K, 0 young (0K), 0 survivors (0K) compacting perm gen total 262144K, used 144077K [0xffffffff40000000, 0xffffffff50000000, 0xffffffff50000000) the space 262144K, 54% used [0xffffffff40000000, 0xffffffff48cb3758, 0xffffffff48cb3800, 0xffffffff50000000) No shared spaces configured. } A simple grep can be used to extract a summary: $ grep "\[ GC pause (young" g1gc.log 2014-05-13T13:24:35.091-0700: 3.109: [GC pause (young) 20M->5029K(9216M), 0.0146328 secs] 2014-05-13T13:24:35.440-0700: 3.459: [GC pause (young) 9125K->6077K(9216M), 0.0086723 secs] 2014-05-13T13:24:37.581-0700: 5.599: [GC pause (young) 25M->8470K(9216M), 0.0203820 secs] 2014-05-13T13:24:42.686-0700: 10.704: [GC pause (young) 44M->15M(9216M), 0.0288848 secs] 2014-05-13T13:24:48.941-0700: 16.958: [GC pause (young) 51M->20M(9216M), 0.0491244 secs] 2014-05-13T13:24:56.049-0700: 24.066: [GC pause (young) 92M->26M(9216M), 0.0525368 secs] 2014-05-13T13:25:34.368-0700: 62.383: [GC pause (young) 602M->68M(9216M), 0.1721173 secs] But that format wasn't easily read into R, so I needed to be a bit more tricky. I used the following Unix command to create a summary file that was easy for R to read. $ echo "SecondsSinceLaunch BeforeSize AfterSize TotalSize RealTime" $ grep "\[GC pause (young" g1gc.log | grep -v mark | sed -e 's/[A-SU-z,]/ /g' -e 's/->/ /' -e 's/: / /g' | more SecondsSinceLaunch BeforeSize AfterSize TotalSize RealTime 2014-05-13T13:24:35.091-0700 3.109 20 5029 9216 0.0146328 2014-05-13T13:24:35.440-0700 3.459 9125 6077 9216 0.0086723 2014-05-13T13:24:37.581-0700 5.599 25 8470 9216 0.0203820 2014-05-13T13:24:42.686-0700 10.704 44 15 9216 0.0288848 2014-05-13T13:24:48.941-0700 16.958 51 20 9216 0.0491244 2014-05-13T13:24:56.049-0700 24.066 92 26 9216 0.0525368 2014-05-13T13:25:34.368-0700 62.383 602 68 9216 0.1721173 Format 2 In some of the log files, the log files with the more verbose format, a single trace, i.e. the report of a singe garbage collection event, was more complicated than Format 1. Here is a text file with an example of a single G1GC trace in the second format. As you can see, it is quite complicated. It is nice that there is so much information available, but the level of detail can be overwhelming. I wrote this awk script (download) to summarize each trace on a single line. #!/usr/bin/env awk -f BEGIN { printf("SecondsSinceLaunch IncrementalCount FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize\n") } ###################### # Save count data from lines that are at the start of each G1GC trace. # Each trace starts out like this: # {Heap before GC invocations=14 (full 0): # garbage-first heap total 9437184K, used 325496K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) ###################### /{Heap.*full/{ gsub ( "\\)" , "" ); nf=split($0,a,"="); split(a[2],b," "); getline; if ( match($0, "first") ) { G1GC=1; IncrementalCount=b[1]; FullCount=substr( b[3], 1, length(b[3])-1 ); } else { G1GC=0; } } ###################### # Pull out time stamps that are in lines with this format: # 2014-05-12T14:02:06.025-0700: 94.312: [GC pause (young), 0.08870154 secs] ###################### /GC pause/ { DateTime=$1; SecondsSinceLaunch=substr($2, 1, length($2)-1); } ###################### # Heap sizes are in lines that look like this: # [ 4842M->4838M(9216M)] ###################### /\[ .*]$/ { gsub ( "\\[" , "" ); gsub ( "\ \]" , "" ); gsub ( "->" , " " ); gsub ( "\$ " , " " ); gsub ( "\ $" , " " ); split($0,a," "); if ( split(a[1],b,"M") > 1 ) {BeforeSize=b[1]*1024;} if ( split(a[1],b,"K") > 1 ) {BeforeSize=b[1];} if ( split(a[2],b,"M") > 1 ) {AfterSize=b[1]*1024;} if ( split(a[2],b,"K") > 1 ) {AfterSize=b[1];} if ( split(a[3],b,"M") > 1 ) {TotalSize=b[1]*1024;} if ( split(a[3],b,"K") > 1 ) {TotalSize=b[1];} } ###################### # Emit an output line when you find input that looks like this: # [Times: user=1.41 sys=0.08, real=0.24 secs] ###################### /\[Times/ { if (G1GC==1) { gsub ( "," , "" ); split($2,a,"="); UserTime=a[2]; split($3,a,"="); SysTime=a[2]; split($4,a,"="); RealTime=a[2]; print DateTime,SecondsSinceLaunch,IncrementalCount,FullCount,UserTime,SysTime,RealTime,BeforeSize,AfterSize,TotalSize; G1GC=0; } } The resulting summary is about 25X smaller that the original file, but still difficult for a human to digest. SecondsSinceLaunch IncrementalCount FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ... 2014-05-12T18:36:34.669-0700: 3985.744 561 0 0.57 0.06 0.16 1724416 1720320 9437184 2014-05-12T18:36:34.839-0700: 3985.914 562 0 0.51 0.06 0.19 1724416 1720320 9437184 2014-05-12T18:36:35.069-0700: 3986.144 563 0 0.60 0.04 0.27 1724416 1721344 9437184 2014-05-12T18:36:35.354-0700: 3986.429 564 0 0.33 0.04 0.09 1725440 1722368 9437184 2014-05-12T18:36:35.545-0700: 3986.620 565 0 0.58 0.04 0.17 1726464 1722368 9437184 2014-05-12T18:36:35.726-0700: 3986.801 566 0 0.43 0.05 0.12 1726464 1722368 9437184 2014-05-12T18:36:35.856-0700: 3986.930 567 0 0.30 0.04 0.07 1726464 1723392 9437184 2014-05-12T18:36:35.947-0700: 3987.023 568 0 0.61 0.04 0.26 1727488 1723392 9437184 2014-05-12T18:36:36.228-0700: 3987.302 569 0 0.46 0.04 0.16 1731584 1724416 9437184 Reading the Data into R Once the GC log data had been cleansed, either by processing the first format with the shell script, or by processing the second format with the awk script, it was easy to read the data into R. g1gc.df = read.csv("summary.txt", row.names = NULL, stringsAsFactors=FALSE,sep="") str(g1gc.df) ## 'data.frame': 8307 obs. of 10 variables: ## $ row.names : chr "2014-05-12T14:00:32.868-0700:" "2014-05-12T14:00:33.179-0700:" "2014-05-12T14:00:33.677-0700:" "2014-05-12T14:00:35.538-0700:" ... ## $ SecondsSinceLaunch: num 1.16 1.47 1.97 3.83 6.1 ... ## $ IncrementalCount : int 0 1 2 3 4 5 6 7 8 9 ... ## $ FullCount : int 0 0 0 0 0 0 0 0 0 0 ... ## $ UserTime : num 0.11 0.05 0.04 0.21 0.08 0.26 0.31 0.33 0.34 0.56 ... ## $ SysTime : num 0.04 0.01 0.01 0.05 0.01 0.06 0.07 0.06 0.07 0.09 ... ## $ RealTime : num 0.02 0.02 0.01 0.04 0.02 0.04 0.05 0.04 0.04 0.06 ... ## $ BeforeSize : int 8192 5496 5768 22528 24576 43008 34816 53248 55296 93184 ... ## $ AfterSize : int 1400 1672 2557 4907 7072 14336 16384 18432 19456 21504 ... ## $ TotalSize : int 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 ... head(g1gc.df) ## row.names SecondsSinceLaunch IncrementalCount ## 1 2014-05-12T14:00:32.868-0700: 1.161 0 ## 2 2014-05-12T14:00:33.179-0700: 1.472 1 ## 3 2014-05-12T14:00:33.677-0700: 1.969 2 ## 4 2014-05-12T14:00:35.538-0700: 3.830 3 ## 5 2014-05-12T14:00:37.811-0700: 6.103 4 ## 6 2014-05-12T14:00:41.428-0700: 9.720 5 ## FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ## 1 0 0.11 0.04 0.02 8192 1400 9437184 ## 2 0 0.05 0.01 0.02 5496 1672 9437184 ## 3 0 0.04 0.01 0.01 5768 2557 9437184 ## 4 0 0.21 0.05 0.04 22528 4907 9437184 ## 5 0 0.08 0.01 0.02 24576 7072 9437184 ## 6 0 0.26 0.06 0.04 43008 14336 9437184 Basic Statistics Once the data has been read into R, simple statistics are very easy to generate. All of the numbers from high school statistics are available via simple commands. For example, generate a summary of every column: summary(g1gc.df) ## row.names SecondsSinceLaunch IncrementalCount FullCount ## Length:8307 Min. : 1 Min. : 0 Min. : 0.0 ## Class :character 1st Qu.: 9977 1st Qu.:2048 1st Qu.: 0.0 ## Mode :character Median :12855 Median :4136 Median : 12.0 ## Mean :12527 Mean :4156 Mean : 31.6 ## 3rd Qu.:15758 3rd Qu.:6262 3rd Qu.: 61.0 ## Max. :55484 Max. :8391 Max. :113.0 ## UserTime SysTime RealTime BeforeSize ## Min. :0.040 Min. :0.0000 Min. : 0.0 Min. : 5476 ## 1st Qu.:0.470 1st Qu.:0.0300 1st Qu.: 0.1 1st Qu.:5137920 ## Median :0.620 Median :0.0300 Median : 0.1 Median :6574080 ## Mean :0.751 Mean :0.0355 Mean : 0.3 Mean :5841855 ## 3rd Qu.:0.920 3rd Qu.:0.0400 3rd Qu.: 0.2 3rd Qu.:7084032 ## Max. :3.370 Max. :1.5600 Max. :488.1 Max. :8696832 ## AfterSize TotalSize ## Min. : 1380 Min. :9437184 ## 1st Qu.:5002752 1st Qu.:9437184 ## Median :6559744 Median :9437184 ## Mean :5785454 Mean :9437184 ## 3rd Qu.:7054336 3rd Qu.:9437184 ## Max. :8482816 Max. :9437184 Q: What is the total amount of User CPU time spent in garbage collection? sum(g1gc.df$UserTime) ## [1] 6236 As you can see, less than two hours of CPU time was spent in garbage collection. Is that too much? To find the percentage of time spent in garbage collection, divide the number above by total_elapsed_time*CPU_count. In this case, there are a lot of CPU’s and it turns out the the overall amount of CPU time spent in garbage collection isn’t a problem when viewed in isolation. When calculating rates, i.e. events per unit time, you need to ask yourself if the rate is homogenous across the time period in the log file. Does the log file include spikes of high activity that should be separately analyzed? Averaging in data from nights and weekends with data from business hours may alias problems. If you have a reason to suspect that the garbage collection rates include peaks and valleys that need independent analysis, see the “Time Series” section, below. Q: How much garbage is collected on each pass? The amount of heap space that is recovered per GC pass is surprisingly low: At least one collection didn’t recover any data. (“Min.=0”) 25% of the passes recovered 3MB or less. (“1st Qu.=3072”) Half of the GC passes recovered 4MB or less. (“Median=4096”) The average amount recovered was 56MB. (“Mean=56390”) 75% of the passes recovered 36MB or less. (“3rd Qu.=36860”) At least one pass recovered 2GB. (“Max.=2121000”) g1gc.df$Delta = g1gc.df$BeforeSize - g1gc.df$AfterSize summary(g1gc.df$Delta) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0 3070 4100 56400 36900 2120000 Q: What is the maximum User CPU time for a single collection? The worst garbage collection (“Max.”) is many standard deviations away from the mean. The data appears to be right skewed. summary(g1gc.df$UserTime) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.040 0.470 0.620 0.751 0.920 3.370 sd(g1gc.df$UserTime) ## [1] 0.3966 Basic Graphics Once the data is in R, it is trivial to plot the data with formats including dot plots, line charts, bar charts (simple, stacked, grouped), pie charts, boxplots, scatter plots histograms, and kernel density plots. Histogram of User CPU Time per Collection I don't think that this graph requires any explanation. hist(g1gc.df$UserTime, main="User CPU Time per Collection", xlab="Seconds", ylab="Frequency") Box plot to identify outliers When the initial data is viewed with a box plot, you can see the one crazy outlier in the real time per GC. Save this data point for future analysis and drop the outlier so that it’s not throwing off our statistics. Now the box plot shows many outliers, which will be examined later, using times series analysis. Notice that the scale of the x-axis changes drastically once the crazy outlier is removed. par(mfrow=c(2,1)) boxplot(g1gc.df$UserTime,g1gc.df$SysTime,g1gc.df$RealTime, main="Box Plot of Time per GC\n(dominated by a crazy outlier)", names=c("usr","sys","elapsed"), xlab="Seconds per GC", ylab="Time (Seconds)", horizontal = TRUE, outcol="red") crazy.outlier.df=g1gc.df[g1gc.df$RealTime > 400,] g1gc.df=g1gc.df[g1gc.df$RealTime < 400,] boxplot(g1gc.df$UserTime,g1gc.df$SysTime,g1gc.df$RealTime, main="Box Plot of Time per GC\n(crazy outlier excluded)", names=c("usr","sys","elapsed"), xlab="Seconds per GC", ylab="Time (Seconds)", horizontal = TRUE, outcol="red") box(which = "outer", lty = "solid") Here is the crazy outlier for future analysis: crazy.outlier.df ## row.names SecondsSinceLaunch IncrementalCount ## 8233 2014-05-12T23:15:43.903-0700: 20741 8316 ## FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ## 8233 112 0.55 0.42 488.1 8381440 8235008 9437184 ## Delta ## 8233 146432 R Time Series Data To analyze the garbage collection as a time series, I’ll use Z’s Ordered Observations (zoo). “zoo is the creator for an S3 class of indexed totally ordered observations which includes irregular time series.” require(zoo) ## Loading required package: zoo ## ## Attaching package: 'zoo' ## ## The following objects are masked from 'package:base': ## ## as.Date, as.Date.numeric head(g1gc.df[,1]) ## [1] "2014-05-12T14:00:32.868-0700:" "2014-05-12T14:00:33.179-0700:" ## [3] "2014-05-12T14:00:33.677-0700:" "2014-05-12T14:00:35.538-0700:" ## [5] "2014-05-12T14:00:37.811-0700:" "2014-05-12T14:00:41.428-0700:" options("digits.secs"=3) times=as.POSIXct( g1gc.df[,1], format="%Y-%m-%dT%H:%M:%OS%z:") g1gc.z = zoo(g1gc.df[,-c(1)], order.by=times) head(g1gc.z) ## SecondsSinceLaunch IncrementalCount FullCount ## 2014-05-12 17:00:32.868 1.161 0 0 ## 2014-05-12 17:00:33.178 1.472 1 0 ## 2014-05-12 17:00:33.677 1.969 2 0 ## 2014-05-12 17:00:35.538 3.830 3 0 ## 2014-05-12 17:00:37.811 6.103 4 0 ## 2014-05-12 17:00:41.427 9.720 5 0 ## UserTime SysTime RealTime BeforeSize AfterSize ## 2014-05-12 17:00:32.868 0.11 0.04 0.02 8192 1400 ## 2014-05-12 17:00:33.178 0.05 0.01 0.02 5496 1672 ## 2014-05-12 17:00:33.677 0.04 0.01 0.01 5768 2557 ## 2014-05-12 17:00:35.538 0.21 0.05 0.04 22528 4907 ## 2014-05-12 17:00:37.811 0.08 0.01 0.02 24576 7072 ## 2014-05-12 17:00:41.427 0.26 0.06 0.04 43008 14336 ## TotalSize Delta ## 2014-05-12 17:00:32.868 9437184 6792 ## 2014-05-12 17:00:33.178 9437184 3824 ## 2014-05-12 17:00:33.677 9437184 3211 ## 2014-05-12 17:00:35.538 9437184 17621 ## 2014-05-12 17:00:37.811 9437184 17504 ## 2014-05-12 17:00:41.427 9437184 28672 Example of Two Benchmark Runs in One Log File The data in the following graph is from a different log file, not the one of primary interest to this article. I’m including this image because it is an example of idle periods followed by busy periods. It would be uninteresting to average the rate of garbage collection over the entire log file period. More interesting would be the rate of garbage collect in the two busy periods. Are they the same or different? Your production data may be similar, for example, bursts when employees return from lunch and idle times on weekend evenings, etc. Once the data is in an R Time Series, you can analyze isolated time windows. Clipping the Time Series data Flashing back to our test case… Viewing the data as a time series is interesting. You can see that the work intensive time period is between 9:00 PM and 3:00 AM. Lets clip the data to the interesting period: par(mfrow=c(2,1)) plot(g1gc.z$UserTime, type="h", main="User Time per GC\nTime: Complete Log File", xlab="Time of Day", ylab="CPU Seconds per GC", col="#1b9e77") clipped.g1gc.z=window(g1gc.z, start=as.POSIXct("2014-05-12 21:00:00"), end=as.POSIXct("2014-05-13 03:00:00")) plot(clipped.g1gc.z$UserTime, type="h", main="User Time per GC\nTime: Limited to Benchmark Execution", xlab="Time of Day", ylab="CPU Seconds per GC", col="#1b9e77") box(which = "outer", lty = "solid") Cumulative Incremental and Full GC count Here is the cumulative incremental and full GC count. When the line is very steep, it indicates that the GCs are repeating very quickly. Notice that the scale on the Y axis is different for full vs. incremental. plot(clipped.g1gc.z[,c(2:3)], main="Cumulative Incremental and Full GC count", xlab="Time of Day", col="#1b9e77") GC Analysis of Benchmark Execution using Time Series data In the following series of 3 graphs: The “After Size” show the amount of heap space in use after each garbage collection. Many Java objects are still referenced, i.e. alive, during each garbage collection. This may indicate that the application has a memory leak, or may indicate that the application has a very large memory footprint. Typically, an application's memory footprint plateau's in the early stage of execution. One would expect this graph to have a flat top. The steep decline in the heap space may indicate that the application crashed after 2:00. The second graph shows that the outliers in real execution time, discussed above, occur near 2:00. when the Java heap seems to be quite full. The third graph shows that Full GCs are infrequent during the first few hours of execution. The rate of Full GC's, (the slope of the cummulative Full GC line), changes near midnight. plot(clipped.g1gc.z[,c("AfterSize","RealTime","FullCount")], xlab="Time of Day", col=c("#1b9e77","red","#1b9e77")) GC Analysis of heap recovered Each GC trace includes the amount of heap space in use before and after the individual GC event. During garbage coolection, unreferenced objects are identified, the space holding the unreferenced objects is freed, and thus, the difference in before and after usage indicates how much space has been freed. The following box plot and bar chart both demonstrate the same point - the amount of heap space freed per garbage colloection is surprisingly low. par(mfrow=c(2,1)) boxplot(as.vector(clipped.g1gc.z$Delta), main="Amount of Heap Recovered per GC Pass", xlab="Size in KB", horizontal = TRUE, col="red") hist(as.vector(clipped.g1gc.z$Delta), main="Amount of Heap Recovered per GC Pass", xlab="Size in KB", breaks=100, col="red") box(which = "outer", lty = "solid") This graph is the most interesting. The dark blue area shows how much heap is occupied by referenced Java objects. This represents memory that holds live data. The red fringe at the top shows how much data was recovered after each garbage collection. barplot(clipped.g1gc.z[,c("AfterSize","Delta")], col=c("#7570b3","#e7298a"), xlab="Time of Day", border=NA) legend("topleft", c("Live Objects","Heap Recovered on GC"), fill=c("#7570b3","#e7298a")) box(which = "outer", lty = "solid") When I discuss the data in the log files with the customer, I will ask for an explaination for the large amount of referenced data resident in the Java heap. There are two are posibilities: There is a memory leak and the amount of space required to hold referenced objects will continue to grow, limited only by the maximum heap size. After the maximum heap size is reached, the JVM will throw an “Out of Memory” exception every time that the application tries to allocate a new object. If this is the case, the aplication needs to be debugged to identify why old objects are referenced when they are no longer needed. The application has a legitimate requirement to keep a large amount of data in memory. The customer may want to further increase the maximum heap size. Another possible solution would be to partition the application across multiple cluster nodes, where each node has responsibility for managing a unique subset of the data. Conclusion In conclusion, R is a very powerful tool for the analysis of Java garbage collection log files. The primary difficulty is data cleansing so that information can be read into an R data frame. Once the data has been read into R, a rich set of tools may be used for thorough evaluation.

Developer IT