restricted formats - Page 73

How to install/configure ffmpeg to compress mp4 videos for flash player delivery?

- by Andrew Fulton

We have a flash web-app that created interactive video, and are using ffmpeg to do some compression/resizing when a user "publishes" their project. The user can upload flv files and mp4 files, both of which play fine in the Flash UI before publishing. After publishing the flv files work fine, but the mp4 files will not play in the flash player: Audio will play but video won't. The mp4 files will play fine if I download them and play them in the Quicktime player but if I attempt to open them in the Adobe Media Player it reports "The media file does not contain a supported video track". If I open the Movie inspector in quicktime it tells me that the original file is an "h264" video and the ffmpeg-processed ones are "mpeg-4". I have tried forcing it to h264 by adding flags like -f h264 and -vcodec h264 but I get a screenfull of errors (no frame, illegal POC type, sps_id out of range) ending with Could not find codec parameters (Video: h264) h264 will show up if I run ffmpeg -formats and ffmpeg -codecs, and as I said it will play fine in Quicktime. Is there anything else I need to do to convince the flash player to play them? Is there anything else I need to tell you about the server that will help?

Read the article

OpenVPN-based VPN server on same system it's "protecting": feasible?

- by Johnny Utahh

Scenario: hosted machine (typically a VPS) serving wiki, svn, git, forums, email lists (eg: GNU mailman), Bugzilla (etc) privately to < 20 people. People not on team not allowed access. Seeking VPN-restricted access to said server. Have good user experience with OpenVPN-based servers/clients, but have yet to server-admin such systems. Otherwise, experienced Linux sysadmin. Target system: Ubuntu, probably 12.04. Seeking to put an OpenVPN process on above server to "protect" all the above-mentioned services, enabling only OpenVPN-authorized clients/processes to access above services. (Can easily acquire additional IP address(es) as needed for this setup.) Option: if absolutely needed, can employ an additional, dedicated, "VPN server" VPS simply to be my VPN server "front end." But prefer to have all server processes (VPN server plus other server apps) all running on same machine, if possible. Will consider further if dedicated-VPN-machine setup enables 1. easier installation/administration, 2. better/easier end-user experience, and/or 3. makes system significantly more secure. Any of above feasible? The main intention: create a VPN from purely-hosted resources, and not spend all the effort to make a non-VPN, secure site--which typically means "SSL wrapping" + all the continual webserver-application-update management. Let the VPN server deal with access security, and spend list time pushing said security "down" in the other apps/Apache.

Read the article

how to remove background layer of djvu file

- by Jon

Hello, I've downloaded some files from the Internet Archive. They come in different file formats and most of the time I use pdf. However, sometimes the scans are saves in colour instead of b/w. This makes it difficult/impossible to read on a dedicated ebook reader. In that case I downloaded the djvu files as on the PC you can select which layer (color, bw,fore,back) one would like to see. Selecting the bw gives excellent results. However, the ebook reader does not has this option. The question is, how can I remove /extract a layer from the djvu file and save only this layer. So far I've tried the following two approaches: 1) select bw in the djvu viewer on the PC and printed to postscript file. Followed by a ps2pdf conversion. This works, but generates a fairly large pdf file. Sure, I can again upload it to any2djvu but it just seems to much manual work for each file. 2) I tried the shared annotation feature and said (mode bw). This works on the PC as desired but is ignored on the ebook reader as the other layers are still present. Any help or suggestions would be greatly appreciated.

Read the article

Preserve embedded album art when converting from .flac to .ogg

- by Profpatsch

I want to convert my archived .flac library to .ogg for daily use. Using find ./ -iname '*.flac' -print0 | xargs -0 -n1 oggenc -q6 on the root music folder and then deleting every .flac (having copies of them in archive) seems straight forward, after trying it with one file it worked and all of the tags were transfered, too, except for one: Embedded album art! I always prefer emedded covers over folder images, since I have some albums with varying covers. One possible solution is discussed here, but the script only works if the image is already extracted: Embed album art in OGG through command line in linux One possible solution I thought about was extracting album art from every song (not every song has one, though, and some even 2 or 3!), temporarily saving it and then using the script to include it into the finished .ogg. But then I want to increase the number of processes xargs runs simultaniously to save time, so the temp images need to have a distinct name. Is there a (linux) program that knows how to handle this? Or is there a finished script floating around somewhere? It would be nice if oggenc supported adding embedded coverart and it really is a shame, since these two formats should (in theory) share the same tag format. Edit: 15 days and noone even tries to answer. It’s funny, most of my questions don’t get answered. Too hard? Wrong SE site?

Read the article

Export-Mailbox - fails with large folders

- by grojo

I am trying to move messages from a rather large mailbox to an archive mailbox. However I run into errors all the time. the command I am executing is Export-Mailbox -Identity MAILBOX_FROM -TargetMailbox ARCHIVE -TargetFolder ARCHIVE_FOLDER -StartDate 2009-02-01 -EndDate 2009-02-28 -DeleteContent -Confirm:$false I can copy/move some messages, but run into frequent "an unknown error has occurred" (statuscode -1056749164) I run the console as administrative user, and all permissions are set right, as far as I can tell. I've restricted the start and end dates in case the number of messages moved/deleted should create problems. Anything I am missing in my setup? Corrupted messages? Over-limit message sizes? Update: What I've learnt so far, is that folder with more than approx 3000 messages will generate errors. If mail retention is set (default 30 days), Export-Mailbox will scan all messages whether these were deleted in previous runs or not, and date restriction to limit number of messages will not work. To avoid errors, I've switched off deleted message retention for the mailbox, and moved the messages from one large folder to multiple folders, and moved these one by one...

Read the article

ubuntu preseed installation keep missing mirror files

- by JackWu

Install ubuntu12.04.2 with preseed file, but there is one buggy problem about preseed mirror setting. The symptom here is installing process got stuck. So I track down the log file, and find out the real problem, the installation is looking for a file that's not there. This is just one of them, another pops up if I faked this file. This all happened during preseed, so I believe preseed has something to do with this. I google ubuntu preseed mirror and find this post saying: # If you select ftp, the mirror/country string does not need to be set. #d-i mirror/protocol string ftp d-i mirror/country string manual d-i mirror/http/hostname string archive.ubuntu.com d-i mirror/http/directory string /ubuntu d-i mirror/http/proxy string # Alternatively: by default, the installer uses CC.archive.ubuntu.com where # CC is the ISO-3166-2 code for the selected country. You can preseed this # so that it does so without asking. #d-i mirror/http/mirror select CC.archive.ubuntu.com # Suite to install. #d-i mirror/suite string lucid # Suite to use for loading installer components (optional). #d-i mirror/udeb/suite string lucid # Components to use for loading installer components (optional). #d-i mirror/udeb/components multiselect main, restricted I wonder the difference between d-i mirror/http/hostname and d-i mirror/http/mirror, I mean they all specify a mirror, right? In my preseed file, this is no d-i mirror/http/mirror, and d-i mirror/http/hostname points to my own repo as you might notice in the previous image. Here is my question: Does preseed fetches file/resource from internet, if I use local repo? Why it's looking for file that's not even there? This has bothered for quite time, many thanks in advance to anyone who might give any help.

Read the article

How to connect the virtual networks of vmware guests running on different hosts?

- by gyrolf

In a test setup, we are running several virtual machines on a single vmware workstation host. All virtual machines are connected via a "host only" network. This runs fine up to 2 or 3 virtual machines (depending on the host hardware). To allow more virtual machines, we want to use more host machines. Details about the environment and applications: Host PCs are running Windows XP in a corporate intranet. VMware used is Workstation 6.5 Guests are running Windows Server 2003 All guests act as Web Servers One of the guests additionally acts as Windows File server, offering shared folders for the other guests to connect to. Restrictions: VMware guests shall not be visible from the intranet. Changes to the host PC are restricted by corporate policy. In the virtual network, no domain controller exists. All virtual machines are member of the same workgroup. Running the virtual network as NAT is possible. Port forwarding might be used if it does not conflict with ports used by the host PC. Looking for a solution, I found hints about using router or vpn software on the hosts, but without any details how to setup. (I found a similar question Sharing the network between 2 VMware hosts, but the answer was not sufficient for me.)

Read the article

How can I totally flatten a PDF in Mac OS on the command line?

- by Matthew Leingang

I use Mac OS X Snow Leopard. I have a PDF with form fields, annotations, and stamps on it. I would like to freeze (or "flatten") that PDF so that the form fields can't be changed and the annotations/stamps are no longer editable. Since I actually have many of these PDFs, I want to do this automatically on the command line. Some things I've tried/considered, with their degree of success: Open in Preview and Print to File. This creates a totally flat PDF without changing the file size. The only way to automate seems to be to write a kludgy UI-based AppleScript, though, which I've been trying to avoid. Open in Acrobat Pro and use a JavaScript function to flatten. Again, not sure how to automate this on the command line. Use pdftk with the flatten option. But this only flattens form fields, not stamps and other annotations. Use cupsfilter which can create PDF from many file formats. Like pdftk this flattened only the form fields. Use cups-pdf to hook into the Mac's printserver and save a PDF file instead of print. I used the macports version. The resulting file is flat but huge. I tried this on an 8MB file; the flattened PDF was 358MB! Perhaps this can be combined with a ghostscript call as in Ubuntu Tip:Howto reduce PDF file size from command line. Any other suggestions would be appreciated.

Read the article

Drowning in documents - recommend doc management solutions?

- by Martin Day

I've been researching document management lately. I want to organise my docs at home and also at the office. Finding affordable solutions one can actually test drive is quite hard. Some that I've downloaded just don't seem to work (testing on brand new Vista PC). I've seen some software on Amazon like Paperport but not really sure what they're like. For home I'd like something to organise files, full text search, good scanner integration, nice interface etc. But for the office it seems harder. I need something that does proper workflow and keeps versions. It will have an audit trail. Documents can be approved, checked in/out etc. I know a few clients who would like something similar. It would be great just to import thousands of documents from a shared drive and get them indexed with dupes killed. I'd like to be super clear about how/where the documents are being stored so that maintenance and backups are clear. My Google/twitter searches lead back to the same tired and vague webpages pushing what look like expensive and custom made solutions. Some might be very good I suppose but it's darn hard to tell. I don't mind a hosted package but all in all I don't think something like Google Docs, as good as it is now, will work. There are too many quirks and missing features (as compared to Office). Being able to work directly with the common Office file formats is important. I've noted a similar sounding question asked here back in August but it didn't seem to turn up too many solutions that I could easily and quickly apply. Also there could have been some changes since then so I feel it's worth asking.

Read the article

Lotus Notes 8.0.2 - how to stop all mail showing in customized view

- by mikolajek

I am using Lotus Notes 8.0.2 at work and unfortunately the admin restricted changing default folders design. Only little changes are possible (e.g. change columns order) and even them are resetted each time I restart the client. I've created a new view with my desired column order, changed sorting etc. I have only one problem - even though I changed the "view" preference to show messages from the inbox folder only, I keep seeing all mail, regardless of the folder they are placed into. I'm not a Lotus expert and don't really know how to code. Yet, I am surprised as I see in a "simple view" this: uses '(ChangeMeetingType), ...' form AND In folder 'Inbox' And in Formula view only this: SELECT ((Form = "(ChangeMeetingType)") | (Form = "(Return Receipt)") | (Form = "Return Receipt") | (Form = "(ReturnNonReceipt)") | (Form = "ReturnNonReceipt") | (Form = "Memo") | (Form = "Memo") | (Form = "MemoEA") | (Form = "Reply") | (Form = "Reply") | (Form = "Reply With History") | (Form = "Reply With History") | (Form = "To Do") | (Form = "Task") | (Form = "_Document Memo") | (Form = "$DocMemo") | (Form = "Word. Document$Word Memo") | (Form = "WordPro. Document$Word Pro Memo") | (Form = "AlternateMemo")) Therefore, it looks like no folder has been really selected. How can I create a solution to see: Inbox contents only? Just messages, invitations and other "normal" stuff - without calendar entries and contacts?

Read the article

Seagate 3TB hard drive loses format information

- by Victor Bugarin

I have a Windows 7x64 Ultimate, 6 GB memory, 1 TB HD. 3TB Barracuda XT HDD. The HDD is installed on a StarTech 4 bays external enclosure I had troubles so I converted to a GPT, created 1 partition and formatted as NTFS. The hard drive I can write and read to and from the hard drive but it will become unreadable at some point while I am copying files or after I have copied files to it. I have copied large Bluray movies and diverse video files, I have also copied 32 GB of pictures, and I have copied about 86 thousand music files in different formats. At some point the partition becomes unreadable and I have to format the partition again (all files lost) and I have to start the whole process again. At some point I have been unable to copy large ISO (Bluray movies) file images. I have partitioned the HDD in 2 partitions P1 - 2TB, P2 - 1TB and I have lost every single file in either partition the same way. I reformat the HDD and it seems fine. I have run seatools to check the hard drive and it reports to be OK. What gives?

Read the article

ms excel 2010 in windows xp - when open workbook the data is formatted differently than when i saved it

- by Justin

I haven't been able to find an answer to this. I have multiple files that I use regularly in excel that now have cell formats of "date". Every single cell in the entire workbook (all sheets) is now formatted as "date". The problem is that I lost my formatting for percents, numbers years, etc and now everything is converted to date (xx/xx/xxxx). I am able to open previously saved versions of a file (prior to me having the problem) and the cells are formatted as I intend them to be (percents, numbers, general, as well as dates). Since this has happened on a couple different files recently, I am wondering how this is happening and how do I prevent it from happening in the future. I cannot cure the problem just by highlighting the entire sheet and converting back to general because I lose all my percents and number formatting. Example (Correct formatting): Month Year Working Days MTD POS Curr Rem May 2012 22 0 1,553,549 June 2012 22 0 1,516,903 June 2011 22 0 1,555,512 June 2010 22 0 1,584,704 Example (Incorrect formatting): Month Year Working Days MTD POS Curr Rem June Tuesday, July 04, 1905 Wednesday, January 04, 1900 Wednesday, January 18, 1900 213,320 July Tuesday, July 04, 1905 Wednesday, January 04, 1900 Monday, January 16, 1900 314,261 July Monday, July 03, 1905 Wednesday, January 04, 1900 Sunday, January 15, 1900 447,759 July Sunday, July 02, 1905 Wednesday, January 04, 1900 Monday, January 16, 1900 321,952 Sorry for the mess. Any suggestions?

Read the article

mysql - moving to a lower performance server, how small can I go?

- by pedalpete

Read the article

Windows VPN client connect on different port

- by John Gardeniers

Scenario: Two Windows Server 2003 machines running RRAS VPNs. The firewall port forwards 1723 to one of those machines for normal remote access. I'd like to find a way to connect to the second machine as well. Not because I need to but just because it's the sort of thing I reckon should be possible but can't figure out how to do. Is it possible to have the Windows PPTP VPN client (on XP in this instance) connect on a port other than 1723? If so, I can simply port forward another port to the second server. I've done a fair bit of Googling over the last few days and have only found others asking the same question but no answers. I have of course tried to add a port number in the host name or IP connection box, in various formats, but to no avail. While this might be possible with a third part client I'm really only interested in whether or not it can be done with the Windows built-in client and if so how?. Perhaps there's a registry hack I'm not aware of?

Read the article

Efficient mirroring of directories using hardlinks

- by zoqaeski

I'm backing up my music collection on to a number of NTFS-formatted external hard-drives; however, as I store my main collection in FLAC and have my library on my laptop as MP3s to save space, I want to be able to back up both sets, because mass conversion between formats is time-consuming. The "music" directory can contain any format; the "mp3s" directory contains only MP3s converted from files in the "music" directory. The music collection on the laptop contains only MP3s, but they come from both sources. When I backup my laptop's library to the "mp3s" directory, I want to only copy across MP3 files that don't exist in the "music" directory; those that do should be hard-linked to the "music" directory. All directories have an identical hierarchy, sorted by artist, album, date, discnumber if applicable, etc, and I use a tagging editor to ensure consistency across all these locations. I'm also using a Linux computer, but keeping the music collections on NTFS-formatted partitions so that they are readable by both Linux and Windows. At the moment, I use the following command to perform the backups, but this is time-consuming due to the expensive nature of finding hard links. rsync -avu --progress --relative --ignore-existing --link-dest=../music/ **/*.mp3 /media/ntfspocket/mp3s Is there a way to perform this backup more efficiently, taking advantage of the directory hierarchy?

Read the article

How can I generate filesystem images that are usable on many different virtualization systems?

- by Mark Longair

I have written a script that generates a root filesystem image (based on Debian lenny) suitable for User-Mode Linux. (Essentially this script creates a filesystem image, mounts it with a loop device, uses debootstrap to create a lenny install, sets up a static IP for TUN/TAP networking, adds public keys for login by SSH and installs a web application.) These filesystem images work pretty well with UML, but it would be nice to be able to generate similar images that people can use on alternative virtualization software, and I'm not familiar with these options at all. In particular, since the idea is to use this image as a standalone server for testing the web application, it's important that the networking works. I wonder if anyone can suggest what would be involved in customizing such root filesystem images such that they could be used with other virtualization software, such as VMware, Xen or as an Amazon EC2 instance? Two particular concerns are: If such systems don't use a raw filesystem image (e.g. they need headers with metadata or are compressed in some particular way) do there exist tools to convert between the different formats? I assume that in the filesystem, at least /etc/network/interfaces will have to be customized, but are more involved changes likely to be necessary? Many thanks for any suggestions...

Read the article

Best format for hard drive for Windows and Mac?

- by Neil

I have a 500 GB USB External Hard Drive. I need four partitions on it, for the following purposes: 160 GB for a bootable backup of my Mac. 160 GB for a bootable backup of my Windows. 11 GB for a bootable Snow Leopard Install Disk Rest as for file storage. Now I need a partition table which will get recognised on both Windows and Mac, without needing extra software on Windows, which will let me keep bootable copies of both OS'es, but let me access the file storage from both OS'es. Currently, I have a GUI Partition Table, with Mac OS Extended (Journaled) Partitions for the two backups, Mac OS Extended for the Install Disk, and NTFS for the file storage. While this gets recognised perfectly on my Mac, thanks to an NTFS for Mac driver from Paragon, when connected to Windows, the drive is detected by the machine (listed in Safely Remove USB), but not recognised in Windows Explorer unless I install MacDrive, which is not feasible for me to install on public Windows Machines I might wanna access my storage area on. Can someone recommend the best combination of formats and software/drivers to get this done seamlessly?

Read the article

Which free RDBMS is best for small in-house development?

- by Nic Waller

I am the sole sysadmin for a small firm of about 50 people, and I have been asked to develop an in-house application for tracking job completion and providing reports based on that data. I'm planning on building it as a web application. I have roughly equal experience developing for MySQL, PostgreSQL, and MSSQL. We are primarily a Windows-based shop, but I'm fairly comfortable with both Windows and Linux system administration. These are my two biggest concerns: Ease of managability. I don't expect to be maintaining this database forever. For the sake of the person that eventually has to take over for me, which database has the lowest barrier to entry? Data integrity. This means transaction-safe, robust storage, and easy backup/recovery. Even better if the database can be easily replicated. There is not a lot of budget for this project, so I am restricted to working with one of the free database systems mentioned above. What would you choose?

Read the article

Bing Desktop not updating the wallpaper anymore

- by warmth

For some reason, first my workstation and then my tablet stopped updating the wallpaper. First I thought it was my company that was avoiding the app to work properly but then I started noticing that the app itself is a mess: It has two storage and formats for the wallpapers: C:\Users\<username>\AppData\Local\Microsoft\BingDesktop\en-US\Apps\Wallpaper_5386c77076d04cf9a8b5d619b4cba48e\VersionIndependent\images with a #####.jpg (single number) image format & C:\Users\<username>\AppData\Local\Microsoft\BingDesktop\themes with a ####-##-##.jpg (date) image format. I read here that deleting the themes folder it will get remade with the new images, and it worked. However those are not the files used by the Wallpaper app and deleting the images folder won't get the same result. I have added Bing Desktop to the Firewall white list and the issue is still there. Any ideas? Currently I'm using DisplayFusion to place the wallpaper manually because the company doesn't allow change the wallpapers (policies). Note: I wrote to the DisplayFusion developers to suggest adding a feature to support Bing Wallpapers. They told me there was no API support to implement it but they will study this possibility (workaround) for the future: http://stackoverflow.com/questions/10639914/is-there-a-way-to-get-bings-photo-of-the-day

Read the article

Why does loading a html page opens automaically the printer dialog?

- by Alex

I have the problem that loading a certain webpage in firefox automatically opens the printer dialog. How is that? Additional information: This only happens on firefox 24.0 on Windows 7. It neither happens on Windows Explorer in Windows 7 nor on Firefox 24 on a Linux system. This also happens when using the firefox safemode, with all add-ons disabled. I cannot post the webpage, since it is not public and restricted. Multiple javascript files are used, but none contains the expression print(). The page content does not contain the phrase 'printer'. It has nothing to do with printing something. This happens sometimes, but not always. I cannot say for sure the URL and/or the content is always identical. I can answer any other questions, like 'Does the page contain this and that...', 'does the URL contain this and that...' Basic question: What could be the reason for the annoying printer pop up?

Read the article

Need solution for Network/Servers.

- by rehanplus

Dear All, Please help me. I just joined a new Hospital and want some help managing my network. There are some requirements: Current Network: There is a D.S.L connection and that is terminated on a LINUX proxy and then connected to D-Link layer 2 switches and then providing internet to more then 200 PC's (Would be increasing to 1500 in couple of months). D-Link switches are not configured yet. Also there is one Database server Report server and an application server. In near Future Application should be accessed by local users as well as remote users from internet via our web server. We do have a sharing server and all these servers databases and PC's are on single sub net. Required Network: All i do want is to secure my network from outside access and just allowing specific users via web application and they will be submitting there record for patient card and appointment facility by means of application and entering there record (on our database) but not violating our network resources. Secondly in house users also need to access the same application and also internet but they must have some unique identity and rights (i.e. Finance lab dept. peoples do have limited access to that application). Notes: Should i create V LAN or break sub nets. Having a firewall will solve my issues? is a router needed on these type of scenario's. Currently all the access are restricted from Linux Proxy. Thanks.

Read the article

Efficient mirroring of directories using hard links [closed]

- by zoqaeski

I'm backing up my music collection on to a number of NTFS-formatted external hard-drives; however, as I store my main collection in FLAC and have my library on my laptop as MP3s to save space, I want to be able to back up both sets, because mass conversion between formats is time-consuming. The "music" directory can contain any format; the "mp3s" directory contains only MP3s converted from files in the "music" directory. The music collection on the laptop contains only MP3s, but they come from both sources. When I backup my laptop's library to the "mp3s" directory, I want to only copy across MP3 files that don't exist in the "music" directory; those that do should be hard-linked to the "music" directory. All directories have an identical hierarchy, sorted by artist, album, date, discnumber if applicable, etc, and I use a tagging editor to ensure consistency across all these locations. I'm also using a Linux computer, but keeping the music collections on NTFS-formatted partitions so that they are readable by both Linux and Windows. At the moment, I use the following command to perform the backups, but this is time-consuming due to the expensive nature of finding hard links. rsync -avu --progress --relative --ignore-existing --link-dest=../music/ **/*.mp3 /media/ntfspocket/mp3s Is there a way to perform this backup more efficiently, taking advantage of the directory hierarchy?

Read the article

How to securely control access to a backend key server?

- by andy

I need to securely encrypt data in my database so that if the database is dumped, hackers are unable to decrypt the data. I'm planning on creating a simple key server on a different machine, and allowing the DB server access to it (restricted by IP address on the key server to permit the DB server). The key server would contain the key required to encrypt/decrypt data. However, if a hacker were able to get a shell on the DB server, they could request the key from the key server and therefore decrypt the data in the database. How could I prevent this (assuming all firewalls are in place, DB is not connected directly to the internet, etc)? i.e. is there some method I could use that could secure a request from the DB server to the key server so that even if a hacker had a shell on the DB server they'd be unable to make those same requests? Signed requests from the DB server could make issuing these requests less trivial - I suppose that'd help increase the amount of time it'd take to compromise the key server, something a hacker probably wouldn't have much of. As far as I can see, if someone can get a shell on the DB server everything's lost anyway. This could be mitigated by using one key per data item in the DB so at least there's not a single "master" key, but multiple keys that the hacker would need to access. What would be a secure method of ensuring requests from the DB server to the key server were authentic and could be trusted?

Read the article

Windows Server 2008 - Non-Domain users can see my server shares

- by ManovrareSoft

Windows Server 2008 - Server Machine Windows 7 Professional - Client Machine I have a domain. It was setup by the client. The shares on the server are restricted correctly when a user logs on to the domain and uses their workstation, I have a few groups setup to restrict some access but the groups are at their core "Domain Users". The problem I am having is that when a user brings in a laptop with Windows 7 Pro on it, they can type up the name of the server in the "Run Dialog" on the start menu like "\SERVERNAME\" and access all of the shares freely... because they are not logged in to the domain there are no restrictions it seems.I have reviewed the permissions on the folders and they all have to be "Domain Users" and I have removed "Everyone" from the list of people able to see it. Guest access is also disabled...What am I doing wrong? Only group in the list is "Domain Users" isn't a domain user a user that is logged in to the domain? How do I stop non-domain users from seeing the shared folder? I noticed this on Windows Server 2003 too at another time. I assume they both had similar security issues and neither were set up by myself so I am not sure what could have been enabled or specifically deactivated that makes this issue appear.

Read the article

Using R to Analyze G1GC Log Files

- by user12620111

Using R to Analyze G1GC Log Files body, td { font-family: sans-serif; background-color: white; font-size: 12px; margin: 8px; } tt, code, pre { font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace; } h1 { font-size:2.2em; } h2 { font-size:1.8em; } h3 { font-size:1.4em; } h4 { font-size:1.0em; } h5 { font-size:0.9em; } h6 { font-size:0.8em; } a:visited { color: rgb(50%, 0%, 50%); } pre { margin-top: 0; max-width: 95%; border: 1px solid #ccc; white-space: pre-wrap; } pre code { display: block; padding: 0.5em; } code.r, code.cpp { background-color: #F8F8F8; } table, td, th { border: none; } blockquote { color:#666666; margin:0; padding-left: 1em; border-left: 0.5em #EEE solid; } hr { height: 0px; border-bottom: none; border-top-width: thin; border-top-style: dotted; border-top-color: #999999; } @media print { * { background: transparent !important; color: black !important; filter:none !important; -ms-filter: none !important; } body { font-size:12pt; max-width:100%; } a, a:visited { text-decoration: underline; } hr { visibility: hidden; page-break-before: always; } pre, blockquote { padding-right: 1em; page-break-inside: avoid; } tr, img { page-break-inside: avoid; } img { max-width: 100% !important; } @page :left { margin: 15mm 20mm 15mm 10mm; } @page :right { margin: 15mm 10mm 15mm 20mm; } p, h2, h3 { orphans: 3; widows: 3; } h2, h3 { page-break-after: avoid; } } pre .operator, pre .paren { color: rgb(104, 118, 135) } pre .literal { color: rgb(88, 72, 246) } pre .number { color: rgb(0, 0, 205); } pre .comment { color: rgb(76, 136, 107); } pre .keyword { color: rgb(0, 0, 255); } pre .identifier { color: rgb(0, 0, 0); } pre .string { color: rgb(3, 106, 7); } var hljs=new function(){function m(p){return p.replace(/&/gm,"&").replace(/"}while(y.length||w.length){var v=u().splice(0,1)[0];z+=m(x.substr(q,v.offset-q));q=v.offset;if(v.event=="start"){z+=t(v.node);s.push(v.node)}else{if(v.event=="stop"){var p,r=s.length;do{r--;p=s[r];z+=("")}while(p!=v.node);s.splice(r,1);while(r'+M[0]+""}else{r+=M[0]}O=P.lR.lastIndex;M=P.lR.exec(L)}return r+L.substr(O,L.length-O)}function J(L,M){if(M.sL&&e[M.sL]){var r=d(M.sL,L);x+=r.keyword_count;return r.value}else{return F(L,M)}}function I(M,r){var L=M.cN?'':"";if(M.rB){y+=L;M.buffer=""}else{if(M.eB){y+=m(r)+L;M.buffer=""}else{y+=L;M.buffer=r}}D.push(M);A+=M.r}function G(N,M,Q){var R=D[D.length-1];if(Q){y+=J(R.buffer+N,R);return false}var P=q(M,R);if(P){y+=J(R.buffer+N,R);I(P,M);return P.rB}var L=v(D.length-1,M);if(L){var O=R.cN?"":"";if(R.rE){y+=J(R.buffer+N,R)+O}else{if(R.eE){y+=J(R.buffer+N,R)+O+m(M)}else{y+=J(R.buffer+N+M,R)+O}}while(L1){O=D[D.length-2].cN?"":"";y+=O;L--;D.length--}var r=D[D.length-1];D.length--;D[D.length-1].buffer="";if(r.starts){I(r.starts,"")}return R.rE}if(w(M,R)){throw"Illegal"}}var E=e[B];var D=[E.dM];var A=0;var x=0;var y="";try{var s,u=0;E.dM.buffer="";do{s=p(C,u);var t=G(s[0],s[1],s[2]);u+=s[0].length;if(!t){u+=s[1].length}}while(!s[2]);if(D.length1){throw"Illegal"}return{r:A,keyword_count:x,value:y}}catch(H){if(H=="Illegal"){return{r:0,keyword_count:0,value:m(C)}}else{throw H}}}function g(t){var p={keyword_count:0,r:0,value:m(t)};var r=p;for(var q in e){if(!e.hasOwnProperty(q)){continue}var s=d(q,t);s.language=q;if(s.keyword_count+s.rr.keyword_count+r.r){r=s}if(s.keyword_count+s.rp.keyword_count+p.r){r=p;p=s}}if(r.language){p.second_best=r}return p}function i(r,q,p){if(q){r=r.replace(/^((]+|\t)+)/gm,function(t,w,v,u){return w.replace(/\t/g,q)})}if(p){r=r.replace(/\n/g,"")}return r}function n(t,w,r){var x=h(t,r);var v=a(t);var y,s;if(v){y=d(v,x)}else{return}var q=c(t);if(q.length){s=document.createElement("pre");s.innerHTML=y.value;y.value=k(q,c(s),x)}y.value=i(y.value,w,r);var u=t.className;if(!u.match("(\\s|^)(language-)?"+v+"(\\s|$)")){u=u?(u+" "+v):v}if(/MSIE [678]/.test(navigator.userAgent)&&t.tagName=="CODE"&&t.parentNode.tagName=="PRE"){s=t.parentNode;var p=document.createElement("div");p.innerHTML=""+y.value+"";t=p.firstChild.firstChild;p.firstChild.cN=s.cN;s.parentNode.replaceChild(p.firstChild,s)}else{t.innerHTML=y.value}t.className=u;t.result={language:v,kw:y.keyword_count,re:y.r};if(y.second_best){t.second_best={language:y.second_best.language,kw:y.second_best.keyword_count,re:y.second_best.r}}}function o(){if(o.called){return}o.called=true;var r=document.getElementsByTagName("pre");for(var p=0;p|=||=||=|\\?|\\[|\\{|\\(|\\^|\\^=|\\||\\|=|\\|\\||~";this.ER="(?![\\s\\S])";this.BE={b:"\\\\.",r:0};this.ASM={cN:"string",b:"'",e:"'",i:"\\n",c:[this.BE],r:0};this.QSM={cN:"string",b:'"',e:'"',i:"\\n",c:[this.BE],r:0};this.CLCM={cN:"comment",b:"//",e:"$"};this.CBLCLM={cN:"comment",b:"/\\*",e:"\\*/"};this.HCM={cN:"comment",b:"#",e:"$"};this.NM={cN:"number",b:this.NR,r:0};this.CNM={cN:"number",b:this.CNR,r:0};this.BNM={cN:"number",b:this.BNR,r:0};this.inherit=function(r,s){var p={};for(var q in r){p[q]=r[q]}if(s){for(var q in s){p[q]=s[q]}}return p}}();hljs.LANGUAGES.cpp=function(){var a={keyword:{"false":1,"int":1,"float":1,"while":1,"private":1,"char":1,"catch":1,"export":1,virtual:1,operator:2,sizeof:2,dynamic_cast:2,typedef:2,const_cast:2,"const":1,struct:1,"for":1,static_cast:2,union:1,namespace:1,unsigned:1,"long":1,"throw":1,"volatile":2,"static":1,"protected":1,bool:1,template:1,mutable:1,"if":1,"public":1,friend:2,"do":1,"return":1,"goto":1,auto:1,"void":2,"enum":1,"else":1,"break":1,"new":1,extern:1,using:1,"true":1,"class":1,asm:1,"case":1,typeid:1,"short":1,reinterpret_cast:2,"default":1,"double":1,register:1,explicit:1,signed:1,typename:1,"try":1,"this":1,"switch":1,"continue":1,wchar_t:1,inline:1,"delete":1,alignof:1,char16_t:1,char32_t:1,constexpr:1,decltype:1,noexcept:1,nullptr:1,static_assert:1,thread_local:1,restrict:1,_Bool:1,complex:1},built_in:{std:1,string:1,cin:1,cout:1,cerr:1,clog:1,stringstream:1,istringstream:1,ostringstream:1,auto_ptr:1,deque:1,list:1,queue:1,stack:1,vector:1,map:1,set:1,bitset:1,multiset:1,multimap:1,unordered_set:1,unordered_map:1,unordered_multiset:1,unordered_multimap:1,array:1,shared_ptr:1}};return{dM:{k:a,i:"",k:a,r:10,c:["self"]}]}}}();hljs.LANGUAGES.r={dM:{c:[hljs.HCM,{cN:"number",b:"\\b0[xX][0-9a-fA-F]+[Li]?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+(?:[eE][+\\-]?\\d*)?L\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+\\.(?!\\d)(?:i\\b)?",e:hljs.IMMEDIATE_RE,r:1},{cN:"number",b:"\\b\\d+(?:\\.\\d*)?(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\.\\d+(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"keyword",b:"(?:tryCatch|library|setGeneric|setGroupGeneric)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\.",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\d+(?![\\w.])",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\b(?:function)",e:hljs.IMMEDIATE_RE,r:2},{cN:"keyword",b:"(?:if|in|break|next|repeat|else|for|return|switch|while|try|stop|warning|require|attach|detach|source|setMethod|setClass)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"literal",b:"(?:NA|NA_integer_|NA_real_|NA_character_|NA_complex_)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"literal",b:"(?:NULL|TRUE|FALSE|T|F|Inf|NaN)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"identifier",b:"[a-zA-Z.][a-zA-Z0-9._]*\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"|=|| Using R to Analyze G1GC Log Files Using R to Analyze G1GC Log Files Introduction Working in Oracle Platform Integration gives an engineer opportunities to work on a wide array of technologies. My team’s goal is to make Oracle applications run best on the Solaris/SPARC platform. When looking for bottlenecks in a modern applications, one needs to be aware of not only how the CPUs and operating system are executing, but also network, storage, and in some cases, the Java Virtual Machine. I was recently presented with about 1.5 GB of Java Garbage First Garbage Collector log file data. If you’re not familiar with the subject, you might want to review Garbage First Garbage Collector Tuning by Monica Beckwith. The customer had been running Java HotSpot 1.6.0_31 to host a web application server. I was told that the Solaris/SPARC server was running a Java process launched using a commmand line that included the following flags: -d64 -Xms9g -Xmx9g -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=80 -XX:PermSize=256m -XX:MaxPermSize=256m -XX:+PrintGC -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -XX:+PrintGCDateStamps -XX:+PrintFlagsFinal -XX:+DisableExplicitGC -XX:+UnlockExperimentalVMOptions -XX:ParallelGCThreads=8 Several sources on the internet indicate that if I were to print out the 1.5 GB of log files, it would require enough paper to fill the bed of a pick up truck. Of course, it would be fruitless to try to scan the log files by hand. Tools will be required to summarize the contents of the log files. Others have encountered large Java garbage collection log files. There are existing tools to analyze the log files: IBM’s GC toolkit The chewiebug GCViewer gchisto HPjmeter Instead of using one of the other tools listed, I decide to parse the log files with standard Unix tools, and analyze the data with R. Data Cleansing The log files arrived in two different formats. I guess that the difference is that one set of log files was generated using a more verbose option, maybe -XX:+PrintHeapAtGC, and the other set of log files was generated without that option. Format 1 In some of the log files, the log files with the less verbose format, a single trace, i.e. the report of a singe garbage collection event, looks like this: {Heap before GC invocations=12280 (full 61): garbage-first heap total 9437184K, used 7499918K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) region size 4096K, 1 young (4096K), 0 survivors (0K) compacting perm gen total 262144K, used 144077K [0xffffffff40000000, 0xffffffff50000000, 0xffffffff50000000) the space 262144K, 54% used [0xffffffff40000000, 0xffffffff48cb3758, 0xffffffff48cb3800, 0xffffffff50000000) No shared spaces configured. 2014-05-14T07:24:00.988-0700: 60586.353: [GC pause (young) 7324M->7320M(9216M), 0.1567265 secs] Heap after GC invocations=12281 (full 61): garbage-first heap total 9437184K, used 7496533K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) region size 4096K, 0 young (0K), 0 survivors (0K) compacting perm gen total 262144K, used 144077K [0xffffffff40000000, 0xffffffff50000000, 0xffffffff50000000) the space 262144K, 54% used [0xffffffff40000000, 0xffffffff48cb3758, 0xffffffff48cb3800, 0xffffffff50000000) No shared spaces configured. } A simple grep can be used to extract a summary: $ grep "\[ GC pause (young" g1gc.log 2014-05-13T13:24:35.091-0700: 3.109: [GC pause (young) 20M->5029K(9216M), 0.0146328 secs] 2014-05-13T13:24:35.440-0700: 3.459: [GC pause (young) 9125K->6077K(9216M), 0.0086723 secs] 2014-05-13T13:24:37.581-0700: 5.599: [GC pause (young) 25M->8470K(9216M), 0.0203820 secs] 2014-05-13T13:24:42.686-0700: 10.704: [GC pause (young) 44M->15M(9216M), 0.0288848 secs] 2014-05-13T13:24:48.941-0700: 16.958: [GC pause (young) 51M->20M(9216M), 0.0491244 secs] 2014-05-13T13:24:56.049-0700: 24.066: [GC pause (young) 92M->26M(9216M), 0.0525368 secs] 2014-05-13T13:25:34.368-0700: 62.383: [GC pause (young) 602M->68M(9216M), 0.1721173 secs] But that format wasn't easily read into R, so I needed to be a bit more tricky. I used the following Unix command to create a summary file that was easy for R to read. $ echo "SecondsSinceLaunch BeforeSize AfterSize TotalSize RealTime" $ grep "\[GC pause (young" g1gc.log | grep -v mark | sed -e 's/[A-SU-z,]/ /g' -e 's/->/ /' -e 's/: / /g' | more SecondsSinceLaunch BeforeSize AfterSize TotalSize RealTime 2014-05-13T13:24:35.091-0700 3.109 20 5029 9216 0.0146328 2014-05-13T13:24:35.440-0700 3.459 9125 6077 9216 0.0086723 2014-05-13T13:24:37.581-0700 5.599 25 8470 9216 0.0203820 2014-05-13T13:24:42.686-0700 10.704 44 15 9216 0.0288848 2014-05-13T13:24:48.941-0700 16.958 51 20 9216 0.0491244 2014-05-13T13:24:56.049-0700 24.066 92 26 9216 0.0525368 2014-05-13T13:25:34.368-0700 62.383 602 68 9216 0.1721173 Format 2 In some of the log files, the log files with the more verbose format, a single trace, i.e. the report of a singe garbage collection event, was more complicated than Format 1. Here is a text file with an example of a single G1GC trace in the second format. As you can see, it is quite complicated. It is nice that there is so much information available, but the level of detail can be overwhelming. I wrote this awk script (download) to summarize each trace on a single line. #!/usr/bin/env awk -f BEGIN { printf("SecondsSinceLaunch IncrementalCount FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize\n") } ###################### # Save count data from lines that are at the start of each G1GC trace. # Each trace starts out like this: # {Heap before GC invocations=14 (full 0): # garbage-first heap total 9437184K, used 325496K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) ###################### /{Heap.*full/{ gsub ( "\\)" , "" ); nf=split($0,a,"="); split(a[2],b," "); getline; if ( match($0, "first") ) { G1GC=1; IncrementalCount=b[1]; FullCount=substr( b[3], 1, length(b[3])-1 ); } else { G1GC=0; } } ###################### # Pull out time stamps that are in lines with this format: # 2014-05-12T14:02:06.025-0700: 94.312: [GC pause (young), 0.08870154 secs] ###################### /GC pause/ { DateTime=$1; SecondsSinceLaunch=substr($2, 1, length($2)-1); } ###################### # Heap sizes are in lines that look like this: # [ 4842M->4838M(9216M)] ###################### /\[ .*]$/ { gsub ( "\\[" , "" ); gsub ( "\ \]" , "" ); gsub ( "->" , " " ); gsub ( "\$ " , " " ); gsub ( "\ $" , " " ); split($0,a," "); if ( split(a[1],b,"M") > 1 ) {BeforeSize=b[1]*1024;} if ( split(a[1],b,"K") > 1 ) {BeforeSize=b[1];} if ( split(a[2],b,"M") > 1 ) {AfterSize=b[1]*1024;} if ( split(a[2],b,"K") > 1 ) {AfterSize=b[1];} if ( split(a[3],b,"M") > 1 ) {TotalSize=b[1]*1024;} if ( split(a[3],b,"K") > 1 ) {TotalSize=b[1];} } ###################### # Emit an output line when you find input that looks like this: # [Times: user=1.41 sys=0.08, real=0.24 secs] ###################### /\[Times/ { if (G1GC==1) { gsub ( "," , "" ); split($2,a,"="); UserTime=a[2]; split($3,a,"="); SysTime=a[2]; split($4,a,"="); RealTime=a[2]; print DateTime,SecondsSinceLaunch,IncrementalCount,FullCount,UserTime,SysTime,RealTime,BeforeSize,AfterSize,TotalSize; G1GC=0; } } The resulting summary is about 25X smaller that the original file, but still difficult for a human to digest. SecondsSinceLaunch IncrementalCount FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ... 2014-05-12T18:36:34.669-0700: 3985.744 561 0 0.57 0.06 0.16 1724416 1720320 9437184 2014-05-12T18:36:34.839-0700: 3985.914 562 0 0.51 0.06 0.19 1724416 1720320 9437184 2014-05-12T18:36:35.069-0700: 3986.144 563 0 0.60 0.04 0.27 1724416 1721344 9437184 2014-05-12T18:36:35.354-0700: 3986.429 564 0 0.33 0.04 0.09 1725440 1722368 9437184 2014-05-12T18:36:35.545-0700: 3986.620 565 0 0.58 0.04 0.17 1726464 1722368 9437184 2014-05-12T18:36:35.726-0700: 3986.801 566 0 0.43 0.05 0.12 1726464 1722368 9437184 2014-05-12T18:36:35.856-0700: 3986.930 567 0 0.30 0.04 0.07 1726464 1723392 9437184 2014-05-12T18:36:35.947-0700: 3987.023 568 0 0.61 0.04 0.26 1727488 1723392 9437184 2014-05-12T18:36:36.228-0700: 3987.302 569 0 0.46 0.04 0.16 1731584 1724416 9437184 Reading the Data into R Once the GC log data had been cleansed, either by processing the first format with the shell script, or by processing the second format with the awk script, it was easy to read the data into R. g1gc.df = read.csv("summary.txt", row.names = NULL, stringsAsFactors=FALSE,sep="") str(g1gc.df) ## 'data.frame': 8307 obs. of 10 variables: ## $ row.names : chr "2014-05-12T14:00:32.868-0700:" "2014-05-12T14:00:33.179-0700:" "2014-05-12T14:00:33.677-0700:" "2014-05-12T14:00:35.538-0700:" ... ## $ SecondsSinceLaunch: num 1.16 1.47 1.97 3.83 6.1 ... ## $ IncrementalCount : int 0 1 2 3 4 5 6 7 8 9 ... ## $ FullCount : int 0 0 0 0 0 0 0 0 0 0 ... ## $ UserTime : num 0.11 0.05 0.04 0.21 0.08 0.26 0.31 0.33 0.34 0.56 ... ## $ SysTime : num 0.04 0.01 0.01 0.05 0.01 0.06 0.07 0.06 0.07 0.09 ... ## $ RealTime : num 0.02 0.02 0.01 0.04 0.02 0.04 0.05 0.04 0.04 0.06 ... ## $ BeforeSize : int 8192 5496 5768 22528 24576 43008 34816 53248 55296 93184 ... ## $ AfterSize : int 1400 1672 2557 4907 7072 14336 16384 18432 19456 21504 ... ## $ TotalSize : int 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 ... head(g1gc.df) ## row.names SecondsSinceLaunch IncrementalCount ## 1 2014-05-12T14:00:32.868-0700: 1.161 0 ## 2 2014-05-12T14:00:33.179-0700: 1.472 1 ## 3 2014-05-12T14:00:33.677-0700: 1.969 2 ## 4 2014-05-12T14:00:35.538-0700: 3.830 3 ## 5 2014-05-12T14:00:37.811-0700: 6.103 4 ## 6 2014-05-12T14:00:41.428-0700: 9.720 5 ## FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ## 1 0 0.11 0.04 0.02 8192 1400 9437184 ## 2 0 0.05 0.01 0.02 5496 1672 9437184 ## 3 0 0.04 0.01 0.01 5768 2557 9437184 ## 4 0 0.21 0.05 0.04 22528 4907 9437184 ## 5 0 0.08 0.01 0.02 24576 7072 9437184 ## 6 0 0.26 0.06 0.04 43008 14336 9437184 Basic Statistics Once the data has been read into R, simple statistics are very easy to generate. All of the numbers from high school statistics are available via simple commands. For example, generate a summary of every column: summary(g1gc.df) ## row.names SecondsSinceLaunch IncrementalCount FullCount ## Length:8307 Min. : 1 Min. : 0 Min. : 0.0 ## Class :character 1st Qu.: 9977 1st Qu.:2048 1st Qu.: 0.0 ## Mode :character Median :12855 Median :4136 Median : 12.0 ## Mean :12527 Mean :4156 Mean : 31.6 ## 3rd Qu.:15758 3rd Qu.:6262 3rd Qu.: 61.0 ## Max. :55484 Max. :8391 Max. :113.0 ## UserTime SysTime RealTime BeforeSize ## Min. :0.040 Min. :0.0000 Min. : 0.0 Min. : 5476 ## 1st Qu.:0.470 1st Qu.:0.0300 1st Qu.: 0.1 1st Qu.:5137920 ## Median :0.620 Median :0.0300 Median : 0.1 Median :6574080 ## Mean :0.751 Mean :0.0355 Mean : 0.3 Mean :5841855 ## 3rd Qu.:0.920 3rd Qu.:0.0400 3rd Qu.: 0.2 3rd Qu.:7084032 ## Max. :3.370 Max. :1.5600 Max. :488.1 Max. :8696832 ## AfterSize TotalSize ## Min. : 1380 Min. :9437184 ## 1st Qu.:5002752 1st Qu.:9437184 ## Median :6559744 Median :9437184 ## Mean :5785454 Mean :9437184 ## 3rd Qu.:7054336 3rd Qu.:9437184 ## Max. :8482816 Max. :9437184 Q: What is the total amount of User CPU time spent in garbage collection? sum(g1gc.df$UserTime) ## [1] 6236 As you can see, less than two hours of CPU time was spent in garbage collection. Is that too much? To find the percentage of time spent in garbage collection, divide the number above by total_elapsed_time*CPU_count. In this case, there are a lot of CPU’s and it turns out the the overall amount of CPU time spent in garbage collection isn’t a problem when viewed in isolation. When calculating rates, i.e. events per unit time, you need to ask yourself if the rate is homogenous across the time period in the log file. Does the log file include spikes of high activity that should be separately analyzed? Averaging in data from nights and weekends with data from business hours may alias problems. If you have a reason to suspect that the garbage collection rates include peaks and valleys that need independent analysis, see the “Time Series” section, below. Q: How much garbage is collected on each pass? The amount of heap space that is recovered per GC pass is surprisingly low: At least one collection didn’t recover any data. (“Min.=0”) 25% of the passes recovered 3MB or less. (“1st Qu.=3072”) Half of the GC passes recovered 4MB or less. (“Median=4096”) The average amount recovered was 56MB. (“Mean=56390”) 75% of the passes recovered 36MB or less. (“3rd Qu.=36860”) At least one pass recovered 2GB. (“Max.=2121000”) g1gc.df$Delta = g1gc.df$BeforeSize - g1gc.df$AfterSize summary(g1gc.df$Delta) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0 3070 4100 56400 36900 2120000 Q: What is the maximum User CPU time for a single collection? The worst garbage collection (“Max.”) is many standard deviations away from the mean. The data appears to be right skewed. summary(g1gc.df$UserTime) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.040 0.470 0.620 0.751 0.920 3.370 sd(g1gc.df$UserTime) ## [1] 0.3966 Basic Graphics Once the data is in R, it is trivial to plot the data with formats including dot plots, line charts, bar charts (simple, stacked, grouped), pie charts, boxplots, scatter plots histograms, and kernel density plots. Histogram of User CPU Time per Collection I don't think that this graph requires any explanation. hist(g1gc.df$UserTime, main="User CPU Time per Collection", xlab="Seconds", ylab="Frequency") Box plot to identify outliers When the initial data is viewed with a box plot, you can see the one crazy outlier in the real time per GC. Save this data point for future analysis and drop the outlier so that it’s not throwing off our statistics. Now the box plot shows many outliers, which will be examined later, using times series analysis. Notice that the scale of the x-axis changes drastically once the crazy outlier is removed. par(mfrow=c(2,1)) boxplot(g1gc.df$UserTime,g1gc.df$SysTime,g1gc.df$RealTime, main="Box Plot of Time per GC\n(dominated by a crazy outlier)", names=c("usr","sys","elapsed"), xlab="Seconds per GC", ylab="Time (Seconds)", horizontal = TRUE, outcol="red") crazy.outlier.df=g1gc.df[g1gc.df$RealTime > 400,] g1gc.df=g1gc.df[g1gc.df$RealTime < 400,] boxplot(g1gc.df$UserTime,g1gc.df$SysTime,g1gc.df$RealTime, main="Box Plot of Time per GC\n(crazy outlier excluded)", names=c("usr","sys","elapsed"), xlab="Seconds per GC", ylab="Time (Seconds)", horizontal = TRUE, outcol="red") box(which = "outer", lty = "solid") Here is the crazy outlier for future analysis: crazy.outlier.df ## row.names SecondsSinceLaunch IncrementalCount ## 8233 2014-05-12T23:15:43.903-0700: 20741 8316 ## FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ## 8233 112 0.55 0.42 488.1 8381440 8235008 9437184 ## Delta ## 8233 146432 R Time Series Data To analyze the garbage collection as a time series, I’ll use Z’s Ordered Observations (zoo). “zoo is the creator for an S3 class of indexed totally ordered observations which includes irregular time series.” require(zoo) ## Loading required package: zoo ## ## Attaching package: 'zoo' ## ## The following objects are masked from 'package:base': ## ## as.Date, as.Date.numeric head(g1gc.df[,1]) ## [1] "2014-05-12T14:00:32.868-0700:" "2014-05-12T14:00:33.179-0700:" ## [3] "2014-05-12T14:00:33.677-0700:" "2014-05-12T14:00:35.538-0700:" ## [5] "2014-05-12T14:00:37.811-0700:" "2014-05-12T14:00:41.428-0700:" options("digits.secs"=3) times=as.POSIXct( g1gc.df[,1], format="%Y-%m-%dT%H:%M:%OS%z:") g1gc.z = zoo(g1gc.df[,-c(1)], order.by=times) head(g1gc.z) ## SecondsSinceLaunch IncrementalCount FullCount ## 2014-05-12 17:00:32.868 1.161 0 0 ## 2014-05-12 17:00:33.178 1.472 1 0 ## 2014-05-12 17:00:33.677 1.969 2 0 ## 2014-05-12 17:00:35.538 3.830 3 0 ## 2014-05-12 17:00:37.811 6.103 4 0 ## 2014-05-12 17:00:41.427 9.720 5 0 ## UserTime SysTime RealTime BeforeSize AfterSize ## 2014-05-12 17:00:32.868 0.11 0.04 0.02 8192 1400 ## 2014-05-12 17:00:33.178 0.05 0.01 0.02 5496 1672 ## 2014-05-12 17:00:33.677 0.04 0.01 0.01 5768 2557 ## 2014-05-12 17:00:35.538 0.21 0.05 0.04 22528 4907 ## 2014-05-12 17:00:37.811 0.08 0.01 0.02 24576 7072 ## 2014-05-12 17:00:41.427 0.26 0.06 0.04 43008 14336 ## TotalSize Delta ## 2014-05-12 17:00:32.868 9437184 6792 ## 2014-05-12 17:00:33.178 9437184 3824 ## 2014-05-12 17:00:33.677 9437184 3211 ## 2014-05-12 17:00:35.538 9437184 17621 ## 2014-05-12 17:00:37.811 9437184 17504 ## 2014-05-12 17:00:41.427 9437184 28672 Example of Two Benchmark Runs in One Log File The data in the following graph is from a different log file, not the one of primary interest to this article. I’m including this image because it is an example of idle periods followed by busy periods. It would be uninteresting to average the rate of garbage collection over the entire log file period. More interesting would be the rate of garbage collect in the two busy periods. Are they the same or different? Your production data may be similar, for example, bursts when employees return from lunch and idle times on weekend evenings, etc. Once the data is in an R Time Series, you can analyze isolated time windows. Clipping the Time Series data Flashing back to our test case… Viewing the data as a time series is interesting. You can see that the work intensive time period is between 9:00 PM and 3:00 AM. Lets clip the data to the interesting period: par(mfrow=c(2,1)) plot(g1gc.z$UserTime, type="h", main="User Time per GC\nTime: Complete Log File", xlab="Time of Day", ylab="CPU Seconds per GC", col="#1b9e77") clipped.g1gc.z=window(g1gc.z, start=as.POSIXct("2014-05-12 21:00:00"), end=as.POSIXct("2014-05-13 03:00:00")) plot(clipped.g1gc.z$UserTime, type="h", main="User Time per GC\nTime: Limited to Benchmark Execution", xlab="Time of Day", ylab="CPU Seconds per GC", col="#1b9e77") box(which = "outer", lty = "solid") Cumulative Incremental and Full GC count Here is the cumulative incremental and full GC count. When the line is very steep, it indicates that the GCs are repeating very quickly. Notice that the scale on the Y axis is different for full vs. incremental. plot(clipped.g1gc.z[,c(2:3)], main="Cumulative Incremental and Full GC count", xlab="Time of Day", col="#1b9e77") GC Analysis of Benchmark Execution using Time Series data In the following series of 3 graphs: The “After Size” show the amount of heap space in use after each garbage collection. Many Java objects are still referenced, i.e. alive, during each garbage collection. This may indicate that the application has a memory leak, or may indicate that the application has a very large memory footprint. Typically, an application's memory footprint plateau's in the early stage of execution. One would expect this graph to have a flat top. The steep decline in the heap space may indicate that the application crashed after 2:00. The second graph shows that the outliers in real execution time, discussed above, occur near 2:00. when the Java heap seems to be quite full. The third graph shows that Full GCs are infrequent during the first few hours of execution. The rate of Full GC's, (the slope of the cummulative Full GC line), changes near midnight. plot(clipped.g1gc.z[,c("AfterSize","RealTime","FullCount")], xlab="Time of Day", col=c("#1b9e77","red","#1b9e77")) GC Analysis of heap recovered Each GC trace includes the amount of heap space in use before and after the individual GC event. During garbage coolection, unreferenced objects are identified, the space holding the unreferenced objects is freed, and thus, the difference in before and after usage indicates how much space has been freed. The following box plot and bar chart both demonstrate the same point - the amount of heap space freed per garbage colloection is surprisingly low. par(mfrow=c(2,1)) boxplot(as.vector(clipped.g1gc.z$Delta), main="Amount of Heap Recovered per GC Pass", xlab="Size in KB", horizontal = TRUE, col="red") hist(as.vector(clipped.g1gc.z$Delta), main="Amount of Heap Recovered per GC Pass", xlab="Size in KB", breaks=100, col="red") box(which = "outer", lty = "solid") This graph is the most interesting. The dark blue area shows how much heap is occupied by referenced Java objects. This represents memory that holds live data. The red fringe at the top shows how much data was recovered after each garbage collection. barplot(clipped.g1gc.z[,c("AfterSize","Delta")], col=c("#7570b3","#e7298a"), xlab="Time of Day", border=NA) legend("topleft", c("Live Objects","Heap Recovered on GC"), fill=c("#7570b3","#e7298a")) box(which = "outer", lty = "solid") When I discuss the data in the log files with the customer, I will ask for an explaination for the large amount of referenced data resident in the Java heap. There are two are posibilities: There is a memory leak and the amount of space required to hold referenced objects will continue to grow, limited only by the maximum heap size. After the maximum heap size is reached, the JVM will throw an “Out of Memory” exception every time that the application tries to allocate a new object. If this is the case, the aplication needs to be debugged to identify why old objects are referenced when they are no longer needed. The application has a legitimate requirement to keep a large amount of data in memory. The customer may want to further increase the maximum heap size. Another possible solution would be to partition the application across multiple cluster nodes, where each node has responsibility for managing a unique subset of the data. Conclusion In conclusion, R is a very powerful tool for the analysis of Java garbage collection log files. The primary difficulty is data cleansing so that information can be read into an R data frame. Once the data has been read into R, a rich set of tools may be used for thorough evaluation.

Search Results

Search found 2422 results on 97 pages for 'restricted formats'.

Page 73/97 | < Previous Page | 69 70 71 72 73 74 75 76 77 78 79 80 | Next Page >

- by Andrew Fulton

- by Johnny Utahh

- by Jon

- by Profpatsch

- by grojo

- by JackWu

- by gyrolf

- by Matthew Leingang

- by Martin Day

- by mikolajek

- by Victor Bugarin

- by Justin

- by pedalpete

- by John Gardeniers

- by zoqaeski

- by Mark Longair

- by Neil

- by Nic Waller

- by warmth

- by Alex

- by rehanplus

- by zoqaeski

- by andy

- by ManovrareSoft

- by user12620111

< Previous Page | 69 70 71 72 73 74 75 76 77 78 79 80 | Next Page >