Search Results

Search found 622 results on 25 pages for 'repeating'.

Page 25/25 | < Previous Page | 21 22 23 24 25

Managing highly repetitive code and documentation in Java

- by polygenelubricants

Highly repetitive code is generally a bad thing, and there are design patterns that can help minimize this. However, sometimes it's simply inevitable due to the constraints of the language itself. Take the following example from java.util.Arrays: /** * Assigns the specified long value to each element of the specified * range of the specified array of longs. The range to be filled * extends from index <tt>fromIndex</tt>, inclusive, to index * <tt>toIndex</tt>, exclusive. (If <tt>fromIndex==toIndex</tt>, the * range to be filled is empty.) * * @param a the array to be filled * @param fromIndex the index of the first element (inclusive) to be * filled with the specified value * @param toIndex the index of the last element (exclusive) to be * filled with the specified value * @param val the value to be stored in all elements of the array * @throws IllegalArgumentException if <tt>fromIndex > toIndex</tt> * @throws ArrayIndexOutOfBoundsException if <tt>fromIndex < 0</tt> or * <tt>toIndex > a.length</tt> */ public static void fill(long[] a, int fromIndex, int toIndex, long val) { rangeCheck(a.length, fromIndex, toIndex); for (int i=fromIndex; i<toIndex; i++) a[i] = val; } The above snippet appears in the source code 8 times, with very little variation in the documentation/method signature but exactly the same method body, one for each of the root array types int[], short[], char[], byte[], boolean[], double[], float[], and Object[]. I believe that unless one resorts to reflection (which is an entirely different subject in itself), this repetition is inevitable. I understand that as a utility class, such high concentration of repetitive Java code is highly atypical, but even with the best practice, repetition does happen! Refactoring doesn't always work because it's not always possible (the obvious case is when the repetition is in the documentation). Obviously maintaining this source code is a nightmare. A slight typo in the documentation, or a minor bug in the implementation, is multiplied by however many repetitions was made. In fact, the best example happens to involve this exact class: Google Research Blog - Extra, Extra - Read All About It: Nearly All Binary Searches and Mergesorts are Broken (by Joshua Bloch, Software Engineer) The bug is a surprisingly subtle one, occurring in what many thought to be just a simple and straightforward algorithm. // int mid =(low + high) / 2; // the bug int mid = (low + high) >>> 1; // the fix The above line appears 11 times in the source code! So my questions are: How are these kinds of repetitive Java code/documentation handled in practice? How are they developed, maintained, and tested? Do you start with "the original", and make it as mature as possible, and then copy and paste as necessary and hope you didn't make a mistake? And if you did make a mistake in the original, then just fix it everywhere, unless you're comfortable with deleting the copies and repeating the whole replication process? And you apply this same process for the testing code as well? Would Java benefit from some sort of limited-use source code preprocessing for this kind of thing? Perhaps Sun has their own preprocessor to help write, maintain, document and test these kind of repetitive library code? A comment requested another example, so I pulled this one from Google Collections: com.google.common.base.Predicates lines 276-310 (AndPredicate) vs lines 312-346 (OrPredicate). The source for these two classes are identical, except for: AndPredicate vs OrPredicate (each appears 5 times in its class) "And(" vs Or(" (in the respective toString() methods) #and vs #or (in the @see Javadoc comments) true vs false (in apply; ! can be rewritten out of the expression) -1 /* all bits on */ vs 0 /* all bits off */ in hashCode() &= vs |= in hashCode()

Read the article
SDL side-scroller scrolls inconsistantly

- by SDLFunTimes

So I'm working on an upgrade from my previous project (that I posted here for code review) this time implementing a repeating background (like what is used on cartoons) so that SDL doesn't have to load really big images for a level. There's a strange inconsistency in the program, however: the first time the user scrolls all the way to the right 2 less panels are shown than is specified. Going backwards (left) the correct number of panels is shown (that is the panels repeat the number of times specified in the code). After that it appears that going right again (once all the way at the left) the correct number of panels is shown and same going backwards. Here's some selected code and here's a .zip of all my code constructor: Game::Game(SDL_Event* event, SDL_Surface* scr, int level_w, int w, int h, int bpp) { this->event = event; this->bpp = bpp; level_width = level_w; screen = scr; w_width = w; w_height = h; //load images and set rects background = format_surface("background.jpg"); person = format_surface("person.png"); background_rect_left = background->clip_rect; background_rect_right = background->clip_rect; current_background_piece = 1; //we are displaying the first clip rect_in_view = &background_rect_right; other_rect = &background_rect_left; person_rect = person->clip_rect; background_rect_left.x = 0; background_rect_left.y = 0; background_rect_right.x = background->w; background_rect_right.y = 0; person_rect.y = background_rect_left.h - person_rect.h; person_rect.x = 0; } and here's the move method which is probably causing all the trouble: void Game::move(SDLKey direction) { if(direction == SDLK_RIGHT) { if(move_screen(direction)) { if(!background_reached_right()) { //move background right background_rect_left.x += movement_increment; background_rect_right.x += movement_increment; if(rect_in_view->x >= 0) { //move the other rect in to fill the empty space SDL_Rect* temp; other_rect->x = -w_width + rect_in_view->x; temp = rect_in_view; rect_in_view = other_rect; other_rect = temp; current_background_piece++; std::cout << current_background_piece << std::endl; } if(background_overshoots_right()) { //sees if this next blit is past the surface //this is used only for re-aligning the rects when //the end of the screen is reached background_rect_left.x = 0; background_rect_right.x = w_width; } } } else { //move the person instead person_rect.x += movement_increment; if(get_person_right_side() > w_width) { //person went too far right person_rect.x = w_width - person_rect.w; } } } else if(direction == SDLK_LEFT) { if(move_screen(direction)) { if(!background_reached_left()) { //moves background left background_rect_left.x -= movement_increment; background_rect_right.x -= movement_increment; if(rect_in_view->x <= -w_width) { //swap the rect in view SDL_Rect* temp; rect_in_view->x = w_width; temp = rect_in_view; rect_in_view = other_rect; other_rect = temp; current_background_piece--; std::cout << current_background_piece << std::endl; } if(background_overshoots_left()) { background_rect_left.x = 0; background_rect_right.x = w_width; } } } else { //move the person instead person_rect.x -= movement_increment; if(person_rect.x < 0) { //person went too far left person_rect.x = 0; } } } } without the rest of the code this doesn't make too much sense. Since there is too much of it I'll upload it here for testing. Anyway does anyone know how I could fix this inconsistency?

Read the article
Writing more efficient xquery code (avoiding redundant iteration)

- by Coquelicot

Here's a simplified version of a problem I'm working on: I have a bunch of xml data that encodes information about people. Each person is uniquely identified by an 'id' attribute, but they may go by many names. For example, in one document, I might find <person id=1>Paul Mcartney</person> <person id=2>Ringo Starr</person> And in another I might find: <person id=1>Sir Paul McCartney</person> <person id=2>Richard Starkey</person> I want to use xquery to produce a new document that lists every name associated with a given id. i.e.: <person id=1> <name>Paul McCartney</name> <name>Sir Paul McCartney</name> <name>James Paul McCartney</name> </person> <person id=2> ... </person> The way I'm doing this now in xquery is something like this (pseudocode-esque): let $ids := distinct-terms( [all the id attributes on people] ) for $id in $ids return <person id={$id}> { for $unique-name in distinct-values ( for $name in ( [all names] ) where $name/@id=$id return $name ) return <name>{$unique-name}</name> } </person> The problem is that this is really slow. I imagine the bottleneck is the innermost loop, which executes once for every id (of which there are about 1200). I'm dealing with a fair bit of data (300 MB, spread over about 800 xml files), so even a single execution of the query in the inner loop takes about 12 seconds, which means that repeating it 1200 times will take about 4 hours (which might be optimistic - the process has been running for 3 hours so far). Not only is it slow, it's using a whole lot of virtual memory. I'm using Saxon, and I had to set java's maximum heap size to 10 GB (!) to avoid getting out of memory errors, and it's currently using 6 GB of physical memory. So here's how I'd really like to do this (in Pythonic pseudocode): persons = {} for id in ids: person[id] = set() for person in all_the_people_in_my_xml_document: persons[person.id].add(person.name) There, I just did it in linear time, with only one sweep of the xml document. Now, is there some way to do something similar in xquery? Surely if I can imagine it, a reasonable programming language should be able to do it (he said quixotically). The problem, I suppose, is that unlike Python, xquery doesn't (as far as I know) have anything like an associative array. Is there some clever way around this? Failing that, is there something better than xquery that I might use to accomplish my goal? Because really, the computational resources I'm throwing at this relatively simple problem are kind of ridiculous.

Read the article
Open XML SDK 2.0 - Split table to new power point slide when content flows off current slide

- by amurra

I have a bunch of data that I need to export from a website to a power point presentation and have been using Open XML SDK 2.0 to perform this task. I have a power point presentation that I am putting through Open XML SDK 2.0 Productivity Tool to generate the template code that I can use to recreate the export. On one of those slides I have a table and the requirement is to add data to that table and break that table across multiple slides if the table exceeds the bottom of the slide. The approach I have taken is to determine the height of the table and if it exceeds the height of the slide, move that new content into the next slide. I have read Bryan and Jones blog on adding repeating data to a power point slide, but my scenario is a little different. They use the following code: A.Table tbl = current.Slide.Descendants<A.Table>().First(); A.TableRow tr = new A.TableRow(); tr.Height = heightInEmu; tr.Append(CreateDrawingCell(imageRel + imageRelId)); tr.Append(CreateTextCell(category)); tr.Append(CreateTextCell(subcategory)); tr.Append(CreateTextCell(model)); tr.Append(CreateTextCell(price.ToString())); tbl.Append(tr); imageRelId++; This won't work for me since they know what height to set the table row to since it will be the height of the image, but when adding in different amounts of text I do not know the height ahead of time so I just set tr.Heightto a default value. Here is my attempt at figuring at the table height: A.Table tbl = tableSlide.Slide.Descendants<A.Table>().First(); A.TableRow tr = new A.TableRow(); tr.Height = 370840L; tr.Append(PowerPointUtilities.CreateTextCell("This"); tr.Append(PowerPointUtilities.CreateTextCell("is")); tr.Append(PowerPointUtilities.CreateTextCell("a")); tr.Append(PowerPointUtilities.CreateTextCell("test")); tr.Append(PowerPointUtilities.CreateTextCell("Test")); tbl.Append(tr); tableSlide.Slide.Save(); long tableHeight = PowerPointUtilities.TableHeight(tbl); Here are the helper methods: public static A.TableCell CreateTextCell(string text) { A.TableCell tableCell = new A.TableCell( new A.TextBody(new A.BodyProperties(), new A.Paragraph(new A.Run(new A.Text(text)))), new A.TableCellProperties()); return tableCell; } public static Int64Value TableHeight(A.Table table) { long height = 0; foreach (var row in table.Descendants<A.TableRow>() .Where(h => h.Height.HasValue)) { height += row.Height.Value; } return height; } This correctly adds the new table row to the existing table, but when I try and get the height of the table, it returns the original height and not the new height. The new height meaning the default height I initially set and not the height after a large amount of text has been inserted. It seems the height only gets readjusted when it is opened in power point. I have also tried accessing the height of the largest table cell in the row, but can't seem to find the right property to perform that task. My question is how do you determine the height of a dynamically added table row since it doesn't seem to update the height of the row until it is opened in power point? Any other ways to determine when to split content to another slide while using Open XML SDK 2.0? I'm open to any suggestion on a better approach someone might have taken since there isn't much documentation on this subject.

Read the article
Paging, sorting and filtering in a stored procedure (SQL Server)

- by Fruitbat

I was looking at different ways of writing a stored procedure to return a "page" of data. This was for use with the asp ObjectDataSource, but it could be considered a more general problem. The requirement is to return a subset of the data based on the usual paging paremeters, startPageIndex and maximumRows, but also a sortBy parameter to allow the data to be sorted. Also there are some parameters passed in to filter the data on various conditions. One common way to do this seems to be something like this: [Method 1] ;WITH stuff AS ( SELECT CASE WHEN @SortBy = 'Name' THEN ROW_NUMBER() OVER (ORDER BY Name) WHEN @SortBy = 'Name DESC' THEN ROW_NUMBER() OVER (ORDER BY Name DESC) WHEN @SortBy = ... ELSE ROW_NUMBER() OVER (ORDER BY whatever) END AS Row, ., ., ., FROM Table1 INNER JOIN Table2 ... LEFT JOIN Table3 ... WHERE ... (lots of things to check) ) SELECT * FROM stuff WHERE (Row > @startRowIndex) AND (Row <= @startRowIndex + @maximumRows OR @maximumRows <= 0) ORDER BY Row One problem with this is that it doesn't give the total count and generally we need another stored procedure for that. This second stored procedure has to replicate the parameter list and the complex WHERE clause. Not nice. One solution is to append an extra column to the final select list, (SELECT COUNT(*) FROM stuff) AS TotalRows. This gives us the total but repeats it for every row in the result set, which is not ideal. [Method 2] An interesting alternative is given here (http://www.4guysfromrolla.com/articles/032206-1.aspx) using dynamic SQL. He reckons that the performance is better because the CASE statement in the first solution drags things down. Fair enough, and this solution makes it easy to get the totalRows and slap it into an output parameter. But I hate coding dynamic SQL. All that 'bit of SQL ' + STR(@parm1) +' bit more SQL' gubbins. [Method 3] The only way I can find to get what I want, without repeating code which would have to be synchronised, and keeping things reasonably readable is to go back to the "old way" of using a table variable: DECLARE @stuff TABLE (Row INT, ...) INSERT INTO @stuff SELECT CASE WHEN @SortBy = 'Name' THEN ROW_NUMBER() OVER (ORDER BY Name) WHEN @SortBy = 'Name DESC' THEN ROW_NUMBER() OVER (ORDER BY Name DESC) WHEN @SortBy = ... ELSE ROW_NUMBER() OVER (ORDER BY whatever) END AS Row, ., ., ., FROM Table1 INNER JOIN Table2 ... LEFT JOIN Table3 ... WHERE ... (lots of things to check) SELECT * FROM stuff WHERE (Row > @startRowIndex) AND (Row <= @startRowIndex + @maximumRows OR @maximumRows <= 0) ORDER BY Row (Or a similar method using an IDENTITY column on the table variable). Here I can just add a SELECT COUNT on the table variable to get the totalRows and put it into an output parameter. I did some tests and with a fairly simple version of the query (no sortBy and no filter), method 1 seems to come up on top (almost twice as quick as the other 2). Then I decided to test probably I needed the complexity and I needed the SQL to be in stored procedures. With this I get method 1 taking nearly twice as long as the other 2 methods. Which seems strange. Is there any good reason why I shouldn't spurn CTEs and stick with method 3? UPDATE - 15 March 2012 I tried adapting Method 1 to dump the page from the CTE into a temporary table so that I could extract the TotalRows and then select just the relevant columns for the resultset. This seemed to add significantly to the time (more than I expected). I should add that I'm running this on a laptop with SQL Server Express 2008 (all that I have available) but still the comparison should be valid. I looked again at the dynamic SQL method. It turns out I wasn't really doing it properly (just concatenating strings together). I set it up as in the documentation for sp_executesql (with a parameter description string and parameter list) and it's much more readable. Also this method runs fastest in my environment. Why that should be still baffles me, but I guess the answer is hinted at in Hogan's comment.

Read the article
Error when opening .tar.gz via Shell to install Apache Maven

- by adamsquared

Thank you in advance for the help. My Goal: To install apache maven per its websites instructions (http://maven.apache.org/download.html), in order to install the JUNG package according to its install instructions (http://sourceforge.net/apps/trac/jung/wiki/JUNGManual), so I can use the JUNG classes in various Java GUIs. The Problem: I get an error message when I try to extract the apache-maven .gz (install?) file in shell. Background: I'm trying to install the JUNG (http://jung.sourceforge.net/index.html) package to my system's Java, so I can write object-oriented code using various GUIs (Ecliplse, Dr. Java) using the classes in JUNG. I don't understand how the building/installing process works, and how I can get what I build/install to work on various GUIs and the command line. I'm new to shell and the command line, and mostly have experience using a simple IDE (DrJava, Python IDLE, R GUI) to write and compile object-oriented code. Machine: Mac OSX 10.5.8 32-bit. The Instructions: For the maven building Extract the distribution archive, i.e. apache-maven-3.0.4-bin.tar.gz to the directory you wish to install Maven 3.0.4. These instructions assume you chose /usr/local/apache-maven. The subdirectory apache-maven-3.0.4 will be created from the archive. ... for the JUNG installation Appendix: How to Build JUNG This is a brief intro to building JUNG jars with maven2 (the build system that JUNG currently uses). First, ensure that you have a JDK of at least version 1.5: JUNG 2.0+ requires Java 1.5+. Ensure that your JAVA_HOME variable is set to the location of the JDK. On a Windows platform, you may have a separate JRE (Java Runtime Environment) and JDK (Java Development Kit). The JRE has no capability to compile Java source files, so you must have a JDK installed. If your JAVA_HOME variable is set to the location of the JRE, and not the location of the JDK, you will be unable to compile. Get Maven Download and install maven2 from maven.apache.org: http://maven.apache.org/download.html At time of writing (early December 2009), the latest version was maven-2.2.1. Install the downloaded maven2 (there are installation instructions on the Maven website). Follow the installation instructions and confirm a successful installation by typing 'mvn --version' in a command terminal window. Get JUNG ... What I Did: I downloaded the file apache-maven-2.2.1-bin.tar.gz. The JUNG website specified to use apache maven 2. I wanted to stick to the recommended installation instructions, but I couldn't get to /usr on my GUI (i've noticed you click on the MacHD symbol on the desktop its missing several directories/folders that you can see using the shell using the ls command at root directory I couldn't find a way to access the file using my mac GUI. Therefore, I used the shell to navigate to the root directory and then to /usr/local, and used the mkdir command to make the directory apache-maven and entered it. I then moved the file using the mv command. Next I tried extracting the file using tar -zxvf apache-maven-2.2.1-bin.tar.gz. The Error Message: tar: apache-maven-2.2.1/direcoryandfile: Cannot open: No such file or directory ... apache-maven-2.2.1/lib/ext: Cannot mkdir: No such file or directory apache-maven-2.2.1/lib/ext/README.txt tar: apache-maven-2.2.1/lib/ext/README.txt: Cannot open: No such file or directory tar: Error exit delayed from previous errors From what I can tell the archive file is missing some directories or something. I tried deleting the file, redownloading the .tar.gz file from a different mirror and repeating the process. Same result. Thanks again for the help

Read the article
SVG: About using <defs> and <use> with variable text values

- by Lukman

Hi, I have the following SVG source code that generates a number of boxes with texts: <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.0//EN" "http://www.w3.org/TR/2001/REC-SVG-20050904/DTD/svg11.dtd"> <svg xmlns="http://www.w3.org/2000/svg" version="1.1" xmlns:xlink="http://www.w3.org/1999/xlink" width="600" height="600"> <defs> </defs> <title>Draw</title> <g transform="translate(50,40)"> <rect width="80" height="30" x="0" y="-20" style="stroke: black; stroke-opacity: 1; stroke-width: 1; fill: #9dc2de" /> <text text-anchor="middle" x="40">Text</text> </g> <g transform="translate(150,40)"> <rect width="80" height="30" x="0" y="-20" style="stroke: black; stroke-opacity: 1; stroke-width: 1; fill: #9dc2de" /> <text text-anchor="middle" x="40">Text 2</text> </g> <g transform="translate(250,40)"> <rect width="80" height="30" x="0" y="-20" style="stroke: black; stroke-opacity: 1; stroke-width: 1; fill: #9dc2de" /> <text text-anchor="middle" x="40">Text 3</text> </g> </svg> As you can see, I repeated the <g></g> three times to get three such boxes, when SVG has <defs> and <use> elements that allow reusing elements using id references instead of repeating their definitions. Something like: <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.0//EN" "http://www.w3.org/TR/2001/REC-SVG-20050904/DTD/svg11.dtd"> <svg xmlns="http://www.w3.org/2000/svg" version="1.1" xmlns:xlink="http://www.w3.org/1999/xlink" width="600" height="600"> <defs> <marker style="overflow:visible;fill:inherit;stroke:inherit" id="Arrow1Mend" refX="0.0" refY="0.0" orient="auto"> <path transform="scale(0.4) rotate(180) translate(20,0)" style="fill-rule:evenodd;stroke-width:2.0pt;marker-start:none;" d="M 0.0,-15.0 L -20.5,0.0 L 0.0,15.0 "/> </marker> <line marker-end="url(#Arrow1Mend)" id="systemthread" x1="40" y1="10" x2="40" y2="410" style="stroke: black; stroke-dasharray: 5, 5; stroke-width: 1; "/> </defs> <title>Draw</title> <use xlink:href="#systemthread" transform="translate(50,40)" /> <use xlink:href="#systemthread" transform="translate(150,40)" /> <use xlink:href="#systemthread" transform="translate(250,40)" /> </svg> Unfortunately I can't do this with the first SVG code since I need the texts to be different for each box, while the <use> tag simply duplicates 100% what's defined in <defs>. Is there any way to use <defs> and <use> with some kind of parameters/arguments mechanism like function calls?

Read the article
Building an external list while filtering in LINQ

- by Khnle

I have an array of input strings that contains either email addresses or account names in the form of domain\account. I would like to build a List of string that contains only email addresses. If an element in the input array is of the form domain\account, I will perform a lookup in the dictionary. If the key is found in the dictionary, that value is the email address. If not found, that won't get added to the result list. The code below will makes the above description clear: private bool where(string input, Dictionary<string, string> dict) { if (input.Contains("@")) { return true; } else { try { string value = dict[input]; return true; } catch (KeyNotFoundException) { return false; } } } private string select(string input, Dictionary<string, string> dict) { if (input.Contains("@")) { return input; } else { try { string value = dict[input]; return value; } catch (KeyNotFoundException) { return null; } } } public void run() { Dictionary<string, string> dict = new Dictionary<string, string>() { { "gmail\\nameless", "[email protected]"} }; string[] s = { "[email protected]", "gmail\\nameless", "gmail\\unknown" }; var q = s.Where(p => where(p, dict)).Select(p => select(p, dict)); List<string> resultList = q.ToList<string>(); } While the above code works (hope I don't have any typo here), there are 2 problems that I do not like with the above: The code in where() and select() seems to be redundant/repeating. It takes 2 passes. The second pass converts from the query expression to List. So I would like to add to the List resultList directly in the where() method. It seems like I should be able to do so. Here's the code: private bool where(string input, Dictionary<string, string> dict, List<string> resultList) { if (input.Contains("@")) { resultList.Add(input); //note the difference from above return true; } else { try { string value = dict[input]; resultList.Add(value); //note the difference from above return true; } catch (KeyNotFoundException) { return false; } } } The my LINQ expression can be nicely in 1 single statement: List<string> resultList = new List<string>(); s.Where(p => where(p, dict, resultList)); Or var q = s.Where(p => where(p, dict, resultList)); //do nothing with q afterward Which seems like perfect and legal C# LINQ. The result: sometime it works and sometime it doesn't. So why doesn't my code work reliably and how can I make it do so?

Read the article
Delphi: EReadError with message ‘Property PageNr does Not Exist’.

- by lyborko

Hi, I get SOMETIMES error message: EReadError with message 'Property PageNr does Not exist', when I try to run my own project. I am really desperate, because I see simply nothing what is the cause. The devilish is that it comes up sometimes but often. It concerns of my own component TPage. Here is declaration TPage = class(TCustomControl) // private FPaperHeight, FPaperWidth:Integer; FPaperBrush:TBrush; FPaperSize:TPaperSize; FPaperOrientation:TPaperOrientation; FPDFDocument: TPDFDocument; FPageNr:integer; procedure PaintBasicLayout; procedure PaintInterior; procedure SetPapersize(Value: TPapersize); procedure SetPaperHeight(Value: Integer); procedure SetPaperWidth(Value: Integer); procedure SetPaperOrientation(value:TPaperOrientation); procedure SetPaperBrush(Value:TBrush); procedure SetPageNr(Value:Integer); protected procedure CreateParams(var Params:TCreateParams); override; procedure AdjustClientRect(var Rect: TRect); override; public constructor Create(AOwner: TComponent);override; destructor Destroy;override; // function GetChildOwner:TComponent; override; procedure DrawControl(X,Y :integer; Dx,Dy:Double; Ctrl:TControl;NewCanvas:TCanvas); // procedure GetChildren(Proc:TGetChildProc; Root:TComponent); override; procedure Loaded; override; procedure MouseDown(Button: TMouseButton; Shift: TShiftState; X, Y: Integer); override; procedure Paint; override; procedure PrintOnCanvas(X,Y:integer; rX,rY:Double; ACanvas:TCanvas); procedure PrintOnPDFCanvas(X,Y:integer); procedure PrintOnPrinterCanvas(X,Y:integer); procedure Resize; override; procedure SetPrintKind(APrintKind:TPrintKind; APrintGroupindex:Integer); published property PageNr:integer read FPageNr write SetPageNr; property PaperBrush: TBrush read FPaperBrush write SetPaperBrush; property PaperHeight: integer read FPaperHeight write SetPaperHeight; property PaperWidth: integer read FPaperWidth write SetPaperWidth; property PaperSize: TPaperSize read FPaperSize write SetPaperSize; property PaperOrientation:TPaperOrientation read FPaperOrientation write SetPaperOrientation; property PDFDocument:TPDFDocument read FPDFDocument write FPDFDocument; property TabOrder; end; I thoroughly read the similar topic depicted here: Delphi: EReadError with message 'Property Persistence does Not exist' But here it is my own source code. No third party. Interesting: when I delete PageNr property in my dfm file (unit1.dfm), then pops up : EReadError with message 'Property PaperHeight does Not exist'. when I delete PaperHeight then it will claim PaperWidth and so on... Here is piece of dfm file: object pg1: TPage Left = 128 Top = 144 Width = 798 Height = 1127 PageNr = 0 PaperHeight = 1123 PaperWidth = 794 PaperSize = psA4 PaperOrientation = poPortrait TabOrder = 0 object bscshp4: TBasicShape Left = 112 Top = 64 Width = 105 Height = 105 PrintKind = pkNormal PrintGroupIndex = 0 Zooming = 100 Transparent = False Repeating = False PageRepeatOffset = 1 ShapeStyle = ssVertical LinePosition = 2 end object bscshp5: TBasicShape Left = 288 Top = 24 Width = 105 Height = 105 PrintKind = pkNormal PrintGroupIndex = 0 Zooming = 100 Transparent = False What the hell happens ??????? I have never seen that. I compiled the unit several times... Encoutered no problem. Maybe the cause is beyond this. I feel completely powerless.

Read the article
How Expedia Made My New Bride Cry

- by Lance Robinson

Tweet this? Email Expedia and ask them to give me and my new wife our honeymoon? When Expedia followed up their failure with our honeymoon trip with a complete and total lack of acknowledgement of any responsibility for the problem and endless loops of explaining the issue over and over again - I swore that they would make it right. When they brought my new bride to tears, I got an immediate and endless supply of motivation. I hope you will help me make them make it right by posting our story on Twitter, Facebook, your blog, on Expedia itself, and when talking to your friends in person about their own travel plans. If you are considering using them now for an important trip - reconsider. Short summary: We arrived early for a flight - but Expedia had made a mistake with the data they supplied to JetBlue and Emirates, which resulted in us not being able to check in (one leg of our trip was missing)! At the time of this post, three people (myself, my wife, and an exceptionally patient JetBlue employee named Mary) each spent hours on the phone with Expedia. I myself spent right at 3 hours (according to iPhone records), Lauren spent an hour and a half or so, and poor Mary was probably on the phone for a good 3.5 hours. This is after 5 hours total at the airport. If you add up our phone time, that is nearly 8 hours of phone time over a 5 hour period with little or no help, stall tactics (?), run-around, denial, shifting of blame, and holding. Details below (times are approximate): First, my wife and I were married yesterday - June 18th, the 3 year anniversary of our first date. She is awesome. She is the nicest person I have ever known, a ton of fun, absolutely beautiful in every way. Ok enough mushy - here are the dirty details. 2:30 AM - Early Check-in Attempt - we attempted to check-in for our flight online. Some sort of technology error on website, instructed to checkin at desk. 4:30 AM - Arrive at airport. Try to check-in at kiosk, get the same error. We got to the JetBlue desk at RDU International Airport, where Mary helped us. Mary discovered that the Expedia provided itinerary does not match the Expedia provided tickets. We are informed that when that happens American, JetBlue, and others that use the same software cannot check you in for the flight because. Why? Because the itinerary was missing a leg of our flight! Basically we were not shown in the system as definitely being able to make it home. Mary called Expedia and was put on hold by their automated system. 4:55 AM - Mary, myself, and my brand new bride all waited for about 25 minutes when finally I decided I would make a call myself on my iPhone while Mary was on the airport phone. In their automated system, I chose "make a new reservation", thinking they might answer a little more quickly than "customer service". Not surprisingly I was connected to an Expedia person within 1 minute. They informed me that they would have to forward me to a customer service specialist. I explained to them that we were already on hold for that and had been for nearly half an hour, that we were going on our honeymoon and that our flight would be leaving soon - could they please help us. "Yes, I will help you". I hand the phone to JetBlue Mary who explains the situation 3 or 4 times. Obviously I couldn't hear both ends of the conversation at this point, but the Expedia person explained what the problem was by stating exactly what Mary had just spent 15 minutes explaining. Mary calmly confirms that this is the problem, and asks Expedia to re-issue the itinerary. Expedia tells Mary that they'll have to transfer her to customer service. Mary asks for someone specific so that we get an answer this time, and goes on hold. Mary get's connected, explains the situation, and then Mary's connection gets terminated. 5:10 AM - Mary calls back to the Expedia automated system again, and we wait for about 5 minutes on hold this time before I pick up my iPhone and call Expedia again myself. Again I go to sales, a person picks up the phone in less than a minute. I explain the situation and let them know that we are now very close to missing our flight for our honeymoon, could they please help us. "Yes, I will help you". Again I give the phone to Mary who provides them with a call back number in case we get disconnected again and explains the situation again. More back and forth with Expedia doing nothing but repeating the same questions, Mary answering the questions with the same information she provided in the original explanation, and Expedia simply restating the problem. Mary again asks them to re-issue the itinerary, and explains that doing so will fix the problem. Expedia again repeats the problem instead of fixing it, and Mary's connection gets terminated. 5:20 AM - Mary again calls back to Expedia. My beautiful bride also calls on her own phone. At this point she is struggling to hold back her tears, stumbling through an explanation of all that has happened and that we are about to miss our flight. Please help us. "Yes, I will help". My beautiful bride's connection gets terminated. Ok, maybe this disconnection isn't an accident. We've now been disconnected 3 times on two different phones. 5:45 AM - I walk away and pleadingly beg a person to help me. They "escalate" the issue to "Rosy" (sp?) at Expedia. I go through the whole song and dance again with Rosy, who gives me the same treatment Mary was given. Rosy blames JetBlue for now having the correct data. Meanwhile Mary is on the phone with Emirates Air (the airline for the second leg of our trip), who agrees with JetBlue that Expedia's data isn't up to date. We are informed by two airport employees that issues like this with Expedia are not uncommon, and that the fix is simple. On the phone iwth Rosy, I ask her to re-issue the itinerary because we are about to miss our flight. She again explains the problem to me. At this point, I am standing at the window, pleading with Rosy to help us get to our honeymoon, watching our airplane. Then our airplane leaves without us. 6:03 AM - At this point we have missed our flight. Re-issuing the itinerary is no longer a solution. I ask Rosy to start from the beginning and work us up a new trip. She says that she cannot do that. She says that she needs to talk to JetBlue and Emirates and find out why we cannot check-in for our flight. I remind Rosy that our flight has already left - I just watched it taxi away - it no longer matters why (not to mention the fact that we already knew why, and have known why since 4:30 AM), and have known the solution since 4:30 AM. Rosy, can you please book a new trip? Yes, but it will cost $400. Excuse me? Now you can, but it will cost ME to fix your mistake? Rosy says that she can escalate the situation to her supervisor but that will take 1.5 hours. 6:15 AM - I told Rosy that if they had re-issued the itinerary as JetBlue asked (at 4:30 AM), my new wife and I might be on the airplane now instead of dealing with this on the phone and missing the beginning (and how much more?) of our honeymoon. Rosy said that it was not necessary to re-issue the itinerary. Out of curiosity, i asked Rosy if there was some financial burden on them to re-issue the itinerary. "No", said Rosy. I asked her if it was a large time burden on Expedia to re-issue the itinerary. "No", said Rosy. I directly asked Rosy: Why wouldn't Expedia have re-issued the itinerary when JetBlue asked? No answer. I asked Rosy: If you had re-issued the itinerary at 4:30, isn't it possible that I would be on that flight right now? She actually surprised me by answering "Yes" to that question. So I pointed out that it followed that Expedia was responsible for the fact that we missed out flight, and she immediately went into more about how the problem was with JetBlue - but now it was ALSO an Emirates Air problem as well. I tell Rosy to go ahead and escalate the issue again, and please call me back in that 1.5 hours (which how is about 1 hour and 10 minutes away). 6:30 AM - I start tweeting my frustration with iPhone. It's now pretty much impossible for us to make it to The Maldives by 3pm, which is the time at which we would need to arrive in order to be allowed service to the actual island where we are staying. Expedia has now given me the run-around for 2 hours, caused me to miss my flight, and worst of all caused my amazing new wife Lauren to miss our honeymoon. You think I was mad? No. Furious. Its ok to make mistakes - but to refuse to fix them and to ruin our honeymoon? No, not ok, Expedia. I swore right then that Expedia would make this right. 7:45 AM - JetBlue mary is still talking her tail off to other people in JetBlue and Emirates Air. Mary works it out so that if Expedia simply books a new trip, JetBlue and Emirates will both waive all the fees. Now we just have to convince Expedia to fix their mistake and get us on our way! Around this time Expedia Rosy calls me back! I inform her of the excellent work of JetBlue Mary - that JetBlue and Emirates both will waive the fees so Expedia can fix their mistake and get us going on our way. She says that she sees documentation of this in her system and that she needs to put me on hold "for 1 to 10 minutes" to talk to Emirates Air (why I'm not exactly sure). I say ok. 8:45 AM - After an hour on hold, Rosy comes on the line and asks me to hold more. I ask her to call me back. 9:35 AM - I put down the iPhone Twitter app and picks up the laptop. You think I made some noise with my iPhone? Heh 11:25 AM - Expedia follows me and sends a canned "We're sorry, DM us the details". If you look at their Twitter feed, 16 out of the most recent 20 tweets are exactly the same canned response. The other 4? Ads. Um - #MultiFAIL? To Expedia: You now have had (as explained above) 8 hours of 3 different people explaining our situation, you know the email address of our Expedia account, you know my web blog, you know my Twitter address, you know my phone number. You also know how upset you have made both me and my new bride by treating us with such a ... non caring, scripted, uncooperative, argumentative, and possibly even deceitful manner. In the wise words of the great Kenan Thompson of SNL: "FIX IT!". And no, I'm NOT going away until you make this right. Period. 11:45 AM - Expedia corporate office called. The woman I spoke to was very nice and apologetic. She listened to me tell the story again, she says she understands the problem and she is going to work to resolve it. I don't have any details on what exactly that resolution might me, she said she will call me back in 20 minutes. She found out about the problem via Twitter. Thank you Twitter, and all of you who helped. Hopefully social media will win my wife and I our honeymoon, and hopefully Expedia will encourage their customer service teams treat their customers properly. 12:22 PM - Spoke to Fran again from Expedia corporate office. She has a flight for us tonight. She is booking it now. We will arrive at our honeymoon destination of beautiful Veligandu Island Resort only 1 day late. She cannot confirm today, but she expects that Expedia will pay for the lost honeymoon night. Thank you everyone for your help. I will reflect more on this whole situation and confirm its resolution after our flight is 100% confirmed. For now, I'm going to take a breather and go kiss my wonderful wife! 1:50 PM - Have not yet received the promised phone call. We did receive an email with a new itinerary for a flight but the booking is not for specific seats, so there is no guarantee that my wife and I will be able to sit together. With the original booking I carefully selected our seats for every segment of our trip. I decided to call into the phone number that Fran from the Expedia corporate office gave me. Its automated voice system identified itself as "Tier 3 Support". I am currently still on hold with them, I have not gotten through to a human yet. 1:55 PM - Fran from Expedia called me back. She confirmed us as booked. She called the airlines to confirm. Unfortunately, Expedia was unwilling or unable to allow us any type of seat selection. It is possible that i won't get to sit next to the woman I married less than a day ago on our 40 total hours of flight time (there and back). In addition, our seats could be the worst seats on the planes, with no reclining seat back or right next to the restroom. Despite this fact (which in my opinion is huge), the horrible inconvenience, the hours at the airport, and the negative Internet publicity that Expedia is receiving, Expedia declined to offer us any kind of upgrade or to mark us as SFU (suitable for upgrade). Since they didn't offer - I asked, and was rejected. I am grateful to finally be heading in the right direction, but not only did Expedia horribly botch this job from the very beginning, they followed that botch job with near zero customer service, followed by a verbally apologetic but otherwise half-hearted resolution. If this works out favorably for us, great. If not - I'm not done making noise, Expedia. You owe us, and I expect you to make it right. You haven't quite done that yet. Thanks - Thank you to Twitter. Thanks to all those who sympathize with us and helped us get the attention of Expedia, since three people (one of them an airline employee) using Expedia's normal channels of communication for many hours didn't help. Thanks especially to my PowerShell and Sharepoint friends, my local friends, and those connectors who encouraged me and spread my story. 5:15 PM - Love Wins - After all this, Lauren and I are exhausted. We both took a short nap, and when we woke up we talked about the last 24 hours. It was a big, amazing, story-filled 24 hours. I said that Expedia won, but Lauren said no. She pointed out how lucky we are. We are in love and married. We have wonderful family and friends. We are both hard-working successful people who love what they do. We get to go to an amazing exotic destination for our honeymoon like Veligandu in The Maldives... That's a lot of good. Expedia didn't win. This was (is) a big loss for Expedia. It is a public blemish for all to see. But Lauren and I did win, big time. Expedia may not have made things right - but things are right for us. Post in progress... I will relay any further comments (or lack of) from Expedia soon, as well as an update on confirmation of their repayment of our lost resort room rates. I'll also post a picture of us on our honeymoon as soon as I can!

Read the article
The Art of Productivity

- by dwahlin

Getting things done has always been a challenge regardless of gender, age, race, skill, or job position. No matter how hard some people try, they end up procrastinating tasks until the last minute. Some people simply focus better when they know they’re out of time and can’t procrastinate any longer. How many times have you put off working on a term paper in school until the very last minute? With only a few hours left your mental energy and focus seem to kick in to high gear especially as you realize that you either get the paper done now or risk failing. It’s amazing how a little pressure can turn into a motivator and allow our minds to focus on a given task. Some people seem to specialize in procrastinating just about everything they do while others tend to be the “doers” who get a lot done and ultimately rise up the ladder at work. What’s the difference between these types of people? Is it pure laziness or are other factors at play? I think that some people are certainly more motivated than others, but I also think a lot of it is based on the process that “doers” tend to follow - whether knowingly or unknowingly. While I’ve certainly fought battles with procrastination, I’ve always had a knack for being able to get a lot done in a relatively short amount of time. I think a lot of my “get it done” attitude goes back to the the strong work ethic my parents instilled in me at a young age. I remember my dad saying, “You need to learn to work hard!” when I was around 5 years old. I remember that moment specifically because I was on a tractor with him the first time I heard it while he was trying to move some large rocks into a pile. The tractor was big but so were the rocks and my dad had to balance the tractor perfectly so that it didn’t tip forward too far. It was challenging work and somewhat tedious but my dad finished the task and taught me a few important lessons along the way including persistence, the importance of having a skill, and getting the job done right without skimping along the way. In this post I’m going to list a few of the techniques and processes I follow that I hope may be beneficial to others. I blogged about the general concept back in 2009 but thought I’d share some updated information and lessons learned since then. Most of the ideas that follow came from learning and refining my daily work process over the years. However, since most of the ideas are common sense (at least in my opinion), I suspect they can be found in other productivity processes that are out there. Let’s start off with one of the most important yet simple tips: Start Each Day with a List. Start Each Day with a List What are you planning to get done today? Do you keep track of everything in your head or rely on your calendar? While most of us think that we’re pretty good at managing “to do” lists strictly in our head you might be surprised at how affective writing out lists can be. By writing out tasks you’re forced to focus on the most important tasks to accomplish that day, commit yourself to those tasks, and have an easy way to track what was supposed to get done and what actually got done. Start every morning by making a list of specific tasks that you want to accomplish throughout the day. I’ll even go so far as to fill in times when I’d like to work on tasks if I have a lot of meetings or other events tying up my calendar on a given day. I’m not a big fan of using paper since I type a lot faster than I write (plus I write like a 3rd grader according to my wife), so I use the Sticky Notes feature available in Windows. Here’s an example of yesterday’s sticky note: What do you add to your list? That’s the subject of the next tip. Focus on Small Tasks It’s no secret that focusing on small, manageable tasks is more effective than trying to focus on large and more vague tasks. When you make your list each morning only add tasks that you can accomplish within a given time period. For example, if I only have 30 minutes blocked out to work on an article I don’t list “Write Article”. If I do that I’ll end up wasting 30 minutes stressing about how I’m going to get the article done in 30 minutes and ultimately get nothing done. Instead, I’ll list something like “Write Introductory Paragraphs for Article”. The next day I may add, “Write first section of article” or something that’s small and manageable – something I’m confident that I can get done. You’ll find that once you’ve knocked out several smaller tasks it’s easy to continue completing others since you want to keep the momentum going. In addition to keeping my tasks focused and small, I also make a conscious effort to limit my list to 4 or 5 tasks initially. I’ve found that if I list more than 5 tasks I feel a bit overwhelmed which hurts my productivity. It’s easy to add additional tasks as you complete others and you get the added benefit of that confidence boost of knowing that you’re being productive and getting things done as you remove tasks and add others. Getting Started is the Hardest (Yet Easiest) Part I’ve always found that getting started is the hardest part and one of the biggest contributors to procrastination. Getting started working on tasks is a lot like getting a large rock pushed to the bottom of a hill. It’s difficult to get the rock rolling at first, but once you manage to get it rocking some it’s really easy to get it rolling on its way to the bottom. As an example, I’ve written 100s of articles for technical magazines over the years and have really struggled with the initial introductory paragraphs. Keep in mind that these are the paragraphs that don’t really add that much value (in my opinion anyway). They introduce the reader to the subject matter and nothing more. What a waste of time for me to sit there stressing about how to start the article. On more than one occasion I’ve spent more than an hour trying to come up with 2-3 paragraphs of text. Talk about a productivity killer! Whether you’re struggling with a writing task, some code for a project, an email, or other tasks, jumping in without thinking too much is the best way to get started I’ve found. I’m not saying that you shouldn’t have an overall plan when jumping into a task, but on some occasions you’ll find that if you simply jump into the task and stop worrying about doing everything perfectly that things will flow more smoothly. For my introductory paragraph problem I give myself 5 minutes to write out some general concepts about what I know the article will cover and then spend another 10-15 minutes going back and refining that information. That way I actually have some ideas to work with rather than a blank sheet of paper. If I still find myself struggling I’ll write the rest of the article first and then circle back to the introductory paragraphs once I’m done. To sum this tip up: Jump into a task without thinking too hard about it. It’s better to to get the rock at the top of the hill rocking some than doing nothing at all. You can always go back and refine your work. Learn a Productivity Technique and Stick to It There are a lot of different productivity programs and seminars out there being sold by companies. I’ve always laughed at how much money people spend on some of these motivational programs/seminars because I think that being productive isn’t that hard if you create a re-useable set of steps and processes to follow. That’s not to say that some of these programs/seminars aren’t worth the money of course because I know they’ve definitely benefited some people that have a hard time getting things done and staying focused. One of the best productivity techniques I’ve ever learned is called the “Pomodoro Technique” and it’s completely free. This technique is an extremely simple way to manage your time without having to remember a bunch of steps, color coding mechanisms, or other processes. The technique was originally developed by Francesco Cirillo in the 80s and can be implemented with a simple timer. In a nutshell here’s how the technique works: Pick a task to work on Set the timer to 25 minutes and work on the task Once the timer rings record your time Take a 5 minute break Repeat the process Here’s why the technique works well for me: It forces me to focus on a single task for 25 minutes. In the past I had no time goal in mind and just worked aimlessly on a task until I got interrupted or bored. 25 minutes is a small enough chunk of time for me to stay focused. Any distractions that may come up have to wait until after the timer goes off. If the distraction is really important then I stop the timer and record my time up to that point. When the timer is running I act as if I only have 25 minutes total for the task (like you’re down to the last 25 minutes before turning in your term paper….frantically working to get it done) which helps me stay focused and turns into a “beat the clock” type of game. It’s actually kind of fun if you treat it that way and really helps me focus on a the task at hand. I automatically know how much time I’m spending on a given task (more on this later) by using this technique. I know that I have 5 minutes after each pomodoro (the 25 minute sprint) to waste on anything I’d like including visiting a website, stepping away from the computer, etc. which also helps me stay focused when the 25 minute timer is counting down. I use this technique so much that I decided to build a program for Windows 8 called Pomodoro Focus (I plan to blog about how it was built in a later post). It’s a Windows Store application that allows people to track tasks, productive time spent on tasks, interruption time experienced while working on a given task, and the number of pomodoros completed. If a time estimate is given when the task is initially created, Pomodoro Focus will also show the task completion percentage. I like it because it allows me to track my tasks, time spent on tasks (very useful in the consulting world), and even how much time I wasted on tasks (pressing the pause button while working on a task starts the interruption timer). I recently added a new feature that charts productive and interruption time for tasks since I wanted to see how productive I was from week to week and month to month. A few screenshots from the Pomodoro Focus app are shown next, I had a lot of fun building it and use it myself to as I work on tasks. There are certainly many other productivity techniques and processes out there (and a slew of books describing them), but the Pomodoro Technique has been the simplest and most effective technique I’ve ever come across for staying focused and getting things done. Persistence is Key Getting things done is great but one of the biggest lessons I’ve learned in life is that persistence is key especially when you’re trying to get something done that at times seems insurmountable. Small tasks ultimately lead to larger tasks getting accomplished, however, it’s not all roses along the way as some of the smaller tasks may come with their own share of bumps and bruises that lead to discouragement about the end goal and whether or not it is worth achieving at all. I’ve been on several long-term projects over my career as a software developer (I have one personal project going right now that fits well here) and found that repeating, “Persistence is the key!” over and over to myself really helps. Not every project turns out to be successful, but if you don’t show persistence through the hard times you’ll never know if you succeeded or not. Likewise, if you don’t persistently stick to the process of creating a daily list, follow a productivity process, etc. then the odds of consistently staying productive aren’t good. Track Your Time How much time do you actually spend working on various tasks? If you don’t currently track time spent answering emails, on phone calls, and working on various tasks then you might be surprised to find out that a task that you thought was going to take you 30 minutes ultimately ended up taking 2 hours. If you don’t track the time you spend working on tasks how can you expect to learn from your mistakes, optimize your time better, and become more productive? That’s another reason why I like the Pomodoro Technique – it makes it easy to stay focused on tasks while also tracking how much time I’m working on a given task. Eliminate Distractions I blogged about this final tip several years ago but wanted to bring it up again. If you want to be productive (and ultimately successful at whatever you’re doing) then you can’t waste a lot of time playing games or on Twitter, Facebook, or other time sucking websites. If you see an article you’re interested in that has no relation at all to the tasks you’re trying to accomplish then bookmark it and read it when you have some spare time (such as during a pomodoro break). Fighting the temptation to check your friends’ status updates on Facebook? Resist the urge and realize how much those types of activities are hurting your productivity and taking away from your focus. I’ll admit that eliminating distractions is still tough for me personally and something I have to constantly battle. But, I’ve made a conscious decision to cut back on my visits and updates to Facebook, Twitter, Google+ and other sites. Sure, my Klout score has suffered as a result lately, but does anyone actually care about those types of scores aside from your online “friends” (few of whom you’ve actually met in person)? :-) Ultimately it comes down to self-discipline and how badly you want to be productive and successful in your career, life goals, hobbies, or whatever you’re working on. Rather than having your homepage take you to a time wasting news site, game site, social site, picture site, or others, how about adding something like the following as your homepage? Every time your browser opens you’ll see a personal message which helps keep you on the right track. You can download my ubber-sophisticated homepage here if interested. Summary Is there a single set of steps that if followed can ultimately lead to productivity? I don’t think so since one size has never fit all. Every person is different, works in their own unique way, and has their own set of motivators, distractions, and more. While I certainly don’t consider myself to be an expert on the subject of productivity, I do think that if you learn what steps work best for you and gradually refine them over time that you can come up with a personal productivity process that can serve you well. Productivity is definitely an “art” that anyone can learn with a little practice and persistence. You’ve seen some of the steps that I personally like to follow and I hope you find some of them useful in boosting your productivity. If you have others you use please leave a comment. I’m always looking for ways to improve.

Read the article
CakePHP sending multiple lines of same cookie information in response header. why??

- by Vicer

Hi all, I am having this problem with one of my cakePHP applications. My application displays a big html table on the page. I noticed that when the table goes beyond a certain size limit, IE cannot display that page. While trying to figure out why this happens, I noticed that my html response header contains a HUGE number of lines repeating the same thing. Response Headers ------------------------------ Date Thu, 20 May 2010 04:18:10 GMT Server Apache/2.0.63 (Win32) mod_ssl/2.0.63 OpenSSL/0.9.7m PHP/5.2.10 X-Powered-By PHP/5.2.10 P3P CP="NOI ADM DEV PSAi COM NAV OUR OTRo STP IND DEM" Set-Cookie CAKEPHP=q4tp37tn9gkhcpgmf1ftr1i6c1; expires=Sun, 20-May-2035 10:18:12 GMT; path=/mywebapp CAKEPHP=q4tp37tn9gkhcpgmf1ftr1i6c1; expires=Sun, 20-May-2035 10:18:12 GMT; path=/mywebapp CAKEPHP=q4tp37tn9gkhcpgmf1ftr1i6c1; expires=Sun, 20-May-2035 10:18:12 GMT; path=/mywebapp CAKEPHP=q4tp37tn9gkhcpgmf1ftr1i6c1; expires=Sun, 20-May-2035 10:18:12 GMT; path=/mywebapp CAKEPHP=q4tp37tn9gkhcpgmf1ftr1i6c1; expires=Sun, 20-May-2035 10:18:12 GMT; path=/mywebapp CAKEPHP=q4tp37tn9gkhcpgmf1ftr1i6c1; expires=Sun, 20-May-2035 10:18:12 GMT; path=/mywebapp CAKEPHP=q4tp37tn9gkhcpgmf1ftr1i6c1; expires=Sun, 20-May-2035 10:18:12 GMT; path=/mywebapp CAKEPHP=q4tp37tn9gkhcpgmf1ftr1i6c1; expires=Sun, 20-May-2035 10:18:12 GMT; path=/mywebapp CAKEPHP=q4tp37tn9gkhcpgmf1ftr1i6c1; expires=Sun, 20-May-2035 10:18:12 GMT; path=/mywebapp CAKEPHP=q4tp37tn9gkhcpgmf1ftr1i6c1; expires=Sun, 20-May-2035 10:18:12 GMT; path=/mywebapp CAKEPHP=q4tp37tn9gkhcpgmf1ftr1i6c1; expires=Sun, 20-May-2035 10:18:12 GMT; path=/mywebapp CAKEPHP=q4tp37tn9gkhcpgmf1ftr1i6c1; expires=Sun, 20-May-2035 10:18:12 GMT; path=/mywebapp CAKEPHP=q4tp37tn9gkhcpgmf1ftr1i6c1; expires=Sun, 20-May-2035 10:18:13 GMT; path=/mywebapp CAKEPHP=q4tp37tn9gkhcpgmf1ftr1i6c1; expires=Sun, 20-May-2035 10:18:13 GMT; path=/mywebapp CAKEPHP=q4tp37tn9gkhcpgmf1ftr1i6c1; expires=Sun, 20-May-2035 10:18:13 GMT; path=/mywebapp CAKEPHP=q4tp37tn9gkhcpgmf1ftr1i6c1; expires=Sun, 20-May-2035 10:18:13 GMT; path=/mywebapp CAKEPHP=q4tp37tn9gkhcpgmf1ftr1i6c1; expires=Sun, 20-May-2035 10:18:13 GMT; path=/mywebapp CAKEPHP=q4tp37tn9gkhcpgmf1ftr1i6c1; expires=Sun, 20-May-2035 10:18:13 GMT; path=/mywebapp CAKEPHP=q4tp37tn9gkhcpgmf1ftr1i6c1; expires=Sun, 20-May-2035 10:18:13 GMT; path=/mywebapp CAKEPHP=q4tp37tn9gkhcpgmf1ftr1i6c1; expires=Sun, 20-May-2035 10:18:13 GMT; path=/mywebapp CAKEPHP=q4tp37tn9gkhcpgmf1ftr1i6c1; expires=Sun, 20-May-2035 10:18:13 GMT; path=/mywebapp CAKEPHP=q4tp37tn9gkhcpgmf1ftr1i6c1; expires=Sun, 20-May-2035 10:18:13 GMT; path=/mywebapp CAKEPHP=q4tp37tn9gkhcpgmf1ftr1i6c1; expires=Sun, 20-May-2035 10:18:13 GMT; path=/mywebapp CAKEPHP=q4tp37tn9gkhcpgmf1ftr1i6c1; expires=Sun, 20-May-2035 10:18:13 GMT; path=/mywebapp CAKEPHP=q4tp37tn9gkhcpgmf1ftr1i6c1; expires=Sun, 20-May-2035 10:18:13 GMT; path=/mywebapp CAKEPHP=q4tp37tn9gkhcpgmf1ftr1i6c1; expires=Sun, 20-May-2035 10:18:13 GMT; path=/mywebapp CAKEPHP=q4tp37tn9gkhcpgmf1ftr1i6c1; expires=Sun, 20-May-2035 10:18:13 GMT; path=/mywebapp CAKEPHP=q4tp37tn9gkhcpgmf1ftr1i6c1; expires=Sun, 20-May-2035 10:18:13 GMT; path=/mywebapp CAKEPHP=q4tp37tn9gkhcpgmf1ftr1i6c1; expires=Sun, 20-May-2035 10:18:13 GMT; path=/mywebapp Keep-Alive timeout=15, max=100 Connection Keep-Alive Transfer-Encoding chunked Content-Type text/html ========================================================================== Request Headers -------------------------- Host localhost:8080 User-Agent Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3 ( .NET CLR 3.5.30729) Accept text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language en-us,en;q=0.5 Accept-Encoding gzip,deflate Accept-Charset ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive 115 Connection keep-alive Referer http://localhost:8080/mywebapp/section2/bigtable/1 Cookie CAKEPHP=q4tp37tn9gkhcpgmf1ftr1i6c1 Authorization Basic cGt1bWFyYTpwa3VtYXJh Notice the huge set of repeated lines at 'Set-Cookie' in the Response Header. This only happens when I try to display large tables. Does anyone have any clue to what might be causing this? Any help to find the issue is appreciated. I am using CakePHP 1.2.5. As far as I know, I am not messing with any set cookie functions. But yes I am using session variables.

Read the article
Using Oracle Proxy Authentication with JPA (eclipselink-Style)

- by olaf.heimburger

Security is a very intriguing topic. You will find it everywhere and you need to implement it everywhere. Yes, you need. Unfortunately, one can easily forget it while implementing the last mile. The Last Mile In a multi-tier application it is a common practice to use connection pools between the business layer and the database layer. Connection pools are quite useful to speed database connection creation and to split the load. Another very common practice is to use a specific, often called technical, user to connect to the database. This user has authentication and authorization rules that apply to all application users. Imagine you've put every effort to define roles for different types of users that use your application. These roles are necessary to differentiate between normal users, premium users, and administrators (I bet you will find or already have more roles in your application). While these user roles are pretty well used within your application, once the flow of execution enters the database everything is gone. Each and every user just has one role and is the same database user. Issues? What Issues? As long as things go well, this is not a real issue. However, things do not go well all the time. Once your application becomes famous performance decreases in certain situations or, more importantly, current and upcoming regulations and laws require that your application must be able to apply different security measures on a per user role basis at every stage of your application. If you only have a bunch of users with the same name and role you are not able to find the application usage profile that causes the performance issue, or which user has accessed data that he/she is not allowed to. Another thread to your role concept is that databases tend to be used by different applications and tools. These tools can be developer tools like SQL*Plus, SQL Developer, etc. or end user applications like BI Publisher, Oracle Forms and so on. These tools have no idea of your applications role concept and access the database the way they think is appropriate. A big oversight for your perfect role model and a big nightmare for your Chief Security Officer. Speaking of the CSO, brings up another issue: Password management. Once your technical user account is compromised, every user is able to do things that he/she is not expected to do from the design of your application. Counter Measures In the Oracle world a common counter measure is to use Virtual Private Database (VPD). This restricts the values a database user can see to the allowed minimum. However, it doesn't help in regard of a connection pool user, because this one is still not the real user. Oracle Proxy Authentication Another feature of the Oracle database is Proxy Authentication. First introduced with version 9i it is a quite useful feature for nearly every situation. The main idea behind Proxy Authentication is, to create a crippled database user who has only connect rights. Even if this user is compromised the risks are well understood and fairly limited. This user can be used in every situation in which you need to connect to the database, no matter which tool or application (see above) you use.The proxy user is perfect for multi-tier connection pools. CREATE USER app_user IDENTIFIED BY abcd1234; GRANT CREATE SESSION TO app_user; But what if you need to access real data? Well, this is the primary use case, isn't it? Now is the time to bring the application's role concept into play. You define database roles that define the grants for your identified user groups. Once you have these groups you grant access through the proxy user with the application role to the specific user. CREATE ROLE app_role_a; GRANT app_role_a TO scott; ALTER USER scott GRANT CONNECT THROUGH app_user WITH ROLE app_role_a; Now, hr has permission to connect to the database through the proxy user. Through the role you can restrict the hr's rights the are needed for the application only. If hr connects to the database directly all assigned role and permissions apply. Testing the Setup To test the setup you can use SQL*Plus and connect to your database: $ sqlplus app_user[hr]/abcd1234 Java Persistence API The Java Persistence API (JPA) is a fairly easy means to build applications that retrieve data from the database and put it into Java objects. You use plain old Java objects (POJOs) and mixin some Java annotations that define how the attributes of the object are used for storing data from the database into the Java object. Here is a sample for objects from the HR sample schema EMPLOYEES table. When using Java annotations you only specify what can not be deduced from the code. If your Java class name is Employee but the table name is EMPLOYEES, you need to specify the table name, otherwise it will fail. package demo.proxy.ejb; import java.io.Serializable; import java.sql.Timestamp; import java.util.List; import javax.persistence.Column; import javax.persistence.Entity; import javax.persistence.Id; import javax.persistence.JoinColumn; import javax.persistence.ManyToOne; import javax.persistence.NamedQueries; import javax.persistence.NamedQuery; import javax.persistence.OneToMany; import javax.persistence.Table; @Entity @NamedQueries({ @NamedQuery(name = "Employee.findAll", query = "select o from Employee o") }) @Table(name = "EMPLOYEES") public class Employee implements Serializable { @Column(name="COMMISSION_PCT") private Double commissionPct; @Column(name="DEPARTMENT_ID") private Long departmentId; @Column(nullable = false, unique = true, length = 25) private String email; @Id @Column(name="EMPLOYEE_ID", nullable = false) private Long employeeId; @Column(name="FIRST_NAME", length = 20) private String firstName; @Column(name="HIRE_DATE", nullable = false) private Timestamp hireDate; @Column(name="JOB_ID", nullable = false, length = 10) private String jobId; @Column(name="LAST_NAME", nullable = false, length = 25) private String lastName; @Column(name="PHONE_NUMBER", length = 20) private String phoneNumber; private Double salary; @ManyToOne @JoinColumn(name = "MANAGER_ID") private Employee employee; @OneToMany(mappedBy = "employee") private List employeeList; public Employee() { } public Employee(Double commissionPct, Long departmentId, String email, Long employeeId, String firstName, Timestamp hireDate, String jobId, String lastName, Employee employee, String phoneNumber, Double salary) { this.commissionPct = commissionPct; this.departmentId = departmentId; this.email = email; this.employeeId = employeeId; this.firstName = firstName; this.hireDate = hireDate; this.jobId = jobId; this.lastName = lastName; this.employee = employee; this.phoneNumber = phoneNumber; this.salary = salary; } public Double getCommissionPct() { return commissionPct; } public void setCommissionPct(Double commissionPct) { this.commissionPct = commissionPct; } public Long getDepartmentId() { return departmentId; } public void setDepartmentId(Long departmentId) { this.departmentId = departmentId; } public String getEmail() { return email; } public void setEmail(String email) { this.email = email; } public Long getEmployeeId() { return employeeId; } public void setEmployeeId(Long employeeId) { this.employeeId = employeeId; } public String getFirstName() { return firstName; } public void setFirstName(String firstName) { this.firstName = firstName; } public Timestamp getHireDate() { return hireDate; } public void setHireDate(Timestamp hireDate) { this.hireDate = hireDate; } public String getJobId() { return jobId; } public void setJobId(String jobId) { this.jobId = jobId; } public String getLastName() { return lastName; } public void setLastName(String lastName) { this.lastName = lastName; } public String getPhoneNumber() { return phoneNumber; } public void setPhoneNumber(String phoneNumber) { this.phoneNumber = phoneNumber; } public Double getSalary() { return salary; } public void setSalary(Double salary) { this.salary = salary; } public Employee getEmployee() { return employee; } public void setEmployee(Employee employee) { this.employee = employee; } public List getEmployeeList() { return employeeList; } public void setEmployeeList(List employeeList) { this.employeeList = employeeList; } public Employee addEmployee(Employee employee) { getEmployeeList().add(employee); employee.setEmployee(this); return employee; } public Employee removeEmployee(Employee employee) { getEmployeeList().remove(employee); employee.setEmployee(null); return employee; } } JPA could be used in standalone applications and Java EE containers. In both worlds you normally create a Facade to retrieve or store the values of the Entities to or from the database. The Facade does this via an EntityManager which will be injected by the Java EE container. Here is sample Facade Session Bean for a Java EE container. package demo.proxy.ejb; import java.util.HashMap; import java.util.List; import javax.ejb.Local; import javax.ejb.Remote; import javax.ejb.Stateless; import javax.persistence.EntityManager; import javax.persistence.PersistenceContext; import javax.persistence.Query; import javax.interceptor.AroundInvoke; import javax.interceptor.InvocationContext; import oracle.jdbc.driver.OracleConnection; import org.eclipse.persistence.config.EntityManagerProperties; import org.eclipse.persistence.internal.jpa.EntityManagerImpl; @Stateless(name = "DataFacade", mappedName = "ProxyUser-TestEJB-DataFacade") @Remote @Local public class DataFacadeBean implements DataFacade, DataFacadeLocal { @PersistenceContext(unitName = "TestEJB") private EntityManager em; private String username; public Object queryByRange(String jpqlStmt, int firstResult, int maxResults) { // setSessionUser(); Query query = em.createQuery(jpqlStmt); if (firstResult 0) { query = query.setFirstResult(firstResult); } if (maxResults 0) { query = query.setMaxResults(maxResults); } return query.getResultList(); } public Employee persistEmployee(Employee employee) { // setSessionUser(); em.persist(employee); return employee; } public Employee mergeEmployee(Employee employee) { // setSessionUser(); return em.merge(employee); } public void removeEmployee(Employee employee) { // setSessionUser(); employee = em.find(Employee.class, employee.getEmployeeId()); em.remove(employee); } /** select o from Employee o */ public List getEmployeeFindAll() { Query q = em.createNamedQuery("Employee.findAll"); return q.getResultList(); } Putting Both Together To use Proxy Authentication with JPA and within a Java EE container you have to take care of the additional requirements: Use an OCI JDBC driver Provide the user name that connects through the proxy user Use an OCI JDBC driver To use the OCI JDBC driver you need to set up your JDBC data source file to use the correct JDBC URL. hr jdbc:oracle:oci8:@(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=localhost)(PORT=1521))(CONNECT_DATA=(SID=XE))) oracle.jdbc.OracleDriver user app_user 62C32F70E98297522AD97E15439FAC0E SQL SELECT 1 FROM DUAL jdbc/hrDS Application Additionally you need to make sure that the version of the shared libraries of the OCI driver match the version of the JDBC driver in your Java EE container or Java application and are within your PATH (on Windows) or LD_LIBRARY_PATH (on most Unix-based systems). Installing the Oracle Database Instance Client software works perfectly. Provide the user name that connects through the proxy user This part needs some modification of your application software and session facade. Session Facade Changes In the Session Facade we must ensure that every call that goes through the EntityManager must be prepared correctly and uniquely assigned to this session. The second is really important, as the EntityManager works with a connection pool and can not guarantee that we set the proxy user on the connection that will be used for the database activities. To avoid changing every method call of the Session Facade we provide a method to set the username of the user that connects through the proxy user. This method needs to be called by the Facade client bfore doing anything else. public void setUsername(String name) { username = name; } Next we provide a means to instruct the TopLink EntityManager Delegate to use Oracle Proxy Authentication. (I love small helper methods to hide the nitty-gritty details and avoid repeating myself.) private void setSessionUser() { setSessionUser(username); } private void setSessionUser(String user) { if (user != null && !user.isEmpty()) { EntityManagerImpl emDelegate = ((EntityManagerImpl)em.getDelegate()); emDelegate.setProperty(EntityManagerProperties.ORACLE_PROXY_TYPE, OracleConnection.PROXYTYPE_USER_NAME); emDelegate.setProperty(OracleConnection.PROXY_USER_NAME, user); emDelegate.setProperty(EntityManagerProperties.EXCLUSIVE_CONNECTION_MODE, "Always"); } } The final step is use the EJB 3.0 AroundInvoke interceptor. This interceptor will be called around every method invocation. We therefore check whether the Facade methods will be called or not. If so, we set the user for proxy authentication and the normal method flow continues. @AroundInvoke public Object proxyInterceptor(InvocationContext invocationCtx) throws Exception { if (invocationCtx.getTarget() instanceof DataFacadeBean) { setSessionUser(); } return invocationCtx.proceed(); } Benefits Using Oracle Proxy Authentification has a number of additional benefits appart from implementing the role model of your application: Fine grained access control for temporary users of the account, without compromising the original password. Enabling database auditing and logging. Better identification of performance bottlenecks. References Effective Oracle Database 10g Security by Design, David Knox TopLink Developer's Guide, Chapter 98

Read the article
SOLVED Install MythTV & 11.10 on Lenovo S12 (Intel atom) with wireless

- by keepitsimpleengineer

This is how I installed Ubuntu 11.10 and MythTV client on my Lenovo S12 (Intel Atom) laptop and use it using WiFi (see additional notes at end). I did this because the upgrade from 11.04 bricked the laptop. Note that the partitions on the Lenovo standard disk were already in place for this installation. Also note that my LAN is setup for fixed IP addresses. Downloaded and burned 11.10 x86 Desktop Ubuntu CD Connected the power supply cord, LAN wire and the external DVD USB drive. Ran Windows XP and made sure performance level "Performance" was set and "Wireless" was enabled. Booted S12 from CD Disabled Networking from icon on upper left panel icon Edited Connections… "Wired connection 1" ? Set IP address, accepted default netmask and set gateway. Also set DNS server. Good idea to check "Connection Information" here to verify everything's O.K. Selected Install Ubuntu from the initial "Install" window Verified the three items were checked (required disk space available, plugged into a power source, & connected to the Internet) Selected Download updates while installing and third party software. Hit Continue… At wireless selected don't want to connect…WiFi…now. Continue… At Installation type, selected Something else. Continue… At partition tale, selected the ext4 Linux partition, set the mount point as "/", and marked for formatting. Here I selected the main disk (/sda) for installing the boot manager. Continue… Selected or verified my Time zone. Continue… Selected my keyboard layout. Continue… Filled in the who are you fields. Make sure password is required to sign in is checked. Continue… Chose a picture. Continue… I selected import no accounts. Continue… Wait as the Install creeps along. If your screen goes blank, tap the space bar ? apparently the screen saver/power plan does this. There are several progress bars. The longest was "Installing system", and it was the next to the last one. Installation Complete window appears, Restart Now… Wait as it stops, The screen blanks then the message "…remove…media…close tray…press enter" I just unplugged the USB DVD and hit enter… It was disheartening but the screen turned Ubuntu Purple-beige and nothing happened, so I help down the power key until it shut down, the pressed it again and the Grub Boot screen appeared. Select Ubuntu… 25.The screen went blank with the little flashing underscore cursor on it and the disk light would occasionally flash. I hit the enter key and eventuality Ubuntu started. After a somewhat long time the unity desktop appeared. 11.10, unlike earlier versions, retains the connection information. Check this by checking the network icon on the upper left applet panel. Here the touch-pad·mouse quit working and I had to reboot. It takes and extremely long time to boot, sometimes requiring several power off/ power on (cold boot). You can try to get the default network manager to work, but it might not, it didn't on mine for WiFi. Thanks to: Chris at URL here's what to do… disconnect your wired Internet connection. input your wireless information into network manager open a terminal (unity dash, top of icon totem, open, and make sure the ruler&pen icon on the bottom is selected, 2nd from left) type in "terminal". Might be a good idea to drag and drop the terminal icon to the terminal, it's easy to get rid of later. click to open a terminal, and type in: sudo rmmod acer_wmi && echo "blacklist acer_wmi" >> /etc/modprobe.d/blacklist.conf and hit enter. type in your password as asked. if you have correctly entered your WiFi information and you are near your AP, you should connect immediately if not, see the URL above ? you might need to replace "network manager" with "wicd" ? I did with 11.04. Update the new 11.10, in the upper left panel applet weird·gear icon is menu with a line about updating. It's the new way to invoke Update Manager. Your lenovo S12 (intel atom) should now run the new unity Ubuntu. Point your elbow at the ceiling and pat yourself on the back. Installing Mythbuntu Client 24.1 Open mythbuntu.org/repos (I urge you not to directly use Ubuntu Software Center for this) Install Mythbuntu Repos Save the file (in ~/Downloads, the default) Run the file ? it will update your repositories so that you will get the proper installation sources ? it will start Ubuntu Software Center to do this ? Click Install… You will need your password. Debconf window will open, select by making sure check mark is in the little box "Would you like to activate…". Forward… Which version? At the time of writing the current "Stable" version was 24.1, select 0.24.x… Forward… Read the message, then forward… Delete the downloaded file. Install synaptic (unity dash, top of icon totem, open, and make sure the ruler&pen icon on the bottom is selected, 2nd from left) type in "synaptic". Click on the synaptic icon. Ubuntu Software Center will open and allow you to install synaptic package manager. Open Synaptic (unity dash, top of icon totem, open, and make sure the ruler&pen icon on the bottom is selected, 2nd from left) type in "Synaptic". Might be a good idea to drag and drop the terminal icon to the terminal, it's easy to get rid of later. Run synaptic, read the intro, and close the intro window. Type in mythbuntu-control-centre in the Quick filter text box, and then select it "Mark for installation" by clicking on the box next to it's name. Marvel at the additional to be installed items, then select "?Mark"… At the top of the synaptic window click on the "? Apply" button. Marvel at the amount of stuff to be installed, the click on "Apply". When finished, close finished window and synaptic. Open mythbuntu-control-centre (unity dash, top of icon totem, open, and make sure the ruler&pen icon on the bottom is selected, 2nd from left) type in "mythbuntu". Might be a good idea to drag and drop the mythbuntu-control-centre icon to the terminal, it's easy to get rid of later. You can now configure and install the frontend. Go down the icon totem on the right side of the window and click as needed… System roles. ? No Backend, Desktop Frontend, and Ubuntu Desktop. Apply… & Apply changes… & Password… MySQL Configuration ? from backend ? Setup General Alt-N(ext) Alt-N(ext) Stetting Access Setup PIN code: ~~~~ Input Security key and click "Test Connection", if ?, then Apply… & Apply… {note: for some inexplicable reason, control centre hung on this, but when I restarted it, it was set properly} Graphics drivers, When I did this, only the Broadcom wireless driver showed up. I closed without doing anything. Services. I enabled SSH & Samba. Apply… & Apply… Repositories. Asked & Answered. MythExport. Pass, I believe it requires backend on the same system. Proprietary Codec Support. Check to enable, Apply… & Apply… System Updates. No action necessary, will be a part of the Ubuntu update mechanism. Themes and Artwork. For themes, I selected Enable/Update all. Apply… & Apply… Infrared & Startup behavior and Plugins. Defer until you know more. Close software centre. Open mythTV (unity dash, top of icon totem, open, and make sure the ruler&pen icon on the bottom is selected, 2nd from left) type in "mythTV". Might be a good idea to drag and drop the mythTV icon to the terminal, it's easy to get rid of later. Incorrect Group Membership. Fix this by clicking "Yes"… Log out/end. Do this by clicking "Yes"… For my Lenovo S12, I had to manually restart Ubuntu - and still with the very long restart…/no start/cold boot/reboot/pressing the shift key required Open mythTV (unity dash, top of icon totem, open, and make sure the ruler&pen icon on the bottom is selected, 2nd from left) type in "mythTV". Might be a good idea to drag and drop the mythTV icon to the terminal, it's easy to get rid of later. Will open with Select country & language. Do so. then get message with "No", hit "Ok" and arrive at the data base Configuration 1/2 screen. You will need your brackend password, from backend ? Setup General Database Configuration 1/2 Password:~? Enter this Hit Alt-n to go to the next page. Select "Use custom id…", then enter a custom ID, I use the machine's name. Hit finish, and MythTV should start up with all default settings. For the lenovo S12, the first thing you want to do is to set Playback profiles to "Normal". From Setup TV Settings Playback Alt-N(ext) Alt-N(ext) Playback Profiles (3/8) : Change Current Video Playback Profile to "Normal". You can fiddle with this setting later. For the lenovo S12, the second thing is to get the sound going. From Setup General Alt-N(ext) Alt-N(ext) Alt-N(ext) Audio System: The top of the screen is a button title "Scan for audio devices", move the highlight there and press the Space bar. Then Tab down to Audio Output Device: and left-right arrow until "ALSA:hw:Card=Intel,DEV=0" is selected. Then Alt-N(ext) until "Finish". Now you should have sound. You should now have MythTV working nicely on the Lenovo S12 Notes about wireless: Running Lenovo S12 on wireless is demanding on both power and WiFi connection. Best results will be obtained when running on power and wired connection. I run my S12 on wireless, actually two serial connections with two access points, something that is not easy to achieve. Here Mythbuntu client-server (in den) <? wireless link 1 <?office LAN? wireless link 2 <? Lenovo S12 Ubuntu 11.10 The office LAN is fixed IP behind an Untangle firewall router. There is another MythTV client on Ubuntu 10.10 computer in the office (which has always worked well). ProblemMythbuntu\Win7 client hangs with frozen frames, short segment of audio repeating. Hardware Rosewill RNX-G300EX IEEE 802.11b/g PCI Wireless Card on client-server 2 Linksys WRT54GL wireless broadband routers on LAN for link1 and link 2 WRT54GL FirmwareDD-WRT v24-sp2(07/22/09) voip set up to act as an access point. Note? many people advised this was an unworkable scheme, and in probably most cases it will be. Solution? Set up DD-WRT with the following Wireless settings… Basic Channel: Different fixed channels at least 4 difference, I use 6 & 11 Basic Sensitivity Range (ACK timing): 50 MAC filter use filter: Enable, Selected Permit only clients listed to access… Requires adding MAC addresses in "Edit MAC Filter List" This causes the 54GL's to ignore any but the listed MAC address, down side, no "guest" capability. Advanced Basic rate: All Advanced CTS Protection Mode: Off Advanced Frame Burst: Enable Advanced Max associate clients: 4 for client link 2, 1 for client-server link 1 Advanced AP isolation: Enable Advanced Preamble: Short Advanced Afterburner: On Advanced Wireless GUI access: Off Advanced WMM support: Off Other settings: default for supplied firmware. Why I suspect this worked? The 54GL Access Points's with the firmware's setting are set to handle a multiple client, wide area situation. With these mods I reconfigured them for a small area, few client situation, disabling Advanced WMM probably the most important. In addition, the client mythtv when used all other users of its access point are turned off except for a Skype phone. Also, the client-server is set up to allow other connections though it's LAN connection, and these are used to connect the TV and disc players, not used when client is being used.

Read the article
timer_getoverrun() doesn't behave as expected when using sleep()

- by dlp

Here is a program that uses a POSIX per-process timer alongside the sleep subroutine. The signal used by the timer has been set to SIGUSR1 rather than SIGALRM, since SIGALRM may be used internally by sleep, but it still doesn't seem to work. I have run the program using the command line timer-overruns -d 1 -n 10000000 (1 cs interval) so, in theory, we should expect 100 overruns between calls to sigwaitinfo. However, timer_getoverrun returns 0. I have also tried a version using a time-consuming for loop to introduce the delay. In this case, overruns are recorded. Does anyone know why this happens? I am running a 3.4 Linux kernel. Program source /* * timer-overruns.c */ #include <unistd.h> #include <stdlib.h> #include <stdio.h> #include <signal.h> #include <time.h> // Signal to be used for timer expirations #define TIMER_SIGNAL SIGUSR1 int main(int argc, char **argv) { int opt; int d = 0; int r = 0; // Repeat indefinitely struct itimerspec its; its.it_interval.tv_sec = 0; its.it_interval.tv_nsec = 0; // Parse arguments while ((opt = getopt(argc, argv, "d:r:s:n:")) != -1) { switch (opt) { case 'd': // Delay before calling sigwaitinfo() d = atoi(optarg); break; case 'r': // Number of times to call sigwaitinfo() r = atoi(optarg); break; case 's': // Timer interval (seconds) its.it_interval.tv_sec = its.it_value.tv_sec = atoi(optarg); break; case 'n': // Timer interval (nanoseconds) its.it_interval.tv_nsec = its.it_value.tv_nsec = atoi(optarg); break; default: /* '?' */ fprintf(stderr, "Usage: %s [-d signal_accept_delay] [-r repetitions] [-s interval_seconds] [-n interval_nanoseconds]\n", argv[0]); exit(EXIT_FAILURE); } } // Check sanity of command line arguments short e = 0; if (d < 0) { fprintf(stderr, "Delay (-d) cannot be negative!\n"); e++; } if (r < 0) { fprintf(stderr, "Number of repetitions (-r) cannot be negative!\n"); e++; } if (its.it_interval.tv_sec < 0) { fprintf(stderr, "Interval seconds value (-s) cannot be negative!\n"); e++; } if (its.it_interval.tv_nsec < 0) { fprintf(stderr, "Interval nanoseconds value (-n) cannot be negative!\n"); e++; } if (its.it_interval.tv_nsec > 999999999) { fprintf(stderr, "Interval nanoseconds value (-n) must be < 1 second.\n"); e++; } if (e > 0) exit(EXIT_FAILURE); // Set default values if not specified if (its.it_interval.tv_sec == 0 && its.it_interval.tv_nsec == 0) { its.it_interval.tv_sec = its.it_value.tv_sec = 1; its.it_value.tv_nsec = 0; } printf("Running with timer delay %d.%09d seconds\n", (int) its.it_interval.tv_sec, (int) its.it_interval.tv_nsec); // Will be waiting for signals synchronously, so block the one in use. sigset_t sigset; sigemptyset(&sigset); sigaddset(&sigset, TIMER_SIGNAL); sigprocmask(SIG_BLOCK, &sigset, NULL ); // Create and arm the timer struct sigevent sev; timer_t timer; sev.sigev_notify = SIGEV_SIGNAL; sev.sigev_signo = TIMER_SIGNAL; sev.sigev_value.sival_ptr = timer; timer_create(CLOCK_REALTIME, &sev, &timer); timer_settime(timer, TIMER_ABSTIME, &its, NULL ); // Signal handling loop int overruns; siginfo_t si; // Make the loop infinite if r = 0 if (r == 0) r = -1; while (r != 0) { // Sleeping should cause overruns if (d > 0) sleep(d); sigwaitinfo(&sigset, &si); // Check that the signal is from the timer if (si.si_code != SI_TIMER) continue; overruns = timer_getoverrun(timer); if (overruns > 0) { printf("Timer overrun occurred for %d expirations.\n", overruns); } // Decrement r if not repeating indefinitely if (r > 0) r--; } return EXIT_SUCCESS; }

Read the article
Inconsistent results in R with RNetCDF - why?

- by sarcozona

I am having trouble extracting data from NetCDF data files using RNetCDF. The data files each have 3 dimensions (longitude, latitude, and a date) and 3 variables (latitude, longitude, and a climate variable). There are four datasets, each with a different climate variable. Here is some of the output from print.nc(p8m.tmax) for clarity. The other datasets are identical except for the specific climate variable. dimensions: month = UNLIMITED ; // (1368 currently) lat = 3105 ; lon = 7025 ; variables: float lat(lat) ; lat:long_name = "latitude" ; lat:standard_name = "latitude" ; lat:units = "degrees_north" ; float lon(lon) ; lon:long_name = "longitude" ; lon:standard_name = "longitude" ; lon:units = "degrees_east" ; short tmax(lon, lat, month) ; tmax:missing_value = -9999 ; tmax:_FillValue = -9999 ; tmax:units = "degree_celsius" ; tmax:scale_factor = 0.01 ; tmax:valid_min = -5000 ; tmax:valid_max = 6000 ; I am getting behavior I don't understand when I use the var.get.nc function from the RNetCDF package. For example, when I attempt to extract 82 values beginning at stval from the maximum temperature data (p8m.tmax <- open.nc(tmaxdataset.nc)) with > var.get.nc(p8m.tmax,'tmax', start=c(lon_val, lat_val, stval),count=c(1,1,82)) (where lon_val and lat_val specify the location in the dataset of the coordinates I'm interested in and stval is stval is set to which(time_vec==200201), which in this case equaled 1285.) I get Error: Invalid argument But after successfully extracting 80 and 81 values > var.get.nc(p8m.tmax,'tmax', start=c(lon_val, lat_val, stval),count=c(1,1,80)) > var.get.nc(p8m.tmax,'tmax', start=c(lon_val, lat_val, stval),count=c(1,1,81)) the command with 82 works: > var.get.nc(p8m.tmax,'tmax', start=c(lon_val, lat_val, stval),count=c(1,1,82)) [1] 444 866 1063 ... [output snipped] The same problem occurs in the identically structured tmin file, but at 36 instead of 82: > var.get.nc(p8m.tmin,'tmin', start=c(lon_val, lat_val, stval),count=c(1,1,36)) produces Error: Invalid argument But after repeating with counts of 30, 31, etc > var.get.nc(p8m.tmin,'tmin', start=c(lon_val, lat_val, stval), count=c(1,1,36)) works. These examples make it seem like the function is failing at the last count, but that actually isn't the case. In the first example, var.get.nc gave Error: Invalid argument after I asked for 84 values. I then narrowed the failure down to the 82nd count by varying the starting point in the dataset and asking for only 1 value at a time. The particular number the problem occurs at also varies. I can close and reopen the dataset and have the problem occur at a different location. In the particular examples above, lon_val and lat_val are 1595 and 1751, respectively, identifying the location in the dataset along the lat and lon dimensions for the latitude and longitude I'm interested in. The 1595th latitude and 1751th longitude are not the problem, however. The problem occurs with all other latitude and longitudes I've tried. Varying the starting location in the dataset along the climate variable dimension (stval) and/or specifying it different (as a number in the command instead of the object stval) also does not fix the problem. This problem doesn't always occur. I can run identical code three times in a row (clearing all objects in between runs) and get a different outcome each time. The first run may choke on the 7th entry I'm trying to get, the second might work fine, and the third run might choke on the 83rd entry. I'm absolutely baffled by such inconsistent behavior. The open.nc function has also started to fail with the same Error: Invalid argument. Like the var.get.nc problems, it also occurs inconsistently. Does anyone know what causes the initial failure to extract the variable? And how I might prevent it? Could have to do with the size of the data files (~60GB each) and/or the fact that I'm accessing them through networked drives? This was also asked here: https://stat.ethz.ch/pipermail/r-help/2011-June/281233.html > sessionInfo() R version 2.13.0 (2011-04-13) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] reshape_0.8.4 plyr_1.5.2 RNetCDF_1.5.2-2 loaded via a namespace (and not attached): [1] tools_2.13.0

Read the article
Loop crashing program having to do with 2D arrays

- by user450062

I am creating an encoding program and when I instruct the program to create a 5X5 grid based on the alphabet while skipping over letters that match up to certain pre-defined variables(which are given values by user input during runtime). I have a loop that instructs the loop to keep running until the values that access the array are out of bounds, the loop seems to cause the problem. This code is standardized so there shouldn't be much trouble compiling it in another compiler. Also would it be better to seperate my program into functions? here is the code: #include<iostream> #include<fstream> #include<cstdlib> #include<string> #include<limits> using namespace std; int main(){ while (!cin.fail()) { char type[81]; char filename[20]; char key [5]; char f[2] = "q"; char g[2] = "q"; char h[2] = "q"; char i[2] = "q"; char j[2] = "q"; char k[2] = "q"; char l[2] = "q"; int a = 1; int b = 1; int c = 1; int d = 1; int e = 1; string cipherarraytemplate[5][5]= { {"a","b","c","d","e"}, {"f","g","h","i","j"}, {"k","l","m","n","o"}, {"p","r","s","t","u"}, {"v","w","x","y","z"} }; string cipherarray[5][5]= { {"a","b","c","d","e"}, {"f","g","h","i","j"}, {"k","l","m","n","o"}, {"p","r","s","t","u"}, {"v","w","x","y","z"} }; cout<<"Enter the name of a file you want to create.\n"; cin>>filename; ofstream outFile; outFile.open(filename); outFile<<fixed; outFile.precision(2); outFile.setf(ios_base::showpoint); cin.ignore(std::numeric_limits<int>::max(),'\n'); cout<<"enter your codeword(codeword can have no repeating letters)\n"; cin>>key; while (key[a] != '\0' ){ while(b < 6){ cipherarray[b][c] = key[a]; if ( f == "q" ) { cipherarray[b][c] = f; } if ( f != "q" && g == "q" ) { cipherarray[b][c] = g; } if ( g != "q" && h == "q" ) { cipherarray[b][c] = h; } if ( h != "q" && i == "q" ) { cipherarray[b][c] = i; } if ( i != "q" && j == "q" ) { cipherarray[b][c] = j; } if ( j != "q" && k == "q" ) { cipherarray[b][c] = k; } if ( k != "q" && l == "q" ) { cipherarray[b][c] = l; } a++; b++; } c++; b = 1; } while (c < 6 || b < 6){ if (cipherarraytemplate[d][e] == f || cipherarraytemplate[d][e] == g || cipherarraytemplate[d][e] == h || cipherarraytemplate[d][e] == i || cipherarraytemplate[d][e] == j || cipherarraytemplate[d][e] == k || cipherarraytemplate[d][e] == l){ d++; } else { cipherarray[b][c] = cipherarraytemplate[d][e]; d++; b++; } if (d == 6){ d = 1; e++; } if (b == 6){ c++; b = 1; } } cout<<"now enter some text."<<endl<<"To end this program press Crtl-Z\n"; while(!cin.fail()){ cin.getline(type,81); outFile<<type<<endl; } outFile.close(); } } I know there is going to be some mid-forties guy out there who is going to stumble on to this post, he's have been programming for 20-some years and he's going to look at my code and say: "what is this guy doing".

Read the article
OpenVPN Client timing out

- by Austin

I recently installed OpenVPN on my Ubuntu VPS. Whenenver I try to connect to it, I can establish a connection just fine. However, everything I try to connect to times out. If I try to ping something, it will resolve the IP, but will time out after resolving the IP. (So DNS Server seems to be working correctly) My server.conf has this relevant information (At least I think it's relevant. I'm not sure if you need more or not) # Which local IP address should OpenVPN # listen on? (optional) ;local a.b.c.d # Which TCP/UDP port should OpenVPN listen on? # If you want to run multiple OpenVPN instances # on the same machine, use a different port # number for each one. You will need to # open up this port on your firewall. port 1194 # TCP or UDP server? ;proto tcp proto udp # "dev tun" will create a routed IP tunnel, # "dev tap" will create an ethernet tunnel. # Use "dev tap0" if you are ethernet bridging # and have precreated a tap0 virtual interface # and bridged it with your ethernet interface. # If you want to control access policies # over the VPN, you must create firewall # rules for the the TUN/TAP interface. # On non-Windows systems, you can give # an explicit unit number, such as tun0. # On Windows, use "dev-node" for this. # On most systems, the VPN will not function # unless you partially or fully disable # the firewall for the TUN/TAP interface. ;dev tap dev tun # Windows needs the TAP-Win32 adapter name # from the Network Connections panel if you # have more than one. On XP SP2 or higher, # you may need to selectively disable the # Windows firewall for the TAP adapter. # Non-Windows systems usually don't need this. ;dev-node MyTap # SSL/TLS root certificate (ca), certificate # (cert), and private key (key). Each client # and the server must have their own cert and # key file. The server and all clients will # use the same ca file. # # See the "easy-rsa" directory for a series # of scripts for generating RSA certificates # and private keys. Remember to use # a unique Common Name for the server # and each of the client certificates. # # Any X509 key management system can be used. # OpenVPN can also use a PKCS #12 formatted key file # (see "pkcs12" directive in man page). ca ca.crt cert server.crt key server.key # This file should be kept secret # Diffie hellman parameters. # Generate your own with: # openssl dhparam -out dh1024.pem 1024 # Substitute 2048 for 1024 if you are using # 2048 bit keys. dh dh1024.pem # Configure server mode and supply a VPN subnet # for OpenVPN to draw client addresses from. # The server will take 10.8.0.1 for itself, # the rest will be made available to clients. # Each client will be able to reach the server # on 10.8.0.1. Comment this line out if you are # ethernet bridging. See the man page for more info. server 10.8.0.0 255.255.255.0 # Maintain a record of client <-> virtual IP address # associations in this file. If OpenVPN goes down or # is restarted, reconnecting clients can be assigned # the same virtual IP address from the pool that was # previously assigned. ifconfig-pool-persist ipp.txt # Configure server mode for ethernet bridging. # You must first use your OS's bridging capability # to bridge the TAP interface with the ethernet # NIC interface. Then you must manually set the # IP/netmask on the bridge interface, here we # assume 10.8.0.4/255.255.255.0. Finally we # must set aside an IP range in this subnet # (start=10.8.0.50 end=10.8.0.100) to allocate # to connecting clients. Leave this line commented # out unless you are ethernet bridging. ;server-bridge 10.8.0.4 255.255.255.0 10.8.0.50 10.8.0.100 # Configure server mode for ethernet bridging # using a DHCP-proxy, where clients talk # to the OpenVPN server-side DHCP server # to receive their IP address allocation # and DNS server addresses. You must first use # your OS's bridging capability to bridge the TAP # interface with the ethernet NIC interface. # Note: this mode only works on clients (such as # Windows), where the client-side TAP adapter is # bound to a DHCP client. ;server-bridge # Push routes to the client to allow it # to reach other private subnets behind # the server. Remember that these # private subnets will also need # to know to route the OpenVPN client # address pool (10.8.0.0/255.255.255.0) # back to the OpenVPN server. ;push "route 192.168.10.0 255.255.255.0" ;push "route 192.168.20.0 255.255.255.0" # To assign specific IP addresses to specific # clients or if a connecting client has a private # subnet behind it that should also have VPN access, # use the subdirectory "ccd" for client-specific # configuration files (see man page for more info). # EXAMPLE: Suppose the client # having the certificate common name "Thelonious" # also has a small subnet behind his connecting # machine, such as 192.168.40.128/255.255.255.248. # First, uncomment out these lines: ;client-config-dir ccd ;route 192.168.40.128 255.255.255.248 # Then create a file ccd/Thelonious with this line: # iroute 192.168.40.128 255.255.255.248 # This will allow Thelonious' private subnet to # access the VPN. This example will only work # if you are routing, not bridging, i.e. you are # using "dev tun" and "server" directives. # EXAMPLE: Suppose you want to give # Thelonious a fixed VPN IP address of 10.9.0.1. # First uncomment out these lines: ;client-config-dir ccd ;route 10.9.0.0 255.255.255.252 # Then add this line to ccd/Thelonious: # ifconfig-push 10.9.0.1 10.9.0.2 # Suppose that you want to enable different # firewall access policies for different groups # of clients. There are two methods: # (1) Run multiple OpenVPN daemons, one for each # group, and firewall the TUN/TAP interface # for each group/daemon appropriately. # (2) (Advanced) Create a script to dynamically # modify the firewall in response to access # from different clients. See man # page for more info on learn-address script. ;learn-address ./script # If enabled, this directive will configure # all clients to redirect their default # network gateway through the VPN, causing # all IP traffic such as web browsing and # and DNS lookups to go through the VPN # (The OpenVPN server machine may need to NAT # or bridge the TUN/TAP interface to the internet # in order for this to work properly). push "redirect-gateway def1 bypass-dhcp" push "dhcp-option DNS 8.8.8.8" # Certain Windows-specific network settings # can be pushed to clients, such as DNS # or WINS server addresses. CAVEAT: # http://openvpn.net/faq.html#dhcpcaveats # The addresses below refer to the public # DNS servers provided by opendns.com. ;push "dhcp-option DNS 8.8.8.8" push "dhcp-option DNS 8.8.4.4" # Uncomment this directive to allow different # clients to be able to "see" each other. # By default, clients will only see the server. # To force clients to only see the server, you # will also need to appropriately firewall the # server's TUN/TAP interface. ;client-to-client # Uncomment this directive if multiple clients # might connect with the same certificate/key # files or common names. This is recommended # only for testing purposes. For production use, # each client should have its own certificate/key # pair. # # IF YOU HAVE NOT GENERATED INDIVIDUAL # CERTIFICATE/KEY PAIRS FOR EACH CLIENT, # EACH HAVING ITS OWN UNIQUE "COMMON NAME", # UNCOMMENT THIS LINE OUT. ;duplicate-cn # The keepalive directive causes ping-like # messages to be sent back and forth over # the link so that each side knows when # the other side has gone down. # Ping every 10 seconds, assume that remote # peer is down if no ping received during # a 120 second time period. keepalive 10 120 # For extra security beyond that provided # by SSL/TLS, create an "HMAC firewall" # to help block DoS attacks and UDP port flooding. # # Generate with: # openvpn --genkey --secret ta.key # # The server and each client must have # a copy of this key. # The second parameter should be '0' # on the server and '1' on the clients. ;tls-auth ta.key 0 # This file is secret # Select a cryptographic cipher. # This config item must be copied to # the client config file as well. ;cipher BF-CBC # Blowfish (default) ;cipher AES-128-CBC # AES ;cipher DES-EDE3-CBC # Triple-DES # Enable compression on the VPN link. # If you enable it here, you must also # enable it in the client config file. comp-lzo # The maximum number of concurrently connected # clients we want to allow. ;max-clients 100 # It's a good idea to reduce the OpenVPN # daemon's privileges after initialization. # # You can uncomment this out on # non-Windows systems. ;user nobody ;group nogroup # The persist options will try to avoid # accessing certain resources on restart # that may no longer be accessible because # of the privilege downgrade. persist-key persist-tun # Output a short status file showing # current connections, truncated # and rewritten every minute. status openvpn-status.log # By default, log messages will go to the syslog (or # on Windows, if running as a service, they will go to # the "\Program Files\OpenVPN\log" directory). # Use log or log-append to override this default. # "log" will truncate the log file on OpenVPN startup, # while "log-append" will append to it. Use one # or the other (but not both). ;log openvpn.log ;log-append openvpn.log # Set the appropriate level of log # file verbosity. # # 0 is silent, except for fatal errors # 4 is reasonable for general usage # 5 and 6 can help to debug connection problems # 9 is extremely verbose verb 3 # Silence repeating messages. At most 20 # sequential messages of the same message # category will be output to the log. ;mute 20 I've tried on multiple computers by the way. The same result on all of them. What could be wrong? Thanks in advance, and if you need other information I'll gladly post it. Information for new comments root@vps:~# iptables -L -n -v Chain INPUT (policy ACCEPT 862K packets, 51M bytes) pkts bytes target prot opt in out source destination Chain FORWARD (policy ACCEPT 3 packets, 382 bytes) pkts bytes target prot opt in out source destination 0 0 ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED 4641 298K ACCEPT all -- * * 10.8.0.0/24 0.0.0.0/0 0 0 REJECT all -- * * 0.0.0.0/0 0.0.0.0/0 reject-with icmp-port-unreachable Chain OUTPUT (policy ACCEPT 1671K packets, 2378M bytes) pkts bytes target prot opt in out source destination And root@vps:~# iptables -t nat -L -n -v Chain PREROUTING (policy ACCEPT 17937 packets, 2013K bytes) pkts bytes target prot opt in out source destination Chain POSTROUTING (policy ACCEPT 8975 packets, 562K bytes) pkts bytes target prot opt in out source destination 1579 103K SNAT all -- * * 10.8.0.0/24 0.0.0.0/0 to:SERVERIP Chain OUTPUT (policy ACCEPT 8972 packets, 562K bytes) pkts bytes target prot opt in out source destination

Read the article
Set up linux box for secure local hosting a-z

- by microchasm

I am in the process of reinstalling the OS on a machine that will be used to host a couple of apps for our business. The apps will be local only; access from external clients will be via vpn only. The prior setup used a hosting control panel (Plesk) for most of the admin, and I was looking at using another similar piece of software for the reinstall - but I figured I should finally learn how it all works. I can do most of the things the software would do for me, but am unclear on the symbiosis of it all. This is all an attempt to further distance myself from the land of Configuration Programmer/Programmer, if at all possible. I can't find a full walkthrough anywhere for what I'm looking for, so I thought I'd put up this question, and if people can help me on the way I will edit this with the answers, and document my progress/pitfalls. Hopefully someday this will help someone down the line. The details: CentOS 5.5 x86_64 httpd: Apache/2.2.3 mysql: 5.0.77 (to be upgraded) php: 5.1 (to be upgraded) The requirements: SECURITY!! Secure file transfer Secure client access (SSL Certs and CA) Secure data storage Virtualhosts/multiple subdomains Local email would be nice, but not critical The Steps: Download latest CentOS DVD-iso (torrent worked great for me). Install CentOS: While going through the install, I checked the Server Components option thinking I was going to be using another Plesk-like admin. In hindsight, considering I've decided to try to go my own way, this probably wasn't the best idea. Basic config: Setup users, networking/ip address etc. Yum update/upgrade. Upgrade PHP/MySQL: To upgrade PHP and MySQL to the latest versions, I had to look to another repo outside CentOS. IUS looks great and I'm happy I found it! Add IUS repository to our package manager cd /tmp wget http://dl.iuscommunity.org/pub/ius/stable/Redhat/5/x86_64/epel-release-1-1.ius.el5.noarch.rpm rpm -Uvh epel-release-1-1.ius.el5.noarch.rpm wget http://dl.iuscommunity.org/pub/ius/stable/Redhat/5/x86_64/ius-release-1-4.ius.el5.noarch.rpm rpm -Uvh ius-release-1-4.ius.el5.noarch.rpm yum list | grep -w \.ius\. # list all the packages in the IUS repository; use this to find PHP/MySQL version and libraries you want to install Remove old version of PHP and install newer version from IUS rpm -qa | grep php # to list all of the installed php packages we want to remove yum shell # open an interactive yum shell remove php-common php-mysql php-cli #remove installed PHP components install php53 php53-mysql php53-cli php53-common #add packages you want transaction solve #important!! checks for dependencies transaction run #important!! does the actual installation of packages. [control+d] #exit yum shell php -v PHP 5.3.2 (cli) (built: Apr 6 2010 18:13:45) Upgrade MySQL from IUS repository /etc/init.d/mysqld stop rpm -qa | grep mysql # to see installed mysql packages yum shell remove mysql mysql-server #remove installed MySQL components install mysql51 mysql51-server mysql51-devel transaction solve #important!! checks for dependencies transaction run #important!! does the actual installation of packages. [control+d] #exit yum shell service mysqld start mysql -v Server version: 5.1.42-ius Distributed by The IUS Community Project Upgrade instructions courtesy of IUS wiki: http://wiki.iuscommunity.org/Doc/ClientUsageGuide Install rssh (restricted shell) to provide scp and sftp access, without allowing ssh login cd /tmp wget http://dag.wieers.com/rpm/packages/rssh/rssh-2.3.2-1.2.el5.rf.x86_64.rpm rpm -ivh rssh-2.3.2-1.2.el5.rf.x86_64.rpm useradd -m -d /home/dev -s /usr/bin/rssh dev passwd dev Edit /etc/rssh.conf to grant access to SFTP to rssh users. vi /etc/rssh.conf Uncomment or add: allowscp allowsftp This allows me to connect to the machine via SFTP protocol in Transmit (my FTP program of choice; I'm sure it's similar with other FTP apps). rssh instructions appropriated (with appreciation!) from http://www.cyberciti.biz/tips/linux-unix-restrict-shell-access-with-rssh.html Set up virtual interfaces ifconfig eth1:1 192.168.1.3 up #start up the virtual interface cd /etc/sysconfig/network-scripts/ cp ifcfg-eth1 ifcfg-eth1:1 #copy default script and match name to our virtual interface vi ifcfg-eth1:1 #modify eth1:1 script #ifcfg-eth1:1 | modify so it looks like this: DEVICE=eth1:1 IPADDR=192.168.1.3 NETMASK=255.255.255.0 NETWORK=192.168.1.0 ONBOOT=yes NAME=eth1:1 Add more Virtual interfaces as needed by repeating. Because of the ONBOOT=yes line in the ifcfg-eth1:1 file, this interface will be brought up when the system boots, or the network starts/restarts. service network restart Shutting down interface eth0: [ OK ] Shutting down interface eth1: [ OK ] Shutting down loopback interface: [ OK ] Bringing up loopback interface: [ OK ] Bringing up interface eth0: [ OK ] Bringing up interface eth1: [ OK ] ping 192.168.1.3 64 bytes from 192.168.1.3: icmp_seq=1 ttl=64 time=0.105 ms Virtualhosts In the rssh section above I added a user to use for SFTP. In this users' home directory, I created a folder called 'https'. This is where the documents for this site will live, so I need to add a virtualhost that will point to it. I will use the above virtual interface for this site (herein called dev.site.local). vi /etc/http/conf/httpd.conf Add the following to the end of httpd.conf: <VirtualHost 192.168.1.3:80> ServerAdmin [email protected] DocumentRoot /home/dev/https ServerName dev.site.local ErrorLog /home/dev/logs/error_log TransferLog /home/dev/logs/access_log </VirtualHost> I put a dummy index.html file in the https directory just to check everything out. I tried browsing to it, and was met with permission denied errors. The logs only gave an obscure reference to what was going on: [Mon May 17 14:57:11 2010] [error] [client 192.168.1.100] (13)Permission denied: access to /index.html denied I tried chmod 777 et. al., but to no avail. Turns out, I needed to chmod+x the https directory and its' parent directories. chmod +x /home chmod +x /home/dev chmod +x /home/dev/https This solved that problem. DNS I'm handling DNS via our local Windows Server 2003 box. However, the CentOS documentation for BIND can be found here: http://www.centos.org/docs/5/html/Deployment_Guide-en-US/ch-bind.html SSL To get SSL working, I changed the following in httpd.conf: NameVirtualHost 192.168.1.3:443 #make sure this line is in httpd.conf <VirtualHost 192.168.1.3:443> #change port to 443 ServerAdmin [email protected] DocumentRoot /home/dev/https ServerName dev.site.local ErrorLog /home/dev/logs/error_log TransferLog /home/dev/logs/access_log </VirtualHost> Unfortunately, I keep getting (Error code: ssl_error_rx_record_too_long) errors when trying to access a page with SSL. As JamesHannah gracefully pointed out below, I had not set up the locations of the certs in httpd.conf, and thusly was getting the page thrown at the broswer as the cert making the browser balk. So first, I needed to set up a CA and make certificate files. I found a great (if old) walkthrough on the process here: http://www.debian-administration.org/articles/284. Here are the relevant steps I took from that article: mkdir /home/CA cd /home/CA/ mkdir newcerts private echo '01' > serial touch index.txt #this and the above command are for the database that will keep track of certs Create an openssl.cnf file in the /home/CA/ dir and edit it per the walkthrough linked above. (For reference, my finished openssl.cnf file looked like this: http://pastebin.com/raw.php?i=hnZDij4T) openssl req -new -x509 -extensions v3_ca -keyout private/cakey.pem -out cacert.pem -days 3650 -config ./openssl.cnf #this creates the cacert.pem which gets distributed and imported to the browser(s) Modified openssl.cnf again per walkthrough instructions. openssl req -new -nodes -out dev.req.pem -config ./openssl.cnf #generates certificate request, and key.pem which I renamed dev.key.pem. Modified openssl.cnf again per walkthrough instructions. openssl ca -out dev.cert.pem -config ./openssl.cnf -infiles dev.req.pem #create and sign certificate. cp dev.cert.pem /home/dev/certs/cert.pem cp dev.key.pem /home/certs/key.pem I updated httpd.conf to reflect the certs and turn SSLEngine on: NameVirtualHost 192.168.1.3:443 <VirtualHost 192.168.1.3:443> ServerAdmin [email protected] DocumentRoot /home/dev/https SSLEngine on SSLCertificateFile /home/dev/certs/cert.pem SSLCertificateKeyFile /home/dev/certs/key.pem ServerName dev.site.local ErrorLog /home/dev/logs/error_log TransferLog /home/dev/logs/access_log </VirtualHost> Put the CA cert.pem in a web-accessible place, and downloaded/imported it into my browser. Now I can visit https://dev.site.local with no errors or warnings. And this is where I'm at. I will keep editing this as I make progress. Any tips on how to configure SSL email would be appreciated.

Read the article
A way of doing real-world test-driven development (and some thoughts about it)

- by Thomas Weller

Lately, I exchanged some arguments with Derick Bailey about some details of the red-green-refactor cycle of the Test-driven development process. In short, the issue revolved around the fact that it’s not enough to have a test red or green, but it’s also important to have it red or green for the right reasons. While for me, it’s sufficient to initially have a NotImplementedException in place, Derick argues that this is not totally correct (see these two posts: Red/Green/Refactor, For The Right Reasons and Red For The Right Reason: Fail By Assertion, Not By Anything Else). And he’s right. But on the other hand, I had no idea how his insights could have any practical consequence for my own individual interpretation of the red-green-refactor cycle (which is not really red-green-refactor, at least not in its pure sense, see the rest of this article). This made me think deeply for some days now. In the end I found out that the ‘right reason’ changes in my understanding depending on what development phase I’m in. To make this clear (at least I hope it becomes clear…) I started to describe my way of working in some detail, and then something strange happened: The scope of the article slightly shifted from focusing ‘only’ on the ‘right reason’ issue to something more general, which you might describe as something like 'Doing real-world TDD in .NET , with massive use of third-party add-ins’. This is because I feel that there is a more general statement about Test-driven development to make: It’s high time to speak about the ‘How’ of TDD, not always only the ‘Why’. Much has been said about this, and me myself also contributed to that (see here: TDD is not about testing, it's about how we develop software). But always justifying what you do is very unsatisfying in the long run, it is inherently defensive, and it costs time and effort that could be used for better and more important things. And frankly: I’m somewhat sick and tired of repeating time and again that the test-driven way of software development is highly preferable for many reasons - I don’t want to spent my time exclusively on stating the obvious… So, again, let’s say it clearly: TDD is programming, and programming is TDD. Other ways of programming (code-first, sometimes called cowboy-coding) are exceptional and need justification. – I know that there are many people out there who will disagree with this radical statement, and I also know that it’s not a description of the real world but more of a mission statement or something. But nevertheless I’m absolutely sure that in some years this statement will be nothing but a platitude. Side note: Some parts of this post read as if I were paid by Jetbrains (the manufacturer of the ReSharper add-in – R#), but I swear I’m not. Rather I think that Visual Studio is just not production-complete without it, and I wouldn’t even consider to do professional work without having this add-in installed... The three parts of a software component Before I go into some details, I first should describe my understanding of what belongs to a software component (assembly, type, or method) during the production process (i.e. the coding phase). Roughly, I come up with the three parts shown below: First, we need to have some initial sort of requirement. This can be a multi-page formal document, a vague idea in some programmer’s brain of what might be needed, or anything in between. In either way, there has to be some sort of requirement, be it explicit or not. – At the C# micro-level, the best way that I found to formulate that is to define interfaces for just about everything, even for internal classes, and to provide them with exhaustive xml comments. The next step then is to re-formulate these requirements in an executable form. This is specific to the respective programming language. - For C#/.NET, the Gallio framework (which includes MbUnit) in conjunction with the ReSharper add-in for Visual Studio is my toolset of choice. The third part then finally is the production code itself. It’s development is entirely driven by the requirements and their executable formulation. This is the delivery, the two other parts are ‘only’ there to make its production possible, to give it a decent quality and reliability, and to significantly reduce related costs down the maintenance timeline. So while the first two parts are not really relevant for the customer, they are very important for the developer. The customer (or in Scrum terms: the Product Owner) is not interested at all in how the product is developed, he is only interested in the fact that it is developed as cost-effective as possible, and that it meets his functional and non-functional requirements. The rest is solely a matter of the developer’s craftsmanship, and this is what I want to talk about during the remainder of this article… An example To demonstrate my way of doing real-world TDD, I decided to show the development of a (very) simple Calculator component. The example is deliberately trivial and silly, as examples always are. I am totally aware of the fact that real life is never that simple, but I only want to show some development principles here… The requirement As already said above, I start with writing down some words on the initial requirement, and I normally use interfaces for that, even for internal classes - the typical question “intf or not” doesn’t even come to mind. I need them for my usual workflow and using them automatically produces high componentized and testable code anyway. To think about their usage in every single situation would slow down the production process unnecessarily. So this is what I begin with: namespace Calculator { /// <summary> /// Defines a very simple calculator component for demo purposes. /// </summary> public interface ICalculator { /// <summary> /// Gets the result of the last successful operation. /// </summary> /// <value>The last result.</value> /// <remarks> /// Will be <see langword="null" /> before the first successful operation. /// </remarks> double? LastResult { get; } } // interface ICalculator } // namespace Calculator So, I’m not beginning with a test, but with a sort of code declaration - and still I insist on being 100% test-driven. There are three important things here: Starting this way gives me a method signature, which allows to use IntelliSense and AutoCompletion and thus eliminates the danger of typos - one of the most regular, annoying, time-consuming, and therefore expensive sources of error in the development process. In my understanding, the interface definition as a whole is more of a readable requirement document and technical documentation than anything else. So this is at least as much about documentation than about coding. The documentation must completely describe the behavior of the documented element. I normally use an IoC container or some sort of self-written provider-like model in my architecture. In either case, I need my components defined via service interfaces anyway. - I will use the LinFu IoC framework here, for no other reason as that is is very simple to use. The ‘Red’ (pt. 1) First I create a folder for the project’s third-party libraries and put the LinFu.Core dll there. Then I set up a test project (via a Gallio project template), and add references to the Calculator project and the LinFu dll. Finally I’m ready to write the first test, which will look like the following: namespace Calculator.Test { [TestFixture] public class CalculatorTest { private readonly ServiceContainer container = new ServiceContainer(); [Test] public void CalculatorLastResultIsInitiallyNull() { ICalculator calculator = container.GetService<ICalculator>(); Assert.IsNull(calculator.LastResult); } } // class CalculatorTest } // namespace Calculator.Test This is basically the executable formulation of what the interface definition states (part of). Side note: There’s one principle of TDD that is just plain wrong in my eyes: I’m talking about the Red is 'does not compile' thing. How could a compiler error ever be interpreted as a valid test outcome? I never understood that, it just makes no sense to me. (Or, in Derick’s terms: this reason is as wrong as a reason ever could be…) A compiler error tells me: Your code is incorrect, but nothing more. Instead, the ‘Red’ part of the red-green-refactor cycle has a clearly defined meaning to me: It means that the test works as intended and fails only if its assumptions are not met for some reason. Back to our Calculator. When I execute the above test with R#, the Gallio plugin will give me this output: So this tells me that the test is red for the wrong reason: There’s no implementation that the IoC-container could load, of course. So let’s fix that. With R#, this is very easy: First, create an ICalculator - derived type: Next, implement the interface members: And finally, move the new class to its own file: So far my ‘work’ was six mouse clicks long, the only thing that’s left to do manually here, is to add the Ioc-specific wiring-declaration and also to make the respective class non-public, which I regularly do to force my components to communicate exclusively via interfaces: This is what my Calculator class looks like as of now: using System; using LinFu.IoC.Configuration; namespace Calculator { [Implements(typeof(ICalculator))] internal class Calculator : ICalculator { public double? LastResult { get { throw new NotImplementedException(); } } } } Back to the test fixture, we have to put our IoC container to work: [TestFixture] public class CalculatorTest { #region Fields private readonly ServiceContainer container = new ServiceContainer(); #endregion // Fields #region Setup/TearDown [FixtureSetUp] public void FixtureSetUp() { container.LoadFrom(AppDomain.CurrentDomain.BaseDirectory, "Calculator.dll"); } ... Because I have a R# live template defined for the setup/teardown method skeleton as well, the only manual coding here again is the IoC-specific stuff: two lines, not more… The ‘Red’ (pt. 2) Now, the execution of the above test gives the following result: This time, the test outcome tells me that the method under test is called. And this is the point, where Derick and I seem to have somewhat different views on the subject: Of course, the test still is worthless regarding the red/green outcome (or: it’s still red for the wrong reasons, in that it gives a false negative). But as far as I am concerned, I’m not really interested in the test outcome at this point of the red-green-refactor cycle. Rather, I only want to assert that my test actually calls the right method. If that’s the case, I will happily go on to the ‘Green’ part… The ‘Green’ Making the test green is quite trivial. Just make LastResult an automatic property: [Implements(typeof(ICalculator))] internal class Calculator : ICalculator { public double? LastResult { get; private set; } } One more round… Now on to something slightly more demanding (cough…). Let’s state that our Calculator exposes an Add() method: ... /// <summary> /// Adds the specified operands. /// </summary> /// <param name="operand1">The operand1.</param> /// <param name="operand2">The operand2.</param> /// <returns>The result of the additon.</returns> /// <exception cref="ArgumentException"> /// Argument <paramref name="operand1"/> is < 0.<br/> /// -- or --<br/> /// Argument <paramref name="operand2"/> is < 0. /// </exception> double Add(double operand1, double operand2); } // interface ICalculator A remark: I sometimes hear the complaint that xml comment stuff like the above is hard to read. That’s certainly true, but irrelevant to me, because I read xml code comments with the CR_Documentor tool window. And using that, it looks like this: Apart from that, I’m heavily using xml code comments (see e.g. here for a detailed guide) because there is the possibility of automating help generation with nightly CI builds (using MS Sandcastle and the Sandcastle Help File Builder), and then publishing the results to some intranet location. This way, a team always has first class, up-to-date technical documentation at hand about the current codebase. (And, also very important for speeding up things and avoiding typos: You have IntelliSense/AutoCompletion and R# support, and the comments are subject to compiler checking…). Back to our Calculator again: Two more R# – clicks implement the Add() skeleton: ... public double Add(double operand1, double operand2) { throw new NotImplementedException(); } } // class Calculator As we have stated in the interface definition (which actually serves as our requirement document!), the operands are not allowed to be negative. So let’s start implementing that. Here’s the test: [Test] [Row(-0.5, 2)] public void AddThrowsOnNegativeOperands(double operand1, double operand2) { ICalculator calculator = container.GetService<ICalculator>(); Assert.Throws<ArgumentException>(() => calculator.Add(operand1, operand2)); } As you can see, I’m using a data-driven unit test method here, mainly for these two reasons: Because I know that I will have to do the same test for the second operand in a few seconds, I save myself from implementing another test method for this purpose. Rather, I only will have to add another Row attribute to the existing one. From the test report below, you can see that the argument values are explicitly printed out. This can be a valuable documentation feature even when everything is green: One can quickly review what values were tested exactly - the complete Gallio HTML-report (as it will be produced by the Continuous Integration runs) shows these values in a quite clear format (see below for an example). Back to our Calculator development again, this is what the test result tells us at the moment: So we’re red again, because there is not yet an implementation… Next we go on and implement the necessary parameter verification to become green again, and then we do the same thing for the second operand. To make a long story short, here’s the test and the method implementation at the end of the second cycle: // in CalculatorTest: [Test] [Row(-0.5, 2)] [Row(295, -123)] public void AddThrowsOnNegativeOperands(double operand1, double operand2) { ICalculator calculator = container.GetService<ICalculator>(); Assert.Throws<ArgumentException>(() => calculator.Add(operand1, operand2)); } // in Calculator: public double Add(double operand1, double operand2) { if (operand1 < 0.0) { throw new ArgumentException("Value must not be negative.", "operand1"); } if (operand2 < 0.0) { throw new ArgumentException("Value must not be negative.", "operand2"); } throw new NotImplementedException(); } So far, we have sheltered our method from unwanted input, and now we can safely operate on the parameters without further caring about their validity (this is my interpretation of the Fail Fast principle, which is regarded here in more detail). Now we can think about the method’s successful outcomes. First let’s write another test for that: [Test] [Row(1, 1, 2)] public void TestAdd(double operand1, double operand2, double expectedResult) { ICalculator calculator = container.GetService<ICalculator>(); double result = calculator.Add(operand1, operand2); Assert.AreEqual(expectedResult, result); } Again, I’m regularly using row based test methods for these kinds of unit tests. The above shown pattern proved to be extremely helpful for my development work, I call it the Defined-Input/Expected-Output test idiom: You define your input arguments together with the expected method result. There are two major benefits from that way of testing: In the course of refining a method, it’s very likely to come up with additional test cases. In our case, we might add tests for some edge cases like ‘one of the operands is zero’ or ‘the sum of the two operands causes an overflow’, or maybe there’s an external test protocol that has to be fulfilled (e.g. an ISO norm for medical software), and this results in the need of testing against additional values. In all these scenarios we only have to add another Row attribute to the test. Remember that the argument values are written to the test report, so as a side-effect this produces valuable documentation. (This can become especially important if the fulfillment of some sort of external requirements has to be proven). So your test method might look something like that in the end: [Test, Description("Arguments: operand1, operand2, expectedResult")] [Row(1, 1, 2)] [Row(0, 999999999, 999999999)] [Row(0, 0, 0)] [Row(0, double.MaxValue, double.MaxValue)] [Row(4, double.MaxValue - 2.5, double.MaxValue)] public void TestAdd(double operand1, double operand2, double expectedResult) { ICalculator calculator = container.GetService<ICalculator>(); double result = calculator.Add(operand1, operand2); Assert.AreEqual(expectedResult, result); } And this will produce the following HTML report (with Gallio): Not bad for the amount of work we invested in it, huh? - There might be scenarios where reports like that can be useful for demonstration purposes during a Scrum sprint review… The last requirement to fulfill is that the LastResult property is expected to store the result of the last operation. I don’t show this here, it’s trivial enough and brings nothing new… And finally: Refactor (for the right reasons) To demonstrate my way of going through the refactoring portion of the red-green-refactor cycle, I added another method to our Calculator component, namely Subtract(). Here’s the code (tests and production): // CalculatorTest.cs: [Test, Description("Arguments: operand1, operand2, expectedResult")] [Row(1, 1, 0)] [Row(0, 999999999, -999999999)] [Row(0, 0, 0)] [Row(0, double.MaxValue, -double.MaxValue)] [Row(4, double.MaxValue - 2.5, -double.MaxValue)] public void TestSubtract(double operand1, double operand2, double expectedResult) { ICalculator calculator = container.GetService<ICalculator>(); double result = calculator.Subtract(operand1, operand2); Assert.AreEqual(expectedResult, result); } [Test, Description("Arguments: operand1, operand2, expectedResult")] [Row(1, 1, 0)] [Row(0, 999999999, -999999999)] [Row(0, 0, 0)] [Row(0, double.MaxValue, -double.MaxValue)] [Row(4, double.MaxValue - 2.5, -double.MaxValue)] public void TestSubtractGivesExpectedLastResult(double operand1, double operand2, double expectedResult) { ICalculator calculator = container.GetService<ICalculator>(); calculator.Subtract(operand1, operand2); Assert.AreEqual(expectedResult, calculator.LastResult); } ... // ICalculator.cs: /// <summary> /// Subtracts the specified operands. /// </summary> /// <param name="operand1">The operand1.</param> /// <param name="operand2">The operand2.</param> /// <returns>The result of the subtraction.</returns> /// <exception cref="ArgumentException"> /// Argument <paramref name="operand1"/> is < 0.<br/> /// -- or --<br/> /// Argument <paramref name="operand2"/> is < 0. /// </exception> double Subtract(double operand1, double operand2); ... // Calculator.cs: public double Subtract(double operand1, double operand2) { if (operand1 < 0.0) { throw new ArgumentException("Value must not be negative.", "operand1"); } if (operand2 < 0.0) { throw new ArgumentException("Value must not be negative.", "operand2"); } return (this.LastResult = operand1 - operand2).Value; } Obviously, the argument validation stuff that was produced during the red-green part of our cycle duplicates the code from the previous Add() method. So, to avoid code duplication and minimize the number of code lines of the production code, we do an Extract Method refactoring. One more time, this is only a matter of a few mouse clicks (and giving the new method a name) with R#: Having done that, our production code finally looks like that: using System; using LinFu.IoC.Configuration; namespace Calculator { [Implements(typeof(ICalculator))] internal class Calculator : ICalculator { #region ICalculator public double? LastResult { get; private set; } public double Add(double operand1, double operand2) { ThrowIfOneOperandIsInvalid(operand1, operand2); return (this.LastResult = operand1 + operand2).Value; } public double Subtract(double operand1, double operand2) { ThrowIfOneOperandIsInvalid(operand1, operand2); return (this.LastResult = operand1 - operand2).Value; } #endregion // ICalculator #region Implementation (Helper) private static void ThrowIfOneOperandIsInvalid(double operand1, double operand2) { if (operand1 < 0.0) { throw new ArgumentException("Value must not be negative.", "operand1"); } if (operand2 < 0.0) { throw new ArgumentException("Value must not be negative.", "operand2"); } } #endregion // Implementation (Helper) } // class Calculator } // namespace Calculator But is the above worth the effort at all? It’s obviously trivial and not very impressive. All our tests were green (for the right reasons), and refactoring the code did not change anything. It’s not immediately clear how this refactoring work adds value to the project. Derick puts it like this: STOP! Hold on a second… before you go any further and before you even think about refactoring what you just wrote to make your test pass, you need to understand something: if your done with your requirements after making the test green, you are not required to refactor the code. I know… I’m speaking heresy, here. Toss me to the wolves, I’ve gone over to the dark side! Seriously, though… if your test is passing for the right reasons, and you do not need to write any test or any more code for you class at this point, what value does refactoring add? Derick immediately answers his own question: So why should you follow the refactor portion of red/green/refactor? When you have added code that makes the system less readable, less understandable, less expressive of the domain or concern’s intentions, less architecturally sound, less DRY, etc, then you should refactor it. I couldn’t state it more precise. From my personal perspective, I’d add the following: You have to keep in mind that real-world software systems are usually quite large and there are dozens or even hundreds of occasions where micro-refactorings like the above can be applied. It’s the sum of them all that counts. And to have a good overall quality of the system (e.g. in terms of the Code Duplication Percentage metric) you have to be pedantic on the individual, seemingly trivial cases. My job regularly requires the reading and understanding of ‘foreign’ code. So code quality/readability really makes a HUGE difference for me – sometimes it can be even the difference between project success and failure… Conclusions The above described development process emerged over the years, and there were mainly two things that guided its evolution (you might call it eternal principles, personal beliefs, or anything in between): Test-driven development is the normal, natural way of writing software, code-first is exceptional. So ‘doing TDD or not’ is not a question. And good, stable code can only reliably be produced by doing TDD (yes, I know: many will strongly disagree here again, but I’ve never seen high-quality code – and high-quality code is code that stood the test of time and causes low maintenance costs – that was produced code-first…) It’s the production code that pays our bills in the end. (Though I have seen customers these days who demand an acceptance test battery as part of the final delivery. Things seem to go into the right direction…). The test code serves ‘only’ to make the production code work. But it’s the number of delivered features which solely counts at the end of the day - no matter how much test code you wrote or how good it is. With these two things in mind, I tried to optimize my coding process for coding speed – or, in business terms: productivity - without sacrificing the principles of TDD (more than I’d do either way…). As a result, I consider a ratio of about 3-5/1 for test code vs. production code as normal and desirable. In other words: roughly 60-80% of my code is test code (This might sound heavy, but that is mainly due to the fact that software development standards only begin to evolve. The entire software development profession is very young, historically seen; only at the very beginning, and there are no viable standards yet. If you think about software development as a kind of casting process, where the test code is the mold and the resulting production code is the final product, then the above ratio sounds no longer extraordinary…) Although the above might look like very much unnecessary work at first sight, it’s not. With the aid of the mentioned add-ins, doing all the above is a matter of minutes, sometimes seconds (while writing this post took hours and days…). The most important thing is to have the right tools at hand. Slow developer machines or the lack of a tool or something like that - for ‘saving’ a few 100 bucks - is just not acceptable and a very bad decision in business terms (though I quite some times have seen and heard that…). Production of high-quality products needs the usage of high-quality tools. This is a platitude that every craftsman knows… The here described round-trip will take me about five to ten minutes in my real-world development practice. I guess it’s about 30% more time compared to developing the ‘traditional’ (code-first) way. But the so manufactured ‘product’ is of much higher quality and massively reduces maintenance costs, which is by far the single biggest cost factor, as I showed in this previous post: It's the maintenance, stupid! (or: Something is rotten in developerland.). In the end, this is a highly cost-effective way of software development… But on the other hand, there clearly is a trade-off here: coding speed vs. code quality/later maintenance costs. The here described development method might be a perfect fit for the overwhelming majority of software projects, but there certainly are some scenarios where it’s not - e.g. if time-to-market is crucial for a software project. So this is a business decision in the end. It’s just that you have to know what you’re doing and what consequences this might have… Some last words First, I’d like to thank Derick Bailey again. His two aforementioned posts (which I strongly recommend for reading) inspired me to think deeply about my own personal way of doing TDD and to clarify my thoughts about it. I wouldn’t have done that without this inspiration. I really enjoy that kind of discussions… I agree with him in all respects. But I don’t know (yet?) how to bring his insights into the described production process without slowing things down. The above described method proved to be very “good enough” in my practical experience. But of course, I’m open to suggestions here… My rationale for now is: If the test is initially red during the red-green-refactor cycle, the ‘right reason’ is: it actually calls the right method, but this method is not yet operational. Later on, when the cycle is finished and the tests become part of the regular, automated Continuous Integration process, ‘red’ certainly must occur for the ‘right reason’: in this phase, ‘red’ MUST mean nothing but an unfulfilled assertion - Fail By Assertion, Not By Anything Else!

Read the article
Using R to Analyze G1GC Log Files

- by user12620111

Using R to Analyze G1GC Log Files body, td { font-family: sans-serif; background-color: white; font-size: 12px; margin: 8px; } tt, code, pre { font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace; } h1 { font-size:2.2em; } h2 { font-size:1.8em; } h3 { font-size:1.4em; } h4 { font-size:1.0em; } h5 { font-size:0.9em; } h6 { font-size:0.8em; } a:visited { color: rgb(50%, 0%, 50%); } pre { margin-top: 0; max-width: 95%; border: 1px solid #ccc; white-space: pre-wrap; } pre code { display: block; padding: 0.5em; } code.r, code.cpp { background-color: #F8F8F8; } table, td, th { border: none; } blockquote { color:#666666; margin:0; padding-left: 1em; border-left: 0.5em #EEE solid; } hr { height: 0px; border-bottom: none; border-top-width: thin; border-top-style: dotted; border-top-color: #999999; } @media print { * { background: transparent !important; color: black !important; filter:none !important; -ms-filter: none !important; } body { font-size:12pt; max-width:100%; } a, a:visited { text-decoration: underline; } hr { visibility: hidden; page-break-before: always; } pre, blockquote { padding-right: 1em; page-break-inside: avoid; } tr, img { page-break-inside: avoid; } img { max-width: 100% !important; } @page :left { margin: 15mm 20mm 15mm 10mm; } @page :right { margin: 15mm 10mm 15mm 20mm; } p, h2, h3 { orphans: 3; widows: 3; } h2, h3 { page-break-after: avoid; } } pre .operator, pre .paren { color: rgb(104, 118, 135) } pre .literal { color: rgb(88, 72, 246) } pre .number { color: rgb(0, 0, 205); } pre .comment { color: rgb(76, 136, 107); } pre .keyword { color: rgb(0, 0, 255); } pre .identifier { color: rgb(0, 0, 0); } pre .string { color: rgb(3, 106, 7); } var hljs=new function(){function m(p){return p.replace(/&/gm,"&").replace(/"}while(y.length||w.length){var v=u().splice(0,1)[0];z+=m(x.substr(q,v.offset-q));q=v.offset;if(v.event=="start"){z+=t(v.node);s.push(v.node)}else{if(v.event=="stop"){var p,r=s.length;do{r--;p=s[r];z+=("")}while(p!=v.node);s.splice(r,1);while(r'+M[0]+""}else{r+=M[0]}O=P.lR.lastIndex;M=P.lR.exec(L)}return r+L.substr(O,L.length-O)}function J(L,M){if(M.sL&&e[M.sL]){var r=d(M.sL,L);x+=r.keyword_count;return r.value}else{return F(L,M)}}function I(M,r){var L=M.cN?'':"";if(M.rB){y+=L;M.buffer=""}else{if(M.eB){y+=m(r)+L;M.buffer=""}else{y+=L;M.buffer=r}}D.push(M);A+=M.r}function G(N,M,Q){var R=D[D.length-1];if(Q){y+=J(R.buffer+N,R);return false}var P=q(M,R);if(P){y+=J(R.buffer+N,R);I(P,M);return P.rB}var L=v(D.length-1,M);if(L){var O=R.cN?"":"";if(R.rE){y+=J(R.buffer+N,R)+O}else{if(R.eE){y+=J(R.buffer+N,R)+O+m(M)}else{y+=J(R.buffer+N+M,R)+O}}while(L1){O=D[D.length-2].cN?"":"";y+=O;L--;D.length--}var r=D[D.length-1];D.length--;D[D.length-1].buffer="";if(r.starts){I(r.starts,"")}return R.rE}if(w(M,R)){throw"Illegal"}}var E=e[B];var D=[E.dM];var A=0;var x=0;var y="";try{var s,u=0;E.dM.buffer="";do{s=p(C,u);var t=G(s[0],s[1],s[2]);u+=s[0].length;if(!t){u+=s[1].length}}while(!s[2]);if(D.length1){throw"Illegal"}return{r:A,keyword_count:x,value:y}}catch(H){if(H=="Illegal"){return{r:0,keyword_count:0,value:m(C)}}else{throw H}}}function g(t){var p={keyword_count:0,r:0,value:m(t)};var r=p;for(var q in e){if(!e.hasOwnProperty(q)){continue}var s=d(q,t);s.language=q;if(s.keyword_count+s.rr.keyword_count+r.r){r=s}if(s.keyword_count+s.rp.keyword_count+p.r){r=p;p=s}}if(r.language){p.second_best=r}return p}function i(r,q,p){if(q){r=r.replace(/^((]+|\t)+)/gm,function(t,w,v,u){return w.replace(/\t/g,q)})}if(p){r=r.replace(/\n/g,"")}return r}function n(t,w,r){var x=h(t,r);var v=a(t);var y,s;if(v){y=d(v,x)}else{return}var q=c(t);if(q.length){s=document.createElement("pre");s.innerHTML=y.value;y.value=k(q,c(s),x)}y.value=i(y.value,w,r);var u=t.className;if(!u.match("(\\s|^)(language-)?"+v+"(\\s|$)")){u=u?(u+" "+v):v}if(/MSIE [678]/.test(navigator.userAgent)&&t.tagName=="CODE"&&t.parentNode.tagName=="PRE"){s=t.parentNode;var p=document.createElement("div");p.innerHTML=""+y.value+"";t=p.firstChild.firstChild;p.firstChild.cN=s.cN;s.parentNode.replaceChild(p.firstChild,s)}else{t.innerHTML=y.value}t.className=u;t.result={language:v,kw:y.keyword_count,re:y.r};if(y.second_best){t.second_best={language:y.second_best.language,kw:y.second_best.keyword_count,re:y.second_best.r}}}function o(){if(o.called){return}o.called=true;var r=document.getElementsByTagName("pre");for(var p=0;p|=||=||=|\\?|\\[|\\{|\\(|\\^|\\^=|\\||\\|=|\\|\\||~";this.ER="(?![\\s\\S])";this.BE={b:"\\\\.",r:0};this.ASM={cN:"string",b:"'",e:"'",i:"\\n",c:[this.BE],r:0};this.QSM={cN:"string",b:'"',e:'"',i:"\\n",c:[this.BE],r:0};this.CLCM={cN:"comment",b:"//",e:"$"};this.CBLCLM={cN:"comment",b:"/\\*",e:"\\*/"};this.HCM={cN:"comment",b:"#",e:"$"};this.NM={cN:"number",b:this.NR,r:0};this.CNM={cN:"number",b:this.CNR,r:0};this.BNM={cN:"number",b:this.BNR,r:0};this.inherit=function(r,s){var p={};for(var q in r){p[q]=r[q]}if(s){for(var q in s){p[q]=s[q]}}return p}}();hljs.LANGUAGES.cpp=function(){var a={keyword:{"false":1,"int":1,"float":1,"while":1,"private":1,"char":1,"catch":1,"export":1,virtual:1,operator:2,sizeof:2,dynamic_cast:2,typedef:2,const_cast:2,"const":1,struct:1,"for":1,static_cast:2,union:1,namespace:1,unsigned:1,"long":1,"throw":1,"volatile":2,"static":1,"protected":1,bool:1,template:1,mutable:1,"if":1,"public":1,friend:2,"do":1,"return":1,"goto":1,auto:1,"void":2,"enum":1,"else":1,"break":1,"new":1,extern:1,using:1,"true":1,"class":1,asm:1,"case":1,typeid:1,"short":1,reinterpret_cast:2,"default":1,"double":1,register:1,explicit:1,signed:1,typename:1,"try":1,"this":1,"switch":1,"continue":1,wchar_t:1,inline:1,"delete":1,alignof:1,char16_t:1,char32_t:1,constexpr:1,decltype:1,noexcept:1,nullptr:1,static_assert:1,thread_local:1,restrict:1,_Bool:1,complex:1},built_in:{std:1,string:1,cin:1,cout:1,cerr:1,clog:1,stringstream:1,istringstream:1,ostringstream:1,auto_ptr:1,deque:1,list:1,queue:1,stack:1,vector:1,map:1,set:1,bitset:1,multiset:1,multimap:1,unordered_set:1,unordered_map:1,unordered_multiset:1,unordered_multimap:1,array:1,shared_ptr:1}};return{dM:{k:a,i:"",k:a,r:10,c:["self"]}]}}}();hljs.LANGUAGES.r={dM:{c:[hljs.HCM,{cN:"number",b:"\\b0[xX][0-9a-fA-F]+[Li]?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+(?:[eE][+\\-]?\\d*)?L\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+\\.(?!\\d)(?:i\\b)?",e:hljs.IMMEDIATE_RE,r:1},{cN:"number",b:"\\b\\d+(?:\\.\\d*)?(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\.\\d+(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"keyword",b:"(?:tryCatch|library|setGeneric|setGroupGeneric)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\.",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\d+(?![\\w.])",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\b(?:function)",e:hljs.IMMEDIATE_RE,r:2},{cN:"keyword",b:"(?:if|in|break|next|repeat|else|for|return|switch|while|try|stop|warning|require|attach|detach|source|setMethod|setClass)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"literal",b:"(?:NA|NA_integer_|NA_real_|NA_character_|NA_complex_)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"literal",b:"(?:NULL|TRUE|FALSE|T|F|Inf|NaN)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"identifier",b:"[a-zA-Z.][a-zA-Z0-9._]*\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"|=|| Using R to Analyze G1GC Log Files Using R to Analyze G1GC Log Files Introduction Working in Oracle Platform Integration gives an engineer opportunities to work on a wide array of technologies. My team’s goal is to make Oracle applications run best on the Solaris/SPARC platform. When looking for bottlenecks in a modern applications, one needs to be aware of not only how the CPUs and operating system are executing, but also network, storage, and in some cases, the Java Virtual Machine. I was recently presented with about 1.5 GB of Java Garbage First Garbage Collector log file data. If you’re not familiar with the subject, you might want to review Garbage First Garbage Collector Tuning by Monica Beckwith. The customer had been running Java HotSpot 1.6.0_31 to host a web application server. I was told that the Solaris/SPARC server was running a Java process launched using a commmand line that included the following flags: -d64 -Xms9g -Xmx9g -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=80 -XX:PermSize=256m -XX:MaxPermSize=256m -XX:+PrintGC -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -XX:+PrintGCDateStamps -XX:+PrintFlagsFinal -XX:+DisableExplicitGC -XX:+UnlockExperimentalVMOptions -XX:ParallelGCThreads=8 Several sources on the internet indicate that if I were to print out the 1.5 GB of log files, it would require enough paper to fill the bed of a pick up truck. Of course, it would be fruitless to try to scan the log files by hand. Tools will be required to summarize the contents of the log files. Others have encountered large Java garbage collection log files. There are existing tools to analyze the log files: IBM’s GC toolkit The chewiebug GCViewer gchisto HPjmeter Instead of using one of the other tools listed, I decide to parse the log files with standard Unix tools, and analyze the data with R. Data Cleansing The log files arrived in two different formats. I guess that the difference is that one set of log files was generated using a more verbose option, maybe -XX:+PrintHeapAtGC, and the other set of log files was generated without that option. Format 1 In some of the log files, the log files with the less verbose format, a single trace, i.e. the report of a singe garbage collection event, looks like this: {Heap before GC invocations=12280 (full 61): garbage-first heap total 9437184K, used 7499918K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) region size 4096K, 1 young (4096K), 0 survivors (0K) compacting perm gen total 262144K, used 144077K [0xffffffff40000000, 0xffffffff50000000, 0xffffffff50000000) the space 262144K, 54% used [0xffffffff40000000, 0xffffffff48cb3758, 0xffffffff48cb3800, 0xffffffff50000000) No shared spaces configured. 2014-05-14T07:24:00.988-0700: 60586.353: [GC pause (young) 7324M->7320M(9216M), 0.1567265 secs] Heap after GC invocations=12281 (full 61): garbage-first heap total 9437184K, used 7496533K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) region size 4096K, 0 young (0K), 0 survivors (0K) compacting perm gen total 262144K, used 144077K [0xffffffff40000000, 0xffffffff50000000, 0xffffffff50000000) the space 262144K, 54% used [0xffffffff40000000, 0xffffffff48cb3758, 0xffffffff48cb3800, 0xffffffff50000000) No shared spaces configured. } A simple grep can be used to extract a summary: $ grep "\[ GC pause (young" g1gc.log 2014-05-13T13:24:35.091-0700: 3.109: [GC pause (young) 20M->5029K(9216M), 0.0146328 secs] 2014-05-13T13:24:35.440-0700: 3.459: [GC pause (young) 9125K->6077K(9216M), 0.0086723 secs] 2014-05-13T13:24:37.581-0700: 5.599: [GC pause (young) 25M->8470K(9216M), 0.0203820 secs] 2014-05-13T13:24:42.686-0700: 10.704: [GC pause (young) 44M->15M(9216M), 0.0288848 secs] 2014-05-13T13:24:48.941-0700: 16.958: [GC pause (young) 51M->20M(9216M), 0.0491244 secs] 2014-05-13T13:24:56.049-0700: 24.066: [GC pause (young) 92M->26M(9216M), 0.0525368 secs] 2014-05-13T13:25:34.368-0700: 62.383: [GC pause (young) 602M->68M(9216M), 0.1721173 secs] But that format wasn't easily read into R, so I needed to be a bit more tricky. I used the following Unix command to create a summary file that was easy for R to read. $ echo "SecondsSinceLaunch BeforeSize AfterSize TotalSize RealTime" $ grep "\[GC pause (young" g1gc.log | grep -v mark | sed -e 's/[A-SU-z,]/ /g' -e 's/->/ /' -e 's/: / /g' | more SecondsSinceLaunch BeforeSize AfterSize TotalSize RealTime 2014-05-13T13:24:35.091-0700 3.109 20 5029 9216 0.0146328 2014-05-13T13:24:35.440-0700 3.459 9125 6077 9216 0.0086723 2014-05-13T13:24:37.581-0700 5.599 25 8470 9216 0.0203820 2014-05-13T13:24:42.686-0700 10.704 44 15 9216 0.0288848 2014-05-13T13:24:48.941-0700 16.958 51 20 9216 0.0491244 2014-05-13T13:24:56.049-0700 24.066 92 26 9216 0.0525368 2014-05-13T13:25:34.368-0700 62.383 602 68 9216 0.1721173 Format 2 In some of the log files, the log files with the more verbose format, a single trace, i.e. the report of a singe garbage collection event, was more complicated than Format 1. Here is a text file with an example of a single G1GC trace in the second format. As you can see, it is quite complicated. It is nice that there is so much information available, but the level of detail can be overwhelming. I wrote this awk script (download) to summarize each trace on a single line. #!/usr/bin/env awk -f BEGIN { printf("SecondsSinceLaunch IncrementalCount FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize\n") } ###################### # Save count data from lines that are at the start of each G1GC trace. # Each trace starts out like this: # {Heap before GC invocations=14 (full 0): # garbage-first heap total 9437184K, used 325496K [0xfffffffd00000000, 0xffffffff40000000, 0xffffffff40000000) ###################### /{Heap.*full/{ gsub ( "\\)" , "" ); nf=split($0,a,"="); split(a[2],b," "); getline; if ( match($0, "first") ) { G1GC=1; IncrementalCount=b[1]; FullCount=substr( b[3], 1, length(b[3])-1 ); } else { G1GC=0; } } ###################### # Pull out time stamps that are in lines with this format: # 2014-05-12T14:02:06.025-0700: 94.312: [GC pause (young), 0.08870154 secs] ###################### /GC pause/ { DateTime=$1; SecondsSinceLaunch=substr($2, 1, length($2)-1); } ###################### # Heap sizes are in lines that look like this: # [ 4842M->4838M(9216M)] ###################### /\[ .*]$/ { gsub ( "\\[" , "" ); gsub ( "\ \]" , "" ); gsub ( "->" , " " ); gsub ( "\$ " , " " ); gsub ( "\ $" , " " ); split($0,a," "); if ( split(a[1],b,"M") > 1 ) {BeforeSize=b[1]*1024;} if ( split(a[1],b,"K") > 1 ) {BeforeSize=b[1];} if ( split(a[2],b,"M") > 1 ) {AfterSize=b[1]*1024;} if ( split(a[2],b,"K") > 1 ) {AfterSize=b[1];} if ( split(a[3],b,"M") > 1 ) {TotalSize=b[1]*1024;} if ( split(a[3],b,"K") > 1 ) {TotalSize=b[1];} } ###################### # Emit an output line when you find input that looks like this: # [Times: user=1.41 sys=0.08, real=0.24 secs] ###################### /\[Times/ { if (G1GC==1) { gsub ( "," , "" ); split($2,a,"="); UserTime=a[2]; split($3,a,"="); SysTime=a[2]; split($4,a,"="); RealTime=a[2]; print DateTime,SecondsSinceLaunch,IncrementalCount,FullCount,UserTime,SysTime,RealTime,BeforeSize,AfterSize,TotalSize; G1GC=0; } } The resulting summary is about 25X smaller that the original file, but still difficult for a human to digest. SecondsSinceLaunch IncrementalCount FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ... 2014-05-12T18:36:34.669-0700: 3985.744 561 0 0.57 0.06 0.16 1724416 1720320 9437184 2014-05-12T18:36:34.839-0700: 3985.914 562 0 0.51 0.06 0.19 1724416 1720320 9437184 2014-05-12T18:36:35.069-0700: 3986.144 563 0 0.60 0.04 0.27 1724416 1721344 9437184 2014-05-12T18:36:35.354-0700: 3986.429 564 0 0.33 0.04 0.09 1725440 1722368 9437184 2014-05-12T18:36:35.545-0700: 3986.620 565 0 0.58 0.04 0.17 1726464 1722368 9437184 2014-05-12T18:36:35.726-0700: 3986.801 566 0 0.43 0.05 0.12 1726464 1722368 9437184 2014-05-12T18:36:35.856-0700: 3986.930 567 0 0.30 0.04 0.07 1726464 1723392 9437184 2014-05-12T18:36:35.947-0700: 3987.023 568 0 0.61 0.04 0.26 1727488 1723392 9437184 2014-05-12T18:36:36.228-0700: 3987.302 569 0 0.46 0.04 0.16 1731584 1724416 9437184 Reading the Data into R Once the GC log data had been cleansed, either by processing the first format with the shell script, or by processing the second format with the awk script, it was easy to read the data into R. g1gc.df = read.csv("summary.txt", row.names = NULL, stringsAsFactors=FALSE,sep="") str(g1gc.df) ## 'data.frame': 8307 obs. of 10 variables: ## $ row.names : chr "2014-05-12T14:00:32.868-0700:" "2014-05-12T14:00:33.179-0700:" "2014-05-12T14:00:33.677-0700:" "2014-05-12T14:00:35.538-0700:" ... ## $ SecondsSinceLaunch: num 1.16 1.47 1.97 3.83 6.1 ... ## $ IncrementalCount : int 0 1 2 3 4 5 6 7 8 9 ... ## $ FullCount : int 0 0 0 0 0 0 0 0 0 0 ... ## $ UserTime : num 0.11 0.05 0.04 0.21 0.08 0.26 0.31 0.33 0.34 0.56 ... ## $ SysTime : num 0.04 0.01 0.01 0.05 0.01 0.06 0.07 0.06 0.07 0.09 ... ## $ RealTime : num 0.02 0.02 0.01 0.04 0.02 0.04 0.05 0.04 0.04 0.06 ... ## $ BeforeSize : int 8192 5496 5768 22528 24576 43008 34816 53248 55296 93184 ... ## $ AfterSize : int 1400 1672 2557 4907 7072 14336 16384 18432 19456 21504 ... ## $ TotalSize : int 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 9437184 ... head(g1gc.df) ## row.names SecondsSinceLaunch IncrementalCount ## 1 2014-05-12T14:00:32.868-0700: 1.161 0 ## 2 2014-05-12T14:00:33.179-0700: 1.472 1 ## 3 2014-05-12T14:00:33.677-0700: 1.969 2 ## 4 2014-05-12T14:00:35.538-0700: 3.830 3 ## 5 2014-05-12T14:00:37.811-0700: 6.103 4 ## 6 2014-05-12T14:00:41.428-0700: 9.720 5 ## FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ## 1 0 0.11 0.04 0.02 8192 1400 9437184 ## 2 0 0.05 0.01 0.02 5496 1672 9437184 ## 3 0 0.04 0.01 0.01 5768 2557 9437184 ## 4 0 0.21 0.05 0.04 22528 4907 9437184 ## 5 0 0.08 0.01 0.02 24576 7072 9437184 ## 6 0 0.26 0.06 0.04 43008 14336 9437184 Basic Statistics Once the data has been read into R, simple statistics are very easy to generate. All of the numbers from high school statistics are available via simple commands. For example, generate a summary of every column: summary(g1gc.df) ## row.names SecondsSinceLaunch IncrementalCount FullCount ## Length:8307 Min. : 1 Min. : 0 Min. : 0.0 ## Class :character 1st Qu.: 9977 1st Qu.:2048 1st Qu.: 0.0 ## Mode :character Median :12855 Median :4136 Median : 12.0 ## Mean :12527 Mean :4156 Mean : 31.6 ## 3rd Qu.:15758 3rd Qu.:6262 3rd Qu.: 61.0 ## Max. :55484 Max. :8391 Max. :113.0 ## UserTime SysTime RealTime BeforeSize ## Min. :0.040 Min. :0.0000 Min. : 0.0 Min. : 5476 ## 1st Qu.:0.470 1st Qu.:0.0300 1st Qu.: 0.1 1st Qu.:5137920 ## Median :0.620 Median :0.0300 Median : 0.1 Median :6574080 ## Mean :0.751 Mean :0.0355 Mean : 0.3 Mean :5841855 ## 3rd Qu.:0.920 3rd Qu.:0.0400 3rd Qu.: 0.2 3rd Qu.:7084032 ## Max. :3.370 Max. :1.5600 Max. :488.1 Max. :8696832 ## AfterSize TotalSize ## Min. : 1380 Min. :9437184 ## 1st Qu.:5002752 1st Qu.:9437184 ## Median :6559744 Median :9437184 ## Mean :5785454 Mean :9437184 ## 3rd Qu.:7054336 3rd Qu.:9437184 ## Max. :8482816 Max. :9437184 Q: What is the total amount of User CPU time spent in garbage collection? sum(g1gc.df$UserTime) ## [1] 6236 As you can see, less than two hours of CPU time was spent in garbage collection. Is that too much? To find the percentage of time spent in garbage collection, divide the number above by total_elapsed_time*CPU_count. In this case, there are a lot of CPU’s and it turns out the the overall amount of CPU time spent in garbage collection isn’t a problem when viewed in isolation. When calculating rates, i.e. events per unit time, you need to ask yourself if the rate is homogenous across the time period in the log file. Does the log file include spikes of high activity that should be separately analyzed? Averaging in data from nights and weekends with data from business hours may alias problems. If you have a reason to suspect that the garbage collection rates include peaks and valleys that need independent analysis, see the “Time Series” section, below. Q: How much garbage is collected on each pass? The amount of heap space that is recovered per GC pass is surprisingly low: At least one collection didn’t recover any data. (“Min.=0”) 25% of the passes recovered 3MB or less. (“1st Qu.=3072”) Half of the GC passes recovered 4MB or less. (“Median=4096”) The average amount recovered was 56MB. (“Mean=56390”) 75% of the passes recovered 36MB or less. (“3rd Qu.=36860”) At least one pass recovered 2GB. (“Max.=2121000”) g1gc.df$Delta = g1gc.df$BeforeSize - g1gc.df$AfterSize summary(g1gc.df$Delta) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0 3070 4100 56400 36900 2120000 Q: What is the maximum User CPU time for a single collection? The worst garbage collection (“Max.”) is many standard deviations away from the mean. The data appears to be right skewed. summary(g1gc.df$UserTime) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.040 0.470 0.620 0.751 0.920 3.370 sd(g1gc.df$UserTime) ## [1] 0.3966 Basic Graphics Once the data is in R, it is trivial to plot the data with formats including dot plots, line charts, bar charts (simple, stacked, grouped), pie charts, boxplots, scatter plots histograms, and kernel density plots. Histogram of User CPU Time per Collection I don't think that this graph requires any explanation. hist(g1gc.df$UserTime, main="User CPU Time per Collection", xlab="Seconds", ylab="Frequency") Box plot to identify outliers When the initial data is viewed with a box plot, you can see the one crazy outlier in the real time per GC. Save this data point for future analysis and drop the outlier so that it’s not throwing off our statistics. Now the box plot shows many outliers, which will be examined later, using times series analysis. Notice that the scale of the x-axis changes drastically once the crazy outlier is removed. par(mfrow=c(2,1)) boxplot(g1gc.df$UserTime,g1gc.df$SysTime,g1gc.df$RealTime, main="Box Plot of Time per GC\n(dominated by a crazy outlier)", names=c("usr","sys","elapsed"), xlab="Seconds per GC", ylab="Time (Seconds)", horizontal = TRUE, outcol="red") crazy.outlier.df=g1gc.df[g1gc.df$RealTime > 400,] g1gc.df=g1gc.df[g1gc.df$RealTime < 400,] boxplot(g1gc.df$UserTime,g1gc.df$SysTime,g1gc.df$RealTime, main="Box Plot of Time per GC\n(crazy outlier excluded)", names=c("usr","sys","elapsed"), xlab="Seconds per GC", ylab="Time (Seconds)", horizontal = TRUE, outcol="red") box(which = "outer", lty = "solid") Here is the crazy outlier for future analysis: crazy.outlier.df ## row.names SecondsSinceLaunch IncrementalCount ## 8233 2014-05-12T23:15:43.903-0700: 20741 8316 ## FullCount UserTime SysTime RealTime BeforeSize AfterSize TotalSize ## 8233 112 0.55 0.42 488.1 8381440 8235008 9437184 ## Delta ## 8233 146432 R Time Series Data To analyze the garbage collection as a time series, I’ll use Z’s Ordered Observations (zoo). “zoo is the creator for an S3 class of indexed totally ordered observations which includes irregular time series.” require(zoo) ## Loading required package: zoo ## ## Attaching package: 'zoo' ## ## The following objects are masked from 'package:base': ## ## as.Date, as.Date.numeric head(g1gc.df[,1]) ## [1] "2014-05-12T14:00:32.868-0700:" "2014-05-12T14:00:33.179-0700:" ## [3] "2014-05-12T14:00:33.677-0700:" "2014-05-12T14:00:35.538-0700:" ## [5] "2014-05-12T14:00:37.811-0700:" "2014-05-12T14:00:41.428-0700:" options("digits.secs"=3) times=as.POSIXct( g1gc.df[,1], format="%Y-%m-%dT%H:%M:%OS%z:") g1gc.z = zoo(g1gc.df[,-c(1)], order.by=times) head(g1gc.z) ## SecondsSinceLaunch IncrementalCount FullCount ## 2014-05-12 17:00:32.868 1.161 0 0 ## 2014-05-12 17:00:33.178 1.472 1 0 ## 2014-05-12 17:00:33.677 1.969 2 0 ## 2014-05-12 17:00:35.538 3.830 3 0 ## 2014-05-12 17:00:37.811 6.103 4 0 ## 2014-05-12 17:00:41.427 9.720 5 0 ## UserTime SysTime RealTime BeforeSize AfterSize ## 2014-05-12 17:00:32.868 0.11 0.04 0.02 8192 1400 ## 2014-05-12 17:00:33.178 0.05 0.01 0.02 5496 1672 ## 2014-05-12 17:00:33.677 0.04 0.01 0.01 5768 2557 ## 2014-05-12 17:00:35.538 0.21 0.05 0.04 22528 4907 ## 2014-05-12 17:00:37.811 0.08 0.01 0.02 24576 7072 ## 2014-05-12 17:00:41.427 0.26 0.06 0.04 43008 14336 ## TotalSize Delta ## 2014-05-12 17:00:32.868 9437184 6792 ## 2014-05-12 17:00:33.178 9437184 3824 ## 2014-05-12 17:00:33.677 9437184 3211 ## 2014-05-12 17:00:35.538 9437184 17621 ## 2014-05-12 17:00:37.811 9437184 17504 ## 2014-05-12 17:00:41.427 9437184 28672 Example of Two Benchmark Runs in One Log File The data in the following graph is from a different log file, not the one of primary interest to this article. I’m including this image because it is an example of idle periods followed by busy periods. It would be uninteresting to average the rate of garbage collection over the entire log file period. More interesting would be the rate of garbage collect in the two busy periods. Are they the same or different? Your production data may be similar, for example, bursts when employees return from lunch and idle times on weekend evenings, etc. Once the data is in an R Time Series, you can analyze isolated time windows. Clipping the Time Series data Flashing back to our test case… Viewing the data as a time series is interesting. You can see that the work intensive time period is between 9:00 PM and 3:00 AM. Lets clip the data to the interesting period: par(mfrow=c(2,1)) plot(g1gc.z$UserTime, type="h", main="User Time per GC\nTime: Complete Log File", xlab="Time of Day", ylab="CPU Seconds per GC", col="#1b9e77") clipped.g1gc.z=window(g1gc.z, start=as.POSIXct("2014-05-12 21:00:00"), end=as.POSIXct("2014-05-13 03:00:00")) plot(clipped.g1gc.z$UserTime, type="h", main="User Time per GC\nTime: Limited to Benchmark Execution", xlab="Time of Day", ylab="CPU Seconds per GC", col="#1b9e77") box(which = "outer", lty = "solid") Cumulative Incremental and Full GC count Here is the cumulative incremental and full GC count. When the line is very steep, it indicates that the GCs are repeating very quickly. Notice that the scale on the Y axis is different for full vs. incremental. plot(clipped.g1gc.z[,c(2:3)], main="Cumulative Incremental and Full GC count", xlab="Time of Day", col="#1b9e77") GC Analysis of Benchmark Execution using Time Series data In the following series of 3 graphs: The “After Size” show the amount of heap space in use after each garbage collection. Many Java objects are still referenced, i.e. alive, during each garbage collection. This may indicate that the application has a memory leak, or may indicate that the application has a very large memory footprint. Typically, an application's memory footprint plateau's in the early stage of execution. One would expect this graph to have a flat top. The steep decline in the heap space may indicate that the application crashed after 2:00. The second graph shows that the outliers in real execution time, discussed above, occur near 2:00. when the Java heap seems to be quite full. The third graph shows that Full GCs are infrequent during the first few hours of execution. The rate of Full GC's, (the slope of the cummulative Full GC line), changes near midnight. plot(clipped.g1gc.z[,c("AfterSize","RealTime","FullCount")], xlab="Time of Day", col=c("#1b9e77","red","#1b9e77")) GC Analysis of heap recovered Each GC trace includes the amount of heap space in use before and after the individual GC event. During garbage coolection, unreferenced objects are identified, the space holding the unreferenced objects is freed, and thus, the difference in before and after usage indicates how much space has been freed. The following box plot and bar chart both demonstrate the same point - the amount of heap space freed per garbage colloection is surprisingly low. par(mfrow=c(2,1)) boxplot(as.vector(clipped.g1gc.z$Delta), main="Amount of Heap Recovered per GC Pass", xlab="Size in KB", horizontal = TRUE, col="red") hist(as.vector(clipped.g1gc.z$Delta), main="Amount of Heap Recovered per GC Pass", xlab="Size in KB", breaks=100, col="red") box(which = "outer", lty = "solid") This graph is the most interesting. The dark blue area shows how much heap is occupied by referenced Java objects. This represents memory that holds live data. The red fringe at the top shows how much data was recovered after each garbage collection. barplot(clipped.g1gc.z[,c("AfterSize","Delta")], col=c("#7570b3","#e7298a"), xlab="Time of Day", border=NA) legend("topleft", c("Live Objects","Heap Recovered on GC"), fill=c("#7570b3","#e7298a")) box(which = "outer", lty = "solid") When I discuss the data in the log files with the customer, I will ask for an explaination for the large amount of referenced data resident in the Java heap. There are two are posibilities: There is a memory leak and the amount of space required to hold referenced objects will continue to grow, limited only by the maximum heap size. After the maximum heap size is reached, the JVM will throw an “Out of Memory” exception every time that the application tries to allocate a new object. If this is the case, the aplication needs to be debugged to identify why old objects are referenced when they are no longer needed. The application has a legitimate requirement to keep a large amount of data in memory. The customer may want to further increase the maximum heap size. Another possible solution would be to partition the application across multiple cluster nodes, where each node has responsibility for managing a unique subset of the data. Conclusion In conclusion, R is a very powerful tool for the analysis of Java garbage collection log files. The primary difficulty is data cleansing so that information can be read into an R data frame. Once the data has been read into R, a rich set of tools may be used for thorough evaluation.

Read the article
Zen and the Art of File and Folder Organization

- by Mark Virtue

Is your desk a paragon of neatness, or does it look like a paper-bomb has gone off? If you’ve been putting off getting organized because the task is too huge or daunting, or you don’t know where to start, we’ve got 40 tips to get you on the path to zen mastery of your filing system. For all those readers who would like to get their files and folders organized, or, if they’re already organized, better organized—we have compiled a complete guide to getting organized and staying organized, a comprehensive article that will hopefully cover every possible tip you could want. Signs that Your Computer is Poorly Organized If your computer is a mess, you’re probably already aware of it. But just in case you’re not, here are some tell-tale signs: Your Desktop has over 40 icons on it “My Documents” contains over 300 files and 60 folders, including MP3s and digital photos You use the Windows’ built-in search facility whenever you need to find a file You can’t find programs in the out-of-control list of programs in your Start Menu You save all your Word documents in one folder, all your spreadsheets in a second folder, etc Any given file that you’re looking for may be in any one of four different sets of folders But before we start, here are some quick notes: We’re going to assume you know what files and folders are, and how to create, save, rename, copy and delete them The organization principles described in this article apply equally to all computer systems. However, the screenshots here will reflect how things look on Windows (usually Windows 7). We will also mention some useful features of Windows that can help you get organized. Everyone has their own favorite methodology of organizing and filing, and it’s all too easy to get into “My Way is Better than Your Way” arguments. The reality is that there is no perfect way of getting things organized. When I wrote this article, I tried to keep a generalist and objective viewpoint. I consider myself to be unusually well organized (to the point of obsession, truth be told), and I’ve had 25 years experience in collecting and organizing files on computers. So I’ve got a lot to say on the subject. But the tips I have described here are only one way of doing it. Hopefully some of these tips will work for you too, but please don’t read this as any sort of “right” way to do it. At the end of the article we’ll be asking you, the reader, for your own organization tips. Why Bother Organizing At All? For some, the answer to this question is self-evident. And yet, in this era of powerful desktop search software (the search capabilities built into the Windows Vista and Windows 7 Start Menus, and third-party programs like Google Desktop Search), the question does need to be asked, and answered. I have a friend who puts every file he ever creates, receives or downloads into his My Documents folder and doesn’t bother filing them into subfolders at all. He relies on the search functionality built into his Windows operating system to help him find whatever he’s looking for. And he always finds it. He’s a Search Samurai. For him, filing is a waste of valuable time that could be spent enjoying life! It’s tempting to follow suit. On the face of it, why would anyone bother to take the time to organize their hard disk when such excellent search software is available? Well, if all you ever want to do with the files you own is to locate and open them individually (for listening, editing, etc), then there’s no reason to ever bother doing one scrap of organization. But consider these common tasks that are not achievable with desktop search software: Find files manually. Often it’s not convenient, speedy or even possible to utilize your desktop search software to find what you want. It doesn’t work 100% of the time, or you may not even have it installed. Sometimes its just plain faster to go straight to the file you want, if you know it’s in a particular sub-folder, rather than trawling through hundreds of search results. Find groups of similar files (e.g. all your “work” files, all the photos of your Europe holiday in 2008, all your music videos, all the MP3s from Dark Side of the Moon, all your letters you wrote to your wife, all your tax returns). Clever naming of the files will only get you so far. Sometimes it’s the date the file was created that’s important, other times it’s the file format, and other times it’s the purpose of the file. How do you name a collection of files so that they’re easy to isolate based on any of the above criteria? Short answer, you can’t. Move files to a new computer. It’s time to upgrade your computer. How do you quickly grab all the files that are important to you? Or you decide to have two computers now – one for home and one for work. How do you quickly isolate only the work-related files to move them to the work computer? Synchronize files to other computers. If you have more than one computer, and you need to mirror some of your files onto the other computer (e.g. your music collection), then you need a way to quickly determine which files are to be synced and which are not. Surely you don’t want to synchronize everything? Choose which files to back up. If your backup regime calls for multiple backups, or requires speedy backups, then you’ll need to be able to specify which files are to be backed up, and which are not. This is not possible if they’re all in the same folder. Finally, if you’re simply someone who takes pleasure in being organized, tidy and ordered (me! me!), then you don’t even need a reason. Being disorganized is simply unthinkable. Tips on Getting Organized Here we present our 40 best tips on how to get organized. Or, if you’re already organized, to get better organized. Tip #1. Choose Your Organization System Carefully The reason that most people are not organized is that it takes time. And the first thing that takes time is deciding upon a system of organization. This is always a matter of personal preference, and is not something that a geek on a website can tell you. You should always choose your own system, based on how your own brain is organized (which makes the assumption that your brain is, in fact, organized). We can’t instruct you, but we can make suggestions: You may want to start off with a system based on the users of the computer. i.e. “My Files”, “My Wife’s Files”, My Son’s Files”, etc. Inside “My Files”, you might then break it down into “Personal” and “Business”. You may then realize that there are overlaps. For example, everyone may want to share access to the music library, or the photos from the school play. So you may create another folder called “Family”, for the “common” files. You may decide that the highest-level breakdown of your files is based on the “source” of each file. In other words, who created the files. You could have “Files created by ME (business or personal)”, “Files created by people I know (family, friends, etc)”, and finally “Files created by the rest of the world (MP3 music files, downloaded or ripped movies or TV shows, software installation files, gorgeous desktop wallpaper images you’ve collected, etc).” This system happens to be the one I use myself. See below: Mark is for files created by meVC is for files created by my company (Virtual Creations)Others is for files created by my friends and familyData is the rest of the worldAlso, Settings is where I store the configuration files and other program data files for my installed software (more on this in tip #34, below). Each folder will present its own particular set of requirements for further sub-organization. For example, you may decide to organize your music collection into sub-folders based on the artist’s name, while your digital photos might get organized based on the date they were taken. It can be different for every sub-folder! Another strategy would be based on “currentness”. Files you have yet to open and look at live in one folder. Ones that have been looked at but not yet filed live in another place. Current, active projects live in yet another place. All other files (your “archive”, if you like) would live in a fourth folder. (And of course, within that last folder you’d need to create a further sub-system based on one of the previous bullet points). Put some thought into this – changing it when it proves incomplete can be a big hassle! Before you go to the trouble of implementing any system you come up with, examine a wide cross-section of the files you own and see if they will all be able to find a nice logical place to sit within your system. Tip #2. When You Decide on Your System, Stick to It! There’s nothing more pointless than going to all the trouble of creating a system and filing all your files, and then whenever you create, receive or download a new file, you simply dump it onto your Desktop. You need to be disciplined – forever! Every new file you get, spend those extra few seconds to file it where it belongs! Otherwise, in just a month or two, you’ll be worse off than before – half your files will be organized and half will be disorganized – and you won’t know which is which! Tip #3. Choose the Root Folder of Your Structure Carefully Every data file (document, photo, music file, etc) that you create, own or is important to you, no matter where it came from, should be found within one single folder, and that one single folder should be located at the root of your C: drive (as a sub-folder of C:\). In other words, do not base your folder structure in standard folders like “My Documents”. If you do, then you’re leaving it up to the operating system engineers to decide what folder structure is best for you. And every operating system has a different system! In Windows 7 your files are found in C:\Users\YourName, whilst on Windows XP it was C:\Documents and Settings\YourName\My Documents. In UNIX systems it’s often /home/YourName. These standard default folders tend to fill up with junk files and folders that are not at all important to you. “My Documents” is the worst offender. Every second piece of software you install, it seems, likes to create its own folder in the “My Documents” folder. These folders usually don’t fit within your organizational structure, so don’t use them! In fact, don’t even use the “My Documents” folder at all. Allow it to fill up with junk, and then simply ignore it. It sounds heretical, but: Don’t ever visit your “My Documents” folder! Remove your icons/links to “My Documents” and replace them with links to the folders you created and you care about! Create your own file system from scratch! Probably the best place to put it would be on your D: drive – if you have one. This way, all your files live on one drive, while all the operating system and software component files live on the C: drive – simply and elegantly separated. The benefits of that are profound. Not only are there obvious organizational benefits (see tip #10, below), but when it comes to migrate your data to a new computer, you can (sometimes) simply unplug your D: drive and plug it in as the D: drive of your new computer (this implies that the D: drive is actually a separate physical disk, and not a partition on the same disk as C:). You also get a slight speed improvement (again, only if your C: and D: drives are on separate physical disks). Warning: From tip #12, below, you will see that it’s actually a good idea to have exactly the same file system structure – including the drive it’s filed on – on all of the computers you own. So if you decide to use the D: drive as the storage system for your own files, make sure you are able to use the D: drive on all the computers you own. If you can’t ensure that, then you can still use a clever geeky trick to store your files on the D: drive, but still access them all via the C: drive (see tip #17, below). If you only have one hard disk (C:), then create a dedicated folder that will contain all your files – something like C:\Files. The name of the folder is not important, but make it a single, brief word. There are several reasons for this: When creating a backup regime, it’s easy to decide what files should be backed up – they’re all in the one folder! If you ever decide to trade in your computer for a new one, you know exactly which files to migrate You will always know where to begin a search for any file If you synchronize files with other computers, it makes your synchronization routines very simple. It also causes all your shortcuts to continue to work on the other machines (more about this in tip #24, below). Once you’ve decided where your files should go, then put all your files in there – Everything! Completely disregard the standard, default folders that are created for you by the operating system (“My Music”, “My Pictures”, etc). In fact, you can actually relocate many of those folders into your own structure (more about that below, in tip #6). The more completely you get all your data files (documents, photos, music, etc) and all your configuration settings into that one folder, then the easier it will be to perform all of the above tasks. Once this has been done, and all your files live in one folder, all the other folders in C:\ can be thought of as “operating system” folders, and therefore of little day-to-day interest for us. Here’s a screenshot of a nicely organized C: drive, where all user files are located within the \Files folder: Tip #4. Use Sub-Folders This would be our simplest and most obvious tip. It almost goes without saying. Any organizational system you decide upon (see tip #1) will require that you create sub-folders for your files. Get used to creating folders on a regular basis. Tip #5. Don’t be Shy About Depth Create as many levels of sub-folders as you need. Don’t be scared to do so. Every time you notice an opportunity to group a set of related files into a sub-folder, do so. Examples might include: All the MP3s from one music CD, all the photos from one holiday, or all the documents from one client. It’s perfectly okay to put files into a folder called C:\Files\Me\From Others\Services\WestCo Bank\Statements\2009. That’s only seven levels deep. Ten levels is not uncommon. Of course, it’s possible to take this too far. If you notice yourself creating a sub-folder to hold only one file, then you’ve probably become a little over-zealous. On the other hand, if you simply create a structure with only two levels (for example C:\Files\Work) then you really haven’t achieved any level of organization at all (unless you own only six files!). Your “Work” folder will have become a dumping ground, just like your Desktop was, with most likely hundreds of files in it. Tip #6. Move the Standard User Folders into Your Own Folder Structure Most operating systems, including Windows, create a set of standard folders for each of its users. These folders then become the default location for files such as documents, music files, digital photos and downloaded Internet files. In Windows 7, the full list is shown below: Some of these folders you may never use nor care about (for example, the Favorites folder, if you’re not using Internet Explorer as your browser). Those ones you can leave where they are. But you may be using some of the other folders to store files that are important to you. Even if you’re not using them, Windows will still often treat them as the default storage location for many types of files. When you go to save a standard file type, it can become annoying to be automatically prompted to save it in a folder that’s not part of your own file structure. But there’s a simple solution: Move the folders you care about into your own folder structure! If you do, then the next time you go to save a file of the corresponding type, Windows will prompt you to save it in the new, moved location. Moving the folders is easy. Simply drag-and-drop them to the new location. Here’s a screenshot of the default My Music folder being moved to my custom personal folder (Mark): Tip #7. Name Files and Folders Intelligently This is another one that almost goes without saying, but we’ll say it anyway: Do not allow files to be created that have meaningless names like Document1.doc, or folders called New Folder (2). Take that extra 20 seconds and come up with a meaningful name for the file/folder – one that accurately divulges its contents without repeating the entire contents in the name. Tip #8. Watch Out for Long Filenames Another way to tell if you have not yet created enough depth to your folder hierarchy is that your files often require really long names. If you need to call a file Johnson Sales Figures March 2009.xls (which might happen to live in the same folder as Abercrombie Budget Report 2008.xls), then you might want to create some sub-folders so that the first file could be simply called March.xls, and living in the Clients\Johnson\Sales Figures\2009 folder. A well-placed file needs only a brief filename! Tip #9. Use Shortcuts! Everywhere! This is probably the single most useful and important tip we can offer. A shortcut allows a file to be in two places at once. Why would you want that? Well, the file and folder structure of every popular operating system on the market today is hierarchical. This means that all objects (files and folders) always live within exactly one parent folder. It’s a bit like a tree. A tree has branches (folders) and leaves (files). Each leaf, and each branch, is supported by exactly one parent branch, all the way back to the root of the tree (which, incidentally, is exactly why C:\ is called the “root folder” of the C: drive). That hard disks are structured this way may seem obvious and even necessary, but it’s only one way of organizing data. There are others: Relational databases, for example, organize structured data entirely differently. The main limitation of hierarchical filing structures is that a file can only ever be in one branch of the tree – in only one folder – at a time. Why is this a problem? Well, there are two main reasons why this limitation is a problem for computer users: The “correct” place for a file, according to our organizational rationale, is very often a very inconvenient place for that file to be located. Just because it’s correctly filed doesn’t mean it’s easy to get to. Your file may be “correctly” buried six levels deep in your sub-folder structure, but you may need regular and speedy access to this file every day. You could always move it to a more convenient location, but that would mean that you would need to re-file back to its “correct” location it every time you’d finished working on it. Most unsatisfactory. A file may simply “belong” in two or more different locations within your file structure. For example, say you’re an accountant and you have just completed the 2009 tax return for John Smith. It might make sense to you to call this file 2009 Tax Return.doc and file it under Clients\John Smith. But it may also be important to you to have the 2009 tax returns from all your clients together in the one place. So you might also want to call the file John Smith.doc and file it under Tax Returns\2009. The problem is, in a purely hierarchical filing system, you can’t put it in both places. Grrrrr! Fortunately, Windows (and most other operating systems) offers a way for you to do exactly that: It’s called a “shortcut” (also known as an “alias” on Macs and a “symbolic link” on UNIX systems). Shortcuts allow a file to exist in one place, and an icon that represents the file to be created and put anywhere else you please. In fact, you can create a dozen such icons and scatter them all over your hard disk. Double-clicking on one of these icons/shortcuts opens up the original file, just as if you had double-clicked on the original file itself. Consider the following two icons: The one on the left is the actual Word document, while the one on the right is a shortcut that represents the Word document. Double-clicking on either icon will open the same file. There are two main visual differences between the icons: The shortcut will have a small arrow in the lower-left-hand corner (on Windows, anyway) The shortcut is allowed to have a name that does not include the file extension (the “.docx” part, in this case) You can delete the shortcut at any time without losing any actual data. The original is still intact. All you lose is the ability to get to that data from wherever the shortcut was. So why are shortcuts so great? Because they allow us to easily overcome the main limitation of hierarchical file systems, and put a file in two (or more) places at the same time. You will always have files that don’t play nice with your organizational rationale, and can’t be filed in only one place. They demand to exist in two places. Shortcuts allow this! Furthermore, they allow you to collect your most often-opened files and folders together in one spot for convenient access. The cool part is that the original files stay where they are, safe forever in their perfectly organized location. So your collection of most often-opened files can – and should – become a collection of shortcuts! If you’re still not convinced of the utility of shortcuts, consider the following well-known areas of a typical Windows computer: The Start Menu (and all the programs that live within it) The Quick Launch bar (or the Superbar in Windows 7) The “Favorite folders” area in the top-left corner of the Windows Explorer window (in Windows Vista or Windows 7) Your Internet Explorer Favorites or Firefox Bookmarks Each item in each of these areas is a shortcut! Each of those areas exist for one purpose only: For convenience – to provide you with a collection of the files and folders you access most often. It should be easy to see by now that shortcuts are designed for one single purpose: To make accessing your files more convenient. Each time you double-click on a shortcut, you are saved the hassle of locating the file (or folder, or program, or drive, or control panel icon) that it represents. Shortcuts allow us to invent a golden rule of file and folder organization: “Only ever have one copy of a file – never have two copies of the same file. Use a shortcut instead” (this rule doesn’t apply to copies created for backup purposes, of course!) There are also lesser rules, like “don’t move a file into your work area – create a shortcut there instead”, and “any time you find yourself frustrated with how long it takes to locate a file, create a shortcut to it and place that shortcut in a convenient location.” So how to we create these massively useful shortcuts? There are two main ways: “Copy” the original file or folder (click on it and type Ctrl-C, or right-click on it and select Copy): Then right-click in an empty area of the destination folder (the place where you want the shortcut to go) and select Paste shortcut: Right-drag (drag with the right mouse button) the file from the source folder to the destination folder. When you let go of the mouse button at the destination folder, a menu pops up: Select Create shortcuts here. Note that when shortcuts are created, they are often named something like Shortcut to Budget Detail.doc (windows XP) or Budget Detail – Shortcut.doc (Windows 7). If you don’t like those extra words, you can easily rename the shortcuts after they’re created, or you can configure Windows to never insert the extra words in the first place (see our article on how to do this). And of course, you can create shortcuts to folders too, not just to files! Bottom line: Whenever you have a file that you’d like to access from somewhere else (whether it’s convenience you’re after, or because the file simply belongs in two places), create a shortcut to the original file in the new location. Tip #10. Separate Application Files from Data Files Any digital organization guru will drum this rule into you. Application files are the components of the software you’ve installed (e.g. Microsoft Word, Adobe Photoshop or Internet Explorer). Data files are the files that you’ve created for yourself using that software (e.g. Word Documents, digital photos, emails or playlists). Software gets installed, uninstalled and upgraded all the time. Hopefully you always have the original installation media (or downloaded set-up file) kept somewhere safe, and can thus reinstall your software at any time. This means that the software component files are of little importance. Whereas the files you have created with that software is, by definition, important. It’s a good rule to always separate unimportant files from important files. So when your software prompts you to save a file you’ve just created, take a moment and check out where it’s suggesting that you save the file. If it’s suggesting that you save the file into the same folder as the software itself, then definitely don’t follow that suggestion. File it in your own folder! In fact, see if you can find the program’s configuration option that determines where files are saved by default (if it has one), and change it. Tip #11. Organize Files Based on Purpose, Not on File Type If you have, for example a folder called Work\Clients\Johnson, and within that folder you have two sub-folders, Word Documents and Spreadsheets (in other words, you’re separating “.doc” files from “.xls” files), then chances are that you’re not optimally organized. It makes little sense to organize your files based on the program that created them. Instead, create your sub-folders based on the purpose of the file. For example, it would make more sense to create sub-folders called Correspondence and Financials. It may well be that all the files in a given sub-folder are of the same file-type, but this should be more of a coincidence and less of a design feature of your organization system. Tip #12. Maintain the Same Folder Structure on All Your Computers In other words, whatever organizational system you create, apply it to every computer that you can. There are several benefits to this: There’s less to remember. No matter where you are, you always know where to look for your files If you copy or synchronize files from one computer to another, then setting up the synchronization job becomes very simple Shortcuts can be copied or moved from one computer to another with ease (assuming the original files are also copied/moved). There’s no need to find the target of the shortcut all over again on the second computer Ditto for linked files (e.g Word documents that link to data in a separate Excel file), playlists, and any files that reference the exact file locations of other files. This applies even to the drive that your files are stored on. If your files are stored on C: on one computer, make sure they’re stored on C: on all your computers. Otherwise all your shortcuts, playlists and linked files will stop working! Tip #13. Create an “Inbox” Folder Create yourself a folder where you store all files that you’re currently working on, or that you haven’t gotten around to filing yet. You can think of this folder as your “to-do” list. You can call it “Inbox” (making it the same metaphor as your email system), or “Work”, or “To-Do”, or “Scratch”, or whatever name makes sense to you. It doesn’t matter what you call it – just make sure you have one! Once you have finished working on a file, you then move it from the “Inbox” to its correct location within your organizational structure. You may want to use your Desktop as this “Inbox” folder. Rightly or wrongly, most people do. It’s not a bad place to put such files, but be careful: If you do decide that your Desktop represents your “to-do” list, then make sure that no other files find their way there. In other words, make sure that your “Inbox”, wherever it is, Desktop or otherwise, is kept free of junk – stray files that don’t belong there. So where should you put this folder, which, almost by definition, lives outside the structure of the rest of your filing system? Well, first and foremost, it has to be somewhere handy. This will be one of your most-visited folders, so convenience is key. Putting it on the Desktop is a great option – especially if you don’t have any other folders on your Desktop: the folder then becomes supremely easy to find in Windows Explorer: You would then create shortcuts to this folder in convenient spots all over your computer (“Favorite Links”, “Quick Launch”, etc). Tip #14. Ensure You have Only One “Inbox” Folder Once you’ve created your “Inbox” folder, don’t use any other folder location as your “to-do list”. Throw every incoming or created file into the Inbox folder as you create/receive it. This keeps the rest of your computer pristine and free of randomly created or downloaded junk. The last thing you want to be doing is checking multiple folders to see all your current tasks and projects. Gather them all together into one folder. Here are some tips to help ensure you only have one Inbox: Set the default “save” location of all your programs to this folder. Set the default “download” location for your browser to this folder. If this folder is not your desktop (recommended) then also see if you can make a point of not putting “to-do” files on your desktop. This keeps your desktop uncluttered and Zen-like: (the Inbox folder is in the bottom-right corner) Tip #15. Be Vigilant about Clearing Your “Inbox” Folder This is one of the keys to staying organized. If you let your “Inbox” overflow (i.e. allow there to be more than, say, 30 files or folders in there), then you’re probably going to start feeling like you’re overwhelmed: You’re not keeping up with your to-do list. Once your Inbox gets beyond a certain point (around 30 files, studies have shown), then you’ll simply start to avoid it. You may continue to put files in there, but you’ll be scared to look at it, fearing the “out of control” feeling that all overworked, chaotic or just plain disorganized people regularly feel. So, here’s what you can do: Visit your Inbox/to-do folder regularly (at least five times per day). Scan the folder regularly for files that you have completed working on and are ready for filing. File them immediately. Make it a source of pride to keep the number of files in this folder as small as possible. If you value peace of mind, then make the emptiness of this folder one of your highest (computer) priorities If you know that a particular file has been in the folder for more than, say, six weeks, then admit that you’re not actually going to get around to processing it, and move it to its final resting place. Tip #16. File Everything Immediately, and Use Shortcuts for Your Active Projects As soon as you create, receive or download a new file, store it away in its “correct” folder immediately. Then, whenever you need to work on it (possibly straight away), create a shortcut to it in your “Inbox” (“to-do”) folder or your desktop. That way, all your files are always in their “correct” locations, yet you still have immediate, convenient access to your current, active files. When you finish working on a file, simply delete the shortcut. Ideally, your “Inbox” folder – and your Desktop – should contain no actual files or folders. They should simply contain shortcuts. Tip #17. Use Directory Symbolic Links (or Junctions) to Maintain One Unified Folder Structure Using this tip, we can get around a potential hiccup that we can run into when creating our organizational structure – the issue of having more than one drive on our computer (C:, D:, etc). We might have files we need to store on the D: drive for space reasons, and yet want to base our organized folder structure on the C: drive (or vice-versa). Your chosen organizational structure may dictate that all your files must be accessed from the C: drive (for example, the root folder of all your files may be something like C:\Files). And yet you may still have a D: drive and wish to take advantage of the hundreds of spare Gigabytes that it offers. Did you know that it’s actually possible to store your files on the D: drive and yet access them as if they were on the C: drive? And no, we’re not talking about shortcuts here (although the concept is very similar). By using the shell command mklink, you can essentially take a folder that lives on one drive and create an alias for it on a different drive (you can do lots more than that with mklink – for a full rundown on this programs capabilities, see our dedicated article). These aliases are called directory symbolic links (and used to be known as junctions). You can think of them as “virtual” folders. They function exactly like regular folders, except they’re physically located somewhere else. For example, you may decide that your entire D: drive contains your complete organizational file structure, but that you need to reference all those files as if they were on the C: drive, under C:\Files. If that was the case you could create C:\Files as a directory symbolic link – a link to D:, as follows: mklink /d c:\files d:\ Or it may be that the only files you wish to store on the D: drive are your movie collection. You could locate all your movie files in the root of your D: drive, and then link it to C:\Files\Media\Movies, as follows: mklink /d c:\files\media\movies d:\ (Needless to say, you must run these commands from a command prompt – click the Start button, type cmd and press Enter) Tip #18. Customize Your Folder Icons This is not strictly speaking an organizational tip, but having unique icons for each folder does allow you to more quickly visually identify which folder is which, and thus saves you time when you’re finding files. An example is below (from my folder that contains all files downloaded from the Internet): To learn how to change your folder icons, please refer to our dedicated article on the subject. Tip #19. Tidy Your Start Menu The Windows Start Menu is usually one of the messiest parts of any Windows computer. Every program you install seems to adopt a completely different approach to placing icons in this menu. Some simply put a single program icon. Others create a folder based on the name of the software. And others create a folder based on the name of the software manufacturer. It’s chaos, and can make it hard to find the software you want to run. Thankfully we can avoid this chaos with useful operating system features like Quick Launch, the Superbar or pinned start menu items. Even so, it would make a lot of sense to get into the guts of the Start Menu itself and give it a good once-over. All you really need to decide is how you’re going to organize your applications. A structure based on the purpose of the application is an obvious candidate. Below is an example of one such structure: In this structure, Utilities means software whose job it is to keep the computer itself running smoothly (configuration tools, backup software, Zip programs, etc). Applications refers to any productivity software that doesn’t fit under the headings Multimedia, Graphics, Internet, etc. In case you’re not aware, every icon in your Start Menu is a shortcut and can be manipulated like any other shortcut (copied, moved, deleted, etc). With the Windows Start Menu (all version of Windows), Microsoft has decided that there be two parallel folder structures to store your Start Menu shortcuts. One for you (the logged-in user of the computer) and one for all users of the computer. Having two parallel structures can often be redundant: If you are the only user of the computer, then having two parallel structures is totally redundant. Even if you have several users that regularly log into the computer, most of your installed software will need to be made available to all users, and should thus be moved out of the “just you” version of the Start Menu and into the “all users” area. To take control of your Start Menu, so you can start organizing it, you’ll need to know how to access the actual folders and shortcut files that make up the Start Menu (both versions of it). To find these folders and files, click the Start button and then right-click on the All Programs text (Windows XP users should right-click on the Start button itself): The Open option refers to the “just you” version of the Start Menu, while the Open All Users option refers to the “all users” version. Click on the one you want to organize. A Windows Explorer window then opens with your chosen version of the Start Menu selected. From there it’s easy. Double-click on the Programs folder and you’ll see all your folders and shortcuts. Now you can delete/rename/move until it’s just the way you want it. Note: When you’re reorganizing your Start Menu, you may want to have two Explorer windows open at the same time – one showing the “just you” version and one showing the “all users” version. You can drag-and-drop between the windows. Tip #20. Keep Your Start Menu Tidy Once you have a perfectly organized Start Menu, try to be a little vigilant about keeping it that way. Every time you install a new piece of software, the icons that get created will almost certainly violate your organizational structure. So to keep your Start Menu pristine and organized, make sure you do the following whenever you install a new piece of software: Check whether the software was installed into the “just you” area of the Start Menu, or the “all users” area, and then move it to the correct area. Remove all the unnecessary icons (like the “Read me” icon, the “Help” icon (you can always open the help from within the software itself when it’s running), the “Uninstall” icon, the link(s)to the manufacturer’s website, etc) Rename the main icon(s) of the software to something brief that makes sense to you. For example, you might like to rename Microsoft Office Word 2010 to simply Word Move the icon(s) into the correct folder based on your Start Menu organizational structure And don’t forget: when you uninstall a piece of software, the software’s uninstall routine is no longer going to be able to remove the software’s icon from the Start Menu (because you moved and/or renamed it), so you’ll need to remove that icon manually. Tip #21. Tidy C:\ The root of your C: drive (C:\) is a common dumping ground for files and folders – both by the users of your computer and by the software that you install on your computer. It can become a mess. There’s almost no software these days that requires itself to be installed in C:\. 99% of the time it can and should be installed into C:\Program Files. And as for your own files, well, it’s clear that they can (and almost always should) be stored somewhere else. In an ideal world, your C:\ folder should look like this (on Windows 7): Note that there are some system files and folders in C:\ that are usually and deliberately “hidden” (such as the Windows virtual memory file pagefile.sys, the boot loader file bootmgr, and the System Volume Information folder). Hiding these files and folders is a good idea, as they need to stay where they are and are almost never needed to be opened or even seen by you, the user. Hiding them prevents you from accidentally messing with them, and enhances your sense of order and well-being when you look at your C: drive folder. Tip #22. Tidy Your Desktop The Desktop is probably the most abused part of a Windows computer (from an organization point of view). It usually serves as a dumping ground for all incoming files, as well as holding icons to oft-used applications, plus some regularly opened files and folders. It often ends up becoming an uncontrolled mess. See if you can avoid this. Here’s why… Application icons (Word, Internet Explorer, etc) are often found on the Desktop, but it’s unlikely that this is the optimum place for them. The “Quick Launch” bar (or the Superbar in Windows 7) is always visible and so represents a perfect location to put your icons. You’ll only be able to see the icons on your Desktop when all your programs are minimized. It might be time to get your application icons off your desktop… You may have decided that the Inbox/To-do folder on your computer (see tip #13, above) should be your Desktop. If so, then enough said. Simply be vigilant about clearing it and preventing it from being polluted by junk files (see tip #15, above). On the other hand, if your Desktop is not acting as your “Inbox” folder, then there’s no reason for it to have any data files or folders on it at all, except perhaps a couple of shortcuts to often-opened files and folders (either ongoing or current projects). Everything else should be moved to your “Inbox” folder. In an ideal world, it might look like this: Tip #23. Move Permanent Items on Your Desktop Away from the Top-Left Corner When files/folders are dragged onto your desktop in a Windows Explorer window, or when shortcuts are created on your Desktop from Internet Explorer, those icons are always placed in the top-left corner – or as close as they can get. If you have other files, folders or shortcuts that you keep on the Desktop permanently, then it’s a good idea to separate these permanent icons from the transient ones, so that you can quickly identify which ones the transients are. An easy way to do this is to move all your permanent icons to the right-hand side of your Desktop. That should keep them separated from incoming items. Tip #24. Synchronize If you have more than one computer, you’ll almost certainly want to share files between them. If the computers are permanently attached to the same local network, then there’s no need to store multiple copies of any one file or folder – shortcuts will suffice. However, if the computers are not always on the same network, then you will at some point need to copy files between them. For files that need to permanently live on both computers, the ideal way to do this is to synchronize the files, as opposed to simply copying them. We only have room here to write a brief summary of synchronization, not a full article. In short, there are several different types of synchronization: Where the contents of one folder are accessible anywhere, such as with Dropbox Where the contents of any number of folders are accessible anywhere, such as with Windows Live Mesh Where any files or folders from anywhere on your computer are synchronized with exactly one other computer, such as with the Windows “Briefcase”, Microsoft SyncToy, or (much more powerful, yet still free) SyncBack from 2BrightSparks. This only works when both computers are on the same local network, at least temporarily. A great advantage of synchronization solutions is that once you’ve got it configured the way you want it, then the sync process happens automatically, every time. Click a button (or schedule it to happen automatically) and all your files are automagically put where they’re supposed to be. If you maintain the same file and folder structure on both computers, then you can also sync files depend upon the correct location of other files, like shortcuts, playlists and office documents that link to other office documents, and the synchronized files still work on the other computer! Tip #25. Hide Files You Never Need to See If you have your files well organized, you will often be able to tell if a file is out of place just by glancing at the contents of a folder (for example, it should be pretty obvious if you look in a folder that contains all the MP3s from one music CD and see a Word document in there). This is a good thing – it allows you to determine if there are files out of place with a quick glance. Yet sometimes there are files in a folder that seem out of place but actually need to be there, such as the “folder art” JPEGs in music folders, and various files in the root of the C: drive. If such files never need to be opened by you, then a good idea is to simply hide them. Then, the next time you glance at the folder, you won’t have to remember whether that file was supposed to be there or not, because you won’t see it at all! To hide a file, simply right-click on it and choose Properties: Then simply tick the Hidden tick-box: Tip #26. Keep Every Setup File These days most software is downloaded from the Internet. Whenever you download a piece of software, keep it. You’ll never know when you need to reinstall the software. Further, keep with it an Internet shortcut that links back to the website where you originally downloaded it, in case you ever need to check for updates. See tip #33 below for a full description of the excellence of organizing your setup files. Tip #27. Try to Minimize the Number of Folders that Contain Both Files and Sub-folders Some of the folders in your organizational structure will contain only files. Others will contain only sub-folders. And you will also have some folders that contain both files and sub-folders. You will notice slight improvements in how long it takes you to locate a file if you try to avoid this third type of folder. It’s not always possible, of course – you’ll always have some of these folders, but see if you can avoid it. One way of doing this is to take all the leftover files that didn’t end up getting stored in a sub-folder and create a special “Miscellaneous” or “Other” folder for them. Tip #28. Starting a Filename with an Underscore Brings it to the Top of a List Further to the previous tip, if you name that “Miscellaneous” or “Other” folder in such a way that its name begins with an underscore “_”, then it will appear at the top of the list of files/folders. The screenshot below is an example of this. Each folder in the list contains a set of digital photos. The folder at the top of the list, _Misc, contains random photos that didn’t deserve their own dedicated folder: Tip #29. Clean Up those CD-ROMs and (shudder!) Floppy Disks Have you got a pile of CD-ROMs stacked on a shelf of your office? Old photos, or files you archived off onto CD-ROM (or even worse, floppy disks!) because you didn’t have enough disk space at the time? In the meantime have you upgraded your computer and now have 500 Gigabytes of space you don’t know what to do with? If so, isn’t it time you tidied up that stack of disks and filed them into your gorgeous new folder structure? So what are you waiting for? Bite the bullet, copy them all back onto your computer, file them in their appropriate folders, and then back the whole lot up onto a shiny new 1000Gig external hard drive! Useful Folders to Create This next section suggests some useful folders that you might want to create within your folder structure. I’ve personally found them to be indispensable. The first three are all about convenience – handy folders to create and then put somewhere that you can always access instantly. For each one, it’s not so important where the actual folder is located, but it’s very important where you put the shortcut(s) to the folder. You might want to locate the shortcuts: On your Desktop In your “Quick Launch” area (or pinned to your Windows 7 Superbar) In your Windows Explorer “Favorite Links” area Tip #30. Create an “Inbox” (“To-Do”) Folder This has already been mentioned in depth (see tip #13), but we wanted to reiterate its importance here. This folder contains all the recently created, received or downloaded files that you have not yet had a chance to file away properly, and it also may contain files that you have yet to process. In effect, it becomes a sort of “to-do list”. It doesn’t have to be called “Inbox” – you can call it whatever you want. Tip #31. Create a Folder where Your Current Projects are Collected Rather than going hunting for them all the time, or dumping them all on your desktop, create a special folder where you put links (or work folders) for each of the projects you’re currently working on. You can locate this folder in your “Inbox” folder, on your desktop, or anywhere at all – just so long as there’s a way of getting to it quickly, such as putting a link to it in Windows Explorer’s “Favorite Links” area: Tip #32. Create a Folder for Files and Folders that You Regularly Open You will always have a few files that you open regularly, whether it be a spreadsheet of your current accounts, or a favorite playlist. These are not necessarily “current projects”, rather they’re simply files that you always find yourself opening. Typically such files would be located on your desktop (or even better, shortcuts to those files). Why not collect all such shortcuts together and put them in their own special folder? As with the “Current Projects” folder (above), you would want to locate that folder somewhere convenient. Below is an example of a folder called “Quick links”, with about seven files (shortcuts) in it, that is accessible through the Windows Quick Launch bar: See tip #37 below for a full explanation of the power of the Quick Launch bar. Tip #33. Create a “Set-ups” Folder A typical computer has dozens of applications installed on it. For each piece of software, there are often many different pieces of information you need to keep track of, including: The original installation setup file(s). This can be anything from a simple 100Kb setup.exe file you downloaded from a website, all the way up to a 4Gig ISO file that you copied from a DVD-ROM that you purchased. The home page of the software manufacturer (in case you need to look up something on their support pages, their forum or their online help) The page containing the download link for your actual file (in case you need to re-download it, or download an upgraded version) The serial number Your proof-of-purchase documentation Any other template files, plug-ins, themes, etc that also need to get installed For each piece of software, it’s a great idea to gather all of these files together and put them in a single folder. The folder can be the name of the software (plus possibly a very brief description of what it’s for – in case you can’t remember what the software does based in its name). Then you would gather all of these folders together into one place, and call it something like “Software” or “Setups”. If you have enough of these folders (I have several hundred, being a geek, collected over 20 years), then you may want to further categorize them. My own categorization structure is based on “platform” (operating system): The last seven folders each represents one platform/operating system, while _Operating Systems contains set-up files for installing the operating systems themselves. _Hardware contains ROMs for hardware I own, such as routers. Within the Windows folder (above), you can see the beginnings of the vast library of software I’ve compiled over the years: An example of a typical application folder looks like this: Tip #34. Have a “Settings” Folder We all know that our documents are important. So are our photos and music files. We save all of these files into folders, and then locate them afterwards and double-click on them to open them. But there are many files that are important to us that can’t be saved into folders, and then searched for and double-clicked later on. These files certainly contain important information that we need, but are often created internally by an application, and saved wherever that application feels is appropriate. A good example of this is the “PST” file that Outlook creates for us and uses to store all our emails, contacts, appointments and so forth. Another example would be the collection of Bookmarks that Firefox stores on your behalf. And yet another example would be the customized settings and configuration files of our all our software. Granted, most Windows programs store their configuration in the Registry, but there are still many programs that use configuration files to store their settings. Imagine if you lost all of the above files! And yet, when people are backing up their computers, they typically only back up the files they know about – those that are stored in the “My Documents” folder, etc. If they had a hard disk failure or their computer was lost or stolen, their backup files would not include some of the most vital files they owned. Also, when migrating to a new computer, it’s vital to ensure that these files make the journey. It can be a very useful idea to create yourself a folder to store all your “settings” – files that are important to you but which you never actually search for by name and double-click on to open them. Otherwise, next time you go to set up a new computer just the way you want it, you’ll need to spend hours recreating the configuration of your previous computer! So how to we get our important files into this folder? Well, we have a few options: Some programs (such as Outlook and its PST files) allow you to place these files wherever you want. If you delve into the program’s options, you will find a setting somewhere that controls the location of the important settings files (or “personal storage” – PST – when it comes to Outlook) Some programs do not allow you to change such locations in any easy way, but if you get into the Registry, you can sometimes find a registry key that refers to the location of the file(s). Simply move the file into your Settings folder and adjust the registry key to refer to the new location. Some programs stubbornly refuse to allow their settings files to be placed anywhere other then where they stipulate. When faced with programs like these, you have three choices: (1) You can ignore those files, (2) You can copy the files into your Settings folder (let’s face it – settings don’t change very often), or (3) you can use synchronization software, such as the Windows Briefcase, to make synchronized copies of all your files in your Settings folder. All you then have to do is to remember to run your sync software periodically (perhaps just before you run your backup software!). There are some other things you may decide to locate inside this new “Settings” folder: Exports of registry keys (from the many applications that store their configurations in the Registry). This is useful for backup purposes or for migrating to a new computer Notes you’ve made about all the specific customizations you have made to a particular piece of software (so that you’ll know how to do it all again on your next computer) Shortcuts to webpages that detail how to tweak certain aspects of your operating system or applications so they are just the way you like them (such as how to remove the words “Shortcut to” from the beginning of newly created shortcuts). In other words, you’d want to create shortcuts to half the pages on the How-To Geek website! Here’s an example of a “Settings” folder: Windows Features that Help with Organization This section details some of the features of Microsoft Windows that are a boon to anyone hoping to stay optimally organized. Tip #35. Use the “Favorite Links” Area to Access Oft-Used Folders Once you’ve created your great new filing system, work out which folders you access most regularly, or which serve as great starting points for locating the rest of the files in your folder structure, and then put links to those folders in your “Favorite Links” area of the left-hand side of the Windows Explorer window (simply called “Favorites” in Windows 7): Some ideas for folders you might want to add there include: Your “Inbox” folder (or whatever you’ve called it) – most important! The base of your filing structure (e.g. C:\Files) A folder containing shortcuts to often-accessed folders on other computers around the network (shown above as Network Folders) A folder containing shortcuts to your current projects (unless that folder is in your “Inbox” folder) Getting folders into this area is very simple – just locate the folder you’re interested in and drag it there! Tip #36. Customize the Places Bar in the File/Open and File/Save Boxes Consider the screenshot below: The highlighted icons (collectively known as the “Places Bar”) can be customized to refer to any folder location you want, allowing instant access to any part of your organizational structure. Note: These File/Open and File/Save boxes have been superseded by new versions that use the Windows Vista/Windows 7 “Favorite Links”, but the older versions (shown above) are still used by a surprisingly large number of applications. The easiest way to customize these icons is to use the Group Policy Editor, but not everyone has access to this program. If you do, open it up and navigate to: User Configuration > Administrative Templates > Windows Components > Windows Explorer > Common Open File Dialog If you don’t have access to the Group Policy Editor, then you’ll need to get into the Registry. Navigate to: HKEY_CURRENT_USER \ Software \ Microsoft \ Windows \ CurrentVersion \ Policies \ comdlg32 \ Placesbar It should then be easy to make the desired changes. Log off and log on again to allow the changes to take effect. Tip #37. Use the Quick Launch Bar as a Application and File Launcher That Quick Launch bar (to the right of the Start button) is a lot more useful than people give it credit for. Most people simply have half a dozen icons in it, and use it to start just those programs. But it can actually be used to instantly access just about anything in your filing system: For complete instructions on how to set this up, visit our dedicated article on this topic. Tip #38. Put a Shortcut to Windows Explorer into Your Quick Launch Bar This is only necessary in Windows Vista and Windows XP. The Microsoft boffins finally got wise and added it to the Windows 7 Superbar by default. Windows Explorer – the program used for managing your files and folders – is one of the most useful programs in Windows. Anyone who considers themselves serious about being organized needs instant access to this program at any time. A great place to create a shortcut to this program is in the Windows XP and Windows Vista “Quick Launch” bar: To get it there, locate it in your Start Menu (usually under “Accessories”) and then right-drag it down into your Quick Launch bar (and create a copy). Tip #39. Customize the Starting Folder for Your Windows 7 Explorer Superbar Icon If you’re on Windows 7, your Superbar will include a Windows Explorer icon. Clicking on the icon will launch Windows Explorer (of course), and will start you off in your “Libraries” folder. Libraries may be fine as a starting point, but if you have created yourself an “Inbox” folder, then it would probably make more sense to start off in this folder every time you launch Windows Explorer. To change this default/starting folder location, then first right-click the Explorer icon in the Superbar, and then right-click Properties:Then, in Target field of the Windows Explorer Properties box that appears, type %windir%\explorer.exe followed by the path of the folder you wish to start in. For example: %windir%\explorer.exe C:\Files If that folder happened to be on the Desktop (and called, say, “Inbox”), then you would use the following cleverness: %windir%\explorer.exe shell:desktop\Inbox Then click OK and test it out. Tip #40. Ummmmm…. No, that’s it. I can’t think of another one. That’s all of the tips I can come up with. I only created this one because 40 is such a nice round number… Case Study – An Organized PC To finish off the article, I have included a few screenshots of my (main) computer (running Vista). The aim here is twofold: To give you a sense of what it looks like when the above, sometimes abstract, tips are applied to a real-life computer, and To offer some ideas about folders and structure that you may want to steal to use on your own PC. Let’s start with the C: drive itself. Very minimal. All my files are contained within C:\Files. I’ll confine the rest of the case study to this folder: That folder contains the following: Mark: My personal files VC: My business (Virtual Creations, Australia) Others contains files created by friends and family Data contains files from the rest of the world (can be thought of as “public” files, usually downloaded from the Net) Settings is described above in tip #34 The Data folder contains the following sub-folders: Audio: Radio plays, audio books, podcasts, etc Development: Programmer and developer resources, sample source code, etc (see below) Humour: Jokes, funnies (those emails that we all receive) Movies: Downloaded and ripped movies (all legal, of course!), their scripts, DVD covers, etc. Music: (see below) Setups: Installation files for software (explained in full in tip #33) System: (see below) TV: Downloaded TV shows Writings: Books, instruction manuals, etc (see below) The Music folder contains the following sub-folders: Album covers: JPEG scans Guitar tabs: Text files of guitar sheet music Lists: e.g. “Top 1000 songs of all time” Lyrics: Text files MIDI: Electronic music files MP3 (representing 99% of the Music folder): MP3s, either ripped from CDs or downloaded, sorted by artist/album name Music Video: Video clips Sheet Music: usually PDFs The Data\Writings folder contains the following sub-folders: (all pretty self-explanatory) The Data\Development folder contains the following sub-folders: Again, all pretty self-explanatory (if you’re a geek) The Data\System folder contains the following sub-folders: These are usually themes, plug-ins and other downloadable program-specific resources. The Mark folder contains the following sub-folders: From Others: Usually letters that other people (friends, family, etc) have written to me For Others: Letters and other things I have created for other people Green Book: None of your business Playlists: M3U files that I have compiled of my favorite songs (plus one M3U playlist file for every album I own) Writing: Fiction, philosophy and other musings of mine Mark Docs: Shortcut to C:\Users\Mark Settings: Shortcut to C:\Files\Settings\Mark The Others folder contains the following sub-folders: The VC (Virtual Creations, my business – I develop websites) folder contains the following sub-folders: And again, all of those are pretty self-explanatory. Conclusion These tips have saved my sanity and helped keep me a productive geek, but what about you? What tips and tricks do you have to keep your files organized? Please share them with us in the comments. Come on, don’t be shy… Similar Articles Productive Geek Tips Fix For When Windows Explorer in Vista Stops Showing File NamesWhy Did Windows Vista’s Music Folder Icon Turn Yellow?Print or Create a Text File List of the Contents in a Directory the Easy WayCustomize the Windows 7 or Vista Send To MenuAdd Copy To / Move To on Windows 7 or Vista Right-Click Menu TouchFreeze Alternative in AutoHotkey The Icy Undertow Desktop Windows Home Server – Backup to LAN The Clear & Clean Desktop Use This Bookmarklet to Easily Get Albums Use AutoHotkey to Assign a Hotkey to a Specific Window Latest Software Reviews Tinyhacker Random Tips Acronis Online Backup DVDFab 6 Revo Uninstaller Pro Registry Mechanic 9 for Windows Track Daily Goals With 42Goals Video Toolbox is a Superb Online Video Editor Fun with 47 charts and graphs Tomorrow is Mother’s Day Check the Average Speed of YouTube Videos You’ve Watched OutlookStatView Scans and Displays General Usage Statistics

Read the article

Search Results

Search found 622 results on 25 pages for 'repeating'.

Page 25/25 | < Previous Page | 21 22 23 24 25

- by polygenelubricants

- by SDLFunTimes

- by Coquelicot

- by amurra

- by Fruitbat

- by adamsquared

- by Lukman

- by Khnle

- by lyborko

- by Lance Robinson

- by dwahlin

- by Vicer

- by olaf.heimburger

- by keepitsimpleengineer

- by dlp

- by sarcozona

- by user450062

- by Austin

- by microchasm

- by Thomas Weller

- by user12620111

- by Mark Virtue

< Previous Page | 21 22 23 24 25