Search Results

Search found 6682 results on 268 pages for 'edge cases'.

Page 23/268 | < Previous Page | 19 20 21 22 23 24 25 26 27 28 29 30 | Next Page >

Synergy doesn't work correctly if I switch client/server role (left of works, right of does not)

- by PhilW

When I use my win7/64bit as a server, with the mac (10.7.5) on its left, it works. Screens: [Mac/10.7.5]---[Win7/64bit] I've now switched the roles, so I use the Mac's keyboard (because Bug #18/19) and use windows as a client. Now I cannot move the mouse over the right edge to the windows client. But if I configure windows to be on the left (virtually at least), it works, I can use the left edge to cross over to the windows client. Dock is on the bottom. Synergy v1.4.15 What do I need to change in order to fix this? Thanks!

Read the article
Drawing outlines around organic shapes

- by ThunderChunky_SF

One thing that seems particularly easy to do in the Flash IDE but difficult to do with code is to outline an organic shape. In the IDE you can just use the inkbucket tool to draw a stroke around something. Using nothing but code it seems much trickier. One method I've seen is to add a glow filter to the shape in question and just mess with the strength. But what if i want to only show the outline? What I'd like to do is to collect all of the points that make up the edge of the shape and then just connect the dots. I've actually gotten so far as to collect all of the points with a quick and dirty edge detection script that I wrote. So now I have a Vector of all the points that makeup my shape. How do I connect them in the proper sequence so it actually looks like the original object? For anyone who is interested here is my edge detection script: // Create a new sprite which we'll use for our outline var sp:Sprite = new Sprite(); var radius:int = 50; sp.graphics.beginFill(0x00FF00, 1); sp.graphics.drawCircle(0, 0, radius); sp.graphics.endFill(); sp.x = stage.stageWidth / 2; sp.y = stage.stageHeight / 2; // Create a bitmap data object to draw our vector data var bmd:BitmapData = new BitmapData(sp.width, sp.height, true, 0); // Use a transform matrix to translate the drawn clip so that none of its // pixels reside in negative space. The draw method will only draw starting // at 0,0 var mat:Matrix = new Matrix(1, 0, 0, 1, radius, radius); bmd.draw(sp, mat); // Pass the bitmap data to an actual bitmap var bmp:Bitmap = new Bitmap(bmd); // Add the bitmap to the stage addChild(bmp); // Grab all of the pixel data from the bitmap data object var pixels:Vector.<uint> = bmd.getVector(bmd.rect); // Setup a vector to hold our stroke points var points:Vector.<Point> = new Vector.<Point>; // Loop through all of the pixels of the bitmap data object and // create a point instance for each pixel location that isn't // transparent. var l:int = pixels.length; for(var i:int = 0; i < l; ++i) { // Check to see if the pixel is transparent if(pixels[i] != 0) { var pt:Point; // Check to see if the pixel is on the first or last // row. We'll grab everything from these rows to close the outline if(i <= bmp.width || i >= (bmp.width * bmp.height) - bmp.width) { pt = new Point(); pt.x = int(i % bmp.width); pt.y = int(i / bmp.width); points.push(pt); continue; } // Check to see if the current pixel is on either extreme edge if(int(i % bmp.width) == 0 || int(i % bmp.width) == bmp.width - 1) { pt = new Point(); pt.x = int(i % bmp.width); pt.y = int(i / bmp.width); points.push(pt); continue; } // Check to see if the previous or next pixel are transparent, // if so save the current one. if(i > 0 && i < bmp.width * bmp.height) { if(pixels[i - 1] == 0 || pixels[i + 1] == 0) { pt = new Point(); pt.x = int(i % bmp.width); pt.y = int(i / bmp.width); points.push(pt); } } } }

Read the article
Get rid of jfreechart chartpanel unnecessary space

- by ryvantage

I am trying to get a JFreeChart ChartPanel to remove unwanted extra space between the edge of the panel and the graph itself. To best illustrate, here's a SSCCE (with JFreeChart installed): public static void main(String[] args) { JPanel panel = new JPanel(new GridBagLayout()); GridBagConstraints gbc = new GridBagConstraints(); gbc.fill = GridBagConstraints.BOTH; gbc.gridwidth = 1; gbc.gridheight = 1; gbc.weightx = 1; gbc.weighty = 1; gbc.gridy = 1; gbc.gridx = 1; panel.add(createChart("Sales", Chart_Type.DOLLARS, 100000, 115000), gbc); gbc.gridx = 2; panel.add(createChart("Quotes", Chart_Type.DOLLARS, 250000, 240000), gbc); gbc.gridx = 3; panel.add(createChart("Profits", Chart_Type.PERCENTAGE, 40.00, 38.00), gbc); JFrame frame = new JFrame(); frame.add(panel); frame.setSize(800, 300); frame.setLocationRelativeTo(null); frame.setVisible(true); frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE); } private static ChartPanel createChart(String title, Chart_Type type, double goal, double actual) { double maxValue = goal * 2; double yellowToGreenNum = goal; double redToYellowNum = goal * .75; DefaultValueDataset dataset = new DefaultValueDataset(actual); JFreeChart jfreechart = createChart(dataset, Math.max(actual, maxValue), redToYellowNum, yellowToGreenNum, title, type); ChartPanel chartPanel = new ChartPanel(jfreechart); chartPanel.setBorder(new LineBorder(Color.red)); return chartPanel; } private static JFreeChart createChart(ValueDataset valuedataset, Number maxValue, Number redToYellowNum, Number yellowToGreenNum, String title, Chart_Type type) { MeterPlot meterplot = new MeterPlot(valuedataset); meterplot.setRange(new Range(0.0D, maxValue.doubleValue())); meterplot.addInterval(new MeterInterval(" Goal Not Met ", new Range(0.0D, redToYellowNum.doubleValue()), Color.lightGray, new BasicStroke(2.0F), new Color(255, 0, 0, 128))); meterplot.addInterval(new MeterInterval(" Goal Almost Met ", new Range(redToYellowNum.doubleValue(), yellowToGreenNum.doubleValue()), Color.lightGray, new BasicStroke(2.0F), new Color(255, 255, 0, 64))); meterplot.addInterval(new MeterInterval(" Goal Met ", new Range(yellowToGreenNum.doubleValue(), maxValue.doubleValue()), Color.lightGray, new BasicStroke(2.0F), new Color(0, 255, 0, 64))); meterplot.setNeedlePaint(Color.darkGray); meterplot.setDialBackgroundPaint(Color.white); meterplot.setDialOutlinePaint(Color.gray); meterplot.setDialShape(DialShape.CHORD); meterplot.setMeterAngle(260); meterplot.setTickLabelsVisible(false); meterplot.setTickSize(maxValue.doubleValue() / 20); meterplot.setTickPaint(Color.lightGray); meterplot.setValuePaint(Color.black); meterplot.setValueFont(new Font("Dialog", Font.BOLD, 0)); meterplot.setUnits(""); if(type == Chart_Type.DOLLARS) meterplot.setTickLabelFormat(NumberFormat.getCurrencyInstance()); else if(type == Chart_Type.PERCENTAGE) meterplot.setTickLabelFormat(NumberFormat.getPercentInstance()); JFreeChart jfreechart = new JFreeChart(title, JFreeChart.DEFAULT_TITLE_FONT, meterplot, false); return jfreechart; } enum Chart_Type { DOLLARS, PERCENTAGE } If you resize the frame, you can see that you cannot make the edge of the graph go to the edge of the panel (the panels are outlined in red). Especially on the bottom - there is always a gap between the bottom the graph and the bottom of the panel. Is there a way to make the graph fill the entire area? Is there a way to at least guarantee that it is touching one edge of the panel (i.e., it is touching the top and bottom or the left and right) ??

Read the article
Right-aligning button in a grid with possibly no content - stretch grid to always fill the page

- by Peter Perhác

Hello people, I am losing my patience with this. I am working on a Windows Phone 7 application and I can't figure out what layout manager to use to achieve the following: Basically, when I use a Grid as the layout root, I can't make the grid to stretch to the size of the phone application page. When the main content area is full, all is well and the button sits where I want it to sit. However, in case the page content is very short, the grid is only as wide as to accommodate its content and then the button (which I am desperate to keep near the right edge of the screen) moves away from the right edge. If I replace the grid and use a vertically oriented stack panel for the layout root, the button sits where I want it but then the content area is capable of growing beyond the bottom edge. So, when I place a listbox full of items into the main content area, it doesn't adjust its height to be completely in view, but the majority of items in that listbox are just rendered below the bottom edge of the display area. I have tried using a third-party DockPanel layout manager and then docked the button in it's top section and set the button's HorizontalAlignment="Right" but the result was the same as with the grid, it also shrinks in size when there isn't enough content in the content area (or when title is short). How do I do this then? ==EDIT== I tried WPCoder's XAML, only I replaced the dummy text box with what I would have in a real page (stackpanel) and placed a listbox into the ContentPanel grid. I noticed that what I had before and what WPCoder is suggesting is very similar. Here's my current XAML and the page still doesn't grow to fit the width of the page and I get identical results to what I had before: <phone:PhoneApplicationPage x:Name="categoriesPage" x:Class="CatalogueBrowser.CategoriesPage" xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation" xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml" xmlns:phone="clr-namespace:Microsoft.Phone.Controls;assembly=Microsoft.Phone" xmlns:shell="clr-namespace:Microsoft.Phone.Shell;assembly=Microsoft.Phone" xmlns:d="http://schemas.microsoft.com/expression/blend/2008" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" FontFamily="{StaticResource PhoneFontFamilyNormal}" FontSize="{StaticResource PhoneFontSizeNormal}" Foreground="{StaticResource PhoneForegroundBrush}" SupportedOrientations="PortraitOrLandscape" Orientation="Portrait" mc:Ignorable="d" d:DesignWidth="480" d:DesignHeight="768" xmlns:ctrls="clr-namespace:Microsoft.Phone.Controls;assembly=Microsoft.Phone.Controls.Toolkit" shell:SystemTray.IsVisible="True"> <Grid x:Name="LayoutRoot" Background="Transparent"> <Grid.RowDefinitions> <RowDefinition Height="Auto"/> <RowDefinition Height="*"/> </Grid.RowDefinitions> <Grid> <Grid.ColumnDefinitions> <ColumnDefinition Width="*" /> <ColumnDefinition Width="Auto" /> </Grid.ColumnDefinitions> <StackPanel Orientation="Horizontal" VerticalAlignment="Center" > <TextBlock Text="Browsing:" Margin="10,10" Style="{StaticResource PhoneTextTitle3Style}" /> <TextBlock x:Name="ListTitle" Text="{Binding DisplayName}" Margin="0,10" Style="{StaticResource PhoneTextTitle3Style}" /> </StackPanel> <Button Grid.Column="1" x:Name="btnRefineSearch" Content="Refine Search" Style="{StaticResource buttonBarStyle}" FontSize="14" /> </Grid> <Grid x:Name="ContentPanel" Grid.Row="1"> <ListBox x:Name="CategoryList" ItemsSource="{Binding Categories}" Style="{StaticResource CatalogueList}" SelectionChanged="CategoryList_SelectionChanged"/> </Grid> </Grid> </phone:PhoneApplicationPage> This is what the page with the above XAML markup looks like in the emulator:

Read the article
Is it better to hard code data or find an algorithm?

- by OghmaOsiris

I've been working on a boardgame that has a hex grid as the board (the upper right grid in the image below) Since the board will never change and the spaces on the board will always be linked to the same other spaces around it, should I just hard code every space with the values that I need? Or should I use various algorithms to calculate links and traversals? To be more specific, my board game is a 4 player game where each player has a 5x5x5x5x5x5 hex grid (again, the upper right grid in th eimage above). The object is to get from the bottom of the grid to the top, with various obstacles in the way, and each players being able to attack eachother from the edge of their grid onto other players based on a range multiplier. Since the players grid will never change and the distance of any arbitrary space from the edge of the grid will always be the same, should I just hard code this number into each of the spaces, or should I still use a breadth first search algorithm when players are attacking? The only con I can think of for hard coding everything is that I'm going to code 9+ 2(5+6+7+8) = 61 individual cells. Is there anything else that I'm missing that I should consider using more complex algorithms?

Read the article
Profit at Oracle OpenWorld 2012

- by user462779

It's only a week away: Oracle OpenWorld descends on San Francisco from September 30 to October 4. It's always a frantic week for the Profit editorial staff, but here's a few thing we've got going in San Francisco that you'll want to watch out for: Profit on Oracle OpenWorld Live: The Oracle video team will be broadcasting live from the event all week. I have a few interesting on-air interviews booked, including a conversation with business/technology researcher Andrew Mcafee (Monday Oct 1 @ 11:45am), Acorn Paper CEO David Weissberg (Tuesday, Oct 2 @ 12:15pm) and Abhay Parasnis, Oracle Senior Vice President, Oracle Public Cloud (Wednesday, Oct 3, @ 10:45am). Profit in the Oracle Partner Network Lounge: This summer, I worked with the amazing Oracle Partner Network (OPN) team to create the Profit Oracle Specialized Partner Edition 2012. It's a great catalog of Oracle partner success stories and insight into the OPN strategy from its leadership. Look for the special issue of Profit in the Oracle PartnerNetwork Lounge: the place where partners can meet formally or informally with colleagues, customers, prospects, and other industry professionals. Moscone South, Exhibit Hall, Room 100 Oracle Customer Experience Summit @ OpenWorld: There's been a lot of discussion within my editorial team (and content published, as well)about Customer Experience. To keep pace with this evolving subject, I'll be attending this special embedded conference on Wednesday and Thursday (Oct. 3-4). Especially looking forward to Seth Godin's presentation: he was one of the first experts we interviewed forProfit Online five years ago. The Executive Edge @ OpenWorld: Of course, my Oracle OpenWorld is mostly filled with meetings/interviews with Oracle customers about completed Oracle projects and the strategic impact of enterprise IT on business. The ideal place for these conversations is The Executive Edge @ OpenWorld embedded conference. Samovar Tea Lounge at Moscone Center: I spend my down time on the roof of Moscone North, preparing for meetings or having impromptu conversations with attendees at this little oasis overlooking Yerba Buena Gardens. Fee free to drop my for a chat! See you in San Francisco! -Aaron Lazenby

Read the article
Real world pitfalls of introducing F# into a large codebase and engineering team

- by nganju

I'm CTO of a software firm with a large existing codebase (all C#) and a sizable engineering team. I can see how certain parts of the code would be far easier to write in F#, resulting in faster development time, fewer bugs, easier parallel implementations, etc., basically overall productivity gains for my team. However, I can also see several productivity pitfalls of introducing F#, namely: 1) Everyone has to learn F#, and it's not as trivial as switching from, say, Java to C#. Team members that have not learned F# will be unable to work on F# parts of the codebase. 2) The pool of hireable F# programmers, as of now (Dec 2010) is non-existent. Search various software engineer resume databases for "F#", way less than 1% of resumes contain the keyword. 3) Community support as of now (Dec 2010) is less available. You can google almost any problem in C# and find someone that has already dealt with it, not so with F#. Third party tool support (NUnit, Resharper etc) is also sketchy. I realize that this is a bit Catch-22, i.e. if people like me don't use F# then the community and tools will never materialize, etc. But, I've got a company to run, and I can be cutting edge but not bleeding edge. Any other pitfalls I'm not considering? Or anyone care to rebut the pitfalls I've mentioned? I think this is an important discussion and would love to hear your counter-arguments in this public forum that may do a lot to increase F# adoption by industry. Thanks.

Read the article
In a 2D platform game, how to ensure the player moves smoothly over sloping ground?

- by Kovsa

See image: http://i41.tinypic.com/huis13.jpg I'm developing a physics engine for a 2D platform game. I'm using the separating axis theorem for collision detection. The ground surface is constructed from oriented bounding boxes, with the player as an axis aligned bounding box. (Specifically, I'm using the algorithm from the book "Realtime Collision Detection" which performs swept collision detection for OBBs using SAT). I'm using a fairly small (close to zero) restitution coefficient in the collision response, to ensure that the dynamic objects don't penetrate the environment. The engine mostly works fine, it's just that I'm concerned about some edge cases that could possibly occur. For example, in the diagram, A, B and C are the ground surface. The player is heading left along B towards A. It seems to me that due to inaccuracy, the player box could be slightly below the box B as it continues up and left. When it reaches A, therefore, the bottom left corner of the player might then collide with the right side of A, which would be undesirable (as the intention is for the player to move smoothly over the top of A). It seems like a similar problem could happen when the player is on top of box C, moving left towards B - the most extreme point of B could collide with the left side of the player, instead of the player's bottom left corner sliding up and left above B. Box2D seems to handle this problem by storing connectivity information for its edge shapes, but I'm not really sure how it uses this information to solve the problem, and after looking at the code I don't really grasp what it's doing. Any suggestions would be greatly appreciated.

Read the article
How can I resolve collisions at different speeds, depending on the direction?

- by Raven Dreamer

I have, for all intents and purposes, a Triangle class that objects in my scene can collide with (In actuality, the right side of a parallelogram). My collision detection and resolution code works fine for the purposes of preventing a gameobject from entering into the space of the Triangle, instead directing the movement along the edge. The trouble is, the maximum speed along the x and y axis is not equivalent in my game, and moving along the Y axis (up or down) should take twice as long as an equivalent distance along the X axis (left or right). Unfortunately, these speeds apply to the collision resolution too, and movement along the blue path above progresses twice as fast. What can I do in my collision resolution to make sure that the speedlimit for Y axis movement is obeyed in the latter case? Collision Resolution for this case below (vecInput and velocity are the position and velocity vectors of the game object): // y = mx+c // solve for y. M = 2, x = input's x coord, c = rightYIntercept lowY = 2*vecInput.x + parag.rightYIntercept ; ... else { // y = mx+c // vecInput.y = 2(x) + RightYIntercept // (vecInput.y - RightYIntercept) / 2 = x; //if velocity.Y (positive) greater than velocity.X (negative) //pushing from bottom, so push right. if(velocity.y > -1*velocity.x) { //change the input vector's x position to match the //y position on the shape's edge. Formula for line: Y = MX+C // M is 2, C is rightYIntercept, y is the input y, solve for X. vecInput = new Vector2((vecInput.y - parag.rightYIntercept)/2, vecInput.y); Debug.Log("adjusted rightwards"); } else { vecInput = new Vector2( vecInput.x, lowY); Debug.Log("adjusted downwards"); } }

Read the article
Profit : August, 2012

- by user462779

August 2012 issue of Profit is now available online. Way back in 2003, I wrote my first feature for Profit. It was titled “Everything You Always Wanted to Know About Application Servers (But Were Afraid To Ask),” and it discussed “cutting-edge” technologies like portals and XML and the brand-new Java Platform, Enterprise Edition (Java EE; we’re now on Java EE 7). But despite the dated terms I used in my Profit debut, I noticed something in rereading that old story that has stayed constant: mid-tier technology is where innovative enterprise IT projects happen. It may have been XML in 2003, but it’s SOA in 2012. While preparing the August issue of Profit was more than just a stroll down memory lane for me, it has provided a nice bit of perspective about what changes and what doesn’t in this dynamic IT industry. Technologies continuously evolve—some become standard practice, some are revived or reinvented, and some are left by the wayside. But the drive to innovate and the desire to succeed are business principles that never go out of fashion. Also, be sure to check out the Profit JD Edwards Special Issue 2012 (PDF), featuring partner profiles, customer successes, and Oracle executive interviews. The Middleware Advantage Three ways a flexible, integrate software layer can deliver a competitive edge Playing to Win Electronic Arts’ superefficient hub processes millions of online gaming transactions every day. Adjustable Loans With Oracle Exadata, Reliance Commercial Finance keeps pace with India’s commercial loan market. Future Proof To keep pace with mobile, social, and location-based services, smart technologists are using middleware to innovate. Spring Training Knowledge and communication help Jackson Hewitt’s Tim Bechtold get seasonal workers in top shape. Keeping Online Customers Happy Customers worldwide are comfortable with online service—but are companies meeting customers’ needs?

Read the article
Character movement on a 2D tile map

- by Chris Morris

I'm working at making a HTML5 game. Top down, closest thing I can equate it to is the gameboy zeldas, but open world and no rooms. What I have so far is a procedurally generated map in a multi dimensional array. And a starting position on the map. Along with this I have an array of movable and non movable tile ID's. I also have a class for my player and have him being rendered out in the center of the starting tile. My problem however is getting the movement sorted out for the player. I want to be able to have the character free move around the map (pixel by pixel essentially) ontop of this 2D generated world. Ideally this would allow the user to move around the walk able area of the canvas. this is simple enough for me to do, but I am having problems now moving the world. If the user is 20% from the edge of the screen i want the world to start panning in the direction the player is heading. But I'm rather lacking in ideas of how to do this. I've looked around for some tutorials, but am coming up blank on ideas of how to generate the playable area (zoomed in) and to then move this generated area under the player when they reach near the end of the screen. My current idea was to generate a certain amount of tiles full size to fill the screen and place the player i the middle. Then when the user approaches the edge of the screen start generating the tiles offset by the distance moved and the direction. I can kind of see this working but I really have no idea if this is the best or easiest to code of methods for generating the world. sorry for the lack of code but I'm still just in the theory stages of working this all out.

Read the article
BPM Solution Catalogue–promote your process templates

- by JuergenKress

Oracle’s BPM Solution Catalogue showcasing our solutions with partners is now live. Take a look at the initial entries here. We are planning to use this catalogue not only to publish and highlight successful BPM implementations but also will be running campaigns in industry verticals with your solutions. If you have delivered a successful implementation in BPM and think it could be reused and applied again in a similar scenario in the same industry or in a similar environment, then we are keen to know about it and will add it to the solution catalogue. The solution catalogue will showcase successful BPM solutions both inside and outside Oracle. Be in touch with us on this via this e-mail id and we will make sure to add your solution. For more information you can also read the article “Leading-Edge BPM Benefits Without Bleeding-Edge Pain” SOA & BPM Partner Community For regular information on Oracle SOA Suite become a member in the SOA & BPM Partner Community for registration please visit www.oracle.com/goto/emea/soa (OPN account required) If you need support with your account please contact the Oracle Partner Business Center. Blog Twitter LinkedIn Mix Forum Technorati Tags: BPM,Solution Catalogue,process templates,BPM Suite,SOA Community,Oracle SOA,Oracle BPM,Community,OPN,Jürgen Kress

Read the article
Slick 2d scrolling off screen

- by Peter

I have something scrolling in and out of the screen. Now when it goes off screen, I want it to scroll into the screen at another location. What I do is I grab the last pixels at the screens edge using g.copyArea and then g.drawImage on the edge of the screen. And then I do a g.translate to create room for the next row which is next render cycle. My problem is that I get a single pixel row, which is not copied onto the canvas. Where as I want each row to be added and then translated, so that the image that scrolled off screen is recreated on the other side of the screen. Here is my code, maybe there is a better way of doing this, open to any suggests, cause I'm totally stuck @Override public void render(GameContainer gc, Graphics g) throws SlickException { //g.setClip(0, 0, 300, gc.getHeight()); g.translate(0, y); g.drawImage(image,0,200); g.resetTransform(); //g.clearClip(); g.copyArea(rightImage, 0, gc.getHeight() - 1); g.drawImage(rightImage, 300, 0); g.translate(0, y); y=y+3; }

Read the article
Graph data structures and journal format for mini-IDE

- by matec

Background: I am writing a small/partial IDE. Code is internally converted/parsed into a graph data structure (for fast navigation, syntax-check etc). Functionality to undo/redo (also between sessions) and restoring from crash is implemented by writing to and reading from journal. The journal records modifications to the graph (not to the source). Question: I am hoping for advice on a decision on data-structures and journal format. For the graph I see two possible versions: g-a Graph edges are implemented in the way that one node stores references to other nodes via memory address g-b Every node has an ID. There is an ID-to-memory-address map. Graph uses IDs (instead of addresses) to connect nodes. Moving along an edge from one node to another each time requires lookup in ID-to-address map. And also for the journal: j-a There is a current node (like current working directory in a shell + file-system setting). The journal contains entries like "create new node and connect to current", "connect first child of current node" (relative IDs) j-b Journal uses absolute IDs, e.g. "delete edge 7 - 5", "delete node 5" I could e.g. combine g-a with j-a or combine g-b with j-b. In principle also g-b and j-a should be possible. [My first/original attempt was g-a and a version of j-b that uses addresses, but that turned out to cause severe restrictions: nodes cannot change their addresses (or journal would have to keep track of it), and using journal between two sessions is a mess (or even impossible)] I wonder if variant a or variant b or a combination would be a good idea, what advantages and disadvantages they would have and especially if some variant might be causing troubles later.

Read the article
How does the new google maps make buildings and cityscapes 3D?

- by Aerovistae

Anyone who's seen the new Google maps has no doubt taken note of the incredible amount of three-dimensional detail in select American cities such as Boston, New York, Chicago, and San Francisco. They've even modeled the trees, bridges and some of the boats in the harbor! Minor architectural details are present. It's crazy. Looking at it up close, I've found there's a rectangular area around each of those cities, and anything within them is 3Dified, but it cuts off hard and fast at the edge, even if it's in the middle of a building. The edge of the rectangle is where the 3D stops. This leads me to think it's being done algorithmically (which would make sense, given the scale of the project, how many trees and buildings and details there are), and yet I can't imagine how that's possible. How could an algorithm model all these things without extensive data on their shapes and contours? How could it model the individual wires of a bridge, or the statues in a park? It must be done by hand, and yet how could it be for so much detail! Does anyone have any insight on this?

Read the article
How do you track existing requirements over time?

- by CaptainAwesomePants

I'm a software engineer working on a complex, ongoing website. It has a lot of moving parts and a small team of UI designers and business folks adding new features and tweaking old ones. Over the last year or so, we've added hundreds of interesting little edge cases. Planning, implementing, and testing them is not a problem. The problem comes later, when we want to refactor or add another new feature. Nobody remembers half of the old features and edge cases from a year ago. When we want to add a new change, we notice that code does all sorts of things in there, and we're not entirely sure which things are intentional requirements and which are meaningless side effects. Did someone last year request that the login token was supposed to only be valid for 30 minutes, or did some programmers just pick a sensible default? Can we change it? Back when the product was first envisioned, we created some documentation describing how the site worked. Since then we created a few additional documents describing new features, but nobody ever goes back and updates those documents when new features are requested, so the only authoritative documentation is the code itself. But the code provides no justification, no reason for its actions: only the how, never the why. What do other long-running teams do to keep track of what the requirements were and why?

Read the article
How to represent a graph with multiple edges allowed between nodes and edges that can selectively disappear

- by Pops

I'm trying to figure out what sort of data structure to use for modeling some hypothetical, idealized network usage. In my scenario, a number of users who are hostile to each other are all trying to form networks of computers where all potential connections are known. The computers that one user needs to connect may not be the same as the ones another user needs to connect, though; user 1 might need to connect computers A, B and D while user 2 might need to connect computers B, C and E. Image generated with the help of NCTM Graph Creator I think the core of this is going to be an undirected cyclic graph, with nodes representing computers and edges representing Ethernet cables. However, due to the nature of the scenario, there are a few uncommon features that rule out adjacency lists and adjacency matrices (at least, without non-trivial modifications): edges can become restricted-use; that is, if one user acquires a given network connection, no other user may use that connection in the example, the green user cannot possibly connect to computer A, but the red user has connected B to E despite not having a direct link between them in some cases, a given pair of nodes will be connected by more than one edge in the example, there are two independent cables running from D to E, so the green and blue users were both able to connect those machines directly; however, red can no longer make such a connection if two computers are connected by more than one cable, each user may own no more than one of those cables I'll need to do several operations on this graph, such as: determining whether any particular pair of computers is connected for a given user identifying the optimal path for a given user to connect target computers identifying the highest-latency computer connection for a given user (i.e. longest path without branching) My first thought was to simply create a collection of all of the edges, but that's terrible for searching. The best thing I can think to do now is to modify an adjacency list so that each item in the list contains not only the edge length but also its cost and current owner. Is this a sensible approach? Assuming space is not a concern, would it be reasonable to create multiple copies of the graph (one for each user) rather than a single graph?

Read the article
What could cause a sudden stop in Box2D?

- by alexanderpine

I'm using Box2d for a game, and I have a bug that's driving me nuts. I've simplified the situation down to a square player sliding back and forth frictionlessly on top of a floor composed of a series of square tiles, driven by the left and right keys (which apply a horizontal force). Works great, sliding back and forth across the whole floor. Except... Every once in a while, the player will suddenly stick at the edge of one of the tiles as if it is hitting a (nonexistent) wall. Further pushes in the same direction it was traveling will fail, but as soon as I push backwards once in the opposite direction, I can push forwards past the sticking point again. The sticking point seems to be random, except for being on the edge of a tile. Happens while going left or right. For debugging purposes, I keep the Positions/velocity values for the previous two update ticks and print them out when this stop occurs. As an example, here you see the player moving right, decelerating slightly; pos2 should be about 8.7, but it stops dead instead. tick0: pos= 8.4636 vel= 7.1875 tick1: pos= 8.5816 vel= 7.0833 tick2: pos= 8.5816 vel= 0.0000 So, as the player is 0.8 and the tiles 1.0 wide, the player is stopping just as it is about to cross onto the next tile (8.5816 + 0.8/2 = 8.9816). In fact, I get a collision message (which I ignore except noting that it happened). It only seems to happen at x.5816 (or -x.4184) while moving right, and x.4167 (or -x.5833) while moving left I said that it's like hitting a wall, but in fact, when it hits a wall, the numbers look more like: tick0: pos0= 12.4131 vel2= 8.4375 tick1: pos1= 12.5555 vel1= 8.5417 tick2: pos2= 12.5850 vel0= 0.0000 so it moves further right on the last tick, which puts it in contact with the wall. Anyone seen anything like this. Any suggestion on how I could be causing this behavior.

Read the article
Bug Triage

In this blog post brain dump, I'll attempt to describe the process my team tries to follow when dealing with new bug reports (specifically, code defect reports). This is not official Microsoft policy, just the way we do things… if you do things differently and want to share, you can do so at the bottom in the comments (or on your blog).Feature Triage TeamA subset of the feature crew, the triage team (which has representations from the PM, Dev and QA disciplines), looks at all unassigned bugs at regular intervals. This can be weekly or daily (or other frequency) dependent on which part of the product cycle we are in and what the untriaged bug load looks like. They discuss each bug considering the evidence and make a decision of whether the bug goes from Not Yet Assigned to Assigned (plus the name of the DEV to fix this) or whether it goes from Active to Resolved (which means it gets assigned back to the requestor for closure or further debate if they were not present at the triage meeting). Close to critical milestones, the feature triage team needs to further justify bugs they take to additional higher-level triage teams.Bug Opened = Not Yet AssignedSomeone (typically an SDET from the QA team) creates the bug item (e.g. in TFS), ensuring they populate all the relevant fields including: Title, Description, Repro Steps (including the Actual Result at the end of the steps), attachments of code and/or screenshots, Build number that they observed the issue in, regression details if applicable, how it was found, if a test case exists or needs to be created etc. They also indicate their opinion on the Priority and Severity. The bug status is left as Not Yet Assigned."Issue" versus "Fix for issue"The solution to some bugs is easy to determine, e.g. "bug: the column name is misspelled". Obviously the fix is to correct the spelling – still, the triage team should be explicit and enter the correct spelling in the bug's Description. Note that a bad bug name here would be "bug: fix the spelling of the column" (it describes the solution, rather than the problem).Other solutions are trickier to establish, e.g. "bug: the column header is not accessible (can only be clicked on with the mouse, not reached via keyboard)". What is the correct solution here? The last thing to do is leave this undetermined and just assign it to a developer. The solution has to be entered in the description. Behind this type of a bug usually hides a spec defect or a new feature request.The person opening the bug should focus on describing the issue, rather than the solution. The person indicates what the fix is in their opinion by stating the Expected Result (immediately after stating the Actual Result). If they have a complex suggested solution, that should be split out in a separate part, but the triage team has the final say before assigning it. If the solution is lengthy/complicated to describe, the bug can be assigned to the PM. Note: the strict interpretation suggests that any bug with no clear, obvious solution is always a hole in the spec and should always go to the PM. This also ensures the spec gets updated.Not Yet Assigned - Not Yet Assigned (on someone else's plate)If the bug is observed in our feature, but the cause is actually another team, we change the Area Path (which is the way we identify teams in TFS) and leave it as Not Yet Assigned. The triage team may add more comments as appropriate including potentially changing the repro steps. In some cases, we may even resolve the bug in our area path and open a new bug in the area path of the other team.Even though there is no action on a dev on the team, the bug still needs to be tracked. One way of doing this is to implement some notification system that informs the team when the tracked bug changed status; another way is to occasionally run a global query (against all area paths) for bugs that have been opened by a member of the team and follow up with the current owners for stale bugs.Not Yet Assigned - ResolvedThis state transition can only be made by the Feature Triage Team.0. Sometimes the bug description is not clear and in that case it gets Resolved as More Information Needed, so the original requestor can provide it.After understanding what the bug item is about, the first decision is to determine whether it needs to go to a dev.1. If it is a known bug, it gets resolved as "Duplicate" and linked to the existing bug.2. If it is "By Design" it gets resolved as such, indicating that the triage team does not think this is a bug.3. If the bug does not repro on latest bits, it is resolved as "No Repro"4. The most painful: If it is decided that we cannot fix it for this release it gets resolved as "Postponed" or "Won't Fix". The former is typically due to resources and time constraints, while the latter is due to deciding that it is not important enough to consume our resources in any release (yes, not all bugs must be fixed!). For both cases, there are other factors that contribute to the decision such as: existence of a reasonable workaround, frequency we expect users to encounter the issue, dependencies on other team to offer a solution, whether it breaks a core scenario, whether it prohibits customer feedback on a major feature, is it a regression from a previous release, impact of the fix on other partner teams (e.g. User Education, User Experience, Localization/Globalization), whether this is the right fix, does the fix impact performance goals, and last but not least, severity of bug (e.g. loss of customer data, security threat, crash, hang). The bar for fixing a bug goes up as the release date approaches. The triage team becomes hardnosed about which bugs to take, while the developers are busy resolving assigned bugs thus everyone drives for Zero Bug Bounce (ZBB). ZBB is when you have 0 active bugs older than 48 hours.Not Yet Assigned - AssignedIf the bug is something we decide to fix in this release and the solution is known, then it is assigned to a DEV. This is either the developer that will do the work, or a Lead that can further assign it to one of his developer team based on a load balancing algorithm of their choosing.Sometimes, the triage team needs the dev to do some investigation work before deciding whether to take the fix; similarly, the checkin for the fix may be gated on code review by the triage team. In these cases, these instructions are provided in the comments section of the bug and when the developer is done they notify the triage team for final decision.Additionally, a Priority and Severity (from 0 to 4) has to be entered, e.g. a P0 means "drop anything you are doing and fix this now" whereas a P4 is something you get to after all P0,1,2,3 bugs are fixed.From a testing perspective, if the bug was found through ad-hoc testing or an external team, the decision is made whether test cases should be added to avoid future regressions. This is communicated to the QA team.Assigned - ResolvedWhen the developer receives the bug (they should be checking daily for new bugs on their plate looking at bugs in order of priority and from older to newer) they can send it back to triage if the information is not clear. Otherwise, they investigate the bug, setting the Sub Status to "Investigating"; if they cannot make progress, they set the Sub Status to "Blocked" and discuss this with triage or whoever else can help them get unblocked. Once they are unblocked, they set the Sub Status to "Working on Solution"; once they are code complete they send a code review request, setting the Sub Status to "Fix Available". After the iterative code review process is over and everyone is happy with the fix, the developer checks it in and changes the state of the bug from Active (and Assigned to them) to Resolved (and Assigned to someone else).The developer needs to ensure that when the status is changed to Resolved that it is assigned to a QA person. For example, maybe the PM opened the bug, but it should be a QA person that will verify the fix - the developer needs to manually change the assignee in that case. Typically the QA person will send an email to the original requestor notifying them that the fix is verified.Resolved - ??In all cases above, note that the final state was Resolved. What happens after that? The final step should be Closed. The bug is closed once the QA person verifying the fix is happy with it. If the person is not happy, then they change the state from Resolved to Active, thus sending it back to the developer. If the developer and QA person cannot reach agreement, then triage can be brought into it. An easy way to do that is change the status back to Not Yet Assigned with appropriate comments so the triage team can re-review.It is important to note that only QA can close a bug. That means that if the opener of the bug was a PM, when the bug gets resolved by the dev it may land on the PM's plate and after a quick review, the PM would re-assign to an SDET, which is the only role that can close bugs. One exception to this is if the person that filed the bug is external: in that case, we leave it Resolved and assigned to them and also send them a notification that they need to verify the fix. Another exception is if specialized developer knowledge is needed for verifying the bug fix (e.g. it was a refactoring suggestion bug typically not observable by the user) in which case it is fine to have a developer verify the fix, and ideally a different developer to the one that opened the bug.Other links on bug triageA quick search reveals that others have talked about this subject, e.g. here, here, here, here and here.Your take?If you have other best practices your team uses to deal with incoming bug reports, feel free to share in the comments below or on your blog. Comments about this post welcome at the original blog.

Read the article
The Business case for Big Data

- by jasonw

The Business Case for Big Data Part 1 What's the Big Deal Okay, so a new buzz word is emerging. It's gone beyond just a buzzword now, and I think it is going to change the landscape of retail, financial services, healthcare....everything. Let me spend a moment to talk about what i'm going to talk about. Massive amounts of data are being collected every second, more than ever imaginable, and the size of this data is more than can be practically managed by today’s current strategies and technologies. There is a revolution at hand centering on this groundswell of data and it will change how we execute our businesses through greater efficiencies, new revenue discovery and even enable innovation. It is the revolution of Big Data. This is more than just a new buzzword is being tossed around technology circles.This blog series for Big Data will explain this new wave of technology and provide a roadmap for businesses to take advantage of this growing trend. Cases for Big Data There is a growing list of use cases for big data. We naturally think of Marketing as the low hanging fruit. Many projects look to analyze twitter feeds to find new ways to do marketing. I think of a great example from a TED speech that I recently saw on data visualization from Facebook from my masters studies at University of Virginia. We can see when the most likely time for breaks-ups occurs by looking at status changes and updates on users Walls. This is the intersection of Big Data, Analytics and traditional structured data. Ted Video Marketers can use this to sell more stuff. I really like the following piece on looking at twitter feeds to measure mood. The following company was bought by a hedge fund. They could predict how the S&P was going to do within three days at an 85% accuracy. Link to the article Here we see a convergence of predictive analytics and Big Data. So, we'll look at a lot of these business cases and start talking about what this means for the business. It's more than just finding ways to use Hadoop + NoSql and we'll talk about that too. How do I start in Big Data? That's what is coming next post.

Read the article
Quick guide to Oracle IRM 11g: Classification design

- by Simon Thorpe

Quick guide to Oracle IRM 11g indexThis is the final article in the quick guide to Oracle IRM. If you've followed everything prior you will now have a fully functional and tested Information Rights Management service. It doesn't matter if you've been following the 10g or 11g guide as this next article is common to both. ContentsWhy this is the most important part... Understanding the classification and standard rights model Identifying business use cases Creating an effective IRM classification modelOne single classification across the entire businessA context for each and every possible granular use caseWhat makes a good context? Deciding on the use of roles in the context Reviewing the features and security for context roles Summary Why this is the most important part...Now the real work begins, installing and getting an IRM system running is as simple as following instructions. However to actually have an IRM technology easily protecting your most sensitive information without interfering with your users existing daily work flows and be able to scale IRM across the entire business, requires thought into how confidential documents are created, used and distributed. This article is going to give you the information you need to ask the business the right questions so that you can deploy your IRM service successfully. The IRM team here at Oracle have over 10 years of experience in helping customers and it is important you understand the following to be successful in securing access to your most confidential information. Whatever you are trying to secure, be it mergers and acquisitions information, engineering intellectual property, health care documentation or financial reports. No matter what type of user is going to access the information, be they employees, contractors or customers, there are common goals you are always trying to achieve.Securing the content at the earliest point possible and do it automatically. Removing the dependency on the user to decide to secure the content reduces the risk of mistakes significantly and therefore results a more secure deployment. K.I.S.S. (Keep It Simple Stupid) Reduce complexity in the rights/classification model. Oracle IRM lets you make changes to access to documents even after they are secured which allows you to start with a simple model and then introduce complexity once you've understood how the technology is going to be used in the business. After an initial learning period you can review your implementation and start to make informed decisions based on user feedback and administration experience. Clearly communicate to the user, when appropriate, any changes to their existing work practice. You must make every effort to make the transition to sealed content as simple as possible. For external users you must help them understand why you are securing the documents and inform them the value of the technology to both your business and them. Before getting into the detail, I must pay homage to Martin White, Vice President of client services in SealedMedia, the company Oracle acquired and who created Oracle IRM. In the SealedMedia years Martin was involved with every single customer and was key to the design of certain aspects of the IRM technology, specifically the context model we will be discussing here. Listening carefully to customers and understanding the flexibility of the IRM technology, Martin taught me all the skills of helping customers build scalable, effective and simple to use IRM deployments. No matter how well the engineering department designed the software, badly designed and poorly executed projects can result in difficult to use and manage, and ultimately insecure solutions. The advice and information that follows was born with Martin and he's still delivering IRM consulting with customers and can be found at www.thinkers.co.uk. It is from Martin and others that Oracle not only has the most advanced, scalable and usable document security solution on the market, but Oracle and their partners have the most experience in delivering successful document security solutions. Understanding the classification and standard rights model The goal of any successful IRM deployment is to balance the increase in security the technology brings without over complicating the way people use secured content and avoid a significant increase in administration and maintenance. With Oracle it is possible to automate the protection of content, deploy the desktop software transparently and use authentication methods such that users can open newly secured content initially unaware the document is any different to an insecure one. That is until of course they attempt to do something for which they don't have any rights, such as copy and paste to an insecure application or try and print. Central to achieving this objective is creating a classification model that is simple to understand and use but also provides the right level of complexity to meet the business needs. In Oracle IRM the term used for each classification is a "context". A context defines the relationship between.A group of related documents The people that use the documents The roles that these people perform The rights that these people need to perform their role The context is the key to the success of Oracle IRM. It provides the separation of the role and rights of a user from the content itself. Documents are sealed to contexts but none of the rights, user or group information is stored within the content itself. Sealing only places information about the location of the IRM server that sealed it, the context applied to the document and a few other pieces of metadata that pertain only to the document. This important separation of rights from content means that millions of documents can be secured against a single classification and a user needs only one right assigned to be able to access all documents. If you have followed all the previous articles in this guide, you will be ready to start defining contexts to which your sensitive information will be protected. But before you even start with IRM, you need to understand how your own business uses and creates sensitive documents and emails. Identifying business use cases Oracle is able to support multiple classification systems, but usually there is one single initial need for the technology which drives a deployment. This need might be to protect sensitive mergers and acquisitions information, engineering intellectual property, financial documents. For this and every subsequent use case you must understand how users create and work with documents, to who they are distributed and how the recipients should interact with them. A successful IRM deployment should start with one well identified use case (we go through some examples towards the end of this article) and then after letting this use case play out in the business, you learn how your users work with content, how well your communication to the business worked and if the classification system you deployed delivered the right balance. It is at this point you can start rolling the technology out further. Creating an effective IRM classification model Once you have selected the initial use case you will address with IRM, you need to design a classification model that defines the access to secured documents within the use case. In Oracle IRM there is an inbuilt classification system called the "context" model. In Oracle IRM 11g it is possible to extend the server to support any rights classification model, but the majority of users who are not using an application integration (such as Oracle IRM within Oracle Beehive) are likely to be starting out with the built in context model. Before looking at creating a classification system with IRM, it is worth reviewing some recognized standards and methods for creating and implementing security policy. A very useful set of documents are the ISO 17799 guidelines and the SANS security policy templates. First task is to create a context against which documents are to be secured. A context consists of a group of related documents (all top secret engineering research), a list of roles (contributors and readers) which define how users can access documents and a list of users (research engineers) who have been given a role allowing them to interact with sealed content. Before even creating the first context it is wise to decide on a philosophy which will dictate the level of granularity, the question is, where do you start? At a department level? By project? By technology? First consider the two ends of the spectrum... One single classification across the entire business Imagine that instead of having separate contexts, one for engineering intellectual property, one for your financial data, one for human resources personally identifiable information, you create one context for all documents across the entire business. Whilst you may have immediate objections, there are some significant benefits in thinking about considering this. Document security classification decisions are simple. You only have one context to chose from! User provisioning is simple, just make sure everyone has a role in the only context in the business. Administration is very low, if you assign rights to groups from the business user repository you probably never have to touch IRM administration again. There are however some obvious downsides to this model.All users in have access to all IRM secured content. So potentially a sales person could access sensitive mergers and acquisition documents, if they can get their hands on a copy that is. You cannot delegate control of different documents to different parts of the business, this may not satisfy your regulatory requirements for the separation and delegation of duties. Changing a users role affects every single document ever secured. Even though it is very unlikely a business would ever use one single context to secure all their sensitive information, thinking about this scenario raises one very important point. Just having one single context and securing all confidential documents to it, whilst incurring some of the problems detailed above, has one huge value. Once secured, IRM protected content can ONLY be accessed by authorized users. Just think of all the sensitive documents in your business today, imagine if you could ensure that only everyone you trust could open them. Even if an employee lost a laptop or someone accidentally sent an email to the wrong recipient, only the right people could open that file. A context for each and every possible granular use case Now let's think about the total opposite of a single context design. What if you created a context for each and every single defined business need and created multiple contexts within this for each level of granularity? Let's take a use case where we need to protect engineering intellectual property. Imagine we have 6 different engineering groups, and in each we have a research department, a design department and manufacturing. The company information security policy defines 3 levels of information sensitivity... restricted, confidential and top secret. Then let's say that each group and department needs to define access to information from both internal and external users. Finally add into the mix that they want to review the rights model for each context every financial quarter. This would result in a huge amount of contexts. For example, lets just look at the resulting contexts for one engineering group. Q1FY2010 Restricted Internal - Engineering Group 1 - Research Q1FY2010 Restricted Internal - Engineering Group 1 - Design Q1FY2010 Restricted Internal - Engineering Group 1 - Manufacturing Q1FY2010 Restricted External- Engineering Group 1 - Research Q1FY2010 Restricted External - Engineering Group 1 - Design Q1FY2010 Restricted External - Engineering Group 1 - Manufacturing Q1FY2010 Confidential Internal - Engineering Group 1 - Research Q1FY2010 Confidential Internal - Engineering Group 1 - Design Q1FY2010 Confidential Internal - Engineering Group 1 - Manufacturing Q1FY2010 Confidential External - Engineering Group 1 - Research Q1FY2010 Confidential External - Engineering Group 1 - Design Q1FY2010 Confidential External - Engineering Group 1 - Manufacturing Q1FY2010 Top Secret Internal - Engineering Group 1 - Research Q1FY2010 Top Secret Internal - Engineering Group 1 - Design Q1FY2010 Top Secret Internal - Engineering Group 1 - Manufacturing Q1FY2010 Top Secret External - Engineering Group 1 - Research Q1FY2010 Top Secret External - Engineering Group 1 - Design Q1FY2010 Top Secret External - Engineering Group 1 - Manufacturing Now multiply the above by 6 for each engineering group, 18 contexts. You are then creating/reviewing another 18 every 3 months. After a year you've got 72 contexts. What would be the advantages of such a complex classification model? You can satisfy very granular rights requirements, for example only an authorized engineering group 1 researcher can create a top secret report for access internally, and his role will be reviewed on a very frequent basis. Your business may have very complex rights requirements and mapping this directly to IRM may be an obvious exercise. The disadvantages of such a classification model are significant...Huge administrative overhead. Someone in the business must manage, review and administrate each of these contexts. If the engineering group had a single administrator, they would have 72 classifications to reside over each year. From an end users perspective life will be very confusing. Imagine if a user has rights in just 6 of these contexts. They may be able to print content from one but not another, be able to edit content in 2 contexts but not the other 4. Such confusion at the end user level causes frustration and resistance to the use of the technology. Increased synchronization complexity. Imagine a user who after 3 years in the company ends up with over 300 rights in many different contexts across the business. This would result in long synchronization times as the client software updates all your offline rights. Hard to understand who can do what with what. Imagine being the VP of engineering and as part of an internal security audit you are asked the question, "What rights to researchers have to our top secret information?". In this complex model the answer is not simple, it would depend on many roles in many contexts. Of course this example is extreme, but it highlights that trying to build many barriers in your business can result in a nightmare of administration and confusion amongst users. In the real world what we need is a balance of the two. We need to seek an optimum number of contexts. Too many contexts are unmanageable and too few contexts does not give fine enough granularity. What makes a good context? Good context design derives mainly from how well you understand your business requirements to secure access to confidential information. Some customers I have worked with can tell me exactly the documents they wish to secure and know exactly who should be opening them. However there are some customers who know only of the government regulation that requires them to control access to certain types of information, they don't actually know where the documents are, how they are created or understand exactly who should have access. Therefore you need to know how to ask the business the right questions that lead to information which help you define a context. First ask these questions about a set of documentsWhat is the topic? Who are legitimate contributors on this topic? Who are the authorized readership? If the answer to any one of these is significantly different, then it probably merits a separate context. Remember that sealed documents are inherently secure and as such they cannot leak to your competitors, therefore it is better sealed to a broad context than not sealed at all. Simplicity is key here. Always revert to the first extreme example of a single classification, then work towards essential complexity. If there is any doubt, always prefer fewer contexts. Remember, Oracle IRM allows you to change your mind later on. You can implement a design now and continue to change and refine as you learn how the technology is used. It is easy to go from a simple model to a more complex one, it is much harder to take a complex model that is already embedded in the work practice of users and try to simplify it. It is also wise to take a single use case and address this first with the business. Don't try and tackle many different problems from the outset. Do one, learn from the process, refine it and then take what you have learned into the next use case, refine and continue. Once you have a good grasp of the technology and understand how your business will use it, you can then start rolling out the technology wider across the business. Deciding on the use of roles in the context Once you have decided on that first initial use case and a context to create let's look at the details you need to decide upon. For each context, identify; Administrative rolesBusiness owner, the person who makes decisions about who may or may not see content in this context. This is often the person who wanted to use IRM and drove the business purchase. They are the usually the person with the most at risk when sensitive information is lost. Point of contact, the person who will handle requests for access to content. Sometimes the same as the business owner, sometimes a trusted secretary or administrator. Context administrator, the person who will enact the decisions of the Business Owner. Sometimes the point of contact, sometimes a trusted IT person. Document related rolesContributors, the people who create and edit documents in this context. Reviewers, the people who are involved in reviewing documents but are not trusted to secure information to this classification. This role is not always necessary. (See later discussion on Published-work and Work-in-Progress) Readers, the people who read documents from this context. Some people may have several of the roles above, which is fine. What you are trying to do is understand and define how the business interacts with your sensitive information. These roles obviously map directly to roles available in Oracle IRM. Reviewing the features and security for context roles At this point we have decided on a classification of information, understand what roles people in the business will play when administrating this classification and how they will interact with content. The final piece of the puzzle in getting the information for our first context is to look at the permissions people will have to sealed documents. First think why are you protecting the documents in the first place? It is to prevent the loss of leaking of information to the wrong people. To control the information, making sure that people only access the latest versions of documents. You are not using Oracle IRM to prevent unauthorized people from doing legitimate work. This is an important point, with IRM you can erect many barriers to prevent access to content yet too many restrictions and authorized users will often find ways to circumvent using the technology and end up distributing unprotected originals. Because IRM is a security technology, it is easy to get carried away restricting different groups. However I would highly recommend starting with a simple solution with few restrictions. Ensure that everyone who reasonably needs to read documents can do so from the outset. Remember that with Oracle IRM you can change rights to content whenever you wish and tighten security. Always return to the fact that the greatest value IRM brings is that ONLY authorized users can access secured content, remember that simple "one context for the entire business" model. At the start of the deployment you really need to aim for user acceptance and therefore a simple model is more likely to succeed. As time passes and users understand how IRM works you can start to introduce more restrictions and complexity. Another key aspect to focus on is handling exceptions. If you decide on a context model where engineering can only access engineering information, and sales can only access sales data. Act quickly when a sales manager needs legitimate access to a set of engineering documents. Having a quick and effective process for permitting other people with legitimate needs to obtain appropriate access will be rewarded with acceptance from the user community. These use cases can often be satisfied by integrating IRM with a good Identity & Access Management technology which simplifies the process of assigning users the correct business roles. The big print issue... Printing is often an issue of contention, users love to print but the business wants to ensure sensitive information remains in the controlled digital world. There are many cases of physical document loss causing a business pain, it is often overlooked that IRM can help with this issue by limiting the ability to generate physical copies of digital content. However it can be hard to maintain a balance between security and usability when it comes to printing. Consider the following points when deciding about whether to give print rights. Oracle IRM sealed documents can contain watermarks that expose information about the user, time and location of access and the classification of the document. This information would reside in the printed copy making it easier to trace who printed it. Printed documents are slower to distribute in comparison to their digital counterparts, so time sensitive information in printed format may present a lower risk. Print activity is audited, therefore you can monitor and react to users abusing print rights. Summary In summary it is important to think carefully about the way you create your context model. As you ask the business these questions you may get a variety of different requirements. There may be special projects that require a context just for sensitive information created during the lifetime of the project. There may be a department that requires all information in the group is secured and you might have a few senior executives who wish to use IRM to exchange a small number of highly sensitive documents with a very small number of people. Oracle IRM, with its very flexible context classification system, can support all of these use cases. The trick is to introducing the complexity to deliver them at the right level. In another article i'm working on I will go through some examples of how Oracle IRM might map to existing business use cases. But for now, this article covers all the important questions you need to get your IRM service deployed and successfully protecting your most sensitive information.

Read the article
Option Trading: Getting the most out of the event session options

- by extended_events

You can control different aspects of how an event session behaves by setting the event session options as part of the CREATE EVENT SESSION DDL. The default settings for the event session options are designed to handle most of the common event collection situations so I generally recommend that you just use the defaults. Like everything in the real world though, there are going to be a handful of “special cases” that require something different. This post focuses on identifying the special cases and the correct use of the options to accommodate those cases. There is a reason it’s called Default The default session options specify a total event buffer size of 4 MB with a 30 second latency. Translating this into human terms; this means that our default behavior is that the system will start processing events from the event buffer when we reach about 1.3 MB of events or after 30 seconds, which ever comes first. Aside: What’s up with the 1.3 MB, I thought you said the buffer was 4 MB?The Extended Events engine takes the total buffer size specified by MAX_MEMORY (4MB by default) and divides it into 3 equally sized buffers. This is done so that a session can be publishing events to one buffer while other buffers are being processed. There are always at least three buffers; how to get more than three is covered later. Using this configuration, the Extended Events engine can “keep up” with most event sessions on standard workloads. Why is this? The fact is that most events are small, really small; on the order of a couple hundred bytes. Even when you start considering events that carry dynamically sized data (eg. binary, text, etc.) or adding actions that collect additional data, the total size of the event is still likely to be pretty small. This means that each buffer can likely hold thousands of events before it has to be processed. When the event buffers are finally processed there is an economy of scale achieved since most targets support bulk processing of the events so they are processed at the buffer level rather than the individual event level. When all this is working together it’s more likely that a full buffer will be processed and put back into the ready queue before the remaining buffers (remember, there are at least three) are full. I know what you’re going to say: “My server is exceptional! My workload is so massive it defies categorization!” OK, maybe you weren’t going to say that exactly, but you were probably thinking it. The point is that there are situations that won’t be covered by the Default, but that’s a good place to start and this post assumes you’ve started there so that you have something to look at in order to determine if you do have a special case that needs different settings. So let’s get to the special cases… What event just fired?! How about now?! Now?! If you believe the commercial adage from Heinz Ketchup (Heinz Slow Good Ketchup ad on You Tube), some things are worth the wait. This is not a belief held by most DBAs, particularly DBAs who are looking for an answer to a troubleshooting question fast. If you’re one of these anxious DBAs, or maybe just a Program Manager doing a demo, then 30 seconds might be longer than you’re comfortable waiting. If you find yourself in this situation then consider changing the MAX_DISPATCH_LATENCY option for your event session. This option will force the event buffers to be processed based on your time schedule. This option only makes sense for the asynchronous targets since those are the ones where we allow events to build up in the event buffer – if you’re using one of the synchronous targets this option isn’t relevant. Avoid forgotten events by increasing your memory Have you ever had one of those days where you keep forgetting things? That can happen in Extended Events too; we call it dropped events. In order to optimizes for server performance and help ensure that the Extended Events doesn’t block the server if to drop events that can’t be published to a buffer because the buffer is full. You can determine if events are being dropped from a session by querying the dm_xe_sessions DMV and looking at the dropped_event_count field. Aside: Should you care if you’re dropping events?Maybe not – think about why you’re collecting data in the first place and whether you’re really going to miss a few dropped events. For example, if you’re collecting query duration stats over thousands of executions of a query it won’t make a huge difference to miss a couple executions. Use your best judgment. If you find that your session is dropping events it means that the event buffer is not large enough to handle the volume of events that are being published. There are two ways to address this problem. First, you could collect fewer events – examine you session to see if you are over collecting. Do you need all the actions you’ve specified? Could you apply a predicate to be more specific about when you fire the event? Assuming the session is defined correctly, the next option is to change the MAX_MEMORY option to a larger number. Picking the right event buffer size might take some trial and error, but a good place to start is with the number of dropped events compared to the number you’ve collected. Aside: There are three different behaviors for dropping events that you specify using the EVENT_RETENTION_MODE option. The default is to allow single event loss and you should stick with this setting since it is the best choice for keeping the impact on server performance low.You’ll be tempted to use the setting to not lose any events (NO_EVENT_LOSS) – resist this urge since it can result in blocking on the server. If you’re worried that you’re losing events you should be increasing your event buffer memory as described in this section. Some events are too big to fail A less common reason for dropping an event is when an event is so large that it can’t fit into the event buffer. Even though most events are going to be small, you might find a condition that occasionally generates a very large event. You can determine if your session is dropping large events by looking at the dm_xe_sessions DMV once again, this time check the largest_event_dropped_size. If this value is larger than the size of your event buffer [remember, the size of your event buffer, by default, is max_memory / 3] then you need a large event buffer. To specify a large event buffer you set the MAX_EVENT_SIZE option to a value large enough to fit the largest event dropped based on data from the DMV. When you set this option the Extended Events engine will create two buffers of this size to accommodate these large events. As an added bonus (no extra charge) the large event buffer will also be used to store normal events in the cases where the normal event buffers are all full and waiting to be processed. (Note: This is just a side-effect, not the intended use. If you’re dropping many normal events then you should increase your normal event buffer size.) Partitioning: moving your events to a sub-division Earlier I alluded to the fact that you can configure your event session to use more than the standard three event buffers – this is called partitioning and is controlled by the MEMORY_PARTITION_MODE option. The result of setting this option is fairly easy to explain, but knowing when to use it is a bit more art than science. First the science… You can configure partitioning in three ways: None, Per NUMA Node & Per CPU. This specifies the location where sets of event buffers are created with fairly obvious implication. There are rules we follow for sub-dividing the total memory (specified by MAX_MEMORY) between all the event buffers that are specific to the mode used: None: 3 buffers (fixed)Node: 3 * number_of_nodesCPU: 2.5 * number_of_cpus Here are some examples of what this means for different Node/CPU counts: Configuration None Node CPU 2 CPUs, 1 Node 3 buffers 3 buffers 5 buffers 6 CPUs, 2 Node 3 buffers 6 buffers 15 buffers 40 CPUs, 5 Nodes 3 buffers 15 buffers 100 buffers Aside: Buffer size on multi-processor computersAs the number of Nodes or CPUs increases, the size of the event buffer gets smaller because the total memory is sub-divided into more pieces. The defaults will hold up to this for a while since each buffer set is holding events only from the Node or CPU that it is associated with, but at some point the buffers will get too small and you’ll either see events being dropped or you’ll get an error when you create your session because you’re below the minimum buffer size. Increase the MAX_MEMORY setting to an appropriate number for the configuration. The most likely reason to start partitioning is going to be related to performance. If you notice that running an event session is impacting the performance of your server beyond a reasonably expected level [Yes, there is a reasonably expected level of work required to collect events.] then partitioning might be an answer. Before you partition you might want to check a few other things: Is your event retention set to NO_EVENT_LOSS and causing blocking? (I told you not to do this.) Consider changing your event loss mode or increasing memory. Are you over collecting and causing more work than necessary? Consider adding predicates to events or removing unnecessary events and actions from your session. Are you writing the file target to the same slow disk that you use for TempDB and your other high activity databases? <kidding> <not really> It’s always worth considering the end to end picture – if you’re writing events to a file you can be impacted by I/O, network; all the usual stuff. Assuming you’ve ruled out the obvious (and not so obvious) issues, there are performance conditions that will be addressed by partitioning. For example, it’s possible to have a successful event session (eg. no dropped events) but still see a performance impact because you have many CPUs all attempting to write to the same free buffer and having to wait in line to finish their work. This is a case where partitioning would relieve the contention between the different CPUs and likely reduce the performance impact cause by the event session. There is no DMV you can check to find these conditions – sorry – that’s where the art comes in. This is largely a matter of experimentation. On the bright side you probably won’t need to to worry about this level of detail all that often. The performance impact of Extended Events is significantly lower than what you may be used to with SQL Trace. You will likely only care about the impact if you are trying to set up a long running event session that will be part of your everyday workload – sessions used for short term troubleshooting will likely fall into the “reasonably expected impact” category. Hey buddy – I think you forgot something OK, there are two options I didn’t cover: STARTUP_STATE & TRACK_CAUSALITY. If you want your event sessions to start automatically when the server starts, set the STARTUP_STATE option to ON. (Now there is only one option I didn’t cover.) I’m going to leave causality for another post since it’s not really related to session behavior, it’s more about event analysis. - Mike Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
How to find and fix performance problems in ORM powered applications

- by FransBouma

Once in a while we get requests about how to fix performance problems with our framework. As it comes down to following the same steps and looking into the same things every single time, I decided to write a blogpost about it instead, so more people can learn from this and solve performance problems in their O/R mapper powered applications. In some parts it's focused on LLBLGen Pro but it's also usable for other O/R mapping frameworks, as the vast majority of performance problems in O/R mapper powered applications are not specific for a certain O/R mapper framework. Too often, the developer looks at the wrong part of the application, trying to fix what isn't a problem in that part, and getting frustrated that 'things are so slow with <insert your favorite framework X here>'. I'm in the O/R mapper business for a long time now (almost 10 years, full time) and as it's a small world, we O/R mapper developers know almost all tricks to pull off by now: we all know what to do to make task ABC faster and what compromises (because there are almost always compromises) to deal with if we decide to make ABC faster that way. Some O/R mapper frameworks are faster in X, others in Y, but you can be sure the difference is mainly a result of a compromise some developers are willing to deal with and others aren't. That's why the O/R mapper frameworks on the market today are different in many ways, even though they all fetch and save entities from and to a database. I'm not suggesting there's no room for improvement in today's O/R mapper frameworks, there always is, but it's not a matter of 'the slowness of the application is caused by the O/R mapper' anymore. Perhaps query generation can be optimized a bit here, row materialization can be optimized a bit there, but it's mainly coming down to milliseconds. Still worth it if you're a framework developer, but it's not much compared to the time spend inside databases and in user code: if a complete fetch takes 40ms or 50ms (from call to entity object collection), it won't make a difference for your application as that 10ms difference won't be noticed. That's why it's very important to find the real locations of the problems so developers can fix them properly and don't get frustrated because their quest to get a fast, performing application failed. Performance tuning basics and rules Finding and fixing performance problems in any application is a strict procedure with four prescribed steps: isolate, analyze, interpret and fix, in that order. It's key that you don't skip a step nor make assumptions: these steps help you find the reason of a problem which seems to be there, and how to fix it or leave it as-is. Skipping a step, or when you assume things will be bad/slow without doing analysis will lead to the path of premature optimization and won't actually solve your problems, only create new ones. The most important rule of finding and fixing performance problems in software is that you have to understand what 'performance problem' actually means. Most developers will say "when a piece of software / code is slow, you have a performance problem". But is that actually the case? If I write a Linq query which will aggregate, group and sort 5 million rows from several tables to produce a resultset of 10 rows, it might take more than a couple of milliseconds before that resultset is ready to be consumed by other logic. If I solely look at the Linq query, the code consuming the resultset of the 10 rows and then look at the time it takes to complete the whole procedure, it will appear to me to be slow: all that time taken to produce and consume 10 rows? But if you look closer, if you analyze and interpret the situation, you'll see it does a tremendous amount of work, and in that light it might even be extremely fast. With every performance problem you encounter, always do realize that what you're trying to solve is perhaps not a technical problem at all, but a perception problem. The second most important rule you have to understand is based on the old saying "Penny wise, Pound Foolish": the part which takes e.g. 5% of the total time T for a given task isn't worth optimizing if you have another part which takes a much larger part of the total time T for that same given task. Optimizing parts which are relatively insignificant for the total time taken is not going to bring you better results overall, even if you totally optimize that part away. This is the core reason why analysis of the complete set of application parts which participate in a given task is key to being successful in solving performance problems: No analysis -> no problem -> no solution. One warning up front: hunting for performance will always include making compromises. Fast software can be made maintainable, but if you want to squeeze as much performance out of your software, you will inevitably be faced with the dilemma of compromising one or more from the group {readability, maintainability, features} for the extra performance you think you'll gain. It's then up to you to decide whether it's worth it. In almost all cases it's not. The reason for this is simple: the vast majority of performance problems can be solved by implementing the proper algorithms, the ones with proven Big O-characteristics so you know the performance you'll get plus you know the algorithm will work. The time taken by the algorithm implementing code is inevitable: you already implemented the best algorithm. You might find some optimizations on the technical level but in general these are minor. Let's look at the four steps to see how they guide us through the quest to find and fix performance problems. Isolate The first thing you need to do is to isolate the areas in your application which are assumed to be slow. For example, if your application is a web application and a given page is taking several seconds or even minutes to load, it's a good candidate to check out. It's important to start with the isolate step because it allows you to focus on a single code path per area with a clear begin and end and ignore the rest. The rest of the steps are taken per identified problematic area. Keep in mind that isolation focuses on tasks in an application, not code snippets. A task is something that's started in your application by either another task or the user, or another program, and has a beginning and an end. You can see a task as a piece of functionality offered by your application. Analyze Once you've determined the problem areas, you have to perform analysis on the code paths of each area, to see where the performance problems occur and which areas are not the problem. This is a multi-layered effort: an application which uses an O/R mapper typically consists of multiple parts: there's likely some kind of interface (web, webservice, windows etc.), a part which controls the interface and business logic, the O/R mapper part and the RDBMS, all connected with either a network or inter-process connections provided by the OS or other means. Each of these parts, including the connectivity plumbing, eat up a part of the total time it takes to complete a task, e.g. load a webpage with all orders of a given customer X. To understand which parts participate in the task / area we're investigating and how much they contribute to the total time taken to complete the task, analysis of each participating task is essential. Start with the code you wrote which starts the task, analyze the code and track the path it follows through your application. What does the code do along the way, verify whether it's correct or not. Analyze whether you have implemented the right algorithms in your code for this particular area. Remember we're looking at one area at a time, which means we're ignoring all other code paths, just the code path of the current problematic area, from begin to end and back. Don't dig in and start optimizing at the code level just yet. We're just analyzing. If your analysis reveals big architectural stupidity, it's perhaps a good idea to rethink the architecture at this point. For the rest, we're analyzing which means we collect data about what could be wrong, for each participating part of the complete application. Reviewing the code you wrote is a good tool to get deeper understanding of what is going on for a given task but ultimately it lacks precision and overview what really happens: humans aren't good code interpreters, computers are. We therefore need to utilize tools to get deeper understanding about which parts contribute how much time to the total task, triggered by which other parts and for example how many times are they called. There are two different kind of tools which are necessary: .NET profilers and O/R mapper / RDBMS profilers. .NET profiling .NET profilers (e.g. dotTrace by JetBrains or Ants by Red Gate software) show exactly which pieces of code are called, how many times they're called, and the time it took to run that piece of code, at the method level and sometimes even at the line level. The .NET profilers are essential tools for understanding whether the time taken to complete a given task / area in your application is consumed by .NET code, where exactly in your code, the path to that code, how many times that code was called by other code and thus reveals where hotspots are located: the areas where a solution can be found. Importantly, they also reveal which areas can be left alone: remember our penny wise pound foolish saying: if a profiler reveals that a group of methods are fast, or don't contribute much to the total time taken for a given task, ignore them. Even if the code in them is perhaps complex and looks like a candidate for optimization: you can work all day on that, it won't matter. As we're focusing on a single area of the application, it's best to start profiling right before you actually activate the task/area. Most .NET profilers support this by starting the application without starting the profiling procedure just yet. You navigate to the particular part which is slow, start profiling in the profiler, in your application you perform the actions which are considered slow, and afterwards you get a snapshot in the profiler. The snapshot contains the data collected by the profiler during the slow action, so most data is produced by code in the area to investigate. This is important, because it allows you to stay focused on a single area. O/R mapper and RDBMS profiling .NET profilers give you a good insight in the .NET side of things, but not in the RDBMS side of the application. As this article is about O/R mapper powered applications, we're also looking at databases, and the software making it possible to consume the database in your application: the O/R mapper. To understand which parts of the O/R mapper and database participate how much to the total time taken for task T, we need different tools. There are two kind of tools focusing on O/R mappers and database performance profiling: O/R mapper profilers and RDBMS profilers. For O/R mapper profilers, you can look at LLBLGen Prof by hibernating rhinos or the Linq to Sql/LLBLGen Pro profiler by Huagati. Hibernating rhinos also have profilers for other O/R mappers like NHibernate (NHProf) and Entity Framework (EFProf) and work the same as LLBLGen Prof. For RDBMS profilers, you have to look whether the RDBMS vendor has a profiler. For example for SQL Server, the profiler is shipped with SQL Server, for Oracle it's build into the RDBMS, however there are also 3rd party tools. Which tool you're using isn't really important, what's important is that you get insight in which queries are executed during the task / area we're currently focused on and how long they took. Here, the O/R mapper profilers have an advantage as they collect the time it took to execute the query from the application's perspective so they also collect the time it took to transport data across the network. This is important because a query which returns a massive resultset or a resultset with large blob/clob/ntext/image fields takes more time to get transported across the network than a small resultset and a database profiler doesn't take this into account most of the time. Another tool to use in this case, which is more low level and not all O/R mappers support it (though LLBLGen Pro and NHibernate as well do) is tracing: most O/R mappers offer some form of tracing or logging system which you can use to collect the SQL generated and executed and often also other activity behind the scenes. While tracing can produce a tremendous amount of data in some cases, it also gives insight in what's going on. Interpret After we've completed the analysis step it's time to look at the data we've collected. We've done code reviews to see whether we've done anything stupid and which parts actually take place and if the proper algorithms have been implemented. We've done .NET profiling to see which parts are choke points and how much time they contribute to the total time taken to complete the task we're investigating. We've performed O/R mapper profiling and RDBMS profiling to see which queries were executed during the task, how many queries were generated and executed and how long they took to complete, including network transportation. All this data reveals two things: which parts are big contributors to the total time taken and which parts are irrelevant. Both aspects are very important. The parts which are irrelevant (i.e. don't contribute significantly to the total time taken) can be ignored from now on, we won't look at them. The parts which contribute a lot to the total time taken are important to look at. We now have to first look at the .NET profiler results, to see whether the time taken is consumed in our own code, in .NET framework code, in the O/R mapper itself or somewhere else. For example if most of the time is consumed by DbCommand.ExecuteReader, the time it took to complete the task is depending on the time the data is fetched from the database. If there was just 1 query executed, according to tracing or O/R mapper profilers / RDBMS profilers, check whether that query is optimal, uses indexes or has to deal with a lot of data. Interpret means that you follow the path from begin to end through the data collected and determine where, along the path, the most time is contributed. It also means that you have to check whether this was expected or is totally unexpected. My previous example of the 10 row resultset of a query which groups millions of rows will likely reveal that a long time is spend inside the database and almost no time is spend in the .NET code, meaning the RDBMS part contributes the most to the total time taken, the rest is compared to that time, irrelevant. Considering the vastness of the source data set, it's expected this will take some time. However, does it need tweaking? Perhaps all possible tweaks are already in place. In the interpret step you then have to decide that further action in this area is necessary or not, based on what the analysis results show: if the analysis results were unexpected and in the area where the most time is contributed to the total time taken is room for improvement, action should be taken. If not, you can only accept the situation and move on. In all cases, document your decision together with the analysis you've done. If you decide that the perceived performance problem is actually expected due to the nature of the task performed, it's essential that in the future when someone else looks at the application and starts asking questions you can answer them properly and new analysis is only necessary if situations changed. Fix After interpreting the analysis results you've concluded that some areas need adjustment. This is the fix step: you're actively correcting the performance problem with proper action targeted at the real cause. In many cases related to O/R mapper powered applications it means you'll use different features of the O/R mapper to achieve the same goal, or apply optimizations at the RDBMS level. It could also mean you apply caching inside your application (compromise memory consumption over performance) to avoid unnecessary re-querying data and re-consuming the results. After applying a change, it's key you re-do the analysis and interpretation steps: compare the results and expectations with what you had before, to see whether your actions had any effect or whether it moved the problem to a different part of the application. Don't fall into the trap to do partly analysis: do the full analysis again: .NET profiling and O/R mapper / RDBMS profiling. It might very well be that the changes you've made make one part faster but another part significantly slower, in such a way that the overall problem hasn't changed at all. Performance tuning is dealing with compromises and making choices: to use one feature over the other, to accept a higher memory footprint, to go away from the strict-OO path and execute queries directly onto the RDBMS, these are choices and compromises which will cross your path if you want to fix performance problems with respect to O/R mappers or data-access and databases in general. In most cases it's not a big issue: alternatives are often good choices too and the compromises aren't that hard to deal with. What is important is that you document why you made a choice, a compromise: which analysis data, which interpretation led you to the choice made. This is key for good maintainability in the years to come. Most common performance problems with O/R mappers Below is an incomplete list of common performance problems related to data-access / O/R mappers / RDBMS code. It will help you with fixing the hotspots you found in the interpretation step. SELECT N+1: (Lazy-loading specific). Lazy loading triggered performance bottlenecks. Consider a list of Orders bound to a grid. You have a Field mapped onto a related field in Order, Customer.CompanyName. Showing this column in the grid will make the grid fetch (indirectly) for each row the Customer row. This means you'll get for the single list not 1 query (for the orders) but 1+(the number of orders shown) queries. To solve this: use eager loading using a prefetch path to fetch the customers with the orders. SELECT N+1 is easy to spot with an O/R mapper profiler or RDBMS profiler: if you see a lot of identical queries executed at once, you have this problem. Prefetch paths using many path nodes or sorting, or limiting. Eager loading problem. Prefetch paths can help with performance, but as 1 query is fetched per node, it can be the number of data fetched in a child node is bigger than you think. Also consider that data in every node is merged on the client within the parent. This is fast, but it also can take some time if you fetch massive amounts of entities. If you keep fetches small, you can use tuning parameters like the ParameterizedPrefetchPathThreshold setting to get more optimal queries. Deep inheritance hierarchies of type Target Per Entity/Type. If you use inheritance of type Target per Entity / Type (each type in the inheritance hierarchy is mapped onto its own table/view), fetches will join subtype- and supertype tables in many cases, which can lead to a lot of performance problems if the hierarchy has many types. With this problem, keep inheritance to a minimum if possible, or switch to a hierarchy of type Target Per Hierarchy, which means all entities in the inheritance hierarchy are mapped onto the same table/view. Of course this has its own set of drawbacks, but it's a compromise you might want to take. Fetching massive amounts of data by fetching large lists of entities. LLBLGen Pro supports paging (and limiting the # of rows returned), which is often key to process through large sets of data. Use paging on the RDBMS if possible (so a query is executed which returns only the rows in the page requested). When using paging in a web application, be sure that you switch server-side paging on on the datasourcecontrol used. In this case, paging on the grid alone is not enough: this can lead to fetching a lot of data which is then loaded into the grid and paged there. Keep note that analyzing queries for paging could lead to the false assumption that paging doesn't occur, e.g. when the query contains a field of type ntext/image/clob/blob and DISTINCT can't be applied while it should have (e.g. due to a join): the datareader will do DISTINCT filtering on the client. this is a little slower but it does perform paging functionality on the data-reader so it won't fetch all rows even if the query suggests it does. Fetch massive amounts of data because blob/clob/ntext/image fields aren't excluded. LLBLGen Pro supports field exclusion for queries. You can exclude fields (also in prefetch paths) per query to avoid fetching all fields of an entity, e.g. when you don't need them for the logic consuming the resultset. Excluding fields can greatly reduce the amount of time spend on data-transport across the network. Use this optimization if you see that there's a big difference between query execution time on the RDBMS and the time reported by the .NET profiler for the ExecuteReader method call. Doing client-side aggregates/scalar calculations by consuming a lot of data. If possible, try to formulate a scalar query or group by query using the projection system or GetScalar functionality of LLBLGen Pro to do data consumption on the RDBMS server. It's far more efficient to process data on the RDBMS server than to first load it all in memory, then traverse the data in-memory to calculate a value. Using .ToList() constructs inside linq queries. It might be you use .ToList() somewhere in a Linq query which makes the query be run partially in-memory. Example: var q = from c in metaData.Customers.ToList() where c.Country=="Norway" select c; This will actually fetch all customers in-memory and do an in-memory filtering, as the linq query is defined on an IEnumerable<T>, and not on the IQueryable<T>. Linq is nice, but it can often be a bit unclear where some parts of a Linq query might run. Fetching all entities to delete into memory first. To delete a set of entities it's rather inefficient to first fetch them all into memory and then delete them one by one. It's more efficient to execute a DELETE FROM ... WHERE query on the database directly to delete the entities in one go. LLBLGen Pro supports this feature, and so do some other O/R mappers. It's not always possible to do this operation in the context of an O/R mapper however: if an O/R mapper relies on a cache, these kind of operations are likely not supported because they make it impossible to track whether an entity is actually removed from the DB and thus can be removed from the cache. Fetching all entities to update with an expression into memory first. Similar to the previous point: it is more efficient to update a set of entities directly with a single UPDATE query using an expression instead of fetching the entities into memory first and then updating the entities in a loop, and afterwards saving them. It might however be a compromise you don't want to take as it is working around the idea of having an object graph in memory which is manipulated and instead makes the code fully aware there's a RDBMS somewhere. Conclusion Performance tuning is almost always about compromises and making choices. It's also about knowing where to look and how the systems in play behave and should behave. The four steps I provided should help you stay focused on the real problem and lead you towards the solution. Knowing how to optimally use the systems participating in your own code (.NET framework, O/R mapper, RDBMS, network/services) is key for success as well as knowing what's going on inside the application you built. I hope you'll find this guide useful in tracking down performance problems and dealing with them in a useful way.

Read the article
Option Trading: Getting the most out of the event session options

- by extended_events

You can control different aspects of how an event session behaves by setting the event session options as part of the CREATE EVENT SESSION DDL. The default settings for the event session options are designed to handle most of the common event collection situations so I generally recommend that you just use the defaults. Like everything in the real world though, there are going to be a handful of “special cases” that require something different. This post focuses on identifying the special cases and the correct use of the options to accommodate those cases. There is a reason it’s called Default The default session options specify a total event buffer size of 4 MB with a 30 second latency. Translating this into human terms; this means that our default behavior is that the system will start processing events from the event buffer when we reach about 1.3 MB of events or after 30 seconds, which ever comes first. Aside: What’s up with the 1.3 MB, I thought you said the buffer was 4 MB?The Extended Events engine takes the total buffer size specified by MAX_MEMORY (4MB by default) and divides it into 3 equally sized buffers. This is done so that a session can be publishing events to one buffer while other buffers are being processed. There are always at least three buffers; how to get more than three is covered later. Using this configuration, the Extended Events engine can “keep up” with most event sessions on standard workloads. Why is this? The fact is that most events are small, really small; on the order of a couple hundred bytes. Even when you start considering events that carry dynamically sized data (eg. binary, text, etc.) or adding actions that collect additional data, the total size of the event is still likely to be pretty small. This means that each buffer can likely hold thousands of events before it has to be processed. When the event buffers are finally processed there is an economy of scale achieved since most targets support bulk processing of the events so they are processed at the buffer level rather than the individual event level. When all this is working together it’s more likely that a full buffer will be processed and put back into the ready queue before the remaining buffers (remember, there are at least three) are full. I know what you’re going to say: “My server is exceptional! My workload is so massive it defies categorization!” OK, maybe you weren’t going to say that exactly, but you were probably thinking it. The point is that there are situations that won’t be covered by the Default, but that’s a good place to start and this post assumes you’ve started there so that you have something to look at in order to determine if you do have a special case that needs different settings. So let’s get to the special cases… What event just fired?! How about now?! Now?! If you believe the commercial adage from Heinz Ketchup (Heinz Slow Good Ketchup ad on You Tube), some things are worth the wait. This is not a belief held by most DBAs, particularly DBAs who are looking for an answer to a troubleshooting question fast. If you’re one of these anxious DBAs, or maybe just a Program Manager doing a demo, then 30 seconds might be longer than you’re comfortable waiting. If you find yourself in this situation then consider changing the MAX_DISPATCH_LATENCY option for your event session. This option will force the event buffers to be processed based on your time schedule. This option only makes sense for the asynchronous targets since those are the ones where we allow events to build up in the event buffer – if you’re using one of the synchronous targets this option isn’t relevant. Avoid forgotten events by increasing your memory Have you ever had one of those days where you keep forgetting things? That can happen in Extended Events too; we call it dropped events. In order to optimizes for server performance and help ensure that the Extended Events doesn’t block the server if to drop events that can’t be published to a buffer because the buffer is full. You can determine if events are being dropped from a session by querying the dm_xe_sessions DMV and looking at the dropped_event_count field. Aside: Should you care if you’re dropping events?Maybe not – think about why you’re collecting data in the first place and whether you’re really going to miss a few dropped events. For example, if you’re collecting query duration stats over thousands of executions of a query it won’t make a huge difference to miss a couple executions. Use your best judgment. If you find that your session is dropping events it means that the event buffer is not large enough to handle the volume of events that are being published. There are two ways to address this problem. First, you could collect fewer events – examine you session to see if you are over collecting. Do you need all the actions you’ve specified? Could you apply a predicate to be more specific about when you fire the event? Assuming the session is defined correctly, the next option is to change the MAX_MEMORY option to a larger number. Picking the right event buffer size might take some trial and error, but a good place to start is with the number of dropped events compared to the number you’ve collected. Aside: There are three different behaviors for dropping events that you specify using the EVENT_RETENTION_MODE option. The default is to allow single event loss and you should stick with this setting since it is the best choice for keeping the impact on server performance low.You’ll be tempted to use the setting to not lose any events (NO_EVENT_LOSS) – resist this urge since it can result in blocking on the server. If you’re worried that you’re losing events you should be increasing your event buffer memory as described in this section. Some events are too big to fail A less common reason for dropping an event is when an event is so large that it can’t fit into the event buffer. Even though most events are going to be small, you might find a condition that occasionally generates a very large event. You can determine if your session is dropping large events by looking at the dm_xe_sessions DMV once again, this time check the largest_event_dropped_size. If this value is larger than the size of your event buffer [remember, the size of your event buffer, by default, is max_memory / 3] then you need a large event buffer. To specify a large event buffer you set the MAX_EVENT_SIZE option to a value large enough to fit the largest event dropped based on data from the DMV. When you set this option the Extended Events engine will create two buffers of this size to accommodate these large events. As an added bonus (no extra charge) the large event buffer will also be used to store normal events in the cases where the normal event buffers are all full and waiting to be processed. (Note: This is just a side-effect, not the intended use. If you’re dropping many normal events then you should increase your normal event buffer size.) Partitioning: moving your events to a sub-division Earlier I alluded to the fact that you can configure your event session to use more than the standard three event buffers – this is called partitioning and is controlled by the MEMORY_PARTITION_MODE option. The result of setting this option is fairly easy to explain, but knowing when to use it is a bit more art than science. First the science… You can configure partitioning in three ways: None, Per NUMA Node & Per CPU. This specifies the location where sets of event buffers are created with fairly obvious implication. There are rules we follow for sub-dividing the total memory (specified by MAX_MEMORY) between all the event buffers that are specific to the mode used: None: 3 buffers (fixed)Node: 3 * number_of_nodesCPU: 2.5 * number_of_cpus Here are some examples of what this means for different Node/CPU counts: Configuration None Node CPU 2 CPUs, 1 Node 3 buffers 3 buffers 5 buffers 6 CPUs, 2 Node 3 buffers 6 buffers 15 buffers 40 CPUs, 5 Nodes 3 buffers 15 buffers 100 buffers Aside: Buffer size on multi-processor computersAs the number of Nodes or CPUs increases, the size of the event buffer gets smaller because the total memory is sub-divided into more pieces. The defaults will hold up to this for a while since each buffer set is holding events only from the Node or CPU that it is associated with, but at some point the buffers will get too small and you’ll either see events being dropped or you’ll get an error when you create your session because you’re below the minimum buffer size. Increase the MAX_MEMORY setting to an appropriate number for the configuration. The most likely reason to start partitioning is going to be related to performance. If you notice that running an event session is impacting the performance of your server beyond a reasonably expected level [Yes, there is a reasonably expected level of work required to collect events.] then partitioning might be an answer. Before you partition you might want to check a few other things: Is your event retention set to NO_EVENT_LOSS and causing blocking? (I told you not to do this.) Consider changing your event loss mode or increasing memory. Are you over collecting and causing more work than necessary? Consider adding predicates to events or removing unnecessary events and actions from your session. Are you writing the file target to the same slow disk that you use for TempDB and your other high activity databases? <kidding> <not really> It’s always worth considering the end to end picture – if you’re writing events to a file you can be impacted by I/O, network; all the usual stuff. Assuming you’ve ruled out the obvious (and not so obvious) issues, there are performance conditions that will be addressed by partitioning. For example, it’s possible to have a successful event session (eg. no dropped events) but still see a performance impact because you have many CPUs all attempting to write to the same free buffer and having to wait in line to finish their work. This is a case where partitioning would relieve the contention between the different CPUs and likely reduce the performance impact cause by the event session. There is no DMV you can check to find these conditions – sorry – that’s where the art comes in. This is largely a matter of experimentation. On the bright side you probably won’t need to to worry about this level of detail all that often. The performance impact of Extended Events is significantly lower than what you may be used to with SQL Trace. You will likely only care about the impact if you are trying to set up a long running event session that will be part of your everyday workload – sessions used for short term troubleshooting will likely fall into the “reasonably expected impact” category. Hey buddy – I think you forgot something OK, there are two options I didn’t cover: STARTUP_STATE & TRACK_CAUSALITY. If you want your event sessions to start automatically when the server starts, set the STARTUP_STATE option to ON. (Now there is only one option I didn’t cover.) I’m going to leave causality for another post since it’s not really related to session behavior, it’s more about event analysis. - Mike Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
Kuppinger Cole Paper on Entitlements Server

- by Naresh Persaud

Kuppinger Cole recently released a paper discussing external authorization describing how organizations can "future proof" their enterprise security by deploying Oracle Entitlements Server. By taking a declarative security approach, security policy can be flexible and distributed across multiple applications consistently. You can get a copy of the report here. In fact Oracle Entitlements Server is being used in many places to secure data and sensitive business transactions. The paper covers the major use cases for Entitlements Server as well as Kuppinger Cole's assessment of the market. Here are some additional resources that reinforce the cases discussed in the paper. Today applications for cloud and mobile applications can utilize RESTful interfaces. Click on this link to learn how. OES can also be used to secure data in Oracle Databases. To learn more check out the new Oracle U OES 11g course.

Read the article

Search Results

Search found 6682 results on 268 pages for 'edge cases'.

Page 23/268 | < Previous Page | 19 20 21 22 23 24 25 26 27 28 29 30 | Next Page >

- by PhilW

- by ThunderChunky_SF

- by ryvantage

- by Peter Perhác

- by OghmaOsiris

- by user462779

- by nganju

- by Kovsa

- by Raven Dreamer

- by user462779

- by Chris Morris

- by JuergenKress

- by Peter

- by matec

- by Aerovistae

- by CaptainAwesomePants

- by Pops

- by alexanderpine

- by jasonw

- by Simon Thorpe

- by extended_events

- by FransBouma

- by extended_events

- by Naresh Persaud

< Previous Page | 19 20 21 22 23 24 25 26 27 28 29 30 | Next Page >