Search Results

Search found 11362 results on 455 pages for 'big o analysis'.

Page 91/455 | < Previous Page | 87 88 89 90 91 92 93 94 95 96 97 98  | Next Page >

  • C++ Little Wonders: The C++11 auto keyword redux

    - by James Michael Hare
    I’ve decided to create a sub-series of my Little Wonders posts to focus on C++.  Just like their C# counterparts, these posts will focus on those features of the C++ language that can help improve code by making it easier to write and maintain.  The index of the C# Little Wonders can be found here. This has been a busy week with a rollout of some new website features here at my work, so I don’t have a big post for this week.  But I wanted to write something up, and since lately I’ve been renewing my C++ skills in a separate project, it seemed like a good opportunity to start a C++ Little Wonders series.  Most of my development work still tends to focus on C#, but it was great to get back into the saddle and renew my C++ knowledge.  Today I’m going to focus on a new feature in C++11 (formerly known as C++0x, which is a major move forward in the C++ language standard).  While this small keyword can seem so trivial, I feel it is a big step forward in improving readability in C++ programs. The auto keyword If you’ve worked on C++ for a long time, you probably have some passing familiarity with the old auto keyword as one of those rarely used C++ keywords that was almost never used because it was the default. That is, in the code below (before C++11): 1: int foo() 2: { 3: // automatic variables (allocated and deallocated on stack) 4: int x; 5: auto int y; 6:  7: // static variables (retain their value across calls) 8: static int z; 9:  10: return 0; 11: } The variable x is assumed to be auto because that is the default, thus it is unnecessary to specify it explicitly as in the declaration of y below that.  Basically, an auto variable is one that is allocated and de-allocated on the stack automatically.  Contrast this to static variables, that are allocated statically and exist across the lifetime of the program. Because auto was so rarely (if ever) used since it is the norm, they decided to remove it for this purpose and give it new meaning in C++11.  The new meaning of auto: implicit typing Now, if your compiler supports C++ 11 (or at least a good subset of C++11 or 0x) you can take advantage of type inference in C++.  For those of you from the C# world, this means that the auto keyword in C++ now behaves a lot like the var keyword in C#! For example, many of us have had to declare those massive type declarations for an iterator before.  Let’s say we have a std::map of std::string to int which will map names to ages: 1: std::map<std::string, int> myMap; And then let’s say we want to find the age of a given person: 1: // Egad that's a long type... 2: std::map<std::string, int>::const_iterator pos = myMap.find(targetName); Notice that big ugly type definition to declare variable pos?  Sure, we could shorten this by creating a typedef of our specific map type if we wanted, but now with the auto keyword there’s no need: 1: // much shorter! 2: auto pos = myMap.find(targetName); The auto now tells the compiler to determine what type pos should be based on what it’s being assigned to.  This is not dynamic typing, it still determines the type as if it were explicitly declared and once declared that type cannot be changed.  That is, this is invalid: 1: // x is type int 2: auto x = 42; 3:  4: // can't assign string to int 5: x = "Hello"; Once the compiler determines x is type int it is exactly as if we typed int x = 42; instead, so don’t' confuse it with dynamic typing, it’s still very type-safe. An interesting feature of the auto keyword is that you can modify the inferred type: 1: // declare method that returns int* 2: int* GetPointer(); 3:  4: // p1 is int*, auto inferred type is int 5: auto *p1 = GetPointer(); 6:  7: // ps is int*, auto inferred type is int* 8: auto p2 = GetPointer(); Notice in both of these cases, p1 and p2 are determined to be int* but in each case the inferred type was different.  because we declared p1 as auto *p1 and GetPointer() returns int*, it inferred the type int was needed to complete the declaration.  In the second case, however, we declared p2 as auto p2 which means the inferred type was int*.  Ultimately, this make p1 and p2 the same type, but which type is inferred makes a difference, if you are chaining multiple inferred declarations together.  In these cases, the inferred type of each must match the first: 1: // Type inferred is int 2: // p1 is int* 3: // p2 is int 4: // p3 is int& 5: auto *p1 = GetPointer(), p2 = 42, &p3 = p2; Note that this works because the inferred type was int, if the inferred type was int* instead: 1: // syntax error, p1 was inferred to be int* so p2 and p3 don't make sense 2: auto p1 = GetPointer(), p2 = 42, &p3 = p2; You could also use const or static to modify the inferred type: 1: // inferred type is an int, theAnswer is a const int 2: const auto theAnswer = 42; 3:  4: // inferred type is double, Pi is a static double 5: static auto Pi = 3.1415927; Thus in the examples above it inferred the types int and double respectively, which were then modified to const and static. Summary The auto keyword has gotten new life in C++11 to allow you to infer the type of a variable from it’s initialization.  This simple little keyword can be used to cut down large declarations for complex types into a much more readable form, where appropriate.   Technorati Tags: C++, C++11, Little Wonders, auto

    Read the article

  • An MCM exam, Rob? Really?

    - by Rob Farley
    I took the SQL 2008 MCM Knowledge exam while in Seattle for the PASS Summit ten days ago. I wasn’t planning to do it, but I got persuaded to try. I was meaning to write this post to explain myself before the result came out, but it seems I didn’t get typing quickly enough. Those of you who know me will know I’m a big fan of certification, to a point. I’ve been involved with Microsoft Learning to help create exams. I’ve kept my certifications current since I first took an exam back in 1998, sitting many in beta, across quite a variety of topics. I’ve probably become quite good at them – I know I’ve definitely passed some that I really should’ve failed. I’ve also written that I don’t think exams are worth studying for. (That’s probably not entirely true, but it depends on your motivation. If you’re doing learning, I would encourage you to focus on what you need to know to do your job better. That will help you pass an exam – but the two skills are very different. I can coach someone on how to pass an exam, but that’s a different kind of teaching when compared to coaching someone about how to do a job. For example, the real world includes a lot of “it depends”, where you develop a feel for what the influencing factors might be. In an exam, its better to be able to know some of the “Don’t use this technology if XYZ is true” concepts better.) As for the Microsoft Certified Master certification… I’m not opposed to the idea of having the MCM (or in the future, MCSM) cert. But the barrier to entry feels quite high for me. When it was first introduced, the nearest testing centres to me were in Kuala Lumpur and Manila. Now there’s one in Perth, but that’s still a big effort. I know there are options in the US – such as one about an hour’s drive away from downtown Seattle, but it all just seems too hard. Plus, these exams are more expensive, and all up – I wasn’t sure I wanted to try them, particularly with the fact that I don’t like to study. I used to study for exams. It would drive my wife crazy. I’d have some exam scheduled for some time in the future (like the time I had two booked for two consecutive days at TechEd Australia 2005), and I’d make sure I was ready. Every waking moment would be spent pouring over exam material, and it wasn’t healthy. I got shaken out of that, though, when I ended up taking four exams in those two days in 2005 and passed them all. I also worked out that if I had a Second Shot available, then failing wasn’t a bad thing at all. Even without Second Shot, I’m much more okay about failing. But even just trying an MCM exam is a big effort. I wouldn’t want to fail one of them. Plus there’s the illusion to maintain. People have told me for a long time that I should just take the MCM exams – that I’d pass no problem. I’ve never been so sure. It was almost becoming a pride-point. Perhaps I should fail just to demonstrate that I can fail these things. Anyway – boB Taylor (@sqlboBT) persuaded me to try the SQL 2008 MCM Knowledge exam at the PASS Summit. They set up a testing centre in one of the room there, so it wasn’t out of my way at all. I had to squeeze it in between other commitments, and I certainly didn’t have time to even see what was on the syllabus, let alone study. In fact, I was so exhausted from the week that I fell asleep at least once (just for a moment though) during the actual exam. Perhaps the questions need more jokes, I’m not sure. I knew if I failed, then I might disappoint some people, but that I wouldn’t’ve spent a great deal of effort in trying to pass. On the other hand, if I did pass I’d then be under pressure to investigate the MCM Lab exam, which can be taken remotely (therefore, a much smaller amount of effort to make happen). In some ways, passing could end up just putting a bunch more pressure on me. Oh, and I did.

    Read the article

  • SQL SERVER – Integrate Your Data with Skyvia – Cloud ETL Solution

    - by Pinal Dave
    In our days data integration often becomes a key aspect of business success. For business analysts it’s very important to get integrated data from various sources, such as relational databases, cloud CRMs, etc. to make correct and successful decisions. There are various data integration solutions on market, and today I will tell about one of them – Skyvia. Skyvia is a cloud data integration service, which allows integrating data in cloud CRMs and different relational databases. It is a completely online solution and does not require anything except for a browser. Skyvia provides powerful etl tools for data import, export, replication, and synchronization for SQL Server and other databases and cloud CRMs. You can use Skyvia data import tools to load data from various sources to SQL Server (and SQL Azure). Skyvia supports such cloud CRMs as Salesforce and Microsoft Dynamics CRM and such databases as MySQL and PostgreSQL. You even can migrate data from SQL Server to SQL Server, or from SQL Server to other databases and cloud CRMs. Additionally Skyvia supports import of CSV files, either uploaded manually or stored on cloud file storage services, such as Dropbox, Box, Google Drive, or FTP servers. When data import is not enough, Skyvia offers bidirectional data synchronization. With this tool, you can synchronize SQL Server data with other databases and cloud CRMs. After performing the first synchronization, Skyvia tracks data changes in the synchronized data storages. In SQL Server databases (and other relational databases) it creates additional tracking tables and triggers. This allows synchronizing only the changed data. Skyvia also maps records by their primary key values to each other, so it does not require different sources to have the same primary key structure. It still can match the corresponding records without having to add any additional columns or changing data structure. The only requirement for synchronization is that primary keys must be autogenerated. With Skyvia it’s not necessary for data to have the same structure in integrated data storages. Skyvia supports powerful mapping mechanisms that allow synchronizing data with completely different structure. It provides support for complex mathematical and string expressions when mapping data, using lookups, etc. You may use data splitting – loading data from a single CSV file or source table to multiple related target tables. Or you may load data from several source CSV files or tables to several related target tables. In each case Skyvia preserves data relations. It builds corresponding relations between the target data automatically. When you often work with cloud CRM data, native CRM data reporting and analysis tools may be not enough for you. And there is a vast set of professional data analysis and reporting tools available for SQL Server. With Skyvia you can quickly copy your cloud CRM data to an SQL Server database and apply corresponding SQL Server tools to the data. In such case you can use Skyvia data replication tools. It allows you to quickly copy cloud CRM data to SQL Server or other databases without customizing any mapping. You need just to specify columns to copy data from. Target database tables will be created automatically. Skyvia offers powerful filtering settings to replicate only the records you need. Skyvia also provides capability to export data from SQL Server (including SQL Azure) and other databases and cloud CRMs to CSV files. These files can be either downloadable manually or loaded to cloud file storages or FTP server. You can use export, for example, to backup SQL Azure data to Dropbox. Any data integration operation can be scheduled for automatic execution. Thus, you can automate your SQL Azure data backup or data synchronization – just configure it once, then schedule it, and benefit from automatic data integration with Skyvia. Currently registration and using Skyvia is completely free, so you can try it yourself and find out whether its data migration and integration tools suits for you. Visit this link to register on Skyvia: https://app.skyvia.com/register Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL Tagged: Cloud Computing

    Read the article

  • Stir Trek: Iron Man Edition Recap and Photos

    - by Brian Jackett
    If you’ve noticed my blogging activity has reduced in frequency and technical content lately it’s primarily due to all of the conferences I’ve been attending, speaking at, or planning in the past few months.  This past Friday myself and six other dedicated individuals put on Stir Trek: Iron Man Edition as the culmination of a few months of hard work.  For those unfamiliar, Stir Trek is a web developer conference that was founded last year as an event to showcase content from Microsoft’s MIX conference and end the day with a private showing of the then just-released Star Trek movie.  This year’s conference expanded from 2 to 4 content tracks and upped the number of tickets from 350 to 600.  Even more amazing was the fact that we had 592 people show up day of the event for the lowest drop-off percentage of any conference I’ve been to before.   Nerd Dinner and Swag Bags     The night before Stir Trek: Iron Man Edition we hosted a nerd dinner at the Polaris Shopping mall food court with about 30 in attendance.  Nerd dinners are a great time to meet others passionate about technology and socialize before the whirlwind of the conference hits.  After the nerd dinner 20+ volunteers headed to the conference location and helped us stuff swag bags.  This in and of itself was a monumental task of putting together 600 swag bags with numerous leaflets, sponsor items, and t-shirts.  A big thanks goes out to all who assisted us that night so that we could finish in just under 2 hours instead of taking all night.  My sleep schedule also thanks you. Morning of Stir Trek     After getting a decent amount of sleep I arrived at Marcus Crosswoods theater at 6am to begin setting up for the day.  Myself and Jody Morgan were in charge of registration so we got tables set up, laid out swag bags, and organized our volunteer crew to assist with checking-in attendees.  Despite having 600+ people registration went fairly smoothly and got the day off to a great start.  I especially appreciated the 3+ cups of coffee from Crimson Cup, a local coffee shop.  For any of you that know me you’ll know that I rarely drink coffee except a few times a year when I really need the energy, so that says a lot about how good their coffee is.   Conference Starts     Once registration was completed the day kicked off with Molly Holzschlag keynoting.  Unfortunately Molly suffered from an ear infection and wasn’t able to fly so she had a virtual keynote and a session later in the day.  I was working behind the scenes on various tasks so I was only able to drop in very briefly on the keynote and rest of the morning sessions.  Throughout the day I tried to grab at least 1 or 2 pics of each presenter.  See my album below for the full set of pics.      For lunch we ordered around 150 pizzas from Mellow Mushroom, a local pizza place (notice the theme of supporting local businesses.)  Early on we were concerned about Mellow Mushroom being able to supply that many pizzas and get them delivered (still hot) to the theater, but they did an excellent job day of the event.  I wish I had gotten some pictures of the old school VW van they delivered the pizza in, but I was just a bit busy running around trying to get theaters ready for lunch.  We had attendees from last year who specifically requested that we have Mellow Mushroom supply lunch this year and I’m glad everything worked out being able to use them again.     During the afternoon I was able to attend a few sessions and hear some great content from various speakers.  It was also nice to just sit down and get off my feet for a bit.  After the last sessions the day concluded with a raffle.  There were a few logistical and technical issues that hampered our ability to smoothly conduct the raffle.  To those of you that agree the raffle wasn’t the smoothest experience I would like to say that the Stir Trek planning committee has already begun meeting to discuss ways of improving the conference for next year.  We are also accepting feedback (both positive and negative) at the following link: click here.  If you don’t wish to use the Joind In site you can also email me directly and I’ll be sure to pass along the feedback.   Iron Man 2 Movie     Last but not least, what Stir Trek event would be complete without the feature movie.  This year’s movie was Iron Man 2.  The theater had some really cool props and promotions (see pic below) for the movie.  I really enjoyed Iron Man 2, but I would recommend brushing up on the Iron Man comics and Marvel’s plans for future movies to understand some of the plot elements that come up.  Also make sure you stay through to the end of the movie credits to see a sneak peak of something special, that’s all I’ll say. Conclusion     Again a big thanks goes out to all of the speakers, sponsors, attendees, movie theater staff, volunteers, and everyone else involved in making this event great.  Also big thanks to my fellow Stir Trek planning committee members: Jeff Blankenburg, Matt Casto, Carey Payette, Jody Morgan, Rick Kierner, and Sarah Dutkiewitcz.  I am grateful for everything I learned while helping plan this event and look forward to being involved again next year.  For those interested we are currently targeting Thor as our movie theme for 2011 and then The Avengers for 2012.  These are tentative based on release dates that could shift as we get closer, but for now look solid.   Photos Pics on Facebook (includes tagging)     Stir Trek: Iron Man Edition photos on Facebook Pics on Live site (higher res)      View Full Album         -Frog Out

    Read the article

  • SQL SERVER – TechEd India 2012 – Content, Speakers and a Lots of Fun

    - by pinaldave
    TechEd is one event which every developers and IT professionals are looking forward to attend. It is opportunity of life time and no matter how many time one gets chance to engage with it, it is never enough. I still remember every single moment of every TechEd I have attended so far. We are less than 100 hours away from TechEd India 2012 event.This event is the one must attend event for every Technology Enthusiast. Fourth time in the row I am going to attend this event and I am equally excited as the first time of the event. There are going to be two very solid SQL Server track this time and I will be attending end of the end both the tracks. Here is my view on each of the 10 sessions. Each session is carefully crafted and leading exeprts from industry will present it. Day 1, March 21, 2012 T-SQL Rediscovered with SQL Server 2012 – This session is going to bring some of the lesser known enhancements that were brought with SQL Server 2012. When I learned that Jacob Sebastian is going to do this session my reaction to this is DEMO, DEMO and DEMO! Jacob spends hours and hours of his time preparing his session and this will be one of those session that I am confident will be delivered over and over through out the next many events. Catapult your data with SQL Server 2012 Integration Services – Praveen is expert story teller and one of the wizard when it is about SQL Server and business intelligence. He is surely going to mesmerize you with some interesting insights on SSIS performance too. Processing Big Data with SQL Server 2012 and Hadoop – There are three sessions on Big Data at TechEd India 2012. Stephen is going to deliver one of the session. Watching Stephen present is always joy and quite entertaining. He shares knowledge with his typical humor which captures ones attention. I wrote about what is BIG DATA in a blog post. SQL Server Misconceptions and Resolutions – I will be presenting this Session along with Vinod Kumar. READ MORE HERE. Securing with ContainedDB in SQL Server 2012 – Pranab is expert when it is about SQL Server and Security. I have seen him presenting and he is indeed very pleasant to watch. A dry subject like security, he makes it much lively. A Contained Database is a database which contains all the necessary settings and metadata, making database easily portable to another server. This database will contain all the necessary details and will not have to depend on any server where it is installed for anything. You can take this database and move it to another server without having any worries. Day 3, March 23, 2012 Peeling SQL Server like an Onion: Internals Demystified – Vinod Kumar has been writing about this extensively on his other blog post. In recent conversation he suggested that he will be creating very exclusive content for this presentation. I know Vinod for long time and have worked with him along many community activities. I am going to pay special attention to the details. I know Vinod has few give-away planned now for attending the session now only if he shares with us. Speed Up – Parallel Processes and unparalleled Performance – Performance tuning is my favorite subject. I will be discussing effect of parallelism on performance in this session. Here me out, there will be lots of quiz questions during this session and if you get the answers correct – you can win some really cool goodies – I Promise! READ MORE HERE. Keep your database available – AlwaysOn – Balmukund is like an army man. He is always ready to show and prove that he has coolest toys in terms of SQL Server and he knows how to keep them running AlwaysON. Availability groups, Listener, Clustering, Failover, Read-Only replica etc all will be demo’ed in this session. This is really heavy but very interesting content not to be missed. Lesser known facts about SQL Server Backup and Restore – Amit Banerjee – this name is known internationally for solving SQL Server problems in 140 characters. He has already blogged about this and this topic is going to be interesting. A successful restore strategy for applications is as good as their last good known backup. I have few difficult questions to ask to Amit and I am very sure that his unique style will entertain people. By the way, his one of the slide may give few in audience a funny heart attack. Top 5 reasons why you want SQL Server 2012 BI – Praveen plans to take a tour of some of the BI enhancements introduced in the new version. Business Insights with SQL Server is a critical building block and this version of SQL Server is no exception. For the matter of the fact, when I saw the demos he was going to show during this session, I felt like that I wish I can set up all of this on my machine. If you miss this session – you will miss one of the most informative session of the day. Also TechEd India 2012 has a Live streaming of some content and this can be watched here. The TechEd Team is planning to have some really good exclusive content in this channel as well. If you spot me, just do not hesitate to come by me and introduce yourself, I want to remember you! Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, SQLAuthority Author Visit, SQLServer, T SQL, Technology Tagged: TechEd, TechEdIn

    Read the article

  • What Will Happen to Real Estate Leases when Operating Leases are Gone?

    - by Theresa Hickman
    Many people are concerned about what will happen to real estate leases when FASB and IASB abolish operating leases. They plan to unveil the proposed standards on treating leases this summer as part of the convergence project but no "finalized ruling" is expected for at least a year because it will need to get formal consensus from many players, such as the SEC, American Association of Investors, Congress, the Big Four, American Associate of Realtors, the international equivalents of these, etc. If your accounting is a bit rusty, an Operating Lease is where you lease equipment or some asset for a shorter period than the actual (expected) life of the asset and then give the asset back while it still has some useful life in it. (Think leasing a car). Because an Operating Lease does not contain any of the provisions that would qualify it as a Capital Lease, the lease is not treated as a sale or purchase and hits the lessee's rental expense and the lessor's revenue. So it all stays on the P&L (assuming no prepayments are made). Capital Leases, on the other hand, hit lessee's and lessor's balance sheets because the asset is treated as a sale. (I'm ignoring interest and depreciation here to emphasize my point). Question: What will happen to real estate leases when Operating Leases go away and how will Oracle Financials address these changes? Before I attempt to address these questions, here's a real-life example to expound on some of the issues: Let's say a U.S. retailer leases a store in a mall for 15 years. Under U.S. GAAP, the lease is considered an operating or expense lease. Will that same lease be considered a capital lease under IFRS? Real estate leases are supposedly going to be capitalized under IFRS. If so, will everyone need to change all leases from operating to capital? Or, could we make some adjustments so we report the lease as an expense for operations reporting but capitalize it for SEC reporting? Would all aspects of the lease be capitalized, or would some line items still be expensed? For example, many retail store leases are defined to include (1) the agreed-to rent amount; (2) a negotiated increase in base rent, e.g., maybe a 5% increase in Year 5; (3) a sales rent component whereby the retailer pays a variable additional amount based on the sales generated in the prior month; (4) parking lot maintenance fees. Would the entire lease be capitalized, or would some portions still be expensed? To help answer these questions, I met up with our resident accounting expert and walking encyclopedia, Seamus Moran. Here's what he had to say: Oracle is aware of the potential changes specific to reporting/capitalization of real estate leases; i.e., we are aware that FASB and IASB have identified real estate leases as one of the areas for standards convergence. Oracle stays apprised of the on-going convergence through our domain expertise staff, our relationship with customers, our market awareness, and, of course, our relationships with the Big 4. This is part of our normal process with respect to regulatory compliance worldwide. At this time, Oracle expects that the standards convergence committee will make a recommendation about reporting standards for real estate leases in about a year. Following typical procedures, we also expect that the recommendation will be up for review for a year, and customers will then need to start reporting to the new standard about a year after that. So that means we would expect the first customer to report under the new standard in maybe 3 years. Typically, after the new standard is finalized and distributed, we find that our customers then begin to evaluate how they plan to meet the new standard. And through groups like the Customer Advisory Boards (CABs), our customers tell us what kind of product changes are needed in order to satisfy their new reporting requirements. Of course, Oracle is also working with the Big 4 and Accenture and other implementers in order to ascertain that these recommended changes will indeed meet new reporting standards. So the best advice we can offer right now is, stay apprised of the standards convergence committee; know that Oracle is also staying abreast of developments; get involved with your CAB so your voice is heard; know that Oracle products continue to be GAAP compliant, and we will continue to maintain that as our standard. But exactly what is that "standard"--we need to wait on the standards convergence committee. In a nut shell, operating leases will become either capital leases or month to month rentals, but it is still too early, too political and too uncertain to call out at this point.

    Read the article

  • Tuning Red Gate: #4 of Some

    - by Grant Fritchey
    First time connecting to these servers directly (keys to the kingdom, bwa-ha-ha-ha. oh, excuse me), so I'm going to take a look at the server properties, just to see if there are any issues there. Max memory is set, cool, first possible silly mistake clear. In fact, these look to be nicely set up. Oh, I'd like to see the ANSI Standards set by default, but it's not a big deal. The default location for database data is the F:\ drive, where I saw all the activity last time. Cool, the people maintaining the servers in our company listen, parallelism threshold is set to 35 and optimize for ad hoc is enabled. No shocks, no surprises. The basic setup is appropriate. On to the problem database. Nothing wrong in the properties. The database is in SIMPLE recovery, but I think it's a reporting system, so no worries there. Again, I'd prefer to see the ANSI settings for connections, but that's the worst thing I can see. Time to look at the queries, tables, indexes and statistics because all the information I've collected over the last several days suggests that we're not looking at a systemic problem (except possibly not enough memory), but at the traditional tuning issues. I just want to note that, I started looking at the system, not the queries. So should you when tuning your environment. I know, from the data collected through SQL Monitor, what my top poor performing queries are, and the most frequently called, etc. I'm starting with the most frequently called. I'm going to get the execution plan for this thing out of the cache (although, with the cache dumping constantly, I might not get it). And it's not there. Called 1.3 million times over the last 3 days, but it's not in cache. Wow. OK. I'll see what's in cache for this database: SELECT  deqs.creation_time,         deqs.execution_count,         deqs.max_logical_reads,         deqs.max_elapsed_time,         deqs.total_logical_reads,         deqs.total_elapsed_time,         deqp.query_plan,         SUBSTRING(dest.text, (deqs.statement_start_offset / 2) + 1,                   (deqs.statement_end_offset - deqs.statement_start_offset) / 2                   + 1) AS QueryStatement FROM    sys.dm_exec_query_stats AS deqs         CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest         CROSS APPLY sys.dm_exec_query_plan(deqs.plan_handle) AS deqp WHERE   dest.dbid = DB_ID('Warehouse') AND deqs.statement_end_offset > 0 AND deqs.statement_start_offset > 0 ORDER BY deqs.max_logical_reads DESC ; And looking at the most expensive operation, we have our first bad boy: Multiple table scans against very large sets of data and a sort operation. a sort operation? It's an insert. Oh, I see, the table is a heap, so it's doing an insert, then sorting the data and then inserting into the primary key. First question, why isn't this a clustered index? Let's look at some more of the queries. The next one is deceiving. Here's the query plan: You're thinking to yourself, what's the big deal? Well, what if I told you that this thing had 8036318 reads? I know, you're looking at skinny little pipes. Know why? Table variable. Estimated number of rows = 1. Actual number of rows. well, I'm betting several more than one considering it's read 8 MILLION pages off the disk in a single execution. We have a serious and real tuning candidate. Oh, and I missed this, it's loading the table variable from a user defined function. Let me check, let me check. YES! A multi-statement table valued user defined function. And another tuning opportunity. This one's a beauty, seriously. Did I also mention that they're doing a hash against all the columns in the physical table. I'm sure that won't lead to scans of a 500,000 row table, no, not at all. OK. I lied. Of course it is. At least it's on the top part of the Loop which means the scan is only executed once. I just did a cursory check on the next several poor performers. all calling the UDF. I think I found a big tuning opportunity. At this point, I'm typing up internal emails for the company. Someone just had their baby called ugly. In addition to a series of suggested changes that we need to implement, I'm also apologizing for being such an unkind monster as to question whether that third eye & those flippers belong on such an otherwise lovely child.

    Read the article

  • ORA-4031 Troubleshooting

    - by [email protected]
      QUICKLINK: Note 396940.1 Troubleshooting and Diagnosing ORA-4031 Error Note 1087773.1 : ORA-4031 Diagnostics Tools [Video]   Have you observed an ORA-04031 error reported in your alert log? An ORA-4031 error is raised when memory is unavailable for use or reuse in the System Global Area (SGA).  The error message will indicate the memory pool getting errors and high level information about what kind of allocation failed and how much memory was unavailable.  The challenge with ORA-4031 analysis is that the error and associated trace is for a "victim" of the problem.   The failing code ran into the memory limitation, but in almost all cases it was not part of the root problem.    Looking for the best way to diagnose? When an ORA-4031 error occurs, a trace file is raised and noted in the alert log if the process experiencing the error is a background process.   User processes may experience errors without reports in the alert log or traces generated.   The V$SHARED_POOL_RESERVED view will show reports of misses for memory over the life of the database. Diagnostics scripts are available in Note 430473.1 to help in analysis of the problem.  There is also a training video on using and interpreting the script data Note 1087773.1. 11g DiagnosabilityStarting with Oracle Database 11g Release 1, the Diagnosability infrastructure was introduced which places traces and core files into a location controlled by the DIAGNOSTIC_DEST initialization parameter when an incident, such as an ORA-4031 occurs. For earlier versions, the trace file will be written to either USER_DUMP_DEST (if the error was caught in a user process) or BACKGROUND_DUMP_DEST (if the error was caught in a background process like PMON or SMON). The trace file contains vital information about what led to the error condition.  Note 443529.1 11g Quick Steps to Package and Send Critical Error Diagnostic Information to Support[Video]Oracle Configuration Manager (OCM)Oracle Configuration Manager (OCM) works with My Oracle Support to enable proactive support capability that helps you organize, collect and manage your Oracle configurations.Oracle Configuration Manager Quick Start GuideNote 548815.1: My Oracle Support Configuration Management FAQ Note 250434.1: BULLETIN: Learn More About My Oracle Support Configuration Manager    Common Causes/Solutions The ORA-4031 can occur for many different reasons.  Some possible causes are: SGA components too small for workload Auto-tuning issues Fragmentation due to application design Bug/leaks in memory allocationsFor more on the 4031 and how this affects the SGA, see Note 396940.1 Troubleshooting and Diagnosing ORA-4031 Error Because of the multiple potential causes, it is important to gather enough diagnostics so that an appropriate solution can be identified.  However, most commonly the cause is associated with configuration tuning.   Ensuring that MEMORY_TARGET or SGA_TARGET are large enough to accommodate workload can get around many scenarios.  The default trace associated with the error provides very high level information about the memory problem and the "victim" that ran into the issue.   The data in the default trace is not going to point to the root cause of the problem. When migrating from 9i to 10g and higher, it is necessary to increase the size of the Shared Pool due to changes in the basic design of the shared memory area. Note 270935.1 Shared pool sizing in 10gNOTE: Diagnostics on the errors should be investigated as close to the time of the error(s) as possible.  If you must restart a database, it is not feasible to diagnose the problem until the database has matured and/or started seeing the problems again. Note 801787.1 Common Cause for ORA-4031 in 10gR2, Excess "KGH: NO ACCESS" Memory Allocation ***For reference to the content in this blog, refer to Note.1088239.1 Master Note for Diagnosing ORA-4031 

    Read the article

  • LexisNexis and Oracle Join Forces to Prevent Fraud and Identity Abuse

    - by Tanu Sood
    Author: Mark Karlstrand About the Writer:Mark Karlstrand is a Senior Product Manager at Oracle focused on innovative security for enterprise web and mobile applications. Over the last sixteen years Mark has served as director in a number of tech startups before joining Oracle in 2007. Working with a team of talented architects and engineers Mark developed Oracle Adaptive Access Manager, a best of breed access security solution.The world’s top enterprise software company and the world leader in data driven solutions have teamed up to provide a new integrated security solution to prevent fraud and misuse of identities. LexisNexis Risk Solutions, a Gold level member of Oracle PartnerNetwork (OPN), today announced it has achieved Oracle Validated Integration of its Instant Authenticate product with Oracle Identity Management.Oracle provides the most complete Identity and Access Management platform. The only identity management provider to offer advanced capabilities including device fingerprinting, location intelligence, real-time risk analysis, context-aware authentication and authorization makes the Oracle offering unique in the industry. LexisNexis Risk Solutions provides the industry leading Instant Authenticate dynamic knowledge based authentication (KBA) service which offers customers a secure and cost effective means to authenticate new user or prove authentication for password resets, lockouts and such scenarios. Oracle and LexisNexis now offer an integrated solution that combines the power of the most advanced identity management platform and superior data driven user authentication to stop identity fraud in its tracks and, in turn, offer significant operational cost savings. The solution offers the ability to challenge users with dynamic knowledge based authentication based on the risk of an access request or transaction thereby offering an additional level to other authentication methods such as static challenge questions or one-time password when needed. For example, with Oracle Identity Management self-service, the forgotten password reset workflow utilizes advanced capabilities including device fingerprinting, location intelligence, risk analysis and one-time password (OTP) via short message service (SMS) to secure this sensitive flow. Even when a user has lost or misplaced his/her mobile phone and, therefore, cannot receive the SMS, the new integrated solution eliminates the need to contact the help desk. The Oracle Identity Management platform dynamically switches to use the LexisNexis Instant Authenticate service for authentication if the user is not able to authenticate via OTP. The advanced Oracle and LexisNexis integrated solution, thus, both improves user experience and saves money by avoiding unnecessary help desk calls. Oracle Identity and Access Management secures applications, Juniper SSL VPN and other web resources with a thoroughly modern layered and context-aware platform. Users don't gain access just because they happen to have a valid username and password. An enterprise utilizing the Oracle solution has the ability to predicate access based on the specific context of the current situation. The device, location, temporal data, and any number of other attributes are evaluated in real-time to determine the specific risk at that moment. If the risk is elevated a user can be challenged for additional authentication, refused access or allowed access with limited privileges. The LexisNexis Instant Authenticate dynamic KBA service plugs into the Oracle platform to provide an additional layer of security by validating a user's identity in high risk access or transactions. The large and varied pool of data the LexisNexis solution utilizes to quiz a user makes this challenge mechanism even more robust. This strong combination of Oracle and LexisNexis user authentication capabilities greatly mitigates the risk of exposing sensitive applications and services on the Internet which helps an enterprise grow their business with confidence.Resources:Press release: LexisNexis® Achieves Oracle Validated Integration with Oracle Identity Management Oracle Access Management (HTML)Oracle Adaptive Access Manager (pdf)

    Read the article

  • Oracle Data Integration 12c: Perspectives of Industry Experts, Customers and Partners

    - by Irem Radzik
    Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 As you may have seen from our recent blog posts on Oracle Data Integrator 12c and Oracle GoldenGate 12c, we are very excited to share with you the great new features the 12c release brings to Oracle’s data integration solutions. And, fortunately we are not alone in this sentiment. Since the press announcement October 17th, which incorporates our customers' and experts' testimonials, we have seen positive comments in leading technology publications and social media as well. Here are some examples: In CIO and PCWorld you can find Joab Jackson’s article, Oracle Data Integrator 12c ready for real-time analysis, where wrote about the tight integration between Oracle Data Integrator and Oracle GoldenGate . He noted “Heeding the call from enterprise customers who clamor for more immediacy in their data-driven reports, Oracle has updated its data-integration software portfolio so that it can more rapidly deliver data to data warehouses and analysis applications.” Integration Developer News’ Vance McCarthy wrote the article Oracle Ships ‘Future Proofs’ Integration Tools for Traditional, Cloud, Big Data, Real-Time Projects and mentioned that “Oracle Data Integrator 12c and Oracle GoldenGate 12c sport a wide range of improvements to let devs more easily deliver data integration for cloud, analytics, big data and other new projects that leverage multiple datasets for business.“ InformationWeek’s Doug Henschen gave a great overview to several key features including the new flow-based UI in Oracle Data Integrator. Doug said “Oracle Data Integrator 12c introduces a complete makeover of the job-building experience, while real-time oriented GoldenGate 12c introduces performance gains “. In Database Trends and Applications’ article Oracle Strengthens Data Integration with Release of Oracle Data Integrator 12c and Oracle GoldenGate 12c highlighted the productivity aspect of the new solution with his remarks: “tight integration between Oracle Data Integrator 12c and Oracle GoldenGate 12c enables developers to leverage Oracle GoldenGate’s low overhead, real-time change data capture completely within the Oracle Data Integrator Studio without additional training”. We are also thrilled about what our customers and partners have to say about our products and the new release. And we are equally excited to share those perspectives with you in our upcoming launch video webcast on November 12th. SolarWorld Industries America’s Senior Database Manager, Russ Toyama will join our executives in our studio in Redwood Shores to discuss GoldenGate’s core benefits and the new release, while Surren Partharb, CTO of Strategic Technology Services for BT, and Mark Rittman, CTO of Rittman Mead, will provide their comments via the interviews conducted in the UK. This interactive panel discussion in the video webcast will unveil the new release with the expertise of our development executives and the great insight from our customers and partners. In addition, our product experts will be available online to answer chat questions. This is really a great opportunity to learn how Oracle's data integration offering has changed the integration and replication technology space with the new release, and established itself as the new leader. If you have not registered for this free event yet, you can do so via this link. We will run the live event at 8am PT/4pm GMT, followed by a replay of the event with live chat for Q&A  at 10am PT/6pm GMT. The replay will be available on-demand for those who register but cannot attend either session on November 12th. /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Times New Roman","serif"; mso-fareast-font-family:"Times New Roman";}

    Read the article

  • Master Data Management for Location Data - Oracle Site Hub

    - by david.butler(at)oracle.com
    Most MDM discussions cover key domains such as customer, supplier, product, service, and reference data. It is usually understood that these domains have complex structures and hundreds if not thousands of attributes that need governing. Location, on the other hand, strikes most people as address data. How hard can that be? But for many industries, locations are complex, and site information is critical to efficient operations and relevant analytics. Retail stores and malls, bank branches, construction sites come to mind. But one of the best industries for illustrating the power of a site mastering application is Oil & Gas.   Oracle's Master Data Management solution for location data is the Oracle Site Hub. It is a location mastering solution that enables organizations to centralize site and location specific information from heterogeneous systems, creating a single view of site information that can be leveraged across all functional departments and analytical systems.   Let's take a look at the location entities the Oracle Site Hub can manage for the Oil & Gas industry: organizations, property, land, buildings, roads, oilfield, service center, inventory site, real estate, facilities, refineries, storage tanks, vendor locations, businesses, assets; project site, area, well, basin, pipelines, critical infrastructure, offshore platform, compressor station, gas station, etc. Any site can be classified into multiple hierarchies, like organizational hierarchy, operational hierarchy, geographic hierarchy, divisional hierarchies and so on. Any site can also be associated to multiple clusters, i.e. collections of sites, and these can be used as a foundation for driving reporting, analysis, organize daily work, etc. Hierarchies can also be used to model entities which are structured or non-structured collections of nodes, like for example routes, pipelines and more. The User Defined Attribute Framework provides the needed infrastructure to add single row attributes groups like well base attributes (well IDs, well type, well structure and key characterizing measures, and more) and well geometry, and multi row attribute groups like well applications, permits, production data, activities, operations, logs, treatments, tests, drills, treatments, and KPIs. Site Hub can also model areas, lands, fields, basins, pools, platforms, eco-zones, and stratigraphic layers as specific sites, tracking their base attributes, aliases, descriptions, subcomponents and more. Midstream entities (pipelines, logistic sites, pump stations) and downstream entities (cylinders, tanks, inventories, meters, partner's sites, routes, facilities, gas stations, and competitor sites) can also be easily modeled, together with their specific attributes and relationships. Site Hub can store any type of unstructured data associated to a site. This could be stored directly or on an external content management solution, like Oracle Universal Content Management. Considering a well, for example, Site Hub can store any relevant associated multimedia file such as: CAD drawings of the well profile, structure and/or parts, engineering documents, contracts, applications, permits, logs, pictures, photos, videos and more. For any site entity, Site Hub can associate all the related assets and equipments at the site, as well as all relationships between sites, between a site and multiple parties, and between a site and any purchasable or sellable item, over time. Items can be equipment, instruments, facilities, services, products, production entities, production facilities (pipelines, batteries, compressor stations, gas plants, meters, separators, etc.), support facilities (rigs, roads, transmission or radio towers, airstrips, etc.), supplier products and services, catalogs, and more. Items can just be associated to sites using standard Site Hub features, or they can be fully mastered by implementing Oracle Product Hub. Site locations (addresses or geographical coordinates) are also managed with out-of-the-box address geo-coding capabilities coupled with Google Maps integration to deliver powerful mapping capabilities and spatial data analysis. Locations can be shared between different sites. Centered on the site location, any site can also have associated areas. Site Hub can master any site location specific information, like for example cadastral, ownership, jurisdictional, geological, seismic and more, and any site-centric area specific information, like for example economical, political, risk, weather, logistic, traffic information and more. Now if anyone ever asks you why locations need MDM, think about how all these Oil & Gas entities and attributes would translate into your business locations. To learn more about Oracle's full MDM solution for the digital oil field, here is a link to Roberto Negro's outstanding whitepaper: Oracle Site Master Data Management for mastering wells and other PPDM entities in a digital oilfield context  

    Read the article

  • Build-time dependency resolving coming to Entity Framework. Now, how about those BI tools too?

    - by jamiet
    Three months ago I wrote a blog post entitled Some thoughts on Visual Studio database references and how they should be used for SQL Server BI where I shared some thoughts on a feature available to database developers in Visual Studio 2010 that I would love to see added to SQL Server Integration Services (SSIS), Analysis Services (SSAS) and Reporting Services (SSRS). In there I said: Over the past few weeks I have been making heavy use of the Database tools in Visual Studio 2010 and one of the features that has most impressed me has been database references.   Database references allow you to have stored procedures in your database project that refer to objects (tables, views, stored procedures etc…) that exist in other database projects and hence when you build your database project it is able to resolve those references.   It occurred to me that similar functionality would be incredibly useful for SQL Server Integration Services(SSIS), Analysis Services (SSAS) & Reporting Services (SSRS) projects. After all reports, packages and data source views are rife with references to database objects – why shouldn’t we be able to have design-time dependency checking in our BI projects the same way that database and .Net developers do? In that blog post I shared links to three Connect submissions where I requested this feature be added to SSIS, SSAS & SSRS. In addition I also submitted a request that the feature be extended to .Net projects so that any reference to a database object in a .Net assembly can be resolved at build time. That Connect submission is at [Entity FX] Use database references to constrain the EDM and overnight it received this comment from Microsoft: We have been working on this feature for a while and and will be available soon This is really good news - it improves the Microsoft developer ecosystem by ensuring invalid references to database references get caught at build time (ideally as part of a Continuous integration build) rather than run time. [Hopefully it might nip this code-first nonsense in the bud too (Ooo...way to incite flame comments :) ) ]. If you want to see this feature in action then check out a video from Teched Europe last month entitled SQL Server Developer Tools Code-named "Juneau" where it is demo'd by Lance Delano and Tim Laverty.   The point of this blog post though is not just to draw attention to this forthcoming feature for .Net developers, it is to ask you to petition Microsoft to get this feature added to SSIS/SSAS/SSRS too. After all, we already know (from the video above) that the feature is coming to this new code-name Juneau development environment plus we also know that Juneau will be the development environment for SSIS/SSAS/SSRS as well - is it really much of a stretch to expect the BI tools to have access to this great feature too? I don't think so and if you agree with me then I urge you to vote and add a comment to the Connection submissions that are requesting this feature. They are at: [SSAS] Declare Object Dependancies [SSRS] Declare Object Dependancies [SSIS] Declare Object Dependancies (Update, Apparently someone at Microsoft has deemed it necassary to set this to private and I am not able to change it back even though I submitted it. You can still vote on the other two though.) Let's close that SQL Developer Gap!   @Jamiet    

    Read the article

  • Is Oracle Policy Automation a Fit for My Agency? I'll bet it is.

    - by jeffrey.waterman
    Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} Recently, I stumbled upon a new(-ish) whitepaper now posted on the Oracle Technology Network around Oracle Policy Automation (OPA). This paper is certain to become a must read for any customer interested in rules automation. What is OPA?  If you are not sitting in your favorite Greek restaurant waiting for that order of Saganaki to appear, OPA is Oracle’s solution for automated streamlining, standardizing, and the maintenance of policy. It is a specialized rules platform that simplifies the automation of rules and policies, putting the analysis in the hands of the analysts, not the IT organization. In other words, OPA allows the organization to be more efficient by eliminating (or at a minimum, reducing the engagement of) the middle man from the process. The whitepaper I mention above is titled, “Is Oracle Policy Automation a Good Fit for My Business?”. This short document walks the reader through use cases and advice for the reader to consider when deciding if OPA is right for their agency. The paper outlines many different scenarios, different uses of OPA in production today and, where OPA may not be a good fit. Many of the use case examples revolve around end user questionnaires or analyst research. What is often overlooked is OPA’s ability to act as a rules engine behind the scenes. That is, take inputs from one source (e.g., personnel data), process that data in OPA and send the output (e.g., pay data with benefits deductions) to a second source. The rules have been automated, no necessary human intervention to perform analysis. A few of my customers have used the embedded OPA solution to improve transaction processing and reduce the time spent analyzing exceptions. I suggest any reader whose organization is reliant on or deals with high complexity, volume or volatility in rules that are based on documentation – or which need to be documented – take a look at Oracle Policy Automation. You can find the white paper on Oracle Technology Network. You can find the white paper in the Oracle Policy Automation of the OTN. You can find more information around OPA on oracle.com. Finally, you can send me a question any time at [email protected] Thank you for reading. If you have any topics around Oracle Applications in the Federal or Public Sector industries you would like to see addressed in this blog, please leave suggestions in the comments section and I will do my best to address in a future post.

    Read the article

  • Advanced Record-Level Business Intelligence with Inner Queries

    - by gt0084e1
    While business intelligence is generally applied at an aggregate level to large data sets, it's often useful to provide a more streamlined insight into an individual records or to be able to sort and rank them. For instance, a salesperson looking at a specific customer could benefit from basic stats on that account. A marketer trying to define an ideal customer could pull the top entries and look for insights or patterns. Inner queries let you do sophisticated analysis without the overhead of traditional BI or OLAP technologies like Analysis Services. Example - Order History Constancy Let's assume that management has realized that the best thing for our business is to have customers ordering every month. We'll need to identify and rank customers based on how consistently they buy and when their last purchase was so sales & marketing can respond accordingly. Our current application may not be able to provide this and adding an OLAP server like SSAS may be overkill for our needs. Luckily, SQL Server provides the ability to do relatively sophisticated analytics via inner queries. Here's the kind of output we'd like to see. Creating the Queries Before you create a view, you need to create the SQL query that does the calculations. Here we are calculating the total number of orders as well as the number of months since the last order. These fields might be very useful to sort by but may not be available in the app. This approach provides a very streamlined and high performance method of delivering actionable information without radically changing the application. It's also works very well with self-service reporting tools like Izenda. SELECT CustomerID,CompanyName, ( SELECT COUNT(OrderID) FROM Orders WHERE Orders.CustomerID = Customers.CustomerID ) As Orders, DATEDIFF(mm, ( SELECT Max(OrderDate) FROM Orders WHERE Orders.CustomerID = Customers.CustomerID) ,getdate() ) AS MonthsSinceLastOrder FROM Customers Creating Views To turn this or any query into a view, just put CREATE VIEW AS before it. If you want to change it use the statement ALTER VIEW AS. Creating Computed Columns If you'd prefer not to create a view, inner queries can also be applied by using computed columns. Place you SQL in the (Formula) field of the Computed Column Specification or check out this article here. Advanced Scoring and Ranking One of the best uses for this approach is to score leads based on multiple fields. For instance, you may be in a business where customers that don't order every month require more persistent follow up. You could devise a simple formula that shows the continuity of an account. If they ordered every month since their first order, they would be at 100 indicating that they have been ordering 100% of the time. Here's the query that would calculate that. It uses a few SQL tricks to make this happen. We are extracting the count of unique months and then dividing by the months since initial order. This query will give you the following information which can be used to help sales and marketing now where to focus. You could sort by this percentage to know where to start calling or to find patterns describing your best customers. Number of orders First Order Date Last Order Date Percentage of months order was placed since last order. SELECT CustomerID, (SELECT COUNT(OrderID) FROM Orders WHERE Orders.CustomerID = Customers.CustomerID) As Orders, (SELECT Max(OrderDate) FROM Orders WHERE Orders.CustomerID = Customers.CustomerID) AS LastOrder, (SELECT Min(OrderDate) FROM Orders WHERE Orders.CustomerID = Customers.CustomerID) AS FirstOrder, DATEDIFF(mm,(SELECT Min(OrderDate) FROM Orders WHERE Orders.CustomerID = Customers.CustomerID),getdate()) AS MonthsSinceFirstOrder, 100*(SELECT COUNT(DISTINCT 100*DATEPART(yy,OrderDate) + DATEPART(mm,OrderDate)) FROM Orders WHERE Orders.CustomerID = Customers.CustomerID) / DATEDIFF(mm,(SELECT Min(OrderDate) FROM Orders WHERE Orders.CustomerID = Customers.CustomerID),getdate()) As OrderPercent FROM Customers

    Read the article

  • My Feelings About Microsoft Surface

    - by Valter Minute
    Advice: read the title carefully, I’m talking about “feelings” and not about advanced technical points proved in a scientific and objective way I still haven’t had a chance to play with a MS Surface tablet (I would love to, of course) and so my ideas just came from reading different articles on the net and MS official statements. Remember also that the MVP motto begins with “Independent” (“Independent Experts. Real World Answers.”) and this is just my humble opinion about a product and a technology. I know that, being an MS MVP you can be called an “MS-fanboy”, I don’t care, I hope that people can appreciate my opinion, even if it doesn’t match theirs. The “Surface” brand can be confusing for techies that knew the “original” surface concept but I think that will be a fresh new brand name for most of the people out there. But marketing department are here to confuse people… so I can understand this “recycle” of an existing name. So Microsoft is entering the hardware arena… for me this is good news. Microsoft developed some nice hardware in the past: the xbox, zune (even if the commercial success was quite limited) and, last but not least, the two arc mices (old and new model) that I use and appreciate. In the past Microsoft worked with OEMs and that model lead to good and bad things. Good thing (for microsoft, at least) is market domination by windows-based PCs that only in the last years has been reduced by the return of the Mac and tablets. Google is also moving in the hardware business with its acquisition of Motorola, and Apple leveraged his control of both the hardware and software sides to develop innovative products. Microsoft can scare OEMs and make them fly away from windows (but where?) or just lead the pack, showing how devices should be designed to compete in the market and bring back some of the innovation that disappeared from recent PC products (look at the shelves of your favorite electronics store and try to distinguish a laptop between the huge mass of anonymous PCs on displays… only Macs shine out there…). Having to compete with MS “official” hardware will force OEMs to develop better product and bring back some real competition in a market that was ruled only by prices (the lower the better even when that means low quality) and no innovative features at all (when it was the last time that a new PC surprised you?). Moving into a new market is a big and risky move, but with Windows 8 Microsoft is playing a crucial move for its future, trying to be back in the innovation run against apple and google. MS can’t afford to fail this time. I saw the new devices (the WinRT and Pro) and the specifications are scarce, misleading and confusing. The first impression is that the device looks like an iPad with a nice keyboard cover… Using “HD” and “full HD” to define display resolution instead of using the real figures and reviving the “ClearType” brand (now dead on Win8 as reported here and missed by people who hate to read text on displays, like myself) without providing clear figures (couldn’t you count those damned pixels?) seems to imply that MS was caught by surprise by apple recent “retina” displays that brought very high definition screens on tablets.Also there are no specifications about the processors used (even if some sources report NVidia Tegra for the ARM tablet and i5 for the x86 one) and expected battery life (a critical point for tablets and the point that killed Windows7 x86 based tablets). Also nothing about the price, and this will be another critical point because other platform out there already provide lots of applications and have a good user base, if MS want to enter this market tablets pricing must be competitive. There are some expansion ports (SD and USB), so no fixed storage model (even if the specs talks about 32-64GB for RT and 128-256GB for pro). I like this and don’t like the apple model where flash memory (that it’s dirt cheap used in thumdrives or SD cards) is as expensive as gold (or cocaine to have a more accurate per gram measurement) when mounted inside a tablet/phone. For big files you’ll be able to use external media and an SD card could be used to store files that don’t require super-fast SSD-like access times, I hope. To be honest I really don’t like the marketplace model and the limitation of Windows RT APIs (no local database? from a company that based a good share of its success on VB6+Access!) and lack of desktop support on the ARM (even if the support is here and has been used to port office). It’s a step toward the consumer market (where competitors are making big money), but may impact enterprise (and embedded) users that may not appreciate Windows 8 new UI or the limitations of the new app model (if you aren’t connected you are dead ). Not having compatibility with the desktop will require brand new applications and honestly made all the CPU cycles spent to convert .NET IL into real machine code in the past like a huge waste of time… as soon as a new processor architecture is supported by Windows you still have to rewrite part of your application (and MS is pushing HTML5+JS and native code more than .NET in my perception). On the other side I believe that the development experience provided by Visual Studio is still miles (or kilometres) ahead of the competition and even the all-uppercase menu of VS2012 hasn’t changed this situation. The new metro UI got mixed reviews. On my side I should say that is very pleasant to use on a touch screen, I like the minimalist design (even if sometimes is too minimal and hides stuff that, in my opinion, should be visible) but I should also say that using it with mouse and keyboard is like trying to pick your nose with boxing gloves… Metro is also very interesting for embedded devices where touch screen usage is quite common and where having an application taking all the screen is the norm. For devices like kiosks, vending machines etc. this kind of UI can be a great selling point. I don’t need a new tablet (to be honest I’m pretty happy with my wife’s iPad and with my PC), but I may change my opinion after having a chance to play a little bit with those new devices and understand what’s hidden under all this mysterious and generic announcements and specifications!

    Read the article

  • Investigating on xVelocity (VertiPaq) column size

    - by Marco Russo (SQLBI)
      In January I published an article about how to optimize high cardinality columns in VertiPaq. In the meantime, VertiPaq has been rebranded to xVelocity: the official name is now “xVelocity in-memory analytics engine (VertiPaq)” but using xVelocity and VertiPaq when we talk about Analysis Services has the same meaning. In this post I’ll show how to investigate on columns size of an existing Tabular database so that you can find the most important columns to be optimized. A first approach can be looking in the DataDir of Analysis Services and look for the folder containing the database. Then, look for the biggest files in all subfolders and you will find the name of a file that contains the name of the most expensive column. However, this heuristic process is not very optimized. A better approach is using a DMV that provides the exact information. For example, by using the following query (open SSMS, open an MDX query on the database you are interested to and execute it) you will see all database objects sorted by used size in a descending way. SELECT * FROM $SYSTEM.DISCOVER_STORAGE_TABLE_COLUMN_SEGMENTS ORDER BY used_size DESC You can look at the first rows in order to understand what are the most expensive columns in your tabular model. The interesting data provided are: TABLE_ID: it is the name of the object – it can be also a dictionary or an index COLUMN_ID: it is the column name the object belongs to – you can also see ID_TO_POS and POS_TO_ID in case they refer to internal indexes RECORDS_COUNT: it is the number of rows in the column USED_SIZE: it is the used memory for the object By looking at the ration between USED_SIZE and RECORDS_COUNT you can understand what you can do in order to optimize your tabular model. Your options are: Remove the column. Yes, if it contains data you will never use in a query, simply remove the column from the tabular model Change granularity. If you are tracking time and you included milliseconds but seconds would be enough, round the data source column to the nearest second. If you have a floating point number but two decimals are good enough (i.e. the temperature), round the number to the nearest decimal is relevant to you. Split the column. Create two or more columns that have to be combined together in order to produce the original value. This technique is described in VertiPaq optimization article. Sort the table by that column. When you read the data source, you might consider sorting data by this column, so that the compression will be more efficient. However, this technique works better on columns that don’t have too many distinct values and you will probably move the problem to another column. Sorting data starting from the lower density columns (those with a few number of distinct values) and going to higher density columns (those with high cardinality) is the technique that provides the best compression ratio. After the optimization you should be able to reduce the used size and improve the count/size ration you measured before. If you are interested in a longer discussion about internal storage in VertiPaq and you want understand why this approach can save you space (and time), you can attend my 24 Hours of PASS session “VertiPaq Under the Hood” on March 21 at 08:00 GMT.

    Read the article

  • Comparing Apples and Pairs

    - by Tony Davis
    A recent study, High Costs and Negative Value of Pair Programming, by Capers Jones, pulls no punches in its assessment of the costs-to- benefits ratio of pair programming, two programmers working together, at a single computer, rather than separately. He implies that pair programming is a method rushed into production on a wave of enthusiasm for Agile or Extreme Programming, without any real regard for its effectiveness. Despite admitting that his data represented a far from complete study of the economics of pair programming, his conclusions were stark: it was 2.5 times more expensive, resulted in a 15% drop in productivity, and offered no significant quality benefits. The author provides a more scientific analysis than Jon Evans’ Pair Programming Considered Harmful, but the theme is the same. In terms of upfront-coding costs, pair programming is surely more expensive. The claim of productivity loss is dubious and contested by other studies. The third claim, though, did surprise me. The author’s data suggests that if both the pair and the individual programmers employ static code analysis and testing, then there is no measurable difference in the resulting code quality, in terms of defects per function point. In other words, pair programming incurs a massive extra cost for no tangible return in investment. There were, inevitably, many criticisms of his data and his conclusions, a few of which are persuasive. Firstly, that the driver/observer model of pair programming, on which the study bases its findings, is far from the most effective. For example, many find Ping-Pong pairing, based on use of test-driven development, far more productive. Secondly, that it doesn’t distinguish between “expert” and “novice” pair programmers– that is, independently of other programming skills, how skilled was an individual at pair programming. Thirdly, that his measure of quality is too narrow. This point rings true, certainly at Red Gate, where developers don’t pair program all the time, but use the method in short bursts, while tackling a tricky problem and needing a fresh perspective on the best approach, or more in-depth knowledge in a particular domain. All of them argue that pair programming, and collective code ownership, offers significant rewards, if not in terms of immediate “bug reduction”, then in removing the likelihood of single points of failure, and improving the overall quality and longer-term adaptability/maintainability of the design. There is also a massive learning benefit for both participants. One developer told me how he once worked in the same team over consecutive summers, the first time with no pair programming and the second time pair-programming two-thirds of the time, and described the increased rate of learning the second time as “phenomenal”. There are a great many theories on how we should develop software (Scrum, XP, Lean, etc.), but woefully little scientific research in their effectiveness. For a group that spends so much time crunching other people’s data, I wonder if developers spend enough time crunching data about themselves. Capers Jones’ data may be incomplete, but should cause a pause for thought, especially for any large IT departments, supporting commerce and industry, who are considering pair programming. It certainly shouldn’t discourage teams from exploring new ways of developing software, as long as they also think about how to gather hard data to gauge their effectiveness.

    Read the article

  • How much is a subscriber worth?

    - by Tom Lewin
    This year at Red Gate, we’ve started providing a way to back up SQL Azure databases and Azure storage. We decided to sell this as a service, instead of a product, which means customers only pay for what they use. Unfortunately for us, it makes figuring out revenue much trickier. With a product like SQL Compare, a customer pays for it, and it’s theirs for good. Sure, we offer support and upgrades, but, fundamentally, the sale is a simple, upfront transaction: we’ve made this product, you need this product, we swap product for money and everyone is happy. With software as a service, it isn’t that easy. The money and product don’t change hands up front. Instead, we provide a service in exchange for a recurring fee. We know someone buying SQL Compare will pay us $X, but we don’t know how long service customers will stay with us, or how much they will spend. How do we find this out? We use lifetime value analysis. What is lifetime value? Lifetime value, or LTV, is how much a customer is worth to the business. For Entrepreneurs has a brilliant write up that we followed to conduct our analysis. Basically, it all boils down to this equation: LTV = ARPU x ALC To make it a bit less of an alphabet-soup and a bit more understandable, we can write it out in full: The lifetime value of a customer equals the average revenue per customer per month, times the average time a customer spends with the service Simple, right? A customer is worth the average spend times the average stay. If customers pay on average $50/month, and stay on average for ten months, then a new customer will, on average, bring in $500 over the time they are a customer! Average spend is easy to work out; it’s revenue divided by customers. The problem comes when we realise that we don’t know exactly how long a customer will stay with us. How can we figure out the average lifetime of a customer, if we only have six months’ worth of data? The answer lies in the fact that: Average Lifetime of a Customer = 1 / Churn Rate The churn rate is the percentage of customers that cancel in a month. If half of your customers cancel each month, then your average customer lifetime is two months. The problem we faced was that we didn’t have enough data to make an estimate of one month’s cancellations reliable (because barely anybody cancels)! To deal with this data problem, we can take data from the last three months instead. This means we have more data to play with. We can still use the equation above, we just need to multiply the final result by three (as we worked out how many three month periods customers stay for, and we want our answer to be in months). Now these estimates are likely to be fairly unreliable; when there’s not a lot of data it pays to be cautious with inference. That said, the numbers we have look fairly consistent, and it’s super easy to revise our estimates when new data comes in. At the very least, these numbers give us a vague idea of whether a subscription business is viable. As far as Cloud Services goes, the business looks very viable indeed, and the low cancellation rates are much more than just data points in LTV equations; they show that the product is working out great for our customers, which is exactly what we’re looking for!

    Read the article

  • Investigating on xVelocity (VertiPaq) column size

    - by Marco Russo (SQLBI)
      In January I published an article about how to optimize high cardinality columns in VertiPaq. In the meantime, VertiPaq has been rebranded to xVelocity: the official name is now “xVelocity in-memory analytics engine (VertiPaq)” but using xVelocity and VertiPaq when we talk about Analysis Services has the same meaning. In this post I’ll show how to investigate on columns size of an existing Tabular database so that you can find the most important columns to be optimized. A first approach can be looking in the DataDir of Analysis Services and look for the folder containing the database. Then, look for the biggest files in all subfolders and you will find the name of a file that contains the name of the most expensive column. However, this heuristic process is not very optimized. A better approach is using a DMV that provides the exact information. For example, by using the following query (open SSMS, open an MDX query on the database you are interested to and execute it) you will see all database objects sorted by used size in a descending way. SELECT * FROM $SYSTEM.DISCOVER_STORAGE_TABLE_COLUMN_SEGMENTS ORDER BY used_size DESC You can look at the first rows in order to understand what are the most expensive columns in your tabular model. The interesting data provided are: TABLE_ID: it is the name of the object – it can be also a dictionary or an index COLUMN_ID: it is the column name the object belongs to – you can also see ID_TO_POS and POS_TO_ID in case they refer to internal indexes RECORDS_COUNT: it is the number of rows in the column USED_SIZE: it is the used memory for the object By looking at the ration between USED_SIZE and RECORDS_COUNT you can understand what you can do in order to optimize your tabular model. Your options are: Remove the column. Yes, if it contains data you will never use in a query, simply remove the column from the tabular model Change granularity. If you are tracking time and you included milliseconds but seconds would be enough, round the data source column to the nearest second. If you have a floating point number but two decimals are good enough (i.e. the temperature), round the number to the nearest decimal is relevant to you. Split the column. Create two or more columns that have to be combined together in order to produce the original value. This technique is described in VertiPaq optimization article. Sort the table by that column. When you read the data source, you might consider sorting data by this column, so that the compression will be more efficient. However, this technique works better on columns that don’t have too many distinct values and you will probably move the problem to another column. Sorting data starting from the lower density columns (those with a few number of distinct values) and going to higher density columns (those with high cardinality) is the technique that provides the best compression ratio. After the optimization you should be able to reduce the used size and improve the count/size ration you measured before. If you are interested in a longer discussion about internal storage in VertiPaq and you want understand why this approach can save you space (and time), you can attend my 24 Hours of PASS session “VertiPaq Under the Hood” on March 21 at 08:00 GMT.

    Read the article

  • The five steps of business intelligence adoption: where are you?

    - by Red Gate Software BI Tools Team
    When I was in Orlando and New York last month, I spoke to a lot of business intelligence users. What they told me suggested a path of BI adoption. The user’s place on the path depends on the size and sophistication of their organisation. Step 1: A company with a database of customer transactions will often want to examine particular data, like revenue and unit sales over the last period for each product and territory. To do this, they probably use simple SQL queries or stored procedures to produce data on demand. Step 2: The results from step one are saved in an Excel document, so business users can analyse them with filters or pivot tables. Alternatively, SQL Server Reporting Services (SSRS) might be used to generate a report of the SQL query for display on an intranet page. Step 3: If these queries are run frequently, or business users want to explore data from multiple sources more freely, it may become necessary to create a new database structured for analysis rather than CRUD (create, retrieve, update, and delete). For example, data from more than one system — plus external information — may be incorporated into a data warehouse. This can become ‘one source of truth’ for the business’s operational activities. The warehouse will probably have a simple ‘star’ schema, with fact tables representing the measures to be analysed (e.g. unit sales, revenue) and dimension tables defining how this data is aggregated (e.g. by time, region or product). Reports can be generated from the warehouse with Excel, SSRS or other tools. Step 4: Not too long ago, Microsoft introduced an Excel plug-in, PowerPivot, which allows users to bring larger volumes of data into Excel documents and create links between multiple tables.  These BISM Tabular documents can be created by the database owners or other expert Excel users and viewed by anyone with Excel PowerPivot. Sometimes, business users may use PowerPivot to create reports directly from the primary database, bypassing the need for a data warehouse. This can introduce problems when there are misunderstandings of the database structure or no single ‘source of truth’ for key data. Step 5: Steps three or four are often enough to satisfy business intelligence needs, especially if users are sophisticated enough to work with the warehouse in Excel or SSRS. However, sometimes the relationships between data are too complex or the queries which aggregate across periods, regions etc are too slow. In these cases, it can be necessary to formalise how the data is analysed and pre-build some of the aggregations. To do this, a business intelligence professional will typically use SQL Server Analysis Services (SSAS) to create a multidimensional model — or “cube” — that more simply represents key measures and aggregates them across specified dimensions. Step five is where our tool, SSAS Compare, becomes useful, as it helps review and deploy changes from development to production. For us at Red Gate, the primary value of SSAS Compare is to establish a dialog with BI users, so we can develop a portfolio of products that support creation and deployment across a range of report and model types. For example, PowerPivot and the new BISM Tabular model create a potential customer base for tools that extend beyond BI professionals. We’re interested in learning where people are in this story, so we’ve created a six-question survey to find out. Whether you’re at step one or step five, we’d love to know how you use BI so we can decide how to build tools that solve your problems. So if you have a sixty seconds to spare, tell us on the survey!

    Read the article

  • Synchronizing Asynchronous request handlers in Silverlight environment

    - by Eric Lifka
    For our senior design project my group is making a Silverlight application that utilizes graph theory concepts and stores the data in a database on the back end. We have a situation where we add a link between two nodes in the graph and upon doing so we run analysis to re-categorize our clusters of nodes. The problem is that this re-categorization is quite complex and involves multiple queries and updates to the database so if multiple instances of it run at once it quickly garbles data and breaks (by trying to re-insert already used primary keys). Essentially it's not thread safe, and we're trying to make it safe, and that's where we're failing and need help :). The create link function looks like this: private Semaphore dblock = new Semaphore(1, 1); // This function is on our service reference and gets called // by the client code. public int addNeed(int nodeOne, int nodeTwo) { dblock.WaitOne(); submitNewNeed(createNewNeed(nodeOne, nodeTwo)); verifyClusters(nodeOne, nodeTwo); dblock.Release(); return 0; } private void verifyClusters(int nodeOne, int nodeTwo) { // Run analysis of nodeOne and nodeTwo in graph } All copies of addNeed should wait for the first one that comes in to finish before another can execute. But instead they all seem to be running and conflicting with each other in the verifyClusters method. One solution would be to force our front end calls to be made synchronously. And in fact, when we do that everything works fine, so the code logic isn't broken. But when it's launched our application will be deployed within a business setting and used by internal IT staff (or at least that's the plan) so we'll have the same problem. We can't force all clients to submit data at different times, so we really need to get it synchronized on the back end. Thanks for any help you can give, I'd be glad to supply any additional information that you could need!

    Read the article

  • OCR: How to improve accuracy - existing libraries for removing non-text 'furniture', shapes, etc to

    - by Rob
    I want to remove rectangles etc that enclose text in a screenshot image, so that I can perform optical character recognition to get accurate text from the screenshot. Background: I doing this to extract data from a legacy application for use with other applications. This is the only way to get at this data as associated files are in a closed, proprietary, binary format. I will be using AutoItScript to drive the application to show data in its UI, then I will screenshot this and feed this to tesseract. I've already had some success in automating the UI, and have been able to use tesseract to get plain ascii text out of the bitmap. There are several AutoItScripr forum articles discussing its use with tesseract/OCR but not specifically for my question. http://www.autoitscript.com/forum/index.php?s=6c32c3ece12756e635a619cdf175eff9&showforum=2 What I need to do There are thin, 1-pixel wide rectangles that closely enclose some text, when fed to tesseract, it sees them as I for example for a verticle line of the rectangle. Any thoughts on how to remove the rectangles, or best practices? I'm asking if there is a generic command line based toolset to overwrite rectangles, for example, in .png files. I could then pass the .png through this, then pass it to tesseract. Details on the tesseract release/setup I've used are as follows: Go here: http://code.google.com/p/tesseract-ocr/downloads/list - For the basic english generic character set to get Tesseract up and running and recognising your bitmapped text into ascii text, use tesseract-2.00.eng.tar.gz (current version at time of writing is: "English language data for Tesseract (2.00 and up) Jul 2007 989 KB 84845") Related questions I have already looked at on Stack Overflow http://stackoverflow.com/questions/1335581/how-to-give-best-chance-of-success-to-an-ocr-software http://stackoverflow.com/questions/2296568/analysis-and-transformation-of-the-image-on-the-basis-of-this-analysis-for-better http://stackoverflow.com/questions/2268028/reading-characters-off-of-the-screen In these, my question is not completely answered or a commercial solution is being sold. I do not want to consider a commercial solution at this stage.

    Read the article

  • Text mining on large database (data mining)

    - by yox
    Hello, I have a large database of resumes (CV), and a certain table skills grouping all users skills. inside that table there's a field skill_text that describes the skill in full text. I'm looking for an algorithm/software/method to extract significant terms/phrases from that table in order to build a new table with standarized skills.. Here are some examples skills extracted from the DB : Sectoral and competitive analysis Business Development (incl. in international settings) Specific structure and road design software - Microstation, Macao, AutoCAD (basic knowledge) Creative work (Photoshop, In-Design, Illustrator) checking and reporting back on campaign progress organising and attending events and exhibitions Development : Aptana Studio, PHP, HTML, CSS, JavaScript, SQL, AJAX Discipline: One to one marketing, E-marketing (SEO & SEA, display, emailing, affiliate program) Mix marketing, Viral Marketing, Social network marketing. The output shoud be something like : Sectoral and competitive analysis Business Development Specific structure and road design software - Macao AutoCAD Photoshop In-Design Illustrator organising events Development Aptana Studio PHP HTML CSS JavaScript SQL AJAX Mix marketing Viral Marketing Social network marketing emailing SEO One to one marketing As you see only skills remains no other representation text. I know this is possible using text mining technics but how to do it ? the database is realy large.. it's a good thing because we can calculate text frequency and decide if it's a real skill or just meaningless text... The big problem is .. how to determin that "blablabla" is a skill ? thanks

    Read the article

  • Agile and Scrum burning me down please help me figuring out the truth

    - by jadook
    hi all, in the last while I installed MS-TFS 2008 then started to get myself prepared to use Agile Process Guidance template shipped with the TFS. with little googling I passed through Mike Cohn materials: I watched his conference in youtube "sponsored by google: http://www.youtube.com/watch?v=fb9Rzyi8b90 http://www.youtube.com/watch?v=jeT0pOVg0EI Read his book "Agile Estimating and Planning" Watching the video series in his website: http://www.mountaingoatsoftware.com/presentations-tag/video-recorded I was very happy while absorbing and eating the techniques he is using with the teams and how agile and scrum is such a great software process/methodology until I saw Mike answering a question regarding an architect role and talking about the requirements document... at that point everything start falling apart due to the following: Last year I had been assigned to make full analysis "including requirements gathering" for big project "very high priority project". within 2 months of hardwork, dedication and commitment I delivered the whole analysis with full satisfaction of the customer and my BOSS and ZERO amendments. Later on, the project entered the architecting, development ... phases. due to the fact that the system included many competitive and exciting features I requested patenting it and its going in the process... so imagine you are the kind of person who used to love facing all kind of challenges and returning with excellent experience and results for the stakeholders and yourself, How fairly agile and scrum processes will credit and admit your talent and passion while the scrum master/coach treat the team as one unit that accomplish user stories and converge through trial and error approach??!!!! with that dark thoughts about agile and scrum I found many people "anti agile" and on top of them is "Crispin Rogers Johnson": http://agile-crispin.blogspot.com/ that guy made anti statement for everything Mike Cohn used to talk about. I really don't know what to do next! so any guidance will be appreciated. Thanks,

    Read the article

  • structured vs. unstructured data in db

    - by Igor
    the question is one of design. i'm gathering a big chunk of performance data with lots of key-value pairs. pretty much everything in /proc/cpuinfo, /proc/meminfo/, /proc/loadavg, plus a bunch of other stuff, from several hundred hosts. right now, i just need to display the latest chunk of data in my UI. i will probably end up doing some analysis of the data gathered to figure out performance problems down the road, but this is a new application so i'm not sure what exactly i'm looking for performance-wise just yet. i could structure the data in the db -- have a column for each key i'm gathering. the table would end up being O(100) columns wide, it would be a pain to put into the db, i would have to add new columns if i start gathering a new stat. but it would be easy to sort/analyze the data just using SQL. or i could just dump my unstructured data blob into the table. maybe three columns -- host id, timestamp, and a serialized version of my array, probably using JSON in a TEXT field. which should I do? am i going to be sorry if i go with the unstructured approach? when doing analysis, should i just convert the fields i'm interested in and create a new, more structured table? what are the trade-offs i'm missing here?

    Read the article

< Previous Page | 87 88 89 90 91 92 93 94 95 96 97 98  | Next Page >