Search Results

Search found 913 results on 37 pages for 'massive boisson'.

Page 5/37 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >

  • Cloud Computing - just get started already!

    - by BuckWoody
    OK - you've been hearing about "cloud" (I really dislike that term, but whatever) for over two years. You've equated it with just throwing some VM's in some vendor's datacenter - which is certainly part of it, but not the whole story. There's a whole world of - wait for it - *coding* out there that you should be working on. If you're a developer, this is just a set of servers with operating systems and the runtime layer (like.NET, Java, PHP, etc.) that you can deploy code to and have it run. It can expand in a horizontal way, allowing massive - and I really, honestly mean massive, not just marketing talk kind of scale. We see this every day. If you're not a developer, well, now's the time to learn. Explore a little. Try it. We'll help you. There's a free conference you can attend in November, and you can sign up for it now. It's all on-line, and the tools you need to code are free. Put down Facebook and Twitter for a minute - go sign up. Learn. Do. :) See you there. http://www.windowsazureconf.net/

    Read the article

  • HTML5 - Does it have the power to handle a large 2D game with a huge world?

    - by user15858
    I have been using XNA game studio, but due to private reasons (as well as the ability to publish anywhere & my heavy interest in isogenic engine), I would like to switch to HTML5. However, I have very high 2D graphic demands for my game. The game itself will have a HDD size of anywhere between 6GB (min) to 12GB (max) which would be a full game deployed offline. The size of the images aren't significantly large, so streaming would be entirely possible if only those assets required were streamed as needed. The game has a massive file size because of the sheer amount of content. For some images or spritesheets, they would be quite massive. (ex. a very large Dragon, which if animated in a spritesheet would be split into two 4096x4096 sheets or one 8192x8192 sheet). Most assets would be very small, and about 7MB for a full character with 15 animations in every direction (all animations not required immediately) so in the size of a few hundred KB to download before the game loads. My question, however, is if the graphical power of HTML5 is enough to animate several characters on screen at once, when it flips through frames quite rapidly. All my sprites have about 25 frames per animation, 5 directions (a spritesheet for each direction & animation), and run at 30fps. Upon changing direction, animation, or a new character entering, spritesheets would change and be constantly loading/unloading. If I pack all directions in a single sheet, it would be about 2048x2048 per sheet. Most frameworks have no problem with this, but I am afraid from what I read that HTML5's graphical capabilities will limit me. Since it takes significant time simply to animate characters in any language, I'd like a quick answer.

    Read the article

  • Spend Analytics on a Grand Scale

    - by jacqueline.coolidge(at)oracle.com
    The Wall St. Journal reports in Billions in Bloat Uncovered in Beltway that a recent study by Government Accountability Office (GAO) has released a massive study of several programs and agencies that cost U.S. taxpayers billions of dollars each year.  This report help save $100 to $200 Billion dollars by identifying duplicate spending and ineffective programs that can be consolidated or eliminated. Now, that is spend analytics on a massive scale! It remains to be seen how actionable that information will be.  Certainly, there have been studies before that identify wasteful spending.  But, it’s a great case of the power of business intelligence and spend analytics.   Many companies do find significant savings when they implement spend and procurement analytics. What makes for an excellent spend analysis? It should be: Objective and provide visibility across programs and/or divisions A cross functional analysis that links financial with performance metrics Prescriptive and actionable Spend and procurement analytics have been HOT during the economic downturn! I expect 2011 will see many more companies get serious about spend analytics and would love to hear from companies who are willing to share their experience.

    Read the article

  • comparing two cursors in oracle instead of using MINUS

    - by Omnipresent
    The following query takes more than 3 minutes to run because tables contain massive amounts of data: SELECT RTRIM(LTRIM(A.HEAD)), A.EFFECTIVE_DATE, FROM TABLE_1 A WHERE A.TYPE_OF_ACTION='6' AND A.EFFECTIVE_DATE >= ADD_MONTHS(SYSDATE,-15) MINUS SELECT RTRIM(LTRIM(B.head)), B.EFFECTIVE_DATE, FROM TABLE_2 B In our system a query gets killed if it is running for more than 8 seconds. Is there a way to run the queries individually ..put them in cursors..compare and then get the results? that way each query will be ran individually rather than as one massive query which takes 3 minutes. How would two cursors be compared to mimic the MINUS?

    Read the article

  • Redirect print in Python: val = print(arg) to output mixed iterable to file

    - by emcee
    So lets say I have an incredibly nested iterable of lists/dictionaries. I would like to print them to a file as easily as possible. Why can't I just redirect print to a file? val = print(arg) gets a SyntaxError. Is there a way to access stdinput? And why does print take forever with massive strings? Bad programming on my side for outputting massive strings, but quick debugging--and isn't that leveraging the strength of an interactive prompt? There's probably also an easier way than my gripe. Has the hive-mind an answer?

    Read the article

  • pipelined function

    - by user289429
    Can someone provide an example of how to use parallel table function in oracle pl/sql. We need to run massive queries for 15 years and combine the result. SELECT * FROM Table(TableFunction(cursor(SELECT * FROM year_table))) ...is what we want effectively. The innermost select will give all the years, and the table function will take each year and run massive query and returns a collection. The problem we have is that all years are being fed to one table function itself, we would rather prefer the table function being called in parallel for each of the year. We tried all sort of partitioning by hash and range and it didn't help. Also, can we drop the keyword PIPELINED from the function declaration? because we are not performing any transformation and just need the aggregate of the resultset.

    Read the article

  • How to find and fix performance problems in ORM powered applications

    - by FransBouma
    Once in a while we get requests about how to fix performance problems with our framework. As it comes down to following the same steps and looking into the same things every single time, I decided to write a blogpost about it instead, so more people can learn from this and solve performance problems in their O/R mapper powered applications. In some parts it's focused on LLBLGen Pro but it's also usable for other O/R mapping frameworks, as the vast majority of performance problems in O/R mapper powered applications are not specific for a certain O/R mapper framework. Too often, the developer looks at the wrong part of the application, trying to fix what isn't a problem in that part, and getting frustrated that 'things are so slow with <insert your favorite framework X here>'. I'm in the O/R mapper business for a long time now (almost 10 years, full time) and as it's a small world, we O/R mapper developers know almost all tricks to pull off by now: we all know what to do to make task ABC faster and what compromises (because there are almost always compromises) to deal with if we decide to make ABC faster that way. Some O/R mapper frameworks are faster in X, others in Y, but you can be sure the difference is mainly a result of a compromise some developers are willing to deal with and others aren't. That's why the O/R mapper frameworks on the market today are different in many ways, even though they all fetch and save entities from and to a database. I'm not suggesting there's no room for improvement in today's O/R mapper frameworks, there always is, but it's not a matter of 'the slowness of the application is caused by the O/R mapper' anymore. Perhaps query generation can be optimized a bit here, row materialization can be optimized a bit there, but it's mainly coming down to milliseconds. Still worth it if you're a framework developer, but it's not much compared to the time spend inside databases and in user code: if a complete fetch takes 40ms or 50ms (from call to entity object collection), it won't make a difference for your application as that 10ms difference won't be noticed. That's why it's very important to find the real locations of the problems so developers can fix them properly and don't get frustrated because their quest to get a fast, performing application failed. Performance tuning basics and rules Finding and fixing performance problems in any application is a strict procedure with four prescribed steps: isolate, analyze, interpret and fix, in that order. It's key that you don't skip a step nor make assumptions: these steps help you find the reason of a problem which seems to be there, and how to fix it or leave it as-is. Skipping a step, or when you assume things will be bad/slow without doing analysis will lead to the path of premature optimization and won't actually solve your problems, only create new ones. The most important rule of finding and fixing performance problems in software is that you have to understand what 'performance problem' actually means. Most developers will say "when a piece of software / code is slow, you have a performance problem". But is that actually the case? If I write a Linq query which will aggregate, group and sort 5 million rows from several tables to produce a resultset of 10 rows, it might take more than a couple of milliseconds before that resultset is ready to be consumed by other logic. If I solely look at the Linq query, the code consuming the resultset of the 10 rows and then look at the time it takes to complete the whole procedure, it will appear to me to be slow: all that time taken to produce and consume 10 rows? But if you look closer, if you analyze and interpret the situation, you'll see it does a tremendous amount of work, and in that light it might even be extremely fast. With every performance problem you encounter, always do realize that what you're trying to solve is perhaps not a technical problem at all, but a perception problem. The second most important rule you have to understand is based on the old saying "Penny wise, Pound Foolish": the part which takes e.g. 5% of the total time T for a given task isn't worth optimizing if you have another part which takes a much larger part of the total time T for that same given task. Optimizing parts which are relatively insignificant for the total time taken is not going to bring you better results overall, even if you totally optimize that part away. This is the core reason why analysis of the complete set of application parts which participate in a given task is key to being successful in solving performance problems: No analysis -> no problem -> no solution. One warning up front: hunting for performance will always include making compromises. Fast software can be made maintainable, but if you want to squeeze as much performance out of your software, you will inevitably be faced with the dilemma of compromising one or more from the group {readability, maintainability, features} for the extra performance you think you'll gain. It's then up to you to decide whether it's worth it. In almost all cases it's not. The reason for this is simple: the vast majority of performance problems can be solved by implementing the proper algorithms, the ones with proven Big O-characteristics so you know the performance you'll get plus you know the algorithm will work. The time taken by the algorithm implementing code is inevitable: you already implemented the best algorithm. You might find some optimizations on the technical level but in general these are minor. Let's look at the four steps to see how they guide us through the quest to find and fix performance problems. Isolate The first thing you need to do is to isolate the areas in your application which are assumed to be slow. For example, if your application is a web application and a given page is taking several seconds or even minutes to load, it's a good candidate to check out. It's important to start with the isolate step because it allows you to focus on a single code path per area with a clear begin and end and ignore the rest. The rest of the steps are taken per identified problematic area. Keep in mind that isolation focuses on tasks in an application, not code snippets. A task is something that's started in your application by either another task or the user, or another program, and has a beginning and an end. You can see a task as a piece of functionality offered by your application.  Analyze Once you've determined the problem areas, you have to perform analysis on the code paths of each area, to see where the performance problems occur and which areas are not the problem. This is a multi-layered effort: an application which uses an O/R mapper typically consists of multiple parts: there's likely some kind of interface (web, webservice, windows etc.), a part which controls the interface and business logic, the O/R mapper part and the RDBMS, all connected with either a network or inter-process connections provided by the OS or other means. Each of these parts, including the connectivity plumbing, eat up a part of the total time it takes to complete a task, e.g. load a webpage with all orders of a given customer X. To understand which parts participate in the task / area we're investigating and how much they contribute to the total time taken to complete the task, analysis of each participating task is essential. Start with the code you wrote which starts the task, analyze the code and track the path it follows through your application. What does the code do along the way, verify whether it's correct or not. Analyze whether you have implemented the right algorithms in your code for this particular area. Remember we're looking at one area at a time, which means we're ignoring all other code paths, just the code path of the current problematic area, from begin to end and back. Don't dig in and start optimizing at the code level just yet. We're just analyzing. If your analysis reveals big architectural stupidity, it's perhaps a good idea to rethink the architecture at this point. For the rest, we're analyzing which means we collect data about what could be wrong, for each participating part of the complete application. Reviewing the code you wrote is a good tool to get deeper understanding of what is going on for a given task but ultimately it lacks precision and overview what really happens: humans aren't good code interpreters, computers are. We therefore need to utilize tools to get deeper understanding about which parts contribute how much time to the total task, triggered by which other parts and for example how many times are they called. There are two different kind of tools which are necessary: .NET profilers and O/R mapper / RDBMS profilers. .NET profiling .NET profilers (e.g. dotTrace by JetBrains or Ants by Red Gate software) show exactly which pieces of code are called, how many times they're called, and the time it took to run that piece of code, at the method level and sometimes even at the line level. The .NET profilers are essential tools for understanding whether the time taken to complete a given task / area in your application is consumed by .NET code, where exactly in your code, the path to that code, how many times that code was called by other code and thus reveals where hotspots are located: the areas where a solution can be found. Importantly, they also reveal which areas can be left alone: remember our penny wise pound foolish saying: if a profiler reveals that a group of methods are fast, or don't contribute much to the total time taken for a given task, ignore them. Even if the code in them is perhaps complex and looks like a candidate for optimization: you can work all day on that, it won't matter.  As we're focusing on a single area of the application, it's best to start profiling right before you actually activate the task/area. Most .NET profilers support this by starting the application without starting the profiling procedure just yet. You navigate to the particular part which is slow, start profiling in the profiler, in your application you perform the actions which are considered slow, and afterwards you get a snapshot in the profiler. The snapshot contains the data collected by the profiler during the slow action, so most data is produced by code in the area to investigate. This is important, because it allows you to stay focused on a single area. O/R mapper and RDBMS profiling .NET profilers give you a good insight in the .NET side of things, but not in the RDBMS side of the application. As this article is about O/R mapper powered applications, we're also looking at databases, and the software making it possible to consume the database in your application: the O/R mapper. To understand which parts of the O/R mapper and database participate how much to the total time taken for task T, we need different tools. There are two kind of tools focusing on O/R mappers and database performance profiling: O/R mapper profilers and RDBMS profilers. For O/R mapper profilers, you can look at LLBLGen Prof by hibernating rhinos or the Linq to Sql/LLBLGen Pro profiler by Huagati. Hibernating rhinos also have profilers for other O/R mappers like NHibernate (NHProf) and Entity Framework (EFProf) and work the same as LLBLGen Prof. For RDBMS profilers, you have to look whether the RDBMS vendor has a profiler. For example for SQL Server, the profiler is shipped with SQL Server, for Oracle it's build into the RDBMS, however there are also 3rd party tools. Which tool you're using isn't really important, what's important is that you get insight in which queries are executed during the task / area we're currently focused on and how long they took. Here, the O/R mapper profilers have an advantage as they collect the time it took to execute the query from the application's perspective so they also collect the time it took to transport data across the network. This is important because a query which returns a massive resultset or a resultset with large blob/clob/ntext/image fields takes more time to get transported across the network than a small resultset and a database profiler doesn't take this into account most of the time. Another tool to use in this case, which is more low level and not all O/R mappers support it (though LLBLGen Pro and NHibernate as well do) is tracing: most O/R mappers offer some form of tracing or logging system which you can use to collect the SQL generated and executed and often also other activity behind the scenes. While tracing can produce a tremendous amount of data in some cases, it also gives insight in what's going on. Interpret After we've completed the analysis step it's time to look at the data we've collected. We've done code reviews to see whether we've done anything stupid and which parts actually take place and if the proper algorithms have been implemented. We've done .NET profiling to see which parts are choke points and how much time they contribute to the total time taken to complete the task we're investigating. We've performed O/R mapper profiling and RDBMS profiling to see which queries were executed during the task, how many queries were generated and executed and how long they took to complete, including network transportation. All this data reveals two things: which parts are big contributors to the total time taken and which parts are irrelevant. Both aspects are very important. The parts which are irrelevant (i.e. don't contribute significantly to the total time taken) can be ignored from now on, we won't look at them. The parts which contribute a lot to the total time taken are important to look at. We now have to first look at the .NET profiler results, to see whether the time taken is consumed in our own code, in .NET framework code, in the O/R mapper itself or somewhere else. For example if most of the time is consumed by DbCommand.ExecuteReader, the time it took to complete the task is depending on the time the data is fetched from the database. If there was just 1 query executed, according to tracing or O/R mapper profilers / RDBMS profilers, check whether that query is optimal, uses indexes or has to deal with a lot of data. Interpret means that you follow the path from begin to end through the data collected and determine where, along the path, the most time is contributed. It also means that you have to check whether this was expected or is totally unexpected. My previous example of the 10 row resultset of a query which groups millions of rows will likely reveal that a long time is spend inside the database and almost no time is spend in the .NET code, meaning the RDBMS part contributes the most to the total time taken, the rest is compared to that time, irrelevant. Considering the vastness of the source data set, it's expected this will take some time. However, does it need tweaking? Perhaps all possible tweaks are already in place. In the interpret step you then have to decide that further action in this area is necessary or not, based on what the analysis results show: if the analysis results were unexpected and in the area where the most time is contributed to the total time taken is room for improvement, action should be taken. If not, you can only accept the situation and move on. In all cases, document your decision together with the analysis you've done. If you decide that the perceived performance problem is actually expected due to the nature of the task performed, it's essential that in the future when someone else looks at the application and starts asking questions you can answer them properly and new analysis is only necessary if situations changed. Fix After interpreting the analysis results you've concluded that some areas need adjustment. This is the fix step: you're actively correcting the performance problem with proper action targeted at the real cause. In many cases related to O/R mapper powered applications it means you'll use different features of the O/R mapper to achieve the same goal, or apply optimizations at the RDBMS level. It could also mean you apply caching inside your application (compromise memory consumption over performance) to avoid unnecessary re-querying data and re-consuming the results. After applying a change, it's key you re-do the analysis and interpretation steps: compare the results and expectations with what you had before, to see whether your actions had any effect or whether it moved the problem to a different part of the application. Don't fall into the trap to do partly analysis: do the full analysis again: .NET profiling and O/R mapper / RDBMS profiling. It might very well be that the changes you've made make one part faster but another part significantly slower, in such a way that the overall problem hasn't changed at all. Performance tuning is dealing with compromises and making choices: to use one feature over the other, to accept a higher memory footprint, to go away from the strict-OO path and execute queries directly onto the RDBMS, these are choices and compromises which will cross your path if you want to fix performance problems with respect to O/R mappers or data-access and databases in general. In most cases it's not a big issue: alternatives are often good choices too and the compromises aren't that hard to deal with. What is important is that you document why you made a choice, a compromise: which analysis data, which interpretation led you to the choice made. This is key for good maintainability in the years to come. Most common performance problems with O/R mappers Below is an incomplete list of common performance problems related to data-access / O/R mappers / RDBMS code. It will help you with fixing the hotspots you found in the interpretation step. SELECT N+1: (Lazy-loading specific). Lazy loading triggered performance bottlenecks. Consider a list of Orders bound to a grid. You have a Field mapped onto a related field in Order, Customer.CompanyName. Showing this column in the grid will make the grid fetch (indirectly) for each row the Customer row. This means you'll get for the single list not 1 query (for the orders) but 1+(the number of orders shown) queries. To solve this: use eager loading using a prefetch path to fetch the customers with the orders. SELECT N+1 is easy to spot with an O/R mapper profiler or RDBMS profiler: if you see a lot of identical queries executed at once, you have this problem. Prefetch paths using many path nodes or sorting, or limiting. Eager loading problem. Prefetch paths can help with performance, but as 1 query is fetched per node, it can be the number of data fetched in a child node is bigger than you think. Also consider that data in every node is merged on the client within the parent. This is fast, but it also can take some time if you fetch massive amounts of entities. If you keep fetches small, you can use tuning parameters like the ParameterizedPrefetchPathThreshold setting to get more optimal queries. Deep inheritance hierarchies of type Target Per Entity/Type. If you use inheritance of type Target per Entity / Type (each type in the inheritance hierarchy is mapped onto its own table/view), fetches will join subtype- and supertype tables in many cases, which can lead to a lot of performance problems if the hierarchy has many types. With this problem, keep inheritance to a minimum if possible, or switch to a hierarchy of type Target Per Hierarchy, which means all entities in the inheritance hierarchy are mapped onto the same table/view. Of course this has its own set of drawbacks, but it's a compromise you might want to take. Fetching massive amounts of data by fetching large lists of entities. LLBLGen Pro supports paging (and limiting the # of rows returned), which is often key to process through large sets of data. Use paging on the RDBMS if possible (so a query is executed which returns only the rows in the page requested). When using paging in a web application, be sure that you switch server-side paging on on the datasourcecontrol used. In this case, paging on the grid alone is not enough: this can lead to fetching a lot of data which is then loaded into the grid and paged there. Keep note that analyzing queries for paging could lead to the false assumption that paging doesn't occur, e.g. when the query contains a field of type ntext/image/clob/blob and DISTINCT can't be applied while it should have (e.g. due to a join): the datareader will do DISTINCT filtering on the client. this is a little slower but it does perform paging functionality on the data-reader so it won't fetch all rows even if the query suggests it does. Fetch massive amounts of data because blob/clob/ntext/image fields aren't excluded. LLBLGen Pro supports field exclusion for queries. You can exclude fields (also in prefetch paths) per query to avoid fetching all fields of an entity, e.g. when you don't need them for the logic consuming the resultset. Excluding fields can greatly reduce the amount of time spend on data-transport across the network. Use this optimization if you see that there's a big difference between query execution time on the RDBMS and the time reported by the .NET profiler for the ExecuteReader method call. Doing client-side aggregates/scalar calculations by consuming a lot of data. If possible, try to formulate a scalar query or group by query using the projection system or GetScalar functionality of LLBLGen Pro to do data consumption on the RDBMS server. It's far more efficient to process data on the RDBMS server than to first load it all in memory, then traverse the data in-memory to calculate a value. Using .ToList() constructs inside linq queries. It might be you use .ToList() somewhere in a Linq query which makes the query be run partially in-memory. Example: var q = from c in metaData.Customers.ToList() where c.Country=="Norway" select c; This will actually fetch all customers in-memory and do an in-memory filtering, as the linq query is defined on an IEnumerable<T>, and not on the IQueryable<T>. Linq is nice, but it can often be a bit unclear where some parts of a Linq query might run. Fetching all entities to delete into memory first. To delete a set of entities it's rather inefficient to first fetch them all into memory and then delete them one by one. It's more efficient to execute a DELETE FROM ... WHERE query on the database directly to delete the entities in one go. LLBLGen Pro supports this feature, and so do some other O/R mappers. It's not always possible to do this operation in the context of an O/R mapper however: if an O/R mapper relies on a cache, these kind of operations are likely not supported because they make it impossible to track whether an entity is actually removed from the DB and thus can be removed from the cache. Fetching all entities to update with an expression into memory first. Similar to the previous point: it is more efficient to update a set of entities directly with a single UPDATE query using an expression instead of fetching the entities into memory first and then updating the entities in a loop, and afterwards saving them. It might however be a compromise you don't want to take as it is working around the idea of having an object graph in memory which is manipulated and instead makes the code fully aware there's a RDBMS somewhere. Conclusion Performance tuning is almost always about compromises and making choices. It's also about knowing where to look and how the systems in play behave and should behave. The four steps I provided should help you stay focused on the real problem and lead you towards the solution. Knowing how to optimally use the systems participating in your own code (.NET framework, O/R mapper, RDBMS, network/services) is key for success as well as knowing what's going on inside the application you built. I hope you'll find this guide useful in tracking down performance problems and dealing with them in a useful way.  

    Read the article

  • Is Berkeley DB a NoSQL solution?

    - by Gregory Burd
    Berkeley DB is a library. To use it to store data you must link the library into your application. You can use most programming languages to access the API, the calls across these APIs generally mimic the Berkeley DB C-API which makes perfect sense because Berkeley DB is written in C. The inspiration for Berkeley DB was the DBM library, a part of the earliest versions of UNIX written by AT&T's Ken Thompson in 1979. DBM was a simple key/value hashtable-based storage library. In the early 1990s as BSD UNIX was transitioning from version 4.3 to 4.4 and retrofitting commercial code owned by AT&T with unencumbered code, it was the future founders of Sleepycat Software who wrote libdb (aka Berkeley DB) as the replacement for DBM. The problem it addressed was fast, reliable local key/value storage. At that time databases almost always lived on a single node, even the most sophisticated databases only had simple fail-over two node solutions. If you had a lot of data to store you would choose between the few commercial RDBMS solutions or to write your own custom solution. Berkeley DB took the headache out of the custom approach. These basic market forces inspired other DBM implementations. There was the "New DBM" (ndbm) and the "GNU DBM" (GDBM) and a few others, but the theme was the same. Even today TokyoCabinet calls itself "a modern implementation of DBM" mimicking, and improving on, something first created over thirty years ago. In the mid-1990s, DBM was the name for what you needed if you were looking for fast, reliable local storage. Fast forward to today. What's changed? Systems are connected over fast, very reliable networks. Disks are cheep, fast, and capable of storing huge amounts of data. CPUs continued to follow Moore's Law, processing power that filled a room in 1990 now fits in your pocket. PCs, servers, and other computers proliferated both in business and the personal markets. In addition to the new hardware entire markets, social systems, and new modes of interpersonal communication moved onto the web and started evolving rapidly. These changes cause a massive explosion of data and a need to analyze and understand that data. Taken together this resulted in an entirely different landscape for database storage, new solutions were needed. A number of novel solutions stepped up and eventually a category called NoSQL emerged. The new market forces inspired the CAP theorem and the heated debate of BASE vs. ACID. But in essence this was simply the market looking at what to trade off to meet these new demands. These new database systems shared many qualities in common. There were designed to address massive amounts of data, millions of requests per second, and scale out across multiple systems. The first large-scale and successful solution was Dynamo, Amazon's distributed key/value database. Dynamo essentially took the next logical step and added a twist. Dynamo was to be the database of record, it would be distributed, data would be partitioned across many nodes, and it would tolerate failure by avoiding single points of failure. Amazon did this because they recognized that the majority of the dynamic content they provided to customers visiting their web store front didn't require the services of an RDBMS. The queries were simple, key/value look-ups or simple range queries with only a few queries that required more complex joins. They set about to use relational technology only in places where it was the best solution for the task, places like accounting and order fulfillment, but not in the myriad of other situations. The success of Dynamo, and it's design, inspired the next generation of Non-SQL, distributed database solutions including Cassandra, Riak and Voldemort. The problem their designers set out to solve was, "reliability at massive scale" so the first focal point was distributed database algorithms. Underneath Dynamo there is a local transactional database; either Berkeley DB, Berkeley DB Java Edition, MySQL or an in-memory key/value data structure. Dynamo was an evolution of local key/value storage onto networks. Cassandra, Riak, and Voldemort all faced similar design decisions and one, Voldemort, choose Berkeley DB Java Edition for it's node-local storage. Riak at first was entirely in-memory, but has recently added write-once, append-only log-based on-disk storage similar type of storage as Berkeley DB except that it is based on a hash table which must reside entirely in-memory rather than a btree which can live in-memory or on disk. Berkeley DB evolved too, we added high availability (HA) and a replication manager that makes it easy to setup replica groups. Berkeley DB's replication doesn't partitioned the data, every node keeps an entire copy of the database. For consistency, there is a single node where writes are committed first - a master - then those changes are delivered to the replica nodes as log records. Applications can choose to wait until all nodes are consistent, or fire and forget allowing Berkeley DB to eventually become consistent. Berkeley DB's HA scales-out quite well for read-intensive applications and also effectively eliminates the central point of failure by allowing replica nodes to be elected (using a PAXOS algorithm) to mastership if the master should fail. This implementation covers a wide variety of use cases. MemcacheDB is a server that implements the Memcache network protocol but uses Berkeley DB for storage and HA to replicate the cache state across all the nodes in the cache group. Google Accounts, the user authentication layer for all Google properties, was until recently running Berkeley DB HA. That scaled to a globally distributed system. That said, most NoSQL solutions try to partition (shard) data across nodes in the replication group and some allow writes as well as reads at any node, Berkeley DB HA does not. So, is Berkeley DB a "NoSQL" solution? Not really, but it certainly is a component of many of the existing NoSQL solutions out there. Forgetting all the noise about how NoSQL solutions are complex distributed databases when you boil them down to a single node you still have to store the data to some form of stable local storage. DBMs solved that problem a long time ago. NoSQL has more to do with the layers on top of the DBM; the distributed, sometimes-consistent, partitioned, scale-out storage that manage key/value or document sets and generally have some form of simple HTTP/REST-style network API. Does Berkeley DB do that? Not really. Is Berkeley DB a "NoSQL" solution today? Nope, but it's the most robust solution on which to build such a system. Re-inventing the node-local data storage isn't easy. A lot of people are starting to come to appreciate the sophisticated features found in Berkeley DB, even mimic them in some cases. Could Berkeley DB grow into a NoSQL solution? Absolutely. Our key/value API could be extended over the net using any of a number of existing network protocols such as memcache or HTTP/REST. We could adapt our node-local data partitioning out over replicated nodes. We even have a nice query language and cost-based query optimizer in our BDB XML product that we could reuse were we to build out a document-based NoSQL-style product. XML and JSON are not so different that we couldn't adapt one to work with the other interchangeably. Without too much effort we could add what's missing, we could jump into this No SQL market withing a single product development cycle. Why isn't Berkeley DB already a NoSQL solution? Why aren't we working on it? Why indeed...

    Read the article

  • MIX 2010 Covert Operations Day 2 Silverlight + Windows 7 Phone

    - by GeekAgilistMercenary
    Left the Circus Circus and headed to the geek circus at Mandalay Bay.  Got in, got some breakfast, met a few more people and headed to the keynote. Upon arriving the crew I was hanging with at the event; Erik Mork, Beth Murray, and Brian Henderson and I were entertained with several other thousand geeks by the wicked yo-yoing. The first video demo of something was of Bing Maps and various aspects of Microsoft Research integrated together.  Namely the pictures, put in place, on real 3d element maps of various environments. Silverlight Scott Guthrie, as one would guess, kicked off the keynote.  His first point was that user experience has become a priority at Microsoft.  This can be seen by any observant soul with the release and push of Expression, Silverlight, and the other tools.  This is even more apparent when one takes note of Microsoft bringing in people that can actually do good design and putting them at the forefront. The next thing Scott brought up was a few key points about Silverlight.  Currently Silverlight is a little over 2 years old and has achieved a pretty solid 60% penetration.  Silverlight has all sorts of capabilities that have been developed and are now provided as open source including;  ad injection, smoothing, playback editing, and more.  Another thing he showed, which really struck me as awesome being in the analytics space, was the Olympics and a quick glimpse of the ad statistics, viewer experience, video playback performance, audience trends, and overall viewer participation.  All of it rendered in Silverlight in beautiful detail. The key piece of Scott's various points were all punctuated with the fact that all of this code is available as open source.  Not only is Microsoft really delving into this design element of things, they're getting involved in the right ways. One of the last points I'll bring up about Silverlight 4 is the ability to have HD video on a monitor, and an entirely different activity being done on the other monitor, effectively making Silverlight the only RIA framework that supports multi-monitor support.  Overall, Silverlight is continuing to impress – providing superior capabilities tit-for-tat with the competition. Windows 7 Phone The Windows 7 Phone has 3 primary buttons (yes, more than the iPhone, don't let your mind explode!!).  Start, Search, and Back control all of the needed functionality of the phone.  At the same time, of course, there is the multi-touch, touch, and other interactive abilities of the interface.  The intent, once start is pressed is to have all the information that a phone owner wants displayed immediately.  Avoiding the scrolling through pages of apps or rolling a ball to get through multitudes of other non-interactive phone interfaces.  The Windows 7 Phone simply has the data right in front of you, basically a phone dashboard.  From there it is easy to dive into the interactive areas of the phone. Each area of the interface of the phone is broken into hubs.  These hubs include applications, data, and other things based on a relative basis.  This basis being determined by the user.  These applications interact on many other levels, and form a kind of relationship between each other adding more and more meta-data to the phone user, their interactions between the applications, and of course the social element of their interactions on the phone.  This makes this phone a practical must have for a marketer involved in social media.  The level of wired together interaction is massive, and of course, if you've seen Office Outlook 2010 you know that the power that is pulled into the phone by being tied to Outlook is massive. Joe Belfiore also showed several UI & specifically UX elements of the phone interface that allows paging to be instinctual by simple clipped items, flipping page to page, and other excellent user experience advances for phone devices.  Belfiore's also showed how his people hub had a massive list of people, with pictures, all from various different social networks and other associated relations.  The rendering, speed, and viewing of these people's, their pictures, their social network information, and other characteristics was smooth and in some situations unbelievably rendered.  This demo showed some of the great power of the beta phone, which isn't even as powerful as the planned end device. Joe finished up by jumping into the music, videos, and other media with the Zune Component of the Windows 7 Mobile Phone.  This was all good stuff, but I'll get to what really sold me on the media element in a moment. When Joe was done, Scott Guthrie stepped back up to walk through building a Windows 7 Mobile Phone.  This is were I have to give serious props.  He built this application, in Visual Studio 2010, in front of 2000+ people.  That was cool, but what really was amazing that he build the application in about 2 minutes.  The IDE, side by side design that is standard in Visual Studio is light years ahead of x-Code or any of the iPhone IDEs.  The Windows 7 Mobile System, if it can get market penetration, poses a technologically superior development and phone platform over anything on the market right now.  The biggest problem with the phone, is it just isn't available yet.  I personally can't wait for a chance to build some apps for the new Windows Phone. Netflix, I May Start Up an Account Again! When I get my Windows 7 Phone device, I am absolutely getting a Netflix account again.  The Vertigo crew, as I wrote on Twitter "#MIX10 Props @seesharp on @netflix demo", displayed an application on the phone for Netflix that actually ran HD Video of Rescue Me (with Dennis Leary).  The video played back smooth as it would on a dedicated computer, I was instantly sold.  So this didn't actually sell me on the phone, because I'm already sold, but it did sell me whole heartedly on the media capabilities of the pending phone. Anyway, I try not to do this but I may double post today.  Lunch is over and I'm off to another session very near and dear to the heart of my occupation, Analytics Tracking.  Stay tuned and I should have that post up by the end of the day. Original Post – Check out my other blog for even more technical ramblings and reads.

    Read the article

  • Search Engine Optimisation Companies Push PPC on Mobile Phones

    Search engine optimisation companies are now huddling various strategies to offer search engine optimisation services for mobile browsers. This rapidly growing trend can be clearly seen just by mobile phone users' behavior on how they use their cell phones. Search engines optimisation combined with PPC advertising for mobile gadgets can generate massive profits.

    Read the article

  • CPU DB like IMDB for Microprocessors

    - by Jason Fitzpatrick
    If you’re interested in the history of microprocessors, the CPU DB at Stanford is a massive database of microprocessors that covers everything from code names to speed to processor families. Play with their visuals or download the entire database and make your own. CPU DB [Stanford.edu] The Best Free Portable Apps for Your Flash Drive Toolkit How to Own Your Own Website (Even If You Can’t Build One) Pt 3 How to Sync Your Media Across Your Entire House with XBMC

    Read the article

  • Create Keyword Dense Content For Better SEO

    Briefly touching on this in the introduction this is important to do and be aware of but do not make this a massive part of your efforts to achieve better search engine ranking. This basically means optimizing your content to be more keyword dense so that the search engine robots will pick up your site as being relevant for a certain search phrase or keyword.

    Read the article

  • What bots are really worth letting onto a site?

    - by blunders
    Having written a number of bots, and seen the massive amounts of random bots that happen to crawl a site, I am wondering if the goal of the site allowing bots is for the potential for the bot to send real traffic back to the site if there is any reason to allow bots that are not known to be sending real traffic back, and how to spot these "good" bots; based on how they ID themselves, IPs they come from, behaviors, etc.

    Read the article

  • Don't Cut Corners on Server Defragmentation

    Hard-Core Hardware: Fragmentation may not cut it as a big screen villain, but it remains a threat and handicap to optimal server performance. In this era of massive hard drives and virtualization, minimizing fragmentation is more critical than ever.

    Read the article

  • Don't Cut Corners on Server Defragmentation

    Hard-Core Hardware: Fragmentation may not cut it as a big screen villain, but it remains a threat and handicap to optimal server performance. In this era of massive hard drives and virtualization, minimizing fragmentation is more critical than ever.

    Read the article

  • Open Clip Art Library 2.0 Release!

    <b>Worldlabel:</b> "The Open Clip Art Library grew from a project between Jon Phillips (of Fabricatorz) and Bryce Harrington, in early 2004. From humble beginnings, it has evolved into a massive collection of over 24,000 scalable vector images, all created by 1200+ artists from around the world."

    Read the article

  • At The ATM: The Challenge of Tiny Buttons [Video]

    - by Jason Fitzpatrick
    If you’ve ever mis-mashed the buttons on an electronic device because your fingers are just too big, you’ll appreciate the situation this cheerful but massive fingered fellow gets into. Courtesy of Rikke Asbjoern, created while interning at Cartoon Network, the video is sure to hit home with those of us that fumble keypads and buttons wherever we go. [via Neatorama] HTG Explains: What Is RSS and How Can I Benefit From Using It? HTG Explains: Why You Only Have to Wipe a Disk Once to Erase It HTG Explains: Learn How Websites Are Tracking You Online

    Read the article

  • Google not indexing new forum

    - by Tom Gullen
    We installed a new forum a few months ago now. The URL is: https://www.scirra.com/forum I've 301'd the old topics/threads, as well as included all the new URLs in the sitemap. Yet they still are not appearing. Webmaster tools is showing: 139,512 URLs submitted 50,544 URLs indexed And has been stuck there for quite some time. A massive drop in indexed pages since we updated the forum as well: Any help much appreciated

    Read the article

  • Project Idea - Android

    - by Darren Young
    Hi, I am trying to come up with some project ideas for my final year at University, and I think that I have one that would offer be a (massive) challenge, and something I could potentially make money from. I just want to check something. Is it possible(from a photograph), to be able to determine somebodys face and the individual parts of that face - eyes, ears, nose, etc? This will probably be via Android. Thanks.

    Read the article

  • How do you manage extensibility in your multi-tenant systems?

    - by Brian MacKay
    I've got a few big web based multi-tenant products now, and very soon I can see that there will be a lot of customizations that are tenant specific. An extra field here or there, maybe an extra page or some extra logic in the middle of a workflow - that sort of thing. Some of these customizations can be rolled into the core product, and that's great. Some of them are highly specific and would get in everyone else's way. I have a few ideas in mind for managing this, but none of them seem to scale well. The obvious solution is to introduce a ton of client-level settings, allowing various 'features' to be enabled on per-client basis. The downside with that, of course, is massive complexity and clutter. You could introduce a truly huge number of settings, and over time various types of logic (presentation, business) could get way out of hand. Then there's the problem of client-specific fields, which begs for something cleaner than just adding a bunch of nullable fields to the existing tables. So what are people doing to manage this? Force.com seems to be the master of extensibility; obviously they've created a platform from the ground up that is super extensible. You can add on to almost anything with their web-based UI. FogBugz did something similiar where they created a robust plugin model that, come to think of it, might have actually been inspired by Force. I know they spent a lot of time and money on it and if I'm not mistaken the intention was to actually use it internally for future product development. Sounds like the kind of thing I could be tempted to build but probably shouldn't. :) Is a massive investment in pluggable architecture the only way to go? How are you managing these problems, and what kind of results are you seeing? EDIT: It does look as though FogBugz handled the problem by building a fairly robust platform and then using that to put together their screens. To extend it you create a DLL containing classes that implement interfaces like ISearchScreenGridColumn, and that becomes a module. I'm sure it was tremendously expensive to build considering that they have a large of devs and they worked on it for months, plus their surface area is perhaps 5% of the size of my application. Right now I am seriously wondering if Force.com is the right way to handle this. And I am a hard core ASP.Net guy, so this is a strange position to find myself in.

    Read the article

  • The Great PST Migration

    Having recently been on the front lines of a massive PST import operation, Sean Duffy offers advice and points out pitfalls. More than anything, he wishes he had a simple tool with which to banish PST hell, and finishes with some hard-won guidelines.

    Read the article

  • Staggered Isometric Map: Calculate map coordinates for point on screen

    - by Chris
    I know there are already a lot of resources about this, but I haven't found one that matches my coordinate system and I'm having massive trouble adjusting any of those solutions to my needs. What I learned is that the best way to do this is to use a transformation matrix. Implementing that is no problem, but I don't know in which way I have to transform the coordinate space. Here's an image that shows my coordinate system: How do I transform a point on screen to this coordinate system?

    Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >