Search Results

Search found 7490 results on 300 pages for 'algorithm analysis'.

Page 147/300 | < Previous Page | 143 144 145 146 147 148 149 150 151 152 153 154 | Next Page >

Trouble with copying dictionaries and using deepcopy on an SQLAlchemy ORM object

- by Az

Hi there, I'm doing a Simulated Annealing algorithm to optimise a given allocation of students and projects. This is language-agnostic pseudocode from Wikipedia: s ? s0; e ? E(s) // Initial state, energy. sbest ? s; ebest ? e // Initial "best" solution k ? 0 // Energy evaluation count. while k < kmax and e > emax // While time left & not good enough: snew ? neighbour(s) // Pick some neighbour. enew ? E(snew) // Compute its energy. if enew < ebest then // Is this a new best? sbest ? snew; ebest ? enew // Save 'new neighbour' to 'best found'. if P(e, enew, temp(k/kmax)) > random() then // Should we move to it? s ? snew; e ? enew // Yes, change state. k ? k + 1 // One more evaluation done return sbest // Return the best solution found. The following is an adaptation of the technique. My supervisor said the idea is fine in theory. First I pick up some allocation (i.e. an entire dictionary of students and their allocated projects, including the ranks for the projects) from entire set of randomised allocations, copy it and pass it to my function. Let's call this allocation aOld (it is a dictionary). aOld has a weight related to it called wOld. The weighting is described below. The function does the following: Let this allocation, aOld be the best_node From all the students, pick a random number of students and stick in a list Strip (DEALLOCATE) them of their projects ++ reflect the changes for projects (allocated parameter is now False) and lecturers (free up slots if one or more of their projects are no longer allocated) Randomise that list Try assigning (REALLOCATE) everyone in that list projects again Calculate the weight (add up ranks, rank 1 = 1, rank 2 = 2... and no project rank = 101) For this new allocation aNew, if the weight wNew is smaller than the allocation weight wOld I picked up at the beginning, then this is the best_node (as defined by the Simulated Annealing algorithm above). Apply the algorithm to aNew and continue. If wOld < wNew, then apply the algorithm to aOld again and continue. The allocations/data-points are expressed as "nodes" such that a node = (weight, allocation_dict, projects_dict, lecturers_dict) Right now, I can only perform this algorithm once, but I'll need to try for a number N (denoted by kmax in the Wikipedia snippet) and make sure I always have with me, the previous node and the best_node. So that I don't modify my original dictionaries (which I might want to reset to), I've done a shallow copy of the dictionaries. From what I've read in the docs, it seems that it only copies the references and since my dictionaries contain objects, changing the copied dictionary ends up changing the objects anyway. So I tried to use copy.deepcopy().These dictionaries refer to objects that have been mapped with SQLA. Questions: I've been given some solutions to the problems faced but due to my über green-ness with using Python, they all sound rather cryptic to me. Deepcopy isn't playing nicely with SQLA. I've been told thatdeepcopy on ORM objects probably has issues that prevent it from working as you'd expect. Apparently I'd be better off "building copy constructors, i.e. def copy(self): return FooBar(....)." Can someone please explain what that means? I checked and found out that deepcopy has issues because SQLAlchemy places extra information on your objects, i.e. an _sa_instance_state attribute, that I wouldn't want in the copy but is necessary for the object to have. I've been told: "There are ways to manually blow away the old _sa_instance_state and put a new one on the object, but the most straightforward is to make a new object with __init__() and set up the attributes that are significant, instead of doing a full deep copy." What exactly does that mean? Do I create a new, unmapped class similar to the old, mapped one? An alternate solution is that I'd have to "implement __deepcopy__() on your objects and ensure that a new _sa_instance_state is set up, there are functions in sqlalchemy.orm.attributes which can help with that." Once again this is beyond me so could someone kindly explain what it means? A more general question: given the above information are there any suggestions on how I can maintain the information/state for the best_node (which must always persist through my while loop) and the previous_node, if my actual objects (referenced by the dictionaries, therefore the nodes) are changing due to the deallocation/reallocation taking place? That is, without using copy?

Read the article
Online ALTER TABLE in MySQL 5.6

- by Marko Mäkelä

This is the low-level view of data dictionary language (DDL) operations in the InnoDB storage engine in MySQL 5.6. John Russell gave a more high-level view in his blog post April 2012 Labs Release – Online DDL Improvements. MySQL before the InnoDB Plugin Traditionally, the MySQL storage engine interface has taken a minimalistic approach to data definition language. The only natively supported operations were CREATE TABLE, DROP TABLE and RENAME TABLE. Consider the following example: CREATE TABLE t(a INT); INSERT INTO t VALUES (1),(2),(3); CREATE INDEX a ON t(a); DROP TABLE t; The CREATE INDEX statement would be executed roughly as follows: CREATE TABLE temp(a INT, INDEX(a)); INSERT INTO temp SELECT * FROM t; RENAME TABLE t TO temp2; RENAME TABLE temp TO t; DROP TABLE temp2; You could imagine that the database could crash when copying all rows from the original table to the new one. For example, it could run out of file space. Then, on restart, InnoDB would roll back the huge INSERT transaction. To fix things a little, a hack was added to ha_innobase::write_row for committing the transaction every 10,000 rows. Still, it was frustrating that even a simple DROP INDEX would make the table unavailable for modifications for a long time. Fast Index Creation in the InnoDB Plugin of MySQL 5.1 MySQL 5.1 introduced a new interface for CREATE INDEX and DROP INDEX. The old table-copying approach can still be forced by SET old_alter_table=0. This interface is used in MySQL 5.5 and in the InnoDB Plugin for MySQL 5.1. Apart from the ability to do a quick DROP INDEX, the main advantage is that InnoDB will execute a merge-sort algorithm before inserting the index records into each index that is being created. This should speed up the insert into the secondary index B-trees and potentially result in a better B-tree fill factor. The 5.1 ALTER TABLE interface was not perfect. For example, DROP FOREIGN KEY still invoked the table copy. Renaming columns could conflict with InnoDB foreign key constraints. Combining ADD KEY and DROP KEY in ALTER TABLE was problematic and not atomic inside the storage engine. The ALTER TABLE interface in MySQL 5.6 The ALTER TABLE storage engine interface was completely rewritten in MySQL 5.6. Instead of introducing a method call for every conceivable operation, MySQL 5.6 introduced a handful of methods, and data structures that keep track of the requested changes. In MySQL 5.6, online ALTER TABLE operation can be requested by specifying LOCK=NONE. Also LOCK=SHARED and LOCK=EXCLUSIVE are available. The old-style table copying can be requested by ALGORITHM=COPY. That one will require at least LOCK=SHARED. From the InnoDB point of view, anything that is possible with LOCK=EXCLUSIVE is also possible with LOCK=SHARED. Most ALGORITHM=INPLACE operations inside InnoDB can be executed online (LOCK=NONE). InnoDB will always require an exclusive table lock in two phases of the operation. The execution phases are tied to a number of methods: handler::check_if_supported_inplace_alter Checks if the storage engine can perform all requested operations, and if so, what kind of locking is needed. handler::prepare_inplace_alter_table InnoDB uses this method to set up the data dictionary cache for upcoming CREATE INDEX operation. We need stubs for the new indexes, so that we can keep track of changes to the table during online index creation. Also, crash recovery would drop any indexes that were incomplete at the time of the crash. handler::inplace_alter_table In InnoDB, this method is used for creating secondary indexes or for rebuilding the table. This is the ‘main’ phase that can be executed online (with concurrent writes to the table). handler::commit_inplace_alter_table This is where the operation is committed or rolled back. Here, InnoDB would drop any indexes, rename any columns, drop or add foreign keys, and finalize a table rebuild or index creation. It would also discard any logs that were set up for online index creation or table rebuild. The prepare and commit phases require an exclusive lock, blocking all access to the table. If MySQL times out while upgrading the table meta-data lock for the commit phase, it will roll back the ALTER TABLE operation. In MySQL 5.6, data definition language operations are still not fully atomic, because the data dictionary is split. Part of it is inside InnoDB data dictionary tables. Part of the information is only available in the *.frm file, which is not covered by any crash recovery log. But, there is a single commit phase inside the storage engine. Online Secondary Index Creation It may occur that an index needs to be created on a new column to speed up queries. But, it may be unacceptable to block modifications on the table while creating the index. It turns out that it is conceptually not so hard to support online index creation. All we need is some more execution phases: Set up a stub for the index, for logging changes. Scan the table for index records. Sort the index records. Bulk load the index records. Apply the logged changes. Replace the stub with the actual index. Threads that modify the table will log the operations to the logs of each index that is being created. Errors, such as log overflow or uniqueness violations, will only be flagged by the ALTER TABLE thread. The log is conceptually similar to the InnoDB change buffer. The bulk load of index records will bypass record locking. We still generate redo log for writing the index pages. It would suffice to log page allocations only, and to flush the index pages from the buffer pool to the file system upon completion. Native ALTER TABLE Starting with MySQL 5.6, InnoDB supports most ALTER TABLE operations natively. The notable exceptions are changes to the column type, ADD FOREIGN KEY except when foreign_key_checks=0, and changes to tables that contain FULLTEXT indexes. The keyword ALGORITHM=INPLACE is somewhat misleading, because certain operations cannot be performed in-place. For example, changing the ROW_FORMAT of a table requires a rebuild. Online operation (LOCK=NONE) is not allowed in the following cases: when adding an AUTO_INCREMENT column, when the table contains FULLTEXT indexes or a hidden FTS_DOC_ID column, or when there are FOREIGN KEY constraints referring to the table, with ON…CASCADE or ON…SET NULL option. The FOREIGN KEY limitations are needed, because MySQL does not acquire meta-data locks on the child or parent tables when executing SQL statements. Theoretically, InnoDB could support operations like ADD COLUMN and DROP COLUMN in-place, by lazily converting the table to a newer format. This would require that the data dictionary keep multiple versions of the table definition. For simplicity, we will copy the entire table, even for DROP COLUMN. The bulk copying of the table will bypass record locking and undo logging. For facilitating online operation, a temporary log will be associated with the clustered index of table. Threads that modify the table will also write the changes to the log. When altering the table, we skip all records that have been marked for deletion. In this way, we can simply discard any undo log records that were not yet purged from the original table. Off-page columns, or BLOBs, are an important consideration. We suspend the purge of delete-marked records if it would free any off-page columns from the old table. This is because the BLOBs can be needed when applying changes from the log. We have special logging for handling the ROLLBACK of an INSERT that inserted new off-page columns. This is because the columns will be freed at rollback.

Read the article
Organization & Architecture UNISA Studies – Chap 4

- by MarkPearl

Learning Outcomes Explain the characteristics of memory systems Describe the memory hierarchy Discuss cache memory principles Discuss issues relevant to cache design Describe the cache organization of the Pentium Computer Memory Systems There are key characteristics of memory… Location – internal or external Capacity – expressed in terms of bytes Unit of Transfer – the number of bits read out of or written into memory at a time Access Method – sequential, direct, random or associative From a users perspective the two most important characteristics of memory are… Capacity Performance – access time, memory cycle time, transfer rate The trade off for memory happens along three axis… Faster access time, greater cost per bit Greater capacity, smaller cost per bit Greater capacity, slower access time This leads to people using a tiered approach in their use of memory As one goes down the hierarchy, the following occurs… Decreasing cost per bit Increasing capacity Increasing access time Decreasing frequency of access of the memory by the processor The use of two levels of memory to reduce average access time works in principle, but only if conditions 1 to 4 apply. A variety of technologies exist that allow us to accomplish this. Thus it is possible to organize data across the hierarchy such that the percentage of accesses to each successively lower level is substantially less than that of the level above. A portion of main memory can be used as a buffer to hold data temporarily that is to be read out to disk. This is sometimes referred to as a disk cache and improves performance in two ways… Disk writes are clustered. Instead of many small transfers of data, we have a few large transfers of data. This improves disk performance and minimizes processor involvement. Some data designed for write-out may be referenced by a program before the next dump to disk. In that case the data is retrieved rapidly from the software cache rather than slowly from disk. Cache Memory Principles Cache memory is substantially faster than main memory. A caching system works as follows.. When a processor attempts to read a word of memory, a check is made to see if this in in cache memory… If it is, the data is supplied, If it is not in the cache, a block of main memory, consisting of a fixed number of words is loaded to the cache. Because of the phenomenon of locality of references, when a block of data is fetched into the cache, it is likely that there will be future references to that same memory location or to other words in the block. Elements of Cache Design While there are a large number of cache implementations, there are a few basic design elements that serve to classify and differentiate cache architectures… Cache Addresses Cache Size Mapping Function Replacement Algorithm Write Policy Line Size Number of Caches Cache Addresses Almost all non-embedded processors support virtual memory. Virtual memory in essence allows a program to address memory from a logical point of view without needing to worry about the amount of physical memory available. When virtual addresses are used the designer may choose to place the cache between the MMU (memory management unit) and the processor or between the MMU and main memory. The disadvantage of virtual memory is that most virtual memory systems supply each application with the same virtual memory address space (each application sees virtual memory starting at memory address 0), which means the cache memory must be completely flushed with each application context switch or extra bits must be added to each line of the cache to identify which virtual address space the address refers to. Cache Size We would like the size of the cache to be small enough so that the overall average cost per bit is close to that of main memory alone and large enough so that the overall average access time is close to that of the cache alone. Also, larger caches are slightly slower than smaller ones. Mapping Function Because there are fewer cache lines than main memory blocks, an algorithm is needed for mapping main memory blocks into cache lines. The choice of mapping function dictates how the cache is organized. Three techniques can be used… Direct – simplest technique, maps each block of main memory into only one possible cache line Associative – Each main memory block to be loaded into any line of the cache Set Associative – exhibits the strengths of both the direct and associative approaches while reducing their disadvantages For detailed explanations of each approach – read the text book (page 148 – 154) Replacement Algorithm For associative and set associating mapping a replacement algorithm is needed to determine which of the existing blocks in the cache must be replaced by a new block. There are four common approaches… LRU (Least recently used) FIFO (First in first out) LFU (Least frequently used) Random selection Write Policy When a block resident in the cache is to be replaced, there are two cases to consider If no writes to that block have happened in the cache – discard it If a write has occurred, a process needs to be initiated where the changes in the cache are propagated back to the main memory. There are several approaches to achieve this including… Write Through – all writes to the cache are done to the main memory as well at the point of the change Write Back – when a block is replaced, all dirty bits are written back to main memory The problem is complicated when we have multiple caches, there are techniques to accommodate for this but I have not summarized them. Line Size When a block of data is retrieved and placed in the cache, not only the desired word but also some number of adjacent words are retrieved. As the block size increases from very small to larger sizes, the hit ratio will at first increase because of the principle of locality, which states that the data in the vicinity of a referenced word are likely to be referenced in the near future. As the block size increases, more useful data are brought into cache. The hit ratio will begin to decrease as the block becomes even bigger and the probability of using the newly fetched information becomes less than the probability of using the newly fetched information that has to be replaced. Two specific effects come into play… Larger blocks reduce the number of blocks that fit into a cache. Because each block fetch overwrites older cache contents, a small number of blocks results in data being overwritten shortly after they are fetched. As a block becomes larger, each additional word is farther from the requested word and therefore less likely to be needed in the near future. The relationship between block size and hit ratio is complex, and no set approach is judged to be the best in all circumstances. Pentium 4 and ARM cache organizations The processor core consists of four major components: Fetch/decode unit – fetches program instruction in order from the L2 cache, decodes these into a series of micro-operations, and stores the results in the L2 instruction cache Out-of-order execution logic – Schedules execution of the micro-operations subject to data dependencies and resource availability – thus micro-operations may be scheduled for execution in a different order than they were fetched from the instruction stream. As time permits, this unit schedules speculative execution of micro-operations that may be required in the future Execution units – These units execute micro-operations, fetching the required data from the L1 data cache and temporarily storing results in registers Memory subsystem – This unit includes the L2 and L3 caches and the system bus, which is used to access main memory when the L1 and L2 caches have a cache miss and to access the system I/O resources

Read the article
Your thoughts on Best Practices for Scientific Computing?

- by John Smith

A recent paper by Wilson et al (2014) pointed out 24 Best Practices for scientific programming. It's worth to have a look. I would like to hear opinions about these points from experienced programmers in scientific data analysis. Do you think these advices are helpful and practical? Or are they good only in an ideal world? Wilson G, Aruliah DA, Brown CT, Chue Hong NP, Davis M, Guy RT, Haddock SHD, Huff KD, Mitchell IM, Plumbley MD, Waugh B, White EP, Wilson P (2014) Best Practices for Scientific Computing. PLoS Biol 12:e1001745. http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001745 Box 1. Summary of Best Practices Write programs for people, not computers. (a) A program should not require its readers to hold more than a handful of facts in memory at once. (b) Make names consistent, distinctive, and meaningful. (c) Make code style and formatting consistent. Let the computer do the work. (a) Make the computer repeat tasks. (b) Save recent commands in a file for re-use. (c) Use a build tool to automate workflows. Make incremental changes. (a) Work in small steps with frequent feedback and course correction. (b) Use a version control system. (c) Put everything that has been created manually in version control. Don’t repeat yourself (or others). (a) Every piece of data must have a single authoritative representation in the system. (b) Modularize code rather than copying and pasting. (c) Re-use code instead of rewriting it. Plan for mistakes. (a) Add assertions to programs to check their operation. (b) Use an off-the-shelf unit testing library. (c) Turn bugs into test cases. (d) Use a symbolic debugger. Optimize software only after it works correctly. (a) Use a profiler to identify bottlenecks. (b) Write code in the highest-level language possible. Document design and purpose, not mechanics. (a) Document interfaces and reasons, not implementations. (b) Refactor code in preference to explaining how it works. (c) Embed the documentation for a piece of software in that software. Collaborate. (a) Use pre-merge code reviews. (b) Use pair programming when bringing someone new up to speed and when tackling particularly tricky problems. (c) Use an issue tracking tool. I'm relatively new to serious programming for scientific data analysis. When I tried to write code for pilot analyses of some of my data last year, I encountered tremendous amount of bugs both in my code and data. Bugs and errors had been around me all the time, but this time it was somewhat overwhelming. I managed to crunch the numbers at last, but I thought I couldn't put up with this mess any longer. Some actions must be taken. Without a sophisticated guide like the article above, I started to adopt "defensive style" of programming since then. A book titled "The Art of Readable Code" helped me a lot. I deployed meticulous input validations or assertions for every function, renamed a lot of variables and functions for better readability, and extracted many subroutines as reusable functions. Recently, I introduced Git and SourceTree for version control. At the moment, because my co-workers are much more reluctant about these issues, the collaboration practices (8a,b,c) have not been introduced. Actually, as the authors admitted, because all of these practices take some amount of time and effort to introduce, it may be generally hard to persuade your reluctant collaborators to comply them. I think I'm asking your opinions because I still suffer from many bugs despite all my effort on many of these practices. Bug fix may be, or should be, faster than before, but I couldn't really measure the improvement. Moreover, much of my time has been invested on defence, meaning that I haven't actually done much data analysis (offence) these days. Where is the point I should stop at in terms of productivity? I've already deployed: 1a,b,c, 2a, 3a,b,c, 4b,c, 5a,d, 6a,b, 7a,7b I'm about to have a go at: 5b,c Not yet: 2b,c, 4a, 7c, 8a,b,c (I could not really see the advantage of using GNU make (2c) for my purpose. Could anyone tell me how it helps my work with MATLAB?)

Read the article
Introducing Microsoft SQL Server 2008 R2 - Business Intelligence Samples

- by smisner

On April 14, 2010, Microsoft Press (blog | twitter) released my latest book, co-authored with Ross Mistry (twitter), as a free ebook download - Introducing Microsoft SQL Server 2008 R2. As the title implies, this ebook is an introduction to the latest SQL Server release. Although you'll find a comprehensive review of the product's features in this book, you will not find the step-by-step details that are typical in my other books. For those readers who are interested in a more interactive learning experience, I have created two samples file for download: IntroSQLServer2008R2Samples project Sales Analysis workbook Here's a recap of the business intelligence chapters and the samples I used to generate the screen shots by chapter: Chapter 6: Scalable Data Warehousing covers a new edition of SQL Server, Parallel Data Warehouse. Understandably, Microsoft did not ship me the software and hardware to set up my own Parallel Data Warehouse environment for testing purposes and consequently you won't see any screenshots in this chapter. I received a lot of information and a lot of help from the product team during the development of this chapter to ensure its technical accuracy. Chapter 7: Master Data Services is a new component in SQL Server. After you install Master Data Services (MDS), which is a separate installation from SQL Server although it's found on the same media, you can install sample models to explore (which is what I did to create screenshots for the book). To do this, you deploying packages found at \Program Files\Microsoft SQL Server\Master Data Services\Samples\Packages. You will first need to use the Configuration Manager (in the Microsoft SQL Server 2008 R2\Master Data Services program group) to create a database and a Web application for MDS. Then when you launch the application, you'll see a Getting Started page which has a Deploy Sample Data link that you can use to deploy any of the sample packages. Chapter 8: Complex Event Processing is an introduction to another new component, StreamInsight. This topic was way too large to cover in-depth in a single chapter, so I focused on information such as architecture, development models, and an overview of the key sections of code you'll need to develop for your own applications. StreamInsight is an engine that operates on data in-flight and as such has no user interface that I could include in the book as screenshots. The November CTP version of SQL Server 2008 R2 included code samples as part of the installation, but these are not the official samples that will eventually be available in Codeplex. At the time of this writing, the samples are not yet published. Chapter 9: Reporting Services Enhancements provides an overview of all the changes to Reporting Services in SQL Server 2008 R2, and there are many! In previous posts, I shared more details than you'll find in the book about new functions (Lookup, MultiLookup, and LookupSet), properties for page numbering, and the new global variable RenderFormat. I will confess that I didn't use actual data in the book for my discussion on the Lookup functions, but I did create real reports for the blog posts and will upload those separately. For the other screenshots and examples in the book, I have created the IntroSQLServer2008R2Samples project for you to download. To preview these reports in Business Intelligence Development Studio, you must have the AdventureWorksDW2008R2 database installed, and you must download and install SQL Server 2008 R2. For the map report, you must execute the PopulationData.sql script that I included in the samples file to add a table to the AdventureWorksDW2008R2 database. The IntroSQLServer2008R2Samples project includes the following files: 01_AggregateOfAggregates.rdl to illustrate the use of embedded aggregate functions 02_RenderFormatAndPaging.rdl to illustrate the use of page break properties (Disabled, ResetPageNumber), the PageName property, and the RenderFormat global variable 03_DataSynchronization.rdl to illustrate the use of the DomainScope property 04_TextboxOrientation.rdl to illustrate the use of the WritingMode property 05_DataBar.rdl 06_Sparklines.rdl 07_Indicators.rdl 08_Map.rdl to illustrate a simple analytical map that uses color to show population counts by state PopulationData.sql to provide the data necessary for the map report Chapter 10: Self-Service Analysis with PowerPivot introduces two new components to the Microsoft BI stack, PowerPivot for Excel and PowerPivot for SharePoint, which you can learn more about at the PowerPivot site. To produce the screenshots for this chapter, I created the Sales Analysis workbook which you can download (although you must have Excel 2010 and the PowerPivot for Excel add-in installed to explore it fully). It's a rather simple workbook because space in the book did not permit a complete exploration of all the wonderful things you can do with PowerPivot. I used a tutorial that was available with the CTP version as a basis for the report so it might look familiar if you've already started learning about PowerPivot. In future posts, I'll continue exploring the new features in greater detail. If there's any special requests, please let me know! Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
SQL SERVER – Example of Performance Tuning for Advanced Users with DB Optimizer

- by Pinal Dave

Performance tuning is such a subject that everyone wants to master it. In beginning everybody is at a novice level and spend lots of time learning how to master the art of performance tuning. However, as we progress further the tuning of the system keeps on getting very difficult. I have understood in my early career there should be no need of ego in the technology field. There are always better solutions and better ideas out there and we should not resist them. Instead of resisting the change and new wave I personally adopt it. Here is a similar example, as I personally progress to the master level of performance tuning, I face that it is getting harder to come up with optimal solutions. In such scenarios I rely on various tools to teach me how I can do things better. Once I learn about tools, I am often able to come up with better solutions when I face the similar situation next time. A few days ago I had received a query where the user wanted to tune it further to get the maximum out of the performance. I have re-written the similar query with the help of AdventureWorks sample database. SELECT * FROM HumanResources.Employee e INNER JOIN HumanResources.EmployeeDepartmentHistory edh ON e.BusinessEntityID = edh.BusinessEntityID INNER JOIN HumanResources.Shift s ON edh.ShiftID = s.ShiftID; User had similar query to above query was used in very critical report and wanted to get best out of the query. When I looked at the query – here were my initial thoughts Use only column in the select statements as much as you want in the application Let us look at the query pattern and data workload and find out the optimal index for it Before I give further solutions I was told by the user that they need all the columns from all the tables and creating index was not allowed in their system. He can only re-write queries or use hints to further tune this query. Now I was in the constraint box – I believe * was not a great idea but if they wanted all the columns, I believe we can’t do much besides using *. Additionally, if I cannot create a further index, I must come up with some creative way to write this query. I personally do not like to use hints in my application but there are cases when hints work out magically and gives optimal solutions. Finally, I decided to use Embarcadero’s DB Optimizer. It is a fantastic tool and very helpful when it is about performance tuning. I have previously explained how it works over here. First open DBOptimizer and open Tuning Job from File >> New >> Tuning Job. Once you open DBOptimizer Tuning Job follow the various steps indicates in the following diagram. Essentially we will take our original script and will paste that into Step 1: New SQL Text and right after that we will enable Step 2 for Generating Various cases, Step 3 for Detailed Analysis and Step 4 for Executing each generated case. Finally we will click on Analysis in Step 5 which will generate the report detailed analysis in the result pan. The detailed pan looks like. It generates various cases of T-SQL based on the original query. It applies various hints and available hints to the query and generate various execution plans of the query and displays them in the resultant. You can clearly notice that original query had a cost of 0.0841 and logical reads about 607 pages. Whereas various options which are just following it has different execution cost as well logical read. There are few cases where we have higher logical read and there are few cases where as we have very low logical read. If we pay attention the very next row to original query have Merge_Join_Query in description and have lowest execution cost value of 0.044 and have lowest Logical Reads of 29. This row contains the query which is the most optimal re-write of the original query. Let us double click over it. Here is the query: SELECT * FROM HumanResources.Employee e INNER JOIN HumanResources.EmployeeDepartmentHistory edh ON e.BusinessEntityID = edh.BusinessEntityID INNER JOIN HumanResources.Shift s ON edh.ShiftID = s.ShiftID OPTION (MERGE JOIN) If you notice above query have additional hint of Merge Join. With the help of this Merge Join query hint this query is now performing much better than before. The entire process takes less than 60 seconds. Please note that it the join hint Merge Join was optimal for this query but it is not necessary that the same hint will be helpful in all the queries. Additionally, if the workload or data pattern changes the query hint of merge join may be no more optimal join. In that case, we will have to redo the entire exercise once again. This is the reason I do not like to use hints in my queries and I discourage all of my users to use the same. However, if you look at this example, this is a great case where hints are optimizing the performance of the query. It is humanly not possible to test out various query hints and index options with the query to figure out which is the most optimal solution. Sometimes, we need to depend on the efficiency tools like DB Optimizer to guide us the way and select the best option from the suggestion provided. Let me know what you think of this article as well your experience with DB Optimizer. Please leave a comment. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Joins, SQL Optimization, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
Am I right about the differences between Floyd-Warshall, Dijkstra's and Bellman-Ford algorithms?

- by Programming Noob

I've been studying the three and I'm stating my inferences from them below. Could someone tell me if I have understood them accurately enough or not? Thank you. Dijkstra's algorithm is used only when you have a single source and you want to know the smallest path from one node to another, but fails in cases like this Floyd-Warshall's algorithm is used when any of all the nodes can be a source, so you want the shortest distance to reach any destination node from any source node. This only fails when there are negative cycles (this is the most important one. I mean, this is the one I'm least sure about:) 3.Bellman-Ford is used like Dijkstra's, when there is only one source. This can handle negative weights and its working is the same as Floyd-Warshall's except for one source, right? If you need to have a look, the corresponding algorithms are (courtesy Wikipedia): Bellman-Ford: procedure BellmanFord(list vertices, list edges, vertex source) // This implementation takes in a graph, represented as lists of vertices // and edges, and modifies the vertices so that their distance and // predecessor attributes store the shortest paths. // Step 1: initialize graph for each vertex v in vertices: if v is source then v.distance := 0 else v.distance := infinity v.predecessor := null // Step 2: relax edges repeatedly for i from 1 to size(vertices)-1: for each edge uv in edges: // uv is the edge from u to v u := uv.source v := uv.destination if u.distance + uv.weight < v.distance: v.distance := u.distance + uv.weight v.predecessor := u // Step 3: check for negative-weight cycles for each edge uv in edges: u := uv.source v := uv.destination if u.distance + uv.weight < v.distance: error "Graph contains a negative-weight cycle" Dijkstra: 1 function Dijkstra(Graph, source): 2 for each vertex v in Graph: // Initializations 3 dist[v] := infinity ; // Unknown distance function from 4 // source to v 5 previous[v] := undefined ; // Previous node in optimal path 6 // from source 7 8 dist[source] := 0 ; // Distance from source to source 9 Q := the set of all nodes in Graph ; // All nodes in the graph are 10 // unoptimized - thus are in Q 11 while Q is not empty: // The main loop 12 u := vertex in Q with smallest distance in dist[] ; // Start node in first case 13 if dist[u] = infinity: 14 break ; // all remaining vertices are 15 // inaccessible from source 16 17 remove u from Q ; 18 for each neighbor v of u: // where v has not yet been 19 removed from Q. 20 alt := dist[u] + dist_between(u, v) ; 21 if alt < dist[v]: // Relax (u,v,a) 22 dist[v] := alt ; 23 previous[v] := u ; 24 decrease-key v in Q; // Reorder v in the Queue 25 return dist; Floyd-Warshall: 1 /* Assume a function edgeCost(i,j) which returns the cost of the edge from i to j 2 (infinity if there is none). 3 Also assume that n is the number of vertices and edgeCost(i,i) = 0 4 */ 5 6 int path[][]; 7 /* A 2-dimensional matrix. At each step in the algorithm, path[i][j] is the shortest path 8 from i to j using intermediate vertices (1..k-1). Each path[i][j] is initialized to 9 edgeCost(i,j). 10 */ 11 12 procedure FloydWarshall () 13 for k := 1 to n 14 for i := 1 to n 15 for j := 1 to n 16 path[i][j] = min ( path[i][j], path[i][k]+path[k][j] );

Read the article
Code Metrics: Number of IL Instructions

- by DigiMortal

In my previous posting about code metrics I introduced how to measure LoC (Lines of Code) in .NET applications. Now let’s take a step further and let’s take a look how to measure compiled code. This way we can somehow have a picture about what compiler produces. In this posting I will introduce you code metric called number of IL instructions. NB! Number of IL instructions is not something you can use to measure productivity of your team. If you want to get better idea about the context of this metric and LoC then please read my first posting about LoC. What are IL instructions? When code written in some .NET Framework language is compiled then compiler produces assemblies that contain byte code. These assemblies are executed later by Common Language Runtime (CLR) that is code execution engine of .NET Framework. The byte code is called Intermediate Language (IL) – this is more common language than C# and VB.NET by example. You can use ILDasm tool to convert assemblies to IL assembler so you can read them. As IL instructions are building blocks of all .NET Framework binary code these instructions are smaller and highly general – we don’t want very rich low level language because it executes slower than more general language. For every method or property call in some .NET Framework language corresponds set of IL instructions. There is no 1:1 relationship between line in high level language and line in IL assembler. There are more IL instructions than lines in C# code by example. How much instructions there are? I have no common answer because it really depends on your code. Here you can see some metrics from my current community project that is developed on SharePoint Server 2007. As average I have about 7 IL instructions per line of code. This is not metric you should use, it is just illustrative example so you can see the differences between numbers of lines and IL instructions. Why should I measure the number of IL instructions? Just take a look at chart above. Compiler does something that you cannot see – it compiles your code to IL. This is not intuitive process because you usually cannot say what is exactly the end result. You know it at greater plain but you don’t know it exactly. Therefore we can expect some surprises and that’s why we should measure the number of IL instructions. By example, you may find better solution for some method in your source code. It looks nice, it works nice and everything seems to be okay. But on server under load your fix may be way slower than previous code. Although you minimized the number of lines of code it ended up with increasing the number of IL instructions. How to measure the number of IL instructions? My choice is NDepend because Visual Studio is not able to measure this metric. Steps to make are easy. Open your NDepend project or create new and add all your application assemblies to project (you can also add Visual Studio solution to project). Run project analysis and wait until it is done. You can see over-all stats form global summary window. This is the same window I used to read the LoC and the number of IL instructions metrics for my chart. Meanwhile I made some changes to my code (enabled advanced caching for events and event registrations module) and then I ran code analysis again to get results for this section of this posting. NDepend is also able to tell you exactly what parts of code have problematically much IL instructions. The code quality section of CQL Query Explorer shows you how much problems there are with members in analyzed code. If you click on the line Methods too big (NbILInstructions) you can see all the problematic members of classes in CQL Explorer shown in image on right. In my case if have 10 methods that are too big and two of them have horrible number of IL instructions – just take a look at first two methods in this TOP10. Also note the query box. NDepend has easy and SQL-like query language to query code analysis results. You can modify these queries if you like and also you can define your own ones if default set is not enough for you. What is good result? As you can see from query window then the number of IL instructions per member should have maximally 200 IL instructions. Of course, like always, the less instructions you have, the better performing code you have. I don’t mean here little differences but big ones. By example, take a look at my first method in warnings list. The number of IL instructions it has is huge. And believe me – this method looks awful. Conclusion The number of IL instructions is useful metric when optimizing your code. For analyzing code at general level to find out too long methods you can use the number of LoC metric because it is more intuitive for you and you can therefore handle the situation more easily. Also you can use NDepend as code metrics tool because it has a lot of metrics to offer.

Read the article
F# and the rose-tinted reflection

- by CliveT

We're already seeing increasing use of many cores on client desktops. It is a change that has been long predicted. It is not just a change in architecture, but our notions of efficiency in a program. No longer can we focus on the asymptotic complexity of an algorithm by counting the steps that a single core processor would take to execute it. Instead we'll soon be more concerned about the scalability of the algorithm and how well we can increase the performance as we increase the number of cores. This may even lead us to throw away our most efficient algorithms, and switch to less efficient algorithms that scale better. We might even be willing to waste cycles in order to speculatively execute at the algorithm rather than the hardware level. State is the big headache in this parallel world. At the hardware level, main memory doesn't necessarily contain the definitive value corresponding to a particular address. An update to a location might still be held in a CPU's local cache and it might be some time before the value gets propagated. To get the latest value, and the notion of "latest" takes a lot of defining in this world of rapidly mutating state, the CPUs may well need to communicate to decide who has the definitive value of a particular address in order to avoid lost updates. At the user program level, this means programmers will need to lock objects before modifying them, or attempt to avoid the overhead of locking by understanding the memory models at a very deep level. I think it's this need to avoid statefulness that has led to the recent resurgence of interest in functional languages. In the 1980s, functional languages started getting traction when research was carried out into how programs in such languages could be auto-parallelised. Sadly, the impracticality of some of the languages, the overheads of communication during this parallel execution, and rapid improvements in compiler technology on stock hardware meant that the functional languages fell by the wayside. The one thing that these languages were good at was getting rid of implicit state, and this single idea seems like a solution to the problems we are going to face in the coming years. Whether these languages will catch on is hard to predict. The mindset for writing a program in a functional language is really very different from the way that object-oriented problem decomposition happens - one has to focus on the verbs instead of the nouns, which takes some getting used to. There are a number of hybrid functional/object languages that have been becoming more popular in recent times. These half-way houses make it easy to use functional ideas for some parts of the program while still allowing access to the underlying object-focused platform without a great deal of impedance mismatch. One example is F# running on the CLR which, in Visual Studio 2010, has because a first class member of the pack. Inside Visual Studio 2010, the tooling for F# has improved to the point where it is easy to set breakpoints and watch values change while debugging at the source level. In my opinion, it is the tooling support that will enable the widespread adoption of functional languages - without this support, people will put off any transition into the functional world for as long as they possibly can. Without tool support it will make it hard to learn these languages. One tool that doesn't currently support F# is Reflector. The idea of decompiling IL to a functional language is daunting, but F# is potentially so important I couldn't dismiss the idea. As I'm currently developing Reflector 6.5, I thought it wise to take four days just to see how far I could get in doing so, even if it achieved little more than to be clearer on how much was possible, and how long it might take. You can read what happened here, and of the insights it gave us on ways to improve the tool.

Read the article
Speeding up procedural texture generation

- by FalconNL

Recently I've begun working on a game that takes place in a procedurally generated solar system. After a bit of a learning curve (having neither worked with Scala, OpenGL 2 ES or Libgdx before), I have a basic tech demo going where you spin around a single procedurally textured planet: The problem I'm running into is the performance of the texture generation. A quick overview of what I'm doing: a planet is a cube that has been deformed to a sphere. To each side, a n x n (e.g. 256 x 256) texture is applied, which are bundled in one 8n x n texture that is sent to the fragment shader. The last two spaces are not used, they're only there to make sure the width is a power of 2. The texture is currently generated on the CPU, using the updated 2012 version of the simplex noise algorithm linked to in the paper 'Simplex noise demystified'. The scene I'm using to test the algorithm contains two spheres: the planet and the background. Both use a greyscale texture consisting of six octaves of 3D simplex noise, so for example if we choose 128x128 as the texture size there are 128 x 128 x 6 x 2 x 6 = about 1.2 million calls to the noise function. The closest you will get to the planet is about what's shown in the screenshot and since the game's target resolution is 1280x720 that means I'd prefer to use 512x512 textures. Combine that with the fact the actual textures will of course be more complicated than basic noise (There will be a day and night texture, blended in the fragment shader based on sunlight, and a specular mask. I need noise for continents, terrain color variation, clouds, city lights, etc.) and we're looking at something like 512 x 512 x 6 x 3 x 15 = 70 million noise calls for the planet alone. In the final game, there will be activities when traveling between planets, so a wait of 5 or 10 seconds, possibly 20, would be acceptable since I can calculate the texture in the background while traveling, though obviously the faster the better. Getting back to our test scene, performance on my PC isn't too terrible, though still too slow considering the final result is going to be about 60 times worse: 128x128 : 0.1s 256x256 : 0.4s 512x512 : 1.7s This is after I moved all performance-critical code to Java, since trying to do so in Scala was a lot worse. Running this on my phone (a Samsung Galaxy S3), however, produces a more problematic result: 128x128 : 2s 256x256 : 7s 512x512 : 29s Already far too long, and that's not even factoring in the fact that it'll be minutes instead of seconds in the final version. Clearly something needs to be done. Personally, I see a few potential avenues, though I'm not particularly keen on any of them yet: Don't precalculate the textures, but let the fragment shader calculate everything. Probably not feasible, because at one point I had the background as a fullscreen quad with a pixel shader and I got about 1 fps on my phone. Use the GPU to render the texture once, store it and use the stored texture from then on. Upside: might be faster than doing it on the CPU since the GPU is supposed to be faster at floating point calculations. Downside: effects that cannot (easily) be expressed as functions of simplex noise (e.g. gas planet vortices, moon craters, etc.) are a lot more difficult to code in GLSL than in Scala/Java. Calculate a large amount of noise textures and ship them with the application. I'd like to avoid this if at all possible. Lower the resolution. Buys me a 4x performance gain, which isn't really enough plus I lose a lot of quality. Find a faster noise algorithm. If anyone has one I'm all ears, but simplex is already supposed to be faster than perlin. Adopt a pixel art style, allowing for lower resolution textures and fewer noise octaves. While I originally envisioned the game in this style, I've come to prefer the realistic approach. I'm doing something wrong and the performance should already be one or two orders of magnitude better. If this is the case, please let me know. If anyone has any suggestions, tips, workarounds, or other comments regarding this problem I'd love to hear them.

Read the article
Skewed: a rotating camera in a simple CPU-based voxel raycaster/raytracer

- by voxelizr

TL;DR -- in my first simple software voxel raycaster, I cannot get camera rotations to work, seemingly correct matrices notwithstanding. The result is skewed: like a flat rendering, correctly rotated, however distorted and without depth. (While axis-aligned ie. unrotated, depth and parallax are as expected.) I'm trying to write a simple voxel raycaster as a learning exercise. This is purely CPU based for now until I figure out how things work exactly -- fow now, OpenGL is just (ab)used to blit the generated bitmap to the screen as often as possible. Now I have gotten to the point where a perspective-projection camera can move through the world and I can render (mostly, minus some artifacts that need investigation) perspective-correct 3-dimensional views of the "world", which is basically empty but contains a voxel cube of the Stanford Bunny. So I have a camera that I can move up and down, strafe left and right and "walk forward/backward" -- all axis-aligned so far, no camera rotations. Herein lies my problem. Screenshot #1: correct depth when the camera is still strictly axis-aligned, ie. un-rotated. Now I have for a few days been trying to get rotation to work. The basic logic and theory behind matrices and 3D rotations, in theory, is very clear to me. Yet I have only ever achieved a "2.5 rendering" when the camera rotates... fish-eyey, bit like in Google Streetview: even though I have a volumetric world representation, it seems --no matter what I try-- like I would first create a rendering from the "front view", then rotate that flat rendering according to camera rotation. Needless to say, I'm by now aware that rotating rays is not particularly necessary and error-prone. Still, in my most recent setup, with the most simplified raycast ray-position-and-direction algorithm possible, my rotation still produces the same fish-eyey flat-render-rotated style looks: Screenshot #2: camera "rotated to the right by 39 degrees" -- note how the blue-shaded left-hand side of the cube from screen #2 is not visible in this rotation, yet by now "it really should"! Now of course I'm aware of this: in a simple axis-aligned-no-rotation-setup like I had in the beginning, the ray simply traverses in small steps the positive z-direction, diverging to the left or right and top or bottom only depending on pixel position and projection matrix. As I "rotate the camera to the right or left" -- ie I rotate it around the Y-axis -- those very steps should be simply transformed by the proper rotation matrix, right? So for forward-traversal the Z-step gets a bit smaller the more the cam rotates, offset by an "increase" in the X-step. Yet for the pixel-position-based horizontal+vertical-divergence, increasing fractions of the x-step need to be "added" to the z-step. Somehow, none of my many matrices that I experimented with, nor my experiments with matrix-less hardcoded verbose sin/cos calculations really get this part right. Here's my basic per-ray pre-traversal algorithm -- syntax in Go, but take it as pseudocode: fx and fy: pixel positions x and y rayPos: vec3 for the ray starting position in world-space (calculated as below) rayDir: vec3 for the xyz-steps to be added to rayPos in each step during ray traversal rayStep: a temporary vec3 camPos: vec3 for the camera position in world space camRad: vec3 for camera rotation in radians pmat: typical perspective projection matrix The algorithm / pseudocode: // 1: rayPos is for now "this pixel, as a vector on the view plane in 3d, at The Origin" rayPos.X, rayPos.Y, rayPos.Z = ((fx / width) - 0.5), ((fy / height) - 0.5), 0 // 2: rotate around Y axis depending on cam rotation. No prob since view plane still at Origin 0,0,0 rayPos.MultMat(num.NewDmat4RotationY(camRad.Y)) // 3: a temp vec3. planeDist is -0.15 or some such -- fov-based dist of view plane from eye and also the non-normalized, "in axis-aligned world" traversal step size "forward into the screen" rayStep.X, rayStep.Y, rayStep.Z = 0, 0, planeDist // 4: rotate this too -- 0,zstep should become some meaningful xzstep,xzstep rayStep.MultMat(num.NewDmat4RotationY(CamRad.Y)) // set up direction vector from still-origin-based-ray-position-off-rotated-view-plane plus rotated-zstep-vector rayDir.X, rayDir.Y, rayDir.Z = -rayPos.X - me.rayStep.X, -rayPos.Y, rayPos.Z + rayStep.Z // perspective projection rayDir.Normalize() rayDir.MultMat(pmat) // before traversal, the ray starting position has to be transformed from origin-relative to campos-relative rayPos.Add(camPos) I'm skipping the traversal and sampling parts -- as per screens #1 through #3, those are "basically mostly correct" (though not pretty) -- when axis-aligned / unrotated.

Read the article
Fastest pathfinding for static node matrix

- by Sean Martin

I'm programming a route finding routine in VB.NET for an online game I play, and I'm searching for the fastest route finding algorithm for my map type. The game takes place in space, with thousands of solar systems connected by jump gates. The game devs have provided a DB dump containing a list of every system and the systems it can jump to. The map isn't quite a node tree, since some branches can jump to other branches - more of a matrix. What I need is a fast pathfinding algorithm. I have already implemented an A* routine and a Dijkstra's, both find the best path but are too slow for my purposes - a search that considers about 5000 nodes takes over 20 seconds to compute. A similar program on a website can do the same search in less than a second. This website claims to use D*, which I have looked into. That algorithm seems more appropriate for dynamic maps rather than one that does not change - unless I misunderstand it's premise. So is there something faster I can use for a map that is not your typical tile/polygon base? GBFS? Perhaps a DFS? Or have I likely got some problem with my A* - maybe poorly chosen heuristics or movement cost? Currently my movement cost is the length of the jump (the DB dump has solar system coordinates as well), and the heuristic is a quick euclidean calculation from the node to the goal. In case anyone has some optimizations for my A*, here is the routine that consumes about 60% of my processing time, according to my profiler. The coordinateData table contains a list of every system's coordinates, and neighborNode.distance is the distance of the jump. Private Function findDistance(ByVal startSystem As Integer, ByVal endSystem As Integer) As Integer 'hCount += 1 'If hCount Mod 0 = 0 Then 'Return hCache 'End If 'Initialize variables to be filled Dim x1, x2, y1, y2, z1, z2 As Integer 'LINQ queries for solar system data Dim systemFromData = From result In jumpDataDB.coordinateDatas Where result.systemId = startSystem Select result.x, result.y, result.z Dim systemToData = From result In jumpDataDB.coordinateDatas Where result.systemId = endSystem Select result.x, result.y, result.z 'LINQ execute 'Fill variables with solar system data for from and to system For Each solarSystem In systemFromData x1 = (solarSystem.x) y1 = (solarSystem.y) z1 = (solarSystem.z) Next For Each solarSystem In systemToData x2 = (solarSystem.x) y2 = (solarSystem.y) z2 = (solarSystem.z) Next Dim x3 = Math.Abs(x1 - x2) Dim y3 = Math.Abs(y1 - y2) Dim z3 = Math.Abs(z1 - z2) 'Calculate distance and round 'Dim distance = Math.Round(Math.Sqrt(Math.Abs((x1 - x2) ^ 2) + Math.Abs((y1 - y2) ^ 2) + Math.Abs((z1 - z2) ^ 2))) Dim distance = firstConstant * Math.Min(secondConstant * (x3 + y3 + z3), Math.Max(x3, Math.Max(y3, z3))) 'Dim distance = Math.Abs(x1 - x2) + Math.Abs(z1 - z2) + Math.Abs(y1 - y2) 'hCache = distance Return distance End Function And the main loop, the other 30% 'Begin search While openList.Count() != 0 'Set current system and move node to closed currentNode = lowestF() move(currentNode.id) For Each neighborNode In neighborNodes If Not onList(neighborNode.toSystem, 0) Then If Not onList(neighborNode.toSystem, 1) Then Dim newNode As New nodeData() newNode.id = neighborNode.toSystem newNode.parent = currentNode.id newNode.g = currentNode.g + neighborNode.distance newNode.h = findDistance(newNode.id, endSystem) newNode.f = newNode.g + newNode.h newNode.security = neighborNode.security openList.Add(newNode) shortOpenList(OLindex) = newNode.id OLindex += 1 Else Dim proposedG As Integer = currentNode.g + neighborNode.distance If proposedG < gValue(neighborNode.toSystem) Then changeParent(neighborNode.toSystem, currentNode.id, proposedG) End If End If End If Next 'Check to see if done If currentNode.id = endSystem Then Exit While End If End While If clarification is needed on my spaghetti code, I'll try to explain.

Read the article
ORA-7445 Troubleshooting

- by [email protected]

QUICKLINK: Note 153788.1 ORA-600/ORA-7445 Lookup tool Note 1082674.1 : A Video To Demonstrate The Usage Of The ORA-600/ORA-7445 Lookup Tool [Video] Have you observed an ORA-07445 error reported in your alert log? While the ORA-600 error is "captured" as a handled exception in the Oracle source code, the ORA-7445 is an unhandled exception error due to an OS exception which should result in the creation of a core file. An ORA-7445 is a generic error, and can occur from anywhere in the Oracle code. The precise location of the error is identified by the core file and/or trace file it produces. Looking for the best way to diagnose? Whenever an ORA-7445 error is raised a core file is generated. There may be a trace file generated with the error as well. Prior to 11g, the core files are located in the CORE_DUMP_DEST directory. Starting with 11g, there is a new advanced fault diagnosability infrastructure to manage trace data. Diagnostic files are written into a root directory for all diagnostic data called the ADR home. Core files at 11g will go to the ADR HOME/cdump directory. For more information on the Oracle 11g Diagnosability feature see Note 453125.1 11g Diagnosability Frequently Asked Questions Note 443529.1 11g Quick Steps to Package and Send Critical Error Diagnostic Information to Support[Video] NOTE: While the core file is captured in the Diagnosability infrastructure, the file may not be included with a diagnostic package.1. Check the Alert LogThe alert log may indicate additional errors or other internal errors at the time of the problem. In some cases, the ORA-7445 error will occur along with ORA-600, ORA-3113, ORA-4030 errors. The ORA-7445 error can be side effects of the other problems and you should review the first error and associated core file or trace file and work down the list of errors. Note 1020463.6 DIAGNOSING ORA-3113 ERRORS Note 1812.1 TECH: Getting a Stack Trace from a CORE file Note 414966.1 RDA Documentation Index If the ORA-7445 errors are not associated with other error conditions, ensure the trace data is not truncated. If you see a message at the end of the file "MAX DUMP FILE SIZE EXCEEDED" the MAX_DUMP_FILE_SIZE parameter is not setup high enough or to 'unlimited'. There could be vital diagnostic information missing in the file and discovering the root issue may be very difficult. Set the MAX_DUMP_FILE_SIZE appropriately and regenerate the error for complete trace information. For pointers on deeper analysis of these errors see Note 390293.1 Introduction to 600/7445 Internal Error Analysis Note 211909.1 Customer Introduction to ORA-7445 Errors 2. Search 600/7445 Lookup Tool Visit My Oracle Support to access the ORA-00600 Lookup tool (Note 153788.1). The ORA-600/ORA-7445 Lookup tool may lead you to applicable content in My Oracle Support on the problem and can be used to investigate the problem with argument data from the error message or you can pull out key stack pointers from the associated trace file to match up against known bugs.3. "Fine tune" searches in Knowledge Base As the ORA-7445 error indicates an unhandled exception in the Oracle source code, your search in the Oracle Knowledge Base will need to focus on the stack data from the core file or the trace file. Keep in mind that searches on generic argument data will bring back a large result set. The more you can learn about the environment and code leading to the errors, the easier it will be to narrow the hit list to match your problem. Note 153788.1 ORA-600/ORA-7445 TroubleshooterNote 1082674.1 A Video To Demonstrate The Usage Of The ORA-600/ORA-7445 Lookup Tool [Video] NOTE: If no trace file is captured, see Note 1812.1 TECH: Getting a Stack Trace from a CORE file. Core files are managed through 11g Diagnosability, but are not packaged with other diagnostic data automatically. The core files can be quite large, but may be useful during analysis within Oracle Support.4. If assistance is required from Oracle Should it become necessary to get assistance from Oracle Support on an ORA-7445 problem, please provide at a minimum, the Alert log Associated tracefile(s) or incident package at 11g Patch level information Core file(s) Information about changes in configuration and/or application prior to issues If error is reproducible, a self-contained reproducible testcase: Note.232963.1 How to Build a Testcase for Oracle Data Server Support to Reproduce ORA-600 and ORA-7445 Errors RDA report or Oracle Configuration Manager information Oracle Configuration Manager Quick Start Guide Note 548815.1 My Oracle Support Configuration Management FAQ Note 414966.1 RDA Documentation Index ***For reference to the content in this blog, refer to Note.1092832.1 Master Note for Diagnosing ORA-600

Read the article
Oracle WebCenter at the Enterprise 2.0 Conference

- by Brian Dirking

We had a great week at the E20 Conference, presenting in four sessions – Andy MacMillan gave a session titled Today’s Successful Enterprises are Social Enterprises and was on a panel that Tony Byrne moderated; Christian Finn spoke on a panel on Unified Communications Unified Communications + Social Computing = Best of Both Worlds?, Mark Bennett spoke on a panel on The Evolution of Talent Management. The key areas of focus this year were sentiment analysis, adoption and community building, the benefits of failure, and social’s role in process applications. Sentiment analysis. This was focused not on external audiences but more on employee sentiment. Tim Young showed his internal "NikoNiko" project, where employees use smilies to report their current mood. The result was a dashboard that showed the company mood by department. Since the goal is to improve productivity, people can see which departments are running into issues and try and address them. A company might otherwise wait until the end of the quarter financials to find out that there was a problem and product didn’t ship. This is a way to identify issues immediately. Tim is great – he had the crowd laughing as soon as he hit the stage, with his proposed hastag for his session: by making it 138 characters long, people couldn’t say much behind his back. And as I tweeted during his session, I loved his comment that complexity diffuses energy - it sounds like something Sun Tzu would say. Another example of employee sentiment analysis was CubeVibe. Founder and CEO Aaron Aycock, in his 3 minute pitch or die session talked about how engaged employees perform better. It was too bad he got gonged, he was just picking up speed, but CubeVibe did win the vote – congratulations to them. Internal adoption, community building, and involvement. On this topic I spoke to Terri Griffith, and she said there is some good work going on at University of Indiana regarding this, and hinted that she might be blogging about it in the near future. This area holds lots of interest for me. Amongst our customers, - CPAC stands out as an organization that has successfully built a community. So, I wonder - what are the building blocks? A strong leader? A common or unifying purpose? A certain level of engagement? I imagine someone has created an equation that says “for a community to grow at 30% per month, there must be an engagement level x to the square root of y, where x equals current community size, and y equals the expected growth rate, and the result is how many engagements the average user must contribute to maintain that growth.” Does anyone have a framework like that? The net result of everyone’s experience is that there is nothing to do but start early and fail often. Kevin Jones made this the focus of his keynote. He talked about the types of failure and what they mean. And he showed his famous kids at work video: Kevin’s blog also has this post: Social Business Failure #8: Workflow Integration. This is something that we’ve been working on at Oracle. Since so much of business is based in enterprise applications such as ERP and CRM (and since Oracle offers e-Business Suite, Siebel, PeopleSoft, and JD Edwards, as well as Fusion Applications), it makes sense that the social capabilities of Oracle WebCenter is built right into these applications. There are two types of social collaboration – ad-hoc, and exception handling. When you are in a business process and encounter an exception, you immediately look for 1) the document that tells you how to handle it, or 2) the person who can tell you how to handle it. With WebCenter built into these processes, people either search their content management system, or engage in expertise location and conversation. The great thing is, THEY DON’T HAVE TO LEAVE THE APPLICATION TO DO IT. Oracle has built the social capabilities right into the applications and business processes. I don’t think enough folks were able to see that at the event, but I expect that over the next six months folks will become very aware of it. WebCenter also provides the ability to have ad-hoc collaboration, search, and expertise location that folks need when they are innovating or collaborating. We demonstrated Oracle Social Network. It’s built on our Oracle WebCenter product to provide social collaboration inside and outside of your company. When we showed it to people, there were a number of areas that they commented on that were different from the other products being shown at the conference: Screenshots from within the product Many authors working on documents simultaneously Flagging people for follow up Direct ability to call out to people Ability to see presence not just if someone is online, but which conversation they are actively in Great stuff, the conference was full of smart people that that we enjoy spending time with. We’ll keep up in the meantime, but we look forward to seeing you in Boston.

Read the article
Goto for the Java Programming Language

- by darcy

Work on JDK 8 is well-underway, but we thought this late-breaking JEP for another language change for the platform couldn't wait another day before being published. Title: Goto for the Java Programming Language Author: Joseph D. Darcy Organization: Oracle. Created: 2012/04/01 Type: Feature State: Funded Exposure: Open Component: core/lang Scope: SE JSR: 901 MR Discussion: compiler dash dev at openjdk dot java dot net Start: 2012/Q2 Effort: XS Duration: S Template: 1.0 Reviewed-by: Duke Endorsed-by: Edsger Dijkstra Funded-by: Blue Sun Corporation Summary Provide the benefits of the time-testing goto control structure to Java programs. The Java language has a history of adding new control structures over time, the assert statement in 1.4, the enhanced for-loop in 1.5,and try-with-resources in 7. Having support for goto is long-overdue and simple to implement since the JVM already has goto instructions. Success Metrics The goto statement will allow inefficient and verbose recursive algorithms and explicit loops to be replaced with more compact code. The effort will be a success if at least twenty five percent of the JDK's explicit loops are replaced with goto's. Coordination with IDE vendors is expected to help facilitate this goal. Motivation The goto construct offers numerous benefits to the Java platform, from increased expressiveness, to more compact code, to providing new programming paradigms to appeal to a broader demographic. In JDK 8, there is a renewed focus on using the Java platform on embedded devices with more modest resources than desktop or server environments. In such contexts, static and dynamic memory footprint is a concern. One significant component of footprint is the code attribute of class files and certain classes of important algorithms can be expressed more compactly using goto than using other constructs, saving footprint. For example, to implement state machines recursively, some parties have asked for the JVM to support tail calls, that is, to perform a complex transformation with security implications to turn a method call into a goto. Such complicated machinery should not be assumed for an embedded context. A better solution is just to expose to the programmer the desired functionality, goto. The web has familiarized users with a model of traversing links among different HTML pages in a free-form fashion with some state being maintained on the side, such as login credentials, to effect behavior. This is exactly the programming model of goto and code. While in the past this has been derided as leading to "spaghetti code," spaghetti is a tasty and nutritious meal for programmers, unlike quiche. The invokedynamic instruction added by JSR 292 exposes the JVM's linkage operation to programmers. This is a low-level operation that can be leveraged by sophisticated programmers. Likewise, goto is a also a low-level operation that should not be hidden from programmers who can use more efficient idioms. Some may object that goto was consciously excluded from the original design of Java as one of the removed feature from C and C++. However, the designers of the Java programming languages have revisited these removals before. The enum construct was also left out only to be added in JDK 5 and multiple inheritance was left out, only to be added back by the virtual extension method methods of Project Lambda. As a living language, the needs of the growing Java community today should be used to judge what features are needed in the platform tomorrow; the language should not be forever bound by the decisions of the past. Description From its initial version, the JVM has had two instructions for unconditional transfer of control within a method, goto (0xa7) and goto_w (0xc8). The goto_w instruction is used for larger jumps. All versions of the Java language have supported labeled statements; however, only the break and continue statements were able to specify a particular label as a target with the onerous restriction that the label must be lexically enclosing. The grammar addition for the goto statement is: GotoStatement: goto Identifier ; The new goto statement similar to break except that the target label can be anywhere inside the method and the identifier is mandatory. The compiler simply translates the goto statement into one of the JVM goto instructions targeting the right offset in the method. Therefore, adding the goto statement to the platform is only a small effort since existing compiler and JVM functionality is reused. Other language changes to support goto include obvious updates to definite assignment analysis, reachability analysis, and exception analysis. Possible future extensions include a computed goto as found in gcc, which would replace the identifier in the goto statement with an expression having the type of a label. Testing Since goto will be implemented using largely existing facilities, only light levels of testing are needed. Impact Compatibility: Since goto is already a keyword, there are no source compatibility implications. Performance/scalability: Performance will improve with more compact code. JVMs already need to handle irreducible flow graphs since goto is a VM instruction.

Read the article
How to prepare for a programming competition? Graphs, Stacks, Trees, oh my! [closed]

- by Simucal

Last semester I attended ACM's (Association for Computing Machinery) bi-annual programming competition at a local University. My University sent 2 teams of 3 people and we competed amongst other schools in the mid-west. We got our butts kicked. You are given a packet with about 11 problems (1 problem per page) and you have 4 hours to solve as many as you can. They'll run your program you submit against a set of data and your output must match theirs exactly. In fact, the judging is automated for the most part. In any case.. I went there fairly confident in my programming skills and I left there feeling drained and weak. It was a terribly humbling experience. In 4 hours my team of 3 people completed only one of the problems. The top team completed 4 of them and took 1st place. The problems they asked were like no problems I have ever had to answer before. I later learned that in order to solve them some of them effectively you have to use graphs/graph algorithms, trees, stacks. Some of them were simply "greedy" algo's. My question is, how can I better prepare for this semesters programming competition so I don't leave there feeling like a complete moron? What tips do you have for me to be able to answer these problems that involve graphs, trees, various "well known" algorithms? How can I easily identify the algorithm we should implement for a given problem? I have yet to take Algorithm Design in school so I just feel a little out of my element. Here are some examples of the questions asked at the competitions: ACM Problem Sets Update: Just wanted to update this since the latest competition is over. My team placed 1st for our small region (about 6-7 universities with between 1-5 teams each school) and ~15th for the midwest! So, it is a marked improvement over last years performance for sure. We also had no graduate students on our team and after reviewing the rules we found out that many teams had several! So, that would be a pretty big advantage in my own opinion. Problems this semester ranged from about 1-2 "easy" problems (ie bit manipulation, string manipulation) to hard (graph problems involving fairly complex math and network flow problems). We were able to solve 4 problems in our 5 hours. Just wanted to thank everyone for the resources they provided here, we used them for our weekly team practices and it definitely helped! Some quick tips that I have that aren't suggested below: When you are seated at your computer before the competition starts, quickly type out various data structures that you might need that you won't have access to in your languages libraries. I typed out a Graph data-structure complete with floyd-warshall and dijkstra's algorithm before the competition began. We ended up using it in our 2nd problem that we solved and this is the main reason why we solved this problem before anyone else in the midwest. We had it ready to go from the beginning. Similarly, type out the code to read in a file since this will be required for every problem. Save this answer "template" someplace so you can quickly copy/paste it to your IDE at the beginning of each problem. There are no rules on programming anything before the competition starts so get any boilerplate code out the way. We found it useful to have one person who is on permanent whiteboard duty. This is usually the person who is best at math and at working out solutions to get a head start on future problems you will be doing. One person is on permanent programming duty. Your fastest/most skilled "programmer" (most familiar with the language). This will save debugging time also. The last person has several roles between assessing the packet of problems for the next "easiest" problem, helping the person on the whiteboard work out solutions and helping the person programming work out bugs/issues. This person needs to be flexible and be able to switch between roles easily.

Read the article
First Foray–About timeout

- by SQLMonger

It has been quite a while since I signed up for this blog site and high time that something was posted. I have a list of topics that I will be working through and posting. Some I am sure will have been posted by others, but I will be sticking to the technical problems and challenges that I’ve recently faced, and the solutions that worked for me. My motto when learning something new has always been “My kingdom for an example!”, and I plan on delivering useful examples here so others can learn from my efforts, failures and successes. A bit of background about me… My name is Clayton Groom. I am a founding partner of a consulting firm in St. Louis Missouri, Covenant Technology Partners, LLC and focus on SQL Server Data Warehouse design, Analysis Services and Enterprise Reporting solutions. I have been working with SQL Server since the early nineties, when it still only ran on OS/2. I love solving puzzles and technical challenges. Enough about me… On to a real problem… SSIS Connection Time outs versus Command Time outs Last week, I was working on automating the processing for a large Analysis Services cube. I had reworked an SSIS package and script task originally posted by Vidas Matelis that automates the process of adding new and dropping old partitions to/from an Analysis Services cube. I had the package working great, tested, and ready for deployment. It basically performs a query against the source system to determine if there is new data in the warehouse that will require a new partition to be added to the cube, and it checks the cube to see if there are any partitions that are present that are no longer needed in a rolling 60 month window. My client uses Tivoli for running all their production jobs, and not SQL Agent, so I had to build a command line file for Tivoli to use to run the package. Everything was going great. I had tested the command file from my development workstation using an XML configuration file to pass in server-specific parameters into the package when executed using the DTExec utility. With all the pieces ready, I updated the dtsconfig file to point to the UAT environment and started working with the Tivoli developer to test the job. On the first run, the job failed, and from what I could see in the SSIS log, it had failed because of a timeout. Other errors in the log made me think that perhaps the connection string had not been passed into the package correctly. We bumped the Connection Manager timeout values from 20 seconds to 120 seconds and tried again. The job still failed. After changing the command line to use the /SET option instead of the /CONFIGFILE option, we tested again, and again failure. After a number more failed attempts, and getting the Teradata DBA involved to monitor and see if we were connecting and failing or just failing to connect, we determined that the job was indeed connecting to the server and then disconnecting itself after 30 seconds. This seemed odd, as we had the timeout values for the connection manager set to 180 seconds by then. At this point one of the DBA’s found a post on the Teradata forum that had the clues to the puzzle: There is a separate “CommandTimeout” custom property on the Data source object that may needed to be adjusted for longer running queries. I opened up the SSIS package, opened the data flow task that generated the partition list table and right-clicked on the data source. from the context menu, I selected “Show Advanced Editor” and found the property. Sure enough, it was set to 30 seconds. The CommandTimeout property can also be edited in the SSIS Properties sheet. In order to determine how long the timeout needed to be, I ran the query from the task in the development environment and received a response in a matter of seconds. I then tried the same query against the production database and waited several minutes for a response. This did not seem to be a reasonable response time for the query involved, and indeed it wasn’t. The Teradata DBA’s adjusted the query governor settings for the service account I was testing with, and we were able to get the response back down under a minute. Still, I set the CommandTimeout property to a much higher value in case the job was ever started during a time of high-demand on the production server. With this change in place, the job finally completed successfully. The lesson learned for me was two-fold: Always compare query execution times between development and production environments, and don’t assume that production will always be faster. With higher user demands, query governors, and a whole lot more data, the execution time of even what might seem to be simple queries can vary greatly. SSIS Connection time out settings do not affect command time outs. Connection timeouts control how long the package will wait for a response from the server before assuming the server is not available or is not responding. Command time outs control how long a task will wait for results to start being returned before deciding that the server is not responding. Both lessons seem pretty straight forward, and I felt pretty sheepish once I finally figured out what the issue was. To be fair though, In the 5+ years that I have been working with SSIS, I could only recall one other time where I had to set the CommandTimeout property, and that memory only resurfaced while I was penning this post.

Read the article
Clever memory usage through the years

- by Ben Emmett

A friend and I were recently talking about the really clever tricks people have used to get the most out of memory. I thought I’d share my favorites, and would love to hear yours too! Interleaving on drum memory Back in the ye olde days before I’d been born (we’re talking the 50s / 60s here), working memory commonly took the form of rotating magnetic drums. These would spin at a constant speed, and a fixed head would read from memory when the correct part of the drum passed it by, a bit like a primitive platter disk. Because each revolution took a few milliseconds, programmers took to manually arranging information non-sequentially on the drum, timing when an instruction or memory address would need to be accessed, then spacing information accordingly around the edge of the drum, thus reducing the access delay. Similar techniques were still used on hard disks and floppy disks into the 90s, but have become irrelevant with modern disk technologies. The Hashlife algorithm Conway’s Game of Life has attracted numerous implementations over the years, but Bill Gosper’s Hashlife algorithm is particularly impressive. Taking advantage of the repetitive nature of many cellular automata, it uses a quadtree structure to store the hashes of pieces of the overall grid. Over time there are fewer and fewer new structures which need to be evaluated, so it starts to run faster with larger grids, drastically outperforming other algorithms both in terms of speed and the size of grid which can be simulated. The actual amount of memory used is huge, but it’s used in a clever way, so makes the list . Elite’s procedural generation Ok, so this isn’t exactly a memory optimization – more a storage optimization – but it gets an honorable mention anyway. When writing Elite, David Braben and Ian Bell wanted to build a rich world which gamers could explore, but their 22K memory was something of a limitation (for comparison that’s about the size of my avatar picture at the top of this page). They procedurally generated all the characteristics of the 2048 planets in their virtual universe, including the names, which were stitched together using a lookup table of parts of names. In fact the original plans were for 2^52 planets, but it was decided that that was probably too many. Oh, and they did that all in assembly language. Other games of the time used similar techniques too – The Sentinel’s landscape generation algorithm being another example. Modern Garbage Collectors Garbage collection in managed languages like Java and .NET ensures that most of the time, developers stop needing to care about how they use and clean up memory as the garbage collector handles it automatically. Achieving this without killing performance is a near-miraculous feet of software engineering. Much like when learning chemistry, you find that every time you think you understand how the garbage collector works, it turns out to be a mere simplification; that there are yet more complexities and heuristics to help it run efficiently. Of course introducing memory problems is still possible (and there are tools like our memory profiler to help if that happens to you) but they’re much, much rarer. A cautionary note In the examples above, there were good and well understood reasons for the optimizations, but cunningly optimized code has usually had to trade away readability and maintainability to achieve its gains. Trying to optimize memory usage without being pretty confident that there’s actually a problem is doing it wrong. So what have I missed? Tell me about the ingenious (or stupid) tricks you’ve seen people use. Ben

Read the article
Observing flow control idle time in TCP

- by user12820842

Previously I described how to observe congestion control strategies during transmission, and here I talked about TCP's sliding window approach for handling flow control on the receive side. A neat trick would now be to put the pieces together and ask the following question - how often is TCP transmission blocked by congestion control (send-side flow control) versus a zero-sized send window (which is the receiver saying it cannot process any more data)? So in effect we are asking whether the size of the receive window of the peer or the congestion control strategy may be sub-optimal. The result of such a problem would be that we have TCP data that we could be transmitting but we are not, potentially effecting throughput. So flow control is in effect: when the congestion window is less than or equal to the amount of bytes outstanding on the connection. We can derive this from args[3]-tcps_snxt - args[3]-tcps_suna, i.e. the difference between the next sequence number to send and the lowest unacknowledged sequence number; and when the window in the TCP segment received is advertised as 0 We time from these events until we send new data (i.e. args[4]-tcp_seq = snxt value when window closes. Here's the script: #!/usr/sbin/dtrace -s #pragma D option quiet tcp:::send / (args[3]-tcps_snxt - args[3]-tcps_suna) = args[3]-tcps_cwnd / { cwndclosed[args[1]-cs_cid] = timestamp; cwndsnxt[args[1]-cs_cid] = args[3]-tcps_snxt; @numclosed["cwnd", args[2]-ip_daddr, args[4]-tcp_dport] = count(); } tcp:::send / cwndclosed[args[1]-cs_cid] && args[4]-tcp_seq = cwndsnxt[args[1]-cs_cid] / { @meantimeclosed["cwnd", args[2]-ip_daddr, args[4]-tcp_dport] = avg(timestamp - cwndclosed[args[1]-cs_cid]); @stddevtimeclosed["cwnd", args[2]-ip_daddr, args[4]-tcp_dport] = stddev(timestamp - cwndclosed[args[1]-cs_cid]); @numclosed["cwnd", args[2]-ip_daddr, args[4]-tcp_dport] = count(); cwndclosed[args[1]-cs_cid] = 0; cwndsnxt[args[1]-cs_cid] = 0; } tcp:::receive / args[4]-tcp_window == 0 && (args[4]-tcp_flags & (TH_SYN|TH_RST|TH_FIN)) == 0 / { swndclosed[args[1]-cs_cid] = timestamp; swndsnxt[args[1]-cs_cid] = args[3]-tcps_snxt; @numclosed["swnd", args[2]-ip_saddr, args[4]-tcp_dport] = count(); } tcp:::send / swndclosed[args[1]-cs_cid] && args[4]-tcp_seq = swndsnxt[args[1]-cs_cid] / { @meantimeclosed["swnd", args[2]-ip_daddr, args[4]-tcp_sport] = avg(timestamp - swndclosed[args[1]-cs_cid]); @stddevtimeclosed["swnd", args[2]-ip_daddr, args[4]-tcp_sport] = stddev(timestamp - swndclosed[args[1]-cs_cid]); swndclosed[args[1]-cs_cid] = 0; swndsnxt[args[1]-cs_cid] = 0; } END { printf("%-6s %-20s %-8s %-25s %-8s %-8s\n", "Window", "Remote host", "Port", "TCP Avg WndClosed(ns)", "StdDev", "Num"); printa("%-6s %-20s %-8d %@-25d %@-8d %@-8d\n", @meantimeclosed, @stddevtimeclosed, @numclosed); } So this script will show us whether the peer's receive window size is preventing flow ("swnd" events) or whether congestion control is limiting flow ("cwnd" events). As an example I traced on a server with a large file transfer in progress via a webserver and with an active ssh connection running "find / -depth -print". Here is the output: ^C Window Remote host Port TCP Avg WndClosed(ns) StdDev Num cwnd 10.175.96.92 80 86064329 77311705 125 cwnd 10.175.96.92 22 122068522 151039669 81 So we see in this case, the congestion window closes 125 times for port 80 connections and 81 times for ssh. The average time the window is closed is 0.086sec for port 80 and 0.12sec for port 22. So if you wish to change congestion control algorithm in Oracle Solaris 11, a useful step may be to see if congestion really is an issue on your network. Scripts like the one posted above can help assess this, but it's worth reiterating that if congestion control is occuring, that's not necessarily a problem that needs fixing. Recall that congestion control is about controlling flow to prevent large-scale drops, so looking at congestion events in isolation doesn't tell us the whole story. For example, are we seeing more congestion events with one control algorithm, but more drops/retransmission with another? As always, it's best to start with measures of throughput and latency before arriving at a specific hypothesis such as "my congestion control algorithm is sub-optimal".

Read the article
Reading graph inputs for a programming puzzle and then solving it

- by Vrashabh

I just took a programming competition question and I absolutely bombed it. I had trouble right at the beginning itself from reading the input set. The question was basically a variant of this puzzle http://codercharts.com/puzzle/evacuation-plan but also had an hour component in the first line(say 3 hours after start of evacuation). It reads like this This puzzle is a tribute to all the people who suffered from the earthquake in Japan. The goal of this puzzle is, given a network of road and locations, to determine the maximum number of people that can be evacuated. The people must be evacuated from evacuation points to rescue points. The list of road and the number of people they can carry per hour is provided. Input Specifications Your program must accept one and only one command line argument: the input file. The input file is formatted as follows: the first line contains 4 integers n r s t n is the number of locations (each location is given by a number from 0 to n-1) r is the number of roads s is the number of locations to be evacuated from (evacuation points) t is the number of locations where people must be evacuated to (rescue points) the second line contains s integers giving the locations of the evacuation points the third line contains t integers giving the locations of the rescue points the r following lines contain to the road definitions. Each road is defined by 3 integers l1 l2 width where l1 and l2 are the locations connected by the road (roads are one-way) and width is the number of people per hour that can fit on the road Now look at the sample input set 5 5 1 2 3 0 3 4 0 1 10 0 2 5 1 2 4 1 3 5 2 4 10 The 3 in the first line is the additional component and is defined as the number of hours since the resuce has started which is 3 in this case. Now my solution was to use Dijisktras algorithm to find the shortest path between each of the rescue and evac nodes. Now my problem started with how to read the input set. I read the first line in python and stored the values in variables. But then I did not know how to store the values of the distance between the nodes and what DS to use and how to input it to say a standard implementation of dijikstras algorithm. So my question is two fold 1.) How do I take the input of such problems? - I have faced this problem in quite a few competitions recently and I hope I can get a simple code snippet or an explanation in java or python to read the data input set in such a way that I can input it as a graph to graph algorithms like dijikstra and floyd/warshall. Also a solution to the above problem would also help. 2.) How to solve this puzzle? My algorithm was: Find shortest path between evac points (in the above example it is 14 from 0 to 3) Multiply it by number of hours to get maximal number of saves Also the answer given for the variant for the input set was 24 which I dont understand. Can someone explain that also. UPDATE: I get how the answer is 14 in the given problem link - it seems to be just the shortest path between node 0 and 3. But with the 3 hour component how is the answer 24 UPDATE I get how it is 24 - its a complete graph traversal at every hour and this is how I solve it Hour 1 Node 0 to Node 1 - 10 people Node 0 to Node 2- 5 people TotalRescueCount=0 Node 1=10 Node 2= 5 Hour 2 Node 1 to Node 3 = 5(Rescued) Node 2 to Node 4 = 5(Rescued) Node 0 to Node 1 = 10 Node 0 to Node 2 = 5 Node 1 to Node 2 = 4 TotalRescueCount = 10 Node 1 = 10 Node 2= 5+4 = 9 Hour 3 Node 1 to Node 3 = 5(Rescued) Node 2 to Node 4 = 5+4 = 9(Rescued) TotalRescueCount = 9+5+10 = 24 It hard enough for this case , for multiple evac and rescue points how in the world would I write a pgm for this ?

Read the article
Modular Reduction of Polynomials in NTRUEncrypt

- by Neville

Hello everyone. I'm implementing the NTRUEncrypt algorithm, according to an NTRU tutorial, a polynomial f has an inverse g such that f*g=1 mod x, basically the polynomial multiplied by its inverse reduced modulo x gives 1. I get the concept but in an example they provide, a polynomial f = -1 + X + X^2 - X4 + X6 + X9 - X10 which we will represent as the array [-1,1,1,0,-1,0,1,0,0,1,-1] has an inverse g of [1,2,0,2,2,1,0,2,1,2,0], so that when we multiply them and reduce the result modulo 3 we get 1, however when I use the NTRU algorithm for multiplying and reducing them I get -2. Here is my algorithm for multiplying them written in Java: public static int[] PolMulFun(int a[],int b[],int c[],int N,int M) { for(int k=N-1;k>=0;k--) { c[k]=0; int j=k+1; for(int i=N-1;i>=0;i--) { if(j==N) { j=0; } if(a[i]!=0 && b[j]!=0) { c[k]=(c[k]+(a[i]*b[j]))%M; } j=j+1; } } return c; } It basicall taken in polynomial a and multiplies it b, resturns teh result in c, N specifies the degree of the polynomials+1, in teh example above N=11; and M is the reuction modulo, in teh exampel above 3. Why am I getting -2 and not 1?

Read the article
Progress Bar design patterns?

- by shoosh

The application I'm writing performs a length algorithm which usually takes a few minutes to finish. During this time I'd like to show the user a progress bar which indicates how much of the algorithm is done as precisely as possible. The algorithm is divided into several steps, each with its own typical timing. For instance- initialization (500 milli-sec) reading inputs (5 sec) step 1 (30 sec) step 2 (3 minutes) writing outputs (7 sec) shutting down (10 milli-sec) Each step can report its progress quite easily by setting the range its working on, say [0 to 150] and then reporting the value it completed in its main loop. What I currently have set up is a scheme of nested progress monitors which form a sort of implicit tree of progress reporting. All progress monitors inherit from an interface IProgressMonitor: class IProgressMonitor { public: void setRange(int from, int to) = 0; void setValue(int v) = 0; }; The root of the tree is the ProgressMonitor which is connected to the actual GUI interface: class GUIBarProgressMonitor : public IProgressMonitor { GUIBarProgressMonitor(ProgressBarWidget *); }; Any other node in the tree are monitors which take control of a piece of the parent progress: class SubProgressMonitor : public IProgressMonitor { SubProgressMonitor(IProgressMonitor *parent, int parentFrom, int parentLength) ... }; A SubProgressMonitor takes control of the range [parentFrom, parentFrom+parentLength] of its parent. With this scheme I am able to statically divide the top level progress according to the expected relative portion of each step in the global timing. Each step can then be further subdivided into pieces etc' The main disadvantage of this is that the division is static and it gets painful to make changes according to variables which are discovered at run time. So the question: are there any known design patterns for progress monitoring which solve this issue?

Read the article
Updating a Minimum spanning tree when a new edge is inserted

- by Lynette

Hello, I've been presented the following problem in University: Let G = (V, E) be an (undirected) graph with costs ce = 0 on the edges e € E. Assume you are given a minimum-cost spanning tree T in G. Now assume that a new edge is added to G, connecting two nodes v, tv € V with cost c. a) Give an efficient algorithm to test if T remains the minimum-cost spanning tree with the new edge added to G (but not to the tree T). Make your algorithm run in time O(|E|). Can you do it in O(|V|) time? Please note any assumptions you make about what data structure is used to represent the tree T and the graph G. b)Suppose T is no longer the minimum-cost spanning tree. Give a linear-time algorithm (time O(|E|)) to update the tree T to the new minimum-cost spanning tree. This is the solution I found: Let e1=(a,b) the new edge added Find in T the shortest path from a to b (BFS) if e1 is the most expensive edge in the cycle then T remains the MST else T is not the MST It seems to work but i can easily make this run in O(|V|) time, while the problem asks O(|E|) time. Am i missing something? By the way we are authorized to ask for help from anyone so I'm not cheating :D Thanks in advance

Read the article
Python MD5 Hash Faster Calculation

- by balgan

Hi everyone. I will try my best to explain my problem and my line of thought on how I think I can solve it. I use this code for root, dirs, files in os.walk(downloaddir): for infile in files: f = open(os.path.join(root,infile),'rb') filehash = hashlib.md5() while True: data = f.read(10240) if len(data) == 0: break filehash.update(data) print "FILENAME: " , infile print "FILE HASH: " , filehash.hexdigest() and using start = time.time() elapsed = time.time() - start I measure how long it takes to calculate an hash. Pointing my code to a file with 653megs this is the result: root@Mars:/home/tiago# python algorithm-timer.py FILENAME: freebsd.iso FILE HASH: ace0afedfa7c6e0ad12c77b6652b02ab 12.624 root@Mars:/home/tiago# python algorithm-timer.py FILENAME: freebsd.iso FILE HASH: ace0afedfa7c6e0ad12c77b6652b02ab 12.373 root@Mars:/home/tiago# python algorithm-timer.py FILENAME: freebsd.iso FILE HASH: ace0afedfa7c6e0ad12c77b6652b02ab 12.540 Ok now 12 seconds +- on a 653mb file, my problem is I intend to use this code on a program that will run through multiple files, some of them might be 4/5/6Gb and it will take wayy longer to calculate. What am wondering is if there is a faster way for me to calculate the hash of the file? Maybe by doing some multithreading? I used a another script to check the use of the CPU second by second and I see that my code is only using 1 out of my 2 CPUs and only at 25% max, any way I can change this? Thank you all in advance for the given help.

Read the article
Finding the left-most and right-most points of a list. std::find_if the right way to go?

- by Tom

Hi, I have a list of Point objects, (each one with x,y properties) and would like to find the left-most and right-most points. I've been trying to do it with find_if, but i'm not sure its the way to go, because i can't seem to pass a comparator instance. Is find_if the way to go? Seems not. So, is there an algorithm in <algorithm> to achieve this? Thanks in advance. #include <iostream> #include <list> #include <algorithm> using namespace std; typedef struct Point{ float x; float y; } Point; bool left(Point& p1,Point& p2) { return p1.x < p2.x; } int main(){ Point p1 ={-1,0}; Point p2 ={1,0}; Point p3 ={5,0}; Point p4 ={7,0}; list <Point> points; points.push_back(p1); points.push_back(p2); points.push_back(p3); points.push_back(p4); //Should return an interator to p1. find_if(points.begin(),points.end(),left); return 0; }

Read the article

Search Results

Search found 7490 results on 300 pages for 'algorithm analysis'.

Page 147/300 | < Previous Page | 143 144 145 146 147 148 149 150 151 152 153 154 | Next Page >

- by Az

- by Marko Mäkelä

- by MarkPearl

- by John Smith

- by smisner

- by Pinal Dave

- by Programming Noob

- by DigiMortal

- by CliveT

- by FalconNL

- by voxelizr

- by Sean Martin

- by [email protected]

- by Brian Dirking

- by darcy

- by Simucal

- by SQLMonger

- by Ben Emmett

- by user12820842

- by Vrashabh

- by Neville

- by shoosh

- by Lynette

- by balgan

- by Tom

< Previous Page | 143 144 145 146 147 148 149 150 151 152 153 154 | Next Page >