Search Results

Search found 3190 results on 128 pages for 'isapi rewrite'.

Page 104/128 | < Previous Page | 100 101 102 103 104 105 106 107 108 109 110 111  | Next Page >

  • Wordpress Custom Type permalink containing Taxonomy slug

    - by treznik
    I'm trying to create a permalink pattern for a Custom Type, that includes one of its taxonomies. The taxonomy name is known from the start (so I'm not trying to add or mix all of its taxonomies, just a specific one), but the value will by dynamic, of course. Normally, the Custom Type permalink is built using the rewrite arg with the slug param, but I don't see how I could add a dynamic variable in there. http://codex.wordpress.org/Function_Reference/register_post_type I'm guessing a custom solution is required, but I'm not sure what the best unintrusive approach would be. Is there a known practice for this or has anyone built something similar recently? I'm using WP 3.2.1 btw.

    Read the article

  • .net load balancing for server

    - by user1439111
    Some time ago I wrote server software which is currently running at it's max. (3k users average). So I decided to rewrite certain parts so I can run the software at another server to balance it's load. I can't simply start another instance of the server since there is some data which has to be available to all users. So I was thinking of creating a small manager and all the servers connect and send their (relevant)data to the manager. But it also got me thinking about another problem. The manager could also reach it's limits which is exactly what i'm trying to prevent in the future. So I would like to know how I could fix this problem. (I have already tried to optimize critical parts of the software but I can't optimize it forever)

    Read the article

  • Process for Upgrading from RedBean 3.5 to RedBean 4

    - by Jay Haase
    I am currently using RedBean version 3.5. I think I would like to move to the latest version of RedBean, version 4. I have found no documentation about upgrade process and have a number of significant questions: Is my RedBean 3.5 database schema compatible 4, or will up have to migrate all of the tables to some new format? Is any of my RedBean 3.5 code compatible with version 4, or wouldI need to rewrite all of my code that uses RedBean 3.5? Would it make more sense to upgrade to Doctrine? As a side note, I am also feeling concerned about RedBean's drop of support for Composer, which I have found to be über helpful in managing the various versions of libraries I am using.

    Read the article

  • xsl:variable xsl:copy-of select

    - by user1901345
    I have the following XML: Picture 1 Picture 2 Picture 3 While this XSL does what is expected (output the attr of the first picture): It seems to be not possible to do the same inside the variable declaration using xsl:copy-of: Curious: If I just select "$FirstPicture" instead of "$FirstPicture/@attr" in the second example, it outputs the text node of Picture 1 as expected... Before you all suggest me to rewrite the code: This is just a simplified test, my real aim is to use a named template to select a node into the variable FirstPicture and reuse it for further selections. I hope someone could help me to understand the behavior or could suggest me a proper way to select a node with code which could be easily reused (the decission which node is the first one is complex in my real application). Thanks.

    Read the article

  • squid url_rewrite with cookie

    - by geekGod
    I have a squid 3.0 deployed which has a url_rewriter program which rewrites certain HTTP requests. I now need to modify this prpogram to rewrite along with the cookie setting code. As much as I have seen the url_rewrite_program documentation, it appears that I may not be able to set a cookie along with the 302 response. Is this correct? Can i set a cookie in the redirect response or would this require modifying the squid code. Appreciate any help in this regard!

    Read the article

  • Guidance on using drop in DLLs

    - by Scott Chamberlain
    I have been giving the task to rewrite a internal utility in .net for my work. One of program requirements the new system has is have a dll that implements a set of interfaces and have the program call the DLL. Now this dll will be changed out a lot per deployment. My question is what is the best way to do it from a development standpoint? Do I add a template DLL (one that only has the interfaces but no implementation) to the project references like I would do any other dll that I would use. Or do I need to use somthing like this every time I want to use code from the dll? var DropIn = System.Reflection.Assembly.LoadFrom("DropInDll.dll"); var getActions = DropIn.GetType("Main").GetMethod("GetActions"); List<IAction> ActionList = (List<IAction>)getActions.Invoke(null, null);

    Read the article

  • What scripts should not be ported from bash to python?

    - by Jack
    I decided to rewrite all our Bash scripts in Python (there are not so many of them) as my first Python project. The reason for it is that although being quite fluent in Bash I feel it's somewhat archaic language and since our system is in the first stages of its developments I think switching to Python now will be the right thing to do. Are there scripts that should always be written in Bash? For example, we have an init.d daemon script - is it OK to use Python for it? We run CentOS. Thanks.

    Read the article

  • building a hash lookup table during `git filter-branch` or `git-rebase`

    - by intuited
    I've been using the SHA1 hashes of my commits as references in documentation, etc. I've realized that if I need to rewrite those commits, I'll need to create a lookup table to correspond the hashes for the original repo with the hashes for the filtered repo. Since these are effectively UUID's, a simple lookup table would do. I think that it's relatively straightforward to write a script to do this during a filter-branch run; that's not really my question, though if there are some gotchas that make it complicated, I'd certainly like to hear about them. I'm really wondering if there are any tools that provide this functionality, or if there is some sort of convention on where to keep the lookup table/what to call it? I'd prefer not to do things in a completely idiosyncratic way.

    Read the article

  • What is this VB6 method doing?

    - by Craig
    We are converting a VB6 application to C# (4.0). and have come across a method in VB6 that we're battling to understand. Public Sub SaveToField(fldAttach As ADODB.Field) Dim bData() As Byte Dim nSize As Long nSize = Len(m_sEmail) bData = LngToByteArray(nSize) fldAttach.AppendChunk bData If nSize > 0 Then bData = StringToByteArray(m_sEmail) fldAttach.AppendChunk bData End If nSize = Len(m_sName) bData = LngToByteArray(nSize) fldAttach.AppendChunk bData If nSize > 0 Then bData = StringToByteArray(m_sName) fldAttach.AppendChunk bData End If bData = LngToByteArray(m_nContactID) fldAttach.AppendChunk bData End Sub It seems like it's doing some binary file copy type thing, but I'm not quite understanding. Could someone explain so that we can rewrite it?

    Read the article

  • Java : Using parent class method to access child class variable

    - by Jayant
    I have the following scenario : public class A { private int x = 5; public void print() { System.out.println(x); } } public class B extends A { private int x = 10; /*public void print() { System.out.println(x); }*/ public static void main(String[] args) { B b = new B(); b.print(); } } On executing the code, the output is : 5. How to access the child class(B's) variable(x) via the parent class method? Could this be done without overriding the print() method (i.e. uncommenting it in B)? [This is important because on overriding we will have to rewrite the whole code for the print() method again]

    Read the article

  • Input Iterator for a shared_ptr

    - by Baz
    I have an iterator which contains the following functions: ... T &operator*() { return *_i; } std::shared_ptr<T> operator->() { return _i; } private: std::shared_ptr<T> _i; ... How do I get a shared pointer to the internally stored _i? std::shared_ptr<Type> item = ??? Should I do: MyInterfaceIterator<Type> i; std::shared_ptr<Type> item = i.operator->(); Or should I rewrite operator*()?

    Read the article

  • magento .htaccess password protect inner pages (not homepage)

    - by Angel Wong
    I would like to use .htaccess to password protect all inner pages of Magento, except the home page. e.g. http://www.example.com/abc (password protect) http://www.example.com (home page, no need to password protect) I tried to use the setifenv request_uri = "/" => allow, but didn't work. It still password protect all pages including the homepage. I also tried a few ways inside the Magento admin URL rewrite, but those won't work either. Any expert can help? thx E

    Read the article

  • What is the right way to modify a wordpress query in a plugin?

    - by starepod
    Basically i am playing with a plugin that allows future-dated posts on archive pages. My question is broader than this specific functionality, but everyone likes some context. I have my head around many of the plugin development concepts, but must be missing something very basic. I can successfully rewrite a query that gives me the results i want like this: function modify_where( $where ) { global $wp_query; // define $year, $cat, etc if( is_archive() ) { $where = " AND YEAR(wp_posts.post_date)='".$year."' AND wp_term_taxonomy.taxonomy = 'category' AND wp_term_taxonomy.term_id IN ('".$cat."') AND wp_posts.post_type = 'post' AND (wp_posts.post_status = 'publish' OR wp_posts.post_status = 'future')"; } return $where; } add_filter('posts_where', 'catCal_where' ); However, if i attempt to create a new WP_Query('different_query_stuff') after the main loop the new query uses the same WHERE statement outlined above. The question is : What am I missing? Thanks.

    Read the article

  • Include upper bound in range()

    - by Jull
    How can I include the upper bound in range() function? I can't add by 1 because my for-loop looks like: for x in range(1,math.floor(math.sqrt(x))): y = math.sqrt(n - x * x) But as I understand it will actually be 1 < x < M where I need 1 < x <= M Adding 1 will completely change the result. I am trying to rewrite my old program from C# to Python. That's how it looked in C#: for (int x = 1; x <= Math.Floor(Math.Sqrt(n)); x++) double y = Math.Sqrt(n - x * x);

    Read the article

  • JavaScript: One ID, two functions. How can I do this with minimal code duplication?

    - by user1775598
    I've got an ID and I'd like to assign two functions to it. Here's what it currently looks like: document.getElementById(this.config.dragArea).addEventListener("drop", this._dropFiles, false); document.getElementById(this.config.dragArea).addEventListener("drop", this._handleFileDrop, false); How can I rewrite this file without so much duplication? I tried doing document.getElementById(this.config.dragArea).addEventListener("drop", this._dropFiles, this._handleFileDrop, false); and document.getElementById(this.config.dragArea).addEventListener("drop", function(){this._dropFiles; this._handleFileDrop}, false); All to no avail :(

    Read the article

  • Create div tag template and reuse

    - by user1683645
    Is it possible to create a template e.g with lots of other elements inside it with proper attribute "tagging" and reuse it with jquery? For instance when you want to display user submitted comments without refreshing the page. The reason I ask this is because the code between the div tags are rather long. So using for instance prepend() would be to long to rewrite. Whats the best approach for larger manipulations? Create a separate html? Im pretty new to manipulation, but since I have a programming background i would expect that there is an efficient way to reuse already existing HTML instead of redefining it in jquery.

    Read the article

  • CodePlex Daily Summary for Sunday, March 07, 2010

    CodePlex Daily Summary for Sunday, March 07, 2010New ProjectsAlgorithminator: Universal .NET algorithm visualizer, which helps you to illustrate any algorithm, written in any .NET language. Still in development.ALToolkit: Contains a set of handy .NET components/classes. Currently it contains: * A Numeric Text Box (an Extended NumericUpDown) * A Splash Screen base fo...Automaton Home: Automaton is a home automation software built with a n-Tier, MVVM pattern utilzing WCF, EF, WPF, Silverlight and XBAP.Developer Controls: Developer Controls contains various controls to help build applications that can script/write code.Dynamic Reference Manager: Dynamic Reference Manager is a set (more like a small group) of classes and attributes written in C# that allows any .NET program to reference othe...indiologic: Utilities of an IndioNeural Cryptography in F#: This project is my magistracy resulting work. It is intended to be an example of using neural networks in cryptography. Hashing functions are chose...Particle Filter Visualization: Particle Filter Visualization Program for the Intel Science and Engineering FairPólya: Efficient, immutable, polymorphic collections. .Net lacks them, we provide them*. * By we, we mean I; and by efficient, I mean hopefully so.project euler solutions from mhinze: mhinze project euler solutionsSilverlight 4 and WCF multi layer: Silverlight 4 and WCF multi layersqwarea: Project for a browser-based, minimalistic, massively multiplayer strategy game. Part of the "Génie logiciel et Cloud Computing" course of the ENS (...SuperSocket: SuperSocket, a socket application framework can build FTP/SMTP/POP server easilyToast (for ASP.NET MVC): Dynamic, developer & designer friendly content injection, compression and optimization for ASP.NET MVCNew ReleasesALToolkit: ALToolkit 1.0: Binary release of the libraries containing: NumericTextBox SplashScreen Based on the VB.NET code, but that doesn't really matter.Blacklist of Providers: 1.0-Milestone 1: Blacklist of Providers.Milestone 1In this development release implemented - Main interface (Work Item #5453) - Database (Work Item #5523)C# Linear Hash Table: Linear Hash Table b2: Now includes a default constructor, and will throw an exception if capacity is not set to a power of 2 or loadToMaintain is below 1.Composure: CassiniDev-Trunk-40745-VS2010.rc1.NET4: A simple port of the CassiniDev portable web server project for Visual Studio 2010 RC1 built against .NET 4.0. The WCF tests currently fail unless...Developer Controls: DevControls: These are the version 1.0 releases of these controls. Download the individually or all together (in a .zip file). More releases coming soon!Dynamic Reference Manager: DRM Alpha1: This is the first release. I'm calling it Alpha because I intend implementing other functions, but I do not intend changing the way current functio...ESB Toolkit Extensions: Tellago SOA ESB Extenstions v0.3: Windows Installer file that installs Library on a BizTalk ESB 2.0 system. This Install automatically configures the esb.config to use the new compo...GKO Libraries: GKO Libraries 0.1 Alpha: 0.1 AlphaHome Access Plus+: v3.0.3.0: Version 3.0.3.0 Release Change Log: Added Announcement Box Removed script files that aren't needed Fixed & issue in directory path Stylesheet...Icarus Scene Engine: Icarus Scene Engine 1.10.306.840: Icarus Professional, Icarus Player, the supporting software for Icarus Scene Engine, with some included samples, and the start of a tutorial (with ...mavjuz WndLpt: wndlpt-0.2.5: New: Response to 5 LPT inputs "test i 1" New: Reaction to 12 LPT outputs "test q 8" New: Reaction to all LPT pins "test pin 15" New: Syntax: ...Neural Cryptography in F#: Neural Cryptography 0.0.1: The most simple version of this project. It has a neural network that works just like logical AND and a possibility to recreate neural network from...Password Provider: 1.0.3: This release fixes a bug which caused the program to crash when double clicking on a generic item.RoTwee: RoTwee 6.2.0.0: New feature is as next. 16649 Add hashtag for tweet of tune.Now you can tweet your playing tune with hashtag.Visual Studio DSite: Picture Viewer (Visual C++ 2008): This example source code allows you to view any picture you want in the click of a button. All you got to do is click the button and browser via th...WatchersNET CKEditor™ Provider for DotNetNuke: CKEditor Provider 1.8.00: Whats New File Browser: Folders & Files View reworked File Browser: Folders & Files View reworked File Browser: Folders are displayed as TreeVi...WSDLGenerator: WSDLGenerator 0.0.0.4: - replaced CommonLibrary.dll by CommandLineParser.dll - added better support for custom complex typesMost Popular ProjectsMetaSharpSilverlight ToolkitASP.NET Ajax LibraryAll-In-One Code FrameworkWindows 7 USB/DVD Download Toolニコ生アラートWindows Double ExplorerVirtual Router - Wifi Hot Spot for Windows 7 / 2008 R2Caliburn: An Application Framework for WPF and SilverlightArkSwitchMost Active ProjectsUmbraco CMSRawrSDS: Scientific DataSet library and toolsBlogEngine.NETjQuery Library for SharePoint Web Servicespatterns & practices – Enterprise LibraryIonics Isapi Rewrite FilterFarseer Physics EngineFasterflect - A Fast and Simple Reflection APIFluent Assertions

    Read the article

  • Improving Partitioned Table Join Performance

    - by Paul White
    The query optimizer does not always choose an optimal strategy when joining partitioned tables. This post looks at an example, showing how a manual rewrite of the query can almost double performance, while reducing the memory grant to almost nothing. Test Data The two tables in this example use a common partitioning partition scheme. The partition function uses 41 equal-size partitions: CREATE PARTITION FUNCTION PFT (integer) AS RANGE RIGHT FOR VALUES ( 125000, 250000, 375000, 500000, 625000, 750000, 875000, 1000000, 1125000, 1250000, 1375000, 1500000, 1625000, 1750000, 1875000, 2000000, 2125000, 2250000, 2375000, 2500000, 2625000, 2750000, 2875000, 3000000, 3125000, 3250000, 3375000, 3500000, 3625000, 3750000, 3875000, 4000000, 4125000, 4250000, 4375000, 4500000, 4625000, 4750000, 4875000, 5000000 ); GO CREATE PARTITION SCHEME PST AS PARTITION PFT ALL TO ([PRIMARY]); There two tables are: CREATE TABLE dbo.T1 ( TID integer NOT NULL IDENTITY(0,1), Column1 integer NOT NULL, Padding binary(100) NOT NULL DEFAULT 0x,   CONSTRAINT PK_T1 PRIMARY KEY CLUSTERED (TID) ON PST (TID) );   CREATE TABLE dbo.T2 ( TID integer NOT NULL, Column1 integer NOT NULL, Padding binary(100) NOT NULL DEFAULT 0x,   CONSTRAINT PK_T2 PRIMARY KEY CLUSTERED (TID, Column1) ON PST (TID) ); The next script loads 5 million rows into T1 with a pseudo-random value between 1 and 5 for Column1. The table is partitioned on the IDENTITY column TID: INSERT dbo.T1 WITH (TABLOCKX) (Column1) SELECT (ABS(CHECKSUM(NEWID())) % 5) + 1 FROM dbo.Numbers AS N WHERE n BETWEEN 1 AND 5000000; In case you don’t already have an auxiliary table of numbers lying around, here’s a script to create one with 10 million rows: CREATE TABLE dbo.Numbers (n bigint PRIMARY KEY);   WITH L0 AS(SELECT 1 AS c UNION ALL SELECT 1), L1 AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B), L2 AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B), L3 AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B), L4 AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B), L5 AS(SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B), Nums AS(SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS n FROM L5) INSERT dbo.Numbers WITH (TABLOCKX) SELECT TOP (10000000) n FROM Nums ORDER BY n OPTION (MAXDOP 1); Table T1 contains data like this: Next we load data into table T2. The relationship between the two tables is that table 2 contains ‘n’ rows for each row in table 1, where ‘n’ is determined by the value in Column1 of table T1. There is nothing particularly special about the data or distribution, by the way. INSERT dbo.T2 WITH (TABLOCKX) (TID, Column1) SELECT T.TID, N.n FROM dbo.T1 AS T JOIN dbo.Numbers AS N ON N.n >= 1 AND N.n <= T.Column1; Table T2 ends up containing about 15 million rows: The primary key for table T2 is a combination of TID and Column1. The data is partitioned according to the value in column TID alone. Partition Distribution The following query shows the number of rows in each partition of table T1: SELECT PartitionID = CA1.P, NumRows = COUNT_BIG(*) FROM dbo.T1 AS T CROSS APPLY (VALUES ($PARTITION.PFT(TID))) AS CA1 (P) GROUP BY CA1.P ORDER BY CA1.P; There are 40 partitions containing 125,000 rows (40 * 125k = 5m rows). The rightmost partition remains empty. The next query shows the distribution for table 2: SELECT PartitionID = CA1.P, NumRows = COUNT_BIG(*) FROM dbo.T2 AS T CROSS APPLY (VALUES ($PARTITION.PFT(TID))) AS CA1 (P) GROUP BY CA1.P ORDER BY CA1.P; There are roughly 375,000 rows in each partition (the rightmost partition is also empty): Ok, that’s the test data done. Test Query and Execution Plan The task is to count the rows resulting from joining tables 1 and 2 on the TID column: SET STATISTICS IO ON; DECLARE @s datetime2 = SYSUTCDATETIME();   SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID;   SELECT DATEDIFF(Millisecond, @s, SYSUTCDATETIME()); SET STATISTICS IO OFF; The optimizer chooses a plan using parallel hash join, and partial aggregation: The Plan Explorer plan tree view shows accurate cardinality estimates and an even distribution of rows across threads (click to enlarge the image): With a warm data cache, the STATISTICS IO output shows that no physical I/O was needed, and all 41 partitions were touched: Running the query without actual execution plan or STATISTICS IO information for maximum performance, the query returns in around 2600ms. Execution Plan Analysis The first step toward improving on the execution plan produced by the query optimizer is to understand how it works, at least in outline. The two parallel Clustered Index Scans use multiple threads to read rows from tables T1 and T2. Parallel scan uses a demand-based scheme where threads are given page(s) to scan from the table as needed. This arrangement has certain important advantages, but does result in an unpredictable distribution of rows amongst threads. The point is that multiple threads cooperate to scan the whole table, but it is impossible to predict which rows end up on which threads. For correct results from the parallel hash join, the execution plan has to ensure that rows from T1 and T2 that might join are processed on the same thread. For example, if a row from T1 with join key value ‘1234’ is placed in thread 5’s hash table, the execution plan must guarantee that any rows from T2 that also have join key value ‘1234’ probe thread 5’s hash table for matches. The way this guarantee is enforced in this parallel hash join plan is by repartitioning rows to threads after each parallel scan. The two repartitioning exchanges route rows to threads using a hash function over the hash join keys. The two repartitioning exchanges use the same hash function so rows from T1 and T2 with the same join key must end up on the same hash join thread. Expensive Exchanges This business of repartitioning rows between threads can be very expensive, especially if a large number of rows is involved. The execution plan selected by the optimizer moves 5 million rows through one repartitioning exchange and around 15 million across the other. As a first step toward removing these exchanges, consider the execution plan selected by the optimizer if we join just one partition from each table, disallowing parallelism: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = 1 AND $PARTITION.PFT(T2.TID) = 1 OPTION (MAXDOP 1); The optimizer has chosen a (one-to-many) merge join instead of a hash join. The single-partition query completes in around 100ms. If everything scaled linearly, we would expect that extending this strategy to all 40 populated partitions would result in an execution time around 4000ms. Using parallelism could reduce that further, perhaps to be competitive with the parallel hash join chosen by the optimizer. This raises a question. If the most efficient way to join one partition from each of the tables is to use a merge join, why does the optimizer not choose a merge join for the full query? Forcing a Merge Join Let’s force the optimizer to use a merge join on the test query using a hint: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (MERGE JOIN); This is the execution plan selected by the optimizer: This plan results in the same number of logical reads reported previously, but instead of 2600ms the query takes 5000ms. The natural explanation for this drop in performance is that the merge join plan is only using a single thread, whereas the parallel hash join plan could use multiple threads. Parallel Merge Join We can get a parallel merge join plan using the same query hint as before, and adding trace flag 8649: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (MERGE JOIN, QUERYTRACEON 8649); The execution plan is: This looks promising. It uses a similar strategy to distribute work across threads as seen for the parallel hash join. In practice though, performance is disappointing. On a typical run, the parallel merge plan runs for around 8400ms; slower than the single-threaded merge join plan (5000ms) and much worse than the 2600ms for the parallel hash join. We seem to be going backwards! The logical reads for the parallel merge are still exactly the same as before, with no physical IOs. The cardinality estimates and thread distribution are also still very good (click to enlarge): A big clue to the reason for the poor performance is shown in the wait statistics (captured by Plan Explorer Pro): CXPACKET waits require careful interpretation, and are most often benign, but in this case excessive waiting occurs at the repartitioning exchanges. Unlike the parallel hash join, the repartitioning exchanges in this plan are order-preserving ‘merging’ exchanges (because merge join requires ordered inputs): Parallelism works best when threads can just grab any available unit of work and get on with processing it. Preserving order introduces inter-thread dependencies that can easily lead to significant waits occurring. In extreme cases, these dependencies can result in an intra-query deadlock, though the details of that will have to wait for another time to explore in detail. The potential for waits and deadlocks leads the query optimizer to cost parallel merge join relatively highly, especially as the degree of parallelism (DOP) increases. This high costing resulted in the optimizer choosing a serial merge join rather than parallel in this case. The test results certainly confirm its reasoning. Collocated Joins In SQL Server 2008 and later, the optimizer has another available strategy when joining tables that share a common partition scheme. This strategy is a collocated join, also known as as a per-partition join. It can be applied in both serial and parallel execution plans, though it is limited to 2-way joins in the current optimizer. Whether the optimizer chooses a collocated join or not depends on cost estimation. The primary benefits of a collocated join are that it eliminates an exchange and requires less memory, as we will see next. Costing and Plan Selection The query optimizer did consider a collocated join for our original query, but it was rejected on cost grounds. The parallel hash join with repartitioning exchanges appeared to be a cheaper option. There is no query hint to force a collocated join, so we have to mess with the costing framework to produce one for our test query. Pretending that IOs cost 50 times more than usual is enough to convince the optimizer to use collocated join with our test query: -- Pretend IOs are 50x cost temporarily DBCC SETIOWEIGHT(50);   -- Co-located hash join SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (RECOMPILE);   -- Reset IO costing DBCC SETIOWEIGHT(1); Collocated Join Plan The estimated execution plan for the collocated join is: The Constant Scan contains one row for each partition of the shared partitioning scheme, from 1 to 41. The hash repartitioning exchanges seen previously are replaced by a single Distribute Streams exchange using Demand partitioning. Demand partitioning means that the next partition id is given to the next parallel thread that asks for one. My test machine has eight logical processors, and all are available for SQL Server to use. As a result, there are eight threads in the single parallel branch in this plan, each processing one partition from each table at a time. Once a thread finishes processing a partition, it grabs a new partition number from the Distribute Streams exchange…and so on until all partitions have been processed. It is important to understand that the parallel scans in this plan are different from the parallel hash join plan. Although the scans have the same parallelism icon, tables T1 and T2 are not being co-operatively scanned by multiple threads in the same way. Each thread reads a single partition of T1 and performs a hash match join with the same partition from table T2. The properties of the two Clustered Index Scans show a Seek Predicate (unusual for a scan!) limiting the rows to a single partition: The crucial point is that the join between T1 and T2 is on TID, and TID is the partitioning column for both tables. A thread that processes partition ‘n’ is guaranteed to see all rows that can possibly join on TID for that partition. In addition, no other thread will see rows from that partition, so this removes the need for repartitioning exchanges. CPU and Memory Efficiency Improvements The collocated join has removed two expensive repartitioning exchanges and added a single exchange processing 41 rows (one for each partition id). Remember, the parallel hash join plan exchanges had to process 5 million and 15 million rows. The amount of processor time spent on exchanges will be much lower in the collocated join plan. In addition, the collocated join plan has a maximum of 8 threads processing single partitions at any one time. The 41 partitions will all be processed eventually, but a new partition is not started until a thread asks for it. Threads can reuse hash table memory for the new partition. The parallel hash join plan also had 8 hash tables, but with all 5,000,000 build rows loaded at the same time. The collocated plan needs memory for only 8 * 125,000 = 1,000,000 rows at any one time. Collocated Hash Join Performance The collated join plan has disappointing performance in this case. The query runs for around 25,300ms despite the same IO statistics as usual. This is much the worst result so far, so what went wrong? It turns out that cardinality estimation for the single partition scans of table T1 is slightly low. The properties of the Clustered Index Scan of T1 (graphic immediately above) show the estimation was for 121,951 rows. This is a small shortfall compared with the 125,000 rows actually encountered, but it was enough to cause the hash join to spill to physical tempdb: A level 1 spill doesn’t sound too bad, until you realize that the spill to tempdb probably occurs for each of the 41 partitions. As a side note, the cardinality estimation error is a little surprising because the system tables accurately show there are 125,000 rows in every partition of T1. Unfortunately, the optimizer uses regular column and index statistics to derive cardinality estimates here rather than system table information (e.g. sys.partitions). Collocated Merge Join We will never know how well the collocated parallel hash join plan might have worked without the cardinality estimation error (and the resulting 41 spills to tempdb) but we do know: Merge join does not require a memory grant; and Merge join was the optimizer’s preferred join option for a single partition join Putting this all together, what we would really like to see is the same collocated join strategy, but using merge join instead of hash join. Unfortunately, the current query optimizer cannot produce a collocated merge join; it only knows how to do collocated hash join. So where does this leave us? CROSS APPLY sys.partitions We can try to write our own collocated join query. We can use sys.partitions to find the partition numbers, and CROSS APPLY to get a count per partition, with a final step to sum the partial counts. The following query implements this idea: SELECT row_count = SUM(Subtotals.cnt) FROM ( -- Partition numbers SELECT p.partition_number FROM sys.partitions AS p WHERE p.[object_id] = OBJECT_ID(N'T1', N'U') AND p.index_id = 1 ) AS P CROSS APPLY ( -- Count per collocated join SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals; The estimated plan is: The cardinality estimates aren’t all that good here, especially the estimate for the scan of the system table underlying the sys.partitions view. Nevertheless, the plan shape is heading toward where we would like to be. Each partition number from the system table results in a per-partition scan of T1 and T2, a one-to-many Merge Join, and a Stream Aggregate to compute the partial counts. The final Stream Aggregate just sums the partial counts. Execution time for this query is around 3,500ms, with the same IO statistics as always. This compares favourably with 5,000ms for the serial plan produced by the optimizer with the OPTION (MERGE JOIN) hint. This is another case of the sum of the parts being less than the whole – summing 41 partial counts from 41 single-partition merge joins is faster than a single merge join and count over all partitions. Even so, this single-threaded collocated merge join is not as quick as the original parallel hash join plan, which executed in 2,600ms. On the positive side, our collocated merge join uses only one logical processor and requires no memory grant. The parallel hash join plan used 16 threads and reserved 569 MB of memory:   Using a Temporary Table Our collocated merge join plan should benefit from parallelism. The reason parallelism is not being used is that the query references a system table. We can work around that by writing the partition numbers to a temporary table (or table variable): SET STATISTICS IO ON; DECLARE @s datetime2 = SYSUTCDATETIME();   CREATE TABLE #P ( partition_number integer PRIMARY KEY);   INSERT #P (partition_number) SELECT p.partition_number FROM sys.partitions AS p WHERE p.[object_id] = OBJECT_ID(N'T1', N'U') AND p.index_id = 1;   SELECT row_count = SUM(Subtotals.cnt) FROM #P AS p CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals;   DROP TABLE #P;   SELECT DATEDIFF(Millisecond, @s, SYSUTCDATETIME()); SET STATISTICS IO OFF; Using the temporary table adds a few logical reads, but the overall execution time is still around 3500ms, indistinguishable from the same query without the temporary table. The problem is that the query optimizer still doesn’t choose a parallel plan for this query, though the removal of the system table reference means that it could if it chose to: In fact the optimizer did enter the parallel plan phase of query optimization (running search 1 for a second time): Unfortunately, the parallel plan found seemed to be more expensive than the serial plan. This is a crazy result, caused by the optimizer’s cost model not reducing operator CPU costs on the inner side of a nested loops join. Don’t get me started on that, we’ll be here all night. In this plan, everything expensive happens on the inner side of a nested loops join. Without a CPU cost reduction to compensate for the added cost of exchange operators, candidate parallel plans always look more expensive to the optimizer than the equivalent serial plan. Parallel Collocated Merge Join We can produce the desired parallel plan using trace flag 8649 again: SELECT row_count = SUM(Subtotals.cnt) FROM #P AS p CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals OPTION (QUERYTRACEON 8649); The actual execution plan is: One difference between this plan and the collocated hash join plan is that a Repartition Streams exchange operator is used instead of Distribute Streams. The effect is similar, though not quite identical. The Repartition uses round-robin partitioning, meaning the next partition id is pushed to the next thread in sequence. The Distribute Streams exchange seen earlier used Demand partitioning, meaning the next partition id is pulled across the exchange by the next thread that is ready for more work. There are subtle performance implications for each partitioning option, but going into that would again take us too far off the main point of this post. Performance The important thing is the performance of this parallel collocated merge join – just 1350ms on a typical run. The list below shows all the alternatives from this post (all timings include creation, population, and deletion of the temporary table where appropriate) from quickest to slowest: Collocated parallel merge join: 1350ms Parallel hash join: 2600ms Collocated serial merge join: 3500ms Serial merge join: 5000ms Parallel merge join: 8400ms Collated parallel hash join: 25,300ms (hash spill per partition) The parallel collocated merge join requires no memory grant (aside from a paltry 1.2MB used for exchange buffers). This plan uses 16 threads at DOP 8; but 8 of those are (rather pointlessly) allocated to the parallel scan of the temporary table. These are minor concerns, but it turns out there is a way to address them if it bothers you. Parallel Collocated Merge Join with Demand Partitioning This final tweak replaces the temporary table with a hard-coded list of partition ids (dynamic SQL could be used to generate this query from sys.partitions): SELECT row_count = SUM(Subtotals.cnt) FROM ( VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10), (11),(12),(13),(14),(15),(16),(17),(18),(19),(20), (21),(22),(23),(24),(25),(26),(27),(28),(29),(30), (31),(32),(33),(34),(35),(36),(37),(38),(39),(40),(41) ) AS P (partition_number) CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals OPTION (QUERYTRACEON 8649); The actual execution plan is: The parallel collocated hash join plan is reproduced below for comparison: The manual rewrite has another advantage that has not been mentioned so far: the partial counts (per partition) can be computed earlier than the partial counts (per thread) in the optimizer’s collocated join plan. The earlier aggregation is performed by the extra Stream Aggregate under the nested loops join. The performance of the parallel collocated merge join is unchanged at around 1350ms. Final Words It is a shame that the current query optimizer does not consider a collocated merge join (Connect item closed as Won’t Fix). The example used in this post showed an improvement in execution time from 2600ms to 1350ms using a modestly-sized data set and limited parallelism. In addition, the memory requirement for the query was almost completely eliminated  – down from 569MB to 1.2MB. The problem with the parallel hash join selected by the optimizer is that it attempts to process the full data set all at once (albeit using eight threads). It requires a large memory grant to hold all 5 million rows from table T1 across the eight hash tables, and does not take advantage of the divide-and-conquer opportunity offered by the common partitioning. The great thing about the collocated join strategies is that each parallel thread works on a single partition from both tables, reading rows, performing the join, and computing a per-partition subtotal, before moving on to a new partition. From a thread’s point of view… If you have trouble visualizing what is happening from just looking at the parallel collocated merge join execution plan, let’s look at it again, but from the point of view of just one thread operating between the two Parallelism (exchange) operators. Our thread picks up a single partition id from the Distribute Streams exchange, and starts a merge join using ordered rows from partition 1 of table T1 and partition 1 of table T2. By definition, this is all happening on a single thread. As rows join, they are added to a (per-partition) count in the Stream Aggregate immediately above the Merge Join. Eventually, either T1 (partition 1) or T2 (partition 1) runs out of rows and the merge join stops. The per-partition count from the aggregate passes on through the Nested Loops join to another Stream Aggregate, which is maintaining a per-thread subtotal. Our same thread now picks up a new partition id from the exchange (say it gets id 9 this time). The count in the per-partition aggregate is reset to zero, and the processing of partition 9 of both tables proceeds just as it did for partition 1, and on the same thread. Each thread picks up a single partition id and processes all the data for that partition, completely independently from other threads working on other partitions. One thread might eventually process partitions (1, 9, 17, 25, 33, 41) while another is concurrently processing partitions (2, 10, 18, 26, 34) and so on for the other six threads at DOP 8. The point is that all 8 threads can execute independently and concurrently, continuing to process new partitions until the wider job (of which the thread has no knowledge!) is done. This divide-and-conquer technique can be much more efficient than simply splitting the entire workload across eight threads all at once. Related Reading Understanding and Using Parallelism in SQL Server Parallel Execution Plans Suck © 2013 Paul White – All Rights Reserved Twitter: @SQL_Kiwi

    Read the article

  • Next Phase of ECM 11g Now Available - New UCM & URM 11g, & Updated I/PM & IRM 11g

    - by michelle.huff
    We're excited to announce that the Oracle Enterprise Content Management Suite 11g is now available! Today, Oracle announced ECM Suite 11g, a part of Fusion Middleware 11gR1 Patchset 2 release, which builds upon the Imaging and Process Management (I/PM) and Information Rights Management (IRM) 11g release earlier this year. Universal Content Management (UCM) and Universal Records Management (URM) 11g are now available with many new features and enhancements. All ECM products are localized into 27 languages, use a single repository, a single installer, centralized administration, and all run on the same Fusion Middleware tech stack. Oracle ECM Suite 11g, is better integrated to fit the way you work, with extreme performance and extreme scalability. Universal Content Management One click Web content management - brings Web content management authoring, design and presentation capabilities directly into how organizations design sites, portals, and custom Web applications. Simply take in the right amount of WCM that meets your needs - all without having to rewrite the application or port it over to a new technology stack or framework. Greater business user empowerment - with next generation desktop integrations and "smart productivity folders", new Web site "design mode" for business users, and enhanced rich media support enabling users to better work with photography, graphics, videos & podcasts created today as well as contribute content within Flash files directly from the Web. Advanced manageability with extreme performance & scalability - centralized system monitoring, installation, logging, performance metrics & diagnostics, with new built in "fast check-in" features, redesigned component management interface - all running on Fusion Middleware infrastructure. Universal Records Management Enhanced user experience: Oracle URM 11g makes records management easier for both business users and records administrators. Simplifications in the end user experience allow the creation of bookmarks into often-used part of the file plan, easy copying of categories and dispositions, and integrated folder and records search. The records management dashboard provides a consolidated view into records administrator tasks and system performance. DoD 5015.02 v3: Oracle URM is fully certified against all part of the US Department of Defense records management standard - baseline, classified, and Freedom of Information and Privacy Act. This enables Federal, state, & local governments & public agencies, as well as private companies, to maintain regulated compliance. Expanded functionality through Oracle integrations: Oracle URM 11g allows for an expanded set of functionality through integration capabilities with other Oracle products. This includes configurable records definition capabilities directly within a UCM instance. An out of the box integration with Oracle BI Publisher provides easily configured and robust reporting. Additionally, 11g offers an out of the box Oracle Secure Enterprise Search integration enabling real time full text discovery across disparate systems in an organization. Read the Press Release Watch the 3 Minute ECM 11g Video Get Up to Speed with the What's New in ECM Suite Datasheet Learn More on OTN with new tutorials, downloads and whitepapers

    Read the article

  • Woes of a Junior Developer - is it possible to not be cut out for programming?

    - by user575158
    (Let me start off by asking - please be gentle, I know this is subjective, but it's meant to incite discussion and provide information for others. If needed it can be converted to community wiki.) I recently was hired as a junior developer at a company I really like. I started out in the field doing QA and transitioned into more and more development work, which is what I really want to end up doing. I enjoy it, but more and more I am questioning whether I am really any good at it or not. Part of this is still growing into the junior developer role, I know, but how much? What are junior developers to expect, what should they be doing and not doing? What can I do to improve and show my company I am serious about this opportunity? I hate that I am costing them time by getting up to speed. I've been told by others that companies make investments in Junior devs and don't expect them to pay off for a while, but how much of this is true? There's got to be a point when it's apparent whether the investment will pay off or not. So far I've been trying to ask as many questions I can, but I've you've been obsessing over a simple problem for some time and the others know that, there comes a time when it's pretty embarrassing to have to get help after struggling so long. I've also tried to be as open to suggestion as possible and work with others to try to refactor my code, but sometimes this can be hard clashing with various team members' personal opinions (being told by someone to write it one way, and then having someone else make you rewrite it). I often get over-stressed and judge myself too harshly, but I just don't want to have to struggle the rest of my life trying to get things work if I just don't have the talent. In your experience, is programming something that almost everyone can learn, or something that some people just don't get? Do others feel this way, or did you feel that way when starting out? It scares me that I have no other job skills should I be unsuited for having the skills necessary to code well.

    Read the article

  • SharpDX and game engines, back to zero?

    - by Baboon
    I'm a desktop developer (I mainly do WPF for a living) but I want to make games as a hobby. So a few months ago, I started reading blogs, gamedevSE, you name it. I understand in the C++ DirectX world, you have engines such as Unity3D with designers and whatnot, but as much as I'm ok to spend months understanding game development, I'd prefer to stay with my comfy C#. So I thought I'd develop my first games in /C#/DirectX through SharpDX But then, I can't use game engines anymore, since they're made for C++DirectX and not SharpDX. (ok, I could do P/Invokes but that defeats the purpose of SharpDX). I do know about XNA, and I also do know I can't publish to the marketplace with it (and quite frankly, I don't really want to learn an API that is in jeopardy). So how do you conceal writing games in C# and using existing game engines instead of reinventing the wheel? wait for ports? So I've found out the following: After doing the first tutorials of MOgre and digging around, it seems MOgre gives you the worse of both worlds: . You can't port it with Mono because it directly references a C++/CLI dll (Ogre) and C++/CLI isn't supported by Mono. And since it's not C++ itself, you can't make a pure build. Which, as far as I understand, means I'd be stuck on windows (and not even WinRT/Metro compatible), without capability of porting anything to mobiles or other OS (Mac/Linux). Even though it looks really nice to develop with MOgre, I'd like to learn something a little more open to future broadening. On the other hand, MonoGame seems to be a rewrite of XNA with SharpDX which sounds very promising: Mono allows me to easily port my games to other platforms, mobiles included. SharpDX allows me to access the latest DirectX versions XNA would be my first choice if only MS showed some hope for the future It really looks like MonoGame is nothing else than XNA on SharpDX. Axiom looks good too but I lack info on the subject (and pages with poor design don't give me a good feeling about an API...) XNA with SunBurn looks good: It should be portable to Mono (can anyone with experience give us feedback on that?), thus multi-platform. It's then marketplace-able since Mono itself is. Did I miss something or are 2) and 4) my best options (aside from the fact Mono doesn't support any XAML)?

    Read the article

  • What micro web-framework has the lowest overhead but includes templating

    - by Simon Martin
    I want to rewrite a simple small (10 page) website and besides a contact form it could be written in pure html. It is currently built with classic asp and Dreamweaver templates. The reason I'm not simply writing 10 html pages is that I want to keep the layout all in 1 place so would need either includes or a masterpage. I don't want to use Dreamweaver templates, or batch processing (like org-mode) because I want to be able to edit using notepad (or Visual Studio) because occasionally I might need to edit a file on the server (Go Daddy's IIS admin interface will let me edit text). I don't want to use ASP.NET MVC or WebForms (which I use in my day job) because I don't need all the overhead they bring with them when essentially I'm serving up 9 static files, 1 contact form and 1 list of clubs (that I aim to use jQuery to filter). The shared hosting package I have on Go Daddy seems to take a long time to spin up when serving aspx files. Currently the clubs page is driven from an MS SQL database that I try to keep up to date by manually checking the dojo locator on the main HQ pages and editing the entries myself, this is again way over the top. I aim to get a text file with the club details (probably in JSON or xml format) and use that as the source for the clubs page. There will need to be a bit of programming for this as the HQ site is unable to provide an extract / feed so something will have to scrape the site periodically to update my clubs persistence file. I'd like that to be automated - but I'm happy to have that triggered on a visit to the clubs page so I don't need to worry about scheduling a job. I would probably have a separate process that updates the persistence that has nothing to do with the rest of the site. Ideally I'd like to use Mercurial (or git) to publish, I know Bitbucket (and github) both serve static page sites so they wouldn't work in this scenario (dynamic pages and a contact form) but that's the model I'd like to use if there is such a thing. My requirements are: Simple templating system, 1 place to define header, footers, menu etc., that can be edited using just notepad. Very minimal / lightweight framework. I don't need a monster for 10 pages Must run either on IIS7 (shared Go Daddy Windows hosting) or other free host

    Read the article

  • Appropriate response when client empowered with CMS destroys content to his own will

    - by dukeofgaming
    So, I just recently closed a website project that pretty much was The Oatmeals' Design Hell, but with content. The client loved the site at the beginning but started getting other people involved and mercilessly bombarding us with their opinions. We served a carefully thought content strategy (which the client approved) and extremely curated copywriting that took us four months after at least 5 requirement changes (new content, new objectives for the business, changed offerings, new mindfaps, etc.) that required us to rewrite the content about 3 times. The client never gave timely feedback even though we kept the process open for him and his people to see (content being developed transparently in Google Docs). Near the end of the project he still wanted to make changes but wanted us to finish already (there are not enough words in the world to even try to make sense of this). So I explained to him the obvious implications of the never-ending requirement changes and advised him to take the time to gather his thoughts with his own team and see the new content introduced as a new content maintenance project. He happily accepted, but on the day of training/delivery things went very wrong and we have no idea why. The client didn't even allow the site to be out for a week with the content we developed for him and quickly replaced us with a Joomla savvy intern so that he completely destroy the content with shallow, unstructured, tasteless and plain wordsmithing (and I'm not even being visceral). Worst insult of all, he revoked our access from his server and the deployed CMS not even having passed 10 minutes of being given his administrator account (we realized the day after that he did it in our own office, the nerve!). Everybody involved in the team is enraged and insulted. I never want to see this happen again. So, to try to make sense of this situation and avoid it in the future with new clients I have two concrete questions: Is there even an appropriate course of action with a client like this?, or is he just not worth the trouble of analyzing (blindly hoping this never repeats again). In the exercise to try and blame ourselves instead of the client and take this as a lesson of... something, how should we set expectations for new clients about the working terms, process and final product so that they are discouraged from mauling the content to their own contempt once they get the codes to the nukes (access to the CMS)?

    Read the article

  • Beginning with first project on game development [closed]

    - by Tsvetan
    Today is the day I am going to start my first real game project. It will be a Universe simulator. Basically, you can build anything from tiny meteor to quazars and universes. It is going to be my project for an olympiad in IT in my country and I really want to make it perfect(at least a bronze medal). So, I would like to ask some questions about organization and development methodologies. Firstly, my plan is to make a time schedule. In it I would write my plans for the next month or two(because that is the time I have). With this exact plan I hope to make my organisation at its best. Of course, if I am doing sth faster than the schedule I would involve more features for the game and/or continue with the tempo I have. Also, for the organisation I would make a basic pseudocode(maybe) and just rewrite it so it is compilable. Like a basic skeleton of everything. The last is an idea I tought of in the moment, but if it is good I will use it. Secondly, for the development methodologies, obviously, I think of making object-oriented code and make everything perfect(a lot of testing, good code, documentation etc.). Also, I am going to make my own menu system(I read that OpenGL hasn't got very good one). Maybe I would implement it with an xml file, holding the info about position of buttons, text boxes, images and everything. Maybe I would do a specific CSS for it and so on. I think that is very good way of doing the menu system, because it makes the presentation layer separate of the logic. But, if there is a better way, I would do it the better way. For the logic, well, I don't have much to say. OO code, testing, debuging, good and fast algorithms and so on. Also, a good documentation must be written and this is the area I need to make some research in. I think that is for now. I hope I have been enough descriptive. If more questions come on my mind, I will ask them. Edit: I think of blogging every part of the project, or at least writing down everything in a file or something like that. My question is: Is my plan of how to do everything around the project good? And if not, what is necessary to be improved and what other things I can involve for making the project good.

    Read the article

  • Advice on designing a robust program to handle a large library of meta-information & programs

    - by Sam Bryant
    So this might be overly vague, but here it is anyway I'm not really looking for a specific answer, but rather general design principles or direction towards resources that deal with problems like this. It's one of my first large-scale applications, and I would like to do it right. Brief Explanation My basic problem is that I have to write an application that handles a large library of meta-data, can easily modify the meta-data on-the-fly, is robust with respect to crashing, and is very efficient. (Sorta like the design parameters of iTunes, although sometimes iTunes performs more poorly than I would like). If you don't want to read the details, you can skip the rest Long Explanation Specifically I am writing a program that creates a library of image files and meta-data about these files. There is a list of tags that may or may not apply to each image. The program needs to be able to add new images, new tags, assign tags to images, and detect duplicate images, all while operating. The program contains an image Viewer which has tagging operations. The idea is that if a given image A is viewed while the library has tags T1, T2, and T3, then that image will have boolean flags for each of those tags (depending on whether the user tagged that image while it was open in the Viewer). However, prior to being viewed in the Viewer, image A would have no value for tags T1, T2, and T3. Instead it would have a "dirty" flag indicating that it is unknown whether or not A has these tags or not. The program can introduce new tags at any time (which would automatically set all images to "dirty" with respect to this new tag) This program must be fast. It must be easily able to pull up a list of images with or without a certain tag as well as images which are "dirty" with respect to a tag. It has to be crash-safe, in that if it suddenly crashes, all of the tagging information done in that session is not lost (though perhaps it's okay to loose some of it) Finally, it has to work with a lot of images (10,000) I am a fairly experienced programmer, but I have never tried to write a program with such demanding needs and I have never worked with databases. With respect to the meta-data storage, there seem to be a few design choices: Choice 1: Invidual meta-data vs centralized meta-data Individual Meta-Data: have a separate meta-data file for each image. This way, as soon as you change the meta-data for an image, it can be written to the hard disk, without having to rewrite the information for all of the other images. Centralized Meta-Data: Have a single file to hold the meta-data for every file. This would probably require meta-data writes in intervals as opposed to after every change. The benefit here is that you could keep a centralized list of all images with a given tag, ect, making the task of pulling up all images with a given tag very efficient

    Read the article

< Previous Page | 100 101 102 103 104 105 106 107 108 109 110 111  | Next Page >