Search Results

Search found 442 results on 18 pages for 'the naive'.

Page 18/18 | < Previous Page | 14 15 16 17 18

How to reduce iOS AVPlayer start delay

- by Bernt Habermeier

Note, for the below question: All assets are local on the device -- no network streaming is taking place. The videos contain audio tracks. I'm working on an iOS application that requires playing video files with minimum delay to start the video clip in question. Unfortunately we do not know what specific video clip is next until we actually need to start it up. Specifically: When one video clip is playing, we will know what the next set of (roughly) 10 video clips are, but we don't know which one exactly, until it comes time to 'immediately' play the next clip. What I've done to look at actual start delays is to call addBoundaryTimeObserverForTimes on the video player, with a time period of one millisecond to see when the video actually started to play, and I take the difference of that time stamp with the first place in the code that indicates which asset to start playing. From what I've seen thus-far, I have found that using the combination of AVAsset loading, and then creating an AVPlayerItem from that once it's ready, and then waiting for AVPlayerStatusReadyToPlay before I call play, tends to take between 1 and 3 seconds to start the clip. I've since switched to what I think is roughly equivalent: calling [AVPlayerItem playerItemWithURL:] and waiting for AVPlayerItemStatusReadyToPlay to play. Roughly same performance. One thing I'm observing is that the first AVPlayer item load is slower than the rest. Seems one idea is to pre-flight the AVPlayer with a short / empty asset before trying to play the first video might be of good general practice. [http://stackoverflow.com/questions/900461/slow-start-for-avaudioplayer-the-first-time-a-sound-is-played] I'd love to get the video start times down as much as possible, and have some ideas of things to experiment with, but would like some guidance from anyone that might be able to help. Update: idea 7, below, as-implemented yields switching times of around 500 ms. This is an improvement, but it it'd be nice to get this even faster. Idea 1: Use N AVPlayers (won't work) Using ~ 10 AVPPlayer objects and start-and-pause all ~ 10 clips, and once we know which one we really need, switch to, and un-pause the correct AVPlayer, and start all over again for the next cycle. I don't think this works, because I've read there is roughly a limit of 4 active AVPlayer's in iOS. There was someone asking about this on StackOverflow here, and found out about the 4 AVPlayer limit: fast-switching-between-videos-using-avfoundation Idea 2: Use AVQueuePlayer (won't work) I don't believe that shoving 10 AVPlayerItems into an AVQueuePlayer would pre-load them all for seamless start. AVQueuePlayer is a queue, and I think it really only makes the next video in the queue ready for immediate playback. I don't know which one out of ~10 videos we do want to play back, until it's time to start that one. ios-avplayer-video-preloading Idea 3: Load, Play, and retain AVPlayerItems in background (not 100% sure yet -- but not looking good) I'm looking at if there is any benefit to load and play the first second of each video clip in the background (suppress video and audio output), and keep a reference to each AVPlayerItem, and when we know which item needs to be played for real, swap that one in, and swap the background AVPlayer with the active one. Rinse and Repeat. The theory would be that recently played AVPlayer/AVPlayerItem's may still hold some prepared resources which would make subsequent playback faster. So far, I have not seen benefits from this, but I might not have the AVPlayerLayer setup correctly for the background. I doubt this will really improve things from what I've seen. Idea 4: Use a different file format -- maybe one that is faster to load? I'm currently using .m4v's (video-MPEG4) H.264 format. I have not played around with other formats, but it may well be that some formats are faster to decode / get ready than others. Possible still using video-MPEG4 but with a different codec, or maybe quicktime? Maybe a lossless video format where decoding / setup is faster? Idea 5: Combination of lossless video format + AVQueuePlayer If there is a video format that is fast to load, but maybe where the file size is insane, one idea might be to pre-prepare the first 10 seconds of each video clip with a version that is boated but faster to load, but back that up with an asset that is encoded in H.264. Use an AVQueuePlayer, and add the first 10 seconds in the uncompressed file format, and follow that up with one that is in H.264 which gets up to 10 seconds of prepare/preload time. So I'd get 'the best' of both worlds: fast start times, but also benefits from a more compact format. Idea 6: Use a non-standard AVPlayer / write my own / use someone else's Given my needs, maybe I can't use AVPlayer, but have to resort to AVAssetReader, and decode the first few seconds (possibly write raw file to disk), and when it comes to playback, make use of the raw format to play it back fast. Seems like a huge project to me, and if I go about it in a naive way, it's unclear / unlikely to even work better. Each decoded and uncompressed video frame is 2.25 MB. Naively speaking -- if we go with ~ 30 fps for the video, I'd end up with ~60 MB/s read-from-disk requirement, which is probably impossible / pushing it. Obviously we'd have to do some level of image compression (perhaps native openGL/es compression formats via PVRTC)... but that's kind crazy. Maybe there is a library out there that I can use? Idea 7: Combine everything into a single movie asset, and seekToTime One idea that might be easier than some of the above, is to combine everything into a single movie, and use seekToTime. The thing is that we'd be jumping all around the place. Essentially random access into the movie. I think this may actually work out okay: avplayer-movie-playing-lag-in-ios5 Which approach do you think would be best? So far, I've not made that much progress in terms of reducing the lag.

Read the article
The Windows Store... why did I sign up with this mess again?

- by FransBouma

Yesterday, Microsoft revealed that the Windows Store is now open to all developers in a wide range of countries and locations. For the people who think "wtf is the 'Windows Store'?", it's the central place where Windows 8 users will be able to find, download and purchase applications (or as we now have to say to not look like a computer illiterate: <accent style="Kentucky">aaaaappss</accent>) for Windows 8. As this is the store which is integrated into Windows 8, it's an interesting place for ISVs, as potential customers might very well look there first. This of course isn't true for all kinds of software, and developer tools in general aren't the kind of applications most users will download from the Windows store, but a presence there can't hurt. Now, this Windows Store hosts two kinds of applications: 'Metro-style' applications and 'Desktop' applications. The 'Metro-style' applications are applications created for the new 'Metro' UI which is present on Windows 8 desktop and Windows RT (the single color/big font fingerpaint-oriented UI). 'Desktop' applications are the applications we all run and use on Windows today. Our software are desktop applications. The Windows Store hosts all Metro-style applications locally in the store and handles the payment for these applications. This means you upload your application (sorry, 'app') to the store, jump through a lot of hoops, Microsoft verifies that your application is not violating a tremendous long list of rules and after everything is OK, it's published and hopefully you get customers and thus earn money. Money which Microsoft will pay you on a regular basis after customers buy your application. Desktop applications are not following this path however. Desktop applications aren't hosted by the Windows Store. Instead, the Windows Store more or less hosts a page with the application's information and where to get the goods. I.o.w.: it's nothing more than a product's Facebook page. Microsoft will simply redirect a visitor of the Windows Store to your website and the visitor will then use your site's system to purchase and download the application. This last bit of information is very important. So, this morning I started with fresh energy to register our company 'Solutions Design bv' at the Windows Store and our two applications, LLBLGen Pro and ORM Profiler. First I went to the Windows Store dashboard page. If you don't have an account, you have to log in or sign up if you don't have a live account. I signed in with my live account. After that, it greeted me with a page where I had to fill in a code which was mailed to me. My local mail server polls every several minutes for email so I had to kick it to get it immediately. I grabbed the code from the email and I was presented with a multi-step process to register myself as a company or as an individual. In red I was warned that this choice was permanent and not changeable. I chuckled: Microsoft apparently stores its data on paper, not in digital form. I chose 'company' and was presented with a lengthy form to fill out. On the form there were two strange remarks: Per company there can just be 1 (one, uno, not zero, not two or more) registered developer, and only that developer is able to upload stuff to the store. I have no idea how this works with large companies, oh the overhead nightmares... "Sorry, but John, our registered developer with the Windows Store is on holiday for 3 months, backpacking through Australia, no, he's not reachable at this point. M'yeah, sorry bud. Hey, did you fill in those TPS reports yesterday?" A separate Approver has to be specified, which has to be a different person than the registered developer. Apparently to Microsoft a company with just 1 person is not a company. Luckily we're with two people! *pfew*, dodged that one, otherwise I would be stuck forever: the choice I already made was not reversible! After I had filled out the form and it was all well and good and accepted by the Microsoft lackey who had to write it all down in some paper notebook ("Hey, be warned! It's a permanent choice! Written down in ink, can't be changed!"), I was presented with the question how I wanted to pay for all this. "Pay for what?" I wondered. Must be the paper they were scribbling the information on, I concluded. After all, there's a financial crisis going on! How could I forget! Silly me. "Ok fair enough". The price was 75 Euros, not the end of the world. I could only pay by credit card, so it was accepted quickly. Or so I thought. You see, Microsoft has a different idea about CC payments. In the normal world, you type in your CC number, some date, a name and a security code and that's it. But Microsoft wants to verify this even more. They want to make a verification purchase of a very small amount and are doing that with a special code in the description. You then have to type in that code in a special form in the Windows Store dashboard and after that you're verified. Of course they'll refund the small amount they pull from your card. Sounds simple, right? Well... no. The problem starts with the fact that I can't see the CC activity on some website: I have a bank issued CC card. I get the CC activity once a month on a piece of paper sent to me. The bank's online website doesn't show them. So it's possible I have to wait for this code till October 12th. One month. "So what, I'm not going to use it anyway, Desktop applications don't use the payment system", I thought. "Haha, you're so naive, dear developer!" Microsoft won't allow you to publish any applications till this verification is done. So no application publishing for a month. Wouldn't it be nice if things were, you know, digital, so things got done instantly? But of course, that lackey who scribbled everything in the Big Windows Store Registration Book isn't that quick. Can't blame him though. He's just doing his job. Now, after the payment was done, I was presented with a page which tells me Microsoft is going to use a third party company called 'Symantec', which will verify my identity again. The page explains to me that this could be done through email or phone and that they'll contact the Approver to verify my identity. "Phone?", I thought... that's a little drastic for a developer account to publish a single page of information about an external hosted software product, isn't it? On Facebook I just added a page, done. And paying you, Microsoft, took less information: you were happy to take my money before my identity was even 'verified' by this 3rd party's minions! "Double standards!", I roared. No-one cared. But it's the thought of getting it off your chest, you know. Luckily for me, everyone at Symantec was asleep when I was registering so they went for the fallback option in case phone calls were not possible: my Approver received an email. Imagine you have to explain the idiot web of security theater I was caught in to someone else who then has to reply a random person over the internet that I indeed was who I said I was. As she's a true sweetheart, she gave me the benefit of the doubt and assured that for now, I was who I said I was. Remember, this is for a desktop application, which is only a link to a website, some pictures and a piece of text. No file hosting, no payment processing, nothing, just a single page. Yeah, I also thought I was crazy. But we're not at the end of this quest yet. I clicked around in the confusing menus of the Windows Store dashboard and found the 'Desktop' section. I get a helpful screen with a warning in red that it can't find any certified 'apps'. True, I'm just getting started, buddy. I see a link: "Check the Windows apps you submitted for certification". Well, I haven't submitted anything, but let's see where it brings me. Oh the thrill of adventure! I click the link and I end up on this site: the hardware/desktop dashboard account registration. "Erm... but I just registered...", I mumbled to no-one in particular. Apparently for desktop registration / verification I have to register again, it tells me. But not only that, the desktop application has to be signed with a certificate. And not just some random el-cheapo certificate you can get at any mall's discount store. No, this certificate is special. It's precious. This certificate, the 'Microsoft Authenticode' Digital Certificate, is the only certificate that's acceptable, and jolly, it can be purchased from VeriSign for the price of only ... $99.-, but be quick, because this is a limited time offer! After that it's, I kid you not, $499.-. 500 dollars for a certificate to sign an executable. But, I do feel special, I got a special price. Only for me! I'm glowing. Not for long though. Here I started to wonder, what the benefit of it all was. I now again had to pay money for a shiny certificate which will add 'Solutions Design bv' to our installer as the publisher instead of 'unknown', while our customers download the file from our website. Not only that, but this was all about a Desktop application, which wasn't hosted by Microsoft. They only link to it. And make no mistake. These prices aren't single payments. Every year these have to be renewed. Like a membership of an exclusive club: you're special and privileged, but only if you cough up the dough. To give you an example how silly this all is: I added LLBLGen Pro and ORM Profiler to the Visual Studio Gallery some time ago. It's the same thing: it's a central place where one can find software which adds to / extends / works with Visual Studio. I could simply create the pages, add the information and they show up inside Visual Studio. No files are hosted at Microsoft, they're downloaded from our website. Exactly the same system. As I have to wait for the CC transcripts to arrive anyway, I can't proceed with publishing in this new shiny store. After the verification is complete I have to wait for verification of my software by Microsoft. Even Desktop applications need to be verified using a long list of rules which are mainly focused on Metro-style applications. Even while they're not hosted by Microsoft. I wonder what they'll find. "Your application wasn't approved. It violates rule 14 X sub D: it provides more value than our own competing framework". While I was writing this post, I tried to check something in the Windows Store Dashboard, to see whether I remembered it correctly. I was presented again with the question, after logging in with my live account, to enter the code that was just mailed to me. Not the previous code, a brand new one. Again I had to kick my mail server to pull the email to proceed. This was it. This 'experience' is so beyond miserable, I'm afraid I have to say goodbye for now to the 'Windows Store'. It's simply not worth my time. Now, about live accounts. You might know this: live accounts are tied to everything you do with Microsoft. So if you have an MSDN subscription, e.g. the one which costs over $5000.-, it's tied to this same live account. But the fun thing is, you can login with your live account to the MSDN subscriptions with just the account id and password. No additional code is mailed to you. While it gives you access to all Microsoft software available, including your licenses. Why the draconian security theater with this Windows Store, while all I want is to publish some desktop applications while on other Microsoft sites it's OK to simply sign in with your live account: no codes needed, no verification and no certificates? Microsoft, one thing you need with this store and that's: apps. Apps, apps, apps, apps, aaaaaaaaapps. Sorry, my bad, got carried away. I just can't stand the word 'app'. This store's shelves have to be filled to the brim with goods. But instead of being welcomed into the store with open arms, I have to fight an uphill battle with an endless list of rules and bullshit to earn the privilege to publish in this shiny store. As if I have to be thrilled to be one of the exclusive club called 'Windows Store Publishers'. As if Microsoft doesn't want it to succeed. Craig Stuntz sent me a link to an old blog post of his regarding code signing and uploading to Microsoft's old mobile store from back in the WinMo5 days: http://blogs.teamb.com/craigstuntz/2006/10/11/28357/. Good read and good background info about how little things changed over the years. I hope this helps Microsoft make things more clearer and smoother and also helps ISVs with their decision whether to go with the Windows Store scheme or ignore it. For now, I don't see the advantage of publishing there, especially not with the nonsense rules Microsoft cooked up. Perhaps it changes in the future, who knows.

Read the article
I never really understood: what is CGI?

- by claws

CGI is a Comman Gateway Interface. As the name says, it is a "common" gateway interface for everything. It is so trivial and naive from the name. I feel that I understood this and I felt this every time I encountered this word. But frankly, I didn't. I'm still confused. I am a PHP programmer. I did lot of web development. user (client) request for page --- webserver(-embedded PHP interpreter) ---- Server side(PHP) Script --- MySQL Server. Now say my PHP Script can fetch results from MySQL Server && MATLAB Server && Some other server. So, now PHP Script is the CGI? because its interface for the between webserver & All other servers? I don't know. Sometimes they call CGI, a technology & othertimes they call CGI a program or someother server. What exactly is CGI? Whats the big deal with /cgi-bin/*.cgi? Whats up with this? I don't know what is this cgi-bin directory on the server for. I don't know why they have *.cgi extensions. Why does Perl always comes in the way. CGI & Perl (language). I also don't know whats up with these two. Almost all the time I keep hearing these two in combination "CGI & Perl". This book is another great example CGI Programming with Perl Why not "CGI Programming with PHP/JSP/ASP". I never saw such things. CGI Programming in C this confuses me a lot. in C?? Seriously?? I don't know what to say. I"m just confused. "in C"?? This changes everything. Program needs to be compiled and executed. This entirely changes my view of web programming. When do I compile? How does the program gets executed (because it will be a machine code, so it must execute as a independent process). How does it communicate with the web server? IPC? and interfacing with all the servers (in my example MATLAB & MySQL) using socket programming? I'm lost!! They say that CGI is depreciated. Its no more in use. Is it so? What is its latest update? Once, I ran into a situation where I had to give HTTP PUT request access to web server (Apache HTTPD). Its a long back. So, as far as I remember this is what I did: Edited the configuration file of Apache HTTPD to tell webserver to pass all HTTP PUT requests to some put.php ( I had to write this PHP script) Implement put.php to handle the request (save the file to the location mentioned) People said that I wrote a CGI Script. Seriously, I didn't have clue what they were talking about. Did I really write CGI Script? I hope you understood what my confusion is. (Because I myself don't know where I'm confused). I request you guys to keep your answer as simple as possible. I really can't understand any fancy technical terminology. At least not in this case. EDIT: I found this amazing tutorial "CGI Programming Is Simple!" - CGI Tutorial Which explains the concepts in simplest possible way. I've only have one complaint about this tutorial. Just to make what ever he explained complete he should have shown the C code he used for generating response for those GET / POST requests. I've also added link to this tutorial to Wikipedia's article : http://en.wikipedia.org/wiki/Common_Gateway_Interface

Read the article
Inside the Concurrent Collections: ConcurrentDictionary

- by Simon Cooper

Using locks to implement a thread-safe collection is rather like using a sledgehammer - unsubtle, easy to understand, and tends to make any other tool redundant. Unlike the previous two collections I looked at, ConcurrentStack and ConcurrentQueue, ConcurrentDictionary uses locks quite heavily. However, it is careful to wield locks only where necessary to ensure that concurrency is maximised. This will, by necessity, be a higher-level look than my other posts in this series, as there is quite a lot of code and logic in ConcurrentDictionary. Therefore, I do recommend that you have ConcurrentDictionary open in a decompiler to have a look at all the details that I skip over. The problem with locks There's several things to bear in mind when using locks, as encapsulated by the lock keyword in C# and the System.Threading.Monitor class in .NET (if you're unsure as to what lock does in C#, I briefly covered it in my first post in the series): Locks block threads The most obvious problem is that threads waiting on a lock can't do any work at all. No preparatory work, no 'optimistic' work like in ConcurrentQueue and ConcurrentStack, nothing. It sits there, waiting to be unblocked. This is bad if you're trying to maximise concurrency. Locks are slow Whereas most of the methods on the Interlocked class can be compiled down to a single CPU instruction, ensuring atomicity at the hardware level, taking out a lock requires some heavy lifting by the CLR and the operating system. There's quite a bit of work required to take out a lock, block other threads, and wake them up again. If locks are used heavily, this impacts performance. Deadlocks When using locks there's always the possibility of a deadlock - two threads, each holding a lock, each trying to aquire the other's lock. Fortunately, this can be avoided with careful programming and structured lock-taking, as we'll see. So, it's important to minimise where locks are used to maximise the concurrency and performance of the collection. Implementation As you might expect, ConcurrentDictionary is similar in basic implementation to the non-concurrent Dictionary, which I studied in a previous post. I'll be using some concepts introduced there, so I recommend you have a quick read of it. So, if you were implementing a thread-safe dictionary, what would you do? The naive implementation is to simply have a single lock around all methods accessing the dictionary. This would work, but doesn't allow much concurrency. Fortunately, the bucketing used by Dictionary allows a simple but effective improvement to this - one lock per bucket. This allows different threads modifying different buckets to do so in parallel. Any thread making changes to the contents of a bucket takes the lock for that bucket, ensuring those changes are thread-safe. The method that maps each bucket to a lock is the GetBucketAndLockNo method: private void GetBucketAndLockNo( int hashcode, out int bucketNo, out int lockNo, int bucketCount) { // the bucket number is the hashcode (without the initial sign bit) // modulo the number of buckets bucketNo = (hashcode & 0x7fffffff) % bucketCount; // and the lock number is the bucket number modulo the number of locks lockNo = bucketNo % m_locks.Length; } However, this does require some changes to how the buckets are implemented. The 'implicit' linked list within a single backing array used by the non-concurrent Dictionary adds a dependency between separate buckets, as every bucket uses the same backing array. Instead, ConcurrentDictionary uses a strict linked list on each bucket: This ensures that each bucket is entirely separate from all other buckets; adding or removing an item from a bucket is independent to any changes to other buckets. Modifying the dictionary All the operations on the dictionary follow the same basic pattern: void AlterBucket(TKey key, ...) { int bucketNo, lockNo; 1: GetBucketAndLockNo( key.GetHashCode(), out bucketNo, out lockNo, m_buckets.Length); 2: lock (m_locks[lockNo]) { 3: Node headNode = m_buckets[bucketNo]; 4: Mutate the node linked list as appropriate } } For example, when adding another entry to the dictionary, you would iterate through the linked list to check whether the key exists already, and add the new entry as the head node. When removing items, you would find the entry to remove (if it exists), and remove the node from the linked list. Adding, updating, and removing items all follow this pattern. Performance issues There is a problem we have to address at this point. If the number of buckets in the dictionary is fixed in the constructor, then the performance will degrade from O(1) to O(n) when a large number of items are added to the dictionary. As more and more items get added to the linked lists in each bucket, the lookup operations will spend most of their time traversing a linear linked list. To fix this, the buckets array has to be resized once the number of items in each bucket has gone over a certain limit. (In ConcurrentDictionary this limit is when the size of the largest bucket is greater than the number of buckets for each lock. This check is done at the end of the TryAddInternal method.) Resizing the bucket array and re-hashing everything affects every bucket in the collection. Therefore, this operation needs to take out every lock in the collection. Taking out mutiple locks at once inevitably summons the spectre of the deadlock; two threads each hold a lock, and each trying to acquire the other lock. How can we eliminate this? Simple - ensure that threads never try to 'swap' locks in this fashion. When taking out multiple locks, always take them out in the same order, and always take out all the locks you need before starting to release them. In ConcurrentDictionary, this is controlled by the AcquireLocks, AcquireAllLocks and ReleaseLocks methods. Locks are always taken out and released in the order they are in the m_locks array, and locks are all released right at the end of the method in a finally block. At this point, it's worth pointing out that the locks array is never re-assigned, even when the buckets array is increased in size. The number of locks is fixed in the constructor by the concurrencyLevel parameter. This simplifies programming the locks; you don't have to check if the locks array has changed or been re-assigned before taking out a lock object. And you can be sure that when a thread takes out a lock, another thread isn't going to re-assign the lock array. This would create a new series of lock objects, thus allowing another thread to ignore the existing locks (and any threads controlling them), breaking thread-safety. Consequences of growing the array Just because we're using locks doesn't mean that race conditions aren't a problem. We can see this by looking at the GrowTable method. The operation of this method can be boiled down to: private void GrowTable(Node[] buckets) { try { 1: Acquire first lock in the locks array // this causes any other thread trying to take out // all the locks to block because the first lock in the array // is always the one taken out first // check if another thread has already resized the buckets array // while we were waiting to acquire the first lock 2: if (buckets != m_buckets) return; 3: Calculate the new size of the backing array 4: Node[] array = new array[size]; 5: Acquire all the remaining locks 6: Re-hash the contents of the existing buckets into array 7: m_buckets = array; } finally { 8: Release all locks } } As you can see, there's already a check for a race condition at step 2, for the case when the GrowTable method is called twice in quick succession on two separate threads. One will successfully resize the buckets array (blocking the second in the meantime), when the second thread is unblocked it'll see that the array has already been resized & exit without doing anything. There is another case we need to consider; looking back at the AlterBucket method above, consider the following situation: Thread 1 calls AlterBucket; step 1 is executed to get the bucket and lock numbers. Thread 2 calls GrowTable and executes steps 1-5; thread 1 is blocked when it tries to take out the lock in step 2. Thread 2 re-hashes everything, re-assigns the buckets array, and releases all the locks (steps 6-8). Thread 1 is unblocked and continues executing, but the calculated bucket and lock numbers are no longer valid. Between calculating the correct bucket and lock number and taking out the lock, another thread has changed where everything is. Not exactly thread-safe. Well, a similar problem was solved in ConcurrentStack and ConcurrentQueue by storing a local copy of the state, doing the necessary calculations, then checking if that state is still valid. We can use a similar idea here: void AlterBucket(TKey key, ...) { while (true) { Node[] buckets = m_buckets; int bucketNo, lockNo; GetBucketAndLockNo( key.GetHashCode(), out bucketNo, out lockNo, buckets.Length); lock (m_locks[lockNo]) { // if the state has changed, go back to the start if (buckets != m_buckets) continue; Node headNode = m_buckets[bucketNo]; Mutate the node linked list as appropriate } break; } } TryGetValue and GetEnumerator And so, finally, we get onto TryGetValue and GetEnumerator. I've left these to the end because, well, they don't actually use any locks. How can this be? Whenever you change a bucket, you need to take out the corresponding lock, yes? Indeed you do. However, it is important to note that TryGetValue and GetEnumerator don't actually change anything. Just as immutable objects are, by definition, thread-safe, read-only operations don't need to take out a lock because they don't change anything. All lockless methods can happily iterate through the buckets and linked lists without worrying about locking anything. However, this does put restrictions on how the other methods operate. Because there could be another thread in the middle of reading the dictionary at any time (even if a lock is taken out), the dictionary has to be in a valid state at all times. Every change to state has to be made visible to other threads in a single atomic operation (all relevant variables are marked volatile to help with this). This restriction ensures that whatever the reading threads are doing, they never read the dictionary in an invalid state (eg items that should be in the collection temporarily removed from the linked list, or reading a node that has had it's key & value removed before the node itself has been removed from the linked list). Fortunately, all the operations needed to change the dictionary can be done in that way. Bucket resizes are made visible when the new array is assigned back to the m_buckets variable. Any additions or modifications to a node are done by creating a new node, then splicing it into the existing list using a single variable assignment. Node removals are simply done by re-assigning the node's m_next pointer. Because the dictionary can be changed by another thread during execution of the lockless methods, the GetEnumerator method is liable to return dirty reads - changes made to the dictionary after GetEnumerator was called, but before the enumeration got to that point in the dictionary. It's worth listing at this point which methods are lockless, and which take out all the locks in the dictionary to ensure they get a consistent view of the dictionary: Lockless: TryGetValue GetEnumerator The indexer getter ContainsKey Takes out every lock (lockfull?): Count IsEmpty Keys Values CopyTo ToArray Concurrent principles That covers the overall implementation of ConcurrentDictionary. I haven't even begun to scratch the surface of this sophisticated collection. That I leave to you. However, we've looked at enough to be able to extract some useful principles for concurrent programming: Partitioning When using locks, the work is partitioned into independant chunks, each with its own lock. Each partition can then be modified concurrently to other partitions. Ordered lock-taking When a method does need to control the entire collection, locks are taken and released in a fixed order to prevent deadlocks. Lockless reads Read operations that don't care about dirty reads don't take out any lock; the rest of the collection is implemented so that any reading thread always has a consistent view of the collection. That leads us to the final collection in this little series - ConcurrentBag. Lacking a non-concurrent analogy, it is quite different to any other collection in the class libraries. Prepare your thinking hats!

Read the article
Techniques for modeling a dynamic dataflow with Java concurrency API

- by Maian

Is there an elegant way to model a dynamic dataflow in Java? By dataflow, I mean there are various types of tasks, and these tasks can be "connected" arbitrarily, such that when a task finishes, successor tasks are executed in parallel using the finished tasks output as input, or when multiple tasks finish, their output is aggregated in a successor task (see flow-based programming). By dynamic, I mean that the type and number of successors tasks when a task finishes depends on the output of that finished task, so for example, task A may spawn task B if it has a certain output, but may spawn task C if has a different output. Another way of putting it is that each task (or set of tasks) is responsible for determining what the next tasks are. Sample dataflow for rendering a webpage: I have as task types: file downloader, HTML/CSS renderer, HTML parser/DOM builder, image renderer, JavaScript parser, JavaScript interpreter. File downloader task for HTML file HTML parser/DOM builder task File downloader task for each embedded file/link If image, image renderer If external JavaScript, JavaScript parser JavaScript interpreter Otherwise, just store in some var/field in HTML parser task JavaScript parser for each embedded script JavaScript interpreter Wait for above tasks to finish, then HTML/CSS renderer (obviously not optimal or perfectly correct, but this is simple) I'm not saying the solution needs to be some comprehensive framework (in fact, the closer to the JDK API, the better), and I absolutely don't want something as heavyweight is say Spring Web Flow or some declarative markup or other DSL. To be more specific, I'm trying to think of a good way to model this in Java with Callables, Executors, ExecutorCompletionServices, and perhaps various synchronizer classes (like Semaphore or CountDownLatch). There are a couple use cases and requirements: Don't make any assumptions on what executor(s) the tasks will run on. In fact, to simplify, just assume there's only one executor. It can be a fixed thread pool executor, so a naive implementation can result in deadlocks (e.g. imagine a task that submits another task and then blocks until that subtask is finished, and now imagine several of these tasks using up all the threads). To simplify, assume that the data is not streamed between tasks (task output-succeeding task input) - the finishing task and succeeding task won't exist together, so the input data to the succeeding task will not be changed by the preceeding task (since it's already done). There are only a couple operations that the dataflow "engine" should be able to handle: A mechanism where a task can queue more tasks A mechanism whereby a successor task is not queued until all the required input tasks are finished A mechanism whereby the main thread (or other threads not managed by the executor) blocks until the flow is finished A mechanism whereby the main thread (or other threads not managed by the executor) blocks until certain tasks have finished Since the dataflow is dynamic (depends on input/state of the task), the activation of these mechanisms should occur within the task code, e.g. the code in a Callable is itself responsible for queueing more Callables. The dataflow "internals" should not be exposed to the tasks (Callables) themselves - only the operations listed above should be available to the task. Note that the type of the data is not necessarily the same for all tasks, e.g. a file download task may accept a File as input but will output a String. If a task throws an uncaught exception (indicating some fatal error requiring all dataflow processing to stop), it must propagate up to the thread that initiated the dataflow as quickly as possible and cancel all tasks (or something fancier like a fatal error handler). Tasks should be launched as soon as possible. This along with the previous requirement should preclude simple Future polling + Thread.sleep(). As a bonus, I would like to dataflow engine itself to perform some action (like logging) every time task is finished or when no has finished in X time since last task has finished. Something like: ExecutorCompletionService<T> ecs; while (hasTasks()) { Future<T> future = ecs.poll(1 minute); some_action_like_logging(); if (future != null) { future.get() ... } ... } Are there straightforward ways to do all this with Java concurrency API? Or if it's going to complex no matter what with what's available in the JDK, is there a lightweight library that satisfies the requirements? I already have a partial solution that fits my particular use case (it cheats in a way, since I'm using two executors, and just so you know, it's not related at all to the web browser example I gave above), but I'd like to see a more general purpose and elegant solution.

Read the article
CodePlex Daily Summary for Tuesday, November 08, 2011

CodePlex Daily Summary for Tuesday, November 08, 2011Popular ReleasesFacebook C# SDK: v5.3.2: This is a RTW release which adds new features and bug fixes to v5.2.1. Query/QueryAsync methods uses graph api instead of legacy rest api. removed dependency from Code Contracts enabled Task Parallel Support in .NET 4.0+ (experimental) added support for early preview for .NET 4.5 (binaries not distributed in codeplex nor nuget.org, will need to manually build from Facebook-Net45.sln) added additional method overloads for .NET 4.5 to support IProgress<T> for upload progress added ne...Delete Inactive TS Ports: List and delete the Inactive TS Ports: List and delete the Inactive TS Ports - The InactiveTSPortList.EXE accepts command line arguments The InactiveTSPortList.Standalone.WithoutPrompt.exe runs as a standalone exe without the need for any command line arguments.ClosedXML - The easy way to OpenXML: ClosedXML 0.60.0: Added almost full support for auto filters (missing custom date filters). See examples Filter Values, Custom Filters Fixed issues 7016, 7391, 7388, 7389, 7198, 7196, 7194, 7186, 7067, 7115, 7144Microsoft Research Boogie: Nightly builds: This download category contains automatically released nightly builds, reflecting the current state of Boogie's development. We try to make sure each nightly build passes the test suite. If you suspect that was not the case, please try the previous nightly build to see if that really is the problem. Also, please see the installation instructions.DotSpatial: Release 821: Updated releaseGoogleMap Control: GoogleMap Control 6.0: Major design changes to the control in order to achieve better scalability and extensibility for the new features comming with GoogleMaps API. GoogleMap control switched to GoogleMaps API v3 and .NET 4.0. GoogleMap control is 100% ScriptControl now, it requires ScriptManager to be registered on the pages where and before it is used. Markers, polylines, polygons and directions were implemented as ExtenderControl, instead of being inner properties of GoogleMap control. Better perfomance. Better...WDTVHubGen - Adds Metadata, thumbnails and subtitles to WDTV Live Hubs: V2.1: Version 2.1 (click on the right) this uses V4.0 of .net Version 2.1 adds the following features: (apologize if I forget some, added a lot of little things) Manual Lookup with TV or Movie (finally huh!), you can look up a movie or TV episode directly, you can right click on anythign, and choose manual lookup, then will allow you to type anything you want to look up and it will assign it to the file you right clicked. No Rename: a very popular request, this is an option you can set so that t...SubExtractor: Release 1020: Feature: added "baseline double quotes" character to selector box Feature: added option to save SRT files as ANSI (instead of previous UTF-8 only) Feature: made "Save Sup files to Source directory" apply to both Sup and Idx source files. Fix: removed SDH text (...) or [...] that is split over 2 lines Fix: better decision-making in when to prefix a line with a '-' because SDH was removedAcDown????? - Anime&Comic Downloader: AcDown????? v3.6.1: ?? ● AcDown??????????、??????，??????????????????????，???????Acfun、Bilibili、???、???、???、Tucao.cc、SF???、?????80????，???????????、?????????。 ● AcDown???????????????????????????，???，???????????????????。 ● AcDown???????C#??，????.NET Framework 2.0??。?????"Acfun?????"。 ????32??64? Windows XP/Vista/7 ????????????? ??：????????Windows XP???,?????????.NET Framework 2.0???(x86)?.NET Framework 2.0???(x64)，?????"?????????"??? ??????????????，??????????： ??"AcDown?????"????????? ?? v3.6.1?? ??.hlv...Track Folder Changes: Track Folder Changes 1.1: Fixed exception when right-clicking the root nodeMapWindow 4: MapWindow GIS v4.8.6 - Final release - 32Bit: This is the final release of MapWindow v4.8. It has 4.8.6 as version number. This version has been thoroughly tested. If you do get an exception send the exception to us. Don't forget to include your e-mail address. Use the forums at http://www.mapwindow.org/phorum/ for questions. Please consider donating a small portion of the money you have saved by having free GIS tools: http://www.mapwindow.org/pages/donate.php What’s New in 4.8.6 (Final release) · A few minor issues have been fixed Wha...Kinect Paint: Kinect Paint 1.1: Updated for Kinect for Windows SDK v1.0 Beta 2!Kinect Mouse Cursor: Kinect Mouse Cursor 1.1: Updated for Kinect for Windows SDK v1.0 Beta 2!Coding4Fun Kinect Toolkit: Coding4Fun Kinect Toolkit 1.1: Updated for Kinect for Windows SDK v1.0 Beta 2!Async Executor: 1.0: Source code of the AsyncExecutorMedia Companion: MC 3.421b Weekly: Ensure .NET 4.0 Full Framework is installed. (Available from http://www.microsoft.com/download/en/details.aspx?id=17718) Ensure the NFO ID fix is applied when transitioning from versions prior to 3.416b. (Details here) TV Show Resolutions... Fix to show the season-specials.tbn when selecting an episode from season 00. Before, MC would try & load season00.tbn Fix for issue #197 - new show added by 'Manually Add Path' not being picked up. Also made non-visible the same thing in Root Folders...?????????? - ????????: All-In-One Code Framework ??? 2011-11-02: http://download.codeplex.com/Project/Download/FileDownload.aspx?ProjectName=1codechs&DownloadId=216140 ??????，11??，?????20????Microsoft OneCode Sample，????6?Program Language Sample，2?Windows Base Sample，2?GDI+ Sample，4?Internet Explorer Sample?6?ASP.NET Sample。?????????????。 ????，?????。http://i3.codeplex.com/Project/Download/FileDownload.aspx?ProjectName=1code&DownloadId=128165 Program Language CSImageFullScreenSlideShow VBImageFullScreenSlideShow CSDynamicallyBuildLambdaExpressionWithFie...Python Tools for Visual Studio: 1.1 Alpha: We’re pleased to announce the release of Python Tools for Visual Studio 1.1 Alpha. Python Tools for Visual Studio (PTVS) is an open-source plug-in for Visual Studio which supports programming with the Python programming language. This release includes new core IDE features, a couple of new sample libraries for interacting with Kinect and Excel, and many bug fixes for issues reported since the release of 1.0. For the core IDE features we’ve added many new features which improve the basic edit...BExplorer (Better Explorer): Better Explorer 2.0.0.631 Alpha: Changelog: Added: Some new functions in ribbon Added: Possibility to choose displayed columns Added: Basic Search Fixed: Some bugs after navigation Fixed: Attempt to fix slow navigation and slow start Known issues: - BreadcrumbBar fails on some situations - Basic search does not work well in some situations Please if anyone finds bugs be kind and report them at the Issue Tracker! Thanks!DotNetNuke® Community Edition: 05.06.04: Major Highlights Fixed issue with upgrades on systems that had upgraded the Telerik library to 6.0.0 Fixed issue with Razor Host upgrade to 5.6.3 The logic for module administration checks contains incorrect logic in 1 place, opening the possibility of a user with edit permissions gaining access to functionality they should not have through a particularly crafted url Security FixesBrowsers support the ability to remember common strings such as usernames/addresses etc. Code was adde...New ProjectsAddThis for DotNetNuke: A simple AddThis module (plugin) for DotNetNukeAgnisExchange: <AGNIS>as A Growable Network Infor;ation System is developped in <C#>Blogger auto poster: Do you want to have a links blog? Share your favorites links in Google+ and then allow BAP(Blogger auto poster) to read its JSON feed and post its daily summary at your Blogger weblog automatically. CarbonEmissions: CarbonEmissionsCredivale: CredivaleCRM 2011 TreeView for Lookup: CRM 2011 WebResource Utility for showing Self-Joined Lookup data in TreeView form.Dafuscator: Dafuscator is a database data obfuscation system that allows you to tactically obfuscate or delete data out of your production database while leaving most of the data intact.DigDesDevSchoolHomeSaleSystem: This is study project. Done by Russian students. Use ASP.NET MVC.Dynamic Data Extensions: Dynamic Data Extensions add new features to the current ASP.Net 4 Dynamic Data versionEchosoft NoCrash From Scratch: Nocrash From scratch, a operating system designed (but not promised to be) almost invincible from viruses and %20 Windows-like using DoubleTime emulation to make it act like Windows 2000EdiProj: Just a PHP and Asp.Net project.experteer: decision theory class projectFlan Project: Fast-paced Action-RPG (2D Scrolling)GameXNA_Master: XNA MASETR GD 2011 Gyrf: SecretJogo Da Velha em CSharp: Jogo da velha feito para o curso de c#.Jogo das cores: Projeto da aula de extensão de C#Kookaburra: Kookaburra is a framework based on Windows PowerShell 2.0 that can be used for automating SharePoint 2010 administration. It contains a menu, several functions and a well defined structure for adding PowerShell scriptlets.LaunchWithParams: LaunchWithParams is a Windows Explorer add-on that allows you to launch an application executable with specific command-line parameters without having to open a command prompt.MerchantTribe - Shopping Cart and Ecommerce Platform in C# MVC ASP.NET: MerchantTribe is a free, open-source ASP.NET MVC ecommerce platform in C#. For more than a decade, we've been building and selling commercial .NET shopping cart software. Now, we're releasing an open-source project based on our award winning BV Commerce software. Thousands of companies use our shopping cart include Pebble Beach Resorts, DataPipe, TastyKake, MathBlaster and Chesapeake Energy. We need as many people as possible using MerchantTribe for it to thrive. We need Merchants, De...NaiveAlgorithms: Naive C# implementation of basic, general purpose algorithms, with an emphasis on readability, maintainability and correctness. While asymptotic complexity of algorithms is preserved, there is no premature optimisation performed to improve performance. NeonMikaWebserver: NeonMikaWebserver is easy to set up on your .Net MF device, as long as it has an ethernet port. It's espacially created for the Netduino, but I think it's no big work to port it to other plattforms. It provides the following functionalities: -XML Responses ... Just create an Instance of XMLResponse, give it an URI to listen for and fill a Hashtable with your response keys and values... Voila ;) -File Response ... No XML Response fitting your URL? Probably you want to watch a file sa...Next Action: The Next Action web application is a GTD task manager implemented with using MVC3, jQuery, Entity Framework. Orchard Comments Refactoring: Refactoring Orchard's comments moduleoutsource: outsourceScutex: Scutex, pronounced (sec-u-techs), is a 100% managed .Net Framework licensing platform for your applications. Scutex is a flexible licensing system allowing multiple licensing schemes allowing you the most control over how you licensing your products. Unlike any other licensing system on the market, Scutex provides 4 distinct licensing schemes, allowing you to protect your products at different levels using completely different licensing schemes, key types and protection systems. Using Scut...sigem2: Proyecto sigem2SSDL: An easy way to call and manage stored procedures in .NET.Trabalho C# - Datilografia: Software de datilografia produzido para o curso de extensão em C# da UFSCar Sorocaba.triliporra: Triliporra Euro 2012User Interface Toolkit: Ever dreamed about having one framework to write UI independently of your targetted client ? Now your dream has come true !vimrc: my vimrc fileVisuospatial Web Browsing using Kinect: The Kinect Web Browser project explores how users might use the Kinect sensor as a natural interface to browse the visual web. It's developed in C#.VNManager: VNManager or Version Number Manager is a simple Windows command line application which will update your Microsoft Visual Studio AssemblyInfo.cs files AssemblyVersion and AssemblyFileVersion attributes based on rules defined as comments on the same lines.WeddingAssistant: My site will be a ‘gift wish’ site for couples getting married. It will also be a place where people can come and look at ideas for hen and stag nights. Users log in and create a list of items they wish to receive for their wedding or look for ideas for hen and stag nights. The site will allow users to compile a list of items they have found on the internet from specific listed shops and order them in preference. The compiled list is then sent out to friends and relatives who can create a...WMS.NET: thewms is warehouse manager system! For the logistics industry! The team learning project! Built on the mvc3.0 and wcf ! Using jquery js framework as front endWorkflowRobot: Workflow based automation software.

Read the article
Finding the heaviest length-constrained path in a weighted Binary Tree

- by Hristo

UPDATE I worked out an algorithm that I think runs in O(n*k) running time. Below is the pseudo-code: routine heaviestKPath( T, k ) // create 2D matrix with n rows and k columns with each element = -8 // we make it size k+1 because the 0th column must be all 0s for a later // function to work properly and simplicity in our algorithm matrix = new array[ T.getVertexCount() ][ k + 1 ] (-8); // set all elements in the first column of this matrix = 0 matrix[ n ][ 0 ] = 0; // fill our matrix by traversing the tree traverseToFillMatrix( T.root, k ); // consider a path that would arc over a node globalMaxWeight = -8; findArcs( T.root, k ); return globalMaxWeight end routine // node = the current node; k = the path length; node.lc = node’s left child; // node.rc = node’s right child; node.idx = node’s index (row) in the matrix; // node.lc.wt/node.rc.wt = weight of the edge to left/right child; routine traverseToFillMatrix( node, k ) if (node == null) return; traverseToFillMatrix(node.lc, k ); // recurse left traverseToFillMatrix(node.rc, k ); // recurse right // in the case that a left/right child doesn’t exist, or both, // let’s assume the code is smart enough to handle these cases matrix[ node.idx ][ 1 ] = max( node.lc.wt, node.rc.wt ); for i = 2 to k { // max returns the heavier of the 2 paths matrix[node.idx][i] = max( matrix[node.lc.idx][i-1] + node.lc.wt, matrix[node.rc.idx][i-1] + node.rc.wt); } end routine // node = the current node, k = the path length routine findArcs( node, k ) if (node == null) return; nodeMax = matrix[node.idx][k]; longPath = path[node.idx][k]; i = 1; j = k-1; while ( i+j == k AND i < k ) { left = node.lc.wt + matrix[node.lc.idx][i-1]; right = node.rc.wt + matrix[node.rc.idx][j-1]; if ( left + right > nodeMax ) { nodeMax = left + right; } i++; j--; } // if this node’s max weight is larger than the global max weight, update if ( globalMaxWeight < nodeMax ) { globalMaxWeight = nodeMax; } findArcs( node.lc, k ); // recurse left findArcs( node.rc, k ); // recurse right end routine Let me know what you think. Feedback is welcome. I think have come up with two naive algorithms that find the heaviest length-constrained path in a weighted Binary Tree. Firstly, the description of the algorithm is as follows: given an n-vertex Binary Tree with weighted edges and some value k, find the heaviest path of length k. For both algorithms, I'll need a reference to all vertices so I'll just do a simple traversal of the Tree to have a reference to all vertices, with each vertex having a reference to its left, right, and parent nodes in the tree. Algorithm 1 For this algorithm, I'm basically planning on running DFS from each node in the Tree, with consideration to the fixed path length. In addition, since the path I'm looking for has the potential of going from left subtree to root to right subtree, I will have to consider 3 choices at each node. But this will result in a O(n*3^k) algorithm and I don't like that. Algorithm 2 I'm essentially thinking about using a modified version of Dijkstra's Algorithm in order to consider a fixed path length. Since I'm looking for heaviest and Dijkstra's Algorithm finds the lightest, I'm planning on negating all edge weights before starting the traversal. Actually... this doesn't make sense since I'd have to run Dijkstra's on each node and that doesn't seem very efficient much better than the above algorithm. So I guess my main questions are several. Firstly, do the algorithms I've described above solve the problem at hand? I'm not totally certain the Dijkstra's version will work as Dijkstra's is meant for positive edge values. Now, I am sure there exist more clever/efficient algorithms for this... what is a better algorithm? I've read about "Using spine decompositions to efficiently solve the length-constrained heaviest path problem for trees" but that is really complicated and I don't understand it at all. Are there other algorithms that tackle this problem, maybe not as efficiently as spine decomposition but easier to understand? Thanks.

Read the article
How would you go about tackling this problem? [SOLVED in C++]

- by incrediman

Intro: EDIT: See solution at the bottom of this question (c++) I have a programming contest coming up in about half a week, and I've been prepping :) I found a bunch of questions from this canadian competition, they're great practice: http://cemc.math.uwaterloo.ca/contests/computing/2009/stage2/day1.pdf I'm looking at problem B ("Dinner"). Any idea where to start? I can't really think of anything besides the naive approach (ie. trying all permutations) which would take too long to be a valid answer. Btw, the language there says c++ and pascal I think, but i don't care what language you use - I mean really all I want is a hint as to the direction I should proceed in, and perhpas a short explanation to go along with it. It feels like I'm missing something obvious... Of course extended speculation is more than welcome, but I just wanted to clarify that I'm not looking for a full solution here :) Short version of the question: You have a binary string N of length 1-100 (in the question they use H's and G's instead of one's and 0's). You must remove all of the digits from it, in the least number of steps possible. In each step you may remove any number of adjacent digits so long as they are the same. That is, in each step you can remove any number of adjacent G's, or any number of adjacent H's, but you can't remove H's and G's in one step. Example: HHHGHHGHH Solution to the example: 1. HHGGHH (remove middle Hs) 2. HHHH (remove middle Gs) 3. Done (remove Hs) -->Would return '3' as the answer. Note that there can also be a limit placed on how large adjacent groups have to be when you remove them. For example it might say '2', and then you can't remove single digits (you'd have to remove pairs or larger groups at a time). Solution I took Mark Harrison's main algorithm, and Paradigm's grouping idea and used them to create the solution below. You can try it out on the official test cases if you want. //B.cpp //include debug messages? #define DEBUG false #include <iostream> #include <stdio.h> #include <vector> using namespace std; #define FOR(i,n) for (int i=0;i<n;i++) #define FROM(i,s,n) for (int i=s;i<n;i++) #define H 'H' #define G 'G' class String{ public: int num; char type; String(){ type=H; num=0; } String(char type){ this->type=type; num=1; } }; //n is the number of bits originally in the line //k is the minimum number of people you can remove at a time //moves is the counter used to determine how many moves we've made so far int n, k, moves; int main(){ /*Input from File*/ scanf("%d %d",&n,&k); char * buffer = new char[200]; scanf("%s",buffer); /*Process input into a vector*/ //the 'line' is a vector of 'String's (essentially contigious groups of identical 'bits') vector<String> line; line.push_back(String()); FOR(i,n){ //if the last String is of the correct type, simply increment its count if (line.back().type==buffer[i]) line.back().num++; //if the last String is of the wrong type but has a 0 count, correct its type and set its count to 1 else if (line.back().num==0){ line.back().type=buffer[i]; line.back().num=1; } //otherwise this is the beginning of a new group, so create the new group at the back with the correct type, and a count of 1 else{ line.push_back(String(buffer[i])); } } /*Geedily remove groups until there are at most two groups left*/ moves=0; int I;//the position of the best group to remove int bestNum;//the size of the newly connected group the removal of group I will create while (line.size()>2){ /*START DEBUG*/ if (DEBUG){ cout<<"\n"<<moves<<"\n----\n"; FOR(i,line.size()) printf("%d %c \n",line[i].num,line[i].type); cout<<"----\n"; } /*END DEBUG*/ I=1; bestNum=-1; FROM(i,1,line.size()-1){ if (line[i-1].num+line[i+1].num>bestNum && line[i].num>=k){ bestNum=line[i-1].num+line[i+1].num; I=i; } } //remove the chosen group, thus merging the two adjacent groups line[I-1].num+=line[I+1].num; line.erase(line.begin()+I);line.erase(line.begin()+I); moves++; } /*START DEBUG*/ if (DEBUG){ cout<<"\n"<<moves<<"\n----\n"; FOR(i,line.size()) printf("%d %c \n",line[i].num,line[i].type); cout<<"----\n"; cout<<"\n\nFinal Answer: "; } /*END DEBUG*/ /*Attempt the removal of the last two groups, and output the final result*/ if (line.size()==2 && line[0].num>=k && line[1].num>=k) cout<<moves+2;//success else if (line.size()==1 && line[0].num>=k) cout<<moves+1;//success else cout<<-1;//not everyone could dine. /*START DEBUG*/ if (DEBUG){ cout<<" moves."; } /*END DEBUG*/ }

Read the article
The Incremental Architect´s Napkin – #3 – Make Evolvability inevitable

- by Ralf Westphal

Originally posted on: http://geekswithblogs.net/theArchitectsNapkin/archive/2014/06/04/the-incremental-architectacutes-napkin-ndash-3-ndash-make-evolvability-inevitable.aspxThe easier something to measure the more likely it will be produced. Deviations between what is and what should be can be readily detected. That´s what automated acceptance tests are for. That´s what sprint reviews in Scrum are for. It´s no small wonder our software looks like it looks. It has all the traits whose conformance with requirements can easily be measured. And it´s lacking traits which cannot easily be measured. Evolvability (or Changeability) is such a trait. If an operation is correct, if an operation if fast enough, that can be checked very easily. But whether Evolvability is high or low, that cannot be checked by taking a measure or two. Evolvability might correlate with certain traits, e.g. number of lines of code (LOC) per function or Cyclomatic Complexity or test coverage. But there is no threshold value signalling “evolvability too low”; also Evolvability is hardly tangible for the customer. Nevertheless Evolvability is of great importance - at least in the long run. You can get away without much of it for a short time. Eventually, though, it´s needed like any other requirement. Or even more. Because without Evolvability no other requirement can be implemented. Evolvability is the foundation on which all else is build. Such fundamental importance is in stark contrast with its immeasurability. To compensate this, Evolvability must be put at the very center of software development. It must become the hub around everything else revolves. Since we cannot measure Evolvability, though, we cannot start watching it more. Instead we need to establish practices to keep it high (enough) at all times. Chefs have known that for long. That´s why everybody in a restaurant kitchen is constantly seeing after cleanliness. Hygiene is important as is to have clean tools at standardized locations. Only then the health of the patrons can be guaranteed and production efficiency is constantly high. Still a kitchen´s level of cleanliness is easier to measure than software Evolvability. That´s why important practices like reviews, pair programming, or TDD are not enough, I guess. What we need to keep Evolvability in focus and high is… to continually evolve. Change must not be something to avoid but too embrace. To me that means the whole change cycle from requirement analysis to delivery needs to be gone through more often. Scrum´s sprints of 4, 2 even 1 week are too long. Kanban´s flow of user stories across is too unreliable; it takes as long as it takes. Instead we should fix the cycle time at 2 days max. I call that Spinning. No increment must take longer than from this morning until tomorrow evening to finish. Then it should be acceptance checked by the customer (or his/her representative, e.g. a Product Owner). For me there are several resasons for such a fixed and short cycle time for each increment: Clear expectations Absolute estimates (“This will take X days to complete.”) are near impossible in software development as explained previously. Too much unplanned research and engineering work lurk in every feature. And then pervasive interruptions of work by peers and management. However, the smaller the scope the better our absolute estimates become. That´s because we understand better what really are the requirements and what the solution should look like. But maybe more importantly the shorter the timespan the more we can control how we use our time. So much can happen over the course of a week and longer timespans. But if push comes to shove I can block out all distractions and interruptions for a day or possibly two. That´s why I believe we can give rough absolute estimates on 3 levels: Noon Tonight Tomorrow Think of a meeting with a Product Owner at 8:30 in the morning. If she asks you, how long it will take you to implement a user story or bug fix, you can say, “It´ll be fixed by noon.”, or you can say, “I can manage to implement it until tonight before I leave.”, or you can say, “You´ll get it by tomorrow night at latest.” Yes, I believe all else would be naive. If you´re not confident to get something done by tomorrow night (some 34h from now) you just cannot reliably commit to any timeframe. That means you should not promise anything, you should not even start working on the issue. So when estimating use these four categories: Noon, Tonight, Tomorrow, NoClue - with NoClue meaning the requirement needs to be broken down further so each aspect can be assigned to one of the first three categories. If you like absolute estimates, here you go. But don´t do deep estimates. Don´t estimate dozens of issues; don´t think ahead (“Issue A is a Tonight, then B will be a Tomorrow, after that it´s C as a Noon, finally D is a Tonight - that´s what I´ll do this week.”). Just estimate so Work-in-Progress (WIP) is 1 for everybody - plus a small number of buffer issues. To be blunt: Yes, this makes promises impossible as to what a team will deliver in terms of scope at a certain date in the future. But it will give a Product Owner a clear picture of what to pull for acceptance feedback tonight and tomorrow. Trust through reliability Our trade is lacking trust. Customers don´t trust software companies/departments much. Managers don´t trust developers much. I find that perfectly understandable in the light of what we´re trying to accomplish: delivering software in the face of uncertainty by means of material good production. Customers as well as managers still expect software development to be close to production of houses or cars. But that´s a fundamental misunderstanding. Software development ist development. It´s basically research. As software developers we´re constantly executing experiments to find out what really provides value to users. We don´t know what they need, we just have mediated hypothesises. That´s why we cannot reliably deliver on preposterous demands. So trust is out of the window in no time. If we switch to delivering in short cycles, though, we can regain trust. Because estimates - explicit or implicit - up to 32 hours at most can be satisfied. I´d say: reliability over scope. It´s more important to reliably deliver what was promised then to cover a lot of requirement area. So when in doubt promise less - but deliver without delay. Deliver on scope (Functionality and Quality); but also deliver on Evolvability, i.e. on inner quality according to accepted principles. Always. Trust will be the reward. Less complexity of communication will follow. More goodwill buffer will follow. So don´t wait for some Kanban board to show you, that flow can be improved by scheduling smaller stories. You don´t need to learn that the hard way. Just start with small batch sizes of three different sizes. Fast feedback What has been finished can be checked for acceptance. Why wait for a sprint of several weeks to end? Why let the mental model of the issue and its solution dissipate? If you get final feedback after one or two weeks, you hardly remember what you did and why you did it. Resoning becomes hard. But more importantly youo probably are not in the mood anymore to go back to something you deemed done a long time ago. It´s boring, it´s frustrating to open up that mental box again. Learning is harder the longer it takes from event to feedback. Effort can be wasted between event (finishing an issue) and feedback, because other work might go in the wrong direction based on false premises. Checking finished issues for acceptance is the most important task of a Product Owner. It´s even more important than planning new issues. Because as long as work started is not released (accepted) it´s potential waste. So before starting new work better make sure work already done has value. By putting the emphasis on acceptance rather than planning true pull is established. As long as planning and starting work is more important, it´s a push process. Accept a Noon issue on the same day before leaving. Accept a Tonight issue before leaving today or first thing tomorrow morning. Accept a Tomorrow issue tomorrow night before leaving or early the day after tomorrow. After acceptance the developer(s) can start working on the next issue. Flexibility As if reliability/trust and fast feedback for less waste weren´t enough economic incentive, there is flexibility. After each issue the Product Owner can change course. If on Monday morning feature slices A, B, C, D, E were important and A, B, C were scheduled for acceptance by Monday evening and Tuesday evening, the Product Owner can change her mind at any time. Maybe after A got accepted she asks for continuation with D. But maybe, just maybe, she has gotten a completely different idea by then. Maybe she wants work to continue on F. And after B it´s neither D nor E, but G. And after G it´s D. With Spinning every 32 hours at latest priorities can be changed. And nothing is lost. Because what got accepted is of value. It provides an incremental value to the customer/user. Or it provides internal value to the Product Owner as increased knowledge/decreased uncertainty. I find such reactivity over commitment economically very benefical. Why commit a team to some workload for several weeks? It´s unnecessary at beast, and inflexible and wasteful at worst. If we cannot promise delivery of a certain scope on a certain date - which is what customers/management usually want -, we can at least provide them with unpredecented flexibility in the face of high uncertainty. Where the path is not clear, cannot be clear, make small steps so you´re able to change your course at any time. Premature completion Customers/management are used to premeditating budgets. They want to know exactly how much to pay for a certain amount of requirements. That´s understandable. But it does not match with the nature of software development. We should know that by now. Maybe there´s somewhere in the world some team who can consistently deliver on scope, quality, and time, and budget. Great! Congratulations! I, however, haven´t seen such a team yet. Which does not mean it´s impossible, but I think it´s nothing I can recommend to strive for. Rather I´d say: Don´t try this at home. It might hurt you one way or the other. However, what we can do, is allow customers/management stop work on features at any moment. With spinning every 32 hours a feature can be declared as finished - even though it might not be completed according to initial definition. I think, progress over completion is an important offer software development can make. Why think in terms of completion beyond a promise for the next 32 hours? Isn´t it more important to constantly move forward? Step by step. We´re not running sprints, we´re not running marathons, not even ultra-marathons. We´re in the sport of running forever. That makes it futile to stare at the finishing line. The very concept of a burn-down chart is misleading (in most cases). Whoever can only think in terms of completed requirements shuts out the chance for saving money. The requirements for a features mostly are uncertain. So how does a Product Owner know in the first place, how much is needed. Maybe more than specified is needed - which gets uncovered step by step with each finished increment. Maybe less than specified is needed. After each 4–32 hour increment the Product Owner can do an experient (or invite users to an experiment) if a particular trait of the software system is already good enough. And if so, she can switch the attention to a different aspect. In the end, requirements A, B, C then could be finished just 70%, 80%, and 50%. What the heck? It´s good enough - for now. 33% money saved. Wouldn´t that be splendid? Isn´t that a stunning argument for any budget-sensitive customer? You can save money and still get what you need? Pull on practices So far, in addition to more trust, more flexibility, less money spent, Spinning led to “doing less” which also means less code which of course means higher Evolvability per se. Last but not least, though, I think Spinning´s short acceptance cycles have one more effect. They excert pull-power on all sorts of practices known for increasing Evolvability. If, for example, you believe high automated test coverage helps Evolvability by lowering the fear of inadverted damage to a code base, why isn´t 90% of the developer community practicing automated tests consistently? I think, the answer is simple: Because they can do without. Somehow they manage to do enough manual checks before their rare releases/acceptance checks to ensure good enough correctness - at least in the short term. The same goes for other practices like component orientation, continuous build/integration, code reviews etc. None of that is compelling, urgent, imperative. Something else always seems more important. So Evolvability principles and practices fall through the cracks most of the time - until a project hits a wall. Then everybody becomes desperate; but by then (re)gaining Evolvability has become as very, very difficult and tedious undertaking. Sometimes up to the point where the existence of a project/company is in danger. With Spinning that´s different. If you´re practicing Spinning you cannot avoid all those practices. With Spinning you very quickly realize you cannot deliver reliably even on your 32 hour promises. Spinning thus is pulling on developers to adopt principles and practices for Evolvability. They will start actively looking for ways to keep their delivery rate high. And if not, management will soon tell them to do that. Because first the Product Owner then management will notice an increasing difficulty to deliver value within 32 hours. There, finally there emerges a way to measure Evolvability: The more frequent developers tell the Product Owner there is no way to deliver anything worth of feedback until tomorrow night, the poorer Evolvability is. Don´t count the “WTF!”, count the “No way!” utterances. In closing For sustainable software development we need to put Evolvability first. Functionality and Quality must not rule software development but be implemented within a framework ensuring (enough) Evolvability. Since Evolvability cannot be measured easily, I think we need to put software development “under pressure”. Software needs to be changed more often, in smaller increments. Each increment being relevant to the customer/user in some way. That does not mean each increment is worthy of shipment. It´s sufficient to gain further insight from it. Increments primarily serve the reduction of uncertainty, not sales. Sales even needs to be decoupled from this incremental progress. No more promises to sales. No more delivery au point. Rather sales should look at a stream of accepted increments (or incremental releases) and scoup from that whatever they find valuable. Sales and marketing need to realize they should work on what´s there, not what might be possible in the future. But I digress… In my view a Spinning cycle - which is not easy to reach, which requires practice - is the core practice to compensate the immeasurability of Evolvability. From start to finish of each issue in 32 hours max - that´s the challenge we need to accept if we´re serious increasing Evolvability. Fortunately higher Evolvability is not the only outcome of Spinning. Customer/management will like the increased flexibility and “getting more bang for the buck”.

Read the article
How do I get .NET to garbage collect aggressively?

- by mmr

I have an application that is used in image processing, and I find myself typically allocating arrays in the 4000x4000 ushort size, as well as the occasional float and the like. Currently, the .NET framework tends to crash in this app apparently randomly, almost always with an out of memory error. 32mb is not a huge declaration, but if .NET is fragmenting memory, then it's very possible that such large continuous allocations aren't behaving as expected. Is there a way to tell the garbage collector to be more aggressive, or to defrag memory (if that's the problem)? I realize that there's the GC.Collect and GC.WaitForPendingFinalizers calls, and I've sprinkled them pretty liberally through my code, but I'm still getting the errors. It may be because I'm calling dll routines that use native code a lot, but I'm not sure. I've gone over that C++ code, and make sure that any memory I declare I delete, but still I get these C# crashes, so I'm pretty sure it's not there. I wonder if the C++ calls could be interfering with the GC, making it leave behind memory because it once interacted with a native call-- is that possible? If so, can I turn that functionality off? EDIT: Here is some very specific code that will cause the crash. According to this SO question, I do not need to be disposing of the BitmapSource objects here. Here is the naive version, no GC.Collects in it. It generally crashes on iteration 4 to 10 of the undo procedure. This code replaces the constructor in a blank WPF project, since I'm using WPF. I do the wackiness with the bitmapsource because of the limitations I explained in my answer to @dthorpe below as well as the requirements listed in this SO question. public partial class Window1 : Window { public Window1() { InitializeComponent(); //Attempts to create an OOM crash //to do so, mimic minute croppings of an 'image' (ushort array), and then undoing the crops int theRows = 4000, currRows; int theColumns = 4000, currCols; int theMaxChange = 30; int i; List<ushort[]> theList = new List<ushort[]>();//the list of images in the undo/redo stack byte[] displayBuffer = null;//the buffer used as a bitmap source BitmapSource theSource = null; for (i = 0; i < theMaxChange; i++) { currRows = theRows - i; currCols = theColumns - i; theList.Add(new ushort[(theRows - i) * (theColumns - i)]); displayBuffer = new byte[theList[i].Length]; theSource = BitmapSource.Create(currCols, currRows, 96, 96, PixelFormats.Gray8, null, displayBuffer, (currCols * PixelFormats.Gray8.BitsPerPixel + 7) / 8); System.Console.WriteLine("Got to change " + i.ToString()); System.Threading.Thread.Sleep(100); } //should get here. If not, then theMaxChange is too large. //Now, go back up the undo stack. for (i = theMaxChange - 1; i >= 0; i--) { displayBuffer = new byte[theList[i].Length]; theSource = BitmapSource.Create((theColumns - i), (theRows - i), 96, 96, PixelFormats.Gray8, null, displayBuffer, ((theColumns - i) * PixelFormats.Gray8.BitsPerPixel + 7) / 8); System.Console.WriteLine("Got to undo change " + i.ToString()); System.Threading.Thread.Sleep(100); } } } Now, if I'm explicit in calling the garbage collector, I have to wrap the entire code in an outer loop to cause the OOM crash. For me, this tends to happen around x = 50 or so: public partial class Window1 : Window { public Window1() { InitializeComponent(); //Attempts to create an OOM crash //to do so, mimic minute croppings of an 'image' (ushort array), and then undoing the crops for (int x = 0; x < 1000; x++){ int theRows = 4000, currRows; int theColumns = 4000, currCols; int theMaxChange = 30; int i; List<ushort[]> theList = new List<ushort[]>();//the list of images in the undo/redo stack byte[] displayBuffer = null;//the buffer used as a bitmap source BitmapSource theSource = null; for (i = 0; i < theMaxChange; i++) { currRows = theRows - i; currCols = theColumns - i; theList.Add(new ushort[(theRows - i) * (theColumns - i)]); displayBuffer = new byte[theList[i].Length]; theSource = BitmapSource.Create(currCols, currRows, 96, 96, PixelFormats.Gray8, null, displayBuffer, (currCols * PixelFormats.Gray8.BitsPerPixel + 7) / 8); } //should get here. If not, then theMaxChange is too large. //Now, go back up the undo stack. for (i = theMaxChange - 1; i >= 0; i--) { displayBuffer = new byte[theList[i].Length]; theSource = BitmapSource.Create((theColumns - i), (theRows - i), 96, 96, PixelFormats.Gray8, null, displayBuffer, ((theColumns - i) * PixelFormats.Gray8.BitsPerPixel + 7) / 8); GC.WaitForPendingFinalizers();//force gc to collect, because we're in scenario 2, lots of large random changes GC.Collect(); } System.Console.WriteLine("Got to changelist " + x.ToString()); System.Threading.Thread.Sleep(100); } } } If I'm mishandling memory in either scenario, if there's something I should spot with a profiler, let me know. That's a pretty simple routine there. Unfortunately, it looks like @Kevin's answer is right-- this is a bug in .NET and how .NET handles objects larger than 85k. This situation strikes me as exceedingly strange; could Powerpoint be rewritten in .NET with this kind of limitation, or any of the other Office suite applications? 85k does not seem to me to be a whole lot of space, and I'd also think that any program that uses so-called 'large' allocations frequently would become unstable within a matter of days to weeks when using .NET. EDIT: It looks like Kevin is right, this is a limitation of .NET's GC. For those who don't want to follow the entire thread, .NET has four GC heaps: gen0, gen1, gen2, and LOH (Large Object Heap). Everything that's 85k or smaller goes on one of the first three heaps, depending on creation time (moved from gen0 to gen1 to gen2, etc). Objects larger than 85k get placed on the LOH. The LOH is never compacted, so eventually, allocations of the type I'm doing will eventually cause an OOM error as objects get scattered about that memory space. We've found that moving to .NET 4.0 does help the problem somewhat, delaying the exception, but not preventing it. To be honest, this feels a bit like the 640k barrier-- 85k ought to be enough for any user application (to paraphrase this video of a discussion of the GC in .NET). For the record, Java does not exhibit this behavior with its GC.

Read the article
How to find an entry-level job after you already have a graduate degree?

- by Uri

Note: I asked this question in early 2009. A couple of months later, I found a great job. I've previously updated this question with some tips for whoever ends up in a similar situation, and now cleaned it up a little for the benefit of the fresh batch of graduates. Original post: In my early 20s I abandoned a great C++ development career path in a major company to go to graduate school and get a research masters (3 years). I did another year in industrial research, and then moved to the US to attend graduate school again, getting another masters and a Ph.D in software engineering from a top school (another 6 years down the drain). I was coding the whole way throughout my degrees (core Java and Eclipse plug-ins) and working on research related to software engineering (usability of APIs). I ended up graduating the year of the recession, with a son on the way and the prospects of no healthcare. Academic jobs and industrial research jobs are quite scarce. Initially, I was naive, thinking that with my background, I could easily find a coding job. Big mistake. It turns out that I'm in a complicated position. Entry level positions are usually offered to college undergraduates. I attended my school's career fairs, but you could immediately see signs of Ph.D. aversion and overqualification issues. Some of the recruiters I spoke with explicitly told me that they wanted 20 year olds with clean slates, and some were looking for interns since they are in various forms of hiring freezes. I managed to get a couple of interviews from these career fairs and through recruiters. However, since I've been out of school for a long time and programming primarily in Java, I am also no longer proficient in C/C++ and the usual range of college-level interview questions that everyone uses. I had no problems with this when I was 19 and interviewing for my first job since a lot of what you do in C is manipulate pointers and I was coding C++ for fun and for school. Later I was routinely doing pointer manipulation on the job, and during my first masters taught college courses with data structures and C++. But even though I remember many properties of C++ well, it's been close to ten years since I regularly used C++ and pointers. As a Java developer I rarely had to work at this level, but experience in OOD and in writing good maintainable code is meaningless for C++ interviews. Reading books as a refresh and looking at sample code did not do the trick. I also looked at mid-to-senior level Java positions, but most of them focused on J2EE APIs rather than on core Java and required a certain number of years in industrial positions. Coding research tools and prior C++ experience doesn't count. So that sends me back to entry-level jobs that are posted through job-boards, and these are not common (mostly they are Monster junk), and small companies are even less likely to answer a Ph.D. compared to the giants who participate in top-10 career fairs. Even worse, in many companies initial screening is done by HR folks who really don't want to deal with anything anomalous like a Ph.D. Any tips on how I should approach this intractable position? For example, what should I write in cover letters? Note that while immigration is not an issue for me, I cannot go freelance as I need the benefits (and in particular group health insurance). During my studies I had no time to contribute to open-source projects or maintain a popular blog, so even if I invested in that now there would be no immediate benefit. Updates: In the two months after posting this I received several offers to work as a core Java developer in the financial industry and accepted one from a firm where I am working to this day. For those who find themselves in similar situations, here are my tips: Give up on trying to find an entry level positions. You can't undo time. Accept the fact that there is Ph.D. discrimination in the job market (some might say rightfully so). It is legal to discriminate based on education. No point fighting it. The most important tip is to focus on the language you are comfortable with. The sad truth about programming in a particular language is that it is not like riding a bike. If you haven't used a language in the last few years, and can't actually apply it routinely (not just as a refresher) before you start your search, it is going to be very difficult to do well in an interview. Now that I'm interviewing others, I routinely see it in folks with a mixed C++/Java background. We maintain "a shadow" of the old language but end up with a weird mix that makes it hard to interview on either. Entry-level folks are at an advantage here since they usually have one language. Memory can help you do great in a screening interview, but without recent day-to-day experience, code tests will be difficult. Despite the supposed relation, core Java programming and J2EE programming are two different things with different skillsets. If you come from academia, you likely have very little J2EE experience and may find it hard to get accepted for a J2EE job. J2EE jobs seem to have a larger list of acronyms in their requirements. In addition, from interviewing J2EE developers it seems that for many there is a focus on mastering specific APIs and architectures, whereas core Java development tends to be secondary. In the same way that I can no longer manipulate pointers well, a J2EE developer may have difficulties doing low level Java manipulation. This puts you at a relative advantage in competing for core Java jobs! If you are able to work for startups (in terms of family life and stability) or migrate to startup-rich areas such as the west coast, you can find many exciting opportunities where advanced degrees are a benefit. I've since been approached by several startups, although I had to decline. Work through a recruiter if possible. They have direct contacts with the hiring parties, allowing you to "stand out". It is better to get a clear yes/no confirmation from a recruiter on whether a company might be interested in interviewing you, than it is to send your resume and hope that someone will ever see it. Recruiters are also a great way of bypassing HR. However, also beware of recruiters. They have a vested interest and will go to various shady practices and pressure tactics. To find a good recruiter, talk to a friend who declined a job offer he got through a recruiter. A good recruiter, to me, is measured in how they handle that. Interview for the jobs that require your core strength. If you're rusty or entirely unfamiliar with a technology around which the job revolves, you're probably not a good match. Yes, you probably have the talent to master them, but most companies would want "instant gratification". I got my offers from companies that wanted core Java developer. I didn't do well on places that wanted advance C++ because I am too rusty and not up to date on recent libraries. I also didn't hear from companies that wanted lots of J2EE experience, and that's ok. Finding companies that want core Java without web is harder, but exists in specific industries (e.g., finance, defense). This requires a lot more legwork in terms of search, but these jobs do exist. There are different interview styles. Some companies focus on puzzles, some companies focus on algorithms, and some companies focus on design and coding skills. I had the most success in places where the questions were the most related to the function I would have been performing. Pick companies accordingly as well.

Read the article
Odd behavior when recursively building a return type for variadic functions

- by Dennis Zickefoose

This is probably going to be a really simple explanation, but I'm going to give as much backstory as possible in case I'm wrong. Advanced apologies for being so verbose. I'm using gcc4.5, and I realize the c++0x support is still somewhat experimental, but I'm going to act on the assumption that there's a non-bug related reason for the behavior I'm seeing. I'm experimenting with variadic function templates. The end goal was to build a cons-list out of std::pair. It wasn't meant to be a custom type, just a string of pair objects. The function that constructs the list would have to be in some way recursive, with the ultimate return value being dependent on the result of the recursive calls. As an added twist, successive parameters are added together before being inserted into the list. So if I pass [1, 2, 3, 4, 5, 6] the end result should be {1+2, {3+4, 5+6}}. My initial attempt was fairly naive. A function, Build, with two overloads. One took two identical parameters and simply returned their sum. The other took two parameters and a parameter pack. The return value was a pair consisting of the sum of the two set parameters, and the recursive call. In retrospect, this was obviously a flawed strategy, because the function isn't declared when I try to figure out its return type, so it has no choice but to resolve to the non-recursive version. That I understand. Where I got confused was the second iteration. I decided to make those functions static members of a template class. The function calls themselves are not parameterized, but instead the entire class is. My assumption was that when the recursive function attempts to generate its return type, it would instantiate a whole new version of the structure with its own static function, and everything would work itself out. The result was: "error: no matching function for call to BuildStruct<double, double, char, char>::Go(const char&, const char&)" The offending code: static auto Go(const Type& t0, const Type& t1, const Types&... rest) -> std::pair<Type, decltype(BuildStruct<Types...>::Go(rest...))> My confusion comes from the fact that the parameters to BuildStruct should always be the same types as the arguments sent to BuildStruct::Go, but in the error code Go is missing the initial two double parameters. What am I missing here? If my initial assumption about how the static functions would be chosen was incorrect, why is it trying to call the wrong function rather than just not finding a function at all? It seems to just be mixing types willy-nilly, and I just can't come up with an explanation as to why. If I add additional parameters to the initial call, it always burrows down to that last step before failing, so presumably the recursion itself is at least partially working. This is in direct contrast to the initial attempt, which always failed to find a function call right away. Ultimately, I've gotten past the problem, with a fairly elegant solution that hardly resembles either of the first two attempts. So I know how to do what I want to do. I'm looking for an explanation for the failure I saw. Full code to follow since I'm sure my verbal description was insufficient. First some boilerplate, if you feel compelled to execute the code and see it for yourself. Then the initial attempt, which failed reasonably, then the second attempt, which did not. #include <iostream> using std::cout; using std::endl; #include <utility> template<typename T1, typename T2> std::ostream& operator <<(std::ostream& str, const std::pair<T1, T2>& p) { return str << "[" << p.first << ", " << p.second << "]"; } //Insert code here int main() { Execute(5, 6, 4.3, 2.2, 'c', 'd'); Execute(5, 6, 4.3, 2.2); Execute(5, 6); return 0; } Non-struct solution: template<typename Type> Type BuildFunction(const Type& t0, const Type& t1) { return t0 + t1; } template<typename Type, typename... Rest> auto BuildFunction(const Type& t0, const Type& t1, const Rest&... rest) -> std::pair<Type, decltype(BuildFunction(rest...))> { return std::pair<Type, decltype(BuildFunction(rest...))> (t0 + t1, BuildFunction(rest...)); } template<typename... Types> void Execute(const Types&... t) { cout << BuildFunction(t...) << endl; } Resulting errors: test.cpp: In function 'void Execute(const Types& ...) [with Types = {int, int, double, double, char, char}]': test.cpp:33:35: instantiated from here test.cpp:28:3: error: no matching function for call to 'BuildFunction(const int&, const int&, const double&, const double&, const char&, const char&)' Struct solution: template<typename... Types> struct BuildStruct; template<typename Type> struct BuildStruct<Type, Type> { static Type Go(const Type& t0, const Type& t1) { return t0 + t1; } }; template<typename Type, typename... Types> struct BuildStruct<Type, Type, Types...> { static auto Go(const Type& t0, const Type& t1, const Types&... rest) -> std::pair<Type, decltype(BuildStruct<Types...>::Go(rest...))> { return std::pair<Type, decltype(BuildStruct<Types...>::Go(rest...))> (t0 + t1, BuildStruct<Types...>::Go(rest...)); } }; template<typename... Types> void Execute(const Types&... t) { cout << BuildStruct<Types...>::Go(t...) << endl; } Resulting errors: test.cpp: In instantiation of 'BuildStruct<int, int, double, double, char, char>': test.cpp:33:3: instantiated from 'void Execute(const Types& ...) [with Types = {int, int, double, double, char, char}]' test.cpp:38:41: instantiated from here test.cpp:24:15: error: no matching function for call to 'BuildStruct<double, double, char, char>::Go(const char&, const char&)' test.cpp:24:15: note: candidate is: static std::pair<Type, decltype (BuildStruct<Types ...>::Go(BuildStruct<Type, Type, Types ...>::Go::rest ...))> BuildStruct<Type, Type, Types ...>::Go(const Type&, const Type&, const Types& ...) [with Type = double, Types = {char, char}, decltype (BuildStruct<Types ...>::Go(BuildStruct<Type, Type, Types ...>::Go::rest ...)) = char] test.cpp: In function 'void Execute(const Types& ...) [with Types = {int, int, double, double, char, char}]': test.cpp:38:41: instantiated from here test.cpp:33:3: error: 'Go' is not a member of 'BuildStruct<int, int, double, double, char, char>'

Read the article
Can anyone explain why my crypto++ decrypted file 16 bytes short?

- by Tom Williams

I suspect it might be too much to hope for, but can anyone with experience with crypto++ explain why the "decrypted.out" file created by main() is 16 characters short (which probably not coincidentally is the block size)? I think the issue must be in CryptStreamBuffer::GetNextChar(), but I've been staring at it and the crypto++ documentation for hours. Any other comments about how crummy or naive my std::streambuf implementation are also welcome ;-) And I've just noticed I'm missing some calls to delete so you don't have to tell me about those. Thanks, Tom // Runtime Includes #include <iostream> // Crypto++ Includes #include "aes.h" #include "modes.h" // xxx_Mode< > #include "filters.h" // StringSource and // StreamTransformation #include "files.h" using namespace std; class CryptStreamBuffer: public std::streambuf { public: CryptStreamBuffer(istream& encryptedInput, CryptoPP::StreamTransformation& c); CryptStreamBuffer(ostream& encryptedOutput, CryptoPP::StreamTransformation& c); protected: virtual int_type overflow(int_type ch = traits_type::eof()); virtual int_type uflow(); virtual int_type underflow(); virtual int_type pbackfail(int_type ch); virtual int sync(); private: int GetNextChar(); int m_NextChar; // Buffered character CryptoPP::StreamTransformationFilter* m_StreamTransformationFilter; CryptoPP::FileSource* m_Source; CryptoPP::FileSink* m_Sink; }; // class CryptStreamBuffer CryptStreamBuffer::CryptStreamBuffer(istream& encryptedInput, CryptoPP::StreamTransformation& c) : m_NextChar(traits_type::eof()), m_StreamTransformationFilter(0), m_Source(0), m_Sink(0) { m_StreamTransformationFilter = new CryptoPP::StreamTransformationFilter(c); m_Source = new CryptoPP::FileSource(encryptedInput, false, m_StreamTransformationFilter); } CryptStreamBuffer::CryptStreamBuffer(ostream& encryptedOutput, CryptoPP::StreamTransformation& c) : m_NextChar(traits_type::eof()), m_StreamTransformationFilter(0), m_Source(0), m_Sink(0) { m_Sink = new CryptoPP::FileSink(encryptedOutput); m_StreamTransformationFilter = new CryptoPP::StreamTransformationFilter(c, m_Sink); } CryptStreamBuffer::int_type CryptStreamBuffer::overflow(int_type ch) { return m_StreamTransformationFilter->Put((byte)ch); } CryptStreamBuffer::int_type CryptStreamBuffer::uflow() { int_type result = GetNextChar(); // Reset the buffered character m_NextChar = traits_type::eof(); return result; } CryptStreamBuffer::int_type CryptStreamBuffer::underflow() { return GetNextChar(); } CryptStreamBuffer::int_type CryptStreamBuffer::pbackfail(int_type ch) { return traits_type::eof(); } int CryptStreamBuffer::sync() { if (m_Sink) { m_StreamTransformationFilter->MessageEnd(); } } int CryptStreamBuffer::GetNextChar() { // If we have a buffered character do nothing if (m_NextChar != traits_type::eof()) { return m_NextChar; } // If there are no more bytes currently available then pump the source // *** I SUSPECT THE PROBLEM IS HERE *** if (m_StreamTransformationFilter->MaxRetrievable() == 0) { m_Source->Pump(1024); } // Retrieve the next byte byte nextByte; size_t noBytes = m_StreamTransformationFilter->Get(nextByte); if (0 == noBytes) { return traits_type::eof(); } // Buffer up the next character m_NextChar = nextByte; return m_NextChar; } void InitKey(byte key[]) { key[0] = -62; key[1] = 102; key[2] = 78; key[3] = 75; key[4] = -96; key[5] = 125; key[6] = 66; key[7] = 125; key[8] = -95; key[9] = -66; key[10] = 114; key[11] = 22; key[12] = 48; key[13] = 111; key[14] = -51; key[15] = 112; } void DecryptFile(const char* sourceFileName, const char* destFileName) { ifstream ifs(sourceFileName, ios::in | ios::binary); ofstream ofs(destFileName, ios::out | ios::binary); byte key[CryptoPP::AES::DEFAULT_KEYLENGTH]; InitKey(key); CryptoPP::ECB_Mode<CryptoPP::AES>::Decryption decryptor(key, sizeof(key)); if (ifs) { if (ofs) { CryptStreamBuffer cryptBuf(ifs, decryptor); std::istream decrypt(&cryptBuf); int c; while (EOF != (c = decrypt.get())) { ofs << (char)c; } ofs.flush(); } else { std::cerr << "Failed to open file '" << destFileName << "'." << endl; } } else { std::cerr << "Failed to open file '" << sourceFileName << "'." << endl; } } void EncryptFile(const char* sourceFileName, const char* destFileName) { ifstream ifs(sourceFileName, ios::in | ios::binary); ofstream ofs(destFileName, ios::out | ios::binary); byte key[CryptoPP::AES::DEFAULT_KEYLENGTH]; InitKey(key); CryptoPP::ECB_Mode<CryptoPP::AES>::Encryption encryptor(key, sizeof(key)); if (ifs) { if (ofs) { CryptStreamBuffer cryptBuf(ofs, encryptor); std::ostream encrypt(&cryptBuf); int c; while (EOF != (c = ifs.get())) { encrypt << (char)c; } encrypt.flush(); } else { std::cerr << "Failed to open file '" << destFileName << "'." << endl; } } else { std::cerr << "Failed to open file '" << sourceFileName << "'." << endl; } } int main(int argc, char* argv[]) { EncryptFile(argv[1], "encrypted.out"); DecryptFile("encrypted.out", "decrypted.out"); return 0; }

Read the article
Can anyone explain why my crypto++ decrypted file is 16 bytes short?

- by Tom Williams

I suspect it might be too much to hope for, but can anyone with experience with crypto++ explain why the "decrypted.out" file created by main() is 16 characters short (which probably not coincidentally is the block size)? I think the issue must be in CryptStreamBuffer::GetNextChar(), but I've been staring at it and the crypto++ documentation for hours. Any other comments about how crummy or naive my std::streambuf implementation are also welcome ;-) And I've just noticed I'm missing some calls to delete so you don't have to tell me about those. Thanks, Tom // Runtime Includes #include <iostream> // Crypto++ Includes #include "aes.h" #include "modes.h" // xxx_Mode< > #include "filters.h" // StringSource and // StreamTransformation #include "files.h" using namespace std; class CryptStreamBuffer: public std::streambuf { public: CryptStreamBuffer(istream& encryptedInput, CryptoPP::StreamTransformation& c); CryptStreamBuffer(ostream& encryptedOutput, CryptoPP::StreamTransformation& c); protected: virtual int_type overflow(int_type ch = traits_type::eof()); virtual int_type uflow(); virtual int_type underflow(); virtual int_type pbackfail(int_type ch); virtual int sync(); private: int GetNextChar(); int m_NextChar; // Buffered character CryptoPP::StreamTransformationFilter* m_StreamTransformationFilter; CryptoPP::FileSource* m_Source; CryptoPP::FileSink* m_Sink; }; // class CryptStreamBuffer CryptStreamBuffer::CryptStreamBuffer(istream& encryptedInput, CryptoPP::StreamTransformation& c) : m_NextChar(traits_type::eof()), m_StreamTransformationFilter(0), m_Source(0), m_Sink(0) { m_StreamTransformationFilter = new CryptoPP::StreamTransformationFilter(c); m_Source = new CryptoPP::FileSource(encryptedInput, false, m_StreamTransformationFilter); } CryptStreamBuffer::CryptStreamBuffer(ostream& encryptedOutput, CryptoPP::StreamTransformation& c) : m_NextChar(traits_type::eof()), m_StreamTransformationFilter(0), m_Source(0), m_Sink(0) { m_Sink = new CryptoPP::FileSink(encryptedOutput); m_StreamTransformationFilter = new CryptoPP::StreamTransformationFilter(c, m_Sink); } CryptStreamBuffer::int_type CryptStreamBuffer::overflow(int_type ch) { return m_StreamTransformationFilter->Put((byte)ch); } CryptStreamBuffer::int_type CryptStreamBuffer::uflow() { int_type result = GetNextChar(); // Reset the buffered character m_NextChar = traits_type::eof(); return result; } CryptStreamBuffer::int_type CryptStreamBuffer::underflow() { return GetNextChar(); } CryptStreamBuffer::int_type CryptStreamBuffer::pbackfail(int_type ch) { return traits_type::eof(); } int CryptStreamBuffer::sync() { if (m_Sink) { m_StreamTransformationFilter->MessageEnd(); } } int CryptStreamBuffer::GetNextChar() { // If we have a buffered character do nothing if (m_NextChar != traits_type::eof()) { return m_NextChar; } // If there are no more bytes currently available then pump the source // *** I SUSPECT THE PROBLEM IS HERE *** if (m_StreamTransformationFilter->MaxRetrievable() == 0) { m_Source->Pump(1024); } // Retrieve the next byte byte nextByte; size_t noBytes = m_StreamTransformationFilter->Get(nextByte); if (0 == noBytes) { return traits_type::eof(); } // Buffer up the next character m_NextChar = nextByte; return m_NextChar; } void InitKey(byte key[]) { key[0] = -62; key[1] = 102; key[2] = 78; key[3] = 75; key[4] = -96; key[5] = 125; key[6] = 66; key[7] = 125; key[8] = -95; key[9] = -66; key[10] = 114; key[11] = 22; key[12] = 48; key[13] = 111; key[14] = -51; key[15] = 112; } void DecryptFile(const char* sourceFileName, const char* destFileName) { ifstream ifs(sourceFileName, ios::in | ios::binary); ofstream ofs(destFileName, ios::out | ios::binary); byte key[CryptoPP::AES::DEFAULT_KEYLENGTH]; InitKey(key); CryptoPP::ECB_Mode<CryptoPP::AES>::Decryption decryptor(key, sizeof(key)); if (ifs) { if (ofs) { CryptStreamBuffer cryptBuf(ifs, decryptor); std::istream decrypt(&cryptBuf); int c; while (EOF != (c = decrypt.get())) { ofs << (char)c; } ofs.flush(); } else { std::cerr << "Failed to open file '" << destFileName << "'." << endl; } } else { std::cerr << "Failed to open file '" << sourceFileName << "'." << endl; } } void EncryptFile(const char* sourceFileName, const char* destFileName) { ifstream ifs(sourceFileName, ios::in | ios::binary); ofstream ofs(destFileName, ios::out | ios::binary); byte key[CryptoPP::AES::DEFAULT_KEYLENGTH]; InitKey(key); CryptoPP::ECB_Mode<CryptoPP::AES>::Encryption encryptor(key, sizeof(key)); if (ifs) { if (ofs) { CryptStreamBuffer cryptBuf(ofs, encryptor); std::ostream encrypt(&cryptBuf); int c; while (EOF != (c = ifs.get())) { encrypt << (char)c; } encrypt.flush(); } else { std::cerr << "Failed to open file '" << destFileName << "'." << endl; } } else { std::cerr << "Failed to open file '" << sourceFileName << "'." << endl; } } int main(int argc, char* argv[]) { EncryptFile(argv[1], "encrypted.out"); DecryptFile("encrypted.out", "decrypted.out"); return 0; }

Read the article
Numpy zero rank array indexing/broadcasting

- by Lemming

I'm trying to write a function that supports broadcasting and is fast at the same time. However, numpy's zero-rank arrays are causing trouble as usual. I couldn't find anything useful on google, or by searching here. So, I'm asking you. How should I implement broadcasting efficiently and handle zero-rank arrays at the same time? This whole post became larger than anticipated, sorry. Details: To clarify what I'm talking about I'll give a simple example: Say I want to implement a Heaviside step-function. I.e. a function that acts on the real axis, which is 0 on the negative side, 1 on the positive side, and from case to case either 0, 0.5, or 1 at the point 0. Implementation Masking The most efficient way I found so far is the following. It uses boolean arrays as masks to assign the correct values to the corresponding slots in the output vector. from numpy import * def step_mask(x, limit=+1): """Heaviside step-function. y = 0 if x < 0 y = 1 if x > 0 See below for x == 0. Arguments: x Evaluate the function at these points. limit Which limit at x == 0? limit > 0: y = 1 limit == 0: y = 0.5 limit < 0: y = 0 Return: The values corresponding to x. """ b = broadcast(x, limit) out = zeros(b.shape) out[x>0] = 1 mask = (limit > 0) & (x == 0) out[mask] = 1 mask = (limit == 0) & (x == 0) out[mask] = 0.5 mask = (limit < 0) & (x == 0) out[mask] = 0 return out List Comprehension The following-the-numpy-docs way is to use a list comprehension on the flat iterator of the broadcast object. However, list comprehensions become absolutely unreadable for such complicated functions. def step_comprehension(x, limit=+1): b = broadcast(x, limit) out = empty(b.shape) out.flat = [ ( 1 if x_ > 0 else ( 0 if x_ < 0 else ( 1 if l_ > 0 else ( 0.5 if l_ ==0 else ( 0 ))))) for x_, l_ in b ] return out For Loop And finally, the most naive way is a for loop. It's probably the most readable option. However, Python for-loops are anything but fast. And hence, a really bad idea in numerics. def step_for(x, limit=+1): b = broadcast(x, limit) out = empty(b.shape) for i, (x_, l_) in enumerate(b): if x_ > 0: out[i] = 1 elif x_ < 0: out[i] = 0 elif l_ > 0: out[i] = 1 elif l_ < 0: out[i] = 0 else: out[i] = 0.5 return out Test First of all a brief test to see if the output is correct. >>> x = array([-1, -0.1, 0, 0.1, 1]) >>> step_mask(x, +1) array([ 0., 0., 1., 1., 1.]) >>> step_mask(x, 0) array([ 0. , 0. , 0.5, 1. , 1. ]) >>> step_mask(x, -1) array([ 0., 0., 0., 1., 1.]) It is correct, and the other two functions give the same output. Performance How about efficiency? These are the timings: In [45]: xl = linspace(-2, 2, 500001) In [46]: %timeit step_mask(xl) 10 loops, best of 3: 19.5 ms per loop In [47]: %timeit step_comprehension(xl) 1 loops, best of 3: 1.17 s per loop In [48]: %timeit step_for(xl) 1 loops, best of 3: 1.15 s per loop The masked version performs best as expected. However, I'm surprised that the comprehension is on the same level as the for loop. Zero Rank Arrays But, 0-rank arrays pose a problem. Sometimes you want to use a function scalar input. And preferably not have to worry about wrapping all scalars in at least 1-D arrays. >>> step_mask(1) Traceback (most recent call last): File "<ipython-input-50-91c06aa4487b>", line 1, in <module> step_mask(1) File "script.py", line 22, in step_mask out[x>0] = 1 IndexError: 0-d arrays can't be indexed. >>> step_for(1) Traceback (most recent call last): File "<ipython-input-51-4e0de4fcb197>", line 1, in <module> step_for(1) File "script.py", line 55, in step_for out[i] = 1 IndexError: 0-d arrays can't be indexed. >>> step_comprehension(1) array(1.0) Only the list comprehension can handle 0-rank arrays. The other two versions would need special case handling for 0-rank arrays. Numpy gets a bit messy when you want to use the same code for arrays and scalars. However, I really like to have functions that work on as arbitrary input as possible. Who knows which parameters I'll want to iterate over at some point. Question: What is the best way to implement a function as the one above? Is there a way to avoid if scalar then like special cases? I'm not looking for a built-in Heaviside. It's just a simplified example. In my code the above pattern appears in many places to make parameter iteration as simple as possible without littering the client code with for loops or comprehensions. Furthermore, I'm aware of Cython, or weave & Co., or implementation directly in C. However, the performance of the masked version above is sufficient for the moment. And for the moment I would like to keep things as simple as possible.

Read the article
Followup: Python 2.6, 3 abstract base class misunderstanding

- by Aaron

I asked a question at Python 2.6, 3 abstract base class misunderstanding. My problem was that python abstract base classes didn't work quite the way I expected them to. There was some discussion in the comments about why I would want to use ABCs at all, and Alex Martelli provided an excellent answer on why my use didn't work and how to accomplish what I wanted. Here I'd like to address why one might want to use ABCs, and show my test code implementation based on Alex's answer. tl;dr: Code after the 16th paragraph. In the discussion on the original post, statements were made along the lines that you don't need ABCs in Python, and that ABCs don't do anything and are therefore not real classes; they're merely interface definitions. An abstract base class is just a tool in your tool box. It's a design tool that's been around for many years, and a programming tool that is explicitly available in many programming languages. It can be implemented manually in languages that don't provide it. An ABC is always a real class, even when it doesn't do anything but define an interface, because specifying the interface is what an ABC does. If that was all an ABC could do, that would be enough reason to have it in your toolbox, but in Python and some other languages they can do more. The basic reason to use an ABC is when you have a number of classes that all do the same thing (have the same interface) but do it differently, and you want to guarantee that that complete interface is implemented in all objects. A user of your classes can rely on the interface being completely implemented in all classes. You can maintain this guarantee manually. Over time you may succeed. Or you might forget something. Before Python had ABCs you could guarantee it semi-manually, by throwing NotImplementedError in all the base class's interface methods; you must implement these methods in derived classes. This is only a partial solution, because you can still instantiate such a base class. A more complete solution is to use ABCs as provided in Python 2.6 and above. Template methods and other wrinkles and patterns are ideas whose implementation can be made easier with full-citizen ABCs. Another idea in the comments was that Python doesn't need ABCs (understood as a class that only defines an interface) because it has multiple inheritance. The implied reference there seems to be Java and its single inheritance. In Java you "get around" single inheritance by inheriting from one or more interfaces. Java uses the word "interface" in two ways. A "Java interface" is a class with method signatures but no implementations. The methods are the interface's "interface" in the more general, non-Java sense of the word. Yes, Python has multiple inheritance, so you don't need Java-like "interfaces" (ABCs) merely to provide sets of interface methods to a class. But that's not the only reason in software development to use ABCs. Most generally, you use an ABC to specify an interface (set of methods) that will likely be implemented differently in different derived classes, yet that all derived classes must have. Additionally, there may be no sensible default implementation for the base class to provide. Finally, even an ABC with almost no interface is still useful. We use something like it when we have multiple except clauses for a try. Many exceptions have exactly the same interface, with only two differences: the exception's string value, and the actual class of the exception. In many exception clauses we use nothing about the exception except its class to decide what to do; catching one type of exception we do one thing, and another except clause catching a different exception does another thing. According to the exception module's doc page, BaseException is not intended to be derived by any user defined exceptions. If ABCs had been a first class Python concept from the beginning, it's easy to imagine BaseException being specified as an ABC. But enough of that. Here's some 2.6 code that demonstrates how to use ABCs, and how to specify a list-like ABC. Examples are run in ipython, which I like much better than the python shell for day to day work; I only wish it was available for python3. Your basic 2.6 ABC: from abc import ABCMeta, abstractmethod class Super(): __metaclass__ = ABCMeta @abstractmethod def method1(self): pass Test it (in ipython, python shell would be similar): In [2]: a = Super() --------------------------------------------------------------------------- TypeError Traceback (most recent call last) /home/aaron/projects/test/<ipython console> in <module>() TypeError: Can't instantiate abstract class Super with abstract methods method1 Notice the end of the last line, where the TypeError exception tells us that method1 has not been implemented ("abstract methods method1"). That was the method designated as @abstractmethod in the preceding code. Create a subclass that inherits Super, implement method1 in the subclass and you're done. My problem, which caused me to ask the original question, was how to specify an ABC that itself defines a list interface. My naive solution was to make an ABC as above, and in the inheritance parentheses say (list). My assumption was that the class would still be abstract (can't instantiate it), and would be a list. That was wrong; inheriting from list made the class concrete, despite the abstract bits in the class definition. Alex suggested inheriting from collections.MutableSequence, which is abstract (and so doesn't make the class concrete) and list-like. I used collections.Sequence, which is also abstract but has a shorter interface and so was quicker to implement. First, Super derived from Sequence, with nothing extra: from abc import abstractmethod from collections import Sequence class Super(Sequence): pass Test it: In [6]: a = Super() --------------------------------------------------------------------------- TypeError Traceback (most recent call last) /home/aaron/projects/test/<ipython console> in <module>() TypeError: Can't instantiate abstract class Super with abstract methods __getitem__, __len__ We can't instantiate it. A list-like full-citizen ABC; yea! Again, notice in the last line that TypeError tells us why we can't instantiate it: __getitem__ and __len__ are abstract methods. They come from collections.Sequence. But, I want a bunch of subclasses that all act like immutable lists (which collections.Sequence essentially is), and that have their own implementations of my added interface methods. In particular, I don't want to implement my own list code, Python already did that for me. So first, let's implement the missing Sequence methods, in terms of Python's list type, so that all subclasses act as lists (Sequences). First let's see the signatures of the missing abstract methods: In [12]: help(Sequence.__getitem__) Help on method __getitem__ in module _abcoll: __getitem__(self, index) unbound _abcoll.Sequence method (END) In [14]: help(Sequence.__len__) Help on method __len__ in module _abcoll: __len__(self) unbound _abcoll.Sequence method (END) __getitem__ takes an index, and __len__ takes nothing. And the implementation (so far) is: from abc import abstractmethod from collections import Sequence class Super(Sequence): # Gives us a list member for ABC methods to use. def __init__(self): self._list = [] # Abstract method in Sequence, implemented in terms of list. def __getitem__(self, index): return self._list.__getitem__(index) # Abstract method in Sequence, implemented in terms of list. def __len__(self): return self._list.__len__() # Not required. Makes printing behave like a list. def __repr__(self): return self._list.__repr__() Test it: In [34]: a = Super() In [35]: a Out[35]: [] In [36]: print a [] In [37]: len(a) Out[37]: 0 In [38]: a[0] --------------------------------------------------------------------------- IndexError Traceback (most recent call last) /home/aaron/projects/test/<ipython console> in <module>() /home/aaron/projects/test/test.py in __getitem__(self, index) 10 # Abstract method in Sequence, implemented in terms of list. 11 def __getitem__(self, index): ---> 12 return self._list.__getitem__(index) 13 14 # Abstract method in Sequence, implemented in terms of list. IndexError: list index out of range Just like a list. It's not abstract (for the moment) because we implemented both of Sequence's abstract methods. Now I want to add my bit of interface, which will be abstract in Super and therefore required to implement in any subclasses. And we'll cut to the chase and add subclasses that inherit from our ABC Super. from abc import abstractmethod from collections import Sequence class Super(Sequence): # Gives us a list member for ABC methods to use. def __init__(self): self._list = [] # Abstract method in Sequence, implemented in terms of list. def __getitem__(self, index): return self._list.__getitem__(index) # Abstract method in Sequence, implemented in terms of list. def __len__(self): return self._list.__len__() # Not required. Makes printing behave like a list. def __repr__(self): return self._list.__repr__() @abstractmethod def method1(): pass class Sub0(Super): pass class Sub1(Super): def __init__(self): self._list = [1, 2, 3] def method1(self): return [x**2 for x in self._list] def method2(self): return [x/2.0 for x in self._list] class Sub2(Super): def __init__(self): self._list = [10, 20, 30, 40] def method1(self): return [x+2 for x in self._list] We've added a new abstract method to Super, method1. This makes Super abstract again. A new class Sub0 which inherits from Super but does not implement method1, so it's also an ABC. Two new classes Sub1 and Sub2, which both inherit from Super. They both implement method1 from Super, so they're not abstract. Both implementations of method1 are different. Sub1 and Sub2 also both initialize themselves differently; in real life they might initialize themselves wildly differently. So you have two subclasses which both "is a" Super (they both implement Super's required interface) although their implementations are different. Also remember that Super, although an ABC, provides four non-abstract methods. So Super provides two things to subclasses: an implementation of collections.Sequence, and an additional abstract interface (the one abstract method) that subclasses must implement. Also, class Sub1 implements an additional method, method2, which is not part of Super's interface. Sub1 "is a" Super, but it also has additional capabilities. Test it: In [52]: a = Super() --------------------------------------------------------------------------- TypeError Traceback (most recent call last) /home/aaron/projects/test/<ipython console> in <module>() TypeError: Can't instantiate abstract class Super with abstract methods method1 In [53]: a = Sub0() --------------------------------------------------------------------------- TypeError Traceback (most recent call last) /home/aaron/projects/test/<ipython console> in <module>() TypeError: Can't instantiate abstract class Sub0 with abstract methods method1 In [54]: a = Sub1() In [55]: a Out[55]: [1, 2, 3] In [56]: b = Sub2() In [57]: b Out[57]: [10, 20, 30, 40] In [58]: print a, b [1, 2, 3] [10, 20, 30, 40] In [59]: a, b Out[59]: ([1, 2, 3], [10, 20, 30, 40]) In [60]: a.method1() Out[60]: [1, 4, 9] In [61]: b.method1() Out[61]: [12, 22, 32, 42] In [62]: a.method2() Out[62]: [0.5, 1.0, 1.5] [63]: a[:2] Out[63]: [1, 2] In [64]: a[0] = 5 --------------------------------------------------------------------------- TypeError Traceback (most recent call last) /home/aaron/projects/test/<ipython console> in <module>() TypeError: 'Sub1' object does not support item assignment Super and Sub0 are abstract and can't be instantiated (lines 52 and 53). Sub1 and Sub2 are concrete and have an immutable Sequence interface (54 through 59). Sub1 and Sub2 are instantiated differently, and their method1 implementations are different (60, 61). Sub1 includes an additional method2, beyond what's required by Super (62). Any concrete Super acts like a list/Sequence (63). A collections.Sequence is immutable (64). Finally, a wart: In [65]: a._list Out[65]: [1, 2, 3] In [66]: a._list = [] In [67]: a Out[67]: [] Super._list is spelled with a single underscore. Double underscore would have protected it from this last bit, but would have broken the implementation of methods in subclasses. Not sure why; I think because double underscore is private, and private means private. So ultimately this whole scheme relies on a gentleman's agreement not to reach in and muck with Super._list directly, as in line 65 above. Would love to know if there's a safer way to do that.

Read the article
Steganography Experiment - Trouble hiding message bits in DCT coefficients

- by JohnHankinson

I have an application requiring me to be able to embed loss-less data into an image. As such I've been experimenting with steganography, specifically via modification of DCT coefficients as the method I select, apart from being loss-less must also be relatively resilient against format conversion, scaling/DSP etc. From the research I've done thus far this method seems to be the best candidate. I've seen a number of papers on the subject which all seem to neglect specific details (some neglect to mention modification of 0 coefficients, or modification of AC coefficient etc). After combining the findings and making a few modifications of my own which include: 1) Using a more quantized version of the DCT matrix to ensure we only modify coefficients that would still be present should the image be JPEG'ed further or processed (I'm using this in place of simply following a zig-zag pattern). 2) I'm modifying bit 4 instead of the LSB and then based on what the original bit value was adjusting the lower bits to minimize the difference. 3) I'm only modifying the blue channel as it should be the least visible. This process must modify the actual image and not the DCT values stored in file (like jsteg) as there is no guarantee the file will be a JPEG, it may also be opened and re-saved at a later stage in a different format. For added robustness I've included the message multiple times and use the bits that occur most often, I had considered using a QR code as the message data or simply applying the reed-solomon error correction, but for this simple application and given that the "message" in question is usually going to be between 10-32 bytes I have plenty of room to repeat it which should provide sufficient redundancy to recover the true bits. No matter what I do I don't seem to be able to recover the bits at the decode stage. I've tried including / excluding various checks (even if it degrades image quality for the time being). I've tried using fixed point vs. double arithmetic, moving the bit to encode, I suspect that the message bits are being lost during the IDCT back to image. Any thoughts or suggestions on how to get this working would be hugely appreciated. (PS I am aware that the actual DCT/IDCT could be optimized from it's naive On4 operation using row column algorithm, or an FDCT like AAN, but for now it just needs to work :) ) Reference Papers: http://www.lokminglui.com/dct.pdf http://arxiv.org/ftp/arxiv/papers/1006/1006.1186.pdf Code for the Encode/Decode process in C# below: using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Drawing.Imaging; using System.Drawing; namespace ImageKey { public class Encoder { public const int HIDE_BIT_POS = 3; // use bit position 4 (1 << 3). public const int HIDE_COUNT = 16; // Number of times to repeat the message to avoid error. // JPEG Standard Quantization Matrix. // (to get higher quality multiply by (100-quality)/50 .. // for lower than 50 multiply by 50/quality. Then round to integers and clip to ensure only positive integers. public static double[] Q = {16,11,10,16,24,40,51,61, 12,12,14,19,26,58,60,55, 14,13,16,24,40,57,69,56, 14,17,22,29,51,87,80,62, 18,22,37,56,68,109,103,77, 24,35,55,64,81,104,113,92, 49,64,78,87,103,121,120,101, 72,92,95,98,112,100,103,99}; // Maximum qauality quantization matrix (if all 1's doesn't modify coefficients at all). public static double[] Q2 = {1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1}; public static Bitmap Encode(Bitmap b, string key) { Bitmap response = new Bitmap(b.Width, b.Height, PixelFormat.Format32bppArgb); uint imgWidth = ((uint)b.Width) & ~((uint)7); // Maximum usable X resolution (divisible by 8). uint imgHeight = ((uint)b.Height) & ~((uint)7); // Maximum usable Y resolution (divisible by 8). // Start be transferring the unmodified image portions. // As we'll be using slightly less width/height for the encoding process we'll need the edges to be populated. for (int y = 0; y < b.Height; y++) for (int x = 0; x < b.Width; x++) { if( (x >= imgWidth && x < b.Width) || (y>=imgHeight && y < b.Height)) response.SetPixel(x, y, b.GetPixel(x, y)); } // Setup the counters and byte data for the message to encode. StringBuilder sb = new StringBuilder(); for(int i=0;i<HIDE_COUNT;i++) sb.Append(key); byte[] codeBytes = System.Text.Encoding.ASCII.GetBytes(sb.ToString()); int bitofs = 0; // Current bit position we've encoded too. int totalBits = (codeBytes.Length * 8); // Total number of bits to encode. for (int y = 0; y < imgHeight; y += 8) { for (int x = 0; x < imgWidth; x += 8) { int[] redData = GetRedChannelData(b, x, y); int[] greenData = GetGreenChannelData(b, x, y); int[] blueData = GetBlueChannelData(b, x, y); int[] newRedData; int[] newGreenData; int[] newBlueData; if (bitofs < totalBits) { double[] redDCT = DCT(ref redData); double[] greenDCT = DCT(ref greenData); double[] blueDCT = DCT(ref blueData); int[] redDCTI = Quantize(ref redDCT, ref Q2); int[] greenDCTI = Quantize(ref greenDCT, ref Q2); int[] blueDCTI = Quantize(ref blueDCT, ref Q2); int[] blueDCTC = Quantize(ref blueDCT, ref Q); HideBits(ref blueDCTI, ref blueDCTC, ref bitofs, ref totalBits, ref codeBytes); double[] redDCT2 = DeQuantize(ref redDCTI, ref Q2); double[] greenDCT2 = DeQuantize(ref greenDCTI, ref Q2); double[] blueDCT2 = DeQuantize(ref blueDCTI, ref Q2); newRedData = IDCT(ref redDCT2); newGreenData = IDCT(ref greenDCT2); newBlueData = IDCT(ref blueDCT2); } else { newRedData = redData; newGreenData = greenData; newBlueData = blueData; } MapToRGBRange(ref newRedData); MapToRGBRange(ref newGreenData); MapToRGBRange(ref newBlueData); for(int dy=0;dy<8;dy++) { for(int dx=0;dx<8;dx++) { int col = (0xff<<24) + (newRedData[dx+(dy*8)]<<16) + (newGreenData[dx+(dy*8)]<<8) + (newBlueData[dx+(dy*8)]); response.SetPixel(x+dx,y+dy,Color.FromArgb(col)); } } } } if (bitofs < totalBits) throw new Exception("Failed to encode data - insufficient cover image coefficients"); return (response); } public static void HideBits(ref int[] DCTMatrix, ref int[] CMatrix, ref int bitofs, ref int totalBits, ref byte[] codeBytes) { int tempValue = 0; for (int u = 0; u < 8; u++) { for (int v = 0; v < 8; v++) { if ( (u != 0 || v != 0) && CMatrix[v+(u*8)] != 0 && DCTMatrix[v+(u*8)] != 0) { if (bitofs < totalBits) { tempValue = DCTMatrix[v + (u * 8)]; int bytePos = (bitofs) >> 3; int bitPos = (bitofs) % 8; byte mask = (byte)(1 << bitPos); byte value = (byte)((codeBytes[bytePos] & mask) >> bitPos); // 0 or 1. if (value == 0) { int a = DCTMatrix[v + (u * 8)] & (1 << HIDE_BIT_POS); if (a != 0) DCTMatrix[v + (u * 8)] |= (1 << HIDE_BIT_POS) - 1; DCTMatrix[v + (u * 8)] &= ~(1 << HIDE_BIT_POS); } else if (value == 1) { int a = DCTMatrix[v + (u * 8)] & (1 << HIDE_BIT_POS); if (a == 0) DCTMatrix[v + (u * 8)] &= ~((1 << HIDE_BIT_POS) - 1); DCTMatrix[v + (u * 8)] |= (1 << HIDE_BIT_POS); } if (DCTMatrix[v + (u * 8)] != 0) bitofs++; else DCTMatrix[v + (u * 8)] = tempValue; } } } } } public static void MapToRGBRange(ref int[] data) { for(int i=0;i<data.Length;i++) { data[i] += 128; if(data[i] < 0) data[i] = 0; else if(data[i] > 255) data[i] = 255; } } public static int[] GetRedChannelData(Bitmap b, int sx, int sy) { int[] data = new int[8 * 8]; for (int y = sy; y < (sy + 8); y++) { for (int x = sx; x < (sx + 8); x++) { uint col = (uint)b.GetPixel(x,y).ToArgb(); data[(x - sx) + ((y - sy) * 8)] = (int)((col >> 16) & 0xff) - 128; } } return (data); } public static int[] GetGreenChannelData(Bitmap b, int sx, int sy) { int[] data = new int[8 * 8]; for (int y = sy; y < (sy + 8); y++) { for (int x = sx; x < (sx + 8); x++) { uint col = (uint)b.GetPixel(x, y).ToArgb(); data[(x - sx) + ((y - sy) * 8)] = (int)((col >> 8) & 0xff) - 128; } } return (data); } public static int[] GetBlueChannelData(Bitmap b, int sx, int sy) { int[] data = new int[8 * 8]; for (int y = sy; y < (sy + 8); y++) { for (int x = sx; x < (sx + 8); x++) { uint col = (uint)b.GetPixel(x, y).ToArgb(); data[(x - sx) + ((y - sy) * 8)] = (int)((col >> 0) & 0xff) - 128; } } return (data); } public static int[] Quantize(ref double[] DCTMatrix, ref double[] Q) { int[] DCTMatrixOut = new int[8*8]; for (int u = 0; u < 8; u++) { for (int v = 0; v < 8; v++) { DCTMatrixOut[v + (u * 8)] = (int)Math.Round(DCTMatrix[v + (u * 8)] / Q[v + (u * 8)]); } } return(DCTMatrixOut); } public static double[] DeQuantize(ref int[] DCTMatrix, ref double[] Q) { double[] DCTMatrixOut = new double[8*8]; for (int u = 0; u < 8; u++) { for (int v = 0; v < 8; v++) { DCTMatrixOut[v + (u * 8)] = (double)DCTMatrix[v + (u * 8)] * Q[v + (u * 8)]; } } return(DCTMatrixOut); } public static double[] DCT(ref int[] data) { double[] DCTMatrix = new double[8 * 8]; for (int v = 0; v < 8; v++) { for (int u = 0; u < 8; u++) { double cu = 1; if (u == 0) cu = (1.0 / Math.Sqrt(2.0)); double cv = 1; if (v == 0) cv = (1.0 / Math.Sqrt(2.0)); double sum = 0.0; for (int y = 0; y < 8; y++) { for (int x = 0; x < 8; x++) { double s = data[x + (y * 8)]; double dctVal = Math.Cos((2 * y + 1) * v * Math.PI / 16) * Math.Cos((2 * x + 1) * u * Math.PI / 16); sum += s * dctVal; } } DCTMatrix[u + (v * 8)] = (0.25 * cu * cv * sum); } } return (DCTMatrix); } public static int[] IDCT(ref double[] DCTMatrix) { int[] Matrix = new int[8 * 8]; for (int y = 0; y < 8; y++) { for (int x = 0; x < 8; x++) { double sum = 0; for (int v = 0; v < 8; v++) { for (int u = 0; u < 8; u++) { double cu = 1; if (u == 0) cu = (1.0 / Math.Sqrt(2.0)); double cv = 1; if (v == 0) cv = (1.0 / Math.Sqrt(2.0)); double idctVal = (cu * cv) / 4.0 * Math.Cos((2 * y + 1) * v * Math.PI / 16) * Math.Cos((2 * x + 1) * u * Math.PI / 16); sum += (DCTMatrix[u + (v * 8)] * idctVal); } } Matrix[x + (y * 8)] = (int)Math.Round(sum); } } return (Matrix); } } public class Decoder { public static string Decode(Bitmap b, int expectedLength) { expectedLength *= Encoder.HIDE_COUNT; uint imgWidth = ((uint)b.Width) & ~((uint)7); // Maximum usable X resolution (divisible by 8). uint imgHeight = ((uint)b.Height) & ~((uint)7); // Maximum usable Y resolution (divisible by 8). // Setup the counters and byte data for the message to decode. byte[] codeBytes = new byte[expectedLength]; byte[] outBytes = new byte[expectedLength / Encoder.HIDE_COUNT]; int bitofs = 0; // Current bit position we've decoded too. int totalBits = (codeBytes.Length * 8); // Total number of bits to decode. for (int y = 0; y < imgHeight; y += 8) { for (int x = 0; x < imgWidth; x += 8) { int[] blueData = ImageKey.Encoder.GetBlueChannelData(b, x, y); double[] blueDCT = ImageKey.Encoder.DCT(ref blueData); int[] blueDCTI = ImageKey.Encoder.Quantize(ref blueDCT, ref Encoder.Q2); int[] blueDCTC = ImageKey.Encoder.Quantize(ref blueDCT, ref Encoder.Q); if (bitofs < totalBits) GetBits(ref blueDCTI, ref blueDCTC, ref bitofs, ref totalBits, ref codeBytes); } } bitofs = 0; for (int i = 0; i < (expectedLength / Encoder.HIDE_COUNT) * 8; i++) { int bytePos = (bitofs) >> 3; int bitPos = (bitofs) % 8; byte mask = (byte)(1 << bitPos); List<int> values = new List<int>(); int zeroCount = 0; int oneCount = 0; for (int j = 0; j < Encoder.HIDE_COUNT; j++) { int val = (codeBytes[bytePos + ((expectedLength / Encoder.HIDE_COUNT) * j)] & mask) >> bitPos; values.Add(val); if (val == 0) zeroCount++; else oneCount++; } if (oneCount >= zeroCount) outBytes[bytePos] |= mask; bitofs++; values.Clear(); } return (System.Text.Encoding.ASCII.GetString(outBytes)); } public static void GetBits(ref int[] DCTMatrix, ref int[] CMatrix, ref int bitofs, ref int totalBits, ref byte[] codeBytes) { for (int u = 0; u < 8; u++) { for (int v = 0; v < 8; v++) { if ((u != 0 || v != 0) && CMatrix[v + (u * 8)] != 0 && DCTMatrix[v + (u * 8)] != 0) { if (bitofs < totalBits) { int bytePos = (bitofs) >> 3; int bitPos = (bitofs) % 8; byte mask = (byte)(1 << bitPos); int value = DCTMatrix[v + (u * 8)] & (1 << Encoder.HIDE_BIT_POS); if (value != 0) codeBytes[bytePos] |= mask; bitofs++; } } } } } } } UPDATE: By switching to using a QR Code as the source message and swapping a pair of coefficients in each block instead of bit manipulation I've been able to get the message to survive the transform. However to get the message to come through without corruption I have to adjust both coefficients as well as swap them. For example swapping (3,4) and (4,3) in the DCT matrix and then respectively adding 8 and subtracting 8 as an arbitrary constant seems to work. This survives a re-JPEG'ing of 96 but any form of scaling/cropping destroys the message again. I was hoping that by operating on mid to low frequency values that the message would be preserved even under some light image manipulation.

Read the article

Search Results

Search found 442 results on 18 pages for 'the naive'.

Page 18/18 | < Previous Page | 14 15 16 17 18

- by Bernt Habermeier

- by FransBouma

- by claws

- by Simon Cooper

- by Maian

- by Hristo

- by incrediman

- by Ralf Westphal

- by mmr

- by Uri

- by Dennis Zickefoose

- by Tom Williams

- by Tom Williams

- by Lemming

- by Aaron

- by JohnHankinson

< Previous Page | 14 15 16 17 18