Search Results

Search found 283 results on 12 pages for 'gotchas'.

Page 11/12 | < Previous Page | 7 8 9 10 11 12 | Next Page >

Pointer arithmetic and arrays: what's really legal?

- by bitcruncher

Consider the following statements: int *pFarr, *pVarr; int farr[3] = {11,22,33}; int varr[3] = {7,8,9}; pFarr = &(farr[0]); pVarr = varr; At this stage, both pointers are pointing at the start of each respective array address. For *pFarr, we are presently looking at 11 and for *pVarr, 7. Equally, if I request the contents of each array through *farr and *varr, i also get 11 and 7. So far so good. Now, let's try pFarr++ and pVarr++. Great. We're now looking at 22 and 8, as expected. But now... Trying to move up farr++ and varr++ ... and we get "wrong type of argument to increment". Now, I recognize the difference between an array pointer and a regular pointer, but since their behaviour is similar, why this limitation? This is further confusing to me when I also consider that in the same program I can call the following function in an ostensibly correct way and in another incorrect way, and I get the same behaviour, though in contrast to what happened in the code posted above!? working_on_pointers ( pFarr, farr ); // calling with expected parameters working_on_pointers ( farr, pFarr ); // calling with inverted parameters . void working_on_pointers ( int *pExpect, int aExpect[] ) { printf("%i", *pExpect); // displays the contents of pExpect ok printf("%i", *aExpect); // displays the contents of aExpect ok pExpect++; // no warnings or errors aExpect++; // no warnings or errors printf("%i", *pExpect); // displays the next element or an overflow element (with no errors) printf("%i", *aExpect); // displays the next element or an overflow element (with no errors) } Could someone help me to understand why array pointers and pointers behave in similar ways in some contexts, but different in others? So many thanks. EDIT: Noobs like myself could further benefit from this resource: http://www.panix.com/~elflord/cpp/gotchas/index.shtml

Read the article
What are the reasons why the CPU usage doesn’t go 100% with C# and APM?

- by Martin

I have an application which is CPU intensive. When the data is processed on a single thread, the CPU usage goes to 100% for many minutes. So the performance of the application appears to be bound by the CPU. I have multithreaded the logic of the application, which result in an increase of the overall performance. However, the CPU usage hardly goes above 30%-50%. I would expect the CPU (and the many cores) to go to 100% since I process many set of data at the same time. Below is a simplified example of the logic I use to start the threads. When I run this example, the CPU goes to 100% (on an 8/16 cores machine). However, my application which uses the same pattern doesn’t. public class DataExecutionContext { public int Counter { get; set; } // Arrays of data } static void Main(string[] args) { // Load data from the database into the context var contexts = new List<DataExecutionContext>(100); for (int i = 0; i < 100; i++) { contexts.Add(new DataExecutionContext()); } // Data loaded. Start to process. var latch = new CountdownEvent(contexts.Count); var processData = new Action<DataExecutionContext>(c => { // The thread doesn't access data from a DB, file, // network, etc. It reads and write data in RAM only // (in its context). for (int i = 0; i < 100000000; i++) c.Counter++; }); foreach (var context in contexts) { processData.BeginInvoke(context, new AsyncCallback(ar => { latch.Signal(); }), null); } latch.Wait(); } I have reduced the number of locks to the strict minimum (only the latch is locking). The best way I found was to create a context in which a thread can read/write in memory. Contexts are not shared among other threads. The threads can’t access the database, files or network. In other words, I profiled my application and I didn’t find any bottleneck. Why the CPU usage of my application doesn’t go about 50%? Is it the pattern I use? Should I create my own thread instead of using the .Net thread pool? Is there any gotchas? Is there any tool that you could recommend me to find my issue? Thanks!

Read the article
What is the best workaround for the WCF client `using` block issue?

- by Eric King

I like instantiating my WCF service clients within a using block as it's pretty much the standard way to use resources that implement IDisposable: using (var client = new SomeWCFServiceClient()) { //Do something with the client } But, as noted in this MSDN article, wrapping a WCF client in a using block could mask any errors that result in the client being left in a faulted state (like a timeout or communication problem). Long story short, when Dispose() is called, the client's Close() method fires, but throws and error because it's in a faulted state. The original exception is then masked by the second exception. Not good. The suggested workaround in the MSDN article is to completely avoid using a using block, and to instead instantiate your clients and use them something like this: try { ... client.Close(); } catch (CommunicationException e) { ... client.Abort(); } catch (TimeoutException e) { ... client.Abort(); } catch (Exception e) { ... client.Abort(); throw; } Compared to the using block, I think that's ugly. And a lot of code to write each time you need a client. Luckily, I found a few other workarounds, such as this one on IServiceOriented. You start with: public delegate void UseServiceDelegate<T>(T proxy); public static class Service<T> { public static ChannelFactory<T> _channelFactory = new ChannelFactory<T>(""); public static void Use(UseServiceDelegate<T> codeBlock) { IClientChannel proxy = (IClientChannel)_channelFactory.CreateChannel(); bool success = false; try { codeBlock((T)proxy); proxy.Close(); success = true; } finally { if (!success) { proxy.Abort(); } } } } Which then allows: Service<IOrderService>.Use(orderService => { orderService.PlaceOrder(request); } That's not bad, but I don't think it's as expressive and easily understandable as the using block. The workaround I'm currently trying to use I first read about on blog.davidbarret.net. Basically you override the client's Dispose() method wherever you use it. Something like: public partial class SomeWCFServiceClient : IDisposable { void IDisposable.Dispose() { if (this.State == CommunicationState.Faulted) { this.Abort(); } else { this.Close(); } } } This appears to be able to allow the using block again without the danger of masking a faulted state exception. So, are there any other gotchas I have to look out for using these workarounds? Has anybody come up with anything better?

Read the article
May volatile be in user defined types to help writing thread-safe code

- by David Rodríguez - dribeas

I know, it has been made quite clear in a couple of questions/answers before, that volatile is related to the visible state of the c++ memory model and not to multithreading. On the other hand, this article by Alexandrescu uses the volatile keyword not as a runtime feature but rather as a compile time check to force the compiler into failing to accept code that could be not thread safe. In the article the keyword is used more like a required_thread_safety tag than the actual intended use of volatile. Is this (ab)use of volatile appropriate? What possible gotchas may be hidden in the approach? The first thing that comes to mind is added confusion: volatile is not related to thread safety, but by lack of a better tool I could accept it. Basic simplification of the article: If you declare a variable volatile, only volatile member methods can be called on it, so the compiler will block calling code to other methods. Declaring an std::vector instance as volatile will block all uses of the class. Adding a wrapper in the shape of a locking pointer that performs a const_cast to release the volatile requirement, any access through the locking pointer will be allowed. Stealing from the article: template <typename T> class LockingPtr { public: // Constructors/destructors LockingPtr(volatile T& obj, Mutex& mtx) : pObj_(const_cast<T*>(&obj)), pMtx_(&mtx) { mtx.Lock(); } ~LockingPtr() { pMtx_->Unlock(); } // Pointer behavior T& operator*() { return *pObj_; } T* operator->() { return pObj_; } private: T* pObj_; Mutex* pMtx_; LockingPtr(const LockingPtr&); LockingPtr& operator=(const LockingPtr&); }; class SyncBuf { public: void Thread1() { LockingPtr<BufT> lpBuf(buffer_, mtx_); BufT::iterator i = lpBuf->begin(); for (; i != lpBuf->end(); ++i) { // ... use *i ... } } void Thread2(); private: typedef vector<char> BufT; volatile BufT buffer_; Mutex mtx_; // controls access to buffer_ };

Read the article
How to designing a generic databse whos layout may change over time?

- by mawg

Here's a tricky one - how do I programatically create and interrogate a database who's contents I can't really foresee? I am implementing a generic input form system. The user can create PHP forms with a WYSIWYG layout and use them for any purpose he wishes. He can also query the input. So, we have three stages: a form is designed and generated. This is a one-off procedure, although the form can be edited later. This designs the database. someone or several people make use of the form - say for daily sales reports, stock keeping, payroll, etc. Their input to the forms is written to the database. others, maybe management, can query the database and generate reports. Since these forms are generic, I can't predict the database structure - other than to say that it will reflect HTML form fields and consist of a the data input from collection of edit boxes, memos, radio buttons and the like. Questions and remarks: A) how can I best structure the database, in terms of tables and columns? What about primary keys? My first thought was to use the control name to identify each column, then I realized that the user can edit the form and rename, so that maybe "name" becomes "employee" or "wages" becomes ":salary". I am leaning towards a unique number for each. B) how best to key the rows? I was thinking of a timestamp to allow me to query and a column for the row Id from A) C) I have to handle column rename/insert/delete. Foe deletion, I am unsure whether to delete the data from the database. Even if the user is not inputting it from the form any more he may wish to query what was previously entered. Or there may be some legal requirements to retain the data. Any gotchas in column rename/insert/delete? D) For the querying, I can have my PHP interrogate the database to get column names and generate a form with a list where each entry has a database column name, a checkbox to say if it should be used in the query and, based on column type, some selection criteria. That ought to be enough to build searches like "position = 'senior salesman' and salary 50k". E) I probably have to generate some fancy charts - graphs, histograms, pie charts, etc for query results of numerical data over time. I need to find some good FOSS PHP for this. F) What else have I forgotten? This all seems very tricky to me, but I am database n00b - maybe it is simple to you gurus?

Read the article
Use of text() function when using xPath in dom4j

- by jlawless

I have inherited an application that parses xml using dom4j and xPath: The xml being parsed is similar to the following: <cache> <content> <transaction> <page> <widget name="PAGE_ID">WRK_REGISTRATION</widget> <widget name="TRANS_DETAIL_ID">77145</widget> <widget name="GRD_ERRORS" /> </page> <page> <widget name="PAGE_ID">WRK_REGISTRATION</widget> <widget name="TRANS_DETAIL_ID">77147</widget> <widget name="GRD_ERRORS" /> </page> <page> <widget name="PAGE_ID">WRK_PROCESSING</widget> <widget name="TRANS_DETAIL_ID">77152</widget> <widget name="GRD_ERRORS" /> </page> </transaction> </content> </cache> Individual Nodes are being searched using the following: String xPathToGridErrorNode = "//cache/content/transaction/page/widget[@name='PAGE_ID'][text()='WRK_DNA_REGISTRATION']/../widget[@name='TRANS_DETAIL_ID'][text()='77147']/../widget[@name='GRD_ERRORS_TEMP']"; org.dom4j.Element root = null; SAXReader reader = new SAXReader(); Document document = reader.read(new BufferedInputStream(new ByteArrayInputStream(xmlToParse.getBytes()))); root = document.getRootElement(); Node gridNode = root.selectSingleNode(xPathToGridErrorNode); where xmlToParse is a String of xml similar to the excerpt provided above. The code is trying to obtain the GRD_ERROR node for the page with the PAGE_ID and TRANS_DETAIL_ID provided in the xPath. I am seeing an intermittent (~1-2%) failure (returned node is null) of this selectSingleNode request even though the requested node is in the xml being searched. I know there are some gotchas associated with using text()= in xPath and was wondering if there was a better way to format the xPath string for this type of search.

Read the article
Linker Issues with boost::thread under linux using Eclipse and CMake

- by OcularProgrammer

I'm in the process of attempting to port some code across from PC to Ubuntu, and am having some issues due to limited experience developing under linux. We use CMake to generate all our build stuff. Under windows I'm making VS2010 projects, and under Linux I'm making Eclipse projects. I've managed to get my OpenCV stuff ported across successfully, but am having major headaches trying to port my threaded boost apps. Just so we're clear, the steps I have followed so-far on a clean Ubuntu 12 installation. (I've done 2 clean re-installs to try and fix potential library cock-ups, now I'm just giving up and asking): Install Eclipse and Eclipse CDT using my package manager Install CMake and CMake Gui using my package manager Install libboost-all-dev using my package manager So-far that's all I've done. I can create the eclipse project using CMake with no errors, so CMake is successfully finding my boost install. When I try and build through eclipse is when I get issues; The app I'm attempting to build uses boost::asio for some UDP I/O and boost::thread to create worker threads for the asio I/O services. I can successfully compile each module, but when I come to link I get spammed with errors such as: /usr/bin/c++ CMakeFiles/RE05DevelopmentDemo.dir/main.cpp.o CMakeFiles/RE05DevelopmentDemo.dir/RE05FusionListener/RE05FusionListener.cpp.o CMakeFiles/RE05DevelopmentDemo.dir/NewEye/NewEye.cpp.o -o RE05DevelopmentDemo -rdynamic -Wl,-Bstatic -lboost_system-mt -lboost_date_time-mt -lboost_regex-mt -lboost_thread-mt -Wl,-Bdynamic /usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../lib/libboost_thread-mt.a(thread.o): In function `void boost::call_once<void (*)()>(boost::once_flag&, void (*)()) [clone .constprop.98]': make[2]: Leaving directory `/home/david/Code/Build/Support/RE05DevDemo' (.text+0xc8): undefined reference to `pthread_key_create' /usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../lib/libboost_thread-mt.a(thread.o): In function `boost::this_thread::interruption_enabled()': (.text+0x540): undefined reference to `pthread_getspecific' make[1]: Leaving directory `/home/david/Code/Build/Support/RE05DevDemo' /usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../lib/libboost_thread-mt.a(thread.o): In function `boost::this_thread::disable_interruption::disable_interruption()': (.text+0x570): undefined reference to `pthread_getspecific' /usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../lib/libboost_thread-mt.a(thread.o): In function `boost::this_thread::disable_interruption::disable_interruption()': (.text+0x59f): undefined reference to `pthread_getspecific' Some Gotchas that I have collected from other StackOverflow posts and have already checked: The boost libs are all present at /usr/lib I am not getting any compile errors for inability to find the boost headers, so they must be getting found. I am trying to link statically, but I believe eclipse should be passing the correct arguments to make that happen since my CMakeLists.txt includes SET(Boost_USE_STATIC_LIBS ON) I'm officially out of ideas here, I have tried doing local builds of boost and a bunch of other stuff with no more success. I even re-installed Ubuntu to ensure I haven't completely fracked the libs directories and links with multiple weird versions or anything else. Any help would be muchly appreciated.

Read the article
Exchange Server 2007 Setup

- by AlamedaDad

Hi, I'm working on a upgrade to Exchange 2007 and I wanted to get some advise on hardware choices. We currently have an Exchange 2003 STD server with 400 users split between 6 AD Sites, that is housed on a single server. We need to move to a redundant, fault tolerant system to support our users. I'm planning on installing 2 Dell 1950 servers with W2k8-std to act as CAS and Hub servers, with NLB to allow abstraction of the actual server name to the users. There won't be an edge system since we have a Barracuda box already that will handle in/out spam/virus filtering. Backend I'm planning on 2 mailbox servers which will be Dell 2950s with 16GB RAM, 2 either dual-core or quad-core CPUs and 6 300GB SAS drives in some RAID config. These systems will be clustered using W2k8 Ent clustering and running CCR in Exchange. My questions are as follows: Is 16GB enough RAM for serving that many mailboxes along with the windows clustering and ccr? I'm trying to figure out disk layouts and I'm unsure of whether to use all local disk or some local and some SAN, via an OpenFiler iSCSI server. The SAN would be a Dell 2850 with 6 - 300GB SCSI drives and a PERC controller to slice as I want, with 8GB RAM. Option 1: 2 drives, RAID 1 - OS 2 drives, RAID 1 - Logs 2 drives, RAID 1 - Mail stores Option 2: 2 drives, RAID 1 - OS and logs 4 drives, RAID 5 - Mail Stores and scratch space for eseutil. Option 3: 2 drives, RAID 1 - OS 2 drives, RAID 1 - Logs 2 drives, RAID 0 - scratch space ~300GB iSCSI volume for mail stores Option 4: 2 drives, RAID 1 - OS 4 drives, RAID 5 - scratch space ~300GB iSCSI volume for mail stores ~300GB iSCSI volume for logs I have 2 sockets for CPUs and need to chose between dual and quad cores. The dual core have faster clocks but less cache and I'm thinking older architecture. Am I better off with more cores and cache while sacraficing clock speed? I am planning on adding the new E2K7 cluster to the E2K3 server and then move each mailbox over, all at once, then remove the old server. This seems more complicated than simply getting rid of the 2003 server and then adding the 2007 cluster and restoring the mailboxes using PowerControls or exmerge. The migration option lets me do this on my time, where a cutover means it all needs to work at once. If I go with the cutover method, how can I prebuild the servers and add them to the domain right after removing the 2003 server, or can't I? I think the answer is no and the migration is my only real option if I want to prebuild. I need to also migrate about 30GB of Public Folders. Is there anything special about this, other than specifying in the E2K7 install that I want older Outlook clients and PF's setup? I guess I could even keep the E2K3 server to host just the PFs? Lastly, if I have a mix of Outlook 200, 2003 and 2007 what do I need to do to make sure they all have access to the GAL and OAB? At time of cutover, we'll be at like 90% 2007, but we will have some older stuff around. My plan is to use Outlook Anywhere on laptops that are used outside the physical network. Are there any gotchas involved in that? I'm even thinking about using is for all Outlook clients, does anyone do that? The reason I'm considering it is that our WAN is really VPN tunnels over internet connections, so not a fully messhed, stable WAN. Thank you all very much for the assistance in advance and I look forward to discussion of these points! Regards...Michael

Read the article
Using different SSDs types (not only SATA based) as system drive

- by Hubert Kario

Currently I have a Thinkpad X61s and want to make it both a bit faster and a bit more power efficient. For that reason I thought that adding SSD drive would make most sense. Unfortunately, because of financial reasons, buying SSD of over 200GB capacity is out of reach for me (not only it would be worth more than the rest of the laptop, but also I currently have a 500GB drive in it, so even such a drive would be kind of a downgrade for me). During preliminary testing with a cheap Transcend 4GB Class 6 (14MiB/s streaming, 9MiB/s random read) card I experienced boot times to be reduced by half so putting the OS only on it would already would be an improvement. Unfortunately, my system now is about 11GiB in size so anything less than 16GB would be constraining. In this laptop I can connect additional drives on at least 5 different ways: using SATA-ATA converter caddy in the X6 Ultrabase using internal mini PCIe slot using integrated SDHC slot using CardBus (a.k.a PCMCIA or PC Card) slot using USB Thankfully, because I use only Linux on this PC the bootability of them is irrelevant as I can put the /boot partition on internal HDD and / on any of the above mentioned Flash memories (as I already did for the SDHC test). From what I was able to research and from my own experience those options come with rather big downsides or other problems: SATA-ATA caddy It has three downsides: I have to carry the Ultrabse with me at all times (it's not really inconvenient, but those grams do add) and couldn't disconnect it when I want to disconnect the battery It makes the bay unusable for the optical drive and occasional quick access to other hard drives the only caddies I could buy have rather flaky controllers in them so putting my OS on it would hamper its stability Internal mini PCIe slot This would be an ideal solution, if only I could find real PCIe SSDs, not only devices that could talk only SATA or ATA over PCIe mechanical connection (the ones used in Dell Mini or Asus EEE). Theoretically Samsung did release such devices but I couldn't find them in retail anywhere. Integrated SDHC slot It's a nice solution with a single drawback: the fastest 16GB SDHC card on the market can only do around 35MiB/s read and 15MiB/s write while still costing like a normal 40GB SATA SSD that's 10 times faster. Not really cost-effective. CardBus (a.k.a PCMCIA or PC Card) slot Those cards are much faster than the SDHC option (there are ones that can do well over 50MiB/s read in benchmarks) and from what I could find the PCMCIA controller in my laptop does support UDMA so it should be able to deliver comparable speeds. They still cost similarly to SD cards but at least they provide streaming performance comparable to my current HDD. USB That's the worst option. Not only is it limited to 20-30MiB/s by the interface itself the drive would stick out of the laptop so it's a big no no. The question As such I think that going the "CF in a CardBus adapter" route will be the best option. My question is: did anyone try using CF cards in CardBus adapters as system drives with Linux on Thinkpad laptops? Laptops in general? What was the real-world performance? I don't have any CF cards so I can't check how well does it work with suspend/resume, or whatever it's easy to make it work in initramfs (I'm using ArchLinux and SD card was trivial — add 3 modules in single config line and rebuilding initramfs) so any tips/gotchas on this are welcome as well.

Read the article
Brighton Rocks: UA Europe 2011

- by ultan o'broin

User Assistance Europe 2011 was held in Brighton, UK. Having seen Quadrophenia a dozen times, I just had to go along (OK, I wanted to talk about messages in enterprise applications). Sadly, it rained a lot, though that was still eminently more tolerable than being stuck home in Dublin during Bloomsday. So, here are my somewhat selective highlights and observations from the conference, massively skewed towards my own interests, as usual. Enjoyed Leah Guren's (Cow TC) great start ‘keynote’ on the Cultural Dimensions of Software Help Usage. Starting out by revisiting Hofstede's and Hall's work on culture (how many times I have done this for Multilingual magazine?) and then Neilsen’s findings on age as an indicator of performance, Leah showed how it is the expertise of the user that user assistance (UA) needs to be designed for (especially for high-end users), with some considerations made for age, while the gender and culture of users are not major factors. Help also needs to be contextual and concise, embedded close to the action. That users are saying things like “If I want help on Office, I go to Google ” isn't all that profound at this stage, but it is always worth reiterating how search can be optimized to return better results for users. Interestingly, regardless of user education level, the issue of information quality--hinging on the lynchpin of terminology reflecting that of the user--is critical. Major takeaway for me there. Matthew Ellison’s sessions on embedded help and demos were also impressive. Embedded help that is concise and contextual is definitely a powerful UX enabler, and I’m pleased to say that in Oracle Fusion Applications we have embraced the concept fully. Matthew also mentioned in his session about successful software demos that the principle of modality with demos is a must. Look no further than Oracle User Productivity Kit demos See It!, Try It!, Know It, and Do It! modes, for example. I also found some key takeaways in the presentation by Marie-Louise Flacke on notes and warnings. Here, legal considerations seemed to take precedence over providing any real information to users. I was delighted when Marie-Louise called out the Oracle JDeveloper documentation as an exemplar of how to use notes and instructions instead of trying to scare the bejaysus out of people and not providing them with any real information they’d find useful instead. My own session on designing messages for enterprise applications was well attended. Knowing your user profiles (remember user expertise is the king maker for UA so write for each audience involved), how users really work, the required application business and UI rules, what your application technology supports, and how messages integrate with the enterprise help desk and support policies and you will go much further than relying solely on the guideline of "writing messages in plain language". And, remember the value in warnings and confirmation messages too, and how you can use them smartly. I hope y’all got something from my presentation and from my answers to questions afterwards. Ellis Pratt stole the show with his presentation on applying game theory to software UA, using plenty of colorful, relevant examples (check out the Atlassian and DropBox approaches, for example), and striking just the right balance between theory and practice. Completely agree that the approach to take here is not to make UA itself a game, but to invoke UA as part of a bigger game dynamic (time-to-task completion, personal and communal goals, personal achievement and status, and so on). Sure there are gotchas and limitations to gamification, and we need to do more research. However, we'll hear a lot more about this subject in coming years, particularly in the enterprise space. I hope. I also heard good things about the different sessions about DITA usage (including one by Sonja Fuga that clearly opens the door for major innovation in the community content space using WordPress), the progressive disclosure of information (Cerys Willoughby), an overview of controlled language (or "information quality", as I like to position it) solutions and rationale by Dave Gash, and others. I also spent time chatting with Mike Hamilton of MadCap Software, who showed me a cool demo of their Flare product, and the Lingo translation solution. I liked the idea of their licensing model for workers-on-the-go; that’s smart UX-awareness in itself. Also chatted with Julian Murfitt of Mekon about uptake of DITA in the enterprise space. In all, it's worth attending UA Europe. I was surprised, however, not to see conference topics about mobile UA, community conversation and content, and search in its own right. These are unstoppable forces now, and the latter is pretty central to providing assistance now to all but the most irredentist of hard-copy fetishists or advanced technical or functional users working away on the back end of applications and systems. Only saw one iPad too (says the guy who carries three laptops). Tweeting during the conference was pretty much nonexistent during the event, so no community energy there. Perhaps all this can be addressed next year. I would love to see the next UA Europe event come to Dublin (despite Bloomsday, it's not a bad place place, really) now that hotels are so cheap and all. So, what is my overall impression of the state of user assistance in Europe? Clearly, there are still many people in the industry who feel there is something broken with the traditional forms of user assistance (particularly printed doc) and something needs to be done about it. I would suggest they move on and try and embrace change, instead. Many others see new possibilities, offered by UX and technology, as well as the reality of online user behavior in an increasingly connected world and that is encouraging. Such thought leaders need to be listened to. As Ellis Pratt says in his great book, Trends in Technical Communication - Rethinking Help: “To stay relevant means taking a new perspective on the role (of technical writer), and delivering “products” over and above the traditional manual and online Help file... there are a number of new trends in this field - some complementary, some conflicting. Whatever trends emerge as the norm, it’s likely the status quo will change.” It already has, IMO. I hear similar debates in the professional translation world about the onset of translation crowd sourcing (the Facebook model) and machine translation (trust me, that battle is over). Neither of these initiatives has put anyone out of a job and probably won't, though the nature of the work might change. If anything, such innovations have increased the overall need for professional translators as user expectations rise, new audiences emerge, and organizations need to collate and curate user-generated content, combining it with their own. Perhaps user assistance professionals can learn from other professions and grow accordingly.

Read the article
.NET to iOS: From WinForms to the iPad

- by RobertChipperfield

One of the great things about working at Red Gate is getting to play with new technology - and right now, that means mobile. A few weeks ago, we decided that a little research into the tablet computing arena was due, and purely from a numbers point of view, that suggested the iPad as a good target device. A quick trip to iPhoneDevCon in San Diego later, and Marine and I came back full of ideas, and with some concept of how iOS development was meant to work. Here's how we went from there to the release of Stacks & Heaps, our geeky take on the classic "Snakes & Ladders" game. Step 1: Buy a Mac I've played with many operating systems in my time: from the original BBC Model B, through DOS, Windows, Linux, and others, but I'd so far managed to avoid buying fruit-flavoured computer hardware! If you want to develop for the iPhone, iPad or iPod Touch, that's the first thing that needs to change. If you've not used OS X before, the first thing you'll realise is that everything is different! In the interests of avoiding a flame war in the comments section, I'll only go so far as to say that a lot of my Windows-flavoured muscle memory no longer worked. If you're in the UK, you'll also realise your keyboard is lacking a # key, and that " and @ are the other way around from normal. The wonderful Ukelele keyboard layout editor restores some sanity here, as long as you don't look at the keyboard when you're typing. I couldn't give up the PC entirely, but a handy application called Synergy comes to the rescue - it lets you share a single keyboard and mouse between multiple machines. There's a few limitations: Alt-Tab always seems to go to the Mac, and Windows 7's UAC dialogs require the local mouse for security reasons, but it gets you a long way at least. Step 2: Register as an Apple Developer You can register as an Apple Developer free of charge, and that lets you download XCode and the iOS SDK. You also get the iPhone / iPad emulator, which is handy, since you'll need to be a paid member before you can deploy your apps to a real device. You can either enroll as an individual, or as a company. They both cost the same ($99/year), but there's a few differences between them. If you register as a company, you can add multiple developers to your team (all for the same $99 - not $99 per developer), and you get to use your company name in the App Store. However, you'll need to send off significantly more documentation to Apple, and I suspect the process takes rather longer than for an individual, where they just need to verify some credit card details. Here's a tip: if you're registering as a company, do so as early as possible. The approval process can take a while to complete, so get the application in in plenty of time. Step 3: Learn to love the square brackets! Objective-C is the language of the iPad. C and C++ are also supported, and if you're doing some serious game development, you'll probably spend most of your time in C++ talking OpenGL, but for forms-based apps, you'll be interacting with a lot of the Objective-C SDK. Like shifting from Ctrl-C to Cmd-C, it feels a little odd at first, with the familiar string.format(.) turning into: NSString *myString = [NSString stringWithFormat:@"Hello world, it's %@", [NSDate date]]; Thankfully XCode's auto-complete is normally passable, if not up to Visual Studio's standards, which coupled with a huge amount of content on Stack Overflow means you'll soon get to grips with the API. You'll need to get used to some terminology changes, though; here's an incomplete approximation: Coming from a .NET background, there's some luxuries you no longer have developing Objective C in XCode: Generics! Remember back in .NET 1.1, when all collections were just objects? Yup, we're back there now. ReSharper. Or, more generally, very much refactoring support. The not-many-keystrokes to rename a class, its file, and al references to it in Visual Studio turns into a much more painful experience in XCode. Garbage collection. This is actually rather less of an issue than you might expect: if you follow the rules, the reference counting provided by Objective C gets you a long way without too much pain. Circular references are their usual problematic self, though. Decent exception handling. You do have exceptions, but they're nowhere near as widely used. Generally, if something goes wrong, you get nil (see translation table above) back. Which brings me on to. Calling a method on a nil object isn't a failure - it just returns nil itself! There's many arguments for and against this, but personally I fall into the "stuff should fail as quickly and explicitly as possible" camp. Less specifically, I found that there's more chance of code failing at runtime rather than getting caught at compile-time: using the @selector(.) syntax to pass a method signature isn't (can't be) checked at compile-time, so the first you know about a typo is a crash when you try and call it. The solution to this is of course lots of great testing, both automated and manual, but I still find comfort in provably correct type safety being enforced in addition to testing. Step 4: Submit to the App Store Assuming you want to distribute to more than a handful of devices, you're going to need to submit your app to the Apple App Store. There's a few gotchas in terms of getting builds signed with the right certificates, and you'll be bouncing around between XCode and iTunes Connect a fair bit, but eventually you get everything checked off the to-do list, and are ready to upload your first binary! With some amount of anticipation, I pressed the Upload button in XCode, ready to release our creation into the world, but was instead greeted by an error informing me my XML file was malformed. Uh. A little Googling later, and it turned out that a simple rename from "Stacks&Heaps.app" to "StacksAndHeaps.app" worked around an XML escaping bug, and we were good to go. The next step is to wait for approval (or otherwise). After a couple of weeks of intensive development, this part is agonising. Did we make it? The Apple jury is still out at the moment, but our fingers are firmly crossed! In the meantime, you can see some screenshots and leave us your email address if you'd like us to get in touch when it does go live at the MobileFoo website. Step 5: Profit! Actually, that wasn't the idea here: Stacks & Heaps is free; there's no adverts, and we're not going to sell all your data either. So why did we do it? We wanted to get an idea of what it's like to move from coding for a desktop environment, to something completely different. We don't know whether in a year's time, the iPad will still be the dominant force, or whether Android will have smoothed out some bugs, tweaked the performance, and polished the UI, but I think it's a fairly sure bet that the tablet form factor is here to stay. We want to meet people who are using it, start chatting to them, and find out about some of the pain they're feeling. What better way to do that than do it ourselves, and get to write a cool game in the process?

Read the article
SQL SERVER – SSIS Look Up Component – Cache Mode – Notes from the Field #028

- by Pinal Dave

[Notes from Pinal]: Lots of people think that SSIS is all about arranging various operations together in one logical flow. Well, the understanding is absolutely correct, but the implementation of the same is not as easy as it seems. Similarly most of the people think lookup component is just component which does look up for additional information and does not pay much attention to it. Due to the same reason they do not pay attention to the same and eventually get very bad performance. Linchpin People are database coaches and wellness experts for a data driven world. In this 28th episode of the Notes from the Fields series database expert Tim Mitchell (partner at Linchpin People) shares very interesting conversation related to how to write a good lookup component with Cache Mode. In SQL Server Integration Services, the lookup component is one of the most frequently used tools for data validation and completion. The lookup component is provided as a means to virtually join one set of data to another to validate and/or retrieve missing values. Properly configured, it is reliable and reasonably fast. Among the many settings available on the lookup component, one of the most critical is the cache mode. This selection will determine whether and how the distinct lookup values are cached during package execution. It is critical to know how cache modes affect the result of the lookup and the performance of the package, as choosing the wrong setting can lead to poorly performing packages, and in some cases, incorrect results. Full Cache The full cache mode setting is the default cache mode selection in the SSIS lookup transformation. Like the name implies, full cache mode will cause the lookup transformation to retrieve and store in SSIS cache the entire set of data from the specified lookup location. As a result, the data flow in which the lookup transformation resides will not start processing any data buffers until all of the rows from the lookup query have been cached in SSIS. The most commonly used cache mode is the full cache setting, and for good reason. The full cache setting has the most practical applications, and should be considered the go-to cache setting when dealing with an untested set of data. With a moderately sized set of reference data, a lookup transformation using full cache mode usually performs well. Full cache mode does not require multiple round trips to the database, since the entire reference result set is cached prior to data flow execution. There are a few potential gotchas to be aware of when using full cache mode. First, you can see some performance issues – memory pressure in particular – when using full cache mode against large sets of reference data. If the table you use for the lookup is very large (either deep or wide, or perhaps both), there’s going to be a performance cost associated with retrieving and caching all of that data. Also, keep in mind that when doing a lookup on character data, full cache mode will always do a case-sensitive (and in some cases, space-sensitive) string comparison even if your database is set to a case-insensitive collation. This is because the in-memory lookup uses a .NET string comparison (which is case- and space-sensitive) as opposed to a database string comparison (which may be case sensitive, depending on collation). There’s a relatively easy workaround in which you can use the UPPER() or LOWER() function in the pipeline data and the reference data to ensure that case differences do not impact the success of your lookup operation. Again, neither of these present a reason to avoid full cache mode, but should be used to determine whether full cache mode should be used in a given situation. Full cache mode is ideally useful when one or all of the following conditions exist: The size of the reference data set is small to moderately sized The size of the pipeline data set (the data you are comparing to the lookup table) is large, is unknown at design time, or is unpredictable Each distinct key value(s) in the pipeline data set is expected to be found multiple times in that set of data Partial Cache When using the partial cache setting, lookup values will still be cached, but only as each distinct value is encountered in the data flow. Initially, each distinct value will be retrieved individually from the specified source, and then cached. To be clear, this is a row-by-row lookup for each distinct key value(s). This is a less frequently used cache setting because it addresses a narrower set of scenarios. Because each distinct key value(s) combination requires a relational round trip to the lookup source, performance can be an issue, especially with a large pipeline data set to be compared to the lookup data set. If you have, for example, a million records from your pipeline data source, you have the potential for doing a million lookup queries against your lookup data source (depending on the number of distinct values in the key column(s)). Therefore, one has to be keenly aware of the expected row count and value distribution of the pipeline data to safely use partial cache mode. Using partial cache mode is ideally suited for the conditions below: The size of the data in the pipeline (more specifically, the number of distinct key column) is relatively small The size of the lookup data is too large to effectively store in cache The lookup source is well indexed to allow for fast retrieval of row-by-row values No Cache As you might guess, selecting no cache mode will not add any values to the lookup cache in SSIS. As a result, every single row in the pipeline data set will require a query against the lookup source. Since no data is cached, it is possible to save a small amount of overhead in SSIS memory in cases where key values are not reused. In the real world, I don’t see a lot of use of the no cache setting, but I can imagine some edge cases where it might be useful. As such, it’s critical to know your data before choosing this option. Obviously, performance will be an issue with anything other than small sets of data, as the no cache setting requires row-by-row processing of all of the data in the pipeline. I would recommend considering the no cache mode only when all of the below conditions are true: The reference data set is too large to reasonably be loaded into SSIS memory The pipeline data set is small and is not expected to grow There are expected to be very few or no duplicates of the key values(s) in the pipeline data set (i.e., there would be no benefit from caching these values) Conclusion The cache mode, an often-overlooked setting on the SSIS lookup component, represents an important design decision in your SSIS data flow. Choosing the right lookup cache mode directly impacts the fidelity of your results and the performance of package execution. Know how this selection impacts your ETL loads, and you’ll end up with more reliable, faster packages. If you want me to take a look at your server and its settings, or if your server is facing any issue we can Fix Your SQL Server. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: Notes from the Field, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL Tagged: SSIS

Read the article
Rebuilding CoasterBuzz, Part III: The architecture using the "Web stack of love"

- by Jeff

This is the third post in a series about rebuilding one of my Web sites, which has been around for 12 years. I hope to relaunch in the next month or two. More: Part I: Evolution, and death to WCF Part II: Hot data objects I finally hit a point in the re-do of CoasterBuzz where I feel like the major pieces are in place... rewritten, ported and what not, so that I can focus now on front-end design and more interesting creative problems. I've been asked on more than one occasion (OK, just twice) what's going on under the covers, so I figure this might be a good time to explain the overall architecture. As it turns out, I'm using a whole lof of the "Web stack of love," as Scott Hanselman likes to refer to it. Oh that Hanselman. First off, at the center of it all, is BizTalk. Just kidding. That's "enterprise architecture" humor, where every discussion starts with how they'll use BizTalk. Here are the bigger moving parts: It's fairly straight forward. A common library lives in a number of Web apps, all of which are (or will be) powered by ASP.NET MVC 4. They all talk to the same database. There is the main Web site, which also has the endpoint for the Silverlight-based Feed app. The cstr.bz site handles redirects, which are generated when news items are published and sent to Twitter. Facebook publishing is handled via the RSS Graffiti Facebook app. The API site handles requests from the Windows Phone app. The main site depends very heavily on POP Forums, the open source, MVC-based forum I maintain. It serves a number of functions, primarily handling users. These user objects serve in non-forum roles to handle things like news and database contributions, maintaining track records (coaster nerd for "list of rides I've been on") and, perhaps most importantly, paid club memberships. Before I get into more specifics, note that the "glue" for everything is Ninject, the dependency injection framework. I actually prefer StructureMap these days, but I started with Ninject in POP Forums a long time ago. POP Forums has a static class, PopForumsActivation, that new's up an instance of the container, and you can call it from where ever. The downside is that the forums require Ninject in your MVC app as the default dependency resolver. At some point, I'll decouple it, but for now it's not in the way. In the general sense, the entire set of apps follow a repository-service-controller-view pattern. Repos just do data access, service classes do business logic, controllers compose and route, views view. The forum also provides Scoring Game functionality. The Scoring Game is a reasonably abstract framework to award users points based on certain actions, and then award achievements when a certain number of point events happen. For example, the forum already awards a point when someone plus-one's a post you made. You can set up an achievement that says, "Give the user an award when they've had 100 posts plus'd." It also does zero-point entries into the ledger, so if you make a post, you could award an achievement based on 100 posts made. Wiring in the scoring game to CoasterBuzz functionality is just a matter of going to the Ninject container and getting an instance of the event publisher, and passing it events. Forum adapters were introduced into POP Forums a few versions ago, and they can intercept the model generated for forum topic lists and threads and designate an alternate view. These are used to make the "Day in Pictures" forum, where users can upload photos as frame-by-frame photo threads. Another adapter adds an association UI, so users can associate specific amusement parks with their trip report posts. The Silverlight-based Feed app talks to a simple JSON endpoint in the main app. This uses an underlying library I wrote ages ago, simply called Feeds, that aggregates event information. You inherit from a base class that creates instances of a publisher interface, and then use that class to send it an event type and any number of data fields. Feeds has two publishers: One is to the database, and that's used for the endpoint that talks to the Silverlight app. The second publisher publishes to Twitter, if the event is of the type "news." The wiring is a little strange, because for the new posts and topics events, I'm actually pulling out the forum repository classes from the Ninject container and replacing them with overridden methods to publish. I should probably be doing this at the service class level, but whatever. It's my mess. cstr.bz doesn't do anything interesting. It looks up the path, and if it has a match, does a 301 redirect to the long URL. The API site just serves up JSON for the Windows Phone app. The Windows Phone app is Silverlight, of course, and there isn't much to it. It does use the control toolkit, but beyond that, it relies on a simple class that creates a Webclient and calls the server for JSON to deserialize. The same class is now used by the Feed app, which used to use WCF. Simple is better. Data access in POP Forums is all straight SQL, because a lot of it was ported from the ASP.NET version. Most CoasterBuzz data access is handled by the Entity Framework, using the code-first model. The context class in this case does a lot of work to make sure that the table and key mapping works, since much of it breaks from the normal conventions of EF. One of the more powerful things you can do with EF, once you understand the little gotchas, is split tables by row into different entities. For example, a roller coaster photo has everything in the same row, including the metadata, the thumbnail bytes and the image itself. Obviously, if you want to get a list of photos to iterate over in a view, you don't want to get the image data. The use of navigation properties makes it easier to get just what you want. The front end includes Razor views in MVC, and jQuery is used for client-side goodness. I'm also using jQuery UI in a few places, for tabs, a dialog box and autocomplete. I'm also, tentatively, using jQuery Mobile. I've already ported most forum views to Mobile, but they need some work as v1.1 isn't finished yet. I'm not sure if I'll ship CoasterBuzz with mobile views or not yet. It's on the radar, but not something in my delivery criteria. That covers all of the big frameworks in play. Next time I hope to talk more about the front-end experience, which to me is where most of the fun is these days. Hoping to launch in the next month or two. Getting tired of looking at the old site!

Read the article
Neo4J and Azure and VS2012 and Windows 8

- by Chris Skardon

Now, I know that this has been written about, but both of the main places (http://www.richard-banks.org/2011/02/running-neo4j-on-azure.html and http://blog.neo4j.org/2011/02/announcing-neo4j-on-windows-azure.html) utilise VS2010, and well, I’m on VS2012 and Windows 8. Not that I think Win 8 had anything to do with it really, anyhews! I’m going to begin from the beginning, this is my first foray into running something on Azure, so it’s been a bit of a learning curve. But luckily the Neo4J guys have got us started, so let’s download the VS2010 solution: http://neo4j.org/get?file=Neo4j.Azure.Server.zip OK, the other thing we’ll need is the VS2012 Azure SDK, so let’s get that as well: http://www.windowsazure.com/en-us/develop/downloads/ (I just did the full install). Now, unzip the VS2010 solution and let’s open it in VS2012: <your location>\Neo4j.Azure.Server\Neo4j.Azure.Server.sln One-way-upgrade? Yer! Ignore the migration report – we don’t care! Let’s build that sucker… Ahhh 14 errors… WindowsAzure does not exist in the namespace ‘Microsoft’ Not a problem right? We’ve installed the SDK, just need to update the references: We can ignore the Test projects, they don’t use Azure, we’re interested in the other projects, so what we’ll do is remove the broken references, and add the correct ones, so expand the references bit of each project: hunt out those yellow exclamation marks, and delete them! You’ll need to add the right ones back in (listed below), when you go to the ‘Add Reference’ dialog make sure you have ‘Assemblies’ and ‘Framework’ selected before you seach (and search for ‘microsoft.win’ to narrow it down) So the references you need for each project are: CollectDiagnosticsData Microsoft.WindowsAzure.Diagnostics Microsoft.WindowsAzure.StorageClient Diversify.WindowsAzure.ServiceRuntime Microsoft.WindowsAzure.CloudDrive Microsoft.WindowsAzure.ServiceRuntime Microsoft.WindowsAzure.StorageClient Right, so let’s build again… Sweet! No errors. Now we need to setup our Blobs, I’m assuming you are using the most up-to-date Java you happened to have downloaded :) in my case that’s JRE7, and that is located in: C:\Program Files (x86)\Java\jre7 So, zip up that folder into whatever you want to call it, I went with jre7.zip, and stuck it in a temp folder for now. In that same temp folder I also copied the neo4j zip I was using: neo4j-community-1.7.2-windows.zip OK, now, we need to get these into our Blob storage, this is where a lot of stuff becomes unstuck - I didn’t find any applications that helped me use the blob storage, one would crash (because my internet speed is so slow) and the other just didn’t work – sure it looked like it had worked, but when push came to shove it didn’t. So this is how I got my files into Blob (local first): 1. Run the ‘Storage Emulator’ (just search for that in the start menu) 2. That takes a little while to start up so fire up another instance of Visual Studio in the mean time, and create a new Console Application. 3. Manage Nuget Packages for that solution and add ‘Windows Azure Storage’ Now you’re set up to add the code: public static void Main() { CloudStorageAccount cloudStorageAccount = CloudStorageAccount.DevelopmentStorageAccount; CloudBlobClient client = cloudStorageAccount.CreateCloudBlobClient(); client.Timeout = TimeSpan.FromMinutes(30); CloudBlobContainer container = client.GetContainerReference("neo4j"); //This will create it as well UploadBlob(container, "jre7.zip", "c:\\temp\\jre7.zip"); UploadBlob(container, "neo4j-community-1.7.2-windows.zip", "c:\\temp\\neo4j-community-1.7.2-windows.zip"); } private static void UploadBlob(CloudBlobContainer container, string blobName, string filename) { CloudBlob blob = container.GetBlobReference(blobName); using (FileStream fileStream = File.OpenRead(filename)) blob.UploadFromStream(fileStream); } This will upload the files to your local storage account (to switch to an Azure one, you’ll need to create a storage account, and use those credentials when you make your CloudStorageAccount above) To test you’ve got them uploaded correctly, go to: http://localhost:10000/devstoreaccount1/neo4j/jre7.zip and you will hopefully download the zip file you just uploaded. Now that those files are there, we are ready for some final configuration… Right click on the Neo4jServerHost role in the Neo4j.Azure.Server cloud project: Click on the ‘Settings’ tab and we’ll need to do some changes – by default, the 1.7.2 edition of neo4J unzips to: neo4j-community-1.7.2 So, we need to update all the ‘neo4j-1.3.M02’ directories to be ‘neo4j-community-1.7.2’, we also need to update the Java runtime location, so we start with this: and end with this: Now, I also changed the Endpoints settings, to be HTTP (from TCP) and to have a port of 7410 (mainly because that’s straight down on the numpad) The last ‘gotcha’ is some hard coded consts, which had me looking for ages, they are in the ‘ConfigSettings’ class of the ‘Neo4jServerHost’ project, and the ones we’re interested in are: Neo4jFileName JavaZipFileName Change those both to what that should be. OK Nearly there (I promise)! Run the ‘Compute Emulator’ (same deal with the Start menu), in your system tray you should have an Azure icon, when the compute emulator is up and running, right click on the icon and select ‘Show Compute Emulator UI’ The last steps! Make sure the ‘Neo4j.Azure.Server’ cloud project is set up as the start project and let’s hit F5 tension mounts, the build takes place (you need to accept the UAC warning) and VS does it’s stuff. If you look at the Compute Emulator UI you’ll see some log stuff (which you’ll need if this goes awry – but it won’t don’t worry!) In a bit, the console and a Java window will pop up: Then the console will bog off, leaving just the Java one, and if we switch back to the Compute Emulator UI and scroll up we should be able to see a line telling us the port number we’ve been assigned (in my case 7411): (If you can’t see it, don’t worry.. press CTRL+A on the emulator, then CTRL+C, copy all the text and paste it into something like Notepad, then just do a Find for ‘port’ you’ll soon see it) Go to your favourite browser, and head to: http://localhost:YOURPORT/ and you should see the WebAdmin! See you on the cloud side hopefully! Chris PS Other gotchas! OK, I’ve been caught out a couple of times: I had an instance of Neo4J running as a service on my machine, the Azure instance wanted to run the https version of the server on the same port as the Service was running on, and so Java would complain that the port was already in use.. The first time I converted the project, it didn’t update the version of the Azure library to load, in the App.Config of the Neo4jServerHost project, and VS would throw an exception saying it couldn’t find the Azure dll version 1.0.0.0.

Read the article
Consuming the Amazon S3 service from a Win8 Metro Application

- by cibrax

As many of the existing Http APIs for Cloud Services, AWS also provides a set of different platform SDKs for hiding many of complexities present in the APIs. While there is a platform SDK for .NET, which is open source and available in C#, that SDK does not work in Win8 Metro Applications for the changes introduced in WinRT. WinRT offers a complete different set of APIs for doing I/O operations such as doing http calls or using cryptography for signing or encrypting data, two aspects that are absolutely necessary for consuming AWS. All the I/O APIs available as part of WinRT are asynchronous, and uses the TPL model for .NET applications (HTML and JavaScript Metro applications use a model based in promises, which is similar concept). In the case of S3, the http Authorization header is used for two purposes, authenticating clients and make sure the messages were not altered while they were in transit. For doing that, it uses a signature or hash of the message content and some of the headers using a symmetric key (That's just one of the available mechanisms). Windows Azure for example also uses the same mechanism in many of its APIs. There are three challenges that any developer working for first time in Metro will have to face to consume S3, the new WinRT APIs, the asynchronous nature of them and the complexity introduced for generating the Authorization header. Having said that, I decided to write this post with some of the gotchas I found myself trying to consume this Amazon service. 1. Generating the signature for the Authorization header All the cryptography APIs in WinRT are available under Windows.Security.Cryptography namespace. Many of operations available in these APIs uses the concept of buffers (IBuffer) for representing a chunk of binary data. As you will see in the example below, these buffers are mainly generated with the use of static methods in a WinRT class CryptographicBuffer available as part of the namespace previously mentioned. private string DeriveAuthToken(string resource, string httpMethod, string timestamp) { var stringToSign = string.Format("{0}\n" + "\n" + "\n" + "\n" + "x-amz-date:{1}\n" + "/{2}/", httpMethod, timestamp, resource); var algorithm = MacAlgorithmProvider.OpenAlgorithm("HMAC_SHA1"); var keyMaterial = CryptographicBuffer.CreateFromByteArray(Encoding.UTF8.GetBytes(this.secret)); var hmacKey = algorithm.CreateKey(keyMaterial); var signature = CryptographicEngine.Sign( hmacKey, CryptographicBuffer.CreateFromByteArray(Encoding.UTF8.GetBytes(stringToSign)) ); return CryptographicBuffer.EncodeToBase64String(signature); } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } The algorithm that determines the information or content you need to use for generating the signature is very well described as part of the AWS documentation. In this case, this method is generating a signature required for creating a new bucket. A HmacSha1 hash is computed using a secret or symetric key provided by AWS in the management console. 2. Sending an Http Request to the S3 service WinRT also ships with the System.Net.Http.HttpClient that was first introduced some months ago with ASP.NET Web API. This client provides a rich interface on top the traditional WebHttpRequest class, and also solves some of limitations found in this last one. There are a few things that don't work with a raw WebHttpRequest such as setting the Host header, which is something absolutely required for consuming S3. Also, HttpClient is more friendly for doing unit tests, as it receives a HttpMessageHandler as part of the constructor that can fake to emulate a real http call. This is how the code for consuming the service with HttpClient looks like, public async Task<S3Response> CreateBucket(string name, string region = null, params string[] acl) { var timestamp = string.Format("{0:r}", DateTime.UtcNow); var auth = DeriveAuthToken(name, "PUT", timestamp); var request = new HttpRequestMessage(HttpMethod.Put, "http://s3.amazonaws.com/"); request.Headers.Host = string.Format("{0}.s3.amazonaws.com", name); request.Headers.TryAddWithoutValidation("Authorization", "AWS " + this.key + ":" + auth); request.Headers.Add("x-amz-date", timestamp); var client = new HttpClient(); var response = await client.SendAsync(request); return new S3Response { Succeed = response.StatusCode == HttpStatusCode.OK, Message = (response.Content != null) ? await response.Content.ReadAsStringAsync() : null }; } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } You will notice a few additional things in this code. By default, HttpClient validates the values for some well-know headers, and Authorization is one of them. It won't allow you to set a value with ":" on it, which is something that S3 expects. However, that's not a problem at all, as you can skip the validation by using the TryAddWithoutValidation method. Also, the code is heavily relying on the new async and await keywords to transform all the asynchronous calls into synchronous ones. In case you would want to unit test this code and faking the call to the real S3 service, you should have to modify it to inject a custom HttpMessageHandler into the HttpClient. The following implementation illustrates this concept, In case you would want to unit test this code and faking the call to the real S3 service, you should have to modify it to inject a custom HttpMessageHandler into the HttpClient. The following implementation illustrates this concept, public class FakeHttpMessageHandler : HttpMessageHandler { HttpResponseMessage response; public FakeHttpMessageHandler(HttpResponseMessage response) { this.response = response; } protected override Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, System.Threading.CancellationToken cancellationToken) { var tcs = new TaskCompletionSource<HttpResponseMessage>(); tcs.SetResult(response); return tcs.Task; } } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } You can use this handler for injecting any response while you are unit testing the code.

Read the article
Consuming the Amazon S3 service from a Win8 Metro Application

- by cibrax

As many of the existing Http APIs for Cloud Services, AWS also provides a set of different platform SDKs for hiding many of complexities present in the APIs. While there is a platform SDK for .NET, which is open source and available in C#, that SDK does not work in Win8 Metro Applications for the changes introduced in WinRT. WinRT offers a complete different set of APIs for doing I/O operations such as doing http calls or using cryptography for signing or encrypting data, two aspects that are absolutely necessary for consuming AWS. All the I/O APIs available as part of WinRT are asynchronous, and uses the TPL model for .NET applications (HTML and JavaScript Metro applications use a model based in promises, which is similar concept). In the case of S3, the http Authorization header is used for two purposes, authenticating clients and make sure the messages were not altered while they were in transit. For doing that, it uses a signature or hash of the message content and some of the headers using a symmetric key (That's just one of the available mechanisms). Windows Azure for example also uses the same mechanism in many of its APIs. There are three challenges that any developer working for first time in Metro will have to face to consume S3, the new WinRT APIs, the asynchronous nature of them and the complexity introduced for generating the Authorization header. Having said that, I decided to write this post with some of the gotchas I found myself trying to consume this Amazon service. 1. Generating the signature for the Authorization header All the cryptography APIs in WinRT are available under Windows.Security.Cryptography namespace. Many of operations available in these APIs uses the concept of buffers (IBuffer) for representing a chunk of binary data. As you will see in the example below, these buffers are mainly generated with the use of static methods in a WinRT class CryptographicBuffer available as part of the namespace previously mentioned. private string DeriveAuthToken(string resource, string httpMethod, string timestamp) { var stringToSign = string.Format("{0}\n" + "\n" + "\n" + "\n" + "x-amz-date:{1}\n" + "/{2}/", httpMethod, timestamp, resource); var algorithm = MacAlgorithmProvider.OpenAlgorithm("HMAC_SHA1"); var keyMaterial = CryptographicBuffer.CreateFromByteArray(Encoding.UTF8.GetBytes(this.secret)); var hmacKey = algorithm.CreateKey(keyMaterial); var signature = CryptographicEngine.Sign( hmacKey, CryptographicBuffer.CreateFromByteArray(Encoding.UTF8.GetBytes(stringToSign)) ); return CryptographicBuffer.EncodeToBase64String(signature); } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } The algorithm that determines the information or content you need to use for generating the signature is very well described as part of the AWS documentation. In this case, this method is generating a signature required for creating a new bucket. A HmacSha1 hash is computed using a secret or symetric key provided by AWS in the management console. 2. Sending an Http Request to the S3 service WinRT also ships with the System.Net.Http.HttpClient that was first introduced some months ago with ASP.NET Web API. This client provides a rich interface on top the traditional WebHttpRequest class, and also solves some of limitations found in this last one. There are a few things that don't work with a raw WebHttpRequest such as setting the Host header, which is something absolutely required for consuming S3. Also, HttpClient is more friendly for doing unit tests, as it receives a HttpMessageHandler as part of the constructor that can fake to emulate a real http call. This is how the code for consuming the service with HttpClient looks like, public async Task<S3Response> CreateBucket(string name, string region = null, params string[] acl) { var timestamp = string.Format("{0:r}", DateTime.UtcNow); var auth = DeriveAuthToken(name, "PUT", timestamp); var request = new HttpRequestMessage(HttpMethod.Put, "http://s3.amazonaws.com/"); request.Headers.Host = string.Format("{0}.s3.amazonaws.com", name); request.Headers.TryAddWithoutValidation("Authorization", "AWS " + this.key + ":" + auth); request.Headers.Add("x-amz-date", timestamp); var client = new HttpClient(); var response = await client.SendAsync(request); return new S3Response { Succeed = response.StatusCode == HttpStatusCode.OK, Message = (response.Content != null) ? await response.Content.ReadAsStringAsync() : null }; } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } You will notice a few additional things in this code. By default, HttpClient validates the values for some well-know headers, and Authorization is one of them. It won't allow you to set a value with ":" on it, which is something that S3 expects. However, that's not a problem at all, as you can skip the validation by using the TryAddWithoutValidation method. Also, the code is heavily relying on the new async and await keywords to transform all the asynchronous calls into synchronous ones. In case you would want to unit test this code and faking the call to the real S3 service, you should have to modify it to inject a custom HttpMessageHandler into the HttpClient. The following implementation illustrates this concept, In case you would want to unit test this code and faking the call to the real S3 service, you should have to modify it to inject a custom HttpMessageHandler into the HttpClient. The following implementation illustrates this concept, public class FakeHttpMessageHandler : HttpMessageHandler { HttpResponseMessage response; public FakeHttpMessageHandler(HttpResponseMessage response) { this.response = response; } protected override Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, System.Threading.CancellationToken cancellationToken) { var tcs = new TaskCompletionSource<HttpResponseMessage>(); tcs.SetResult(response); return tcs.Task; } } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } You can use this handler for injecting any response while you are unit testing the code.

Read the article
FluentNHibernate Unit Of Work / Repository Design Pattern Questions

- by Echiban

Hi all, I think I am at a impasse here. I have an application I built from scratch using FluentNHibernate (ORM) / SQLite (file db). I have decided to implement the Unit of Work and Repository Design pattern. I am at a point where I need to think about the end game, which will start as a WPF windows app (using MVVM) and eventually implement web services / ASP.Net as UI. Now I already created domain objects (entities) for ORM. And now I don't know how should I use it outside of ORM. Questions about it include: Should I use ORM entity objects directly as models in MVVM? If yes, do I put business logic (such as certain values must be positive and be greater than another Property) in those entity objects? It is certainly the simpler approach, and one I am leaning right now. However, will there be gotchas that would trash this plan? If the answer above is no, do I then create a new set of classes to implement business logic and use those as Models in MVVM? How would I deal with the transition between model objects and entity objects? I guess a type converter implementation would work well here. Now I followed this well written article to implement the Unit Of Work pattern. However, due to the fact that I am using FluentNHibernate instead of NHibernate, I had to bastardize the implementation of UnitOfWorkFactory. Here's my implementation: using System; using FluentNHibernate.Cfg; using FluentNHibernate.Cfg.Db; using NHibernate; using NHibernate.Cfg; using NHibernate.Tool.hbm2ddl; namespace ELau.BlindsManagement.Business { public class UnitOfWorkFactory : IUnitOfWorkFactory { private static readonly string DbFilename; private static Configuration _configuration; private static ISession _currentSession; private ISessionFactory _sessionFactory; static UnitOfWorkFactory() { // arbitrary default filename DbFilename = "defaultBlindsDb.db3"; } internal UnitOfWorkFactory() { } #region IUnitOfWorkFactory Members public ISession CurrentSession { get { if (_currentSession == null) { throw new InvalidOperationException(ExceptionStringTable.Generic_NotInUnitOfWork); } return _currentSession; } set { _currentSession = value; } } public ISessionFactory SessionFactory { get { if (_sessionFactory == null) { _sessionFactory = BuildSessionFactory(); } return _sessionFactory; } } public Configuration Configuration { get { if (_configuration == null) { Fluently.Configure().ExposeConfiguration(c => _configuration = c); } return _configuration; } } public IUnitOfWork Create() { ISession session = CreateSession(); session.FlushMode = FlushMode.Commit; _currentSession = session; return new UnitOfWorkImplementor(this, session); } public void DisposeUnitOfWork(UnitOfWorkImplementor adapter) { CurrentSession = null; UnitOfWork.DisposeUnitOfWork(adapter); } #endregion public ISession CreateSession() { return SessionFactory.OpenSession(); } public IStatelessSession CreateStatelessSession() { return SessionFactory.OpenStatelessSession(); } private static ISessionFactory BuildSessionFactory() { ISessionFactory result = Fluently.Configure() .Database( SQLiteConfiguration.Standard .UsingFile(DbFilename) ) .Mappings(m => m.FluentMappings.AddFromAssemblyOf<UnitOfWorkFactory>()) .ExposeConfiguration(BuildSchema) .BuildSessionFactory(); return result; } private static void BuildSchema(Configuration config) { // this NHibernate tool takes a configuration (with mapping info in) // and exports a database schema from it _configuration = config; new SchemaExport(_configuration).Create(false, true); } } } I know that this implementation is flawed because a few tests pass when run individually, but when all tests are run, it would fail for some unknown reason. Whoever wants to help me out with this one, given its complexity, please contact me by private message. I am willing to send some $$$ by Paypal to someone who can address the issue and provide solid explanation. I am new to ORM, so any assistance is appreciated.

Read the article
Git for beginners: The definitive practical guide

- by Adam Davis

Ok, after seeing this post by PJ Hyett, I have decided to skip to the end and go with git. So what I need is a beginners practical guide to git. "Beginner" being defined as someone who knows how to handle their compiler, understands to some level what a makefile is, and has touched source control without understanding it very well. "Practical" being defined as this person doesn't want to get into great detail regarding what git is doing in the background, and doesn't even care (or know) that it's distributed. Your answers might hint at the possibilities, but try to aim for the beginner that wants to keep a 'main' repository on a 'server' which is backed up and secure, and treat their local repository as merely a 'client' resource. Procedural note: PLEASE pick one and only one of the below topics and answer it clearly and concisely in any given answer. Don't try to jam a bunch of information into one answer. Don't just link to other resources - cut and paste with attribution if copyright allows, otherwise learn it and explain it in your own words (ie, don't make people leave this page to learn a task). Please comment on, or edit, an already existing answer unless your explanation is very different and you think the community is better served with a different explanation rather than altering the existing explanation. So: Installation/Setup How to install git How do you set up git? Try to cover linux, windows, mac, think 'client/server' mindset. Setup GIT Server with Msysgit on Windows How do you create a new project/repository? How do you configure it to ignore files (.obj, .user, etc) that are not really part of the codebase? Working with the code How do you get the latest code? How do you check out code? How do you commit changes? How do you see what's uncommitted, or the status of your current codebase? How do you destroy unwanted commits? How do you compare two revisions of a file, or your current file and a previous revision? How do you see the history of revisions to a file? How do you handle binary files (visio docs, for instance, or compiler environments)? How do you merge files changed at the "same time"? How do you undo (revert or reset) a commit? Tagging, branching, releases, baselines How do you 'mark' 'tag' or 'release' a particular set of revisions for a particular set of files so you can always pull that one later? How do you pull a particular 'release'? How do you branch? How do you merge branches? How do you resolve conflicts and complete the merge? How do you merge parts of one branch into another branch? What is rebasing? How do I track remote branches? How can I create a branch on a remote repository? Other Describe and link to a good gui, IDE plugin, etc that makes git a non-command line resource, but please list its limitations as well as its good. msysgit - Cross platform, included with git gitk - Cross platform history viewer, included with git gitnub - OS X gitx - OS X history viewer smartgit - Cross platform, commercial, beta tig - console GUI for Linux qgit - GUI for Windows, Linux Any other common tasks a beginner should know? Git Status tells you what you just did, what branch you have, and other useful information How do I work effectively with a subversion repository set as my source control source? Other git beginner's references git guide git book git magic gitcasts github guides git tutorial Progit - book by Scott Chacon Git - SVN Crash Course Delving into git Understanding git conceptually I will go through the entries from time to time and 'tidy' them up so they have a consistent look/feel and it's easy to scan the list - feel free to follow a simple "header - brief explanation - list of instructions - gotchas and extra info" template. I'll also link to the entries from the bullet list above so it's easy to find them later.

Read the article
I'm trying to run some PHP scripts as CLI instead of over HTTP. How do I make them play nice?

- by gnfti

Hi everyone. I'm using some PHP scripts from FeedForAll to join together RSS feeds (RSSmesh) and display them as HTML (RSS2HTML). Because I intend to run these scripts fairly intensively and don't want the resulting HTTP requests and bandwidth to count towards my hosting quota, I am in the process of moving to running them on the web host's server in an umbrella PHP "batch" script, and call this script via cron (this is a Linux server, by the way). Here's a (working) sample request over HTTP: http://www.mydomain.com/a/rss2htmlcore/rss2html2.php?XMLFILE=http://www.mydomain.com/a/myapp/xmlcache/feed.xml&TEMPLATE=template.html This will produce the desired HTML output. An example of how I want this to work on the command line: /srv/customers/mycustomer#/mydomain.com/www/a/rss2htmlcore/rss2html2-cli.php /srv/customers/mycustomer#/mydomain.com/www/a/myapp/xmlcache/feed.xml /srv/customers/mycustomer#/mydomain.com/www/a/template.html This is with the correct shebang line added to "rss2html2-cli.php". I could just as well specify the executable ("/usr/local/bin/php") in the request, I doubt it makes a difference because I am able to run another script (that I wrote myself) either way without problems. Now, RSS2HTML and RSSmesh are different in that, for starters, they include secondary files -- for example, both include an XML parser script -- and I suspect that this is where I am getting a bit in over my head. Right now I'm calling exec() from the "umbrella" batch script, like so: exec("/srv/customers/mycustomer#/mydomain.com/www/a/rss2htmlcore/rss2html2-cli.php /srv/customers/mycustomer#/mydomain.com/www/a/myapp/xmlcache/feed.xml /srv/customers/mycustomer#/mydomain.com/www/a/template.html", $output) But no output is being produced. What's the best way to go about this and what "gotchas" should I keep in mind? Is exec() the right way to approach this? It works fine for the other (simple) script but that writes its own output. For this I want to get the output and write it to a file from within the umbrella script if possible. I've also tried output buffering but to no avail. Do I need to pay attention to anything specific with regard to the includes? Right now they're specified in the scripts as include_once("FeedForAll_XMLParser.inc.php"); and the specified files are indeed in the same folder. Further info: -This is a Linux server. -I have no direct access to the shell, so I can't test things directly on a command line, everything is via crontab. -I will admit that support for the FeedForAll scripts leaves a lot to be desired, but I'd like to keep using their scripts if at all possible, if only because I know them and have been using them for a while. I have looked into Simplepie, but the FFA scripts do some things that I've seen no obvious solutions for with Simplepie, like limiting the number of items per individual feed (RSSmesh) or limiting the description length (RSS2HTML). -Yahoo! Pipes is out, they cache their data for too long for my application. Should you want to take a look at the code, here are the scripts as txt files. RSS2HTML2 and RSSmesh are the FeedForAll scripts, FeedForAll_XMLParser... is the included parser. Note that I have not yet amended these to handle $argv etc. I have however in "scraper-universal-rss-cli", which works fine with CLI. If anyone has any thoughts to share on this it would be very much appreciated. Thank you in advance.

Read the article
Why are there so many man-made edge cases in IT, and is there any hope for simplification / unificat

- by Hamish Grubijan

This question is meant to generate discussion and so it is marked as community wiki. My observation is that the field of information technology grows so rapidly and randomly, that for many it takes a lot of time to learn many intricacies of some tools that will be obsolete in just short 3 years. If you look at the questions asked on StackOverflow ... at least half of them stem from the fact that some language / tool / API / protocol was poorly designed, is backwards and has gotchas. There are so many things which distract developers from converting English into machine code; instead they spend their time configuring stuff and gluing together things that do not really fit. How many times do you pick up somebody else's project (or someone picks up yours :) ) and realize that this program does not need half of the dialogs that it has, and that the logic can be simplified a great deal? But, it had to be made and sold here before a better thing is made and sold elsewhere, and hence all this rush. I often wish that things would just slow down. I do not want Microsoft Windows to run on my car's computer, my watch, my table, my toaster oven, and my toilet seat. I'd rather have Windows that DOES NOT HAVE WINDOWS REGISTRY, I'd rather have Windows that allows two different programs to work on the same file at the same time, the way it works on Unix systems. Microsoft is just an example. I am looking forward to the day when I do not have to worry about Windows vs Unix new line break, when System32 actually means that this directory contains 32-bit binaries, and not 64-bit ones, the day when dll hell and manifest hell are no longer an issue, the day when it takes me a lot less than 3 months on a new job to learn the system. I do not mean learning the entire code base of a product (depending on the size of it, it can take a long time). I mean - remembering which build-assisting scripts are written in Perl and which version of it, and which ones are done through .bat files, when do I need to manually make every file in some directory writable before running a script, or else a critical step of a database maintenance home-grown tool will bomb, and it will take 2 days to clean that up. Makes me wonder if humans enslaved computers, or if it is the other way around. The key is that improving those things will not bring extra revenue, and hence those taking the time to fix crap like that are not "business focused". However, these imperfections irritate me immensely, particularly because my memory is limited - I can hold only a small portion of that useless knowledge of a system in my head at any given point in time. I must not be alone. Did you also happen to notice that a programmer can waste a lot of time on things that should have been a lot more straight-forward? Is there hope? Will things get better/simpler in the future, or will there be a lot more IT crap floating around? I suppose I see diversity of tools, protocols, etc. as a bad thing. Thank you for participation.

Read the article
What are good CLI tools for JSON?

- by jasonmp85

General Problem Though I may be diagnosing the root cause of an event, determining how many users it affected, or distilling timing logs in order to assess the performance and throughput impact of a recent code change, my tools stay the same: grep, awk, sed, tr, uniq, sort, zcat, tail, head, join, and split. To glue them all together, Unix gives us pipes, and for fancier filtering we have xargs. If these fail me, there's always perl -e. These tools are perfect for processing CSV files, tab-delimited files, log files with a predictable line format, or files with comma-separated key-value pairs. In other words, files where each line has next to no context. XML Analogues I recently needed to trawl through Gigabytes of XML to build a histogram of usage by user. This was easy enough with the tools I had, but for more complicated queries the normal approaches break down. Say I have files with items like this: <foo user="me"> <baz key="zoidberg" value="squid" /> <baz key="leela" value="cyclops" /> <baz key="fry" value="rube" /> </foo> And let's say I want to produce a mapping from user to average number of <baz>s per <foo>. Processing line-by-line is no longer an option: I need to know which user's <foo> I'm currently inspecting so I know whose average to update. Any sort of Unix one liner that accomplishes this task is likely to be inscrutable. Fortunately in XML-land, we have wonderful technologies like XPath, XQuery, and XSLT to help us. Previously, I had gotten accustomed to using the wonderful XML::XPath Perl module to accomplish queries like the one above, but after finding a TextMate Plugin that could run an XPath expression against my current window, I stopped writing one-off Perl scripts to query XML. And I just found out about XMLStarlet which is installing as I type this and which I look forward to using in the future. JSON Solutions? So this leads me to my question: are there any tools like this for JSON? It's only a matter of time before some investigation task requires me to do similar queries on JSON files, and without tools like XPath and XSLT, such a task will be a lot harder. If I had a bunch of JSON that looked like this: { "firstName": "Bender", "lastName": "Robot", "age": 200, "address": { "streetAddress": "123", "city": "New York", "state": "NY", "postalCode": "1729" }, "phoneNumber": [ { "type": "home", "number": "666 555-1234" }, { "type": "fax", "number": "666 555-4567" } ] } And wanted to find the average number of phone numbers each person had, I could do something like this with XPath: fn:avg(/fn:count(phoneNumber)) Questions Are there any command-line tools that can "query" JSON files in this way? If you have to process a bunch of JSON files on a Unix command line, what tools do you use? Heck, is there even work being done to make a query language like this for JSON? If you do use tools like this in your day-to-day work, what do you like/dislike about them? Are there any gotchas? I'm noticing more and more data serialization is being done using JSON, so processing tools like this will be crucial when analyzing large data dumps in the future. Language libraries for JSON are very strong and it's easy enough to write scripts to do this sort of processing, but to really let people play around with the data shell tools are needed. Related Questions Grep and Sed Equivalent for XML Command Line Processing Is there a query language for JSON? JSONPath or other XPath like utility for JSON/Javascript; or Jquery JSON

Read the article
How to overlay a div (or any element) over a table row (tr)?

- by slolife

I'd like to overlay a div (or any element that'll work) over a table row (tr tag) that happens to have more than one column. I have tried a few methods, which don't seem to work. I've posted my current code below. I do get an overlay, but not directly over just the row. I tried setting the overlay top to $divBottom.css('top'), but that is always 'auto'. So, am I on the right track, or is there a better way of doing it? Utilizing jQuery is fine as you can see. If I am on the right track, how do I get the div placed correctly? Is the offsetTop an offset in the containing element, the table, and I need to do some math? Any other gotchas I'll run into with that? <html> <head> <title>Overlay Tests</title> <style> #rowBottom { outline:red solid 2px } #divBottom { margin:1em; font-size:xx-large; position:relative; } #divOverlay { background-color:Silver; text-align:center; position:absolute; z-index:10000; opacity:0.5; } </style> </head> <body> <p align="center"><a id="lnkDoIt" href="#">Do it!</a></p> <table width="100%" border="0" cellpadding="10" cellspacing="3" style="position:relative"> <tr> <td><p>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</p></td> </tr> <tr id="rowBottom"> <td><div id="divBottom"><p align="center">This is the bottom text</p></div></td> </tr> <tr> <td><p>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</p></td> </tr> </table> <div id="divOverlay" style=""><p>This is the overlay div.</p><p id="info"></p></div> <script src="../includes/javascript/jquery/jquery-1.4.2.js" type="text/javascript"></script> <script type="text/javascript"> $(document).ready(function() { $('#lnkDoIt').click(function() { var $divBottom = $('#rowBottom'); var $divOverlay = $('#divOverlay'); var bottomTop = $divBottom.attr('offsetTop'); var bottomLeft = $divBottom.attr('offsetLeft'); var bottomWidth = $divBottom.css('width'); var bottomHeight = $divBottom.css('height'); $divOverlay.css('top', bottomTop); $divOverlay.css('left', bottomLeft); $divOverlay.css('width', bottomWidth); $divOverlay.css('height', bottomHeight); $('#info').text('Top: ' + bottomTop + ' Left: ' + bottomLeft); }); }); </script> </body> </html>

Read the article
many-to-many-to-many, incl alignment of data from diff sources

- by JefeCoon

Re-factoring dbase to support many:many:many. At the second and third levels we need to preserve end-user 'mapping' or aligning of data from different sources, e.g. Order 17 FirstpartyOrderID => aha LineItem_for_BigShinyThingy => AA-1 # maps to 77-a LineItem_for_BigShinyThingy => AA-2 # maps to 77-b, 77-c LineItem_for_LittleWidget => AA-x # maps to 77-zulu, 77-alpha, 99-foxtrot LineItem_for_LittleWidget => AA-y # maps to 77-zulu, 99-foxtrot LineItem_for_LittleWidget => AA-z # maps to 77-alpha ThirdpartyOrderID => foo LineItem_for_BigShinyThingy => 77-a LineItem_for_BigShinyThingy => 77-b LineItem_for_BigShinyThingy => 77-c LineItem_for_LittleWidget => 77-zulu LineItem_for_LittleWidget => 77-alpha ThirdpartyOrderID => bar LineItem_for_LittleWidget => 99-foxtrot Each LineItem has daily datapoints reported from its own source (Firstparty|Thirdparty). In our UI & app we provide tools to align these, then we'd like to save them into the cleanest possible schema for querying, enabling us to diff the reported daily datapoints, and perform other daily calculations (which we'll store in the dbase also, fortunately that should be cake once we've nailed this). We need to map related [firstparty|thirdparty]line_items which have their own respective datapoints. We'll be using the association to pull each line_items collection of datapoints for summary and discrepancy calculations. I'm considering two options, std has_many,through x2 --or-- possibly (scary) ubermasterjoin table OptionA: order<<-->> order_join_table[id,order_id,firstparty_order_id,thirdparty_order_id] <<-->>line_item order_join_table[firstparty_order_id]-->raw_order[id] order_join_table[thirdparty_order_id]-->raw_order[id] raw_order-->raw_line_items[raw_order_id] line_item<<-->> line_item_join[id,LI_join_id,firstparty_LI,thirdparty_LI <<-->>raw_line_items line_item_join[firstparty_LI]-->raw_line_item[id] line_item_join[thirdparty_LI]-->raw_line_item[id] raw_line_item<<-->>datapoints = we rely upon join to store all mappings of first|third orders & line_items = keys to raw_* enable lookup of these order & line_item details = concerns about circular references and/or lack of correct mapping logic, e.g order--line_item--raw_line_items vs. order--raw_order--raw_line_items OptionB: order<<-->> join_master[id,order_id,FP_order_id,TP_order_id,FP_line_item_id,TP_line_item_id] join_master[FP_order_id & TP_order_id]-->raw_order[id] join_master[FP_line_item_id & TP_line_item_id]-->raw_line_item[id] = every combo of FP_line_item + TP_line_item writes a record into the join_master table = "theoretically" queries easy/fast/flexible/sexy At long last, my questions: a) any learnings from painful firsthand experience about how best to implement/tune/optimize many-to-many-to-many relationships b) in rails? c) any painful gotchas (circular references, slow queries, spaghetti-monsters) to watch out for? d) any joy & goodness in Rails3 that makes this magically easy & joyful? e) anyone written the "how to do many-to-many-to-many schema in Rails and make it fast & sexy?" tutorial that I somehow haven't found? If not, I'll follow up with our learnings in the hope it's helpful.. Thanks in advance- --Jeff

Read the article
C++ copy-construct construct-and-assign question

- by Andy

Blockquote Here is an extract from item 56 of the book "C++ Gotchas": It's not uncommon to see a simple initialization of a Y object written any of three different ways, as if they were equivalent. Y a( 1066 ); Y b = Y(1066); Y c = 1066; In point of fact, all three of these initializations will probably result in the same object code being generated, but they're not equivalent. The initialization of a is known as a direct initialization, and it does precisely what one might expect. The initialization is accomplished through a direct invocation of Y::Y(int). The initializations of b and c are more complex. In fact, they're too complex. These are both copy initializations. In the case of the initialization of b, we're requesting the creation of an anonymous temporary of type Y, initialized with the value 1066. We then use this anonymous temporary as a parameter to the copy constructor for class Y to initialize b. Finally, we call the destructor for the anonymous temporary. To test this, I did a simple class with a data member (program attached at the end) and the results were surprising. It seems that for the case of b, the object was constructed by the copy constructor rather than as suggested in the book. Does anybody know if the language standard has changed or is this simply an optimisation feature of the compiler? I was using Visual Studio 2008. Code sample: #include <iostream> class Widget { std::string name; public: // Constructor Widget(std::string n) { name=n; std::cout << "Constructing Widget " << this->name << std::endl; } // Copy constructor Widget (const Widget& rhs) { std::cout << "Copy constructing Widget from " << rhs.name << std::endl; } // Assignment operator Widget& operator=(const Widget& rhs) { std::cout << "Assigning Widget from " << rhs.name << " to " << this->name << std::endl; return *this; } }; int main(void) { // construct Widget a("a"); // copy construct Widget b(a); // construct and assign Widget c("c"); c = a; // copy construct! Widget d = a; // construct! Widget e = "e"; // construct and assign Widget f = Widget("f"); return 0; } Output: Constructing Widget a Copy constructing Widget from a Constructing Widget c Assigning Widget from a to c Copy constructing Widget from a Constructing Widget e Constructing Widget f Copy constructing Widget from f I was most surprised by the results of constructing d and e.

Read the article
Effective optimization strategies on modern C++ compilers

- by user168715

I'm working on scientific code that is very performance-critical. An initial version of the code has been written and tested, and now, with profiler in hand, it's time to start shaving cycles from the hot spots. It's well-known that some optimizations, e.g. loop unrolling, are handled these days much more effectively by the compiler than by a programmer meddling by hand. Which techniques are still worthwhile? Obviously, I'll run everything I try through a profiler, but if there's conventional wisdom as to what tends to work and what doesn't, it would save me significant time. I know that optimization is very compiler- and architecture- dependent. I'm using Intel's C++ compiler targeting the Core 2 Duo, but I'm also interested in what works well for gcc, or for "any modern compiler." Here are some concrete ideas I'm considering: Is there any benefit to replacing STL containers/algorithms with hand-rolled ones? In particular, my program includes a very large priority queue (currently a std::priority_queue) whose manipulation is taking a lot of total time. Is this something worth looking into, or is the STL implementation already likely the fastest possible? Along similar lines, for std::vectors whose needed sizes are unknown but have a reasonably small upper bound, is it profitable to replace them with statically-allocated arrays? I've found that dynamic memory allocation is often a severe bottleneck, and that eliminating it can lead to significant speedups. As a consequence I'm interesting in the performance tradeoffs of returning large temporary data structures by value vs. returning by pointer vs. passing the result in by reference. Is there a way to reliably determine whether or not the compiler will use RVO for a given method (assuming the caller doesn't need to modify the result, of course)? How cache-aware do compilers tend to be? For example, is it worth looking into reordering nested loops? Given the scientific nature of the program, floating-point numbers are used everywhere. A significant bottleneck in my code used to be conversions from floating point to integers: the compiler would emit code to save the current rounding mode, change it, perform the conversion, then restore the old rounding mode --- even though nothing in the program ever changed the rounding mode! Disabling this behavior significantly sped up my code. Are there any similar floating-point-related gotchas I should be aware of? One consequence of C++ being compiled and linked separately is that the compiler is unable to do what would seem to be very simple optimizations, such as move method calls like strlen() out of the termination conditions of loop. Are there any optimization like this one that I should look out for because they can't be done by the compiler and must be done by hand? On the flip side, are there any techniques I should avoid because they are likely to interfere with the compiler's ability to automatically optimize code? Lastly, to nip certain kinds of answers in the bud: I understand that optimization has a cost in terms of complexity, reliability, and maintainability. For this particular application, increased performance is worth these costs. I understand that the best optimizations are often to improve the high-level algorithms, and this has already been done.

Read the article

Search Results

Search found 283 results on 12 pages for 'gotchas'.

Page 11/12 | < Previous Page | 7 8 9 10 11 12 | Next Page >

- by bitcruncher

- by Martin

- by Eric King

- by David Rodríguez - dribeas

- by mawg

- by jlawless

- by OcularProgrammer

- by AlamedaDad

- by Hubert Kario

- by ultan o'broin

- by RobertChipperfield

- by Pinal Dave

- by Jeff

- by Chris Skardon

- by cibrax

- by cibrax

- by Echiban

- by Adam Davis

- by gnfti

- by Hamish Grubijan

- by jasonmp85

- by slolife

- by JefeCoon

- by Andy

- by user168715

< Previous Page | 7 8 9 10 11 12 | Next Page >