Search Results

Search found 9494 results on 380 pages for 'least squares'.

Page 344/380 | < Previous Page | 340 341 342 343 344 345 346 347 348 349 350 351  | Next Page >

  • Why is numpy's einsum faster than numpy's built in functions?

    - by Ophion
    Lets start with three arrays of dtype=np.double. Timings are performed on a intel CPU using numpy 1.7.1 compiled with icc and linked to intel's mkl. A AMD cpu with numpy 1.6.1 compiled with gcc without mkl was also used to verify the timings. Please note the timings scale nearly linearly with system size and are not due to the small overhead incurred in the numpy functions if statements these difference will show up in microseconds not milliseconds: arr_1D=np.arange(500,dtype=np.double) large_arr_1D=np.arange(100000,dtype=np.double) arr_2D=np.arange(500**2,dtype=np.double).reshape(500,500) arr_3D=np.arange(500**3,dtype=np.double).reshape(500,500,500) First lets look at the np.sum function: np.all(np.sum(arr_3D)==np.einsum('ijk->',arr_3D)) True %timeit np.sum(arr_3D) 10 loops, best of 3: 142 ms per loop %timeit np.einsum('ijk->', arr_3D) 10 loops, best of 3: 70.2 ms per loop Powers: np.allclose(arr_3D*arr_3D*arr_3D,np.einsum('ijk,ijk,ijk->ijk',arr_3D,arr_3D,arr_3D)) True %timeit arr_3D*arr_3D*arr_3D 1 loops, best of 3: 1.32 s per loop %timeit np.einsum('ijk,ijk,ijk->ijk', arr_3D, arr_3D, arr_3D) 1 loops, best of 3: 694 ms per loop Outer product: np.all(np.outer(arr_1D,arr_1D)==np.einsum('i,k->ik',arr_1D,arr_1D)) True %timeit np.outer(arr_1D, arr_1D) 1000 loops, best of 3: 411 us per loop %timeit np.einsum('i,k->ik', arr_1D, arr_1D) 1000 loops, best of 3: 245 us per loop All of the above are twice as fast with np.einsum. These should be apples to apples comparisons as everything is specifically of dtype=np.double. I would expect the speed up in an operation like this: np.allclose(np.sum(arr_2D*arr_3D),np.einsum('ij,oij->',arr_2D,arr_3D)) True %timeit np.sum(arr_2D*arr_3D) 1 loops, best of 3: 813 ms per loop %timeit np.einsum('ij,oij->', arr_2D, arr_3D) 10 loops, best of 3: 85.1 ms per loop Einsum seems to be at least twice as fast for np.inner, np.outer, np.kron, and np.sum regardless of axes selection. The primary exception being np.dot as it calls DGEMM from a BLAS library. So why is np.einsum faster that other numpy functions that are equivalent? The DGEMM case for completeness: np.allclose(np.dot(arr_2D,arr_2D),np.einsum('ij,jk',arr_2D,arr_2D)) True %timeit np.einsum('ij,jk',arr_2D,arr_2D) 10 loops, best of 3: 56.1 ms per loop %timeit np.dot(arr_2D,arr_2D) 100 loops, best of 3: 5.17 ms per loop The leading theory is from @sebergs comment that np.einsum can make use of SSE2, but numpy's ufuncs will not until numpy 1.8 (see the change log). I believe this is the correct answer, but have not been able to confirm it. Some limited proof can be found by changing the dtype of input array and observing speed difference and the fact that not everyone observes the same trends in timings.

    Read the article

  • Git repo planning questions

    - by masonk
    At work, development uses perforce to handle code sharing. I won't say "revision control", because we aren't allowed to check in changes until they are ready for regression testing. In order to get my personal change sets under revision control, I've been given the go-ahead to build my own git and initialize the client view of the perforce depot as a git repo. There are some difficulties in doing this, however. The client view lives in a subfolder of ~, (~/p4), and I want to put ~ under revision control as well, with its own separate history. I can't figure out how to keep the history for ~ separate from ~/p4 without using a submodule. The problem with a submodule is that it looks like I have to go make a repository that will become the submodule and then git submodule add <repo> <path>. But there is nowhere to make the submodule's repository except in ~. There seems to be no safe place to create the initial client view of the depot with git p4 clone. (I'm working off of the assumption that initing or cloning a repo into a subdirectory of a git repo is not supported. At least, I can find nothing authoritative on nested git repos.) edit: Is merely ignoring ~/p4 in the repo rooted at ~ enough to allow me to init a nested repo in ~/p4? My __git_ps1 function still thinks I'm in a git repository when I visit an ignored subdirectory of a git repo, so I'm inclined to think not. I need the "remote" repository created by git p4 sync to be a branch in ~/p4. We are required to keep all of our code in ~/p4 so that it doesn't get backed up. Can I pull from a "remote" branch that is really a local branch? This one is just for convenience, but I thought I could learn something by asking it. For 99% of the project, I just want to start the with the p4 head revision as the inital commit object. For the other 1%, I would like to suck down the entire p4 history so that I can browse it in git. IOW, after I'm done initalizing it, the initial commit of remotes/p4/master branch will contain: revision 1 of //depot/prod/Foo/Bar/* revision X of other files in //depot/prod/*, where X is the head revision and the remotes/p4/master branch contains Y commits, where Y is the number of changelists that had a file in //depot/prod/Foo/Bar/*, with each commit in the history corresponding to one of those p4 changelists, and HEAD looking like p4's head.

    Read the article

  • System.Net.Dns.GetHostAddresses("")

    - by dbasnett
    Yesterday s**ked, and today ain't (sic) looking better. I have an application I have been working on and it can be slow to start when my ISP is down because of DNS. My ISP was down for 3 hours yesterday, so I didn't think much about this piece of code I had added, until I found that it is always slow to start. This code is supposed to return your IP address and my reading of the link suggests that should be immediate, but it isn't, at least on my machine. Oh, and yesterday before the internet went down, I upgraded (oymoron) to XP SP3, and have had other problems. So my questions / request: 1. Am I doing this right? 2. If you run this on your machine does it take 39 seconds to return your IP address? It does on mine. One other note, I did a packet capture and the first request did NOT go on the wire, but the second did, and was answered quickly. So the question is what happened in XP SP3 that I am missing, besides a brain. One last note. If I resolve a FQDN all is well. Public Class Form1 'http://msdn.microsoft.com/en-us/library/system.net.dns.gethostaddresses.aspx ' 'excerpt 'The GetHostAddresses method queries a DNS server 'for the IP addresses associated with a host name. ' 'If hostNameOrAddress is an IP address, this address 'is returned without querying the DNS server. ' 'When an empty string is passed as the host name, 'this method returns the IPv4 addresses of the local host Private Sub Button1_Click(ByVal sender As System.Object, _ ByVal e As System.EventArgs) Handles Button1.Click Dim stpw As New Stopwatch stpw.Reset() stpw.Start() 'originally Dns.GetHostEntry, but slow also Dim myIPs() As System.Net.IPAddress = System.Net.Dns.GetHostAddresses("") stpw.Stop() Debug.WriteLine("'" & stpw.Elapsed.TotalSeconds) If myIPs.Length > 0 Then Debug.WriteLine("'" & myIPs(0).ToString) 'debug '39.8990525 '192.168.1.2 stpw.Reset() stpw.Start() 'originally Dns.GetHostEntry, but slow also myIPs = System.Net.Dns.GetHostAddresses("www.vbforums.com") stpw.Stop() Debug.WriteLine("'" & stpw.Elapsed.TotalSeconds) If myIPs.Length > 0 Then Debug.WriteLine("'" & myIPs(0).ToString) 'debug '0.042212 '63.236.73.220 End Sub End Class

    Read the article

  • Return pre-UPDATE column values in PostgreSQL without using triggers, functions or other "magic"

    - by Python Larry
    I have a related question, but this is another part of MY puzzle. I would like to get the OLD VALUE of a Column from a Row that was UPDATEd... WITHOUT using Triggers (nor Stored Procedures, nor any other extra, non-SQL/-query entities). The query I have is like this: UPDATE my_table SET processing_by = our_id_info -- unique to this instance WHERE trans_nbr IN ( SELECT trans_nbr FROM my_table GROUP BY trans_nbr HAVING COUNT(trans_nbr) > 1 LIMIT our_limit_to_have_single_process_grab ) RETURNING row_id If I could do "FOR UPDATE ON my_table" at the end of the subquery, that'd be devine (and fix my other question/problem). But, that won't work: can't have this AND a "GROUP BY" (which is necessary for figuring out the COUNT of trans_nbr's). Then I could just take those trans_nbr's and do a query first to get the (soon-to-be-) former processing_by values. I've tried doing like: UPDATE my_table SET processing_by = our_id_info -- unique to this instance FROM my_table old_my_table JOIN ( SELECT trans_nbr FROM my_table GROUP BY trans_nbr HAVING COUNT(trans_nbr) > 1 LIMIT our_limit_to_have_single_process_grab ) sub_my_table ON old_my_table.trans_nbr = sub_my_table.trans_nbr WHERE my_table.trans_nbr = sub_my_table.trans_nbr AND my_table.processing_by = old_my_table.processing_by RETURNING my_table.row_id, my_table.processing_by, old_my_table.processing_by But that can't work; "old_my_table" is not viewable outside of the join; the RETURNING clause is blind to it. I've long since lost count of all the attempts I've made; I have been researching this for literally hours. If I could just find a bullet-proof way to lock the rows in my subquery - and ONLY those rows, and WHEN the subquery happens - all the concurrency issues I'm trying to avoid disappear... UPDATE: [WIPES EGG OFF FACE] Okay, so I had a typo in the non-generic code of the above that I wrote "doesn't work"; it does... thanks to Erwin Brandstetter, below, who stated it would, I re-did it (after a night's sleep, refreshed eyes, and a banana for bfast). Since it took me so long/hard to find this sort of solution, perhaps my embarrassment is worth it? At least this is on SO for posterity now... : What I now have (that works) is like this: UPDATE my_table SET processing_by = our_id_info -- unique to this instance FROM my_table AS old_my_table WHERE trans_nbr IN ( SELECT trans_nbr FROM my_table GROUP BY trans_nbr HAVING COUNT(*) > 1 LIMIT our_limit_to_have_single_process_grab ) AND my_table.row_id = old_my_table.row_id RETURNING my_table.row_id, my_table.processing_by, old_my_table.processing_by AS old_processing_by The COUNT(*) is per a suggestion from Flimzy in a comment on my other (linked above) question. (I was more specific than necessary. [In this instance.])

    Read the article

  • mysql index optimization for a table with multiple indexes that index some of the same columns

    - by Sean
    I have a table that stores some basic data about visitor sessions on third party web sites. This is its structure: id, site_id, unixtime, unixtime_last, ip_address, uid There are four indexes: id, site_id/unixtime, site_id/ip_address, and site_id/uid There are many different types of ways that we query this table, and all of them are specific to the site_id. The index with unixtime is used to display the list of visitors for a given date or time range. The other two are used to find all visits from an IP address or a "uid" (a unique cookie value created for each visitor), as well as determining if this is a new visitor or a returning visitor. Obviously storing site_id inside 3 indexes is inefficient for both write speed and storage, but I see no way around it, since I need to be able to quickly query this data for a given specific site_id. Any ideas on making this more efficient? I don't really understand B-trees besides some very basic stuff, but it's more efficient to have the left-most column of an index be the one with the least variance - correct? Because I considered having the site_id being the second column of the index for both ip_address and uid but I think that would make the index less efficient since the IP and UID are going to vary more than the site ID will, because we only have about 8000 unique sites per database server, but millions of unique visitors across all ~8000 sites on a daily basis. I've also considered removing site_id from the IP and UID indexes completely, since the chances of the same visitor going to multiple sites that share the same database server are quite small, but in cases where this does happen, I fear it could be quite slow to determine if this is a new visitor to this site_id or not. The query would be something like: select id from sessions where uid = 'value' and site_id = 123 limit 1 ... so if this visitor had visited this site before, it would only need to find one row with this site_id before it stopped. This wouldn't be super fast necessarily, but acceptably fast. But say we have a site that gets 500,000 visitors a day, and a particular visitor loves this site and goes there 10 times a day. Now they happen to hit another site on the same database server for the first time. The above query could take quite a long time to search through all of the potentially thousands of rows for this UID, scattered all over the disk, since it wouldn't be finding one for this site ID. Any insight on making this as efficient as possible would be appreciated :) Update - this is a MyISAM table with MySQL 5.0. My concerns are both with performance as well as storage space. This table is both read and write heavy. If I had to choose between performance and storage, my biggest concern is performance - but both are important. We use memcached heavily in all areas of our service, but that's not an excuse to not care about the database design. I want the database to be as efficient as possible.

    Read the article

  • How to fine tune a Membership Provider?

    - by Venemo
    After all the answers to my last question about fine-tuning turned out to be more useful than I expected, I thought that I would ask another similar Question about the MembershipProviders as well. Okay, so firstly, to clarify: I know what a Membership, Role, and Profile provider is, how to implement my own, and how to configure them, and most of the things about them. Implementing a role and profile provider is pretty straightforward, because they only require simple CRUD most of the time. (A single line of LINQ is enough for about half of the RoleProvider's methods.) However, the Membership provider is a differend beast. Many of you may realize that it violates the SR (Single Responsibility) principle, because it has to do EVERYTHING related to user management. While this leaves a lot of room for customizations, it has its downsides as well. There is no information on the Internet about what their EXACT expected behaviour is, such as when should they throw exceptions or simply return null, and stuff like that. I use this sample implementation for reference, but it also contains several contradictions. For example, it uses its own ValidateUser method for checking for credentials in the ChangePassword method. But the ValidateUser also updates the user's LastLoginDate to the current date. So, does the framework expect that I set it in my own provider as well, or is it simply a mistake in the sample? The other is: the ChangePassword method throws an exception every time when validating the new password, but CreateUser doesn't ever throw an exception, it simply returns false. And last, but not least: it counts the invalid password attempts of the user and locks them if it passes a threshold. While this is good, but it requires manual action to unlock the users. Is it a problem if my provider automatically unlocks the user after a certain amount of time? (EDIT) I almost forgot: the CreateUser method in the sample inserts the ID from the method parameter. I actually think this is bad practice, because I use inters with auto incement as IDs, so inserting them from some method parameter is not an option. Should I just ignore the parameter, or require that its value is null and throw an exception if it isn't? All in all, does ASP.NET have any assumptions about the behaviour of a MembershipProvider? Is there any documentation which describes when should I throw an exception or just return null? I also tried to find a set of generic unit tests which would provide some guidance about the expected behaviour, but no luck, I found plenty of articles about "Unit testing is good", and "How to unit test a MembershipProvider", but not one where there would be any actual tests. Thanks in advance for everyone!

    Read the article

  • Multiple views and source list in a Core Data app

    - by Ellie P.
    I'm working on my first major Cocoa app for an undergraduate research project. The application is document-based and uses Core Data. One of the entities is an abstract entity, Page. Page is parent of several types of pages: ie PageWithHeaderAndFooter, PageWithTwoColumns, BasicPage etc. Page has attributes, such as title and author, that all pages have in common. Each specific type of page has a certain number of layout blocks (PageWithHeaderAndFooter has three: header, footer, body. BasicPage has one: body. etc.) Additionally, all Page subclasses define layout-specific implementations of certain methods. The other relevant entity is Style, which defines the visual look of a Page. (Think of Pages as HTML and Style as CSS.) I would like my app to have an iTunes/Mail-like source list with sections. (One section would be Pages, the other would be Styles.) I have a pretty good idea how to do the sectioned source list (this was a great help). However, after hours of headbanging and fruitless googling, here's what I can't figure out: Pages and Styles listed in the source list, and when you select one of them, all of the relevant fields for that object appear at the right (mostly NSTextViews, pop up menus, etc). I laid that out and did all of the bindings in Interface Builder. The problem is, if my source list contains different types of pages, how do I get a different view to display at the right depending on the type of page selected? For example, if a BasicPage is selected, I want just what you see above: the general page stuff and one NSTextView that corresponds to the one field body of BasicPage. But if I select a PageWithHeaderAndFooter, I want to display the general page stuff plus three NSTextViews (one for header, body, and footer.) If I have a Style selected, I want to display various pop up menus, color wells, etc. For the pages at least, we're only talking about one or more NSTextViews, each of which corresponds to a String attribute of the respective entity. How would you do this? Thank you for your help!

    Read the article

  • Corner Cases, Unexpected and Unusual Matlab

    - by Mikhail
    Over the years, reading others code, I encountered and collected some examples of Matlab syntax which can be at first unusual and counterintuitive. Please, feel free to comment or complement this list. I verified it r2006a. set([], 'Background:Color','red') Matlab is very forgiving sometimes. In this case, setting properties to an array of objects works also with nonsense properties, at least when the array is empty. myArray([1,round(end/2)]) This use of end keyword may seem unclean but is sometimes very handy instead of using length(myArray). any([]) ~= all([]) Surprisigly any([]) returns false and all([]) returns true. And I always thought that all is stronger then any. EDIT: with not empty argument all() returns true for a subset of values for which any() returns true (e.g. truth table). This means that any() false implies all() false. This simple rule is being violated by Matlab with [] as argument. Loren also blogged about it. Select(Range(ExcelComObj)) Procedural style COM object method dispatch. Do not wonder that exist('Select') returns zero! [myString, myCell] Matlab makes in this case an implicit cast of string variable myString to cell type {myString}. It works, also if I would not expect it to do so. [double(1.8), uint8(123)] => 2 123 Another cast example. Everybody would probably expect uint8 value being cast to double but Mathworks have another opinion. a = 5; b = a(); It looks silly but you can call a variable with round brackets. Actually it makes sense because this way you can execute a function given its handle. a = {'aa', 'bb' 'cc', 'dd'}; Surprsisingly this code neither returns a vector nor rises an error but defins matrix, using just code layout. It is probably a relict from ancient times. set(hobj, {'BackgroundColor','ForegroundColor'},{'red','blue'}) This code does what you probably expect it to do. That function set accepts a struct as its second argument is a known fact and makes sense, and this sintax is just a cell2struct away. Equvalence rules are sometimes unexpected at first. For example 'A'==65 returns true (although for C-experts it is self-evident). About which further unexpected/unusual Matlab features are you aware?

    Read the article

  • Parse int to string with stringstream

    - by SoulBeaver
    Well! I feel really stupid for this question, and I wholly don't mind if I get downvoted for this, but I guess I wouldn't be posting this if I had not at least made an earnest attempt at looking for the solution. I'm currently working on Euler Problem 4, finding the largest palindromic number of two three-digit numbers [100..999]. As you might guess, I'm at the part where I have to work with the integer I made. I looked up a few sites and saw a few standards for converting an Int to a String, one of which included stringstream. So my code looked like this: // tempTotal is my int value I want converted. void toString( int tempTotal, string &str ) { ostringstream ss; // C++ Standard compliant method. ss << tempTotal; str = ss.str(); // Overwrite referenced value of given string. } and the function calling it was: else { toString( tempTotal, store ); cout << loop1 << " x " << loop2 << "= " << store << endl; } So far, so good. I can't really see an error in what I've written, but the output gives me the address to something. It stays constant, so I don't really know what the program is doing there. Secondly, I tried .ToString(), string.valueOf( tempTotal ), (string)tempTotal, or simply store = temptotal. All refused to work. When I simply tried doing an implicit cast with store = tempTotal, it didn't give me a value at all. When I tried checking output it literally printed nothing. I don't know if anything was copied into my string that simply isn't a printable character, or if the compiler just ignored it. I really don't know. So even though I feel this is a really, really lame question, I just have to ask: How do I convert that stupid integer to a string with the stringstream? The other tries are more or less irrelevant for me, I just really want to know why my stringstream solution isn't working.

    Read the article

  • How would you organize this Javascript?

    - by Anurag
    How do you usually organize complex web applications that are extremely rich on the client side. I have created a contrived example to indicate the kind of mess it's easy to get into if things are not managed well for big apps. Feel free to modify/extend this example as you wish - http://jsfiddle.net/NHyLC/1/ The example basically mirrors part of the comment posting on SO, and follows the following rules: Must have 15 characters minimum, after multiple spaces are trimmed out to one. If Add Comment is clicked, but the size is less than 15 after removing multiple spaces, then show a popup with the error. Indicate amount of characters remaining and summarize with color coding. Gray indicates a small comment, brown indicates a medium comment, orange a large comment, and red a comment overflow. One comment can only be submitted every 15 seconds. If comment is submitted too soon, show a popup with appropriate error message. A couple of issues I noticed with this example. This should ideally be a widget or some sort of packaged functionality. Things like a comment per 15 seconds, and minimum 15 character comment belong to some application wide policies rather than being embedded inside each widget. Too many hard-coded values. No code organization. Model, Views, Controllers are all bundled together. Not that MVC is the only approach for organizing rich client side web applications, but there is none in this example. How would you go about cleaning this up? Applying a little MVC/MVP along the way? Here's some of the relevant functions, but it will make more sense if you saw the entire code on jsfiddle: /** * Handle comment change. * Update character count. * Indicate progress */ function handleCommentUpdate(comment) { var status = $('.comment-status'); status.text(getStatusText(comment)); status.removeClass('mild spicy hot sizzling'); status.addClass(getStatusClass(comment)); } /** * Is the comment valid for submission */ function commentSubmittable(comment) { var notTooSoon = !isTooSoon(); var notEmpty = !isEmpty(comment); var hasEnoughCharacters = !isTooShort(comment); return notTooSoon && notEmpty && hasEnoughCharacters; } // submit comment $('.add-comment').click(function() { var comment = $('.comment-box').val(); // submit comment, fake ajax call if(commentSubmittable(comment)) { .. } // show a popup if comment is mostly spaces if(isTooShort(comment)) { if(comment.length < 15) { // blink status message } else { popup("Comment must be at least 15 characters in length."); } } // show a popup is comment submitted too soon else if(isTooSoon()) { popup("Only 1 comment allowed per 15 seconds."); } });

    Read the article

  • PERL newbie : get a proper minimal debug_mode solution

    - by Michael Mao
    Hi all: I am learning PERL in a "head-first" manner. I am absolutely a newbie in this language: I am trying to have a debug_mode switch from CLI which can be used to control how my script works, by switching certain subroutines "on and off". And below is what I've got so far: #!/usr/bin/perl -s -w # purpose : make subroutine execution optional, # which is depending on a CLI switch flag use strict; use warnings; use constant DEBUG_VERBOSE => "v"; use constant DEBUG_SUPPRESS_ERROR_MSGS => "s"; use constant DEBUG_IGNORE_VALIDATION => "i"; use constant DEBUG_SETPPING_COMPUTATION => "c"; our ($debug_mode); mainMethod(); sub mainMethod # () { if(!$debug_mode) { print "debug_mode is OFF\n"; } elsif($debug_mode) { print "debug_mode is ON\n"; } else { print "OMG!\n"; exit -1; } checkArgv(); printErrorMsg("Error_Code_123", "Parsing Error at..."); verbose(); } sub checkArgv #() { print ("Number of ARGV : ".(1 + $#ARGV)."\n"); } sub printErrorMsg # ($error_code, $error_msg, ..) { if(defined($debug_mode) && !($debug_mode =~ DEBUG_SUPPRESS_ERROR_MSGS)) { print "You can only see me if -debug_mode is NOT set". " to DEBUG_SUPPRESS_ERROR_MSGS\n"; die("terminated prematurely...\n") and exit -1; } } sub verbose # () { if(defined($debug_mode) && ($debug_mode =~ DEBUG_VERBOSE)) { print "Blah blah blah...\n"; } } So far as I can tell, at least it works...: the -debug_mode switch doesn't interfere with normal ARGV the following commandlines work: ./optional.pl ./optional.pl -debug_mode ./optional.pl -debug_mode=v ./optional.pl -debug_mode=s However, I am puzzled when multiple debug_modes are "mixed", such as: ./optional.pl -debug_mode=sv ./optional.pl -debug_mode=vs I don't understand why the above lines of code "magically works". I see both of the "DEBUG_VERBOS" and "DEBUG_SUPPRESS_ERROR_MSGS" apply to the script, which is fine in this case. However, if there are some "conflicting" debug modes, I am not sure how to set the "precedence of debue_modes"? Also, I am not certain if my approach is good enough to Perlists and I hope I am getting my feet in the right direction. One biggest problem is that I now put if statements inside most of my subroutines for controlling their behavior under different modes. Is this okay? Is there a more elegant way? I know there must be a debug module from CPAN or elsewhere, but I wanna a real minimal solution that doesn't depend on any other module than the "default" And I cannot have any control on the environment where this script will be executed... Many thanks to the suggestions in advance.

    Read the article

  • Avoiding Redundancies in XML documents

    - by MarceloRamires
    I was working with a certain XML where there were no redundancies <person> <eye> <eye_info> <eye_color> blue </eye_color> </eye_info> </eye> <hair> <hair_info> <hair_color> blue </hair_color> </hair_info> </hair> </person> As you can see, the sub-tag eye-color makes reference to eye in it's name, so there was no need to avoid redundancies, I could get the eye color in a single line after loading the XML into a dataset: dataset.ReadXml(path); value = dataset.Tables("eye_info").Rows(0)("eye_color"); I do realise it's not the smartest way of doing so, and this situation I'm having now wasn't unforeseen. Now, let's say I have to read xml's that are in this format: <person> <eye> <info> <color> blue </color> </info> </eye> <hair> <info> <color> blue </color> </info> </hair> </person> So If I try to call it like this: dataset.ReadXml(path); value = dataset.Tables("info").Rows(0)("color"); There will be a redundancy, because I could only go as far as one up level to identify a single field in a XML with my previous method, and the 'disambiguator' is three levels above. Is there a practical way to reach with no mistake a single field given all the above (or at least a few) fields ?

    Read the article

  • Google Bookmarks Accelerator for IE8

    - by MACHiNESMiTH
    To Editors: Sorry, I didnt realise the question thing till you pointed it out. Also I didn't know about the appengine tag, I'm assuming it was selected by mistake from that JS autosuggestion box (it usually happens with me, I run a P3) my apologies for that too. I've edited it now, if it doesnt work let me know. Hi all, I'm making (or at least trying to make) a Google bookmarks accelerator for IE8, but I keep running into "Internet Explorer could not install this accelerator. There was a problem with the Accelerator's information." Does anyone know why this is happening? Here is the code I've made <?xml version="1.0" encoding="UTF-8" ?> <os:openServiceDescription xmlns="http://www.microsoft.com/schemas/openservicedescription/1.0"> <os:homepageUrl>http://www.google.com/bookmarks</os:homepageUrl> <os:display> <os:description>Add this page to GoogleBookmarks</os:description> <os:name>Add to GoogleBookmarks</os:name> <os:icon>http://www.google.com/favicon.ico</os:icon> </os:display> <os:activity category="Bookmarks"> <os:activityAction context="document"> <os:execute method="get" action="http://www.google.com/bookmarks/mark?op=add&output=popup"> <os:parameter name="bkmk" value="{documentUrl}" /> <os:parameter name="title" value="{documentTitle}" /> <os:parameter name="annotation" value="{selection}" /> </os:execute> </os:activityAction> </os:activity> </os:openServiceDescription> These are the references I used: MSDN Developers Guide MSDN Format Specification Unofficial Google API (used as a look up so far) and yes i know that I cannot send document variables between schemes (HTTP - HTTPS, for example) or between servers in different security zones (Intranet - Internet etc) Also, if this is the wrong place, what are the recommended sites and/or forums for xml related posts? (not really questions - but more like full blown detailed rant like posts) Thanks

    Read the article

  • Logging raw HTTP request/response in ASP.NET MVC & IIS7

    - by Greg Beech
    I'm writing a web service (using ASP.NET MVC) and for support purposes we'd like to be able to log the requests and response in as close as possible to the raw, on-the-wire format (i.e including HTTP method, path, all headers, and the body) into a database. What I'm not sure of is how to get hold of this data in the least 'mangled' way. I can re-constitute what I believe the request looks like by inspecting all the properties of the HttpRequest object and building a string from them (and similarly for the response) but I'd really like to get hold of the actual request/response data that's sent on the wire. I'm happy to use any interception mechanism such as filters, modules, etc. and the solution can be specific to IIS7. However, I'd prefer to keep it in managed code only. Any recommendations? Edit: I note that HttpRequest has a SaveAs method which can save the request to disk but this reconstructs the request from the internal state using a load of internal helper methods that cannot be accessed publicly (quite why this doesn't allow saving to a user-provided stream I don't know). So it's starting to look like I'll have to do my best to reconstruct the request/response text from the objects... groan. Edit 2: Please note that I said the whole request including method, path, headers etc. The current responses only look at the body streams which does not include this information. Edit 3: Does nobody read questions around here? Five answers so far and yet not one even hints at a way to get the whole raw on-the-wire request. Yes, I know I can capture the output streams and the headers and the URL and all that stuff from the request object. I already said that in the question, see: I can re-constitute what I believe the request looks like by inspecting all the properties of the HttpRequest object and building a string from them (and similarly for the response) but I'd really like to get hold of the actual request/response data that's sent on the wire. If you know the complete raw data (including headers, url, http method, etc.) simply cannot be retrieved then that would be useful to know. Similarly if you know how to get it all in the raw format (yes, I still mean including headers, url, http method, etc.) without having to reconstruct it, which is what I asked, then that would be very useful. But telling me that I can reconstruct it from the HttpRequest/HttpResponse objects is not useful. I know that. I already said it. Please note: Before anybody starts saying this is a bad idea, or will limit scalability, etc., we'll also be implementing throttling, sequential delivery, and anti-replay mechanisms in a distributed environment, so database logging is required anyway. I'm not looking for a discussion of whether this is a good idea, I'm looking for how it can be done.

    Read the article

  • Finding related tags using acts-as-taggable-on

    - by user284194
    In tag#show I list all entries with that tag. At the bottom of the page I'd like to have something like: "Related Tags: linked, list, of, related tags" My view looks like: <h2><%= link_to 'Tag', tags_path %>: <%= @tag.name.titleize %></h2> <% @entries.each do |entry| %> <h2><%= link_to h(entry.name), entry %></h2> <%- unless entry.phone.empty? -%> <p><%= h(entry.phone) %></p> <%- end -%> <%- unless entry.address.empty? -%> <p><%= h(entry.address) %></p> <%- end -%> <%- unless entry.description.empty? -%> <p><%= h(entry.description) %></p> <%- end -%> <p2><%= link_to "more...", entry %><p2> <% end %> Related Tags: <% @related.each do |tag| %> <%= link_to h(tag.tags), tag %> <% end %> tags_controller.rb: def show @title = Tag.find(params[:id]).name @tag = Tag.find(params[:id]) @entries = Entry.paginate(Entry.find_tagged_with(@tag), :page => params[:page], :per_page => 10, :order => "name") @related = Entry.tagged_with(@tag, :on => :tags) end Every entry has at least one tag, it's required by the entry model. I'd like duplicate tags ignored and the current tag (the tag that the list belongs to) ignored. My current code displays this: Related Tags: Gardens Gardens ToursGardens Gardens is a link to the entry, not the tag gardens. ToursGardens is a link to the entry that includes those tags. My desired result would be: Related Tags: Gardens, Tours Each link would link to it's associated tag. Can anyone help me achieve this? I tried using a div_for but I don't think that was right.

    Read the article

  • Rails ActiveResource Associations

    - by brad
    I have some ARes models (see below) that I'm trying to use associations with (which seems to be wholly undocumented and maybe not possible but I thought I'd give it a try) So on my service side, my ActiveRecord object will render something like render :xml => @group.to_xml(:include => :customers) (see generated xml below) The models Group and Customers are HABTM On my ARes side, I'm hoping that it can see the <customers> xml attribute and automatically populate the .customers attribute of that Group object , but the has_many etc methods aren't supported (at least as far as I can tell) So I'm wondering how ARes does it's reflection on the XML to set the attributes of an object. In AR for instance I could create a def customers=(customer_array) and set it myself, but this doesn't seem to work in ARes. One suggestion I found for an "association" is the just have a method def customers Customer.find(:all, :conditions => {:group_id => self.id}) end But this has the disadvantage that it makes a second service call to look up those customers... not cool I'd like my ActiveResource model to see that the customers attribute in the XML and automatically populate my model. Anyone have any experience with this?? # My Services class Customer < ActiveRecord::Base has_and_belongs_to_many :groups end class Group < ActiveRecord::Base has_and_belongs_to_many :customer end # My ActiveResource accessors class Customer < ActiveResource::Base; end class Group < ActiveResource::Base; end # XML from /groups/:id?customers=true <group> <domain>some.domain.com</domain> <id type="integer">266</id> <name>Some Name</name> <customers type="array"> <customer> <active type="boolean">true</active> <id type="integer">1</id> <name>Some Name</name> </customer> <customer> <active type="boolean" nil="true"></active> <id type="integer">306</id> <name>Some Other Name</name> </customer> </customers> </group>

    Read the article

  • PL/SQL pre-compile and Code Quality checks in an automatted build environment?

    - by Lars Corneliussen
    We build software using Hudson and Maven. We have C#, java and last, but not least PL/SQL sources (sprocs, packages, DDL, crud) For C# and Java we do unit tests and code analysis, but we don't really know the health of our PL/SQL sources before we actually publish them to the target database. Requirements There are a couple of things we wan't to test in the following priority: Are the sources valid, hence "compilable"? For packages, with respect to a certain database, would they compile? Code Quality: Do we have code flaws like duplicates, too complex methods or other violations to a defined set of rules? Also, the tool must run head-less (commandline, ant, ...) we wan't to do analysis on a partial code base (changed sources only) Tools We did a little research and found the following tools that could potencially help: Cast Application Intelligence Platform (AIP): Seems to be a server that grasps information about "anything". Couldn't find a console version that would export in readable format. Toad for Oracle: The Professional version is said to include something called Xpert validates a set of rules against a code base. Sonar + PL/SQL-Plugin: Uses Toad for Oracle to display code-health the sonar-way. This is for browsing the current state of the code base. Semantic Designs DMSToolkit: Quite general analysis of source code base. Commandline available? Semantic Designs Clones Detector: Detects clones. But also via command line? Fortify Source Code Analyzer: Seems to be focussed on security issues. But maybe it is extensible? more... So far, Toad for Oracle together with Sonar seems to be an elegant solution. But may be we are missing something here? Any ideas? Other products? Experiences? Related Questions on SO: http://stackoverflow.com/questions/531430/any-static-code-analysis-tools-for-stored-procedures http://stackoverflow.com/questions/839707/any-code-quality-tool-for-pl-sql http://stackoverflow.com/questions/956104/is-there-a-static-analysis-tool-for-python-ruby-sql-cobol-perl-and-pl-sql

    Read the article

  • What's the best way to do base36 arithmetic in perl?

    - by DVK
    What's the best way to do base36 arithmetic in Perl? To be more specific, I need to be able to do the following: Operate on positive N-digit numbers in base 36 (e.g. digits are 0-9 A-Z) N is finite, say 9 Provide basic arithmetic, at the very least the following 3: Addition (A+B) Subtraction (A-B) Whole division, e.g. floor(A/B). Strictly speaking, I don't really need a base10 conversion ability - the numbers will 100% of time be in base36. So I'm quite OK if the solution does NOT implement conversion from base36 back to base10 and vice versa. I don't much care whether the solution is brute-force "convert to base 10 and back" or converting to binary, or some more elegant approach "natively" performing baseN operations (as stated above, to/from base10 conversion is not a requirement). My only 3 considerations are: It fits the minimum specifications above It's "standard". Currently we're using and old homegrown module based on base10 conversion done by hand that is buggy and sucks. I'd much rather replace that with some commonly used CPAN solution instead of re-writing my own bicycle from scratch, but I'm perfectly capable of building it if no better standard possibility exists. It must be fast-ish (though not lightning fast). Something that takes 1 second to sum up 2 9-digit base36 numbers is worse than anything I can roll on my own :) P.S. Just to provide some context in case people decide to solve my XY problem for me in addition to answering the technical question above :) We have a fairly large tree (stored in DB as a bunch of edges), and we need to superimpose order on a subset of that tree. The tree dimentions are big both depth- and breadth- wise. The tree is VERY actively updated (inserts and deletes and branch moves). This is currently done by having a second table with 3 columns: parent_vertex, child_vertex, local_order, where local_order is an 9-character string built of A-Z0-9 (e.g. base 36 number). Additional considerations: It is required that the local order is unique per child (and obviously unique per parent), Any complete re-ordering of a parent is somewhat expensive, and thus the implementation is to try and assign - for a parent with X children - the orders which are somewhat evenly distributed between 0 and 36**10-1, so that almost no tree inserts result in a full re-ordering.

    Read the article

  • Boggling Direct3D9 dynamic vertex buffer Lock crash/post-lock failure on Intel GMA X3100.

    - by nj
    Hi, For starters I'm a fairly seasoned graphics programmer but as wel all know, everyone makes mistakes. Unfortunately the codebase is a bit too large to start throwing sensible snippets here and re-creating the whole situation in an isolated CPP/codebase is too tall an order -- for which I am sorry, do not have the time. I'll do my best to explain. B.t.w, I will of course supply specific pieces of code if someone wonders how I'm handling this-or-that! As with all resources in the D3DPOOL_DEFAULT pool, when the device context is taken away from you you'll sooner or later will have to reset your resources. I've built a mechanism to handle this for all relevant resources that's been working for years; but that fact nothingwithstanding I've of course checked, asserted and doubted any assumption since this bug came to light. What happens is as follows: I have a rather large dynamic vertex buffer, exact size 18874368 bytes. This buffer is locked (and discarded fully using the D3DLOCK_DISCARD flag) each frame prior to generating dynamic geometry (isosurface-related, f.y.i) to it. This works fine, until, of course, I start to reset. It might take 1 time, it might take 2 or it might take 5 resets to set off a bug that causes an access violation either on the pointer returned by the Lock() operation on the renewed resource or a plain crash -- regarding a somewhat similar address, but without the offset that it has tacked on to it in the first case because in that case we're somewhere halfway writing -- iside the D3D9 dll Lock() call. I've tested this on other hardware, upgraded my GMA X3100 drivers (using a MacBook with BootCamp) to the latest ones, but I can't reproduce it on any other machine and I'm at a loss about what's wrong here. I have tried to reproduce a similar situation with a similar buffer (I've got a large scratch pad of the same type I filled with quads) and beyond a certain amount of bytes it started to behave likewise. I'm not asking for a solution here but I'm very interested if there are other developers here who have battled with the same foe or maybe some who can point me in some insightful direction, maybe ask some questions that might shed a light on what I may or may not be overlooking. Another interesting artifact is that the vertex buffer starts to bug if I supply both D3DLOCK_DISCARD and D3DLOCK_NOOVERWRITE together which, even though not very logical (you're not going to overwrite if you've just discarded all), gives graphics glitches. Thanks and any corrections are more than welcome. Niels p.s - A friend of mine raised the valid point that it is a huge buffer for onboard video RAM and it's being at least double or triple buffered internally due to it's dynamic nature. On the other hand, the debug output (D3D9 debug DLL + max. warning output) remains silent. p.s 2 - Had it tested on more machines and still works -- it's probably a matter of circumstance: the huge dynamic, internally double/trippled buffered buffer, not a lot of memory and drivers that don't complain when they should.. Unless someone has a better suggestion; I'd still love to hear it :)

    Read the article

  • Mindblowing jQuery Weirdness

    - by Jason
    ...at least to me. This code used to work fine. I'm pretty sure nothing has changed, but now all of the sudden it behaves oddly. Basically I'm trying to create inline editing functionality. When the user clicks on the link, it dynamically generates a textbox and a confirm and cancel link. I'm having problems with the cancel link not removing everything in the cell. HTML: ... <td class="bid"> <a href="javascript:" class="102093" title="Click to modify bid">$0.45</a> </td> ... Binding jQuery (in $(function())): $('.bid a').live('click', renderBidChange); .... $('.report_table .cancel').live('click', cancelUpdate); renderBidChange (this function creates the dynamic elements): function renderBidChange(){ var cpc = $(this); var value = cpc.text().replace('$', ''); var cell = cpc.parent('.bid'); cpc.hide(); var input = document.createElement('input'); $(input).attr({type:'text',class:'dynamic cpc-input'}).val(value); cell.append(input); var accept = document.createElement('a'); $(accept).addClass('accept').attr({'href':'javascript:', 'title':'Accept Changes'}).text('Accept Changes'); cell.append(accept); var cancel = document.createElement('a'); $(cancel).addClass('cancel').attr({'href':'javascript:', 'title':'Cancel Changes'}).text('Cancel Changes'); cell.append(cancel); $(input).focus(); input.select(); } cancelUpdate this function just removes everything visible (all the dynamic junk in this case) in the cell and shows what used to be there. function cancelUpdate(){ var cell = $(this).parent(); cell.find(':visible').remove(); cell.find(':hidden').show(); } However, for some reason, the cancel link remains after it is clicked! Everything else is removed except that. W T F Thanks for any insight you're able to provide! I'm sure it's just some stupid little detail I'm over[caffeinatedly]looking... UPDATE Immediately after posting this I epiphanied that it may be a CSS issue, but after double checking my code, it is not.

    Read the article

  • Problem with final branch in a parallel activity

    - by Dan Revell
    This might seem like a silly thing to say, the final branch in a parallel activity so I'll clarify. It's a parallel activity with three branches each containing a simple create task, on task changed and complete task. The branch containing the task that is last to complete seems to break. So every task works in it's own right, but the last one encounters a problem. Say the user clicks the final tasks link to open the attached infopath form and submits that. Execution gets to the event handler for that onTaskChanged where a taskCompleted variable gets set to true which will exit the while loop. I've successfully hit a breakpoint on this line so I know that happens. However the final activity in that branch, the completeTask doesn't get hit. When submit is clicked in the final form, the operation in progess screen says of for quite a while before returning to the workflow status page. The task that was opened and submitted says "Not Started". I can disable any of the branches to leave only two, but the same problem happens with the last to be completed. Earlier on in the workflow I do essencially the same thing. I have another 3 branch parallel activity with each brach containing a task. This one works correctly which leads me to believe that it might be a problem with having two parallel activites in the same sequential workflow. I've considered the possibility that it might be a correlation token problem. The token that every task branch uses is unique to that branch and it's owner activity name is est to that of the branch. It stands to reason that if the task complete variable is indeed getting set to true but the while loop isn't being exited, then there's a wire crossing with the variable somewhere. However I'd still have thought that the task status back on the workflow status page would at least say that the task is in progress. This is a frustrating show stopper of a bug for me. Any thoughts or suggestions would be much appricated so I can investigate them.

    Read the article

  • Help choosing authentication method

    - by Dima
    I need to choose an authentication method for an application installed and integrated in customers environment. There are two types of environments - windows and linux/unix. Application is user based, no web stuff, pure Java. The requirement is to authenticate users which will use my application against customer provided user base. Meaning, customer installs my app, but uses his own users to grant or deny access to my app. Typical, right? I have three options to consider and I need to pick up the one which would be a) the most flexible to cover most common modern environments and b) would take least effort while stay robust and standard. Option (1) - Authenticate locally managing user credentials in some local storage, e.g. file. Customer would then add his users to my application and it will then check the passwords. Simple, clumsy but would work. Customers would have to punch every user they want to grant access to my app using some UI we will have to provide. Lots of work for me, headache to the customer. Option (2) - Use LDAP authentication. Customers would tell my app where to look for users and I will walk their directory resolving names into user names and trying to bind with found password. This is better approach IMO, but more fragile because I will have to walk an unknown directory structure and who knows if this will be permitted everywhere. Would be harder to test since there are many LDAP implementation out there, last thing I want is drowning in this voodoo. Option(3) - Use plain Kerberos authentication. Customers would tell my app what realm (domain) and which KDC (key distribution center) to use. In ideal world these two parameters would be all I need to set while customers could use their own administration tools to configure domain and kdc. My application would simply delegate user credentials to this third party (using JAAS or Spring security) and consider success when third party is happy with them. I personally prefer #3, but not sure what surprises I might face. Would this cover windows and *nix systems entirely? Is there another option to consider?

    Read the article

  • Scala actors: receive vs react

    - by jqno
    Let me first say that I have quite a lot of Java experience, but have only recently become interested in functional languages. Recently I've started looking at Scala, which seems like a very nice language. However, I've been reading about Scala's Actor framework in Programming in Scala, and there's one thing I don't understand. In chapter 30.4 it says that using react instead of receive makes it possible to re-use threads, which is good for performance, since threads are expensive in the JVM. Does this mean that, as long as I remember to call react instead of receive, I can start as many Actors as I like? Before discovering Scala, I've been playing with Erlang, and the author of Programming Erlang boasts about spawning over 200,000 processes without breaking a sweat. I'd hate to do that with Java threads. What kind of limits am I looking at in Scala as compared to Erlang (and Java)? Also, how does this thread re-use work in Scala? Let's assume, for simplicity, that I have only one thread. Will all the actors that I start run sequentially in this thread, or will some sort of task-switching take place? For example, if I start two actors that ping-pong messages to each other, will I risk deadlock if they're started in the same thread? According to Programming in Scala, writing actors to use react is more difficult than with receive. This sounds plausible, since react doesn't return. However, the book goes on to show how you can put a react inside a loop using Actor.loop. As a result, you get loop { react { ... } } which, to me, seems pretty similar to while (true) { receive { ... } } which is used earlier in the book. Still, the book says that "in practice, programs will need at least a few receive's". So what am I missing here? What can receive do that react cannot, besides return? And why do I care? Finally, coming to the core of what I don't understand: the book keeps mentioning how using react makes it possible to discard the call stack to re-use the thread. How does that work? Why is it necessary to discard the call stack? And why can the call stack be discarded when a function terminates by throwing an exception (react), but not when it terminates by returning (receive)? I have the impression that Programming in Scala has been glossing over some of the key issues here, which is a shame, because otherwise it's a truly excellent book.

    Read the article

  • Optimizing spacing of mesh containing a given set of points

    - by Feynman
    I tried to summarize the this as best as possible in the title. I am writing an initial value problem solver in the most general way possible. I start with an arbitrary number of initial values at arbitrary locations (inside a boundary.) The first part of my program creates a mesh/grid (I am not sure which is the correct nuance), with N points total, that contains all the initial values. My goal is to optimize the mesh such that the spacing is as uniform as possible. My solver seems to work half decently (it needs some more obscure debugging that is not relevant here.) I am starting with one dimension. I intend to generalize the algorithm to an arbitrary number of dimensions once I get it working consistently. I am writing my code in fortran, but feel free to reply with pseudocode or the language of your choice. Allow me to elaborate with an example: Say I am working on a closed interval [1,10] xmin=1 xmax=10 Say I have 3 initial points: xmin, 5 and xmax num_ivc=3 known(num_ivc)=[xmin,5,xmax] //my arrays start at 1. Assume "known" starts sorted I store my mesh/grid points in an array called coord. Say I want 10 points total in my mesh/grid. N=10 coord(10) Remember, all this is arbitrary--except the variable names of course. The algorithm should set coord to {1,2,3,4,5,6,7,8,9,10} Now for a less trivial example: num_ivc=3 known(num_ivc)=[xmin,5.5,xmax or just num_ivc=1 known(num_ivc)=[5.5] Now, would you have 5 evenly spaced points on the interval [1, 5.5] and 5 evenly spaced points on the interval (5.5, 10]? But there is more space between 1 and 5.5 than between 5.5 and 10. So would you have 6 points on [1, 5.5] followed by 4 on (5.5 to 10]. The key is to minimize the difference in spacing. I have been working on this for 2 days straight and I can assure you it is a lot trickier than it sounds. I have written code that only works if N is large only works if N is small only works if it the known points are close together only works if it the known points are far apart only works if at least one of the known points is near a boundary only works if none of the known points are near a boundary So as you can see, I have coded the gamut of almost-solutions. I cannot figure out a way to get it to perform equally well in all possible scenarios (that is, create the optimum spacing.)

    Read the article

  • JavaScript: Given an offset and substring length in an HTML string, what is the parent node?

    - by Bungle
    My current project requires locating an array of strings within an element's text content, then wrapping those matching strings in <a> elements using JavaScript (requirements simplified here for clarity). I need to avoid jQuery if at all possible - at least including the full library. For example, given this block of HTML: <div> <p>This is a paragraph of text used as an example in this Stack Overflow question.</p> </div> and this array of strings to match: ['paragraph', 'example'] I would need to arrive at this: <div> <p>This is a <a href="http://www.example.com/">paragraph</a> of text used as an <a href="http://www.example.com/">example</a> in this Stack Overflow question.</p> </div> I've arrived at a solution to this by using the innerHTML() method and some string manipulation - basically using the offsets (via indexOf()) and lengths of the strings in the array to break the HTML string apart at the appropriate character offsets and insert <a href="http://www.example.com/"> and </a> tags where needed. However, an additional requirement has me stumped. I'm not allowed to wrap any matched strings in <a> elements if they're already in one, or if they're a descendant of a heading element (<h1> to <h6>). So, given the same array of strings above and this block of HTML (the term matching has to be case-insensitive, by the way): <div> <h1>Example</a> <p>This is a <a href="http://www.example.com/">paragraph of text</a> used as an example in this Stack Overflow question.</p> </div> I would need to disregard both the occurrence of "Example" in the <h1> element, and the "paragraph" in <a href="http://www.example.com/">paragraph of text</a>. This suggests to me that I have to determine which node each matched string is in, and then traverse its ancestors until I hit <body>, checking to see if I encounter a <a> or <h_> node along the way. Firstly, does this sound reasonable? Is there a simpler or more obvious approach that I've failed to consider? It doesn't seem like regular expressions or another string-based comparison to find bounding tags would be robust - I'm thinking of issues like self-closing elements, irregularly nested tags, etc. There's also this... Secondly, is this possible, and if so, how would I approach it?

    Read the article

< Previous Page | 340 341 342 343 344 345 346 347 348 349 350 351  | Next Page >