Search Results

Search found 19554 results on 783 pages for 'xml pull parser'.

Page 448/783 | < Previous Page | 444 445 446 447 448 449 450 451 452 453 454 455 | Next Page >

Webcrawler, feedback?

- by Jan Kuboschek

Hey folks, every once in a while I have the need to automate data collection tasks from websites. Sometimes I need a bunch of URLs from a directory, sometimes I need an XML sitemap (yes, I know there is lots of software for that and online services). Anyways, as follow up to my previous question I've written a little webcrawler that can visit websites. Basic crawler class to easily and quickly interact with one website. Override "doAction(String URL, String content)" to process the content further (e.g. store it, parse it). Concept allows for multi-threading of crawlers. All class instances share processed and queued lists of links. Instead of keeping track of processed links and queued links within the object, a JDBC connection could be established to store links in a database. Currently limited to one website at a time, however, could be expanded upon by adding an externalLinks stack and adding to it as appropriate. JCrawler is intended to be used to quickly generate XML sitemaps or parse websites for your desired information. It's lightweight. Is this a good/decent way to write the crawler, provided the limitations above? http://pastebin.com/VtgC4qVE - Main.java http://pastebin.com/gF4sLHEW - JCrawler.java http://pastebin.com/VJ1grArt - HTMLUtils.java Thanks for your feedback in advance! :)

Read the article
Getting elements children with certain tag jQuery

- by johnnyArt

I'm trying to get all the input elements from a certain form from jQuery by providing only the name of the form and only knowing that those wanted fields are input elements. Let's say: <form action='#' id='formId'> <input id='name' /> <input id='surname'/> </form> How do I access them individually with jQuery? I tried something like $('#formId > input') with no success, in fact an error came back on the console "XML filter is applied to non-XML value (function (a, b) {return new (c.fn.init)(a, b);})" Maybe I have to do it with .children or something like that? I'm pretty new at jQuery and I'm not really liking the Docs. It was much friendlier in Mootools, or maybe I just need to get used to it. Oh and last but not least, I've seen it asked before but no final answer, can I create a new dom element with jQuery and work with it before inserting it (if I ever do) into de code? In mootools, we had something like var myEl = new Element(element[, properties]); and you could then refer to it in further expressions, but I fail to understand how to do that on jQuery Thanks in advance.

Read the article
Perl Regex Multiple Items in Single String

- by Sho Minamimoto

I'm trying to parse a single string and get multiple chunks of data out from the same string with the same regex conditions. I'm parsing a single HTML doc that is static (For an undisclosed reason, I can't use an HTML parser to do the job.) I have an expression that looks like $string =~ /\<img\ssrc\="(.*)"/; and I want to get the value of $1. However, in the one string, there are many img tags like this, so I need something like an array returned (@1?) is this possible?

Read the article
Extracting Demographic and Contact Information from unstructured text files

- by jn29098

I am looking to extract specific items out of a large pool of unstructured documents. These documents could be 1-5 pages of text formatted in various ways by the user, but in most cases would contain at least: Name Address (physical) Email Address Phone number website URL I'm looking for a semantic parser that can attempt to extract these elements from the documents so that I can load that information into a relational database and work with these records as contacts. Other services I've looked for, while valuable for other purposes, do not address this specific need. Alchemy API Open Calais Saplo Any thoughts, suggestions or leads?

Read the article
parsing raw email in php

- by Uberfuzzy

i'm looking for good/working/simple to use php code for parsing raw email into parts. i've written a couple of brute force solutions, but everytime, one small change/header/space/something comes along and my whole parser fails and the project falls apart. and before i get pointed at PEAR/PECL, i need actual code. my host has some screwy config or something, i can never seem to get the .so's to build right. if i do get the .so made, some difference in path/environment/php.ini doesnt always make it available (apache vs cron vs cli) oh, and one last thing, i'm parsing the raw email text, NOT pop3, and NOT imap. its being piped into the php script via a .qmail email redirect. i'm not expecting SOF to write it for me, i'm looking for some tips/starting points on doing it "right". this is one of those "wheel" problems that i know has already been solved.

Read the article
Reformat a YouTube URL using jQuery

- by jamEs

I am using the YouTube Channel jQuery script to pull in data from a YouTube channel. Then I'm using an iFrame on the page to display the videos without leaving the site. The only problem is that the URLs the Channel plugin is pulling in don't quite work with my iFrame concept. They load the whole YouTube page instead of just the video. I figured out a workaround that basically I reformat the URL and tell it to display fullscreen and to autoplay will do what I would like it to do. The URLs are currently formatted like http://www.youtube.com/watch?v=xxxxxxxxx&feature=youtube_gdata and I need them to be rewritten as http://www.youtube.com/v/xxxxxxxxx?fs=1&autoplay=1 I've seen a couple of similar topics here on SO, but given my limited jQuery and regex talents I wasn't able to get anything to work for my purposes.

Read the article
How to convert an HTML table to an array in python

- by user345660

I have an html document, and I want to pull the tables out of this document and return them as arrays. I'm picturing 2 functions, one that finds all the html tables in a document, and a second one that turns html tables into 2-dimensional arrays. Something like this: htmltables = get_tables(htmldocument) for table in htmltables: array=make_array(table) There's 2 catches: 1. The number tables varies day to day 2. The tables have all kinds of weird extra formatting, like bold and blink tags, randomly thrown in. Thanks!

Read the article
Chrome Extension: How to display tab objects?

- by Jalleluhah

I am in the process of designing an extension for Google Chrome that helps to organize tabs (I know, there are many that already exist; that doesn't matter). I wish to open a popup window that will display tabs as objects (i.e., in the same way that it is displayed in the tab bar at the top of the browser). One way of doing this would be to pull various details (ID, Title, URL, etc.) from each tab, create a class and make instantiations of it upon the opening of each tab using these data, but this seems rather convoluted considering that what I want is sitting right there in the tab bar. Is there any simpler way to achieve this? In addition, I have seen several apps that utilize page previews. Is there something in the API that allows direct access to these?

Read the article
Join Query Help

- by John

Hello, The query below works well. It pulls data from two MySQL tables, "submission" and "login." I would like to also pull data from a third table called "comment" in the same database. The table "comment" has the following fields: commentid, loginid, submissionid, comment, datecommented Two of the fields in the table "login" are called "loginid" and "username." In the query below, I would like to count all "commentid" in "comment" where "loginid" equals the "loginid" in "login" where "username" equals "$profile." How can I do this? Thanks in advance, John $sqlStr1 = "SELECT l.username, l.loginid, s.loginid, s.submissionid, s.title, s.url, s.datesubmitted, s.displayurl, l.created, count(s.submissionid) countSubmissions FROM submission AS s INNER JOIN login AS l ON s.loginid = l.loginid WHERE l.username = '$profile'";

Read the article
Hibernate Bi- Directional many to many mapping advice!

- by Rob

hi all, i woundered if anyone might be able to help me out. I am trying to work out what to google for (or any other ideas!!) basically i have a bidirectional many to many mapping between a user entity and a club entity (via a join table called userClubs) I now want to include a column in userClubs that represents the role so that when i call user.getClubs() I can also work out what level access they have. Is there a clever way to do this using hibernate or do i need to rethink the database structure? Thank you for any help (or just for reading this far!!) the user.hbm.xml looks a bit like <set name="clubs" table="userClubs" cascade="save-update"> <key column="user_ID"/> <many-to-many column="activity_ID" class="com.ActivityGB.client.domain.Activity"/> </set> the activity.hbm.xml part <set name="members" inverse="true" table="userClubs" cascade="save-update"> <key column="activity_ID"/> <many-to-many column="user_ID" class="com.ActivityGB.client.domain.User"/> </set> The current userClubs table contains the fields id | user_ID | activity_ID I would like to include in there id | user_ID | activity_ID | role and be able to access the role on both sides...

Read the article
How to see a branch created in master

- by richard

Hi, I create a branch in my master repository (192.168.1.2). And in my other computer, I did '$ git pull --rebase ', I see Unpacking objects: 100% (16/16), done. From git+ssh://[email protected]/media/LINUXDATA/mozilla-1.9.1 62d004e..b291703 master -> origin/master * [new branch] improv -> origin/improv But when I do a 'git branch' in my local repository, I see only 1 branch and I did '$ git checkout improv ' $ git branch * master $ git checkout improv error: pathspec 'improv' did not match any file(s) known to git. Did you forget to 'git add'?

Read the article
Best way to parse command line arguments in C#

- by Paul Stovell

When building console applications that take parameters, you can use the arguments passed to Main(string[] args). In the past I've simply indexed/looped that array and done a few regular expressions to extract the values. However, when the commands get more complicated, the parsing can get pretty ugly. More recently, I built the world's simplest Backus-Naur Form parser in C# to parse the arguments. It does the job, but it also feels like overkill. So I'm interested in: Libraries that you use Patterns that you use Assume the commands always adhere to common standards such as answered here.

Read the article
.NET How would I build a DAL to meet my requirments?

- by Jonno

Assuming that I must deploy an asp.net app over the following 3 servers: 1) DB - not public 2) 'middle' - not public 3) Web server - public I am not allowed to connect from the web server to the DB directly. I must pass through 'middle' - this is purely to slow down an attacker if they breached the web server. All db access is via stored procedures. No table access. I simply want to provide the web server with a ado dataset (I know many will dislike this, but this is the requirement). Using asmx web services - it works, but XML serialisation is slow and it's an extra set of code to maintain and deploy. Using a ssh/vpn tunnel so that the one connects to the db 'via' the middle server, seems to remove any possible benefit of maintaining 'middle'. Using WCF binary/tcp removes the XML problem, but still there is extra code. Is there an approach that provides the ease of ssh/vpn, but the potential benefit of having the dal on the middle server? Many thanks.

Read the article
drupal form textfield #default_value not working

- by alvin.ng

I am working on a custom module with multi-page form on drupal 6. I found that #default_value is not working when my '#type' = 'textfield'. However, when '#type'='textarea', it displays correctly with the '#default_value' specified. Basically, I wrote a FormFactory to return different form definition ($form) based on the post parameter received. Initially, it returns the display of directories list, user then selects from radio buttos until a specific directory contains a xml file, it will become edit form. The edit form will have text fields display the data (#default_value) inside the xml file, however the type 'textarea' works here rather than 'textfield'. How can I make my '#default_value' work in this case? Below is the non-working field definition: $form['pageset']['newsTitle'] = array( '#type' => 'textfield', '#title' => 'News Title', '#default_value' => "{$element->newsTitle}", '#rows' => 1, '#required' => TRUE, ); Then I changed it to textarea as shown below to make it work: $form['pageset']['newsTitle'] = array( '#type' => 'textarea', '#title' => 'News Title', '#default_value' => "{$element->newsTitle}", '#rows' => 1, '#required' => TRUE, );

Read the article
Using Website Information Without WebView

- by Mr. Monkey

I am very new to this, and I more looking for what information I need to study to be able to accomplish this. What I want to do is use my GUI I have built for my app, but pull the information from a website. If I have a website that looks like this: (Sorry, can't post pics yet) http:// dl.dropbox.com/u/7037695/ErrorCodeApp/FromWebsite.PNG (full website can be seen at http://www.atmequipment.com/Error-Codes) What would I need from the website so that if a user entered an error code here: http:// dl.dropbox.com/u/7037695/ErrorCodeApp/InApp.PNG It would use the search from the website, and populate the error description in my app? I know this is a huge question, I'm just looking for what is actually needed to accomplish this, and then I can start researching from there. -- Or is it even possible?

Read the article
How to perform undirected graph processing from SQL data

- by recipriversexclusion

I ran into the following problem in dynamically creating topics for our ActiveMQ system: I have a number of processes (M_1, ..., M_n), where n is not large, typically 5-10. Some of the processes will listen to the output of others, through a message queue; these edges are specified in an XML file, e.g. <link from="M1" to="M3"</link> <link from="M2" to="M4"</link> <link from="M3" to="M4"</link> etc. The edges are sparse, so there won't be many of them. I will parse this XML and store this information in an SQL DB, one table for nodes and another for edges. Now, I need to dynamically create strings of the form M1.exe --output_topic=T1 M2.exe --output_topic=T2 M3.exe --input_topic=T1 --output_topic=T3 M4.exe --input_topic=T2 --input_topic=T3 where the tags are sequentially generated. What is the best way to go about querying SQL to obtain these relationships? Are there any tools or other tutorials you can point me to? I've never done graps with SQL. Using SQL is imperative, because we use it for other stuff, too. Thanks!

Read the article
Agile development; on-line free tools!

- by BT.

We have been looking to implement Agile methodology within our geographically distributed development team, so i need suggestions on any free on-line application that you have used and find useful. Right now we are using paper cards and wall to manage this :), but we want to shift to an on-line version preferably free. I have used TargetProcess at my previous job! My Core requirements are: Business Analyst can add user stories We can assign, prioritize different user stories to developers. QA team can add test cases around different user stories. Project Manager can track the time of all the resources and can pull reports for upper management

Read the article
innerHTML doesn't work correctly with xhtml in Chrome

- by Desperadeus

Hi! I've got a trouble with Chrome5.0.375.70, but FF 3.6.3 and Opera 10.53 are OK. Below is the line of code: document.getElementById('content').innerHTML = data.documentElement.innerHTML; The data object from the code is a document (typeof(data) == 'object') and I've got it by ajax request to chapter01.xhtml: <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE html [ <!ENTITY D "—"> <!ENTITY o "‘"> <!ENTITY c "’"> <!ENTITY O "“"> <!ENTITY C "”"> ]> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>Alice's Adventures in Wonderland by Lewis Carroll. Chapter I: Down the Rabbit-Hole</title> <link rel="stylesheet" type="text/css" href="style.css"/> <link rel="stylesheet" type="application/vnd.adobe-page-template+xml" href="page-template.xpgt"/> </head> <body> <div class="title_box"> <h2 class="chapnum">Chapter I</h2> <h2 class="chaptitle">Down the Rabbit-Hole</h2> <hr/> </div> The Chrome cuts all before body and as a result link to css in header is missed; user can't see formatted text and images. How can I fix it or bypass?

Read the article
using pom for test scope dependencies

- by IttayD

Hi, Is it possible to create a pom file so it can be used inside another pom to add test scope dependencies? So in module E's pom.xml I have: <dependencies> <dependency> <groupId>com.example</artifactId> <artifactId>D</artifactId> <type>pom</type> <scope>test</scope> </dependency> </dependencies> So that if D's pom.xml contains dependencies on artifacts A, B, C, then these artifacts are in the compilation and execution classpath of E's tests. NOTE: the reason I want such a pom, and not rely on regular dependency resolution is that I have created a tests jar using maven-jar-plugin:test-jar and using that jar as a dependency causes maven to not use its transitive dependencies. (see http://jira.codehaus.org/browse/MNG-1378) UPDATE: this does not work for me (maybe because I'm trying to use it for the test scope): http://www.sonatype.com/books/mvnref-book/reference/pom-relationships-sect-pom-best-practice.html

Read the article
JCo | How to iterate column wise

- by cedar715

The data from SAP is returned as a JCo.Table. However, we don't want to display ALL the columns in the VIEW. So, what we have done is, we have created a file called display.xml which has the JCO.Table columns to be displayed. The display.xml is converted to a List and each field is verified if it is present in the display list(see the code below) which is redundant from second row onwards. final Table outputTable = jcoFunction.getTableParameterList(). getTable("OUTPUT_TABLE"); final int numRows = outputTable.getNumRows(); for (int i = 0; i < numRows; i++) { final FieldIterator fields = outputTable.fields(); while (fields.hasNextFields()) { final JCO.Field recordField = fields.nextField(); final String sapFieldName = recordField.getName(); final DisplayFieldDto key = new DisplayFieldDto(sapFieldName); if (displayFields.contains(key)) { System.out.println("recordField.getName() = " + recordField.getName()); final String sapFieldName = (String)recordField.getValue(); } else { // ignore the field. } } } What is the better way to filter the fields in JCo? Can I iterate column wise? Thank you :)

Read the article
In ASP.NET MVC, why can't I inherit from "MyCustomView" without specifying the full type name?

- by Seth Petry-Johnson

In my MVC apps I normally declare a base view type that all of my views inherit from. I get a parser error when I specify Inherits="MyView" in my Page declaration, but not if I specify Inherits="MyApp.Web.Views.MyView". Strangely enough, it also works fine if I specify Inherits="MyView<T> (where T is any valid type). Why can I specify a strongly typed view without the full type name, but not a generic view? My base view class is declared like this: namespace MyApp.Web.Views { public class MyView : MyView<object> { } public class MyView<TModel> : ViewPage<TModel> where TModel : class { } }

Read the article
Lua : Dynamicly calling a function with arguments.

- by Tipx

Using Lua, I'm trying to dynamicly call a function with parameters. What I want to have it done is I send a string to be parsed in a way that : 1st argument is a class instance "Handle" 2nd is the function to be called All that is left are arguments "modules" is a a table like { string= } split() is a simple parser that returns a table with indexed strings function Dynamic(msg) local args = split(msg, " ") module = args[1] table.remove(args, 1) if module then module = modules[module] command = args[1] table.remove(args, 1) if command then if not args then module[command]() else module[command](unpack(args)) -- Reference 1 end else -- Function doesnt exist end else -- Module doesnt exist end end When I try this with "ignore remove bob", by "Reference 1", it tries to call "remove" on the instance associated with "ignore" in modules, and gives the argument "bob", contained in a table (with a single value). However, on the other side of the call, the remove function does not receive the argument. I even tried to replace the "Reference 1" line with module[command]("bob") but I get the same result.

Read the article
Query column and everything subordinate (hard to describe, non native speaker, PLS let me explain)

- by MAD9

A few weeks ago, I asked a question about how to generate hierarchical XML from a table, that has a parentID column. It all works fine. The point is, according to the hierarchy, I also want to query a table. I'll give you an example: Thats the table with the codes: ID CODE NAME PARENTID 1 ROOT IndustryCode NULL 2 IND Industry 1 3 CON Consulting 1 4 FIN Finance 1 5 PHARM Pharmaceuticals 2 6 AUTO Automotive 2 7 STRAT Strategy 3 8 IMPL Implementation 3 9 CFIN Corporate Finance 4 10 CMRKT Capital Markets 9 From which I generate (for displaying in a TreeViewControl) this XML: <record key="1" parentkey="" Code="ROOT" Name="IndustryCode"> <record key="2" parentkey="1" Code="IND" Name="Industry"> <record key="5" parentkey="2" Code="PHARM" Name="Pharmaceuticals" /> <record key="6" parentkey="2" Code="AUTO" Name="Automotive" /> </record> <record key="3" parentkey="1" Code="CON" Name="Consulting"> <record key="7" parentkey="3" Code="STRAT" Name="Strategy" /> <record key="8" parentkey="3" Code="IMPL" Name="Implementation" /> </record> <record key="4" parentkey="1" Code="FIN" Name="Finance"> <record key="9" parentkey="4" Code="CFIN" Name="Corporate Finance"> <record key="10" parentkey="9" Code="CMRKT" Name="Capital Markets" /> </record> </record> </record> As you can see, some codes are subordinate to others, for example AUTO << IND << ROOT What I want (and have absolutely no idea how to realise or even, where to start) is to be able to query another table (where one column is this certain code of course) for a code and get all records with the specific code and all subordinate codes For example: I query the other table for "IndustryCode = IND[ustry]" and get (of course) the records containing "IND", but also AUTO[motive] and PHARM[aceutical] (= all subordinates) Its an SQL Express Server 2008 with Advanced Services.

Read the article
Standard Android Button with a different color

- by Mike

I'd like to change the color of a standard Android button slightly in order to better match a client's branding. For example, see the "Find a Table" button for the OpenTable application: The best way I've found to do this so far is to change the Button's drawable to the following drawable located in res/drawable/red_button.xml: <?xml version="1.0" encoding="utf-8"?> <selector xmlns:android="http://schemas.android.com/apk/res/android"> <item android:state_pressed="true" android:drawable="@drawable/red_button_pressed" /> <item android:state_focused="true" android:drawable="@drawable/red_button_focus" /> <item android:drawable="@drawable/red_button_rest" /> </selector> But doing that requires that I actually create three different drawables for each button I want to customize (one for the button at rest, one when focused, and one when pressed). That seems more complicated and non-DRY than I need. All I really want to do is apply some sort of color transform to the button. Is there an easier way to go about changing a button's color than I'm doing?

Read the article
Image Grabbing with False Referer

- by Mr Carl

Hey guys, I'm struggling with grabbing an image at the moment... sounds silly, but check out this link :P http://manga.justcarl.co.uk/A/Oishii_Kankei/31/1 If you get the image URL, the image loads. Go back, it looks like it's working fine, but that's just the browser loading up the cached image. The application was working fine before, I'm thinking they implemented some kind of Referer check on their images. So I found some code and came up with the following... $ref = 'http://www.thesite.com/'; $file = 'theimage.jpg'; $hdrs = array( 'http' = array( 'method' = "GET", 'header'= "accept-language: en\r\n" . "Accept:application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*\/*;q=0.5\r\n" . "Referer: $ref\r\n" . // Setting the http-referer "Content-Type: image/jpeg\r\n" ) ); // get the requested page from the server // with our header as a request-header $context = stream_context_create($hdrs); $fp = fopen($imgChapterPath.$file, 'rb', false, $context); fpassthru($fp); fclose($fp); Essentially it's making up a false referrer. All I'm getting back though is a bunch of gibberish (thanks to fpassthru) so I think it's getting the image, but I'm afraid to say I have no idea how to output/display the collected image. Cheers, Carl

Read the article

< Previous Page | 444 445 446 447 448 449 450 451 452 453 454 455 | Next Page >