Search Results

Search found 7251 results on 291 pages for 'pdf parsing'.

Page 207/291 | < Previous Page | 203 204 205 206 207 208 209 210 211 212 213 214  | Next Page >

  • If you're not supposed to use Regular Expressions to parse HTML, then how are HTML parsers written?

    - by Andy E
    I see questions every day asking how to parse or extract something from some HTML string and the first answer/comment is always "Don't use RegEx to parse HTML, lest you feel the wrath!" (that last part is sometimes omitted). This is rather confusing for me, I always thought that in general, the best way to parse any complicated string is to use a regular expression. So how does a HTML parser work? Doesn't it use regular expressions to parse. One particular argument for using a regular expression is that there's not always a parsing alternative (such as JavaScript, where DOMDocument isn't a universally available option). jQuery, for instance, seems to manage just fine using a regex to convert a HTML string to DOM nodes. Not sure whether or not to CW this, it's a genuine question that I want to be answered and not really intended to be a discussion thread.

    Read the article

  • Configuration manager for PHP

    - by Jack
    I am working on code re-factoring of configuration file loading part in PHP. Earlier I was using multiple 'ini' files but now I plan to go for single XML file which will be containing all configuration details of the project. Problem is, if somebody wants configuration file in ini or DB or anything else and not the default one (in this case XML), my code should handle that part. If somebody wants to go for other configuration option like ini, he will have to create ini file similar to my XML configuration file and my configuration manager should take care everything like parsing, storing in cache. For that I need a mechanism lets say proper interface for my configuration data where the underlying data store can be anything( XML, DB, ini etc) also I don't want it to be dependent on these underlying store and anytime in future this should be extensible to other file formats.

    Read the article

  • PCI scan findings and problems with week ciphers on ports 993,443,995,465

    - by user64991
    From PCI scan results: Synops is : The remote service encrypts traffic using a protocol with known weaknesses . Description : The remote service accepts connections encrypted using SSL 2.0, which reportedly suffers from several cryptographic flaws and has been deprecated for several years. An attacker may be able to exploit these issues to conduct man-in-the-middle attacks or decrypt communications between the affected service and clients . See also : http://www.schneier.com/paper-ssl.pdf Solution: Consult the application's documentation to disable SSL 2.0 and use SSL 3.0 or TLS 1.0 instead. Risk Factor: Medium / CVSS Base Score : 2 (AV:R/AC:L/Au:NR/C:P/A:N/I:N/B:N) I have tried to change SSLProtocol all -SSLv2 to SSLProtocol -ALL +SSLv3 +TLSv1 And SSLCipherSuite ALL:!ADH:!EXPORT:!SSLv2:RC4+RSA:+HIGH:+MEDIUM:+LOW To SSLCipherSuite ALL:!ADH:RC4+RSA:+HIGH:!MEDIUM:!LOW:!SSLv2:!EXPORT But using SSLdigger, it shows the same result. Is this the right way to do something like this?

    Read the article

  • Updating a TableView with a WebService and Saving to CoreData

    - by jcady
    I am working on a project where I have a table view that is currently updated via a web request that returns XML. I implemented -(int)numberOfRowsInTableView:(NSTableView*)tv and -(id)tableView:(NSTableView *)tv objectValueForTableColumn:(NSTableColumn*)tableColumn row:(int)row in my XML parsing class, and have the table updated with the data that is pulled down from the server. I want to save the data that is pulled down using Core Data, so that the table can be saved/loaded. Then later on application start when the web request is made, it will only add data that is not already present. (The XML is sorted by release date, so later I will check to see which release dates are not loaded up from the Core Data store, and only load newer entries.) How would I go about implementing this? I am a very new Cocoa developer, but have gone through the entire Hillegass book. Thanks so much.

    Read the article

  • Server-to-Switch Trunking in Procurve switch, what does this mean?

    - by MattUebel
    I am looking to set up switch redundancy in a new datacenter environment. IEEE 802.3ad seems to be the go-to concept on this, at least when paired with a technology that gets around the "single switch" limitation for the link aggregation. Looking through the brochure for a procurve switch I see: Server-to-Switch Distributed Trunking, which allows a server to connect to two switches with one logical trunk; increases resiliency and enables load sharing in virtualized data centers http://www.procurve.com/docs/products/brochures/5400_3500%20Product%20Brochure4AA0-4236ENW.pdf I am trying to figure out how this relates to the 802.3a standard, as it seems that it would give me what I want (one server has 2 nics, each of which is connected to separate switches, together forming a single logical nic which would provide the happy redundancy we want), but I guess I am looking for someone familiar with this concept and could add to it.

    Read the article

  • Java: how to initialize int without assigning a value?

    - by HH
    $ javac InitInt.java InitInt.java:9: '[' expected right = new int; ^ InitInt.java:9: ']' expected right = new int; ^ InitInt.java:13: ';' expected } ^ InitInt.java:14: ';' expected public int getRight(){return right;} ^ InitInt.java:15: reached end of file while parsing } ^ 5 errors $ cat InitInt.java import java.util.*; import java.io.*; public class InitInt { private final int right; public static void main(String[] args) { // I don't want to assign any value. // just initialize it, how? right = new int; // later assiging a value } public int getRight(){return right;} }

    Read the article

  • Can I password protect a Publisher file?

    - by tombull89
    I was asked ealier this week if it was possible to password protect a Microsoft Office 2007 Publisher document. I was under the impression that it would be like protecting a Word document, by going to Office Save As Word Document Tools General Options and creating a password to modify, like shown below. This also works for Excel documents. However, in Publisher 2007 the option is not there. The only option under "Tools" is "Map network drive". We overcame the issue as saving as a PDF and distributing that, but is there a way to do what we want?

    Read the article

  • Why am I getting this WSDL SOAP error with authorize.net?

    - by Chad Johnson
    I have my script email me when there is a problem creating a recurring transaction with authorize.net. I received the following at 5:23AM Pacific time: SOAP-ERROR: Parsing WSDL: Couldn't load from 'https://api.authorize.net/soap/v1/service.asmx?wsdl' : failed to load external entity "https://api.authorize.net/soap/v1/service.asmx?wsdl" And of course, when I did exactly the same thing that the user did, it worked fine for me. Does this mean authorize.net's API is down? Their knowledge base simply sucks and provides no information whatsoever about this problem. I've contacted the company, but I'm not holding my breath for a response. Google reveals nothing. Looking through their code, nothing stands out. Maybe an authentication error? Has anyone seen an error like this before? What causes this?

    Read the article

  • ANTLR, optional ';' in JavaScript

    - by vava
    I'm just playing with ANTLR and decided to try parsing JavaScript with it. But I hit the wall in dealing with optional ';' in it, where statement end is marked by newline instead. Can it be done in some straightforward way? Just a simple grammar example that doesn't work grammar optional_newline; def : statements ; statements : statement (statement)* ; statement : expression (';' | '\n') ; expression : ID | INT | 'var' ID '=' INT ; ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')* ; INT : '0'..'9'+ ; WS : ( ' ' | '\t' | '\r' | '\n') {$channel=HIDDEN;} ; and I want to be able to parse this (which can be parsed by JavaScript parsers) var i = 10 10; PS: I don't want to put WS in parser rules, I would be much happier if lexer just get rid of those.

    Read the article

  • ASP.Net Cross Page Posting

    - by John
    Currently I have two pages: The first page contains an input form, and the 2nd page generates an excel document. The input form's button posts to this 2nd page. What I'd like to do is add a second button which also posts to the 2nd page; however, I'll need requests created from this new button to act differently, which brings me to my question: Is there a way I can tell, from the 2nd page, which button was pressed to submit the request? The main reason I'm asking is I'd like to re-use the 2nd page's logic in parsing the information from the first page if possible; I'd rather not have to copy it to a new page and have the new button post to that. Thanks!

    Read the article

  • Word 2007, Adding Page Numbers to Landscape, 5.5 by 8.5 Booklet Style Document

    - by nicorellius
    I am publishing a 5.5 by 8.5 booklet. I created this document in Word 2007 and will be converting it to PDF. It looks good as is, but I can't seem to figure out how to add page numbers automatically to this document. In general, I know how to add page numbers using footers, etc, but this application is a bit different: I have two pages (5.5 by 8.5) on one landscape 8.5 by 11 page. See picture below: I guess I could manually add page numbers, but then getting the formatting perfect will be tough. Any ideas?

    Read the article

  • How to parse an XML file using PHP?

    - by Jack
    Here I have a variable 'response' which is obtained by parsing an XML file. $url = 'http://xxxxx.xml'; $ch = curl_init($url); $response = curl_exec($ch); The url structure is as follows - <user> <id>734</id> <name>Peter Parker</name> - <status> <favorited>false</favorited> </status> </user> How to access each bit of info like id,name,favorited from response?

    Read the article

  • JQuery XML option node

    - by JD
    hi, I am having an issue with parsing XML with JQuery when there is a node with an option node <preferences><dashboard> <report id="si_pg_vw" order="0"> <header> <data> <option type="reportname" value="Page View"/> </data> </header> </report> the following code in firebug returns no children $reportElement.find("data")[0] however if I change option to any other value ("option2", "test" etc) then the line above returns one child which is correct. Am I mising something or is there a bug? Thanks John

    Read the article

  • Programmatic resource monitoring per process in Linux

    - by tuxx
    Hi, I want to know if there is an efficient solution to monitor a process resource consumption (cpu, memory, network bandwidth) in Linux. I want to write a daemon in C++ that does this monitoring for some given PIDs. From what I know, the classic solution is to periodically read the information from /proc, but this doesn't seem the most efficient way (it involves many system calls). For example to monitor the memory usage every second for 50 processes, I have to open, read and close 50 files (that means 150 system calls) every second from /proc. Not to mention the parsing involved when reading these files. Another problem is the network bandwidth consumption: this cannot be easily computed for each process I want to monitor. The solution adopted by NetHogs involves a pretty high overhead in my opinion: it captures and analyzes every packet using libpcap, then for each packet the local port is determined and searched in /proc to find the corresponding process. Do you know if there are more efficient alternatives to these methods presented or any libraries that deal with this problems?

    Read the article

  • How can I capture multiple matches from the same Perl regex?

    - by Sho Minamimoto
    I'm trying to parse a single string and get multiple chunks of data out from the same string with the same regex conditions. I'm parsing a single HTML doc that is static (For an undisclosed reason, I can't use an HTML parser to do the job.) I have an expression that looks like: $string =~ /\<img\ssrc\="(.*)"/; and I want to get the value of $1. However, in the one string, there are many img tags like this, so I need something like an array returned (@1?) is this possible?

    Read the article

  • htaccess rewrite and auth conflict

    - by Michael
    I have 2 directories each with a .htaccess file: html/.htaccess - There is a rewrite in this file to send almost everything to url.php RewriteCond %{REQUEST_URI} !(exported/?|\.(php|gif|jpe?g|png|css|js|pdf|doc|xml|ico))$ RewriteRule (.*)$ /url.php [L] and html/exported/.htaccess AuthType Basic AuthName "exported" AuthUserFile "/home/siteuser/.htpasswd" require valid-user If I remove html/exported/.htaccess the rewriting works fine and the exported directory can be access. If I remove html/.htaccess the authentication works fine. However when I have both .htaccess files exported/ is being rewritten to /url.php. Any ideas how I can prevent it?

    Read the article

  • Determine if a url matches a Route, and pull out the terms if it does

    - by Kevin Montrose
    I've got a big old log file I'm trying to break down in terms of routes. Essentially, I'm getting input of a path (/questions/31415 for example) and a list of all the registered Routes. What I want out is a Route and the parameters specified in the route (so in, /questions/{id}/{answer} I'd get id and answers out). I've got a working solution that basically generates a nasty bit of regex on the fly with named groups to do matching and parsing all-in-one. My gut tells me this is a brittle way to do it, and frankly there has to be a better way, right?

    Read the article

  • inserting std::strings in to a std::map

    - by PaulH
    I have a program that reads data from a file line-by-line. I would like to copy some substring of that line in to a map as below: std::map< DWORD, std::string > my_map; DWORD index; // populated with some data char buffer[ 1024 ]; // populated with some data char* element_begin; // points to some location in buffer char* element_end; // points to some location in buffer > element_begin my_map.insert( std::make_pair( index, std::string( element_begin, element_end ) ) ); This std::map<>::insert() operation takes a long time (It doubles the file parsing time). Is there a way to make this a less expensive operation? Thanks, PaulH

    Read the article

  • appcelerator titanium cannot parse JSON

    - by Richard
    Hi, I'm new to titanium and get difficulty in parsing JSON from mysql export. the json is valid and I feel frustrated with many unsuccessful trials. To simplify the code, I put it below. The code just stop and said: [ERROR] Script Error = Unable to parse JSON string var win = Titanium.UI.currentWindow; var hotdealjson = "{'hotdeal':[{'place':'bangkok','date':'4D3N','cost':'$4999up'},{'place':'tokyo','date':'3D2N','cost':'$3799up'}]}"; //read json var response = JSON.parse(hotdealjson); alert(response.hotdeal.length); Thanks & regards, Richard

    Read the article

  • Allow users to view Word documents only and not be able to edit, copy or save them.

    - by Alexander
    Hello In a traditional Windows Server 2003 environment with AD, we have shared a folder for our policy documents (MS Word). These documents get edited/updated now and then by the administrator(principal of college). Users only have read-only access to the folder, but they can still save-as and then change the content. Sharepoint is a possible solution but not easy to implement. We also thought of using a CMS on Linux and installing Joomla to let users only view the docs with a document management system... but is it possible to automatically retrieve the policy folder on the network and convert or put it in a format that users can only view and not copy? We also thought of saving the docs to secure pdf format but the principal wants an automated system. Basically she just wants to work in Word and the policies must be available to staff members on the network. Any ideas? Much appreciated.

    Read the article

  • Double encoded url being fully decoded in ASP.NET

    - by Brad R
    I have just come across something that is quite strange and yet I haven't found any mention on the interwebs of others having the same problem. If I hit my ASP.NET application with a double encoded url then the Request["myQueryParam"] will do a double decode of the query for me. This is not desirable as I have double encoded my query string for a good reason. Can others confirm I'm not doing something obviously wrong, and why this would happen. A solution to prevent it, without doing some nasty query string parsing, would be great too! As an example if you hit the url: http://localhost/MyApp?originalUrl=http%3a%2f%2flocalhost%2fAction%2fRedirect%3fUrl%3d%252fsomeUrl%253futm_medium%253dabc%2526utm_source%253dabc%2526utm_campaign%253dabc (For reference %25 is the % symbol) Then look at the Request["originalUrl"] (page or controller) the string returned is: http://localhost/Action/Redirect?Url=/someUrl?utm_medium=abc&utm_source=abc&utm_campaign=abc I would expect: http://localhost/Action/Redirect?Url=%2fsomeUrl%3futm_medium%3dabc%26utm_source%3dabc%26utm_campaign%3dabc I have also checked in Fiddler and the URL is being passed to the server correctly (one possible culprit could have been the browser decoding the URL before sending).

    Read the article

  • "Arbitrary" context free grammars?

    - by danwroy
    Long time admirer first time inquirer :) I'm working on a program which derives a deterministic finite-state automata from a context-free grammar, and the paper I have been assigned which explains how to do this keeps referring to "arbitrary probabilistic context-free grammars" but never defines the meaning of "arbitrary" in relation to PCFGs. I assume they mean "any old PCFG" but then why not just say "any PCFG"? The term also turns up in several Wikipedia entries. At the top of the CFG page there is a reference to arbitrariness in relation to CFGs on ("clauses can be nested inside clauses arbitrarily deeply"), but doesn't make clear why someone would refer to a PCFG or subset of PCFGs as arbitrary. In case anyone is curious, the paper is Parsing and Hypergraphs by Klein and Manning (2001); I've also been reading two other papers by them related to this one (An Agenda-Based Chart Parser for Arbitrary Probabilistic Context-Free Grammars and Empirical Bounds, Theoretical Models, and the Penn Treebank) which use the term extensively but never explain it either. Thanks for your help!

    Read the article

  • Extract attachments from Mbox throw MIME

    - by Simeon
    I am a littlebit frustrated, im working on a project with the aim to build a system witch print automatically e-mail attachments of incoming mails ("E-Mail to Print"-system). I already set up a e-mail server (exim4) which receive perfectly e-mail and stores them to a mbox in /var/mail/ - now I want to extract the attachments out of the mbox file throw MIME to the original .PDF, .DOC, .JPG, .GIF, ... and save them in a directory, from where they get print. After the e-mail attachments got extracted they should be deleted, so they don't get extracted again. But how can I get this to work? I am not a coder, so I looked for existing scripts and programs but found nothing to work with. Could anyone give me little help - I would be very thankful! Thanks, Simeon

    Read the article

  • ASP:LinkButton and Eval

    - by sgibbons
    I'm using an ASP:LinkButton inside of an ItemTemplate inside of a TemplateField in a GridView. For the command argument for the link button I want to pass the ID of the row from the datasource that the gridview is bound to, so I'm doing something like this: <asp:LinkButton ID="viewLogButton" CommandName="viewLog" CommandArgument="<%#Eval("ID")%>" Text="View Log" runat="server"/> Unfortunately, the resulting HTML is this: <asp:LinkButton ID="viewLogButton" CommandName="viewLog" CommandArgument="3" Text="View Log" runat="server"/> It seems that it is parsing the Eval() properly, but this is somehow causing it not to parse the LinkButton tag and just dump it out as literal text. Does anyone know: a) why this is happening and, b) what a good solution to this problem is?

    Read the article

  • Unescape _xHHHH_ XML escape sequences using Python

    - by John Machin
    I'm using Python 2.x [not negotiable] to read XML documents [created by others] that allow the content of many elements to contain characters that are not valid XML characters by escaping them using the _xHHHH_ convention e.g. ASCII BEL aka U+0007 is represented by the 7-character sequence u"_x0007_". Neither the functionality that allows representation of any old character in the document nor the manner of escaping is negotiable. I'm parsing the documents using cElementTree or lxml [semi-negotiable]. Here is my best attempt at unescapeing the parser output as efficiently as possible: import re def unescape(s, subber=re.compile(r'_x[0-9A-Fa-f]{4,4}_').sub, repl=lambda mobj: unichr(int(mobj.group(0)[2:6], 16)), ): if "_" in s: return subber(repl, s) return s The above is biassed by observing a very low frequency of "_" in typical text and a better-than-doubling of speed by avoiding the regex apparatus where possible. The question: Any better ideas out there?

    Read the article

< Previous Page | 203 204 205 206 207 208 209 210 211 212 213 214  | Next Page >