Search Results

Search found 2937 results on 118 pages for 'recursive descent parser'.

Page 25/118 | < Previous Page | 21 22 23 24 25 26 27 28 29 30 31 32  | Next Page >

  • recursive wget with hotlinked requisites

    - by dongle
    I often use wget to mirror very large websites. Sites that contain hotlinked content (be it images, video, css, js) pose a problem, as I seem unable to specify that I would like wget to grab page requisites that are on other hosts, without having the crawl also follow hyperlinks to other hosts. For example, let's look at this page https://dl.dropbox.com/u/11471672/wget-all-the-things.html Let's pretend that this is a large site that I would like to completely mirror, including all page requisites – including those that are hotlinked. wget -e robots=off -r -l inf -pk ^^ gets everything but the hotlinked image wget -e robots=off -r -l inf -pk -H ^^ gets everything, including hotlinked image, but goes wildly out of control, proceeding to download the entire web wget -e robots=off -r -l inf -pk -H --ignore-tags=a ^^ gets the first page, including both hotlinked and local image, does not follow the hyperlink to the site outside of scope, but obviously also does not follow the hyperlink to the next page of the site. I know that there are various other tools and methods of accomplishing this (HTTrack and Heritrix allow for the user to make a distinction between hotlinked content on other hosts vs hyperlinks to other hosts) but I'd like to see if this is possible with wget. Ideally this would not be done in post-processing, as I would like the external content, requests, and headers to be included in the WARC file I'm outputting.

    Read the article

  • Calling recursive function twice consecutively

    - by Zack
    #include <stdio.h> #define LENGTH 16 void makeBranches(int, int); void display(int, int); int main(){ makeBranches(0, LENGTH-1); } void makeBranches(int left, int right){ if(left >= right){ return; } else{ display(left, right); makeBranches(left, (right+left)/2); makeBranches((right+left/2)+1, right); } } void display(int left, int right){ printf("%d, %d", left, right); int mid = (left+right)/2; int i; for(i = left; i <= right; i++){ if(i == mid) printf("X"); else printf("-"); } if(right == LENGTH-1) printf("\n"); } The problem that I am having is the second call of makeBranches only executes with the values that caused the first call of makeBranches to return and not the original values that the first call used.

    Read the article

  • recursive cumulative sums

    - by user1816377
    I need to write a program that compute cumulative sums from a list of numbers with def but ONLY with recursion. I did it, but now I need to write the same program without using the method sum, but no success so far. Any idea? my code: def rec_cumsum(numbers): ''' Input: numbers - a list of numbers, Output: a list of cumulative sums of the numbers''' if len(numbers)==0: return numbers return rec_cumsum(numbers[:-1])+ [sum(numbers)] input: 1 [1,2,3] 2 [2, 2, 2, 3] output: 1 [1,3,6] 2 [2, 4, 6, 9]

    Read the article

  • SQL select descendants of a row

    - by Joey Adams
    Suppose a tree structure is implemented in SQL like this: CREATE TABLE nodes ( id INTEGER PRIMARY KEY, parent INTEGER -- references nodes(id) ); Although cycles can be created in this representation, let's assume we never let that happen. The table will only store a collection of roots (records where parent is null) and their descendants. The goal is to, given an id of a node on the table, find all nodes that are descendants of it. A is a descendant of B if either A's parent is B or A's parent is a descendant of B. Note the recursive definition. Here is some sample data: INSERT INTO nodes VALUES (1, NULL); INSERT INTO nodes VALUES (2, 1); INSERT INTO nodes VALUES (3, 2); INSERT INTO nodes VALUES (4, 3); INSERT INTO nodes VALUES (5, 3); INSERT INTO nodes VALUES (6, 2); which represents: 1 `-- 2 |-- 3 | |-- 4 | `-- 5 | `-- 6 We can select the (immediate) children of 1 by doing this: SELECT a.* FROM nodes AS a WHERE parent=1; We can select the children and grandchildren of 1 by doing this: SELECT a.* FROM nodes AS a WHERE parent=1 UNION ALL SELECT b.* FROM nodes AS a, nodes AS b WHERE a.parent=1 AND b.parent=a.id; We can select the children, grandchildren, and great grandchildren of 1 by doing this: SELECT a.* FROM nodes AS a WHERE parent=1 UNION ALL SELECT b.* FROM nodes AS a, nodes AS b WHERE a.parent=1 AND b.parent=a.id UNION ALL SELECT c.* FROM nodes AS a, nodes AS b, nodes AS c WHERE a.parent=1 AND b.parent=a.id AND c.parent=b.id; How can a query be constructed that gets all descendants of node 1 rather than those at a finite depth? It seems like I would need to create a recursive query or something. I'd like to know if such a query would be possible using SQLite. However, if this type of query requires features not available in SQLite, I'm curious to know if it can be done in other SQL databases.

    Read the article

  • Parsing Wiki XML Dumps ver0.4 just got tough

    - by syed
    Hello, I am trying to parse Wikipedia XML Dump using "Parse-MediaWikiDump-1.0.4" along with "Wikiprep.pl" script. I guess this script works fine with ver0.3 Wiki XML Dumps but not with the latest ver0.4 Dumps. I get the following error. Can't locate object method "page" via package "Parse::MediaWikiDump::Pages" at wikiprep.pl line 390. Also, under the "Parse-MediaWikiDump-1.0.4" documentation @ http://search.cpan.org/~triddle/Parse-MediaWikiDump-1.0.4/lib/Parse/MediaWikiDump/Pages.pm, I read "LIMITATIONS Version 0.4 This class was updated to support version 0.4 dump files from a MediaWiki instance but it does not currently support any of the new information available in those files." Any work arounds would help me get to the next level. Note: one may wonder why cannot we directly use SAX or STAX parser instead, wikipedia dump is a 25GB plus single file, stack/memory issues are obvious. Hence, the above perl script resolves this issue but currently I am stuck with this version problem.

    Read the article

  • How to implement a graph-structured stack?

    - by Emil
    Ok, so I would like to make a GLR parser generator. I know there exist such programs better than what I will probably make, but I am doing this for fun/learning so that's not important. I have been reading about GLR parsing and I think I have a decent high level understanding of it now. But now it's time to get down to business. The graph-structured stack (GSS) is the key data structure for use in GLR parsers. Conceptually I know how GSS works, but none of the sources I looked at so far explain how to implement GSS. I don't even have an authoritative list of operations to support. Can someone point me to some good sample code/tutorial for GSS? Google didn't help so far. I hope this question is not too vague.

    Read the article

  • IVR-style dialog system

    - by unbeli
    I need to build a dialog system similar to IVR used in call centers. My system is not phone-based, but the dialog is similar. Something like System: "Main menu: Enter [1] for menu1, [2] for menu2" User: [1] System: "menu1: enter [1] for apples, [2] for oranges, [3] for main menu" User: [7] System: "What??" System: "menu1: enter [1] for apples, [2] for oranges, [3] for main menu" User: [2] ... and so on I want to have a nice declarative description of all the possible options and a nice way to run through that tree, guided by user input. Already considered: ANTLR-generated lexer/parser (seems to be an overkill), SCXML-based state machine (seems like only transitions can be declared, the rest needs to be coded)

    Read the article

  • IVR-style dialog system / workflow / menu

    - by unbeli
    I need to build a dialog system similar to IVR used in call centers. My system is not phone-based, but the dialog is similar. Something like System: "Main menu: Enter [1] for menu1, [2] for menu2" User: [1] System: "menu1: enter [1] for apples, [2] for oranges, [3] for main menu" User: [7] System: "What??" System: "menu1: enter [1] for apples, [2] for oranges, [3] for main menu" User: [2] ... and so on I want to have a nice declarative description of all the possible options and a nice way to run through that tree, guided by user input. Already considered: ANTLR-generated lexer/parser (seems to be an overkill), SCXML-based state machine (seems like only transitions can be declared, the rest needs to be coded)

    Read the article

  • How can I build a Truth Table Generator?

    - by KingNestor
    I'm looking to write a Truth Table Generator as a personal project. There are several web-based online ones here and here. (Example screenshot of an existing Truth Table Generator) I have the following questions: How should I go about parsing expressions like: ((P = Q) & (Q = R)) = (P = R) Should I use a parser generator like ANTLr or YACC, or use straight regular expressions? Once I have the expression parsed, how should I go about generating the truth table? Each section of the expression needs to be divided up into its smallest components and re-built from the left side of the table to the right. How would I evaluate something like that? Can anyone provide me with tips concerning the parsing of these arbitrary expressions and eventually evaluating the parsed expression?

    Read the article

  • Python regex on list

    - by Peter Nielsen
    Hi there I am trying to build a parser and save the results as an xml file but i have problems.. For instance i get a TypeError: expected string or buffer when i try to run the code.. Would you experts please have a look at my code ? import urllib2, re from xml.dom.minidom import Document from BeautifulSoup import BeautifulSoup as bs osc = open('OSCTEST.html','r') oscread = osc.read() soup=bs(oscread) doc = Document() root = doc.createElement('root') doc.appendChild(root) countries = doc.createElement('countries') root.appendChild(countries) findtags1 = re.compile ('<h1 class="title metadata_title content_perceived_text(.*?)</h1>', re.DOTALL | re.IGNORECASE).findall(soup) findtags2 = re.compile ('<span class="content_text">(.*?)</span>', re.DOTALL | re.IGNORECASE).findall(soup) for header in findtags1: title_elem = doc.createElement('title') countries.appendChild(title_elem) header_elem = doc.createTextNode(header) title_elem.appendChild(header_elem) for item in findtags2: art_elem = doc.createElement('artikel') countries.appendChild(art_elem) s = item.replace('<P>','') t = s.replace('</P>','') text_elem = doc.createTextNode(t) art_elem.appendChild(text_elem) print doc.toprettyxml()

    Read the article

  • Lexing newlines in scala StdLexical?

    - by Nick Fortescue
    I'm trying to lex (then parse) a C like language. In C there are preprocessor directives where line breaks are significant, then the actual code where they are just whitespace. One way of doing this would be do a two pass process like early C compilers - have a separate preprocessor for the # directives, then lex the output of that. However, I wondered if it was possible to do it in a single lexer. I'm pretty happy with writing the scala parser-combinator code, but I'm not so sure of how StdLexical handles whitespace. Could someone write some simple sample code which say could lex a #include line (using the newline) and some trivial code (ignoring the newline)? Or is this not possible, and it is better to go with the 2-pass appproach?

    Read the article

  • lexers / parsers for (un) structured text documents

    - by wilson32
    There are lots of parsers and lexers for scripts (i.e. structured computer languages). But I'm looking for one which can break a (almost) non-structured text document into larger sections e.g. chapters, paragraphs, etc. It's relatively easy for a person to identify them: where the Table of Contents, acknowledgements, or where the main body starts and it is possible to build rule based systems to identify some of these (such as paragraphs). I don't expect it to be perfect, but does any one know of such a broad 'block based' lexer / parser? Or could you point me in the direction of literature which may help?

    Read the article

  • Delaying execution of Javascript function relative to Google Maps / geoxml3 parser?

    - by Terra Fimeira
    I'm working on a implementing a Google map on a website with our own tiles overlays and KML elements. I've been previously requested to create code so that, for instance, when the page is loaded from a specific URL, it would initialize with one of the tile overlays already enabled. Recently, I've been requested to do the same for the buildings which are outlined by KML elements so that, arriving at the page with a specific URL, it would automatically zoom, center, and display information on the building. However, while starting with the tile overlays work, the building KML does not. After doing some testing, I've determined that when the code which checks the URL executes, the page is still loading the KML elements and thus do not exist for the code to compare to or use: Code for evaluating URL (placed at the end of onLoad="initialize()") function urlClick() { var currentURL = window.location.href; //Retrieve page URL var URLpiece = currentURL.slice(-6); //pull the last 6 digits (for testing) if (URLpiece === "access") { //If the resulting string is "access": access_click(); //Display accessibility overlay } else if (URLpiece === "middle") { //Else if the string is "middle": facetClick('Middle College'); //Click on building "Middle College" }; }; facetClick(); function facetClick(name) { //Convert building name to building ID. for (var i = 0; i < active.placemarks.length; i++) { if (active.placemarks[i].name === name) { sideClick(i) //Click building whose id matches "Middle College" }; }; }; Firebug Console Error active is null for (var i = 0; i < active.placemarks.length; i++) { active.placemarks is which KML elements are loaded on the page, and being null, means no KML has been loaded yet. In short, I have a mistiming and I can't seem to find a suitable place to place the URL code to execute after the KMl has loaded. As noted above, I placed it at the end of onLoad="initialize()", but it would appear that, instead of waiting for the KML to completely load earlier in the function, the remainder of the function is executed: onLoad="initialize()" information(); //Use the buttons variables inital state to set up description buttons(); //and button state button_hover(0); //and button description to neutral. //Create and arrange the Google Map. //Create basic tile overlays. //Set up parser to work with KML elements. myParser = new geoXML3.parser({ //Parser: Takes KML and converts to JS. map: map, //Applies parsed KML to the map singleInfoWindow: true, afterParse: useTheData //Allows us to use the parsed KML in a function }); myParser.parse(['/maps/kml/shapes.kml','/maps/kml/shapes_hidden.kml']); google.maps.event.addListener(map, 'maptypeid_changed', function() { autoOverlay(); }); //Create other tile overlays to appear over KML elements. urlClick(); I suspect one my issues lies in using the geoxml3 parser (http://code.google.com/p/geoxml3/) which converts our KML files to Javascript. While the page has completed loading all of the elements, the map on the page is still loading, including the KML elements. I have also tried placing urlClick() in the parser itself in various places which appear to execute after all the shapes have been parsed, but I've had no success there either. While I've been intending to strip out the parser, I would like to know if there is any way of executing the "urlClick" after the parser has returned the KML shapes. Ideally, I don't want to use an arbitrary means of defining a time to wait, such as "wait 3 seconds, and go", as my various browsers all load the page at different times; rather, I'm looking for some way to say "when the parser is done, execute" or "when the Google map is completely loaded, execute" or perhaps even "hold until the parser is complete before advancing to urlClick".

    Read the article

  • Using Recursive SQL and XML trick to PIVOT(OK, concat) a "Document Folder Structure Relationship" table, works like MySQL GROUP_CONCAT

    - by Kevin Shyr
    I'm in the process of building out a Data Warehouse and encountered this issue along the way.In the environment, there is a table that stores all the folders with the individual level.  For example, if a document is created here:{App Path}\Level 1\Level 2\Level 3\{document}, then the DocumentFolder table would look like this:IDID_ParentFolderName1NULLLevel 121Level 232Level 3To my understanding, the table was built so that:Each proposal can have multiple documents stored at various locationsDifferent users working on the proposal will have different access level to the folder; if one user is assigned access to a folder level, she/he can see all the sub folders and their content.Now we understand from an application point of view why this table was built this way.  But you can quickly see the pain this causes the report writer to show a document link on the report.  I wasn't surprised to find the report query had 5 self outer joins, which is at the mercy of nobody creating a document that is buried 6 levels deep, and not to mention the degradation in performance.With the help of 2 posts (at the end of this post), I was able to come up with this solution:Use recursive SQL to build out the folder pathUse SQL XML trick to concat the strings.Code (a reminder, I built this code in a stored procedure.  If you copy the syntax into a simple query window and execute, you'll get an incorrect syntax error) Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} -- Get all folders and group them by the original DocumentFolderID in PTSDocument table;WITH DocFoldersByDocFolderID(PTSDocumentFolderID_Original, PTSDocumentFolderID_Parent, sDocumentFolder, nLevel)AS (-- first member      SELECT 'PTSDocumentFolderID_Original' = d1.PTSDocumentFolderID            , PTSDocumentFolderID_Parent            , 'sDocumentFolder' = sName            , 'nLevel' = CONVERT(INT, 1000000)      FROM (SELECT DISTINCT PTSDocumentFolderID                  FROM dbo.PTSDocument_DY WITH(READPAST)            ) AS d1            INNER JOIN dbo.PTSDocumentFolder_DY AS df1 WITH(READPAST)                  ON d1.PTSDocumentFolderID = df1.PTSDocumentFolderID      UNION ALL      -- recursive      SELECT ddf1.PTSDocumentFolderID_Original            , df1.PTSDocumentFolderID_Parent            , 'sDocumentFolder' = df1.sName            , 'nLevel' = ddf1.nLevel - 1      FROM dbo.PTSDocumentFolder_DY AS df1 WITH(READPAST)            INNER JOIN DocFoldersByDocFolderID AS ddf1                  ON df1.PTSDocumentFolderID = ddf1.PTSDocumentFolderID_Parent)-- Flatten out folder path, DocFolderSingleByDocFolderID(PTSDocumentFolderID_Original, sDocumentFolder)AS (SELECT dfbdf.PTSDocumentFolderID_Original            , 'sDocumentFolder' = STUFF((SELECT '\' + sDocumentFolder                                         FROM DocFoldersByDocFolderID                                         WHERE (PTSDocumentFolderID_Original = dfbdf.PTSDocumentFolderID_Original)                                         ORDER BY PTSDocumentFolderID_Original, nLevel                                         FOR XML PATH ('')),1,1,'')      FROM DocFoldersByDocFolderID AS dfbdf      GROUP BY dfbdf.PTSDocumentFolderID_Original) And voila, I use the second CTE to join back to my original query (which is now a CTE for Source as we can now use MERGE to do INSERT and UPDATE at the same time).Each part of this solution would not solve the problem by itself because:If I don't use recursion, I cannot build out the path properly.  If I use the XML trick only, then I don't have the originating folder ID info that I need to link to the document.If I don't use the XML trick, then I don't have one row per document to show in the report.I could conceivably do this in the report function, but I'd rather not deal with the beginning or ending backslash and how to attach the document name.PIVOT doesn't do strings and UNPIVOT runs into the same problem as the above.I'm excited that each version of SQL server provides us new tools to solve old problems and/or enables us to solve problems in a more elegant wayThe 2 posts that helped me along:Recursive Queries Using Common Table ExpressionHow to use GROUP BY to concatenate strings in SQL server?

    Read the article

  • Recognizing terminals in a CFG production previously not defined as tokens.

    - by kmels
    I'm making a generator of LL(1) parsers, my input is a CoCo/R language specification. I've already got a Scanner generator for that input. Suppose I've got the following specification: COMPILER 1. CHARACTERS digit="0123456789". TOKENS number = digit{digit}. decnumber = digit{digit}"."digit{digit}. PRODUCTIONS Expression = Term{"+"Term|"-"Term}. Term = Factor{"*"Factor|"/"Factor}. Factor = ["-"](Number|"("Expression")"). Number = (number|decnumber). END 1. So, if the parser generated by this grammar receives a word "1+1", it'd be accepted i.e. a parse tree would be found. My question is, the character "+" was never defined in a token, but it appears in the non-terminal "Expression". How should my generated Scanner recognize it? It would not recognize it as a token. Is this a valid input then? Should I add this terminal in TOKENS and then consider an error routine for a Scanner for it to skip it? How does usual language specifications handle this?

    Read the article

  • Parsing HTTP - Bytes.length != String.length

    - by hotzen
    Hello, I consume HTTP via nio.SocketChannel, so I get chunks of data as Array[Byte]. I want to put these chunks into a parser and continue parsing after each chunk has been put. HTTP itself seems to use an ISO8859-Charset but the Payload/Body itself may be arbitrarily encoded: If the HTTP Content-Length specifies X bytes, the UTF8-decoded Body may have much less Characters (1 Character may be represented in UTF8 by 2 bytes, etc). So what is a good parsing strategy to honor an explicitly specified Content-Length and/or a Transfer-Encoding: Chunked which specifies a chunk-length to be honored. append each data-chunk to an mutable.ArrayBuffer[Byte], search for CRLF in the bytes, decode everything from 0 until CRLF to String and match with Regular-Expressions like StatusRegex, HeaderRegex, etc? decode each data-chunk with the proper charset (e.g. iso8859, utf8, etc) and add to StringBuilder. With this solution I am not able to honor any Content-Length or Chunk-Size, but.. do I have to care for it? any other solution... ?

    Read the article

  • Python Permutation Program Flow help

    - by dsaccount1
    Hello world, i found this code at activestate, it takes a string and prints permutations of the string. I understand that its a recursive function but i dont really understand how it works, it'd be great if someone could walk me through the program flow, thanks a bunch! <pre><code> import sys def printList(alist, blist=[]): if not len(alist): print ''.join(blist) for i in range(len(alist)): blist.append(alist.pop(i)) printList(alist, blist) alist.insert(i, blist.pop()) if name == 'main': k='love' if len(sys.argv)1: k = sys.argv[1] printList(list(k))

    Read the article

  • Recursion with an Array; can't get the right value to return

    - by Matt
    Recursive Solution: Not working! Explanation: An integer, time, is passed into the function. It's then used to provide an end to the FOR statement (counter<time). The IF section (time == 0) provides a base case where the recursion should terminate, returning 0. The ELSE section is where the recursive call occurs: total is a private variable defined in the header file, elsewhere. It's initialized to 0 in a constructor, elsewhere. The function calls itself, recursively, adding productsAndSales[time-1][0] to total, again, and again, until the base call. Then the total is returned, and printed out later. Well, that's what I hoped for anyway. What I imagined would happen is that I would add up all the values in this one column of the array and the value would get returned, and printed out. Instead if returns 0. If I set the IF section to "return 1", I noticed that it returns powers of 2, for whatever value time is. EG: Time = 3, it returns 2*2 + 1. If time = 5, it returns 2*2*2*2 + 1. I don't understand why it's not returning the value I'm expecting. int CompanySales::calcTotals( int time ) { cout << setw( 4 ); if ( time == 0 ) { return 0; } else { return total += calcTotals( productsAndSales[ time-1 ][ 0 ]); } } Iterative Solution: Working! Explanation: An integer, time, is passed into the function. It's then used to provide an end to the FOR statement (counter<time). The FOR statement cycles through an array, adding all of the values in one column together. The value is then returned (and elsewhere in the program, printed out). Works perfectly. int CompanySales::calcTotals( int time ) { int total = 0; cout << setw( 4 ); for ( int counter = 0; counter < time; counter++ ) { total += productsAndSales[counter][0]; } return total0; }

    Read the article

  • How to write a function to output unconstant loop

    - by tunpishuang
    Here is the function description test($argv) $argv is an array, for example $argv=array($from1,$to1,$from2,$to2.....); array items must be even. $argv=array(1,2,4,5) : this will output values like below: 1_4 1_5 2_4 2_5 The number of array $argv's is not constant. Maybe 3 or 4 levels of loop will be outputed. I know this will used RECURSIVE , but i don't know exactly how to code.

    Read the article

  • how to write a function to output unconstant loop with PHP

    - by tunpishuang
    here is the function description test($argv) $argv is an array , for example $argv=array($from1,$to1,$from2,$to2.....); array items must be even. $argv=array(1,2,4,5) : this will output values like below: 1_4 1_5 2_4 2_5 the number of arrray $argv's is not constant. maybe 3 or 4 levels of loop will be outputed. i know this will used RECURSIVE , but i don't know exatly how to code.

    Read the article

  • Creating a Maze using Java

    - by user356184
    Im using Java to create a maze of specified "rows" and "columns" over each other to look like a grid. I plan to use a depth-first recursive method to "open the doors" between the rooms (the box created by the rows and columns). I need help writing a openDoor method that will break the link between rooms.

    Read the article

  • Creating a bare bone web-browser: After the html parser, javascript parser, etc have done their work, how do I display the content of the webpage?

    - by aste123
    This is a personal project to learn computer programming. I took a look at this: https://www.udacity.com/course/viewer#!/c-cs262 The following is the approach taken in it: Abstract Syntax Tree is created. But javascript is still not completely broken down in order not to confuse with the html tags. Then the javascript interpreter is called on it. Javascript interpreter stores the text from the write() and document.write() to be used later. Then a graphics library in Python is called which will convert everything to a pdf file and then we convert it into png or jpeg and then display it. My Question: I want to display the actual text in a window (which I will design later) like firefox or chrome does instead of image files so that the data can be selected, copied, etc by the user of the browser. How do I accomplish this? In other words, what are the other elements of a bare bone web browser that I am missing? I would prefer to implement most of the stuff in C++ although if things seem too complicated I might go with Python to save time and create a prototype and later creating another bare bone browser in C++ and add more features. This is a project to learn more. I do realize we already have lots of reliable browsers like firefox, etc. The way I feel it is done: I think after all the broken down contents have been created by the parsers and interpreters, I will need to access them individually from within the window's code (like qt) and then decide upon a good way to display them. I am not sure if it is the way this should be done. Additions after useful comment by Kilian Foth: I found this page: http://friendlybit.com/css/rendering-a-web-page-step-by-step/ 14. A DOM tree is built out of the broken HTML 15. New requests are made to the server for each new resource that is found in the HTML source (typically images, style sheets, and JavaScript files). Go back to step 3 and repeat for each resource. 16. Stylesheets are parsed, and the rendering information in each gets attached to the matching node in the DOM tree 17. Javascript is parsed and executed, and DOM nodes are moved and style information is updated accordingly 18. The browser renders the page on the screen according to the DOM tree and the style information for each node 19. You see the page on the screen I need help with step 18. How do I do that? How much work do Webkit and Gecko do? I want to use a readymade layout renderer for step number 18 and not for anything that comes before that.

    Read the article

  • Can Haskell's Parsec library be used to implement a recursive descent parser with backup?

    - by Thor Thurn
    I've been considering using Haskell's Parsec parsing library to parse a subset of Java as a recursive descent parser as an alternative to more traditional parser-generator solutions like Happy. Parsec seems very easy to use, and parse speed is definitely not a factor for me. I'm wondering, though, if it's possible to implement "backup" with Parsec, a technique which finds the correct production to use by trying each one in turn. For a simple example, consider the very start of the JLS Java grammar: Literal: IntegerLiteral FloatingPointLiteral I'd like a way to not have to figure out how I should order these two rules to get the parse to succeed. As it stands, a naive implementation like this: literal = do { x <- try (do { v <- integer; return (IntLiteral v)}) <|> (do { v <- float; return (FPLiteral v)}); return(Literal x) } Will not work... inputs like "15.2" will cause the integer parser to succeed first, and then the whole thing will choke on the "." symbol. In this case, of course, it's obvious that you can solve the problem by re-ordering the two productions. In the general case, though, finding things like this is going to be a nightmare, and it's very likely that I'll miss some cases. Ideally, I'd like a way to have Parsec figure out stuff like this for me. Is this possible, or am I simply trying to do too much with the library? The Parsec documentation claims that it can "parse context-sensitive, infinite look-ahead grammars", so it seems like something like I should be able to do something here.

    Read the article

< Previous Page | 21 22 23 24 25 26 27 28 29 30 31 32  | Next Page >