Search Results

Search found 3284 results on 132 pages for 'parser generator'.

Page 109/132 | < Previous Page | 105 106 107 108 109 110 111 112 113 114 115 116 | Next Page >

Serialization Performance and Google Android

- by Jomanscool2

I'm looking for advice to speed up serialization performance, specifically when using the Google Android. For a project I am working on, I am trying to relay a couple hundred objects from a server to the Android app, and am going through various stages to get the performance I need. First I tried a terrible XML parser that I hacked together using Scanner specifically for this project, and that caused unbelievably slow performance when loading the objects (~5 minutes for a 300KB file). I then moved away from that and made my classes implement Serializable and wrote the ArrayList of objects I had to a file. Reading that file into the objects the Android, with the file already downloaded mind you, was taking ~15-30 seconds for the ~100KB serialized file. I still find this completely unacceptable for an Android app, as my app requires loading the data when starting the application. I have read briefly about Externalizable and how it can increase performance, but I am not sure as to how one implements it with nested classes. Right now, I am trying to store an ArrayList of the following class, with the nested classes below it. public class MealMenu implements Serializable{ private String commonsName; private long startMillis, endMillis, modMillis; private ArrayList<Venue> venues; private String mealName; } And the Venue class: public class Venue implements Serializable{ private String name; private ArrayList<FoodItem> foodItems; } And the FoodItem class: public class FoodItem implements Serializable{ private String name; private boolean vegan; private boolean vegetarian; } IF Externalizable is the way to go to increase performance, is there any information as to how java calls the methods in the objects when you try to write it out? I am not sure if I need to implement it in the parent class, nor how I would go about serializing the nested objects within each object.

Read the article
Versioned RDF store

- by Mat

Let me try rephrasing this: I am looking for a robust RDF store or library with the following features: Named graphs, or some other form of reification. Version tracking (probably at the named graph level). Privacy between groups of users, either at named graph or triple level. Human-readable data input and output, e.g. TriG parser and serialiser. I've played with Jena, Sesame, Boca, RDFLib, Redland and one or two others some time ago but each had its problems. Have any improved in the above areas recently? Can anything else do what I want, or is RDF not yet ready for prime-time? Reading around the subject a bit more, I've found that: Jena, nothing further Sesame, nothing further Boca does not appear to be maintained any more and seems only really designed for DB2. OpenAnzo, an open-source fork, appears more promising. RDFLib, nothing further Redland, nothing further Talis Platform appears to support changesets (wiki page and reference in Kniblet Tutorial Part 5) but it's a hosted-only service. Still may look into it though. SemVersion sounded promising, but appears to be stale.

Read the article
Registering custom webcontrol inside mvc view?

- by kastermester

I am in the middle of a project where I am migrating some code for a website from WebForms to MVC - unfortunatly there's not enough time to do it all at once, so I will have to do some... not so pretty solutions. I am though facing a problems with a custom control I have written that inherits from the standard GridView control namespace Controls { public class MyGridView : GridView { ... } } I have added to the web.config file as usual: <configuration> ... <system.web> ... <pages> ... <controls> ... <add tagPrefix="Xui" namespace="Controls"/> </controls> </pages> </system.web> </configuration> Then on the MVC View: <Xui:MyGridView ID="GridView1" runat="server" ...>...</Xui:MyGgridView> However I am getting a parser error stating that the control cannot be found. I am suspecting this has to do with the mix up of MVC and WebForms, however I am/was under the impression that such mixup should be possible, is there any kind of tweak for this? I realise this solution is far from ideal, however there's no time to "do the right thing". Thanks

Read the article
DIY intellisense on XPath - design approach? (WinForms app)

- by Cheeso

I read the DIY Intellisense article on code project, which was referenced from the Mimic Intellisense? question here on SO. I wanna do something similar, DIY intellisense, but for XPath not C#. The design approach used there makes sense to me: maintain a tree of terms, and when the "completion character" is pressed, in the case of C#, a dot, pop up the list of possible completions in a textfield. Then allow the user to select a term from the textfield either through typing, arrow keys, or double-click. How would you apply this to XPath autocompletion? should there be an autocomplete key? In XPath there is no obvious separator key like "dot" in C#. should the popup be triggered explicitly in some other way, let's say ctrl-. ? or should the parser try to autocomplete continuously? If I do the autocomplete continuously, how to scale it properly? There are 93 xpath functions, not counting overloads. I certainly don't want to popup a list of 93 choices. How do I decide when I've narrowed it enough to offer a useful lsit of possible completions? How to populate the tree of possible completions? For C#, it's easy: walk the type space via reflection. At a first level, the "syntax tree" for C# seems like a single tree, and the list of completions at any point depends on the graph of nodes you've traversed to that point. Typing System.Console. traverses to a certain node in that tree, and the list of completions is the set of child nodes available at that node in the tree. On the other hand, the xpath syntax seems like it is a "flatter" tree - function names, axis names, literals. Does this make sense? what have I not considered?

Read the article
Execute SQL on CSV files via JDBC

- by Markos Fragkakis

Dear all, I need to apply an SQL query to CSV files (comma-separated text files). My SQL is predefined from another tool, and is not eligible to change. It may contain embedded selects and table aliases in the FROM part. For my task I have found two open-source (this is a project requirement) libraries that provide JDBC drivers: CsvJdbc XlSQL JBoss Teiid Create an Apache Derby DB, load all CSVs as tables and execute the query. These are the problems I encountered: it does not accept the syntax of the SQL (it uses internal selects and table aliases). Furthermore, it has not been maintained since 2004. I could not get it to work, as it has as dependency a SAX Parser that causes exception when parsing other documents. Similarly, no change since 2004. Have not checked if it supports the syntax, but seems like an overhead. It needs several entities defines (Virtual Databases, Bindings). From the mailing list they told me that last release supports runtime creation of required objects. Has anyone used it for such simple task (normally it can connect to several types of data, like CSV, XML or other DBS and create a virtual, unified one)? Can this even be done easily? From the 4 things I considered/tried, only 3 and 4 seem to me viable. Any advice on these, or any other way in which I can query my CSV files? Cheers

Read the article
Django upload failing on request data read error

- by Jake

Hi All, I've got a Django app that accepts uploads from jQuery uploadify, a jQ plugin that uses flash to upload files and give a progress bar. Files under about 150k work, but bigger files always fail and almost always at around 192k (that's 3 chunks) completed, sometimes at around 160k. The Exception I get is below. exceptions.IOError request data read error File "/usr/lib/python2.4/site-packages/django/core/handlers/wsgi.py", line 171, in _get_post self._load_post_and_files() File "/usr/lib/python2.4/site-packages/django/core/handlers/wsgi.py", line 137, in _load_post_and_files self._post, self._files = self.parse_file_upload(self.META, self.environ[\'wsgi.input\']) File "/usr/lib/python2.4/site-packages/django/http/__init__.py", line 124, in parse_file_upload return parser.parse() File "/usr/lib/python2.4/site-packages/django/http/multipartparser.py", line 192, in parse for chunk in field_stream: File "/usr/lib/python2.4/site-packages/django/http/multipartparser.py", line 314, in next output = self._producer.next() File "/usr/lib/python2.4/site-packages/django/http/multipartparser.py", line 468, in next for bytes in stream: File "/usr/lib/python2.4/site-packages/django/http/multipartparser.py", line 314, in next output = self._producer.next() File "/usr/lib/python2.4/site-packages/django/http/multipartparser.py", line 375, in next data = self.flo.read(self.chunk_size) File "/usr/lib/python2.4/site-packages/django/http/multipartparser.py", line 405, in read return self._file.read(num_bytes) When running locally on the Django development server, big files work. I've tried setting my FILE_UPLOAD_HANDLERS = ("django.core.files.uploadhandler.TemporaryFileUploadHandler",) in case it was the memory upload handler, but it made no difference. Does anyone know how to fix this?

Read the article
Sudden issues reading uncompressed video using opencv

- by JohnSavage

I have been using a particular pipeline to process video using opencv to encode uncompressed video (fourcc = 0), and opencv python bindings to then open and work on these files. This has been working fine for me on OpenCV 2.3.1a on Ubuntu 11.10 until just a few days ago. For some reason it currently is only allowing me to read the first frame of a given file the first time I open that file. Further frames are not read, and once I touch the file once with my program, it then cannot even read the first frame. More detail: I created the uncompressed video files as follows: out_video.open(out_vid_name, 0, // FOURCC = 0 means record raw fps, Size(640, 480)) Again, these videos worked fine for me until about a week ago. Now, when I try to open one of these I get the following message (from what I think is ffmpeg): Processing video.avi Using network protocols without global network initialization. Please use avformat_network_init(), this will become mandatory later. [avi @ 0x29251e0] parser not found for codec rawvideo, packets or times may be invalid. It reads and displays the first frame fine, but then fails to read the next frame. Then, when I try to run my code on the same video, the capture still opens with the same message as above. However, it cannot even read the very first frame. Here is the code to open the capture: self.capture = cv2.VideoCapture(filename) if not self.capture.isOpened() print "Error: could not open capture" sys.exit() Again, this part is passed without any issue, but then the break happens at: success, rgb = self.capture.read() if not success: print "error: could not read frame" return False This part breaks at the second frame on the first run of the video file, and then on the first frame on subsequent runs. I really don't know where to even begin debugging this. Please help!

Read the article
problem parsing with XMLReader (using ReadSubTree)

- by no9

Hello. Im trying to build a simple XML to Controls parser in my CF application. In the code below the string im trying to parse looks like this: "<Panel><Label>Text1</Label><Label>Text2</Label></Panel>" The result i want with this code would be a Panel with two labels. But the problem is when the first Label is parsed the subreader.Read() returns false in the ParsePanelElementh method, and so it falls out of while statement. Since im new into XMLReader i must be missing something very simple. Any help would be apreciated ! peace. static class XMLParser { public static Control Parse(string aXmlString) { XmlReader reader = XmlReader.Create(new StringReader(aXmlString)); return ParseXML(reader); } public static Control ParseXML(XmlReader reader) { using (reader) { while (reader.Read()) { if (reader.NodeType == XmlNodeType.Element) { if (reader.LocalName == "Panel") { return ParsePanelElement(reader); } if (reader.LocalName == "Label") { return ParseLabelElement(reader); } } } } return null; } private static Control ParsePanelElement(XmlReader reader) { var myPanel = new Panel(); XmlReader subReader = reader.ReadSubtree(); while (subReader.Read()) { Control subControl = ParseXML(subReader); if (subControl != null) { myPanel.Controls.Add(subControl); }; } return myPanel; } private static Control ParseLabelElement(XmlReader reader) { reader.Read(); var myString = reader.Value as string; var myLabel = new Label(); myLabel.Text = myString; return myLabel; } }

Read the article
Django-South introspection rule doesn't work.

- by Ory Band

I'm using Django 1.2.3 and South 0.7.3. I am trying to convert my app (named core) to use Django-South. I have a custom model/field that I'm using, named ImageWithThumbsField. It's basically just the ol' django.db.models.ImageField with some attributes such as height, weight, etc. While trying to ./manage.py convert_to_auth core I receieve South's freezing errors. I have no idea why, I'm Probably missing something... I am using a simple custom Model: from django.db.models import ImageField class ImageWithThumbsField(ImageField): def __init__(self, verbose_name=None, name=None, width_field=None, height_field=None, sizes=None, **kwargs): self.verbose_name=verbose_name self.name=name self.width_field=width_field self.height_field=height_field self.sizes = sizes super(ImageField, self).__init__(**kwargs) And this is my introspection rule, which I add to the top of my models.py: from south.modelsinspector import add_introspection_rules from lib.thumbs import ImageWithThumbsField add_introspection_rules( [ ( (ImageWithThumbsField, ), [], { "verbose_name": ["verbose_name", {"default": None}], "name": ["name", {"default": None}], "width_field": ["width_field", {"default": None}], "height_field": ["height_field", {"default": None}], "sizes": ["sizes", {"default": None}], }, ), ], ["^core/.fields/.ImageWithThumbsField",]) This is the errors I receieve: ! Cannot freeze field 'core.additionalmaterialphoto.photo' ! (this field has class lib.thumbs.ImageWithThumbsField) ! Cannot freeze field 'core.material.photo' ! (this field has class lib.thumbs.ImageWithThumbsField) ! Cannot freeze field 'core.material.formulaimage' ! (this field has class lib.thumbs.ImageWithThumbsField) ! South cannot introspect some fields; this is probably because they are custom ! fields. If they worked in 0.6 or below, this is because we have removed the ! models parser (it often broke things). ! To fix this, read http://south.aeracode.org/wiki/MyFieldsDontWork Does anybody know why? What am I doing wrong?

Read the article
QueryReadStore loads JSON into DataGrid, but JsonRestStore does not (from the same source)

- by labratmatt

I'm building a Dojo DataGrid from JSON data provided by my REST interface. The DataGrid loads the data fine using a QueryReadStore, but doesn't seem to work with the same same data piped into a JsonRestStore. I'm using the following Dojo libs with Dojo 1.4.1: dojo.require("dojox.data.JsonRestStore"); dojo.require("dojox.grid.DataGrid"); dojo.require("dojox.data.QueryReadStore"); dojo.require("dojo.parser"); I declare my stores in the following manner: var storeJRS = new dojox.data.JsonRestStore({target:"api/collaborations.php/1", idAttribute: 'items[].id'}); var storeQRS = new dojox.data.QueryReadStore({url:"api/collaborations.php/1", requestMethod:"get"}); I create my grid layout like this: var gridLayout = [ new dojox.grid.cells.RowIndex({ name: "Row #", width: 5, styles: "text-align: left;" }), { name: "Name", field: "name", styles: "text-align:right;", width:20 }, { name: "Description", field: "description", width:30 } ]; I create my DataGrid as follows: The above works, but if I use QueryReadStore as my store, the grid is created with the headers (Name, Description), but it isn't populated with any rows: <div dojoType="dojox.grid.DataGrid" jsid="grid3" store="storeQRS" structure="gridLayout" style="height:500px; width:1000px;"></div> Using FireBug, I can see that QueryReadStore is getting my JSON data from my REST interface. It looks like the following: {"numRows":6,"items":[{"name":"My Super Cool Collab","description":"This is for all the super cool people in the super cool group","id":1},{"name":"My Other Super Cool","description":"This is for all the other super cool people","id":3},{"name":"This is another coll","description":"This is just some other collab","id":4},{"name":"some new collab","description":"this is a new collab","id":5},{"name":"yet another new coll","description":"uh huh","id":6},{"name":"asdf","description":"asdf","id":7}]} Any ideas? Thanks.

Read the article
Slowdowns when reading from an urlconnection's inputstream (even with byte[] and buffers)

- by user342677

Ok so after spending two days trying to figure out the problem, and reading about dizillion articles, i finally decided to man up and ask to for some advice(my first time here). Now to the issue at hand - I am writing a program which will parse api data from a game, namely battle logs. There will be A LOT of entries in the database(20+ million) and so the parsing speed for each battle log page matters quite a bit. The pages to be parsed look like this: http://api.erepublik.com/v1/feeds/battle_logs/10000/0. (see source code if using chrome, it doesnt display the page right). It has 1000 hit entries, followed by a little battle info(lastpage will have <1000 obviously). On average, a page contains 175000 characters, UTF-8 encoding, xml format(v 1.0). Program will run locally on a good PC, memory is virtually unlimited(so that creating byte[250000] is quite ok). The format never changes, which is quite convenient. Now, I started off as usual: //global vars,class declaration skipped public WebObject(String url_string, int connection_timeout, int read_timeout, boolean redirects_allowed, String user_agent) throws java.net.MalformedURLException, java.io.IOException { // Open a URL connection java.net.URL url = new java.net.URL(url_string); java.net.URLConnection uconn = url.openConnection(); if (!(uconn instanceof java.net.HttpURLConnection)) { throw new java.lang.IllegalArgumentException("URL protocol must be HTTP"); } conn = (java.net.HttpURLConnection) uconn; conn.setConnectTimeout(connection_timeout); conn.setReadTimeout(read_timeout); conn.setInstanceFollowRedirects(redirects_allowed); conn.setRequestProperty("User-agent", user_agent); } public void executeConnection() throws IOException { try { is = conn.getInputStream(); //global var l = conn.getContentLength(); //global var } catch (Exception e) { //handling code skipped } } //getContentStream and getLength methods which just return'is' and 'l' are skipped Here is where the fun part began. I ran some profiling (using System.currentTimeMillis()) to find out what takes long ,and what doesnt. The call to this method takes only 200ms on avg public InputStream getWebPageAsStream(int battle_id, int page) throws Exception { String url = "http://api.erepublik.com/v1/feeds/battle_logs/" + battle_id + "/" + page; WebObject wobj = new WebObject(url, 10000, 10000, true, "Mozilla/5.0 " + "(Windows; U; Windows NT 5.1; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3 ( .NET CLR 3.5.30729)"); wobj.executeConnection(); l = wobj.getContentLength(); // global variable return wobj.getContentStream(); //returns 'is' stream } 200ms is quite expected from a network operation, and i am fine with it. BUT when i parse the inputStream in any way(read it into string/use java XML parser/read it into another ByteArrayStream) the process takes over 1000ms! for example, this code takes 1000ms IF i pass the stream i got('is') above from getContentStream() directly to this method: public static Document convertToXML(InputStream is) throws ParserConfigurationException, IOException, SAXException { DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); DocumentBuilder db = dbf.newDocumentBuilder(); Document doc = db.parse(is); doc.getDocumentElement().normalize(); return doc; } this code too, takes around 920ms IF the initial InputStream 'is' is passed in(dont read into the code itself - it just extracts the data i need by directly counting the characters, which can be done thanks to the rigid api feed format): public static parsedBattlePage convertBattleToXMLWithoutDOM(InputStream is) throws IOException { // Point A BufferedReader br = new BufferedReader(new InputStreamReader(is)); LinkedList ll = new LinkedList(); String str = br.readLine(); while (str != null) { ll.add(str); str = br.readLine(); } if (((String) ll.get(1)).indexOf("error") != -1) { return new parsedBattlePage(null, null, true, -1); } //Point B Iterator it = ll.iterator(); it.next(); it.next(); it.next(); it.next(); String[][] hits_arr = new String[1000][4]; String t_str = (String) it.next(); String tmp = null; int j = 0; for (int i = 0; t_str.indexOf("time") != -1; i++) { hits_arr[i][0] = t_str.substring(12, t_str.length() - 11); tmp = (String) it.next(); hits_arr[i][1] = tmp.substring(14, tmp.length() - 9); tmp = (String) it.next(); hits_arr[i][2] = tmp.substring(15, tmp.length() - 10); tmp = (String) it.next(); hits_arr[i][3] = tmp.substring(18, tmp.length() - 13); it.next(); it.next(); t_str = (String) it.next(); j++; } String[] b_info_arr = new String[9]; int[] space_nums = {13, 10, 13, 11, 11, 12, 5, 10, 13}; for (int i = 0; i < space_nums.length; i++) { tmp = (String) it.next(); b_info_arr[i] = tmp.substring(space_nums[i] + 4, tmp.length() - space_nums[i] - 1); } //Point C return new parsedBattlePage(hits_arr, b_info_arr, false, j); } I have tried replacing the default BufferedReader with BufferedReader br = new BufferedReader(new InputStreamReader(is), 250000); This didnt change much. My second try was to replace the code between A and B with: Iterator it = IOUtils.lineIterator(is, "UTF-8"); Same result, except this time A-B was 0ms, and B-C was 1000ms, so then every call to it.next() must have been consuming some significant time.(IOUtils is from apache-commons-io library). And here is the culprit - the time taken to parse the stream to string, be it by an iterator or BufferedReader in ALL cases was about 1000ms, while the rest of the code took 0ms(e.g. irrelevant). This means that parsing the stream to LinkedList, or iterating over it, for some reason was eating up a lot of my system resources. question was - why? Is it just the way java is made...no...thats just stupid, so I did another experiment. In my main method I added after the getWebPageAsStream(): //Point A ba = new byte[l]; // 'l' comes from wobj.getContentLength above bytesRead = is.read(ba); //'is' is our URLConnection original InputStream offset = bytesRead; while (bytesRead != -1) { bytesRead = is.read(ba, offset - 1, l - offset); offset += bytesRead; } //Point B InputStream is2 = new ByteArrayInputStream(ba); //Now just working with 'is2' - the "copied" stream The InputStream-byte[] conversion took again 1000ms - this is the way many ppl suggested to read an InputStream, and stil it is slow. And guess what - the 2 parser methods above (convertToXML() and convertBattlePagetoXMLWithoutDOM(), when passed 'is2' instead of 'is' took, in all 4 cases, under 50ms to complete. I read a suggestion that the stream waits for connection to close before unblocking, so i tried using HttpComponentsClient 4.0 (http://hc.apache.org/httpcomponents-client/index.html) instead, but the initial InputStream took just as long to parse. e.g. this code: public InputStream getWebPageAsStream2(int battle_id, int page) throws Exception { String url = "http://api.erepublik.com/v1/feeds/battle_logs/" + battle_id + "/" + page; HttpClient httpclient = new DefaultHttpClient(); HttpGet httpget = new HttpGet(url); HttpParams p = new BasicHttpParams(); HttpConnectionParams.setSocketBufferSize(p, 250000); HttpConnectionParams.setStaleCheckingEnabled(p, false); HttpConnectionParams.setConnectionTimeout(p, 5000); httpget.setParams(p); HttpResponse response = httpclient.execute(httpget); HttpEntity entity = response.getEntity(); l = (int) entity.getContentLength(); return entity.getContent(); } took even longer to process(50ms more for just the network) and the stream parsing times remained the same. Obviously it can be instantiated so as to not create HttpClient and properties every time(faster network time), but the stream issue wont be affected by that. So we come to the center problem - why does the initial URLConnection InputStream(or HttpClient InputStream) take so long to process, while any stream of same size and content created locally is orders of magnitude faster? I mean, the initial response is already somewhere in RAM, and I cant see any good reasong why it is processed so slowly compared to when a same stream is just created from a byte[]. Considering I have to parse million of entries and thousands of pages like that, a total processing time of almost 1.5s/page seems WAY WAY too long. Any ideas? P.S. Please ask in any more code is required - the only thing I do after parsing is make a PreparedStatement and put the entries into JavaDB in packs of 1000+, and the perfomance is ok ~ 200ms/1000entries, prb could be optimized with more cache but I didnt look into it much.

Read the article
iPhone: variable type returned by yajl

- by Luc

Hello, I'm quite new to iphone programming and I want to do the following stuff: get data from a JSON REST web server parse the received data using YAJL Draw a graph with those data using core-plot So, 1th item is fine, I use ASIHttpRequest which runs as espected 3rd is almost fine (I still have to learn how to tune core-plot). The problem I have is regarding 2nd item. I use YAJL as it seems to be the faster parser, so why not give it a try :) Here is the part of code that gets the data from the server and parse them: // Get server data response_data = [request responseData]; // Parse JSON received self.arrayFromData = [response_data yajl_JSON]; NSLog(@"Array from data: %@", self.arrayFromData); The parsing works quite well in fact, the NSLog output is something like: 2010-06-14 17:56:35.375 TEST_APP[3733:207] Array from data : { data = ( { val = 1317; date = "2010-06-10T15:50:01+02:00"; }, { val = 1573; date = "2010-06-10T16:20:01+02:00"; }, ........ { val = 840; date = "2010-06-11T14:50:01+02:00"; }, { val = 1265; date = "2010-06-11T15:20:01+02:00"; } ); from = "2010-06-10T15:50:01+02:00"; to = "2010-06-11T15:20:01+02:00"; max = "2590"; } According to th yajl-objc explanations http://github.com/gabriel/yajl-objc, the parsing returns a NSArray... The thing is... I do not know how to get all the values from it as for me it looks more like a NSDictionary than a NSArray... Could you please help ? Thanks a lot, Luc edit1: it happens that this object is actually a NSCFDictionary (!), I am still not able to get value from it, when I try the objectFromKey method (that should work on a Dictionary, no ?) it fails.

Read the article
Parsing multibyte string in PHP

- by Petr Peller

I would like to write a (HTML) parser based on state machine but I have doubts how to acctually read/use an input. I decided to load the whole input into one string and then work with it as with an array and hold its index as current parsing position. There would be no problems with single-byte encoding, but in multi-byte encoding each value does not represent a character, but a byte of a character. Example: $mb_string = 'žšcr'; //4 multi-byte characters in UTF-8 for($i=0; $i < 4; $i++) { echo $mb_string[$i], PHP_EOL; } Outputs: L ž L A This means I cannot iterate through the string in a loop to check single characters, because I never know if I am in the middle of an character or not. So the questions are: How do I multi-byte safe read a single character from a string in a performance friendly way? Is it good idea to work with the string as it was an array in this case? How would you read the input?

Read the article
What libraries will parse a DTD using PHP

- by Chadwick

I need to parse DTDs using PHP and am hoping there's a simple library to help out. Each DTD has numerous <!ENTITY... and <!-- Comment... elements, which I need to act upon. Note that I do not need to validate anything against these DTDs, simply parse them as data files themselves. A few options I've looked at: James Clarke's SD, which is an option of last resort, but I'd like to avoid the complexity of building/installing/configuring code external to PHP. I'm not sure it's even possible in my situation. PEAR has an XML_DTD_Parser, which requires installing/configuring PEAR and a number of pear modules, which I'm also not sure is possible, and would rather avoid. Has anyone used it with success? PHP XML Classes has the class_path_parser, which another site suggested, but it fails to read ENTITY elements. It appears to be using PHP's built in XML parsing capabilities, which use EXPAT. PHP's DOMDocument will validate against a DTD, so must be able to read them, though I don't see how to get at the DTD parser directly at first glance.

Read the article
What options to parse a DTD using PHP

- by Chadwick

I need to parse DTDs using PHP and am hoping there's a simple library to help out. Each DTD has numerous <!ENTITY... and <!-- Comment... elements, which I need to act upon. Note that I do not need to validate anything against these DTDs, simply parse them as data files themselves. A few options I've looked at: James Clarke's SD, which is an option of last resort, but I'd like to avoid the complexity of building/installing/configuring code external to PHP. I'm not sure it's even possible in my situation. PEAR has an XML_DTD_Parser, which requires installing/configuring PEAR and a number of pear modules, which I'm also not sure is possible, and would rather avoid. Has anyone used it with success? PHP XML Classes has the class_path_parser, which another site suggested, but it fails to read ENTITY elements. It appears to be using PHP's built in XML parsing capabilities, which use EXPAT. PHP's DOMDocument will validate against a DTD, so must be able to read them, though I don't see how to get at the DTD parser directly at first glance.

Read the article
What is the correct way to handle object which is instance of class in apache.xerces?

- by Roman

Preface: I'm working on docx parser for java. docx format is based on xml. When I read document its parts are being unmarshalled (with JAXB). And I get a tree of certain elements based on xml markup. Almost problem: But some elements (which are at very deep xml level) returned not as certain class from docx spec (i.e. CTStyle, CTDrawing, CTInline etc) but as Object. Those objects are indeed instances of xerces classes, e.g. ElementNSImpl. Problem: How should I handle these objects. The simplest approach is: CTGraphicData gData = getGraphicData (); Object obj = gData.getAny().get(0); ElementNSImpl element = (ElementNSImpl)obj; But it doesn't seem to be a good solution. I've never worked with xerces directly. What is the better way to do this casting? (If anyone also give me a tip about right way to iterate through nodes it would be great).

Read the article
Java remove HTML from String without regular expressions

- by behrk2

Hello, I am trying to remove all HTML elements from a String. Unfortunately, I cannot use regular expressions because I am developing on the Blackberry platform and regular expressions are not yet supported. Is there any other way that I can remove HTML from a string? I read somewhere that you can use a DOM Parser, but I couldn't find much on it. Text with HTML: <![CDATA[As a massive asteroid hurtles toward Earth, NASA head honcho Dan Truman (<a href="http://www.netflix.com/RoleDisplay/Billy_Bob_Thornton/20000303">Billy Bob Thornton</a>) hatches a plan to split the deadly rock in two before it annihilates the entire planet, calling on Harry Stamper (<a href="http://www.netflix.com/RoleDisplay/Bruce_Willis/99786">Bruce Willis</a>) -- the world's finest oil driller -- to head up the mission. With time rapidly running out, Stamper assembles a crack team and blasts off into space to attempt the treacherous task. <a href="http://www.netflix.com/RoleDisplay/Ben_Affleck/20000016">Ben Affleck</a> and <a href="http://www.netflix.com/RoleDisplay/Liv_Tyler/162745">Liv Tyler</a> co-star.]]> Text without HTML: As a massive asteroid hurtles toward Earth, NASA head honcho Dan Truman (Billy Bob Thornton) hatches a plan to split the deadly rock in two before it annihilates the entire planet, calling on Harry Stamper (Bruce Willis) -- the world's finest oil driller -- to head up the mission. With time rapidly running out, Stamper assembles a crack team and blasts off into space to attempt the treacherous task.Ben Affleck and Liv Tyler co-star. Thanks!

Read the article
Extracting email addresses in an html block in ruby/rails

- by corroded

I am creating a parser that wards off against spamming and harvesting of emails from a block of text that comes from tinyMCE (so it may or may not have html tags in it) I've tried regexes and so far this has been successful: /\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b/i problem is, i need to ignore all email addresses with mailto hrefs. for example: <a href="mailto:[email protected]">[email protected]</a> should only return the second email add. To get a background of what im doing, im reversing the email addresses in a block so the above example would look like this: <a href="mailto:[email protected]">moc.liam@tset</a> problem with my current regex is that it also replaces the one in href. Is there a way for me to do this with a single regex? Or do i have to check for one then the other? Is there a way for me to do this just by using gsub or do I have to use some nokogiri/hpricot magicks and whatnot to parse the mailtos? Thanks in advance! Here were my references btw: so.com/questions/504860/extract-email-addresses-from-a-block-of-text so.com/questions/1376149/regexp-for-extracting-a-mailto-address im also testing using this: http://rubular.com/ edit here's my current helper code: def email_obfuscator(text) text.gsub(/\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b/i) { |m| m = "<span class='anti-spam'>#{m.reverse}</span>" } end which results in this: <a target="_self" href="mailto:<span class='anti-spam'>moc.liamg@tset</span>"><span class="anti-spam">moc.liamg@tset</span></a>

Read the article
Parsing basic math equations for children's educational software?

- by Simucal

Inspired by a recent TED talk, I want to write a small piece of educational software. The researcher created little miniature computers in the shape of blocks called "Siftables". [David Merril, inventor - with Siftables in the background.] There were many applications he used the blocks in but my favorite was when each block was a number or basic operation symbol. You could then re-arrange the blocks of numbers or operation symbols in a line, and it would display an answer on another siftable block. So, I've decided I wanted to implemented a software version of "Math Siftables" on a limited scale as my final project for a CS course I'm taking. What is the generally accepted way for parsing and interpreting a string of math expressions, and if they are valid, perform the operation? Is this a case where I should implement a full parser/lexer? I would imagine interpreting basic math expressions would be a semi-common problem in computer science so I'm looking for the right way to approach this. For example, if my Math Siftable blocks where arranged like: [1] [+] [2] This would be a valid sequence and I would perform the necessary operation to arrive at "3". However, if the child were to drag several operation blocks together such as: [2] [\] [\] [5] It would obviously be invalid. Ultimately, I want to be able to parse and interpret any number of chains of operations with the blocks that the user can drag together. Can anyone explain to me or point me to resources for parsing basic math expressions? I'd prefer as much of a language agnostic answer as possible.

Read the article
How do I get an overview and a methodology for programming in Python

- by Peter Nielsen

I've started to learn Python and programming from scratch. I have not programmed before so it's a new experience. I do seem to grasp most of the concepts, from variables to definitions and modules. I still need to learn a lot more about what the different libraries and modules do and also I lack knowledge on OOP and classes in Python. I see people who just program in Python like that's all they have ever done and I am still just coming to grips with it. Is there a way, some tools, a logical methodology that would give me an overview or a good hold of how to handle programming problems ? For instance, I'm trying to create a parser which we need at the office . I also need to create a spider that would collect links from various websites. Is there a formidable way of studying the various modules to see what is needed ? Or is it just nose to the grind stone and understand what the documentation says ? Sorry for the lengthy question..

Read the article
Theory of formal languages - Automaton

- by dader51

Hi everybody ! I'm wondering about formal languages. I have a kind of parser : It reads à xml-like serialized tree structure and turn it into a multidimmensionnal array. I figured out that i need at least three variables to achieve the job : $tree = array(); // a new array $pTree = array(&$tree); // a new array which the first element points to $tree; $deep = 0; plus the one containing the sentence splitted into words. My point is on the similarities between the algorithm deing used and the differents kinds of automatons ( state machines turing machines stack ... ). The $words variable is the "tape" of the automaton, the test/conditions of the algorithm are transitions, $deep is the state and $tree is the output. I cannont figure what is $pTree. So the question is : which is the automaton I implictly use here, and to which formal languages family does it fit ? And what's about recursion ?

Read the article
How do I recover from pushing a gitosis.conf file with parsing errors due to line breaks?

- by Kasia

I have successfully set up gitosis for an Android mirror (containing multiple git repositories). While adding a new .git path following writable= in gitosis.conf I managed to insert a few line breaks. Saved, committed and pushed to server when I received the following parsing error: Traceback (most recent call last): File "/usr/bin/gitosis-run-hook", line 8, in load_entry_point('gitosis==0.2', 'console_scripts', 'gitosis-run-hook')() File "/usr/lib/python2.5/site-packages/gitosis-0.2-py2.5.egg/gitosis/app.py", line 24, in run return app.main() File "/usr/lib/python2.5/site-packages/gitosis-0.2-py2.5.egg/gitosis/app.py", line 38, in main self.handle_args(parser, cfg, options, args) File "/usr/lib/python2.5/site-packages/gitosis-0.2-py2.5.egg/gitosis/run_hook.py", line 75, in handle_args post_update(cfg, git_dir) File "/usr/lib/python2.5/site-packages/gitosis-0.2-py2.5.egg/gitosis/run_hook.py", line 33, in post_update cfg.read(os.path.join(export, '..', 'gitosis.conf')) File "/usr/lib/python2.5/ConfigParser.py", line 267, in read self._read(fp, filename) File "/usr/lib/python2.5/ConfigParser.py", line 490, in _read raise e ConfigParser.ParsingError: File contains parsing errors: ./gitosis-export/../gitosis.conf (...) I have removed the line break and amendend the commit by git commit -m "fix linebreak" --amend However git push still yields the exact same error. It leads me to believe gitosis is preventing me from doing any further pushes. How do I recover from this?

Read the article
XMLStreamReader and a real stream

- by Yuri Ushakov

Update There is no ready XML parser in Java community which can do NIO and XML parsing. This is the closest I found, and it's incomplete: http://wiki.fasterxml.com/AaltoHome I have the following code: InputStream input = ...; XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance(); XMLStreamReader streamReader = xmlInputFactory.createXMLStreamReader(input, "UTF-8"); Question is, why does the method #createXMLStreamReader() expects to have an entire XML document in the input stream? Why is it called a "stream reader", if it can't seem to process a portion of XML data? For example, if I feed: <root> <child> to it, it would tell me I'm missing the closing tags. Even before I begin iterating the stream reader itself. I suspect that I just don't know how to use a XMLStreamReader properly. I should be able to supply it with data by pieces, right? I need it because I'm processing a XML stream coming in from network socket, and don't want to load the whole source text into memory. Thank you for help, Yuri.

Read the article
Computing complex math equations in python

- by dassouki

Are there any libraries or techniques that simplify computing equations ? Take the following two examples: F = B * { [ a * b * sumOf (A / B ''' for all i ''' ) ] / [ sumOf(c * d * j) ] } where: F = cost from i to j B, a, b, c, d, j are all vectors in the format [ [zone_i, zone_j, cost_of_i_to_j], [..]] This should produce a vector F [ [1,2, F_1_2], ..., [i,j, F_i_j] ] T_ij = [ P_i * A_i * F_i_j] / [ SumOf [ Aj * F_i_j ] // j = 1 to j = n ] where: n is the number of zones T = vector [ [1, 2, A_1_2, P_1_2], ..., [i, j, A_i_j, P_i_j] ] F = vector [1, 2, F_1_2], ..., [i, j, F_i_j] so P_i would be the sum of all P_i_j for all j and Aj would be sum of all P_j for all i I'm not sure what I'm looking for, but perhaps a parser for these equations or methods to deal with multiple multiplications and products between vectors? To calculate some of the factors, for example A_j, this is what i use from collections import defaultdict A_j_dict = defaultdict(float) for A_item in TG: A_j_dict[A_item[1]] += A_item[3] Although this works fine, I really feel that it is a brute force / hacking method and unmaintainable in the case we want to add more variables or parameters. Are there any math equation parsers you'd recommend? Side Note: These equations are used to model travel. Currently I use excel to solve a lot of these equations; and I find that process to be daunting. I'd rather move to python where it pulls the data directly from our database (postgres) and outputs the results into the database. All that is figured out. I'm just struggling with evaluating the equations themselves. Thanks :)

Read the article
Display a tree dijit using zend framework

- by churris43

I am trying to display a tree of categories and subcategories by using dijits with zend framework. Haven't been able to find a good example. This is what I've got: Basically I got the following code as my action: class SubcategoriesController extends Zend_Controller_Action{ ..... public function loadtreeAction() { Zend_Dojo::enableView($this->view); Zend_Layout::getMvcInstance()->disableLayout(); //Creating a sample tree of categories and subcategories $a["cat1"]["id"] = "id1"; $a["cat1"]["name"] = "Category1"; $a["cat1"]["type"] = "category"; $subcat1 = array("id" => "Subcat1","name" => "Subcategory1" , "type" => "subcategory"); $subcat2 = array("id" => "Subcat12","name" => "Subcategory12" , "type" => "subcategory"); $a["cat1"]["children"] = array($subcat1,$subcat2); $treeObj = new Zend_Dojo_Data('id', $a); $treeObj->setLabel('name'); $this->view->tree = $treeObj->toJson(); } .... } And on my view: <?php $this->dojo()->requireModule('dojo.data.ItemFileReadStore'); $this->dojo()->requireModule('dijit.Tree'); $this->dojo()->requireModule('dojo.parser'); ?> <div dojoType="dojo.data.ItemFileReadStore" url="/Subcategories/loadtree" jsId="store"></div> <div dojoType="dijit.tree.ForestStoreModel" jsId="treeModel" store="store" rootId="root" rootLabel="List of Categories" childrenAttrs="children" query="{type:'category'}"></div> <div dojoType="dijit.Tree" model="treeModel" labelAttrs="ListOfCategories"></div> It doesn't even seem to try to load the tree at all. Any help is appreciated

Read the article

< Previous Page | 105 106 107 108 109 110 111 112 113 114 115 116 | Next Page >