Search Results

Search found 70351 results on 2815 pages for 'file parsing'.

Page 4/2815 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

Parsing large delimited files with dynamic number of columns

- by annelie

Hi, What would be the best approach to parse a delimited file when the columns are unknown before parsing the file? The file format is Rightmove v3 (.blm), the structure looks like this: #HEADER# Version : 3 EOF : '^' EOR : '~' #DEFINITION# AGENT_REF^ADDRESS_1^POSTCODE1^MEDIA_IMAGE_00~ // can be any number of columns #DATA# agent1^the address^the postcode^an image~ agent2^the address^the postcode^^~ // the records have to have the same number of columns as specified in the definition, however they can be empty etc #END# The files can potentially be very large, the example file I have is 40Mb but they could be several hundred megabytes. Below is the code I had started on before I realised the columns were dynamic, I'm opening a filestream as I read that was the best way to handle large files. I'm not sure my idea of putting every record in a list then processing is any good though, don't know if that will work with such large files. List<string> recordList = new List<string>(); try { using (FileStream fs = new FileStream(path, FileMode.Open, FileAccess.Read)) { StreamReader file = new StreamReader(fs); string line; while ((line = file.ReadLine()) != null) { string[] records = line.Split('~'); foreach (string item in records) { if (item != String.Empty) { recordList.Add(item); } } } } } catch (FileNotFoundException ex) { Console.WriteLine(ex.Message); } foreach (string r in recordList) { Property property = new Property(); string[] fields = r.Split('^'); // can't do this as I don't know which field is the post code property.PostCode = fields[2]; // etc propertyList.Add(property); } Any ideas of how to do this better? It's C# 3.0 and .Net 3.5 if that helps. Thanks, Annelie

Read the article
Android - Parsing XML with XPath

- by Ruben Deig Ramos

First of all, thanks to all the people who's going to spend a little time on this question. Second, sorry for my english (not my first language! :D). Well, here is my problem. I'm learning Android and I'm making an app which uses a XML file to store some info. I have no problem creating the file, but trying to read de XML tags with XPath (DOM, XMLPullParser, etc. only gave me problems) I've been able to read, at least, the first one. Let's see the code. Here is the XML file the app generates: <dispositivo> <id>111</id> <nombre>Name</nombre> <intervalo>300</intervalo> </dispositivo> And here is the function which reads the XML file: private void leerXML() { try { XPathFactory factory=XPathFactory.newInstance(); XPath xPath=factory.newXPath(); // Introducimos XML en memoria File xmlDocument = new File("/data/data/com.example.gps/files/devloc_cfg.xml"); InputSource inputSource = new InputSource(new FileInputStream(xmlDocument)); // Definimos expresiones para encontrar valor. XPathExpression tag_id = xPath.compile("/dispositivo/id"); String valor_id = tag_id.evaluate(inputSource); id=valor_id; XPathExpression tag_nombre = xPath.compile("/dispositivo/nombre"); String valor_nombre = tag_nombre.evaluate(inputSource); nombre=valor_nombre; } catch (Exception e) { e.printStackTrace(); } } The app gets correctly the id value and shows it on the screen ("id" and "nombre" variables are assigned to a TextView each one), but the "nombre" is not working. What should I change? :) Thanks for all your time and help. This site is quite helpful! PD: I've been searching for a response on the whole site but didn't found any.

Read the article
When creating a new text file, should I add a .txt extension to its name?

- by Agmenor

When I create a new document aimed at containing only plain text, I am not obliged by Ubuntu to add a .txt extension to its name. It works indeed very well: gedit opens it without problem, understanding very well that it is only text. The only two pro arguments I have found from now on for adding an extension are 1/ interoperability with Windows systems and 2/ avoiding confusion with folders having the same name. Nevertheless those two arguments do not convince me at all. As a consequence, should I keep the reflex of adding an extension to files or not?

Read the article
PDF parsing file trailer

- by Ralph

It is not clear from the PDF ISO standard document (PDF32000-2008) whether a comment may follow the startxref keyword: startxref Byte_offset_of_last_cross-reference_section %%EOF The standard does seem to imply that comments may appear anywhere: 7.2.3 Comments Any occurrence of the PERCENT SIGN (25h) outside a string or stream introduces a comment. The comment consists of all characters after the PERCENT SIGN and up to but not including the end of the line, including regular, delimiter, SPACE (20h), and HORZONTAL TAB characters (09h). A conforming reader shall ignore comments, and treat them as single white-space characters. That is, a comment separates the token preceding it from the one following it. EXAMPLE The PDF fragment in this example is syntactically equivalent to just the tokens abc and 123. abc% comment ( /%) blah blah blah 123 Comments (other than the %PDF–n.m and %%EOF comments described in 7.5, "File Structure") have no semantics. They are not necessarily preserved by applications that edit PDF files. If they are allowed to appear after the startxref, parsing the file becomes more difficult because you do not know how far to back up from the %%EOF comment to start parsing to find the byte offset. Any ideas?

Read the article
Parsing basic math equations for children's educational software?

- by Simucal

Inspired by a recent TED talk, I want to write a small piece of educational software. The researcher created little miniature computers in the shape of blocks called "Siftables". [David Merril, inventor - with Siftables in the background.] There were many applications he used the blocks in but my favorite was when each block was a number or basic operation symbol. You could then re-arrange the blocks of numbers or operation symbols in a line, and it would display an answer on another siftable block. So, I've decided I wanted to implemented a software version of "Math Siftables" on a limited scale as my final project for a CS course I'm taking. What is the generally accepted way for parsing and interpreting a string of math expressions, and if they are valid, perform the operation? Is this a case where I should implement a full parser/lexer? I would imagine interpreting basic math expressions would be a semi-common problem in computer science so I'm looking for the right way to approach this. For example, if my Math Siftable blocks where arranged like: [1] [+] [2] This would be a valid sequence and I would perform the necessary operation to arrive at "3". However, if the child were to drag several operation blocks together such as: [2] [\] [\] [5] It would obviously be invalid. Ultimately, I want to be able to parse and interpret any number of chains of operations with the blocks that the user can drag together. Can anyone explain to me or point me to resources for parsing basic math expressions? I'd prefer as much of a language agnostic answer as possible.

Read the article
MalformedURLException with file URI

- by Paul Reiners

While executing the following code: doc = builder.parse(file); where doc is an instance of org.w3c.dom.Document and builder is an instance of javax.xml.parsers.DocumentBuilder, I'm getting the following exception: Exception in thread "main" java.net.MalformedURLException: unknown protocol: c at java.net.URL.<init>(Unknown Source) at java.net.URL.<init>(Unknown Source) at java.net.URL.<init>(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startDTDEntity(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.setInputSource(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.dispatch(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.next(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source) at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source) at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source) at javax.xml.parsers.DocumentBuilder.parse(Unknown Source) at com.acme.ItemToThetaValues.createFiles(ItemToThetaValues.java:47) It's choking on this line of the file: <!DOCTYPE questestinterop SYSTEM "C:\Program Files\Acme\parsers\acme_full.dtd"> I am not getting this error on my machine, while a user is getting it on his machine. We are both using version 6 of the Sun JRE. This error also occurs when he's uses double backslashes in the path instead of single backslashes and when he uses forward slashes instead of backslashes. First of all, is the XML correct? Is the path expressed correctly? Second of all, why is this error occurring on one computer but not on another?

Read the article
Parsing HTTP - Bytes.length != String.length

- by hotzen

Hello, I consume HTTP via nio.SocketChannel, so I get chunks of data as Array[Byte]. I want to put these chunks into a parser and continue parsing after each chunk has been put. HTTP itself seems to use an ISO8859-Charset but the Payload/Body itself may be arbitrarily encoded: If the HTTP Content-Length specifies X bytes, the UTF8-decoded Body may have much less Characters (1 Character may be represented in UTF8 by 2 bytes, etc). So what is a good parsing strategy to honor an explicitly specified Content-Length and/or a Transfer-Encoding: Chunked which specifies a chunk-length to be honored. append each data-chunk to an mutable.ArrayBuffer[Byte], search for CRLF in the bytes, decode everything from 0 until CRLF to String and match with Regular-Expressions like StatusRegex, HeaderRegex, etc? decode each data-chunk with the proper charset (e.g. iso8859, utf8, etc) and add to StringBuilder. With this solution I am not able to honor any Content-Length or Chunk-Size, but.. do I have to care for it? any other solution... ?

Read the article
Need some ideas on how to acomplish this in Java (parsing strings)

- by Matt

Sorry I couldn't think of a better title, but thanks for reading! My ultimate goal is to read a .java file, parse it, and pull out every identifier. Then store them all in a list. Two preconditions are there are no comments in the file, and all identifiers are composed of letters only. Right now I can read the file, parse it by spaces, and store everything in a list. If anything in the list is a java reserved word, it is removed. Also, I remove any loose symbols that are not attached to anything (brackets and arithmetic symbols). Now I am left with a bunch of weird strings, but at least they have no spaces in them. I know I am going to have to re-parse everything with a . delimiter in order to pull out identifiers like System.out.print, but what about strings like this example: Logger.getLogger(MyHash.class.getName()).log(Level.SEVERE, After re-parsing by . I will be left with more crazy strings like: getLogger(MyHash getName()) log(Level SEVERE, How am I going to be able to pull out all the identifiers while leaving out all the trash? Just keep re-parsing by every symbol that could exist in java code? That seems rather lame and time consuming. I am not even sure if it would work completely. So, can you suggest a better way of doing this?

Read the article
How to remove file association in windows 8?

- by Chesnokov Yuriy

I have Chrome associated with .xlsx file on windows 8.1 machine In Control Panel\Programs\Default Programs\Set Associations it is not possible to remove association only to change it to another program. In Control Panel\Programs\Default Programs\Set Default Programs\Set Program Associations , .xlsx is not present in Chrome. I removed all keys from HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Explorer\FileExts\.xlsx Still Chrome remains associated with that extension in Control Panel\Programs\Default Programs\Set Associations, Windows Explorer shows the Chrome icon with the .xlsx file.

Read the article
online file sharing / free storage services supporting upload from an existing url

- by x86shadow

Hello Experts, Can you list websites (free) allowing upload a file from an existing url ? Currently I am using ADrive's file transfer feature, do you know other services with similar functionality ?

Read the article
List of file sharing / free storage services supporting upload from an existing URL

- by x86shadow

Can you list (free) websites allowing the upload of a file from an existing URL? ADrive : has remote file transfer feature. 50 GB free storage, needs java runtime to download uploaded files.

Read the article
file sharing / free storage services supporting upload from an existing url

- by x86shadow

Hello Experts, Can you list websites (free) allowing upload a file from an existing url ? Currently I am using ADrive's file transfer feature, do you know other services with similar functionality ?

Read the article
Parsing Complex Text File with C#

- by David

Hello, I need to parse a text file that has a lot of levels and characters. I've been trying different ways to parse it but I haven't been able to get anything to work. I've included a sample of the text file I'm dealing with. Any suggestions on how I can parse this file? I have denoted the parts of the file I need with TEXTINEED. (bean name: 'TEXTINEED context: (list '/text '/content/home/left-nav/text '/content/home/landing-page) type: '/text/types/text module: '/modules/TEXTINEED source: '|moretext| ((contents (list (list (bean type: '/directory/TEXTINEED ((directives (bean ((chartSize (list 600 400)) (showCorners (list #f)) (showColHeader (list #f)) (showRowHeader (list #f))))))) (bean type: '/directory/TEXTINEED ((directives (bean ((displayName (list "MTD")) (showCorners (list #f)) (showColHeader (list #f)) (showRowLabels (list #f)) (hideDetailedLink (list #t)) (showRowHeader (list #f)) (chartSize (list 600 400))))))) (bean type: '/directory/TEXTINEED ((directives (bean ((displayName (list "QTD")) (showCorners (list #f)) (showColHeader (list #f)) (showRowLabels (list #f)) (hideDetailedLink (list #t)) (showRowHeader (list #f)) (chartSize (list 600 400)))))))) Thanks!

Read the article
Bash script: regexp reading numerical parameters from text file

- by Andrey Kazak

Greetings! I have a text file with parameter set as follows: NameOfParameter Value1 Value2 Value3 ... ... I want to find needed parameter by its NameOfParameter using regexp pattern and return a selected Value to my Bash script. I tried to do this with grep, but it returns a whole line instead of Value. Could you help me to find as approach please?

Read the article
Best Practice: Apache File Upload

- by matnagel

I am looking for a soultion for trusted users to upload pdf files via html forms (with maybe php involved). This is quite a standard ubuntu linux server with apache 2.x and php 5. I am wonderiung what are the benefits of the apache file upload module. There were no updates for some time, is it actively maintained? What are the advantages over traditional php upload with apache 2 without this module? http://commons.apache.org/fileupload I remember traditional php file upload is difficult with some pitfalls, will the apache file upload module improve the situation? The solution I am looking for will be part of an existing website and be integrated into the admin web frontend. Things I am not considering are webdav, ssh, ftp, ftps, ftp over ssh. Should work with a browser and without installing special client software, so I am asking about a browser based upload without special client side requirements. I can request a modern browser like firefox = 3.5 or modern webkit broser like chrome or safari from the users.

Read the article
Iterating through folders and files in batch file?

- by Will Marcouiller

Here's my situation. A project has as objective to migrate some attachments to another system. These attachments will be located to a parent folder, let's say "Folder 0" (see this question's diagram for better understanding), and they will be zipped/compressed. I want my batch script to be called like so: BatchScript.bat "c:\temp\usd\Folder 0" I'm using 7za.exe as the command line extraction tool. What I want my batch script to do is to iterate through the "Folder 0"'s subfolders, and extract all of the containing ZIP files into their respective folder. It is obligatory that the files extracted are in the same folder as their respective ZIP files. So, files contained in "File 1.zip" are needed in "Folder 1" and so forth. I have read about the FOR...DO command on Windows XP Professional Product Documentation - Using Batch Files. Here's my script: @ECHO OFF FOR /D %folder IN (%%rootFolderCmdLnParam) DO FOR %zippedFile IN (*.zip) DO 7za.exe e %zippedFile I guess that I would also need to change the actual directory before calling 7za.exe e %zippedFile for file extraction, but I can't figure out how in this batch file (through I know how in command line, and even if I know it is the same instruction "cd"). Anyone's help is gratefully appreciated.

Read the article
Windows file association for README, INSTALL, LICENSE and the like [closed]

- by Lumi

Possible Duplicate: How to set the default program for opening files without an extension in Windows? Many files originating in the UNIX world come without file extension. Popular examples include README, INSTALL, LICENSE. We know for a fact that these are text files. It is therefore a bit disappointing not to be able to just double-click them open in Explorer and see them in Notepad (actually, Notepad2 because of the UNIX line endings which silly Microsoft Notepad doesn't render correctly). Does anyone know of a way to create a file association for, say, README files without extension? This could then be replicated to cover the most frequently occurring file types, and then double-clicking them open would work. Update (Sort of in response to all your comments.) Thanks, folks, your comments and answers have helped me. @Indrek, yes, I was under the assumption that you could somehow create an association for just README or Makefile, and couldn't do so for files without extension. Turns out the contrary is true, and yes, that is a workaround that neatly solves the issue. Ultimately, I just want to be able to double-click to open a README or Makefile, that's all. @Sampo, the SendMe trick is also useful, although usability is not as great as a straight double-click. (I'm really lazy sometimes.) Turns out the following trick using ftype and ftype from an Administrator prompt does the double-click enabling job: assoc .=no_ext ftype no_ext=%SystemRoot%\system32\NOTEPAD.EXE %1 :: You can see it created some entries in the registry: reg query hkcr\no_ext /s reg query hkcr\. /s

Read the article
Moving hidden files/folders with the command-line or batch-file

- by Synetech

Question Does anyone know of a way to move files and folders that have the hidden, system, or read-only attribute set from the command-line or a batch file? (No, stripping the attributes first is not an option since there is no practical way to know which attributes were set in order to re-set them after the move.) (Failed) Attempts Using the basic move command does not work with items with the hidden or system attribute set and for some reason, it does not have switches to specify attributes like the dir and del commands do. I tried using a utility I wrote that uses the shell’s file operation function, but that requires using start /w to prevent the batch file from running on ahead, and it complains about long-filename support for some reason. I tried using robocopy, but it first copies the files and then deletes the originals instead of simply moving the source (which results in a frustrating delay, even with the excessive output redirected to nul). (Surprisingly it seems that few people have ever needed to move hidden files from the command-line. All I could find was this one person who abandoned the attempt.)

Read the article
Windows network: deny file access to another user if file is currently open

- by Steve

Some users on my network are having difficulties saving files, because the file is open elsewhere. Let's say Lemuel wants to edit a file, but Bernice in the next office over is working on it. Lemuel opens, edits, and tries to save, but then gets a "no write access" error. Bernice chortles with glee (since earlier that week Lemuel stole her sandwich). Unfortunately, various softwares will not warn the user that they have opened a read-only file. Is there a way (in Windows) to limit file access to ONE user only, i.e. 777 for the first user to open the file, and 000 for all users after that? (Sorry for the Linux terminology but it gets my point across).

Read the article
Exceptions with DateTime parsing in RSS feed in C#

- by hIpPy

I'm trying to parse Rss2, Atom feeds using SyndicationFeedFormatter and SyndicationFeed objects. But I'm getting XmlExceptions while parsing DateTime field like pubDate and/or lastBuildDate. Wed, 24 Feb 2010 18:56:04 GMT+00:00 does not work Wed, 24 Feb 2010 18:56:04 GMT works So, it's throwing due to the timezone field. As a workaround, for familiar feeds I would manually fix those DateTime nodes - by catching the XmlException, loading the Rss into an XmlDocument, fixing those nodes' value, creating a new XmlReader and then returning the formatter from this new XmlReader object (code not shown). But for this approach to work, I need to know beforehand which nodes cause exception. SyndicationFeedFormatter syndicationFeedFormatter = null; XmlReaderSettings settings = new XmlReaderSettings(); using (XmlReader reader = XmlReader.Create(url, settings)) { try { syndicationFeedFormatter = SyndicationFormatterFactory.CreateFeedFormatter(reader); syndicationFeedFormatter.ReadFrom(reader); } catch (XmlException xexp) { // fix those datetime nodes with exceptions and read again. } return syndicationFeedFormatter; } rss feed: http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&q=test&cf=all&output=rss exception detials: XmlException Error in line 1 position 376. An error was encountered when parsing a DateTime value in the XML. at System.ServiceModel.Syndication.Rss20FeedFormatter.DateFromString(String dateTimeString, XmlReader reader) at System.ServiceModel.Syndication.Rss20FeedFormatter.ReadXml(XmlReader reader, SyndicationFeed result) at System.ServiceModel.Syndication.Rss20FeedFormatter.ReadFrom(XmlReader reader) at ... cs:line 171 <rss version="2.0"> <channel> ... <pubDate>Wed, 24 Feb 2010 18:56:04 GMT+00:00</pubDate> <lastBuildDate>Wed, 24 Feb 2010 18:56:04 GMT+00:00</lastBuildDate> <-----exception ... <item> ... <pubDate>Wed, 24 Feb 2010 16:17:50 GMT+00:00</pubDate> <lastBuildDate>Wed, 24 Feb 2010 18:56:04 GMT+00:00</lastBuildDate> </item> ... </channel> </rss> Is there a better way to achieve this? Please help. Thanks.

Read the article
Looking for a tutorial on Recursive Descent Parsing.

- by bodacydo

I am trying to parse some data to no success. Can anyone recommend a good introduction with a lot of examples to Recursive Descent Parsing? I haven't been able to find any.

Read the article
Ruby libraries for parsing .doc files?

- by Platinum Azure

Hi all, I was just wondering if anyone knew of any good libraries for parsing .doc files (and similar formats, like .odt) to extract text, yet also keep formatting information where possible for display on a website. Capability of doing similarly for PDFs would be a bonus, but I'm not looking as much for that. This is for a Rails project, if that helps at all. Thanks in advance!

Read the article
Any python libs for parsing apache config files?

- by daniels

Any python libs for parsing apache config files or if not python anyone aware of such thing in other languages (perl, php, java, c#)? As i'll be able to rewrite them in python.

Read the article
Sending and Parsing JSON in Android

- by primal

Hi, In the application I am developing, I would like to send messages in the form of JSON objects to a Django Server and parse the JSON response from the server and populate a custom listview. From the little JSON knowledge I have, I thought this format for the response from server { "post": { "username": "someusername", "message": "this is a sweet message", "image": "http://localhost/someimage.jpg", "time": "present time" }, } How much knowledge of JSON should I have to accomplish this purpose? Also it would be great if someone could provide me links of some tutorials for sending and parsing JSON Objects.

Read the article
parsing HTML on the iPhone

- by Ben Alpert

Can anyone recommend a C or Objective-C library for HTML parsing? It needs to handle messy HTML code that won't quite validate. Does such a library exist, or am I better off just trying to use regular expressions?

Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >