Search Results

Search found 186 results on 8 pages for 'newlines'.

Page 7/8 | < Previous Page | 3 4 5 6 7 8  | Next Page >

  • How to parse a CSV file containing serialized PHP? [migrated]

    - by garbetjie
    I've just started dabbling in Perl, to try and gain some exposure to different programming languages - so forgive me if some of the following code is horrendous. I needed a quick and dirty CSV parser that could receive a CSV file, and split it into file batches containing "X" number of CSV lines (taking into account that entries could contain embedded newlines). I came up with a working solution, and it was going along just fine. However, as one of the CSV files that I'm trying to split, I came across one that contains serialized PHP code. This seems to break the CSV parsing. As soon as I remove the serialization - the CSV file is parsed correctly. Are there any tricks I need to know when it comes to parsing serialized data in CSV files? Here is a shortened sample of the code: use strict; use warnings; my $csv = Text::CSV_XS->new({ eol => $/, always_quote => 1, binary => 1 }); my $out; my $in; open $in, "<:encoding(utf8)", "infile.csv" or die("cannot open input file $inputfile"); open $out, ">outfile.000"; binmode($out, ":utf8"); while (my $line = $csv->getline($in)) { $lines++; $csv->print($out, $line); } I'm never able to get into the while loop shown above. As soon as I remove the serialized data, I suddenly am able to get into the loop. Edit: An example of a line that is causing me trouble (taken straight from Vim - hence the ^M): "26","other","1","20,000 Subscriber Plan","Some text here.^M\ Some more text","on","","18","","0","","0","0","recurring","0","","payment","totalsend","0","tsadmin","R34bL9oq","37","0","0","","","","","","","","","","","","","","","","","","","","","","","0","0","0","a:18:{i:0;s:1:\"3\";i:1;s:1:\"2\";i:2;s:2:\"59\";i:3;s:2:\"60\";i:4;s:2:\"61\";i:5;s:2:\"62\";i:6;s:2:\"63\";i:7;s:2:\"64\";i:8;s:2:\"65\";i:9;s:2:\"66\";i:10;s:2:\"67\";i:11;s:2:\"68\";i:12;s:2:\"69\";i:13;s:2:\"70\";i:14;s:2:\"71\";i:15;s:2:\"72\";i:16;s:2:\"73\";i:17;s:2:\"74\";}","","","0","0","","0","0","0.0000","0.0000","0","","","0.00","","6","1" "27","other","1","35,000 Subscriber Plan","Some test here.^M\ Some more text","on","","18","","0","","0","0","recurring","0","","payment","totalsend","0","tsadmin","R34bL9oq","38","0","0","","","","","","","","","","","","","","","","","","","","","","","0","0","0","a:18:{i:0;s:1:\"3\";i:1;s:1:\"2\";i:2;s:2:\"59\";i:3;s:2:\"60\";i:4;s:2:\"61\";i:5;s:2:\"62\";i:6;s:2:\"63\";i:7;s:2:\"64\";i:8;s:2:\"65\";i:9;s:2:\"66\";i:10;s:2:\"67\";i:11;s:2:\"68\";i:12;s:2:\"69\";i:13;s:2:\"70\";i:14;s:2:\"71\";i:15;s:2:\"72\";i:16;s:2:\"73\";i:17;s:2:\"74\";}","","","0","0","","0","0","0.0000","0.0000","0","","","0.00","","7","1" "28","other","1","50,000 Subscriber Plan","Some text here.^M\ Some more text","on","","18","","0","","0","0","recurring","0","","payment","totalsend","0","tsadmin","R34bL9oq","39","0","0","","","","","","","","","","","","","","","","","","","","","","","0","0","0","a:18:{i:0;s:1:\"3\";i:1;s:1:\"2\";i:2;s:2:\"59\";i:3;s:2:\"60\";i:4;s:2:\"61\";i:5;s:2:\"62\";i:6;s:2:\"63\";i:7;s:2:\"64\";i:8;s:2:\"65\";i:9;s:2:\"66\";i:10;s:2:\"67\";i:11;s:2:\"68\";i:12;s:2:\"69\";i:13;s:2:\"70\";i:14;s:2:\"71\";i:15;s:2:\"72\";i:16;s:2:\"73\";i:17;s:2:\"74\";}","","","0","0","","0","0","0.0000","0.0000","0","","","0.00","","8","1""73","other","8","10,000,000","","","","0","","0","","0","0","recurring","0","","payment","","0","","","75","0","10000000","","","","","","","","","","","","","","","","","","","","","","","0","0","0","a:17:{i:0;s:1:\"3\";i:1;s:1:\"2\";i:2;s:2:\"59\";i:3;s:2:\"60\";i:4;s:2:\"61\";i:5;s:2:\"62\";i:6;s:2:\"63\";i:7;s:2:\"64\";i:8;s:2:\"65\";i:9;s:2:\"66\";i:10;s:2:\"67\";i:11;s:2:\"68\";i:12;s:2:\"69\";i:13;s:2:\"70\";i:14;s:2:\"71\";i:15;s:2:\"72\";i:16;s:2:\"74\";}","","","0","0","","0","0","0.0000","0.0000","0","","","0.00","","14","0"

    Read the article

  • Suggestions for displaying code on webpages, MUST use <br> for newline

    - by bguiz
    Hi, I want to post code snippets online (wordpress.com blog) - and have its whitespace formatted nicely. See the answers suggested by this other SO question: Those would be OK, except that I like to copy code to clip board or clip entire pages using Evernote - and they use either the <pre> tag or <table> (or both) to format the code. So I end up with text whose newlines and white spaces ignored, e.g. string url = "<a href=\"" + someObj.getUrl() + "\" target=\"_blank\">"; // single line comments // second single line override protected void OnLoad(EventArgs e) { if(Attributes["class"]&nbsp;!= null) { //_year.CssClass = _month.CssClass = _day.CssClass = Attributes["class"]; } base.OnLoad(e); } Which I find rather annoying myself. I find that if the code was formatted using <br> tags, they copy/ clip porperly, e.g. string url = "<a href=\"" + someObj.getUrl() + "\" target=\"_blank\">"; // single line comments // second single line override protected void OnLoad(EventArgs e) { if(Attributes["class"]&nbsp;!= null) { //_year.CssClass = _month.CssClass = _day.CssClass = Attributes["class"]; } base.OnLoad(e); } I find this annoying myself, so I don't want to inflict it upon others when I post my own code. Please suggest methods of posting code snippets online that are able to do this. I would like to emphasise that syntax highlighting capability is secondary to correct white space markup. Thank you

    Read the article

  • newline-ignoring diff / diff across multiple lines / reflow-ignoring diff

    - by Adam
    Does anybody know of a diff-like tool that can show me the changes between two text files, but ignore changes in whitespace including newlines? Here's an example: the quick brown fox jumped over the lazy bear. the quick brown fox jumped over the lazy bear. the quick brown fox jumped over the lazy bear. the quick brown fox jumped over the lazy bear. quick brown fox jumped over the lazy bear. the quick brown fox jumped over the lazy bear. the quick brown fox jumped over the lazy bear. the quick brown fox jumped over the lazy bear. All I did was delete one word and reflow it, but "diff -b" detects a change on every line (as it should; I'm not saying this is a bug in diff). But for large LaTeX files this is a major problem; change one word in a long paragraph and the diff you get back is basically useless. By the way, I'm aware that this requires way more computational power than the usual lines-are-atomic diff. I'm only doing this on small human-generated files and am happy to wait a long time if I have to.

    Read the article

  • problem using getline with a unicode file

    - by hamishmcn
    UPDATE: Thank you to @Potatoswatter and @Jonathan Leffler for comments - rather embarrassingly I was caught out by the debugger tool tip not showing the value of a wstring correctly - however it still isn't quite working for me and I have updated the question below: If I have a small multibyte file I want to read into a string I use the following trick - I use getline with a delimeter of '\0' e.g. std::string contents_utf8; std::ifstream inf1("utf8.txt"); getline(inf1, contents_utf8, '\0'); This reads in the entire file including newlines. However if I try to do the same thing with a wide character file it doesn't work - my wstring only reads to the the first line. std::wstring contents_wide; std::wifstream inf2(L"ucs2-be.txt"); getline( inf2, contents_wide, wchar_t(0) ); //doesn't work For example my if unicode file contains the chars A and B seperated by CRLF, the hex looks like this: FE FF 00 41 00 0D 00 0A 00 42 Based on the fact that with a multibyte file getline with '\0' reads the entire file I believed that getline( inf2, contents_wide, wchar_t(0) ) should read in the entire unicode file. However it doesn't - with the example above my wide string would contain the following two wchar_ts: FF FF (If I remove the wchar_t(0) it reads in the first line as expected (ie FE FF 00 41 00 0D 00) Why doesn't wchar_t(0) work as a delimiting wchar_t of "00 00"? Thank you

    Read the article

  • Variable-width inline underline effects in CSS

    - by sidereal
    I need to simulate the look of a typical paper form in CSS. It consists of a two-column table of fields. Each field consists of a field name (of variable width) followed by an underline that continues to the end of the column. The field might be populated, in which case there is some text centered above the line, or it may be blank. If that isn't clear, he's a rough idea in manky ASCII art: Name: _______Foo_______ Age: _____17______ Location: __Melbourne__ Handedness: _Left_ (except that the underline would continue under any text) To implement the underline without text, I assume I should use a border-bottom rather than a text-decoration: underline. Additionally, I need the bordered element to take up the full available space. Both of those argue for a block-level element. However, I can't find any way to get the block level element (either a div, an li, or a span set to display: block or inline-block) to remain on the same line as the label. As soon as I give it a width: 100%, it newlines. I've tried various combinations of floats, and I'm not inclined to do anything ridiculous with absolute positioning. Any recommendations?

    Read the article

  • PHP PCRE differences on testing and hosting servers

    - by Gary Pearman
    Hi all, I've got the following regular expression that works fine on my testing server, but just returns an empty string on my hosted server. $text = preg_replace('~[^\\pL\d]+~u', $use, $text); Now I'm pretty sure this comes down to the hosting server version of PCRE not being compiled with Unicode property support enabled. The differences in the two versions are as follows: My server: PCRE version 7.8 2008-09-05 Compiled with UTF-8 support Unicode properties support Newline sequence is LF \R matches all Unicode newlines Internal link size = 2 POSIX malloc threshold = 10 Default match limit = 10000000 Default recursion depth limit = 10000000 Match recursion uses stack Hosting server: PCRE version 4.5 01-December-2003 Compiled with UTF-8 support Newline character is LF Internal link size = 2 POSIX malloc threshold = 10 Default match limit = 10000000 Match recursion uses stack Also note that the version on the hosting server (the same version PHP is compiled against) is pretty old. What confuses me though, is that pcretest fails on both servers from the command line with re> ~[^\\pL\d]+~u ** Unknown option 'u' although this regexp works fine when run from PHP on my server. So, I guess my questions are does the regular expression fail on the hosting server because of the lack of Unicode properties? Or is there something else that I'm missing? Thanks all, Gaz.

    Read the article

  • Add newline to a text field (Win32)

    - by user146780
    I'm making a Notepad clone. Right now my text loads fine but where their are newline characters, they do not make newlines in the text field. I load it like this: void LoadText(HWND ctrl,HWND parent) { int leng; char buf[330000]; char FileBuffer[500]; memset(FileBuffer,0,500); FileBuffer[0] = '*'; FileBuffer[1] = '.'; FileBuffer[2] = 't'; FileBuffer[3] = 'x'; FileBuffer[4] = 't'; OPENFILENAMEA ofn; memset(&ofn, 0, sizeof(OPENFILENAMEA)); ofn.lStructSize = sizeof(OPENFILENAMEA); ofn.hwndOwner = parent; ofn.lpstrFile = FileBuffer; ofn.nMaxFile = 500; ofn.lpstrFilter = "Filetype (*.txt)\0\0"; ofn.lpstrDefExt = "txt"; ofn.Flags = OFN_EXPLORER; if(!GetOpenFileNameA(&ofn)) { return; } ifstream *file; file = new ifstream(FileBuffer,ios::in); int lenn; lenn = 0; while (!file->eof()) { buf[lenn] = file->get(); lenn += 1; } buf[lenn - 1] = 0; file->read(buf,lenn); SetWindowTextA(ctrl,buf); file->close(); } How can I make it do the new line characters? Thanks

    Read the article

  • Sybase: how can I remove non-printable characters from CHAR or VARCHAR fields with SQL?

    - by Kenny Drobnack
    I'm working with a Sybase database that seems to have non-printable characters in some of the string fields and this is throwing off some of our processing code. At first glance, it seemed to only be newlines and carriage returns, but we also have an ASCII code 27 in there - an ESC character, some accented characters, and some other oddities in there. I have no direct access to change the database, so changing the bad data isn't an option, yet. For now I have to make do with just filtering it out. We're trying to export the table data from one database and load it into a database used by another application in a nightly batch process. Ideally, I'd like to have a function that I can pass a list of characters and just have Sybase return the data with those characters removed. I'd like to keep it something we could do in plain SQL if possible. Something like this to remove characters that are ASCII 0 - 31. select str_replace(FIELD1, (0-31), NULL) as FIELD1, str_replace(FIELD2, (0-31), NULL) as FIELD2 from TABLE So far, str_replace is the nearest I can find, but it only allows replacing one string with another. No support for character ranges and won't let me do the above. We're running on Sybase ASE 12.5 on Unix servers.

    Read the article

  • Strange \n in base64 encoded string in Ruby

    - by intellidiot
    The inbuilt Base64 library in Ruby is adding some '\n's. I'm unable to find out the reason. For this special example: irb(main):001:0> require 'rubygems' => true irb(main):002:0> require 'base64' => true irb(main):003:0> str = "1110--ad6ca0b06e1fbeb7e6518a0418a73a6e04a67054" => "1110--ad6ca0b06e1fbeb7e6518a0418a73a6e04a67054" irb(main):004:0> Base64.encode64(str) => "MTExMC0tYWQ2Y2EwYjA2ZTFmYmViN2U2NTE4YTA0MThhNzNhNmUwNGE2NzA1\nNA==\n" The \n's are at the last and 6th position from end. The decoder (Base64.decode64) returns back the old string perfectly. Strange thing is, these \n's don't add any value to the encoded string. When I remove the newlines from the output string, the decoder decodes it again perfectly. irb(main):005:0> Base64.decode64(Base64.encode64(str).gsub("\n", '')) == str => true More of this, I used an another JS library to produce the base64 encoded output of the same input string, the output comes without the \n's. Is this a bug or anything else? Has anybody faced this issue before? FYI, $ ruby -v ruby 1.8.7 (2008-08-11 patchlevel 72) [i486-linux]

    Read the article

  • Find first item inside angular brackets after occurrence of other item, using RegEx, in C#

    - by Mihaela
    I have an xml-like text, in which I would like to find the item that occurs in the first occurrence of a certain pattern: typically: ... <PropertyGroup><name>true</name></PropertyGroup><PropertyGroup>.... .... Could also be ... <PropertyGroup> <name> true</name> </PropertyGroup> ... <PropertyGroup> ... In the above, I need to extract the "name". My initial assumption was that all occurrences were to be in one line, and I wrote my code using string properties, but it is very difficult o take in consideration every possibility, and only RegEx can save me. I just don't know how to write it... I Have started with something like this: Regex regex = new Regex("(?<=<PropertyGroup>#)<+"); Match matches = regex.Matches(Text)[0]; MessageBox.Show(matches.ToString()); I think this finds the first item after a <PropertyGroup>, but I don't know how to make it get the item within the angular brackets... (which may be after one or more newlines, and/or spaces). I know that there are utilities for parsing xml, but I am looking for something simple to insert in a c# program Can someone please help me ? Thank you very much.

    Read the article

  • JSP Document/JSPX: what determines how tabs/space/linebreaks are removed in the output?

    - by NoozNooz42
    I've got a "JSP Document" ("JSP in XML") nicely formatted and when the webpage is generated and sent to the user, some linebreaks are removed. Now the really weird part: apparently the "main" .jsp always gets all its linebreak removed but for any subsequent .jsp included from the main .jsp, linebreaks seems to be randomly removed (some are there, others aren't). For example, if I'm looking at the webpage served from Firefox and ask to "view source", I get to see what is generated. So, what determines when/how linebreaks are kept/removed? This is just an example I made up... Can you force a .jsp to serve this: <body><div id="page"><div id="header"><div class="title">... or this: <body> <div id="page"> <div id="header"> <div class="title">... ? I take it that linebreaks are removed to save on bandwidth, but what if I want to keep them? And what if I want to keep the same XML indentation as in my .jsp file? Is this doable? EDIT Following skaffman's advice, I took a look at the generated .java files and the "main" one doesn't have lots of out.write but not a single one writing tabs nor newlines. Contrary to that file, all the ones that I'm including from that main .jsp have lots of lines like: out.write("\t...\n"); So I guess my question stays exactly the same: what determines how tabs/space/linebreaks are included/removed in the output?

    Read the article

  • jQuery RJS inserting string vs dom.

    - by Dmitriy Likhten
    So I am trying to use jQuery to insert data from an ajax call. I actually use the jquery.form plugin, and have the ajax form submitted with a dataType: 'script'. The response is a jquery expression which contains a <%= javascript_escape(render ...) %> erb tag (similar to what the railscasts episode 136 instructs to do). However the end result is that the full text of the render is inserted as if that was the content to be inserted into the page, as text, not as dom elements. Could the fact that the render had some newlines at the beginning be the cause? Dom text: "\n \n &lt;li>....&lt;/li>" I also tried having jQuery just read the response as a script and execute it, and used the prototype-based rjs stuff, same effect, the text is inserted into the dom. Are there any reasons why such a behavior would be experienced? A bit of clarification: My response.js.erb is jQuery("#content").append("<%= escape_javascript(render(:partial => "widgets")) %>"); jQuery("#information").text("Finally, something happened!"); The full text inside the append() call is inserted as text into #content.

    Read the article

  • Convert array to CSV/TSV-formated string in Python.

    - by dreeves
    Python provides csv.DictWriter for outputting CSV to a file. What is the simplest way to output CSV to a string or to stdout? For example, given a 2D array like this: [["a b c", "1,2,3"], ["i \"comma-heart\" you", "i \",heart\" u, too"]] return the following string: "a b c, \"1, 2, 3\"\n\"i \"\"comma-heart\"\" you\", \"i \"\",heart\"\" u, too\"" which when printed would look like this: a b c, "1,2,3" "i ""heart"" you", "i "",heart"" u, too" (I'm taking csv.DictWriter's word for it that that is in fact the canonical way to output that array as CSV. Excel does parse it correctly that way, though Mathematica does not. From a quick look at the wikipedia page on CSV it seems Mathematica is wrong.) One way would be to write to a temp file with csv.DictWriter and read it back with csv.DictReader. What's a better way? TSV instead of CSV It also occurs to me that I'm not wedded to CSV. TSV would make a lot of the headaches with delimiters and quotes go away: just replace tabs with spaces in the entries of the 2D array and then just intersperse tabs and newlines and you're done. Let's include solutions for both TSV and CSV in the answers to make this as useful as possible for future searchers.

    Read the article

  • How can I remove HTML span tags with a Perl one liner?

    - by yaya3
    I want to perform the following vim substitution as a one-liner in the terminal with Perl. I would prefer to allow for any occurences of whitespace or newlines, rather than explicitly catering for them as I am below. %s/blockDontForget">\n*\s*<p><span><a\(.*\)<\/span>/blockDontForget"><p><a\1/g I've tried this: perl -pi -e 's/blockDontForget"><p><span><a(.*)<\/span>/blockDontForget"><p><a$1/msg' I presume I am misinterpreting the flags. Where am I going wrong? Thanks. EDIT: The above example is to strip the spans out of the following html: <div class="block blockDontForget"> <p><span><a href="../../../foo/bar/x/x.html">Lorem Ipsum</a></span></p> EDIT: It's just the <span>'s and </span>'s that are inbetween <p> and <a> from the "blockDontForget" class </div> that I want to remove (there are lots or these blockDontForget divs with spans inside anchors that I want to keep).

    Read the article

  • How to search cvs comment history

    - by Chris Noe
    I am aware of this command: cvs log -N -w<userid> -d"1 day ago" Unfortunately this generates a formatted report with lots of newlines in it, such that the file-path, the file-version, and the comment-text are all on separate lines. Therefore it is difficult to scan it for all occurrences of comment text, (eg, grep), and correlate the matches to file/version. (Note that the log output would be perfectly acceptable, if only cvs could perform the filtering natively.) EDIT: Sample output. A block of text like this is reported for each repository file: RCS file: /data/cvs/dps/build.xml,v Working file: build.xml head: 1.49 branch: locks: strict access list: keyword substitution: kv total revisions: 57; selected revisions: 1 description: ---------------------------- revision 1.48 date: 2008/07/09 17:17:32; author: noec; state: Exp; lines: +2 -2 Fixed src.jar references ---------------------------- revision 1.47 date: 2008/07/03 13:13:14; author: noec; state: Exp; lines: +1 -1 Fixed common-src.jar reference. =============================================================================

    Read the article

  • Python line file iteration and strange characters

    - by muckabout
    I have a huge gzipped text file which I need to read, line by line. I go with the following: for i, line in enumerate(codecs.getreader('utf-8')(gzip.open('file.gz'))): print i, line At some point late in the file, the python output diverges from the file. This is because lines are getting broken due to weird special characters that python thinks are newlines. When I open the file in 'vim', they are correct, but the suspect characters are formatted weirdly. Is there something I can do to fix this? I've tried other codecs including utf-16, latin-1. I've also tried with no codec. I looked at the file using 'od'. Sure enough, there are \n characters where they shouldn't be. But, the "wrong" ones are prepended by a weird character. I think there's some encoding here with some characters being 2-bytes, but the trailing byte being a \n if not viewed properly. If I replace: gzip.open('file.gz') With: os.popen('zcat file.gz') It works fine (and actually, quite faster). But, I'd like to know where I'm going wrong.

    Read the article

  • Perl Unicode glitch

    - by RedGrittyBrick
    In this output, why am I getting extra newlines between lines b&c and d&e? a: ....v....1....v... (a) b: 'Budejovický Budvar' length 18 (b) c: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 (c) d: B u d e j o v i c k ý B u d v a r (d) e: 42 75 64 11b 6a 6f 76 69 63 6b fd 20 42 75 64 76 61 72 (e) from this program #!perl use strict; use warnings; binmode (STDOUT, "encoding(UTF-8)"); # so no "Wide characater in print" warning print "\n"; my $r = "Bud\N{U+011B}jovick\N{U+00FD} Budvar"; print "a: ....v....1....v... (a)\n"; print "b: '$r' length ", length($r)," (b)\n"; print "c:"; printf "%4d",$_ for (1..18); print " (c)\n"; print "d: "; print join(" ", split("", $r)); print " (d)\n"; print "e: "; printf "%*v3x", " ", $r; print " (e)\n";

    Read the article

  • Scrolling to the bottom of a div on page load: issue with syntaxhighlighter

    - by Rayne
    I've been using this code: var objDiv = document.getElementById("code"); objDiv.scrollTop = objDiv.scrollHeight; to scroll to the very bottom of the div. It worked perfectly in FF and Chrome (I asked a question about it not working in Chrome a few days ago, but it appears the guy who was testing it on Chrome was incorrect, so I tested it myself) until I started syntax highlighting the code that I put in the div with SyntaxHighlighter. Before, I was putting the code in a <p> and breaking lines with <br />, but the <br /> stuff doesn't fly with SyntaxHighlighter, so I replaced all of those with newlines (not entirely certain if this is important, but it's worth mentioning). Now, when the page loads, it does scroll, but not all the way down. It scrolls nearly to the bottom. I've tried all the methods listed in the other question I mentioned but they all do the same thing, or nothing at all. Is there anything else I can try? Here is the relevant piece of the generated HTML. Forgive the poor formatting, I'm not writing the HTML by hand, but rather using Hiccup with Clojure, and it doesn't bother with formatting. <div class="scroll" id="code"><pre class="brush: clojure">=> (doseq [x (range 1 100)] (println x)) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 nil </pre></div><script type="text/javascript">var objDiv = document.getElementById("code"); objDiv.scrollTop = objDiv.scrollHeight;</script>

    Read the article

  • Removing HTML entities while preserving line breaks with JSoup

    - by shrodes
    I have been using JSoup to parse lyrics and it has been great until now, but have run into a problem. I can use Node.html() to return the full HTML of the desired node, which retains line breaks as such: Gl&oacute;andi augu, silfurn&aacute;tt <br />Bl&oacute;&eth; alv&ouml;ru, starir &aacute; <br />&Oacute;&eth;ur hundur er &iacute; v&iacute;gam&oacute;&eth;, &iacute; maga... m&eacute;r <br /> <br />Kolni&eth;ur gref, kvik sem dreg h&eacute;r <br />Kolni&eth;ur svart, hvergi bjart n&eacute; But has the unfortunate side-effect, as you can see, of retaining HTML entities and tags. However, if I use Node.text(), I can get a better looking result, free of tags and entities: Glóandi augu, silfurnátt Blóð alvöru, starir á Óður hundur er í vígamóð, í maga... mér Kolniður gref, kvik sem dreg hér Kolniður svart, Which has another unfortunate side-effect of removing the line breaks and compressing into a single line. Simply replacing <br /> from the node before calling Node.text() yields the same result, and it seems that that method is compressing the text onto a single line in the method itself, ignoring newlines. Is it possible to have the best of both worlds, and have tags and entities replaced correctly which preserving the line breaks, or is there another method or way of decoding entities and removing tags without having to replace them manually?

    Read the article

  • Terminal-based snake game: input thread manipulates output

    - by enlightened
    I'm writing a snake game for the terminal, i.e. output via print. The following works just fine: while status[snake_monad] do print to_string draw canvas, compose_all([ frame, specs, snake_to_hash(snake[snake_monad]) ]) turn! snake_monad, get_dir move! snake_monad, specs sleep 0.25 end But I don't want the turn!ing to block, of course. So I put it into a new Thread and let it loop: Thread.new do loop do turn! snake_monad, get_dir end end while status[snake_monad] do ... # no turn! here ... end Which also works logically (the snake is turning), but the output is somehow interspersed with newlines. As soon as I kill the input thread (^C) it looks normal again. So why and how does the thread have any effect on my output? And how do I work around this issue? (I don't know much about threads, even less about them in ruby. Input and output concurrently on the same terminal make the matter worse, I guess...) Also (not really important): Wanting my program as pure as possible, would it be somewhat easily possible to get the input non-blockingly while passing everything around? Thank you!

    Read the article

  • Ruby - Nokogiri - Need to put node.value to an array

    - by r3nrut
    What I'm trying to do is read the value for all the nodes in this XML and put them into an array. This should be simple but for some reason it's driving me nuts. XML <ArrayOfAddress> <Address> <AddressId>297424fe-cfff-4ee1-8faa-162971d2645f</AddressId> <FirstName>George</FirstName> <LastName>Washington</LastName> <Address1>123 Main St</Address1> <Address2>Apt #611</Address2> <City>New York</City> <State>NY</State> <PostalCode>10110</PostalCode> <CountryCode>US</CountryCode> <EmailAddress>[email protected]</EmailAddress> <PhoneNumber>5555551234</PhoneNumber> <AddressType>CustomerAddress</AddressType> </Address> </ArrayOfAddress> Code class MassageRepsone def parse_resp @@get_address.url_builder #URL passed through HTTPClient - @@resp is the xml above doc = Nokogiri::XML::Reader(@@resp) @@values = doc.each do |node| node.value end end @@get_address.parse_resp obj = [@@values] Array(obj) p obj end The code snippet from above returns the following: 297424fe-cfff-4ee1-8faa-162971d2645f George Washington 123 Main St Apt #622 New York NY 10110 US test.test.com 5555551234 CustomerAddress I tried putting @@values to a string and applying chomp but that just prints the newlines as nil and puts quotes around the values. Not sure what the next step is or if I need to approach this differently with Nokogiri.

    Read the article

  • Fast and efficient way to read a space separated file of numbers into an array?

    - by John_Sheares
    I need a fast and efficient method to read a space separated file with numbers into an array. The files are formatted this way: 4 6 1 2 3 4 5 6 2 5 4 3 21111 101 3 5 6234 1 2 3 4 2 33434 4 5 6 The first row is the dimension of the array [rows columns]. The lines following contain the array data. The data may also be formatted without any newlines like this: 4 6 1 2 3 4 5 6 2 5 4 3 21111 101 3 5 6234 1 2 3 4 2 33434 4 5 6 I can read the first line and initialize an array with the row and column values. Then I need to fill the array with the data values. My first idea was to read the file line by line and use the split function. But the second format listed gives me pause, because the entire array data would be loaded into memory all at once. Some of these files are in the 100 of MBs. The second method would be to read the file in chunks and then parse them piece by piece. Maybe somebody else has a better a way of doing this?

    Read the article

  • Make error in installing Math support for MediaWiki

    - by Masi
    How can you solve the following Make error in installing MediaWiki? ... /local/lib/site_perl . /etc/perl /usr/local/lib/perl/5.10.0 /usr/local/share/perl/5.10.0 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.10 /usr/share/perl/5.10 /usr/local/lib/site_perl .) at t/maint/php-tag.t line 8. BEGIN failed--compilation aborted at t/maint/php-tag.t line 8. # Looks like your test died before it could output anything. t/maint/php-tag..........dubious Test returned status 255 (wstat 65280, 0xff00) t/maint/unix-newlines....ok Failed Test Stat Wstat Total Fail List of Failed ------------------------------------------------------------------------------- t/00-test.t 127 32512 ?? ?? ?? t/inc/Database.t 127 32512 ?? ?? ?? t/inc/Global.t 127 32512 ?? ?? ?? t/inc/IP.t 127 32512 ?? ?? ?? t/inc/ImageFunctions.t 127 32512 ?? ?? ?? t/inc/Language.t 127 32512 ?? ?? ?? t/inc/Licenses.t 127 32512 ?? ?? ?? t/inc/LocalFile.t 127 32512 ?? ?? ?? t/inc/Parser.t 127 32512 ?? ?? ?? t/inc/Revision.t 127 32512 ?? ?? ?? t/inc/Sanitizer.t 127 32512 ?? ?? ?? t/inc/Search.t 127 32512 ?? ?? ?? t/inc/Title.t 127 32512 ?? ?? ?? t/inc/Xml.t 127 32512 ?? ?? ?? t/maint/php-lint.t 254 65024 966 966 1-966 t/maint/php-tag.t 255 65280 ?? ?? ?? Failed 16/19 test scripts. 966/4248 subtests failed. Files=19, Tests=4248, 46 wallclock secs (33.15 cusr + 10.38 csys = 43.53 CPU) Failed 16/19 test programs. 966/4248 subtests failed. make: *** [test] Error 255

    Read the article

  • Spaces and Parenthesis in windows PATH variable screws up batch files.

    - by NoName
    So, my path variable (System-Adv Settings-Env Vars-System-PATH) is set to: C:\Python26\Lib\site-packages\PyQt4\bin; %SystemRoot%\system32; %SystemRoot%; %SystemRoot%\System32\Wbem; %SYSTEMROOT%\System32\WindowsPowerShell\v1.0\; C:\Python26\; C:\Python26\Scripts\; C:\cygwin\bin; "C:\PathWithSpaces\What_is_this_bullshit"; "C:\PathWithSpaces 1.5\What_is_this_bullshit_1.5"; "C:\PathWithSpaces (2.0)\What_is_this_bullshit_2.0"; "C:\Program Files (x86)\IronPython 2.6"; "C:\Program Files (x86)\Subversion\bin"; "C:\Program Files (x86)\Git\cmd"; "C:\Program Files (x86)\PuTTY"; "C:\Program Files (x86)\Mercurial"; Z:\droid\android-sdk-windows\tools; Although, obviously, without the newlines. Notice the lines containing PathWithSpaces - the first has no spaces, the second has a space, and the third has a space followed by a parenthesis. Now, notice the output of this batch file: C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\bin\>vcvars32.bat C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\bin>"C:\Program Files (x86 )\Microsoft Visual Studio 9.0\Common7\Tools\vsvars32.bat" Setting environment for using Microsoft Visual Studio 2008 x86 tools. \What_is_this_bullshit_2.0";"C:\Program was unexpected at this time. C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\bin> set "PATH=C:\Pro gram Files\Microsoft SDKs\Windows\v6.0A\bin;C:\Python26\Lib\site-packages\PyQt4\ bin;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\ WindowsPowerShell\v1.0\;C:\Python26\;C:\Python26\Scripts\;C:\cygwin\bin;"C:\Path WithSpaces\What_is_this_bullshit";"C:\PathWithSpaces 1.5\What_is_this_bullshit_1 .5";"C:\PathWithSpaces (2.0)\What_is_this_bullshit_2.0";"C:\Program Files (x86)\ IronPython 2.6";"C:\Program Files (x86)\Subversion\bin";"C:\Program Files (x86)\ Git\cmd";"C:\Program Files (x86)\PuTTY";"C:\Program Files (x86)\Mercurial";Z:\dr oid\android-sdk-windows\tools;" or specifically the line: \What_is_this_bullshit_2.0";"C:\Program was unexpected at this time. So, what is this bullshit? Specifically: Directory in path that is properly escaped with quotes, but with no spaces = fine Directory in path that is properly escaped with quotes, and has spaces but no parenthesis = fine Directory in path that is properly escaped with quotes, and has spaces and has a parenthesis = ERROR Whats going on here? How can I fix this? I'll probably resort to a junction point to let my tools still work as workaround, but if you have any insight into this, please let me know :)

    Read the article

  • Using Objective-C Blocks

    - by Sean
    Today I was experimenting with Objective-C's blocks so I thought I'd be clever and add to NSArray a few functional-style collection methods that I've seen in other languages: @interface NSArray (FunWithBlocks) - (NSArray *)collect:(id (^)(id obj))block; - (NSArray *)select:(BOOL (^)(id obj))block; - (NSArray *)flattenedArray; @end The collect: method takes a block which is called for each item in the array and expected to return the results of some operation using that item. The result is the collection of all of those results. (If the block returns nil, nothing is added to the result set.) The select: method will return a new array with only the items from the original that, when passed as an argument to the block, the block returned YES. And finally, the flattenedArray method iterates over the array's items. If an item is an array, it recursively calls flattenedArray on it and adds the results to the result set. If the item isn't an array, it adds the item to the result set. The result set is returned when everything is finished. So now that I had some infrastructure, I needed a test case. I decided to find all package files in the system's application directories. This is what I came up with: NSArray *packagePaths = [[[NSSearchPathForDirectoriesInDomains(NSAllApplicationsDirectory, NSAllDomainsMask, YES) collect:^(id path) { return (id)[[[NSFileManager defaultManager] contentsOfDirectoryAtPath:path error:nil] collect:^(id file) { return (id)[path stringByAppendingPathComponent:file]; }]; }] flattenedArray] select:^(id fullPath) { return [[NSWorkspace sharedWorkspace] isFilePackageAtPath:fullPath]; }]; Yep - that's all one line and it's horrid. I tried a few approaches at adding newlines and indentation to try to clean it up, but it still feels like the actual algorithm is lost in all the noise. I don't know if it's just a syntax thing or my relative in-experience with using a functional style that's the problem, though. For comparison, I decided to do it "the old fashioned way" and just use loops: NSMutableArray *packagePaths = [NSMutableArray new]; for (NSString *searchPath in NSSearchPathForDirectoriesInDomains(NSAllApplicationsDirectory, NSAllDomainsMask, YES)) { for (NSString *file in [[NSFileManager defaultManager] contentsOfDirectoryAtPath:searchPath error:nil]) { NSString *packagePath = [searchPath stringByAppendingPathComponent:file]; if ([[NSWorkspace sharedWorkspace] isFilePackageAtPath:packagePath]) { [packagePaths addObject:packagePath]; } } } IMO this version was easier to write and is more readable to boot. I suppose it's possible this was somehow a bad example, but it seems like a legitimate way to use blocks to me. (Am I wrong?) Am I missing something about how to write or structure Objective-C code with blocks that would clean this up and make it clearer than (or even just as clear as) the looped version?

    Read the article

< Previous Page | 3 4 5 6 7 8  | Next Page >