Search Results

Search found 5756 results on 231 pages for 'illegal characters'.

Page 132/231 | < Previous Page | 128 129 130 131 132 133 134 135 136 137 138 139  | Next Page >

  • Efficient file buffering & scanning methods for large files in python

    - by eblume
    The description of the problem I am having is a bit complicated, and I will err on the side of providing more complete information. For the impatient, here is the briefest way I can summarize it: What is the fastest (least execution time) way to split a text file in to ALL (overlapping) substrings of size N (bound N, eg 36) while throwing out newline characters. I am writing a module which parses files in the FASTA ascii-based genome format. These files comprise what is known as the 'hg18' human reference genome, which you can download from the UCSC genome browser (go slugs!) if you like. As you will notice, the genome files are composed of chr[1..22].fa and chr[XY].fa, as well as a set of other small files which are not used in this module. Several modules already exist for parsing FASTA files, such as BioPython's SeqIO. (Sorry, I'd post a link, but I don't have the points to do so yet.) Unfortunately, every module I've been able to find doesn't do the specific operation I am trying to do. My module needs to split the genome data ('CAGTACGTCAGACTATACGGAGCTA' could be a line, for instance) in to every single overlapping N-length substring. Let me give an example using a very small file (the actual chromosome files are between 355 and 20 million characters long) and N=8 import cStringIO example_file = cStringIO.StringIO("""\ header CAGTcag TFgcACF """) for read in parse(example_file): ... print read ... CAGTCAGTF AGTCAGTFG GTCAGTFGC TCAGTFGCA CAGTFGCAC AGTFGCACF The function that I found had the absolute best performance from the methods I could think of is this: def parse(file): size = 8 # of course in my code this is a function argument file.readline() # skip past the header buffer = '' for line in file: buffer += line.rstrip().upper() while len(buffer) = size: yield buffer[:size] buffer = buffer[1:] This works, but unfortunately it still takes about 1.5 hours (see note below) to parse the human genome this way. Perhaps this is the very best I am going to see with this method (a complete code refactor might be in order, but I'd like to avoid it as this approach has some very specific advantages in other areas of the code), but I thought I would turn this over to the community. Thanks! Note, this time includes a lot of extra calculation, such as computing the opposing strand read and doing hashtable lookups on a hash of approximately 5G in size. Post-answer conclusion: It turns out that using fileobj.read() and then manipulating the resulting string (string.replace(), etc.) took relatively little time and memory compared to the remainder of the program, and so I used that approach. Thanks everyone!

    Read the article

  • Images inside of UILabel

    - by enby
    hello everyone! I'm trying to find the best way to display images inside of UILabels (in fact, I wouldn't mind switching to something other than UILabel if it supports images with no hassle) The scenario is: I have a table view with hundreds of cells and UILabel being the main component of each cell The text I assign to each cell contains sequences of characters that need to be parsed out and represented as an image In simpler words, imagine a TableView of an instant messenger that parses replaces all ":)", ":(", ":D" etc with corresponding smiley images Any input would be greatly appreciated!

    Read the article

  • Internationalization string testing

    - by LicenseQ
    Some people using look-alike Unicode symbols to replace English characters to test the internationalization, e.g. "Test" is replaced as "Test". Is there a wellknown name for this language/culture? Are there utils, keyboard layouts, translation tools for this "language"?

    Read the article

  • Creating a unique URL safe hash

    - by Ben Foster
    I want to hash/encode a unique integer (database ID) to create a similarly unique string. It needs to meet the following requirements: Must start with a letter or number, and can contain only letters and numbers. All letters in a container name must be lowercase. Must be from 3 through 63 characters long (although the shorter the better) The result does not need to be reversible, just repeatable - so a 1-way hash would be fine.

    Read the article

  • php regex - replace on "\${1}"

    - by Qiao
    found this regex: insert " " every 10 characters: $text = preg_replace("|(.{10})|u", "\${1}"." ", $text); can you, please, explain what \${1} means. Why using \ and what curly brackets means?

    Read the article

  • â?? in my hmtl after purify

    - by mmcgrail
    I have a database the i am rebuilding the table structure was crap so I'm porting some of the data from one table to another. This data appears to have been copy past from MSO product so as I'm getting the data I clean it up with htmlpurifier and some alittle str_replace in php here the clean function function clean_html($html) { $config = HTMLPurifier_Config::createDefault(); $config->set('AutoFormat','RemoveEmpty',true); $config->set('HTML','AllowedAttributes','href,src'); $config->set('HTML','AllowedElements','p,em,strong,a,ul,li,ol,img'); $purifier = new HTMLPurifier($config); $html = $purifier->purify($html); $html = str_replace('&nbsp;',' ',$html); $html = str_replace("\r",'',$html); $html = str_replace("\n",'',$html); $html = str_replace("\t",'',$html); $html = str_replace(' ',' ',$html); $html = str_replace('<p> </p>','',$html); $html = str_replace(chr(160),' ',$html); return trim($html); } but when I put the results into my new table and out put them to the ckeditor I get those three characters. I then have a javascript function that is called to remove special characters from the content of the ckeditor too. it doesn't clean it either function remove_special(str) { var rExps=[ /[\xC0-\xC2]/g, /[\xE0-\xE2]/g, /[\xC8-\xCA]/g, /[\xE8-\xEB]/g, /[\xCC-\xCE]/g, /[\xEC-\xEE]/g, /[\xD2-\xD4]/g, /[\xF2-\xF4]/g, /[\xD9-\xDB]/g, /[\xF9-\xFB]/g, /\xD1/,/\xF1/g, "/[\u00a0|\u1680|[\u2000-\u2009]|u200a|\u200b|\u2028|\u2029|\u202f|\u205f|\u3000|\xa0]/g", /\u000b/g,'/[\u180e|\u000c]/g', /\u2013/g, /\u2014/g, /\xa9/g,/\xae/g,/\xb7/g,/\u2018/g,/\u2019/g,/\u201c/g,/\u201d/g,/\u2026/g]; var repChar=['A','a','E','e','I','i','O','o','U','u','N','n',' ','\t','','-','--','(c)','(r)','*',"'","'",'"','"','...']; for(var i=0; i<rExps.length; i++) { str=str.replace(rExps[i],repChar[i]); } for (var x = 0; x < str.length; x++) { charcode = str.charCodeAt(x); if ((charcode < 32 || charcode > 126) && charcode !=10 && charcode != 13) { str = str.replace(str.charAt(x), ""); } } return str; } Does anyone know off hand what I need to do to get rid of them. I think they may be some sort of quote

    Read the article

  • How do I write a Oracle SQl query for this tricky question...

    - by atrueguy
    Here is the table data with the column name as Ships. +--------------+ Ships | +--------------+ Duke of north | ---------------+ Prince of Wales| ---------------+ Baltic | ---------------+ In the Outcomes table, transform names of the ships containing more than one space, as follows: Replace all characters between the first and the last spaces (excluding these spaces) by symbols of an asterisk (*). The number of asterisks must be equal to number

    Read the article

  • SVN 255 Character Problem

    - by Tom
    Hi Guys, I am using TortiseSVN and we have a problem when we exporting etc because subversion errors. The path has a character limit 255 - so I am not sure if this is the problem [I think it is in Win7 x-64 bit] How do I fix this ? i.e. allow paths for 255 characters ?

    Read the article

  • Problem on creating font using a custom ant task, which extends LWUIT's FontTask.

    - by Smithy
    Hi. I am new to LWUIT and j2me, and I am building a j2me application for showing Japanese text vertically. The phonetic symbol part of the text should be shown in relatively small font size (about half the size of the text), small Kanas need to be shown as normal ones, and some 'vertical only' characters need to be put into the Private Use Area, etc. I tried to build this font into a bitmap font using the FontTask ant task LWUIT provided, but found that it does support the customizations mentioned above. So I decided to write my own task and add those. Below is what I have achieved: 1 An ant task extending the LWUITTask task to support a new nested element <verticalfont>. public class VerticalFontBuildTask extends LWUITTask { public void addVerticalfont(VerticalFontTask anVerticalFont) { super.addFont(anVerticalFont); } } 2 The VerticalFontTask task, which extends the original FontTask. Instead of inserting a EditorFont object, it inserts a VerticalEditorFont object(derived from EditorFont) into the resource. public class VerticalFontTask extends FontTask { // some constants are omitted public VerticalFontTask() { StringBuilder sb = new StringBuilder(); sb.append(UPPER_ALPHABET); sb.append(UPPER_ALPHABET.toLowerCase()); sb.append(HALFWIDTH); sb.append(HIRAGANA); sb.append(HIRAGANA_SMALL); sb.append(KATAKANA); sb.append(KATAKANA_SMALL); sb.append(WIDE); this.setCharset(sb.toString()); } @Override public void addToResources(EditableResources e) { log("Putting rigged font into resource..."); super.addToResources(e); //antialias settings Object aa = this.isAntiAliasing() ? RenderingHints.VALUE_TEXT_ANTIALIAS_ON :RenderingHints.VALUE_TEXT_ANTIALIAS_OFF; VerticalEditorFont ft = new VerticalEditorFont( Font.createSystemFont( this.systemFace, this.systemStyle, this.systemSize), null, getLogicalName(), isCreateBitmap(), aa, getCharset()); e.setFont(getName(), ft); } VerticalEditorFont is just a bunch of methods logging to output and call the super. I am still trying to figure out how to extend it. But things are not going well: none of the methods on the VerticalEditorFont object get called when executing this task. My questions are: 1 where did I do wrong? 2 I want to embed a truetype font to support larger screens. I only need a small part of the font inside my application and I don't want it to carry a font resource weighing 1~2MB. Is there a way to extract only the characters needed and pack them into LWUIT?

    Read the article

  • Java OCR Help Needed

    - by maSnun
    Hello, How do I detect all the characters in an image? The image is in png and the font is constant. For simplicity, lets assume that the image has only numeric digits and there are only 4 digits on an image. I need to read all of them and output the text. Can you help? Thanks in advance.

    Read the article

  • UILabel to render partial character using clip

    - by magic-c0d3r
    I want a UILabel to render a partial character by setting the lineBreakMode to clip. But it is clipping the entire character. Is there a different way to clip a word so that only partial character is displayed? Lets say I have a string like: "Hello Word" and that string is in a myLabel with a width that only fits the 5 characters and part of the "W" I want it still to render part of the "W" and not drop it from the render.

    Read the article

  • xcode user script - Apple Script - Sort selected lines by length

    - by Bach
    I need to create a user scripts in xcode where I can sort a selection of multiple lines, by their length (number of characters) I know of this macro variable PBXTextLength, but not sure how to write the script. this is the sort selection script in Xcode: echo -n "%%%{PBXSelection}%%%" sort <&0 echo -n "%%%{PBXSelection}%%%" how can i modify that script to sort the selection by the length of the line (PBXTextLength)? thanks

    Read the article

  • timeout stringwithcontentsofurl

    - by sergiobuj
    Hi, I have this call to stringwithcontentsofurl: [NSString stringWithContentsOfURL:url usedEncoding:NSASCIIStringEncoding error:nil]; How can I give that a simple timeout? I don't want to use threads or operation queues (the content of the url is about 100 characters), I just don't want to wait too long when the connection is slow.

    Read the article

  • Problem with regular expression for some special parttern.

    - by SpawnCxy
    Hi all, I got a problem when I tried to find some characters with following code: preg_match_all('/[\w\uFF10-\uFF19\uFF21-\uFF3A\uFF41-\uFF5A]/',$str,$match); //line 5 print_r($match); And I got error as below: Warning: preg_match_all() [function.preg-match-all]: Compilation failed: PCRE does not support \L, \l, \N, \U, or \u at offset 4 in E:\mycake\app\webroot\re.php on line 5 I'm not so familiar with reg expression and have no idea about this error.How can I fix this?Thanks.

    Read the article

  • Include a version control tag in VSS

    - by Sjuul Janssen
    I was reading Code Complete 2 and it mentions this: Many version-control tools wil insert version information into a file. In CVS, for exmple the characters // $id$ Will Automaticly expand to // $id: ClassName.java, v 1.1 2004/02/05 00:36:42 ismene Exp $ So now I would like to do something similar with VSS for our SQL scripts I have been googling around for the answer but can't find it. Is this possible? can someone maybe point me in the right direction?

    Read the article

  • How do I add html link to image title

    - by Jason
    I'm actually needing to include html links in the longdesc attribute. I've altered prettyphoto to use longdesc instead of title for images, but I need to include html links in those descriptions. I know it's possible with code for characters, I just don't remember what those are. Thanks

    Read the article

  • Building proper link with spaces

    - by Joel
    Hello, I have the following code in Python: linkHTML = "<a href=\"page?q=%s\">click here</a>" % strLink The problem is that when strLink has spaces in it the link shows up as <a href="page?q=with space">click here</a> I can use strLink.replace(" ","+") But I am sure there are other characters which can cause errors. I tried using urllib.quote(strLink) But it doesn't seem to help. Thanks! Joel

    Read the article

  • How to move the textbox caret to the right.

    - by monkey_boys
    I would like to change all the characters entered into a textbox to upper case. The code will add the character, but how do I move the caret to the right? private void textBox3_KeyPress(object sender, KeyPressEventArgs e) { textBox3.Text += e.KeyChar.ToString().ToUpper(); e.Handled = true; }

    Read the article

  • Formatting a CSV File that contains HTML for Import to Excel

    - by Dave
    I would like to export a CSV file from my application for importing into Excel (or any other spreadsheet that supports CSV files). Anyhow, one of the cells in my table have rich content (i.e. HTML) which can, of course, contain commas as well as other HTML characters and formatting. I realize that Excel "can" handle HTML formatted text, but exporting it as CSV tends to screw up the data for importing. Is there some particular way I can format that particular cell so that imports correctly?

    Read the article

< Previous Page | 128 129 130 131 132 133 134 135 136 137 138 139  | Next Page >