Search Results

Search found 5433 results on 218 pages for 'escaped characters'.

Page 82/218 | < Previous Page | 78 79 80 81 82 83 84 85 86 87 88 89  | Next Page >

  • Converting from ANSI to Unicode

    - by Rayne
    Hi all, I'm using Visual Studio .NET 2003, and I'm trying to convert a program written in purely ANSI characters to be independent of Unicode/Multi-byte characters. The program has a callback function of pcap_loop, called "got_packet". It's defined as void got_packet(u_char *user, const struct pcap_pkthdr *header, const u_char *cpacket) { USES_CONVERSION; _TUCHAR *packet; packet = A2T(cpacket); ... } However, I get the error message error C2440: 'type cast': cannot convert from 'const u_char *' to 'ATL::CA2WEX<>' How do fix this? Thank you. Regards, Rayne

    Read the article

  • Finding text's bounding rect in Core Text

    - by Mo
    I'm trying to find the boundaries of a line of text in Core Text. For simplicity, assume it has a single character. At the moment I'm using the following method: line = CTLineCreateWithAttributedString(attrString); rect = CTLineGetImageBounds(line, context); It works most of the times, but for some characters, like math italic d (Unicode: 0x1D451) or math italic q (Unicode: 0x1D45E), the width is a bit short. I tried using CTLineGetTypographicBounds() or CTFramesetterSuggestFrameSizeWithConstraints, but they didn't help either (I think they use glyph's advance to find the width, not its graphical width.) As the font itself isn't italic, I also can't use slant angle to correct this. I tried accessing the glyphs directly and using CTFontCreatePathForGlyph(), but failed as CGGlyph and UniChar are both 16-bits and I need 32-bit characters. Does anyone know if I'm doing anything wrong? If so, what's the right way?

    Read the article

  • to escape or not to escape: well formed XHTML with diacritics

    - by andresmh
    Say that you have a XHTML document in English but it has accented characters (e.g. meta name="author" content="José"). Let's say you have no control over the HTTP headers. Should the characters be replaced for their corresponding named entities (e.g. &aacute;, etc)? Should the doc type and the xml:lang attribute be set to English? I know I can check the W3C recommendation but I am asking more from a practical point of view.

    Read the article

  • C#, string replace russian to english

    - by Fabio Beoni
    Hello, I have a strange problem replacing chars in string... I read a .txt file containing russian text, and starting from a list of letters russian to english (ru=en), I loop the list and I WOULD like to replace russian characters with english characters. The problem is: I can see in the debug the right reading of the russian and the right reading of the english, but using myWord = myWord.Replace(ruChar, enChar), the string is not replaced. My txt file is a UTF-8 encoding. Any suggestions?? Thank you to all...

    Read the article

  • Is it possible to reliably auto-decode user files to Unicode? [C#]

    - by NVRAM
    I have a web application that allows users to upload their content for processing. The processing engine expects UTF8 (and I'm composing XML from multiple users' files), so I need to ensure that I can properly decode the uploaded files. Since I'd be surprised if any of my users knew their files even were encoded, I have very little hope they'd be able to correctly specify the encoding (decoder) to use. And so, my application is left with task of detecting before decoding. This seems like such a universal problem, I'm surprised not to find either a framework capability or general recipe for the solution. Can it be I'm not searching with meaningful search terms? I've implemented BOM-aware detection (http://en.wikipedia.org/wiki/Byte_order_mark) but I'm not sure how often files will be uploaded w/o a BOM to indicate encoding, and this isn't useful for most non-UTF files. My questions boil down to: Is BOM-aware detection sufficient for the vast majority of files? In the case where BOM-detection fails, is it possible to try different decoders and determine if they are "valid"? (My attempts indicate the answer is "no.") Under what circumstances will a "valid" file fail with the C# encoder/decoder framework? Is there a repository anywhere that has a multitude of files with various encodings to use for testing? While I'm specifically asking about C#/.NET, I'd like to know the answer for Java, Python and other languages for the next time I have to do this. So far I've found: A "valid" UTF-16 file with Ctrl-S characters has caused encoding to UTF-8 to throw an exception (Illegal character?) (That was an XML encoding exception.) Decoding a valid UTF-16 file with UTF-8 succeeds but gives text with null characters. Huh? Currently, I only expect UTF-8, UTF-16 and probably ISO-8859-1 files, but I want the solution to be extensible if possible. My existing set of input files isn't nearly broad enough to uncover all the problems that will occur with live files. Although the files I'm trying to decode are "text" I think they are often created w/methods that leave garbage characters in the files. Hence "valid" files may not be "pure". Oh joy. Thanks.

    Read the article

  • pyparsing ambiguity

    - by Claudiu
    I'm trying to parse some text using PyParser. The problem is that I have names that can contain white spaces. So my input might look like this: Joe Bob Jimmy Foo Joe decides to eat. Bob decides to not eat. Jimmy Foo decides to eat. How can I create a parser for the decides to eat line? If I create my name parser naively, meaning with alphabetic characters plus space characters, then it will match the entire line.

    Read the article

  • Why do C compilers prepend underscores to external names?

    - by Michael Burr
    I've been working in C for so long that the fact that compilers typically add an underscore to the start of an extern is just understood... However, another SO question today got me wondering about the real reason why the underscore is added. A wikipedia article claims that a reason is: It was common practice for C compilers to prepend a leading underscore to all external scope program identifiers to avert clashes with contributions from runtime language support I think there's at least a kernel of truth to this, but also it seems to no really answer the question, since if the underscore is added to all externs it won't help much with preventing clashes. Does anyone have good information on the rationale for the leading underscore? Is the added underscore part of the reason that the Unix creat() system call doesn't end with an 'e'? I've heard that early linkers on some platforms had a limit of 6 characters for names. If that's the case, then prepending an underscore to external names would seem to be a downright crazy idea (now I only have 5 characters to play with...).

    Read the article

  • Purpose of Trigraph sequences in C++?

    - by Kirill V. Lyadvinsky
    According to C++'03 Standard 2.3/1: Before any other processing takes place, each occurrence of one of the following sequences of three characters (“trigraph sequences”) is replaced by the single character indicated in Table 1. ---------------------------------------------------------------------------- | trigraph | replacement | trigraph | replacement | trigraph | replacement | ---------------------------------------------------------------------------- | ??= | # | ??( | [ | ??< | { | | ??/ | \ | ??) | ] | ??> | } | | ??’ | ˆ | ??! | | | ??- | ˜ | ---------------------------------------------------------------------------- In real life that means that code printf( "What??!\n" ); will result in printing What| because ??! is a trigraph sequence that is replaced with the | character. My question is what purpose of using trigraphs? Is there any practical advantage of using trigraphs? UPD: In answers was mentioned that some European keyboards don't have all the punctuation characters, so non-US programmers have to use trigraphs in everyday life? UPD2: Visual Studio 2010 has trigraph support turned off by default.

    Read the article

  • Unescape _xHHHH_ XML escape sequences using Python

    - by John Machin
    I'm using Python 2.x [not negotiable] to read XML documents [created by others] that allow the content of many elements to contain characters that are not valid XML characters by escaping them using the _xHHHH_ convention e.g. ASCII BEL aka U+0007 is represented by the 7-character sequence u"_x0007_". Neither the functionality that allows representation of any old character in the document nor the manner of escaping is negotiable. I'm parsing the documents using cElementTree or lxml [semi-negotiable]. Here is my best attempt at unescapeing the parser output as efficiently as possible: import re def unescape(s, subber=re.compile(r'_x[0-9A-Fa-f]{4,4}_').sub, repl=lambda mobj: unichr(int(mobj.group(0)[2:6], 16)), ): if "_" in s: return subber(repl, s) return s The above is biassed by observing a very low frequency of "_" in typical text and a better-than-doubling of speed by avoiding the regex apparatus where possible. The question: Any better ideas out there?

    Read the article

  • How to specify character encoding for Ant Task parameters in Java

    - by räph
    I'm writing an ANT task in Java. In my build.xml I specify parameters, which should be read from my java class. Problems occur, when I use special characters, like german umlauts (Ö,Ä,Ü) in these parameters. In my java task they appear as ?-characters (using System.out.print). All my files are encoded as UTF-8. and my build.xml has the corresponding declaration: <?xml version="1.0" encoding="UTF-8" ?> For the details of writing the task: I do it according to http://ant.apache.org/manual/develop.html (especially Point 5 nested elements). I have nested elements in my task like: <parameter name="test" value="ÖÄÜtest"/> and a java method: public void addConfiguredParameter(Parameter prop) { System.out.println(prop.getValue()); //prints ???test } to read the parameter values.

    Read the article

  • PHP: Replace umlauts with closest 7-bit ASCII equivalent in an UTF-8 string

    - by BlaM
    What I want to do is to remove all accents and umlauts from a string, turning "lärm" into "larm" or "andré" into "andre". What I tried to do was to utf8_decode the string and then use strtr on it, but since my source file is saved as UTF-8 file, I can't enter the ISO-8859-15 characters for all umlauts - the editor inserts the UTF-8 characters. Obviously a solution for this would be to have an include that's an ISO-8859-15 file, but there must be a better way than to have another required include? echo strtr(utf8_decode($input), 'ŠŒŽšœžŸ¥µÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýÿ', 'SOZsozYYuAAAAAAACEEEEIIIIDNOOOOOOUUUUYsaaaaaaaceeeeiiiionoooooouuuuyy'); UPDATE: Maybe I was a bit inaccurate with what I try to do: I do not actually want to remove the umlauts, but to replace them with their closest "one character ASCII" aequivalent.

    Read the article

  • tchar safe functions -- count parameter for UTF-8 constants

    - by Dustin Getz
    I'm porting a library from char to TCHAR. the count parameter of this fragment, according to MSDN, is the number of multibyte characters, not the number of bytes. so, did I get this right? _tcsncmp(access, TEXT("ftp"), 3); //or do i want _tcsnccmp? "Supported on Windows platforms only, _mbsncmp and _mbsnbcmp are multibyte versions of strncmp. _mbsncmp will compare at most count multibyte characters and _mbsnbcmp will compare at most count bytes. They both use the current multibyte code page. _tcsnccmp and _tcsncmp are the corresponding Generic functions for _mbsncmp and _mbsnbcmp, respectively. _tccmp is equivalent to _tcsnccmp."

    Read the article

  • Python code formatting

    - by Curious2learn
    In response to another question of mine, someone suggested that I avoid long lines in the code and to use PEP-8 rules when writing Python code. One of the PEP-8 rules suggested avoiding lines which are longer than 80 characters. I changed a lot of my code to comply with this requirement without any problems. However, changing the following line in the manner shown below breaks the code. Any ideas why? Does it have to do with the fact that what follows return command has to be in a single line? The line longer that 80 characters: def __str__(self): return "Car Type \n"+"mpg: %.1f \n" % self.mpg + "hp: %.2f \n" %(self.hp) + "pc: %i \n" %self.pc + "unit cost: $%.2f \n" %(self.cost) + "price: $%.2f "%(self.price) The line changed by using Enter key and Spaces as necessary: def __str__(self): return "Car Type \n"+"mpg: %.1f \n" % self.mpg + "hp: %.2f \n" %(self.hp) + "pc: %i \n" %self.pc + "unit cost: $%.2f \n" %(self.cost) + "price: $%.2f "%(self.price)

    Read the article

  • UTF-8 to ISO-8859-1 mapping / lossless conversion libraries in Java

    - by Pawel Krupinski
    I need to perform a conversion of characters from UTF-8 to ISO-8859-1 in Java without losing for example all of the UTF-8 specific punctuation. Ideally would like these to be converted to equivalents in ISO (e.g. there are probably 5 different single quotes in UTF-8 and would like them all converted to ISO single quote character). String.getBytes("ISO-8859-1") just won't do the trick in this case as it will lose the UTF-8-specific chars. Do you know of any ready mappings or libraries in Java that would map UTF-8 specific characters to ISO?

    Read the article

  • Splitting a string in ASP Classic

    - by Sam
    So here's my string: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam elit lacus, dignissim quis laoreet non, cursus id eros. Etiam lacinia tortor vel purus eleifend accumsan. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Quisque bibendum vestibulum nisl vitae volutpat. I need to split it every 100 characters (full words only) until all the characters are used up. So we'd end up with: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam elit lacus, dignissim quis laoreet non, and cursus id eros. Etiam lacinia tortor vel purus eleifend accumsan. Pellentesque habitant morbi tristique and senectus et netus et malesuada fames ac turpis egestas. Quisque bibendum vestibulum nisl vitae volutpat. Any ideas on the best way to do that?

    Read the article

  • Convert Decimal to ASCII

    - by Dan Snyder
    I'm having difficulty using reinterpret_cast. Before I show you my code I'll let you know what I'm trying to do. I'm trying to get a filename from a vector full of data being used by a MIPS I processor I designed. Basically what I do is compile a binary from a test program for my processor, dump all the hex's from the binary into a vector in my c++ program, convert all of those hex's to decimal integers and store them in a DataMemory vector which is the data memory unit for my processor. I also have instruction memory. So When my processor runs a SYSCALL instruction such as "Open File" my C++ operating system emulator receives a pointer to the beginning of the filename in my data memory. So keep in mind that data memory is full of ints, strings, globals, locals, all sorts of stuff. When I'm told where the filename starts I do the following: Convert the whole decimal integer element that is being pointed to to its ASCII character representation, and then search from left to right to see if the string terminates, if not then just load each character consecutively into a "filename" string. Do this until termination of the string in memory and then store filename in a table. My difficulty is generating filename from my memory. Here is an example of what I'm trying to do: C++ Syntax (Toggle Plain Text) 1.Index Vector NewVector ASCII filename 2.0 240faef0 128123792 'abc7' 'a' 3.0 240faef0 128123792 'abc7' 'ab' 4.0 240faef0 128123792 'abc7' 'abc' 5.0 240faef0 128123792 'abc7' 'abc7' 6.1 1234567a 243225 'k2s0' 'abc7k' 7.1 1234567a 243225 'k2s0' 'abc7k2' 8.1 1234567a 243225 'k2s0' 'abc7k2s' 9. //EXIT LOOP// 10.1 1234567a 243225 'k2s0' 'abc7k2s' Index Vector NewVector ASCII filename 0 240faef0 128123792 'abc7' 'a' 0 240faef0 128123792 'abc7' 'ab' 0 240faef0 128123792 'abc7' 'abc' 0 240faef0 128123792 'abc7' 'abc7' 1 1234567a 243225 'k2s0' 'abc7k' 1 1234567a 243225 'k2s0' 'abc7k2' 1 1234567a 243225 'k2s0' 'abc7k2s' //EXIT LOOP// 1 1234567a 243225 'k2s0' 'abc7k2s' Here is the code that I've written so far to get filename (I'm just applying this to element 1000 of my DataMemory vector to test functionality. 1000 is arbitrary.): C++ Syntax (Toggle Plain Text) 1.int i = 0; 2.int step = 1000;//top->a0; 3.string filename; 4.char *temp = reinterpret_cast<char*>( DataMemory[1000] );//convert to char 5.cout << "a0:" << top->a0 << endl;//pointer supplied 6.cout << "Data:" << DataMemory[top->a0] << endl;//my vector at pointed to location 7.cout << "Data(1000):" << DataMemory[1000] << endl;//the element I'm testing 8.cout << "Characters:" << &temp << endl;//my temporary char array 9. 10.while(&temp[i]!=0) 11.{ 12. filename+=temp[i];//add most recent non-terminated character to string 13. i++; 14. if(i==4)//when 4 chatacters have been added.. 15. { 16. i=0; 17. step+=1;//restart loop at the next element in DataMemory 18. temp = reinterpret_cast<char*>( DataMemory[step] ); 19. } 20. } 21. cout << "Filename:" << filename << endl; int i = 0; int step = 1000;//top-a0; string filename; char *temp = reinterpret_cast( DataMemory[1000] );//convert to char cout << "a0:" << top-a0 << endl;//pointer supplied cout << "Data:" << DataMemory[top-a0] << endl;//my vector at pointed to location cout << "Data(1000):" << DataMemory[1000] << endl;//the element I'm testing cout << "Characters:" << &temp << endl;//my temporary char array while(&temp[i]!=0) { filename+=temp[i];//add most recent non-terminated character to string i++; if(i==3)//when 4 chatacters have been added.. { i=0; step+=1;//restart loop at the next element in DataMemory temp = reinterpret_cast( DataMemory[step] ); } } cout << "Filename:" << filename << endl; So the issue is that when I do the conversion of my decimal element to a char array I assume that 8 hex #'s will give me 4 characters. Why isn't this this case? Here is my output: C++ Syntax (Toggle Plain Text) 1.a0:0 2.Data:0 3.Data(1000):4428576 4.Characters:0x7fff5fbff128 5.Segmentation fault

    Read the article

  • This regx does not work only in Chrome

    - by Deeptechtons
    Hi i just put up a validation function in jScript to validate filename in fileupload control[input type file]. The function seems to work fine in FF and sometimes in ie but never in Chrome. Basically the function tests if File name is atleast 1 char upto 25 characters long.Contains only valid characters,numbers [no spaces] and are of file types in the list. Could you throw some light on this function validate(Uploadelem) { var objRgx = new RegExp(/^[\w]{1,25}\.*\.(jpg|gif|png|jpeg|doc|docx|pdf|txt|rtf)$/); objRgx.ignoreCase = true; if (objRgx.test(Uploadelem.value)) { document.getElementById('moreUploadsLink').style.display = 'block'; } else { document.getElementById('moreUploadsLink').style.display = 'none'; } }

    Read the article

  • ASP.NET plug-in architecture, settings problem

    - by Xaqron
    I want to divide business layer (BLL) of an asp.net application into multiple components. Each component is a .NET class library which is compiled as a standalone DLL. These components should have their own configuration files. For example "MyNameSpace.Users.dll" contains classes about users of the website and there's a password policy to check if password length is at least x characters. When webmaster edits the config file of this DLL and set x to y then component (DLL) should use new value (y) in the future and enforce passwords to be at least y characters. I want each component as a single project and compile them separaely (and not to put all projects in a solution in Visual Studio), and put the DLLs of these libraries into the "Bin" folder of my ASP.NET application. Is it possible ? Where should I put these config files ?

    Read the article

  • comparing strings in PostgreSQL

    - by binaryLV
    Hello! Is there any way in PostgreSQL to convert UTF-8 characters to "similar" ASCII characters? String glažškunu rukiši would have to be converted to glazskunu rukisi. UTF-8 text is not in some specific language, it might be in Latvian, Russian, English, Italian or any other language. This is needed for using in where clause, so it might be just "comparing strings" rather than "converting strings". I tried using convert, but it does not give desired results (e.g., select convert('A', 'utf8', 'sql_ascii') gives \304\200, not A). Database is created with: ENCODING = 'UTF8' LC_COLLATE = 'Latvian_Latvia.1257' LC_CTYPE = 'Latvian_Latvia.1257' These params may be changed, if necessary.

    Read the article

  • How to get some xml that comes before and a little from after a DOM Node.

    - by startoftext
    I am using java and I am pretty open to using w3c DOM or DOM4J at this point. So lets say I have a Node like a text node that I have found something interesting in, like say an occurrence of a substring in the nodes text. If I want to get a string with a number characters preceding that node and a few characters after that node how may I do that? Basically I need to be able to display a snippet of the original xml around the occurrence of that string. The problem I have with getting the parent node for example and then calling asXML is that I no longer know the exact location of the substring in the text node. If I search again for that string value in the parents xml then I may find 2 occurrences or many more if the parent has other children that contain an occurrence of that string. Much appreciation if any one can answer this question.

    Read the article

  • Select Range by string

    - by Petott
    How I can change this feature so I select the range of characters in a word document between the characters "E" and "F", if I have; xasdasdEcdscasdcFvfvsdfv is underlined to me the range - cdscasdc private void Rango() { Word.Range rng; Word.Document document = this.Application.ActiveDocument; object startLocation = "E"; object endLocation = "F"; // Supply a Start and End value for the Range. rng = document.Range(ref startLocation, ref endLocation); // Select the Range. rng.Select(); } This function will not let me pass by reference two objects of string type....... Thanks

    Read the article

  • Effective methods for reading and writing large files in C

    - by Bertholt Stutley Johnson
    I'm writing an application that deals with very large user-generated input files. The program will copy about 95 percent of the file, effectively duplicating it and switching a few words and values in the copy, and then appending the copy (in chunks) to the original file, such that each block (consisting of between 10 and 50 lines) in the original is followed by the copied and modified block, and then the next original block, and so on. The user-generated input conforms to a certain format, and it is highly unlikely that any line in the original file is longer than 100 characters long. Which would be the better approach? a) To use one file pointer and use variables that hold the current position of how much has been read and where to write to, seeking the file pointer back and forth to read and write; or b) To use multiple file pointers, one for reading and one for writing. I am mostly concerned with the efficiency of the program, as the input files will reach up to 25,000 lines, each about 50 characters long. Thanks!

    Read the article

  • Changing text depending on rounded total from database

    - by NeonBlue Bliss
    On a website I have a number of small PHP scripts to automate changes to the text of the site, depending on a figure that's calculated from a MySQL database. The site is for a fundraising group, and the text in question on the home page gives the total amount raised. The amount raised is pulled from the database and rounded to the nearest thousand. This is the PHP I use to round the figure and find the last three digits of the total: $query4 = mysql_query("SELECT SUM(amountraised) AS full_total FROM fundraisingtotal;"); $result4 = mysql_fetch_array($query4); $fulltotal = $result4["full_total"]; $num = $fulltotal + 30000; $ftotalr = round($num,-3); $roundnum = round($num); $string = $roundnum; $length = strlen($string); $characters = 3; $start = $length - $characters; $string = substr($string , $start ,$characters); $figure = $string; (£30,000 is the amount that had been raised by the previous fundraising team from when the project first started, which is why I've added 30000 to $fulltotal for the $num variable) Currently the text reads: the bookstall and other fundraising events have raised more than &pound;<? echo number_format($ftotalr); ?> I've just realised though that because the PHP is rounding to the nearest thousand, if the total's for example £39,200 and it's rounded to £40,000, to say it's more than £40,000 is incorrect, and in that case I'd need it to say 'almost £40,000' or something similar. I obviously need to replace the 'more than' with a variable. Obviously I need to test whether the last three digits of the total are nearer to 0 or 1000, so that if the total was for example £39,2000, the text would read 'just over', if it was between £39,250 and £39,400 something like 'over', between £39,400 and £39,700 something like 'well over', and between £39,700 and £39,999, 'almost.' I've managed to get the last three digits of the total as a variable, and I think I need some sort of an if/else/elseif code block (not sure if that would be the right approach, or whether to use case/break), and obviously I'm going to have to check whether the figure meets each of the criteria, but I can't figure out how to do that. Could anyone suggest what would be the best way to do this please?

    Read the article

  • A maximum character limit on the preg functions?

    - by animuson
    On my site I use output buffering to grab all the output and then run it through a process function before sending it out to the browser (I don't replace anything, just break it into more manageable pieces). In this particular case, there is a massive amount of output because it is listing out a label for every country in the database (around 240 countries). The problem is that in full, my preg_match functions seems to get skipped over, it does absolutely nothing and returns no matches. However, if I remove parts of the labels (no particular part, just random pieces to reduce characters) then the preg_match functions works again. It doesn't seem to matter what I remove from the label, it just seems to be that as long as I remove so many characters. Is there some sort of cap on what the preg functions can handle or will it time out if there is too much data to be scanned over?

    Read the article

< Previous Page | 78 79 80 81 82 83 84 85 86 87 88 89  | Next Page >