Search Results

Search found 5604 results on 225 pages for 'chinese characters'.

Page 62/225 | < Previous Page | 58 59 60 61 62 63 64 65 66 67 68 69  | Next Page >

  • help with regex needed

    - by user268375
    I need a regular expression with the following needs: the string is alphanumeric and have exactly 6 characters in the first half followed by hyphen(optional) followed by optional 4 characters:(cannot have more than 4 characters in the second half) so any of the following is valid 11111A 111111-1 111111-yy yyyyy-989 yyyyyy-9090 i thought this expression /[a-zA-Z0-9]([-])?[a-zA-Z0-9]{5,10}$/; should work but i m unable to get it working correctly. Any help will be appreciated,

    Read the article

  • How to get the substring in C#?

    - by Nano HE
    Hi, I can get the first three characters with the function below. However, how can I get the output of the last five characters (Three) with Substring() function. Or other string function will be used? Thank you. static void Main() { string input = "OneTwoThree"; // Get first three characters string sub = input.Substring(0, 3); Console.WriteLine("Substring: {0}", sub); // Output One. }

    Read the article

  • How do I convert filenames from unicode to ascii

    - by zedwarth
    I have a bunch of music files on a NTFS partition mounted on linux that have filenames with unicode characters. I'm having trouble writing a script to rename the files so that all of the file names use only ASCII characters. I think that using the iconv command should work, but I'm having trouble escaping the characters for the mv command.

    Read the article

  • Turkish character problems while parsing (Android)

    - by alper35.5
    I am parsing an html content and have output on my screen. This website have Turkish characters such as çÇsSöÖgGiIüÜ. I am not able to show them as proper characters, they are printed out as question marks yet. Eclipse - Project - Properties - Resource - Text File Encoding = Inherited from container (Cp1254) I searched web and found this solution: Eclipse - Project - Properties - Resource - Text File Encoding = Other: UTF-8 However, it's not working. It only changes my files' current characters. (I have titles that have such characters on my activities) Any help? Thanks in advance...

    Read the article

  • Password generation, best practice

    - by Aidan
    I need to generate some passwords, I want to avoid characters that can be confused for each other. Is there a definitive list of characters I should avoid? my current list is il10o8B3Evu![]{} Are there any other pairs of characters that are easy to confuse? for special characters I was going to limit myself to those under the number keys, though I know that this differs depending on your keyboards nationality! As a rider question, I would like my passwords to be 'wordlike'do you have a favoured algorithm for that? Thanks :)

    Read the article

  • Why does this Bash regex match return an Exit Status of "2"?

    - by PreservedMoose
    I'm writing a Bash script that needs to scan for the existence of non-ASCII characters in filenames. I'm using the POSIX bracket regex syntax to match the non-ASCII characters, but for some reason, when I test for the match in an if/then statement, the test always returns an Exit Status of 2, and never matches my test string. Here's the code in question: FILEREQ_SOURCEFILE="Filename–WithNonAScII-Charàcters-05sec_23.98.mov" REGEX_MATCH_NONASCII="[^[:ascii:]]" if [[ $FILEREQ_SOURCEFILE =~ $REGEX_MATCH_NONASCII ]]; then echo "Exit Status: $?" echo "Matched!" else echo "Exit Status: $?" echo "No Match" fi This code always returns: Exit Status: 2 No Match I've read and re-read the bash-hackers.org explanation of how regex matching works, as well as this previous question on SO regarding matching non-ASCII characters, but for the life of me, I can't get this to work. What am I missing here?

    Read the article

  • Using traversal by pointer to check whether a string is repeated

    - by Bob John
    bool repeat_char(char *s, int n); //R: s is a C-string of at least n non-NUL characters and n > 0 //E: returns true if the first n characters are fully repeated throughout the string s, false // otherwise. I'm having trouble implementing this function using traversal by pointer. I was thinking that I could extract the first n characters from s, then use that in a comparison with s, but I'm not sure how I could do that. If I'm traversing through s one character at a time, how can I check that it matches a block of text, such as the first n characters of s? Thanks!

    Read the article

  • Changing the interface language in Windows 7 Home Premium

    - by Cristián Romo
    A friend of mine has recently purchased a laptop in the U.S. that has Windows 7 Home Premium on it with an English interface. Not being a native English speaker, I'm trying to change the interface language to traditional Chinese. I've looked through the Control Panel in search of something that might let me change the interface language. Naturally, I looked at the Region and Language section and managed to change the formats the computer uses and install a working keyboard, but I haven't found a way to change the interface language. Upon doing some research, I found out that there are two kinds of interface packs, Multilingual User Interface (MUI) and Language Interface Packs (LIP). It seems that MUIs can only be installed through Windows Update, so I looked through the list of updates. To my dismay, the language packs are not present. The optional updates tab doesn't even show up. Many sites show a drop down menu the under Keyboards and Languages tab in the Region and Language options, yet it doesn't show up for me. We also don't have the Windows 7 DVD which might contain this useful file. As far as the LIPs go, I can't find one in Chinese at all, let alone traditional Chinese. Can the interface language be changed in Home Premium at all? If it can, how would I do so?

    Read the article

  • Need help regarding internationalization of iPhone application

    - by Taufeeq Ahmed
    I have provided support for two languages, English and Chinese, in my iPhone application. I use string files for the languages using "key"-"value" pairs and my application displays the appropriate language using NSLocalizedString(@"Fund red not red?", @""). I get only Chinese text when I run the app in XCode. How can I switch to different languages in XCode (iPhone simulator)?

    Read the article

  • [C#] How divide a string into array

    - by Ricky
    If I have the following plain string, how do I divide it into an array of three elements? {["a","English"],["b","US"],["c","Chinese"]} ["a","English"],["b","US"],["c","Chinese"] This problem is related to JSON string parsing, so I wonder if there is any API to faciliate the conversion.

    Read the article

  • How divide a string into array

    - by Ricky
    If I have the following plain string, how do I divide it into an array of three elements? {["a","English"],["b","US"],["c","Chinese"]} ["a","English"],["b","US"],["c","Chinese"] This problem is related to JSON string parsing, so I wonder if there is any API to facilitate the conversion.

    Read the article

  • Can't change Joomla Default language

    - by Moak
    On this site http://www.bostonteaclub.com I want the default language to be Chinese. I set the default language to Chinese in the backend (it's got the star next to it) but when you went to the page you probably noticed that the site is in english. If you check the source code you will see on the very bottom hidden a var_dump of the language object, and by the looks of it the default is still en-GB ["_default"]=> string(5) "en-GB" Why is this? Thanks

    Read the article

  • Observations in Migrating from JavaFX Script to JavaFX 2.0

    - by user12608080
    Observations in Migrating from JavaFX Script to JavaFX 2.0 Introduction Having been available for a few years now, there is a decent body of work written for JavaFX using the JavaFX Script language. With the general availability announcement of JavaFX 2.0 Beta, the natural question arises about converting the legacy code over to the new JavaFX 2.0 platform. This article reflects on some of the observations encountered while porting source code over from JavaFX Script to the new JavaFX API paradigm. The Application The program chosen for migration is an implementation of the Sudoku game and serves as a reference application for the book JavaFX – Developing Rich Internet Applications. The design of the program can be divided into two major components: (1) A user interface (ideally suited for JavaFX design) and (2) the puzzle generator. For the context of this article, our primary interest lies in the user interface. The puzzle generator code was lifted from a sourceforge.net project and is written entirely in Java. Regardless which version of the UI we choose (JavaFX Script vs. JavaFX 2.0), no code changes were required for the puzzle generator code. The original user interface for the JavaFX Sudoku application was written exclusively in JavaFX Script, and as such is a suitable candidate to convert over to the new JavaFX 2.0 model. However, a few notable points are worth mentioning about this program. First off, it was written in the JavaFX 1.1 timeframe, where certain capabilities of the JavaFX framework were as of yet unavailable. Citing two examples, this program creates many of its own UI controls from scratch because the built-in controls were yet to be introduced. In addition, layout of graphical nodes is done in a very manual manner, again because much of the automatic layout capabilities were in flux at the time. It is worth considering that this program was written at a time when most of us were just coming up to speed on this technology. One would think that having the opportunity to recreate this application anew, it would look a lot different from the current version. Comparing the Size of the Source Code An attempt was made to convert each of the original UI JavaFX Script source files (suffixed with .fx) over to a Java counterpart. Due to language feature differences, there are a small number of source files which only exist in one version or the other. The table below summarizes the size of each of the source files. JavaFX Script source file Number of Lines Number of Character JavaFX 2.0 Java source file Number of Lines Number of Characters ArrowKey.java 6 72 Board.fx 221 6831 Board.java 205 6508 BoardNode.fx 446 16054 BoardNode.java 723 29356 ChooseNumberNode.fx 168 5267 ChooseNumberNode.java 302 10235 CloseButtonNode.fx 115 3408 CloseButton.java 99 2883 ParentWithKeyTraversal.java 111 3276 FunctionPtr.java 6 80 Globals.java 20 554 Grouping.fx 8 140 HowToPlayNode.fx 121 3632 HowToPlayNode.java 136 4849 IconButtonNode.fx 196 5748 IconButtonNode.java 183 5865 Main.fx 98 3466 Main.java 64 2118 SliderNode.fx 288 10349 SliderNode.java 350 13048 Space.fx 78 1696 Space.java 106 2095 SpaceNode.fx 227 6703 SpaceNode.java 220 6861 TraversalHelper.fx 111 3095 Total 2,077 79,127 2531 87,800 A few notes about this table are in order: The number of lines in each file was determined by running the Unix ‘wc –l’ command over each file. The number of characters in each file was determined by running the Unix ‘ls –l’ command over each file. The examination of the code could certainly be much more rigorous. No standard formatting was performed on these files.  All comments however were deleted. There was a certain expectation that the new Java version would require more lines of code than the original JavaFX script version. As evidenced by a count of the total number of lines, the Java version has about 22% more lines than its FX Script counterpart. Furthermore, there was an additional expectation that the Java version would be more verbose in terms of the total number of characters.  In fact the preceding data shows that on average the Java source files contain fewer characters per line than the FX files.  But that's not the whole story.  Upon further examination, the FX Script source files had a disproportionate number of blank characters.  Why?  Because of the nature of how one develops JavaFX Script code.  The object literal dominates FX Script code.  Its not uncommon to see object literals indented halfway across the page, consuming lots of meaningless space characters. RAM consumption Not the most scientific analysis, memory usage for the application was examined on a Windows Vista system by running the Windows Task Manager and viewing how much memory was being consumed by the Sudoku version in question. Roughly speaking, the FX script version, after startup, had a RAM footprint of about 90MB and remained pretty much the same size. The Java version started out at about 55MB and maintained that size throughout its execution. What About Binding? Arguably, the most striking observation about the conversion from JavaFX Script to JavaFX 2.0 concerned the need for data synchronization, or lack thereof. In JavaFX Script, the primary means to synchronize data is via the bind expression (using the “bind” keyword), and perhaps to a lesser extent it’s “on replace” cousin. The bind keyword does not exist in Java, so for JavaFX 2.0 a Data Binding API has been introduced as a replacement. To give a feel for the difference between the two versions of the Sudoku program, the table that follows indicates how many binds were required for each source file. For JavaFX Script files, this was ascertained by simply counting the number of occurrences of the bind keyword. As can be seen, binding had been used frequently in the JavaFX Script version (and does not take into consideration an additional half dozen or so “on replace” triggers). The JavaFX 2.0 program achieves the same functionality as the original JavaFX Script version, yet the equivalent of binding was only needed twice throughout the Java version of the source code. JavaFX Script source file Number of Binds JavaFX Next Java source file Number of “Binds” ArrowKey.java 0 Board.fx 1 Board.java 0 BoardNode.fx 7 BoardNode.java 0 ChooseNumberNode.fx 11 ChooseNumberNode.java 0 CloseButtonNode.fx 6 CloseButton.java 0 CustomNodeWithKeyTraversal.java 0 FunctionPtr.java 0 Globals.java 0 Grouping.fx 0 HowToPlayNode.fx 7 HowToPlayNode.java 0 IconButtonNode.fx 9 IconButtonNode.java 0 Main.fx 1 Main.java 0 Main_Mobile.fx 1 SliderNode.fx 6 SliderNode.java 1 Space.fx 0 Space.java 0 SpaceNode.fx 9 SpaceNode.java 1 TraversalHelper.fx 0 Total 58 2 Conclusions As the JavaFX 2.0 technology is so new, and experience with the platform is the same, it is possible and indeed probable that some of the observations noted in the preceding article may not apply across other attempts at migrating applications. That being said, this first experience indicates that the migrated Java code will likely be larger, though not extensively so, than the original Java FX Script source. Furthermore, although very important, it appears that the requirements for data synchronization via binding, may be significantly less with the new platform.

    Read the article

  • Dynamic character animation - Using the physics engine or not

    - by Lex Webb
    I'm planning on building a dynamic reactant animation engine for the characters in my 2D Game. I have already built templates for a skeleton based animation system using key frames and interpolation to specify a limbs position at any given moment in time. I am using Farseer physics (an extension of Box2D) in Monogame/XNA in C# My real question lies in how i go about tying this character animation into the physics engine. I have two options: Moving limbs using physics engine - applying a interpolated force to each limb (dynamic body) in order to attempt to get it to its position as donated by the skeleton animation. Moving limbs by simply changing the position of a fixed body - Updating the new position of each limb manually, attempting to take into account physics collisions. Then stepping the physics after the animation to allow for environment interaction. Each of these methods have their distinct advantages and disadvantages. Physics based movement Advantages: Possibly more natural/realistic movement Better interaction with game objects as force applying to objects colliding with characters would be calculated for me. No need to convert to dynamic bodies when reacting to projectiles/death/fighting. Disadvantages: Possible difficulty in calculating correct amount of force to move a limb a certain distance at a constant rate. Underlying character balance system would need to be created that would need to be robust enough to prevent characters falling over at the touch of a feather. Added code complexity and processing time for the above. Static Object movement Advantages: Easy to interpolate movement of limbs between game steps Moving limbs is as simple as applying a rotation to the skeleton bone. Greater control over limbs, wont need to worry about characters falling over as all animation would be pre-defined. Disadvantages: Possible unnatural movement (Depends entirely on my animation skills!) Bad physics collision reactions with physics engine (Dynamic bodies simply slide out of the way of static objects) Need to calculate collisions with physics objects and my limbs myself and apply directional forces to them. Hard to account for slopes/stairs/non standard planes when animating walking/running animations. Need to convert objects to dynamic when reacting to projectile/fighting/death physics objects. The Question! As you can see, i have thought about this extensively, i have also had Google into physics based animation and have found mostly dissertation papers! Which is filling me with sense that it may a lot more advanced than my mathematics skills. My question is mostly subjective based on my findings above/any experience you may have: Which of the above methods should i use when creating my game? I am willing to spend the time to get a physics solution working if you think it would be possible. In the end i want to provide the most satisfying experience for the gamer, as well as a robust and dynamic system i can use to animate pretty much anything i need.

    Read the article

  • How do I get long command lines to wrap to the next line?

    - by BrianH
    Edit It was my .bashrc file. I've copied the same profile from machine to machine, and I used special characters in my $PS1 that are somehow throwing it off. I'm now sticking with the standard bash variables for my $PS1. Thanks to @ændrük for the tip on the .bashrc! ...End Edit... Something I have noticed in Ubuntu for a long time that has been frustrating to me is when I am typing a command at the command line that gets longer (wider) than the terminal width, instead of wrapping to a new line, it goes back to column 1 on the same line and starts over-writing the beginning of my command line. (It doesn't actually overwrite the actual command, but visually, it is overwriting the text that was displayed). It's hard to explain without seeing it, but let's say my terminal was 20 characters wide (Mine is more like 120 characters - but for the sake of an example), and I want to echo the English alphabet. What I type is this: echo abcdefghijklmnopqrstuvwxyz But what my terminal looks like before I hit the key is: pqrstuvwxyzghijklmno When I hit enter, it echos abcdefghijklmnopqrstuvwxyz so I know the command was received properly. It just wrapped my typing after the "o" and started over on the same line. What I would expect to happen, if I typed this command in on a terminal that was only 20 characters wide would be this: echo abcdefghijklmno pqrstuvwxyz Background: I am using bash as my shell, and I have this line in my ~/.bashrc: set -o vi to be able to navigate the command line with VI commands. I am currently using Ubuntu 10.10 server, and connecting to the server with Putty. In any other environment I have worked in, if I type a long command line, it will add a new line underneath the line I am working on when my command gets longer than the terminal width and when I keep typing I can see my command on 2 different lines. But for as long as I can remember using Ubuntu, my long commands only occupy 1 line. This also happens when I am going back to previous commands in the history (I hit Esc, then 'K' to go back to previous commands) - when I get to a previous command that was longer than the terminal width, the command line gets mangled and I cannot tell where I am at in the command. The only work-around I have found to see the entire long command is to hit "Esc-V", which opens up the current command in a VI editor. I don't think I have anything odd in my .bashrc file. I commented out the "set -o vi" line, and I still had the problem. I downloaded a fresh copy of Putty and didn't make any changes to the configuration - I just typed in my host name to connect, and I still have the problem, so I don't think it's anything with Putty (unless I need to make some config changes) Has anyone else had this problem, and can anyone think of how to fix it? Thanks in advance! Brian

    Read the article

  • How to make gvfs-smb always use UTF8

    - by Didier
    The problem is that if I make an entry in /etc/fstab to mount a samba share I can give the option iocharset=utf8 and this then mounts the share correctly and in the right encoding with special characters displayed correctly. Gnomes automount system for some reason never gets this right and I can find nowhere to change its settings. Is there a way to make it always use UTF8 by default? This is with Ubuntu 10.10 and the system can display the characters involved.

    Read the article

  • WebCenter Customer Spotlight: Marvel

    - by me
    Author: Peter Reiser - Social Business Evangelist, Oracle WebCenter  Solution SummaryMarvel Entertainment, LLC (Marvel) is one of the world's most prominent character-based entertainment companies, built on a proven library of over 8,000 characters featured in a variety of media over seventy years. The customer wanted to optimize their brand licensing process, so Marvel worked with Oracle WebCenter partner Fishbowl Solutions and implemented a centralized Content Hub based on Oracle WebCenter Content. The 100% web based secure Intranet/Partner Extranet solution is now managing the entire life cycle of the brand licensing process. Marvel and their brand licensees have  now complete visibility of brand license operations including the history of approval request and related content.  Company OverviewMarvel Entertainment, LLC (Marvel) a wholly-owned subsidiary of The Walt Disney Company, is one of the world's most prominent character-based entertainment companies, built on a proven library of over 8,000 characters featured in a variety of media over seventy years.  Marvel utilizes its character franchises in entertainment, licensing and publishing.   Sample  characters:    - Spider-Man    - Iron Man    - Captain America    - X-MEN    - Thor    - Avengers    - And a host of others  Business ChallengesMarvel wanted to optimize their brand licensing process for their characters and had following business requirements : Facilitating content worldwide Scalable and flexible infrastructure to manage multiple content types and huge file sizes Optimize the licensing process workflow trough automatic notifications, tracking reviews, issuing approvals, etc. Solution DeployedMarvel worked with Oracle WebCenter partner Fishbowl Solutions and implemented a centralized Content Hub based on Oracle WebCenter Content. The 100% web based secure Intranet/Partner Extranet solution is now managing the entire life cycle of the brand licensing process. The internal users can now manage all digital assets related to a character trough proper categorization of all items, workflow based review and approval of branding styles and a powerful search and retrieval service. The licensees of Marvel brands can now online develop and submit  concepts and prototypes which are reviewed and approved using a collaborative process. Business ResultMarvel and their brand licensees have now complete visibility of brand license operations including the history of approval request and related content. The character brand related content is now in the right place, at the right time at the user's fingertips with highly improved quality. Additional Information Marvel Open World Presentation Oracle WebCenter Content

    Read the article

  • How do I get long command lines to wrap to the next line?

    - by BrianH
    Edit It was my .bashrc file. I've copied the same profile from machine to machine, and I used special characters in my $PS1 that are somehow throwing it off. I'm now sticking with the standard bash variables for my $PS1. Thanks to @ændrük for the tip on the .bashrc! ...End Edit... Something I have noticed in Ubuntu for a long time that has been frustrating to me is when I am typing a command at the command line that gets longer (wider) than the terminal width, instead of wrapping to a new line, it goes back to column 1 on the same line and starts over-writing the beginning of my command line. (It doesn't actually overwrite the actual command, but visually, it is overwriting the text that was displayed). It's hard to explain without seeing it, but let's say my terminal was 20 characters wide (Mine is more like 120 characters - but for the sake of an example), and I want to echo the English alphabet. What I type is this: echo abcdefghijklmnopqrstuvwxyz But what my terminal looks like before I hit the key is: pqrstuvwxyzghijklmno When I hit enter, it echos abcdefghijklmnopqrstuvwxyz so I know the command was received properly. It just wrapped my typing after the "o" and started over on the same line. What I would expect to happen, if I typed this command in on a terminal that was only 20 characters wide would be this: echo abcdefghijklmno pqrstuvwxyz Background: I am using bash as my shell, and I have this line in my ~/.bashrc: set -o vi to be able to navigate the command line with VI commands. I am currently using Ubuntu 10.10 server, and connecting to the server with Putty. In any other environment I have worked in, if I type a long command line, it will add a new line underneath the line I am working on when my command gets longer than the terminal width and when I keep typing I can see my command on 2 different lines. But for as long as I can remember using Ubuntu, my long commands only occupy 1 line. This also happens when I am going back to previous commands in the history (I hit Esc, then 'K' to go back to previous commands) - when I get to a previous command that was longer than the terminal width, the command line gets mangled and I cannot tell where I am at in the command. The only work-around I have found to see the entire long command is to hit "Esc-V", which opens up the current command in a VI editor. I don't think I have anything odd in my .bashrc file. I commented out the "set -o vi" line, and I still had the problem. I downloaded a fresh copy of Putty and didn't make any changes to the configuration - I just typed in my host name to connect, and I still have the problem, so I don't think it's anything with Putty (unless I need to make some config changes) Has anyone else had this problem, and can anyone think of how to fix it? Thanks in advance! Brian

    Read the article

  • Unsteady Display

    - by Elton McRae
    I use Ubuntu 12:04. The Text character on my screen while on the inter-net s too small, so I decided to adjust the display to give larger characters. This caused the screen to become very unstable, once I am logged it the screen begins to flitter. I am now usingthe guest login. How can I readjust the display to first make it stable and secondly to have larger text characters. thanks in anticipation, Elton.

    Read the article

  • Working with Windows and Unix

    - by user554629
    Beware of new line characters One of the most frequent issues we encounter in Tech Support is the corruption of files that are transferred between Windows and Unix.   The transfer can occur at any stage, but ultimately involves a transfer of a file using an ftp client that is running on Windows;  it could be ftp or filezilla. Windows uses two characters to mark the end of a line in a text file (CR/LF),carriage return, linefeed.   Unix uses a single character (CR). In all situations, it is best to use binary mode transfer for all files, including ascii text files. Common problems: upload a core file from unix to windows using ftp in ascii mode.The file is going to be larger on Windows than Unix.ftp doesn't know if this is a text file with real line-ends, it takes every ascii CR and transmits two ascii characters CR/LF.The core file, tar file, library ... will be corrupted when transferred to Oracle. download a shell script to Windows, and transfer it to Unix using ftpIf the file is edited on Windows, the unix script line-end chars will be doubled.Unix doesn't know how to handle that, and will likely tell you the script is not executable.Why?  The first line of a shell script ( called "sh-bang" ), identifies the command interpreter the unix shell should use for this script.   Common examples:#/bin/sh#/bin/ksh#/bin/bash#/bin/perl#/bin/sh^M    # will not be understood.#/bin/env ksh # special syntax.  Find ksh and run it dos2unix is a common utility found on most unix platforms, that repairs the issue of Windows LineEnd characters in unix script files.   I've written my own flavor of this utility for use in Tech Support and build environments, that is a bit easier to use, and has some nice side-effects. accepts a list of files:   dos2unix *.sh repairs the file in-place.  Doesn't generate a new file you have to name retains the same timestamp;  it is the encoding that changed, not the file content. Here are the versions of dos2unix for each of the environments we work in.They are compressed with gzip, to avoid the ftp ascii transfer trap,and because I am quite limited in the number of files I can upload to this blog. AIX Linux Solaris sparc  Windows 

    Read the article

  • Can a whitespace regex character be used to perform a javascript injection? [migrated]

    - by webose
    if I want to validate the input of a <textarea>, and want it to contain, for example, only numerical values, but even want to give users the possibility to insert new lines, I can selected wanted characters with a javascript regex that includes even the whitespace characters. /[0-9\s]/ The question is: do a whitecharacter can be used to perform injections, XSS,even if I think this last option is impossible, or any other type of attack ? thanks

    Read the article

  • Uneditable file and Unreadable(for further processing) file( WHY? ) after processing it through C++

    - by mgj
    Hi...:) This might look to be a very long question to you I understand, but trust me on this its not long. I am not able to identify why after processing this text is not being able to be read and edited. I tried using the ord() function in python to check if the text contains any Unicode characters( non ascii characters) apart from the ascii ones.. I found quite a number of them. I have a strong feeling that this could be due to the original text itself( The INPUT ). Input-File: Just copy paste it into a file "acle5v1.txt" The objective of this code below is to check for upper case characters and to convert it to lower case and also to remove all punctuations so that these words are taken for further processing for word alignment #include<iostrea> #include<fstream> #include<ctype.h> #include<cstring> using namespace std; ifstream fin2("acle5v1.txt"); ofstream fin3("acle5v1_op.txt"); ofstream fin4("chkcharadded.txt"); ofstream fin5("chkcharntadded.txt"); ofstream fin6("chkprintchar.txt"); ofstream fin7("chknonasci.txt"); ofstream fin8("nonprinchar.txt"); int main() { char ch,ch1; fin2.seekg(0); fin3.seekp(0); int flag = 0; while(!fin2.eof()) { ch1=ch; fin2.get(ch); if (isprint(ch))// if the character is printable flag = 1; if(flag) { fin6<<"Printable character:\t"<<ch<<"\t"<<(int)ch<<endl; flag = 0; } else { fin8<<"Non printable character caught:\t"<<ch<<"\t"<<int(ch)<<endl; } if( isalnum(ch) || ch == '@' || ch == ' ' )// checks for alpha numeric characters { fin4<<"char added: "<<ch<<"\tits ascii value: "<<int(ch)<<endl; if(isupper(ch)) { //tolower(ch); fin3<<(char)tolower(ch); } else { fin3<<ch; } } else if( ( ch=='\t' || ch=='.' || ch==',' || ch=='#' || ch=='?' || ch=='!' || ch=='"' || ch != ';' || ch != ':') && ch1 != ' ' ) { fin3<<' '; } else if( (ch=='\t' || ch=='.' || ch==',' || ch=='#' || ch=='?' || ch=='!' || ch=='"' || ch != ';' || ch != ':') && ch1 == ' ' ) { //fin3<<" '; } else if( !(int(ch)>=0 && int(ch)<=127) ) { fin5<<"Char of ascii within range not added: "<<ch<<"\tits ascii value: "<<int(ch)<<endl; } else { fin7<<"Non ascii character caught(could be a -ve value also)\t"<<ch<<int(ch)<<endl; } } return 0; } I have a similar code as the above written in python which gives me an otput which is again not readable and not editable The code in python looks like this: #!/usr/bin/python # -*- coding: UTF-8 -*- import sys input_file=sys.argv[1] output_file=sys.argv[2] list1=[] f=open(input_file) for line in f: line=line.strip() #line=line.rstrip('.') line=line.replace('.','') line=line.replace(',','') line=line.replace('#','') line=line.replace('?','') line=line.replace('!','') line=line.replace('"','') line=line.replace('?','') line=line.replace('|','') line = line.lower() list1.append(line) f.close() f1=open(output_file,'w') f1.write(' '.join(list1)) f1.close() the file takes ip and op at runtime.. as: python punc_remover.py acle5v1.txt acle5v1_op.txt The output of this file is in "acle5v1_op.txt" now after processing this particular output file is needed for further processing. This particular file "aclee5v1_op.txt" is the UNREADABLE Aand UNEDITABLE File that I am not being able to use for further processing. I need this for Word alignment in NLP. I tried readin this output with the following program #include<iostream> #include<fstream> using namespace std; ifstream fin1("acle5v1_op.txt"); ofstream fout1("chckread_acle5v1_op.txt"); ofstream fout2("chcknotread_acle5v1_op.txt"); int main() { char ch; int flag = 0; long int r = 0; long int nr = 0; while(!(fin1)) { fin1.get(ch); if(ch) { flag = 1; } if(flag) { fout1<<ch; flag = 0; r++; } else { fout2<<"Char not been able to be read from source file\n"; nr++; } } cout<<"Number of characters able to be read: "<<r; cout<<endl<<"Number of characters not been able to be read: "<<nr; return 0; } which prints the character if its readable and if not it doesn't print them but I observed the output of both the file is blank thus I could draw a conclusion that this file "acle5v1_op.txt" is UNREADABLE AND UNEDITABLE. Could you please help me on how to deal with this problem.. To tell you a bit about the statistics wrt the original input file "acle5v1.txt" file it has around 3441 lines in it and around 3 million characters in it. Keeping in mind the number of characters in the file you editor might/might not be able to manage to open the file.. I was able to open the file in gedit of Fedora 10 which I am currently using .. This is just to notify you that opening with a particular editor was not actually an issue at least in my case... Can I use scripting languages like Python and Perl to deal with this problem if Yes how? could please be specific on that regard as I am a novice to Perl and Python. Or could you please tell me how do I solve this problem using C++ itself.. Thank you...:) I am really looking forward to some help or guidance on how to go about this problem....

    Read the article

< Previous Page | 58 59 60 61 62 63 64 65 66 67 68 69  | Next Page >