Search Results

Search found 5433 results on 218 pages for 'escaped characters'.

Page 104/218 | < Previous Page | 100 101 102 103 104 105 106 107 108 109 110 111 | Next Page >

Using C# to detect whether a filename character is considered international

- by Morten Mertner

I've written a small console application (source below) to locate and optionally rename files containing international characters, as they are a source of constant pain with most source control systems (some background on this below). The code I'm using has a simple dictionary with characters to look for and replace (and nukes every other character that uses more than one byte of storage), but it feels very hackish. What's the right way to (a) find out whether a character is international? and (b) what the best ASCII substitution character would be? Let me provide some background information on why this is needed. It so happens that the danish Å character has two different encodings in UTF-8, both representing the same symbol. These are known as NFC and NFD encodings. Windows and Linux will create NFC encoding by default but respect whatever encoding it is given. Mac will convert all names (when saving to a HFS+ partition) to NFD and therefore returns a different byte stream for the name of a file created on Windows. This effectively breaks Subversion, Git and lots of other utilities that don't care to properly handle this scenario. I'm currently evaluating Mercurial, which turns out to be even worse at handling international characters.. being fairly tired of these problems, either source control or the international character would have to go, and so here we are. My current implementation: public class Checker { private Dictionary<char, string> internationals = new Dictionary<char, string>(); private List<char> keep = new List<char>(); private List<char> seen = new List<char>(); public Checker() { internationals.Add( 'æ', "ae" ); internationals.Add( 'ø', "oe" ); internationals.Add( 'å', "aa" ); internationals.Add( 'Æ', "Ae" ); internationals.Add( 'Ø', "Oe" ); internationals.Add( 'Å', "Aa" ); internationals.Add( 'ö', "o" ); internationals.Add( 'ü', "u" ); internationals.Add( 'ä', "a" ); internationals.Add( 'é', "e" ); internationals.Add( 'è', "e" ); internationals.Add( 'ê', "e" ); internationals.Add( '¦', "" ); internationals.Add( 'Ã', "" ); internationals.Add( '©', "" ); internationals.Add( ' ', "" ); internationals.Add( '§', "" ); internationals.Add( '¡', "" ); internationals.Add( '³', "" ); internationals.Add( '', "" ); internationals.Add( 'º', "" ); internationals.Add( '«', "-" ); internationals.Add( '»', "-" ); internationals.Add( '´', "'" ); internationals.Add( '`', "'" ); internationals.Add( '"', "'" ); internationals.Add( Encoding.UTF8.GetString( new byte[] { 226, 128, 147 } )[ 0 ], "-" ); internationals.Add( Encoding.UTF8.GetString( new byte[] { 226, 128, 148 } )[ 0 ], "-" ); internationals.Add( Encoding.UTF8.GetString( new byte[] { 226, 128, 153 } )[ 0 ], "'" ); internationals.Add( Encoding.UTF8.GetString( new byte[] { 226, 128, 166 } )[ 0 ], "." ); keep.Add( '-' ); keep.Add( '=' ); keep.Add( '\'' ); keep.Add( '.' ); } public bool IsInternationalCharacter( char c ) { var s = c.ToString(); byte[] bytes = Encoding.UTF8.GetBytes( s ); if( bytes.Length > 1 && ! internationals.ContainsKey( c ) && ! seen.Contains( c ) ) { Console.WriteLine( "X '{0}' ({1})", c, string.Join( ",", bytes ) ); seen.Add( c ); if( ! keep.Contains( c ) ) { internationals[ c ] = ""; } } return internationals.ContainsKey( c ); } public bool HasInternationalCharactersInName( string name, out string safeName ) { StringBuilder sb = new StringBuilder(); Array.ForEach( name.ToCharArray(), c => sb.Append( IsInternationalCharacter( c ) ? internationals[ c ] : c.ToString() ) ); int length = sb.Length; sb.Replace( " ", " " ); while( sb.Length != length ) { sb.Replace( " ", " " ); } safeName = sb.ToString().Trim(); string namePart = Path.GetFileNameWithoutExtension( safeName ); if( namePart.EndsWith( "." ) ) safeName = namePart.Substring( 0, namePart.Length - 1 ) + Path.GetExtension( safeName ); return name != safeName; } } And this would be invoked like this: FileInfo file = new File( "Århus.txt" ); string safeName; if( checker.HasInternationalCharactersInName( file.Name, out safeName ) ) { // rename file }

Read the article
Writing an optimised and efficient search engine with mySQL and ColdFusion

- by Mel

I have a search page with the following scenarios listed below. I was told there was a better way to do it, but not how, and that I am using too many if statements, and that it's prone to causing an error through url manipulation: Search.cfm will processes a search made from a search bar present on all pages, with one search input (titleName). If search.cfm is accessed manually (through URL not through using the simple search bar on all pages) it displays an advanced search form with three inputs (titleName, genreID, platformID) or it evaluates searchResponse variable and decides what to do. If simple search query is blank, has no results, or less than 3 characters it displays an error If advanced search query is blank, has no results, or less than 3 characters it displays an error If any successful search returns results, they come back normally. The top-of-page logic is as follows:  <cfparam name="variables.searchResponse" default="">  <cfif IsDefined("Form.simpleSearch") AND Len(Trim(Form.titleName)) LTE 2> <cfset variables.searchResponse = "invalidString"> <cfelseif IsDefined("Form.simpleSearch") AND Len(Trim(Form.titleName)) GTE 3>  <cfinvoke component="myComponent" method="simpleSearch" searchString="#Form.titleName#" returnvariable="simpleSearchResult"> <cfset variables.searchResponse = "simpleSearchResult"> </cfif>  <cfif IsDefined("variables.simpleSearchResult") AND simpleSearchResult.RecordCount IS 0> <cfset variables.searchResponse = "noResult"> </cfif>  <cfif IsDefined("Form.AdvancedSearch") AND Len(Trim(Form.titleName)) LTE 2> <cfset variables.searchResponse = "invalidString"> <cfelseif IsDefined("Form.advancedSearch") AND Len(Trim(Form.titleName)) GTE 2>  <cfinvoke component="myComponent" method="advancedSearch" returnvariable="advancedSearchResult" titleName="#Form.titleName#" genreID="#Form.genreID#" platformID="#Form.platformID#"> <cfset variables.searchResponse = "advancedSearchResult"> </cfif>  <cfif IsDefined("variables.advancedSearchResult") AND advancedSearchResult.RecordCount IS 0> <cfset variables.searchResponse = "noResult"> </cfif> I'm using the searchResponse variable to decide what the the page displays, based on the following scenarios:  <form name="simpleSearch" action="search.cfm" method="post"> <input type="hidden" name="simpleSearch" /> <input type="text" name="titleName" /> <input type="button" value="Search" onclick="form.submit()" /> </form>  <cfif searchResponse IS ""> <h1>Advanced Search</h1>  <form name="advancedSearch" action="search.cfm" method="post"> <input type="hidden" name="advancedSearch" /> <input type="text" name="titleName" /> <input type="text" name="genreID" /> <input type="text" name="platformID" /> <input type="button" value="Search" onclick="form.submit()" /> </form> </cfif>  <cfif searchResponse IS "invalidString"> <cfoutput> <h1>INVALID SEARCH</h1> </cfoutput> </cfif>  <cfif searchResponse IS "noResult"> <cfoutput> <h1>NO RESULT FOUND</h1> </cfoutput> </cfif>  <cfif searchResponse IS "simpleSearchResult"> <cfoutput> <h1>Search Results</h1> </cfoutput> <cfoutput query="simpleSearchResult">  </cfoutput> </cfif>  <cfif searchResponse IS "advancedSearchResult"> <cfoutput> <h1>Search Results</h1> <p>Your search for "#Form.titleName#" returned #advancedSearchResult.RecordCount# result(s).</p> </cfoutput> <cfoutput query="advancedSearchResult">  </cfoutput> </cfif> Is my logic a) not efficient because my if statements/is there a better way to do this? And b) Can you see any scenarios where my code can break? I've tested it but I have not been able to find any issues with it. And I have no way of measuring performance. Any thoughts and ideas would be greatly appreciated. Many thanks

Read the article
GNU/Linux interactive table content GUI editor?

- by sdaau

I often find myself in the need to gather data (say from the internet), into a table, for comparison reasons. I usually need the final table output in HTML or MediaWiki mostly, but often times also Latex. My biggest problem is that I often forget the correct table syntax for these markup languages, as well as what needs to be properly escaped in the inline data, for the table to render correctly. So, I often wish there was a GUI application, which provides a tabular framework - which I could stick "Always on Top" as a desktop window, and I could paste content into specific cells - before finally exporting the table as a code in the correct language. One application that partially allows this is Open/LibreOffice calc: The good thing here is that: I can drag and drop browser content into a specifically targeted table cell (here B2) "Rich" text / HTML code gets pasted For long content, the cell (column) width stays put as it originally was The bad thing is, that: when the cell height (due to content size) becomes larger than the calc window, it becomes nearly impossible to scroll calc contents up and down (at least with the mousewheel), as the view gets reset to top-right corner of the selected cell calc shows an "endless"/unlimited field of cells, so not exactly a "table" - which I find visually very confusing (and cognitively taxing) Can only export table to HTML What I would need is an application that: Allows for a limited size table, but with quick adding of rows and columns (e.g. via corresponding + buttons) Allows for quick setup of row and column height and width (as well as table size) Stays put at those sizes, regardless of size of content pasted in; if cell content overflows, cell scrollbars are shown (cell content could be possibly re-edited in a separate/new window); if table overflows over window size, window scrollbars are shown Exports table in multiple formats (I'd need both HTML and mediawiki), properly escaping cell content for each (possibility to strip HTML tags from content pasted in cells, to get plain text, is a plus) Targeting a specific cell in the table for the content paste operation is a must - it doesn't have to be drag'n'drop though, a right click over a cell with "Paste content" is enough. I'd also want the ability to click in a specific cell and type in (plain text) content immediately. So, my question is: is there an application out there that already does something like this? The reason I'm asking is that - as the screenshots show - for instance Libre/OpenOffice allows it, but only somewhat (as using it for that purpose is tedious). I know there exist some GUI editors for Linux (both for UI like guile or HTML like amaya); but I don't know them enough to pinpoint if any of them would offer this kind of functionality (and at least in my searches, that kind of functionality, if present in diverse software, seems not to be advertised). Note I'm not interested in styling an HTML table, which is why I haven't used "table designer" in the title, but "table editor" (in lack of better terms) - I'm interested in (quickly) adjusting row/column size of the table, and populating it with pasted data (which is possibly HTML) in a GUI; and finally exporting such a table as self-contained HTML (or other) code.

Read the article
Upgraded to Ubuntu 13.10 - Apache not able to start

- by 0R10N

I updated to Ubuntu 13.10 (from Ubuntu 13.04) last weekend, and now Apache is not being able to start. It was working perfectly well until the upgraded, and I haven't changed anything myself. When I ran a restart this is what I get apache2: Syntax error on line 260 of /etc/apache2/apache2.conf: Could not open configuration file /etc/apache2/conf.d/: No such file or directory So, I created the directory, and then I get this: * Starting web server apache2 * * The apache2 configtest failed. Output of config test was: [Wed Oct 30 11:17:42.921934 2013] [proxy_html:notice] [pid 2496] AH01425: I18n support in mod_proxy_html requires mod_xml2enc. Without it, non-ASCII characters in proxied pages are likely to display incorrectly. AH00526: Syntax error on line 84 of /etc/apache2/apache2.conf: Invalid command 'LockFile', perhaps misspelled or defined by a module not included in the server configuration Action 'configtest' failed. The Apache error log may have more information. Thanks!

Read the article
Im unable to open pdf files using kword ?! what to do?

- by Curious Apprentice

I have read in many places that kword can open pdf and save it in doc format ! but on ubuntu 11.10 I can not open pdf using kword as it does not showing pdf as supported files ?!!! I want a pdf to doc converter on Ubuntu 11.10. I can open pdf using libreofffice but it displaying garbled characters with over 1000+ pages for a pdf with only 15 pages. What can I do so that Kword became able to open pdf and save it to doc or odt format ??

Read the article
SQLite, python, unicode, and non-utf data

- by Nathan Spears

I started by trying to store strings in sqlite using python, and got the message: sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings. Ok, I switched to Unicode strings. Then I started getting the message: sqlite3.OperationalError: Could not decode to UTF-8 column 'tag_artist' with text 'Sigur Rós' when trying to retrieve data from the db. More research and I started encoding it in utf8, but then 'Sigur Rós' starts looking like 'Sigur RÃ³s' note: My console was set to display in 'latin_1' as @John Machin pointed out. What gives? After reading this, describing exactly the same situation I'm in, it seems as if the advice is to ignore the other advice and use 8-bit bytestrings after all. I didn't know much about unicode and utf before I started this process. I've learned quite a bit in the last couple hours, but I'm still ignorant of whether there is a way to correctly convert 'ó' from latin-1 to utf-8 and not mangle it. If there isn't, why would sqlite 'highly recommend' I switch my application to unicode strings? I'm going to update this question with a summary and some example code of everything I've learned in the last 24 hours so that someone in my shoes can have an easy(er) guide. If the information I post is wrong or misleading in any way please tell me and I'll update, or one of you senior guys can update. Summary of answers Let me first state the goal as I understand it. The goal in processing various encodings, if you are trying to convert between them, is to understand what your source encoding is, then convert it to unicode using that source encoding, then convert it to your desired encoding. Unicode is a base and encodings are mappings of subsets of that base. utf_8 has room for every character in unicode, but because they aren't in the same place as, for instance, latin_1, a string encoded in utf_8 and sent to a latin_1 console will not look the way you expect. In python the process of getting to unicode and into another encoding looks like: str.decode('source_encoding').encode('desired_encoding') or if the str is already in unicode str.encode('desired_encoding') For sqlite I didn't actually want to encode it again, I wanted to decode it and leave it in unicode format. Here are four things you might need to be aware of as you try to work with unicode and encodings in python. The encoding of the string you want to work with, and the encoding you want to get it to. The system encoding. The console encoding. The encoding of the source file Elaboration: (1) When you read a string from a source, it must have some encoding, like latin_1 or utf_8. In my case, I'm getting strings from filenames, so unfortunately, I could be getting any kind of encoding. Windows XP uses UCS-2 (a Unicode system) as its native string type, which seems like cheating to me. Fortunately for me, the characters in most filenames are not going to be made up of more than one source encoding type, and I think all of mine were either completely latin_1, completely utf_8, or just plain ascii (which is a subset of both of those). So I just read them and decoded them as if they were still in latin_1 or utf_8. It's possible, though, that you could have latin_1 and utf_8 and whatever other characters mixed together in a filename on Windows. Sometimes those characters can show up as boxes, other times they just look mangled, and other times they look correct (accented characters and whatnot). Moving on. (2) Python has a default system encoding that gets set when python starts and can't be changed during runtime. See here for details. Dirty summary ... well here's the file I added: \# sitecustomize.py \# this file can be anywhere in your Python path, \# but it usually goes in ${pythondir}/lib/site-packages/ import sys sys.setdefaultencoding('utf_8') This system encoding is the one that gets used when you use the unicode("str") function without any other encoding parameters. To say that another way, python tries to decode "str" to unicode based on the default system encoding. (3) If you're using IDLE or the command-line python, I think that your console will display according to the default system encoding. I am using pydev with eclipse for some reason, so I had to go into my project settings, edit the launch configuration properties of my test script, go to the Common tab, and change the console from latin-1 to utf-8 so that I could visually confirm what I was doing was working. (4) If you want to have some test strings, eg test_str = "ó" in your source code, then you will have to tell python what kind of encoding you are using in that file. (FYI: when I mistyped an encoding I had to ctrl-Z because my file became unreadable.) This is easily accomplished by putting a line like so at the top of your source code file: # -*- coding: utf_8 -*- If you don't have this information, python attempts to parse your code as ascii by default, and so: SyntaxError: Non-ASCII character '\xf3' in file _redacted_ on line 81, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details Once your program is working correctly, or, if you aren't using python's console or any other console to look at output, then you will probably really only care about #1 on the list. System default and console encoding are not that important unless you need to look at output and/or you are using the builtin unicode() function (without any encoding parameters) instead of the string.decode() function. I wrote a demo function I will paste into the bottom of this gigantic mess that I hope correctly demonstrates the items in my list. Here is some of the output when I run the character 'ó' through the demo function, showing how various methods react to the character as input. My system encoding and console output are both set to utf_8 for this run: '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Now I will change the system and console encoding to latin_1, and I get this output for the same input: 'ó' = original char <type 'str'> repr(char)='\xf3' 'ó' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Notice that the 'original' character displays correctly and the builtin unicode() function works now. Now I change my console output back to utf_8. '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' '?' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Here everything still works the same as last time but the console can't display the output correctly. Etc. The function below also displays more information that this and hopefully would help someone figure out where the gap in their understanding is. I know all this information is in other places and more thoroughly dealt with there, but I hope that this would be a good kickoff point for someone trying to get coding with python and/or sqlite. Ideas are great but sometimes source code can save you a day or two of trying to figure out what functions do what. Disclaimers: I'm no encoding expert, I put this together to help my own understanding. I kept building on it when I should have probably started passing functions as arguments to avoid so much redundant code, so if I can I'll make it more concise. Also, utf_8 and latin_1 are by no means the only encoding schemes, they are just the two I was playing around with because I think they handle everything I need. Add your own encoding schemes to the demo function and test your own input. One more thing: there are apparently crazy application developers making life difficult in Windows. #!/usr/bin/env python # -*- coding: utf_8 -*- import os import sys def encodingDemo(str): validStrings = () try: print "str =",str,"{0} repr(str) = {1}".format(type(str), repr(str)) validStrings += ((str,""),) except UnicodeEncodeError as ude: print "Couldn't print the str itself because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print ude try: x = unicode(str) print "unicode(str) = ",x validStrings+= ((x, " decoded into unicode by the default system encoding"),) except UnicodeDecodeError as ude: print "ERROR. unicode(str) couldn't decode the string because the system encoding is set to an encoding that doesn't understand some character in the string." print "\tThe system encoding is set to {0}. See error:\n\t".format(sys.getdefaultencoding()), print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the unicode(str) because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('latin_1') print "str.decode('latin_1') =",x validStrings+= ((x, " decoded with latin_1 into unicode"),) try: print "str.decode('latin_1').encode('utf_8') =",str.decode('latin_1').encode('utf_8') validStrings+= ((x, " decoded with latin_1 into unicode and encoded into utf_8"),) except UnicodeDecodeError as ude: print "The string was decoded into unicode using the latin_1 encoding, but couldn't be encoded into utf_8. See error:\n\t", print ude except UnicodeDecodeError as ude: print "Something didn't work, probably because the string wasn't latin_1 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('latin_1') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('utf_8') print "str.decode('utf_8') =",x validStrings+= ((x, " decoded with utf_8 into unicode"),) try: print "str.decode('utf_8').encode('latin_1') =",str.decode('utf_8').encode('latin_1') except UnicodeDecodeError as ude: print "str.decode('utf_8').encode('latin_1') didn't work. The string was decoded into unicode using the utf_8 encoding, but couldn't be encoded into latin_1. See error:\n\t", validStrings+= ((x, " decoded with utf_8 into unicode and encoded into latin_1"),) print ude except UnicodeDecodeError as ude: print "str.decode('utf_8') didn't work, probably because the string wasn't utf_8 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('utf_8') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t",uee print print "Printing information about each character in the original string." for char in str: try: print "\t'" + char + "' = original char {0} repr(char)={1}".format(type(char), repr(char)) except UnicodeDecodeError as ude: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), ude) except UnicodeEncodeError as uee: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), uee) print uee try: x = unicode(char) print "\t'" + x + "' = unicode(char) {1} repr(unicode(char))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = unicode(char) ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = unicode(char) {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('latin_1') print "\t'" + x + "' = char.decode('latin_1') {1} repr(char.decode('latin_1'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('latin_1') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('latin_1') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('utf_8') print "\t'" + x + "' = char.decode('utf_8') {1} repr(char.decode('utf_8'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('utf_8') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('utf_8') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) print x = 'ó' encodingDemo(x) Much thanks for the answers below and especially to @John Machin for answering so thoroughly.

Read the article
What does this regex mean and why

- by Kalec

$ sed "s/$^[a-z,0-9]*$$.*$$ [a-z,0-9]*$$/\1\2 \1/g" desired_file_name I apreciate it even if you only explain part of it or at lest structure it with words as in s\alphanumerical_at_start\something\alphanumerical_at_end\something_else\global Could someone explain what that means, why and are all regEx so ... awful ? I know that it replaces the first lowcase alphanumerical word with the last one. But could you explain bit by bit what's going on here ? what's with all the /\ and $.*$\ and everything else ? I'm just lost. EDIT: Here is what I do get: (^[a-z0-9]*) starting with a trough z and 0 trough 9; and [a-z,0-9]*$ is the same but the last word (however [0-9,a-z] = just first 2 characters / first character, or the entire word ?). Also: what does the * or the $.*$\ even mean ?

Read the article
SQLite, python, unicode, and non-utf data

- by Nathan Spears

I started by trying to store strings in sqlite using python, and got the message: sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings. Ok, I switched to Unicode strings. Then I started getting the message: sqlite3.OperationalError: Could not decode to UTF-8 column 'tag_artist' with text 'Sigur Rós' when trying to retrieve data from the db. More research and I started encoding it in utf8, but then 'Sigur Rós' starts looking like 'Sigur RÃ³s' note: My console was set to display in 'latin_1' as @John Machin pointed out. What gives? After reading this, describing exactly the same situation I'm in, it seems as if the advice is to ignore the other advice and use 8-bit bytestrings after all. I didn't know much about unicode and utf before I started this process. I've learned quite a bit in the last couple hours, but I'm still ignorant of whether there is a way to correctly convert 'ó' from latin-1 to utf-8 and not mangle it. If there isn't, why would sqlite 'highly recommend' I switch my application to unicode strings? I'm going to update this question with a summary and some example code of everything I've learned in the last 24 hours so that someone in my shoes can have an easy(er) guide. If the information I post is wrong or misleading in any way please tell me and I'll update, or one of you senior guys can update. Summary of answers Let me first state the goal as I understand it. The goal in processing various encodings, if you are trying to convert between them, is to understand what your source encoding is, then convert it to unicode using that source encoding, then convert it to your desired encoding. Unicode is a base and encodings are mappings of subsets of that base. utf_8 has room for every character in unicode, but because they aren't in the same place as, for instance, latin_1, a string encoded in utf_8 and sent to a latin_1 console will not look the way you expect. In python the process of getting to unicode and into another encoding looks like: str.decode('source_encoding').encode('desired_encoding') or if the str is already in unicode str.encode('desired_encoding') For sqlite I didn't actually want to encode it again, I wanted to decode it and leave it in unicode format. Here are four things you might need to be aware of as you try to work with unicode and encodings in python. The encoding of the string you want to work with, and the encoding you want to get it to. The system encoding. The console encoding. The encoding of the source file Elaboration: (1) When you read a string from a source, it must have some encoding, like latin_1 or utf_8. In my case, I'm getting strings from filenames, so unfortunately, I could be getting any kind of encoding. Windows XP uses UCS-2 (a Unicode system) as its native string type, which seems like cheating to me. Fortunately for me, the characters in most filenames are not going to be made up of more than one source encoding type, and I think all of mine were either completely latin_1, completely utf_8, or just plain ascii (which is a subset of both of those). So I just read them and decoded them as if they were still in latin_1 or utf_8. It's possible, though, that you could have latin_1 and utf_8 and whatever other characters mixed together in a filename on Windows. Sometimes those characters can show up as boxes, other times they just look mangled, and other times they look correct (accented characters and whatnot). Moving on. (2) Python has a default system encoding that gets set when python starts and can't be changed during runtime. See here for details. Dirty summary ... well here's the file I added: \# sitecustomize.py \# this file can be anywhere in your Python path, \# but it usually goes in ${pythondir}/lib/site-packages/ import sys sys.setdefaultencoding('utf_8') This system encoding is the one that gets used when you use the unicode("str") function without any other encoding parameters. To say that another way, python tries to decode "str" to unicode based on the default system encoding. (3) If you're using IDLE or the command-line python, I think that your console will display according to the default system encoding. I am using pydev with eclipse for some reason, so I had to go into my project settings, edit the launch configuration properties of my test script, go to the Common tab, and change the console from latin-1 to utf-8 so that I could visually confirm what I was doing was working. (4) If you want to have some test strings, eg test_str = "ó" in your source code, then you will have to tell python what kind of encoding you are using in that file. (FYI: when I mistyped an encoding I had to ctrl-Z because my file became unreadable.) This is easily accomplished by putting a line like so at the top of your source code file: # -*- coding: utf_8 -*- If you don't have this information, python attempts to parse your code as ascii by default, and so: SyntaxError: Non-ASCII character '\xf3' in file _redacted_ on line 81, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details Once your program is working correctly, or, if you aren't using python's console or any other console to look at output, then you will probably really only care about #1 on the list. System default and console encoding are not that important unless you need to look at output and/or you are using the builtin unicode() function (without any encoding parameters) instead of the string.decode() function. I wrote a demo function I will paste into the bottom of this gigantic mess that I hope correctly demonstrates the items in my list. Here is some of the output when I run the character 'ó' through the demo function, showing how various methods react to the character as input. My system encoding and console output are both set to utf_8 for this run: '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Now I will change the system and console encoding to latin_1, and I get this output for the same input: 'ó' = original char <type 'str'> repr(char)='\xf3' 'ó' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Notice that the 'original' character displays correctly and the builtin unicode() function works now. Now I change my console output back to utf_8. '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' '?' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Here everything still works the same as last time but the console can't display the output correctly. Etc. The function below also displays more information that this and hopefully would help someone figure out where the gap in their understanding is. I know all this information is in other places and more thoroughly dealt with there, but I hope that this would be a good kickoff point for someone trying to get coding with python and/or sqlite. Ideas are great but sometimes source code can save you a day or two of trying to figure out what functions do what. Disclaimers: I'm no encoding expert, I put this together to help my own understanding. I kept building on it when I should have probably started passing functions as arguments to avoid so much redundant code, so if I can I'll make it more concise. Also, utf_8 and latin_1 are by no means the only encoding schemes, they are just the two I was playing around with because I think they handle everything I need. Add your own encoding schemes to the demo function and test your own input. One more thing: there are apparently crazy application developers making life difficult in Windows. #!/usr/bin/env python # -*- coding: utf_8 -*- import os import sys def encodingDemo(str): validStrings = () try: print "str =",str,"{0} repr(str) = {1}".format(type(str), repr(str)) validStrings += ((str,""),) except UnicodeEncodeError as ude: print "Couldn't print the str itself because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print ude try: x = unicode(str) print "unicode(str) = ",x validStrings+= ((x, " decoded into unicode by the default system encoding"),) except UnicodeDecodeError as ude: print "ERROR. unicode(str) couldn't decode the string because the system encoding is set to an encoding that doesn't understand some character in the string." print "\tThe system encoding is set to {0}. See error:\n\t".format(sys.getdefaultencoding()), print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the unicode(str) because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('latin_1') print "str.decode('latin_1') =",x validStrings+= ((x, " decoded with latin_1 into unicode"),) try: print "str.decode('latin_1').encode('utf_8') =",str.decode('latin_1').encode('utf_8') validStrings+= ((x, " decoded with latin_1 into unicode and encoded into utf_8"),) except UnicodeDecodeError as ude: print "The string was decoded into unicode using the latin_1 encoding, but couldn't be encoded into utf_8. See error:\n\t", print ude except UnicodeDecodeError as ude: print "Something didn't work, probably because the string wasn't latin_1 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('latin_1') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('utf_8') print "str.decode('utf_8') =",x validStrings+= ((x, " decoded with utf_8 into unicode"),) try: print "str.decode('utf_8').encode('latin_1') =",str.decode('utf_8').encode('latin_1') except UnicodeDecodeError as ude: print "str.decode('utf_8').encode('latin_1') didn't work. The string was decoded into unicode using the utf_8 encoding, but couldn't be encoded into latin_1. See error:\n\t", validStrings+= ((x, " decoded with utf_8 into unicode and encoded into latin_1"),) print ude except UnicodeDecodeError as ude: print "str.decode('utf_8') didn't work, probably because the string wasn't utf_8 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('utf_8') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t",uee print print "Printing information about each character in the original string." for char in str: try: print "\t'" + char + "' = original char {0} repr(char)={1}".format(type(char), repr(char)) except UnicodeDecodeError as ude: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), ude) except UnicodeEncodeError as uee: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), uee) print uee try: x = unicode(char) print "\t'" + x + "' = unicode(char) {1} repr(unicode(char))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = unicode(char) ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = unicode(char) {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('latin_1') print "\t'" + x + "' = char.decode('latin_1') {1} repr(char.decode('latin_1'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('latin_1') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('latin_1') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('utf_8') print "\t'" + x + "' = char.decode('utf_8') {1} repr(char.decode('utf_8'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('utf_8') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('utf_8') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) print x = 'ó' encodingDemo(x) Much thanks for the answers below and especially to @John Machin for answering so thoroughly.

Read the article
New Whitepaper: Advanced Compression 11gR1 Benchmarks with EBS 12

- by Steven Chan

In my opinion, if there's any reason to upgrade an E-Business Suite environment to the 11gR1 or 11gR2 database, it's the Advanced Compression database option. Oracle Advanced Compression was introduced in Oracle Database 11g, and allows you to compress structured data (numbers, characters) as well as unstructured data (documents, spreadsheets, XML and other files). It provides enhanced compression for database backups and also includes network compression for faster synchronization with standby databases.In other words, the promise of Advanced Compression is that it can make your E-Business Suite database smaller and faster. But how well does it actually deliver on that promise?Apps 12 + Advanced Compression Benchmarks now availableThree of my colleagues, Uday Moogala, Lester Gutierrez, and Andy Tremayne, have been benchmarking Oracle E-Business Suite Release 12 with Advanced Compression 11gR1. They've just released a detailed whitepaper with their benchmarking results and recommendations.This whitepaper is available in two locations:Oracle E-Business Suite Release 12.1 with Oracle Database 11g Advanced Compression (Note 1110648.1) (requires My Oracle Support access)Oracle E-Business Suite Release 12.1 with Oracle Database 11g Advanced Compression (Applications Benchmark website, PDF, 500K)

Read the article
The Dark Knight and Team Fortress 2 Mashup Movie Trailer [Video]

- by Asian Angel

If you are a Batman fan then you will find this video mashup interesting to watch. YouTube user TrueOneMoreUser has created a unique Dark Knight Trailer using characters from Team Fortress 2 and select scenes from The Dark Knight movie itself. Team Fortress 2 – The Demo Knight [via Geeks are Sexy] Latest Features How-To Geek ETC How To Make Disposable Sleeves for Your In-Ear Monitors Macs Don’t Make You Creative! So Why Do Artists Really Love Apple? MacX DVD Ripper Pro is Free for How-To Geek Readers (Time Limited!) HTG Explains: What’s a Solid State Drive and What Do I Need to Know? How to Get Amazing Color from Photos in Photoshop, GIMP, and Paint.NET Learn To Adjust Contrast Like a Pro in Photoshop, GIMP, and Paint.NET Bring the Grid to Your Desktop with the TRON Legacy Theme for Windows 7 The Dark Knight and Team Fortress 2 Mashup Movie Trailer [Video] Dirt Cheap DSLR Viewfinder Improves Outdoor DSLR LCD Visibility Lakeside Sunset in the Mountains [Wallpaper] Taskbar Meters Turn Your Taskbar into a System Resource Monitor Create Shortcuts for Your Favorite or Most Used Folders in Ubuntu

Read the article
How to build a game like HaxBall?

- by Cengiz Frostclaw

I'm a coder interested in game development, and I want to build a very simple p2p (real time) game. The perfect example for this is haxball. The client/server model doesn't matter, it could be like haxball (i.e. the room creator is the host) or changing host, or the server is host (if there is such a thing) I just want to learn the basics to create a field where players can move their characters in real time. So where should I start? Please guide me into the right direction. And with examples please. I found out that RTMFP does what I seek, but how do I use it? I know ActionScript 3.0, but no idea how to setup a multiplayer game. Please help ! Thanks in advance.

Read the article
A new Bursting patch for EBS Customers.

- by ashish.shrivastava

Patch 8594771 for Bursting functionality was released last week. The patch is for EBS R12 and 11i Customers. Following bugs have been fixed against this patch. 9550733 - DELIVERYREQUEST DOES NOT ACCEPT THE COMM IN EMAIL ADDRESS ALIAS. 9483876 : XML PUBLISHER REPORT BURSTING PROGRAM NOT DEBUGGING AFTER PATCH 7352374 9433950 : JAVA.UTIL.ZIP.ZIPEXCEPTION: ZIP FILE MUST HAVE AT LEAST ONE ENTRY 9285648 : XML PUBLISHER REPORT BURSTING PROGRAM IGNORES PORT NUMBER FOR WEBDAV DELIVERY 9269498 : NOT ABLE TO GENERATE ALL TYPES OF OUTPUTS WHILE BURSTING WITH JDE INTEGRATION. 9110360 : BURSTING USING BACKGROUND AND TITLE TAGS IN BURSTING CONTROL FILE FAILS 8937963 : BURSTING - UNABLE TO USE SPECIFIC FILENAME MORE THAN ONCE CAUSING ZIPEXCEPTION 8871779 : THE CONCURRENT LOG DOES NOT CONTAIN INFO AFTER PATCH 7352374 8594771 - BURSTING DOES NOT SUPPORT ESCAPE CHARACTERS LIKE "FIRST LAST "

Read the article
Use Advanced Font Ligatures in Office 2010

- by Matthew Guay

Fonts can help your documents stand out and be easier to read, and Office 2010 helps you take your fonts even further with support for OpenType ligatures, stylistic sets, and more. Here’s a quick look at these new font features in Office 2010. Introduction Starting with Windows 7, Microsoft has made an effort to support more advanced font features across their products. Windows 7 includes support for advanced OpenType font features and laid the groundwork for advanced font support in programs with the new DirectWrite subsystem. It also includes the new font Gabriola, which includes an incredible number of beautiful stylistic sets and ligatures. Now, with the upcoming release of Office 2010, Microsoft is bringing advanced typographical features to the Office programs we love. This includes support for OpenType ligatures, stylistic sets, number forms, contextual alternative characters, and more. These new features are available in Word, Outlook, and Publisher 2010, and work the same on Windows XP, Vista and Windows 7. Please note that Windows does include several OpenType fonts that include these advanced features. Calibri, Cambria, Constantia, and Corbel all include multiple number forms, while Consolas, Palatino Linotype, and Gabriola (Windows 7 only) include all the OpenType features. And, of course, these new features will work great with any other OpenType fonts you have that contain advanced ligatures, stylistic sets, and number forms. Using advanced typography in Word To use the new font features, open a new document, select an OpenType font, and enter some text. Here we have Word 2010 in Windows 7 with some random text in the Gabriola font. Click the arrow on the bottom of the Font section of the ribbon to open the font properties. Alternately, select the text and click Font. Now, click on the Advanced tab to see the OpenType features. You can change the ligatures setting… Choose Proportional or Tabular number spacing… And even select Lining or Old-style number forms. Here’s a comparison of Lining and Old-style number forms in Word 2010 with the Calibri font. Finally, you can choose various Stylistic sets for your font. The dialog always shows 20 styles, whether or not your font includes that many. Most include only 1 or 2; Gabriola includes 6. Here’s lorem ipsum text, using the Gabriola font with Stylistic set 6. Impressive, huh? The font ligatures change based on context, so they will automatically change as you are typing. Watch the transition as we typed the word Microsoft in Word with Gabriola stylistic set 6. Here’s another example, showing the fi and tt ligatures in Calibri. These effects work great in Word 2010 in XP, too. And, since Outlook uses Word as it’s editing engine, you can use the same options in Outlook 2010. Note that these font effects may not show up the same if the recipient’s email client doesn’t support advanced OpenType typography. It will, of course, display perfectly if the recipient is using Outlook 2010. Using advanced typography in Publisher 2010 Publisher 2010 includes the same advanced font features. This is especially nice for those using Publisher for professional layout and design. Simply insert a text box, enter some text, select it, and click the arrow on the bottom of the font box as in Word to open the font properties. This font options dialog is actually more advanced than Word’s font options. You can preview your font changes on sample text right in the properties box. You can also choose to add or remove a swash from your characters. Conclusion Advanced typographical effects are a welcome addition to Word and Publisher 2010, and they are very impressive when coupled with modern fonts such as Gabriola. From designing elegant headers to using old-style numbers, these features are very useful and fun. Do you have a favorite OpenType font that includes advanced typographical features? Let us know in the comments! More Reading Advances in typography in Windows 7 – Engineering 7 Blog New features in Microsoft Word 2010 Similar Articles Productive Geek Tips Change the Default Font in Excel 2007Ask the Readers: Do You Use a Laptop, Desktop, or Both?Keep Websites From Using Tiny Fonts in SafariAdd or Remove Apps from the Microsoft Office 2007 or 2010 SuiteFriday Fun: Desktop Tower Defense Pro TouchFreeze Alternative in AutoHotkey The Icy Undertow Desktop Windows Home Server – Backup to LAN The Clear & Clean Desktop Use This Bookmarklet to Easily Get Albums Use AutoHotkey to Assign a Hotkey to a Specific Window Latest Software Reviews Tinyhacker Random Tips Revo Uninstaller Pro Registry Mechanic 9 for Windows PC Tools Internet Security Suite 2010 PCmover Professional SpeedyFox Claims to Speed up your Firefox Beware Hover Kitties Test Drive Mobile Phones Online With TryPhone Ben & Jerry’s Free Cone Day, 3/23/10 New Stinger from McAfee Helps Remove ‘FakeAlert’ Threats Google Apps Marketplace: Tools & Services For Google Apps Users

Read the article
How can a code editor effectively hint at code nesting level - without using indentation?

- by pgfearo

I'm writing an XML text editor that provides 2 view options for the same XML text, one indented (virtually), the other left-justified. The motivation for the left-justified view is to help users 'see' the whitespace characters they're using for indentation of plain-text or XPath code without interference from indentation that is an automated side-effect of the XML context. I want to provide visual clues (in the non-editable part of the editor) for the left-justified mode that will help the user, but without getting too elaborate. I tried just using connecting lines, but that seemed too busy. The best I've come up with so far is shown in a mocked up screenshot of the editor below, but I'm seeking better/simpler alternatives (that don't require too much code). [Edit] Taking the heatmap idea (from: @jimp) I get something like this: or even these alternates:

Read the article
String Format for DateTime in C#

- by SAMIR BHOGAYTA

String Format for DateTime [C#] This example shows how to format DateTime using String.Format method. All formatting can be done also using DateTime.ToString method. Custom DateTime Formatting There are following custom format specifiers y (year), M (month), d (day), h (hour 12), H (hour 24), m (minute), s (second), f (second fraction), F (second fraction, trailing zeroes are trimmed), t (P.M or A.M) and z (time zone). Following examples demonstrate how are the format specifiers rewritten to the output. [C#] // create date time 2008-03-09 16:05:07.123 DateTime dt = new DateTime(2008, 3, 9, 16, 5, 7, 123); String.Format("{0:y yy yyy yyyy}", dt); // "8 08 008 2008" year String.Format("{0:M MM MMM MMMM}", dt); // "3 03 Mar March" month String.Format("{0:d dd ddd dddd}", dt); // "9 09 Sun Sunday" day String.Format("{0:h hh H HH}", dt); // "4 04 16 16" hour 12/24 String.Format("{0:m mm}", dt); // "5 05" minute String.Format("{0:s ss}", dt); // "7 07" second String.Format("{0:f ff fff ffff}", dt); // "1 12 123 1230" sec.fraction String.Format("{0:F FF FFF FFFF}", dt); // "1 12 123 123" without zeroes String.Format("{0:t tt}", dt); // "P PM" A.M. or P.M. String.Format("{0:z zz zzz}", dt); // "-6 -06 -06:00" time zone You can use also date separator / (slash) and time sepatator : (colon). These characters will be rewritten to characters defined in the current DateTimeFormatInfo.DateSeparator and DateTimeFormatInfo.TimeSeparator. [C#] // date separator in german culture is "." (so "/" changes to ".") String.Format("{0:d/M/yyyy HH:mm:ss}", dt); // "9/3/2008 16:05:07" - english (en-US) String.Format("{0:d/M/yyyy HH:mm:ss}", dt); // "9.3.2008 16:05:07" - german (de-DE) Here are some examples of custom date and time formatting: [C#] // month/day numbers without/with leading zeroes String.Format("{0:M/d/yyyy}", dt); // "3/9/2008" String.Format("{0:MM/dd/yyyy}", dt); // "03/09/2008" // day/month names String.Format("{0:ddd, MMM d, yyyy}", dt); // "Sun, Mar 9, 2008" String.Format("{0:dddd, MMMM d, yyyy}", dt); // "Sunday, March 9, 2008" // two/four digit year String.Format("{0:MM/dd/yy}", dt); // "03/09/08" String.Format("{0:MM/dd/yyyy}", dt); // "03/09/2008" Standard DateTime Formatting In DateTimeFormatInfo there are defined standard patterns for the current culture. For example property ShortTimePattern is string that contains value h:mm tt for en-US culture and value HH:mm for de-DE culture. Following table shows patterns defined in DateTimeFormatInfo and their values for en-US culture. First column contains format specifiers for the String.Format method. Specifier DateTimeFormatInfo property Pattern value (for en-US culture) t ShortTimePattern h:mm tt d ShortDatePattern M/d/yyyy T LongTimePattern h:mm:ss tt D LongDatePattern dddd, MMMM dd, yyyy f (combination of D and t) dddd, MMMM dd, yyyy h:mm tt F FullDateTimePattern dddd, MMMM dd, yyyy h:mm:ss tt g (combination of d and t) M/d/yyyy h:mm tt G (combination of d and T) M/d/yyyy h:mm:ss tt m, M MonthDayPattern MMMM dd y, Y YearMonthPattern MMMM, yyyy r, R RFC1123Pattern ddd, dd MMM yyyy HH':'mm':'ss 'GMT' (*) s SortableDateTimePattern yyyy'-'MM'-'dd'T'HH':'mm':'ss (*) u UniversalSortableDateTimePattern yyyy'-'MM'-'dd HH':'mm':'ss'Z' (*) (*) = culture independent Following examples show usage of standard format specifiers in String.Format method and the resulting output. [C#] String.Format("{0:t}", dt); // "4:05 PM" ShortTime String.Format("{0:d}", dt); // "3/9/2008" ShortDate String.Format("{0:T}", dt); // "4:05:07 PM" LongTime String.Format("{0:D}", dt); // "Sunday, March 09, 2008" LongDate String.Format("{0:f}", dt); // "Sunday, March 09, 2008 4:05 PM" LongDate+ShortTime String.Format("{0:F}", dt); // "Sunday, March 09, 2008 4:05:07 PM" FullDateTime String.Format("{0:g}", dt); // "3/9/2008 4:05 PM" ShortDate+ShortTime String.Format("{0:G}", dt); // "3/9/2008 4:05:07 PM" ShortDate+LongTime String.Format("{0:m}", dt); // "March 09" MonthDay String.Format("{0:y}", dt); // "March, 2008" YearMonth String.Format("{0:r}", dt); // "Sun, 09 Mar 2008 16:05:07 GMT" RFC1123 String.Format("{0:s}", dt); // "2008-03-09T16:05:07" SortableDateTime String.Format("{0:u}", dt); // "2008-03-09 16:05:07Z" UniversalSortableDateTime

Read the article
Formal Languages, Inductive Proofs & Regular Expressions

- by MarkPearl

So I am slogging away at my UNISA stuff. I have just finished doing the initial once non stop read through the first 11 chapters of my COS 201 Textbook - “Introduction to Computer Theory 2nd Edition” by Daniel Cohen. It has been an interesting couple of days, with familiar concepts coming up as well as some new territory. In this posting I am going to cover the first couple of chapters of the book. Let start with Formal Languages… What exactly is a formal language? Pretty much a no duh question for me but still a good one to ask – a formal language is a language that is defined in a precise mathematical way. Does that mean that the English language is a formal language? I would say no – and my main motivation for this is that one can have an English sentence that is correct grammatically that is also ambiguous. For example the ambiguous sentence: "I once shot an elephant in my pyjamas.” For this and possibly many other reasons that I am unaware of, English is termed a “Natural Language”. So why the importance of formal languages in computer science? Again a no duh question in my mind… If we want computers to be effective and useful tools then we need them to be able to evaluate a series of commands in some form of language that when interpreted by the device no confusion will exist as to what we were requesting. Imagine the mayhem that would exist if a computer misinterpreted a command to print a document and instead decided to delete it. So what is a Formal Language made up of… For my study purposes a language is made up of a finite alphabet. For a formal language to exist there needs to be a specification on the language that will describe whether a string of characters has membership in the language or not. There are two basic ways to do this: By a “machine” that will recognize strings of the language (e.g. Finite Automata). By a rule that describes how strings of a language can be formed (e.g. Regular Expressions). When we use the phrase “string of characters”, we can also be referring to a “word”. What is an Inductive Proof? So I am not to far into my textbook and of course it starts referring to proofs and different types. I have had to go through several different approaches of proofs in the past, but I can never remember their formal names , so when I saw “inductive proof” I thought to myself – what the heck is that? Google to the rescue… An inductive proof is like a normal proof but it employs a neat trick which allows you to prove a statement about an arbitrary number n by first proving it is true when n is 1 and then assuming it is true for n=k and showing it is true for n=k+1. The idea is that if you want to show that someone can climb to the nth floor of a fire escape, you need only show that you can climb the ladder up to the fire escape (n=1) and then show that you know how to climb the stairs from any level of the fire escape (n=k) to the next level (n=k+1). Does this sound like a form of recursion? No surprise then that in the same chapter they deal with recursive definitions. An example of a recursive definition for the language EVEN would the 3 rules below: 2 is in EVEN If x is in EVEN then so is x+2 The only elements in the set EVEN are those that be produced by the rules above. Nothing to exciting… So if a definition for a language is done recursively, then it makes sense that the language can be proved using induction. Regular Expressions So I am wondering to myself what use is this all – in fact – I find this the biggest challenge to any university material is that it is quite hard to find the immediate practical applications of some theory in real life stuff. How great was my joy when I suddenly saw the word regular expression being introduced. I had been introduced to regular expressions on Stack Overflow where I was trying to recognize if some text measurement put in by a user was in a valid form or not. For instance, the imperial system of measurement where you have feet and inches can be represented in so many different ways. I had eventually turned to regular expressions as an easy way to check if my parser could correctly parse the text or not and convert it to a normalize measurement. So some rules about languages and regular expressions… Any finite language can be represented by at least one if not more regular expressions A regular expressions is almost a rule syntax for expressing how regular languages can be formed regular expressions are cool For a regular expression to be valid for a language it must be able to generate all the words in the language and no other words. This is important. It doesn’t help me if my regular expression parses 100% of my measurement texts but also lets one or two invalid texts to pass as well. Okay, so this posting jumps around a bit – but introduces some very basic fundamentals for the subject which will be built on in later postings… Time to go and do some practical examples now…

Read the article
Coding error at open URL

- by Lobo

Hi, I have the following method to open a URL API String c=""; URL direccionURL; try { direccionURL = new URL("http://api.stackoverflow.com/1.0/users/523725"); BufferedReader in = new BufferedReader(new InputStreamReader( direccionURL.openStream())); String inputLine; while ((inputLine = in.readLine()) != null) c+=inputLine; in.close(); } catch (MalformedURLException e) { // TODO Auto-generated catch block e.printStackTrace(); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } return c; In the end, the "c" variable contains a set of characters that are not the same I get if I open the same URL with a browser. Why?, What am I doing wrong? Thank's for help. Regards!

Read the article
What are the pros and cons of Coffeescript?

- by Philip

Of course one big pro is the amount of syntactic sugar leading to shorter code in a lot of cases. On http://jashkenas.github.com/coffee-script/ there are impressive examples. On the other hand I have doubts that these examples represent code of complex real world applications. In my code for instance I never add functions to bare objects but rather to their prototypes. Moreover the prototype feature is hidden from the user, suggesting classical OOP rather than idiomatic Javascript. The array comprehension example would look in my code probably like this: cubes = $.map(list, math.cube); // which is 8 characters less using jQuery...

Read the article
If spaces in filenames are possible, why do some of us still avoid using them?

- by Chris W. Rea

Somebody I know expressed irritation today regarding those of us who tend not to use spaces in our filenames, e.g. NamingThingsLikeThis.txt -- despite most modern operating systems supporting spaces in filenames. Non-technical people must look at filenames created by geeks and wonder where we learned English. So, what are the reasons that spaces in filenames are avoided or discouraged? The most obvious reason I could think of, and why I typically avoid it, are the extra quotes required on the command line when dealing with such files. Are there any other significant reasons, other than the practice being a vestigial preference? UPDATE: Thanks for all your answers! I'm surprised how popular this was. So, here's a summary: Six Reasons Why Geeks Prefer Filenames Without Spaces In Them It's irritating to put quotes around them when referenced on the command line (or elsewhere.) Some older operating systems didn't used to support them and us old dogs are used to that. Some tools still don't support spaces in filenames at all or very well. (But they should.) It's irritating to escape spaces when used where spaces must be escaped, such as URLs. Certain unenlightened services (e.g. file hosting, webmail) remove or replace spaces anyway! Names without spaces can be shorter, which is sometimes desirable as paths are limited.

Read the article
Can I do filename pattern matching in a bash script?

- by Bob Bowden

Can I do filename pattern matching in a bash script? "test" is a directory with the following files ... bob@bob-laptop:~/test$ ls exclude exclude1 exclude2 include1 include2 from the command line, if I want to exclude some of the files, I can do ... bob@bob-laptop:~/test$ echo !(exclude*) include1 include2 but, if I put that command in a script (named exclude) ... bob@bob-laptop:~/test$ cat exclude echo !(exclude*) when I execute it, I get an error ... bob@bob-laptop:~/test$ ./exclude ./exclude: line 1: syntax error near unexpected token (' ./exclude: line 1:echo !(exclude*)' I've tried every (I think) variation of escaping some, all or none of the special characters and I still get an error. What am I missing here? If I can't do this, would someone please be so kind as to explain why?

Read the article
keyboard layout switching on restart

- by zidarsk8

I have two keyboard layouts that I use, My default keyboard is an USA layout, with a secondary Slovenian layout. I use the Slovenian layout only when I need some special characters when writing emails and such. But my problem is this: Every time I reboot my computer, the layout indicator shows I am on the USA layout, but the actual keyboard layout is Slovenian. Then I normally have to switch from USA to Slovenian and back, to get the layout I want. Is there anything I can do about this? I don't restart my computer often, but when I do I forget about that, and typing the passwords like that doesn't work.

Read the article
Login screen restarts while entering password

- by Shane L

I am having a problem that only occurred after installing the fglrx proprietary driver through the additional drivers app. The exact same issue is described in this question, however it's closed. Must login twice before entering Unity; first login screen has graphical anomalies When I boot up my computer, once I get to lightdm login screen I will start typing my password, and upon the entering of two of my password characters "n2" the screen will start displaying some corruption in the very top edge of the screen, then after I hit the number 2, lightdm seems to crash and it will restart and everything is fine, I can login from that point. So the issue is, every time I reboot, I have to login once, allow lightdm to crash and restart, then login as normal. The issue did not occur prior to installing fglrx, and has happened through several reinstalls.

Read the article
Symbolic Regular Expression Exploration

- by Robz / Fervent Coder

This is a pretty sweet little tool. Rex (Regular Expression Exploration) is a tool that allows you to give it a regular expression and it returns matching strings. The example below creates10 strings that start and end with a number and have at least 2 characters: > rex.exe "^\d.*\d$" /k:10 This is something I could use to validate/generate the Regular Expressions I have created with both UppercuT and RoundhousE. Check out the video below: Margus Veanes - Rex - Symbolic Regular Expression Exploration Margus Veanes, a Researcher from the RiSE group at Microsoft Research, gives an overview of Rex, a tool that generates matching string from .NET regular expressions. Rex turns regular expres...

Read the article
Seperating entities from their actions or behaviours

- by Jamie Dixon

Hi everyone, I'm having a go at creating a very simple text based game and am wondering what the standard design patterns are when it comes to entities (characters, sentient scenery) and the actions those entities can perform. As an example, I have entity that is a 'person' with various properties such as age, gender, height, etc. This 'person' can also perform some actions such as speaking, walking, jumping, flying, etc etc. How would you seperate out the entity from the actions it can perform and what are some common design patterns that solve this kind of problem?

Read the article
Isometric drawing "Not Tile Stuff" on isometric map?

- by Icebone1000

So I got my isometric renderer working, it can draw diamond or jagged maps...Then I want to move on...How do I draw characters/objects on it in a optimal way? What Im doing now, as one can imagine, is traversing my grid(map) and drawing the tiles in a order so alpha blending works correctly. So, anything I draw in this map must be drawed at the same time the map is being drawn, with sucks a lot, screws your very modular map drawer, because now everything on the game (but the HUD) must be included on the drawer.. I was thinking whats the best approach to do this, comparing the position of all objects(not tile stuff) on the grid against the current tile being draw seems stupid, would it be better to add an id ON the grid(map)? this also seems terrible, because objects can move freely, not per tile steps (it can occupies 2 tiles if its between them, etc.) Dont know if matters, but my grid is 3D, so its not a plane with objects poping out, its a bunch of pilled cubes.

Read the article

Search Results

Search found 5433 results on 218 pages for 'escaped characters'.

Page 104/218 | < Previous Page | 100 101 102 103 104 105 106 107 108 109 110 111 | Next Page >

- by Morten Mertner

- by Mel

- by sdaau

- by 0R10N

- by Curious Apprentice

- by Nathan Spears

- by Kalec

- by Nathan Spears

- by Steven Chan

- by Asian Angel

- by Cengiz Frostclaw

- by ashish.shrivastava

- by Matthew Guay

- by pgfearo

- by SAMIR BHOGAYTA

- by MarkPearl

- by Lobo

- by Philip

- by Chris W. Rea

- by Bob Bowden

- by zidarsk8

- by Shane L

- by Robz / Fervent Coder

- by Jamie Dixon

- by Icebone1000

< Previous Page | 100 101 102 103 104 105 106 107 108 109 110 111 | Next Page >