Search Results

Search found 85 results on 4 pages for 'redacted'.

Page 4/4 | < Previous Page | 1 2 3 4

Using JSON Data to Populate a Google Map with Database Objects

- by MikeH

I'm revising this question after reading the resources mentioned in the original answers and working through implementing it. I'm using the google maps api to integrate a map into my Rails site. I have a markets model with the following columns: ID, name, address, lat, lng. On my markets/index view, I want to populate a map with all the markets in my markets table. I'm trying to output @markets as json data, and that's where I'm running into problems. I have the basic map displaying, but right now it's just a blank map. I'm following the tutorials very closely, but I can't get the markers to generate dynamically from the json. Any help is much appreciated! Here's my setup: Markets Controller: def index @markets = Market.filter_city(params[:filter]) respond_to do |format| format.html # index.html.erb format.json { render :json => @market} format.xml { render :xml => @market } end end Markets/index view: <head> <script type="text/javascript" src="http://www.google.com/jsapi?key=GOOGLE KEY REDACTED, BUT IT'S THERE" > </script> <script type="text/javascript"> var markets = <%= @markets.to_json %>; </script> <script type="text/javascript" charset="utf-8"> google.load("maps", "2.x"); google.load("jquery", "1.3.2"); </script> </head> <body> <div id="map" style="width:400px; height:300px;"></div> </body> Public/javascripts/application.js: function initialize() { if (GBrowserIsCompatible() && typeof markets != 'undefined') { var map = new GMap2(document.getElementById("map")); map.setCenter(new GLatLng(40.7371, -73.9903), 13); map.addControl(new GLargeMapControl()); function createMarker(latlng, market) { var marker = new GMarker(latlng); var html="<strong>"+market.name+"</strong><br />"+market.address; GEvent.addListener(marker,"click", function() { map.openInfoWindowHtml(latlng, html); }); return marker; } var bounds = new GLatLngBounds; for (var i = 0; i < markets.length; i++) { var latlng=new GLatLng(markets[i].lat,markets[i].lng) bounds.extend(latlng); map.addOverlay(createMarker(latlng, markets[i])); } } } window.onload=initialize; window.onunload=GUnload;

Read the article
Using JSON Data to Populate a Google Map with Database Objects

- by MikeH

I'm revising this question after reading the resources mentioned in the original answers and working through implementing it. I'm using the google maps api to integrate a map into my Rails site. I have a markets model with the following columns: ID, name, address, lat, lng. On my markets/index view, I want to populate a map with all the markets in my markets table. I'm trying to output @markets as json data, and that's where I'm running into problems. I have the basic map displaying, but right now it's just a blank map. I'm following the tutorials very closely, but I can't get the markers to generate dynamically from the json. Any help is much appreciated! Here's my setup: Markets Controller: def index @markets = Market.filter_city(params[:filter]) respond_to do |format| format.html # index.html.erb format.json { render :json => @market} format.xml { render :xml => @market } end end Markets/index view: <head> <script type="text/javascript" src="http://www.google.com/jsapi?key=GOOGLE KEY REDACTED, BUT IT'S THERE" > </script> <script type="text/javascript"> var markets = <%= @markets.to_json %>; </script> <script type="text/javascript" charset="utf-8"> google.load("maps", "2.x"); google.load("jquery", "1.3.2"); </script> </head> <body> <div id="map" style="width:400px; height:300px;"></div> </body> Public/javascripts/application.js: function initialize() { if (GBrowserIsCompatible() && typeof markets != 'undefined') { var map = new GMap2(document.getElementById("map")); map.setCenter(new GLatLng(40.7371, -73.9903), 13); map.addControl(new GLargeMapControl()); function createMarker(latlng, market) { var marker = new GMarker(latlng); var html="<strong>"+market.name+"</strong><br />"+market.address; GEvent.addListener(marker,"click", function() { map.openInfoWindowHtml(latlng, html); }); return marker; } var bounds = new GLatLngBounds; for (var i = 0; i < markets.length; i++) { var latlng=new GLatLng(markets[i].lat,markets[i].lng) bounds.extend(latlng); map.addOverlay(createMarker(latlng, markets[i])); } } } window.onload=initialize; window.onunload=GUnload;

Read the article
Problem with character encoding on email sent via PHP?

- by cosgrove

Hi everybody, Having some trouble sending properly formatted HTML e-mail from a PHP script. I am running PHP 5.3.0 and Apache 2.2.11 on Windows XP Professional. The output looks like this: Agent Summary for Support on Tuesday April 20 2010=20 Ext. Name Time Volume 137 Agent Name 01:27:25 1 138 =09 00:00:00 0 139 =09 00:00:00 0 You see the =20 and =09 in there? If you look at the HTML you also see = signs being turned into =3D. I figure this is a character encoding issue as I read the following at Wikipedia: ISO-8859-1 and Windows-1252 confusion It is very common to mislabel text data with the charset label ISO-8859-1, even though the data is really Windows-1252 encoded. In Windows-1252, codes between 0x80 and 0x9F are used for letters and punctuation, whereas they are control codes in ISO-8859-1. Many web browsers and e-mail clients will interpret ISO-8859-1 control codes as Windows-1252 characters in order to accommodate such mislabeling but it is not standard behaviour and care should be taken to avoid generating these characters in ISO-8859-1 labeled content. This looks like the problem but I don't know how to fix. My code looks like this: ob_start(); report_queue_summary($yesterday,$yesterday,$first_extension,$last_extension,$queue); $body_report = ob_get_contents(); ob_end_clean(); $body_footer = "This is an automatically generated e-mail."; $message = new Mail_mime(); $html = $body_header.$body_report.$body_footer; $message->setHTMLBody($html); $body = $message->get(); $extraheaders = array("From"=>"***redacted***","To"=>$recipient, "Subject"=>"Agent Summary for $yesterday [$queue]", "Content-type"=>"text/html; charset=iso-8859-1"); $headers = $message->headers($extraheaders); # setup e-mail; $host = "*********"; $port = "26"; $username = "*****"; $password = "*****"; # Send e-mail $smtp = Mail::factory('smtp', array ('host' => $host, 'port' => $port, 'auth' => true, 'username' => $username, 'password' => $password)); $mail = $smtp->send($recipient, $extraheaders, $body); if (PEAR::isError($mail)) { echo("" . $mail->getMessage() . ""); } else { echo("Message successfully sent!"); } Is the problem that I'm using output buffering?

Read the article
Unable to connect to OpenVPN server

- by Incognito

I'm trying to get a working setup of OpenVPN on my VM and authenticate into it from a client. I'm not sure but it looks to me like it's socket related, as it's not set to LISTEN, and localhost seems wrong. I've never set up VPN before. # netstat -tulpn | grep vpn Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name udp 0 0 127.0.0.1:1194 0.0.0.0:* 24059/openvpn I don't think this is set up correctly. Here's some detail into what I've done. I have a VPS from MediaTemple: These are my interfaces before starting openvpn: lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:39482 errors:0 dropped:0 overruns:0 frame:0 TX packets:39482 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:3237452 (3.2 MB) TX bytes:3237452 (3.2 MB) venet0 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 inet addr:127.0.0.1 P-t-P:127.0.0.1 Bcast:0.0.0.0 Mask:255.255.255.255 UP BROADCAST POINTOPOINT RUNNING NOARP MTU:1500 Metric:1 RX packets:4885284 errors:0 dropped:0 overruns:0 frame:0 TX packets:4679884 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:835278537 (835.2 MB) TX bytes:1989289617 (1.9 GB) venet0:0 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 inet addr:205.[redacted] P-t-P:205.186.148.82 Bcast:0.0.0.0 Mask:255.255.255.255 UP BROADCAST POINTOPOINT RUNNING NOARP MTU:1500 Metric:1 I've followed this guide on setting up a basic server and getting a .p12 file, however, I was receiving an error that stated /dev/net/tun was missing, so I created it mkdir -p /dev/net mknod /dev/net/tun c 10 200 chmod 600 /dev/net/tun This resolved the error preventing the service from launching, however, I am unable to connect. On the server I've set up the myserver.conf file (as per the tutorial) to indicate local 127.0.0.1 (I've also attempted with the public IP address, perhaps I don't understand what they mean by local IP?). The server launches without error, this is what the log looks like when it starts: Sun Apr 1 17:21:27 2012 OpenVPN 2.1.3 x86_64-pc-linux-gnu [SSL] [LZO2] [EPOLL] [PKCS11] [MH] [PF_INET6] [eurephia] built on Mar 11 2011 Sun Apr 1 17:21:27 2012 IMPORTANT: OpenVPN's default port number is now 1194, based on an official port number assignment by IANA. OpenVPN 2.0-beta16 and earlier used 5000 as the default port. Sun Apr 1 17:21:27 2012 NOTE: the current --script-security setting may allow this configuration to call user-defined scripts Sun Apr 1 17:21:27 2012 /usr/bin/openssl-vulnkey -q -b 1024 -m <modulus omitted> Sun Apr 1 17:21:27 2012 TUN/TAP device tun0 opened Sun Apr 1 17:21:27 2012 /sbin/ifconfig tun0 10.8.0.1 pointopoint 10.8.0.2 mtu 1500 Sun Apr 1 17:21:27 2012 GID set to openvpn Sun Apr 1 17:21:27 2012 UID set to openvpn Sun Apr 1 17:21:27 2012 UDPv4 link local (bound): [AF_INET]127.0.0.1:1194 Sun Apr 1 17:21:27 2012 UDPv4 link remote: [undef] Sun Apr 1 17:21:27 2012 Initialization Sequence Completed This creates a tun0 interface that looks like this: tun0 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 inet addr:10.8.0.1 P-t-P:10.8.0.2 Mask:255.255.255.255 UP POINTOPOINT RUNNING NOARP MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) And the netstat command still indicates the state is not set to LISTEN. On the client-side I've installed the p12 certs onto two devices (one is an android tablet, the other is an Ubuntu desktop). I don't see port 1194 as open either. Both clients install the cert files and then ask me for the L2TP secret (which was set on the file), but then they oddly ask me for a username and a password, which I don't know where I could possibly get those from. I attempted all of my logins, and some whacky guesses that were frantically pulling at straws. If there's any more information I could provide let me know.

Read the article
Why am I getting a 403 error on a POST to a PHP script?

- by John Gallagher

Background I want to allow my users to submit a crash report which will get emailed to me. I'm using UKCrashReporter with the bundled PHP script I've modified. This code does a POST to a specified URL along with the crash report. I'm on a shared server running Linux. My main domain is synapticmishap.co.uk. The Problem When I send the crash report off, on the Cocoa side, it reports as having sent it successfully, but I don't receive an email. The code has been used in lots of other well established Cocoa projects and it was working for me a few months ago. That leads me to conclude that the problems are related to my web server setup, something I know almost nothing about. When I look at my log files, I see entries like this: IP Redacted - - [10/Jun/2010:09:47:53 +0100] "POST /synapticmishap/crashreportform.php HTTP/1.1" 403 74 "-" "UKCrashReporter" What I've tried I've tried accessing the page at http://synapticmishap.co.uk/synapticmishap/crashreportform.php via a browser. It loads fine. I've made sure the permissions on this php script are set so anyone can execute it. I've tried removing the deny entries from the section of .htaccess at various levels starting with root. I've downloaded the URLParams plugin for Firefox which allows you to simulate POSTs. I put in the URL above and tried a post with "crashlog" as the parameter and "test" as the value. This generated a 200 log entry in my log file - it seemed to work, although no mail message was sent. Code I've got the following at http://synapticmishap.co.uk/synapticmishap/crashreportform.php. I've simplified it to just the bare bones in an effort to get it working. <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html> <head> <title>Crash Report</title> </head> <body> <p>This page contains super special magic which submits a crash report item to me.</p> <p>Nothing to see here - move along.</p> <?php mail( "[email protected]", "Crash Report", "\r\n\r\nThis is a test."); ?> </body> </html> This is my top level .htaccess file: RewriteEngine on # -FrontPage- IndexIgnore .htaccess */.??* *~ *# */HEADER* */README* */_vti* <Limit GET POST> order deny,allow deny from all allow from all </Limit> <Limit PUT DELETE> order deny,allow deny from all </Limit> Options All -Indexes RewriteCond %{HTTP_HOST} ^synapticmishap.co.uk$ [OR] RewriteCond %{HTTP_HOST} ^www.synapticmishap.co.uk$ RewriteCond %{HTTP_HOST} ^lapsusapp.co.uk$ [OR] RewriteCond %{HTTP_HOST} ^www.lapsusapp.co.uk$ RewriteRule ^/?$ "http\:\/\/synapticmishap\.co\.uk\/synapticmishap\/lapsuspromo\/" [R=301,L] RewriteCond %{HTTP_HOST} ^jgtutoring.co.uk$ [OR] RewriteCond %{HTTP_HOST} ^www.jgtutoring.co.uk$ RewriteRule ^/?$ "http\:\/\/synapticmishap\.co\.uk\/tutoring" [R=301,L] RewriteCond %{HTTP_HOST} ^synapticmishap.co.uk$ [OR] RewriteCond %{HTTP_HOST} ^www.synapticmishap.co.uk$ RewriteRule ^/?$ "http\:\/\/synapticmishap\.co\.uk\/synapticmishap" [R=301,L] RewriteCond %{HTTP_HOST} ^jgediting.co.uk$ [OR] RewriteCond %{HTTP_HOST} ^www.jgediting.co.uk$ RewriteRule ^/?$ "http\:\/\/synapticmishap\.co\.uk\/editing" [R=301,L] RewriteCond %{HTTP_REFERER} !^$ RewriteCond %{HTTP_REFERER} !^http://synapticmishap.co.uk/.*$ [NC] RewriteCond %{HTTP_REFERER} !^http://synapticmishap.co.uk$ [NC] RewriteCond %{HTTP_REFERER} !^http://www.synapticmishap.co.uk/.*$ [NC] RewriteCond %{HTTP_REFERER} !^http://www.synapticmishap.co.uk$ [NC] RewriteCond %{HTTP_REFERER} !^http://synapticmishap.co.uk/synapticmishap/crashreportform.php/.*$ [NC] RewriteCond %{HTTP_REFERER} !^http://synapticmishap.co.uk/synapticmishap/crashreportform.php$ [NC] RewriteRule .*\.(jpg|jpeg|gif|png|bmp)$ - [F,NC] Help! I'm at the end of my tether with this and I'm in a very unfamiliar space with all this web stuff. I'd be most appreciative of any thoughts people had on why this isn't working. Thanks.

Read the article
Connect to VPN from Mac on Time Capsule network

- by Lou Franco

I have a few clients on my network that can connect to my work VPN (Windows PPTP) when they are not on my home network. On my home network (Cable Modem with Time Capsule providing Wifi), it fails very early -- looks like it can't even establish a connection. Logs just say that it failed -- even verbose logs don't have much: I redacted the host and IP from this log, but I can ping it. Wed Feb 2 14:32:41 2011 : PPTP connecting to server 'XXX.XXX.com' (XXX.XX.XX.XX)... Wed Feb 2 14:32:41 2011 : PPTP connection established. Wed Feb 2 14:32:41 2011 : using link 0 Wed Feb 2 14:32:41 2011 : Using interface ppp0 Wed Feb 2 14:32:41 2011 : Connect: ppp0 <--> socket[34:17] Wed Feb 2 14:32:41 2011 : sent [LCP ConfReq id=0x1 <asyncmap 0x0> <magic 0x543c7af8> <pcomp> <accomp>] Wed Feb 2 14:32:44 2011 : sent [LCP ConfReq id=0x1 <asyncmap 0x0> <magic 0x543c7af8> <pcomp> <accomp>] Wed Feb 2 14:32:47 2011 : sent [LCP ConfReq id=0x1 <asyncmap 0x0> <magic 0x543c7af8> <pcomp> <accomp>] Wed Feb 2 14:32:50 2011 : sent [LCP ConfReq id=0x1 <asyncmap 0x0> <magic 0x543c7af8> <pcomp> <accomp>] Wed Feb 2 14:32:53 2011 : sent [LCP ConfReq id=0x1 <asyncmap 0x0> <magic 0x543c7af8> <pcomp> <accomp>] Wed Feb 2 14:32:56 2011 : sent [LCP ConfReq id=0x1 <asyncmap 0x0> <magic 0x543c7af8> <pcomp> <accomp>] Wed Feb 2 14:32:59 2011 : sent [LCP ConfReq id=0x1 <asyncmap 0x0> <magic 0x543c7af8> <pcomp> <accomp>] Wed Feb 2 14:33:02 2011 : sent [LCP ConfReq id=0x1 <asyncmap 0x0> <magic 0x543c7af8> <pcomp> <accomp>] Wed Feb 2 14:33:05 2011 : sent [LCP ConfReq id=0x1 <asyncmap 0x0> <magic 0x543c7af8> <pcomp> <accomp>] Wed Feb 2 14:33:08 2011 : sent [LCP ConfReq id=0x1 <asyncmap 0x0> <magic 0x543c7af8> <pcomp> <accomp>] Wed Feb 2 14:33:11 2011 : LCP: timeout sending Config-Requests Wed Feb 2 14:33:11 2011 : Connection terminated. Wed Feb 2 14:33:11 2011 : PPTP disconnecting... Wed Feb 2 14:33:11 2011 : PPTP disconnected Others can get to the VPN and I can too, but not on my network. The only clue I have seen in other forums is to set the NAT default host on the Time Capsule -- I set this to the IP that my mac got over DHCP. I made sure that my Mac gets a different range of IP addresses that it would get if it connected to the VPN (192.168.1.x vs. 10.0.0.x). Not using any VPN client -- just Network System Preferences. It has worked in the past -- but it was a while ago, so I can't pinpoint a change. My sysadmin doesn't even see incoming connections to the VPN (nothing logged about me when I connect). Looking for any diagnostic advice at all

Read the article
Google Analytics V2 SDK for Android EasyTracker giving errors

- by Prince

I have followed the tutorial for the new Google Analytics V2 SDK for Android located here: https://developers.google.com/analytics/devguides/collection/android/v2/ Unfortunately whenever I go to run the application the reporting is not working and this is the messages that logcat gives me: 07-09 09:13:16.978: W/Ads(13933): No Google Analytics: Library Incompatible. 07-09 09:13:16.994: I/Ads(13933): To get test ads on this device, call adRequest.addTestDevice("2BB916E1BD6BE6407582A429D763EC71"); 07-09 09:13:17.018: I/Ads(13933): adRequestUrlHtml: <html><head><script src="http://media.admob.com/sdk-core-v40.js"></script><script>AFMA_getSdkConstants();AFMA_buildAdURL({"kw":[],"preqs":0,"session_id":"7925570029955749351","u_sd":2,"seq_num":"1","slotname":"a14fd91432961bd","u_w":360,"msid":"com.mysampleapp.sampleapp","js":"afma-sdk-a-v6.0.1","mv":"8013013.com.android.vending","isu":"2BB916E1BD6BE6407582A429D763EC71","cipa":1,"format":"320x50_mb","net":"wi","app_name":"1.android.com.mysampleapp.sampleapp","hl":"en","u_h":592,"carrier":"311480","ptime":0,"u_audio":3});</script></head><body></body></html> 07-09 09:13:17.041: W/ActivityManager(220): Unable to start service Intent { act=com.google.android.gms.analytics.service.START (has extras) }: not found 07-09 09:13:17.049: W/GAV2(13933): Thread[main,5,main]: Connection to service failed 1 07-09 09:13:17.057: W/GAV2(13933): Thread[main,5,main]: Need to call initializea() and be in fallback mode to start dispatch. 07-09 09:13:17.088: D/libEGL(13933): loaded /system/lib/egl/libGLES_android.so 07-09 09:13:17.096: D/libEGL(13933): loaded /vendor/lib/egl/libEGL_POWERVR_SGX540_120.so 07-09 09:13:17.096: D/libEGL(13933): loaded /vendor/lib/egl/libGLESv1_CM_POWERVR_SGX540_120.so 07-09 09:13:17.096: D/libEGL(13933): loaded /vendor/lib/egl/libGLESv2_POWERVR_SGX540_120.so Here is my code (I have redacted some of the code that had to do with httppost, etc.): package com.mysampleapp.sampleapp; import java.io.BufferedReader; import java.io.InputStream; import java.io.InputStreamReader; import java.util.ArrayList; import java.util.List; import org.apache.http.HttpEntity; import org.apache.http.HttpResponse; import org.apache.http.NameValuePair; import org.apache.http.client.HttpClient; import org.apache.http.client.entity.UrlEncodedFormEntity; import org.apache.http.client.methods.HttpPost; import org.apache.http.impl.client.DefaultHttpClient; import org.apache.http.message.BasicNameValuePair; import org.json.JSONArray; import org.json.JSONObject; import com.google.analytics.tracking.android.EasyTracker; import android.app.Activity; import android.app.ProgressDialog; import android.content.DialogInterface; import android.content.DialogInterface.OnCancelListener; import android.content.Intent; import android.content.SharedPreferences; import android.os.AsyncTask; import android.os.Bundle; import android.preference.PreferenceManager; import android.util.Log; import android.view.View; import android.view.View.OnClickListener; import android.widget.ImageView; import android.widget.TextView; public class viewRandom extends Activity { @Override public void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.viewrandom); uservote.setVisibility(View.GONE); new randomViewClass().execute(); } public void onStart() { super.onStart(); EasyTracker.getInstance().activityStart(this); } public void onStop() { super.onStop(); EasyTracker.getInstance().activityStop(this); } }

Read the article
SSH login very slow on OS X Leopard

- by acjohnson55

My SSH sessions take a very long time to initiate. This applies for logins with and without passwords, interactive and non-interactive. I have tried setting 'GSSAPIAuthentication no' and 'IPQoS 0x00' on the client side, and 'UseDNS no' on the server side, but no dice. I'm really stumped and frustrated. The worst part is that it SFTP takes forever to establish connections too, making file transfer much longer than it would be otherwise. I thought the problem might be something with PAM, because of where the hang is in the sshd log below, so I tried commenting out each line one-by-one in the /etc/pam.d/sshd file. Some caused login to be impossible, some had no apparent effect. I can't really tell if PAM is stalling for other services, but I can say that su'ing into my account from another account with 'su -l' has no apparent delay. I tried creating a new user account, just to see if there was something wrong with my existing account, and the same problem persisted. Any ideas of what's going on? On the client side, the most verbose mode outputs (redacted where reasonable): OpenSSH_5.9p1, OpenSSL 0.9.8r 8 Feb 2011 debug1: Reading configuration data ... debug1: ... line 1: Applying options for ... debug1: Reading configuration data /etc/ssh_config debug1: /etc/ssh_config line 20: Applying options for * debug1: /etc/ssh_config line 53: Applying options for * debug2: ssh_connect: needpriv 0 debug1: Connecting to ... [x.x.x.x] port 22. debug1: Connection established. debug1: identity file /.../.ssh/id_rsa type -1 debug1: identity file /.../.ssh/id_rsa-cert type -1 debug3: Incorrect RSA1 identifier debug3: Could not load "/.../.ssh/id_dsa" as a RSA1 public key debug1: identity file /.../.ssh/id_dsa type 2 debug1: identity file /.../.ssh/id_dsa-cert type -1 debug1: Remote protocol version 2.0, remote software version OpenSSH_5.2 debug1: match: OpenSSH_5.2 pat OpenSSH* debug1: Enabling compatibility mode for protocol 2.0 debug1: Local version string SSH-2.0-OpenSSH_5.9 debug2: fd 3 setting O_NONBLOCK debug3: load_hostkeys: loading entries for host "..." from file "/.../.ssh/known_hosts" debug3: load_hostkeys: found key type RSA in file /.../.ssh/known_hosts:9 debug3: load_hostkeys: loaded 1 keys debug3: order_hostkeyalgs: prefer hostkeyalgs: [email protected],[email protected],ssh-rsa debug1: SSH2_MSG_KEXINIT sent debug1: SSH2_MSG_KEXINIT received debug2: kex_parse_kexinit: diffie-hellman-group-exchange-sha256,diffie-hellman-group-exchange-sha1,diffie-hellman-group14-sha1,diffie-hellman-group1-sha1 debug2: kex_parse_kexinit: [email protected],[email protected],ssh-rsa,[email protected],[email protected],ssh-dss debug2: kex_parse_kexinit: aes128-ctr,aes192-ctr,aes256-ctr,arcfour256,arcfour128,aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,aes192-cbc,aes256-cbc,arcfour,[email protected] debug2: kex_parse_kexinit: aes128-ctr,aes192-ctr,aes256-ctr,arcfour256,arcfour128,aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,aes192-cbc,aes256-cbc,arcfour,[email protected] debug2: kex_parse_kexinit: hmac-md5,hmac-sha1,[email protected],hmac-sha2-256,hmac-sha2-256-96,hmac-sha2-512,hmac-sha2-512-96,hmac-ripemd160,[email protected],hmac-sha1-96,hmac-md5-96 debug2: kex_parse_kexinit: hmac-md5,hmac-sha1,[email protected],hmac-sha2-256,hmac-sha2-256-96,hmac-sha2-512,hmac-sha2-512-96,hmac-ripemd160,[email protected],hmac-sha1-96,hmac-md5-96 debug2: kex_parse_kexinit: none,[email protected],zlib debug2: kex_parse_kexinit: none,[email protected],zlib debug2: kex_parse_kexinit: debug2: kex_parse_kexinit: debug2: kex_parse_kexinit: first_kex_follows 0 debug2: kex_parse_kexinit: reserved 0 debug2: kex_parse_kexinit: diffie-hellman-group-exchange-sha256,diffie-hellman-group-exchange-sha1,diffie-hellman-group14-sha1,diffie-hellman-group1-sha1 debug2: kex_parse_kexinit: ssh-rsa,ssh-dss debug2: kex_parse_kexinit: aes128-ctr,aes192-ctr,aes256-ctr,arcfour256,arcfour128,aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,aes192-cbc,aes256-cbc,arcfour,[email protected] debug2: kex_parse_kexinit: aes128-ctr,aes192-ctr,aes256-ctr,arcfour256,arcfour128,aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,aes192-cbc,aes256-cbc,arcfour,[email protected] debug2: kex_parse_kexinit: hmac-md5,hmac-sha1,[email protected],hmac-ripemd160,[email protected],hmac-sha1-96,hmac-md5-96 debug2: kex_parse_kexinit: hmac-md5,hmac-sha1,[email protected],hmac-ripemd160,[email protected],hmac-sha1-96,hmac-md5-96 debug2: kex_parse_kexinit: none,[email protected] debug2: kex_parse_kexinit: none,[email protected] debug2: kex_parse_kexinit: debug2: kex_parse_kexinit: debug2: kex_parse_kexinit: first_kex_follows 0 debug2: kex_parse_kexinit: reserved 0 debug2: mac_setup: found hmac-md5 debug1: kex: server->client aes128-ctr hmac-md5 none debug2: mac_setup: found hmac-md5 debug1: kex: client->server aes128-ctr hmac-md5 none debug1: SSH2_MSG_KEX_DH_GEX_REQUEST(1024<1024<8192) sent debug1: expecting SSH2_MSG_KEX_DH_GEX_GROUP debug2: dh_gen_key: priv key bits set: 136/256 debug2: bits set: 523/1024 debug1: SSH2_MSG_KEX_DH_GEX_INIT sent debug1: expecting SSH2_MSG_KEX_DH_GEX_REPLY debug1: Server host key: RSA ... debug3: load_hostkeys: loading entries for host "..." from file "/.../.ssh/known_hosts" debug3: load_hostkeys: found key type RSA in file /.../.ssh/known_hosts:9 debug3: load_hostkeys: loaded 1 keys debug3: load_hostkeys: loading entries for host "x.x.x.x" from file "/.../.ssh/known_hosts" debug3: load_hostkeys: found key type RSA in file /.../.ssh/known_hosts:9 debug3: load_hostkeys: loaded 1 keys debug1: Host '...' is known and matches the RSA host key. debug1: Found key in /.../.ssh/known_hosts:9 debug2: bits set: 492/1024 debug1: ssh_rsa_verify: signature correct debug2: kex_derive_keys debug2: set_newkeys: mode 1 debug1: SSH2_MSG_NEWKEYS sent debug1: expecting SSH2_MSG_NEWKEYS debug2: set_newkeys: mode 0 debug1: SSH2_MSG_NEWKEYS received debug1: Roaming not allowed by server debug1: SSH2_MSG_SERVICE_REQUEST sent debug2: service_accept: ssh-userauth debug1: SSH2_MSG_SERVICE_ACCEPT received debug2: key: /.../.ssh/id_dsa (0x7f8b7b41d6c0) debug2: key: /.../.ssh/id_rsa (0x0) debug1: Authentications that can continue: publickey,password,keyboard-interactive debug3: start over, passed a different list publickey,password,keyboard-interactive debug3: preferred publickey,keyboard-interactive,password debug3: authmethod_lookup publickey debug3: remaining preferred: keyboard-interactive,password debug3: authmethod_is_enabled publickey debug1: Next authentication method: publickey debug1: Offering DSA public key: /.../.ssh/id_dsa debug3: send_pubkey_test debug2: we sent a publickey packet, wait for reply debug1: Server accepts key: pkalg ssh-dss blen 434 debug2: input_userauth_pk_ok: fp ... debug3: sign_and_send_pubkey: DSA ... debug1: Authentication succeeded (publickey). Authenticated to ... ([x.x.x.x]:22). debug1: channel 0: new [client-session] debug3: ssh_session2_open: channel_new: 0 debug2: channel 0: send open debug1: Requesting [email protected] debug1: Entering interactive session. ****** Hangs here ****** debug2: callback start debug2: client_session2_setup: id 0 debug2: fd 3 setting TCP_NODELAY debug2: channel 0: request pty-req confirm 1 debug1: Sending environment. debug3: Ignored env TERM_PROGRAM debug3: Ignored env SHELL debug3: Ignored env TERM debug3: Ignored env TMPDIR debug3: Ignored env Apple_PubSub_Socket_Render debug3: Ignored env TERM_PROGRAM_VERSION debug3: Ignored env TERM_SESSION_ID debug3: Ignored env USER debug3: Ignored env COMMAND_MODE debug3: Ignored env SSH_AUTH_SOCK debug3: Ignored env Apple_Ubiquity_Message debug3: Ignored env __CF_USER_TEXT_ENCODING debug3: Ignored env PATH debug3: Ignored env MKL_NUM_THREADS debug3: Ignored env PWD debug1: Sending env LANG = en_US.UTF-8 debug2: channel 0: request env confirm 0 debug3: Ignored env HOME debug3: Ignored env SHLVL debug3: Ignored env DYLD_LIBRARY_PATH debug3: Ignored env PYTHONPATH debug3: Ignored env LOGNAME debug3: Ignored env DISPLAY debug3: Ignored env SECURITYSESSIONID debug3: Ignored env _ debug2: channel 0: request shell confirm 1 debug2: callback done debug2: channel 0: open confirm rwindow 0 rmax 32768 debug2: channel_input_status_confirm: type 99 id 0 debug2: PTY allocation request accepted on channel 0 debug2: channel 0: rcvd adjust 2097152 debug2: channel_input_status_confirm: type 99 id 0 debug2: shell request accepted on channel 0 On the server side, the debug output looks like: Sep 16 18:46:40 ... sshd[31435]: debug1: inetd sockets after dupping: 3, 4 Sep 16 18:46:40 ... sshd[31435]: Connection from x.x.x.x port 52758 Sep 16 18:46:40 ... sshd[31435]: debug1: Current Session ID is 56AC0FB0 / Session Attributes are 00008000 Sep 16 18:46:40 ... sshd[31435]: debug1: Running in inetd mode in a non-root session... assuming inetd created the session for us. Sep 16 18:46:40 ... sshd[31435]: debug1: Client protocol version 2.0; client software version OpenSSH_5.9 Sep 16 18:46:40 ... sshd[31435]: debug1: match: OpenSSH_5.9 pat OpenSSH* Sep 16 18:46:40 ... sshd[31435]: debug1: Enabling compatibility mode for protocol 2.0 Sep 16 18:46:40 ... sshd[31435]: debug1: Local version string SSH-2.0-OpenSSH_5.2 Sep 16 18:46:40 ... sshd[31435]: debug1: Checking with Service ACLs for ssh login restrictions Sep 16 18:46:40 ... sshd[31435]: debug1: call to mbr_user_name_to_uuid with <...> suceeded to retrieve user_uuid Sep 16 18:46:40 ... sshd[31435]: debug1: Call to mbr_check_service_membership failed with status <0> Sep 16 18:46:40 ... sshd[31435]: debug1: PAM: initializing for "..." Sep 16 18:46:40 ... sshd[31435]: debug1: PAM: setting PAM_RHOST to "x.x.x.x" Sep 16 18:46:40 ... sshd[31435]: Failed none for ... from x.x.x.x port 52758 ssh2 Sep 16 18:46:40 ... sshd[31435]: debug1: temporarily_use_uid: 509/20 (e=0/0) Sep 16 18:46:40 ... sshd[31435]: debug1: trying public key file /.../.ssh/authorized_keys Sep 16 18:46:40 ... sshd[31435]: debug1: restore_uid: 0/0 Sep 16 18:46:40 ... sshd[31435]: debug1: temporarily_use_uid: 509/20 (e=0/0) Sep 16 18:46:40 ... sshd[31435]: debug1: trying public key file /.../.ssh/authorized_keys2 Sep 16 18:46:40 ... sshd[31435]: debug1: fd 5 clearing O_NONBLOCK Sep 16 18:46:40 ... sshd[31435]: debug1: matching key found: file /.../.ssh/authorized_keys2, line 1 Sep 16 18:46:40 ... sshd[31435]: Found matching DSA key: ... Sep 16 18:46:40 ... sshd[31435]: debug1: restore_uid: 0/0 Sep 16 18:46:40 ... sshd[31435]: debug1: temporarily_use_uid: 509/20 (e=0/0) Sep 16 18:46:40 ... sshd[31435]: debug1: trying public key file /.../.ssh/authorized_keys Sep 16 18:46:40 ... sshd[31435]: debug1: restore_uid: 0/0 Sep 16 18:46:40 ... sshd[31435]: debug1: temporarily_use_uid: 509/20 (e=0/0) Sep 16 18:46:40 ... sshd[31435]: debug1: trying public key file /.../.ssh/authorized_keys2 Sep 16 18:46:40 ... sshd[31435]: debug1: fd 5 clearing O_NONBLOCK Sep 16 18:46:40 ... sshd[31435]: debug1: matching key found: file /.../.ssh/authorized_keys2, line 1 Sep 16 18:46:40 ... sshd[31435]: Found matching DSA key: ... Sep 16 18:46:40 ... sshd[31435]: debug1: restore_uid: 0/0 Sep 16 18:46:40 ... sshd[31435]: debug1: ssh_dss_verify: signature correct Sep 16 18:46:40 ... sshd[31435]: debug1: do_pam_account: called Sep 16 18:46:40 ... sshd[31435]: Accepted publickey for ... from x.x.x.x port 52758 ssh2 Sep 16 18:46:40 ... sshd[31435]: debug1: monitor_child_preauth: ... has been authenticated by privileged process Sep 16 18:46:40 ... sshd[31435]: debug1: PAM: establishing credentials ***** Hangs here ***** Sep 16 18:46:54 ... sshd[31435]: User child is on pid 31654 Sep 16 18:46:54 ... sshd[31654]: debug1: PAM: establishing credentials Sep 16 18:46:54 ... sshd[31654]: debug1: permanently_set_uid: 509/20 Sep 16 18:46:54 ... sshd[31654]: debug1: Entering interactive session for SSH2. Sep 16 18:46:54 ... sshd[31654]: debug1: server_init_dispatch_20 Sep 16 18:46:54 ... sshd[31654]: debug1: server_input_channel_open: ctype session rchan 0 win 1048576 max 16384 Sep 16 18:46:54 ... sshd[31654]: debug1: input_session_request Sep 16 18:46:54 ... sshd[31654]: debug1: channel 0: new [server-session] Sep 16 18:46:54 ... sshd[31654]: debug1: session_new: session 0 Sep 16 18:46:54 ... sshd[31654]: debug1: session_open: channel 0 Sep 16 18:46:54 ... sshd[31654]: debug1: session_open: session 0: link with channel 0 Sep 16 18:46:54 ... sshd[31654]: debug1: server_input_channel_open: confirm session Sep 16 18:46:54 ... sshd[31654]: debug1: server_input_global_request: rtype [email protected] want_reply 0 Sep 16 18:46:54 ... sshd[31654]: debug1: server_input_channel_req: channel 0 request pty-req reply 1 Sep 16 18:46:54 ... sshd[31654]: debug1: session_by_channel: session 0 channel 0 Sep 16 18:46:54 ... sshd[31654]: debug1: session_input_channel_req: session 0 req pty-req Sep 16 18:46:54 ... sshd[31654]: debug1: Allocating pty. Sep 16 18:46:54 ... sshd[31435]: debug1: session_new: session 0 Sep 16 18:46:54 ... sshd[31654]: debug1: session_pty_req: session 0 alloc /dev/ttys008 Sep 16 18:46:54 ... sshd[31654]: debug1: server_input_channel_req: channel 0 request env reply 0 Sep 16 18:46:54 ... sshd[31654]: debug1: session_by_channel: session 0 channel 0 Sep 16 18:46:54 ... sshd[31654]: debug1: session_input_channel_req: session 0 req env Sep 16 18:46:54 ... sshd[31654]: debug1: server_input_channel_req: channel 0 request shell reply 1 Sep 16 18:46:54 ... sshd[31654]: debug1: session_by_channel: session 0 channel 0 Sep 16 18:46:54 ... sshd[31654]: debug1: session_input_channel_req: session 0 req shell Sep 16 18:46:54 ... sshd[31655]: debug1: Setting controlling tty using TIOCSCTTY.

Read the article
SQLite, python, unicode, and non-utf data

- by Nathan Spears

I started by trying to store strings in sqlite using python, and got the message: sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings. Ok, I switched to Unicode strings. Then I started getting the message: sqlite3.OperationalError: Could not decode to UTF-8 column 'tag_artist' with text 'Sigur Rós' when trying to retrieve data from the db. More research and I started encoding it in utf8, but then 'Sigur Rós' starts looking like 'Sigur RÃ³s' note: My console was set to display in 'latin_1' as @John Machin pointed out. What gives? After reading this, describing exactly the same situation I'm in, it seems as if the advice is to ignore the other advice and use 8-bit bytestrings after all. I didn't know much about unicode and utf before I started this process. I've learned quite a bit in the last couple hours, but I'm still ignorant of whether there is a way to correctly convert 'ó' from latin-1 to utf-8 and not mangle it. If there isn't, why would sqlite 'highly recommend' I switch my application to unicode strings? I'm going to update this question with a summary and some example code of everything I've learned in the last 24 hours so that someone in my shoes can have an easy(er) guide. If the information I post is wrong or misleading in any way please tell me and I'll update, or one of you senior guys can update. Summary of answers Let me first state the goal as I understand it. The goal in processing various encodings, if you are trying to convert between them, is to understand what your source encoding is, then convert it to unicode using that source encoding, then convert it to your desired encoding. Unicode is a base and encodings are mappings of subsets of that base. utf_8 has room for every character in unicode, but because they aren't in the same place as, for instance, latin_1, a string encoded in utf_8 and sent to a latin_1 console will not look the way you expect. In python the process of getting to unicode and into another encoding looks like: str.decode('source_encoding').encode('desired_encoding') or if the str is already in unicode str.encode('desired_encoding') For sqlite I didn't actually want to encode it again, I wanted to decode it and leave it in unicode format. Here are four things you might need to be aware of as you try to work with unicode and encodings in python. The encoding of the string you want to work with, and the encoding you want to get it to. The system encoding. The console encoding. The encoding of the source file Elaboration: (1) When you read a string from a source, it must have some encoding, like latin_1 or utf_8. In my case, I'm getting strings from filenames, so unfortunately, I could be getting any kind of encoding. Windows XP uses UCS-2 (a Unicode system) as its native string type, which seems like cheating to me. Fortunately for me, the characters in most filenames are not going to be made up of more than one source encoding type, and I think all of mine were either completely latin_1, completely utf_8, or just plain ascii (which is a subset of both of those). So I just read them and decoded them as if they were still in latin_1 or utf_8. It's possible, though, that you could have latin_1 and utf_8 and whatever other characters mixed together in a filename on Windows. Sometimes those characters can show up as boxes, other times they just look mangled, and other times they look correct (accented characters and whatnot). Moving on. (2) Python has a default system encoding that gets set when python starts and can't be changed during runtime. See here for details. Dirty summary ... well here's the file I added: \# sitecustomize.py \# this file can be anywhere in your Python path, \# but it usually goes in ${pythondir}/lib/site-packages/ import sys sys.setdefaultencoding('utf_8') This system encoding is the one that gets used when you use the unicode("str") function without any other encoding parameters. To say that another way, python tries to decode "str" to unicode based on the default system encoding. (3) If you're using IDLE or the command-line python, I think that your console will display according to the default system encoding. I am using pydev with eclipse for some reason, so I had to go into my project settings, edit the launch configuration properties of my test script, go to the Common tab, and change the console from latin-1 to utf-8 so that I could visually confirm what I was doing was working. (4) If you want to have some test strings, eg test_str = "ó" in your source code, then you will have to tell python what kind of encoding you are using in that file. (FYI: when I mistyped an encoding I had to ctrl-Z because my file became unreadable.) This is easily accomplished by putting a line like so at the top of your source code file: # -*- coding: utf_8 -*- If you don't have this information, python attempts to parse your code as ascii by default, and so: SyntaxError: Non-ASCII character '\xf3' in file _redacted_ on line 81, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details Once your program is working correctly, or, if you aren't using python's console or any other console to look at output, then you will probably really only care about #1 on the list. System default and console encoding are not that important unless you need to look at output and/or you are using the builtin unicode() function (without any encoding parameters) instead of the string.decode() function. I wrote a demo function I will paste into the bottom of this gigantic mess that I hope correctly demonstrates the items in my list. Here is some of the output when I run the character 'ó' through the demo function, showing how various methods react to the character as input. My system encoding and console output are both set to utf_8 for this run: '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Now I will change the system and console encoding to latin_1, and I get this output for the same input: 'ó' = original char <type 'str'> repr(char)='\xf3' 'ó' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Notice that the 'original' character displays correctly and the builtin unicode() function works now. Now I change my console output back to utf_8. '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' '?' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Here everything still works the same as last time but the console can't display the output correctly. Etc. The function below also displays more information that this and hopefully would help someone figure out where the gap in their understanding is. I know all this information is in other places and more thoroughly dealt with there, but I hope that this would be a good kickoff point for someone trying to get coding with python and/or sqlite. Ideas are great but sometimes source code can save you a day or two of trying to figure out what functions do what. Disclaimers: I'm no encoding expert, I put this together to help my own understanding. I kept building on it when I should have probably started passing functions as arguments to avoid so much redundant code, so if I can I'll make it more concise. Also, utf_8 and latin_1 are by no means the only encoding schemes, they are just the two I was playing around with because I think they handle everything I need. Add your own encoding schemes to the demo function and test your own input. One more thing: there are apparently crazy application developers making life difficult in Windows. #!/usr/bin/env python # -*- coding: utf_8 -*- import os import sys def encodingDemo(str): validStrings = () try: print "str =",str,"{0} repr(str) = {1}".format(type(str), repr(str)) validStrings += ((str,""),) except UnicodeEncodeError as ude: print "Couldn't print the str itself because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print ude try: x = unicode(str) print "unicode(str) = ",x validStrings+= ((x, " decoded into unicode by the default system encoding"),) except UnicodeDecodeError as ude: print "ERROR. unicode(str) couldn't decode the string because the system encoding is set to an encoding that doesn't understand some character in the string." print "\tThe system encoding is set to {0}. See error:\n\t".format(sys.getdefaultencoding()), print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the unicode(str) because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('latin_1') print "str.decode('latin_1') =",x validStrings+= ((x, " decoded with latin_1 into unicode"),) try: print "str.decode('latin_1').encode('utf_8') =",str.decode('latin_1').encode('utf_8') validStrings+= ((x, " decoded with latin_1 into unicode and encoded into utf_8"),) except UnicodeDecodeError as ude: print "The string was decoded into unicode using the latin_1 encoding, but couldn't be encoded into utf_8. See error:\n\t", print ude except UnicodeDecodeError as ude: print "Something didn't work, probably because the string wasn't latin_1 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('latin_1') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('utf_8') print "str.decode('utf_8') =",x validStrings+= ((x, " decoded with utf_8 into unicode"),) try: print "str.decode('utf_8').encode('latin_1') =",str.decode('utf_8').encode('latin_1') except UnicodeDecodeError as ude: print "str.decode('utf_8').encode('latin_1') didn't work. The string was decoded into unicode using the utf_8 encoding, but couldn't be encoded into latin_1. See error:\n\t", validStrings+= ((x, " decoded with utf_8 into unicode and encoded into latin_1"),) print ude except UnicodeDecodeError as ude: print "str.decode('utf_8') didn't work, probably because the string wasn't utf_8 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('utf_8') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t",uee print print "Printing information about each character in the original string." for char in str: try: print "\t'" + char + "' = original char {0} repr(char)={1}".format(type(char), repr(char)) except UnicodeDecodeError as ude: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), ude) except UnicodeEncodeError as uee: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), uee) print uee try: x = unicode(char) print "\t'" + x + "' = unicode(char) {1} repr(unicode(char))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = unicode(char) ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = unicode(char) {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('latin_1') print "\t'" + x + "' = char.decode('latin_1') {1} repr(char.decode('latin_1'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('latin_1') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('latin_1') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('utf_8') print "\t'" + x + "' = char.decode('utf_8') {1} repr(char.decode('utf_8'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('utf_8') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('utf_8') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) print x = 'ó' encodingDemo(x) Much thanks for the answers below and especially to @John Machin for answering so thoroughly.

Read the article
SQLite, python, unicode, and non-utf data

- by Nathan Spears

I started by trying to store strings in sqlite using python, and got the message: sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings. Ok, I switched to Unicode strings. Then I started getting the message: sqlite3.OperationalError: Could not decode to UTF-8 column 'tag_artist' with text 'Sigur Rós' when trying to retrieve data from the db. More research and I started encoding it in utf8, but then 'Sigur Rós' starts looking like 'Sigur RÃ³s' note: My console was set to display in 'latin_1' as @John Machin pointed out. What gives? After reading this, describing exactly the same situation I'm in, it seems as if the advice is to ignore the other advice and use 8-bit bytestrings after all. I didn't know much about unicode and utf before I started this process. I've learned quite a bit in the last couple hours, but I'm still ignorant of whether there is a way to correctly convert 'ó' from latin-1 to utf-8 and not mangle it. If there isn't, why would sqlite 'highly recommend' I switch my application to unicode strings? I'm going to update this question with a summary and some example code of everything I've learned in the last 24 hours so that someone in my shoes can have an easy(er) guide. If the information I post is wrong or misleading in any way please tell me and I'll update, or one of you senior guys can update. Summary of answers Let me first state the goal as I understand it. The goal in processing various encodings, if you are trying to convert between them, is to understand what your source encoding is, then convert it to unicode using that source encoding, then convert it to your desired encoding. Unicode is a base and encodings are mappings of subsets of that base. utf_8 has room for every character in unicode, but because they aren't in the same place as, for instance, latin_1, a string encoded in utf_8 and sent to a latin_1 console will not look the way you expect. In python the process of getting to unicode and into another encoding looks like: str.decode('source_encoding').encode('desired_encoding') or if the str is already in unicode str.encode('desired_encoding') For sqlite I didn't actually want to encode it again, I wanted to decode it and leave it in unicode format. Here are four things you might need to be aware of as you try to work with unicode and encodings in python. The encoding of the string you want to work with, and the encoding you want to get it to. The system encoding. The console encoding. The encoding of the source file Elaboration: (1) When you read a string from a source, it must have some encoding, like latin_1 or utf_8. In my case, I'm getting strings from filenames, so unfortunately, I could be getting any kind of encoding. Windows XP uses UCS-2 (a Unicode system) as its native string type, which seems like cheating to me. Fortunately for me, the characters in most filenames are not going to be made up of more than one source encoding type, and I think all of mine were either completely latin_1, completely utf_8, or just plain ascii (which is a subset of both of those). So I just read them and decoded them as if they were still in latin_1 or utf_8. It's possible, though, that you could have latin_1 and utf_8 and whatever other characters mixed together in a filename on Windows. Sometimes those characters can show up as boxes, other times they just look mangled, and other times they look correct (accented characters and whatnot). Moving on. (2) Python has a default system encoding that gets set when python starts and can't be changed during runtime. See here for details. Dirty summary ... well here's the file I added: \# sitecustomize.py \# this file can be anywhere in your Python path, \# but it usually goes in ${pythondir}/lib/site-packages/ import sys sys.setdefaultencoding('utf_8') This system encoding is the one that gets used when you use the unicode("str") function without any other encoding parameters. To say that another way, python tries to decode "str" to unicode based on the default system encoding. (3) If you're using IDLE or the command-line python, I think that your console will display according to the default system encoding. I am using pydev with eclipse for some reason, so I had to go into my project settings, edit the launch configuration properties of my test script, go to the Common tab, and change the console from latin-1 to utf-8 so that I could visually confirm what I was doing was working. (4) If you want to have some test strings, eg test_str = "ó" in your source code, then you will have to tell python what kind of encoding you are using in that file. (FYI: when I mistyped an encoding I had to ctrl-Z because my file became unreadable.) This is easily accomplished by putting a line like so at the top of your source code file: # -*- coding: utf_8 -*- If you don't have this information, python attempts to parse your code as ascii by default, and so: SyntaxError: Non-ASCII character '\xf3' in file _redacted_ on line 81, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details Once your program is working correctly, or, if you aren't using python's console or any other console to look at output, then you will probably really only care about #1 on the list. System default and console encoding are not that important unless you need to look at output and/or you are using the builtin unicode() function (without any encoding parameters) instead of the string.decode() function. I wrote a demo function I will paste into the bottom of this gigantic mess that I hope correctly demonstrates the items in my list. Here is some of the output when I run the character 'ó' through the demo function, showing how various methods react to the character as input. My system encoding and console output are both set to utf_8 for this run: '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Now I will change the system and console encoding to latin_1, and I get this output for the same input: 'ó' = original char <type 'str'> repr(char)='\xf3' 'ó' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Notice that the 'original' character displays correctly and the builtin unicode() function works now. Now I change my console output back to utf_8. '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' '?' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Here everything still works the same as last time but the console can't display the output correctly. Etc. The function below also displays more information that this and hopefully would help someone figure out where the gap in their understanding is. I know all this information is in other places and more thoroughly dealt with there, but I hope that this would be a good kickoff point for someone trying to get coding with python and/or sqlite. Ideas are great but sometimes source code can save you a day or two of trying to figure out what functions do what. Disclaimers: I'm no encoding expert, I put this together to help my own understanding. I kept building on it when I should have probably started passing functions as arguments to avoid so much redundant code, so if I can I'll make it more concise. Also, utf_8 and latin_1 are by no means the only encoding schemes, they are just the two I was playing around with because I think they handle everything I need. Add your own encoding schemes to the demo function and test your own input. One more thing: there are apparently crazy application developers making life difficult in Windows. #!/usr/bin/env python # -*- coding: utf_8 -*- import os import sys def encodingDemo(str): validStrings = () try: print "str =",str,"{0} repr(str) = {1}".format(type(str), repr(str)) validStrings += ((str,""),) except UnicodeEncodeError as ude: print "Couldn't print the str itself because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print ude try: x = unicode(str) print "unicode(str) = ",x validStrings+= ((x, " decoded into unicode by the default system encoding"),) except UnicodeDecodeError as ude: print "ERROR. unicode(str) couldn't decode the string because the system encoding is set to an encoding that doesn't understand some character in the string." print "\tThe system encoding is set to {0}. See error:\n\t".format(sys.getdefaultencoding()), print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the unicode(str) because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('latin_1') print "str.decode('latin_1') =",x validStrings+= ((x, " decoded with latin_1 into unicode"),) try: print "str.decode('latin_1').encode('utf_8') =",str.decode('latin_1').encode('utf_8') validStrings+= ((x, " decoded with latin_1 into unicode and encoded into utf_8"),) except UnicodeDecodeError as ude: print "The string was decoded into unicode using the latin_1 encoding, but couldn't be encoded into utf_8. See error:\n\t", print ude except UnicodeDecodeError as ude: print "Something didn't work, probably because the string wasn't latin_1 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('latin_1') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('utf_8') print "str.decode('utf_8') =",x validStrings+= ((x, " decoded with utf_8 into unicode"),) try: print "str.decode('utf_8').encode('latin_1') =",str.decode('utf_8').encode('latin_1') except UnicodeDecodeError as ude: print "str.decode('utf_8').encode('latin_1') didn't work. The string was decoded into unicode using the utf_8 encoding, but couldn't be encoded into latin_1. See error:\n\t", validStrings+= ((x, " decoded with utf_8 into unicode and encoded into latin_1"),) print ude except UnicodeDecodeError as ude: print "str.decode('utf_8') didn't work, probably because the string wasn't utf_8 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('utf_8') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t",uee print print "Printing information about each character in the original string." for char in str: try: print "\t'" + char + "' = original char {0} repr(char)={1}".format(type(char), repr(char)) except UnicodeDecodeError as ude: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), ude) except UnicodeEncodeError as uee: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), uee) print uee try: x = unicode(char) print "\t'" + x + "' = unicode(char) {1} repr(unicode(char))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = unicode(char) ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = unicode(char) {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('latin_1') print "\t'" + x + "' = char.decode('latin_1') {1} repr(char.decode('latin_1'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('latin_1') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('latin_1') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('utf_8') print "\t'" + x + "' = char.decode('utf_8') {1} repr(char.decode('utf_8'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('utf_8') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('utf_8') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) print x = 'ó' encodingDemo(x) Much thanks for the answers below and especially to @John Machin for answering so thoroughly.

Read the article

< Previous Page | 1 2 3 4