utf8 - Page 13 - Developer IT

Is there any free host which supports php and mySQL in utf-8? [closed]

- by Maria Konnou

Possible Duplicate: How to find web hosting that meets my requirements? Is there any free host which supports php and mySQL queries in utf-8? I've already tried to use x10hosting and 000webhosting, but they don't support utf8 mysql queries (got mojibake). The default encoding of mysql in both sites is latin-1, and you're not able to change that. Is there any other free host that fully supports utf-8?

Read the article

How do I create a Unicode databases in PostgreSQL 8.4?

- by wildpeaks

I installed the postgresql-8.4 package with default options. Everything worked fine, however I can't seem to manage to create unicode databases: -- This doesn't work createdb test1 --encoding UNICODE -- This works createdb test2 The error message, createdb: database creation failed: ERROR: new encoding (UTF8) is incompatible with the encoding of the template database (SQL_ASCII) is a bit puzzling because (afaik) I don't use a template for creating the new db, or is it implicitely referring to the default "postgres" database for some reason ? Or maybe I'm missing a setting in a .conf file ?

Read the article

mounted smb share throu fstab, gets read only on added files

- by Jocke

I mounted my nas in ubuntu 12.10 and it works with read/write, but when I'm adding a file or directory that file gets read only permissions. My fstab mount looks like this: //192.168.0.12/share/ /media/nas cifs credentials=/home/jocke/.smbcredentials,iocharset=utf8,file_mode=0777,dir_mode=0777 0 0 If I mount the smb share manualy through the GUI it works, but not through fstab. What I am doing wrong?

Read the article

Mounting hard-drive on boot

- by Kicsi Mano

I had 2 HDD, today I bought a new one, I would like to mount this HDD at boot, it's working, but the new HDD mounted under robu not root, why? Content of the fstab: UUID=8e492a04-c05d-4861-b996-a36ebbaf3d43 /media/WESYS_RAID ext4 rw 0 0 UUID=12C81F25C81F071F /media/WESYS_DATA ntfs defaults,iocharset=utf8 0 0 /dev/mapper/WeSyS_LVM /media/WESYS_LVM ext4 rw 0 0 This is the rights. drwxrwxrwx 1 root root 4096 2012-04-05 11:51 WESYS_DATA drwxr-xr-x 9 root root 4096 2012-03-01 10:11 WESYS_LVM drwx------ 3 robu robu 4096 2012-04-10 12:33 WESYS_RAID

Read the article

Unicode in PostgreSQL 8.4

- by user8382

I installed the "postgresql-8.4" package with default options. Everything worked fine, however I can't seem to manage to create unicode databases: -- This doesn't work createdb test1 --encoding UNICODE -- This works createdb test2 The error message "createdb: database creation failed: ERROR: new encoding (UTF8) is incompatible with the encoding of the template database (SQL_ASCII)" is a bit puzzling because (afaik) I don't use a template for creating the new db, or is it implicitely referring to the default "postgres" database for some reason ? Or maybe I'm missing a setting in a .conf file ?

Read the article

Amazon Ec2: Problem In Setting up FTP Server

- by Muntasir

after setting up My vsFtp Server ON Ec2 i am facing problem , my client is Filezilla and i am getting this error Response: 230 Login successful. Command: OPTS UTF8 ON Response: 200 Always in UTF8 mode. Status: Connected Status: Retrieving directory listing... Command: PWD Response: 257 "/" Command: TYPE I Response: 200 Switching to Binary mode. Command: PASV Response: 500 OOPS: invalid pasv_address Command: PORT 10,130,8,44,240,50 Response: 500 OOPS: priv_sock_get_cmd Error: Failed to retrieve directory listing Error: Connection closed by server this is the current setting in my vsftpd.conf #nopriv_user=ftpsecure #async_abor_enable=YES # ASCII mangling is a horrible feature of the protocol. #ascii_upload_enable=YES #ascii_download_enable=YES # You may specify a file of disallowed anonymous e-mail addresses. Apparently # useful for combatting certain DoS attacks. #deny_email_enable=YES # (default follows) #banned_email_file=/etc/vsftpd/banned_emails # chroot_local_user=YES #chroot_list_enable=YES # (default follows) #chroot_list_file=/etc/vsftpd/chroot_list GNU nano 2.0.6 File: /etc/vsftpd/vsftpd.conf # #ls_recurse_enable=YES # # When "listen" directive is enabled, vsftpd runs in standalone mode and # listens on IPv4 sockets. This directive cannot be used in conjunction # with the listen_ipv6 directive. listen=YES # # This directive enables listening on IPv6 sockets. To listen on IPv4 and IPv6 # sockets, you must run two copies of vsftpd with two configuration files. # Make sure, that one of the listen options is commented !! #listen_ipv6=YES pam_service_name=vsftpd userlist_enable=YES tcp_wrappers=YES pasv_enable=YES pasv_min_port=2345 pasv_max_port=2355 listen_port=1024 pasv_address=ec2-xxxxxxx.compute-1.amazonaws.com pasv_promiscuous=YES Note: i have already open those port in security group i mean listen port, min max if someone shows me how to fix this i will be very greatful thanks

Read the article

mysql.proc has gone corrupt. How can I fix it?

- by Metalcoder

I have a server running Debian 5.0, and MySQL. Suddendly, MySQL stopped working, and after many attempts to fix it, I decided to reinstall it. I installed MySQL 5.1.63, and when started it goes to safe mode. I made some typing, and when I executed mysql_upgrade as root, it complained: ... Running 'mysql_fix_privilege_tables'... ERROR 1548 (HY000) at line 1111: Cannot load from mysql.proc. The table is probably corrupted ERROR 1064 (42000) at line 1112: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'sqlstate 'HY000' set message_text='Unexpected content found in the performance_s' at line 1 ERROR 1548 (HY000) at line 1125: Cannot load from mysql.proc. The table is probably corrupted FATAL ERROR: Upgrade failed I checked the mysql.proc table, and it's comment column was slightly different from my backup. -- My backup says: `comment` char(64) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL DEFAULT '', -- But it were: `comment` text CHARACTER SET utf8 COLLATE utf8_bin NOT NULL, So, I restored my mysql database backup, and now they all match, but mysql_upgrade still trigger the same errors. I also tried do check and repair the mysql.proc table, but got no success.

Read the article

how to correctly mount fat32 partition in Ubuntu in order to preserve case

- by Dean

I've found there are couple of problems might be related how my FAT32 partition was mounted. I hope you can help me to solve the problem. I also included the command I used to help others when they find this post, sorry to those might feel I should use less space. I've the following file structures on my disk dean@notebook:~$ sudo fdisk -l Disk /dev/sda: 160.0 GB, 160041885696 bytes 255 heads, 63 sectors/track, 19457 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x08860886 Device Boot Start End Blocks Id System /dev/sda1 * 1 13 102400 7 HPFS/NTFS Partition 1 does not end on cylinder boundary. /dev/sda2 13 5737 45978624 7 HPFS/NTFS /dev/sda3 5738 10600 39062047+ 83 Linux /dev/sda4 10601 19457 71143852+ 5 Extended /dev/sda5 10601 11208 4883728+ 82 Linux swap / Solaris /dev/sda6 11209 15033 30720000 b W95 FAT32 /dev/sda7 15033 19457 35537920 7 HPFS/NTFS In the etc/fstab I've got UUID=91c57a65-dc53-476b-b219-28dac3682d31 / ext4 defaults 0 1 UUID=BEA2A8AFA2A86D99 /media/NTFS ntfs-3g quiet,defaults,locale=en_US.utf8,umask=0 0 0 UUID=0C0C-9BB3 /media/FAT32 vfat user,auto,utf8,fmask=0111,dmask=0000,uid=1000 0 0 /dev/sda5 swap swap sw 0 0 /dev/sda1 /media/sda1 ntfs nls=iso8859-1,ro,noauto,umask=000 0 0 /dev/sda2 /media/sda2 ntfs nls=iso8859-1,ro,noauto,umask=000 0 0 I checked my id using id and I've got dean@notebook:~$ id uid=1000(dean) gid=1000(dean) groups=4(adm),20(dialout),24(cdrom),46(plugdev),103(fuse),104(lpadmin),115(admin),120(sambashare),1000(dean) I don't know why with these settings I still have problem of using svn like in this one Thank you for your help!

Read the article

how to correctly mount fat32 partition in Ubuntu in order to preserve case

- by Dean

I've found there are couple of problems might be related how my FAT32 partition was mounted. I hope you can help me to solve the problem. I also included the command I used to help others when they find this post, sorry to those might feel I should use less space. I've the following file structures on my disk dean@notebook:~$ sudo fdisk -l Disk /dev/sda: 160.0 GB, 160041885696 bytes 255 heads, 63 sectors/track, 19457 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x08860886 Device Boot Start End Blocks Id System /dev/sda1 * 1 13 102400 7 HPFS/NTFS Partition 1 does not end on cylinder boundary. /dev/sda2 13 5737 45978624 7 HPFS/NTFS /dev/sda3 5738 10600 39062047+ 83 Linux /dev/sda4 10601 19457 71143852+ 5 Extended /dev/sda5 10601 11208 4883728+ 82 Linux swap / Solaris /dev/sda6 11209 15033 30720000 b W95 FAT32 /dev/sda7 15033 19457 35537920 7 HPFS/NTFS In the etc/fstab I've got UUID=91c57a65-dc53-476b-b219-28dac3682d31 / ext4 defaults 0 1 UUID=BEA2A8AFA2A86D99 /media/NTFS ntfs-3g quiet,defaults,locale=en_US.utf8,umask=0 0 0 UUID=0C0C-9BB3 /media/FAT32 vfat user,auto,utf8,fmask=0111,dmask=0000,uid=1000 0 0 /dev/sda5 swap swap sw 0 0 /dev/sda1 /media/sda1 ntfs nls=iso8859-1,ro,noauto,umask=000 0 0 /dev/sda2 /media/sda2 ntfs nls=iso8859-1,ro,noauto,umask=000 0 0 I checked my id using id and I've got dean@notebook:~$ id uid=1000(dean) gid=1000(dean) groups=4(adm),20(dialout),24(cdrom),46(plugdev),103(fuse),104(lpadmin),115(admin),120(sambashare),1000(dean) I don't know why with these settings I still have problem of using svn like in this one Thank you for your help!

Read the article

"unrecognized options" while installing php

- by user1692333

I want to compile php 5.4.8 on my mac 10.8.2, but get some errors which cant solve by my self, so need your help. Firstly i get default php options with php -i | head, after it do this command ./configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --disable-dependency-tracking --sysconfdir=/private/etc --with-apxs2=/usr/sbin/apxs --enable-cli --with-config-file-path=/etc --with-libxml-dir=/usr --with-openssl=/usr --with-kerberos=/usr --with-zlib=/usr --enable-bcmath --with-bz2=/usr --enable-calendar --disable-cgi --with-curl=/usr --enable-dba --enable-ndbm=/usr --enable-exif --enable-fpm --enable-ftp --with-gd --with-freetype-dir=/BinaryCache/apache_mod_php/apache_mod_php-79~4/Root/usr/local --with-jpeg-dir=/BinaryCache/apache_mod_php/apache_mod_php-79~4/Root/usr/local --with-png-dir=/BinaryCache/apache_mod_php/apache_mod_php-79~4/Root/usr/local --enable-gd-native-ttf --with-icu-dir=/usr --with-iodbc=/usr --with-ldap=/usr --with-ldap-sasl=/usr --with-libedit=/usr --enable-mbstring --enable-mbregex --with-mysql=mysqlnd --with-mysqli=mysqlnd --without-pear --with-pdo-mysql=mysqlnd --with-mysql-sock=/var/mysql/mysql.sock --with-readline=/usr --enable-shmop --with-snmp=/usr --enable-soap --enable-sockets --enable-sqlite-utf8 --enable-suhosin --enable-sysvmsg --enable-sysvsem --enable-sysvshm --with-tidy --enable-wddx --with-xmlrpc --with-iconv-dir=/usr --with-xsl=/usr --enable-zend-multibyte --enable-zip --with-pcre-regex --with-pgsql=/usr --with-pdo-pgsql=/usr But get this error config.status: creating Makefile config.status: creating jconfig.h config.status: jconfig.h is unchanged config.status: executing depfiles commands config.status: executing libtool commands configure: WARNING: unrecognized options: --enable-cli, --with-config-file-path, --with-libxml-dir, --with-openssl, --with-kerberos, --with-zlib, --enable-bcmath, --with-bz2, --enable-calendar, --disable-cgi, --with-curl, --enable-dba, --enable-ndbm, --enable-exif, --enable-fpm, --enable-ftp, --with-gd, --with-freetype-dir, --with-jpeg-dir, --with-png-dir, --enable-gd-native-ttf, --with-icu-dir, --with-iodbc, --with-ldap, --with-ldap-sasl, --with-libedit, --enable-mbstring, --enable-mbregex, --with-mysql, --with-mysqli, --without-pear, --with-pdo-mysql, --with-mysql-sock, --with-readline, --enable-shmop, --with-snmp, --enable-soap, --enable-sockets, --enable-sqlite-utf8, --enable-suhosin, --enable-sysvmsg, --enable-sysvsem, --enable-sysvshm, --with-tidy, --enable-wddx, --with-xmlrpc, --with-iconv-dir, --with-xsl, --enable-zend-multibyte, --enable-zip, --with-pcre-regex, --with-pgsql, --with-pdo-pgsql Maybe someone have some suggestions on this?

Read the article

FTP not listing directory NcFTP PASV

- by Jacob Talbot

I am attempting to setup Multicraft on my server, all is running smoothly however the FTP won't allow anyone to connect from a remote FTP client, where net2ftp will work smoothly from a remote location. I have included the transcript from my FTP client, Transmit below to give you an idea of what's going on. I have disabled iptables as well, and still no luck either way. Transmit 4.1.7 (x86_64) Session Transcript [Version 10.8.2 (Build 12C54)] (21/10/12 11:23 PM) LibNcFTP 3.2.3 (July 23, 2009) compiled for UNIX 220: Multicraft 1.7.1 FTP server Connected to ateam.bn-mc.net. Cmd: USER jacob.9 331: Username ok, send password. Cmd: PASS xxxxxxxx 230: Login successful Cmd: TYPE A 200: Type set to: ASCII. Logged in to ateam.bn-mc.net as jacob.9. Cmd: SYST 215: UNIX Type: L8 Cmd: FEAT 211: Features supported: EPRT EPSV MDTM MLSD MLST type*;perm*;size*;modify*;unique*;unix.mode;unix.uid;unix.gid; REST STREAM SIZE TVFS UTF8 End FEAT. Cmd: OPTS UTF8 ON 200: OK Cmd: PWD 257: "/" is the current directory. Cmd: PASV Could not read reply from control connection -- timed out. (SReadline 1)

Read the article

PhpMyAdmin import/export - strange character encoding issues.

- by John Hunt

Hello, I'm migrating a site to a new host, and there are a couple of databases on there. There's no SSH access so I'm stuck with phpmyadmin. The issue is that certain characters (namely just whitespace) seems to being corrupt on the new site (same html, and apache doesn't seem to be messing with any encodings - you can see the strange characters have changed when I use less on my linux machine after downloading a table dump from both servers.) The issue isn't as bad if I import into the new database as utf-8 - whitespace characters only have one funny A type symbol instead of two. I've been trying various combinations of character encoding etc to no avail. Exporting from: phpMyAdmin 2.6.2 MySQL 4.1.20 MySQL connection collation: utf8_general_ci MySQL charset: UTF-8 Unicode (utf8) Collation on tables and their fields is: latin1_swedish_ci Importing to: phpMyAdmin - 2.11.9.2 MySQL client version: 5.0.45 MySQL charset: UTF-8 Unicode (utf8) MySQL connection collation: utf8_general_ci The import sql has this kind of thing in it: ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=192 ; I get the impression this is actually a bug or something with mysqldump as nothing seems to work.. does anyone have any insight into this? Cheers, John.

Read the article

reset locale in debian under Squeeze

- by si2w

I have problems with locale in debian. I tried many thing but it doesn't anything for me : locale -a locale: Cannot set LC_CTYPE to default locale: No such file or directory C POSIX en_US.utf8 I try to set en_US.utf8 without success with this :dpkg-reconfigure locales -plow perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LANGUAGE = "en_US", LC_ALL = (unset), LC_CTYPE = "UTF-8", LANG = (unset) are supported and installed on your system. perl: warning: Falling back to the standard locale ("C"). locale: Cannot set LC_CTYPE to default locale: No such file or directory locale: Cannot set LC_ALL to default locale: No such file or directory /usr/bin/locale: Cannot set LC_CTYPE to default locale: No such file or directory /usr/bin/locale: Cannot set LC_ALL to default locale: No such file or directory Generating locales (this might take a while)... en_US.UTF-8... done Generation complete. perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LANGUAGE = "en_US", LC_ALL = (unset), LC_CTYPE = "UTF-8", LANG = (unset) are supported and installed on your system. perl: warning: Falling back to the standard locale ("C"). perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LANGUAGE = "en_US", LC_ALL = (unset), LC_CTYPE = "UTF-8", LANG = (unset) are supported and installed on your system. perl: warning: Falling back to the standard locale ("C"). After reboot, i try to use a perl script : perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LANGUAGE = "en_US", LC_ALL = (unset), LC_CTYPE = "UTF-8", LANG = "en_US.UTF-8" are supported and installed on your system. perl: warning: Falling back to the standard locale ("C"). Here is my /etc/default/locale config file : cat /etc/default/locale LANG=en_US.UTF-8 LANGUAGE=en_US Any idea to solve this (stupid) problem ? Thanks

Read the article

Good HTTP Monitoring tools

- by ffffff

I look for HTTP to work with a Linux system server monitor tool every protocol. I know, and will not there be it in whom or a freeware? When, for example, I dump 80/tcp with a packet monitor to be concrete # tethereal -i ppp0 port 80 -x Capturing on ppp0 1244206390.030474 219.111.xx.xx -> 74.125.xx.xx HTTP GET /search?output=js&num=0&dt=1244206414703&client=pub-3031568651010206&q=Cagliari%20Flight&ad=n3&ie=utf8&oe=utf8&channel=0091594208&adtest=off HTTP/1.1 0000 00 04 02 00 00 00 00 00 00 00 00 00 00 00 08 00 ................ 0010 45 00 01 e5 ee 82 40 00 40 06 d2 b5 db 6f 02 5b E.....@[email protected].[ 0020 4a 7d 4f 93 d4 29 00 50 3e df 4c 63 4b 6b 42 e0 J}O..).P>.LcKkB Such output is provided, but there is too much unnecessary information such as an SYN packet or a header. What I want The IP address of the client and sending out character string(Get; the contents of the POST) Among the output character string of the server only as for the HTML (Content-Type:) I am what is chisel) of a thing of text/html. I can set a filter and am the best if only information wanting can accumulate in the log.

Read the article

Find which files an apache process is writing to?

- by Haluk

We have this apache process which becomes io-bound time to time. Using atop, we can see it is a write operation. Using lsof -p <PID> we can see a list of files open by the httpd process. First we thought "log" files must be the problem. So we turned them off just to test. However write operations still continues. We will continue testing a few other things. For instance we use php session variables a lot. Maybe php session files are getting all the writing. But is there a way to quickly identify files which get written to by the httpd process? This way we can focus our efforts on those files. UPDATE: We used the strace command as suggested. Here are two lines from the output. write(23, "\27\0\0\0\3SET CHARACTER SET utf8", 27) = 27 write(23, "\17\0\0\0\3SET NAMES utf8", 19) = 19 We do not have a mysql process on this server. So is strace also showing what is being written to an ethernet port? UPDATE2: During high io load, the process which consumes most of the write resources gives the following output to strace -e trace=write -p <PID>: --- SIGCHLD (Child exited) @ 0 (0) --- write(9, "!", 1) = 1 write(19, "OPTIONS * HTTP/1.0\r\nUser-Agent: Apache (internal dummy connection)\r\n\r\n", 70) = 70 However I cannot figure out where these are being written to.

Read the article

SQLite, python, unicode, and non-utf data

- by Nathan Spears

I started by trying to store strings in sqlite using python, and got the message: sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings. Ok, I switched to Unicode strings. Then I started getting the message: sqlite3.OperationalError: Could not decode to UTF-8 column 'tag_artist' with text 'Sigur Rós' when trying to retrieve data from the db. More research and I started encoding it in utf8, but then 'Sigur Rós' starts looking like 'Sigur RÃ³s' note: My console was set to display in 'latin_1' as @John Machin pointed out. What gives? After reading this, describing exactly the same situation I'm in, it seems as if the advice is to ignore the other advice and use 8-bit bytestrings after all. I didn't know much about unicode and utf before I started this process. I've learned quite a bit in the last couple hours, but I'm still ignorant of whether there is a way to correctly convert 'ó' from latin-1 to utf-8 and not mangle it. If there isn't, why would sqlite 'highly recommend' I switch my application to unicode strings? I'm going to update this question with a summary and some example code of everything I've learned in the last 24 hours so that someone in my shoes can have an easy(er) guide. If the information I post is wrong or misleading in any way please tell me and I'll update, or one of you senior guys can update. Summary of answers Let me first state the goal as I understand it. The goal in processing various encodings, if you are trying to convert between them, is to understand what your source encoding is, then convert it to unicode using that source encoding, then convert it to your desired encoding. Unicode is a base and encodings are mappings of subsets of that base. utf_8 has room for every character in unicode, but because they aren't in the same place as, for instance, latin_1, a string encoded in utf_8 and sent to a latin_1 console will not look the way you expect. In python the process of getting to unicode and into another encoding looks like: str.decode('source_encoding').encode('desired_encoding') or if the str is already in unicode str.encode('desired_encoding') For sqlite I didn't actually want to encode it again, I wanted to decode it and leave it in unicode format. Here are four things you might need to be aware of as you try to work with unicode and encodings in python. The encoding of the string you want to work with, and the encoding you want to get it to. The system encoding. The console encoding. The encoding of the source file Elaboration: (1) When you read a string from a source, it must have some encoding, like latin_1 or utf_8. In my case, I'm getting strings from filenames, so unfortunately, I could be getting any kind of encoding. Windows XP uses UCS-2 (a Unicode system) as its native string type, which seems like cheating to me. Fortunately for me, the characters in most filenames are not going to be made up of more than one source encoding type, and I think all of mine were either completely latin_1, completely utf_8, or just plain ascii (which is a subset of both of those). So I just read them and decoded them as if they were still in latin_1 or utf_8. It's possible, though, that you could have latin_1 and utf_8 and whatever other characters mixed together in a filename on Windows. Sometimes those characters can show up as boxes, other times they just look mangled, and other times they look correct (accented characters and whatnot). Moving on. (2) Python has a default system encoding that gets set when python starts and can't be changed during runtime. See here for details. Dirty summary ... well here's the file I added: \# sitecustomize.py \# this file can be anywhere in your Python path, \# but it usually goes in ${pythondir}/lib/site-packages/ import sys sys.setdefaultencoding('utf_8') This system encoding is the one that gets used when you use the unicode("str") function without any other encoding parameters. To say that another way, python tries to decode "str" to unicode based on the default system encoding. (3) If you're using IDLE or the command-line python, I think that your console will display according to the default system encoding. I am using pydev with eclipse for some reason, so I had to go into my project settings, edit the launch configuration properties of my test script, go to the Common tab, and change the console from latin-1 to utf-8 so that I could visually confirm what I was doing was working. (4) If you want to have some test strings, eg test_str = "ó" in your source code, then you will have to tell python what kind of encoding you are using in that file. (FYI: when I mistyped an encoding I had to ctrl-Z because my file became unreadable.) This is easily accomplished by putting a line like so at the top of your source code file: # -*- coding: utf_8 -*- If you don't have this information, python attempts to parse your code as ascii by default, and so: SyntaxError: Non-ASCII character '\xf3' in file _redacted_ on line 81, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details Once your program is working correctly, or, if you aren't using python's console or any other console to look at output, then you will probably really only care about #1 on the list. System default and console encoding are not that important unless you need to look at output and/or you are using the builtin unicode() function (without any encoding parameters) instead of the string.decode() function. I wrote a demo function I will paste into the bottom of this gigantic mess that I hope correctly demonstrates the items in my list. Here is some of the output when I run the character 'ó' through the demo function, showing how various methods react to the character as input. My system encoding and console output are both set to utf_8 for this run: '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Now I will change the system and console encoding to latin_1, and I get this output for the same input: 'ó' = original char <type 'str'> repr(char)='\xf3' 'ó' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Notice that the 'original' character displays correctly and the builtin unicode() function works now. Now I change my console output back to utf_8. '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' '?' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Here everything still works the same as last time but the console can't display the output correctly. Etc. The function below also displays more information that this and hopefully would help someone figure out where the gap in their understanding is. I know all this information is in other places and more thoroughly dealt with there, but I hope that this would be a good kickoff point for someone trying to get coding with python and/or sqlite. Ideas are great but sometimes source code can save you a day or two of trying to figure out what functions do what. Disclaimers: I'm no encoding expert, I put this together to help my own understanding. I kept building on it when I should have probably started passing functions as arguments to avoid so much redundant code, so if I can I'll make it more concise. Also, utf_8 and latin_1 are by no means the only encoding schemes, they are just the two I was playing around with because I think they handle everything I need. Add your own encoding schemes to the demo function and test your own input. One more thing: there are apparently crazy application developers making life difficult in Windows. #!/usr/bin/env python # -*- coding: utf_8 -*- import os import sys def encodingDemo(str): validStrings = () try: print "str =",str,"{0} repr(str) = {1}".format(type(str), repr(str)) validStrings += ((str,""),) except UnicodeEncodeError as ude: print "Couldn't print the str itself because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print ude try: x = unicode(str) print "unicode(str) = ",x validStrings+= ((x, " decoded into unicode by the default system encoding"),) except UnicodeDecodeError as ude: print "ERROR. unicode(str) couldn't decode the string because the system encoding is set to an encoding that doesn't understand some character in the string." print "\tThe system encoding is set to {0}. See error:\n\t".format(sys.getdefaultencoding()), print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the unicode(str) because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('latin_1') print "str.decode('latin_1') =",x validStrings+= ((x, " decoded with latin_1 into unicode"),) try: print "str.decode('latin_1').encode('utf_8') =",str.decode('latin_1').encode('utf_8') validStrings+= ((x, " decoded with latin_1 into unicode and encoded into utf_8"),) except UnicodeDecodeError as ude: print "The string was decoded into unicode using the latin_1 encoding, but couldn't be encoded into utf_8. See error:\n\t", print ude except UnicodeDecodeError as ude: print "Something didn't work, probably because the string wasn't latin_1 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('latin_1') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('utf_8') print "str.decode('utf_8') =",x validStrings+= ((x, " decoded with utf_8 into unicode"),) try: print "str.decode('utf_8').encode('latin_1') =",str.decode('utf_8').encode('latin_1') except UnicodeDecodeError as ude: print "str.decode('utf_8').encode('latin_1') didn't work. The string was decoded into unicode using the utf_8 encoding, but couldn't be encoded into latin_1. See error:\n\t", validStrings+= ((x, " decoded with utf_8 into unicode and encoded into latin_1"),) print ude except UnicodeDecodeError as ude: print "str.decode('utf_8') didn't work, probably because the string wasn't utf_8 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('utf_8') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t",uee print print "Printing information about each character in the original string." for char in str: try: print "\t'" + char + "' = original char {0} repr(char)={1}".format(type(char), repr(char)) except UnicodeDecodeError as ude: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), ude) except UnicodeEncodeError as uee: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), uee) print uee try: x = unicode(char) print "\t'" + x + "' = unicode(char) {1} repr(unicode(char))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = unicode(char) ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = unicode(char) {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('latin_1') print "\t'" + x + "' = char.decode('latin_1') {1} repr(char.decode('latin_1'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('latin_1') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('latin_1') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('utf_8') print "\t'" + x + "' = char.decode('utf_8') {1} repr(char.decode('utf_8'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('utf_8') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('utf_8') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) print x = 'ó' encodingDemo(x) Much thanks for the answers below and especially to @John Machin for answering so thoroughly.

Read the article

SQLite, python, unicode, and non-utf data

- by Nathan Spears

I started by trying to store strings in sqlite using python, and got the message: sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings. Ok, I switched to Unicode strings. Then I started getting the message: sqlite3.OperationalError: Could not decode to UTF-8 column 'tag_artist' with text 'Sigur Rós' when trying to retrieve data from the db. More research and I started encoding it in utf8, but then 'Sigur Rós' starts looking like 'Sigur RÃ³s' note: My console was set to display in 'latin_1' as @John Machin pointed out. What gives? After reading this, describing exactly the same situation I'm in, it seems as if the advice is to ignore the other advice and use 8-bit bytestrings after all. I didn't know much about unicode and utf before I started this process. I've learned quite a bit in the last couple hours, but I'm still ignorant of whether there is a way to correctly convert 'ó' from latin-1 to utf-8 and not mangle it. If there isn't, why would sqlite 'highly recommend' I switch my application to unicode strings? I'm going to update this question with a summary and some example code of everything I've learned in the last 24 hours so that someone in my shoes can have an easy(er) guide. If the information I post is wrong or misleading in any way please tell me and I'll update, or one of you senior guys can update. Summary of answers Let me first state the goal as I understand it. The goal in processing various encodings, if you are trying to convert between them, is to understand what your source encoding is, then convert it to unicode using that source encoding, then convert it to your desired encoding. Unicode is a base and encodings are mappings of subsets of that base. utf_8 has room for every character in unicode, but because they aren't in the same place as, for instance, latin_1, a string encoded in utf_8 and sent to a latin_1 console will not look the way you expect. In python the process of getting to unicode and into another encoding looks like: str.decode('source_encoding').encode('desired_encoding') or if the str is already in unicode str.encode('desired_encoding') For sqlite I didn't actually want to encode it again, I wanted to decode it and leave it in unicode format. Here are four things you might need to be aware of as you try to work with unicode and encodings in python. The encoding of the string you want to work with, and the encoding you want to get it to. The system encoding. The console encoding. The encoding of the source file Elaboration: (1) When you read a string from a source, it must have some encoding, like latin_1 or utf_8. In my case, I'm getting strings from filenames, so unfortunately, I could be getting any kind of encoding. Windows XP uses UCS-2 (a Unicode system) as its native string type, which seems like cheating to me. Fortunately for me, the characters in most filenames are not going to be made up of more than one source encoding type, and I think all of mine were either completely latin_1, completely utf_8, or just plain ascii (which is a subset of both of those). So I just read them and decoded them as if they were still in latin_1 or utf_8. It's possible, though, that you could have latin_1 and utf_8 and whatever other characters mixed together in a filename on Windows. Sometimes those characters can show up as boxes, other times they just look mangled, and other times they look correct (accented characters and whatnot). Moving on. (2) Python has a default system encoding that gets set when python starts and can't be changed during runtime. See here for details. Dirty summary ... well here's the file I added: \# sitecustomize.py \# this file can be anywhere in your Python path, \# but it usually goes in ${pythondir}/lib/site-packages/ import sys sys.setdefaultencoding('utf_8') This system encoding is the one that gets used when you use the unicode("str") function without any other encoding parameters. To say that another way, python tries to decode "str" to unicode based on the default system encoding. (3) If you're using IDLE or the command-line python, I think that your console will display according to the default system encoding. I am using pydev with eclipse for some reason, so I had to go into my project settings, edit the launch configuration properties of my test script, go to the Common tab, and change the console from latin-1 to utf-8 so that I could visually confirm what I was doing was working. (4) If you want to have some test strings, eg test_str = "ó" in your source code, then you will have to tell python what kind of encoding you are using in that file. (FYI: when I mistyped an encoding I had to ctrl-Z because my file became unreadable.) This is easily accomplished by putting a line like so at the top of your source code file: # -*- coding: utf_8 -*- If you don't have this information, python attempts to parse your code as ascii by default, and so: SyntaxError: Non-ASCII character '\xf3' in file _redacted_ on line 81, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details Once your program is working correctly, or, if you aren't using python's console or any other console to look at output, then you will probably really only care about #1 on the list. System default and console encoding are not that important unless you need to look at output and/or you are using the builtin unicode() function (without any encoding parameters) instead of the string.decode() function. I wrote a demo function I will paste into the bottom of this gigantic mess that I hope correctly demonstrates the items in my list. Here is some of the output when I run the character 'ó' through the demo function, showing how various methods react to the character as input. My system encoding and console output are both set to utf_8 for this run: '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Now I will change the system and console encoding to latin_1, and I get this output for the same input: 'ó' = original char <type 'str'> repr(char)='\xf3' 'ó' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' 'ó' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Notice that the 'original' character displays correctly and the builtin unicode() function works now. Now I change my console output back to utf_8. '?' = original char <type 'str'> repr(char)='\xf3' '?' = unicode(char) <type 'unicode'> repr(unicode(char))=u'\xf3' '?' = char.decode('latin_1') <type 'unicode'> repr(char.decode('latin_1'))=u'\xf3' '?' = char.decode('utf_8') ERROR: 'utf8' codec can't decode byte 0xf3 in position 0: unexpected end of data Here everything still works the same as last time but the console can't display the output correctly. Etc. The function below also displays more information that this and hopefully would help someone figure out where the gap in their understanding is. I know all this information is in other places and more thoroughly dealt with there, but I hope that this would be a good kickoff point for someone trying to get coding with python and/or sqlite. Ideas are great but sometimes source code can save you a day or two of trying to figure out what functions do what. Disclaimers: I'm no encoding expert, I put this together to help my own understanding. I kept building on it when I should have probably started passing functions as arguments to avoid so much redundant code, so if I can I'll make it more concise. Also, utf_8 and latin_1 are by no means the only encoding schemes, they are just the two I was playing around with because I think they handle everything I need. Add your own encoding schemes to the demo function and test your own input. One more thing: there are apparently crazy application developers making life difficult in Windows. #!/usr/bin/env python # -*- coding: utf_8 -*- import os import sys def encodingDemo(str): validStrings = () try: print "str =",str,"{0} repr(str) = {1}".format(type(str), repr(str)) validStrings += ((str,""),) except UnicodeEncodeError as ude: print "Couldn't print the str itself because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print ude try: x = unicode(str) print "unicode(str) = ",x validStrings+= ((x, " decoded into unicode by the default system encoding"),) except UnicodeDecodeError as ude: print "ERROR. unicode(str) couldn't decode the string because the system encoding is set to an encoding that doesn't understand some character in the string." print "\tThe system encoding is set to {0}. See error:\n\t".format(sys.getdefaultencoding()), print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the unicode(str) because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('latin_1') print "str.decode('latin_1') =",x validStrings+= ((x, " decoded with latin_1 into unicode"),) try: print "str.decode('latin_1').encode('utf_8') =",str.decode('latin_1').encode('utf_8') validStrings+= ((x, " decoded with latin_1 into unicode and encoded into utf_8"),) except UnicodeDecodeError as ude: print "The string was decoded into unicode using the latin_1 encoding, but couldn't be encoded into utf_8. See error:\n\t", print ude except UnicodeDecodeError as ude: print "Something didn't work, probably because the string wasn't latin_1 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('latin_1') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t", print uee try: x = str.decode('utf_8') print "str.decode('utf_8') =",x validStrings+= ((x, " decoded with utf_8 into unicode"),) try: print "str.decode('utf_8').encode('latin_1') =",str.decode('utf_8').encode('latin_1') except UnicodeDecodeError as ude: print "str.decode('utf_8').encode('latin_1') didn't work. The string was decoded into unicode using the utf_8 encoding, but couldn't be encoded into latin_1. See error:\n\t", validStrings+= ((x, " decoded with utf_8 into unicode and encoded into latin_1"),) print ude except UnicodeDecodeError as ude: print "str.decode('utf_8') didn't work, probably because the string wasn't utf_8 encoded. See error:\n\t", print ude except UnicodeEncodeError as uee: print "ERROR. Couldn't print the str.decode('utf_8') because the console is set to an encoding that doesn't understand some character in the string. See error:\n\t",uee print print "Printing information about each character in the original string." for char in str: try: print "\t'" + char + "' = original char {0} repr(char)={1}".format(type(char), repr(char)) except UnicodeDecodeError as ude: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), ude) except UnicodeEncodeError as uee: print "\t'?' = original char {0} repr(char)={1} ERROR PRINTING: {2}".format(type(char), repr(char), uee) print uee try: x = unicode(char) print "\t'" + x + "' = unicode(char) {1} repr(unicode(char))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = unicode(char) ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = unicode(char) {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('latin_1') print "\t'" + x + "' = char.decode('latin_1') {1} repr(char.decode('latin_1'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('latin_1') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('latin_1') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) try: x = char.decode('utf_8') print "\t'" + x + "' = char.decode('utf_8') {1} repr(char.decode('utf_8'))={2}".format(x, type(x), repr(x)) except UnicodeDecodeError as ude: print "\t'?' = char.decode('utf_8') ERROR: {0}".format(ude) except UnicodeEncodeError as uee: print "\t'?' = char.decode('utf_8') {0} repr(char)={1} ERROR PRINTING: {2}".format(type(x), repr(x), uee) print x = 'ó' encodingDemo(x) Much thanks for the answers below and especially to @John Machin for answering so thoroughly.

Read the article

Mysql - help me optimize this query

- by sandeepan-nath

About the system: -The system has a total of 8 tables - Users - Tutor_Details (Tutors are a type of User,Tutor_Details table is linked to Users) - learning_packs, (stores packs created by tutors) - learning_packs_tag_relations, (holds tag relations meant for search) - tutors_tag_relations and tags and orders (containing purchase details of tutor's packs), order_details linked to orders and tutor_details. For a more clear idea about the tables involved please check the The tables section in the end. -A tags based search approach is being followed.Tag relations are created when new tutors register and when tutors create packs (this makes tutors and packs searcheable). For details please check the section How tags work in this system? below. Following is a simpler representation (not the actual) of the more complex query which I am trying to optimize:- I have used statements like explanation of parts in the query select SUM(DISTINCT( t.tag LIKE "%Dictatorship%" )) as key_1_total_matches, SUM(DISTINCT( t.tag LIKE "%democracy%" )) as key_2_total_matches, td., u., count(distinct(od.id_od)), if (lp.id_lp > 0) then some conditional logic on lp fields else 0 as tutor_popularity from Tutor_Details AS td JOIN Users as u on u.id_user = td.id_user LEFT JOIN Learning_Packs_Tag_Relations AS lptagrels ON td.id_tutor = lptagrels.id_tutor LEFT JOIN Learning_Packs AS lp ON lptagrels.id_lp = lp.id_lp LEFT JOIN `some other tables on lp.id_lp - let's call learning pack tables set (including Learning_Packs table)` LEFT JOIN Order_Details as od on td.id_tutor = od.id_author LEFT JOIN Orders as o on od.id_order = o.id_order LEFT JOIN Tutors_Tag_Relations as ttagrels ON td.id_tutor = ttagrels.id_tutor JOIN Tags as t on (t.id_tag = ttagrels.id_tag) OR (t.id_tag = lptagrels.id_tag) where some condition on Users table's fields AND CASE WHEN ((t.id_tag = lptagrels.id_tag) AND (lp.id_lp 0)) THEN `some conditions on learning pack tables set` ELSE 1 END AND CASE WHEN ((t.id_tag = wtagrels.id_tag) AND (wc.id_wc 0)) THEN `some conditions on webclasses tables set` ELSE 1 END AND CASE WHEN (od.id_od0) THEN od.id_author = td.id_tutor and some conditions on Orders table's fields ELSE 1 END AND ( t.tag LIKE "%Dictatorship%" OR t.tag LIKE "%democracy%") group by td.id_tutor HAVING key_1_total_matches = 1 AND key_2_total_matches = 1 order by tutor_popularity desc, u.surname asc, u.name asc limit 0,20 ===================================================================== What does the above query do? Does AND logic search on the search keywords (2 in this example - "Democracy" and "Dictatorship"). Returns only those tutors for which both the keywords are present in the union of the two sets - tutors details and details of all the packs created by a tutor. To make things clear - Suppose a Tutor name "Sandeepan Nath" has created a pack "My first pack", then:- Searching "Sandeepan Nath" returns Sandeepan Nath. Searching "Sandeepan first" returns Sandeepan Nath. Searching "Sandeepan second" does not return Sandeepan Nath. ====================================================================================== The problem The results returned by the above query are correct (AND logic working as per expectation), but the time taken by the query on heavily loaded databases is like 25 seconds as against normal query timings of the order of 0.005 - 0.0002 seconds, which makes it totally unusable. It is possible that some of the delay is being caused because all the possible fields have not yet been indexed, but I would appreciate a better query as a solution, optimized as much as possible, displaying the same results ========================================================================================== How tags work in this system? When a tutor registers, tags are entered and tag relations are created with respect to tutor's details like name, surname etc. When a Tutors create packs, again tags are entered and tag relations are created with respect to pack's details like pack name, description etc. tag relations for tutors stored in tutors_tag_relations and those for packs stored in learning_packs_tag_relations. All individual tags are stored in tags table. ==================================================================== The tables Most of the following tables contain many other fields which I have omitted here. CREATE TABLE IF NOT EXISTS users ( id_user int(10) unsigned NOT NULL AUTO_INCREMENT, name varchar(100) NOT NULL DEFAULT '', surname varchar(155) NOT NULL DEFAULT '', PRIMARY KEY (id_user) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=636 ; CREATE TABLE IF NOT EXISTS tutor_details ( id_tutor int(10) NOT NULL AUTO_INCREMENT, id_user int(10) NOT NULL DEFAULT '0', PRIMARY KEY (id_tutor), KEY Users_FKIndex1 (id_user) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=51 ; CREATE TABLE IF NOT EXISTS orders ( id_order int(10) unsigned NOT NULL AUTO_INCREMENT, PRIMARY KEY (id_order), KEY Orders_FKIndex1 (id_user), ) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=275 ; ALTER TABLE orders ADD CONSTRAINT Orders_ibfk_1 FOREIGN KEY (id_user) REFERENCES users (id_user) ON DELETE NO ACTION ON UPDATE NO ACTION; CREATE TABLE IF NOT EXISTS order_details ( id_od int(10) unsigned NOT NULL AUTO_INCREMENT, id_order int(10) unsigned NOT NULL DEFAULT '0', id_author int(10) NOT NULL DEFAULT '0', PRIMARY KEY (id_od), KEY Order_Details_FKIndex1 (id_order) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=284 ; ALTER TABLE order_details ADD CONSTRAINT Order_Details_ibfk_1 FOREIGN KEY (id_order) REFERENCES orders (id_order) ON DELETE NO ACTION ON UPDATE NO ACTION; CREATE TABLE IF NOT EXISTS learning_packs ( id_lp int(10) unsigned NOT NULL AUTO_INCREMENT, id_author int(10) unsigned NOT NULL DEFAULT '0', PRIMARY KEY (id_lp), KEY Learning_Packs_FKIndex2 (id_author), KEY id_lp (id_lp) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=23 ; CREATE TABLE IF NOT EXISTS tags ( id_tag int(10) unsigned NOT NULL AUTO_INCREMENT, tag varchar(255) DEFAULT NULL, PRIMARY KEY (id_tag), UNIQUE KEY tag (tag), KEY id_tag (id_tag), KEY tag_2 (tag), KEY tag_3 (tag) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=3419 ; CREATE TABLE IF NOT EXISTS tutors_tag_relations ( id_tag int(10) unsigned NOT NULL DEFAULT '0', id_tutor int(10) DEFAULT NULL, KEY Tutors_Tag_Relations (id_tag), KEY id_tutor (id_tutor), KEY id_tag (id_tag) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; ALTER TABLE tutors_tag_relations ADD CONSTRAINT Tutors_Tag_Relations_ibfk_1 FOREIGN KEY (id_tag) REFERENCES tags (id_tag) ON DELETE NO ACTION ON UPDATE NO ACTION; CREATE TABLE IF NOT EXISTS learning_packs_tag_relations ( id_tag int(10) unsigned NOT NULL DEFAULT '0', id_tutor int(10) DEFAULT NULL, id_lp int(10) unsigned DEFAULT NULL, KEY Learning_Packs_Tag_Relations_FKIndex1 (id_tag), KEY id_lp (id_lp), KEY id_tag (id_tag) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; ALTER TABLE learning_packs_tag_relations ADD CONSTRAINT Learning_Packs_Tag_Relations_ibfk_1 FOREIGN KEY (id_tag) REFERENCES tags (id_tag) ON DELETE NO ACTION ON UPDATE NO ACTION; =================================================================================== Following is the exact query (this includes classes also - tutors can create classes and search terms are matched with classes created by tutors):- select count(distinct(od.id_od)) as tutor_popularity, CASE WHEN (IF((wc.id_wc 0), ( wc.wc_api_status = 1 AND wc.wc_type = 0 AND wc.class_date '2010-06-01 22:00:56' AND wccp.status = 1 AND (wccp.country_code='IE' or wccp.country_code IN ('INT'))), 0)) THEN 1 ELSE 0 END as 'classes_published', CASE WHEN (IF((lp.id_lp 0), (lp.id_status = 1 AND lp.published = 1 AND lpcp.status = 1 AND (lpcp.country_code='IE' or lpcp.country_code IN ('INT'))),0)) THEN 1 ELSE 0 END as 'packs_published', td . * , u . * from Tutor_Details AS td JOIN Users as u on u.id_user = td.id_user LEFT JOIN Learning_Packs_Tag_Relations AS lptagrels ON td.id_tutor = lptagrels.id_tutor LEFT JOIN Learning_Packs AS lp ON lptagrels.id_lp = lp.id_lp LEFT JOIN Learning_Packs_Categories AS lpc ON lpc.id_lp_cat = lp.id_lp_cat LEFT JOIN Learning_Packs_Categories AS lpcp ON lpcp.id_lp_cat = lpc.id_parent LEFT JOIN Learning_Pack_Content as lpct on (lp.id_lp = lpct.id_lp) LEFT JOIN Webclasses_Tag_Relations AS wtagrels ON td.id_tutor = wtagrels.id_tutor LEFT JOIN WebClasses AS wc ON wtagrels.id_wc = wc.id_wc LEFT JOIN Learning_Packs_Categories AS wcc ON wcc.id_lp_cat = wc.id_wp_cat LEFT JOIN Learning_Packs_Categories AS wccp ON wccp.id_lp_cat = wcc.id_parent LEFT JOIN Order_Details as od on td.id_tutor = od.id_author LEFT JOIN Orders as o on od.id_order = o.id_order LEFT JOIN Tutors_Tag_Relations as ttagrels ON td.id_tutor = ttagrels.id_tutor JOIN Tags as t on (t.id_tag = ttagrels.id_tag) OR (t.id_tag = lptagrels.id_tag) OR (t.id_tag = wtagrels.id_tag) where (u.country='IE' or u.country IN ('INT')) AND CASE WHEN ((t.id_tag = lptagrels.id_tag) AND (lp.id_lp 0)) THEN lp.id_status = 1 AND lp.published = 1 AND lpcp.status = 1 AND (lpcp.country_code='IE' or lpcp.country_code IN ('INT')) ELSE 1 END AND CASE WHEN ((t.id_tag = wtagrels.id_tag) AND (wc.id_wc 0)) THEN wc.wc_api_status = 1 AND wc.wc_type = 0 AND wc.class_date '2010-06-01 22:00:56' AND wccp.status = 1 AND (wccp.country_code='IE' or wccp.country_code IN ('INT')) ELSE 1 END AND CASE WHEN (od.id_od0) THEN od.id_author = td.id_tutor and o.order_status = 'paid' and CASE WHEN (od.id_wc 0) THEN od.can_attend_class=1 ELSE 1 END ELSE 1 END AND 1 group by td.id_tutor order by tutor_popularity desc, u.surname asc, u.name asc limit 0,20 Please note - The provided database structure does not show all the fields and tables as in this query

Read the article

Dynamic JSON Parsing in .NET with JsonValue

- by Rick Strahl

So System.Json has been around for a while in Silverlight, but it's relatively new for the desktop .NET framework and now moving into the lime-light with the pending release of ASP.NET Web API which is bringing a ton of attention to server side JSON usage. The JsonValue, JsonObject and JsonArray objects are going to be pretty useful for Web API applications as they allow you dynamically create and parse JSON values without explicit .NET types to serialize from or into. But even more so I think JsonValue et al. are going to be very useful when consuming JSON APIs from various services. Yes I know C# is strongly typed, why in the world would you want to use dynamic values? So many times I've needed to retrieve a small morsel of information from a large service JSON response and rather than having to map the entire type structure of what that service returns, JsonValue actually allows me to cherry pick and only work with the values I'm interested in, without having to explicitly create everything up front. With JavaScriptSerializer or DataContractJsonSerializer you always need to have a strong type to de-serialize JSON data into. Wouldn't it be nice if no explicit type was required and you could just parse the JSON directly using a very easy to use object syntax? That's exactly what JsonValue, JsonObject and JsonArray accomplish using a JSON parser and some sweet use of dynamic sauce to make it easy to access in code. Creating JSON on the fly with JsonValue Let's start with creating JSON on the fly. It's super easy to create a dynamic object structure. JsonValue uses the dynamic keyword extensively to make it intuitive to create object structures and turn them into JSON via dynamic object syntax. Here's an example of creating a music album structure with child songs using JsonValue:[TestMethod] public void JsonValueOutputTest() { // strong type instance var jsonObject = new JsonObject(); // dynamic expando instance you can add properties to dynamic album = jsonObject; album.AlbumName = "Dirty Deeds Done Dirt Cheap"; album.Artist = "AC/DC"; album.YearReleased = 1977; album.Songs = new JsonArray() as dynamic; dynamic song = new JsonObject(); song.SongName = "Dirty Deeds Done Dirt Cheap"; song.SongLength = "4:11"; album.Songs.Add(song); song = new JsonObject(); song.SongName = "Love at First Feel"; song.SongLength = "3:10"; album.Songs.Add(song); Console.WriteLine(album.ToString()); } This produces proper JSON just as you would expect: {"AlbumName":"Dirty Deeds Done Dirt Cheap","Artist":"AC\/DC","YearReleased":1977,"Songs":[{"SongName":"Dirty Deeds Done Dirt Cheap","SongLength":"4:11"},{"SongName":"Love at First Feel","SongLength":"3:10"}]} The important thing about this code is that there's no explicitly type that is used for holding the values to serialize to JSON. I am essentially creating this value structure on the fly by adding properties and then serialize it to JSON. This means this code can be entirely driven at runtime without compile time restraints of structure for the JSON output. Here I use JsonObject() to create a new object and immediately cast it to dynamic. JsonObject() is kind of similar in behavior to ExpandoObject in that it allows you to add properties by simply assigning to them. Internally, JsonValue/JsonObject these values are stored in pseudo collections of key value pairs that are exposed as properties through the DynamicObject functionality in .NET. The syntax gets a little tedious only if you need to create child objects or arrays that have to be explicitly defined first. Other than that the syntax looks like normal object access sytnax. Always remember though these values are dynamic - which means no Intellisense and no compiler type checking. It's up to you to ensure that the values you create are accessed consistently and without typos in your code. Note that you can also access the JsonValue instance directly and get access to the underlying type. This means you can assign properties by string, which can be useful for fully data driven JSON generation from other structures. Below you can see both styles of access next to each other:// strong type instance var jsonObject = new JsonObject(); // you can explicitly add values here jsonObject.Add("Entered", DateTime.Now); // expando style instance you can just 'use' properties dynamic album = jsonObject; album.AlbumName = "Dirty Deeds Done Dirt Cheap"; JsonValue internally stores properties keys and values in collections and you can iterate over them at runtime. You can also manipulate the collections if you need to to get the object structure to look exactly like you want. Again, if you've used ExpandoObject before JsonObject/Value are very similar in the behavior of the structure. Reading JSON strings into JsonValue The JsonValue structure supports importing JSON via the Parse() and Load() methods which can read JSON data from a string or various streams respectively. Essentially JsonValue includes the core JSON parsing to turn a JSON string into a collection of JsonValue objects that can be then referenced using familiar dynamic object syntax. Here's a simple example:[TestMethod] public void JsonValueParsingTest() { var jsonString = @"{""Name"":""Rick"",""Company"":""West Wind"",""Entered"":""2012-03-16T00:03:33.245-10:00""}"; dynamic json = JsonValue.Parse(jsonString); // values require casting string name = json.Name; string company = json.Company; DateTime entered = json.Entered; Assert.AreEqual(name, "Rick"); Assert.AreEqual(company, "West Wind"); } The JSON string represents an object with three properties which is parsed into a JsonValue object and cast to dynamic. Once cast to dynamic I can then go ahead and access the object using familiar object syntax. Note that the actual values - json.Name, json.Company, json.Entered - are actually of type JsonPrimitive and I have to assign them to their appropriate types first before I can do type comparisons. The dynamic properties will automatically cast to the right type expected as long as the compiler can resolve the type of the assignment or usage. The AreEqual() method oesn't as it expects two object instances and comparing json.Company to "West Wind" is comparing two different types (JsonPrimitive to String) which fails. So the intermediary assignment is required to make the test pass. The JSON structure can be much more complex than this simple example. Here's another example of an array of albums serialized to JSON and then parsed through with JsonValue():[TestMethod] public void JsonArrayParsingTest() { var jsonString = @"[ { ""Id"": ""b3ec4e5c"", ""AlbumName"": ""Dirty Deeds Done Dirt Cheap"", ""Artist"": ""AC/DC"", ""YearReleased"": 1977, ""Entered"": ""2012-03-16T00:13:12.2810521-10:00"", ""AlbumImageUrl"": ""http://ecx.images-amazon.com/images/I/61kTaH-uZBL._AA115_.jpg"", ""AmazonUrl"": ""http://www.amazon.com/gp/product/B00008BXJ4/ref=as_li_ss_tl?ie=UTF8&tag=westwindtechn-20&linkCode=as2&camp=1789&creative=390957&creativeASIN=B00008BXJ4"", ""Songs"": [ { ""AlbumId"": ""b3ec4e5c"", ""SongName"": ""Dirty Deeds Done Dirt Cheap"", ""SongLength"": ""4:11"" }, { ""AlbumId"": ""b3ec4e5c"", ""SongName"": ""Love at First Feel"", ""SongLength"": ""3:10"" }, { ""AlbumId"": ""b3ec4e5c"", ""SongName"": ""Big Balls"", ""SongLength"": ""2:38"" } ] }, { ""Id"": ""67280fb8"", ""AlbumName"": ""Echoes, Silence, Patience & Grace"", ""Artist"": ""Foo Fighters"", ""YearReleased"": 2007, ""Entered"": ""2012-03-16T00:13:12.2810521-10:00"", ""AlbumImageUrl"": ""http://ecx.images-amazon.com/images/I/41mtlesQPVL._SL500_AA280_.jpg"", ""AmazonUrl"": ""http://www.amazon.com/gp/product/B000UFAURI/ref=as_li_ss_tl?ie=UTF8&tag=westwindtechn-20&linkCode=as2&camp=1789&creative=390957&creativeASIN=B000UFAURI"", ""Songs"": [ { ""AlbumId"": ""67280fb8"", ""SongName"": ""The Pretender"", ""SongLength"": ""4:29"" }, { ""AlbumId"": ""67280fb8"", ""SongName"": ""Let it Die"", ""SongLength"": ""4:05"" }, { ""AlbumId"": ""67280fb8"", ""SongName"": ""Erase/Replay"", ""SongLength"": ""4:13"" } ] }, { ""Id"": ""7b919432"", ""AlbumName"": ""End of the Silence"", ""Artist"": ""Henry Rollins Band"", ""YearReleased"": 1992, ""Entered"": ""2012-03-16T00:13:12.2800521-10:00"", ""AlbumImageUrl"": ""http://ecx.images-amazon.com/images/I/51FO3rb1tuL._SL160_AA160_.jpg"", ""AmazonUrl"": ""http://www.amazon.com/End-Silence-Rollins-Band/dp/B0000040OX/ref=sr_1_5?ie=UTF8&qid=1302232195&sr=8-5"", ""Songs"": [ { ""AlbumId"": ""7b919432"", ""SongName"": ""Low Self Opinion"", ""SongLength"": ""5:24"" }, { ""AlbumId"": ""7b919432"", ""SongName"": ""Grip"", ""SongLength"": ""4:51"" } ] } ]"; dynamic albums = JsonValue.Parse(jsonString); foreach (dynamic album in albums) { Console.WriteLine(album.AlbumName + " (" + album.YearReleased.ToString() + ")"); foreach (dynamic song in album.Songs) { Console.WriteLine("\t" + song.SongName ); } } Console.WriteLine(albums[0].AlbumName); Console.WriteLine(albums[0].Songs[1].SongName);} It's pretty sweet how easy it becomes to parse even complex JSON and then just run through the object using object syntax, yet without an explicit type in the mix. In fact it looks and feels a lot like if you were using JavaScript to parse through this data, doesn't it? And that's the point…© Rick Strahl, West Wind Technologies, 2005-2012Posted in .NET Web Api JSON Tweet !function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0];if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src="//platform.twitter.com/widgets.js";fjs.parentNode.insertBefore(js,fjs);}}(document,"script","twitter-wjs"); (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/plusone.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

Read the article

C# SOCKS proxy service for HTTP requests

- by Ed

I'm trying to build a service that will forward HTTP requests from agents like a browser to the Tor service. Problem is, the Tor service only accepts SOCKS4a connections. So my solution is to listen for HTTP requests, get the URL they're requesting, and make a request via Tor with the help of the Starksoft.Net.Proxy library. Then return the response. The library kind of works, but I'm not happy. It returns HTTP headers with the response and it can't handle images. So the responses are messed up. How could I improve my code? I'm very new to network programming. Sorry for the long example. public AnonymiserService(ILogger logger) { try { _logger = logger; _logger.Log("Listening on port {0}...", Properties.Settings.Default.ListeningPort); StartListener(new string[] { string.Format("http://*:{0}/", Properties.Settings.Default.ListeningPort) }); } catch (Exception ex) { _logger.LogError("Exception!", ex); } } private void StartListener(string[] prefixes) { if (!HttpListener.IsSupported) { _logger.LogError("HttpListener isn't supported on this machine!"); return; } HttpListener listener = new HttpListener(); foreach (string s in prefixes) listener.Prefixes.Add(s); while (true) { listener.Start(); IAsyncResult result = listener.BeginGetContext(new AsyncCallback(ListenerCallback), listener); result.AsyncWaitHandle.WaitOne(); } } private void ListenerCallback(IAsyncResult result) { try { // Get HTTP request HttpListener listener = (HttpListener)result.AsyncState; HttpListenerContext context = listener.EndGetContext(result); _logger.Log("Retrieving [{0}]", context.Request.RawUrl); // Create connection // Use Tor as proxy IProxyClient proxyClient = new Socks4aProxyClient("localhost", 9050); TcpClient tcpClient = proxyClient.CreateConnection(context.Request.UserHostName, 80); // Create message // Need to set Connection: close to close the connection as soon as it's done byte[] data = Encoding.UTF8.GetBytes(String.Format("GET {0} HTTP/1.1\r\nHost: {1}\r\nConnection: close\r\n\r\n", context.Request.Url.PathAndQuery, context.Request.UserHostName)); // Send message NetworkStream ns = tcpClient.GetStream(); ns.Write(data, 0, data.Length); // Pass on HTTP response HttpListenerResponse responseOut = context.Response; if (ns.CanRead) { byte[] buffer = new byte[32768]; int read = 0; string responseString = string.Empty; // Read response while ((read = ns.Read(buffer, 0, buffer.Length)) > 0) { responseString += Encoding.UTF8.GetString(buffer, 0, read); } // Remove headers if (responseString.IndexOf("HTTP/1.1 200 OK") > -1) responseString = responseString.Substring(responseString.IndexOf("\r\n\r\n")); // Forward response byte[] byteArray = Encoding.UTF8.GetBytes(responseString); responseOut.OutputStream.Write(byteArray, 0, byteArray.Length); } // Close streams responseOut.OutputStream.Close(); ns.Close(); // Close connection tcpClient.Close(); _logger.Log("Retrieved [{0}]", context.Request.RawUrl); } catch (Exception ex) { _logger.LogError("Exception in ListenerCallback!", ex); } }

Read the article

R: extracting "clean" UTF-8 text from a web page scraped with RCurl

- by SlowLearner

Using R, I am trying to scrape a web page save the text, which is in Japanese, to a file. Ultimately this needs to be scaled to tackle hundreds of pages on a daily basis. I already have a workable solution in Perl, but I am trying to migrate the script to R to reduce the cognitive load of switching between multiple languages. So far I am not succeeding. Related questions seem to be this one on saving csv files and this one on writing Hebrew to a HTML file. However, I haven't been successful in cobbling together a solution based on the answers there. The pages are from Yahoo! Japan Finance and my Perl code that looks like this. use strict; use HTML::Tree; use LWP::Simple; #use Encode; use utf8; binmode STDOUT, ":utf8"; my @arr_links = (); $arr_links[1] = "http://stocks.finance.yahoo.co.jp/stocks/detail/?code=7203"; $arr_links[2] = "http://stocks.finance.yahoo.co.jp/stocks/detail/?code=7201"; foreach my $link (@arr_links){ $link =~ s/"//gi; print("$link\n"); my $content = get($link); my $tree = HTML::Tree->new(); $tree->parse($content); my $bar = $tree->as_text; open OUTFILE, ">>:utf8", join("","c:/", substr($link, -4),"_perl.txt") || die; print OUTFILE $bar; } This Perl script produces a CSV file that looks like the screenshot below, with proper kanji and kana that can be mined and manipulated offline: My R code, such as it is, looks like the following. The R script is not an exact duplicate of the Perl solution just given, as it doesn't strip out the HTML and leave the text (this answer suggests an approach using R but it doesn't work for me in this case) and it doesn't have the loop and so on, but the intent is the same. require(RCurl) require(XML) links <- list() links[1] <- "http://stocks.finance.yahoo.co.jp/stocks/detail/?code=7203" links[2] <- "http://stocks.finance.yahoo.co.jp/stocks/detail/?code=7201" txt <- getURL(links, .encoding = "UTF-8") Encoding(txt) <- "bytes" write.table(txt, "c:/geturl_r.txt", quote = FALSE, row.names = FALSE, sep = "\t", fileEncoding = "UTF-8") This R script generates the output shown in the screenshot below. Basically rubbish. I assume that there is some combination of HTML, text and file encoding that will allow me to generate in R a similar result to that of the Perl solution but I cannot find it. The header of the HTML page I'm trying to scrape says the chartset is utf-8 and I have set the encoding in the getURL call and in the write.table function to utf-8, but this alone isn't enough. The question How can I scrape the above web page using R and save the text as CSV in "well-formed" Japanese text rather than something that looks like line noise? Edit: I have added a further screenshot to show what happens when I omit the Encoding step. I get what look like Unicode codes, but not the graphical representation of the characters. So it may be some kind of locale-related issue, but in the exact same locale the Perl script does provide useful output. So this is still puzzling.

Read the article

FTP Upload ftpWebRequest Proxy

- by Rodney Vinyard

Searchable: FTP Upload ftpWebRequest Proxy FTP command is not supported when using HTTP proxy In the article below I will cover 2 topics 1. C# & Windows Command-Line FTP Upload with No Proxy Server 2. C# & Windows Command-Line FTP Upload with Proxy Server Not covered here: Secure FTP / SFTP Sample Attributes: · UploadFilePath = “\\servername\folder\file.name” · Proxy Server = “ftp://proxy.server/” · FTP Target Server = ftp.target.com · FTP User = “User” · FTP Password = “Password” with No Proxy Server · Windows Command-Line > ftp ftp.target.com > ftp User: User > ftp Password: Password > ftp put \\servername\folder\file.name > ftp dir (result: file.name listed) > ftp del file.name > ftp dir (result: file.name deleted) > ftp quit · C# //----------------- //Start FTP via _TargetFtpProxy //----------------- string relPath = Path.GetFileName(\\servername\folder\file.name); //result: relPath = “file.name” FtpWebRequest ftpWebRequest = (FtpWebRequest)WebRequest.Create("ftp.target.com/file.name); ftpWebRequest.Method = WebRequestMethods.Ftp.UploadFile; //----------------- //user - password //----------------- ftpWebRequest.Credentials = new NetworkCredential("user, "password"); //----------------- // set proxy = null! //----------------- ftpWebRequest.Proxy = null; //----------------- // Copy the contents of the file to the request stream. //----------------- StreamReader sourceStream = new StreamReader(“\\servername\folder\file.name”); byte[] fileContents = Encoding.UTF8.GetBytes(sourceStream.ReadToEnd()); sourceStream.Close(); ftpWebRequest.ContentLength = fileContents.Length; //----------------- // transer the stream stream. //----------------- Stream requestStream = ftpWebRequest.GetRequestStream(); requestStream.Write(fileContents, 0, fileContents.Length); requestStream.Close(); //----------------- // Look at the response results //----------------- FtpWebResponse response = (FtpWebResponse)ftpWebRequest.GetResponse(); Console.WriteLine("Upload File Complete, status {0}", response.StatusDescription); with Proxy Server · Windows Command-Line > ftp proxy.server > ftp User: [email protected] > ftp Password: Password > ftp put \\servername\folder\file.name > ftp dir (result: file.name listed) > ftp del file.name > ftp dir (result: file.name deleted) > ftp quit · C# //----------------- //Start FTP via _TargetFtpProxy //----------------- string relPath = Path.GetFileName(\\servername\folder\file.name); //result: relPath = “file.name” FtpWebRequest ftpWebRequest = (FtpWebRequest)WebRequest.Create("ftp://proxy.server/" + relPath); ftpWebRequest.Method = WebRequestMethods.Ftp.UploadFile; //----------------- //user - password //----------------- ftpWebRequest.Credentials = new NetworkCredential("[email protected], "password"); //----------------- // set proxy = null! //----------------- ftpWebRequest.Proxy = null; //----------------- // Copy the contents of the file to the request stream. //----------------- StreamReader sourceStream = new StreamReader(“\\servername\folder\file.name”); byte[] fileContents = Encoding.UTF8.GetBytes(sourceStream.ReadToEnd()); sourceStream.Close(); ftpWebRequest.ContentLength = fileContents.Length; //----------------- // transer the stream stream. //----------------- Stream requestStream = ftpWebRequest.GetRequestStream(); requestStream.Write(fileContents, 0, fileContents.Length); requestStream.Close(); //----------------- // Look at the response results //----------------- FtpWebResponse response = (FtpWebResponse)ftpWebRequest.GetResponse(); Console.WriteLine("Upload File Complete, status {0}", response.StatusDescription);

Read the article

How to parse a CSV file containing serialized PHP? [migrated]

- by garbetjie

I've just started dabbling in Perl, to try and gain some exposure to different programming languages - so forgive me if some of the following code is horrendous. I needed a quick and dirty CSV parser that could receive a CSV file, and split it into file batches containing "X" number of CSV lines (taking into account that entries could contain embedded newlines). I came up with a working solution, and it was going along just fine. However, as one of the CSV files that I'm trying to split, I came across one that contains serialized PHP code. This seems to break the CSV parsing. As soon as I remove the serialization - the CSV file is parsed correctly. Are there any tricks I need to know when it comes to parsing serialized data in CSV files? Here is a shortened sample of the code: use strict; use warnings; my $csv = Text::CSV_XS->new({ eol => $/, always_quote => 1, binary => 1 }); my $out; my $in; open $in, "<:encoding(utf8)", "infile.csv" or die("cannot open input file $inputfile"); open $out, ">outfile.000"; binmode($out, ":utf8"); while (my $line = $csv->getline($in)) { $lines++; $csv->print($out, $line); } I'm never able to get into the while loop shown above. As soon as I remove the serialized data, I suddenly am able to get into the loop. Edit: An example of a line that is causing me trouble (taken straight from Vim - hence the ^M): "26","other","1","20,000 Subscriber Plan","Some text here.^M\ Some more text","on","","18","","0","","0","0","recurring","0","","payment","totalsend","0","tsadmin","R34bL9oq","37","0","0","","","","","","","","","","","","","","","","","","","","","","","0","0","0","a:18:{i:0;s:1:\"3\";i:1;s:1:\"2\";i:2;s:2:\"59\";i:3;s:2:\"60\";i:4;s:2:\"61\";i:5;s:2:\"62\";i:6;s:2:\"63\";i:7;s:2:\"64\";i:8;s:2:\"65\";i:9;s:2:\"66\";i:10;s:2:\"67\";i:11;s:2:\"68\";i:12;s:2:\"69\";i:13;s:2:\"70\";i:14;s:2:\"71\";i:15;s:2:\"72\";i:16;s:2:\"73\";i:17;s:2:\"74\";}","","","0","0","","0","0","0.0000","0.0000","0","","","0.00","","6","1" "27","other","1","35,000 Subscriber Plan","Some test here.^M\ Some more text","on","","18","","0","","0","0","recurring","0","","payment","totalsend","0","tsadmin","R34bL9oq","38","0","0","","","","","","","","","","","","","","","","","","","","","","","0","0","0","a:18:{i:0;s:1:\"3\";i:1;s:1:\"2\";i:2;s:2:\"59\";i:3;s:2:\"60\";i:4;s:2:\"61\";i:5;s:2:\"62\";i:6;s:2:\"63\";i:7;s:2:\"64\";i:8;s:2:\"65\";i:9;s:2:\"66\";i:10;s:2:\"67\";i:11;s:2:\"68\";i:12;s:2:\"69\";i:13;s:2:\"70\";i:14;s:2:\"71\";i:15;s:2:\"72\";i:16;s:2:\"73\";i:17;s:2:\"74\";}","","","0","0","","0","0","0.0000","0.0000","0","","","0.00","","7","1" "28","other","1","50,000 Subscriber Plan","Some text here.^M\ Some more text","on","","18","","0","","0","0","recurring","0","","payment","totalsend","0","tsadmin","R34bL9oq","39","0","0","","","","","","","","","","","","","","","","","","","","","","","0","0","0","a:18:{i:0;s:1:\"3\";i:1;s:1:\"2\";i:2;s:2:\"59\";i:3;s:2:\"60\";i:4;s:2:\"61\";i:5;s:2:\"62\";i:6;s:2:\"63\";i:7;s:2:\"64\";i:8;s:2:\"65\";i:9;s:2:\"66\";i:10;s:2:\"67\";i:11;s:2:\"68\";i:12;s:2:\"69\";i:13;s:2:\"70\";i:14;s:2:\"71\";i:15;s:2:\"72\";i:16;s:2:\"73\";i:17;s:2:\"74\";}","","","0","0","","0","0","0.0000","0.0000","0","","","0.00","","8","1""73","other","8","10,000,000","","","","0","","0","","0","0","recurring","0","","payment","","0","","","75","0","10000000","","","","","","","","","","","","","","","","","","","","","","","0","0","0","a:17:{i:0;s:1:\"3\";i:1;s:1:\"2\";i:2;s:2:\"59\";i:3;s:2:\"60\";i:4;s:2:\"61\";i:5;s:2:\"62\";i:6;s:2:\"63\";i:7;s:2:\"64\";i:8;s:2:\"65\";i:9;s:2:\"66\";i:10;s:2:\"67\";i:11;s:2:\"68\";i:12;s:2:\"69\";i:13;s:2:\"70\";i:14;s:2:\"71\";i:15;s:2:\"72\";i:16;s:2:\"74\";}","","","0","0","","0","0","0.0000","0.0000","0","","","0.00","","14","0"

Read the article

jQuery $.ajax Not Working in IE8 but it works on FireFox & Chrome

- by Sam3k

I have the following ajax call which works perfectly in Firefox and Chrome but not IE: function getAJAXdates( startDate, numberOfNights, opts ) { var month = startDate.getMonth() + 1; var day = startDate.getDate(); var year = startDate.getFullYear(); var d = new Date(); var randNum = Math.floor(Math.random()*100000000); $.ajax({ type : "GET", dataType : "json", url : "/availability/ajax/bookings?rand="+randNum, cache : false, data : 'month='+month+'&day='+day+'&year='+year+'&nights='+numberOfNights, contentType : 'application/json; charset=utf8', success : function(data) { console.log('@data: '+data); insertCellData(data, opts, startDate); }, error:function(xhr, status, errorThrown) { console.log('@Error: '+errorThrown); console.log('@Status: '+status); console.log('@Status Text: '+xhr.statusText); } }); } I know for a fact that all the variables are passing the right content and $.ajax is indeed passing all the paramater/values. This is what I get on error: LOG: @Error: undefined LOG: @Status: parsererror LOG: @Status Text: OK I'm aware of the cache issue on IE and implemented a random paramater to clear it up. Finally these are the headers that are sent back from the backend: header('Content-Type: application/json; charset=utf8'); header("Cache-Control: no-cache"); header("Expires: 0"); header('Access-Control-Max-Age: 3628800'); header('Access-Control-Allow-Methods: GET, POST, PUT, DELETE');

Read the article

Parsing with BeautifulSoup, error message TypeError: coercing to Unicode: need string or buffer, NoneType found

- by Samsun Knight

so I'm trying to scrape an Amazon page for data, and I'm getting an error when I try to parse for where the seller is located. Here's my code: #getting the html request = urllib2.Request('http://www.amazon.com/gp/offer-listing/0393934241/') opener = urllib2.build_opener() #hiding that I'm a webscraper request.add_header('User-Agent', 'Mozilla/5 (Solaris 10) Gecko') #opening it up, putting into soup form html = opener.open(request).read() soup = BeautifulSoup(html, "html5lib") #parsing for the seller info sellers = soup.findAll('div', {'class' : 'a-row a-spacing-medium olpOffer'}) for eachseller in sellers: #parsing for price price = eachseller.find('span', {'class' : 'a-size-large a-color-price olpOfferPrice a-text-bold'}) #parsing for shipping costs shippingprice = eachseller.find('span' , {'class' : 'olpShippingPrice'}) #parsing for condition condition = eachseller.find('span', {'class' : 'a-size-medium'}) #parsing for seller name sellername = eachseller.find('b') #parsing for seller location location = eachseller.find('div', {'class' : 'olpAvailability'}) #printing it all out print "price, " + price.string + ", shipping price, " + shippingprice.string + ", condition," + condition.string + ", seller name, " + sellername.string + ", location, " + location.string I get the error message, pertaining to the 'print' command at the end, "TypeError: coercing to Unicode: need string or buffer, NoneType found" I know that it's coming from this line - location = eachseller.find('div', {'class' : 'olpAvailability'}) - because the code works fine without that line, and I know that I'm getting NoneType because the line isn't finding anything. Here's the html from the section I'm looking to parse: <*div class="olpAvailability"> In Stock. Ships from WI, United States. <*br/><*a href="/gp/aag/details/ref=olp_merch_ship_9/175-0430757-3801038?ie=UTF8&asin=0393934241&seller=A1W2IX7T37FAMZ&sshmPath=shipping-rates#aag_shipping">Domestic shipping rates</a> and <*a href="/gp/aag/details/ref=olp_merch_return_9/175-0430757-3801038?ie=UTF8&asin=0393934241&seller=A1W2IX7T37FAMZ&sshmPath=returns#aag_returns">return policy</a>. <*/div> (but without the stars - just making sure the HTML doesn't compile out of code form) I don't see what's the problem with the 'location' line of code, or why it's not pulling the data I want. Help?

Search Results

Search found 1003 results on 41 pages for 'utf8'.

Page 13/41 | < Previous Page | 9 10 11 12 13 14 15 16 17 18 19 20 | Next Page >

- by Maria Konnou

- by wildpeaks

- by Jocke

- by Kicsi Mano

- by user8382

- by Muntasir

- by Metalcoder

- by Dean

- by Dean

- by user1692333

- by Jacob Talbot

- by John Hunt

- by si2w

- by ffffff

- by Haluk

- by Nathan Spears

- by Nathan Spears

- by sandeepan-nath

- by Rick Strahl

- by Ed

- by SlowLearner

- by Rodney Vinyard

- by garbetjie

- by Sam3k

- by Samsun Knight

< Previous Page | 9 10 11 12 13 14 15 16 17 18 19 20 | Next Page >