Search Results

Search found 3 results on 1 pages for 'yallaa'.

Page 1/1 | 1 

  • How to convert none-Latin-based encoded text into UTF-8, or make them coexist on same page?

    - by Yallaa
    Good day, I have a script that scrapes the title/description of remote pages and prints those values into a corresponding charset=UTF-8 encoded page. Here is the problem, whenever a remote page is encoded with non-Latin characters encoding like (Arabic, Russian, Chinese, Japanese etc.) the imported values print as garbled text. I've tried passing those values through either iconv or mb_convert_encoding converters but without much success. Then, I tried detecting the remote encoding first, then change my presentation page's encoding into the remote one instead of the current utf-8, which works okay with the imported values, but the other existing utf-8 content of that language on the page gets garbled instead. Example: If I try to import those values from a Russian windows-1251 into my UTF-8 encoded page which has a mix English/Russian content. I change the imported non-utf-8 string into a utf-8 using either iconv or mb_convert_encoding. I tried: $RemoteValue = iconv($RemoteEncoding, 'UTF-8', $RemoteValue); or $RemoteValue mb_convert_encoding($RemoteValue, "UTF-8", $RemoteEncoding); or $RemoteValue mb_convert_encoding($RemoteValue, "UTF-8", "auto"); without success. If I detect that the remote page is windows-1251 encoded and I change my presentation page (which already has UTF-8 encoded mixed language content) to be similar to the remote page, then the japanese utf-8 content on the existing page gets garbled... Can 2 differently encoded strings coexist on the same page (ex. utf-8 & windows-1251)? Am I using the converters correctly? any hints as to why they don't work? Is there any better way to do this? Thank you for your help

    Read the article

  • Detect remote charset in php

    - by yallaa
    Hello, I would like to determine a remote page's encoding through detection of the Content-Type header tag <meta http-equiv="Content-Type" content="text/html; charset=XXXXX" /> if present. I retrieve the remote page and try to do a regex to find the required setting if present. I am still learning hence the problem below... Here is what I have: $EncStart = 'charset='; $EncEnd = '" \/\>'; preg_match( "/$EncStart(.*)$EncEnd/s", $RemoteContent, $RemoteEncoding ); echo = $RemoteEncoding[ 1 ]; The above does indeed echo the name of the encoding but it does not know where to stop so it prints out the rest of the line then most of the rest of the remote page in my test. Example: When testing a remote russian page it printed: windows-1251" / rest of page .... Which means that $EncStart was okay, but the $EncEnd part of the regex failed to stop the matching. This meta header usually ends in 3 different possibility after the name of the encoding. "> | "/> | " /> I do not know weather this is usable to satisfy the end of the maching and if yes how to escape it. I played with different ways of doing it but none worked. Thank you in advance for lending a hand.

    Read the article

  • Parse a CSV file extracting some of the values but not all.

    - by Yallaa
    Good day, I have a local csv file with values that change daily called DailyValues.csv I need to extract the value field of category2 and category4. Then combine, sort and remove duplicates (if any) from the extracted values. Then save it to a new local file NewValues.txt. Here is an example of the DailyValues.csv file: category,date,value category1,2010-05-18,value01 category1,2010-05-18,value02 category1,2010-05-18,value03 category1,2010-05-18,value04 category1,2010-05-18,value05 category1,2010-05-18,value06 category1,2010-05-18,value07 category2,2010-05-18,value08 category2,2010-05-18,value09 category2,2010-05-18,value10 category2,2010-05-18,value11 category2,2010-05-18,value12 category2,2010-05-18,value13 category2,2010-05-18,value14 category2,2010-05-18,value30 category3,2010-05-18,value16 category3,2010-05-18,value17 category3,2010-05-18,value18 category3,2010-05-18,value19 category3,2010-05-18,value20 category3,2010-05-18,value21 category3,2010-05-18,value22 category3,2010-05-18,value23 category3,2010-05-18,value24 category4,2010-05-18,value25 category4,2010-05-18,value26 category4,2010-05-18,value10 category4,2010-05-18,value28 category4,2010-05-18,value11 category4,2010-05-18,value30 category2,2010-05-18,value31 category2,2010-05-18,value32 category2,2010-05-18,value33 category2,2010-05-18,value34 category2,2010-05-18,value35 category2,2010-05-18,value07 I've found some helpful parsing examples at http://www.php.net/manual/en/function.fgetcsv.php and managed to extract all the values of the value column but don't know how to restrict it to only extract the values of category2/4 then sort and clean duplicate. The solution needs to be in php, perl or shell script. Any help would be much appreciated. Thank you in advance.

    Read the article

1