Efficient way to calculate byte length of a character, depending on the encoding

Posted by BalusC on Stack Overflow See other posts from Stack Overflow or by BalusC
Published on 2010-04-28T00:27:18Z Indexed on 2010/04/28 0:33 UTC
Read the original article Hit count: 422

Filed under:

java

|

character

|

bytes

|

character-encoding

What's the most efficient way to calculate the byte length of a character, taking the character encoding into account? In UTF-8 for example the characters have a variable byte length, so each character needs to be determined individually. As far now I've come up with this:

char c = getItSomehow();
String encoding = "UTF-8";

int length = new String(new char[] { c }).getBytes(encoding).length;

But this is clumsy and inefficient in a loop since a new String needs to be created everytime. I can't find other and more efficient ways in the Java API. I imagine that this can be done with bitwise operations like bit shifting, but that's my weak point and I'm unsure how to take the encoding into account here :)

_{If you question the need for this, check this topic.}

© Stack Overflow or respective owner

Related posts about java

Tomcat 6: Access Control Exception?

as seen on Server Fault - Search for 'Server Fault'
I'm trying to setup a tomcat6 server, and I'm trying to match another setup someone else established. However, my deployment (default Ubuntu install) uses a policy.d/ directory structure, and the established server just uses a catalina.policy file. I've tried setting every entry in policy.d to match… >>> More
Problem in creation MDB Queue connection at Jboss StartUp

as seen on Stack Overflow - Search for 'Stack Overflow'
I am not able to create a Queue connection in JBOSS4.2.3GA Version & Java1.5, as I am using MDB as per the below details. I am putting this MDB in a jar file(named utsJar.jar) and copied it in deploy folder of JBOSS, In the test env. this MDB works well but in another env. [ env settings and… >>> More
failing to establish connection between Postgres db and gwt

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi, I am using Postgres and gwt 2.0 for one of my applications. I am facing problem connecting to the database. When I try to connect it gives "ClassNotFoundException". Here is what I get when I try to connect to database: java.lang.ClassNotFoundException: org.postgresql.Driver at java.net… >>> More
failing to establish connection between postgre db and gwt

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi, For i am using postgre and gwt 2.0 for one of my applications. I am facing problem connecting to the database. When i try to connect it gives "ClassNotFoundException". Here is what i get when i try to connect to database: java.lang.ClassNotFoundException: org.postgresql.Driver at java.net… >>> More
Migration and deployement problems JBoss 4.2.2.GA to JBoss 6.0.0.M2

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi, I'm trying to migrate an application running on JBoss 4.2.2.GA to JBoss 6.0.0.M2 I give you some log to explain my problem : boot.log : 2010-03-16 09:59:29,406 ERROR [org.jboss.system.server.profileservice.ProfileServiceBootstrap] (Thread-2) Failed to load profile: Summary of incomplete deployments… >>> More

Related posts about character

two byte character or one byte character

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi, How can I see if the input string is a two byte character or one byte character; and from which encoding system the character is coming from? I am using C# and SilverLight; I assume I could find the encoding the computer is running and then the character? Any code snippet? Thank you, Rune //… >>> More
Find the occurrence of word/character in SQL column with wildcard character - PATINDEX

as seen on Geeks with Blogs - Search for 'Geeks with Blogs'
CharIndex and PatIndex both can be used to determine the presence of character or string within sql column data. Both returns the starting position of the first occurrence of the character/word within expression. However, one major difference between CharIndex and PatIndex is that later allows the… >>> More
Passing string with (accidental) escape character loses character even though it's a raw string

as seen on Stack Overflow - Search for 'Stack Overflow'
I have a function with a python doctest that fails because one of the test input strings has a backslash that's treated like an escape character even though I've encoded the string as a raw string. My doctest looks like this: >>> infile = [ "Todo: fix me", "/** todo: fix", "* me"… >>> More
How to determine if a character is a chinese character

as seen on Stack Overflow - Search for 'Stack Overflow'
How to determine if a character is a chinese character use ruby? >>> More
Display label character by character using javascript

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi, I am creating Hang a Man using PHP, MySQL & Javascript. Every thing is going perfect, I get a word randomly from DB show it as a label apply it a class where display = none. Now when I click on a Character that character become disable fine which i actually want but the label-character does… >>> More