I have a UTF-8 encoded char*.
Is there a standard function to calculate the number of visible characters represented by the byte array?
I'm on Red Hat (RHEL 5).
I'm using grep to filter the Mac OS X dictionary words file (by default located at /usr/share/dict/words).
I want to use grep to retrieve all words four characters long. How do I do this?
My best guess for how to do this was:
grep [:alpha:]{4} words
But that returns zero results.
Below is my data:
Data Here 94/452O
Data more 94/4522i
Data bla 94/111
Data bla 94/459es
Data bla 94/444
items is automatically generated by some previous code but it could come out like:
items = ["Data Here 94/452O", "Data more 94/4522i", "Data bla 94/111", "Data bla 94/459es", "Data bla 94/444"]
Now currently I'm appending the following:
"\n".join(items).replace("4ke", "9") with a few other .replaces however I want it to replace/change the characters on the end of the numbers with a capital letter instead of lowercase...
Output:
Data Here 94/452O
Data more 94/4522I
Data bla 94/111
Data bla 94/459ES
Data bla 94/444
I would like to create a query on a field which after a certain number of characters adds/displays a number of dots to show the user that there is additional text to read. At the moment there is a syntax error using the following code in which it doesn't like the "Left" instruction:
X:IIF(len(description) > 5, Left(description, 5) & "....", description)
Note: "X" is what i am naming the field 'description' in my query screen in Access
Basically i have binary data, i dont mind if it's unreadable but im writing it to a file which is parsed and so it's importance newline characters are taken out.
I thought i had done the right thing when i converted to string....
byte[] b = (byte[])SubKey.GetValue(v[i]);
s = System.Text.ASCIIEncoding.ASCII.GetString(b);
and then removed the newlines
String t = s.replace("\n","")
but its not working ?
I want to remove all non-alphanumeric and space characters from a string. So I do want spaces to remain. What do I put for a space in the below function within the [ ] brackets:
ereg_replace("[^A-Za-z0-9]", "", $title);
In other words, what symbol represents space, I know \n represents a new line, is there any such symbol for a single space.
Hi all,
I have a string representation exactly like 'ComputerName -- IPAddress'; i.e:
'samarena -- 192.168.1.97'
. I want to get only the 'ComputerName' part from the actual representation by removing other characters. I'm actually quite beginner in using string.FormatMethods() .
Please help me out.
Thanks.
Write an assembly program to input keystrokes from the PC’s keyboard and display the characters on the system monitor. Pressing any of the function keys F1-F10 should cause the program to end.
I have no idea how to write this code, and need some help por favor.
Suppose you want to generate dynamic page titles that look like this:
"It was all a dream, I used to read word up magazine" from "Juicy" by The Notorious B.I.G
I.e., "LYRICS" from "SONG_NAME" by ARTIST
However, your title can only be 69 characters total and this template will sometimes generate titles that are longer.
One strategy for solving this problem is to truncate the entire string to 69 characters. However, a better approach is to truncate the less important parts of the string first. I.e., your algorithm might look something like this:
Truncate the lyrics until the entire string is <= 69 characters
If you still need to truncate, truncate the artist name until the entire string is <= 69 characters
If you still need to truncate, truncate the song name until the entire string is <= 69 characters
If all else fails, truncate the entire string to 69 characters
Ideally the algorithm would also limit the amount each part of the string could be truncated. E.g., step 1 would really be "Truncate the lyrics to a minimum of 10 characters until the entire string is <= 69 characters"
Since this is such a common situation, I was wondering if someone has a library or code snippet that can take care of it.
I need a robust and simple way to remove illegal path and file characters from a simple string. I've used the below code but it doesn't seem to do anything, what am i missing?
using System;
using System.IO;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string illegal = "\"M<>\"\\a/ry/ h**ad:>> a\\/:*?\"<>| li*tt|le|| la\"mb.?";
illegal = illegal.Trim(Path.GetInvalidFileNameChars());
illegal = illegal.Trim(Path.GetInvalidPathChars());
Console.WriteLine(illegal);
Console.ReadLine();
}
}
}
I feel kind of dumb posting this when this seems kind of simple and there are tons of questions on strings/characters/regex, but I couldn't find quite what I needed (except in another language: http://stackoverflow.com/questions/2176544/remove-all-text-after-certain-point).
I've got the following code:
[Test]
public void stringManipulation()
{
String filename = "testpage.aspx";
String currentFullUrl = "http://localhost:2000/somefolder/myrep/test.aspx?q=qvalue";
String fullUrlWithoutQueryString = currentFullUrl.Replace("?.*", "");
String urlWithoutPageName = fullUrlWithoutQueryString.Remove(fullUrlWithoutQueryString.Length - filename.Length);
String expected = "http://localhost:2000/somefolder/myrep/";
String actual = urlWithoutPageName;
Assert.AreEqual(expected, actual);
}
I tried the solution in the question above (hoping the syntax would be the same!) but nope. I want to first remove the queryString which could be any variable length, then remove the page name, which again could be any length.
How can I get the remove the query string from the full URL such that this test passes?
I'm trying to identify and condense single (uppercase) characters in a string.
For example:
"test A B test" - "test AB test"
"test A B C test" - "test ABC test"
"test A B test C D E test" - "test AB test CDE test"
I have it working for single occurrences (as in the first above example), but cannot figure out how to chain it for multiple occurrences.
$str =~ s/ ([A-Z]) ([A-Z]) / \1\2 /g;
I'll probably feel stupid when I see the solution, but I'm prepared for that. Thanks in advance.
I'm having trouble reading special characters from stdin.
Here are my attempts:
import os
dir = raw_input("Dir name: ")
Dir name: c:/á
os.chdir(dir)
WindowsError: [Error 2] The system cannot find the file specified: 'c:/\x81\xe1'
Ok, so I tried to get the default system encoding and recode the string from stdin:
import locale
encoding = locale.getdefaultlocale()[1]
print encoding
cp1252
unicode(dir, encoding)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "c:\Python26\lib\encodings\cp1252.py", line 15, in decode
return codecs.charmap_decode(input,errors,decoding_table)
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 3: character maps to <undefined>
Now, I don't know how to solve this. Nor can I understand - why is there a problem when I try to access a directory with a name written in the system default encoding itself??
I was trying to send a GET request to Twitter (user ID replaced for privacy reasons) using Net::HTTP:
url = URI.parse("http://api.twitter.com/1/friends/ids.json?user_id=12345")
resp = Net::HTTP.get_response(url)
this throws an exception in Net::HTTP:
NoMethodError: undefined method empty?' for #<URI::HTTP:0x59f5c04>
from /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/net/http.rb:1470:ininitialize'
just by coincidence, I stumbled upon a similar code snippet, which used URI.encode prior to URI.parse, so I copied that and tried again:
url = URI.parse(URI.encode("http://api.twitter.com/1/friends/ids.json?user_id=12345"))
resp = Net::HTTP.get_response(url)
now it works fine, but why? There are no reserved characters that need escaping in the URL I mentioned, so why do I have to call URI.encode for get_response to succeed?
I'm trying to get vim to display my tabs as ? so they cannot be mistaken for actual characters. I'd hoped the following would work:
if has("multi_byte")
set lcs=tab:?
else
set lcs=tab:>-
endif
However, this gives me
E474: Invalid argument: lcs=tab:?
The file is UTF-8 encoded and includes a BOM.
Googling "vim encoding" or similar gives me many results about the encoding of edited files, but nothing about the encoding of executed scripts. How to get this character into my .vimrc so that it is properly displayed?
My user enters message with special characters as follows:
this is a test of &&, &&, % as well as '' "", " instead of this is a test of &, &&, % as well as '' "", "
I need to construct XML as follows:
<?xml version='1.0' encoding='iso-8859-1'?>
<success>
<message>
this is a test of &, &&, % as well as '' "", " instead of this is a test of &, &&, % as well as '' "", "
</message>
</success>
Can any body help me?
Thanks in advance
I'm trying to create random strings of characters. I'm wondering if there might be a more efficient way. Here's my algorithm:
string RANDOM = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz#@$^*()";
StringBuilder sb = new StringBuilder();
int length = rand.Next(10) + 1;
for (int idx = 0; idx < length; ++idx)
{
sb.Append(RANDOM[rand.Next(RANDOM.Length)]);
}
string RandomString = sb.ToString();
I'm wondering if the StringBuilder is the best choice. Also if selecting a random character from my RANDOM string is the best way.
this displays the expected javascript alert message box:
RadAjaxManager1.ResponseScripts.Add("alert('blahblahblah');");
while these does not:
RadAjaxManager1.ResponseScripts.Add("alert('blahblah \n blahblahblah');");
RadAjaxManager1.ResponseScripts.Add("alert('blahblah \r blahblahblah');");
RadAjaxManager1.ResponseScripts.Add("alert('blahblah \r\n blahblahblah');");
RadAjaxManager1.ResponseScripts.Add("alert('blahblah \n\t blahblahblah');");
RadAjaxManager1.ResponseScripts.Add(@"alert('blahblah \n blahblahblah');");
string message = "blahblahblah \n blahblahblah";
RadAjaxManager1.ResponseScripts.Add(message);
I can't find any documentation on escape characters breaking this. I understand the single string argument to the Add method can be any script. No error is thrown, so my best guess is malformed javascript.
Given a java.lang.String instance, I want to verify that it doesn't contain any unicode characters that are not ASCII alphanumerics. e.g. The string should be limited to [A-Za-z0-9.]. What I'm doing now is something very inefficient:
import org.apache.commons.lang.CharUtils;
String s = ...;
char[] ch = s.toCharArray();
for( int i=0; i<ch.length; i++)
{
if( ! CharUtils.isAsciiAlphanumeric( ch[ i ] )
throw new InvalidInput( ch[i] + " is invalid" );
}
Is there a better way to solve this ?
I been trying to figure out how this blasted regex for two hours!!! It's midnight I gotta figure this out and go to bed!!!
String str = new String("filename\\");
if(str.matches(".*[?/<>|*:\"{\\}].*")) {
System.out.println("match");
}else {
System.out.println("no match");
}
".*[?/<>|*:\"{\\}].*" is my regex expression. It catches everything correctly except the backslash!!! I need to know how to make it catch the backslash correctly please help!
FYI, the illegal characters i'm trying to catch are
? \ / < | * : "
I've got it working exception for the backslash
I have two rows that have a varchar column that are different according to a Java .equals(). I can't easily change or debug the Java code that's running against this particular database but I do have access to do queries directly against the database using SQLDeveloper. The fields look the same to me (they are street addresses with two lines separated by some new line or carriage feed/new line combo).
Is there a way to see all of the hidden characters as the result of a query?I'd like to avoid having to use the ascii() function with substr() on each of the rows to figure out which hidden character is different.
I'd also accept some query that shows me which character is the first difference between the two fields.
Hi
I want to remove characters from a string other then a-z, and A-Z. Created following function for the same and it works fine.
public String stripGarbage(String s) {
String good = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789abcdefghijklmnopqrstuvwxyz";
String result = "";
for (int i = 0; i < s.length(); i++) {
if (good.indexOf(s.charAt(i)) >= 0) {
result += s.charAt(i);
}
}
return result;
}
Can anyone tell me a better way to achieve the same. Probably regex may be better option.
Regards
Harry
I have a Lucene index that has several documents in it. Each document has multiple fields such as:
Id
Project
Name
Description
The Id field will be a unique identifier such as a GUID, Project is a user's ProjectID and a user can only view documents for their project, and Name and Description contain text that can have special characters.
When a user performs a search on the Name field, I want to be able to attempt to match the best I can such as:
First
Will return both:
First.Last
and
First.Middle.Last
Name can also be something like:
Test (NameTest)
Where, if a user types in 'Test', 'Name', or '(NameTest)', then they can find the result.
However, if I say that Project is 'ProjectA' then that needs to be an exact match (case insensitive search). The same goes with the Id field.
Which fields should I set up as Tokenized and which as Untokenized? Also, is there a good Analyzer I should consider to make this happen?
I am stuck trying to decide the best route to implement the desired searching.
I have a delphi form:
and my code:
when I run this form in Windows 7, I see:
In design time, form had polish letters in first label, but it doesn't have them in runtime. It looks ok on Vista or Windows XP. When I set caption of second label in code, everything works fine and characters are properly encoded.
First 5 codes of top label on Windows 7: 65 97 69 101 83
First 5 codes of top label on Windows Vista/XP: 165 185 202 234 140
First 5 codes of bottom label on every system: 165 185 202 234 140
Windows 7 changes encoding, why? My system settings seem to be ok. I have proper language set for non-unicode applications in control panel.