Search Results

Search found 5306 results on 213 pages for 'trailing character'.

Page 96/213 | < Previous Page | 92 93 94 95 96 97 98 99 100 101 102 103  | Next Page >

  • How to debug anomalous C memory/stack problems

    - by EBM
    Hello, Sorry I can't be specific with code, but the problems I am seeing are anomalous. Character string values seem to be getting changed depending on other, unrelated code. For example, the value of the argument that is passed around below will change merely depending on if I comment out one or two of the fprintf() calls! By the last fprintf() the value is typically completely empty (and no, I have checked to make sure I am not modifying the argument directly... all I have to do is comment out a fprintf() or add another fprintf() and the value of the string will change at certain points!): static process_args(char *arg) { /* debug */ fprintf(stderr, "Function arg is %s\n", arg); ...do a bunch of stuff including call another function that uses alloc()... /* debug */ fprintf(stderr, "Function arg is now %s\n", arg); } int main(int argc, char *argv[]) { char *my_arg; ... do a bunch of stuff ... /* just to show you it's nothing to do with the argv array */ my_string = strdup(argv[1]); /* debug */ fprintf(stderr, "Argument 1 is %s\n", my_string); process_args(my_string); } There's more code all around, so I can't ask for someone to debug my program -- what I want to know is HOW can I debug why character strings like this are getting their memory changed or overwritten based on unrelated code. Is my memory limited? My stack too small? How do I tell? What else can I do to track down the issue? My program isn't huge, it's like a thousand lines of code give or take and a couple dynamically linked external libs, but nothing out of the ordinary. HELP! TIA!

    Read the article

  • regex to get current page or directory name?

    - by John Isaacks
    I am trying to get the page or last directory name from a url for example if the url is: http://www.example.com/dir/ i want it to return dir or if the passed url is http://www.example.com/page.php I want it to return page Notice I do not want the trailing slash or file extension. I tried this: $regex = "/.*\.(com|gov|org|net|mil|edu)/([a-z_\-]+).*/i"; $name = strtolower(preg_replace($regex,"$2",$url)); I ran this regex in PHP and it returned nothing. (however I tested the same regex in ActionScript and it worked!) So what am I doing wrong here, how do I get what I want? Thanks!!!

    Read the article

  • Can I write this regex in one step?

    - by Marin Doric
    This is the input string "23x +y-34 x + y+21x - 3y2-3x-y+2". I want to surround every '+' and '-' character with whitespaces but only if they are not allready sourrounded from left or right side. So my input string would look like this "23x + y - 34 x + y + 21x - 3y2 - 3x - y + 2". I wrote this code that does the job: Regex reg1 = new Regex(@"\+(?! )|\-(?! )"); input = reg1.Replace(input, delegate(Match m) { return m.Value + " "; }); Regex reg2 = new Regex(@"(?<! )\+|(?<! )\-"); input = reg2.Replace(input, delegate(Match m) { return " " + m.Value; }); explanation: reg1 // Match '+' followed by any character not ' ' (whitespace) or same thing for '-' reg2 // Same thing only that I match '+' or '-' not preceding by ' '(whitespace) delegate 1 and 2 just insert " " before and after m.Value ( match value ) Question is, is there a way to create just one regex and just one delegate? i.e. do this job in one step? I am a new to regex and I want to learn efficient way.

    Read the article

  • How does real-time collaboration with multiple clients work in a system using operation transformati

    - by Saikat Chakrabarti
    I just finished reading High-Latency, Low-Bandwidth Windowing in the Jupiter Collaboration System and I mostly followed everything until part 6: global consistency. This part describes how the system described in the paper can be extended to accomodate for multiple clients connected to the server. However, the explanation is very short and essentially says the system will work if the central server merely forwards client messages to all the other clients. I don't really understand how this works though. What state vector would be sent in the message that is sent to all the other clients? Does the server maintain separate state vectors for each client? Does it maintain a separate copy of the widgets locally for each client? The simple example I can think of is this setup: imagine client A, server, and client B with client A and client B both connected to the server. To start, all three have the state object "ABCD". Then, client A sends the message "insert character F at position 0" at the same time client B sends the message "insert character G at position 0" to the server. It seems like simply relaying client A's message to client B and vice versa doesn't actually handle this case. So what exactly does the server do?

    Read the article

  • IDN aware tools to encode/decode human readable IRI to/from valid URI

    - by Denis Otkidach
    Let's assume a user enter address of some resource and we need to translate it to: <a href="valid URI here">human readable form</a> HTML4 specification refers to RFC 3986 which allows only ASCII alphanumeric characters and dash in host part and all non-ASCII character in other parts should be percent-encoded. That's what I want to put in href attribute to make link working properly in all browsers. IDN should be encoded with Punycode. HTML5 draft refers to RFC 3987 which also allows percent-encoded unicode characters in host part and a large subset of unicode in both host and other parts without encoding them. User may enter address in any of these forms. To provide human readable form of it I need to decode all printable characters. Note that some parts of address might not correspond to valid UTF-8 sequences, usually when target site uses some other character encoding. An example of what I'd like to get: <a href="http://xn--80aswg.xn--p1ai/%D0%BF%D1%83%D1%82%D1%8C?%D0%B7%D0%B0%D0%BF%D1%80%D0%BE%D1%81"> http://????.??/???????????</a> Are there any tools to solve these tasks? I'm especially interested in libraries for Python and JavaScript.

    Read the article

  • Opposite of nl2br? Is it str_replace?

    - by Julian H. Lam
    So the function nl2br is handy. Except in my web app, I want to do the opposite, interpret line breaks as new lines, since they will be echoed into a pre-filled form. str_replace can take <br /> and replace it with whatever I want, but if I put in \n, it echoes literally a backslash and an n. It only works if I put a literal line break in the middle of my script, and break the indentation (so there are no trailing space). See: <?=str_replace('<br />',' ',$foo)?> Am I missing escape characters? I think I tried every combination...

    Read the article

  • Regular expression replace in PL/pgSQL

    - by dreamlax
    If I have the following input (excluding quotes): "The ancestral territorial imperatives of the trumpeter swan" How can I collapse all multiple spaces to a single space so that the input is transformed to: "The ancestral territorial imperatives of the trumpeter swan" This is going to be used in a trigger function on insert/update (which already trims leading/trailing spaces). Currently, it raises an exception if the input contains multiple adjacent spaces, but I would rather it simply transforms it into something valid before inserting. What is the best approach? I can't seem to find a regular-expression replace function for PL/pgSQL. There is a text_replace function, but this will only collapse at most two spaces down to one (meaning three consecutive spaces will collapse to two). Calling this function over and over is not ideal.

    Read the article

  • Jquery Autocomplete plugin with Django (Trey Piepmeier solution)

    - by Sally
    So, I'm basing my code on Trey's solution on: http://solutions.treypiepmeier.com/2009/12/10/using-jquery-autocomplete-with-django/ The script is: <script> $(function() { $('#id_members').autocomplete('{{ object.get_absolute_url }}members/lookup', { dataType: 'json', width: 200, parse: function(data) { return $.map(data, function(row) { return { data:row, value:row[1], result:row[0] }; }); } }).result( function(e, data, value) { $("#id_members_pk").val(value); } ); } ); </script> The views.py: def members_lookup(request, pid): results = [] if request.method == "GET": if request.GET.has_key(u'q'): value = request.GET[u'q'] # Ignore queries shorter than length 1 if len(value) > 2: model_results = Member.objects.filter( Q(user__first_name__icontains=value) | Q(user__last_name__icontains=value) ) results = [ (x.user.get_full_name(), x.id) for x in model_results ] json = simplejson.dumps(results) print json return HttpResponse(json, mimetype='application/json') The problem is: It stops refining the search results after the initial lookup. For example: If I set len(value) 2, after I type the 3rd character it will give me a list of suggestions. But if I keep on typing the 4th or 5th character, the list of suggestions doesn't change. Any suggestions on why this is?

    Read the article

  • Sharepoint designer is replacing french characters with &#65533;

    - by chris
    First of all, I'm not a web designer, I'm a programmer, so I'm working a bit out of my knowledge area. However, as the person in my office who has some working knowledge of French, I'm stuck with this issue. The Problem: Sharepoint Designer is replacing all French accented characters with the &#65533; (square box or diamond-? �) character. It doesn't appear to matter if I enter the 'é' character as alt-130 (in either design or source or as &eacute; Everything works fine when editing, but when the file is saved and loaded into a browser, it replaces the characters. When reloading into designer, the file shows the 65533 symbol. EDIT: More info. I use &#233; and save, close SP designer, Reloading SP designer will show the é (instead of the code) in source. Next reload will have replaced it with &#65533; Question 1: (more important) HOW DO I STOP THIS!? Question 2: (more interesting) Why does this happen? Charset is iso-8859-1

    Read the article

  • How can I read a DBF file with incorrectly defined column data types using ADO.NET?

    - by Jason
    I have a several DBF files generated by a third party that I need to be able to query. I am having trouble because all of the column types have been defined as characters, but the data within some of these fields actually contain binary data. If I try to read these fields using an OleDbDataReader as anything other than a string or character array, I get an InvalidCastException thrown, but I need to be able to read them as a binary value or at least cast/convert them after they are read. The columns that actually DO contain text are being returned as expected. For example, the very first column is defined as a character field with a length of 2 bytes, but the field contains a 16-bit integer. I have written the following test code to read the first column and convert it to the appropriate data type, but the value is not coming out right. The first row of the database has a value of 17365 (0x43D5) in the first column. Running the following code, what I end up getting is 17215 (0x433F). I'm pretty sure it has to do with using the ASCII encoding to get the bytes from the string returned by the data reader, but I'm not sure of another way to get the value into the format that I need, other that to write my own DBF reader and bypass ADO.NET altogether which I don't want to do unless I absolutely have to. Any help would be greatly appreciated. byte[] c0; int i0; string con = @"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\ASTM;Extended Properties=dBASE III;User ID=Admin;Password=;"; using (OleDbConnection c = new OleDbConnection(con)) { c.Open(); OleDbCommand cmd = c.CreateCommand(); cmd.CommandText = "SELECT * FROM astm2007"; OleDbDataReader dr = cmd.ExecuteReader(); while (dr.Read()) { c0 = Encoding.ASCII.GetBytes(dr.GetValue(0).ToString()); i0 = BitConverter.ToInt16(c0, 0); } dr.Dispose(); }

    Read the article

  • Regex question: Why isn't this matching?

    - by AllenG
    I have the following regex: (?<=\.\d+?)0+(?=\D|$) I'm running it against a string which contains the following: SVC~NU^0270~313.3~329.18~~10~~6.00: When it runs, it matches the 6.00 (correctly) which my logic then trims by one zero to turn into 6.0. The regex then runs again (or should) but fails to pick up the 6.0. I'm by no means an expert on Regex, but my understanding of my expression is that it's looking for a decimal with 1 or more optional (so, really zero or more) digits prior to one or more zeros which are then followed by any non-digit character or the line break. Assuming that interpretation is correct, I can't see why it wouldn't match on the second pass. For that matter, I'm not sure why my Regex.Replace isn't matching the full 6.00 on the first pass and removing both of the trailing zeros... Any suggestions?

    Read the article

  • routing problem in codeigniter

    - by Obay
    I'm new to CodeIgniter and routing. I have a Login controller whose index() loads up a view to enter a username/password. In the view, the form has action="login/authenticate". Login-authenticate() determines if the login is valid or not. If it's valid, redirect('lobby'), if not redirect('login') routes.php: $route['default_controller'] = "login" config.php: $config['base_url'] = "http://localhost/dts/"; $config['index_page'] = "index.php"; The problem is that when i go to http://localhost/dts/ , click login, I am correctly (?) redirected to http://localhost/dts/login/authenticate but the browser says Object not found!. But when I go to http://localhost/dts/index.php/ (with trailing slash), it works correctly (I get redirected to http://localhost/dts/index.php/login/authenticate, and am logged in) I tried removing "index.php" by using a .htaccess: RewriteEngine on RewriteCond $1 !^(index\.php|images|robots\.txt) RewriteRule ^(.*)$ /index.php/$1 [L] and it would no longer open even the http://localhost/dts/ I'm confused.. what's going on?

    Read the article

  • Python string manipulation

    - by paradox
    I'm trying to split a string into a int list for further processing. But somehow I can't remove certain whitespaces in between elements of the list. The string x is supposed to have a length of 1000 instead of 1019. I tried reading the documentation for python and saw the function strip() for stripping whitespaces from strings. However, it only works for trailing and leading whitespaces. How should I go about removing these whitespaces and also how do I convert a str list to a int list? My code is as follows : import array x = """73167176531330624919225119674426574742355349194934 96983520312774506326239578318016984801869478851843 85861560789112949495459501737958331952853208805511 12540698747158523863050715693290963295227443043557 66896648950445244523161731856403098711121722383113 62229893423380308135336276614282806444486645238749 30358907296290491560440772390713810515859307960866 70172427121883998797908792274921901699720888093776 65727333001053367881220235421809751254540594752243 52584907711670556013604839586446706324415722155397 53697817977846174064955149290862569321978468622482 83972241375657056057490261407972968652414535100474 82166370484403199890008895243450658541227588666881 16427171479924442928230863465674813919123162824586 17866458359124566529476545682848912883142607690042 24219022671055626321111109370544217506941658960408 07198403850962455444362981230987879927244284909188 84580156166097919133875499200524063689912560717606 05886116467109405077541002256983155200055935729725 71636269561882670428252483600823257530420752963450""" y=[] for i in range(0,len(x)): #String is now in a string list if x[i]!='': y.append(x[i]) print(y[i]) print(len(x))

    Read the article

  • How to detect if certain characters are at the end of an NSString?

    - by Sheehan Alam
    Let's assume I can have the following strings: "hey @john..." "@john, hello" "@john(hello)" I am tokenizing the string to get every word separated by a space: [myString componentsSeparatedByString:@" "]; My array of tokens now contain: @john... @john, @john(hello) For these cases. How can I make sure only @john is tokenized, while retaining the trailing characters: ... , (hello) Note: I would like to be able to handle all cases of characters at the end of a string. The above are just 3 examples.

    Read the article

  • Why does my DataTemplate break the WPF designer?

    - by PRINCESS FLUFF
    Why does the DataTemplate line break the WPF designer in Visual Studio 2008? The program compiles and runs properly. The DataTemplate is applied as it should. However the entire DataTemplate block of code is underlined in red, and when I simply "build" the program without running, I get the error "Type reference cannot find public type named 'Character'" How come it can't find it in the designer yet the program applies the template properly? <UserControl x:Class="WPF_Tests.Tests.TwoCollecViews.TwoViews" xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation" xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml" xmlns:DetailsPane="clr-namespace:WPF_Tests.Tests.DetailsPane" > <UserControl.Resources> <DataTemplate DataType="{x:Type DetailsPane:Character}"> <StackPanel Orientation="Horizontal"> <TextBlock Text="{Binding Path=Name}"></TextBlock> </StackPanel> </DataTemplate> </UserControl.Resources> <Grid> <ListBox ItemsSource="{Binding Path=Characters}" /> </Grid> </UserControl> EDIT: I am being told that this may be a bug in Visual Studio 2008, as it worked correctly in 2010. You can download the code here: http://www.mediafire.com/?z1myytvwm4n - The Test/TwoCollec xaml file's designer will break with this code.

    Read the article

  • Reading UTF-8 XML and writing it to a file with Python

    - by Harri
    I'm trying to parse UTF-8 XML file and save some parts of it to another file. Problem is, that this is my first Python script ever and I'm totally confused about the character encoding problems I'm finding. My script fails immediately when it tries to write non-ascii character to a file, but it can print it to command prompt (at least in some level) Here's the XML (from the parts that matter at least, it's a *.resx file which contains UI strings) <?xml version="1.0" encoding="utf-8"?> <root> <resheader name="foo"> <value>bar</value> </resheader> <data name="lorem" xml:space="preserve"> <value>ipsum öä</value> </data> </root> And here's my python script from xml.dom.minidom import parse names = [] values = [] def getStrings(path): dom = parse(path) data = dom.getElementsByTagName("data") for i in range(len(data)): name = data[i].getAttribute("name") names.append(name) value = data[i].getElementsByTagName("value") values.append(value[0].firstChild.nodeValue.encode("utf-8")) def writeToFile(): with open("uiStrings-fi.py", "w") as f: for i in range(len(names)): line = names[i] + '="'+ values[i] + '"' #varName='varValue' f.write(line) f.write("\n") getStrings("ResourceFile.fi-FI.resx") writeToFile() And here's the traceback: Traceback (most recent call last): File "GenerateLanguageFiles.py", line 24, in writeToFile() File "GenerateLanguageFiles.py", line 19, in writeToFile line = names[i] + '="'+ values[i] + '"' #varName='varValue' UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 2: ordinal not in ran ge(128) How should I fix my script so it would read and write UTF-8 characters properly? The files I'm trying to generate would be used in test automation with Robots Framework.

    Read the article

  • XML: Having trouble solving this XML error here..

    - by Capsud
    Hi i'm getting these errors Multiple annotations found at this line: - error: Error parsing XML: not well-formed (invalid token) - Content is not allowed in trailing section. on this XML file... <selector xmlns:android="http://schemas.android.com/apk/res/android"> <item android:state_enabled="false" android:drawable="@drawable/btn_red"> </item> <item android:state_pressed="true" android:state_enabled="true" android:drawable="@drawable/btn_orange"> </item> <item android:state_focused="true" android:state_enabled="true" android:drawable="@drawable/btn_orange"> </item> <item android:state_enabled="true" android:drawable="@drawable/btn_black"> </item> probably quite simple for you people who know XML. Can you help me please? Thanks.

    Read the article

  • How can I use htaccess to protect a subdirectory of codeigniter installation?

    - by Art Peterson
    I have codeigniter installed at the root directory, and would like to have a subdirectory called "test" password protected using htaccess. I keep getting a "404 page not found" no matter what I try. The directory structure is: /public_html /css /images /system (codeigniter directory) /test .htaccess .htaccess .htpasswd index.php The root .htaccess file looks like: RewriteEngine On RewriteBase / Options -Indexes # Removes trailing slashes RewriteCond %{REQUEST_FILENAME} !-d RewriteRule ^(.+)/$ $1 [L,R=301] # Enforce www RewriteCond %{HTTP_HOST} !^(www) [NC] RewriteRule ^(.*)$ http://www.mydomain.com/$1 [L,R=301] #Checks to see if the user is attempting to access a valid file, #such as an image or css document, if this isn't true it sends the #request to index.php RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteCond %{REQUEST_URI} !^(.*)test(.*) RewriteRule ^(.*)$ index.php?/$1 [L] The /test/.htaccess file: AuthUserFile /home/dir/.htpasswd AuthName "Protected Area" AuthType Basic <limit GET POST PUT> require user adminuser </limit> I'm not even getting the authentication prompt, just the codeigniter 404 page when I navigate to the url "http://www.mydomain.com/test/". Please advise!

    Read the article

  • Custom DataType in DataTemplate breaks WPF designer

    - by PRINCESS FLUFF
    Why does the DataTemplate line break the WPF designer in Visual Studio 2008? The program compiles and runs properly. The DataTemplate is applied as it should. However the entire DataTemplate block of code is underlined in red, and when I simply "build" the program without running, I get the error "Type reference cannot find public type named 'Character'" How come it can't find it in the designer yet the program applies the template properly? <UserControl x:Class="WPF_Tests.Tests.TwoCollecViews.TwoViews" xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation" xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml" xmlns:DetailsPane="clr-namespace:WPF_Tests.Tests.DetailsPane" > <UserControl.Resources> <DataTemplate DataType="{x:Type DetailsPane:Character}"> <StackPanel Orientation="Horizontal"> <TextBlock Text="{Binding Path=Name}"></TextBlock> </StackPanel> </DataTemplate> </UserControl.Resources> <Grid> <ListBox ItemsSource="{Binding Path=Characters}" /> </Grid> </UserControl> EDIT: I am being told that this may be a bug in Visual Studio 2008, as it worked correctly in 2010. You can download the code here: http://www.mediafire.com/?z1myytvwm4n - The Test/TwoCollec xaml file's designer will break with this code.

    Read the article

  • al32utf8 in oracle and SQL Server and DB2 pulling data

    - by Bob
    I have a non-utf8 oracle database running on 11.1.0.7. We need to support greek characters. So we have two options: use nvarchar, nclob fields for those fields that need greek (it is not all fields). We have tested this and gotten it to work with java coding. convert Oracle to AL32UTF8 database. I am not asking how to do this. I got this from the Oracle Site/Oracle Support. I know what is involved, lossy data, etc, increasing the size of the database. My question is we have users to our system that connect to our database with database links but work on SQL Server and IBM DB2 databases. I do not have access to those databases and I do not have experience with them. If they are not in UTF-8 databases what happens when they pull UTF8 data? I would assume that English/Ascii characters are fine and the greek will end up as junk data. I also ran Oracle Character set scanner (oracle command line utility you use to get info about the affects of a character set conversion). It says that my database will crease in sizez by about 20%. Does this have an affect on users with 3rd party databases? These are customers of our data and there is a limit to how much access I can have to them to run tests. Any information you have would be welcome.

    Read the article

  • Best terminal environment for Cygwin/Windows?

    - by Anders Sandvig
    Today I run Cygwin with rxvt using the following startup line: rxvt -bg black -sl 8192 -fg white -sr -g 150x56 -fn "Fixedsys" -e /usr/bin/bash --login -i This gives me a resizeable native Windows window which is much better than the standard "DOS box" the default cygwin.bat provides. However, the current configuration does have a couple of issues: I am not able to enter non-ASCII characters into the terminal window (i.e. æ, ø, å and Æ, Ø, Å, which I use semi-frequently. In fact, the terminal will not even accept them when I paste them into the window. If I paste a string like "bølle" (Norwegian for "bulley"), all I get is "blle". I am not able to render UTF-8 character, they only show as ?, even if they are supported by the font (i.e. when rendering the same characters in ISO-8859-1 they show just fine.). I am running English Windows Vista with locale and keyboard layout set to Norwegian (ISO-8859-1 character set?), but I've had the exact same issue on Windows 2000 and XP. Anyone knows how to fix this (i.e. a better way to configure rxvt)? Apart from the issues mentioned above, I'm very happy with rxvt, so if I find a way to resolve them I'd like to continue using it. However, if the issues are not (easily) solvable, are the any other good terminal solutions for Cygwin? Update The solution provided by Andy and Mattias (editing the .inputrc file) did solve the input problem, but output rendering is still an issue. Output is fine when I render in ISO-8859-1, but when using UTF-8 I only get ? for non-ASCII characters. This behavior is consistent between rxvt, urxvt (under Cygwin XFree X Server), mintty and PuttyCyg. Is there a similar configuration file where output encoding can be set (i.e. the equivalent of setting output locale on a Linux system)?

    Read the article

  • Python performance improvement request for winkler

    - by Martlark
    I'm a python n00b and I'd like some suggestions on how to improve the algorithm to improve the performance of this method to compute the Jaro-Winkler distance of two names. def winklerCompareP(str1, str2): """Return approximate string comparator measure (between 0.0 and 1.0) USAGE: score = winkler(str1, str2) ARGUMENTS: str1 The first string str2 The second string DESCRIPTION: As described in 'An Application of the Fellegi-Sunter Model of Record Linkage to the 1990 U.S. Decennial Census' by William E. Winkler and Yves Thibaudeau. Based on the 'jaro' string comparator, but modifies it according to whether the first few characters are the same or not. """ # Quick check if the strings are the same - - - - - - - - - - - - - - - - - - # jaro_winkler_marker_char = chr(1) if (str1 == str2): return 1.0 len1 = len(str1) len2 = len(str2) halflen = max(len1,len2) / 2 - 1 ass1 = '' # Characters assigned in str1 ass2 = '' # Characters assigned in str2 #ass1 = '' #ass2 = '' workstr1 = str1 workstr2 = str2 common1 = 0 # Number of common characters common2 = 0 #print "'len1', str1[i], start, end, index, ass1, workstr2, common1" # Analyse the first string - - - - - - - - - - - - - - - - - - - - - - - - - # for i in range(len1): start = max(0,i-halflen) end = min(i+halflen+1,len2) index = workstr2.find(str1[i],start,end) #print 'len1', str1[i], start, end, index, ass1, workstr2, common1 if (index > -1): # Found common character common1 += 1 #ass1 += str1[i] ass1 = ass1 + str1[i] workstr2 = workstr2[:index]+jaro_winkler_marker_char+workstr2[index+1:] #print "str1 analyse result", ass1, common1 #print "str1 analyse result", ass1, common1 # Analyse the second string - - - - - - - - - - - - - - - - - - - - - - - - - # for i in range(len2): start = max(0,i-halflen) end = min(i+halflen+1,len1) index = workstr1.find(str2[i],start,end) #print 'len2', str2[i], start, end, index, ass1, workstr1, common2 if (index > -1): # Found common character common2 += 1 #ass2 += str2[i] ass2 = ass2 + str2[i] workstr1 = workstr1[:index]+jaro_winkler_marker_char+workstr1[index+1:] if (common1 != common2): print('Winkler: Wrong common values for strings "%s" and "%s"' % \ (str1, str2) + ', common1: %i, common2: %i' % (common1, common2) + \ ', common should be the same.') common1 = float(common1+common2) / 2.0 ##### This is just a fix ##### if (common1 == 0): return 0.0 # Compute number of transpositions - - - - - - - - - - - - - - - - - - - - - # transposition = 0 for i in range(len(ass1)): if (ass1[i] != ass2[i]): transposition += 1 transposition = transposition / 2.0 # Now compute how many characters are common at beginning - - - - - - - - - - # minlen = min(len1,len2) for same in range(minlen+1): if (str1[:same] != str2[:same]): break same -= 1 if (same > 4): same = 4 common1 = float(common1) w = 1./3.*(common1 / float(len1) + common1 / float(len2) + (common1-transposition) / common1) wn = w + same*0.1 * (1.0 - w) return wn

    Read the article

  • Munging non-printable characters to dots using string.translate()

    - by Jim Dennis
    So I've done this before and it's a surprising ugly bit of code for such a seemingly simple task. The goal is to translate any non-printable character into a . (dot). For my purposes "printable" does exclude the last few characters from string.printable (new-lines, tabs, and so on). This is for printing things like the old MS-DOS debug "hex dump" format ... or anything similar to that (where additional whitespace will mangle the intended dump layout). I know I can use string.translate() and, to use that, I need a translation table. So I use string.maketrans() for that. Here's the best I could come up with: filter = string.maketrans( string.translate(string.maketrans('',''), string.maketrans('',''),string.printable[:-5]), '.'*len(string.translate(string.maketrans('',''), string.maketrans('',''),string.printable[:-5]))) ... which is an unreadable mess (though it does work). From there you can call use something like: for each_line in sometext: print string.translate(each_line, filter) ... and be happy. (So long as you don't look under the hood). Now it is more readable if I break that horrid expression into separate statements: ascii = string.maketrans('','') # The whole ASCII character set nonprintable = string.translate(ascii, ascii, string.printable[:-5]) # Optional delchars argument filter = string.maketrans(nonprintable, '.' * len(nonprintable)) And it's tempting to do that just for legibility. However, I keep thinking there has to be a more elegant way to express this!

    Read the article

  • Checking if language is set in url with regex

    - by Saif Bechan
    I have am working on a multi language file. My urls look something like this: http://www.mydomain.com/en/about/info http://www.mydomain.com/nl/about/info Now I use a small regex script that redirect the user when they use a link without language. The script looks like this: preg_match('~^/[a-z]{2}/~', $_SERVER['REQUEST_URI'] This finds out is there is a language set en|nl|de etc. This works fine on all links except for these: http://www.mydomain.com/en http://www.mydomain.com/nl There is no trailing slash so the regex can not find the given values. Anyone know a fix for this?

    Read the article

  • What is the correct JNA mapping for UniChar on Mac OS X?

    - by Trejkaz
    I have a C struct like this: struct HFSUniStr255 { UInt16 length; UniChar unicode[255]; }; I have mapped this in the expected way: public class HFSUniStr255 extends Structure { public UInt16 length; // UInt16 is just an IntegerType with length 2 for convenience. public /*UniChar*/ char[] unicode = new char[255]; //public /*UniChar*/ byte[] unicode = new byte[255*2]; //public /*UniChar*/ UInt16[] unicode = new UInt16[255]; public HFSUniStr255() { } public HFSUniStr255(Pointer pointer) { super(pointer); } } If I use this version, I get every second character of the string into my char[] ("aits D" for "Macintosh HD".) I am assuming that this is something to do with being on a 64-bit platform and JNA mapping the value to a 32-bit wchar_t but then chopping off the high 16 bits on each wchar_t on copying them back. If I use the byte[] version, I get data which decodes correctly using the UTF-16LE charset. If I use the UInt16[] version, I get the right code point for each character but it is then inconvenient to convert them back into a string. Is there some way I can define my type as char[], and yet have it convert correctly?

    Read the article

< Previous Page | 92 93 94 95 96 97 98 99 100 101 102 103  | Next Page >