the boy za - Page 12 - Developer IT

Zend RegEx Validator error message issue

- by Mallika Iyer

Hello, I'm validating a text field in my form as follows: $name = new Zend_Form_Element_Text('name'); $name->setLabel('First Name:') ->setRequired(true) ->addFilter(new Zend_Filter_StringTrim()) ->addValidator('regex',true,array('/^[(a-zA-Z0-9)]+$/')) ->addErrorMessage('Please enter a valid first name'); What I'm trying to accomplish is - how can i display a meaningful error message? Eg: If first name is 'XYZ-', how can i display '- is not allowed in first name.' Is there a way I can access what character the regex is failing for? Would you recommend something else altogether? I thought about writing a custom validator but the regex is pretty simple, so I don't see the point. I couldn't find a decent documentation for the zend 'regex' validator anywhere. If I don't override the default error message, I simple get something like : ';;;hhbhbhb' does not match against pattern '/^[(a-zA-Z0-9)]+$/' - which I obviously don't want to display to the user. I'd appreciate your inputs.

Read the article

how to prevent white spaces in a regular expression regex validation

- by Rees

i am completely new to regular expressions and am trying to create a regular expression in flex for a validation. using a regular expression, i am going to validate that the user input does NOT contain any white-space and consists of only characters and digits... starting with digit. so far i have: expression="[A-Za-z][A-Za-z0-9]*" this correctly checks for user input to start with a character followed by a possible digit, but this does not check if there is white space...(in my tests if user input has a space this input will pass through validation - this is not desired) can someone tell me how i can modify this expression to ensure that user input with whitespace is flagged as invalid?

Read the article

NSPredicate error/behaving differently on 10.5 vs 10.6

- by Tristan

I am using a NSPredicate to determine if an entered email address is valid. On 10.6 it works perfectly as expected. I recently decided to get my app going on 10.5 and this is the only thing that doesn't work. The error i get is as follows: "Can't do regex matching, reason: Can't open pattern U_MALFORMED_SET (string [email protected], pattern ([\w-+]+(?:\.[\w-+]+)*@(?:[\w-]+\.)+[a-zA-Z]{2,7}), case 0, canon 0)" The code im using is as follows: NSString *regex = @"([\\w-+]+(?:\\.[\\w-+]+)*@(?:[\\w-]+\\.)+[a-zA-Z]{2,7})"; NSPredicate *regextest = [NSPredicate predicateWithFormat:@"SELF MATCHES %@", regex]; if ([regextest evaluateWithObject:[userEmail objectValue]] == YES) Does anyone know why this isn't working on 10.5? And how I might get it working or be able to do this test in a way compatible for both 10.5 and 10.6?

Read the article

Regex-expression with danish characters

- by timkl

I'm currently trying to wrap my head around regex, I have a validation snippet that tests an input box against a regex-expression: $.validator.addMethod("customerName", function(value, element){ return (/^[a-zA-Z]*$/).test(value); }, "Some text"); That works well, but when I try to add a space and some special danish characters, it doesn't filter the danish characters, only the space. $.validator.addMethod("customerName", function(value, element){ return (/^[a-zA-Z æøåÆØÅ]*$/).test(value); }, "Some text"); Any ideas to what could be wrong?

Read the article

Django admin fails when using includes in urlpatterns

- by zenWeasel

I am trying to refactor out my application a little bit to keep it from getting too unwieldily. So I started to move some of the urlpatterns out to sub files as the documentation proposes. Besides that fact that it just doesn't seem to be working (the items are not being rerouted) but when I go to the admin, it says that 'urlpatterns has not been defined'. The urls.py I have at the root of my application is: if settings.ENABLE_SSL: urlpatterns = patterns('', (r'^checkout/orderform/onepage/(\w*)/$','checkout.views.one_page_orderform',{'SSL':True},'commerce.checkout.views.single_product_orderform'), ) else: urlpatterns = patterns('', (r'^checkout/orderform/onepage/(\w*)/$','commerce.checkout.views.single_product_orderform'), ) urlpatterns+= patterns('', (r'^$', 'alchemysites.views.route_to_home'), (r'^%s/' % settings.DAJAXICE_MEDIA_PREFIX, include('dajaxice.urls')), (r'^/checkout/', include('commerce.urls')), (r'^/offers',include('commerce.urls')), (r'^/order/',include('commerce.urls')), (r'^admin/', include(admin.site.urls)), (r'^accounts/login/$', login), (r'^accounts/logout/$', logout), (r'^(?P<path>.*)/$','alchemysites.views.get_path'), (r'^static/(?P<path>.*)$', 'django.views.static.serve', {'document_root':settings.MEDIA_ROOT}), The urls I have moved out so far are the checkout/offers/order which are all subapps of 'commerce' where the urls.py for the apps are so to be clear. /urls.py in questions (included here) /commerce/urls.py where the urls.py I want to include is: order_info = { 'queryset': Order.objects.all(), } urlpatterns+= patterns('', (r'^offers/$','offers.views.start_offers'), (r'^offers/([a-zA-Z0-9-]*)/order/(\d*)/add/([a-zA-Z0-9-]*)/(\w*)/next/([a-zA-Z0-9-)/$','offers.views.show_offer'), (r'^reports/orders/$', list_detail.object_list,order_info), ) and the applications offers lies under commerce. And so the additional problem is that admin will not work at all, so I'm thinking because I killed it somewhere with my includes. Things I have checked for: Is the urlpatterns variable accidentally getting reset somewhere (i.e. urlpatterns = patterns, instead of urlpatterns+= patterns) Are the patterns in commerce.urls valid (yes, when moved back to root they work). So from there I am stumped. I can move everything back into the root, but was trying to get a little decoupled, not just for theoretical reason but for some short terms ones. Lastly if I enter www.domainname/checkout/orderform/onepage/xxxjsd I get the correct page. However, entering www.domainname/checkout/ gets handled by the alchemysites.views.get_path. If not the answer (because this is pretty darn specific), then is there a good way for troubleshoot urls.py? It seems to just be trial and error. Seems there should be some sort of parser that will tell you what your urlpatterns will do.

Read the article

Shortening code

- by Misiur

Nah, looks like it was hosting fault. Who can make this code shorter? private function replaceFunc($subject) { foreach($this->func as $t) { preg_match_all('/\{'.$t.'$[a-zA-Z,\']+$\}/i', $subject, $res); for($j = 0; $j < sizeof($res[0]); $j++) { preg_match('/$[a-zA-Z,\']+$/i', $res[0][$j], $match); if($match > 0) { $prep = explode(", ", substr($match[0], 1, -1)); $args = array(); for($i = 0; $i < sizeof($prep); $i++) { $args[] = substr($prep[$i], 1, -1); } } else { $args = array(); } $subject = preg_replace('/\{'.$t.preg_quote($match[0]).'\}/i', call_user_func_array($t, $args), $subject); } } return $subject; }

Read the article

Convert Eregi_replace to preg_replace in PHP

- by alexy13

I need help converting eregi_replace to preg_replace (since in PHP5 it's depreciated): function makeClickableLinks($text) { $text = eregi_replace('(((f|ht){1}tp://)[-a-zA-Z0-9@:%_\+.~#?&//=]+)', '<a href="\\1">\\1</a>', $text); $text = eregi_replace('([[:space:]()[{}])(www.[-a-zA-Z0-9@:%_\+.~#?&//=]+)', '\\1<a href="http://\\2">\\2</a>', $text); $text = eregi_replace('([_\.0-9a-z-]+@([0-9a-z][0-9a-z-]+\.)+[a-z]{2,3})', '<a href="mailto:\\1">\\1</a>', $text); return $text; } (It turns text links and emails into hyperlinks so that the user can click on them)

Read the article

apache mod_rewrite regex problem with multiple parameter

- by iko

Regular expressions have always been my pet peeves. Every time I think that I finally got it I have a new problem ! I want to catch url like this : http://www.mydomain.com/boutique/blabla-1/bla-bla2/99/104 http://www.mydomain.com/boutique/blabla1/99 and eventually : http://www.mydomain.com/boutique/blabla-1/bla-bla2/product1/99/104/55/ after a lot of tries and errors I came up with this which seems to work with http://www.gskinner.com/RegExr/ but not in apache ^.*/boutique/([a-zA-Z-]*)(/?[a-zA-Z-]*)/?([0-9]*)/?([0-9]*)/?$ boutique.php?c1=$3&c2=$4 (I was only working with the first two url so far) MY apache rewrite log debug files are helpless : pass through /Users/iko/Sites/mysite/boutique.php I'm only interrested in getting the ids. Any help we'll be welcomed ! Thank you.

Read the article

htaccess rewriterule

- by user322731

I currently have this rule where I want the page to just render my static content. RewriteRule ^videos\/coverage\/view\/236159\-([0-9a-zA-Z-]+) http: //website.com/static/236159.html [NC] However, this doesn't work. It works with a L tag but then the URL is different: RewriteRule ^videos\/coverage\/view\/236159\-([0-9a-zA-Z-]+) http: //website.com/static/236159.html [L, NC] My goal is to keep the URL the same but the content different. Can anyone point out to what flags are needed in order to get this working properly? Thanks!

Read the article

Relative Paths, etc. on an Apache server

- by Matt H.

I'm really stuck here. I have 2 issues at once: First, my site is stored (both on local development and on live server), in a subdirectory.. as I'm working on multiple sites. i.e. /Sites/www.mysite.com/(site files here) When I'm referring to files in my web pages, I want to refer to, say, my /images directory without hard-coding every occurrence as /www.mysite.com/images/myfile.jpg Is there a way to simply redefine how the leading "/" gets interpreted by the server? Question two, concerning PHP mod_rewrite I have this set of rewrite rules. The objective is to turn www.mysite.com/faq into www.mysite.com/index.php?page="faq" RewriteEngine On RewriteCond %{REQUEST_URI} !mysite.com RewriteRule (.*) mysite.com/$1 RewriteRule ^([a-zA-Z0-9_-]+)$ /mysite.com/index.php?site=$1 RewriteRule ^([a-zA-Z0-9_-]+)/$ /mysite.com/index.php?site=$1 I don't have a problem when a url gets passed in the 2nd-to-last format (as the example above). However, if the trailing "/" is added: www.mysite.com/faq/, my external script references break: (such as src=js/script.js)...

Read the article

Regular expressions and matching URLs with metacharacters

- by James P.

I'm having trouble finding a regular expression that matches the following String. Korben;http://feeds.feedburner.com/KorbensBlog-UpgradeYourMind?format=xml;1 One problem is escaping the question mark. Java's pattern matcher doesn't seem to accept \? as a valid escape sequence but it also fails to work with the tester at myregexp.com. Here's what I have so far: ([a-zA-Z0-9])+;http://([a-zA-Z0-9./-]+);[0-9]+ Any suggestions? Edit: The original intent was to match all URLs that could be found after the first semi colon.

Read the article

Can this jQuery/Javascript functionality be replicated with PHP

- by benhowdle89

This is the code to grab tweets, but i need this in PHP, can anybody offer any insight? $(document).ready( function() { var url = "http://twitter.com/status/user_timeline/joebloggs.json?count=1&callback=?"; $.getJSON(url, function(data){ $.each(data, function(i, item) { $("#twitter-posts").append("" + item.text.linkify() + " " + relative_time(item.created_at) + " via " + item.source + ""); }); }); }); String.prototype.linkify = function() { return this.replace(/[A-Za-z]+:\/\/[A-Za-z0-9-_]+\.[A-Za-z0-9-_:%&\?\/.=]+/, function(m) { return m.link(m); }); }; function relative_time(time_value) { var values = time_value.split(" "); time_value = values[1] + " " + values[2] + ", " + values[5] + " " + values[3]; var parsed_date = Date.parse(time_value); var relative_to = (arguments.length > 1) ? arguments[1] : new Date(); var delta = parseInt((relative_to.getTime() - parsed_date) / 1000); delta = delta + (relative_to.getTimezoneOffset() * 60); var r = ''; if (delta < 60) { r = 'a minute ago'; } else if(delta < 120) { r = 'couple of minutes ago'; } else if(delta < (45*60)) { r = (parseInt(delta / 60)).toString() + ' minutes ago'; } else if(delta < (90*60)) { r = 'an hour ago'; } else if(delta < (24*60*60)) { r = '' + (parseInt(delta / 3600)).toString() + ' hours ago'; } else if(delta < (48*60*60)) { r = '1 day ago'; } else { r = (parseInt(delta / 86400)).toString() + ' days ago'; } return r; } function twitter_callback () { return true; }

Read the article

parse youtube video id using preg_match

- by Webbo

Hi, I am attempting to parse the video ID of a youtube URL using preg_match. I found a regular expression on this site that appears to work; (?<=v=)[a-zA-Z0-9-]+(?=&)|(?<=[0-9]/)[^&\n]+|(?<=v=)[^&\n]+ As shown in this pic; http://i.imgur.com/SQJW2.jpg My PHP is as follows, but it doesn't work (gives Unknown modifier '[' error)... <? $subject = "http://www.youtube.com/watch?v=z_AbfPXTKms&NR=1"; preg_match("(?<=v=)[a-zA-Z0-9-]+(?=&)|(?<=[0-9]/)[^&\n]+|(?<=v=)[^&\n]+", $subject, $matches); print "<pre>"; print_r($matches); print "</pre>"; ?> Cheers

Read the article

How can I convert Perl regular expressions to boost regular expressions?

- by YY

I'm not familiar with Perl and boost regular expression and I want to convert a Perl code to c++. I want to convert special regular expression in Perl into c++ using Boost regexp library. Please help me understand what I must do? Here is some regexps that a word of a sentence may match: if ($word =~ /^[\.:\,()\'\`-]/) { # hack for punctuation } if ($word =~ /^[A-Z]/) { return; } if ($word =~ /[A-Za-z0-9]+\-[A-Za-z0-9]+/) { # all hyphenated words return; } if ($word =~ /.*[0-9].*/) { # all numbers return; }

Read the article

php array regular expressions

- by bell

I am using regular expressions in php to match postcodes found in a string. The results are being returned as an array, I was wondering if there is any way to assign variables to each of the results, something like $postcode1 = first match found $postcode2 = second match found here is my code $html = "some text here bt123ab and another postcode bt112cd"; preg_match_all("/([a-zA-Z]{2})([0-9]{2,3})([a-zA-Z]{2})/", $html, $matches, PREG_SET_ORDER); foreach ($matches as $val) { echo $val[0]; } I am very new to regular expressions and php, forgive me if this is a stupid question. Thanks in advance

Read the article

an error "variable of field declared void"

- by lego69

I have this code: header - test.h Inside header I have some class Z and definitions of two functions test and test2 I call function test2 from test void test2(Z z, Z const *za); this is implementation of the function: void test2(Z z, Z const *za){ int i = z; //this row works cout << i << endl; } I call it from test: test2(z1, za1); // za1 is pinter to object and z1 is some object but in my header I receive an 3 errors: Multiple markers at this line - initializer expression list treated as compound expression - `A' was not declared in this scope - variable or field `quiz2' declared void can somebody please explain why? thanks in advance

Read the article

Assistance with regular expressions in Python

- by da5id

I am still learning REGEX, and I've run into an issue ... I am trying to separate a string that is composed of a mixture of letters and numbers that are in decimal format: AB0.500CD1.05EF2.29 Into something like this: list1 = AB,CD,EF list2 = 0.500,1.05,2.29 A complication to all this is that I also have strings that look like this: AB1CD2EF3 Which I'd also like to separate into this: list1 = AB,CD,EF list2 = 1,2,3 A previous inquiry yielded the following snippet, import re pattern = re.compile(r'([a-zA-Z]+)([0-9]+)') for (letters, numbers) in re.findall(pattern,cmpnd): print numbers print letters This example works fine for strings of the 2nd kind, but only "finds" the leading digit in the numbers that contain decimal places in the strings of the first kind. I've attempted an approach using the following line: pattern = re.compile(r'([a-zA-Z]+)([0-9]+(\.[0-9]))') But this results in an error: "ValueError: too many values to unpack" Thanks for any and all assistance!

Read the article

regular expression alphanumarics with sapce

- by Surya sasidhar

hi, i write regular expression for alphanumarics but it is not taking space i want space (white space between characters) i write like this ^[a-zA-Z0-9_]*$

Read the article

Perl - string matching issue

- by user2886545

I have a problem I cannot understand. I have this string: gene_id "siRNA_Z27kG1_20543"transcript_id "siRNA_Z27kG1_20543_X_1";tss_id "TSS124620" And I want to change the gene_id. So, I have the following code: if ($line =~ /;transcript_id "([A-Za-z0-9:\-._]*)(_[oxOX][_.][0-9]*)";/) { $num = $2; $line =~ s/gene_id "([A-Za-z0-9:\-._]*)";/gene_id "$1$num";/g; print $new $line."\n"; } The aim of my code is to change siRNA_Z27kG1_20543 for siRNA_Z27kG1_20543_X_1. However, my code does not produce that output. Why? I can't understand that. My regex needs to be as it is because I match other strings (this time with success). Thanks.

Read the article

preg_match_all confusing failing

- by James

$string = (string) file_get_contents($_FILES['file']['tmp_name']); echo $string; preg_match_all("/[\._a-zA-Z0-9-]+@[\._a-zA-Z0-9-]+/i", $string, $matches); print_r($matches); I am parsing text/csv files and grabbing email addresses from uploaded files. When parsing a Google Contact file I exported it weirdly fails. But when I simply copy the string that is echo'd and paste that instead of the file_get_contents result, it parses and works. Any idea why it is refusing to take the file_get_contents string, but if I paste in the raw data myself, it works?

Read the article

IIS7 URL Redirect with Regex

- by andyjv

I'm preparing for a major overhaul of our shopping cart, which is going to completely change how the urls are structured. For what its worth, this is for Magento 1.7. An example URL would be: {domain}/item/sub-domain/sub-sub-domain-5-16-7-16-/8083770?plpver=98&categid=1027&prodid=8090&origin=keyword and redirect it to {domain}/catalogsearch/result/?q=8083710 My web.config is: <?xml version="1.0" encoding="UTF-8"?> <configuration> <system.webServer> <rewrite> <rules> <rule name="Magento Required" stopProcessing="false"> <match url=".*" ignoreCase="false" /> <conditions> <add input="{URL}" pattern="^/(media|skin|js)/" ignoreCase="false" negate="true" /> <add input="{REQUEST_FILENAME}" matchType="IsFile" negate="true" /> <add input="{REQUEST_FILENAME}" matchType="IsDirectory" negate="true" /> </conditions> <action type="Rewrite" url="index.php" /> </rule> <rule name="Item Redirect" stopProcessing="true"> <match url="^item/([_\-a-zA-Z0-9]+)/([_\-a-zA-Z0-9]+)/([_\-a-zA-Z0-9]+)(\?.*)" /> <action type="Redirect" url="catalogsearch/result/?q={R:3}" appendQueryString="true" redirectType="Permanent" /> <conditions trackAllCaptures="true"> </conditions> </rule> </rules> </rewrite> <httpProtocol allowKeepAlive="false" /> <caching enabled="false" /> <urlCompression doDynamicCompression="true" /> </system.webServer> </configuration> Right now it seems the redirect is completely ignored, even though in the IIS GUI the sample url passes the regex test. Is there a better way to redirect or is there something wrong with my web.config?

Read the article

nginx + php-fpm cycle redirection error on linode new vps

- by chifliiiii

I'm new to nginx, and I'm trying to make my first server run. I followed this guide as I'm trying to use it for a multisite wordpress site. After installing everything, I get a 500 Internal server error. If I check logs, I see this: 012/09/27 08:55:54 [error] 11565#0: *8 rewrite or internal redirection cycle while internally redirecting to "/index.html", client: xxx.xxx.xxx.xxx, server: localhost, request: "GET /favicon.ico HTTP/1.1", host: "www.mydomain.com" 2012/09/27 08:59:32 [error] 11618#0: *1 rewrite or internal redirection cycle while internally redirecting to "/index.html", client: xxx.xxx.xxx.xxx, server: localhost, request: "GET /phpmyadmin HTTP/1.1", host: "www.mydomain.com" My conf files are the following: nano /etc/nginx/sites-available/mydomain.com server { listen 80 default_server; server_name mydomain.com *.mydomain.com; root /srv/www/aciup.com/public; access_log /srv/www/mydomain.com/log/access.log; error_log /srv/www/mydomain.com/log/error.log; location / { index index.php; try_files $uri $uri/ /index.php?$args; } # Add trailing slash to */wp-admin requests. rewrite /wp-admin$ $scheme://$host$uri/ permanent; # Directives to send expires headers and turn off 404 error logging. location ~* \.(js|css|png|jpg|jpeg|gif|ico)$ { expires 24h; log_not_found off; } # this prevents hidden files (beginning with a period) from being served location ~ /\. { access_log off; log_not_found off; deny all; } # Pass uploaded files to wp-includes/ms-files.php. rewrite /files/$ /index.php last; if ($uri !~ wp-content/plugins) { rewrite /files/(.+)$ /wp-includes/ms-files.php?file=$1 last; } # Rewrite multisite '.../wp-.*' and '.../*.php'. if (!-e $request_filename) { rewrite ^/[_0-9a-zA-Z-]+(/wp-.*) $1 last; rewrite ^/[_0-9a-zA-Z-]+.*(/wp-admin/.*\.php)$ $1 last; rewrite ^/[_0-9a-zA-Z-]+(/.*\.php)$ $1 last; } location ~ \.php$ { client_max_body_size 25M; fastcgi_pass unix:/var/run/php5-fpm.sock; fastcgi_index index.php; include /etc/nginx/fastcgi_params; } } nano /etc/nginx/nginx.conf user www-data; worker_processes 4; worker_cpu_affinity 0001 0010 0100 1000; error_log /var/log/nginx/error.log; pid /var/run/nginx.pid; events { worker_connections 2048; } http { include /etc/nginx/mime.types; access_log /var/log/nginx/access.log; sendfile on; tcp_nopush on; keepalive_timeout 5; tcp_nodelay on; server_tokens off; gzip on; gzip_types text/plain text/css application/x-javascript text/xml application/xml application/xml+rss text/javascript; gzip_disable "MSIE [1-6]\.(?!.*SV1)"; include /etc/nginx/conf.d/*.conf; include /etc/nginx/sites-enabled/*; } Any help will be appreciated.

Read the article

retrieve value from hashtable with clone of key; C#

- by Johnny

I would like to know if there is any possible way to retrieve an item from a hashtable using a key that is identical to the actual key, but a different object. I understand why it is probably not possible, but I would like to see if there is any tricky way to do it. My problem arises from the fact that, being as stupid as I am, I created hashtables with int[] as the keys, with the integer arrays containing indices representing spatial position. I somehow knew that I needed to create a new int[] every time I wanted to add a new entry, but neglected to think that when I generated spatial coordinate arrays later they would be worthless in retrieving the values from my hashtables. Now I am trying to decide whether to rearrange things so that I can store my values in ArrayLists, or whether to search through the list of keys in the Hashtable for the one I need every time I want to get a value, neither of the options being very cool. Unless of course there is a way to get //1 to work like //2! Thanks in advance. static void Main(string[] args) { Hashtable dog = new Hashtable(); //1 int[] man = new int[] { 5 }; dog.Add(man, "hello"); int[] cat = new int[] { 5 }; Console.WriteLine(dog.ContainsKey(cat)); //false //2 int boy = 5; dog.Add(boy, "wtf"); int kitten = 5; Console.WriteLine(dog.ContainsKey(kitten)); //true; }

Read the article

Hide All But First Matching Element

- by Batfan

I am using jquery to sort through multiple paragraphs. Currently I have it set to show only paragraphs that start with a current letter. But now, I would like to consolidate further. If the text between the paragraph tags has multiple instances, I would like all but the first hidden. This is what I have so far but, it is not working. var letter = '<?php echo(strlen($_GET['letter']) == 1) ? $_GET['letter'] : ''; ?>' function finish(){ jQuery('p').each(function(){ if(jQuery(this).text().substr(0,1).toUpperCase() == letter){ jQuery(this).addClass('current-series'); jQuery(this).html(letter + ''+jQuery(this).text().slice(1)+ ''); } else{ jQuery(this).hide();} }) } Update: Sorry guys, I know this is kind of hard to explain. Here's a basic example: The selected letter is B The values returned are: Ball Ball Ball Boy Brain Bat Bat Each of these values is in a paragraph tag. Is there a way to consolidate to this? Ball Boy Brain Bat

Read the article

SQL SERVER – 5 Tips for Improving Your Data with expressor Studio

- by pinaldave

It’s no secret that bad data leads to bad decisions and poor results. However, how do you prevent dirty data from taking up residency in your data store? Some might argue that it’s the responsibility of the person sending you the data. While that may be true, in practice that will rarely hold up. It doesn’t matter how many times you ask, you will get the data however they decide to provide it. So now you have bad data. What constitutes bad data? There are quite a few valid answers, for example: Invalid date values Inappropriate characters Wrong data Values that exceed a pre-set threshold While it is certainly possible to write your own scripts and custom SQL to identify and deal with these data anomalies, that effort often takes too long and becomes difficult to maintain. Instead, leveraging an ETL tool like expressor Studio makes the data cleansing process much easier and faster. Below are some tips for leveraging expressor to get your data into tip-top shape. Tip 1: Build reusable data objects with embedded cleansing rules One of the new features in expressor Studio 3.2 is the ability to define constraints at the metadata level. Using expressor’s concept of Semantic Types, you can define reusable data objects that have embedded logic such as constraints for dealing with dirty data. Once defined, they can be saved as a shared atomic type and then re-applied to other data attributes in other schemas. As you can see in the figure above, I’ve defined a constraint on zip code. I can then save the constraint rules I defined for zip code as a shared atomic type called zip_type for example. The next time I get a different data source with a schema that also contains a zip code field, I can simply apply the shared atomic type (shown below) and the previously defined constraints will be automatically applied. Tip 2: Unlock the power of regular expressions in Semantic Types Another powerful feature introduced in expressor Studio 3.2 is the option to use regular expressions as a constraint. A regular expression is used to identify patterns within data. The patterns could be something as simple as a date format or something much more complex such as a street address. For example, I could define that a valid IP address should be made up of 4 numbers, each 0 to 255, and separated by a period. So 192.168.23.123 might be a valid IP address whereas 888.777.0.123 would not be. How can I account for this using regular expressions? A very simple regular expression that would look for any 4 sets of 3 digits separated by a period would be: ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$ Alternatively, the following would be the exact check for truly valid IP addresses as we had defined above: ^(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])$ . In expressor, we would enter this regular expression as a constraint like this: Here we select the corrective action to be ‘Escalate’, meaning that the expressor Dataflow operator will decide what to do. Some of the options include rejecting the offending record, skipping it, or aborting the dataflow. Tip 3: Email pattern expressions that might come in handy In the example schema that I am using, there’s a field for email. Email addresses are often entered incorrectly because people are trying to avoid spam. While there are a lot of different ways to define what constitutes a valid email address, a quick search online yields a couple of really useful regular expressions for validating email addresses: This one is short and sweet: \b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b (Source: http://www.regular-expressions.info/) This one is more specific about which characters are allowed: ^([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$ (Source: http://regexlib.com/REDetails.aspx?regexp_id=26 ) Tip 4: Reject “dirty data” for analysis or further processing Yet another feature introduced in expressor Studio 3.2 is the ability to reject records based on constraint violations. To capture reject records on input, simply specify Reject Record in the Error Handling setting for the Read File operator. Then attach a Write File operator to the reject port of the Read File operator as such: Next, in the Write File operator, you can configure the expressor operator in a similar way to the Read File. The key difference would be that the schema needs to be derived from the upstream operator as shown below: Once configured, expressor will output rejected records to the file you specified. In addition to the rejected records, expressor also captures some diagnostic information that will be helpful towards identifying why the record was rejected. This makes diagnosing errors much easier! Tip 5: Use a Filter or Transform after the initial cleansing to finish the job Sometimes you may want to predicate the data cleansing on a more complex set of conditions. For example, I may only be interested in processing data containing males over the age of 25 in certain zip codes. Using an expressor Filter operator, you can define the conditional logic which isolates the records of importance away from the others. Alternatively, the expressor Transform operator can be used to alter the input value via a user defined algorithm or transformation. It also supports the use of conditional logic and data can be rejected based on constraint violations. However, the best tip I can leave you with is to not constrain your solution design approach – expressor operators can be combined in many different ways to achieve the desired results. For example, in the expressor Dataflow below, I can post-process the reject data from the Filter which did not meet my pre-defined criteria and, if successful, Funnel it back into the flow so that it gets written to the target table. I continue to be impressed that expressor offers all this functionality as part of their FREE expressor Studio desktop ETL tool, which you can download from here. Their Studio ETL tool is absolutely free and they are very open about saying that if you want to deploy their software on a dedicated Windows Server, you need to purchase their server software, whose pricing is posted on their website. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology

Search Results

Search found 837 results on 34 pages for 'the boy za'.

Page 12/34 | < Previous Page | 8 9 10 11 12 13 14 15 16 17 18 19 | Next Page >

- by Mallika Iyer

- by Rees

- by Tristan

- by timkl

- by zenWeasel

- by Misiur

- by alexy13

- by iko

- by user322731

- by Matt H.

- by James P.

- by benhowdle89

- by Webbo

- by YY

- by bell

- by lego69

- by da5id

- by Surya sasidhar

- by user2886545

- by James

- by andyjv

- by chifliiiii

- by Johnny

- by Batfan

- by pinaldave

< Previous Page | 8 9 10 11 12 13 14 15 16 17 18 19 | Next Page >