Search Results

Search found 22897 results on 916 pages for 'query processing'.

Page 57/916 | < Previous Page | 53 54 55 56 57 58 59 60 61 62 63 64 | Next Page >

SQL SERVER – 5 Tips for Improving Your Data with expressor Studio

- by pinaldave

It’s no secret that bad data leads to bad decisions and poor results. However, how do you prevent dirty data from taking up residency in your data store? Some might argue that it’s the responsibility of the person sending you the data. While that may be true, in practice that will rarely hold up. It doesn’t matter how many times you ask, you will get the data however they decide to provide it. So now you have bad data. What constitutes bad data? There are quite a few valid answers, for example: Invalid date values Inappropriate characters Wrong data Values that exceed a pre-set threshold While it is certainly possible to write your own scripts and custom SQL to identify and deal with these data anomalies, that effort often takes too long and becomes difficult to maintain. Instead, leveraging an ETL tool like expressor Studio makes the data cleansing process much easier and faster. Below are some tips for leveraging expressor to get your data into tip-top shape. Tip 1: Build reusable data objects with embedded cleansing rules One of the new features in expressor Studio 3.2 is the ability to define constraints at the metadata level. Using expressor’s concept of Semantic Types, you can define reusable data objects that have embedded logic such as constraints for dealing with dirty data. Once defined, they can be saved as a shared atomic type and then re-applied to other data attributes in other schemas. As you can see in the figure above, I’ve defined a constraint on zip code. I can then save the constraint rules I defined for zip code as a shared atomic type called zip_type for example. The next time I get a different data source with a schema that also contains a zip code field, I can simply apply the shared atomic type (shown below) and the previously defined constraints will be automatically applied. Tip 2: Unlock the power of regular expressions in Semantic Types Another powerful feature introduced in expressor Studio 3.2 is the option to use regular expressions as a constraint. A regular expression is used to identify patterns within data. The patterns could be something as simple as a date format or something much more complex such as a street address. For example, I could define that a valid IP address should be made up of 4 numbers, each 0 to 255, and separated by a period. So 192.168.23.123 might be a valid IP address whereas 888.777.0.123 would not be. How can I account for this using regular expressions? A very simple regular expression that would look for any 4 sets of 3 digits separated by a period would be: ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$ Alternatively, the following would be the exact check for truly valid IP addresses as we had defined above: ^(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])$ . In expressor, we would enter this regular expression as a constraint like this: Here we select the corrective action to be ‘Escalate’, meaning that the expressor Dataflow operator will decide what to do. Some of the options include rejecting the offending record, skipping it, or aborting the dataflow. Tip 3: Email pattern expressions that might come in handy In the example schema that I am using, there’s a field for email. Email addresses are often entered incorrectly because people are trying to avoid spam. While there are a lot of different ways to define what constitutes a valid email address, a quick search online yields a couple of really useful regular expressions for validating email addresses: This one is short and sweet: \b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b (Source: http://www.regular-expressions.info/) This one is more specific about which characters are allowed: ^([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$ (Source: http://regexlib.com/REDetails.aspx?regexp_id=26 ) Tip 4: Reject “dirty data” for analysis or further processing Yet another feature introduced in expressor Studio 3.2 is the ability to reject records based on constraint violations. To capture reject records on input, simply specify Reject Record in the Error Handling setting for the Read File operator. Then attach a Write File operator to the reject port of the Read File operator as such: Next, in the Write File operator, you can configure the expressor operator in a similar way to the Read File. The key difference would be that the schema needs to be derived from the upstream operator as shown below: Once configured, expressor will output rejected records to the file you specified. In addition to the rejected records, expressor also captures some diagnostic information that will be helpful towards identifying why the record was rejected. This makes diagnosing errors much easier! Tip 5: Use a Filter or Transform after the initial cleansing to finish the job Sometimes you may want to predicate the data cleansing on a more complex set of conditions. For example, I may only be interested in processing data containing males over the age of 25 in certain zip codes. Using an expressor Filter operator, you can define the conditional logic which isolates the records of importance away from the others. Alternatively, the expressor Transform operator can be used to alter the input value via a user defined algorithm or transformation. It also supports the use of conditional logic and data can be rejected based on constraint violations. However, the best tip I can leave you with is to not constrain your solution design approach – expressor operators can be combined in many different ways to achieve the desired results. For example, in the expressor Dataflow below, I can post-process the reject data from the Filter which did not meet my pre-defined criteria and, if successful, Funnel it back into the flow so that it gets written to the target table. I continue to be impressed that expressor offers all this functionality as part of their FREE expressor Studio desktop ETL tool, which you can download from here. Their Studio ETL tool is absolutely free and they are very open about saying that if you want to deploy their software on a dedicated Windows Server, you need to purchase their server software, whose pricing is posted on their website. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
Tellago speaks about Business Intellligence with SQL Server 2008 R2

- by gsusx

At Tellago , we always try to stay in the frontlines of technology that can enhance our solution development practices. This year we are putting a lot of emphasis on business intelligence and in particular the new set of BI technologies such as Microsoft's PowerPivot, Master Data Services and StreamInsight that are scheduled to be release with SQL Server 2008 R2. In the last few weeks we have been working closely with different Microsoft field offices to coordinate a series of customers events that...(read more)

Read the article
Shader effect similar to Metro 2033 gasmask

- by Tim

I was thinking about effects in games the other day and I was reminded of the Gasmask effect from Metro 2033. Once you put the gasmask on it blurred a bit in the corners and could ice up and even get cracked. I assume that something like that is done using a shader. I have been experimenting a bit with game development, so far mostly playing with existing rendering engines and adding physics support etc. I would like to learn more about this sort of effect. Can someone give me a simple example of a shader that would alter the entire scene like this. Or if not a shader then an idea on how it would be done. Thanks. Edit : Include screenshot of the metro 2033 gasmask effect.

Read the article
SQL University: Parallelism Week - Introduction

- by Adam Machanic

Welcome to Parallelism Week at SQL University . My name is Adam Machanic, and I'm your professor. Imagine having 8 brains, or 16, or 32. Imagine being able to break up complex thoughts and distribute them across your many brains, so that you could solve problems faster. Now quit imagining that, because you're human and you're stuck with only one brain, and you only get access to the entire thing if you're lucky enough to have avoided abusing too many recreational drugs. For your database server,...(read more)

Read the article
Useful link for Xpath Query Testing

- by Arvind Chaudhary

Following link is very useful to test xpath query instantly http://www.xmlme.com/XpathTool.aspx Cheers, Arvind Chaudhary

Read the article
Integrating BizTalk Server and StreamInsight paper

- by gsusx

With all the holidays madness I didn't realized that my "Integrating BizTalk Server and StreamInsight" paper is now available on MSDN . This paper was originally an idea of the BizTalk product team and intends to present some fundamental scenarios that can be enabled by the combination of BizTalk Server and StreamInsight. Thanks to everybody who, directly or indirectly, provided feedback about this paper: Syed Rasheed, Mark Simms , Richard Seroter , Roman Schindlauer and Torsten Grabs from the StreamInsight...(read more)

Read the article
How to speed up this simple mysql query?

- by Jim Thio

The query is simple: SELECT TB.ID, TB.Latitude, TB.Longitude, 111151.29341326*SQRT(pow(-6.185-TB.Latitude,2)+pow(106.773-TB.Longitude,2)*cos(-6.185*0.017453292519943)*cos(TB.Latitude*0.017453292519943)) AS Distance FROM `tablebusiness` AS TB WHERE -6.2767668133836 < TB.Latitude AND TB.Latitude < -6.0932331866164 AND FoursquarePeopleCount >5 AND 106.68123318662 < TB.Longitude AND TB.Longitude <106.86476681338 ORDER BY Distance See, we just look at all business within a rectangle. 1.6 million rows. Within that small rectangle there are only 67,565 businesses. The structure of the table is 1 ID varchar(250) utf8_unicode_ci No None Change Change Drop Drop More Show more actions 2 Email varchar(400) utf8_unicode_ci Yes NULL Change Change Drop Drop More Show more actions 3 InBuildingAddress varchar(400) utf8_unicode_ci Yes NULL Change Change Drop Drop More Show more actions 4 Price int(10) Yes NULL Change Change Drop Drop More Show more actions 5 Street varchar(400) utf8_unicode_ci Yes NULL Change Change Drop Drop More Show more actions 6 Title varchar(400) utf8_unicode_ci Yes NULL Change Change Drop Drop More Show more actions 7 Website varchar(400) utf8_unicode_ci Yes NULL Change Change Drop Drop More Show more actions 8 Zip varchar(400) utf8_unicode_ci Yes NULL Change Change Drop Drop More Show more actions 9 Rating Star double Yes NULL Change Change Drop Drop More Show more actions 10 Rating Weight double Yes NULL Change Change Drop Drop More Show more actions 11 Latitude double Yes NULL Change Change Drop Drop More Show more actions 12 Longitude double Yes NULL Change Change Drop Drop More Show more actions 13 Building varchar(200) utf8_unicode_ci Yes NULL Change Change Drop Drop More Show more actions 14 City varchar(100) utf8_unicode_ci No None Change Change Drop Drop More Show more actions 15 OpeningHour varchar(400) utf8_unicode_ci Yes NULL Change Change Drop Drop More Show more actions 16 TimeStamp timestamp on update CURRENT_TIMESTAMP No CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP Change Change Drop Drop More Show more actions 17 CountViews int(11) Yes NULL Change Change Drop Drop More Show more actions The indexes are: Edit Edit Drop Drop PRIMARY BTREE Yes No ID 1965990 A Edit Edit Drop Drop City BTREE No No City 131066 A Edit Edit Drop Drop Building BTREE No No Building 21 A YES Edit Edit Drop Drop OpeningHour BTREE No No OpeningHour (255) 21 A YES Edit Edit Drop Drop Email BTREE No No Email (255) 21 A YES Edit Edit Drop Drop InBuildingAddress BTREE No No InBuildingAddress (255) 21 A YES Edit Edit Drop Drop Price BTREE No No Price 21 A YES Edit Edit Drop Drop Street BTREE No No Street (255) 982995 A YES Edit Edit Drop Drop Title BTREE No No Title (255) 1965990 A YES Edit Edit Drop Drop Website BTREE No No Website (255) 491497 A YES Edit Edit Drop Drop Zip BTREE No No Zip (255) 178726 A YES Edit Edit Drop Drop Rating Star BTREE No No Rating Star 21 A YES Edit Edit Drop Drop Rating Weight BTREE No No Rating Weight 21 A YES Edit Edit Drop Drop Latitude BTREE No No Latitude 1965990 A YES Edit Edit Drop Drop Longitude BTREE No No Longitude 1965990 A YES The query took forever. I think there has to be something wrong there. Showing rows 0 - 29 ( 67,565 total, Query took 12.4767 sec)

Read the article
SQL SERVER – Weekly Series – Memory Lane – #039

- by Pinal Dave

Here is the list of selected articles of SQLAuthority.com across all these years. Instead of just listing all the articles I have selected a few of my most favorite articles and have listed them here with additional notes below it. Let me know which one of the following is your favorite article from memory lane. 2007 FQL – Facebook Query Language Facebook list following advantages of FQL: Condensed XML reduces bandwidth and parsing costs. More complex requests can reduce the number of requests necessary. Provides a single consistent, unified interface for all of your data. It’s fun! UDF – Get the Day of the Week Function The day of the week can be retrieved in SQL Server by using the DatePart function. The value returned by the function is between 1 (Sunday) and 7 (Saturday). To convert this to a string representing the day of the week, use a CASE statement. UDF – Function to Get Previous And Next Work Day – Exclude Saturday and Sunday While reading ColdFusion blog of Ben Nadel Getting the Previous Day In ColdFusion, Excluding Saturday And Sunday, I realize that I use similar function on my SQL Server Database. This function excludes the Weekends (Saturday and Sunday), and it gets previous as well as next work day. Complete Series of SQL Server Interview Questions and Answers Data Warehousing Interview Questions and Answers – Introduction Data Warehousing Interview Questions and Answers – Part 1 Data Warehousing Interview Questions and Answers – Part 2 Data Warehousing Interview Questions and Answers – Part 3 Data Warehousing Interview Questions and Answers Complete List Download 2008 Introduction to Log Viewer In SQL Server all the windows event logs can be seen along with SQL Server logs. Interface for all the logs is same and can be launched from the same place. This log can be exported and filtered as well. DBCC SHRINKFILE Takes Long Time to Run If you are DBA who are involved with Database Maintenance and file group maintenance, you must have experience that many times DBCC SHRINKFILE operations takes a long time but any other operations with Database are relatively quicker. mssqlsystemresource – Resource Database The purpose of resource database is to facilitates upgrading to the new version of SQL Server without any hassle. In previous versions whenever version of SQL Server was upgraded all the previous version system objects needs to be dropped and new version system objects to be created. 2009 Puzzle – Write Script to Generate Primary Key and Foreign Key In SQL Server Management Studio (SSMS), there is no option to script all the keys. If one is required to script keys they will have to manually script each key one at a time. If database has many tables, generating one key at a time can be a very intricate task. I want to throw a question to all of you if any of you have scripts for the same purpose. Maximizing View of SQL Server Management Studio – Full Screen – New Screen I had explained the following two different methods: 1) Open Results in Separate Tab - This is a very interesting method as result pan shows up in a different tab instead of the splitting screen horizontally. 2) Open SSMS in Full Screen - This works always and to its best. Not many people are aware of this method; hence, very few people use it to enhance performance. 2010 Find Queries using Parallelism from Cached Plan T-SQL script gets all the queries and their execution plan where parallelism operations are kicked up. Pay attention there is TOP 10 is used, if you have lots of transactional operations, I suggest that you change TOP 10 to TOP 50 This is the list of the all the articles in the series of computed columns. SQL SERVER – Computed Column – PERSISTED and Storage This article talks about how computed columns are created and why they take more storage space than before. SQL SERVER – Computed Column – PERSISTED and Performance This article talks about how PERSISTED columns give better performance than non-persisted columns. SQL SERVER – Computed Column – PERSISTED and Performance – Part 2 This article talks about how non-persisted columns give better performance than PERSISTED columns. SQL SERVER – Computed Column and Performance – Part 3 This article talks about how Index improves the performance of Computed Columns. SQL SERVER – Computed Column – PERSISTED and Storage – Part 2 This article talks about how creating index on computed column does not grow the row length of table. SQL SERVER – Computed Columns – Index and Performance This article summarized all the articles related to computed columns. 2011 SQL SERVER – Interview Questions and Answers – Frequently Asked Questions – Data Warehousing Concepts – Day 21 of 31 What is Data Warehousing? What is Business Intelligence (BI)? What is a Dimension Table? What is Dimensional Modeling? What is a Fact Table? What are the Fundamental Stages of Data Warehousing? What are the Different Methods of Loading Dimension tables? Describes the Foreign Key Columns in Fact Table and Dimension Table? What is Data Mining? What is the Difference between a View and a Materialized View? SQL SERVER – Interview Questions and Answers – Frequently Asked Questions – Data Warehousing Concepts – Day 22 of 31 What is OLTP? What is OLAP? What is the Difference between OLTP and OLAP? What is ODS? What is ER Diagram? SQL SERVER – Interview Questions and Answers – Frequently Asked Questions – Data Warehousing Concepts – Day 23 of 31 What is ETL? What is VLDB? Is OLTP Database is Design Optimal for Data Warehouse? If denormalizing improves Data Warehouse Processes, then why is the Fact Table is in the Normal Form? What are Lookup Tables? What are Aggregate Tables? What is Real-Time Data-Warehousing? What are Conformed Dimensions? What is a Conformed Fact? How do you Load the Time Dimension? What is a Level of Granularity of a Fact Table? What are Non-Additive Facts? What is a Factless Facts Table? What are Slowly Changing Dimensions (SCD)? SQL SERVER – Interview Questions and Answers – Frequently Asked Questions – Data Warehousing Concepts – Day 24 of 31 What is Hybrid Slowly Changing Dimension? What is BUS Schema? What is a Star Schema? What Snow Flake Schema? Differences between the Star and Snowflake Schema? What is Difference between ER Modeling and Dimensional Modeling? What is Degenerate Dimension Table? Why is Data Modeling Important? What is a Surrogate Key? What is Junk Dimension? What is a Data Mart? What is the Difference between OLAP and Data Warehouse? What is a Cube and Linked Cube with Reference to Data Warehouse? What is Snapshot with Reference to Data Warehouse? What is Active Data Warehousing? What is the Difference between Data Warehousing and Business Intelligence? What is MDS? Explain the Paradigm of Bill Inmon and Ralph Kimball. SQL SERVER – Azure Interview Questions and Answers – Guest Post by Paras Doshi – Day 25 of 31 Paras Doshi has submitted 21 interesting question and answers for SQL Azure. 1.What is SQL Azure? 2.What is cloud computing? 3.How is SQL Azure different than SQL server? 4.How many replicas are maintained for each SQL Azure database? 5.How can we migrate from SQL server to SQL Azure? 6.Which tools are available to manage SQL Azure databases and servers? 7.Tell me something about security and SQL Azure. 8.What is SQL Azure Firewall? 9.What is the difference between web edition and business edition? 10.How do we synchronize On Premise SQL server with SQL Azure? 11.How do we Backup SQL Azure Data? 12.What is the current pricing model of SQL Azure? 13.What is the current limitation of the size of SQL Azure DB? 14.How do you handle datasets larger than 50 GB? 15.What happens when the SQL Azure database reaches Max Size? 16.How many databases can we create in a single server? 17.How many servers can we create in a single subscription? 18.How do you improve the performance of a SQL Azure Database? 19.What is code near application topology? 20.What were the latest updates to SQL Azure service? 21.When does a workload on SQL Azure get throttled? SQL SERVER – Interview Questions and Answers – Guest Post by Malathi Mahadevan – Day 26 of 31 Malachi had asked a simple question which has several answers. Each answer makes you think and ponder about the reality of the IT world. Look at the simple question – ‘What is the toughest challenge you have faced in your present job and how did you handle it’? and its various answers. Each answer has its own story. SQL SERVER – Interview Questions and Answers – Guest Post by Rick Morelan – Day 27 of 31 Rick Morelan of Joes2Pros has written an excellent blog post on the subject how to find top N values. Most people are fully aware of how the TOP keyword works with a SELECT statement. After years preparing so many students to pass the SQL Certification I noticed they were pretty well prepared for job interviews too. Yes, they would do well in the interview but not great. There seemed to be a few questions that would come up repeatedly for almost everyone. Rick addresses similar questions in his lucid writing skills. 2012 Observation of Top with Index and Order of Resultset SQL Server has lots of things to learn and share. It is amazing to see how people evaluate and understand different techniques and styles differently when implementing. The real reason may be absolutely different but we may blame something totally different for the incorrect results. Read the blog post to learn more. How do I Record Video and Webcast How to Convert Hex to Decimal or INT Earlier I asked regarding a question about how to convert Hex to Decimal. I promised that I will post an answer with Due Credit to the author but never got around to post a blog post around it. Read the original post over here SQL SERVER – Question – How to Convert Hex to Decimal. Query to Get Unique Distinct Data Based on Condition – Eliminate Duplicate Data from Resultset The natural reaction will be to suggest DISTINCT or GROUP BY. However, not all the questions can be solved by DISTINCT or GROUP BY. Let us see the following example, where a user wanted only latest records to be displayed. Let us see the example to understand further. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Memory Lane, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
Tools for modelling data and workflows using structured text files

- by Alexey

Consider a case when I want to try some idea of an application. But I want to avoid investing a lot of effort in coding UI/work flows/database schema etc before I see that it's going to be useful to me (as example of potential user). My idea is stay lightweight and put all the data in text files. So the components could be following: Domain objects are represented by text files or their fragments Domain objects are grouped by their type using directories Structure the files using some both human- and machine-friendly format, e.g. YAML Use some smart text editor (e.g. vim, emacs, rubymine) to edit and navigate those files Use color schemes and macros/custom commands of the text editor to effectively manipulate those files Use scripts (or a lightweight web framework like Sinatra) to try some business logic ideas on top of the data model The question is: Are there tools or toolkits that support or can be adopted to this approach? Also any ideas, links to articles/other knowledge sources are very welcome. And more specific question: What is the simplest way to index and update index of files with YAML files?

Read the article
SQL University: Parallelism Week - Introduction

- by Adam Machanic

Welcome to Parallelism Week at SQL University . My name is Adam Machanic, and I'm your professor. Imagine having 8 brains, or 16, or 32. Imagine being able to break up complex thoughts and distribute them across your many brains, so that you could solve problems faster. Now quit imagining that, because you're human and you're stuck with only one brain, and you only get access to the entire thing if you're lucky enough to have avoided abusing too many recreational drugs. For your database server,...(read more)

Read the article
Stereo images rectification and disparity: which algorithms?

- by alessandro.francesconi

I'm trying to figure out what are currently the two most efficent algorithms that permit, starting from a L/R pair of stereo images created using a traditional camera (so affected by some epipolar lines misalignment), to produce a pair of adjusted images plus their depth information by looking at their disparity. Actually I've found lots of papers about these two methods, like: "Computing Rectifying Homographies for Stereo Vision" (Zhang - seems one of the best for rectification only) "Three-step image recti?cation" (Monasse) "Rectification and Disparity" (slideshow by Navab) "A fast area-based stereo matching algorithm" (Di Stefano - seems a bit inaccurate) "Computing Visual Correspondence with Occlusions via Graph Cuts" (Kolmogorov - this one produces a very good disparity map, with also occlusion informations, but is it efficient?) "Dense Disparity Map Estimation Respecting Image Discontinuities" (Alvarez - toooo long for a first review) Anyone could please give me some advices for orienting into this wide topic? What kind of algorithm/method should I treat first, considering that I'll work on a very simple input: a pair of left and right images and nothing else, no more information (some papers are based on additional, pre-taken, calibration infos)? Speaking about working implementations, the only interesting results I've seen so far belongs to this piece of software, but only for automatic rectification, not disparity: http://stereo.jpn.org/eng/stphmkr/index.html I tried the "auto-adjustment" feature and seems really effective. Too bad there is no source code...

Read the article
Serial plans: Threshold / Parallel_degree_limit = 1

- by jean-pierre.dijcks

As a very short follow up on the previous post. So here is some more on getting a serial plan and why that happens Another reason - compared to the auto DOP is not on as we looked at in the earlier post - and often more prevalent to get a serial plan is if the plan simply does not take long enough to consider a parallel path. The resulting plan and note looks like this (note that this is a serial plan!): explain plan for select count(1) from sales; SELECT PLAN_TABLE_OUTPUT FROM TABLE(DBMS_XPLAN.DISPLAY()); PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------- Plan hash value: 672559287 -------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | Pstart| Pstop | -------------------------------------------------------------------------------------- PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 5 (0)| 00:00:01 | | | | 1 | SORT AGGREGATE | | 1 | | | | | | 2 | PARTITION RANGE ALL| | 960 | 5 (0)| 00:00:01 | 1 | 16 | | 3 | TABLE ACCESS FULL | SALES | 960 | 5 (0)| 00:00:01 | 1 | 16 | Note ----- - automatic DOP: Computed Degree of Parallelism is 1 because of parallel threshold 14 rows selected. The parallel threshold is referring to parallel_min_time_threshold and since I did not change the default (10s) the plan is not being considered for a parallel degree computation and is therefore staying with the serial execution. Now we go into the land of crazy: Assume I do want this DOP=1 to happen, I could set the parameter in the init.ora, but to highlight it in this case I changed it on the session: alter session set parallel_degree_limit = 1; The result I get is: ERROR: ORA-02097: parameter cannot be modified because specified value is invalid ORA-00096: invalid value 1 for parameter parallel_degree_limit, must be from among CPU IO AUTO INTEGER>=2 Which of course makes perfect sense...

Read the article
How advanced are author-recognition methods?

- by Nick Rtz

From a written text by an author if a computer program analyses the text, how much can a computer program tell today about the author of some (long enough to be statistically significant) texts? Can the computer program even tell with "certainty" whether a man or a woman wrote this text based solely on the contents of the text and not an investigation such as ip numbers etc? I'm interested to know if there are algorithms in use for instance to automatically know whether an author was male or female or similar characteristics of an author that a computer program can decide based on analyses of the written text by an author. It could be useful to know before you read a message what a computer analyses says about the author, do you agree? If I for instance get a longer message from my wife that she has had an accident in Nigeria and the computer program says that with 99 % probability the message was written by a male author in his sixties of non-caucasian origin or likewise, or by somebody who is not my wife, then the computer program could help me investigate why a certain message differs in characteristics. There can also be other uses for instance just detecting outliers in a geographically or demographically bounded larger data set. Scam detection is the obvious use I'm thinking of but there could also be other uses. Are there already such programs that analyse a written text to tell something about the author based on word choice, use of pronouns, unusual language usage, or likewise?

Read the article
How to store generated eigen faces for future face recognition?

- by user3237134

My code works in the following manner: 1.First, it obtains several images from the training set 2.After loading these images, we find the normalized faces,mean face and perform several calculation. 3.Next, we ask for the name of an image we want to recognize 4.We then project the input image into the eigenspace, and based on the difference from the eigenfaces we make a decision. 5.Depending on eigen weight vector for each input image we make clusters using kmeans command. Source code i tried: clear all close all clc % number of images on your training set. M=1200; %Chosen std and mean. %It can be any number that it is close to the std and mean of most of the images. um=60; ustd=32; %read and show images(bmp); S=[]; %img matrix for i=1:M str=strcat(int2str(i),'.jpg'); %concatenates two strings that form the name of the image eval('img=imread(str);'); [irow icol d]=size(img); % get the number of rows (N1) and columns (N2) temp=reshape(permute(img,[2,1,3]),[irow*icol,d]); %creates a (N1*N2)x1 matrix S=[S temp]; %X is a N1*N2xM matrix after finishing the sequence %this is our S end %Here we change the mean and std of all images. We normalize all images. %This is done to reduce the error due to lighting conditions. for i=1:size(S,2) temp=double(S(:,i)); m=mean(temp); st=std(temp); S(:,i)=(temp-m)*ustd/st+um; end %show normalized images for i=1:M str=strcat(int2str(i),'.jpg'); img=reshape(S(:,i),icol,irow); img=img'; end %mean image; m=mean(S,2); %obtains the mean of each row instead of each column tmimg=uint8(m); %converts to unsigned 8-bit integer. Values range from 0 to 255 img=reshape(tmimg,icol,irow); %takes the N1*N2x1 vector and creates a N2xN1 matrix img=img'; %creates a N1xN2 matrix by transposing the image. % Change image for manipulation dbx=[]; % A matrix for i=1:M temp=double(S(:,i)); dbx=[dbx temp]; end %Covariance matrix C=A'A, L=AA' A=dbx'; L=A*A'; % vv are the eigenvector for L % dd are the eigenvalue for both L=dbx'*dbx and C=dbx*dbx'; [vv dd]=eig(L); % Sort and eliminate those whose eigenvalue is zero v=[]; d=[]; for i=1:size(vv,2) if(dd(i,i)>1e-4) v=[v vv(:,i)]; d=[d dd(i,i)]; end end %sort, will return an ascending sequence [B index]=sort(d); ind=zeros(size(index)); dtemp=zeros(size(index)); vtemp=zeros(size(v)); len=length(index); for i=1:len dtemp(i)=B(len+1-i); ind(i)=len+1-index(i); vtemp(:,ind(i))=v(:,i); end d=dtemp; v=vtemp; %Normalization of eigenvectors for i=1:size(v,2) %access each column kk=v(:,i); temp=sqrt(sum(kk.^2)); v(:,i)=v(:,i)./temp; end %Eigenvectors of C matrix u=[]; for i=1:size(v,2) temp=sqrt(d(i)); u=[u (dbx*v(:,i))./temp]; end %Normalization of eigenvectors for i=1:size(u,2) kk=u(:,i); temp=sqrt(sum(kk.^2)); u(:,i)=u(:,i)./temp; end % show eigenfaces; for i=1:size(u,2) img=reshape(u(:,i),icol,irow); img=img'; img=histeq(img,255); end % Find the weight of each face in the training set. omega = []; for h=1:size(dbx,2) WW=[]; for i=1:size(u,2) t = u(:,i)'; WeightOfImage = dot(t,dbx(:,h)'); WW = [WW; WeightOfImage]; end omega = [omega WW]; end % Acquire new image % Note: the input image must have a bmp or jpg extension. % It should have the same size as the ones in your training set. % It should be placed on your desktop ed_min=[]; srcFiles = dir('G:\newdatabase\*.jpg'); % the folder in which ur images exists for b = 1 : length(srcFiles) filename = strcat('G:\newdatabase\',srcFiles(b).name); Imgdata = imread(filename); InputImage=Imgdata; InImage=reshape(permute((double(InputImage)),[2,1,3]),[irow*icol,1]); temp=InImage; me=mean(temp); st=std(temp); temp=(temp-me)*ustd/st+um; NormImage = temp; Difference = temp-m; p = []; aa=size(u,2); for i = 1:aa pare = dot(NormImage,u(:,i)); p = [p; pare]; end InImWeight = []; for i=1:size(u,2) t = u(:,i)'; WeightOfInputImage = dot(t,Difference'); InImWeight = [InImWeight; WeightOfInputImage]; end noe=numel(InImWeight); % Find Euclidean distance e=[]; for i=1:size(omega,2) q = omega(:,i); DiffWeight = InImWeight-q; mag = norm(DiffWeight); e = [e mag]; end ed_min=[ed_min MinimumValue]; theta=6.0e+03; %disp(e) z(b,:)=InImWeight; end IDX = kmeans(z,5); clustercount=accumarray(IDX, ones(size(IDX))); disp(clustercount); QUESTIONS: 1.It is working fine for M=50(i.e Training set contains 50 images) but not for M=1200(i.e Training set contains 1200 images).It is not showing any error.There is no output.I waited for 10 min still there is no output. I think it is going infinite loop.What is the problem?Where i was wrong? 2.Instead of running the training set everytime how eigen faces generated are stored so that stored eigen faces are used for future face recoginition for a new input image.So it reduces wastage of time.

Read the article
How to properly render a Frame Buffer to the BackBuffer in Stage3D / AGAL

- by bigp

After doing a render pass with RenderToTarget (RTT), how do you properly render that texture buffer to the screen while maintaining original scale / proportions so it doesn't stretch or lose quality? Can an AGAL VertexShader & FragmentShader be written so it's adaptable to any Texture size and Viewport dimensions? I find I'm getting some "blocky" effects in some of my first attempts at "ping-ponging" between two Texture buffers (to create trailing effects). Perhaps I'm not using the UVs correctly between the rendering-to-target and/or the backbuffer? Is there a simpler way just to "splash" the texture on the backbuffer, or is a Quad absolutely necessary (4 vertices, 2 triangles)? If it needs the Quad, should the Texture buffer be fully drawn (0.0 to 1.0 for vertical and horizontal UVs), or only a percentage of it should, like the example below? Texture Buffer U: 0.0 to viewport.width/texturebuffer.width; Texture Buffer V: 0.0 to viewport.height/texturebuffer.height; Thanks!

Read the article
XNA - Error while rendering a texture to a 2D render target via SpriteBatch

- by Jared B

I've got this simple code that uses SpriteBatch to draw a texture onto a RenderTarget2D: private void drawScene(GameTime g) { GraphicsDevice.Clear(skyColor); GraphicsDevice.SetRenderTarget(targetScene); drawSunAndMoon(); effect.Fog = true; GraphicsDevice.SetVertexBuffer(line); effect.MainEffect.CurrentTechnique.Passes[0].Apply(); GraphicsDevice.DrawPrimitives(PrimitiveType.TriangleStrip, 0, 2); GraphicsDevice.SetRenderTarget(null); SceneTexture = targetScene; } private void drawPostProcessing(GameTime g) { effect.SceneTexture = SceneTexture; GraphicsDevice.SetRenderTarget(targetBloom); spriteBatch.Begin(SpriteSortMode.Immediate, BlendState.Opaque, null, null, null); { if (Bloom) effect.BlurEffect.CurrentTechnique.Passes[0].Apply(); spriteBatch.Draw( targetScene, new Rectangle(0, 0, Window.ClientBounds.Width, Window.ClientBounds.Height), Color.White); } spriteBatch.End(); BloomTexture = targetBloom; GraphicsDevice.SetRenderTarget(null); } Both methods are called from my Draw(GameTime gameTime) function. First drawScene is called, then drawPostProcessing is called. The thing is, when I run this code I get an error on the spriteBatch.Draw call: The render target must not be set on the device when it is used as a texture. I already found the solution, which is to draw the actual render target (targetScene) to the texture so it doesn't create a reference to the loaded render target. However, to my knowledge, the only way of doing this is to write: GraphicsDevice.SetRenderTarget(outputTarget) SpriteBatch.Draw(inputTarget, ...) GraphicsDevice.SetRenderTarget(null) Which encounters the same exact problem I'm having right now. So, the question I'm asking is: how would I render inputTarget to outputTarget without reference issues?

Read the article
Vectorization of matlab code for faster execution

- by user3237134

My code works in the following manner: 1.First, it obtains several images from the training set 2.After loading these images, we find the normalized faces,mean face and perform several calculation. 3.Next, we ask for the name of an image we want to recognize 4.We then project the input image into the eigenspace, and based on the difference from the eigenfaces we make a decision. 5.Depending on eigen weight vector for each input image we make clusters using kmeans command. Source code i tried: clear all close all clc % number of images on your training set. M=1200; %Chosen std and mean. %It can be any number that it is close to the std and mean of most of the images. um=60; ustd=32; %read and show images(bmp); S=[]; %img matrix for i=1:M str=strcat(int2str(i),'.jpg'); %concatenates two strings that form the name of the image eval('img=imread(str);'); [irow icol d]=size(img); % get the number of rows (N1) and columns (N2) temp=reshape(permute(img,[2,1,3]),[irow*icol,d]); %creates a (N1*N2)x1 matrix S=[S temp]; %X is a N1*N2xM matrix after finishing the sequence %this is our S end %Here we change the mean and std of all images. We normalize all images. %This is done to reduce the error due to lighting conditions. for i=1:size(S,2) temp=double(S(:,i)); m=mean(temp); st=std(temp); S(:,i)=(temp-m)*ustd/st+um; end %show normalized images for i=1:M str=strcat(int2str(i),'.jpg'); img=reshape(S(:,i),icol,irow); img=img'; end %mean image; m=mean(S,2); %obtains the mean of each row instead of each column tmimg=uint8(m); %converts to unsigned 8-bit integer. Values range from 0 to 255 img=reshape(tmimg,icol,irow); %takes the N1*N2x1 vector and creates a N2xN1 matrix img=img'; %creates a N1xN2 matrix by transposing the image. % Change image for manipulation dbx=[]; % A matrix for i=1:M temp=double(S(:,i)); dbx=[dbx temp]; end %Covariance matrix C=A'A, L=AA' A=dbx'; L=A*A'; % vv are the eigenvector for L % dd are the eigenvalue for both L=dbx'*dbx and C=dbx*dbx'; [vv dd]=eig(L); % Sort and eliminate those whose eigenvalue is zero v=[]; d=[]; for i=1:size(vv,2) if(dd(i,i)>1e-4) v=[v vv(:,i)]; d=[d dd(i,i)]; end end %sort, will return an ascending sequence [B index]=sort(d); ind=zeros(size(index)); dtemp=zeros(size(index)); vtemp=zeros(size(v)); len=length(index); for i=1:len dtemp(i)=B(len+1-i); ind(i)=len+1-index(i); vtemp(:,ind(i))=v(:,i); end d=dtemp; v=vtemp; %Normalization of eigenvectors for i=1:size(v,2) %access each column kk=v(:,i); temp=sqrt(sum(kk.^2)); v(:,i)=v(:,i)./temp; end %Eigenvectors of C matrix u=[]; for i=1:size(v,2) temp=sqrt(d(i)); u=[u (dbx*v(:,i))./temp]; end %Normalization of eigenvectors for i=1:size(u,2) kk=u(:,i); temp=sqrt(sum(kk.^2)); u(:,i)=u(:,i)./temp; end % show eigenfaces; for i=1:size(u,2) img=reshape(u(:,i),icol,irow); img=img'; img=histeq(img,255); end % Find the weight of each face in the training set. omega = []; for h=1:size(dbx,2) WW=[]; for i=1:size(u,2) t = u(:,i)'; WeightOfImage = dot(t,dbx(:,h)'); WW = [WW; WeightOfImage]; end omega = [omega WW]; end % Acquire new image % Note: the input image must have a bmp or jpg extension. % It should have the same size as the ones in your training set. % It should be placed on your desktop ed_min=[]; srcFiles = dir('G:\newdatabase\*.jpg'); % the folder in which ur images exists for b = 1 : length(srcFiles) filename = strcat('G:\newdatabase\',srcFiles(b).name); Imgdata = imread(filename); InputImage=Imgdata; InImage=reshape(permute((double(InputImage)),[2,1,3]),[irow*icol,1]); temp=InImage; me=mean(temp); st=std(temp); temp=(temp-me)*ustd/st+um; NormImage = temp; Difference = temp-m; p = []; aa=size(u,2); for i = 1:aa pare = dot(NormImage,u(:,i)); p = [p; pare]; end InImWeight = []; for i=1:size(u,2) t = u(:,i)'; WeightOfInputImage = dot(t,Difference'); InImWeight = [InImWeight; WeightOfInputImage]; end noe=numel(InImWeight); % Find Euclidean distance e=[]; for i=1:size(omega,2) q = omega(:,i); DiffWeight = InImWeight-q; mag = norm(DiffWeight); e = [e mag]; end ed_min=[ed_min MinimumValue]; theta=6.0e+03; %disp(e) z(b,:)=InImWeight; end IDX = kmeans(z,5); clustercount=accumarray(IDX, ones(size(IDX))); disp(clustercount); Running time for 50 images:Elapsed time is 103.947573 seconds. QUESTIONS: 1.It is working fine for M=50(i.e Training set contains 50 images) but not for M=1200(i.e Training set contains 1200 images).It is not showing any error.There is no output.I waited for 10 min still there is no output. I think it is going infinite loop.What is the problem?Where i was wrong?

Read the article
Sentence Tree v/s Words List

- by Rohit Jose

I was recently tasked with building a Name Entity Recognizer as part of a project. The objective was to parse a given sentence and come up with all the possible combinations of the entities. One approach that was suggested was to keep a lookup table for all the know connector words like articles and conjunctions, remove them from the words list after splitting the sentence on the basis of the spaces. This would leave out the Name Entities in the sentence. A lookup is then done for these identified entities on another lookup table that associates them to the entity type, for example if the sentence was: Remember the Titans was a movie directed by Boaz Yakin, the possible outputs would be: {Remember the Titans,Movie} was {a movie,Movie} directed by {Boaz Yakin,director} {Remember the Titans,Movie} was a movie directed by Boaz Yakin {Remember the Titans,Movie} was {a movie,Movie} directed by Boaz Yakin {Remember the Titans,Movie} was a movie directed by {Boaz Yakin,director} Remember the Titans was {a movie,Movie} directed by Boaz Yakin Remember the Titans was {a movie,Movie} directed by {Boaz Yakin,director} Remember the Titans was a movie directed by {Boaz Yakin,director} Remember the {the titans,Movie,Sports Team} was {a movie,Movie} directed by {Boaz Yakin,director} Remember the {the titans,Movie,Sports Team} was a movie directed by Boaz Yakin Remember the {the titans,Movie,Sports Team} was {a movie,Movie} directed by Boaz Yakin Remember the {the titans,Movie,Sports Team} was a movie directed by {Boaz Yakin,director} The entity lookup table here would contain the following data: Remember the Titans=Movie a movie=Movie Boaz Yakin=director the Titans=Movie the Titans=Sports Team Another alternative logic that was put forward was to build a crude sentence tree that would contain the connector words in the lookup table as parent nodes and do a lookup in the entity table for the leaf node that might contain the entities. The tree that was built for the sentence above would be: The question I am faced with is the benefits of the two approaches, should I be going for the tree approach to represent the sentence parsing, since it provides a more semantic structure? Is there a better approach I should be going for solving it?

Read the article
google custom search gives different result number for same query

- by santiagozky

We are using google custom search and we have found that often the totalResults iterates between two values, even for the same query. The different values can be slightly different or more than double. The parameters I am using look like this: https://www.googleapis.com/customsearch/v1? q=something cx=XXXXXXXXXX lr=lang_en siteSearch=www.mydomain.com start=1 fields=context%2Citems%28fileFormat%2CformattedUrl%2Clink%2Cpagemap%2Csnippet%2Ctitle%29%2Cqueries%2CsearchInformation%28searchTime%2CtotalResults%29%2Cspelling%2FcorrectedQuery key=YYYYYYYYYYYYYYY filter=0 This is problem because of calculating the number of result pages. How can I get the same results for the same query?

Read the article
Looking for mass cropping software

- by Bart van Heukelom

I'm looking for a tool than runs on Ubuntu that can let me: Open an image in a folder which has thousands Crop and rotate it Save as a copy, automatically named (not manually), with one click. Preferably with something in the name that I can later use to filter these cropped copies in Nautilus (unless it saves in another directory, that'd be even better). Move to next image and repeat Does it exist?

Read the article
Best Practice - XML To Excel

- by MemLeak

I've to read a big XML file with a lot of information. Afterwards I extract the needed information (~20 Points(columns) / ~80 relevant Data (rows, some of them with subdatasets) and write them out in a Excel File. My Question is how to handle the extraction (of unused Data) part, should I copy the whole file and delete the unused parts, and then write it to excel or is it a good approach to create Objects for each column? should I write the whole xml to excel and start to delete rows in excel? What would be performant and a acceptable solution?

Read the article
Stitch scanned images using CLI

- by Adam Matan

I have scanned a newspaper article which was larger than the scanner glass. Each page was scanned twice: the top and the bottom parts, where the middle part appeared in both images. Is there a way to quickly match and stitch these scanned images, preferably using CLI? The panorama stitching tools I know require lengthy configuration, which is mostly irrelevant: lens size, focus, angle etc. Hugin has a solution for this issue, but it isn't practical for batch jobs.

Read the article
OWB 11gR2 – Parallel DML and Query

- by David Allan

A quick post illustrating conventional (non direct path) parallel inserts and query using OWB following on from some recent posts from Jean-Pierre and Randolf on this topic. The mapping configuration properties is where you can define these hints in OWB, taking JP’s simplistic illustration, the parallel query hints in OWB are defined on the ‘Extraction hint’ property for the source, and the parallel DML hints are defined on the ‘Loading hint’ property on the target table operator. If we then generate the code you can see the intermediate code generated below… Finally…remember the parallel enabled session for this all to fly… Anyway, hope this helps join a few dots….

Read the article
Content Query Web Part and the Yes/No Field

- by Bil Simser

The Content Query Web Part (CQWP) is a pretty powerful beast. It allows you to do multiple site queries and aggregate the results. This is great for rolling up content and doing some summary type reporting. Here’s a trick to remember about Yes/No fields and using the CQWP. If you’re building a news style site and want to aggregate say all the announcements that people tag a certain way, up onto the home page this might be a solution. First we need to allow a way for users of all our sites to mark an announcement for inclusion on our Intranet Home Page. We’ll do this by just modifying the Announcement Content type and adding a Yes/No field to it. There are alternate ways of doing this like building a new Announcement type or stapling a feature to all sites to add our column but this is pretty low impact and only affects our current site collection so let’s go with it for now, okay? You can berate me in the comments about the proper way I should have done this part. Go to the Site Settings for the Site Collection and click on Site Content Types under the Galleries. This takes you to the gallery for this site and all subsites. Scroll down until you see the List Content Types and click on Announcements. Now we’re modifying the Announcement content type which affects all those announcement lists that are created by default if you’re building sites using the Team Site template (or creating a new Announcements list on any site for that matter). Click on Add from new site column under the Column list. This will allow us to create a new Yes/No field that users will see in Announcement items. This field will allow the user to flag the announcement for inclusion on the home page. Feel free to modify the fields as you see fit for your environment, this is just an example. Now that we’ve added the column to our Announcements Content type we can go into any site that has an announcement list, modify that announcement and flag it to be included on our home page. See the new Featured column? That was the result of modifying our Announcements Content Type on this site collection. Now we can move onto the dirty part, displaying it in a CQWP on the home page. And here is where the fun begins (and the head scratching should end). On our home page we want to drop a Content Query Web Part and aggregate any Announcement that’s been flagged as Featured by the users (we could also add the filter to handle Expires so we don’t show old content so go ahead and do that if you want). First add a CQWP to the page then modify the settings for the web part. In the first section, Query, we want the List Type to be set to Announcements and the Content type to be Announcement so set your options like this: Click Apply and you’ll see the results display all Announcements from any site in the site collection. I have five team sites created each with a unique announcement added to them. Now comes the filtering. We don’t want to include every announcement, only ones users flag using that Featured column we added. At first blush you might scroll down to the Additional Filters part of the Query options and set the Featured column to be equal to Yes: This seems correct doesn’t it? After all, the column is a Yes/No column and looking at an announcement in the site, it displays the field as Yes or No: However after applying the filter you get this result: (I have the announcements from Team Site 1 and Team Site 4 flagged as Featured) Huh? It’s BACKWARDS! Let’s confirm that. Go back in and change the Additional Filters section from Yes to No and hit Apply and you get this: Wait a minute? Shouldn’t I see Team Site 1 and 4 if the logic is backwards? Why am I seeing the same thing as before. What gives… For whatever reason, unknown to me, a Yes/No field (even though it displays as such) really uses 1 and 0 behind the scenes. Yeah, someone was stuck on using integer values for booleans when they wrote SharePoint (probably after a long night of white boarding ways to mess with developers heads) and came up with this. The solution is pretty simple but not very discoverable. Set the filter to include your flagged items like so: And it will filter the items marked as Featured correctly giving you this result: This kind of solution could also be extended and enhanced. Here are a few suggestions and ideas: Modify the ItemStyle.xsl file to add a new style for this aggregation which would include the first few paragraphs of the body (or perhaps add another field to the Content type called Excerpt or Summary and display that instead) Add an Image column to the Announcement Content type to include a Picture field and display it in the summary Add a Category choice field (Employee News, Current Events, Headlines, etc.) and add multiple CQWPs to the home page filtering each one on a different category I know some may find this topic old and dusty but I didn’t see a lot out there specifically on filtering the Yes/No fields and the whole 1/0 trick was a little wonky, so I figured a few pictures would help walk through overcoming yet another SharePoint weirdness. With a little work and some creative juices you can easily us the power of aggregation and the CQWP to build a news site from content on your team sites.

Read the article
Multiple volumetric lights

- by notabene

I recently read this GPU GEMS 3 article Volumetric Light Scattering as a Post-Process. I like the idea to add volumetric light property to realtime render i'm working on. Question is will it work for multiple lights? Our renderer uses one render pass per light and uses additive blending to sum incoming light. I'm mostly convinced that it have to work nice. Do you agree? Maybe there can be problem where light rays crosses each other.

Read the article

Search Results

Search found 22897 results on 916 pages for 'query processing'.

Page 57/916 | < Previous Page | 53 54 55 56 57 58 59 60 61 62 63 64 | Next Page >

- by pinaldave

- by gsusx

- by Tim

- by Adam Machanic

- by Arvind Chaudhary

- by gsusx

- by Jim Thio

- by Pinal Dave

- by Alexey

- by Adam Machanic

- by alessandro.francesconi

- by jean-pierre.dijcks

- by Nick Rtz

- by user3237134

- by bigp

- by Jared B

- by user3237134

- by Rohit Jose

- by santiagozky

- by Bart van Heukelom

- by MemLeak

- by Adam Matan

- by David Allan

- by Bil Simser

- by notabene

< Previous Page | 53 54 55 56 57 58 59 60 61 62 63 64 | Next Page >