Search Results

Search found 362 results on 15 pages for 'lookups'.

Page 1/15 | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >

  • SSIS - Range lookups

    - by Repieter
      When developing an ETL solution in SSIS we sometimes need to do range lookups in SSIS. Several solutions for this can be found on the internet, but now we have built another solution which I would like to share, since it's pretty easy to implement and the performance is fast.   You can download the sample package to see how it works. Make sure you have the AdventureWorks2008R2 and AdventureWorksDW2008R2 databases installed. (Apologies for the layout of this blog, I don't do this too often :))   To give a little bit more information about the example, this is basically what is does: we load a facttable and do an SCD type 2 lookup operation of the Product dimension. This is done with a script component.   First we query the Data warehouse to create the lookup dataset. The query that is used for that is:   SELECT     [ProductKey]     ,[ProductAlternateKey]     ,[StartDate]     ,ISNULL([EndDate], '9999-01-01') AS EndDate FROM [DimProduct]     The output of this query is stored in a DataTable:     string lookupQuery = @"                         SELECT                             [ProductKey]                             ,[ProductAlternateKey]                             ,[StartDate]                             ,ISNULL([EndDate], '9999-01-01') AS EndDate                         FROM [DimProduct]";           OleDbCommand oleDbCommand = new OleDbCommand(lookupQuery, _oleDbConnection);         OleDbDataAdapter adapter = new OleDbDataAdapter(oleDbCommand);           _dataTable = new DataTable();         adapter.Fill(_dataTable);     Now that the dimension data is stored in the DataTable we use the following method to do the actual lookup:   public int RangeLookup(string businessKey, DateTime lookupDate)     {         // set default return value (Unknown)         int result = -1;           DataRow[] filteredRows;         filteredRows = _dataTable.Select(string.Format("ProductAlternateKey = '{0}'", businessKey));           for (int i = 0; i < filteredRows.Length; i++)         {             // check if the lookupdate is found between the startdate and enddate of any of the records             if (lookupDate >= (DateTime)filteredRows[i][2] && lookupDate < (DateTime)filteredRows[i][3])             {                 result = (filteredRows[i][0] == null) ? -1 : (int)filteredRows[i][0];                 break;             }         }           filteredRows = null;           return result;     }       This method is executed for every row that passes the script component. This is implemented in the ProcessInputRow method   public override void Input0_ProcessInputRow(Input0Buffer Row)     {         // Perform the lookup operation on the current row and put the value in the Surrogate Key Attribute         Row.ProductKey = RangeLookup(Row.ProductNumber, Row.OrderDate);     }   Now what actually happens?!   1. Every record passes the business key and the orderdate to the RangeLookup method. 2. The DataTable is then filtered on the business key of the current record. The output is stored in a DataRow [] object. 3. We loop over the DataRow[] object to see where the orderdate meets the following expression: (lookupDate >= (DateTime)filteredRows[i][2] && lookupDate < (DateTime)filteredRows[i][3]) 4. When the expression returns true (so where the data is between the Startdate and the EndDate), the surrogate key of the dimension record is returned   We have done some testing with this solution and it works great for us. Hope others can use this example to do their range lookups.

    Read the article

  • Multicast hostname lookups on OSX

    - by KARASZI István
    I have a problem with hostname lookups on my OSX computer. According to Apple's HK3473 document it says for v10.6: Host names that contain only one label in addition to local, for example "My-Computer.local", are resolved using Multicast DNS (Bonjour) by default. Host names that contain two or more labels in addition to local, for example "server.domain.local", are resolved using a DNS server by default. Which is not true as my testing. If I try to open a connection on my local computer to a remote port: telnet example.domain.local 22 then it will lookup the IP address with multicast DNS next to the A and AAAA lookups. This causes a two seconds lookup timeout on every lookup. Which is a lot! When I try with IPv4 only then it won't use the multicast queries to fetch the remote address just the simple A queries. telnet -4 example.domain.local 22 When I try with IPv6 only: telnet -6 example.domain.local 22 then it will lookup with multicast DNS and AAAA again, and the 2 seconds timeout delay occurs again. I've tried to create a resolver entry to my /etc/resolver/domain.local, and /etc/resolver/local.1, but none of them was working. Is there any way to disable this multicast lookups for the "two or more label addition to local" domains, or simply disable it for the selected subdomain (domain.local)? Thank you! Update #1 Thanks @mralexgray for the scutil --dns command, now I can see my domain in the list, but it's late in the order: DNS configuration resolver #1 domain : adverticum.lan nameserver[0] : 192.168.1.1 order : 200000 resolver #2 domain : local options : mdns timeout : 2 order : 300000 resolver #3 domain : 254.169.in-addr.arpa options : mdns timeout : 2 order : 300200 resolver #4 domain : 8.e.f.ip6.arpa options : mdns timeout : 2 order : 300400 resolver #5 domain : 9.e.f.ip6.arpa options : mdns timeout : 2 order : 300600 resolver #6 domain : a.e.f.ip6.arpa options : mdns timeout : 2 order : 300800 resolver #7 domain : b.e.f.ip6.arpa options : mdns timeout : 2 order : 301000 resolver #8 domain : domain.local nameserver[0] : 192.168.1.1 order : 200001 Maybe it would work if I could move the resolver #8 to the position #2. Update #2 No probably won't work because the local DNS server on 192.168.1.1 answering for domain.local requests and it's before the mDNS (resolver #2). Update #3 I could decrease the mDNS timeout in /System/Library/SystemConfiguration/IPMonitor.bundle/Contents/Info.plist file, which speeds up the lookups a little, but this is not the solution.

    Read the article

  • Sendmail is not ignoring MX lookups

    - by daniel
    Hello, I recently learned of the joys of square brackets with SMART_HOST to have sendmail ignore MX lookups. I need this functionality, however, I can't seem to make it persistant. Sending mail with -Am works, however, -bm does not. In the -Am case, the correct mail server is used. In the -bm case, an MX lookup is still being performed. Is there a way to disable MX lookups (or some working alternative)? Thanks

    Read the article

  • Inequality joins, Asynchronous transformations and Lookups : SSIS

    - by jamiet
    It is pretty much accepted by SQL Server Integration Services (SSIS) developers that synchronous transformations are generally quicker than asynchronous transformations (for a description of synchronous and asynchronous transformations go read Asynchronous and synchronous data flow components). Notice I said “generally” and not “always”; there are circumstances where using asynchronous transformations can be beneficial and in this blog post I’ll demonstrate such a scenario, one that is pretty common when building data warehouses. Imagine I have a [Customer] dimension table that manages information about all of my customers as a slowly-changing dimension. If that is a type 2 slowly changing dimension then you will likely have multiple rows per customer in that table. Furthermore you might also have datetime fields that indicate the effective time period of each member record. Here is such a table that contains data for four dimension members {Terry, Max, Henry, Horace}: Notice that we have multiple records per customer and that the [SCDStartDate] of a record is equivalent to the [SCDEndDate] of the record that preceded it (if there was one). (Note that I am on record as saying I am not a fan of this technique of storing an [SCDEndDate] but for the purposes of clarity I have included it here.) Anyway, the idea here is that we will have some incoming data containing [CustomerName] & [EffectiveDate] and we need to use those values to lookup [Customer].[CustomerId]. The logic will be: Lookup a [CustomerId] WHERE [CustomerName]=[CustomerName] AND [SCDStartDate] <= [EffectiveDate] AND [EffectiveDate] <= [SCDEndDate] The conventional approach to this would be to use a full cached lookup but that isn’t an option here because we are using inequality conditions. The obvious next step then is to use a non-cached lookup which enables us to change the SQL statement to use inequality operators: Let’s take a look at the dataflow: Notice these are all synchronous components. This approach works just fine however it does have the limitation that it has to issue a SQL statement against your lookup set for every row thus we can expect the execution time of our dataflow to increase linearly in line with the number of rows in our dataflow; that’s not good. OK, that’s the obvious method. Let’s now look at a different way of achieving this using an asynchronous Merge Join transform coupled with a Conditional Split. I’ve shown it post-execution so that I can include the row counts which help to illustrate what is going on here: Notice that there are more rows output from our Merge Join component than on the input. That is because we are joining on [CustomerName] and, as we know, we have multiple records per [CustomerName] in our lookup set. Notice also that there are two asynchronous components in here (the Sort and the Merge Join). I have embedded a video below that compares the execution times for each of these two methods. The video is just over 8minutes long. View on Vimeo  For those that can’t be bothered watching the video I’ll tell you the results here. The dataflow that used the Lookup transform took 36 seconds whereas the dataflow that used the Merge Join took less than two seconds. An illustration in case it is needed: Pretty conclusive proof that in some scenarios it may be quicker to use an asynchronous component than a synchronous one. Your mileage may of course vary. The scenario outlined here is analogous to performance tuning procedural SQL that uses cursors. It is common to eliminate cursors by converting them to set-based operations and that is effectively what we have done here. Our non-cached lookup is performing a discrete operation for every single row of data, exactly like a cursor does. By eliminating this cursor-in-disguise we have dramatically sped up our dataflow. I hope all of that proves useful. You can download the package that I demonstrated in the video from my SkyDrive at http://cid-550f681dad532637.skydrive.live.com/self.aspx/Public/BlogShare/20100514/20100514%20Lookups%20and%20Merge%20Joins.zip Comments are welcome as always. @Jamiet Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

    Read the article

  • Denali Paging–Key seek lookups

    - by Dave Ballantyne
    In my previous post “Denali Paging – is it win.win ?” I demonstrated the use of using the Paging functionality within Denali.  On reflection,  I think i may of been a little unfair and should of continued always planned to continue my investigations to the next step. In Pauls article, he uses a combination of ctes to first scan the ordered keys which is then filtered using TOP and rownumber and then uses those keys to seek the data.  So what happens if we replace the scanning portion of the code with the denali paging functionality. Heres the original procedure,  we are going to replace the functionality of the Keys and SelectedKeys ctes : CREATE  PROCEDURE dbo.FetchPageKeySeek         @PageSize   BIGINT,         @PageNumber BIGINT AS BEGIN         -- Key-Seek algorithm         WITH    Keys         AS      (                 -- Step 1 : Number the rows from the non-clustered index                 -- Maximum number of rows = @PageNumber * @PageSize                 SELECT  TOP (@PageNumber * @PageSize)                         rn = ROW_NUMBER() OVER (ORDER BY P1.post_id ASC),                         P1.post_id                 FROM    dbo.Post P1                 ORDER   BY                         P1.post_id ASC                 ),                 SelectedKeys         AS      (                 -- Step 2 : Get the primary keys for the rows on the page we want                 -- Maximum number of rows from this stage = @PageSize                 SELECT  TOP (@PageSize)                         SK.rn,                         SK.post_id                 FROM    Keys SK                 WHERE   SK.rn > ((@PageNumber - 1) * @PageSize)                 ORDER   BY                         SK.post_id ASC                 )         SELECT  -- Step 3 : Retrieve the off-index data                 -- We will only have @PageSize rows by this stage                 SK.rn,                 P2.post_id,                 P2.thread_id,                 P2.member_id,                 P2.create_dt,                 P2.title,                 P2.body         FROM    SelectedKeys SK         JOIN    dbo.Post P2                 ON  P2.post_id = SK.post_id         ORDER   BY                 SK.post_id ASC; END; and here is the replacement procedure using paging: CREATE  PROCEDURE dbo.FetchOffsetPageKeySeek         @PageSize   BIGINT,         @PageNumber BIGINT AS BEGIN         -- Key-Seek algorithm         WITH    SelectedKeys         AS      (                 SELECT  post_id                 FROM    dbo.Post P1                 ORDER   BY post_id ASC                 OFFSET  @PageSize * (@PageNumber-1) ROWS                 FETCH NEXT @PageSize ROWS ONLY                 )         SELECT  P2.post_id,                 P2.thread_id,                 P2.member_id,                 P2.create_dt,                 P2.title,                 P2.body         FROM    SelectedKeys SK         JOIN    dbo.Post P2                 ON  P2.post_id = SK.post_id         ORDER   BY                 SK.post_id ASC; END; Notice how all i have done is replace the functionality with the Keys and SelectedKeys CTEs with the paging functionality. So , what is the comparative performance now ?. Exactly the same amount of IO and memory usage , but its now pretty obvious that in terms of CPU and overall duration we are onto a winner.    

    Read the article

  • Cardinality Estimation Bug with Lookups in SQL Server 2008 onward

    - by Paul White
    Cost-based optimization stands or falls on the quality of cardinality estimates (expected row counts).  If the optimizer has incorrect information to start with, it is quite unlikely to produce good quality execution plans except by chance.  There are many ways we can provide good starting information to the optimizer, and even more ways for cardinality estimation to go wrong.  Good database people know this, and work hard to write optimizer-friendly queries with a schema and metadata (e.g. statistics) that reduce the chances of poor cardinality estimation producing a sub-optimal plan.  Today, I am going to look at a case where poor cardinality estimation is Microsoft’s fault, and not yours. SQL Server 2005 SELECT th.ProductID, th.TransactionID, th.TransactionDate FROM Production.TransactionHistory AS th WHERE th.ProductID = 1 AND th.TransactionDate BETWEEN '20030901' AND '20031231'; The query plan on SQL Server 2005 is as follows (if you are using a more recent version of AdventureWorks, you will need to change the year on the date range from 2003 to 2007): There is an Index Seek on ProductID = 1, followed by a Key Lookup to find the Transaction Date for each row, and finally a Filter to restrict the results to only those rows where Transaction Date falls in the range specified.  The cardinality estimate of 45 rows at the Index Seek is exactly correct.  The table is not very large, there are up-to-date statistics associated with the index, so this is as expected. The estimate for the Key Lookup is also exactly right.  Each lookup into the Clustered Index to find the Transaction Date is guaranteed to return exactly one row.  The plan shows that the Key Lookup is expected to be executed 45 times.  The estimate for the Inner Join output is also correct – 45 rows from the seek joining to one row each time, gives 45 rows as output. The Filter estimate is also very good: the optimizer estimates 16.9951 rows will match the specified range of transaction dates.  Eleven rows are produced by this query, but that small difference is quite normal and certainly nothing to worry about here.  All good so far. SQL Server 2008 onward The same query executed against an identical copy of AdventureWorks on SQL Server 2008 produces a different execution plan: The optimizer has pushed the Filter conditions seen in the 2005 plan down to the Key Lookup.  This is a good optimization – it makes sense to filter rows out as early as possible.  Unfortunately, it has made a bit of a mess of the cardinality estimates. The post-Filter estimate of 16.9951 rows seen in the 2005 plan has moved with the predicate on Transaction Date.  Instead of estimating one row, the plan now suggests that 16.9951 rows will be produced by each clustered index lookup – clearly not right!  This misinformation also confuses SQL Sentry Plan Explorer: Plan Explorer shows 765 rows expected from the Key Lookup (it multiplies a rounded estimate of 17 rows by 45 expected executions to give 765 rows total). Workarounds One workaround is to provide a covering non-clustered index (avoiding the lookup avoids the problem of course): CREATE INDEX nc1 ON Production.TransactionHistory (ProductID) INCLUDE (TransactionDate); With the Transaction Date filter applied as a residual predicate in the same operator as the seek, the estimate is again as expected: We could also force the use of the ultimate covering index (the clustered one): SELECT th.ProductID, th.TransactionID, th.TransactionDate FROM Production.TransactionHistory AS th WITH (INDEX(1)) WHERE th.ProductID = 1 AND th.TransactionDate BETWEEN '20030901' AND '20031231'; Summary Providing a covering non-clustered index for all possible queries is not always practical, and scanning the clustered index will rarely be optimal.  Nevertheless, these are the best workarounds we have today. In the meantime, watch out for poor cardinality estimates when a predicate is applied as part of a lookup. The worst thing is that the estimate after the lookup join in the 2008+ plans is wrong.  It’s not hopelessly wrong in this particular case (45 versus 16.9951 is not the end of the world) but it easily can be much worse, and there’s not much you can do about it.  Any decisions made by the optimizer after such a lookup could be based on very wrong information – which can only be bad news. If you think this situation should be improved, please vote for this Connect item. © 2012 Paul White – All Rights Reserved twitter: @SQL_Kiwi email: [email protected]

    Read the article

  • Multicast hostname lookups on OSX

    - by KARASZI István
    I have a problem with hostname lookups on my OSX computer. According to Apple's HK3473 document it says for v10.6: Host names that contain only one label in addition to local, for example "My-Computer.local", are resolved using Multicast DNS (Bonjour) by default. Host names that contain two or more labels in addition to local, for example "server.domain.local", are resolved using a DNS server by default. Which is not true as my testing. If I try to open a connection on my local computer to a remote port: telnet example.domain.local 22 then it will lookup the IP address with multicast DNS next to the A and AAAA lookups. This causes a two seconds lookup timeout on every lookup. Which is a lot! When I try with IPv4 only then it won't use the multicast queries to fetch the remote address just the simple A queries. telnet -4 example.domain.local 22 When I try with IPv6 only: telnet -6 example.domain.local 22 then it will lookup with multicast DNS and AAAA again, and the 2 seconds timeout delay occurs again. I've tried to create a resolver entry to my /etc/resolver/domain.local, and /etc/resolver/local.1, but none of them was working. Is there any way to disable this multicast lookups for the "two or more label addition to local" domains, or simply disable it for the selected subdomain (domain.local)? Thank you!

    Read the article

  • Exchange 2003 ActiveSync fine but GAL lookups not working

    - by Daniel Lucas
    We have Exchange 2003 SP2 and use ActiveSync for our mobile devices (iOS and sever Android versions). Everything works except for GAL lookups and we've confirmed that the devices and client versions we are trying it on are supposed to support it. The symptom is different depending on the client, but there is never an error. In a nutshell, lookups return no results. This was working previously and we aren't sure what has changed. Are there logs in Exchange or IIS that will allow me to see GAL lookup transactions? Could it be a GAL permissions issue? (No issues with GAL in Outlook) Could it have been caused by upgrading our DCs to 2008 and increasing the forest/domain functional levels?

    Read the article

  • Disable NSS LDAP IPv6 (AAAA) lookups

    - by pilcrow
    Question: How can I disable inet6 AAAA queries for my LDAP server during (LDAP-backed) NSS lookups on a CentOS (RHEL) 5 machine? Background: I've servers configured to consult ldap://ldap.internal for NSS passwd and group lookups. Every relevant NSS lookup, for example the getpwuid(3) implied by an ls -l which needs to translate UIDs to network user names, performs the following DNS dance before connecting to the ldap server: AAAA? ldap.internal -> (no records) AAAA? ldap.internal.internal -> NXDomain A? ldap.internal -> 192.168.3.89 I'd like to skip the first two queries completely. Configuration: [server]$ cat /etc/redhat-release CentOS release 5.4 (Final) [server]$ grep ^passwd /etc/nsswitch.conf passwd: files ldap [server]$ grep ^uri /etc/ldap.conf uri ldap://ldap.internal/ For what it's worth, IPv6 support is otherwise disabled on these systems: [server]$ grep off /etc/modprobe.conf alias ipv6 off alias net-pf-10 off [server]$ echo "$(ip a | grep -c inet6) IPv6-enabled interfaces" 0 IPv6-enabled interfaces

    Read the article

  • Office365 SPF record has too many lookups

    - by Sammitch
    For some utterly ridiculous administrative reasons we've got a split domain with one mailbox on Office365 which requires us to add include:outlook.com to our SPF record. The problem with this is that that rule alone requires nine DNS lookups of the maximum of 10. Seriously, it's horrible. Just look at it: v=spf1 include:spf-a.outlook.com include:spf-b.outlook.com ip4:157.55.9.128/25 include:spfa.bigfish.com include:spfb.bigfish.com include:spfc.bigfish.com include:spf-a.hotmail.com include:_spf-ssg-b.microsoft.com include:_spf-ssg-c.microsoft.com ~all Given that we have our own large-ish mail system we need to have rules for a, mx, include:_spf1.mydomain.com, and include:_spf2.mydomain.com which puts us at 13 DNS lookups which causes PERMERRORs with strict SPF validators, and completely unreliable/unpredictable validation with non-strict/badly implemented validators. Is it possible to somehow eliminate 3 of those include: rules from the bloated outlook.com record, but still cover the servers used by O365? Edit: Commentors have mentioned that we should simply use the shorter spf.protection.outlook.com record. While that is news to me, and it is shorter, it's only one record shorter: spf.protection.outlook.com include:spf-a.outlook.com include:spf-b.outlook.com include:spf-c.outlook.com include:spf.messaging.microsoft.com include:spfa.frontbridge.com include:spfb.frontbridge.com include:spfc.frontbridge.com Edit² I suppose we can technically pare this down to: v=spf1 a mx include:_spf1.mydomain.com include:_spf2.mydomain.com include:spf-a.outlook.com include:spf-b.outlook.com include:spf-c.outlook.com include:spfa.frontbridge.com include:spfb.frontbridge.com include:spfc.frontbridge.com ~all but the potential issues I see with this are: We need to keep abreast of any changes to the parent spf.protection.outlook.com and spf.messaging.microsoft.com records. If anything is changed or [god forbid] added we would have to manually update ours to reflect that. With our actual domain name the record's length is 260 chars, which would require 2 strings for the TXT record, and I honestly don't trust that all of the DNS clients and SPF resolvers out there will properly accept a TXT record longer than 255 bytes.

    Read the article

  • Way to store a large dictionary with low memory footprint + fast lookups (on Android)

    - by BobbyJim
    I'm developing an android word game app that needs a large (~250,000 word dictionary) available. I need: reasonably fast look ups e.g. constant time preferable, need to do maybe 200 lookups a second on occasion to solve a word puzzle and maybe 20 lookups within 0.2 second more often to check words the user just spelled. EDIT: Lookups are typically asking "Is in the dictionary?". I'd like to support up to two wildcards in the word as well, but this is easy enough by just generating all possible letters the wildcards could have been and checking the generated words (i.e. 26 * 26 lookups for a word with two wildcards). as it's a mobile app, using as little memory as possible and requiring only a small initial download for the dictionary data is top priority. My first naive attempts used Java's HashMap class, which caused an out of memory exception. I've looked into using the SQL lite databases available on android, but this seems like overkill. What's a good way to do what I need?

    Read the article

  • Apache disable DNS lookups

    - by odeceixe
    I'm using Debian 4.3.2-1 and Apache 2 on my production server. Watching the logs, I noticed Apache is resolving client's hostnames even with HostnameLookups Off in apache2.conf. I want to avoid these lookups so I'm guessing Apache is making this DNS query because I have mod_authz_host enabled. When I try to unlink this module, I get several modules complaining because they use the Order directive. How is the clean way to go? Should I comment all Order directives like Order allow,deny Deny from all Is this the only way to stop Apache from making DNS requests? I would like to deny access to .htaccess files and some rules like that.

    Read the article

  • Very long DNS lookups inside my network

    - by Nuno Cordeiro
    Ever since I installed DD-WRT (v24-sp2 08/07/10 std-usb-ftp) on my router (RT-N16), my browsing got substantially slower. Using FirePHP I figured out that it's being caused by VERY long DNS lookups (~30 seconds). When the domain name was very recently accessed then speed is very good. I tried changing DNS on the computer and I tried messing around with the options on DD-WRT. I have tried to configure the router with Google DNS and/or OpenDNS. My current DNS output after using ipconfig -all is: 192.168.1.1 208.67.220.220 8.8.8.8 208.67.222.222 Can someone help me debug and solve this problem? I'd like to snoop the requests themselves. How can I know which DNS requests are being sent and which are failing/succeeding? Note: I don't expect this to be relevant but my router is connected to the internet through an ONT.

    Read the article

  • Optimizing Python code with many attribute and dictionary lookups

    - by gotgenes
    I have written a program in Python which spends a large amount of time looking up attributes of objects and values from dictionary keys. I would like to know if there's any way I can optimize these lookup times, potentially with a C extension, to reduce the time of execution, or if I need to simply re-implement the program in a compiled language. The program implements some algorithms using a graph. It runs prohibitively slowly on our data sets, so I profiled the code with cProfile using a reduced data set that could actually complete. The vast majority of the time is being burned in one function, and specifically in two statements, generator expressions, within the function: The generator expression at line 202 is neighbors_in_selected_nodes = (neighbor for neighbor in node_neighbors if neighbor in selected_nodes) and the generator expression at line 204 is neighbor_z_scores = (interaction_graph.node[neighbor]['weight'] for neighbor in neighbors_in_selected_nodes) The source code for this function of context provided below. selected_nodes is a set of nodes in the interaction_graph, which is a NetworkX Graph instance. node_neighbors is an iterator from Graph.neighbors_iter(). Graph itself uses dictionaries for storing nodes and edges. Its Graph.node attribute is a dictionary which stores nodes and their attributes (e.g., 'weight') in dictionaries belonging to each node. Each of these lookups should be amortized constant time (i.e., O(1)), however, I am still paying a large penalty for the lookups. Is there some way which I can speed up these lookups (e.g., by writing parts of this as a C extension), or do I need to move the program to a compiled language? Below is the full source code for the function that provides the context; the vast majority of execution time is spent within this function. def calculate_node_z_prime( node, interaction_graph, selected_nodes ): """Calculates a z'-score for a given node. The z'-score is based on the z-scores (weights) of the neighbors of the given node, and proportional to the z-score (weight) of the given node. Specifically, we find the maximum z-score of all neighbors of the given node that are also members of the given set of selected nodes, multiply this z-score by the z-score of the given node, and return this value as the z'-score for the given node. If the given node has no neighbors in the interaction graph, the z'-score is defined as zero. Returns the z'-score as zero or a positive floating point value. :Parameters: - `node`: the node for which to compute the z-prime score - `interaction_graph`: graph containing the gene-gene or gene product-gene product interactions - `selected_nodes`: a `set` of nodes fitting some criterion of interest (e.g., annotated with a term of interest) """ node_neighbors = interaction_graph.neighbors_iter(node) neighbors_in_selected_nodes = (neighbor for neighbor in node_neighbors if neighbor in selected_nodes) neighbor_z_scores = (interaction_graph.node[neighbor]['weight'] for neighbor in neighbors_in_selected_nodes) try: max_z_score = max(neighbor_z_scores) # max() throws a ValueError if its argument has no elements; in this # case, we need to set the max_z_score to zero except ValueError, e: # Check to make certain max() raised this error if 'max()' in e.args[0]: max_z_score = 0 else: raise e z_prime = interaction_graph.node[node]['weight'] * max_z_score return z_prime Here are the top couple of calls according to cProfiler, sorted by time. ncalls tottime percall cumtime percall filename:lineno(function) 156067701 352.313 0.000 642.072 0.000 bpln_contextual.py:204(<genexpr>) 156067701 289.759 0.000 289.759 0.000 bpln_contextual.py:202(<genexpr>) 13963893 174.047 0.000 816.119 0.000 {max} 13963885 69.804 0.000 936.754 0.000 bpln_contextual.py:171(calculate_node_z_prime) 7116883 61.982 0.000 61.982 0.000 {method 'update' of 'set' objects}

    Read the article

  • DNS lookups failing somewhere between firewall and router

    - by TessellatingHeckler
    we have a setup of ADSL line - Cisco 837 ADSL router - Zyxel ZyWall 35 firewall/NAT - Switch == Intel load balanced NICS in a server. It has been fine for years, suddenly DNS resolution stopped working on the server. No changes that I know of, so I can't work backwards from there. It was configured with the ISP's DNS servers, neither network device does DNS relaying. Wireshark shows the request go out but nothing comes back. The server networking stack seems OK though, because if we query an internal DNS server on a remote site, that works. I can logon to the Cisco, and DNS resolves OK from the command line. I can logon to the ZyWall, and DNS does not resolve from the command line. So the problem seems to be the firewall, patch cable or router, yes? On the router: interface Ethernet0 ip address aaa.bbb.ccc.ddd 255.255.255.ddd ip tcp adjust-mss 1450 hold-queue 100 out On the firewall: DNS server set to 8.8.8.8 (Google's), DNS traffic allowed LAN-WAN. What else should I look for? Update: Following This guide I've got traffic logging on the Cisco. I have also got access to a public DNS server which I can run tcpdump on to see things from the other side. And as per the below comments, I've tested with Dig and see that DNS over TCP works, and over UDP does not. Currently: DNS request from the server using TCP shows up in the firewall log, and in the Cisco log, and in tcpdump on the DNS server, the answer comes back, it works fine. DNS request from the server using UDP shows up in the firewall log, and in the Cisco log, does NOT show in tcpdump on the DNS server, times out. DNS request from the cisco (using UDP) does show up in tcpdump on the DNS server, answer received, works fine. Ping requests from the server and the cisco to the DNS server show up in tcpdump on the DNS server. DNS request from the server using UDP does show up on the firewall. Summary: TCP seems fine throughought. UDP works over the ADSL and to the Cisco, and it works from the server to the Cisco, but it doesn't cross the Cisco properly, it seems. I did see the Cisco showing as connected at 10Mb/full-duplex internally, and the firewall showing as 100Mb/full-duplex externally. I have forced the firewall to 10Mb and rebooted both devices. That seemed to help get UDP traffic (server-firewall-cisco) instead of (server-firewall), but did not fix it. Update: Sanitized Cisco config: version 12.2 no service pad service timestamps debug datetime msec service timestamps log datetime msec service password-encryption ! hostname cisco ! logging queue-limit 100 enable secret 5 {password} enable password 7 {password} ! ip subnet-zero ip domain name example.org ip name-server {nameserver_IP} ! ! ip audit notify log ip audit po max-events 100 no ftp-server write-enable ! interface Ethernet0 ip address {Inside_public_IP} 255.255.255.248 ip tcp adjust-mss 1460 hold-queue 100 out ! interface ATM0 no ip address no atm ilmi-keepalive pvc 0/38 encapsulation aal5mux ppp dialer dialer pool-member 1 ! dsl operating-mode auto ! interface Dialer1 ip unnumbered Ethernet0 encapsulation ppp dialer pool 1 dialer idle-timeout 0 dialer persistent no cdp enable ppp chap hostname {ADSL_Username} ppp chap password 7 {ADSL_Password} ! ip classless ip route 0.0.0.0 0.0.0.0 Dialer1 no ip http server no ip http secure-server ! access-list 23 permit {IP} dialer-list 1 protocol ip permit no cdp run snmp-server enable traps tty ! {con, vty} end

    Read the article

  • Windows Server 2003 DNS cached lookups modification

    - by Mike
    Hi, Is it possible to modify the entries in the cached lookup? I need to temporarily change the resolution of an IP address of a domain name to something else. I can't wait until DNS updates. Sorry, forgot to mention that the interface of the server has DNS set to itself. DNS server is running.

    Read the article

  • CentOS 6.3 Virtual under OpenVZ cannot ping, host lookups, outbound connections while postfix running

    - by Paul Cravey
    My best theory is that some kernel limit is being hit preventing outbound connections. We have tried basically everything from tcpdumps to provisioning an entirely new virtual server (we do not have this problem on any other virtuals), however the problem somehow carried over, even with new postfix build (working). Emails work, and outbound connections work, so long as postfix does not have too much going on. /proc/user_beancounters shows no limits being hit (show below). Nevertheless, pings fail even to IP addresses. TCP stack appears healthy. Load is low. No iowait. Flushed iptables already. Has anyone experienced anything like this? uid resource held maxheld barrier limit failcnt 3: kmemsize 166216365 170262528 9223372036854775807 9223372036854775807 0 lockedpages 0 0 9223372036854775807 9223372036854775807 0 privvmpages 285727 351885 9223372036854775807 9223372036854775807 0 shmpages 16933 17605 9223372036854775807 9223372036854775807 0 dummy 0 0 0 0 0 numproc 150 303 9223372036854775807 9223372036854775807 0 physpages 314156 326191 0 1280000 0 vmguarpages 0 0 9223372036854775807 9223372036854775807 0 oomguarpages 165355 165355 9223372036854775807 9223372036854775807 0 numtcpsock 89 172 9223372036854775807 9223372036854775807 0 numflock 22 76 9223372036854775807 9223372036854775807 0 numpty 1 2 9223372036854775807 9223372036854775807 0 numsiginfo 0 75 9223372036854775807 9223372036854775807 0 tcpsndbuf 2733472 4371752 9223372036854775807 9223372036854775807 0 tcprcvbuf 1798336 5427296 9223372036854775807 9223372036854775807 0 othersockbuf 491120 1000760 9223372036854775807 9223372036854775807 0 dgramrcvbuf 0 238728 9223372036854775807 9223372036854775807 0 numothersock 361 505 9223372036854775807 9223372036854775807 0 dcachesize 135941831 136114679 9223372036854775807 9223372036854775807 0 numfile 2905 4990 9223372036854775807 9223372036854775807 0 dummy 0 0 0 0 0 dummy 0 0 0 0 0 dummy 0 0 0 0 0 numiptent 8 9 9223372036854775807 9223372036854775807 0 [root@bni /]# ping 4.2.2.1 PING 4.2.2.1 (4.2.2.1) 56(84) bytes of data. --- 4.2.2.1 ping statistics --- 9 packets transmitted, 0 received, 100% packet loss, time 8493ms [root@bni /]# service postfix stop [root@bni /]# ping 4.2.2.1 PING 4.2.2.1 (4.2.2.1) 56(84) bytes of data. 64 bytes from 4.2.2.1: icmp_seq=1 ttl=53 time=8.63 ms 64 bytes from 4.2.2.1: icmp_seq=2 ttl=53 time=8.62 ms 64 bytes from 4.2.2.1: icmp_seq=3 ttl=53 time=8.63 ms 64 bytes from 4.2.2.1: icmp_seq=4 ttl=53 time=8.66 ms Outbound connections of all sorts fail when postfix is running.

    Read the article

  • Ubuntu 12.04 host lookups extremely slow

    - by tubaguy50035
    I'm having issues with one of my servers taking a long time to look up host names. This is an Ubuntu 12.04 box, so I've tried following the new resolvconf directives. In my /etc/network/interfaces file, I defined my name servers like this: auto eth0 iface eth0 inet static address someaddress netmask 255.255.255.0 gateway 198.58.103.1 dns-nameservers 74.14.179.5 72.14.188.5 In my /etc/resolv.conf, I see these name servers, like this: # Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8) # DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN nameserver 74.14.179.5 nameserver 72.14.188.5 On another box, I edited the resolv.conf directly as directed by my hosts' setup help files. It looks like this: domain members.linode.com search members.linode.com nameserver 72.14.179.5 nameserver 72.14.188.5 options rotate This second box has no issues with host name look ups and responds quite quickly. Could not having the domain and search directives make my look ups slow? By slow, I mean it's taking anywhere from 5 to 15 seconds to find the IP address of a host.

    Read the article

  • Lookups targeting merged cells - only returning value for first row

    - by Ian
    I have a master worksheet which contains data that I wish to link to another 'summary' sheet using a lookup. However, some of the cells whose data I wish to include in the summary sheet are merged across two or more adjacent rows. To be clear, the 'primary' column A that I am using in my formula in order to identify the target row does not contain merged cells, but the column from which I wish to return a value does. I have tried VLOOKUP and INDEX+MATCH. The problem is that the data is only returned for the first row's key, and the others return zero (as though the cell in the target column were blank, where actually it is merged). I have tried inelegant ways around this, e.g. using IF statements to try to find the top row of the merged cell. However, these don't work well if the order of values in the summary sheet is different from that in the master sheet, as well as being messy. Can this be done?

    Read the article

  • Loading Fact Table + Lookup / UnionAll for SK lookups.

    - by Nev_Rahd
    I got to populate FactTable with 12 lookups to dimension table to get SK's, of which 6 are to different Dim Tables and rest 6 are lookup to same DimTable (type II) doing lookup to same natural key. Ex: PrimeObjectID = lookup to DimObject.ObjectID = get ObjectSK and got other columns which does same OtherObjectID1 = lookup to DimObject.ObjectID = get ObjectSK OtherObjectID2 = lookup to DimObject.ObjectID = get ObjectSK OtherObjectID3 = lookup to DimObject.ObjectID = get ObjectSK OtherObjectID4 = lookup to DimObject.ObjectID = get ObjectSK OtherObjectID5 = lookup to DimObject.ObjectID = get ObjectSK for such multiple lookup how should go in my SSIS package. for now am using lookup / unionall foreach lookup. Is there a better way to this.

    Read the article

  • How do you efficiently bulk index lookups?

    - by Liron Shapira
    I have these entity kinds: Molecule Atom MoleculeAtom Given a list(molecule_ids) whose lengths is in the hundreds, I need to get a dict of the form {molecule_id: list(atom_ids)}. Likewise, given a list(atom_ids) whose length is in the hunreds, I need to get a dict of the form {atom_id: list(molecule_ids)}. Both of these bulk lookups need to happen really fast. Right now I'm doing something like: atom_ids_by_molecule_id = {} for molecule_id in molecule_ids: moleculeatoms = MoleculeAtom.all().filter('molecule =', db.Key.from_path('molecule', molecule_id)).fetch(1000) atom_ids_by_molecule_id[molecule_id] = [ MoleculeAtom.atom.get_value_for_datastore(ma).id() for ma in moleculeatoms ] Like I said, len(molecule_ids) is in the hundreds. I need to do this kind of bulk index lookup on almost every single request, and I need it to be FAST, and right now it's too slow. Ideas: Will using a Molecule.atoms ListProperty do what I need? Consider that I am storing additional data on the MoleculeAtom node, and remember it's equally important for me to do the lookup in the molecule-atom and atom-molecule directions. Caching? I tried memcaching lists of atom IDs keyed by molecule ID, but I have tons of atoms and molecules, and the cache can't fit it. How about denormalizing the data by creating a new entity kind whose key name is a molecule ID and whose value is a list of atom IDs? The idea is, calling db.get on 500 keys is probably faster than looping through 500 fetches with filters, right?

    Read the article

  • Repeated host lookups failing in urllib2

    - by reve_etrange
    I have code which issues many HTTP GET requests using Python's urllib2, in several threads, writing the responses into files (one per thread). During execution, it looks like many of the host lookups fail (causing a name or service unknown error, see appended error log for an example). Is this due to a flaky DNS service? Is it bad practice to rely on DNS caching, if the host name isn't changing? I.e. should a single lookup's result be passed into the urlopen? Exception in thread Thread-16: Traceback (most recent call last): File "/usr/lib/python2.6/threading.py", line 532, in __bootstrap_inner self.run() File "/home/da/local/bin/ThreadedDownloader.py", line 61, in run page = urllib2.urlopen(url) # get the page File "/usr/lib/python2.6/urllib2.py", line 126, in urlopen return _opener.open(url, data, timeout) File "/usr/lib/python2.6/urllib2.py", line 391, in open response = self._open(req, data) File "/usr/lib/python2.6/urllib2.py", line 409, in _open '_open', req) File "/usr/lib/python2.6/urllib2.py", line 369, in _call_chain result = func(*args) File "/usr/lib/python2.6/urllib2.py", line 1170, in http_open return self.do_open(httplib.HTTPConnection, req) File "/usr/lib/python2.6/urllib2.py", line 1145, in do_open raise URLError(err) URLError: <urlopen error [Errno -2] Name or service not known>

    Read the article

  • Do connection string DNS lookups get cached?

    - by joshcomley
    Suppose the following: I have a database set up on database.mywebsite.com, which resolves to IP 111.111.1.1, running from a local DNS server on our network. I have countless ASP, ASP.NET and WinForms applications that use a connection string utilising database.mywebsite.com as the server name, all running from the internal network. Then the box running the database dies, and I switch over to a new box with an IP of 222.222.2.2. So, I update the DNS for database.mywebsite.com to point to 222.222.2.2. Will all the applications and computers running them have cached the old resolved IP address? I'm assuming they will have. Any suggestions along the lines of "don't have your IP change each time you switch box" are not too welcome as I cannot control this aspect of the situation, unfortunately. We are currently using the machine name of the box, which changes every time it dies and all apps etc. have to be updated with the new machine name. It hurts.

    Read the article

  • NSPredicate by NSManagedObject for many-to-one lookups

    - by niklassaers
    Hi guys, I've got the scenario with two NSManagedObjects, Arm and Person. Between them is a many-to-one relationship Person.arms and inverse Arm.owner. I'd like to write a simple NSPredicate where I've got the NSManagedObject *arm and I'd like to fetch the NSManagedObject *person that this arm belongs to. I could make a textual representation and look for that, but is there a better way where I can look it up by identity? Something like this perhaps? NSEntityDescription *person = [NSEntityDescription entityForName:@"Person" inManagedObjectContext:MOC]; NSPredicate *personPredicate = [NSPredicate predicateWithFormat:@"%@ IN arms", arm]; Cheers Nik

    Read the article

1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >