Search Results

Search found 946 results on 38 pages for 'surrogate pairs'.

Page 26/38 | < Previous Page | 22 23 24 25 26 27 28 29 30 31 32 33 | Next Page >

GRE Tunnel over IPsec with Loopback

- by Alek

I'm having a really hard time trying to estabilish a VPN connection using a GRE over IPsec tunnel. The problem is that it involves some sort of "loopback" connection which I don't understand -- let alone be able to configure --, and the only help I could find is related to configuring Cisco routers. My network is composed of a router and a single host running Debian Linux. My task is to create a GRE tunnel over an IPsec infrastructure, which is particularly intended to route multicast traffic between my network, which I am allowed to configure, and a remote network, for which I only bear a form containing some setup information (IP addresses and phase information for IPsec). For now it suffices to estabilish a communication between this single host and the remote network, but in the future it will be desirable for the traffic to be routed to other machines on my network. As I said this GRE tunnel involves a "loopback" connection which I have no idea of how to configure. From my previous understanding, a loopback connection is simply a local pseudo-device used mostly for testing purposes, but in this context it might be something more specific that I do not have the knowledge of. I have managed to properly estabilish the IPsec communication using racoon and ipsec-tools, and I believe I'm familiar with the creation of tunnels and addition of addresses to interfaces using ip, so the focus is on the GRE step. The worst part is that the remote peers do not respond to ping requests and the debugging of the general setup is very difficult due to the encrypted nature of the traffic. There are two pairs of IP addresses involved: one pair for the GRE tunnel peer-to-peer connection and one pair for the "loopback" part. There is also an IP range involved, which is supposed to be the final IP addresses for the hosts inside the VPN. My question is: how (or if) can this setup be done? Do I need some special software or another daemon, or does the Linux kernel handle every aspect of the GRE/IPsec tunneling? Please inform me if any extra information could be useful. Any help is greatly appreciated.

Read the article
What's with the accesses to $random_existing_file/cache/df.php?

- by Bernd Jendrissek

Occasionally I eyeball Apache's access_log and lately I've been noticing these accesses to URLs that I don't serve. They're correctly 404'ed, but I'd like to know just who and what is involved here. "Obviously" it's some sort of vulnerability probing; I'd like to know which. (Not that it affects me, but I like to know the score.) Here's an example: 69.89.31.206 - - [28/Nov/2012:17:36:34 +0200] "GET /cvfull.pdf/cache/df.php HTTP/1.1" 404 489 "-" "-" Oddly, all 26 attempts are to either /cache/df.php, or to /cvfull.pdf/cache/df.php - they come in pairs. A few weeks ago it was zx.php, now it's df.php - I'm assuming the target is the same. Perhaps I should be flattered that a script is thinking of hiring me. Seriously, my CV is one of only two PDF files on my site, so I can only guess that non-PDF URLs aren't interesting? I've tried Googling for "cache df php", but my Google-fu is weak at the best of times, so I can only find a few reports of other script attacks. What's the vulnerability being scanned for here?

Read the article
How do I SSH tunnel using PuTTY or SecureCRT through gateway/proxy to development server?

- by DAE51D

We have some unix boxes setup in a way that to get to the development box via ssh, you have to ssh into a 'user@jumpoff' box first. There is no direct connection allowed on 'dev' via ssh from anywhere but 'jumpoff'. Furthermore, only key exchange is allowed on both servers. And you always login to the development box as 'build@dev'. It's painful to always do that hopping. I know this can be done with SOCKS or a Tunnel or something... I have setup a FreeBSD VM and I can get things to work awesome using unix ssh tools. Basically all I do is make sure my vm's ~/.ssh/id_rsa.pub key is on both jumpoff and dev and use this ~/.ssh/config file: # Development Server Host ext-dev # this must be a resolvable name for "dev" from Jumpoff Hostname 1.2.3.4 User build IdentityFile ~/.ssh/id_rsa # The Jumpoff Server Host ext Hostname 1.1.1.1 User daevid Port 22 IdentityFile ~/.ssh/id_rsa # This must come below all of the above Host ext-* ProxyCommand ssh ext nc $(echo '%h'|cut -d- -f2-) 22 Then I just simply type "ssh ext-dev" and I'm in like Flynn. The problem is I can't get this same thing to work using either PuTTY or SecureCRT -- and to be honest I've not found any tutorials that really walk me through it. I see many on setting up some kind of proxy tunnel for Firefox, but it doesn't seem to be the same concept. I've been messing with various trial and error most all day and nothing has worked (obviously) and I'm at the end of my ssh knowledge and Google searching. I found this link which seemed to be perfect, but it doesn't work for me. The "Master" connects fine, but the "client" portion doesn't connect. It tells me, the remote system refused the connection. http://www.vandyke.com/support/tips/socksproxy.html I've got the VM, PuTTY and SecureCRT all using the same public/private key pairs to make things consistent and easier to debug. Does anyone have a straight up example of how to do this in Windows?

Read the article
what web based tool, to allow a non-technical user to manage authorized keys files on a Linux (fedora/centos/ubuntu/debian) server

- by Tom H

(Edit: clarification below) We have a number of groups of developers that change frequently, and a security policy to require individual logins to servers using rsa or dsa public keys, which is achieved via the standard method of adding id_dsa.pub to their authorized keys file. I am using chef to sync the user accounts across machines, however our previous method of using webmin to manage the user passwords is not designed for key based auth, and hence is not easy to use for non-technical users. The developers are logging in from the WAN using ssh, they can either provide their own key, or an administrator will send them a private key. The development machines are located in the cloud and we have a single server available to host the master set of accounts. Obviously I could deploy ldap or other centralised authentication system, but that seems a bit over blown when webmin worked well for the simple case. It is easy to achieve synchronised users, groups and passwords across a bunch of low security development boxes using webmin clustered users and groups. However looking at the currently installed webmin it is not so easy to create the authorized keys as it is to create user accounts and passwords. (its possible, but its not easy - some functionality is in the usermin module, or would required some tedious steps) Ideally I'd like a web interface that is pretty much dedicated to creating users and groups, and can generate key pairs on the fly, and can accepted pasted in public keys to add to the users authorized keys file. If the tool sync'ed the users and keys as well, that would be great, but I can use chef to do that part if the accounts are created correctly on the "master" server.

Read the article
Files on ext4 on Drobo with corrupt, zero-ed out blocks

- by Patrick

I have a 2TB ext4 file system (Ubuntu running Linux kernel 2.6.31-22-server x86_64). This file system is the second drive on a Drobo box plugged in via USB. We've not had problems on the first drive (Drobo limits drive size to 2TB due to some OS limitations, so if you have more space than that it appears as two separate drives). I am sharing this files with Samba (smbd 3.4.0) with a mix of Windows and Linux workstations. Recently we've been experiencing some data corruption in multiple files. In many cases I have an un-corrupt original file stored on one of the workstations. These are binary files of various formats, (e.g. SQLite, but others as well). I used "split" to split a corrupt and uncorrupt file into 4096 byte chunks (this is the block size of the ext4 file system). I then ran md5sum on pairs of chunks and discovered that the chunks matched in many cases and in every case where they did not match, the corrupt chunk was a solid chunk of zeroes (620f0b67a91f7f74151bc5be745b7110 for what it's worth). I'm trying to track down a culprit but am a bit at a loss. I don't believe Samba is at fault since I'm using it without issue on the first drive exported by the Drobo. What can I do to narrow this down and find out what's going on?

Read the article
Can I have 2Gbit over 1Gbit Nics

- by Daniel

So this really baffles me. Apparently because 1Gbit can transmit data in both directions simultaneously it should be possible to get 2Gbit of data transfer on a single NIC (1Gbit flow seend and 1Gbit receive). People claim that because 1Gbit is full-duplex (almost always) it is exactly 2Gbit in total. My intuition and electrical background tells me that something is not right here 4 twisted pairs 250Mbit capacity each gives 1Gbit. Unless it is really possible to transfer data in both directions simultaneously. I did a test with iperf. Ubuntu server 12.04 <-- MacBook Pro. Both with decent CPU speed. Tested speed of connection individually and on Mac I can see 112MB/s regardless which direction data is going. On Ubuntu with vnstat and ifstat I got 970Mbit speeds. Now, launching iperf in server mode on both machines at the same time and sending data using 2 iperf clients shows that I'm for example on Ubuntu box sending at 600Mbit, and receiving 350Mbit. which adds up to pretty much 1Gbit link. So to me there is no magical 2Gbit. Can someone confirm that or tell why I'm wrong? Another thing that confuses me i the fact that e.g. 24-port switch has for example: Throughput»up»to:»50.6Mpps Switching»capacity:»68Gbps Switch»fabric»speed:»88Gbps Which would suggest thay can handle 2GBit per port.

Read the article
Ubuntu: unattended-upgrades from a local package archive

- by Novelocrat

I have a local apt archive with a bunch of packages I built in it. The Packages and Release file are generated by apt-ftparchive. The Release file looks like Date: Thu, 06 May 2010 23:04:33 UTC Label: PPL Origin: PPL Suite: ppl MD5Sum: ebec3527ebc8351468b2ef8796c19855 37325 Packages d41d8cd98f00b204e9800998ecf8427e 0 Release SHA1: a0593b663d77fde88ee35b56ae1f3c17801cfe99 37325 Packages da39a3ee5e6b4b0d3255bfef95601890afd80709 0 Release SHA256: dd73a02846aee111cac58a869c6bf650886632ba82c2172ffddd81aa4429981c 37325 Packages e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 0 Release I'm using unattended-upgrades to keep the machines in the lab up to date on security and bug fixes, but I'm finding that it doesn't pull from my local archive. The configuration file for it looks like // Automaticall upgrade packages from these (origin, archive) pairs Unattended-Upgrade::Allowed-Origins { "Ubuntu hardy-security"; "Ubuntu hardy-updates"; "PPL ppl"; }; // List of packages to not update Unattended-Upgrade::Package-Blacklist { // "vim"; // "libc6"; // "libc6-dev"; // "libc6-i686"; }; // Send email to this address for problems or packages upgrades // If empty or unset then no email is sent, make sure that you // have a working mail setup on your system. The package 'mailx' // must be installed or anything that provides /usr/bin/mail. //Unattended-Upgrade::Mail "root@localhost"; Yet, when I run sudo unattended-upgrade on one of these machines, newer package versions don't get installed. Can anyone point out what I'm getting wrong?

Read the article
GRE Tunnel over IPsec with Loopback

- by Alek

Hello, I'm having a really hard time trying to estabilish a VPN connection using a GRE over IPsec tunnel. The problem is that it involves some sort of "loopback" connection which I don't understand -- let alone be able to configure --, and the only help I could find is related to configuring Cisco routers. My network is composed of a router and a single host running Debian Linux. My task is to create a GRE tunnel over an IPsec infrastructure, which is particularly intended to route multicast traffic between my network, which I am allowed to configure, and a remote network, for which I only bear a form containing some setup information (IP addresses and phase information for IPsec). For now it suffices to estabilish a communication between this single host and the remote network, but in the future it will be desirable for the traffic to be routed to other machines on my network. As I said this GRE tunnel involves a "loopback" connection which I have no idea of how to configure. From my previous understanding, a loopback connection is simply a local pseudo-device used mostly for testing purposes, but in this context it might be something more specific that I do not have the knowledge of. I have managed to properly estabilish the IPsec communication using racoon and ipsec-tools, and I believe I'm familiar with the creation of tunnels and addition of addresses to interfaces using ip, so the focus is on the GRE step. The worst part is that the remote peers do not respond to ping requests and the debugging of the general setup is very difficult due to the encrypted nature of the traffic. There are two pairs of IP addresses involved: one pair for the GRE tunnel peer-to-peer connection and one pair for the "loopback" part. There is also an IP range involved, which is supposed to be the final IP addresses for the hosts inside the VPN. My question is: how (or if) can this setup be done? Do I need some special software or another daemon, or does the Linux kernel handle every aspect of the GRE/IPsec tunneling? Please inform me if any extra information could be useful. Any help is greatly appreciated.

Read the article
mod_cache not working

- by Pistos

I have a PHP site that has many dynamically generated pages. I'm trying to turn to mod_cache to help boost performance, because in most cases, content does not change in a given day. I have configured mod_cache as best I could, following examples around the web, including the mod_cache page on apache.org. When I set LogLevel debug, I see a bit of information about the caching that is [not] happening. There are plenty of pairs of lines like this: [Fri Jun 01 17:28:18 2012] [debug] mod_cache.c(141): Adding CACHE_SAVE filter for /foo/bar [Fri Jun 01 17:28:18 2012] [debug] mod_cache.c(148): Adding CACHE_REMOVE_URL filter for /foo/bar Which is fine, because I've set CacheEnable disk /foo, to indicate that I want everything under /foo cached. I'm new to mod_cache, but my understanding about these lines is that it just means that mod_cache has acknowledged that the URL is supposed to be cached, but there are supposed to be more lines indicating that it is saving the data to cache, and then later retrieving them on subsequent hits to the same URL. I can hit the same URL till I'm blue in the face, whether with F5 refreshing, or not, or with different browsers, or different computers. It's always that pair of lines that shows in the logs, and nothing else. When I set CacheEnable disk /, then I see more activity. But I don't want to cache the entire site, and there are many, many different subpaths to the site, so I don't want to have to modify code to set no-cache headers in all the necessary places. I'll mention that mod_rewrite is in use here, rewriting /foo/bar to something like index.php?baz=/foo/bar, but my understanding is that mod_cache uses the pre-rewrite URL, not the post-rewrite URL. As far as I can tell, I have the response headers not getting in the way of caching. Here's an example from one hit: Cache-Control:must-revalidate, max-age=3600 Connection:Keep-Alive Content-Encoding:gzip Content-Length:16790 Content-Type:text/html Date:Fri, 01 Jun 2012 21:43:09 GMT Expires:Fri, 1 Jun 2012 18:43:09 -0400 Keep-Alive:timeout=15, max=100 Pragma: Server:Apache Vary:Accept-Encoding mod_cache config is as follows: CacheRoot /var/cache/apache2/ CacheDirLevels 3 CacheDirLength 2 CacheEnable disk /foo What is getting in the way of mod_cache doing its job of caching?

Read the article
DAS vs SAN storage for serving 2 to 4 nodes

- by Luke404

We currently have 4 Linux nodes with local storage, arranged in two active/passive pairs with storage mirrored using DRBD, running virtual machines (actually using Xen Hypervisor) for typical hosting workloads (mail, web, a couple VPS, etc.). We're approaching the (presumed) maximum IOPS of those servers, and we're planning to migrate to an external storage solution with two active nodes, with capacity for up to four active nodes. Since we're an all-Dell shop I've done some research and found the MD3200 / MD3200i products should be the ones we're looking for. We are pretty sure we won't be attaching more than 4 hosts on a single storage and I'm wondering if there is any clear advantage for one or the other. In theory I should be able to attach 4 SAS hosts to a single MD3200 (single links on a single controller MD3200, or dual redundant SAS links from each host to a dual-controller MD3200), or 4 iSCSI hosts to a single MD3200i (directly on its 4 GigE ports without any switch, again with dual links for the dual controller option). Both setups should let us implement live VM migration since all hosts can access all the LUNs at the same time, and also some shared filesystem like GFS2 or OCFS2. Also, both setups should allow full redundancy of the whole system (assuming dual controllers in the storage). One difference I can see is that the DAS solution is actually limited to 4 hosts while the iSCSI one should be able to grow to more hosts (adding two GigE switches to the mix). One point for the iSCSI solution is that it would allow us to start out with our current nodes and upgrade them at a later time (we can't add other SAS controllers, but they already have 4 GigE ports each). With the right (iSCSI|SAS) controllers I should be able to connect diskless nodes and boot them off the external storage which I think is a good thing (get rid of any local storage). On the other hand, I would have thought the SAS one to be cheaper but it seems like an MD3200 actually costs a little less than an MD3200i (?) (please note: I've used Dell gear in my examples since that's what we're looking for but I assume the same goes with other vendors) I would like to know if my assumptions above are correct, and if I'm missing any important difference between the two setups.

Read the article
How to get automatic upgrades to work on Ubuntu Server?

- by J. Pablo Fernández

I followed the documentation for enabling automatic upgrades in Ubuntu servers, but it's not really updating anything at all. My /etc/apt/apt.conf.d/50unattended-upgrades looks almost like the default. // Automatically upgrade packages from these (origin, archive) pairs Unattended-Upgrade::Allowed-Origins { "Ubuntu karmic-security"; "Ubuntu karmic-updates"; }; // List of packages to not update Unattended-Upgrade::Package-Blacklist { // "vim"; // "libc6"; // "libc6-dev"; // "libc6-i686"; }; // Send email to this address for problems or packages upgrades // If empty or unset then no email is sent, make sure that you // have a working mail setup on your system. The package 'mailx' // must be installed or anything that provides /usr/bin/mail. Unattended-Upgrade::Mail "[email protected]"; // Automatically reboot *WITHOUT CONFIRMATION* if a // the file /var/run/reboot-required is found after the upgrade //Unattended-Upgrade::Automatic-Reboot "false"; The directory /var/log/unattended-upgrades/ is empty. Running /etc/init.d/unattended-upgrades start is not very nice: root@mozart:~# /etc/init.d/unattended-upgrades start Checking for running unattended-upgrades: root@mozart:~# Something seems to be broken, but I'm not sure why. I have pending updates and they are not being applied: root@mozart:~# aptitude safe-upgrade Reading package lists... Done Building dependency tree Reading state information... Done Reading extended state information Initializing package states... Done The following packages will be upgraded: linux-libc-dev 1 packages upgraded, 0 newly installed, 0 to remove and 0 not upgraded. Need to get 0B/743kB of archives. After unpacking 4096B will be used. Do you want to continue? [Y/n/?] In all the servers I have, unattended upgrades seems to have been disabled: root@mozart:~# apt-config shell UnattendedUpgradeInterval APT::Periodic::Unattended-Upgrade root@mozart:~# Any ideas what am I missing?

Read the article
Excel or OpenOffice Table Summary: how to reconstruct a table from another, with "missing" values

- by Gilberto

I have a table of values (partial) with 3 columns: month (from 1 to 12), code and value. E.g., MONTH | CODE | VALUE 1 | aaa | 111 1 | bbb | 222 1 | ccc | 333 2 | aaa | 1111 2 | ccc | 2222 The codes are clients and the values are sales volumes. Each row represents the sales for one month for one client. So I have three clients, namely aaa, bbb, and ccc. For month=1 their sales volumes are: aaa-111, bbb-222, and ccc-333. A client may or may not have sales for every month; for example, for the month 2, the client bbb has no sales. I have to construct a completed summary table for all the MONTH / CODE pairs with their corresponding VALUE (using the value from the "partial" table, if present, otherwise print a string "missing"). MONTH | CODE | VALUE 1 | aaa | 111 1 | bbb | 222 1 | ccc | 333 2 | aaa | 1111 2 | bbb | missing 2 | ccc | 2222 Or, to put it another way, the table is a linear representation of a matrix: and I want to identify the cells for which no value was provided. How can I do that?

Read the article
Exchange 2010 Transport rules stepping on each other

- by TopHat

I have a group of users that I have to restrict email access for and so far using Exchange Transport Rules has worked very well. The problem I am having is that Rule 0 is supposed to bcc the email to a review mailbox but otherwise not change anything and Rule 9 is supposed to block the email and throw a custom NDR to tell the user why they were blocked. Here are my results in practice however. If Rule 0 is enabled and Rule 9 is enabled then only Rule 9 functions If Rule 0 is disabled and Rule 9 is enabled then Rule 9 functions If Rule 0 is enabled and Rule 9 is disabled then Rule 0 functions This is after the Transport Service has been restarted (multiple times actually). I have other rule pairs that work correctly. None of these are overlapping rulesets however. - copy email going to address outside domain and then block - copy email coming in from outside and then block Here is the rule for copying internal emails (Rule 0): Apply rule to messages from a member of Blind carbon copy (Bcc) the message to except when the message is sent to a member of or [email protected] Here is the rule to block the same email (rule 9): Apply rule to messages from a member of send 'Email to non-supervisors or managers has been prohibited. Please contact your supervisor for more information.' to sender with 5.7.420 except when the message is sent to , [email protected], The distribution group used for membership in these rules is used for the other blocking and copying rules and works as expected. Is there something I missed in this setup? All of the copy rules are at the front of the transport rule group and all the actual copies at at the end of the queue if that makes a difference. Any thoughts as to why the email doesn't get copied when it gets blocked?

Read the article
Are there any viable DNS or LDAP alternatives for distributed key/value storage and retrieval?

- by makerofthings7

I'm working on a software app that needs distributed decentralized name resolution, and isn't bound to TCP/IP. Or more precisely, I need to store a "key" and look up it's value, and the key may be a string, a number, or any other realistic data type. Examples: With a phone number, look up a name. (or with an area code, redirect to the server that handles that exchange) With an IP Address get a DNS name, or a Whois contact (string value) With a string, get an IP, ( like a DNS TXT or SRV record). I'm thinking out of the box here and looking for any software that allows for this. (more info below) Are there any secure, scalable DNS alternatives that have gained notoriety? I could ask on StackOverflow, but think the infrastructure groups would have better insight on this. Edit More info: I'm looking at "Namecoin" the DNS version of Bitcoin, and since that project is faltering, I'm looking at alternative ways to store name-value pairs, with an optional qualifier. I think a name value pair is of global interest is useful, but on a limited scale. Namecoin tried to be too much, and ended up becoming nothing. I'm trying to solve that problem in researching alternatives and applying distributed technologies where applicable. Bitcoin/Namecoin offers a Distributed Hash Table, which has some positive aspects, but not useful for DNS, except for root servers.

Read the article
Why can't I use my Bluetooth Headset with my laptop?

- by Michael Haren

I just received a new Plantronics M50 bluetooth headset. It works great with my phone, but I can't get it working with my laptop. What are the things I should be checking? Here's what I've done so far: It pairs successfully: It's not in multipoint mode--it's only paired to my laptop I've installed all available drivers from Plantronics and Dell I have no (!) in Device Manager (though I don't see the headset there either--would I?) I can "configure" the headset by double clicking on it: "Allow the computer to turn off this device to save power" is unchecked in the Bluetooth radio settings Apps that let me choose the playback/mic device only list my laptop, not the headset [UPDATE] I went into the Bluetooth device's properties and Checked "headset" under the services tab. This was successful but hasn't delivered any functionality as far as I can tell I'd like to use this headset for VOIP conferencing (Goto meeting, Gmail voice chat, G+ hangouts, Skype, etc.) and listening to music (iTunes). Where else should I be digging? Is it possible that this new headset is simply not compatible with computers (i.e. it's only compatible with phones)?

Read the article
Is it ok to share private key file between multiple computers/services?

- by Behrang

So we all know how to use public key/private keys using SSH, etc. But what's the best way to use/reuse them? Should I keep them in a safe place forever? I mean, I needed a pair of keys for accessing GitHub. I created a pair from scratch and used that for some time to access GitHub. Then I formatted my HDD and lost that pair. Big deal, I created a new pair and configured GitHub to use my new pair. Or is it something that I don't want to lose? I also needed a pair of public key/private keys to access our company systems. Our admin asked me for my public key and I generated a new pair and gave it to him. Is it generally better to create a new pair for access to different systems or is it better to have one pair and reuse it to access different systems? Similarly, is it better to create two different pairs and use one to access our companies systems from home and the other one to access the systems from work, or is it better to just have one pair and use it from both places?

Read the article
Something like Dropbox for local use

- by Casper

I am looking for a solution to sync folder pairs between a NAS and multiple local macs. Each of the macs could edit files and the other macs should then get synced automatically. Basically my own local version of Dropbox without using "cloud-storage". I have looked into solutions using rsync. As I understand it rsync is not really capable of doing a bi-directional sync. I also do not want to necessarily invoke the sync process. I would prefer a daemon running in the background - waiting and checking for changes and then syncing them "live". The program should also be flexible enough to recognize that it sometimes (in the case with laptops) can not reach the NAS. It should then just wait for the connection to be back again, without bugging me ever few minutes. I have looked into synk, folderwatch, rsync and a few others, but I haven't really found a solution. Isn't there something like "offline folders" from microsoft for the mac? Thanks PS: just for clarification - I don't want to sync for backup purposes, instead I am wanting to sync so that all macs have a local copy of the most recent changes to files.

Read the article
What are the typical methods used to scale up/out email storage servers?

- by nareshov

Hi, What I've tried: I have two email storage architectures. Old and new. Old: courier-imapds on several (18+) 1TB-storage servers. If one of them show signs of running out of disk space, we migrate a few email accounts to another server. the servers don't have replicas. no backups either. New: dovecot2 on a single huge server with 16TB (SATA) storage and a few SSDs we store fresh mails on the SSDs and run a doveadm purge to move mails older than a day to the SATA disks there is an identical server which has a max-15min-old rsync backup from the primary server higher-ups/management wanted to pack in as much storage as possible per server in order to minimise the cost of SSDs per server the rsync'ing is done because GlusterFS wasn't replicating well under that high small/random-IO. scaling out was expected to be done with provisioning another pair of such huge servers on facing disk-crunch issues like in the old architecture, manual moving of email accounts would be done. Concerns/doubts: I'm not convinced with the synchronously-replicated filesystem idea works well for heavy random/small-IO. GlusterFS isn't working for us yet, I'm not sure if there's another filesystem out there for this use case. The idea was to keep identical pairs and use DNS round-robin for email delivery and IMAP/POP3 access. And if one the servers went down for whatever reasons (planned/unplanned), we'd move the IP to the other server in the pair. In filesystems like Lustre, I get the advantage of a single namespace whereby I do not have to worry about manually migrating accounts around and updating MAILHOME paths and other metadata/data. Questions: What are the typical methods used to scale up/out with the traditional software (courier-imapd / dovecot)? Do traditional software that store on a locally mounted filesystem pose a roadblock to scale out with minimal "problems"? Does one have to re-write (parts of) these to work with an object-storage of some sort - such as OpenStack object storage?

Read the article
How can I assign DHCP leases from a script?

- by devicenull

I have an environment where there is one DHCP server servicing a number of different hosts/vlans. The switches are configured to forward the DHCP requests over (via ip-helper) and include information about the port (option 82). I'd like to take that information and translate it into an actual lease for the server. I don't think it's particularly feasible for me to pregenerate a list of available leases, but I should be able to determine an address for a lease as it comes in. Is there an DHCP server that can execute a script when it receives a request? (Note: I'm looking to assign the IP from the script, not have the DHCP server assign an IP then execute the script) Edit: So, ultimately I'm trying to provide DHCP/PXE services over a large number of distinct vlans. This is so we can do OS installs via PXE booting without having to have a separate PXE vlan. I've got the switch config down no problem, and I have the DHCP server recognizing option 82. I need a way to pull DHCP assignments from another system (this other system would know what subnet to use on what vlan), but I do not want to have to pregenerate a list of vlan:DHCP range pairs.

Read the article
Excel 2013: VLookup for cells that share common characters within cell but are both surrounded by other non-matching text

- by Kylie Z

I am pulling information from 2 different databases. The databases use different naming protocol for the exact same item/specified placement however they always have certain components of the name in common. The length of these names can vary throughout each of the databases (see the pic below) so I don't think counting characters would help. I need a formula (probably a vlookup/match/index of some sort) to pair up the names from the 2nd database name with the 1st database name and then place it in the adjacent column(B2) on sheet1. Until this point I've had to match, copy, and paste the pairs manually from one sheet to the other and it takes FOREVER. Any help would be much appreciated!!! For example: Database1 Name in Sheet1,A2: 728x90_Allstate_629930_ALL_JUL_2013_MASSACHUSETTSAUTO_BAN_MSN_ROSMSNAUTOSMASSACHUSETTS_7.2.13 Database2 Name in Sheet2, A13: BAN_MSN_ROSMSNAUTOSMASSACHUSETTS728X90_728X90_DFA Common Factors: "ROSMSNAUTOSMASSACHUSETTS" & "728X90" Therefore A2 and A13 need to pair up In some cases, Database 1 and 2 will have a common name aspect but sizing will be different. They need to have BOTH aspects in common in order to be paired so I would NOT want the below example to pair up. Database1 Name in Sheet1,A2: 728x90_Allstate_629930_ALL_JUL_2013_MASSACHUSETTSAUTO_BAN_MSN_ROSMSNAUTOSMASSACHUSETTS_7.2.13 Database2 Name in Sheet2, A12: BAN_MSN_ROSMSNAUTOSMASSACHUSETTS300X250_300X250_DFA Common Factor: Only "ROSMSNAUTOSMASSACHUSETTS" matches. "728x90" is not equal to "300X250" - Sizing is different so they should not be paired.

Read the article
Data Warehouse ETL slow - change primary key in dimension?

- by Jubbles

I have a working MySQL data warehouse that is organized as a star schema and I am using Talend Open Studio for Data Integration 5.1 to create the ETL process. I would like this process to run once per day. I have estimated that one of the dimension tables (dimUser) will have approximately 2 million records and 23 columns. I created a small test ETL process in Talend that worked, but given the amount of data that may need to be updated daily, the current performance will not cut it. It takes the ETL process four minutes to UPDATE or INSERT 100 records to dimUser. If I assumed a linear relationship between the count of records and the amount of time to UPDATE or INSERT, then there is no way the ETL can finish in 3-4 hours (my hope), let alone one day. Since I'm unfamiliar with Java, I wrote the ETL as a Python script and ran into the same problem. Although, I did discover that if I did only INSERT, the process went much faster. I am pretty sure that the bottleneck is caused by the UPDATE statements. The primary key in dimUser is an auto-increment integer. My friend suggested that I scrap this primary key and replace it with a multi-field primary key (in my case, 2-3 fields). Before I rip the test data out of my warehouse and change the schema, can anyone provide suggestions or guidelines related to the design of the data warehouse the ETL process how realistic it is to have an ETL process INSERT or UPDATE a few million records each day will my friend's suggestion significantly help If you need any further information, just let me know and I'll post it. UPDATE - additional information: mysql> describe dimUser; Field Type Null Key Default Extra user_key int(10) unsigned NO PRI NULL auto_increment id_A int(10) unsigned NO NULL id_B int(10) unsigned NO NULL field_4 tinyint(4) unsigned NO 0 field_5 varchar(50) YES NULL city varchar(50) YES NULL state varchar(2) YES NULL country varchar(50) YES NULL zip_code varchar(10) NO 99999 field_10 tinyint(1) NO 0 field_11 tinyint(1) NO 0 field_12 tinyint(1) NO 0 field_13 tinyint(1) NO 1 field_14 tinyint(1) NO 0 field_15 tinyint(1) NO 0 field_16 tinyint(1) NO 0 field_17 tinyint(1) NO 1 field_18 tinyint(1) NO 0 field_19 tinyint(1) NO 0 field_20 tinyint(1) NO 0 create_date datetime NO 2012-01-01 00:00:00 last_update datetime NO 2012-01-01 00:00:00 run_id int(10) unsigned NO 999 I used a surrogate key because I had read that it was good practice. Since, from a business perspective, I want to keep aware of potential fraudulent activity (say for 200 days a user is associated with state X and then the next day they are associated with state Y - they could have moved or their account could have been compromised), so that is why geographic data is kept. The field id_B may have a few distinct values of id_A associated with it, but I am interested in knowing distinct (id_A, id_B) tuples. In the context of this information, my friend suggested that something like (id_A, id_B, zip_code) be the primary key. For the large majority of daily ETL processes (80%), I only expect the following fields to be updated for existing records: field_10 - field_14, last_update, and run_id (this field is a foreign key to my etlLog table and is used for ETL auditing purposes).

Read the article
Parallelism in .NET – Part 20, Using Task with Existing APIs

- by Reed

Although the Task class provides a huge amount of flexibility for handling asynchronous actions, the .NET Framework still contains a large number of APIs that are based on the previous asynchronous programming model. While Task and Task<T> provide a much nicer syntax as well as extending the flexibility, allowing features such as continuations based on multiple tasks, the existing APIs don’t directly support this workflow. There is a method in the TaskFactory class which can be used to adapt the existing APIs to the new Task class: TaskFactory.FromAsync. This method provides a way to convert from the BeginOperation/EndOperation method pair syntax common through .NET Framework directly to a Task<T> containing the results of the operation in the task’s Result parameter. While this method does exist, it unfortunately comes at a cost – the method overloads are far from simple to decipher, and the resulting code is not always as easily understood as newer code based directly on the Task class. For example, a single call to handle WebRequest.BeginGetResponse/EndGetReponse, one of the easiest “pairs” of methods to use, looks like the following: var task = Task.Factory.FromAsync<WebResponse>( request.BeginGetResponse, request.EndGetResponse, null); .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } The compiler is unfortunately unable to infer the correct type, and, as a result, the WebReponse must be explicitly mentioned in the method call. As a result, I typically recommend wrapping this into an extension method to ease use. For example, I would place the above in an extension method like: public static class WebRequestExtensions { public static Task<WebResponse> GetReponseAsync(this WebRequest request) { return Task.Factory.FromAsync<WebResponse>( request.BeginGetResponse, request.EndGetResponse, null); } } This dramatically simplifies usage. For example, if we wanted to asynchronously check to see if this blog supported XHTML 1.0, and report that in a text box to the user, we could do: var webRequest = WebRequest.Create("http://www.reedcopsey.com"); webRequest.GetReponseAsync().ContinueWith(t => { using (var sr = new StreamReader(t.Result.GetResponseStream())) { string str = sr.ReadLine();; this.textBox1.Text = string.Format("Page at {0} supports XHTML 1.0: {1}", t.Result.ResponseUri, str.Contains("XHTML 1.0")); } }, TaskScheduler.FromCurrentSynchronizationContext()); By using a continuation with a TaskScheduler based on the current synchronization context, we can keep this request asynchronous, check based on the first line of the response string, and report the results back on our UI directly.

Read the article
SharePoint.DesignFactory.ContentFiles–building WCM sites

- by svdoever

One of the use cases where we use the SharePoint.DesignFactory.ContentFiles tooling is in building SharePoint Publishing (WCM) solutions for SharePoint 2007, SharePoint 2010 and Office365. Publishing solutions are often solutions that have one instance, the publishing site (possibly with subsites), that in most cases need to go through DTAP. If you dissect a publishing site, in most case you have the following findings: The publishing site spans a site collection The branding of the site is specified in the root site, because: Master pages live in the root site (/_catalogs/masterpage) Page layouts live in the root site (/_catalogs/masterpage) The style library lives in the root site ( /Style Library) and contains images, css, javascript, xslt transformations for your CQWP’s, … Preconfigured web parts live in the root site (/_catalogs/wp) The root site and subsites contains a document library called Pages (or your language-specific version of it) containing publishing pages using the page layouts and master pages The site collection contains content types, fields and lists When using the SharePoint.DesignFactory.ContentFiles tooling it is very easy to create, test, package and deploy the artifacts that can be uploaded to the SharePoint content database. This can be done in a fast and simple way without the need to create and deploy WSP packages. If we look at the above list of artifacts we can use SharePoint.DesignFactory.ContentFiles for master pages, page layouts, the style library, web part configurations, and initial publishing pages (these are normally made through the SharePoint web UI). Some artifacts like content types, fields and lists in the above list can NOT be handled by SharePoint.DesignFactory.ContentFiles, because they can’t be uploaded to the SharePoint content database. The good thing is that these artifacts are the artifacts that don’t change that much in the development of a SharePoint Publishing solution. There are however multiple ways to create these artifacts: Use paper script: create them manually in each of the environments based on documentation Automate the creation of the artifacts using (PowerShell) script Develop a WSP package to create these artifacts I’m not a big fan of the third option (see my blog post Thoughts on building deployable and updatable SharePoint solutions). It is a lot of work to create content types, fields and list definitions using all kind of XML files, and it is not allowed to modify these artifacts when in use. I know… SharePoint 2010 has some content type upgrade possibilities, but I think it is just too cumbersome. The first option has the problem that content types and fields get ID’s, and that these ID’s must be used by the metadata on for example page layouts. No problem for SharePoint.DesignFactory.ContentFiles, because it supports deploy-time resolving of these ID’s using PowerShell. For example consider the following metadata definition for the page layout contactpage-wcm.aspx.properties.ps1: Metadata page layout # This script must return a hashtable @{ name=value; ... } of field name-value pairs # for the content file that this script applies to. # On deployment to SharePoint, these values are written as fields in the corresponding list item (if any) # Note that fields must exist; they can be updated but not created or deleted. # This script is called right after the file is deployed to SharePoint. # You can use the script parameters and arbitrary PowerShell code to interact with SharePoint. # e.g. to calculate properties and values at deployment time. param([string]$SourcePath, [string]$RelativeUrl, $Context) @{ "ContentTypeId" = $Context.GetContentTypeID('GeneralPage'); "MasterPageDescription" = "Cloud Aviator Contact pagelayout (wcm - don't use)"; "PublishingHidden" = "1"; "PublishingAssociatedContentType" = $Context.GetAssociatedContentTypeInfo('GeneralPage') } The PowerShell functions GetContentTypeID and GetAssociatedContentTypeInfo can at deploy-time resolve the required information from the server we are deploying to. I personally prefer the second option: automate creation through PowerShell, because there are PowerShell scripts available to export content types and fields. An example project structure for a typical SharePoint WCM site looks like: Note that this project uses DualLayout. So if you build Publishing sites using SharePoint, checkout out the completely free SharePoint.DesignFactory.ContentFiles tooling and start flying!

Read the article
Subterranean IL: Pseudo custom attributes

- by Simon Cooper

Custom attributes were designed to make the .NET framework extensible; if a .NET language needs to store additional metadata on an item that isn't expressible in IL, then an attribute could be applied to the IL item to represent this metadata. For instance, the C# compiler uses DecimalConstantAttribute and DateTimeConstantAttribute to represent compile-time decimal or datetime constants, which aren't allowed in pure IL, and FixedBufferAttribute to represent fixed struct fields. How attributes are compiled Within a .NET assembly are a series of tables containing all the metadata for items within the assembly; for instance, the TypeDef table stores metadata on all the types in the assembly, and MethodDef does the same for all the methods and constructors. Custom attribute information is stored in the CustomAttribute table, which has references to the IL item the attribute is applied to, the constructor used (which implies the type of attribute applied), and a binary blob representing the arguments and name/value pairs used in the attribute application. For example, the following C# class: [Obsolete("Please use MyClass2", true)] public class MyClass { // ... } corresponds to the following IL class definition: .class public MyClass { .custom instance void [mscorlib]System.ObsoleteAttribute::.ctor(string, bool) = { string('Please use MyClass2' bool(true) } // ... } and results in the following entry in the CustomAttribute table: TypeDef(MyClass) MemberRef(ObsoleteAttribute::.ctor(string, bool)) blob -> {string('Please use MyClass2' bool(true)} However, there are some attributes that don't compile in this way. Pseudo custom attributes Just like there are some concepts in a language that can't be represented in IL, there are some concepts in IL that can't be represented in a language. This is where pseudo custom attributes come into play. The most obvious of these is SerializableAttribute. Although it looks like an attribute, it doesn't compile to a CustomAttribute table entry; it instead sets the serializable bit directly within the TypeDef entry for the type. This flag is fully expressible within IL; this C#: [Serializable] public class MySerializableClass {} compiles to this IL: .class public serializable MySerializableClass {} For those interested, a full list of pseudo custom attributes is available here. For the rest of this post, I'll be concentrating on the ones that deal with P/Invoke. P/Invoke attributes P/Invoke is built right into the CLR at quite a deep level; there are 2 metadata tables within an assembly dedicated solely to p/invoke interop, and many more that affect it. Furthermore, all the attributes used to specify p/invoke methods in C# or VB have their own keywords and syntax within IL. For example, the following C# method declaration: [DllImport("mscorsn.dll", SetLastError = true)] [return: MarshalAs(UnmanagedType.U1)] private static extern bool StrongNameSignatureVerificationEx( [MarshalAs(UnmanagedType.LPWStr)] string wszFilePath, [MarshalAs(UnmanagedType.U1)] bool fForceVerification, [MarshalAs(UnmanagedType.U1)] ref bool pfWasVerified); compiles to the following IL definition: .method private static pinvokeimpl("mscorsn.dll" lasterr winapi) bool marshal(unsigned int8) StrongNameSignatureVerificationEx( string marshal(lpwstr) wszFilePath, bool marshal(unsigned int8) fForceVerification, bool& marshal(unsigned int8) pfWasVerified) cil managed preservesig {} As you can see, all the p/invoke and marshal properties are specified directly in IL, rather than using attributes. And, rather than creating entries in CustomAttribute, a whole bunch of metadata is emitted to represent this information. This single method declaration results in the following metadata being output to the assembly: A MethodDef entry containing basic information on the method Four ParamDef entries for the 3 method parameters and return type An entry in ModuleRef to mscorsn.dll An entry in ImplMap linking ModuleRef and MethodDef, along with the name of the function to import and the pinvoke options (lasterr winapi) Four FieldMarshal entries containing the marshal information for each parameter. Phew! Applying attributes Most of the time, when you apply an attribute to an element, an entry in the CustomAttribute table will be created to represent that application. However, some attributes represent concepts in IL that aren't expressible in the language you're coding in, and can instead result in a single bit change (SerializableAttribute and NonSerializedAttribute), or many extra metadata table entries (the p/invoke attributes) being emitted to the output assembly.

Read the article
OWB 11gR2 - Early Arriving Facts

- by Dawei Sun

A common challenge when building ETL components for a data warehouse is how to handle early arriving facts. OWB 11gR2 introduced a new feature to address this for dimensional objects entitled Orphan Management. An orphan record is one that does not have a corresponding existing parent record. Orphan management automates the process of handling source rows that do not meet the requirements necessary to form a valid dimension or cube record. In this article, a simple example will be provided to show you how to use Orphan Management in OWB. We first import a sample MDL file that contains all the objects we need. Then we take some time to examine all the objects. After that, we prepare the source data, deploy the target table and dimension/cube loading map. Finally, we run the loading maps, and check the data in target dimension/cube tables. OK, let’s start… 1. Import MDL file and examine sample project First, download zip file from here, which includes a MDL file and three source data files. Then we open OWB design center, import orphan_management.mdl by using the menu File->Import->Warehouse Builder Metadata. Now we have several objects in BI_DEMO project as below: Mapping LOAD_CHANNELS_OM: The mapping for dimension loading. Mapping LOAD_SALES_OM: The mapping for cube loading. Dimension CHANNELS_OM: The dimension that contains channels data. Cube SALES_OM: The cube that contains sales data. Table CHANNELS_OM: The star implementation table of dimension CHANNELS_OM. Table SALES_OM: The star implementation table of cube SALES_OM. Table SRC_CHANNELS: The source table of channels data, that will be loaded into dimension CHANNELS_OM. Table SRC_ORDERS and SRC_ORDER_ITEMS: The source tables of sales data that will be loaded into cube SALES_OM. Sequence CLASS_OM_DIM_SEQ: The sequence used for loading dimension CHANNELS_OM. Dimension CHANNELS_OM This dimension has a hierarchy with three levels: TOTAL, CLASS and CHANNEL. Each level has three attributes: ID (surrogate key), NAME and SOURCE_ID (business key). It has a standard star implementation. The orphan management policy and the default parent setting are shown in the following screenshots: The orphan management policy options that you can set for loading are: Reject Orphan: The record is not inserted. Default Parent: You can specify a default parent record. This default record is used as the parent record for any record that does not have an existing parent record. If the default parent record does not exist, Warehouse Builder creates the default parent record. You specify the attribute values of the default parent record at the time of defining the dimensional object. If any ancestor of the default parent does not exist, Warehouse Builder also creates this record. No Maintenance: This is the default behavior. Warehouse Builder does not actively detect, reject, or fix orphan records. While removing data from a dimension, you can select one of the following orphan management policies: Reject Removal: Warehouse Builder does not allow you to delete the record if it has existing child records. No Maintenance: This is the default behavior. Warehouse Builder does not actively detect, reject, or fix orphan records. (More details are at http://download.oracle.com/docs/cd/E11882_01/owb.112/e10935/dim_objects.htm#insertedID1) Cube SALES_OM This cube is references to dimension CHANNELS_OM. It has three measures: AMOUNT, QUANTITY and COST. The orphan management policy setting are shown as following screenshot: The orphan management policy options that you can set for loading are: No Maintenance: Warehouse Builder does not actively detect, reject, or fix orphan rows. Default Dimension Record: Warehouse Builder assigns a default dimension record for any row that has an invalid or null dimension key value. Use the Settings button to define the default parent row. Reject Orphan: Warehouse Builder does not insert the row if it does not have an existing dimension record. (More details are at http://download.oracle.com/docs/cd/E11882_01/owb.112/e10935/dim_objects.htm#BABEACDG) Mapping LOAD_CHANNELS_OM This mapping loads source data from table SRC_CHANNELS to dimension CHANNELS_OM. The operator CHANNELS_IN is bound to table SRC_CHANNELS; CHANNELS_OUT is bound to dimension CHANNELS_OM. The TOTALS operator is used for generating a constant value for the top level in the dimension. The CLASS_FILTER operator is used to filter out the “invalid” class name, so then we can see what will happen when those channel records with an “invalid” parent are loading into dimension. Some properties of the dimension operator in this mapping are important to orphan management. See the screenshot below: Create Default Level Records: If YES, then default level records will be created. This property must be set to YES for dimensions and cubes if one of their orphan management policies is “Default Parent” or “Default Dimension Record”. This property is set to NO by default, so the user may need to set this to YES manually. LOAD policy for INVALID keys/ LOAD policy for NULL keys: These two properties have the same meaning as in the dimension editor. The values are set to the same as the dimension value when user drops the dimension into the mapping. The user does not need to modify these properties. Record Error Rows: If YES, error rows will be inserted into error table when loading the dimension. REMOVE Orphan Policy: This property is used when removing data from a dimension. Since the dimension loading type is set to LOAD in this example, this property is disabled. Mapping LOAD_SALES_OM This mapping loads source data from table SRC_ORDERS and SRC_ORDER_ITEMS to cube SALES_OM. This mapping seems a little bit complicated, but operators in the red rectangle are used to filter out and generate the records with “invalid” or “null” dimension keys. Some properties of the cube operator in a mapping are important to orphan management. See the screenshot below: Enable Source Aggregation: Should be checked in this example. If the default dimension record orphan policy is set for the cube operator, then it is recommended that source aggregation also be enabled. Otherwise, the orphan management processing may produce multiple fact rows with the same default dimension references, which will cause an “unstable rowset” execution error in the database, since the dimension refs are used as update match attributes for updating the fact table. LOAD policy for INVALID keys/ LOAD policy for NULL keys: These two properties have the same meaning as in the cube editor. The values are set to the same as in the cube editor when the user drops the cube into the mapping. The user does not need to modify these properties. Record Error Rows: If YES, error rows will be inserted into error table when loading the cube. 2. Deploy objects and mappings We now can deploy the objects. First, make sure location SALES_WH_LOCAL has been correctly configured. Then open Control Center Manager by using the menu Tools->Control Center Manager. Expand BI_DEMO->SALES_WH_LOCAL, click SALES_WH node on the project tree. We can see the following objects: Deploy all the objects in the following order: Sequence CLASS_OM_DIM_SEQ Table CHANNELS_OM, SALES_OM, SRC_CHANNELS, SRC_ORDERS, SRC_ORDER_ITEMS Dimension CHANNELS_OM Cube SALES_OM Mapping LOAD_CHANNELS_OM, LOAD_SALES_OM Note that we deployed source tables as well. Normally, we import source table from database instead of deploying them to target schema. However, in this example, we designed the source tables in OWB and deployed them to database for the purpose of this demonstration. 3. Prepare and examine source data Before running the mappings, we need to populate and examine the source data first. Run SRC_CHANNELS.sql, SRC_ORDERS.sql and SRC_ORDER_ITEMS.sql as target user. Then we check the data in these three tables. Table SRC_CHANNELS SQL> select rownum, id, class, name from src_channels; Records 1~5 are correct; they should be loaded into dimension without error. Records 6,7 and 8 have null parents; they should be loaded into dimension with a default parent value, and should be inserted into error table at the same time. Records 9, 10 and 11 have “invalid” parents; they should be rejected by dimension, and inserted into error table. Table SRC_ORDERS and SRC_ORDER_ITEMS SQL> select rownum, a.id, a.channel, b.amount, b.quantity, b.cost from src_orders a, src_order_items b where a.id = b.order_id; Record 178 has null dimension reference; it should be loaded into cube with a default dimension reference, and should be inserted into error table at the same time. Record 179 has “invalid” dimension reference; it should be rejected by cube, and inserted into error table. Other records should be aggregated and loaded into cube correctly. 4. Run the mappings and examine the target data In the Control Center Manager, expand BI_DEMO-> SALES_WH_LOCAL-> SALES_WH-> Mappings, right click on LOAD_CHANNELS_OM node, click Start. Use the same way to run mapping LOAD_SALES_OM. When they successfully finished, we can check the data in target tables. Table CHANNELS_OM SQL> select rownum, total_id, total_name, total_source_id, class_id,class_name, class_source_id, channel_id, channel_name,channel_source_id from channels_om order by abs(dimension_key); Records 1,2 and 3 are the default dimension records for the three levels. Records 8, 10 and 15 are the loaded records that originally have null parents. We see their parents name (class_name) is set to DEF_CLASS_NAME. Those records whose CHANNEL_NAME are Special_4, Special_5 and Special_6 are not loaded to this table because of the invalid parent. Error Table CHANNELS_OM_ERR SQL> select rownum, class_source_id, channel_id, channel_name,channel_source_id, err$$$_error_reason from channels_om_err order by channel_name; We can see all the record with null parent or invalid parent are inserted into this error table. Error reason is “Default parent used for record” for the first three records, and “No parent found for record” for the last three. Table SALES_OM SQL> select a.*, b.channel_name from sales_om a, channels_om b where a.channels=b.channel_id; We can see the order record with null channel_name has been loaded into target table with a default channel_name. The one with “invalid” channel_name are not loaded. Error Table SALES_OM_ERR SQL> select a.amount, a.cost, a.quantity, a.channels, b.channel_name, a.err$$$_error_reason from sales_om_err a, channels_om b where a.channels=b.channel_id(+); We can see the order records with null or invalid channel_name are inserted into error table. If the dimension reference column is null, the error reason is “Default dimension record used for fact”. If it is invalid, the error reason is “Dimension record not found for fact”. Summary In summary, this article illustrated the Orphan Management feature in OWB 11gR2. Automated orphan management policies improve ETL developer and administrator productivity by addressing an important cause of cube and dimension load failures, without requiring developers to explicitly build logic to handle these orphan rows.

Read the article

Search Results

Search found 946 results on 38 pages for 'surrogate pairs'.

Page 26/38 | < Previous Page | 22 23 24 25 26 27 28 29 30 31 32 33 | Next Page >

- by Alek

- by Bernd Jendrissek

- by DAE51D

- by Tom H

- by Patrick

- by Daniel

- by Novelocrat

- by Alek

- by Pistos

- by Luke404

- by J. Pablo Fernández

- by Gilberto

- by TopHat

- by makerofthings7

- by Michael Haren

- by Behrang

- by Casper

- by nareshov

- by devicenull

- by Kylie Z

- by Jubbles

- by Reed

- by svdoever

- by Simon Cooper

- by Dawei Sun

< Previous Page | 22 23 24 25 26 27 28 29 30 31 32 33 | Next Page >