Search Results

Search found 2995 results on 120 pages for 'logical operators'.

Page 34/120 | < Previous Page | 30 31 32 33 34 35 36 37 38 39 40 41  | Next Page >

  • How Do You Actually Model Data?

    Since the 1970’s Developers, Analysts and DBAs have been able to represent concepts and relations in the form of data through the use of generic symbols.  But what is data modeling?  The first time I actually heard this term I could not understand why anyone would want to display a computer on a fashion show runway. Hey, what do you expect? At that time I was a freshman in community college, and obviously this was a long time ago.  I have since had the chance to learn what data modeling truly is through using it. Data modeling is a process of breaking down information and/or requirements in to common categories called objects. Once objects start being defined then relationships start to form based on dependencies found amongst other existing objects.  Currently, there are several tools on the market that help data designer actually map out objects and their relationships through the use of symbols and lines.  These diagrams allow for designs to be review from several perspectives so that designers can ensure that they have the optimal data design for their project and that the design is flexible enough to allow for potential changes and/or extension in the future. Additionally these basic models can always be further refined to show different levels of details depending on the target audience through the use of three different types of models. Conceptual Data Model(CDM)Conceptual Data Models include all key entities and relationships giving a viewer a high level understanding of attributes. Conceptual data model are created by gathering and analyzing information from various sources pertaining to a project during the typical planning phase of a project. Logical Data Model (LDM)Logical Data Models are conceptual data models that have been expanded to include implementation details pertaining to the data that it will store. Additionally, this model typically represents an origination’s business requirements and business rules by defining various attribute data types and relationships regarding each entity. This additional information can be directly translated to the Physical Data Model which reduces the actual time need to implement it. Physical Data Model(PDMs)Physical Data Model are transformed Logical Data Models that include the necessary tables, columns, relationships, database properties for the creation of a database. This model also allows for considerations regarding performance, indexing and denormalization that are applied through database rules, data integrity. Further expanding on why we actually use models in modern application/database development can be seen in the benefits that data modeling provides for data modelers and projects themselves, Benefits of Data Modeling according to Applied Information Science Abstraction that allows data designers remove concepts and ideas form hard facts in the form of data. This gives the data designers the ability to express general concepts and/or ideas in a generic form through the use of symbols to represent data items and the relationships between the items. Transparency through the use of data models allows complex ideas to be translated in to simple symbols so that the concept can be understood by all viewpoints and limits the amount of confusion and misunderstanding. Effectiveness in regards to tuning a model for acceptable performance while maintaining affordable operational costs. In addition it allows systems to be built on a solid foundation in terms of data. I shudder at the thought of a world without data modeling, think about it? Data is everywhere in our lives. Data modeling allows for optimizing a design for performance and the reduction of duplication. If one was to design a database without data modeling then I would think that the first things to get impacted would be database performance due to poorly designed database and there would be greater chances of unnecessary data duplication that would also play in to the excessive query times because unneeded records would need to be processed. You could say that a data designer designing a database is like a box of chocolates. You will never know what kind of database you will get until after it is built.

    Read the article

  • Schizophrenic Ubuntu 12.10-12.04: Atheros 922 PCI WIFI is disabled in Unity but enabled in terminal - How to getit to work?

    - by zewone
    I am trying to get my PCI Wireless Atheros 922 card to work. It is disabled in Unity: both the network utility and the desktop (see screenshot http://www.amisdurailhalanzy.be/Screenshot%20from%202012-10-25%2013:19:54.png) I tried many different advises on many different forums. Installed 12.10 instead of 12.04, enabled all interfaces... etc. I have read about the aht9 driver... The terminal shows no hw or sw lock for the Atheros card, nevertheless, it is still disabled. Nothing worked so far, the card is still disabled. Any help is much appreciated. Here are more tech details: myuser@adri1:~$ sudo lshw -C network *-network:0 DISABLED description: Wireless interface product: AR922X Wireless Network Adapter vendor: Atheros Communications Inc. physical id: 2 bus info: pci@0000:03:02.0 logical name: wlan1 version: 01 serial: 00:18:e7:cd:68:b1 width: 32 bits clock: 66MHz capabilities: pm bus_master cap_list ethernet physical wireless configuration: broadcast=yes driver=ath9k driverversion=3.5.0-17-generic firmware=N/A latency=168 link=no multicast=yes wireless=IEEE 802.11bgn resources: irq:18 memory:d8000000-d800ffff *-network:1 description: Ethernet interface product: VT6105/VT6106S [Rhine-III] vendor: VIA Technologies, Inc. physical id: 6 bus info: pci@0000:03:06.0 logical name: eth0 version: 8b serial: 00:11:09:a3:76:4a size: 10Mbit/s capacity: 100Mbit/s width: 32 bits clock: 33MHz capabilities: pm bus_master cap_list ethernet physical tp mii 10bt 10bt-fd 100bt 100bt-fd autonegotiation configuration: autonegotiation=on broadcast=yes driver=via-rhine driverversion=1.5.0 duplex=half latency=32 link=no maxlatency=8 mingnt=3 multicast=yes port=MII speed=10Mbit/s resources: irq:18 ioport:d300(size=256) memory:d8013000-d80130ff *-network DISABLED description: Wireless interface physical id: 1 bus info: usb@1:8.1 logical name: wlan0 serial: 00:11:09:51:75:36 capabilities: ethernet physical wireless configuration: broadcast=yes driver=rt2500usb driverversion=3.5.0-17-generic firmware=N/A link=no multicast=yes wireless=IEEE 802.11bg myuser@adri1:~$ sudo rfkill list all 0: hci0: Bluetooth Soft blocked: no Hard blocked: no 1: phy1: Wireless LAN Soft blocked: no Hard blocked: yes 2: phy0: Wireless LAN Soft blocked: no Hard blocked: no myuser@adri1:~$ dmesg | grep wlan0 [ 15.114235] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready myuser@adri1:~$ dmesg | egrep 'ath|firm' [ 14.617562] ath: EEPROM regdomain: 0x30 [ 14.617568] ath: EEPROM indicates we should expect a direct regpair map [ 14.617572] ath: Country alpha2 being used: AM [ 14.617575] ath: Regpair used: 0x30 [ 14.637778] ieee80211 phy0: >Selected rate control algorithm 'ath9k_rate_control' [ 14.639410] Registered led device: ath9k-phy0 myuser@adri1:~$ dmesg | grep wlan1 [ 15.119922] IPv6: ADDRCONF(NETDEV_UP): wlan1: link is not ready myuser@adri1:~$ lspci -nn | grep 'Atheros' 03:02.0 Network controller [0280]: Atheros Communications Inc. AR922X Wireless Network Adapter [168c:0029] (rev 01) myuser@adri1:~$ sudo ifconfig eth0 Link encap:Ethernet HWaddr 00:11:09:a3:76:4a inet addr:192.168.2.2 Bcast:192.168.2.255 Mask:255.255.255.0 inet6 addr: fe80::211:9ff:fea3:764a/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:5457 errors:0 dropped:0 overruns:0 frame:0 TX packets:2548 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:3425684 (3.4 MB) TX bytes:282192 (282.1 KB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:590 errors:0 dropped:0 overruns:0 frame:0 TX packets:590 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:53729 (53.7 KB) TX bytes:53729 (53.7 KB) myuser@adri1:~$ sudo iwconfig wlan0 IEEE 802.11bg ESSID:off/any Mode:Managed Access Point: Not-Associated Tx-Power=off Retry long limit:7 RTS thr:off Fragment thr:off Encryption key:off Power Management:on lo no wireless extensions. eth0 no wireless extensions. wlan1 IEEE 802.11bgn ESSID:off/any Mode:Managed Access Point: Not-Associated Tx-Power=0 dBm Retry long limit:7 RTS thr:off Fragment thr:off Encryption key:off Power Management:off myuser@adri1:~$ lsmod | grep "ath9k" ath9k 116549 0 mac80211 461161 3 rt2x00usb,rt2x00lib,ath9k ath9k_common 13783 1 ath9k ath9k_hw 376155 2 ath9k,ath9k_common ath 19187 3 ath9k,ath9k_common,ath9k_hw cfg80211 175375 4 rt2x00lib,ath9k,mac80211,ath myuser@adri1:~$ iwlist scan wlan0 Failed to read scan data : Network is down lo Interface doesn't support scanning. eth0 Interface doesn't support scanning. wlan1 Failed to read scan data : Network is down myuser@adri1:~$ lsb_release -d Description: Ubuntu 12.10 myuser@adri1:~$ uname -mr 3.5.0-17-generic i686 ![Schizophrenic Ubuntu](http://www.amisdurailhalanzy.be/Screenshot%20from%202012-10-25%2013:19:54.png) Any help much appreciated... Thanks, Philippe

    Read the article

  • Wireless doesn't work on a Broadcom BCM4312

    - by Boderick
    As stated, I've just upgraded to 12.04 and my Dell Inspiron 1545 isn't recognising its wireless card and I was wondering if anybody could help? Edit: Okay, so I found the wireless card by using lspci -vvv and it returned this: 0c:00.0 Network controller: Broadcom Corporation BCM4312 802.11b/g LP-PHY (rev 01) Subsystem: Dell Wireless 1397 WLAN Mini-Card Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort- SERR- Kernel modules: ssb lsmod Module Size Used by dm_crypt 22528 0 joydev 17393 0 dell_wmi 12601 0 sparse_keymap 13658 1 dell_wmi dell_laptop 13671 0 dcdbas 14098 1 dell_laptop psmouse 72919 0 uvcvideo 67203 0 serio_raw 13027 0 videodev 86588 1 uvcvideo snd_hda_codec_idt 60251 1 mac_hid 13077 0 snd_hda_intel 32765 5 snd_hda_codec 109562 2 snd_hda_codec_idt,snd_hda_intel snd_hwdep 13276 1 snd_hda_codec parport_pc 32114 0 rfcomm 38139 0 bnep 17830 2 ppdev 12849 0 snd_pcm 80845 3 snd_hda_intel,snd_hda_codec bluetooth 158438 10 rfcomm,bnep snd_seq_midi 13132 0 snd_rawmidi 25424 1 snd_seq_midi snd_seq_midi_event 14475 1 snd_seq_midi snd_seq 51567 2 snd_seq_midi,snd_seq_midi_event snd_timer 28931 2 snd_pcm,snd_seq snd_seq_device 14172 3 snd_seq_midi,snd_rawmidi,snd_seq binfmt_misc 17292 1 snd 62064 18 snd_hda_codec_idt,snd_hda_intel,snd_hda_codec,snd_hwdep,snd_pcm,snd_rawmidi,snd_seq,snd_timer,snd_seq_device soundcore 14635 1 snd snd_page_alloc 14108 2 snd_hda_intel,snd_pcm lp 17455 0 parport 40930 3 parport_pc,ppdev,lp sky2 53628 0 ums_realtek 17920 0 uas 17699 0 i915 414603 3 wmi 18744 1 dell_wmi drm_kms_helper 45466 1 i915 drm 197692 4 i915,drm_kms_helper i2c_algo_bit 13199 1 i915 video 19068 1 i915 usb_storage 39646 1 ums_realtek ifconfig -a eth0 Link encap:Ethernet HWaddr 00:23:ae:24:71:45 inet addr:192.168.1.158 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::223:aeff:fe24:7145/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:14340 errors:0 dropped:0 overruns:0 frame:0 TX packets:10191 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:15403754 (15.4 MB) TX bytes:1262570 (1.2 MB) Interrupt:18 ham0 Link encap:Ethernet HWaddr 7a:79:05:2d:b0:f7 inet addr:5.45.176.247 Bcast:5.255.255.255 Mask:255.0.0.0 inet6 addr: fe80::7879:5ff:fe2d:b0f7/64 Scope:Link inet6 addr: 2620:9b::52d:b0f7/96 Scope:Global UP BROADCAST RUNNING MULTICAST MTU:1404 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:179 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:500 RX bytes:0 (0.0 B) TX bytes:27480 (27.4 KB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:433 errors:0 dropped:0 overruns:0 frame:0 TX packets:433 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:60051 (60.0 KB) TX bytes:60051 (60.0 KB) iwconfig lo no wireless extensions. ham0 no wireless extensions. eth0 no wireless extensions. the results for sudo lshw -class network *-network description: Wireless interface product: BCM4312 802.11b/g LP-PHY vendor: Broadcom Corporation physical id: 0 bus info: pci@0000:0c:00.0 logical name: eth1 version: 01 serial: 00:22:5f:77:1f:e6 width: 64 bits clock: 33MHz capabilities: pm msi pciexpress bus_master cap_list ethernet physical wireless configuration: broadcast=yes driver=wl0 driverversion=5.100.82.38 latency=0 multicast=yes wireless=IEEE 802.11bg resources: irq:17 memory:f69fc000-f69fffff *-network description: Ethernet interface product: 88E8040 PCI-E Fast Ethernet Controller vendor: Marvell Technology Group Ltd. physical id: 0 bus info: pci@0000:09:00.0 logical name: eth0 version: 13 serial: 00:23:ae:24:71:45 size: 100Mbit/s capacity: 100Mbit/s width: 64 bits clock: 33MHz capabilities: pm msi pciexpress bus_master cap_list ethernet physical tp 10bt 10bt-fd 100bt 100bt-fd autonegotiation configuration: autonegotiation=on broadcast=yes driver=sky2 driverversion=1.30 duplex=full firmware=N/A ip=192.168.1.158 latency=0 link=yes multicast=yes port=twisted pair speed=100Mbit/s resources: irq:45 memory:f68fc000-f68fffff ioport:de00(size=256) *-network description: Ethernet interface physical id: 2 logical name: ham0 serial: 7a:79:05:2d:b0:f7 size: 10Mbit/s capabilities: ethernet physical configuration: autonegotiation=off broadcast=yes driver=tun driverversion=1.6 duplex=full firmware=N/A ip=5.45.176.247 link=yes multicast=yes port=twisted pair speed=10Mbit/s and the results of rfkill list all 0: brcmwl-0: Wireless LAN Soft blocked: yes Hard blocked: yes 1: dell-wifi: Wireless LAN Soft blocked: yes Hard blocked: yes

    Read the article

  • Fresh Ubuntu Install - Grub not loading

    - by Ryan Sharp
    System Ubuntu 12.04 64-bit Windows 7 SP1 Samsung 64GB SSD - OS' Samsung 1TB HDD - Games, /Home, Swap WD 300'ishGB HDD - Backup Okay, so I'm very frustrated, so please excuse me if I miss anything out as my head is clouded by anger and impatience, etc. I'll try me best, though. First of all, I'll explain how I got to my predicament. I finally got my new SSD. I firstly installed Windows, which completed without a hitch. Afterwards, I tried to install Ubuntu, which failed several times due to problems irrelevant to this question, but I mention this to explain my frustrations, sorry. Anyway, I finally installed Ubuntu. However, I chose the 'bootloader' to be installed on the same partition as where I was installing the Ubuntu Root partition, as that was what I believed to be the best choice. It was of my thinking that it was supposed to go on the same partition and on the SSD, which is my OS drive, though with my problem, it apparently was wrong. So I tried to fix it by checking guides and following their directions, but seemed to have messed it up even more. Here is what I receive after I use the fdisk -l command: (I also added explanations for which I used each partition for) Disk /dev/sda: 64.0 GB, 64023257088 bytes 255 heads, 63 sectors/track, 7783 cylinders, total 125045424 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x324971d1 Device Boot Start End Blocks Id System /dev/sda1 * 2048 206847 102400 7 HPFS/NTFS/exFAT /dev/sda2 208896 48957439 24374272 7 HPFS/NTFS/exFAT /dev/sda3 48959486 125044735 38042625 5 Extended /dev/sda5 48959488 125044735 38042624 83 Linux sda1 --/ Windows Recovery sda2 --/ Windows 7 sda3/5 --/ Ubuntu root [ / ] Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes 255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0xc0ee6a69 Device Boot Start End Blocks Id System /dev/sdb1 1024208894 1953523711 464657409 5 Extended /dev/sdb3 * 2048 1024206847 512102400 7 HPFS/NTFS/exFAT /dev/sdb5 1024208896 1939851263 457821184 83 Linux /dev/sdb6 1939853312 1953523711 6835200 82 Linux swap / Solaris sdb3 --/ Partition for Steam games, etc. sdb5 --/ Ubuntu Home [ /home ] sdb6 --/ Ubuntu Swap Partition table entries are not in disk order Disk /dev/sdc: 320.1 GB, 320072933376 bytes 255 heads, 63 sectors/track, 38913 cylinders, total 625142448 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x292eee23 Device Boot Start End Blocks Id System /dev/sdc1 2048 625141759 312569856 7 HPFS/NTFS/exFAT sdc1 --/ Generic backup I also used a Boot Script that other users suggested, so that I can give more details on my partitions and also where Grub is located... ============================= Boot Info Summary: =============================== => Grub2 (v1.99) is installed in the MBR of /dev/sda and looks at sector 1 of the same hard drive for core.img. core.img is at this location and looks for (,msdos5)/boot/grub on this drive. => Grub2 (v1.99) is installed in the MBR of /dev/sdb and looks at sector 1 of the same hard drive for core.img. core.img is at this location and looks for (,msdos5)/boot/grub on this drive. => Windows is installed in the MBR of /dev/sdc. Now that is weird... Why would Grub2 be installed on both my SSD and HDD? Even weirder is why is Windows on the MBR of my backup hard drive? Nothing I did should have done that... Anyway, here is the entire Output from that script... PASTEBIN So, to summarize what I need: How can I fix my setup so grub loads on startup? How can I clean my partitions to remove unnecessary grubs? What did I do wrong so that I don't do something so daft again? Thank you so much for reading, and I hope you can help me. I've been trying to have a successful setup since Friday, and I'm almost at the point that I'm really tempted to throw my computer out the window due to my frustration.

    Read the article

  • WiFi stops working after a while in Lenovo ThinkPad W520 (Ubuntu 12.04)

    - by el10780
    After several minutes(I do not know how many) there is no internet connection on my laptop via Wi-Fi.Ubuntu doesn't show any kind of message that my WiFi was disconnected neither there is a signal drop,but suddenly Firefox stops connecting to web pages.I checked my modem/router and it seems that it is working fine.I tried also to reboot the WiFi device and nothing happens.The only thing that it makes it work again is a reboot of the system and if I do not want to do a reboot then I am enforced to connect to the Internet using Ethernet cable.Does anybody know what is happening? ## Some Hardware info that might be helpful ## el10780@ThinkPad-W520:~$ sudo lshw -class network *-network description: Ethernet interface product: 82579LM Gigabit Network Connection vendor: Intel Corporation physical id: 19 bus info: pci@0000:00:19.0 logical name: eth0 version: 04 serial: f0:de:f1:f1:be:10 size: 100Mbit/s capacity: 1Gbit/s width: 32 bits clock: 33MHz capabilities: pm msi bus_master cap_list ethernet physical tp 10bt 10bt-fd 100bt 100bt-fd 1000bt-fd autonegotiation configuration: autonegotiation=on broadcast=yes driver=e1000e driverversion=1.5.1-k duplex=full firmware=0.13-3 ip=192.168.0.10 latency=0 link=yes multicast=yes port=twisted pair speed=100Mbit/s resources: irq:50 memory:f3a00000-f3a1ffff memory:f3a2b000-f3a2bfff ioport:6080(size=32) *-network description: Wireless interface product: Centrino Advanced-N + WiMAX 6250 vendor: Intel Corporation physical id: 0 bus info: pci@0000:03:00.0 logical name: wlan0 version: 5e serial: 64:80:99:63:14:74 width: 64 bits clock: 33MHz capabilities: pm msi pciexpress bus_master cap_list ethernet physical wireless configuration: broadcast=yes driver=iwlwifi driverversion=3.2.0-26-generic firmware=41.28.5.1 build 33926 ip=192.168.0.6 latency=0 link=yes multicast=yes wireless=IEEE 802.11abgn resources: irq:52 memory:f3900000-f3901fff *-network description: Ethernet interface physical id: 1 bus info: usb@2:1.3 logical name: wmx0 serial: 00:1d:e1:53:b2:e8 capabilities: ethernet physical configuration: driver=i2400m firmware=i6050-fw-usb-1.5.sbcf link=no el10780@ThinkPad-W520:~$ lspci 00:00.0 Host bridge: Intel Corporation 2nd Generation Core Processor Family DRAM Controller (rev 09) 00:01.0 PCI bridge: Intel Corporation Xeon E3-1200/2nd Generation Core Processor Family PCI Express Root Port (rev 09) 00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09) 00:16.0 Communication controller: Intel Corporation 6 Series/C200 Series Chipset Family MEI Controller #1 (rev 04) 00:16.3 Serial controller: Intel Corporation 6 Series/C200 Series Chipset Family KT Controller (rev 04) 00:19.0 Ethernet controller: Intel Corporation 82579LM Gigabit Network Connection (rev 04) 00:1a.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #2 (rev 04) 00:1b.0 Audio device: Intel Corporation 6 Series/C200 Series Chipset Family High Definition Audio Controller (rev 04) 00:1c.0 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 1 (rev b4) 00:1c.1 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 2 (rev b4) 00:1c.3 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 4 (rev b4) 00:1c.4 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 5 (rev b4) 00:1c.6 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 7 (rev b4) 00:1d.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #1 (rev 04) 00:1f.0 ISA bridge: Intel Corporation QM67 Express Chipset Family LPC Controller (rev 04) 00:1f.2 SATA controller: Intel Corporation 6 Series/C200 Series Chipset Family 6 port SATA AHCI Controller (rev 04) 00:1f.3 SMBus: Intel Corporation 6 Series/C200 Series Chipset Family SMBus Controller (rev 04) 01:00.0 VGA compatible controller: NVIDIA Corporation GF108 [Quadro 1000M] (rev a1) 03:00.0 Network controller: Intel Corporation Centrino Advanced-N + WiMAX 6250 (rev 5e) 0d:00.0 System peripheral: Ricoh Co Ltd Device e823 (rev 08) 0d:00.3 FireWire (IEEE 1394): Ricoh Co Ltd R5C832 PCIe IEEE 1394 Controller (rev 04) 0e:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host Controller (rev 04) el10780@ThinkPad-W520:~$ rfkill list all 0: hci0: Bluetooth Soft blocked: no Hard blocked: no 1: tpacpi_bluetooth_sw: Bluetooth Soft blocked: no Hard blocked: no 2: phy0: Wireless LAN Soft blocked: no Hard blocked: no 3: i2400m-usb:2-1.3:1.0: WiMAX Soft blocked: yes Hard blocked: no The weirdest thing is this screenshot which I took after running the **Additional Drivers** program.I mean I have a NVidia Quadro 1000M and my Intel Centrino WiFi Card and this shows that there are not proprietay drivers for my system. http://imageshack.us/photo/my-images/268/screenshotfrom201207062.png/

    Read the article

  • Columnstore Case Study #1: MSIT SONAR Aggregations

    - by aspiringgeek
    Preamble This is the first in a series of posts documenting big wins encountered using columnstore indexes in SQL Server 2012 & 2014.  Many of these can be found in this deck along with details such as internals, best practices, caveats, etc.  The purpose of sharing the case studies in this context is to provide an easy-to-consume quick-reference alternative. Why Columnstore? If we’re looking for a subset of columns from one or a few rows, given the right indexes, SQL Server can do a superlative job of providing an answer. If we’re asking a question which by design needs to hit lots of rows—DW, reporting, aggregations, grouping, scans, etc., SQL Server has never had a good mechanism—until columnstore. Columnstore indexes were introduced in SQL Server 2012. However, they're still largely unknown. Some adoption blockers existed; yet columnstore was nonetheless a game changer for many apps.  In SQL Server 2014, potential blockers have been largely removed & they're going to profoundly change the way we interact with our data.  The purpose of this series is to share the performance benefits of columnstore & documenting columnstore is a compelling reason to upgrade to SQL Server 2014. App: MSIT SONAR Aggregations At MSIT, performance & configuration data is captured by SCOM. We archive much of the data in a partitioned data warehouse table in SQL Server 2012 for reporting via an application called SONAR.  By definition, this is a primary use case for columnstore—report queries requiring aggregation over large numbers of rows.  New data is refreshed each night by an automated table partitioning mechanism—a best practices scenario for columnstore. The Win Compared to performance using classic indexing which resulted in the expected query plan selection including partition elimination vs. SQL Server 2012 nonclustered columnstore, query performance increased significantly.  Logical reads were reduced by over a factor of 50; both CPU & duration improved by factors of 20 or more.  Other than creating the columnstore index, no special modifications or tweaks to the app or databases schema were necessary to achieve the performance improvements.  Existing nonclustered indexes were rendered superfluous & were deleted, thus mitigating maintenance challenges such as defragging as well as conserving disk capacity. Details The table provides the raw data & summarizes the performance deltas. Logical Reads (8K pages) CPU (ms) Durn (ms) Columnstore 160,323 20,360 9,786 Conventional Table & Indexes 9,053,423 549,608 193,903 ? x56 x27 x20 The charts provide additional perspective of this data.  "Conventional vs. Columnstore Metrics" document the raw data.  Note on this linear display the magnitude of the conventional index performance vs. columnstore.  The “Metrics (?)” chart expresses these values as a ratio. Summary For DW, reports, & other BI workloads, columnstore often provides significant performance enhancements relative to conventional indexing.  I have documented here, the first in a series of reports on columnstore implementations, results from an initial implementation at MSIT in which logical reads were reduced by over a factor of 50; both CPU & duration improved by factors of 20 or more.  Subsequent features in this series document performance enhancements that are even more significant. 

    Read the article

  • Need help partitioning when reinstalling Ubuntu 14.04

    - by Chris M.
    I upgraded to 14.04 about a month ago on my HP Mini netbook (about 16 GB hard disk). A few days ago the system crashed (I don't know why but I was using internet at the time). When I restarted the computer, Ubuntu would not load. Instead, I got a message from the BIOS saying Reboot and Select proper Boot device or Insert Boot Media in selected Boot device and press a key I took this to mean that I needed to reinstall 14.04. When I try to reinstall Ubuntu from the USB stick, I choose "Erase disk and install Ubuntu" but then I get a message: Some of the partitions you created are too small. Please make the following partitions at least this large: / 3.3 GB If you do not go back to the partitioner and increase the size of these partitions, the installation may fail. At first I hit Continue to see if it would install anyway, and it gave the message: The attempt to mount a file system with type ext4 in SCSI1 (0,0,0), partition # 1 (sda) at / failed. You may resume partitioning from the partitioning menu. The second time I hit Go Back, and it took me to the following partitioning table: Device Type Mount Point Format Size Used System /dev/sda /dev/sda1 ext4 (checked) 3228 MB Unknown /dev/sda5 swap (not checked) 1063 MB Unknown + - Change New Partition Table... Revert Device for boot loader installation: /dev/sda ATA JM Loader 001 (4.3 GB) At this point I'm not sure what to do. I've never partitioned my hard drive before and I don't want to screw things up. (I'm not particularly tech savvy.) Can you instruct me what I should do. (P.S. I'm afraid the table might not appear as I typed it in.) Results from fdisk: ubuntu@ubuntu:~$ sudo fdisk -l Disk /dev/sda: 4294 MB, 4294967296 bytes 255 heads, 63 sectors/track, 522 cylinders, total 8388608 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00000000 Disk /dev/sda doesn't contain a valid partition table Disk /dev/sdb: 7860 MB, 7860125696 bytes 155 heads, 31 sectors/track, 3194 cylinders, total 15351808 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x0009a565 Device Boot Start End Blocks Id System /dev/sdb1 * 2768 15351807 7674520 b W95 FAT32 ubuntu@ubuntu:~$ Here is what it displays when I open the Disks utility (I tried the screenshot terminal command you suggested but it didn't seem to do anything): 4.3 GB Hard Disk /dev/sda Model: JM Loader 001 (01000001) Size: 4.3 GB (4,294,967,296 bytes) Serial Number: 01234123412341234 Assessment: SMART is not supported Volumes Size: 4.3 GB (4,294,967,296 bytes) Device: /dev/sda Contents: Unknown (There is a button in the utility that when you click it gives the following options: Format... Create Disk Image... Restore Disk Image... Benchmark but SMART Data & Self-Tests... is dimmed out) When I hit F9 Change Boot Device Order, it shows the hard drive as: SATA:PM-JM Loader 001 When I hit F10 to get me into the BIOS Setup Utility, under Diagnostic it shows: Primary Hard Disk Self Test Not Support NetworkManager Tool State: disconnected Device: eth0 Type: Wired Driver: atl1c State: unavailable Default: no HW Address: 00:26:55:B0:7F:0C Capabilities: Carrier Detect: yes Wired Properties Carrier: off When I run command lshw -C network, I get: WARNING: you should run this program as super-user. *-network description: Network controller product: BCM4312 802.11b/g LP-PHY vendor: Broadcom Corporation physical id: 0 bus info: pci@0000:01:00.0 version: 01 width: 64 bits clock: 33MHz capabilities: bus_master cap_list configuration: driver=b43-pci-bridge latency=0 resources: irq:16 memory:feafc000-feafffff *-network description: Ethernet interface product: AR8132 Fast Ethernet vendor: Qualcomm Atheros physical id: 0 bus info: pci@0000:02:00.0 logical name: eth0 version: c0 serial: 00:26:55:b0:7f:0c capacity: 100Mbit/s width: 64 bits clock: 33MHz capabilities: bus_master cap_list ethernet physical tp 10bt 10bt-fd 100bt 100bt-fd autonegotiation configuration: autonegotiation=on broadcast=yes driver=atl1c driverversion=1.0.1.1-NAPI latency=0 link=no multicast=yes port=twisted pair resources: irq:43 memory:febc0000-febfffff ioport:ec80(size=128) WARNING: output may be incomplete or inaccurate, you should run this program as super-user.

    Read the article

  • My LAN USB NIC is not working in ubuntu 11.10?

    - by Gaurav_Java
    Today i start my system its seems that my LAN port is not working . so i buy one USB to LAN adapter and i plug in ubuntu system its doen't connect automatically .when i check result lsusb its shows me that there is one DM9601 Ethernet adapter is connected when i click on network information in panel its shows me that there is something " wired NEtwork (Broadcom NetLink BCM5784M gigabit Ethernet PCIe) I think want some driver for that .i don't have any idea how it can be used ? here output of sudo lspci -nn *00:00.0 Host bridge [0600]: Intel Corporation Mobile 4 Series Chipset Memory Controller Hub [8086:2a40] (rev 07) 00:02.0 VGA compatible controller [0300]: Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller [8086:2a42] (rev 07) 00:02.1 Display controller [0380]: Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller [8086:2a43] (rev 07) 00:1a.0 USB Controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #4 [8086:2937] (rev 03) 00:1a.1 USB Controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #5 [8086:2938] (rev 03) 00:1a.7 USB Controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #2 [8086:293c] (rev 03) 00:1b.0 Audio device [0403]: Intel Corporation 82801I (ICH9 Family) HD Audio Controller [8086:293e] (rev 03) 00:1c.0 PCI bridge [0604]: Intel Corporation 82801I (ICH9 Family) PCI Express Port 1 [8086:2940] (rev 03) 00:1c.1 PCI bridge [0604]: Intel Corporation 82801I (ICH9 Family) PCI Express Port 2 [8086:2942] (rev 03) 00:1c.2 PCI bridge [0604]: Intel Corporation 82801I (ICH9 Family) PCI Express Port 3 [8086:2944] (rev 03) 00:1c.4 PCI bridge [0604]: Intel Corporation 82801I (ICH9 Family) PCI Express Port 5 [8086:2948] (rev 03) 00:1d.0 USB Controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #1 [8086:2934] (rev 03) 00:1d.1 USB Controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #2 [8086:2935] (rev 03) 00:1d.2 USB Controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #3 [8086:2936] (rev 03) 00:1d.3 USB Controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #6 [8086:2939] (rev 03) 00:1d.7 USB Controller [0c03]: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #1 [8086:293a] (rev 03) 00:1e.0 PCI bridge [0604]: Intel Corporation 82801 Mobile PCI Bridge [8086:2448] (rev 93) 00:1f.0 ISA bridge [0601]: Intel Corporation ICH9M LPC Interface Controller [8086:2919] (rev 03) 00:1f.2 SATA controller [0106]: Intel Corporation ICH9M/M-E SATA AHCI Controller [8086:2929] (rev 03) 00:1f.3 SMBus [0c05]: Intel Corporation 82801I (ICH9 Family) SMBus Controller [8086:2930] (rev 03) 02:00.0 Ethernet controller [0200]: Broadcom Corporation NetLink BCM5784M Gigabit Ethernet PCIe [14e4:1698] (rev 10) 04:00.0 Network controller [0280]: Intel Corporation WiFi Link 5100 [8086:4232]* sudo lshw -class network *-network description: Ethernet interface product: NetLink BCM5784M Gigabit Ethernet PCIe vendor: Broadcom Corporation physical id: 0 bus info: pci@0000:02:00.0 logical name: eth0 version: 10 serial: 00:1f:16:9a:56:98 capacity: 1Gbit/s width: 64 bits clock: 33MHz capabilities: pm vpd msi pciexpress bus_master cap_list ethernet physical tp 10bt 10bt-fd 100bt 100bt-fd 1000bt 1000bt-fd autonegotiation configuration: autonegotiation=on broadcast=yes driver=tg3 driverversion=3.119 firmware=sb v2.19 latency=0 link=no multicast=yes port=twisted pair resources: irq:47 memory:f4500000-f450ffff *-network DISABLED description: Wireless interface product: WiFi Link 5100 vendor: Intel Corporation physical id: 0 bus info: pci@0000:04:00.0 logical name: wlan0 version: 00 serial: 00:22:fa:09:02:00 width: 64 bits clock: 33MHz capabilities: pm msi pciexpress bus_master cap_list ethernet physical wireless configuration: broadcast=yes driver=iwlagn driverversion=3.0.0-17-generic firmware=8.83.5.1 build 33692 latency=0 link=no multicast=yes wireless=IEEE 802.11abgn resources: irq:46 memory:f4600000-f4601fff *-network description: Ethernet interface physical id: 4 logical name: eth1 serial: 00:60:6e:00:f1:7d size: 100Mbit/s capacity: 100Mbit/s capabilities: ethernet physical tp mii 10bt 10bt-fd 100bt 100bt-fd autonegotiation configuration: autonegotiation=on broadcast=yes driver=dm9601 driverversion=22-Aug-2005 duplex=full firmware=Davicom DM9601 USB Ethernet ip=192.168.1.34 link=yes multicast=yes port=MII speed=100Mbit/s I am using Wimax internet connection which i have to connect from browser . at that time my system is not showing that i am connected to any wired connection. but when i connect internet from other system after getting conneted to internet . when i plug again my USB LAN then its shows that you are conneted to wired connetion. here is screenshot for conneting wimax from browser after connecting to internet network connection shows

    Read the article

  • help in the Donalds B. Johnson's algorithm, i cannot understand the pseudo code

    - by Pitelk
    Hi , does anyone know the Donald B. Johnson's algorithm which enumarates all the elementary circuits (cycles) in a Directed graph? link text I have the paper he had published in 1975 but I cannot understand the pseudo-code. My goal is to implement this algorithm in java. Some questions i have is for example what is the matrix Ak it refers to. In the pseudo code mentions that Ak:=adjacency structure of strong component K with least vertex in subgraph of G induced by {s,s+1,....n}; Does that mean i have to implement another algorithm that finds the Ak matrix? Another question is what the following means? begin logical f; Does also the line "logical procedure CIRCUIT (integer value v);" means that the circuit procedure returns a logical variable. In the pseudo code also has the line "CIRCUIT := f;" . Does this mean? It would be great if someone could translate this 1970's pseudocode to a more modern type of pseudo code so i can understand it in case you are interested to help but you cannot find the paper please email me at [email protected] and i will send you the paper. Thanks in advance

    Read the article

  • slicing a 2d numpy array

    - by MedicalMath
    The following code: import numpy as p myarr=[[0,1],[0,6],[0,1],[0,6],[0,1],[0,6],[0,1],[0,6],[0,1],[0,6],[0,1],[0,6],[0,1],[0,6],[0,1],[0,6],[0,1],[0,6]] copy=p.array(myarr) p.mean(copy)[:,1] Is generating the following error message: Traceback (most recent call last): File "<pyshell#3>", line 1, in <module> p.mean(copy)[:,1] IndexError: 0-d arrays can only use a single () or a list of newaxes (and a single ...) as an index I looked up the syntax at this link and I seem to be using the correct syntax to slice. However, when I type copy[:,1] into the Python shell, it gives me the following output, which is clearly wrong, and is probably what is throwing the error: array([1, 6, 1, 6, 1, 6, 1, 6, 1, 6, 1, 6, 1, 6, 1, 6, 1, 6]) Can anyone show me how to fix my code so that I can extract the second column and then take the mean of the second column as intended in the original code above? EDIT: Thank you for your solutions. However, my posting was an oversimplification of my real problem. I used your solutions in my real code, and got a new error. Here is my real code with one of your solutions that I tried: filteredSignalArray=p.array(filteredSignalArray) logical=p.logical_and(EndTime-10.0<=matchingTimeArray,matchingTimeArray<=EndTime) finalStageTime=matchingTimeArray.compress(logical) finalStageFiltered=filteredSignalArray.compress(logical) for j in range(len(finalStageTime)): if j == 0: outputArray=[[finalStageTime[j],finalStageFiltered[j]]] else: outputArray+=[[finalStageTime[j],finalStageFiltered[j]]] print 'outputArray[:,1].mean() is: ',outputArray[:,1].mean() And here is the error message that is now being generated by the new code: File "mypath\myscript.py", line 1545, in WriteToOutput10SecondsBeforeTimeMarker print 'outputArray[:,1].mean() is: ',outputArray[:,1].mean() TypeError: list indices must be integers, not tuple Second EDIT: This is solved now that I added: outputArray=p.array(outputArray) above my code. I have been at this too many hours and need to take a break for a while if I am making these kinds of mistakes.

    Read the article

  • iOS layout; I'm not getting it

    - by Tbee
    Well, "not getting it" is too harsh; I've got it working in for what for me is a logical setup, but it does not seem to be what iOS deems logical. So I'm not getting something. Suppose I've got an app that shows two pieces of information; a date and a table. According to the MVC approach I've got three MVC at work here, one for the date, one for the table and one that takes both these MCVs and makes it into a screen, wiring them up. The master MVC knows how/where it wants to layout the two sub MVC's. Each detail MVC only takes care of its own childeren within the bounds that were specified by the master MVC. Something like: - (void)loadView { MVC* mvc1 = [[MVC1 alloc] initwithFrame:...] [self.view addSubview:mvc1.view]; MVC* mvc2 = [[MVC2 alloc] initwithFrame:...] [self.view addSubview:mvc2.view]; } If the above is logical (which is it for me) then I would expect any MVC class to have a constructor "initWithFrame". But an MVC does not, only view have this. Why? How would one correctly layout nested MVCs? (Naturally I do not have just these two, but the detail MVCs have sub MVCs again.)

    Read the article

  • Improving Partitioned Table Join Performance

    - by Paul White
    The query optimizer does not always choose an optimal strategy when joining partitioned tables. This post looks at an example, showing how a manual rewrite of the query can almost double performance, while reducing the memory grant to almost nothing. Test Data The two tables in this example use a common partitioning partition scheme. The partition function uses 41 equal-size partitions: CREATE PARTITION FUNCTION PFT (integer) AS RANGE RIGHT FOR VALUES ( 125000, 250000, 375000, 500000, 625000, 750000, 875000, 1000000, 1125000, 1250000, 1375000, 1500000, 1625000, 1750000, 1875000, 2000000, 2125000, 2250000, 2375000, 2500000, 2625000, 2750000, 2875000, 3000000, 3125000, 3250000, 3375000, 3500000, 3625000, 3750000, 3875000, 4000000, 4125000, 4250000, 4375000, 4500000, 4625000, 4750000, 4875000, 5000000 ); GO CREATE PARTITION SCHEME PST AS PARTITION PFT ALL TO ([PRIMARY]); There two tables are: CREATE TABLE dbo.T1 ( TID integer NOT NULL IDENTITY(0,1), Column1 integer NOT NULL, Padding binary(100) NOT NULL DEFAULT 0x,   CONSTRAINT PK_T1 PRIMARY KEY CLUSTERED (TID) ON PST (TID) );   CREATE TABLE dbo.T2 ( TID integer NOT NULL, Column1 integer NOT NULL, Padding binary(100) NOT NULL DEFAULT 0x,   CONSTRAINT PK_T2 PRIMARY KEY CLUSTERED (TID, Column1) ON PST (TID) ); The next script loads 5 million rows into T1 with a pseudo-random value between 1 and 5 for Column1. The table is partitioned on the IDENTITY column TID: INSERT dbo.T1 WITH (TABLOCKX) (Column1) SELECT (ABS(CHECKSUM(NEWID())) % 5) + 1 FROM dbo.Numbers AS N WHERE n BETWEEN 1 AND 5000000; In case you don’t already have an auxiliary table of numbers lying around, here’s a script to create one with 10 million rows: CREATE TABLE dbo.Numbers (n bigint PRIMARY KEY);   WITH L0 AS(SELECT 1 AS c UNION ALL SELECT 1), L1 AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B), L2 AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B), L3 AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B), L4 AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B), L5 AS(SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B), Nums AS(SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS n FROM L5) INSERT dbo.Numbers WITH (TABLOCKX) SELECT TOP (10000000) n FROM Nums ORDER BY n OPTION (MAXDOP 1); Table T1 contains data like this: Next we load data into table T2. The relationship between the two tables is that table 2 contains ‘n’ rows for each row in table 1, where ‘n’ is determined by the value in Column1 of table T1. There is nothing particularly special about the data or distribution, by the way. INSERT dbo.T2 WITH (TABLOCKX) (TID, Column1) SELECT T.TID, N.n FROM dbo.T1 AS T JOIN dbo.Numbers AS N ON N.n >= 1 AND N.n <= T.Column1; Table T2 ends up containing about 15 million rows: The primary key for table T2 is a combination of TID and Column1. The data is partitioned according to the value in column TID alone. Partition Distribution The following query shows the number of rows in each partition of table T1: SELECT PartitionID = CA1.P, NumRows = COUNT_BIG(*) FROM dbo.T1 AS T CROSS APPLY (VALUES ($PARTITION.PFT(TID))) AS CA1 (P) GROUP BY CA1.P ORDER BY CA1.P; There are 40 partitions containing 125,000 rows (40 * 125k = 5m rows). The rightmost partition remains empty. The next query shows the distribution for table 2: SELECT PartitionID = CA1.P, NumRows = COUNT_BIG(*) FROM dbo.T2 AS T CROSS APPLY (VALUES ($PARTITION.PFT(TID))) AS CA1 (P) GROUP BY CA1.P ORDER BY CA1.P; There are roughly 375,000 rows in each partition (the rightmost partition is also empty): Ok, that’s the test data done. Test Query and Execution Plan The task is to count the rows resulting from joining tables 1 and 2 on the TID column: SET STATISTICS IO ON; DECLARE @s datetime2 = SYSUTCDATETIME();   SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID;   SELECT DATEDIFF(Millisecond, @s, SYSUTCDATETIME()); SET STATISTICS IO OFF; The optimizer chooses a plan using parallel hash join, and partial aggregation: The Plan Explorer plan tree view shows accurate cardinality estimates and an even distribution of rows across threads (click to enlarge the image): With a warm data cache, the STATISTICS IO output shows that no physical I/O was needed, and all 41 partitions were touched: Running the query without actual execution plan or STATISTICS IO information for maximum performance, the query returns in around 2600ms. Execution Plan Analysis The first step toward improving on the execution plan produced by the query optimizer is to understand how it works, at least in outline. The two parallel Clustered Index Scans use multiple threads to read rows from tables T1 and T2. Parallel scan uses a demand-based scheme where threads are given page(s) to scan from the table as needed. This arrangement has certain important advantages, but does result in an unpredictable distribution of rows amongst threads. The point is that multiple threads cooperate to scan the whole table, but it is impossible to predict which rows end up on which threads. For correct results from the parallel hash join, the execution plan has to ensure that rows from T1 and T2 that might join are processed on the same thread. For example, if a row from T1 with join key value ‘1234’ is placed in thread 5’s hash table, the execution plan must guarantee that any rows from T2 that also have join key value ‘1234’ probe thread 5’s hash table for matches. The way this guarantee is enforced in this parallel hash join plan is by repartitioning rows to threads after each parallel scan. The two repartitioning exchanges route rows to threads using a hash function over the hash join keys. The two repartitioning exchanges use the same hash function so rows from T1 and T2 with the same join key must end up on the same hash join thread. Expensive Exchanges This business of repartitioning rows between threads can be very expensive, especially if a large number of rows is involved. The execution plan selected by the optimizer moves 5 million rows through one repartitioning exchange and around 15 million across the other. As a first step toward removing these exchanges, consider the execution plan selected by the optimizer if we join just one partition from each table, disallowing parallelism: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = 1 AND $PARTITION.PFT(T2.TID) = 1 OPTION (MAXDOP 1); The optimizer has chosen a (one-to-many) merge join instead of a hash join. The single-partition query completes in around 100ms. If everything scaled linearly, we would expect that extending this strategy to all 40 populated partitions would result in an execution time around 4000ms. Using parallelism could reduce that further, perhaps to be competitive with the parallel hash join chosen by the optimizer. This raises a question. If the most efficient way to join one partition from each of the tables is to use a merge join, why does the optimizer not choose a merge join for the full query? Forcing a Merge Join Let’s force the optimizer to use a merge join on the test query using a hint: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (MERGE JOIN); This is the execution plan selected by the optimizer: This plan results in the same number of logical reads reported previously, but instead of 2600ms the query takes 5000ms. The natural explanation for this drop in performance is that the merge join plan is only using a single thread, whereas the parallel hash join plan could use multiple threads. Parallel Merge Join We can get a parallel merge join plan using the same query hint as before, and adding trace flag 8649: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (MERGE JOIN, QUERYTRACEON 8649); The execution plan is: This looks promising. It uses a similar strategy to distribute work across threads as seen for the parallel hash join. In practice though, performance is disappointing. On a typical run, the parallel merge plan runs for around 8400ms; slower than the single-threaded merge join plan (5000ms) and much worse than the 2600ms for the parallel hash join. We seem to be going backwards! The logical reads for the parallel merge are still exactly the same as before, with no physical IOs. The cardinality estimates and thread distribution are also still very good (click to enlarge): A big clue to the reason for the poor performance is shown in the wait statistics (captured by Plan Explorer Pro): CXPACKET waits require careful interpretation, and are most often benign, but in this case excessive waiting occurs at the repartitioning exchanges. Unlike the parallel hash join, the repartitioning exchanges in this plan are order-preserving ‘merging’ exchanges (because merge join requires ordered inputs): Parallelism works best when threads can just grab any available unit of work and get on with processing it. Preserving order introduces inter-thread dependencies that can easily lead to significant waits occurring. In extreme cases, these dependencies can result in an intra-query deadlock, though the details of that will have to wait for another time to explore in detail. The potential for waits and deadlocks leads the query optimizer to cost parallel merge join relatively highly, especially as the degree of parallelism (DOP) increases. This high costing resulted in the optimizer choosing a serial merge join rather than parallel in this case. The test results certainly confirm its reasoning. Collocated Joins In SQL Server 2008 and later, the optimizer has another available strategy when joining tables that share a common partition scheme. This strategy is a collocated join, also known as as a per-partition join. It can be applied in both serial and parallel execution plans, though it is limited to 2-way joins in the current optimizer. Whether the optimizer chooses a collocated join or not depends on cost estimation. The primary benefits of a collocated join are that it eliminates an exchange and requires less memory, as we will see next. Costing and Plan Selection The query optimizer did consider a collocated join for our original query, but it was rejected on cost grounds. The parallel hash join with repartitioning exchanges appeared to be a cheaper option. There is no query hint to force a collocated join, so we have to mess with the costing framework to produce one for our test query. Pretending that IOs cost 50 times more than usual is enough to convince the optimizer to use collocated join with our test query: -- Pretend IOs are 50x cost temporarily DBCC SETIOWEIGHT(50);   -- Co-located hash join SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (RECOMPILE);   -- Reset IO costing DBCC SETIOWEIGHT(1); Collocated Join Plan The estimated execution plan for the collocated join is: The Constant Scan contains one row for each partition of the shared partitioning scheme, from 1 to 41. The hash repartitioning exchanges seen previously are replaced by a single Distribute Streams exchange using Demand partitioning. Demand partitioning means that the next partition id is given to the next parallel thread that asks for one. My test machine has eight logical processors, and all are available for SQL Server to use. As a result, there are eight threads in the single parallel branch in this plan, each processing one partition from each table at a time. Once a thread finishes processing a partition, it grabs a new partition number from the Distribute Streams exchange…and so on until all partitions have been processed. It is important to understand that the parallel scans in this plan are different from the parallel hash join plan. Although the scans have the same parallelism icon, tables T1 and T2 are not being co-operatively scanned by multiple threads in the same way. Each thread reads a single partition of T1 and performs a hash match join with the same partition from table T2. The properties of the two Clustered Index Scans show a Seek Predicate (unusual for a scan!) limiting the rows to a single partition: The crucial point is that the join between T1 and T2 is on TID, and TID is the partitioning column for both tables. A thread that processes partition ‘n’ is guaranteed to see all rows that can possibly join on TID for that partition. In addition, no other thread will see rows from that partition, so this removes the need for repartitioning exchanges. CPU and Memory Efficiency Improvements The collocated join has removed two expensive repartitioning exchanges and added a single exchange processing 41 rows (one for each partition id). Remember, the parallel hash join plan exchanges had to process 5 million and 15 million rows. The amount of processor time spent on exchanges will be much lower in the collocated join plan. In addition, the collocated join plan has a maximum of 8 threads processing single partitions at any one time. The 41 partitions will all be processed eventually, but a new partition is not started until a thread asks for it. Threads can reuse hash table memory for the new partition. The parallel hash join plan also had 8 hash tables, but with all 5,000,000 build rows loaded at the same time. The collocated plan needs memory for only 8 * 125,000 = 1,000,000 rows at any one time. Collocated Hash Join Performance The collated join plan has disappointing performance in this case. The query runs for around 25,300ms despite the same IO statistics as usual. This is much the worst result so far, so what went wrong? It turns out that cardinality estimation for the single partition scans of table T1 is slightly low. The properties of the Clustered Index Scan of T1 (graphic immediately above) show the estimation was for 121,951 rows. This is a small shortfall compared with the 125,000 rows actually encountered, but it was enough to cause the hash join to spill to physical tempdb: A level 1 spill doesn’t sound too bad, until you realize that the spill to tempdb probably occurs for each of the 41 partitions. As a side note, the cardinality estimation error is a little surprising because the system tables accurately show there are 125,000 rows in every partition of T1. Unfortunately, the optimizer uses regular column and index statistics to derive cardinality estimates here rather than system table information (e.g. sys.partitions). Collocated Merge Join We will never know how well the collocated parallel hash join plan might have worked without the cardinality estimation error (and the resulting 41 spills to tempdb) but we do know: Merge join does not require a memory grant; and Merge join was the optimizer’s preferred join option for a single partition join Putting this all together, what we would really like to see is the same collocated join strategy, but using merge join instead of hash join. Unfortunately, the current query optimizer cannot produce a collocated merge join; it only knows how to do collocated hash join. So where does this leave us? CROSS APPLY sys.partitions We can try to write our own collocated join query. We can use sys.partitions to find the partition numbers, and CROSS APPLY to get a count per partition, with a final step to sum the partial counts. The following query implements this idea: SELECT row_count = SUM(Subtotals.cnt) FROM ( -- Partition numbers SELECT p.partition_number FROM sys.partitions AS p WHERE p.[object_id] = OBJECT_ID(N'T1', N'U') AND p.index_id = 1 ) AS P CROSS APPLY ( -- Count per collocated join SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals; The estimated plan is: The cardinality estimates aren’t all that good here, especially the estimate for the scan of the system table underlying the sys.partitions view. Nevertheless, the plan shape is heading toward where we would like to be. Each partition number from the system table results in a per-partition scan of T1 and T2, a one-to-many Merge Join, and a Stream Aggregate to compute the partial counts. The final Stream Aggregate just sums the partial counts. Execution time for this query is around 3,500ms, with the same IO statistics as always. This compares favourably with 5,000ms for the serial plan produced by the optimizer with the OPTION (MERGE JOIN) hint. This is another case of the sum of the parts being less than the whole – summing 41 partial counts from 41 single-partition merge joins is faster than a single merge join and count over all partitions. Even so, this single-threaded collocated merge join is not as quick as the original parallel hash join plan, which executed in 2,600ms. On the positive side, our collocated merge join uses only one logical processor and requires no memory grant. The parallel hash join plan used 16 threads and reserved 569 MB of memory:   Using a Temporary Table Our collocated merge join plan should benefit from parallelism. The reason parallelism is not being used is that the query references a system table. We can work around that by writing the partition numbers to a temporary table (or table variable): SET STATISTICS IO ON; DECLARE @s datetime2 = SYSUTCDATETIME();   CREATE TABLE #P ( partition_number integer PRIMARY KEY);   INSERT #P (partition_number) SELECT p.partition_number FROM sys.partitions AS p WHERE p.[object_id] = OBJECT_ID(N'T1', N'U') AND p.index_id = 1;   SELECT row_count = SUM(Subtotals.cnt) FROM #P AS p CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals;   DROP TABLE #P;   SELECT DATEDIFF(Millisecond, @s, SYSUTCDATETIME()); SET STATISTICS IO OFF; Using the temporary table adds a few logical reads, but the overall execution time is still around 3500ms, indistinguishable from the same query without the temporary table. The problem is that the query optimizer still doesn’t choose a parallel plan for this query, though the removal of the system table reference means that it could if it chose to: In fact the optimizer did enter the parallel plan phase of query optimization (running search 1 for a second time): Unfortunately, the parallel plan found seemed to be more expensive than the serial plan. This is a crazy result, caused by the optimizer’s cost model not reducing operator CPU costs on the inner side of a nested loops join. Don’t get me started on that, we’ll be here all night. In this plan, everything expensive happens on the inner side of a nested loops join. Without a CPU cost reduction to compensate for the added cost of exchange operators, candidate parallel plans always look more expensive to the optimizer than the equivalent serial plan. Parallel Collocated Merge Join We can produce the desired parallel plan using trace flag 8649 again: SELECT row_count = SUM(Subtotals.cnt) FROM #P AS p CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals OPTION (QUERYTRACEON 8649); The actual execution plan is: One difference between this plan and the collocated hash join plan is that a Repartition Streams exchange operator is used instead of Distribute Streams. The effect is similar, though not quite identical. The Repartition uses round-robin partitioning, meaning the next partition id is pushed to the next thread in sequence. The Distribute Streams exchange seen earlier used Demand partitioning, meaning the next partition id is pulled across the exchange by the next thread that is ready for more work. There are subtle performance implications for each partitioning option, but going into that would again take us too far off the main point of this post. Performance The important thing is the performance of this parallel collocated merge join – just 1350ms on a typical run. The list below shows all the alternatives from this post (all timings include creation, population, and deletion of the temporary table where appropriate) from quickest to slowest: Collocated parallel merge join: 1350ms Parallel hash join: 2600ms Collocated serial merge join: 3500ms Serial merge join: 5000ms Parallel merge join: 8400ms Collated parallel hash join: 25,300ms (hash spill per partition) The parallel collocated merge join requires no memory grant (aside from a paltry 1.2MB used for exchange buffers). This plan uses 16 threads at DOP 8; but 8 of those are (rather pointlessly) allocated to the parallel scan of the temporary table. These are minor concerns, but it turns out there is a way to address them if it bothers you. Parallel Collocated Merge Join with Demand Partitioning This final tweak replaces the temporary table with a hard-coded list of partition ids (dynamic SQL could be used to generate this query from sys.partitions): SELECT row_count = SUM(Subtotals.cnt) FROM ( VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10), (11),(12),(13),(14),(15),(16),(17),(18),(19),(20), (21),(22),(23),(24),(25),(26),(27),(28),(29),(30), (31),(32),(33),(34),(35),(36),(37),(38),(39),(40),(41) ) AS P (partition_number) CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals OPTION (QUERYTRACEON 8649); The actual execution plan is: The parallel collocated hash join plan is reproduced below for comparison: The manual rewrite has another advantage that has not been mentioned so far: the partial counts (per partition) can be computed earlier than the partial counts (per thread) in the optimizer’s collocated join plan. The earlier aggregation is performed by the extra Stream Aggregate under the nested loops join. The performance of the parallel collocated merge join is unchanged at around 1350ms. Final Words It is a shame that the current query optimizer does not consider a collocated merge join (Connect item closed as Won’t Fix). The example used in this post showed an improvement in execution time from 2600ms to 1350ms using a modestly-sized data set and limited parallelism. In addition, the memory requirement for the query was almost completely eliminated  – down from 569MB to 1.2MB. The problem with the parallel hash join selected by the optimizer is that it attempts to process the full data set all at once (albeit using eight threads). It requires a large memory grant to hold all 5 million rows from table T1 across the eight hash tables, and does not take advantage of the divide-and-conquer opportunity offered by the common partitioning. The great thing about the collocated join strategies is that each parallel thread works on a single partition from both tables, reading rows, performing the join, and computing a per-partition subtotal, before moving on to a new partition. From a thread’s point of view… If you have trouble visualizing what is happening from just looking at the parallel collocated merge join execution plan, let’s look at it again, but from the point of view of just one thread operating between the two Parallelism (exchange) operators. Our thread picks up a single partition id from the Distribute Streams exchange, and starts a merge join using ordered rows from partition 1 of table T1 and partition 1 of table T2. By definition, this is all happening on a single thread. As rows join, they are added to a (per-partition) count in the Stream Aggregate immediately above the Merge Join. Eventually, either T1 (partition 1) or T2 (partition 1) runs out of rows and the merge join stops. The per-partition count from the aggregate passes on through the Nested Loops join to another Stream Aggregate, which is maintaining a per-thread subtotal. Our same thread now picks up a new partition id from the exchange (say it gets id 9 this time). The count in the per-partition aggregate is reset to zero, and the processing of partition 9 of both tables proceeds just as it did for partition 1, and on the same thread. Each thread picks up a single partition id and processes all the data for that partition, completely independently from other threads working on other partitions. One thread might eventually process partitions (1, 9, 17, 25, 33, 41) while another is concurrently processing partitions (2, 10, 18, 26, 34) and so on for the other six threads at DOP 8. The point is that all 8 threads can execute independently and concurrently, continuing to process new partitions until the wider job (of which the thread has no knowledge!) is done. This divide-and-conquer technique can be much more efficient than simply splitting the entire workload across eight threads all at once. Related Reading Understanding and Using Parallelism in SQL Server Parallel Execution Plans Suck © 2013 Paul White – All Rights Reserved Twitter: @SQL_Kiwi

    Read the article

  • Showplan Operator of the Week – BookMark/Key Lookup

    Fabiano continues in his mission to describe the major Showplan Operators used by SQL Server's Query Optimiser. This week he meets a star, the Key Lookup, a stalwart performer, but most famous for its role in ill-performing queries where an index does not 'cover' the data required to execute the query. If you understand why, and in what circumstances, key lookups are slow, it helps greatly with optimising query performance.

    Read the article

  • Where can I find luxury goods advertisements for my website?

    - by Nazariy
    I'm running business directory for tourist attractions, and I would like to fill some empty blocks with useful advertisements like flight operators, car retailers, luxury goods etc. We have tried Google AdSense but it's full of cheap, pointless and irrelevant advertisement that would make our website look cheap and bad. So I'm curious is there any centralised resources for luxury goods and services?

    Read the article

  • Compute Scalars, Expressions and Execution Plan Performance

    - by Paul White
    The humble Compute Scalar is one of the least well-understood of the execution plan operators, and usually the last place people look for query performance problems. It often appears in execution plans with a very low (or even zero) cost, which goes some way to explaining why people ignore it. Some readers will already know that a Compute Scalar can contain a call to a user-defined function, and that any T-SQL function with a BEGIN…END block in its definition can have truly disastrous consequences...(read more)

    Read the article

  • The blocking nature of aggregates

    - by Rob Farley
    I wrote a post recently about how query tuning isn’t just about how quickly the query runs – that if you have something (such as SSIS) that is consuming your data (and probably introducing a bottleneck), then it might be more important to have a query which focuses on getting the first bit of data out. You can read that post here.  In particular, we looked at two operators that could be used to ensure that a query returns only Distinct rows. and The Sort operator pulls in all the data, sorts it (discarding duplicates), and then pushes out the remaining rows. The Hash Match operator performs a Hashing function on each row as it comes in, and then looks to see if it’s created a Hash it’s seen before. If not, it pushes the row out. The Sort method is quicker, but has to wait until it’s gathered all the data before it can do the sort, and therefore blocks the data flow. But that was my last post. This one’s a bit different. This post is going to look at how Aggregate functions work, which ties nicely into this month’s T-SQL Tuesday. I’ve frequently explained about the fact that DISTINCT and GROUP BY are essentially the same function, although DISTINCT is the poorer cousin because you have less control over it, and you can’t apply aggregate functions. Just like the operators used for Distinct, there are different flavours of Aggregate operators – coming in blocking and non-blocking varieties. The example I like to use to explain this is a pile of playing cards. If I’m handed a pile of cards and asked to count how many cards there are in each suit, it’s going to help if the cards are already ordered. Suppose I’m playing a game of Bridge, I can easily glance at my hand and count how many there are in each suit, because I keep the pile of cards in order. Moving from left to right, I could tell you I have four Hearts in my hand, even before I’ve got to the end. By telling you that I have four Hearts as soon as I know, I demonstrate the principle of a non-blocking operation. This is known as a Stream Aggregate operation. It requires input which is sorted by whichever columns the grouping is on, and it will release a row as soon as the group changes – when I encounter a Spade, I know I don’t have any more Hearts in my hand. Alternatively, if the pile of cards are not sorted, I won’t know how many Hearts I have until I’ve looked through all the cards. In fact, to count them, I basically need to put them into little piles, and when I’ve finished making all those piles, I can count how many there are in each. Because I don’t know any of the final numbers until I’ve seen all the cards, this is blocking. This performs the aggregate function using a Hash Match. Observant readers will remember this from my Distinct example. You might remember that my earlier Hash Match operation – used for Distinct Flow – wasn’t blocking. But this one is. They’re essentially doing a similar operation, applying a Hash function to some data and seeing if the set of values have been seen before, but before, it needs more information than the mere existence of a new set of values, it needs to consider how many of them there are. A lot is dependent here on whether the data coming out of the source is sorted or not, and this is largely determined by the indexes that are being used. If you look in the Properties of an Index Scan, you’ll be able to see whether the order of the data is required by the plan. A property called Ordered will demonstrate this. In this particular example, the second plan is significantly faster, but is dependent on having ordered data. In fact, if I force a Stream Aggregate on unordered data (which I’m doing by telling it to use a different index), a Sort operation is needed, which makes my plan a lot slower. This is all very straight-forward stuff, and information that most people are fully aware of. I’m sure you’ve all read my good friend Paul White (@sql_kiwi)’s post on how the Query Optimizer chooses which type of aggregate function to apply. But let’s take a look at SQL Server Integration Services. SSIS gives us a Aggregate transformation for use in Data Flow Tasks, but it’s described as Blocking. The definitive article on Performance Tuning SSIS uses Sort and Aggregate as examples of Blocking Transformations. I’ve just shown you that Aggregate operations used by the Query Optimizer are not always blocking, but that the SSIS Aggregate component is an example of a blocking transformation. But is it always the case? After all, there are plenty of SSIS Performance Tuning talks out there that describe the value of sorted data in Data Flow Tasks, describing the IsSorted property that can be set through the Advanced Editor of your Source component. And so I set about testing the Aggregate transformation in SSIS, to prove for sure whether providing Sorted data would let the Aggregate transform behave like a Stream Aggregate. (Of course, I knew the answer already, but it helps to be able to demonstrate these things). A query that will produce a million rows in order was in order. Let me rephrase. I used a query which produced the numbers from 1 to 1000000, in a single field, ordered. The IsSorted flag was set on the source output, with the only column as SortKey 1. Performing an Aggregate function over this (counting the number of rows per distinct number) should produce an additional column with 1 in it. If this were being done in T-SQL, the ordered data would allow a Stream Aggregate to be used. In fact, if the Query Optimizer saw that the field had a Unique Index on it, it would be able to skip the Aggregate function completely, and just insert the value 1. This is a shortcut I wouldn’t be expecting from SSIS, but certainly the Stream behaviour would be nice. Unfortunately, it’s not the case. As you can see from the screenshots above, the data is pouring into the Aggregate function, and not being released until all million rows have been seen. It’s not doing a Stream Aggregate at all. This is expected behaviour. (I put that in bold, because I want you to realise this.) An SSIS transformation is a piece of code that runs. It’s a physical operation. When you write T-SQL and ask for an aggregation to be done, it’s a logical operation. The physical operation is either a Stream Aggregate or a Hash Match. In SSIS, you’re telling the system that you want a generic Aggregation, that will have to work with whatever data is passed in. I’m not saying that it wouldn’t be possible to make a sometimes-blocking aggregation component in SSIS. A Custom Component could be created which could detect whether the SortKeys columns of the input matched the Grouping columns of the Aggregation, and either call the blocking code or the non-blocking code as appropriate. One day I’ll make one of those, and publish it on my blog. I’ve done it before with a Script Component, but as Script components are single-use, I was able to handle the data knowing everything about my data flow already. As per my previous post – there are a lot of aspects in which tuning SSIS and tuning execution plans use similar concepts. In both situations, it really helps to have a feel for what’s going on behind the scenes. Considering whether an operation is blocking or not is extremely relevant to performance, and that it’s not always obvious from the surface. In a future post, I’ll show the impact of blocking v non-blocking and synchronous v asynchronous components in SSIS, using some of LobsterPot’s Script Components and Custom Components as examples. When I get that sorted, I’ll make a Stream Aggregate component available for download.

    Read the article

  • The blocking nature of aggregates

    - by Rob Farley
    I wrote a post recently about how query tuning isn’t just about how quickly the query runs – that if you have something (such as SSIS) that is consuming your data (and probably introducing a bottleneck), then it might be more important to have a query which focuses on getting the first bit of data out. You can read that post here.  In particular, we looked at two operators that could be used to ensure that a query returns only Distinct rows. and The Sort operator pulls in all the data, sorts it (discarding duplicates), and then pushes out the remaining rows. The Hash Match operator performs a Hashing function on each row as it comes in, and then looks to see if it’s created a Hash it’s seen before. If not, it pushes the row out. The Sort method is quicker, but has to wait until it’s gathered all the data before it can do the sort, and therefore blocks the data flow. But that was my last post. This one’s a bit different. This post is going to look at how Aggregate functions work, which ties nicely into this month’s T-SQL Tuesday. I’ve frequently explained about the fact that DISTINCT and GROUP BY are essentially the same function, although DISTINCT is the poorer cousin because you have less control over it, and you can’t apply aggregate functions. Just like the operators used for Distinct, there are different flavours of Aggregate operators – coming in blocking and non-blocking varieties. The example I like to use to explain this is a pile of playing cards. If I’m handed a pile of cards and asked to count how many cards there are in each suit, it’s going to help if the cards are already ordered. Suppose I’m playing a game of Bridge, I can easily glance at my hand and count how many there are in each suit, because I keep the pile of cards in order. Moving from left to right, I could tell you I have four Hearts in my hand, even before I’ve got to the end. By telling you that I have four Hearts as soon as I know, I demonstrate the principle of a non-blocking operation. This is known as a Stream Aggregate operation. It requires input which is sorted by whichever columns the grouping is on, and it will release a row as soon as the group changes – when I encounter a Spade, I know I don’t have any more Hearts in my hand. Alternatively, if the pile of cards are not sorted, I won’t know how many Hearts I have until I’ve looked through all the cards. In fact, to count them, I basically need to put them into little piles, and when I’ve finished making all those piles, I can count how many there are in each. Because I don’t know any of the final numbers until I’ve seen all the cards, this is blocking. This performs the aggregate function using a Hash Match. Observant readers will remember this from my Distinct example. You might remember that my earlier Hash Match operation – used for Distinct Flow – wasn’t blocking. But this one is. They’re essentially doing a similar operation, applying a Hash function to some data and seeing if the set of values have been seen before, but before, it needs more information than the mere existence of a new set of values, it needs to consider how many of them there are. A lot is dependent here on whether the data coming out of the source is sorted or not, and this is largely determined by the indexes that are being used. If you look in the Properties of an Index Scan, you’ll be able to see whether the order of the data is required by the plan. A property called Ordered will demonstrate this. In this particular example, the second plan is significantly faster, but is dependent on having ordered data. In fact, if I force a Stream Aggregate on unordered data (which I’m doing by telling it to use a different index), a Sort operation is needed, which makes my plan a lot slower. This is all very straight-forward stuff, and information that most people are fully aware of. I’m sure you’ve all read my good friend Paul White (@sql_kiwi)’s post on how the Query Optimizer chooses which type of aggregate function to apply. But let’s take a look at SQL Server Integration Services. SSIS gives us a Aggregate transformation for use in Data Flow Tasks, but it’s described as Blocking. The definitive article on Performance Tuning SSIS uses Sort and Aggregate as examples of Blocking Transformations. I’ve just shown you that Aggregate operations used by the Query Optimizer are not always blocking, but that the SSIS Aggregate component is an example of a blocking transformation. But is it always the case? After all, there are plenty of SSIS Performance Tuning talks out there that describe the value of sorted data in Data Flow Tasks, describing the IsSorted property that can be set through the Advanced Editor of your Source component. And so I set about testing the Aggregate transformation in SSIS, to prove for sure whether providing Sorted data would let the Aggregate transform behave like a Stream Aggregate. (Of course, I knew the answer already, but it helps to be able to demonstrate these things). A query that will produce a million rows in order was in order. Let me rephrase. I used a query which produced the numbers from 1 to 1000000, in a single field, ordered. The IsSorted flag was set on the source output, with the only column as SortKey 1. Performing an Aggregate function over this (counting the number of rows per distinct number) should produce an additional column with 1 in it. If this were being done in T-SQL, the ordered data would allow a Stream Aggregate to be used. In fact, if the Query Optimizer saw that the field had a Unique Index on it, it would be able to skip the Aggregate function completely, and just insert the value 1. This is a shortcut I wouldn’t be expecting from SSIS, but certainly the Stream behaviour would be nice. Unfortunately, it’s not the case. As you can see from the screenshots above, the data is pouring into the Aggregate function, and not being released until all million rows have been seen. It’s not doing a Stream Aggregate at all. This is expected behaviour. (I put that in bold, because I want you to realise this.) An SSIS transformation is a piece of code that runs. It’s a physical operation. When you write T-SQL and ask for an aggregation to be done, it’s a logical operation. The physical operation is either a Stream Aggregate or a Hash Match. In SSIS, you’re telling the system that you want a generic Aggregation, that will have to work with whatever data is passed in. I’m not saying that it wouldn’t be possible to make a sometimes-blocking aggregation component in SSIS. A Custom Component could be created which could detect whether the SortKeys columns of the input matched the Grouping columns of the Aggregation, and either call the blocking code or the non-blocking code as appropriate. One day I’ll make one of those, and publish it on my blog. I’ve done it before with a Script Component, but as Script components are single-use, I was able to handle the data knowing everything about my data flow already. As per my previous post – there are a lot of aspects in which tuning SSIS and tuning execution plans use similar concepts. In both situations, it really helps to have a feel for what’s going on behind the scenes. Considering whether an operation is blocking or not is extremely relevant to performance, and that it’s not always obvious from the surface. In a future post, I’ll show the impact of blocking v non-blocking and synchronous v asynchronous components in SSIS, using some of LobsterPot’s Script Components and Custom Components as examples. When I get that sorted, I’ll make a Stream Aggregate component available for download.

    Read the article

  • Dynamic Filtering

    - by Ricardo Peres
    Continuing my previous posts on dynamic LINQ, now it's time for dynamic filtering. For now, I'll focus on string matching. There are three standard operators for string matching, which both NHibernate, Entity Framework and LINQ to SQL recognize: Equals Contains StartsWith EndsWith So, if we want to apply filtering by one of these operators on a string property, we can use this code: public enum MatchType { StartsWith = 0, EndsWith = 1, Contains = 2, Equals = 3 } public static List Filter(IEnumerable enumerable, String propertyName, String filter, MatchType matchType) { return (Filter(enumerable, typeof(T), propertyName, filter, matchType) as List); } public static IList Filter(IEnumerable enumerable, Type elementType, String propertyName, String filter, MatchType matchType) { MethodInfo asQueryableMethod = typeof(Queryable).GetMethods(BindingFlags.Static | BindingFlags.Public).Where(m = (m.Name == "AsQueryable") && (m.ContainsGenericParameters == false)).Single(); IQueryable query = (enumerable is IQueryable) ? (enumerable as IQueryable) : asQueryableMethod.Invoke(null, new Object [] { enumerable }) as IQueryable; MethodInfo whereMethod = typeof(Queryable).GetMethods(BindingFlags.Public | BindingFlags.Static).Where(m = m.Name == "Where").ToArray() [ 0 ].MakeGenericMethod(elementType); MethodInfo matchMethod = typeof(String).GetMethod ( (matchType == MatchType.StartsWith) ? "StartsWith" : (matchType == MatchType.EndsWith) ? "EndsWith" : (matchType == MatchType.Contains) ? "Contains" : "Equals", new Type [] { typeof(String) } ); PropertyInfo displayProperty = elementType.GetProperty(propertyName, BindingFlags.Public | BindingFlags.Instance); MemberExpression member = Expression.MakeMemberAccess(Expression.Parameter(elementType, "n"), displayProperty); MethodCallExpression call = Expression.Call(member, matchMethod, Expression.Constant(filter)); LambdaExpression where = Expression.Lambda(call, member.Expression as ParameterExpression); query = whereMethod.Invoke(null, new Object [] { query, where }) as IQueryable; MethodInfo toListMethod = typeof(Enumerable).GetMethod("ToList", BindingFlags.Static | BindingFlags.Public).MakeGenericMethod(elementType); IList list = toListMethod.Invoke(null, new Object [] { query }) as IList; return (list); } var list = new [] { new { A = "aa" }, new { A = "aabb" }, new { A = "ccaa" }, new { A = "ddaadd" } }; var contains = Filter(list, "A", "aa", MatchType.Contains); var endsWith = Filter(list, "A", "aa", MatchType.EndsWith); var startsWith = Filter(list, "A", "aa", MatchType.StartsWith); var equals = Filter(list, "A", "aa", MatchType.Equals); Perhaps I'll write some more posts on this subject in the near future. SyntaxHighlighter.config.clipboardSwf = 'http://alexgorbatchev.com/pub/sh/2.0.320/scripts/clipboard.swf'; SyntaxHighlighter.brushes.CSharp.aliases = ['c#', 'c-sharp', 'csharp']; SyntaxHighlighter.all();

    Read the article

  • 14 Special Google Searches That Show Instant Answers

    - by Chris Hoffman
    Google can do more than display lists of websites – Google will give you quick answers to many special searches. While Google isn’t quite as advanced as Wolfram Alpha, it has quite a few tricks up its sleeve. We’ve also covered searching Google like a pro by learning the Google search operators – if you want to master Google, be sure to learn those. How To Create a Customized Windows 7 Installation Disc With Integrated Updates How to Get Pro Features in Windows Home Versions with Third Party Tools HTG Explains: Is ReadyBoost Worth Using?

    Read the article

  • Showplan Operator of the Week - Compute Scalar

    The third part of Fabiano's mission to describe the major Showplan Operators used by SQL Server's Query Optimiser continues with the 'Compute Scalar' operator. Fabiano shows how a tweak to SQL to avoid a 'Compute Scalar' step can improve its performance.

    Read the article

  • Undocumented Query Plans: Equality Comparisons

    - by Paul White
    The diagram below shows two data sets, with differences highlighted: To find changed rows using TSQL, we might write a query like this: The logic is clear: join rows from the two sets together on the primary key column, and return rows where a change has occurred in one or more data columns.  Unfortunately, this query only finds one of the expected four rows: The problem, of course, is that our query does not correctly handle NULLs.  The ‘not equal to’ operators <> and != do not evaluate...(read more)

    Read the article

  • Showplan Operator of the Week - Concatenation

    Fabiano continues in his mission to describe, one week at a time, all the major Showplan Operators used by SQL Server's Query Optimiser to build the Query Plan. This week he gets the Concatenation operator ....Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

    Read the article

  • Showplan Operator of the week - Assert

    As part of his mission to explain the Query Optimiser in practical terms, Fabiano attempts the feat of describing, one week at a time, all the major Showplan Operators used by SQL Server's Query Optimiser to build the Query Plan. He starts with Assert

    Read the article

  • What are known approaches to graphing algebraic expressions?

    - by jeremynealbrown
    I am planning to build an expression parser that will be used to graph algebraic functions ( think TI-83 ) with JavaScript. Functions will take the form of f(x)= Aside from typical operators such as: + - * / ^ I'd also like to add support for inline functions such as: sin(), cos(), log() and random(). I have looked at implementing the Shunting Yard algorithm for parsing expressions, but it does not look like an efficient approach to evaluating a function with a hundreds or thousands of inputs. What other known algorithms exist for this task?

    Read the article

< Previous Page | 30 31 32 33 34 35 36 37 38 39 40 41  | Next Page >