Search Results

Search found 11153 results on 447 pages for 'count zero'.

Page 259/447 | < Previous Page | 255 256 257 258 259 260 261 262 263 264 265 266  | Next Page >

  • 12.04 Booting into Terminal

    - by user170796
    To preface this, I would like to say that I am completely new to Ubuntu and have essentially zero programming experience/experience working with command line and terminal. I installed Ubuntu because I would like to get into programming. If you could provide me with the simplest instructions possible, I would be grateful. I have a Lenovo Ideapad Y500 (Intel i7, NVidia GT 750m, 1TB HDD, 16GB SSD cache, 8GB RAM) with Windows 8 on it. Using a Live CD, I installed Ubuntu 12.04 onto a 75 GB partition. During the installation, I kept all default settings except for one thing; I decided to encrypt my home folder, and so checked the corresponding box. The installation completed, and I restarted. Once I restarted, I saw the options "Ubuntu, with Linux 3.2.0-23-generic" "Ubuntu, with Linux 3.2.0-23-generic (recovery mode)" "Memory test (memtest86+)" "Memory test (memtest86+, serial console 115200)" "Windows Recovery Environment (loader) (on /dev/sdb3)" "Windows 8 (loader) (on /dev/sdb5)" "System Setup" I chose the first option, and was directed to a screen with the Ubuntu logo and the row of five dots below that change from orange to white. Then, I was brought to a full screen terminal that prompted me to login, which I did. I saw no option to boot into GUI at all, and am lost. I've been searching around and have tried the "startx" command to no avail. Should the command have some sort of context or something? I've also tried selecting the recovery mode option from the boot manager. I've tried the resume option from the following menu, which eventually just shuts down the computer after displaying a lot of scrolling text that's too fast for me to read. I've also tried the failsafex mode from the recovery mode menu, which only brings up a terminal box at the bottom of the window that covers the entire bottom part of the screen. Commands won't work in this window. When I try to access Windows 8, I get a message saying that the EFI file path was not specified or something along those lines. I had to enable Secure Boot in order to access Windows 8 (I had disabled it to be able to boot from the Live CD), which is functioning normally. I am at a complete loss for what to do. Any help will be extremely appreciated. EDIT: Bonus question! If you could figure out a way for me to boot to Windows 8 without having to enable Secure Boot, it would save me a lot of trouble. I can deal with switching every time, but I'd rather not have to.

    Read the article

  • XNA 4.0: 2D Camera Y and X are going in wrong direction

    - by Setheron
    I asked this question on stackoverflow but assumed this might be a better area to ask it as well for a more informed answer. My problem is that I am trying to create a camera class and have it so that my camera follows the proper RHS, however the Y axis seems to be inverted since on the screen the 0 starts at the top. Here is my Camera2D Class: class Camera2D { private Vector2 _position; private float _zoom; private float _rotation; private float _cameraSpeed; private Viewport _viewport; private Matrix _viewMatrix; private Matrix _viewMatrixIverse; public static float MinZoom = float.Epsilon; public static float MaxZoom = float.MaxValue; public Camera2D(Viewport viewport) { _viewMatrix = Matrix.Identity; _viewport = viewport; _cameraSpeed = 4.0f; _zoom = 1.0f; _rotation = 0.0f; _position = Vector2.Zero; } public void Move(Vector2 amount) { _position += amount; } public void Zoom(float amount) { _zoom += amount; _zoom = MathHelper.Clamp(_zoom, MaxZoom, MinZoom); UpdateViewTransform(); } public Vector2 Position { get { return _position; } set { _position = value; UpdateViewTransform(); } } public Matrix ViewMatrix { get { return _viewMatrix; } } private void UpdateViewTransform() { Matrix proj = Matrix.CreateTranslation(new Vector3(_viewport.Width * 0.5f, _viewport.Height * 0.5f, 0)) * Matrix.CreateScale(new Vector3(1f, 1f, 1f)); _viewMatrix = Matrix.CreateRotationZ(_rotation) * Matrix.CreateScale(new Vector3(_zoom, _zoom, 1.0f)) * Matrix.CreateTranslation(_position.X, _position.Y, 0.0f); _viewMatrix = proj * _viewMatrix; } } I test it using SpriteBatch in the following way: protected override void Draw(GameTime gameTime) { GraphicsDevice.Clear(Color.CornflowerBlue); Vector2 position = new Vector2(0, 0); // TODO: Add your drawing code here spriteBatch.Begin(SpriteSortMode.Immediate, BlendState.AlphaBlend, null, null, null, null, camera.ViewMatrix); Texture2D circle = CreateCircle(100); spriteBatch.Draw(circle, position, Color.Red); spriteBatch.End(); base.Draw(gameTime); }

    Read the article

  • Why does Mass Effect 1 run so slow on my machine if I have an XFX NVidia 9400GT video card? [closed]

    - by Papuccino1
    I so sick and tired of having my components pass the minimum requirements of a game and then I get 15 FPS on the game on everything low. Should't PC developers say 'use at least this video card for a smooth 30 FPS'? Here are my specs: Windows 7 2GB DDR2 RAM XFX Nvidia 9400gt Intel Pentium D Dual Core 2.8ghz I should be at LEAST getting 30 FPS on everything low right? Please tell me what I can do to make games run as they should, or is my video card not good for these games? Here are the recommended requirements from the official site: Recommended System Requirements for Mass Effect on the PC Operating System: Windows XP or Vista Processor: 2.6+GHZ Intel or 2.4+GHZ AMD Memory: 2 Gigabyte Ram Video Card: NVIDIA GeForce 7900 GTX or higher. ATI X1800 XL series or higher Hard Drive Space: 12 Gigabytes Sound Card: DirectX 9.0c compatible sound card and drivers – 5.1 sound card recommended My videocard is 9400GT, how is that worse than a 7900GTX? :S Edit 2: I should note, that I get poor frames when running the game in absolute BOTTOM specs. lowest resolution, no particles, etc. etc. Absolute ZERO and getting poor framerates.

    Read the article

  • FreeBSD performance tuning. Sysctls, loader.conf, kernel.

    - by SaveTheRbtz
    I wanted to share knowledge of tuning FreeBSD via sysctls, so i'm posting them with comments. Based on Igor Sysoev (author of nginx) presentation about FreeBSD tuning up to 100,000-200,000 active connections. Sysctls are for 7.x FreeBSD. Since 7.2 amd64 some of them are tuned well by default. Prior 7.0 some of them are boot only (set via /boot/loader.conf) or does not exist at all. Highload web server sysctls: # Max. backlog size kern.ipc.somaxconn=4096 # Shared memory // 7.2+ can use shared memory > 2Gb kern.ipc.shmmax=2147483648 # Sockets kern.ipc.maxsockets=204800 # Do not use lager sockbufs on 8.0 # ( http://old.nabble.com/Significant-performance-regression-for-increased-maxsockbuf-on-8.0-RELEASE-tt26745981.html#a26745981 ) kern.ipc.maxsockbuf=262144 # Recive clusters (on amd64 7.2+ 65k is default) # For such high value vm.kmem_size must be increased to 3G #kern.ipc.nmbclusters=229376 # Jumbo pagesize(4k/8k) clusters # Used as general packet storage for jumbo frames # can be monitored via `netstat -m` #kern.ipc.nmbjumbop=192000 # Jumbo 9k/16k clusters # If you are using them #kern.ipc.nmbjumbo9=24000 #kern.ipc.nmbjumbo16=10240 # Every socket is a file, so increase them kern.maxfiles=204800 kern.maxfilesperproc=200000 kern.maxvnodes=200000 # Turn off receive autotuning #net.inet.tcp.recvbuf_auto=0 # Small receive space, only usable on http-server, on file server this # should be increased to 65535 or even more #net.inet.tcp.recvspace=8192 # Small send space is useful for http servers that serve small files # Autotuned since 7.x net.inet.tcp.sendspace=16384 # This should be enabled if you going to use big spaces (>64k) #net.inet.tcp.rfc1323=1 # Turn this off on highspeed, lossless connections (LAN 1Gbit+) #net.inet.tcp.delayed_ack=0 # This feature is useful if you are serving data over modems, Gigabit Ethernet, # or even high speed WAN links (or any other link with a high bandwidth delay product), # especially if you are also using window scaling or have configured a large send window. # You can try setting it to 0 on fileserver with 1GBit+ interfaces # Automatically disables on small RTT ( http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet/tcp_subr.c?#rev1.237 ) #net.inet.tcp.inflight.enable=0 # Disable randomizing of ports to avoid false RST # Before usage check SA here www.bsdcan.org/2006/papers/ImprovingTCPIP.pdf # (it's also says that port randomization auto-disables at some conn.rates, but I didn't tested it thou) #net.inet.ip.portrange.randomized=0 # Increase portrange # For outgoing connections only. Good for seed-boxes and ftp servers. net.inet.ip.portrange.first=1024 net.inet.ip.portrange.last=65535 # Security net.inet.ip.redirect=0 net.inet.ip.sourceroute=0 net.inet.ip.accept_sourceroute=0 net.inet.icmp.maskrepl=0 net.inet.icmp.log_redirect=0 net.inet.icmp.drop_redirect=1 net.inet.tcp.drop_synfin=1 # Security net.inet.udp.blackhole=1 net.inet.tcp.blackhole=2 # Increases default TTL, sometimes useful # Default is 64 net.inet.ip.ttl=128 # Lessen max segment life to conserve resources # ACK waiting time in miliseconds (default: 30000 from RFC) net.inet.tcp.msl=5000 # Max bumber of timewait sockets net.inet.tcp.maxtcptw=40960 # Don't use tw on local connections # As of 15 Apr 2009. Igor Sysoev says that nolocaltimewait has some buggy realization. # So disable it or now till get fixed #net.inet.tcp.nolocaltimewait=1 # FIN_WAIT_2 state fast recycle net.inet.tcp.fast_finwait2_recycle=1 # Time before tcp keepalive probe is sent # default is 2 hours (7200000) #net.inet.tcp.keepidle=60000 # Should be increased until net.inet.ip.intr_queue_drops is zero net.inet.ip.intr_queue_maxlen=4096 # Interrupt handling via multiple CPU, but with context switch. # You can play with it. Default is 1; #net.isr.direct=0 # This is for routers only #net.inet.ip.forwarding=1 #net.inet.ip.fastforwarding=1 # This speed ups dummynet when channel isn't saturated net.inet.ip.dummynet.io_fast=1 # Increase dummynet(4) hash #net.inet.ip.dummynet.hash_size=2048 #net.inet.ip.dummynet.max_chain_len # Should be increased when you have A LOT of files on server # (Increase until vfs.ufs.dirhash_mem becames lower) vfs.ufs.dirhash_maxmem=67108864 # Explicit Congestion Notification (see http://en.wikipedia.org/wiki/Explicit_Congestion_Notification) net.inet.tcp.ecn.enable=1 # Flowtable - flow caching mechanism # Useful for routers #net.inet.flowtable.enable=1 #net.inet.flowtable.nmbflows=65535 # Extreme polling tuning #kern.polling.burst_max=1000 #kern.polling.each_burst=1000 #kern.polling.reg_frac=100 #kern.polling.user_frac=1 #kern.polling.idle_poll=0 # IPFW dynamic rules and timeouts tuning # Increase dyn_buckets till net.inet.ip.fw.curr_dyn_buckets is lower net.inet.ip.fw.dyn_buckets=65536 net.inet.ip.fw.dyn_max=65536 net.inet.ip.fw.dyn_ack_lifetime=120 net.inet.ip.fw.dyn_syn_lifetime=10 net.inet.ip.fw.dyn_fin_lifetime=2 net.inet.ip.fw.dyn_short_lifetime=10 # Make packets pass firewall only once when using dummynet # i.e. packets going thru pipe are passing out from firewall with accept #net.inet.ip.fw.one_pass=1 # shm_use_phys Wires all shared pages, making them unswappable # Use this to lessen Virtual Memory Manager's work when using Shared Mem. # Useful for databases #kern.ipc.shm_use_phys=1 /boot/loader.conf: # Accept filters for data, http and DNS requests # Usefull when your software uses select() instead of kevent/kqueue or when you under DDoS # DNS accf available on 8.0+ accf_data_load="YES" accf_http_load="YES" accf_dns_load="YES" # Async IO system calls aio_load="YES" # Adds NCQ support in FreeBSD # WARNING! all ad[0-9]+ devices will be renamed to ada[0-9]+ # 8.0+ only #ahci_load= #siis_load= # Increase kernel memory size to 3G. # # Use ONLY if you have KVA_PAGES in kernel configuration, and you have more than 3G RAM # Otherwise panic will happen on next reboot! # # It's required for high buffer sizes: kern.ipc.nmbjumbop, kern.ipc.nmbclusters, etc # Useful on highload stateful firewalls, proxies or ZFS fileservers # (FreeBSD 7.2+ amd64 users: Check that current value is lower!) #vm.kmem_size="3G" # Older versions of FreeBSD can't tune maxfiles on the fly #kern.maxfiles="200000" # Useful for databases # Sets maximum data size to 1G # (FreeBSD 7.2+ amd64 users: Check that current value is lower!) #kern.maxdsiz="1G" # Maximum buffer size(vfs.maxbufspace) # You can check current one via vfs.bufspace # Should be lowered/upped depending on server's load-type # Usually decreased to preserve kmem # (default is 200M) #kern.maxbcache="512M" # Sendfile buffers # For i386 only #kern.ipc.nsfbufs=10240 # syncache Hash table tuning net.inet.tcp.syncache.hashsize=1024 net.inet.tcp.syncache.bucketlimit=100 # Incresed hostcache net.inet.tcp.hostcache.hashsize="16384" net.inet.tcp.hostcache.bucketlimit="100" # TCP control-block Hash table tuning net.inet.tcp.tcbhashsize=4096 # Enable superpages, for 7.2+ only # Also read http://lists.freebsd.org/pipermail/freebsd-hackers/2009-November/030094.html vm.pmap.pg_ps_enabled=1 # Usefull if you are using Intel-Gigabit NIC #hw.em.rxd=4096 #hw.em.txd=4096 #hw.em.rx_process_limit="-1" # Also if you have ALOT interrupts on NIC - play with following parameters # NOTE: You should set them for every NIC #dev.em.0.rx_int_delay: 250 #dev.em.0.tx_int_delay: 250 #dev.em.0.rx_abs_int_delay: 250 #dev.em.0.tx_abs_int_delay: 250 # There is also multithreaded version of em drivers can be found here: # http://people.yandex-team.ru/~wawa/ # # for additional em monitoring and statistics use # `sysctl dev.em.0.stats=1 ; dmesg` # #Same tunings for igb #hw.igb.rxd=4096 #hw.igb.txd=4096 #hw.igb.rx_process_limit=100 # Some useful netisr tunables. See sysctl net.isr #net.isr.defaultqlimit=4096 #net.isr.maxqlimit: 10240 # Bind netisr threads to CPUs #net.isr.bindthreads=1 # Nicer boot logo =) loader_logo="beastie" And finally here is my additions to GENERIC kernel # Just some of them, see also # cat /sys/{i386,amd64,}/conf/NOTES # This one useful only on i386 #options KVA_PAGES=512 # You can play with HZ in environments with high interrupt rate (default is 1000) # 100 is for my notebook to prolong it's battery life #options HZ=100 # Polling is goot on network loads with high packet rates and low-end NICs # NB! Do not enable it if you want more than one netisr thread #options DEVICE_POLLING # Eliminate datacopy on socket read-write # To take advantage with zero copy sockets you should have an MTU of 8K(amd64) # (4k for i386). This req. is only for receiving data. # Read more in man zero_copy_sockets #options ZERO_COPY_SOCKETS # Support TCP sign. Used for IPSec options TCP_SIGNATURE options IPSEC # This ones can be loaded as modules. They described in loader.conf section #options ACCEPT_FILTER_DATA #options ACCEPT_FILTER_HTTP # Adding ipfw, also can be loaded as modules options IPFIREWALL options IPFIREWALL_VERBOSE options IPFIREWALL_VERBOSE_LIMIT=10 options IPFIREWALL_DEFAULT_TO_ACCEPT options IPFIREWALL_FORWARD # Adding kernel NAT options IPFIREWALL_NAT options LIBALIAS # Traffic shaping options DUMMYNET # Divert, i.e. for userspace NAT options IPDIVERT # This is for OpenBSD's pf firewall device pf device pflog # pf's QoS - ALTQ options ALTQ options ALTQ_CBQ # Class Bases Queuing (CBQ) options ALTQ_RED # Random Early Detection (RED) options ALTQ_RIO # RED In/Out options ALTQ_HFSC # Hierarchical Packet Scheduler (HFSC) options ALTQ_PRIQ # Priority Queuing (PRIQ) options ALTQ_NOPCC # Required for SMP build # Pretty console # Manual can be found here http://forums.freebsd.org/showthread.php?t=6134 #options VESA #options SC_PIXEL_MODE # Disable reboot on Ctrl Alt Del #options SC_DISABLE_REBOOT # Change normal|kernel messages color options SC_NORM_ATTR=(FG_GREEN|BG_BLACK) options SC_KERNEL_CONS_ATTR=(FG_YELLOW|BG_BLACK) # More scroll space options SC_HISTORY_SIZE=8192 # Adding hardware crypto device device crypto device cryptodev # Useful network interfaces device vlan device tap #Virtual Ethernet driver device gre #IP over IP tunneling device if_bridge #Bridge interface device pfsync #synchronization interface for PF device carp #Common Address Redundancy Protocol device enc #IPsec interface device lagg #Link aggregation interface device stf #IPv4-IPv6 port # Also for my notebook, but may be used with Opteron #device amdtemp # Support for ECMP. More than one route for destination # Works even with default route so one can use it as LB for two ISP # For now code is unstable and panics (panic: rtfree 2) on route deletions. #options RADIX_MPATH # Multicast routing #options MROUTING #options PIM # DTrace options KDTRACE_HOOKS # all architectures - enable general DTrace hooks options DDB_CTF # all architectures - kernel ELF linker loads CTF data #options KDTRACE_FRAME # amd64-only # Adaptive spining in lockmgr (8.x+) # See http://www.mail-archive.com/[email protected]/msg10782.html options ADAPTIVE_LOCKMGRS # UTF-8 in console (9.x+) #options TEKEN_UTF8 #options TEKEN_XTERM # NCQ support # WARNING! all ad[0-9]+ devices will be renamed to ada[0-9]+ #options ATA_CAM # FreeBSD 9+ # Deadlock resolver thread # For additional information see http://www.mail-archive.com/[email protected]/msg18124.html #options DEADLKRES PS. Also most of FreeBSD's limits can be monitored by # vmstat -z and # limits PPS. variety of network counters can be monitored via # netstat -s In FreeBSD-9 netstat's -Q option appeared, try following command to display netisr stats # netstat -Q PPPS. also see # man 7 tuning PPPPS. I wanted to thank FreeBSD community, especially author of nginx - Igor Sysoev, nginx-ru@ and FreeBSD-performance@ mailing lists for providing useful information about FreeBSD tuning. So here is the question: What tunings are you using on yours FreeBSD servers? You can also post your /etc/sysctl.conf, /boot/loader.conf, kernel options, etc with description of its' meaning (do not copy-paste from sysctl -d). Don't forget to specify server type (web, smb, gateway, etc) Let's share experience!

    Read the article

  • Issues with doing a P2V on Exchange 2007

    - by PokermUNKEE
    I shutdown all Exchange services on the physical box and did a convert. Everything converted fine. Server booted up no issues and I could receive email from the outside world with no issues. But I am unable to send emails internally or externally. They just sit in the Outbox. I also disabled the DHCPClient service as I didn't want the server to get an IP address from DHCP when it first came online. It came online with no IP, then I used console to assign the correct static IP address (same one on the physical server) and rebooted. I ran the Exchange Troubleshooting Agent and it gave me the following errors: The value for the '\MSExchangeIS Mailbox\Messages Queued For Submission' counter on server exchange is greater than zero (average value is 8.8) and it appears that 'MSExchangeMailSubmission' is failing to submit messages to at least one computer with the Hub Transport server role installed over the last minute. Found 1 computers with the Hub Transport server role installed in the same Active Directory site as server exchange using local Active Directory site GUID 'ce8a4367-1bf6-4825-9cac-c4e2b115c450'. Check Local Active Directory Site Hub Transport Server Role Health (Mailflow_CheckLocalBhHealth1.1.1.1.1.1.1.1) I have one Exchange server with all the roles on it. Has anyone see this before when doing a convert? Please help :(

    Read the article

  • Disable raid member check upon mount to mount damaged nvidia raid1 member

    - by Halfgaar
    Hi, A friend of mine destroyed his Nvidia RAID1 array somehow and in trying to fix it, he ended up with a non-working array. Because of the RAID metadata, the actual disk data was stored at an offset from the beginning. I was able to identify this offset with dd and a hexeditor and then I used losetup to create a loop device with the proper offset, so that I could mount the partition. It was then that I ran into problems, namely that mount says: "mount: unknown filesystem type 'nvidia_raid_member'". I also had this when trying to mount a Linux MD component the other day, and because I can remember that doing that in the past worked, I surmised that it may be some kind of protection. I therefore booted an old Sysrescue CD and tried it there, which worked (because of the older version of mount/libc/kernel/whatever). I still need to try to get more data, and because I don't want to keep using that SysrecueCD, I'd like to be able to mount the disk on my normal system. So, my question is: can the check for a disk being a raid member be disabled? I guess I could also zero out blocks that look like the raid block, but I'd rather not... I made an image of the disk with par2 data, so it's revertable, but still...

    Read the article

  • Arch Linux drops me on my school network.

    - by Kravlin
    I'm running a Lenovo X61 which i carry around my college for getting on the internet at various points in the day. The network has always been finicky but recently it's gotten worse. I'll connect using iwconfig, get an ip from dhcpcd and log in using vpnc to their system. Sometimes I'll stay connected for hours but most of the time within 30 seconds my network traffic will drop to zero and i'll be unable to do anything. My computer still belives it's connected, however to try again i need to put my wireless interface down, put it back up and try again. It's gotten so bad that i've got a window on my computer pinging yahoo or google constantly in order to know if i'm still able to get online. I know other people who have used Arch Linux that don't have the same problems as well as people who use Ubuntu who haven't had any problems either. It seems like my computer is a special case. Does anyone have any suggestions on how to fix it? dmesg doesn't show anything out of the ordinary going on and i don't know where else to look for errors or other things to try.

    Read the article

  • Static IP for dynamic IP

    - by scape279
    I have a dynamic IP address. I would like to have a static IP, but Virgin Media don't allow static IPs for residential broadband services, even if you ask them really nicely and offer to pay for it without switching to a business tariff. I am already registered with a dynamic DNS service which is updated by my router eg me.example.com will always resolve to my dynamic IP. This is fine for some circumstances, but not if you can only enter an IP address into configuration files/hardware etc like firewalls, subversion services etc etc. Is there a way I can have a static IP address 'forwarding' to my dynamic IP? Would a possible solution involve tunnelling? Setting up a private proxy? Please note the following: I am able to buy an IP address from my web host. I have access to a webserver and I am able to create custom DNS zones. I'm happy to have a webserver running at home if necessary also. I do not wish to change broadband providers. I have zero control over the services that require the IP address entering so I cannot tackle the problem that way round (services I need to access are at work). PS I've tried googling this issue, but it is very difficult to search for as most results are related to dynamic dns (which I already have set up and isnt quite what I'm after)

    Read the article

  • Trouble with IIS SMTP relaying to Gmail

    - by saille
    I appreciate that similar questions have been asked about how to setup SMTP relaying with IIS's virtual SMTP server. However I'm still completely stumped on this problem. Here's the setup: IIS 6.0 SMTP server running on Win2k3 box with a NAT'ed IP. Company uses Gmail for all email services. An app on the box needs to send email, so normally we'd just set the app up to talk to smtp.gmail.com directly, but this app doesn't support TLS. Easy, we just setup a local SMTP relay right? So I thought. What we have done so far: Setup IIS SMTP server to relay to smtp.gmail.com, as per these excellent instructions: http://fmuntean.wordpress.com/2008/10/26/how-to-configure-iis-smtp-server-to-forward-emails-using-a-gmail-account/ The local SMTP relay allows anonymous access. Both the local IP and the loopback IP have been explicitly allowed in the Connection... and Relay... dialogs. Tried sending email from 2 different apps via the local SMTP server, but failed (the emails end up in the Queue folder, but never get sent). The IIS logs show the conversation with the local app, but zero conversation happening with smtp.gmail.com. The port used by gmail is open outbound, and indeed the apps we have that support TLS can send email directly via smtp.gmail.com, so there is no problem with the network. At this point I changed the smtp settings in IIS SMTP server to use a different external SMTP server and hey-presto, the local apps can send email via local IIS SMTP relay. So smtp.gmail.com fails to work with our IIS SMTP relay, but another 3rd party SMTP service works fine. We need to use smtp.gmail.com, so how to troubleshoot this one?

    Read the article

  • How to automatically remove Flash history/privacy trail? Or stop Flash from storing it?

    - by Arjan van Bentem
    Many people have heard about third-party cookies, and some browsers even block those by default. Some people may even be using Private Browsing modes. However, only few seem to realise that Adobe's Flash player also leaves a cross-browser trail on your local hard drive, and allows for sending cookie-like information back to the server, including third-party sites. And because it is a plugin, Flash does not take any of the browser's privacy settings into account. Sorry for the long post, but first some details about why using Flash raises a privacy concern, followed by the results of my tests: The Flash player keeps a cross-browser history of the domain names of the Flash-sites your computer has visited. Unlike your browser's history, this history is not limited to a certain number of days. History is also recorded while using so-called Private Browsing modes. It is stored on your hard drive (though, as described below, without going to Adobe's site you won't know what is stored). I am not sure if any date and time information is kept about each visit, but to see the domain names: right-click on some Flash content, open the settings dialog, and click the Help icon or click the Advanced button within the Privacy tab. This opens a browser to the help pages on Adobe.com, where one can click through to the Website Storage Settings panel. One can clear the existing list, but one cannot stop it from being recorded again. Flash allows for storing data on your local hard drive, using so-called Local Shared Objects (aka "Flash Cookies"). Just like HTTP cookies, this data can be sent back to the server, for tracking purposes. They are cross-browser, have no expiration date, and no user defined maximum lifetime can be set in the Flash preferences either. These not being HTTP cookies, they are (of course) not blocked by a browser's cookies preferences and are not removed when the normal HTTP cookies are deleted. Adobe has announced that version 10.1 will obey Private Browsing in most popular browsers, but unfortunately no word about also removing the data whenever normal cookies are deleted manually. And its implementation might be confusing: [..] if the browser is in normal browsing mode when the Flash Player instance is created, then that particular instance will forever be in normal browsing mode (private browsing is turned off). Accordingly, toggling private browsing on or off without refreshing the page or closing the private browsing window will not impact Flash Player. Local Shared Objects are not limited to the site you visit, and third-party storage is enabled by default. At the Global Storage Settings panel one can deselect the default Allow third-party Flash content to store data on your computer. Because of the cross-browser and expiration-less nature (and the fact that few people know about it), I feel that the cross-browser third-party Flash Cookies are more dangerous for visitor tracking than third-party normal HTTP cookies. They are even used to restore plain HTTP cookies that the user tried to delete: "All advertisers, websites and networks use cookies for targeted advertising, but cookies are under attack. According to current research they are being erased by 40% of users creating serious problems," says Mookie Tenembaum, founder of United Virtualities. "From simple frequency capping to the more sophisticated behavioral targeting, cookies are an essential part of any online ad campaign. PIE ["Persistent Identification Element"] will give publishers and third-party providers a persistent backup to cookies effectively rendering them unassailable", adds Tenembaum. [..] To justify this tracking mechanism, UV's Tenembaum said, "The user is not proficient enough in technology to know if the cookie is good or bad, or how it works." When selecting None (zero KB) for Specify the amount of disk space that website websites that you haven't yet visited can use to store information on your computer, and checking Never ask again then some sites do not work. However, the same site might work when setting it to None but without selecting Never ask again, and then choose Deny whenever prompted. Both options would result in zero KB of data being allowed, but the behaviour differs. The plugin also provides a Flash Player cache for Adobe-signed files. I guess these files are not an issue. So: how to automatically delete that information? On a Mac, one can find a settings.sol file and a folder for each visited Flash-website in: $HOME/Library/Preferences/Macromedia/Flash Player/macromedia.com/support/flashplayer/sys/ Deleting the settings.sol file and all the folders in sys, removes the trail from the settings panels. However, the actual Local Shared Ojects are elsewhere (see Wikipedia for locations on other operating systems), in a randomly named subfolder of: $HOME/Library/Preferences/Macromedia/Flash Player/#SharedObjects But then: how to remove this automatically? Simply removing the folders and the settings.sol file every now and then (like by using launchd or Windows' Task Scheduler) may interfere with active browsers. Or is it safe to assume that, given the cross-browser nature, the plugin would not care if things are removed while it is active? Only clearing during log-off may not work for those who hibernate all the time. Firefox users can install BetterPrivacy or Objection to delete the Local Shared Objects (for all others browsers as well). I don't know if that also deletes the trail of website domain names. Or: how to stop Flash from storing a history trail? Change of plans: I'm currently testing prohibiting Flash to write to its own sys and #SharedObjects folders. So far, Flash has not tried to restore permissions (though, when deleting the folders, Flash will of course recreate them). I've not encountered any problems but this may take some while to validate, using multiple browsers and sites. I've not yet found a log that reports errors. On a Mac: cd "$HOME/Library/Preferences/Macromedia/Flash Player/macromedia.com/support/flashplayer" rm -r sys/* chmod u-w sys cd "$HOME/Library/Preferences/Macromedia/Flash Player" # preserve the randomly named subfolders (only preserving the latest would suffice; see below) rm -r \#SharedObjects/*/* chmod -R u-w \#SharedObjects I guess the above chmods cannot be achieved on an old Windows system (I'm not sure about XP and Vista?). Though maybe on Windows one could replace the folders sys and #SharedObjects with dummy files with the same names? Anyone? Obviously, keeping Flash from storing those Local Shared Objects for all sites may cause problems. Some test results (Flash 10 on Mac OS X): When blocking the sys folder (even when leaving the #SharedObjects folder writable) then YouTube won't remember your volume settings while viewing multiple videos. Temporarily allowing write access to the blocked folders while visiting trusted sites (to only create folders for domains you like, maybe including references in settings.sol) solves that. This way, for YouTube, Flash could be allowed to write to sys/#s.ytimg.com and #SharedObjects/s.ytimg.com, while Flash could not create new folders for other domains. One may also need to make settings.sol read-only afterwards, or delete it again. When blocking both the sys and #SharedObjects folders, YouTube and Vimeo work fine (though they might not remember any settings). However, Bits on the Run refuses to even show the video player. This is solved by temporarily unblocking the #SharedObjects folder, to allow Flash to create a subfolder with some random name. Within this folder, it would create yet another folder for the current Flash website (content.bitsontherun.com). Removing that website-specific folder, and blocking both #SharedObjects and the randomly named subfolder, still seems to allow Bits on the Run to operate, even though it still cannot write anything to disk. So: the existence of the randomly named subfolder (even when write protected) is important for some sites. When I first found the #SharedObjects folder, it held many subfolders with random names, some created on the very same day. I wonder when Flash decides it wants a new folder, and how it determines (and remembers) that random name. For a moment I considered not blocking write access for sys and #SharedObjects, but explicitly creating read-only folders for well-known third-party tracking domains (like based on a list from, for example, AdBlock Plus). That way, any other domain could still create Local Shared Objects. But the list would be long, and the domains from AdBlock Plus are probably all third-party domains anyway, so disabling Allow third-party Flash content to store data on your computer might have the very same result. Any experience anyone? (Final notes: if the above links to the settings panels do not work in the future, then use the URL that is known to Flash player as a starting point: www.adobe.com/go/settingsmanager. See also "You Deleted Your Cookies? Think Again" at Wired.com -- which uses Flash cookies itself as well... For the very suspicious using Time Machine: you may want to exclude both folders, for each user, and remove the trace that is already on your backup.)

    Read the article

  • HAProxy + NodeJS gets stuck on TCP Retransmission

    - by sled
    I have a HAProxy + NodeJS + Rails Setup, I use the NodeJS Server for file upload purposes. The problem I'm facing is that if I'm uploading through haproxy to nodejs and a "TCP (Fast) Retransmission" occurs because of a lost packet the TX rate on the client drops to zero for about 5-10 secs and gets flooded with TCP Retransmissions. This does not occur if I upload to NodeJS directly (TCP Retransmission happens too but it doesn't get stuck with dozens of retransmission attempts). My test setup is a simple HTML4 FORM (method POST) with a single file input field. The NodeJS Server only reads the incoming data and does nothing else. I've tested this on multiple machines, networks, browsers, always the same issue. Here's a TCP Traffic Dump from the client while uploading a file: ..... TCP 1506 [TCP segment of a reassembled PDU] >> everything is uploading fine until: TCP 1506 [TCP Fast Retransmission] [TCP segment of a reassembled PDU] TCP 66 [TCP Dup ACK 7392#1] 63265 > http [ACK] Seq=4844161 Ack=1 Win=524280 Len=0 TSval=657047088 TSecr=79373730 TCP 1506 [TCP Retransmission] [TCP segment of a reassembled PDU] >> the last message is repeated about 50 times for >>5-10 secs<< (TX drops to 0 on client, RX drops to 0 on server) TCP 1506 [TCP segment of a reassembled PDU] >> upload continues until the next TCP Fast Retransmission and the same thing happens again The haproxy.conf (haproxy v1.4.18 stable) is the following: global log 127.0.0.1 local1 debug maxconn 4096 # Total Max Connections. This is dependent on ulimit nbproc 2 defaults log global mode http option httplog option tcplog frontend http-in bind *:80 timeout client 6000 acl is_websocket path_beg /node/ use_backend node_backend if is_websocket default_backend app_backend # Rails Server (via nginx+passenger) backend app_backend option httpclose option forwardfor timeout server 30000 timeout connect 4000 server app1 127.0.0.1:3000 # node.js backend node_backend reqrep ^([^\ ]*)\ /node/(.*) \1\ /\2 option httpclose option forwardfor timeout queue 5000 timeout server 6000 timeout connect 5000 server node1 127.0.0.1:3200 weight 1 maxconn 4096 Thanks for reading! :) Simon

    Read the article

  • Files Corrupted on System Restore

    - by Yar
    I restored OSX 10.6.2 today (was 10.6.3 and not booting) by copying the system over from a backup. The data directories were not touched. I am seeing some files as 0 bytes, and getting permission-denied errors when copying, even when using sudo cp or the Finder itself. Some programs, differently, take the files at face value and see no permission problems (such as zip), but they see the files as zero bytes, which would be game-over for recovery. cp: .git/objects/fe/86b676974a44aa7f128a55bf27670f4a1073ca: could not copy extended attributes to /eraseme/blah/.git/objects/fe/86b676974a44aa7f128a55bf27670f4a1073ca: Operation not permitted I have tried sudo chown, sudo chmod -R 777 and sudo chflags -R nouchg which do not change the end result. Strangely, this is only affecting my .git directories (perhaps because they start with a period, but renaming them -- which works -- does not change anything). What else can I do to take ownership of these files? Edit: This question comes from StackOverflow because I originally thought it was a GIT problem. It's definitely not (just) GIT. Anyway, this is to help put some of the comments in context.

    Read the article

  • When I restart my virtual enviorment it does not re-bind to the IP address

    - by RoboTamer
    The IP does no longer respond to a remote ping With restart I mean: lxc-stop -n vm3 lxc-start -n vm3 -f /etc/lxc/vm3.conf -d -- /etc/network/interfaces auto lo iface lo inet loopback up route add -net 127.0.0.0 netmask 255.0.0.0 dev lo down route add -net 127.0.0.0 netmask 255.0.0.0 dev lo # device: eth0 auto eth0 iface eth0 inet manual auto br0 iface br0 inet static address 192.22.189.58 netmask 255.255.255.248 gateway 192.22.189.57 broadcast 192.22.189.63 bridge_ports eth0 bridge_fd 0 bridge_hello 2 bridge_maxage 12 bridge_stp off post-up ip route add 192.22.189.59 dev br0 post-up ip route add 192.22.189.60 dev br0 post-up ip route add 192.22.189.61 dev br0 post-up ip route add 192.22.189.62 dev br0 -- /etc/lxc/vm3.conf lxc.utsname = vm3 lxc.rootfs = /var/lib/lxc/vm3/rootfs lxc.tty = 4 #lxc.pts = 1024 # pseudo tty instance for strict isolation lxc.network.type = veth lxc.network.flags = up lxc.network.link = br0 lxc.network.name = eth0 lxc.network.mtu = 1500 #lxc.cgroup.cpuset.cpus = 0 # security parameter lxc.cgroup.devices.deny = a # Deny all access to devices lxc.cgroup.devices.allow = c 1:3 rwm # dev/null lxc.cgroup.devices.allow = c 1:5 rwm # dev/zero lxc.cgroup.devices.allow = c 5:1 rwm # dev/console lxc.cgroup.devices.allow = c 5:0 rwm # dev/tty lxc.cgroup.devices.allow = c 4:0 rwm # dev/tty0 lxc.cgroup.devices.allow = c 4:1 rwm # dev/tty1 lxc.cgroup.devices.allow = c 4:2 rwm # dev/tty2 lxc.cgroup.devices.allow = c 1:9 rwm # dev/urandon lxc.cgroup.devices.allow = c 1:8 rwm # dev/random lxc.cgroup.devices.allow = c 136:* rwm # dev/pts/* lxc.cgroup.devices.allow = c 5:2 rwm # dev/pts/ptmx lxc.cgroup.devices.allow = c 254:0 rwm # rtc # mounts point lxc.mount.entry=proc /var/lib/lxc/vm3/rootfs/proc proc nodev,noexec,nosuid 0 0 lxc.mount.entry=devpts /var/lib/lxc/vm3/rootfs/dev/pts devpts defaults 0 0 lxc.mount.entry=sysfs /var/lib/lxc/vm3/rootfs/sys sysfs defaults 0 0

    Read the article

  • What is the best keyboard for typing speed (not layouts)

    - by Gapton
    So I am a programmer, and I like playing typing speed games. My typing speed is, for common English words, 85 to 90 wpm, max 95. I type on various devices, my laptop, desktop, office pc.... they all have slightly different keyboards. Being a curious programmer, I wonder what types of keyboard is used for the highest possible typing speed. Or let me phrase it in another way, what is the type of keyboards that people use in typing speed contest? Here is something I know that I feel like I can share: It must be a wired keyboard, I can feel the lag as I am typing this on my wireless keyboard, even if it is a slightly more expensive model which claims to have zero lag. I know people prefer a mechanical keyboard, for the hepatic feedback, however I have not tried one. It lasts longer and is noisy, it also does not have the problem of normal keyboards where you press many keys at a time the signals will get all jammed and the computer will only receive one or two keys. I personally prefer those "thin profile" keyboards. I type a lot, and 95 wpm put me in the top 5%, this is of course just on a gaming site. However when I type on the fat keyboards, my fingers have to travel a much longer distance before the keys actually click. This is where I find myself typing much faster with those thin profile keyboards found on my laptop. Because my fingers only hover on the keys and I only need to press a short distance, each stroke takes less force and light rapid strokes is what makes me type fast. When I type on a fat keyboard, I was forced to use heavy strokes, and this slows me down. There must be some people out there who are keyboard scientists, who actually do experiments and user tests with different setups. It would be interesting to understand more about the things we use everyday for not just work but a majority of our communications. P.S. this is about hardware and not about switching keyboard layouts to dvorak

    Read the article

  • NVidia ION and /dev/mapper/nvidia_... issues.

    - by Ritsaert Hornstra
    I have an NVidia ION board with 4 SATA ports and want to use that to run a Linux Server (CentOS 5.4). I first hooed up 3 HDs (that will be a RAID5 array) and a forth small boot HD. I first started to use the onboard RAID capability but that does not work correctly under Linux: the raid capacity is not a real RAID but uses lvm to define some arays. After setting the BIOS back to normal SATA mode and whiping the HDs, the first boot harddisk (/dev/sda) is seen as /dev/sda BEFORE mounting and after mounting as /dev/mapper/nvidia_. CentOS is unable to install on it (and grub is not installable on it either). So somehow the harddisk is still seen as if it belongs to some lvm volume. I tried to clean out the HD by issuing a few dd if=/dev/zero of=/dev/sda commands to wipe the starting cylinders and final cylinders but to no avail. Did anyone see this problem and did anyone find a solution? UPDATE When I create only a single ext3 partition on the first HD (/dev/mapper/nvidia_...) no LVM partitions are seen and I can boot from /dev/mapper/nvidia_.... Now the next step is to see how I can get rid of this folly.

    Read the article

  • Getting an error when mounting LVM snapshot

    - by Sandra
    I have migrated a file based Xen guest to LVM using dd bs=1M if=/dev/zero of=/dev/vg00/vm10 qemu-img convert ~/vm10.qcow2 -O raw /dev/vg00/vm10 and changed the Xen domain file for the VM to use the LV instead of the old file. The VM boots up, and now on the Xen host would I like to make a snapshot of the running VM. # lvcreate --size 10G --snapshot --name vm10-snapshot /dev/vg00/vm10 Logical volume "vm10-snapshot" created # mount /dev/vg00/vm10-snapshot /mnt/snapshot/ mount: you must specify the filesystem type # dmesg |tail EXT3 FS on dm-3, internal journal EXT3-fs: mounted filesystem with ordered data mode. hfs: unable to find HFS+ superblock VFS: Can't find ext3 filesystem on dev dm-4. hfs: unable to find HFS+ superblock hfs: unable to find HFS+ superblock VFS: Can't find ext3 filesystem on dev dm-2. hfs: unable to find HFS+ superblock hfs: unable to find HFS+ superblock hfs: unable to find HFS+ superblock For some reason it can't see it is an EXT3 filesystem. I have also tried to mount with -t ext3, but still didn't mount. # lvdisplay --- Logical volume --- LV Name /dev/vg00/vm10 VG Name vg00 LV UUID I1y1vQ-Bac5-5jwW-melh-TY5h-l9NO-qaelKk LV Write Access read/write LV snapshot status source of /dev/vg00/vm10-snapshot [active] LV Status available # open 2 LV Size 8.00 GB Current LE 2048 Segments 1 Allocation inherit Read ahead sectors auto - currently set to 256 Block device 253:2 --- Logical volume --- LV Name /dev/vg00/vm10-snapshot VG Name vg00 LV UUID GWsOx3-TPpr-GW64-uiMz-u1YN-QU4h-l0Kala LV Write Access read/write LV snapshot status active destination for /dev/vg00/vm10 LV Status available # open 0 LV Size 8.00 GB Current LE 2048 COW-table size 10.00 GB COW-table LE 2560 Allocated to snapshot 0.00% Snapshot chunk size 4.00 KB Segments 1 Allocation inherit Read ahead sectors auto - currently set to 256 Block device 253:4 # What could the problem be?

    Read the article

  • Can you "swap" the Sysprep answer file in Windows 7

    - by Ben
    I have a load of new Lenovo laptops which I am due to distribute in my company. We are distributed in multiple locations and I want to ship the laptops "boxed" and untouched by IT hand for distribution. We are using LANDesk to do all the software distribution and provisioning, but are currently falling at the first hurdle as when booted, the laptops kick into the Lenovo mini-setup wizard. I assume this is because they have been sysprepped at Lenovo. In order to keep with our (almost) zero touch strategy I want the users to PXE boot into a PE of some sort, which will run a script on startup which replaces the sysprep answer file with one of my own. (i.e. prepopulated with product key, company info etc.) and then reboot to complete Sysprep. The plan is that this will run, and then install the LANDesk agent as a post-sysprep task, which in turn will complete the provisioning. Anyone have any experience / know any pitfalls to look out for / can suggest a suitable, PXE-bootable PE environment? Apologies for the verbosity of the question - it takes a bit of explaining! Thanks in advance, Ben

    Read the article

  • DRBD on a disk with existing file system that takes all the place

    - by Karolis T.
    I'm currently trying to simulate the environment via XEN. I have installed two debian systems with such FS layout: cltest1:/etc# df -h Filesystem Size Used Avail Use% Mounted on /dev/xvda2 6.0G 417M 5.2G 8% / tmpfs 257M 0 257M 0% /lib/init/rw udev 10M 16K 10M 1% /dev tmpfs 257M 4.0K 257M 1% /dev/shm Host cltest2 is identical. Here's my drbd.conf global { minor-count 1; } resource mysql { protocol C; syncer { rate 10M; # 10 Megabytes } on cltest1 { device /dev/drbd0; disk /dev/xvda2; address 192.168.1.186:7789; meta-disk internal; } on cltest2 { device /dev/drbd0; disk /dev/xvda2; address 192.168.1.187:7789; meta-disk internal; } } I have not created filesystem on drbd0 Starting DRBD via init.d script errors out with: Starting DRBD resources: [ d(mysql) /dev/drbd0: Failure: (114) Lower device is already claimed. This usually means it is mounted. [mysql] cmd /sbin/drbdsetup /dev/drbd0 disk /dev/xvda2 /dev/xvda2 internal --set-defaults --create-device failed - continuing! Running: drbdadm create-md mysql gives: cltest1:/etc# drbdadm create-md mysql md_offset 6442446848 al_offset 6442414080 bm_offset 6442217472 Found ext3 filesystem which uses 6291456 kB current configuration leaves usable 6291228 kB Device size would be truncated, which would corrupt data and result in 'access beyond end of device' errors. You need to either * use external meta data (recommended) * shrink that filesystem first * zero out the device (destroy the filesystem) Operation refused. Command 'drbdmeta /dev/drbd0 v08 /dev/xvda2 internal create-md' terminated with exit code 40 drbdadm aborting As I understand, all of my problems are because I don't have unallocated disk space on xvda2. What are my options besides shrinking FS and connecting a separate physical disk? Can't the meta-data be stored on a file in the local filesystem?

    Read the article

  • Drobo FS vs. MacBook Pro: Finder works, Drobo Dashboard doesn't

    - by dash-tom-bang
    Does anyone have any experience with the new Drobo FS, specifically using it from a MacBook Pro? My experience thus far is this: Set up the Drobo Dashboard software (hereinafter called simply 'Dashboard') on my WinXP machine, which is hard-wired to the network to do the data migration from my NAS-that's-being-replaced (a 250G SimpleShare which works well enough but I was always afraid of losing the one disk). The Dashboard seems to work ok, except that the DroboCopy function doesn't work at all. This is the backup solution, which I can configure, and if I launch it (e.g. to back up from the old NAS to the Drobo) it spins the NAS, seeking the drive all over hell and creation, until finally giving up an hour+ later with zero files copied. Selecting only a subset of the data yields the same effect albeit more quickly. On my Mac I installed the Dashboard software too, since most of my fiddling with the device will be from my couch in the living room. Finder connects to the box just fine, fwiw, but Dashboard just sits there, "waiting for connection." This is considerably more bothersome than the above paragraph, but I figured I'd give whatever information I have. Drobo is insisting that I send them this "Debug Log" file that their software generates. Does anyone know what's in it? It's encrypted and they won't tell me, which spooks me just a bit; not like I'm terribly concerned about privacy but I don't want to be sending personal information out to every clown who says they "need" it in order to help me. thanks a ton, -tom!

    Read the article

  • How to remove request blocking on apache reverse proxy after failure of backend before asking backen

    - by matnagel
    I am working on an apache2 reverse proxy vhost. When the server behind apache is down, the first request to apache shows the error page of course. But at subsequent requests it seems apache delays for some time before asking the backend server again. During all this time (which is short but in development I don't want a delay at all) only the apache error page is shown to the browser, although the backend server is already up. Where is this setting in apache, what is this behaviour, and how can I set the delay time to zero? Edit: I am not trying to change the timeout for a single request. I want to change the blocking time. It is my experience that apache blocks further requests for a certain time before asking a backend server again that has failed once. Edit2: This is what apache delivers: Service Temporarily Unavailable The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later. Apache/2.2.8 (Ubuntu) PHP/5.2.4-2ubuntu5.7 with Suhosin-Patch proxy_html/3.0.0 Server at localhost Port 80 After hitting Ctrl-R in firefox for 60 seconds the page finally appears.

    Read the article

  • Trying to run QEMU with a file as hda

    - by Felix
    I'm trying to run QEMU and use a simple file on the host system as the guest's hard drive. Here's what I attempted so far: $ dd if=/dev/zero of=/home/felix/vm/archlinux.img bs=1MB count=8192 8192+0 records in 8192+0 records out 8192000000 bytes (8.2 GB) copied, 86.6054 s, 94.6 MB/s $ qemu -hda /home/felix/vm/archlinux.img -cdrom archlinux-2009.08-netinstall-i686.iso -boot d Then I try to install Archlinux to that file. It goes pretty well (it's able to format it from what I can tell) until I start installing packages, when I get errors like this: And, of course, everything goes downhill from there (unable to mount the partition, corrupted files, ...). What am I doing wrong? Note: I'm just doing this for entertainment purposes. I don't intend to actually use this on servers or anything. The only use I can think of for this kind of installation would be to actually get an 8GB USB stick and dd that file to it and wham! You have a bootable stick with a fully fledged and customized OS, and without torturing the stick through the installation.

    Read the article

  • Hyper-V Guests Dying

    - by Jon Rauschenberger
    I just hit my THIRD instance of a Hyper-V guest machine dying with the exact same behavior. In all three instances we are hosting WS2008 guests on a WS2008 host. AFter a config change, we reboot the guest and the guest OS comes up but in a very cripled state. Specifically, we are able to log into the guest, but can't launch any apps and the guest never comes active on the network. I opened a support ticket with MS the second time this happened and they focused in on the DCOM subsystem not coming up...best explanation they could provide was that permissions on key system files got corrupted. I eventually gave up on the ticket after close to 10 hours on the phone trying different things that were going no where. What really concerns me is that we have now seen the exact same thing happen to a guest hosted on a completly differet host machine. There is zero hardware overlap between the two. Has anyone seen this before?? It's really odd behavior, but it also seems like there's a pattern here that's concerning me. Thanks, jon

    Read the article

  • laptop motherboard "shorts" when connected to adapter

    - by Bash
    Disclaimer: I'm sort of a noob, and this is a long post. Thank you all in advance! summary: completely dead laptop with no signs of life whatsoever (suddenly, for no apparent reason) Here's the deal: Lenovo Y470 (only a few months old with no water or shock damage). It stopped working suddenly (no lights, no sound, even when connecting adapter with or without battery). I tried a different adapter (same electrical rating), but no luck. I disassembled the thing completely, and tried plugging in the adapter and looking for signs of life with all different combinations of components installed (tried all combinations of RAM, CPU, USB power cords, screen, etc plugged in). no luck. Then, I noticed (as I was plugging in the adapter to try for the millionth time) that there was a "spark" for an instant when I first connect the adapter to the power jack. The adapter's LED would then flash (indicating it isn't working or charging). So, I thought the power jack has a short of some sort (due to bad soldering or something). Scanned virtually every single component on the motherboard, and tested the power jack connections with a multimeter. No shorts or damage to anything on the entire motherboard. Now I'm thinking I need to replace the motherboard. But, my actual question: What does this "shorting" when connecting the adapter signify? (btw, the voltage across the power connections and current through it drop to virtually zero when the adapter is connected and "sparks", and they stay that way). The bewildering thing is that there are no damaged components, and the voltage across adapter terminals returns to normal after I disconnect it (so it's not damaged). Please take a look at the pictures (of the motherboard's power connection and nearby components) and see if I'm missing something completely obvious... Links to pictures and laptop and motherboard model: pictures on DropBox Motherboard model: LA-6881P Laptop model: Lenovo IdeaPad Y470

    Read the article

  • How John Got 15x Improvement Without Really Trying

    - by rchrd
    The following article was published on a Sun Microsystems website a number of years ago by John Feo. It is still useful and worth preserving. So I'm republishing it here.  How I Got 15x Improvement Without Really Trying John Feo, Sun Microsystems Taking ten "personal" program codes used in scientific and engineering research, the author was able to get from 2 to 15 times performance improvement easily by applying some simple general optimization techniques. Introduction Scientific research based on computer simulation depends on the simulation for advancement. The research can advance only as fast as the computational codes can execute. The codes' efficiency determines both the rate and quality of results. In the same amount of time, a faster program can generate more results and can carry out a more detailed simulation of physical phenomena than a slower program. Highly optimized programs help science advance quickly and insure that monies supporting scientific research are used as effectively as possible. Scientific computer codes divide into three broad categories: ISV, community, and personal. ISV codes are large, mature production codes developed and sold commercially. The codes improve slowly over time both in methods and capabilities, and they are well tuned for most vendor platforms. Since the codes are mature and complex, there are few opportunities to improve their performance solely through code optimization. Improvements of 10% to 15% are typical. Examples of ISV codes are DYNA3D, Gaussian, and Nastran. Community codes are non-commercial production codes used by a particular research field. Generally, they are developed and distributed by a single academic or research institution with assistance from the community. Most users just run the codes, but some develop new methods and extensions that feed back into the general release. The codes are available on most vendor platforms. Since these codes are younger than ISV codes, there are more opportunities to optimize the source code. Improvements of 50% are not unusual. Examples of community codes are AMBER, CHARM, BLAST, and FASTA. Personal codes are those written by single users or small research groups for their own use. These codes are not distributed, but may be passed from professor-to-student or student-to-student over several years. They form the primordial ocean of applications from which community and ISV codes emerge. Government research grants pay for the development of most personal codes. This paper reports on the nature and performance of this class of codes. Over the last year, I have looked at over two dozen personal codes from more than a dozen research institutions. The codes cover a variety of scientific fields, including astronomy, atmospheric sciences, bioinformatics, biology, chemistry, geology, and physics. The sources range from a few hundred lines to more than ten thousand lines, and are written in Fortran, Fortran 90, C, and C++. For the most part, the codes are modular, documented, and written in a clear, straightforward manner. They do not use complex language features, advanced data structures, programming tricks, or libraries. I had little trouble understanding what the codes did or how data structures were used. Most came with a makefile. Surprisingly, only one of the applications is parallel. All developers have access to parallel machines, so availability is not an issue. Several tried to parallelize their applications, but stopped after encountering difficulties. Lack of education and a perception that parallelism is difficult prevented most from trying. I parallelized several of the codes using OpenMP, and did not judge any of the codes as difficult to parallelize. Even more surprising than the lack of parallelism is the inefficiency of the codes. I was able to get large improvements in performance in a matter of a few days applying simple optimization techniques. Table 1 lists ten representative codes [names and affiliation are omitted to preserve anonymity]. Improvements on one processor range from 2x to 15.5x with a simple average of 4.75x. I did not use sophisticated performance tools or drill deep into the program's execution character as one would do when tuning ISV or community codes. Using only a profiler and source line timers, I identified inefficient sections of code and improved their performance by inspection. The changes were at a high level. I am sure there is another factor of 2 or 3 in each code, and more if the codes are parallelized. The study’s results show that personal scientific codes are running many times slower than they should and that the problem is pervasive. Computational scientists are not sloppy programmers; however, few are trained in the art of computer programming or code optimization. I found that most have a working knowledge of some programming language and standard software engineering practices; but they do not know, or think about, how to make their programs run faster. They simply do not know the standard techniques used to make codes run faster. In fact, they do not even perceive that such techniques exist. The case studies described in this paper show that applying simple, well known techniques can significantly increase the performance of personal codes. It is important that the scientific community and the Government agencies that support scientific research find ways to better educate academic scientific programmers. The inefficiency of their codes is so bad that it is retarding both the quality and progress of scientific research. # cacheperformance redundantoperations loopstructures performanceimprovement 1 x x 15.5 2 x 2.8 3 x x 2.5 4 x 2.1 5 x x 2.0 6 x 5.0 7 x 5.8 8 x 6.3 9 2.2 10 x x 3.3 Table 1 — Area of improvement and performance gains of 10 codes The remainder of the paper is organized as follows: sections 2, 3, and 4 discuss the three most common sources of inefficiencies in the codes studied. These are cache performance, redundant operations, and loop structures. Each section includes several examples. The last section summaries the work and suggests a possible solution to the issues raised. Optimizing cache performance Commodity microprocessor systems use caches to increase memory bandwidth and reduce memory latencies. Typical latencies from processor to L1, L2, local, and remote memory are 3, 10, 50, and 200 cycles, respectively. Moreover, bandwidth falls off dramatically as memory distances increase. Programs that do not use cache effectively run many times slower than programs that do. When optimizing for cache, the biggest performance gains are achieved by accessing data in cache order and reusing data to amortize the overhead of cache misses. Secondary considerations are prefetching, associativity, and replacement; however, the understanding and analysis required to optimize for the latter are probably beyond the capabilities of the non-expert. Much can be gained simply by accessing data in the correct order and maximizing data reuse. 6 out of the 10 codes studied here benefited from such high level optimizations. Array Accesses The most important cache optimization is the most basic: accessing Fortran array elements in column order and C array elements in row order. Four of the ten codes—1, 2, 4, and 10—got it wrong. Compilers will restructure nested loops to optimize cache performance, but may not do so if the loop structure is too complex, or the loop body includes conditionals, complex addressing, or function calls. In code 1, the compiler failed to invert a key loop because of complex addressing do I = 0, 1010, delta_x IM = I - delta_x IP = I + delta_x do J = 5, 995, delta_x JM = J - delta_x JP = J + delta_x T1 = CA1(IP, J) + CA1(I, JP) T2 = CA1(IM, J) + CA1(I, JM) S1 = T1 + T2 - 4 * CA1(I, J) CA(I, J) = CA1(I, J) + D * S1 end do end do In code 2, the culprit is conditionals do I = 1, N do J = 1, N If (IFLAG(I,J) .EQ. 0) then T1 = Value(I, J-1) T2 = Value(I-1, J) T3 = Value(I, J) T4 = Value(I+1, J) T5 = Value(I, J+1) Value(I,J) = 0.25 * (T1 + T2 + T5 + T4) Delta = ABS(T3 - Value(I,J)) If (Delta .GT. MaxDelta) MaxDelta = Delta endif enddo enddo I fixed both programs by inverting the loops by hand. Code 10 has three-dimensional arrays and triply nested loops. The structure of the most computationally intensive loops is too complex to invert automatically or by hand. The only practical solution is to transpose the arrays so that the dimension accessed by the innermost loop is in cache order. The arrays can be transposed at construction or prior to entering a computationally intensive section of code. The former requires all array references to be modified, while the latter is cost effective only if the cost of the transpose is amortized over many accesses. I used the second approach to optimize code 10. Code 5 has four-dimensional arrays and loops are nested four deep. For all of the reasons cited above the compiler is not able to restructure three key loops. Assume C arrays and let the four dimensions of the arrays be i, j, k, and l. In the original code, the index structure of the three loops is L1: for i L2: for i L3: for i for l for l for j for k for j for k for j for k for l So only L3 accesses array elements in cache order. L1 is a very complex loop—much too complex to invert. I brought the loop into cache alignment by transposing the second and fourth dimensions of the arrays. Since the code uses a macro to compute all array indexes, I effected the transpose at construction and changed the macro appropriately. The dimensions of the new arrays are now: i, l, k, and j. L3 is a simple loop and easily inverted. L2 has a loop-carried scalar dependence in k. By promoting the scalar name that carries the dependence to an array, I was able to invert the third and fourth subloops aligning the loop with cache. Code 5 is by far the most difficult of the four codes to optimize for array accesses; but the knowledge required to fix the problems is no more than that required for the other codes. I would judge this code at the limits of, but not beyond, the capabilities of appropriately trained computational scientists. Array Strides When a cache miss occurs, a line (64 bytes) rather than just one word is loaded into the cache. If data is accessed stride 1, than the cost of the miss is amortized over 8 words. Any stride other than one reduces the cost savings. Two of the ten codes studied suffered from non-unit strides. The codes represent two important classes of "strided" codes. Code 1 employs a multi-grid algorithm to reduce time to convergence. The grids are every tenth, fifth, second, and unit element. Since time to convergence is inversely proportional to the distance between elements, coarse grids converge quickly providing good starting values for finer grids. The better starting values further reduce the time to convergence. The downside is that grids of every nth element, n > 1, introduce non-unit strides into the computation. In the original code, much of the savings of the multi-grid algorithm were lost due to this problem. I eliminated the problem by compressing (copying) coarse grids into continuous memory, and rewriting the computation as a function of the compressed grid. On convergence, I copied the final values of the compressed grid back to the original grid. The savings gained from unit stride access of the compressed grid more than paid for the cost of copying. Using compressed grids, the loop from code 1 included in the previous section becomes do j = 1, GZ do i = 1, GZ T1 = CA(i+0, j-1) + CA(i-1, j+0) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) S1 = T1 + T4 - 4 * CA1(i+0, j+0) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 enddo enddo where CA and CA1 are compressed arrays of size GZ. Code 7 traverses a list of objects selecting objects for later processing. The labels of the selected objects are stored in an array. The selection step has unit stride, but the processing steps have irregular stride. A fix is to save the parameters of the selected objects in temporary arrays as they are selected, and pass the temporary arrays to the processing functions. The fix is practical if the same parameters are used in selection as in processing, or if processing comprises a series of distinct steps which use overlapping subsets of the parameters. Both conditions are true for code 7, so I achieved significant improvement by copying parameters to temporary arrays during selection. Data reuse In the previous sections, we optimized for spatial locality. It is also important to optimize for temporal locality. Once read, a datum should be used as much as possible before it is forced from cache. Loop fusion and loop unrolling are two techniques that increase temporal locality. Unfortunately, both techniques increase register pressure—as loop bodies become larger, the number of registers required to hold temporary values grows. Once register spilling occurs, any gains evaporate quickly. For multiprocessors with small register sets or small caches, the sweet spot can be very small. In the ten codes presented here, I found no opportunities for loop fusion and only two opportunities for loop unrolling (codes 1 and 3). In code 1, unrolling the outer and inner loop one iteration increases the number of result values computed by the loop body from 1 to 4, do J = 1, GZ-2, 2 do I = 1, GZ-2, 2 T1 = CA1(i+0, j-1) + CA1(i-1, j+0) T2 = CA1(i+1, j-1) + CA1(i+0, j+0) T3 = CA1(i+0, j+0) + CA1(i-1, j+1) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) T5 = CA1(i+2, j+0) + CA1(i+1, j+1) T6 = CA1(i+1, j+1) + CA1(i+0, j+2) T7 = CA1(i+2, j+1) + CA1(i+1, j+2) S1 = T1 + T4 - 4 * CA1(i+0, j+0) S2 = T2 + T5 - 4 * CA1(i+1, j+0) S3 = T3 + T6 - 4 * CA1(i+0, j+1) S4 = T4 + T7 - 4 * CA1(i+1, j+1) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 CA(i+1, j+0) = CA1(i+1, j+0) + DD * S2 CA(i+0, j+1) = CA1(i+0, j+1) + DD * S3 CA(i+1, j+1) = CA1(i+1, j+1) + DD * S4 enddo enddo The loop body executes 12 reads, whereas as the rolled loop shown in the previous section executes 20 reads to compute the same four values. In code 3, two loops are unrolled 8 times and one loop is unrolled 4 times. Here is the before for (k = 0; k < NK[u]; k++) { sum = 0.0; for (y = 0; y < NY; y++) { sum += W[y][u][k] * delta[y]; } backprop[i++]=sum; } and after code for (k = 0; k < KK - 8; k+=8) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (y = 0; y < NY; y++) { sum0 += W[y][0][k+0] * delta[y]; sum1 += W[y][0][k+1] * delta[y]; sum2 += W[y][0][k+2] * delta[y]; sum3 += W[y][0][k+3] * delta[y]; sum4 += W[y][0][k+4] * delta[y]; sum5 += W[y][0][k+5] * delta[y]; sum6 += W[y][0][k+6] * delta[y]; sum7 += W[y][0][k+7] * delta[y]; } backprop[k+0] = sum0; backprop[k+1] = sum1; backprop[k+2] = sum2; backprop[k+3] = sum3; backprop[k+4] = sum4; backprop[k+5] = sum5; backprop[k+6] = sum6; backprop[k+7] = sum7; } for one of the loops unrolled 8 times. Optimizing for temporal locality is the most difficult optimization considered in this paper. The concepts are not difficult, but the sweet spot is small. Identifying where the program can benefit from loop unrolling or loop fusion is not trivial. Moreover, it takes some effort to get it right. Still, educating scientific programmers about temporal locality and teaching them how to optimize for it will pay dividends. Reducing instruction count Execution time is a function of instruction count. Reduce the count and you usually reduce the time. The best solution is to use a more efficient algorithm; that is, an algorithm whose order of complexity is smaller, that converges quicker, or is more accurate. Optimizing source code without changing the algorithm yields smaller, but still significant, gains. This paper considers only the latter because the intent is to study how much better codes can run if written by programmers schooled in basic code optimization techniques. The ten codes studied benefited from three types of "instruction reducing" optimizations. The two most prevalent were hoisting invariant memory and data operations out of inner loops. The third was eliminating unnecessary data copying. The nature of these inefficiencies is language dependent. Memory operations The semantics of C make it difficult for the compiler to determine all the invariant memory operations in a loop. The problem is particularly acute for loops in functions since the compiler may not know the values of the function's parameters at every call site when compiling the function. Most compilers support pragmas to help resolve ambiguities; however, these pragmas are not comprehensive and there is no standard syntax. To guarantee that invariant memory operations are not executed repetitively, the user has little choice but to hoist the operations by hand. The problem is not as severe in Fortran programs because in the absence of equivalence statements, it is a violation of the language's semantics for two names to share memory. Codes 3 and 5 are C programs. In both cases, the compiler did not hoist all invariant memory operations from inner loops. Consider the following loop from code 3 for (y = 0; y < NY; y++) { i = 0; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += delta[y] * I1[i++]; } } } Since dW[y][u] can point to the same memory space as delta for one or more values of y and u, assignment to dW[y][u][k] may change the value of delta[y]. In reality, dW and delta do not overlap in memory, so I rewrote the loop as for (y = 0; y < NY; y++) { i = 0; Dy = delta[y]; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += Dy * I1[i++]; } } } Failure to hoist invariant memory operations may be due to complex address calculations. If the compiler can not determine that the address calculation is invariant, then it can hoist neither the calculation nor the associated memory operations. As noted above, code 5 uses a macro to address four-dimensional arrays #define MAT4D(a,q,i,j,k) (double *)((a)->data + (q)*(a)->strides[0] + (i)*(a)->strides[3] + (j)*(a)->strides[2] + (k)*(a)->strides[1]) The macro is too complex for the compiler to understand and so, it does not identify any subexpressions as loop invariant. The simplest way to eliminate the address calculation from the innermost loop (over i) is to define a0 = MAT4D(a,q,0,j,k) before the loop and then replace all instances of *MAT4D(a,q,i,j,k) in the loop with a0[i] A similar problem appears in code 6, a Fortran program. The key loop in this program is do n1 = 1, nh nx1 = (n1 - 1) / nz + 1 nz1 = n1 - nz * (nx1 - 1) do n2 = 1, nh nx2 = (n2 - 1) / nz + 1 nz2 = n2 - nz * (nx2 - 1) ndx = nx2 - nx1 ndy = nz2 - nz1 gxx = grn(1,ndx,ndy) gyy = grn(2,ndx,ndy) gxy = grn(3,ndx,ndy) balance(n1,1) = balance(n1,1) + (force(n2,1) * gxx + force(n2,2) * gxy) * h1 balance(n1,2) = balance(n1,2) + (force(n2,1) * gxy + force(n2,2) * gyy)*h1 end do end do The programmer has written this loop well—there are no loop invariant operations with respect to n1 and n2. However, the loop resides within an iterative loop over time and the index calculations are independent with respect to time. Trading space for time, I precomputed the index values prior to the entering the time loop and stored the values in two arrays. I then replaced the index calculations with reads of the arrays. Data operations Ways to reduce data operations can appear in many forms. Implementing a more efficient algorithm produces the biggest gains. The closest I came to an algorithm change was in code 4. This code computes the inner product of K-vectors A(i) and B(j), 0 = i < N, 0 = j < M, for most values of i and j. Since the program computes most of the NM possible inner products, it is more efficient to compute all the inner products in one triply-nested loop rather than one at a time when needed. The savings accrue from reading A(i) once for all B(j) vectors and from loop unrolling. for (i = 0; i < N; i+=8) { for (j = 0; j < M; j++) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (k = 0; k < K; k++) { sum0 += A[i+0][k] * B[j][k]; sum1 += A[i+1][k] * B[j][k]; sum2 += A[i+2][k] * B[j][k]; sum3 += A[i+3][k] * B[j][k]; sum4 += A[i+4][k] * B[j][k]; sum5 += A[i+5][k] * B[j][k]; sum6 += A[i+6][k] * B[j][k]; sum7 += A[i+7][k] * B[j][k]; } C[i+0][j] = sum0; C[i+1][j] = sum1; C[i+2][j] = sum2; C[i+3][j] = sum3; C[i+4][j] = sum4; C[i+5][j] = sum5; C[i+6][j] = sum6; C[i+7][j] = sum7; }} This change requires knowledge of a typical run; i.e., that most inner products are computed. The reasons for the change, however, derive from basic optimization concepts. It is the type of change easily made at development time by a knowledgeable programmer. In code 5, we have the data version of the index optimization in code 6. Here a very expensive computation is a function of the loop indices and so cannot be hoisted out of the loop; however, the computation is invariant with respect to an outer iterative loop over time. We can compute its value for each iteration of the computation loop prior to entering the time loop and save the values in an array. The increase in memory required to store the values is small in comparison to the large savings in time. The main loop in Code 8 is doubly nested. The inner loop includes a series of guarded computations; some are a function of the inner loop index but not the outer loop index while others are a function of the outer loop index but not the inner loop index for (j = 0; j < N; j++) { for (i = 0; i < M; i++) { r = i * hrmax; R = A[j]; temp = (PRM[3] == 0.0) ? 1.0 : pow(r, PRM[3]); high = temp * kcoeff * B[j] * PRM[2] * PRM[4]; low = high * PRM[6] * PRM[6] / (1.0 + pow(PRM[4] * PRM[6], 2.0)); kap = (R > PRM[6]) ? high * R * R / (1.0 + pow(PRM[4]*r, 2.0) : low * pow(R/PRM[6], PRM[5]); < rest of loop omitted > }} Note that the value of temp is invariant to j. Thus, we can hoist the computation for temp out of the loop and save its values in an array. for (i = 0; i < M; i++) { r = i * hrmax; TEMP[i] = pow(r, PRM[3]); } [N.B. – the case for PRM[3] = 0 is omitted and will be reintroduced later.] We now hoist out of the inner loop the computations invariant to i. Since the conditional guarding the value of kap is invariant to i, it behooves us to hoist the computation out of the inner loop, thereby executing the guard once rather than M times. The final version of the code is for (j = 0; j < N; j++) { R = rig[j] / 1000.; tmp1 = kcoeff * par[2] * beta[j] * par[4]; tmp2 = 1.0 + (par[4] * par[4] * par[6] * par[6]); tmp3 = 1.0 + (par[4] * par[4] * R * R); tmp4 = par[6] * par[6] / tmp2; tmp5 = R * R / tmp3; tmp6 = pow(R / par[6], par[5]); if ((par[3] == 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp5; } else if ((par[3] == 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp4 * tmp6; } else if ((par[3] != 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp5; } else if ((par[3] != 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp4 * tmp6; } for (i = 0; i < M; i++) { kap = KAP[i]; r = i * hrmax; < rest of loop omitted > } } Maybe not the prettiest piece of code, but certainly much more efficient than the original loop, Copy operations Several programs unnecessarily copy data from one data structure to another. This problem occurs in both Fortran and C programs, although it manifests itself differently in the two languages. Code 1 declares two arrays—one for old values and one for new values. At the end of each iteration, the array of new values is copied to the array of old values to reset the data structures for the next iteration. This problem occurs in Fortran programs not included in this study and in both Fortran 77 and Fortran 90 code. Introducing pointers to the arrays and swapping pointer values is an obvious way to eliminate the copying; but pointers is not a feature that many Fortran programmers know well or are comfortable using. An easy solution not involving pointers is to extend the dimension of the value array by 1 and use the last dimension to differentiate between arrays at different times. For example, if the data space is N x N, declare the array (N, N, 2). Then store the problem’s initial values in (_, _, 2) and define the scalar names new = 2 and old = 1. At the start of each iteration, swap old and new to reset the arrays. The old–new copy problem did not appear in any C program. In programs that had new and old values, the code swapped pointers to reset data structures. Where unnecessary coping did occur is in structure assignment and parameter passing. Structures in C are handled much like scalars. Assignment causes the data space of the right-hand name to be copied to the data space of the left-hand name. Similarly, when a structure is passed to a function, the data space of the actual parameter is copied to the data space of the formal parameter. If the structure is large and the assignment or function call is in an inner loop, then copying costs can grow quite large. While none of the ten programs considered here manifested this problem, it did occur in programs not included in the study. A simple fix is always to refer to structures via pointers. Optimizing loop structures Since scientific programs spend almost all their time in loops, efficient loops are the key to good performance. Conditionals, function calls, little instruction level parallelism, and large numbers of temporary values make it difficult for the compiler to generate tightly packed, highly efficient code. Conditionals and function calls introduce jumps that disrupt code flow. Users should eliminate or isolate conditionls to their own loops as much as possible. Often logical expressions can be substituted for if-then-else statements. For example, code 2 includes the following snippet MaxDelta = 0.0 do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) if (Delta > MaxDelta) MaxDelta = Delta enddo enddo if (MaxDelta .gt. 0.001) goto 200 Since the only use of MaxDelta is to control the jump to 200 and all that matters is whether or not it is greater than 0.001, I made MaxDelta a boolean and rewrote the snippet as MaxDelta = .false. do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) MaxDelta = MaxDelta .or. (Delta .gt. 0.001) enddo enddo if (MaxDelta) goto 200 thereby, eliminating the conditional expression from the inner loop. A microprocessor can execute many instructions per instruction cycle. Typically, it can execute one or more memory, floating point, integer, and jump operations. To be executed simultaneously, the operations must be independent. Thick loops tend to have more instruction level parallelism than thin loops. Moreover, they reduce memory traffice by maximizing data reuse. Loop unrolling and loop fusion are two techniques to increase the size of loop bodies. Several of the codes studied benefitted from loop unrolling, but none benefitted from loop fusion. This observation is not too surpising since it is the general tendency of programmers to write thick loops. As loops become thicker, the number of temporary values grows, increasing register pressure. If registers spill, then memory traffic increases and code flow is disrupted. A thick loop with many temporary values may execute slower than an equivalent series of thin loops. The biggest gain will be achieved if the thick loop can be split into a series of independent loops eliminating the need to write and read temporary arrays. I found such an occasion in code 10 where I split the loop do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do into two disjoint loops do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) end do end do do i = 1, n do j = 1, m C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do Conclusions Over the course of the last year, I have had the opportunity to work with over two dozen academic scientific programmers at leading research universities. Their research interests span a broad range of scientific fields. Except for two programs that relied almost exclusively on library routines (matrix multiply and fast Fourier transform), I was able to improve significantly the single processor performance of all codes. Improvements range from 2x to 15.5x with a simple average of 4.75x. Changes to the source code were at a very high level. I did not use sophisticated techniques or programming tools to discover inefficiencies or effect the changes. Only one code was parallel despite the availability of parallel systems to all developers. Clearly, we have a problem—personal scientific research codes are highly inefficient and not running parallel. The developers are unaware of simple optimization techniques to make programs run faster. They lack education in the art of code optimization and parallel programming. I do not believe we can fix the problem by publishing additional books or training manuals. To date, the developers in questions have not studied the books or manual available, and are unlikely to do so in the future. Short courses are a possible solution, but I believe they are too concentrated to be much use. The general concepts can be taught in a three or four day course, but that is not enough time for students to practice what they learn and acquire the experience to apply and extend the concepts to their codes. Practice is the key to becoming proficient at optimization. I recommend that graduate students be required to take a semester length course in optimization and parallel programming. We would never give someone access to state-of-the-art scientific equipment costing hundreds of thousands of dollars without first requiring them to demonstrate that they know how to use the equipment. Yet the criterion for time on state-of-the-art supercomputers is at most an interesting project. Requestors are never asked to demonstrate that they know how to use the system, or can use the system effectively. A semester course would teach them the required skills. Government agencies that fund academic scientific research pay for most of the computer systems supporting scientific research as well as the development of most personal scientific codes. These agencies should require graduate schools to offer a course in optimization and parallel programming as a requirement for funding. About the Author John Feo received his Ph.D. in Computer Science from The University of Texas at Austin in 1986. After graduate school, Dr. Feo worked at Lawrence Livermore National Laboratory where he was the Group Leader of the Computer Research Group and principal investigator of the Sisal Language Project. In 1997, Dr. Feo joined Tera Computer Company where he was project manager for the MTA, and oversaw the programming and evaluation of the MTA at the San Diego Supercomputer Center. In 2000, Dr. Feo joined Sun Microsystems as an HPC application specialist. He works with university research groups to optimize and parallelize scientific codes. Dr. Feo has published over two dozen research articles in the areas of parallel parallel programming, parallel programming languages, and application performance.

    Read the article

  • How can I manually install pecl_http on Ubuntu 9.10?

    - by Richard
    This is essentially a repost of http://stackoverflow.com/questions/4159369/ubuntu-9-04-pecl-extension-downloads-but-does-not-install. Hoping maybe someone can help me here. I've done this: sudo apt-get install php-pear sudo apt-get install php5-dev sudo apt-get install libcurl3-openssl-dev which installs fine. However, the next step: sudo pecl install pecl_http Doesn't install the extension, but merely downloads it. There are no error messages. So I have unpacked it and built it myself per http://php.net/manual/en/install.pecl.phpize.php Essentially: cd pecl_http phpize ./configure make make install I also make test'd to check all ok - and it failed one test: HttpRequest, which is kind of fundamental to this package. And indeed this doesn't work: $r = new HttpRequest('http://www.google.com'); $r->send; echo $r->getResponseCode(); No request is sent, the response code is zero, but no errors either. How can I get this damn thing installed? Is this a bug? Am I doing something wrong? Any alternatives, workarounds? Help appreciated. Thanks

    Read the article

< Previous Page | 255 256 257 258 259 260 261 262 263 264 265 266  | Next Page >